Jump to content
Toggle sidebar
Neurobiology.Dev
Search
Create account
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Talk
Contributions
Navigation
Main page
Records
Recent changes
Random page
Tools
What links here
Related changes
Special pages
Page information
Editing
Nanopore RNA Sequencing Protocol
(section)
Page
Discussion
English
Read
Edit
View history
More
Read
Edit
View history
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=Defining Files= In an effort to make this pipeline as generally applicable as possible, I thought it best to start by defining where everything is on your computer ''first'' so that you can just call these shorthands in the ''Terminal'' later. '''Note:''' This will also allow you to simply copy and paste the code in code blocks below, without having to change anything in the code itself. The locations we will need to define are as follows: * The location of your '''Guppy''' installation and the bin folder at <code>root/ont-guppy/bin</code>. We will call in ''Terminal'' with <code>$GUPPY</code>. * The location of where you want to store the ''Guppy'' output as a log text file, if desired (in case you need to refer to it at some later point). We will call this with <code>$GLOGS</code>. * The location of where you will store your reference transcriptome (or genome) files. This will be called with <code>$REF</code>. * The locations of your raw nanopore <code>.fast5</code> data, where you want to store the basecalled <code>.fastq</code> data, and where you want to store the aligned <code>.sam</code>, <code>.bam</code>, '''m6ANet''' output data. We will define these as <code>$NRAW</code>, <code>$NBASE</code>, and <code>$NALI</code>, respectively. * The location of the root ''m6ANet'' folder. We will call this with <code>$M6AO</code>. Note that for my installation, I will put all of these in a folder labeled <code>/Research</code> within my home directory. If you do the same, you can call this base directory with the string <code>/home/$USER/Research</code> (''Linux only''; it depends on your operating system), regardless of what your unique username actually is. In addition, at the start of each run, I will define the name of my sample (currently by flow cell ID), which will be the same name as the folder that contains my raw <code>.fast5</code> reads in my Nanopore raw data folder. This string (<code>SAMPLEID</code>) will be used to fill in file names and folder names as we go along, so you shouldn’t have to edit any of the code as you do more samples. We will do the same for the reference transcriptome file. {{warning|If you do additional runs and re-run this pipeline, you will need to change the <code>SAMPLEID</code> and execute the following code block each time before running the pipeline. Otherwise, you will overwrite the data from your last run! '''Additionally''', you will need to redefine these variables each time you open a new ''Terminal'' window. '''If you do not have any of these software installed yet, I recommend coming back to this section and running this code after you have the appropriate software installed.'''}} ===Shell File Paths=== {{tip|You can easily find out what your current working directory is in the ''Terminal'' by typing <code>echo $PWD</code> and hitting execute. Similarly, you can define your current directory as a variable by setting your variable equal to <code>$PWD</code>. ''PWD'' stands for ''print working directory'' and is a shortcut that will benefit you a lot as you navigate and define folder directories in the shell.}} <code>Bash</code> <syntaxhighlight lang="bash"> SAMPLEID="FAR91556" \ GREFERENCEID="GRCh38.p14.rna.fna" \ CHR="chr19" \ GUPPYVER="6.4.2" \ GUPPY="/home/$USER/Research/ont-guppy-${GUPPYVER}/bin" \ GLOGS="/home/$USER/Research/logs/guppy" \ REF="/home/$USER/Research/Ref" \ MM2="/home/$USER/Research/minimap2" \ NRAW="/home/$USER/Research/Data/Nanopore/Raw" \ NBASE="/home/$USER/Research/Data/Nanopore/Basecalled" \ NALI="/home/$USER/Research/Data/Nanopore/Aligned" \ M6AO="/home/$USER/Research/Data/Nanopore/m6ANet_results" \ EPIOUT="/home/$USER/Research/Data/Nanopore/EpiNano" \ EPI="/home/$USER/Research/EpiNano" </syntaxhighlight> ===R File Paths=== <code>R</code> <syntaxhighlight lang="R"> SAMPLEID="FAR91556" GREFERENCEID="GRCh38.p14.rna.fna" CHR="chr19" OUTPUT_PATH=paste("/home/tj/Research/Data/Nanopore/Aligned", SAMPLEID, "", sep = "/") M6ANET_PATH=paste("/home/tj/Research/Data/Nanopore/m6ANet_results", SAMPLEID, "", sep = "/") </syntaxhighlight> Note that at any time, you can check these in Terminal using the <code>echo</code> command. For example, if I wanted to check what my current <code>SAMPLEID</code> was, I would type <code>echo "${SAMPLEID}"</code> and execute. Also, notice that here we are only defining one sample to be analyzed, which is all we need for ''m6ANet'' or ''EpiNano-SVM''. ''EpiNano-Error'', however, requires two samples as input if you use it. For that analysis, I will leave defining the samples to be compared at the start of the ''EpiNano-Error'' section.
Summary:
Please note that all contributions to Neurobiology.Dev may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Neurobiology.Dev:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)