Jump to content
Toggle sidebar
Neurobiology.Dev
Search
Create account
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Talk
Contributions
Navigation
Main page
Records
Recent changes
Random page
Tools
What links here
Related changes
Special pages
Page information
Editing
Nanopore RNA Sequencing Protocol
(section)
Page
Discussion
English
Read
Edit
View history
More
Read
Edit
View history
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=Sequencing= Next up, we need to sequence our raw data obtained from the ''flow cell'' and ''MinION'' device. To do so, we will use the standalone command-line version of ''Guppy''. '''Note that your file paths should already be defined''', as indicated by the section included at the start of the article, and that this should be done each time you open a new ''Terminal'' window. ==Basecalling (Guppy)== Navigate to the folder with guppy using the <code>cd</code> command (for example, mine is at <code>~/Research/ont-guppy-6.4.2/bin</code>, so I can navigate there using <code>cd ~/Research/ont-guppy-6.4.2/bin</code>, or substituting the path with <code>$GUPPY</code>) and execute the following. Note that the last line of code includes a path to store the log of the ''Guppy'' run, which will require the <code>$GLOGS</code> folder to already have been created. {{warning|The line <code>-x auto \</code> is only valid for newer versions of ''Guppy'' that supports CUDA GPUs. If you are basecalling with an old version of ''Guppy'', like 3.1.5, or you are on a machine that does not have a CUDA GPU (like a ''Mac''), you’ll want to execute this without this line. You should also remove the line <code>--disable_qscore_filtering</code> with older versions, as this was a newer ''Guppy'' feature.}} <code>Bash</code> <syntaxhighlight lang="Bash"> $GUPPY/guppy_basecaller \ --input_path $NRAW/${SAMPLEID} \ --save_path $NBASE/${SAMPLEID} \ --flowcell FLO-MIN106 \ --kit SQK-RNA002 \ --disable_qscore_filtering \ -x auto \ > $GLOGS/${SAMPLEID}_guppy${GUPPYVER}_$(date +"%Y%m%d%H%M%S").txt </syntaxhighlight> Note that the lines containing <code>--flowcell</code> and <code>--kit</code> can be replaced with a configuration (e.g. <code>--config rna_r9.4.1_70bps_hac.cfg</code>). <code>-x auto</code> specifies that you want to run the basecaller using the default GPU (remove this line if you want to do CPU, or replace <code>auto</code> with <code>cuda:X</code>, where <code>X</code> is the GPU identifier you want to use). The last line just exports the output as a text file for logging in case you ever need to reference it later, with the ID of your sample and the current date and time stamped in by the system. You can exclude this if you wish. ==Alignment (Minimap2)== Next let’s do the alignment. The first thing I want to do is merge all the fastq files into one. I don’t think you need this step, but I wanted to avoid errors with trying to process them in sequence and/or having to use a <code>for</code> loop. <code>Bash</code> <syntaxhighlight lang="Bash"> cat $NBASE/${SAMPLEID}/*.fastq > $NBASE/${SAMPLEID}/${SAMPLEID}_master.fastq mkdir $NALI/${SAMPLEID} </syntaxhighlight> Now, lets use ''Minimap2'' to start the alignment (if ''Minimap2'' isn’t installed via ''Conda'', you need to change directories first with <code>$MM2/minimap2</code> and then replace <code>minimap2</code> with <code>./minimap2</code>). We will export this as a <code>.sam</code> file first, and then use '''Samtools''' to format it further into a binary file (<code>.bam</code>). For ''m6ANet, note that we need to map it to the transcriptome'', so your ${GREFERENCEID} should be defined accordingly above. <code>Bash</code> <syntaxhighlight lang="Bash"> minimap2 -ax map-ont -uf --secondary=no \ $REF/${GREFERENCEID} \ $NBASE/${SAMPLEID}/${SAMPLEID}_master.fastq > $NALI/${SAMPLEID}/${SAMPLEID}_taligned_unsorted.sam </syntaxhighlight> {{note|I used <code>_taligned</code> here because I was using the transcriptome as a reference. You can change this to whatever you wish, for example <code>_galigned</code> if using the genome, just make sure you use the updated name later on when you call the file.}} Keep in mind that <code>minimap2</code> here is taking a few arguments, which is required for ''Nanopore'' mRNA reads. <code>-ax</code> specifies that you want to do a “long read alignment” using CIGAR. <code>map-ont</code> specifies that the input is long, noisy reads (~10% error rate) from an ''Oxford Nanopore'' device. <code>-uf</code> specifies that you want to use the transcript strand for finding canonical splicing sites GT-AG. <code>--secondary=no</code> specifies that you do not want to output secondary alignments. Previously, with ''EpiNano'', I also aligned data to the genome using the argument <code>-ax splice -uf -k14</code>, which is similar but 1) it specifies that you want to use splice alignment mode (genome alignment) and 2) specifies that you want 14 bases loaded into memory at a time to process in a mini-batch. You can remove these arguments or add new ones in this line as you wish, but only do so if you know exactly what it is you want ''Minimap2'' to do differently. More information about ''Minimap2'' arguments can be found on the [https://lh3.github.io/minimap2/minimap2.html Minimap2 documentation]. ==Formatting (Samtools)== Next we need to convert the exported <code>.sam</code> file to binary (<code>.bam</code>), which is required for some of the steps we’re going to perform down the road for ''m6A detection''. Then, we will need to sort the <code>.bam</code> file (in other words, organize the mapped reads by chromosome and position), and then index it by generating a <code>.bam.bai</code> file. <code>Bash</code> <syntaxhighlight lang="Bash"> cd $NALI/${SAMPLEID} samtools view -bS ${SAMPLEID}_taligned_unsorted.sam > ${SAMPLEID}_taligned_unsorted.bam samtools sort ${SAMPLEID}_taligned_unsorted.bam -o ${SAMPLEID}_taligned.bam samtools index ${SAMPLEID}_taligned.bam ${SAMPLEID}_taligned.bam.bai </syntaxhighlight> ===Isolate Chromosome=== {{note|This step is ''ONLY'' for alignments that have been done to the genome and for processing with ''EpiNano''. If you are using ''m6ANet'', you’ll want to skip this portion.}} This will speed up the ''EpiNano'' processing step substantially, since it will not be analyzing ''all'' of your data at once. ''Note that we defined the chromosome we wanted to isolate above at the start of the protocol'', with <code>${CHR}</code>. <code>Bash</code> <syntaxhighlight lang="Bash"> samtools view -b ${SAMPLEID}_aligned.bam ${CHR} > ${SAMPLEID}_${CHR}.bam samtools index ${SAMPLEID}_${CHR}.bam ${SAMPLEID}_${CHR}.bam.bai </syntaxhighlight> ===Check Read Count=== ''Samtools'' comes with the ability to check the total number of reads in a <code>.bam</code> file via the command <code>view -c</code>. Therefore, at this stage you should count the total number of reads in both the master aligned file and (if aligned to the genome) the individual chromosome you isolated. This can be done via the following: <code>Bash</code> <syntaxhighlight lang="Bash"> samtools view -c ${SAMPLEID}_aligned.bam samtools view -c ${SAMPLEID}_${CHR}.bam </syntaxhighlight> Or, if mapped to the transcriptome: <code>Bash</code> <syntaxhighlight lang="Bash"> samtools view -c ${SAMPLEID}_taligned.bam </syntaxhighlight> For reference, the <code>FAM95931</code> sample (identified here by ''flow cell'' ID) we ran had about 1.9 million reads total and 106,000 reads across chromosome 19. This was using a standard ''MinION'' flow cell, ''not'' a flongle, which should have about 120-200k reads.
Summary:
Please note that all contributions to Neurobiology.Dev may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Neurobiology.Dev:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)