View source for Nanopore RNA Sequencing Protocol

==Alignment (Minimap2)==

Next let’s do the alignment. Based on the code above, ''Dorado'' should merge all the <code>.fast5</code> or <code>.pod5</code> files into one large basecalled <code>.fastq</code> file, but if you output it in a one-to-one fashion, you can merge them like so:

<code>Bash</code>
<syntaxhighlight lang="Bash">
cat $NBASE/${SAMPLEID}/*.fastq > $NBASE/${SAMPLEID}/${SAMPLEID}_master.fastq
mkdir $NALI/${SAMPLEID}
</syntaxhighlight>

Now, lets use ''Minimap2'' to start the alignment (if ''Minimap2'' isn’t installed via ''Conda'', you need to change directories first with <code>$MM2/minimap2</code> and then replace <code>minimap2</code> with <code>./minimap2</code>). We will export this as a <code>.sam</code> file first, and then use '''Samtools''' to format it further into a binary file (<code>.bam</code>). For ''m6ANet, note that we need to map it to the transcriptome'', so your ${GREFERENCEID} should be defined accordingly above.

<code>Bash</code>
<syntaxhighlight lang="Bash">
minimap2 -ax map-ont -uf --secondary=no \
$REF/${GREFERENCEID} \
$NBASE/${SAMPLEID}/${SAMPLEID}_master.fastq > $NALI/${SAMPLEID}/${SAMPLEID}_taligned_unsorted.sam
</syntaxhighlight>

{{note|I used <code>_taligned</code> here because I was using the transcriptome as a reference. You can change this to whatever you wish, for example <code>_galigned</code> if using the genome, just make sure you use the updated name later on when you call the file.}}

Keep in mind that <code>minimap2</code> here is taking a few arguments, which is required for ''Nanopore'' mRNA reads. <code>-ax</code> specifies that you want to do a “long read alignment” using CIGAR. <code>map-ont</code> specifies that the input is long, noisy reads (~10% error rate) from an ''Oxford Nanopore'' device. <code>-uf</code> specifies that you want to use the transcript strand for finding canonical splicing sites GT-AG. <code>--secondary=no</code> specifies that you do not want to output secondary alignments. Previously, with ''EpiNano'', I also aligned data to the genome using the argument <code>-ax splice -uf -k14</code>, which is similar but 1) it specifies that you want to use splice alignment mode (genome alignment) and 2) specifies that you want 14 bases loaded into memory at a time to process in a mini-batch.

You can remove these arguments or add new ones in this line as you wish, but only do so if you know exactly what it is you want ''Minimap2'' to do differently. More information about ''Minimap2'' arguments can be found on the [https://lh3.github.io/minimap2/minimap2.html Minimap2 documentation].