Jump to content
Toggle sidebar
Neurobiology.Dev
Search
Create account
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Talk
Contributions
Navigation
Main page
Records
Recent changes
Random page
Tools
What links here
Related changes
Special pages
Page information
Editing
Nanopore RNA Sequencing Protocol
(section)
Page
Discussion
English
Read
Edit
View history
More
Read
Edit
View history
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
====EpiNano==== ''EpiNano'' is an algorithm written in python that is used to identify RNA modifications present in direct RNA sequencing reads, using the <code>.bam</code> files that are produced from ''Guppy'' and ''Minimap2''. It extracts a set of ‘features’ from the direct RNA sequencing reads, which will be used to predict whether the ‘error’ is caused by the presence of an RNA modification. Note that ''EpiNano'' can be used to compare two samples (one containing a methylase knockdown or knockout) in a pairwise fashion, which they refer to as ''EpiNano-Error'', or a single sample can be compared to a pre-trained model, referred to as ''EpiNano-SVM''. Unfortunately, the models for ''EpiNano-SVM'' were trained on an old version of ''Guppy'', which is no longer supported, and so we will use ''m6ANet'' for our m6A detection. I left in all the code required to get it up and running in case there is a reason for you to use it, and so I left this portion of the protocol in. '''If you plan on using only m6ANet , please skip this section.''' =====Pre-requisites===== ''EpiNano'', unfortunately, relies upon a series of tools and software packages, both inside and outside of ''R''. Therefore, we will need to check to ensure all of these are installed prior to getting ''EpiNano'', and install any that we do not have using either ''Conda'' or <code>install.packages()</code> (depending on the package). Many of the packages required are older versions of those commonly used, but some of the newer versions do work with ''EpiNano''. Therefore, I am going to make a ''Conda'' environment that is designed to be used specifically for ''EpiNano'', and then switch to that environment prior to the install. That way, we can install the older packages within an ''EpiNano'' “specific” environment that can be loaded and used separately from our base ''Conda'' environment. <code>Bash</code> <syntaxhighlight lang="Bash"> conda create --name epinano conda activate --stack epinano </syntaxhighlight> The text in parentheses next to your name in ''Terminal'' should now swap to <code>(epinano)</code>. To return to your base, simply type <code>conda activate --stack base</code> and execute. Note that at any time, you can see a list of installed ''Conda'' packages along with their versions via the following command: <code>Bash</code> <syntaxhighlight lang="Bash"> conda list </syntaxhighlight> If you just made a new stack named <code>epinano</code>, it should be empty. Let’s go ahead and now start installing packages. I like to start with <code>scikit-learn</code>, since it will install a lot of the correct dependencies for the other packages. To install the right version, add a double equals sign after the name to specify the version number, such as <code>conda install scikit-learn==0.20.2</code>. It may take a while to find the package this way. {{warning|''EpiNano'' 1.2 does '''NOT''' work with the latest versions of python, scikit-learn, etc. Therefore, you will want to install the version of each package as you see here, which can be done with <code>conda install package</code>. As packages install, pay attention to which packages might be upgraded or downgraded as you go along. If you do not, you might run into errors when trying to run ''EpiNano'' scripts if a package was changed without your knowledge!}} {| class="wikitable" |+EpiNano Conda Packages |- ! scope="col"| Package ! scope="col"| Version ! scope="col"| Notes |- |biopython |1.76 | |- |dask |2.5.2 | |- |h5py |2.8.0 | |- |java openjdk |1.8.0 |This should already be installed via a previous step. |- |minimap2 |2.14-r886 |This should already be installed via a previous step. |- |nanopolish |0.12.4 | |- |numpy |1.15.4 | |- |pandas |0.23.4 | |- |pysam |0.15.3+ | |- |python |3.6.7 |<b>Note:</b> Latest version of python doesn't work with scikit-learn 0.20.2! |- |sam2tsv | |Included with the EpiNano repo at <code>EpiNano/misc/</code>. |- |samtools |0.1.19 |This should already be installed via a previous step. |- |scikit-learn |0.20.2 |<b>Note:</b> <code>Epinano_Predict.py</code> does not work with the latest version. You ''must'' install this one for EpiNano. |} Now let’s check the ''R'' packages. Boot up ''R'' in the terminal by typing <code>R</code>, and execute the following in the console to generate a list of the currently installed packages. If you do not have R, you can install it [https://anaconda.org/r/r using Conda], or following the instructions provided [https://www.r-project.org/ here]. <code>R</code> <syntaxhighlight lang="R"> as.data.frame(installed.packages()[ , c(1, 3:4)]) </syntaxhighlight> Cross reference the list that prints with the following (again, sorted in alphabetical order here for you), and install any ''R'' packages that you do not have using the <code>install.packages()</code> command. I recommend starting with the car package, since it will install a lot of the others in the list automatically for you. I also didn’t have an issue installing the latest packages of each listed here, so you likely don’t need to install a specific version for ''EpiNano'' to run. {{warning|Some of these packages may have issues installing on ''Linux'' if certain commands are unavailable to ''R''. For example, you may need to also install <code>curl</code> or <code>gfortran</code> in order to get the packages <code>tidyverse</code> and <code>car</code> to install. Pay attention to the output in the console and read the directions for the next steps if any fail (which will be apparent to you if you see the line <code>Installation of package had non-zero exit status</code> printed anywhere). After all the installations are done, you should recheck the list of installed packages again (see note following the table below).}} {| class="wikitable" |+EpiNano R Packages |- ! scope="col"| R Package ! scope="col"| Version ! scope="col"| Notes |- |car |3.0-3 | |- |dplyr |1.0.1 | |- |forcats |0.4.0 | |- |ggplot2 |3.1.1 | |- |ggrepel |0.8.1 | |- |optparse |1.6.6 | |- |outliers |0.14 | |- |purrr |0.3.2 | |- |readr |1.3.1 | |- |reshape2 |1.4.3 | |- |stringr |1.4.0 | |- |tibble |3.0.3 | |- |tidyr |0.8.3 | |- |tidyverse |1.2.1 | |} You should check that these actually installed after finishing by executing the <code>as.data.frame(installed.packages()[ , c(1, 3:4)])</code> command once again. =====Installing EpiNano===== Now that all the pre-requisites are checked, we can go ahead and install ''EpiNano'' itself. For simplicity, I will keep mine in the <code>~/Research</code> directory, so that all of the stuff I use for the nanopore will be kept in one place. You can place it wherever you like, since we will define (or have defined) the file path to it in the console. <code>Bash</code> <syntaxhighlight lang="Bash"> cd ~/Research git clone "https://github.com/novoalab/EpiNano.git" </syntaxhighlight> ''EpiNano'' should now be installed. Note that there is a '''ReadMe''' available for ''EpiNano'' available [https://github.com/novoalab/EpiNano here] which goes over the requirements for it and how to execute it from the shell (Note: this readme should also be found as a file in your ''EpiNano'' folder after you copy the repo). It also describes some of the arguments that you will include when you execute it later. =====Preparing Reference for EpiNano===== One final thing that we want to do: we need to generate the appropriate files required to be able to run ''EpiNano'' later. This can be done by navigating to the folder containing your genome assembly (mine is at <code>~/Research/Ref</code>) and running a few lines of code (if you have files ending in <code>.fa.fai</code> and <code>.fa.dict</code> already in that folder for your assembly, you can skip this step). If you do not have a genome reference assembly yet, you may grab one from the [https://www.ncbi.nlm.nih.gov/projects/genome/guide/human/index.shtml NIH available here]. {{warning|The command <code>faidx</code> and <code>picard</code> may not run if you don’t have the appropriate software installed. You’ll know if it is installed or not when you try to execute it. If you do not have either of these installed, you can install <code>faidx</code> with the command <code>sudo apt install python3-pyfaidx</code> (''Linux''). <code>Picard</code> will need to be downloaded from the [https://github.com/broadinstitute/picard/releases/tag/2.27.4 Picard github here] as well (select the <code>picard.jar</code> file and place it into your root folder).}} <code>Bash</code> <syntaxhighlight lang="Bash"> sudo apt install python3-pyfaidx </syntaxhighlight> Next, we need to make the appropriate sequence libraries. This can be done as follows. Keep in mind that your paths may be different than the ones listed here! <code>Bash</code> <syntaxhighlight lang="Bash"> cd ~/Research/Ref faidx GRCh38.p13.genome.fa cd ~/Research/ java -jar picard.jar CreateSequenceDictionary \ -R ~/MOP2/anno/GRCh38.p13.genome.fa \ -O ~/MOP2/anno/GRCh38.p13.genome.fa.dict </syntaxhighlight> {{tip|The <code>picard.jar</code> file will need to be in the current working directory for the above code to work. If it isn’t found, make sure you place it in your ''Terminal'' working directory before attempting the code again, or switch your ''Terminal'' directory to the location of your <code>picard.jar</code> file with the <code>cd</code> command. Alternatively, you can make a shell shortcut to <code>picard.jar</code> by specifying the path.}}
Summary:
Please note that all contributions to Neurobiology.Dev may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Neurobiology.Dev:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)