Processing Oxford Nanopore Long Reads Using Amazon Web Services | Biomedical Chemistry: Research and Methods

pdf (Русский) html (Русский) html Supplementary

Published: Dec 25, 2020

DOI: https://doi.org/10.18097/BMCRM00131

Keywords:

cloud computing; bioinformatics; sequencing; RNA; transcript; postgenomic technologies

V.V. Shapovalova

Center for Strategic Planning and Management of Medical and Biological Health Risks, 10 Pogodinskaya str., Moscow, 119121 Russia

S.P. Radko

Institute of Biomedical Chemistry, 10 Pogodinskaya str., Moscow, 119121 Russia

K.G. Ptitsyn

Institute of Biomedical Chemistry, 10 Pogodinskaya str., Moscow, 119121 Russia

G.S. Krasnov

Institute of Biomedical Chemistry, 10 Pogodinskaya str., Moscow, 119121 Russia

K.V. Nakhod

Institute of Biomedical Chemistry, 10 Pogodinskaya str., Moscow, 119121 Russia

O.S. Konash

Institute of Biomedical Chemistry, 10 Pogodinskaya str., Moscow, 119121 Russia

M.A. Vinogradina

Institute of Biomedical Chemistry, 10 Pogodinskaya str., Moscow, 119121 Russia

E.A. Ponomarenko

Institute of Biomedical Chemistry, 10 Pogodinskaya str., Moscow, 119121 Russia

D.S. Druzhilovskiy

Institute of Biomedical Chemistry, 10 Pogodinskaya str., Moscow, 119121 Russia

A.V. Lisitsa

Institute of Biomedical Chemistry, 10 Pogodinskaya str., Moscow, 119121 Russia; West Siberian Interregional Scientific and Educational Center, Tyumen State University, 6 Volodarsky str., Tyumen, 625003 Russia

Abstract

Studies of genomes and transcriptomes are performed using sequencers that read the sequence of nucleotide residues of genomic DNA, RNA, or complementary DNA (cDNA). The analysis consists of an experimental part (obtaining primary data) and bioinformatic processing of primary data. The bioinformatics part is performed with different sets of input parameters. The selection of the optimal values of the parameters, as a rule, requires significant computing power. The article describes a protocol for processing transcriptome data by virtual computers provided by the cloud platform Amazon Web Services (AWS) using the example of the recently emerging technology of long DNA and RNA sequences (Oxford Nanopore Technology). As a result, a virtual machine and instructions for its use have been developed, thus allowing a wide range of molecular biologists to independently process the results obtained using the "Oxford nanopore".

How to Cite

Shapovalova, V., Radko, S., Ptitsyn, K., Krasnov, G., Nakhod, K., Konash, O., Vinogradina, M., Ponomarenko, E., Druzhilovskiy, D., & Lisitsa, A. (2020). Processing Oxford Nanopore Long Reads Using Amazon Web Services. Biomedical Chemistry: Research and Methods, 3(4), e00131. https://doi.org/10.18097/BMCRM00131

Issue

Vol. 3 No. 4 (2020)

Section

PROTOCOLS OF EXPERIMENTS, USEFUL MODELS, PROGRAMS AND SERVICES

References

Van der Auwera, G. A., O’Connor, B. D. (2020) Genomic in the Cloud: Using Docker, GATK, and WDL in Terra.
Tyanova, S., Temu, T., Cox, J. (2016) The MaxQuant computational platform for mass spectrometry-based shotgun proteomics. Nat. Protoc, 11(12), 2301–19. DOI
Forsberg, E. M., Huan, T., Rinehart, D., Benton, H. P., Warth, B., Hilmers, B., Siuzdak, G. (2018) Data processing, multi-omic pathway mapping, and metabolite activity analysis using XCMS Online. Nat. Protoc, 13(4), 633–51. DOI
Li, B., Dewey, C. N. (2011) RSEM: Accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics, 12, 323(2011) DOI
Langmead, B., Trapnell, C., Pop, M., Salzberg, S. L. (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol., 10, R25(2009). DOI
Bankevich, A., Nurk, S., Antipov, D., Gurevich, A. A., Dvorkin, M., Kulikov, A. S., Lesin, V. M., Nikolenko, S. I., Pham, S., Prjibelski, A. D., Pyshkin, A. V., Sirotkin, A. V., Vyahhi, N., Tesler, G., Alekseyev, M. A., Pevzner, P. A. (2012) SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol., 19(5), 455–77. DOI
Direct RNA Sequencing. Oxford Nanopore Technologies. Retrieved September 1, 2020, from: https://store.nanoporetech.com/media/wysiwyg/pdfs/SQK-RNA002/Direct_RNA_sequencing_SQK-RNA002_-minion.pdf
Ilgisonis, E., Lisitsa, A., Kudryavtseva, V., Ponomarenko, E. (2018) Creation of Individual Scientific Concept-Centered Semantic Maps Based on Automated Text-Mining Analysis of PubMed. Adv Bioinformatics, 2018, 4625394. DOI
Boža, V., Perešíni, P., Brejová, B., Vinař, T. (2020) DeepNano-blitz: a fast base caller for MinION nanopore sequencers. Bioinformatics, 36(14), 4191–4192. DOI
Makałowski, W., Shabardina, V. (2020) Bioinformatics of nanopore sequencing. J. Hum. Genet., 65, 61–67. DOI
Wick, R. R., Judd, L. M., Holt, K. E. (2019) Performance of neural network basecalling tools for Oxford Nanopore sequencing. Genome Biol, 20, 129(2019). DOI
Lanfear, R., Schalamun, M., Kainer, D., Wang, W., Schwessinger, B. (2019) MinIONQC: Fast and simple quality control for MinION sequencing data. Bioinformatics, 35(3), 523–525. DOI
Li, H. (2018) Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics, 34(18), 3094–3100. DOI
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R. (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25(16), 2078–2079. DOI
Patro, R., Duggal, G., Love, M. I., Irizarry, R. A., Kingsford, C. (2017) Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods, 14(4), 417–419. DOI
Soneson, C., Yao, Y., Bratus-Neuenschwander, A., Patrignani, A., Robinson, M. D., Hussain, S. (2019) A comprehensive examination of Nanopore native RNA sequencing for characterization of complex transcriptomes. Nat Commun, 10, 3359(2019). DOI
Workman, R. E., Tang, A. D., Tang, P. S., Jain, M., Tyson, J. R., Razaghi, R. et al. (2019) Nanopore native RNA sequencing of a human poly(A) transcriptome. Nat Methods, 16(12), 1297–1305. DOI
Zhang, P., Hung, L. H., Lloyd, W., Yeung, K. Y. (2018) Hot-starting software containers for STAR aligner. Gigascience, 7(8), giy092. DOI
Pratt, B., Howbert, J. J., Tasman, N. I., Nilsson, E. J. (2012) Mr-Tandem: Parallel x!Tandem using Hadoop MapReduce on Amazon web services. Bioinformatics, 28(1), 136–137. DOI
Data files produced by the GENCODE project. Retrieved September 1, 2020, from: ftp://ftp.ebi.ac.uk/pub/databases/gencode/_README.TXT
Salmon Output File Formats. Retrieved September 1, 2020, from: https://salmon.readthedocs.io/en/latest/file_formats.html#fileformats