1 Introduction

1.1 Overview

1.1.1 Single-cell/Single nucleus RNA seq technologies

Single-cell/ single nucleus RNA technologies (sc/snRNAseq) allow a detailed and comprehensive analysis of cellular processes in many sub-specialties of life sciences (Alon et al. 2021; Birnbaum 2018; Davie et al. 2018; Soysa et al. 2019; Ding et al. 2020; Ji et al. 2019; Luecken and Theis 2019; Macosko et al. 2015; Majumdar et al. 2021; Mathys et al. 2019; Papalexi and Satija 2018; Polioudakis et al. 2019; Potter 2018; Ravenscroft et al. 2020; Shaw, Tian, and Xu 2021; Suva and Tirosh 2019; Tang et al. 2009; Wen and Tang 2016; Y. Zhang et al. 2018). Sc/snRNAseq involves the generation of single-cell or single-nucleus suspensions from tissues or cultured cells and the capture of RNA molecules from individual cells into small compartments. Within these compartments, the cells are lysed, and the RNA molecules are used to generate cDNA with a DNA barcode. The barcode within each compartment is unique and helps in the identification of all cDNA generated from a single compartment (Macosko et al. 2015; Nayak and Hasija 2021; Saliba et al. 2014; Tang et al. 2009; Zheng et al. 2017). This general theme has given rise to a variety of different technologies differing from each other on the method used to create compartments and the target RNA molecules that are captured. Droplet-based techniques such as drop-seq involve the generation of compartments using microfluidics technology with gel beads delivering barcoded oligos and reverse transcription reagents. This technology has been adapted by 10X Genomics to generate the proprietary Chromium controller and capture the 3’ ends of RNA molecules with an oligo-dT mediated reverse transcription (X. Zhang et al. 2019). SMART-seq (and the subsequent SMART-seq2) technology can use single-cell capture through fluorescence-activated cell sorting (FACS) or C1-Fluidigm platforms to capture cells in microwell plates and permits the detection of full-length transcripts (Picelli et al. 2014; Tang et al. 2009). Massively parallel single-cell RNA-sequencing (MARS-seq) and Cell Expression by Linear amplification and Sequencing (CELseq) technologies also use FACS to capture individual cells in microwell plates (384-well plates) and provide cost-effective, customizable platforms for detecting RNA molecules (Keren-Shaul et al. 2019). The Seq-well technology employs a portable microwell platform with gel-bead mediated delivery of barcoded oligos to capture single-cells and polyA-tailed transcripts (Gierahn et al. 2017). Single-nucleus sequencing technologies allow for the detection of transcripts from the nucleus of individual cells and are especially useful when the initial sample used is formalin-fixed or paraffin-embedded tissue (Bakken et al. 2018; Habib et al. 2017, 2016; Zheng et al. 2017). Single-cell combinatorial indexing (sci-) sequencing technology works with in-situ delivery of barcode labels to fixed cells or isolated nuclei and samples are isolated for second-strand synthesis either using FACS or dilution. The readers are referred to detailed review and benchmarking studies for each of these methodologies elsewhere (Ding et al. 2020; Mereu et al. 2020).

1.1.2 Alignment of sequencing reads

Following the initial barcoding and cDNA preparation, the samples are subject to cDNA library preparation involving ‘sequencing index’ addition in specific single-cell technologies described above and quality assessment of the libraries (Web Page 2021a). The high-throughput sequencing of these libraries results in millions of reads that carry the barcode and the associated RNA sequence information. These reads can be processed using specialized software such as the Cell Ranger pipeline from 10X Genomics (Zheng et al. 2017) or processed using generic enumerators and alignment tools such as Kallisto, Salmon-Alevin, Bowtie, or STARsolo (Brüning et al. 2021; Du et al. 2020).

1.1.3 Post alignment processing

Post-sequencing alignment and counting of reads usually result in a gene versus barcode table with count data for each gene in each cell (the gene-count matrix). Each platform used for alignment and enumeration generates a different format for reporting this data. 10X Genomics Cell Ranger pipeline produces 3 files: the matrix of cell count, the barcode file, and the genes file identifying specific parts of the information. Other methods produce a “dense matrix”, or dgCMatrix format of the gene-count table that can be stored as comma-separated values files (CSV) or tab-delimited text files (.txt) file (Brüning et al. 2021; Du et al. 2020).

Depending on the technology used, the reads are identified using unique-molecular index (UMI) or read-identity specific to the method (Mereu et al. 2020). The total of the UMIs or the reads per cell helps identify outliers in the data. These outliers are cells that underwent poor lysis or cells that were partially lysed before the capture of the cDNA, or two or more cells captured within a compartment (doublets) resulting in a substantially higher number of mRNA molecules from a single compartment (Luecken and Theis 2019; Mereu et al. 2020; Nayak and Hasija 2021).

Except in the case of single-nucleus sequencing, RNA molecules coded by nuclear genes as well as cell organelles such as mitochondria and chloroplast can be captured. Genome annotation that includes mitochondrial genes can be used to align and enumerate the genes from both the nucleus and mitochondrial genes. The relative abundance of mitochondrial/other cell organelle genes to that of nuclear-coded genes is usually low in different cell types and therefore can be used as a quality control metric in the evaluation of cellular integrity before sequencing (Osorio and Cai 2020). The typical output of a single-cell sequencing method ranges from hundreds of cells to thousands of cells. Many single-cell technologies also allow for the sequential and repeated capture of more cells from a single sample source resulting in millions of cells captured and sequenced (Cao et al. 2020, 2017; Mereu et al. 2020). With the thousands of genes that can potentially be expressed in each cell type, the gene-count matrix produced from each of these technologies is a voluminous table of millions of data points. These data points represent the gene-count information in a ‘multi-dimensional data plane’ and the data needs to be projected onto a reduced-dimensional space for extracting meaningful information (Becht et al. 2018). Before such a dimensional reduction, the counts for each of these cells need to be normalized and scaled across cells in several cases (Hafemeister and Satija 2019; McCarthy et al. 2017; Satija et al. 2015; Vieth et al. 2019). Apart from the count information, many of these experiments require additional information to describe the source of the samples, experimental conditions, and treatments. These are considered as ‘meta’ information and help in evaluating the processed data and extracting information such as differential expression of genes under different conditions.

1.2 Seurat and Monocle

Several analytical pipelines have been developed for processing the gene-count tables into such manageable and minable datasets. At the time of writing, the most popular (in terms of the number of publications) methods for processing post-alignment data are Seurat and Monocle (Web Page 2021b; Vieth et al. 2019). Seurat is an R package that enables users to perform quality control, normalization, dimensionality reduction, clustering of cells among several other functionalities (Butler et al. 2018; Satija et al. 2015; Stuart et al. 2019; Vieth et al. 2019). These functionalities are written into facile and versatile functions that are easy to learn and implement for bioinformaticians and computational biologists across life sciences disciplines. Monocle is an R package that has also been used in a large number of studies to process and analyze single-cell data. Additionally, Monocle provides an additional tool to analyze the temporal changes in samples using the pseudotime analysis step (Qiu, Hill, et al. 2017; Qiu, Mao, et al. 2017; Trapnell et al. 2014). The details of each of these steps taken to perform normalization, scaling data, principal component analysis, dimensional reduction, and pseudotime analyses are well documented through several publications and dedicated web-portals (Web Page, n.d.a, n.d.b).

1.3 Need for Ryabhatta and Natian

However, with a large number of life sciences researchers, the analysis of single-cell sequencing data has been limited due to the command-line nature of these tools. While functions and steps used for Seurat and Monocle are easy to learn for researchers with an R background, the functions and data formats are complex enough to be a barrier for the exploration of data by non-computational scientists. Learning of R and the tools associated with single-cell data analysis is the ultimate solution to the researchers’ interested long-term analysis of the scRNAseq data, but the exploration of the use and scope of single-cell datasets should not be limited by computer literacy. To bring a computational and non-computational scientist to a common table, to be able to analyze the data and discuss the result of their analysis, we provide two graphical user interphases (GUIs) ‘Ryabhatta’ and ‘Natian’. These GUIs are based on the Shiny package in R and are provided to users with no R background. The installation and running of these GUIs have been evaluated in Windows, macOS, and Linux operating systems. The inclusion of life sciences researchers with extensive experience in molecular and cellular biology, but limited R experience in single-cell data analysis will provide novel biological insights on existing and future single-cell data sets.

References

Alon, S., D. R. Goodwin, A. Sinha, A. T. Wassie, F. Chen, E. R. Daugharthy, Y. Bando, et al. 2021. “Expansion Sequencing: Spatially Precise in Situ Transcriptomics in Intact Biological Systems.” Journal Article. Science 371 (6528). https://doi.org/10.1126/science.aax2656.

Bakken, T. E., R. D. Hodge, J. A. Miller, Z. Yao, T. N. Nguyen, B. Aevermann, E. Barkan, et al. 2018. “Single-Nucleus and Single-Cell Transcriptomes Compared in Matched Cortical Cell Types.” Journal Article. PLoS One 13 (12): e0209648. https://doi.org/10.1371/journal.pone.0209648.

Becht, E., L. McInnes, J. Healy, C. A. Dutertre, I. W. H. Kwok, L. G. Ng, F. Ginhoux, and E. W. Newell. 2018. “Dimensionality Reduction for Visualizing Single-Cell Data Using UMAP.” Journal Article. Nat Biotechnol. https://doi.org/10.1038/nbt.4314.

Birnbaum, K. D. 2018. “Power in Numbers: Single-Cell RNA-Seq Strategies to Dissect Complex Tissues.” Journal Article. Annu Rev Genet 52: 203–21. https://doi.org/10.1146/annurev-genet-120417-031247.

Brüning, Ralf Schulze, Lukas Tombor, Marcel H. Schulz, Stefanie Dimmeler, and David John. 2021. “Comparative Analysis of Common Alignment Tools for Single Cell RNA Sequencing.” Journal Article. bioRxiv, 2021.02.15.430948. https://doi.org/10.1101/2021.02.15.430948.

Butler, A., P. Hoffman, P. Smibert, E. Papalexi, and R. Satija. 2018. “Integrating Single-Cell Transcriptomic Data Across Different Conditions, Technologies, and Species.” Journal Article. Nat Biotechnol 36 (5): 411–20. https://doi.org/10.1038/nbt.4096.

Cao, J., D. R. O’Day, H. A. Pliner, P. D. Kingsley, M. Deng, R. M. Daza, M. A. Zager, et al. 2020. “A Human Cell Atlas of Fetal Gene Expression.” Journal Article. Science 370 (6518). https://doi.org/10.1126/science.aba7721.

Cao, J., J. S. Packer, V. Ramani, D. A. Cusanovich, C. Huynh, R. Daza, X. Qiu, et al. 2017. “Comprehensive Single-Cell Transcriptional Profiling of a Multicellular Organism.” Journal Article. Science 357 (6352): 661–67. https://doi.org/10.1126/science.aam8940.

Davie, K., J. Janssens, D. Koldere, M. De Waegeneer, U. Pech, L. Kreft, S. Aibar, et al. 2018. “A Single-Cell Transcriptome Atlas of the Aging Drosophila Brain.” Journal Article. Cell 174 (4): 982–998 e20. https://doi.org/10.1016/j.cell.2018.05.057.

Ding, J., X. Adiconis, S. K. Simmons, M. S. Kowalczyk, C. C. Hession, N. D. Marjanovic, T. K. Hughes, et al. 2020. “Systematic Comparison of Single-Cell and Single-Nucleus RNA-Sequencing Methods.” Journal Article. Nat Biotechnol 38 (6): 737–46. https://doi.org/10.1038/s41587-020-0465-8.

Du, Y., Q. Huang, C. Arisdakessian, and L. X. Garmire. 2020. “Evaluation of STAR and Kallisto on Single Cell RNA-Seq Data Alignment.” Journal Article. G3 (Bethesda) 10 (5): 1775–83. https://doi.org/10.1534/g3.120.401160.

Gierahn, T. M., 2nd Wadsworth M. H., T. K. Hughes, B. D. Bryson, A. Butler, R. Satija, S. Fortune, J. C. Love, and A. K. Shalek. 2017. “Seq-Well: Portable, Low-Cost RNA Sequencing of Single Cells at High Throughput.” Journal Article. Nat Methods 14 (4): 395–98. https://doi.org/10.1038/nmeth.4179.

Habib, N., I. Avraham-Davidi, A. Basu, T. Burks, K. Shekhar, M. Hofree, S. R. Choudhury, et al. 2017. “Massively Parallel Single-Nucleus RNA-Seq with DroNc-Seq.” Journal Article. Nat Methods 14 (10): 955–58. https://doi.org/10.1038/nmeth.4407.

Habib, N., Y. Li, M. Heidenreich, L. Swiech, I. Avraham-Davidi, J. J. Trombetta, C. Hession, F. Zhang, and A. Regev. 2016. “Div-Seq: Single-Nucleus RNA-Seq Reveals Dynamics of Rare Adult Newborn Neurons.” Journal Article. Science 353 (6302): 925–28. https://doi.org/10.1126/science.aad7038.

Hafemeister, C., and R. Satija. 2019. “Normalization and Variance Stabilization of Single-Cell RNA-Seq Data Using Regularized Negative Binomial Regression.” Journal Article. Genome Biol 20 (1): 296. https://doi.org/10.1186/s13059-019-1874-1.

Ji, Q., Y. Zheng, G. Zhang, Y. Hu, X. Fan, Y. Hou, L. Wen, et al. 2019. “Single-Cell RNA-Seq Analysis Reveals the Progression of Human Osteoarthritis.” Journal Article. Ann Rheum Dis 78 (1): 100–110. https://doi.org/10.1136/annrheumdis-2017-212863.

Keren-Shaul, H., E. Kenigsberg, D. A. Jaitin, E. David, F. Paul, A. Tanay, and I. Amit. 2019. “MARS-Seq2.0: An Experimental and Analytical Pipeline for Indexed Sorting Combined with Single-Cell RNA Sequencing.” Journal Article. Nat Protoc 14 (6): 1841–62. https://doi.org/10.1038/s41596-019-0164-4.

Luecken, M. D., and F. J. Theis. 2019. “Current Best Practices in Single-Cell RNA-Seq Analysis: A Tutorial.” Journal Article. Mol Syst Biol 15 (6): e8746. https://doi.org/10.15252/msb.20188746.

Macosko, E. Z., A. Basu, R. Satija, J. Nemesh, K. Shekhar, M. Goldman, I. Tirosh, et al. 2015. “Highly Parallel Genome-Wide Expression Profiling of Individual Cells Using Nanoliter Droplets.” Journal Article. Cell 161 (5): 1202–14. https://doi.org/10.1016/j.cell.2015.05.002.

Majumdar, U., S. Manivannan, M. Basu, Y. Ueyama, M. C. Blaser, E. Cameron, M. R. McDermott, et al. 2021. “Nitric Oxide Prevents Aortic Valve Calcification by s-Nitrosylation of Usp9x to Activate NOTCH Signaling.” Journal Article. Sci Adv 7 (6). https://doi.org/10.1126/sciadv.abe3706.

Mathys, H., J. Davila-Velderrain, Z. Peng, F. Gao, S. Mohammadi, J. Z. Young, M. Menon, et al. 2019. “Single-Cell Transcriptomic Analysis of Alzheimer’s Disease.” Journal Article. Nature 570 (7761): 332–37. https://doi.org/10.1038/s41586-019-1195-2.

McCarthy, D. J., K. R. Campbell, A. T. Lun, and Q. F. Wills. 2017. “Scater: Pre-Processing, Quality Control, Normalization and Visualization of Single-Cell RNA-Seq Data in r.” Journal Article. Bioinformatics 33 (8): 1179–86. https://doi.org/10.1093/bioinformatics/btw777.

Mereu, E., A. Lafzi, C. Moutinho, C. Ziegenhain, D. J. McCarthy, A. Alvarez-Varela, E. Batlle, et al. 2020. “Benchmarking Single-Cell RNA-Sequencing Protocols for Cell Atlas Projects.” Journal Article. Nat Biotechnol 38 (6): 747–55. https://doi.org/10.1038/s41587-020-0469-4.

Nayak, R., and Y. Hasija. 2021. “A Hitchhiker’s Guide to Single-Cell Transcriptomics and Data Analysis Pipelines.” Journal Article. Genomics 113 (2): 606–19. https://doi.org/10.1016/j.ygeno.2021.01.007.

Osorio, D., and J. J. Cai. 2020. “Systematic Determination of the Mitochondrial Proportion in Human and Mice Tissues for Single-Cell RNA Sequencing Data Quality Control.” Journal Article. Bioinformatics. https://doi.org/10.1093/bioinformatics/btaa751.

Papalexi, E., and R. Satija. 2018. “Single-Cell RNA Sequencing to Explore Immune Cell Heterogeneity.” Journal Article. Nat Rev Immunol 18 (1): 35–45. https://doi.org/10.1038/nri.2017.76.

Picelli, S., O. R. Faridani, A. K. Bjorklund, G. Winberg, S. Sagasser, and R. Sandberg. 2014. “Full-Length RNA-Seq from Single Cells Using Smart-Seq2.” Journal Article. Nat Protoc 9 (1): 171–81. https://doi.org/10.1038/nprot.2014.006.

Polioudakis, D., L. de la Torre-Ubieta, J. Langerman, A. G. Elkins, X. Shi, J. L. Stein, C. K. Vuong, et al. 2019. “A Single-Cell Transcriptomic Atlas of Human Neocortical Development During Mid-Gestation.” Journal Article. Neuron 103 (5): 785–801 e8. https://doi.org/10.1016/j.neuron.2019.06.011.

Potter, S. S. 2018. “Single-Cell RNA Sequencing for the Study of Development, Physiology and Disease.” Journal Article. Nat Rev Nephrol 14 (8): 479–92. https://doi.org/10.1038/s41581-018-0021-7.

Qiu, X., A. Hill, J. Packer, D. Lin, Y. A. Ma, and C. Trapnell. 2017. “Single-Cell mRNA Quantification and Differential Analysis with Census.” Journal Article. Nat Methods 14 (3): 309–15. https://doi.org/10.1038/nmeth.4150.

Qiu, X., Q. Mao, Y. Tang, L. Wang, R. Chawla, H. A. Pliner, and C. Trapnell. 2017. “Reversed Graph Embedding Resolves Complex Single-Cell Trajectories.” Journal Article. Nat Methods 14 (10): 979–82. https://doi.org/10.1038/nmeth.4402.

Ravenscroft, T. A., J. Janssens, P. T. Lee, B. Tepe, P. C. Marcogliese, S. Makhzami, T. C. Holmes, S. Aerts, and H. J. Bellen. 2020. “Drosophila Voltage-Gated Sodium Channels Are Only Expressed in Active Neurons and Are Localized to Distal Axonal Initial Segment-Like Domains.” Journal Article. J Neurosci 40 (42): 7999–8024. https://doi.org/10.1523/JNEUROSCI.0142-20.2020.

Saliba, A. E., A. J. Westermann, S. A. Gorski, and J. Vogel. 2014. “Single-Cell RNA-Seq: Advances and Future Challenges.” Journal Article. Nucleic Acids Res 42 (14): 8845–60. https://doi.org/10.1093/nar/gku555.

Satija, R., J. A. Farrell, D. Gennert, A. F. Schier, and A. Regev. 2015. “Spatial Reconstruction of Single-Cell Gene Expression Data.” Journal Article. Nat Biotechnol 33 (5): 495–502. https://doi.org/10.1038/nbt.3192.

Shaw, R., X. Tian, and J. Xu. 2021. “Single-Cell Transcriptome Analysis in Plants: Advances and Challenges.” Journal Article. Mol Plant 14 (1): 115–26. https://doi.org/10.1016/j.molp.2020.10.012.

Soysa, T. Y. de, S. S. Ranade, S. Okawa, S. Ravichandran, Y. Huang, H. T. Salunga, A. Schricker, A. Del Sol, C. A. Gifford, and D. Srivastava. 2019. “Single-Cell Analysis of Cardiogenesis Reveals Basis for Organ-Level Developmental Defects.” Journal Article. Nature 572 (7767): 120–24. https://doi.org/10.1038/s41586-019-1414-x.

Stuart, T., A. Butler, P. Hoffman, C. Hafemeister, E. Papalexi, 3rd Mauck W. M., Y. Hao, M. Stoeckius, P. Smibert, and R. Satija. 2019. “Comprehensive Integration of Single-Cell Data.” Journal Article. Cell 177 (7): 1888–1902 e21. https://doi.org/10.1016/j.cell.2019.05.031.

Suva, M. L., and I. Tirosh. 2019. “Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges.” Journal Article. Mol Cell 75 (1): 7–12. https://doi.org/10.1016/j.molcel.2019.05.003.

Tang, F., C. Barbacioru, Y. Wang, E. Nordman, C. Lee, N. Xu, X. Wang, et al. 2009. “mRNA-Seq Whole-Transcriptome Analysis of a Single Cell.” Journal Article. Nat Methods 6 (5): 377–82. https://doi.org/10.1038/nmeth.1315.

Trapnell, C., D. Cacchiarelli, J. Grimsby, P. Pokharel, S. Li, M. Morse, N. J. Lennon, K. J. Livak, T. S. Mikkelsen, and J. L. Rinn. 2014. “The Dynamics and Regulators of Cell Fate Decisions Are Revealed by Pseudotemporal Ordering of Single Cells.” Journal Article. Nat Biotechnol 32 (4): 381–86. https://doi.org/10.1038/nbt.2859.

Vieth, B., S. Parekh, C. Ziegenhain, W. Enard, and I. Hellmann. 2019. “A Systematic Evaluation of Single Cell RNA-Seq Analysis Pipelines.” Journal Article. Nat Commun 10 (1): 4667. https://doi.org/10.1038/s41467-019-12266-7.

Web Page. 2021a. https://www.illumina.com/content/dam/illumina-marketing/documents/products/other/single-cell-sequencing-ebook-770-2019-007.pdf.

———. 2021b. https://www.scrna-tools.org/.

———. n.d.a. https://satijalab.org/seurat/.

———. n.d.b. http://cole-trapnell-lab.github.io/monocle-release/docs/.

Wen, L., and F. Tang. 2016. “Single-Cell Sequencing in Stem Cell Biology.” Journal Article. Genome Biol 17: 71. https://doi.org/10.1186/s13059-016-0941-0.

Zhang, X., T. Li, F. Liu, Y. Chen, J. Yao, Z. Li, Y. Huang, and J. Wang. 2019. “Comparative Analysis of Droplet-Based Ultra-High-Throughput Single-Cell RNA-Seq Systems.” Journal Article. Mol Cell 73 (1): 130–142 e5. https://doi.org/10.1016/j.molcel.2018.10.020.

Zhang, Y., J. Gao, Y. Huang, and J. Wang. 2018. “Recent Developments in Single-Cell RNA-Seq of Microorganisms.” Journal Article. Biophys J 115 (2): 173–80. https://doi.org/10.1016/j.bpj.2018.06.008.

Zheng, G. X., J. M. Terry, P. Belgrader, P. Ryvkin, Z. W. Bent, R. Wilson, S. B. Ziraldo, et al. 2017. “Massively Parallel Digital Transcriptional Profiling of Single Cells.” Journal Article. Nat Commun 8: 14049. https://doi.org/10.1038/ncomms14049.