10X Genomics Single Cell RNA Sequencing report

Customer Name	DEMO
Customer Institutaion	MIT
Customer Email	xxx@mit.edu
Project ID	2019A15xx

LC Sciences

1.Introduction

Single-cell genomic technologies have revolutionized the way scientists can interrogate heterogeneous tissues or rare subpopulations of cells. Single-cell RNA sequencing (scRNA-seq) has been at the forefront of method development both in the laboratory and computationally to provide robust methods for downstream data analysis.

10x Genomics’ single-cell RNA-seq (scRNA-seq) technology, the Chromium™ Single Cell 3’ Solution, allows you to analyze transcriptomes on a cell-by-cell basis through the use of microfluidic partitioning to capture single cells and prepare barcoded, next-generation sequencing (NGS) cDNA libraries. Specifically, single cells, reverse transcription (RT) reagents, Gel Beads containing barcoded oligonucleotides, and oil are combined on a microfluidic chip to form reaction vesicles called Gel Beads in Emulsion, or GEMs. GEMs are formed in parallel within the microfluidic channels of the chip, allowing the user to process 100’s to 10,000’s of single cells in a single 7-minute Chromium™ Instrument run. It’s important to note that cells are loaded at a limiting dilution in order to maximize the number of GEMs containing a single cell to ensure a low doublet rate, while maintaining a high cell recovery rate of up to ~65%.

Each functional GEM contains a single cell, a single Gel Bead, and RT reagents. Within each GEM reaction vesicle, a single cell is lysed, the Gel Bead is dissolved to free the identically barcoded RT oligonucleotides into solution, and reverse transcription of polyadenylated mRNA occurs. As a result, all cDNAs from a single cell will have the same barcode, allowing the sequencing reads to be mapped back to their original single cells of origin. The preparation of NGS libraries from these barcoded cDNAs is then carried out in a highly efficient bulk reaction.

LC Sciences

2.Technical procedure

2.1 Experimental procedure

2.2 Bioinformatic analysis

See Appendix for more details.

LC Sciences

3.Project information

3.1 Sample information

Species name：human

Latin name：Homo sapiens

Specimens：liver tissue

3.2 Database

Dababase	Web links	Version/date
Genome	ftp://ftp.ensembl.org/pub/release-96/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.toplevel.fa.gz	v96
Gene Orthology (GO)	http://geneontology.org	2016.04
KEGG	http://www.kegg.jp/kegg	2016.05

3.3 Bioinformatic softwares

Classes	Softwares	Version
Fundamental analysis	CellRanger	5.0.1
Advanced analysis	Seurat	3.1.1
Plots	R	3.5.2

LC Sciences

4.Analysis results

4.1 Cellranger_result

Over review of single cell RNA sequencing QC results by CellRanger (web-summary.html).

Document location：

summary/1_Cellranger_result/gex/web_summary.html

Sample	Number of Reads	Valid Barcodes	Sequencing Saturation	Q30 Bases in Barcode	Q30 Bases in RNA Read	Q30 Bases in UMI
con	638,901,019	97.4%	67.3%	93.7%	90.1%	92.4%
gex	278,686,771	98.1%	74.6%	96.9%	94.3%	96.4%

Document location：

summary/1_Cellranger_result/sample_sequence_stat.xlsx

Sample	Estimated Number of Cells	Mean Reads per Cell	Median Genes per Cell	Reads Mapped to Genome	Reads Mapped Confidently to Genome	Reads Mapped Confidently to Intergenic Regions	Reads Mapped Confidently to Intronic Regions	Reads Mapped Confidently to Exonic Regions	Reads Mapped Confidently to Transcriptome	Reads Mapped Antisense to Gene	Fraction Reads in Cells	Total Genes Detected
con	11,670	54,747	1,985	96.1%	91.1%	3.3%	35.1%	52.8%	49.3%	1.3%	94.7%	31,787
gex	6,296	44,264	1,529	97.1%	92.4%	2.2%	28.2%	62.0%	58.3%	1.0%	84.0%	26,403

Document location：

summary/1_Cellranger_result/sample_align_stat.xlsx

4.2 Cell filter & Cell clustering

4.2.1 Cell filter

Seurat allows to easily explore QC metrics and filter cells, we can visualize gene and molecule counts, plot their relationship, and exclude cells with a clear outlier number of genes detected as potential multiplets. Followed criteria were used to filter cells:

1.gene counts 500-Inf per cell.

2.UMI counts<-Inf.

3. the percentage of mitochondrial genes < 25% 25 (customized).

4.DoubletFinder for detecting doublets.

Sample	before_filter_cell_num	after_filter_cell_num	precent
gex	6296	5911	93.89%
con	11670	9923	85.03%

Document location：

src/summary_part/3_Cluster_result/sample_cells_filt_stat.xlsx

Document location：

summary/3_Cluster_result/Basicinfo_nGene_nUMI_pMito.png

summary/3_Cluster_result/Filter_Basicinfo_nGene_nUMI_pMito.png

4.2.2 Cell clustering

Cell clustering contains the following steps:

1)Normalizing the data

After removing unwanted cells from the dataset, the next step is to normalize the data. By default, we employ a global-scaling normalization method “LogNormalize” that normalizes the gene expression measurements for each cell by the total expression.

2)PCA(Principal component analysis)

To overcome the extensive technical noise in any single gene for scRNA-seq data, Seurat clusters cells based on their PCA scores, with each PC essentially representing a ‘metagene’ that combines information across a correlated gene set.

3)Clustering cells

Seurat implements an graph-based clustering approach. Distances between the cells are calculated based on previously identified PCs. Seurat approach was heavily inspired by recent manuscripts which applied graph-based clustering approaches to scRNA-seq data - SNN-Cliq and CyTOF data – PhenoGraph. Briefly, these methods embed cells in a graph structure - for example a K-nearest neighbor (KNN) graph, with edges drawn between cells with similar gene expression patterns, and then attempt to partition this graph into highly interconnected quasi-cliques or communities. As in PhenoGraph, we first construct a KNN graph based on the euclidean distance in PCA space, and refine the edge weights between any two cells based on the shared overlap in their local neighborhoods (Jaccard distance). To cluster the cells, we apply modularity optimization techniques – SLM, to iteratively group cells together, with the goal of optimizing the standard modularity function.

4)t-SNE (t-distributed Stochastic Neighbor Embedding) visualization

Seurat continues to use t-SNE as a powerful tool to visualize and explore these datasets. The tSNE aims to place cells with similar local neighborhoods in high-dimensional space together in low-dimensional space.

Clusters	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	Total
gex	22(0.37%)	2(0.03%)	6(0.10%)	5(0.08%)	8(0.14%)	1033(17.48%)	1028(17.39%)	13(0.22%)	883(14.94%)	844(14.28%)	675(11.42%)	174(2.94%)	5(0.08%)	0(0%)	408(6.90%)	287(4.86%)	251(4.25%)	17(0.29%)	87(1.47%)	81(1.37%)	41(0.69%)	2(0.03%)	33(0.56%)	6(0.10%)	5911(100%)
con	2044(20.60%)	1460(14.71%)	1445(14.56%)	1185(11.94%)	1077(10.85%)	11(0.11%)	0(0%)	1006(10.14%)	3(0.03%)	0(0%)	4(0.04%)	464(4.68%)	592(5.97%)	442(4.45%)	3(0.03%)	0(0%)	2(0.02%)	120(1.21%)	2(0.02%)	0(0%)	5(0.05%)	36(0.36%)	0(0%)	22(0.22%)	9923(100%)

Document location：

summary/3_Cluster_result/sample_cell_cluster_stat.xlsx

Document location：

summary/3_Cluster_result/sample_cell_cluster_stat.png

summary/3_Cluster_result/cluster_sample_stat.png

Document location：

summary/3_Cluster_result/tsne.png

Document location：

summary/3_Cluster_result/tsne.sample_split.png

4.3 Differentially expressed genes (up-regulation) analysis per cluster

4.3.1 Differentially expressed genes analysis

We used likelihood-ratio test to find differential expression for a single cluster, compared to all other cells. We identified differentially expressed genes as following criteria:

1) p_value ≤ 0.01

2) log2FC ≥ 0.26,log2FC means log2 fold-chage of the average expression between the two groups.

3) The percentage of cells where the gene is detected in specific cluster > 10%.

Cluster	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23
AL627309.1	0.00	0.00	0.00	0.01	0.01	0.01	0.00	0.01	0	0	0.01	0.01	0.01	0.00	0.01	0	0	0	0	0	0	0	0	0
CICP27	0.00	0.00	0.00	0.00	0.01	0	0	0.00	0.00	0	0	0	0.00	0.00	0	0.03	0	0.01	0	0	0	0	0	0
AL627309.6	0.00	0.03	0	0.05	0	0.01	0.07	0	0.01	0.17	0.00	0	0	0.00	0	0.01	0	0	0.03	0	0	0	0.01	0
AL627309.7	0	0.00	0	0	0	0	0.00	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
AL627309.5	0.01	0.13	0.00	0.16	0.01	0.01	0.04	0.03	0.01	0.06	0.01	0.02	0.00	0.07	0.01	0	0.04	0.04	0.02	0.02	0.02	0.01	0	0
AL627309.4	0	0.00	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
WASH9P	0	0	0	0	0	0.00	0.00	0	0	0.00	0	0	0.00	0	0	0	0	0	0	0.01	0	0	0	0
AP006222.1	0.01	0.01	0.01	0.01	0.02	0.01	0.00	0.01	0	0.01	0.01	0.01	0.01	0.01	0	0.01	0	0	0	0	0	0	0	0
AL732372.2	0.00	0.01	0.01	0.02	0.00	0	0.01	0.00	0.00	0.01	0	0.01	0.00	0.01	0	0	0.01	0	0	0	0.04	0	0	0
AL732372.3	0	0	0	0.00	0	0	0	0	0	0	0	0	0	0.00	0	0.01	0	0	0	0	0	0	0	0

Document location：

summary/4_Markergene_result/all_cluster_gene_avg_exp.xlsx

Target_Cluster	GeneID	GeneName	Target_Cluster_mean	Other_Cluster_mean	log2FC	Pvlaue	Qvalue	Description	GO	KEGG	KO_ENTRY	EC
0	ENSG00000168685	IL7R	12.52	6.24	1.01	0	0	interleukin 7 receptor [Source:HGNC Symbol;Acc:HGNC:6024]	GO:0000018(regulation of DNA recombination);GO:0000902(cell morphogenesis);GO:0001915(negative regulation of T cell mediated cytotoxicity);GO:0002377(immunoglobulin production);GO:0003823(antigen binding);GO:0004896(cytokine receptor activity);GO:0004917(interleukin-7 receptor activity);GO:0005515(protein binding);GO:0005576(extracellular region);GO:0005886(plasma membrane);GO:0006955(immune response);GO:0007165(signal transduction);GO:0007166(cell surface receptor signaling pathway);GO:0008284(positive regulation of cell population proliferation);GO:0008361(regulation of cell size);GO:0009897(external side of plasma membrane);GO:0010628(positive regulation of gene expression);GO:0016020(membrane);GO:0016021(integral component of membrane);GO:0019221(cytokine-mediated signaling pathway);GO:0030217(T cell differentiation);GO:0030665(clathrin-coated vesicle membrane);GO:0033089(positive regulation of T cell differentiation in thymus);GO:0038111(interleukin-7-mediated signaling pathway);GO:0042100(B cell proliferation);GO:0048535(lymph node development);GO:0048872(homeostasis of number of cells);GO:0061024(membrane organization);GO:0070233(negative regulation of T cell apoptotic process);GO:1904894(positive regulation of receptor signaling pathway via STAT)	04060(Cytokine-cytokine receptor interaction);04068(FoxO signaling pathway);04151(PI3K-Akt signaling pathway);04630(Jak-STAT signaling pathway);04640(Hematopoietic cell lineage);05200(Pathways in cancer);05340(Primary immunodeficiency)	K05072	NA
0	ENSG00000227507	LTB	16.19	8.63	0.91	0	0	lymphotoxin beta [Source:HGNC Symbol;Acc:HGNC:6711]	GO:0005102(signaling receptor binding);GO:0005125(cytokine activity);GO:0005164(tumor necrosis factor receptor binding);GO:0005515(protein binding);GO:0005575(cellular_component);GO:0005615(extracellular space);GO:0005886(plasma membrane);GO:0006955(immune response);GO:0007165(signal transduction);GO:0007267(cell-cell signaling);GO:0010467(gene expression);GO:0010469(regulation of signaling receptor activity);GO:0016020(membrane);GO:0016021(integral component of membrane);GO:0033209(tumor necrosis factor-mediated signaling pathway);GO:0043588(skin development);GO:0045084(positive regulation of interleukin-12 biosynthetic process);GO:0048535(lymph node development)	04060(Cytokine-cytokine receptor interaction);04064(NF-kappa B signaling pathway);05323(Rheumatoid arthritis)	K03157	NA
0	ENSG00000277734	TRAC	8.53	4.77	0.84	0	0	T cell receptor alpha constant [Source:HGNC Symbol;Acc:HGNC:12029]	NA	NA	NA	NA
0	ENSG00000142541	RPL13A	39.76	23.28	0.77	0	0	ribosomal protein L13a [Source:HGNC Symbol;Acc:HGNC:10304]	GO:0000184(nuclear-transcribed mRNA catabolic process, nonsense-mediated decay);GO:0003723(RNA binding);GO:0003729(mRNA binding);GO:0003735(structural constituent of ribosome);GO:0005634(nucleus);GO:0005730(nucleolus);GO:0005737(cytoplasm);GO:0005829(cytosol);GO:0005840(ribosome);GO:0005925(focal adhesion);GO:0006412(translation);GO:0006413(translational initiation);GO:0006417(regulation of translation);GO:0006614(SRP-dependent cotranslational protein targeting to membrane);GO:0015934(large ribosomal subunit);GO:0016020(membrane);GO:0017148(negative regulation of translation);GO:0019083(viral transcription);GO:0022625(cytosolic large ribosomal subunit);GO:0071346(cellular response to interferon-gamma);GO:0097452(GAIT complex);GO:1901194(negative regulation of formation of translation preinitiation complex);GO:1990904(ribonucleoprotein complex)	03010(Ribosome)	K02872	NA
0	ENSG00000127152	BCL11B	4.75	2.82	0.75	0	0	BCL11B, BAF complex component [Source:HGNC Symbol;Acc:HGNC:13222]	GO:0000978(RNA polymerase II proximal promoter sequence-specific DNA binding);GO:0000981(DNA-binding transcription factor activity, RNA polymerase II-specific);GO:0001228(DNA-binding transcription activator activity, RNA polymerase II-specific);GO:0003334(keratinocyte development);GO:0003382(epithelial cell morphogenesis);GO:0003676(nucleic acid binding);GO:0003700(DNA-binding transcription factor activity);GO:0005515(protein binding);GO:0005634(nucleus);GO:0006355(regulation of transcription, DNA-templated);GO:0007409(axonogenesis);GO:0008285(negative regulation of cell population proliferation);GO:0009791(post-embryonic development);GO:0010468(regulation of gene expression);GO:0010837(regulation of keratinocyte proliferation);GO:0019216(regulation of lipid metabolic process);GO:0021773(striatal medium spiny neuron differentiation);GO:0021902(commitment of neuronal cell to specific neuron type in forebrain);GO:0021953(central nervous system neuron differentiation);GO:0031077(post-embryonic camera-type eye development);GO:0033077(T cell differentiation in thymus);GO:0033153(T cell receptor V(D)J recombination);GO:0035701(hematopoietic stem cell migration);GO:0042475(odontogenesis of dentin-containing tooth);GO:0043005(neuron projection);GO:0043066(negative regulation of apoptotic process);GO:0043368(positive T cell selection);GO:0043565(sequence-specific DNA binding);GO:0043588(skin development);GO:0045664(regulation of neuron differentiation);GO:0045944(positive regulation of transcription by RNA polymerase II);GO:0046632(alpha-beta T cell differentiation);GO:0046872(metal ion binding);GO:0048538(thymus development);GO:0071678(olfactory bulb axon guidance);GO:0097535(lymphoid lineage cell migration into thymus)	05202(Transcriptional misregulation in cancer)	K22046	NA
0	ENSG00000167286	CD3D	5.87	3.55	0.73	0	0	CD3d molecule [Source:HGNC Symbol;Acc:HGNC:1673]	GO:0002250(adaptive immune response);GO:0002376(immune system process);GO:0003713(transcription coactivator activity);GO:0004888(transmembrane signaling receptor activity);GO:0005737(cytoplasm);GO:0005886(plasma membrane);GO:0007166(cell surface receptor signaling pathway);GO:0009897(external side of plasma membrane);GO:0016020(membrane);GO:0016021(integral component of membrane);GO:0030217(T cell differentiation);GO:0030665(clathrin-coated vesicle membrane);GO:0042101(T cell receptor complex);GO:0042105(alpha-beta T cell receptor complex);GO:0042803(protein homodimerization activity);GO:0045059(positive thymic T cell selection);GO:0045944(positive regulation of transcription by RNA polymerase II);GO:0046982(protein heterodimerization activity);GO:0050776(regulation of immune response);GO:0050852(T cell receptor signaling pathway);GO:0051260(protein homooligomerization);GO:0061024(membrane organization)	04640(Hematopoietic cell lineage);04658(Th1 and Th2 cell differentiation);04659(Th17 cell differentiation);04660(T cell receptor signaling pathway);05142(Chagas disease (American trypanosomiasis));05162(Measles);05166(Human T-cell leukemia virus 1 infection);05169(Epstein-Barr virus infection);05170(Human immunodeficiency virus 1 infection);05340(Primary immunodeficiency)	NA	NA
0	ENSG00000008988	RPS20	16.86	10.27	0.72	0	0	ribosomal protein S20 [Source:HGNC Symbol;Acc:HGNC:10405]	GO:0000184(nuclear-transcribed mRNA catabolic process, nonsense-mediated decay);GO:0003723(RNA binding);GO:0003735(structural constituent of ribosome);GO:0005515(protein binding);GO:0005622(intracellular);GO:0005654(nucleoplasm);GO:0005737(cytoplasm);GO:0005829(cytosol);GO:0005840(ribosome);GO:0006412(translation);GO:0006413(translational initiation);GO:0006614(SRP-dependent cotranslational protein targeting to membrane);GO:0015935(small ribosomal subunit);GO:0016020(membrane);GO:0019083(viral transcription);GO:0022627(cytosolic small ribosomal subunit);GO:0070062(extracellular exosome)	03010(Ribosome)	K02969	NA
0	ENSG00000129824	RPS4Y1	8.00	4.95	0.69	0	0	ribosomal protein S4 Y-linked 1 [Source:HGNC Symbol;Acc:HGNC:10425]	GO:0000184(nuclear-transcribed mRNA catabolic process, nonsense-mediated decay);GO:0003723(RNA binding);GO:0003735(structural constituent of ribosome);GO:0005622(intracellular);GO:0005634(nucleus);GO:0005654(nucleoplasm);GO:0005829(cytosol);GO:0005840(ribosome);GO:0005844(polysome);GO:0006412(translation);GO:0006413(translational initiation);GO:0006614(SRP-dependent cotranslational protein targeting to membrane);GO:0007275(multicellular organism development);GO:0016020(membrane);GO:0019083(viral transcription);GO:0019843(rRNA binding);GO:0022627(cytosolic small ribosomal subunit)	03010(Ribosome)	K02987	NA
0	ENSG00000179144	GIMAP7	5.42	3.38	0.68	0	0	GTPase, IMAP family member 7 [Source:HGNC Symbol;Acc:HGNC:22404]	GO:0000166(nucleotide binding);GO:0003924(GTPase activity);GO:0005515(protein binding);GO:0005525(GTP binding);GO:0005737(cytoplasm);GO:0005783(endoplasmic reticulum);GO:0005794(Golgi apparatus);GO:0005811(lipid droplet);GO:0005829(cytosol);GO:0042802(identical protein binding);GO:0042803(protein homodimerization activity);GO:0043231(intracellular membrane-bounded organelle);GO:0046039(GTP metabolic process)	NA	NA	NA
0	ENSG00000160654	CD3G	3.93	2.46	0.68	0	0	CD3g molecule [Source:HGNC Symbol;Acc:HGNC:1675]	GO:0002250(adaptive immune response);GO:0002376(immune system process);GO:0004888(transmembrane signaling receptor activity);GO:0005886(plasma membrane);GO:0005887(integral component of plasma membrane);GO:0007163(establishment or maintenance of cell polarity);GO:0007166(cell surface receptor signaling pathway);GO:0009897(external side of plasma membrane);GO:0015031(protein transport);GO:0016020(membrane);GO:0016021(integral component of membrane);GO:0030159(receptor signaling complex scaffold activity);GO:0030217(T cell differentiation);GO:0030665(clathrin-coated vesicle membrane);GO:0038096(Fc-gamma receptor signaling pathway involved in phagocytosis);GO:0042101(T cell receptor complex);GO:0042105(alpha-beta T cell receptor complex);GO:0042110(T cell activation);GO:0042608(T cell receptor binding);GO:0042803(protein homodimerization activity);GO:0045059(positive thymic T cell selection);GO:0046982(protein heterodimerization activity);GO:0050776(regulation of immune response);GO:0050852(T cell receptor signaling pathway);GO:0051260(protein homooligomerization);GO:0061024(membrane organization);GO:0065003(protein-containing complex assembly);GO:0070228(regulation of lymphocyte apoptotic process)	04640(Hematopoietic cell lineage);04658(Th1 and Th2 cell differentiation);04659(Th17 cell differentiation);04660(T cell receptor signaling pathway);05142(Chagas disease (American trypanosomiasis));05162(Measles);05166(Human T-cell leukemia virus 1 infection);05169(Epstein-Barr virus infection);05170(Human immunodeficiency virus 1 infection)	NA	NA

Document location:

summary/4_Markergene_result/Markergene_list.xlsx

Clusters	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23
UP_gene_number	196	662	207	655	181	254	752	370	777	772	293	281	544	575	408	383	466	771	728	503	419	1013	240	634

Document location:

summary/4_Markergene_result/Markergene_stat.xlsx

Document location:

summary/4_Markergene_result/Markergene_stat.png

Then the expression distribution of the top 10 marker genes were demonstrated by using heat map and bubble diagram et al.

Document location:

summary/4_Markergene_result/all_cluster_markers_heatmap.png

Document location:

summary/4_Markergene_result/top10_marker_plot/cluster0_top10_marker.png

Document location：

summary/4_Markergene_result/top10_marker_plot/cluster0_top10_marker_tsne.png

Document location：

summary/4_Markergene_result/top10_marker_plot/cluster0_top10_marker_dotplot.png

Document location：

summary/4_Markergene_result/top10_marker_plot/cluster0_top10_marker_RidgePlot.png

4.3.2 Differentially expressed genes GO enrichment analysis

Gene Ontology (GO) is an international standardized gene functional classification system which offers a dynamic-updated controlled vocabulary and a strictly defined concept to comprehensively describe properties of genes and their products in any organism. GO has three ontologies: molecular function, cellular component and biological process. The basic unit of GO is GO-term. Each GO term belongs to a type of ontology.

GO enrichment analysis provides all GO terms that significantly enriched in differentially expressed genes comparing to the genome background, and filter the differentially expressed genes that correspond to biological functions. Firstly all peak related genes were mapped to GO terms in the Gene Ontology database (http://www.geneontology.org/), gene numbers were calculated for every term, significantly enriched GO terms in differentially expressed genes comparing to the genome background were defined by hypergeometric test. The calculating formula of P-value is:

Here N is the number of all genes with GO annotation; n is the number of differentially expressed genes in N; M is the number of all genes that are annotated to the certain GO terms; m is the number of differentially expressed genes in M. The calculated p-value ≤ 0.05 was set as a threshold. GO terms meeting this condition were defined as significantly enriched GO terms in differentially expressed genes. This analysis was able to recognize the main biological functions that differentially expressed genes exercise.

Document location：

summary/5_Enrichment_result/1_GO_Enrichment

4.3.3 Differentially expressed genes pathway enrichment analysis

Genes usually interact with each other to play roles in certain biological functions. Pathway-based analysis helps to further understand genes biological functions. KEGG is the major public pathway-related database. Pathway enrichment analysis identified significantly enriched metabolic pathways or signal transduction pathways in differentially expressed genes comparing with the whole genome background. The calculating formula is the same as that in GO analysis:

Here N is the number of all transcripts that with KEGG annotation, n is the number of differentially expressed genes in N, M is the number of all transcripts annotated to specific pathways, and m is number of differentially expressed genes in M. The calculated p-value ≤ 0.05 was set as a threshold. Pathways meeting this condition were defined as significantly enriched pathways in differentially expressed genes.

Document location：

summary/5_Enrichment_result/2_KEGG_Enrichment

4.4 Cell annotation (auto by SingleR)

SingleR performs unbiased cell type recognition from single-cell RNA sequencing data, by leveraging reference transcriptomic datasets of pure cell types to infer the cell of origin of each single cell independently.

Document location：

summary/6_SingleR_result/cell_annot_HPCA_label.main_cell_pie.png

Document location：

summary/6_SingleR_result/cell_annot_HPCA_label.main_cell_tsne.png

Notice:

The cell types which included in the HPCA database of SingleR were limited. Some other tools/methods (such as marker gene database-based, correlation-based, supervised classification-based) were suggested to employ in order to identify the tissue specific- or subcellular- cell types.

LC Sciences

5.References

[1] CellRanger：http://support.10xgenomics.com/single-cell/software/overview/welcome.

[2] Seurat：Butler A , Hoffman P , Smibert P , et al. Integrating single-cell transcriptomic data across different conditions, technologies, and species[J]. Nature Biotechnology, 2018.

[3] bimod：Mcdavid A , Finak G , Chattopadyay P K , et al. Data Exploration, Quality Control and Testing in Single-Cell qPCR-Based Gene Expression Experiments[J]. Bioinformatics, 2013, 29(4):461-467.

[4] t-SNE：Maaten L V D, Hinton G. Visualizing Data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9(2605):2579-2605.

[5] Kanehisa, M., M. Araki, et al. (2008). KEGG for linking genomes to life and the environment. Nucleic acids research.(KEGG)

[6] Zheng G X, Terry J M, Belgrader P, et al. Massively parallel digital transcriptional profiling of single cells[J]. Nature Communications, 2017, 8:14049.

[7] Cannoodt R, Saelens W, Saeys Y. Computational methods for trajectory inference from single‐cell transcriptomics[J]. European Journal of Immunology, 2016, 46(11):2496-2506.

LC Sciences

6.Contact us

Address：2575 West Bellfort Street, Suite 270, Houston, TX, 77054 USA

Website:http://www.lcsciences.com/

Email: support@lcsciences.com

Tel: (713) 664-7087