Sales Ended

Expression Recovery in Single Cell RNA Sequencing

Event Information

Share this event

Date and Time

Event description


Time: 9:00 AM - 10:00 AM PST, Feb 22, 2018 link will be sent 24 hours before the event.

Please RSVP, the registration will be closed on 8:59 AM PST, Feb 21, 2018.


Cells are the basic biological units of multicellular organisms. Recent technological breakthroughs have made it possible to measure gene expression at the single-cell level through single-cell RNA sequencing (scRNA-seq). The collection of abundances of all RNA species in a cell forms its “molecular fingerprint”, enabling the investigation of many fundamental biological questions beyond those possible by traditional bulk RNA-seq experiments, such as: How do single cells differ? What are the cell types that make up a tissue? How do cells transition between biological states and acquire new functions?

In this talk, I will describe the statistical challenges for single cell RNA sequencing data, and introduce the models that we have developed. We have shown, using data across 9 representative public data sets generated by multiple experimental protocols, that a Poisson-based model can capture the technical noise in UMI-based counts (UMI is a barcoding technique that I will describe). In this talk I will formulate two general challenges in single cell experiments: distribution recovery and transcript recovery. I will briefly describe the distribution recovery problem, but focus on transcript recovery for most of the talk.

In single cell RNA sequencing experiments, not all transcripts present in the cell are sequenced. The efficiency, that is, the proportion of transcripts in each cell that are eventually represented in the data, can vary between 2-60% across cells, and can be especially low in the recently developed highly parallelized technologies. This leads to a severe case of not-at- random missing data that confounds analysis. To address this issue, we have developed SAVER, a noise reduction and missing-data imputation framework for single cell RNA sequencing. SAVER fills in the zeros in the expression matrix as well as improve the expression estimates derived from the low read counts. We have demonstrated the accuracy of this procedure in two ways, through “thinning experiments” that subsample from real high quality scRNA-seq data sets, and through comparisons to gold-standard RNA-FISH measurements. I will illustrate how this critical recovery step improves downstream analyses in single cell experiments.

Short Bio:
Nancy R. Zhang is a statistician with over 13 years of experience working in the field of genomics. She is currently Associate Professor in the Department of Statistics in the Wharton School at the University of Pennsylvania.

Strategic Alliance:

Date and Time

Save This Event

Event Saved