Recommender Systems for Gene Expression Recovery in scRNA-seq Data
TimeTuesday, July 306:30pm - 8:30pm
LocationCrystal Foyer and Crystal B
DescriptionSingle cell RNA sequencing (scRNA-seq) is a powerful gene expression profiling technique, presently revolutionizing the study of complex cellular systems and responses in the biological sciences. However, scRNA-seq approaches currently suffer from sub-optimal target recovery leading to many false negatives. The resulting inflation of null readings adds noise to data visualization and may confound its interpretation. Since cells represent coherent phenotypes defined by conserved molecular circuitries, and since these are encoded in multi-gene expression patterns, information about one node in the data is predicted to be embedded in other nodes of the data. Under this hypothesis, several approaches have been proposed to impute missing values by extracting information from non-zero values in the data set. As recommender systems have been widely used to make predictions from sparse data matrices, we hypothesized they could provide effective means to recover missing values in RNA-seq data. We applied machine learning (NMF) and collaborative filtering (KNN) recommender systems to scRNA-seq data to impute missing values. We compared these approaches to existing imputation approaches and found they produce significantly better results, and potentially uncover hidden features in the data.