Study of Single Cell Integration Using Machine Learning





Find us on Google Scholar

Peer Review Policy
Article Processing Charges
Publication Procedure
Research Topics
FAQ
Copyright Infringement
Refund and Cancellation Policy

Find us on Google Scholar

Peer Review Policy

Article Processing Charges

Publication Procedure

Research Topics

FAQ

Refund and Cancellation Policy

Version
Download 232
File Size 684.08 KB
File Count 1
Create Date 16/06/2023
Last Updated 16/06/2023

Download

Description

Study of Single Cell Integration Using Machine Learning

1Anamika, 2Ajay K Kaushik

1Maharaja Agrasen Institute of Technology, 2 Maharaja Agrasen Institute of Technology

Abstract : Single-cell genomics has revolutionized our understanding of biology by enabling the measurement of DNA, RNA, and proteins in individual cells. However, analyzing single-cell data presents several challenges due to sparse and noisy measurements, molecular sampling depths, and batch effects. Additionally, current pipelines for single-cell data analysis treat cells as static snapshots, disregarding underlying dynamical biological processes. Incorporating temporal dynamics alongside state changes over time is a crucial and ongoing challenge in single-cell data science.

This paper presents a methodology for analyzing single-cell multiomics data collected from mobilized peripheral CD34+ hematopoietic stem and progenitor cells (HSPCs) isolated from four healthy human donors. The data comprises five time points over a ten-day period, during which cells were cultured with StemSpan SFEM media supplemented with CC100 and thrombopoietin (TPO) and incubated at 37ºC. Two single-cell assays were used to measure two modalities each: chromatin accessibility (DNA) and gene expression (RNA) for the Multiome kit, and gene expression (RNA) and surface protein levels for the CITEseq kit.

The task is to predict gene expression from chromatin accessibility for the Multiome samples, and protein levels from gene expression for the CITEseq samples. The cell types include Mast Cell Progenitor, Megakaryocyte Progenitor, Neutrophil Progenitor, Monocyte Progenitor, Erythrocyte Progenitor, Hematoploetic Stem Cell, and B-Cell Progenitor.

The methodology includes exploratory data analysis, data pre-processing and feature engineering, and a model architecture comprising LightGBM and a neural network. The data pre-processing includes normalization, transformation, standardization, and batch-effect correction. Feature engineering involves decomposition methods such as Principal Component Analysis, Incremental PCA, and Factor Analysis, as well as feature selection based on stable correlations within each group. Cell-type encoding is done using a one-hot encoding scheme.

Cross-validation is performed using GroupK fold validation. We have been able to get an accuracy of 87.63% on the test dataset.

Study of Single Cell Integration Using Machine Learning

Study of Single Cell Integration Using Machine Learning

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us

Study of Single Cell Integration Using Machine Learning

Study of Single Cell Integration Using Machine Learning

What is DOI

Site Map

Frequently Asked Questions

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us