TY - JOUR
T1 - Multi-Cohort Transcriptomic Subtyping of B-Cell Acute Lymphoblastic Leukemia
AU - Mäkinen, Ville Petteri
AU - Rehn, Jacqueline
AU - Breen, James
AU - Yeung, David
AU - White, Deborah L.
N1 - Funding Information:
Funding: D.L.W. was funded by the National Health and Medical Research Council of Australia Target Call for Research (APP1160833) and by Cancer Council SA Beat Cancer Project Principal Cancer Research Fellowship (PRF1618). The work was also supported by the Australasian Leukaemia and Lymphoma Group and Australian and New Zealand Children’s Haematology/Oncology Group.
Funding Information:
D.L.W. was funded by the National Health and Medical Research Council of Australia Target Call for Research (APP1160833) and by Cancer Council SA Beat Cancer Project Principal Cancer Research Fellowship (PRF1618). The work was also supported by the Australasian Leukaemia and Lymphoma Group and Australian and New Zealand Children?s Haematology/Oncology Group.
PY - 2022/5/1
Y1 - 2022/5/1
N2 - RNA sequencing provides a snapshot of the functional consequences of genomic lesions that drive acute lymphoblastic leukemia (ALL). The aims of this study were to elucidate diagnostic associations (via machine learning) between mRNA-seq profiles, independently verify ALL lesions and develop easy-to-interpret transcriptome-wide biomarkers for ALL subtyping in the clinical setting. A training dataset of 1279 ALL patients from six North American cohorts was used for developing machine learning models. Results were validated in 767 patients from Australia with a quality control dataset across 31 tissues from 1160 non-ALL donors. A novel batch correction method was introduced and applied to adjust for cohort differences. Out of 18,503 genes with usable expression, 11,830 (64%) were confounded by cohort effects and excluded. Six ALL subtypes (ETV6::RUNX1, KMT2A, DUX4, PAX5 P80R, TCF3::PBX1, ZNF384) that covered 32% of patients were robustly detected by mRNA-seq (positive predictive value ≥ 87%). Five other frequent subtypes (CRLF2, hypodiploid, hyperdiploid, PAX5 alterations and Ph-positive) were distinguishable in 40% of patients at lower accuracy (52% ≤ positive predictive value ≤ 73%). Based on these findings, we introduce the Allspice R package to predict ALL subtypes and driver genes from unadjusted mRNA-seq read counts as encountered in real-world settings. Two examples of Allspice applied to previously unseen ALL patient samples with atypical lesions are included.
AB - RNA sequencing provides a snapshot of the functional consequences of genomic lesions that drive acute lymphoblastic leukemia (ALL). The aims of this study were to elucidate diagnostic associations (via machine learning) between mRNA-seq profiles, independently verify ALL lesions and develop easy-to-interpret transcriptome-wide biomarkers for ALL subtyping in the clinical setting. A training dataset of 1279 ALL patients from six North American cohorts was used for developing machine learning models. Results were validated in 767 patients from Australia with a quality control dataset across 31 tissues from 1160 non-ALL donors. A novel batch correction method was introduced and applied to adjust for cohort differences. Out of 18,503 genes with usable expression, 11,830 (64%) were confounded by cohort effects and excluded. Six ALL subtypes (ETV6::RUNX1, KMT2A, DUX4, PAX5 P80R, TCF3::PBX1, ZNF384) that covered 32% of patients were robustly detected by mRNA-seq (positive predictive value ≥ 87%). Five other frequent subtypes (CRLF2, hypodiploid, hyperdiploid, PAX5 alterations and Ph-positive) were distinguishable in 40% of patients at lower accuracy (52% ≤ positive predictive value ≤ 73%). Based on these findings, we introduce the Allspice R package to predict ALL subtypes and driver genes from unadjusted mRNA-seq read counts as encountered in real-world settings. Two examples of Allspice applied to previously unseen ALL patient samples with atypical lesions are included.
KW - RNA-seq
KW - acute lymphoblastic leukemia
KW - confounder adjustment
KW - machine learning
UR - http://www.scopus.com/inward/record.url?scp=85128433633&partnerID=8YFLogxK
U2 - 10.3390/ijms23094574
DO - 10.3390/ijms23094574
M3 - Article
AN - SCOPUS:85128433633
VL - 23
JO - International Journal of Molecular Sciences
JF - International Journal of Molecular Sciences
SN - 1661-6596
IS - 9
M1 - 4574
ER -