Multi-Cohort Transcriptomic Subtyping of B-Cell Acute Lymphoblastic Leukemia

Ville Petteri Mäkinen, Jacqueline Rehn, James Breen, David Yeung, Deborah L. White

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)


RNA sequencing provides a snapshot of the functional consequences of genomic lesions that drive acute lymphoblastic leukemia (ALL). The aims of this study were to elucidate diagnostic associations (via machine learning) between mRNA-seq profiles, independently verify ALL lesions and develop easy-to-interpret transcriptome-wide biomarkers for ALL subtyping in the clinical setting. A training dataset of 1279 ALL patients from six North American cohorts was used for developing machine learning models. Results were validated in 767 patients from Australia with a quality control dataset across 31 tissues from 1160 non-ALL donors. A novel batch correction method was introduced and applied to adjust for cohort differences. Out of 18,503 genes with usable expression, 11,830 (64%) were confounded by cohort effects and excluded. Six ALL subtypes (ETV6::RUNX1, KMT2A, DUX4, PAX5 P80R, TCF3::PBX1, ZNF384) that covered 32% of patients were robustly detected by mRNA-seq (positive predictive value ≥ 87%). Five other frequent subtypes (CRLF2, hypodiploid, hyperdiploid, PAX5 alterations and Ph-positive) were distinguishable in 40% of patients at lower accuracy (52% ≤ positive predictive value ≤ 73%). Based on these findings, we introduce the Allspice R package to predict ALL subtypes and driver genes from unadjusted mRNA-seq read counts as encountered in real-world settings. Two examples of Allspice applied to previously unseen ALL patient samples with atypical lesions are included.

Original languageEnglish
Article number4574
JournalInternational Journal of Molecular Sciences
Issue number9
Publication statusPublished or Issued - 1 May 2022


  • RNA-seq
  • acute lymphoblastic leukemia
  • confounder adjustment
  • machine learning

ASJC Scopus subject areas

  • Catalysis
  • Molecular Biology
  • Spectroscopy
  • Computer Science Applications
  • Physical and Theoretical Chemistry
  • Organic Chemistry
  • Inorganic Chemistry

Cite this