Machine learning models predicted cannabis use disorder transitions with 74% accuracy using demographics, wearables, and social factors

Using the All of Us cohort, machine learning models predicted progression from cannabis use to cannabis use disorder with moderate accuracy (AUC = 0.74), with demographics being the strongest predictors and social determinants of health adding meaningful value.

Zamora, Gabriel et al.·Drug and alcohol dependence·2026·Moderate EvidenceLongitudinal Cohort
RTHC-08733Longitudinal CohortModerate Evidence2026RETHINKTHC RESEARCH DATABASErethinkthc.com/research

Quick Facts

Study Type
Longitudinal Cohort
Evidence
Moderate Evidence
Sample
Not reported

What This Study Found

For cannabis users, both elastic net and random forest models achieved AUC of about 0.74 (no significant difference). Demographic variables were the strongest predictors across both models. Social determinants of health, particularly income, contributed substantially. Wearable-derived metrics (activity and sleep data) provided incremental value in linear models but limited independent contribution in random forests.

Key Numbers

Cannabis cohort AUC: EN = 0.740, RF = 0.741 (DeLong p = 0.764); stimulant cohort AUC: RF = 0.732, EN = 0.698 (DeLong p = 0.219); demographics strongest predictors; income most important SDoH variable

How They Did This

Data from the All of Us Research Program, a nationwide cohort integrating electronic health records, surveys, wearable data, and social determinants. Individuals with baseline cannabis use were followed for incident SUD diagnoses. Elastic net logistic regression and random forest models were trained and compared using AUC on independent test sets.

Why This Research Matters

Predicting who will progress from cannabis use to a use disorder could enable targeted prevention. This study shows that readily available demographic and social data already provide moderate predictive power, with wearable technology adding incremental value.

The Bigger Picture

This represents a step toward precision prevention in substance use. While 74% accuracy is moderate, combining easily collected demographic data with emerging wearable technology could eventually enable proactive clinical intervention.

What This Study Doesn't Tell Us

Moderate predictive accuracy limits clinical utility. All of Us cohort may not be fully representative. Electronic health record diagnoses may undercount SUD. Wearable data had limited contribution, possibly due to data quality or relevance.

Questions This Raises

  • ?Would longer follow-up periods improve prediction accuracy?
  • ?Could genetic data or neuroimaging biomarkers significantly boost performance?
  • ?How should moderate-accuracy predictions be used ethically in clinical settings?

Trust & Context

Key Stat:
AUC = 0.74 for predicting cannabis use to disorder transition
Evidence Grade:
Moderate: large diverse cohort with multimodal data and appropriate ML methodology, but moderate predictive accuracy and observational design.
Study Age:
2026 publication using the All of Us Research Program cohort.
Original Title:
Comparing random forest and elastic net models to predict substance use disorder transitions in participants with cannabis and stimulant use: Evidence from the All of Us cohort.
Published In:
Drug and alcohol dependence, 278, 113012 (2026)
Database ID:
RTHC-08733

Evidence Hierarchy

Meta-Analysis / Systematic Review
Randomized Controlled Trial
Cohort / Case-ControlFollows or compares groups over time
This study
Cross-Sectional / Observational
Case Report / Animal Study

Follows a group of people over time to track how outcomes develop.

What do these levels mean? →

Frequently Asked Questions

Can machine learning predict who will develop cannabis use disorder?

With moderate accuracy (74%). Demographics were the strongest predictors, with income and other social factors adding meaningful value. The models performed similarly for cannabis and stimulant use cohorts.

Did wearable data help predict cannabis use disorder?

Somewhat. Activity and sleep data from wearables provided incremental value in linear models, but limited independent contribution in the more complex random forest model. Demographics remained the strongest predictors.

Read More on RethinkTHC

Cite This Study

RTHC-08733·https://rethinkthc.com/research/RTHC-08733

APA

Zamora, Gabriel; Gunawan, Tommy; Zhao, Qingyu; Meruelo, Alejandro D. (2026). Comparing random forest and elastic net models to predict substance use disorder transitions in participants with cannabis and stimulant use: Evidence from the All of Us cohort.. Drug and alcohol dependence, 278, 113012. https://doi.org/10.1016/j.drugalcdep.2025.113012

MLA

Zamora, Gabriel, et al. "Comparing random forest and elastic net models to predict substance use disorder transitions in participants with cannabis and stimulant use: Evidence from the All of Us cohort.." Drug and alcohol dependence, 2026. https://doi.org/10.1016/j.drugalcdep.2025.113012

RethinkTHC

RethinkTHC Research Database. "Comparing random forest and elastic net models to predict su..." RTHC-08733. Retrieved from https://rethinkthc.com/research/zamora-2026-comparing-random-forest-and

Access the Original Study

Study data sourced from PubMed, a service of the U.S. National Library of Medicine, National Institutes of Health.

This study breakdown was produced by the RethinkTHC research team. We analyze and report published research findings without making health recommendations. All interpretations are based solely on the published abstract and study data.