Data Con LA 2020
Description
Over the last 10 years, Cedars-Sinai Health System has made significant efforts to reduce unnecessary C-sections and bring down variability in rates by provider. As part of that effort, providers in the Obstetrics department and data scientists from the Enterprise Data Intelligence team built a model to predict the likelihood that a laboring mother should have a C-section. The goal of the model is to reduce the number of unnecessary C-sections and also to identify necessary C-sections earlier in the course of labor.
Model overview
*Predictions are first made based on factors known within the first hour of admission, including pre-natal visit info, demographics and basic measurements
*The model is then updated every 10 minutes to include new information that may have come in during that time around lab values, cervical exam measurements, medications administered or other relevant events
*The admission based predictions achieved an AUC of 0.78, the continuous predictions have an AUC of 0.93 after the first 4 hours of predictions on validation data
*Models use streaming data to make predictions and return them within ~10 minutes of data being recorded
Speaker
Matthew Wells, Cedars Sinai Health System, Senior Data Scientist
2. Overview
Thank you for
joining!This
presentation will
examine the work
involved in
developing a C-
section risk
prediction
algorithm, from
project inception to
operational model
What are we talking
about today?
Background
How this project
came about
Development
How we built the
prediction model
Results
How
accurate
the model
was
Plans
What will
happen
next with
the model
Productionalizing
How the model is being validated and
rolled out
6. BackgroundWhat was the
goal for this model?
Cesarean delivery can be a life saving
procedure for mother and baby, but it
also presents increased risks of
for both.
Therefore, it is
important to
consider the
benefits and risks
before deciding on
a Cesarean delivery.
Source: “Safe prevention of the primary Cesarean delivery”, American College of
Obstetricians and Gynecologists, ACOG.org, March 2014
The purpose of this model was
to automate the mental math
involved in this benefit and risk
comparison by providing a
likelihood of delivery method
prediction
8. DevelopmentWhat have people
done for this in the past?
Age
Height
Weight
Ethnicity
Comorbidities
Initial cervical
exam
Obstetric
history
Gestational
age
Fetal
ultrasound
These factors are
all measured at or
before admission
to the hospital
Source: “Predicting Vaginal delivery in nulliparous women undergoing induction of labor”, Kawakita, AJP, 2018
”Prediction of Cesarean delivery in the term nulliparous women”, Burke, AJOG, 2017
9. DevelopmentWhat have people
done for this in the past?
Source: “Predicting Vaginal delivery in nulliparous women undergoing induction of labor”, Kawakita, AJP, 2018
”Prediction of Cesarean delivery in the term nulliparous women”, Burke, AJOG, 2017
Using these
measures gathered
at the time of
admission, previous
efforts concluded
with nomograms and
calculators based on
models that
achieved AUCs of
0.69 and 0.76
10. DevelopmentHow was this project
going to be different?
Admit
factors Intrapartum
cervical
exams
Medication
Epidural
Fetal heart
rate
11. Between 2012 and
2019, there were
44,526 deliveries
by 469 different
physicians at
Sinai Medical
Center
DevelopmentWere there other
considerations?
But if we build a
model based on
historical data,
won’t we just carry
forward all of the
traits about non-
necessary C-
sections that we’re
trying to avoid?
1 2
3
103 providers had >50
deliveries in that time.
Of that group of
physicians, 36 had a
C-section rate <24%.This was our
target physician group. These 36
physicians delivered 11,763 babies
between 2012 and 2019
To ensure that the target MD group was also balancing the risks of not doingC-
sections, we looked at mother morbidity measures, infant morbidity measures
and NICU admissions. None of these measures were significantly different
between the target and overall MD group
12. DevelopmentWhat did the trained
ALEx OBserve model look like?
Filter for just the
11,763 births
delivery by the
target physician
group
Pull in factors
available at
admission
Run data
through
automated
machine
learning
process
Return admit
based C-section
prediction results,
combine with real
time factors
Run data
through
automated
machine
learning process
Get predicted C-
section likelihood
for all deliveries in
the target physician
group at ten minute
intervals
14. Measures included
Initial cervical exam (dilation, effacement, station),
gestational age, mother age, prenatal measures (BP,
BMI, Weight), pain scale score, temperature at
admission, pulse at admission
ResultsHow good were predictions
at time of admission?
Models included
Random forest, XGBoost, Extreme gradient boosted trees,
Generalized Additive Models, Logistic regession,
Extra trees classifier, Elastic-net classifier,
Support vector machines, Rule fit classifier, decision tree
Top model: Random forest
AUC: 0.753
Sensitivity: 0.99
Specificity: 0.15
Pick
best
model
Run data
through
models
Get admit measures
for target population
Cesarean Non-Cesarean
Cesarean 59 327
Non-Cesarean 19 1,930
Predicted
Actual
15. ResultsWhat features were most
important for the admit predictions?
Permutation based feature importance (at admission)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Temperature
Trimester 1 BMI
Weight
Pulse
Effacement
Station
Mother age
BMI at admission
Gestational age
Dilation
Feature importance is determined by
shuffling values in a column and then
analyzing model performance. More
important features will have a greater
impact on model accuracy when shuffled.
The magnitude of the impact is calculated
for all features and then normalized to a
scale of 0 -1
16. Measures included
Admit time prediction, dilation, station, effacement,
minutes into labor, medication administration (Oxytocin,
Dinoprostone, Misoprostol), Foley catheter present,
Fetal heart rate pattern, membrane status
ResultsHow good were predictions
during the course of labor?
Models included
Random forest, XGBoost, Extreme gradient boosted trees,
Generalized Additive Models, Logistic regression,
Extra trees classifier, Elastic-net classifier,
Support vector machines, Rule fit classifier, decision tree
Top model: Extreme gradient
boosted trees classifier
AUC: 0.961
Sensitivity: 0.98
Specificity: 0.772
Pick
best
model
Run data
through
models
Get continuous
measures for target
population
Cesarean Non-Cesarean
Cesarean 64,786 19,053
Non-Cesarean 5,521 265,476
Predicted
Actual
17. ResultsWhat features were most
important for the course of labor predictions?
Permutation based feature importance
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
FHR variability
FHR pattern
Foley catheter
Membrane status
FHR rate
Effacement
Minutes since admission
Station
Dilation
Admit prediction
Feature importance is determined by
shuffling values in a column and then
analyzing model performance. More
important features will have a greater
impact on model accuracy when shuffled.
The magnitude of the impact is calculated
for all features and then normalized to a
scale of 0 -1
18. ResultsWhat did the predictions
look like over the course of labor?
Delivery method prediction accuracy over the course of labor
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96
Predicted: non cesarean, Actual: C section
Predicted: C section, Actual: non cesarean
Predicted: C section, Actual: C section
Predicted: non cesarean, Actual: non cesarean
At 4 hours into the course
of labor, the sensitivity is
99.4%, specificity is
59.5%
20. ProductionHow did we get the queries
running in real time?
Flowsheet
measures &
events
Medication
administration
Location
information
Prenatal info
Demographic
information
Queries limit to current
patients
point to new data
source
Restructure
query
Additional data
transformation
24. OngoingWhat are
we doing with it now?
Does the
data look
right?
Data
types!
Timing
Is it all
there?
Is it
consistent?
Does it
match
history?
Is it prospectively
accurate?
AUC
Timing
Prediction
behavior
25. OngoingWhat else
are we monitoring?
A potential benefit of
this model is it’s ability to
reduce disparities in C-
section rates that may
occur in patient
subpopulation segments
(race, ethnicity,
insurance, primary
language spoken)
without a biologically
plausible mechanism
Allocation harms
When opportunities, resources
or information are withheld or
offered at different levels to
different groups
Quality of service harms
When prediction accuracy or other
outcome metrics are not equal
within groups
Source https://fairlearn.github.io/user_guide/fairness_in_machine_learning.html
Mitigation strategies
Although major sources of explicit bias have not been included in the
model, implicit bias can be pernicious. We use libraries and algorithms
available from Fairlearn.io to track and potentially mitigate bias in the
model that are not biological plausible
26. What do
we want to do with it?
Mobile ready
dashboard
Ongoing EHR
integration
Wave form
vitals