Vip Call Girls Anna Salai Chennai 👉 8250192130 ❣️💯 Top Class Girls Available
Duke Industry Statistics Symposium - Real world evidence , EHRs and Cancer Surveillance
1. Real World Evidence, EHRs,
and Cancer Surveillance
Warren A. Kibbe, Ph.D.
Professor, Biostats & Bioinformatics
Chief Data Officer, Duke Cancer Institute
warren.kibbe@duke.edu
@wakibbe
3. Personal & Professional Background
• PhD in Chemistry at Caltech, Postdoc in
molecular genetics of RAS
• Cancer research for 20+ years - cancer
informatics, data science, healthcare
• Faculty in the Feinberg School of Medicine
at Northwestern for 15+ years
• Director NCI CBIIT 2013-2017; Acting NCI
Deputy Director 2016-2017
• Lost three grandparents to cancer
4. (10,000+ patient tumors and increasing)
Courtesy of P. Kuhn (USC)
2006-2015:
A Decade of Illuminating the
Underlying Causes of Primary
Untreated Tumors Omics
Characterization
Cancer is a grand challenge
• Deep biological understanding
• Advances in scientific methods
• Advances in instrumentation
• Advances in technology
• Data and computation
• Mathematical models
Cancer Research and Care generate
detailed data that is critical to
create a learning health system for cancer
Requires:
5. In 2016 there were an estimated
1,700,000 new cancer cases
and
600,000 cancer deaths
- American Cancer Society
Cancer remains the second most common cause
of death in the U.S.
- Centers for Disease Control and Prevention
6. In 2016 there were an
estimated
15,500,000
cancer survivors in the U. S.
7. Understanding Cancer
• Precision medicine will lead to fundamental
understanding of the complex interplay between
genetics, epigenetics, nutrition, environment and
clinical presentation and direct effective,
evidence-based prevention and treatment.
Ramifications across many aspects of health care
8. Cancer Genomics
• Several distinct molecular forms of
cancer at each organ site
• The genomic abnormalities of each
cancer are unique
• The same molecular abnormalities
are found in cancers that arise in
different organs
Our understanding of biology, cancer, and intervention is changing!
9. Genomics, Data resources, and
clinical trials are changing
• NCI Cancer Genomic Data Commons
• Umbrella and basket clinical trials
such as NCI MATCH and Pediatric
MATCH
10. NCI Genomic Data Commons
launched at ASCO on June 6, 2016
https://gdc.cancer.gov
2.6 PB of legacy data and 1.5 PB of harmonized data.
11. NCI “MATCH” TRIAL:
MATCH = Molecular Analysis for Therapy CHoice
• For patients with any form of cancer who have progressive disease and
no standard of care treatment
• Novel approach: Each patient is treated according to the molecular
abnormalities of his/her tumor, rather than according to the tumor site of
origin
• A national trial: more than 1000 approved sites throughout the USA
• FY16 increased appropriation made it possible to increase screening phase
from 3,000 patients to more than 6,000 patients
• Fastest-enrolling trial in NCI’s history: started in 2015, now enrolling
about 500 patients per month into screening phase
• 30 treatment arms that test drugs inhibiting different molecular
abnormalities
• About 20% of screened patients are found to have at least one molecular
abnormality that makes them eligible for a treatment arm; about 75% of
eligible patients enroll in treatment phase of trial
12. NCI “MATCH” CANCER TREATMENT TRIAL:
STATE BY STATE ENROLLMENT
MATCH = Molecular
Analysis for Therapy CHoice
13. NCI-MATCH Success
• In June 2017, the trial reached its goal to
sequence the tumors of 6k patients,
nearly two years early
• Its availability through more than 1100
participating sites reflects the broad
interest in the promise of genomics, and
the ability of such a study to deliver that
promise to the community
14. NCI-MATCH Important Discovery
In patients tested so far, the
tumor gene variants we are
studying are less common
than expected, from 3.47
percent to zero
15. NCI-MATCH Important Discovery
(cont’d)
• Prevalence rates for many tumor gene
abnormalities are lower than expected – for
several of the treatment arms to reach their 35-
patient goal, tens of thousands of patients need
to be screened
16. MATCH and Precision Oncology
• It isn’t just about matching patients to
therapy, it is also about avoiding
therapies that will not work.
• Biology is complex, and we still have a
lot of basic biology to understand
• Genomics+imaging+clinical labs is the
first wave of precision oncology
17. Genomics is the beginning of
precision medicine, not the end!
18. Machine Learning
• Large data sets, particularly
population-based with a well-
annotated comparator set, are ideal
• Machine Learning and Deep Learning
on image features is feasible,
accurate, reproducible and scalable
21. Team Science is critical
Clinical Trials
Biostatists
Bioinformatics
Clinical Care
Clinical Research
EHRs, Imaging, Lab Systems
Data Science
Analytics and Visualization
24. Biology and Medicine are now
data intensive enterprises
Scale is rapidly changing
Technology, data, computing and
IT are pervasive in the lab, the
clinic, in the home, and across the
population
25. Cancer Research Data Ecosystem – Cancer Moonshot BRP
Well characterized
research data sets Cancer cohorts Patient data
EHR, Lab Data, Imaging,
PROs, Smart Devices,
Decision Support
Learning from every
cancer patient
Active research
participation
Research information
donor
Clinical Research
Observational studies
Proteogenomics
Imaging data
Clinical trials
Discovery
Patient engaged
Research
Surveillance
Big Data
Implementation research
SEERGDC
27. The SEER Program
• Funded by NCI to support research on the diagnosis,
treatment and outcomes of cancer since 1973
• Population-based registries covering −30% of the US
population
– Representing racial and ethnic minorities
– Various geographic subgroups
• −450,000+ incident cases reported annually
• Undergoing contract recompete- full and open for the first
time in 44 years!
– Expanding the registry coverage
– Developing two categories of registries under the SEER Program
• Core data collection
• Registries to support research
28. SEERPrecision Cancer
Surveillance
Surveillance data captured/ planned on each cancer patient for the entire population
Pathology
Molecular
Characterization
DetailedInitial
Treatment
Detailed
Subsequent
Treatment
Survival
Cause of Death
Progression
Recurrence
Complement trials to support
development of new diagnostics
and treatments
Understand treatmentand
improve outcomes in the
“real world”
Genome
Demographics
29. The evolution of SEER
Genomics
Pharmacy data
Oncology
Practice data
Claims
CancerLinQ
Hospital data
PROs
Radiation
Oncology
30. What are the critical missing pieces for SEER to
address the challenges and better support
research?
30
• Detailed AND longitudinal treatment
• Outcomes other than survival
– Recurrence and disease progression
– Patient generated health data
• Genomic information
32. Two categories of missing treatment
• Orally Administered Anti-neoplastics
• Traditional Infusion Chemotherapy
• Require different approaches
– Oral treatments provided at pharmacies
– Infusion often provided in the outpatient (community oncology
practice) setting with limited access by registrars
• Solution: linkages
– Data can be automatically added
• does not require registrars capturing the complex and multitude of agents
– Often consistent and/or standardized format
– Provides both initial and subsequent therapy
– Potential to monitor compliance
33. Treatment Linkages: Oral agents
• 25%+ systemic Rx and growing
• No population based information (CTs/case series data only)
• Capturing pharmacy data offers potential for
– Supplementing treatment
– Monitoring disparities in use and nonadherence
– Identifying adverse events
34. Treatment Linkages: Claims data for
infusion Rx
• Value of claims for treatment
• Standardized format and nomenclature from all providers
• High degree of accuracy and detail based on HCPCs for treatment
• Longitudinal data permits assessment of initial and subsequent therapy
• SEER- Medicare (65+)
• Linked since early 1990s
• >1600 publications
• Moonshot sponsored Claims Workshop in Sept 2017
– Propose to expand scope of integrated claims from multiple insurers accomplished
in KY and Seattle to broader SEER program
• Central claims processors (oncology)
• Represent patient populations for all payers & all ages
• A single central processor for 25-45% of oncologists within 7 SEER registries
• Pilot in GA 6 of 12 oncology practices completed 4.5 years data capture (−15% of
cancer patients in GA)
• Implemented in 5 additional registries
35. Preliminary Data from 6 Months Claims in 4 Georgia Oncology Practices:
Common Regimens for Treatment of Initial and Recurrent Breast Cancer
Common Regimens
Initial treatment of
Breast Cancer
4 regimens - 4,676
administrations
Common Regimens
Treatment of
Recurrent or
Metastatic Breast
Cancer
9 regimens -1,262
administrations
Administration frequency for chemotherapy regimens commonly used for initial
breast cancer treatment (6 months of data)
Frequency of Administration
Administration frequency for chemotherapy regimens commonly used for treatment
of recurrent or metastatic breast cancer (6 months of data)
Frequency of Administration
36. Other opportunities to capture
treatment
Link with oncology practice intermediaries
• Radiation Oncology/EMRs – Metrik/Mosaic (45%) & Varian
(50%)
– Pilots in development/ implementation in KY
– Opportunity to capture more detailed radiation information
• Both for initial as well as subsequent therapy (eg. Treatment of metastatic lesions)
• Oncology Practice EMR data
– MOU for data exchange with ASCO CancerLinQ (CLQ) to automatically
report data to registries from oncology practices
• Pilot protocol in process (9 practices in 3 states- IA, GA, UT)
• CLQ has agreements with >2000 oncologists currently and growing
• Opportunity to capture both systemic and orally administered
treatment and supplement other data (recurrence)
39. Capturing outcomes other than survival:
Recurrence
Identifying patients with distant recurrent disease is critical with
>18 million cancer survivors for whom we cannot describe the risk
of recurrence
• Identification of recurrence is complex
• It can be diagnosed via multiple methods which vary by:
– cancer site
– time to recurrence
– diagnosing physician type
• primary care, oncologist, radiation oncologist, radiologist etc.
– diagnostic method (biopsy, imaging, serology)
• Accurate measurement of recurrence requires capture of multiple
layered, combined data sources and new methods (NLP) to provide
comprehensive capture of recurrence(s).
40. Capturing outcomes other than
survival: Patient Generated Health Data
• Certain categories of information can ONLY
be collected from the patient
• Symptoms/ QOL
• Health History
• Working with partners to test solutions, e.g.,
patient portals, direct patient reporting, and
patient-generated data sources
• MBCA/ACS/NCI funded activities for
– Focus groups on patient reporting to registries
– Financial toxicity as an important outcome
• NCI funded registry study in 4 registries to
explore methods for patients to provide data to
registries via electronic formats
42. Genomic data now critical to
understand each patient’s cancer
• The increased availability of targeted agents and
• Proliferation of genomic testing of tumors
• Represents a special challenge to registrars.
– Many individual biomarkers- which may or may not be available to
registrars in the EMR
– Genomic panels can consist of hundreds of mutation tests with varying
structure and actionable information
– It is not feasible for registrars to capture all these data so alternative
methods must be considered.
43. Capturing Genomics Data in SEER through
NLP and automation
• There is an increasing number and requirement for capturing clinically
important biomarkers
– Important for both prognosis and predicting response to therapy
• These are often in unstructured path reports
– we cannot continuously increase the number of required data items registrars
must collect
– Targeted NLP and machine learning can supplement the work registrars do
– Identifying human assisted machine learning may be the optimal way to move
in the future (using machines where possible but will always require human
adjudication for many things)
• Tools and methods developed using path reports can be applied to other
unstructured text in the EMR (radiology dictations, clinical notes etc)
44. Capturing Genomics Data in SEER through
Linkages: targeting high volume/ high
relevance sources– GHI
• Oncotype DX 21 and 16 gene assay completed and repeating annually
• Added 43% of test results not seen in hospital reported data – sent to MD office
– BRCA panels for Breast and Ovarian
• Pilot linkage in GA/ CA completed for 2013-15 data
• Processes for linkage confirmation/data access in development
• Goal – link data across all SEER
– Syapse
• Genomic data acquisition specialist (analogous to the Claims vendor)
• Receive and store data from >15 genomic testing labs for clinical client access
• Developing pilot in GA for linkage with selected genomic lab data
– GenomeDX and Foundation Medicine
• Pilots in discussion to link panels with SEER data
45. • Risk Score predicted Breast
cancer mortality
• Known chemo use followed
reccs.*
• 7% of RS <18
• 34% of RS 18-30,
• 69% of RS ≥31
• No significant association of RS
with non-breast cancer
mortality (p=0.66)
*Chemotherapy known to be under-
reported in SEER
High RS Group
Intermediate RS Group
Low RS Group
OncotypeDx Population-based results corroborating
CTs in a real world setting (n=38,568)
46. SEER-Linked Virtual Bio-Repository Pilot
7 registries funded for pilot of pancreas and breast 9/15
• Focus on “exceptional” survivors
• 431 early stage node negative breast cancer (< 2 yr survival)
• 224 pancreatic adenoca long term survivors (> 5 yr survival)
• Matched controls for both sets of cancers
• Purpose
• Assess best practices across multiple registries
• Estimate costs of supporting a SEER wide system
• Assess availability of specimens
• Understand human subjects/consent as requirements vary by registry
and prepare for common rule changes
• In addition to primary objectives:
– Once completed pilot will provide a well annotated set of cancers with unusual
responses that will be available for research purposes
47. Real World Evidence
• Needs big data!
• Needs population representation
• Need epidemiologists and
statisticians to understand the
potential biases in representation
• EHRs, NLP, Machine Learning can
power real world evidence learning
• Critical for a Learning Health
System
But people can make effective decisions on the same number of factors…
How can we use machine learning and other techniques to reduce cognitive overload?
The scale of genomics, population science and data science is dramatically changing!
Talking points for DOE slide Pilot 3
The SEER cancer surveillance program has been collecting data on all cancer patients since 1973
Data include patient characteristics, initial treatment, survival and cause of death for entire population The process is represented in the top portion of the figure
These data from surveillance provide a “report card” on how we are doing in advancing treatment and prevention of cancer across the entire population and in subgroups where there may be disparities in outcome
SEER is also a national resource that can support the development of new diagnostics, new treatments through leveraging this unique comprehensive system on cancer to provide a sampling frame and infrastructure for tissue based research, cohort studies and clinical trials.
With growing complexity of cancer diagnosis and treatment, capturing essential information to continue this process is increasingly difficult.
Information gaps (subsequent treatment and disease progression or recurrence) exist that are now essential to understanding differences in outcome seen in the real world cancer patient.
Because the data collection process has been largely manual and the primary sources of data (such as EMR documents) are predominantly free text the new methods for applying advanced computational capabilities (both data mining and modeling) to automate data extraction through this collaboration with DOE represent an opportunity to
Close the critical information gaps,
Develop a nimble, flexible platform on which we can
add new sources such as genomics to provide a deeper understanding of drivers of cancer and outcomes in the population
increase the timeliness of reporting to minimize the time lag between new medical technologies and implementing changes
Which will enable better understanding of how the real world patient is treated and the outcomes associated with those treatment in the context of our complex medical and social environment. (compared with 3% in cts)
Now Gina will discuss the proposed implementation process for Pilot 3