The document provides an overview of data mining applications in healthcare. It discusses data mining uses like descriptive and predictive analysis using classification, clustering, and association rules. Technologies covered include database technologies, OLAP, visualization, data scrubbing, and natural language processing. Applications discussed are in safety and quality, clinical research, financial analysis, and public health areas like disease surveillance. The goal is to allow participants to evaluate where these technologies are useful in healthcare.
Student Profile Sample - We help schools to connect the data they have, with ...
Data Mining Healthcare Applications
1. Data Mining Applications In Healthcare
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
2. Introduction
Goals of today’s presentation:
Provide an overview of the
technologies that are
relevant to the development
and deployment of data
mining solutions in
healthcare
Allow participants
to evaluate where
the technology is
useful
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
3. What is
Divining knowledge
Data mining?
from data
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
4. .
Topic Outline
Data mining
• Uses
• Algorithms
• Technology
• Applications in
healthcare
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
5. .
Data Mining Uses
• Descriptive
Understand and characterize
Clustering
Summarization
Association Rules
Sequence Discovery
• Predictive
Extrapolate and forecast
Classification
Regression
Time-Series
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
7. Technology solutions
Technology
Data Mining Infrastructure Technologies
• Database Technologies
• On-Line Analytical Processing
(OLAP)
• Visualization Technologies
• Data scrubbing technologies
• Natural Language Processing
(NLP)
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
8. Database Technologies
•Database
•OLAP
• Data warehouse vs. Data mart
•Visualization
• Relational technologies
> Oracle
> Microsoft
•Scrubbing
•NLP
• XML-databases
> Raining Data
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
11. •Database
•OLAP
•Visualization
•Scrubbing
•NLP
• Data cleansing
• Filling in missing data
• In healthcare, there is a
strong need for deidentification to protect
privacy
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
12. De-Identification of Medical Records *
•
Names;
•
social security numbers;
•
all elements of a street address, city, county,
precinct, zip code, & their equivalent
•
medical record numbers;
•
health plan beneficiary numbers;
geocodes, except for the initial three digits of
a zip code for areas that contain over 20,000
people;
•
account numbers;
•
certificate/license numbers;
all elements of dates (except year) for dates
directly related to the individual, (e.g., birth
date, admission/discharge dates, date of
death); and all ages over 89
•
license plate numbers, vehicle identifiers
and serial numbers;
•
device identifiers and serial numbers;
and all elements of dates (including year)
indicative of such age, except that such
ages and elements may be aggregated into
a single category of age 90 or older;
•
URL addresses;
•
Internet Protocol (IP) address numbers;
•
biometric identifiers, including finger and
voice prints;
•
•
•
•
telephone numbers;
•
fax numbers;
•
•
full face photographic images and
comparable images;
e-mail addresses;
•
any other unique identifying number except
as created by IHS to re-identify information.
* Source: Policy and Procedures for De-Identification of Protected Health Information and Subsequent Re-Identification 45
CFR 164.514(a)-(c) posted by IHS (Indian Health Services)
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
13. Natural Language Processing
•Database
•OLAP
•Visualization
•Scrubbing
•NLP
• NLP Uses
> translation,
summarization,
information
extraction,
document
retrieval or
categorization
• NLP Companies in
health care
> A-Life
> Language and
Computing
• NLP Approaches
> Clustering,
Classification,
Linguistic
analysis,
knowledge-based
analysis
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
14. Applications in Healthcare
• Safety and quality
• Clinical Research
• Financial
• Public Health
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
15. “To err is Human” IOM Report
•Safety and Quality
•Clinical Research
•Financial
•Public Health
• Characterization
> JCAHO Core Measures
> CMS Quality measures starter
set
> Improves patient care –
reactive response
• Prediction
> Identifying cases that can
result in bad clinical outcomes
and raising appropriate alarms
> Impacts patient care –
proactive response
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
16. Quality Measures – Initial Set*
Starter Set of 10 Hospital Quality Measures
Measure
Aspirin at arrival
Condition
Acute Myocardial Infarction (AMI)/Heart attack
Aspirin at discharge
Beta-Blocker at arrival
Beta-Blocker at discharge
ACE Inhibitor for left ventricular systolic dysfunction
Left ventricular function assessment
Heart Failure
ACE inhibitor for left ventricular systolic dysfunction
Initial antibiotic timing
Pneumonia
Pneumococcal vaccination
Oxygenation assessment
*Source: http://www.cms.hhs.gov/quality/hospital/overview.pdf
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
17. Safety and Quality
• University of Mississippi Medical Center
> Data Warehouse Technologies to understand
Medication Errors – Funded by AHRQ
> Anonymous report data collection
> Data mining technologies
> Use of Neural networks and associative rule inference
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
18. Clinical Research & Clinical Trials
•Safety and Quality
•Clinical Research
•Financial
•Public Health
• Pharmacy and medical
claims data
• Drug efficacy and clinical
trials – for example how
effective is a particular drug
regimen
• Protein structure analysis
• Genomic data mining
• Diagnostic Imaging data
research
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
19. The bottom line on cost
•Safety and Quality
•Clinical Research
•Financial
•Public Health
• General Utilization review –
does the care provided meet
accepted clinical and cost
guidelines
• Drug Utilization review
• Outlier analysis – exceptions
to treatment – analyzing
treatments which cost more
than the normal or less than
normal.
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
20. Data mining in public health
•Safety and Quality
•Clinical Research
• Syndromatic surveillance
•Financial
• Bio-terrorism detection
•Public Health
• Communicable disease
reporting (Centers for Disease
Control (CDC))
Example effort: AEGIS
• DAWN (Drug Awareness and
Warning Network)
• Federal Drug Agency (FDA) –
reporting of adverse drug
events.
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010
21. Conclusion
•Descriptive
•Predictive
•Classification
•Clustering
Data mining
• Uses
•Database
•OLAP
•Association rules
•Visualization
•Scrubbing
• Algorithms
•NLP
•Safety and Quality
• Technology
•Clinical Research
• Applications in
healthcare
•Financial
•Public Health
01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010