When I see a data analyst quickly transform and drill through a new pile of data to uncover a keen insight, I feel like I'm watching a new movie from the Marvel universe. If you haven't explored and learned to apply cloud, big data, streaming data, and rapid analytics techniques, then you haven't uncovered your superpowers, yet. Here's how you can get started.
2. • Introduction
• Why do we need super powers?
• What are big data super powers?
• Cloud infrastructure
• Hadoop and distributed databases
• Stream data processing
• Rapid analytics
• How do you get your super powers?
4. • Complex coordination of services
• Care management, behavioral health, provider
networks, community services, telemedicine,
transportation, health neighborhood
• More varieties and larger volume of data
• Clinical notes, lab results, social media
• Wearable fitness and health trackers
• Faster turn-around time on business questions
• Missed opportunities, interactive discovery and
refinement of needs
• Population Health and Care Management
• Complex groups of people with complex needs
• Value-based payment models
• Complex revenue models including cost
management
5. 1. Cloud infrastructure
2. Hadoop and distributed databases
3. Stream data processing
4. Rapid analytics
Also (for another time)
• Machine learning
• Streaming analytics
• Interactive data visualization
6. Definition: The provisioning and use of compute,
data management, and analysis resources
through an external technology vendor who
hosts those services.
For example: AWS, MS Azure, Google Cloud
• Provision server, storage,
and network infrastructure
more quickly.
• Reduce development time by
using higher-level services
like database, messaging,
software solutions, and
machine learning as a service.
• Align infrastructure
investment with business
solutions and value.
• Learning curve and adoption
• Security management
• Development of skills
and new processes
7. • New infrastructure on same day it’s requested
• Build, test, and deploy features in days
• Convert from fixed to variable cost
• 12 million messages per week
• 20% annual growth
We’re a much more responsive and agile organization
using AWS, and that helps us grow our organization.
8.
9. 1. Identify primary business goals
2. Evaluate vendors and partners
3. Align roadmap and milestones with strategic priorities
4. Track and report value realization
1. Identify a cross-team cloud-migration team
2. Train in alignment with vendor and technology direction
3. Establish new cloud-first IT processes
4. Test and refine processes through roadmap execution
10. Definition: Highly scalable, distributed data storage, processing, and
query solutions including relational databases, Hadoop, and NoSQL
databases.
For example: Hortonworks, MS HDInsight, AWS Redshift, Cloudera,
MS SQL PDW, Teradata, Netezza, Vertica, Google Bigtable, IBM Big
Insights
• Ability to cost-effectively
scale to billions of rows and
many terabytes+ of data
• Specialized data structure
and query/search tools for
text, images, relationships,
and documents
• Storage and transformation
on common platform
• Apply best tools for each
data processing need
• Many, diverse vendors
crowding the space
• Learning curve and adoption
• New requirements in data
architecture, modeling, and
metadata
• Duplication of data for
different needs
11. New patient data in the US
would create a stack of paper
1,000 miles high every year.
250 miles
International
Space Station
340 miles
Hubble
Telescope
35,000 ft
(6.6 miles)
Commercial
Airplane
1,000 miles
12. • 15 million members
• IL, MT, NM, OK, TX
• Single view of membership
• ACA reporting
• Understand cross-channel member
interactions
“improve customer service by understanding what our customers are
experiencing and enabling us to have a real-time view of what’s going
on in our business”
14. 1. Identify primary business goals and quick wins
2. Leverage road-shows to demonstrate possibilities
3. Establish executive support and departmental
sponsorship
4. Track and report value realization
5. Scale into enterprise strategy
1. Identify energetic learners and partner to get
started
2. Use known data to create a proof of concept
and innovative business insights
3. Create flashy demonstrations, videos, and
roadshows to create excitement
4. Train core team and expand big data impact
with best practices
15. • Support intra-business cycle
management decisions
• Enable active front-line
decision making
• Personal rather than
aggregate decisions
• New data processing
paradigm
• Leads to increased
volume of data
• Managing out-of-order and
incomplete transactions
• Learning curve and adoption
Definition: The movement and processing of data for
decision making and management on a timeframe that
enables the business to rapidly adapt to customer
interactions and changing needs.
For example: Internet of Things, Lambda Architecture,
AWS Kinesis, Azure Event Hubs, Spark, Storm, Flume,
and Kafka.
16. • Ability to classify and adjust
patient risk through stay
• Reduced heart-failure
readmissions from 26% to 21%
• Redirected resources to highest
risk patients
“It also allows companies to almost
instantaneously detect fraud and intrusions,
rather than waiting to collect all the data and
processing it after it is too late.”
17.
18. 1. Identify missed business opportunities
2. Establish executive support and departmental sponsorship
3. Track and report value realization
1. Identify team with mix of batch and application expertise
2. Consider a new real-time source or conversion of existing batch process
3. Evaluate technology options between ETL vendor, Open Source, and Cloud
4. Implement POC to lay ground work for enterprise best practices
5. Don’t expect one technology for everything
19. • Quickly prototype and test
hypotheses
• Intuitiveness of visual
exploration
• Reduced investment from
developers / IT teams
• Encourages intimate
understanding of operational
data and processes
• Potential duplication of effort
across teams or initiatives
• Most effective with the
adoption of formal processes
by analysts
Definition: Processes and tools that provide ways
for analysts to quickly moving from raw data to
analytical datasets to insights without the
establishment of a large scale project.
For example: Agile methods, Data Lake, self-service
desktop analytics, Tableau, Qlikview, PowerBI
100101010100
100101011001
010100100010
100010101001
101001010100
20. • Wellmark
• Text analytics, Call center and
web logs, Out of state claims
• North Carolina BCBS
• Extract Factory
• Cleveland Clinic
• Agile processes, Visual
exploration
1. Agile analysis lets people analyze data the way they
actually think.
2. Quick iterations. It’s like agile development in this way.
3. Data granular enough to answer unanticipated questions.
4. New importance of personal skill, knowledge of the
subject, and skill managing data.
5. It thrives in organizations that encourage it.
21. 135 million prescription claims
500 GB of data in text files
Insights within days
No existing infrastructure
Setup time: 2 days
• Create AWS Redshift database
• Upload files to S3
• Load data to Redshift
• Profile and check data quality
• Cleanse and transform with SQL
• Building reports in Tableau and R
1 month collaboratively running
analysis and gaining insights
Total infrastructure cost: $300
22. 1. Evaluate organization-wide analytics culture and needs
2. Identify opportunities and gain executive buy-in for future
3. Create roadmap and implement change program
1. Understand how analysts and end-users actually do their work
2. Evaluate technologies, processes, and education to support analysts and users
3. Identify processes that are impeding analysts and users without adding value
4. Execute plan in alignment with strategic direction
23. Super Power Getting Started Guides
Cloud infrastructure AWS, MS Azure
Hadoop and distributed
databases
Hortonworks Data Platform Sandbox
Stream data processing Hortonworks Data Flow
Rapid analytics Tableau
• Do online tutorials.
• Take a problem you
already know how
to solve well and
resolve is using
these technologies.
• Ask Amitech team
members for help,
advice, and
tutorials.
Notas do Editor
From McKinsey - http://www.mckinsey.com/industries/healthcare-systems-and-services/our-insights/the-big-data-revolution-in-us-health-care
• Kaiser Permanente has fully implemented a new computer system, HealthConnect, to ensure data exchange across all medical facilities and promote the use of electronic health records. The integrated system has improved outcomes in cardiovascular disease and achieved an estimated $1 billion in savings from reduced office visits and lab tests.
• Blue Shield of California, in partnership with NantHealth, is improving health-care delivery and patient outcomes by developing an integrated technology system that will allow doctors, hospitals, and health plans to deliver evidence-based care that is more coordinated and personalized. This will help improve performance in a number of areas, including prevention and care coordination.
• AstraZeneca established a four-year partnership with WellPoint’s data and analytics subsidiary, HealthCore, to conduct real-world studies to determine the most effective and economical treatments for some chronic illnesses and common diseases. AstraZeneca will use HealthCore data, together with its own clinical-trial data, to guide R&D investment decisions. The company is also in talks with payors about providing coverage for drugs already on the market, again using HealthCore data as evidence.
https://www.cloudcomputing-news.net/news/2016/jun/27/why-healthcare-industrys-move-cloud-computing-accelerating/
http://www.level3.com/~/media/files/ebooks/en_cloud_eb_healthcare.pdf
http://www.cloud-council.org/deliverables/CSCC-Impact-of-Cloud-Computing-on-Healthcare.pdf
https://aws.amazon.com/health/customer-stories/
MiHIN Case Study
The Michigan Health Information Network Shared Services (MiHIN) uses AWS to process more than 12 million patient health information messages weekly, keep pace with 20 percent growth in demand for new services, and build and test new features in days rather than months. MiHIN enables the exchange of health information throughout the state of Michigan. The company’s network is used by Michigan healthcare providers and payers to securely share patient information. MiHIN is all-in on AWS, running its health-information network in the AWS Cloud.