SlideShare uma empresa Scribd logo
1 de 42
Baixar para ler offline
Clover

is building a
Time Machine
DataEngConf
Apr 7, 2016
What is time?
Time | tīm |

noun


The indefinite
continued progression
of existence and events
that occur in apparently
irreversible succession
from the past through
the present to the future.
“
”~The Internet
A software
engineer’s
worst
nightmare
Who are these people talking about time machines?!
< Jasmine
Alyssa >
Clover Health is reinventing the health
insurance model by using data
Complex & Rich.
Made with love and
COBOL
Footer
Footer
Not owning your data
Typical lifecycle of data
App
1459671246251
User clicks on a button
1459671246251
App
1459671246251
User clicks on a button
1459671246251
Life event = publish event
Clover’s lifecycle of data
Clover
Clover publishes claim data
in our DWH and knows a
member went to the doctor on
Jan 10, 2016 as of Apr 10,
2016
Member goes to
doctor 

(Jan 10, 2016)
Billing enters
claims 

(Apr 1, 2016)
Transaction clearinghouse
systems and third-party
claims processor

Data entry human at
claims processor

Our pipelines

Happy path
What did Clover really
know about someone’s
health event that
happened on Jan 10,
2016 as of April 2016 ,
May 2016, vs Jun
2016? What did the
claims processor
know?
Oops, processed
the claim wrong
(restatement Apr
11, 2016)
Oops, there was
a data entry error
(restatement Jun
11, 2016)
Unreliable path
Oops, the
pipeline is
broken
(breakage
Jun 12, 2016)
Operational complexity
Operational complexity
Members Providers Financials
Clover Data Platform
Applications, Data warehouse, Data Science models
Trying to figure out what happened
• [drawing - WHY IS THE TRASH CAN ON FIRE]
• ENTAILS: VASE, CRYING CHILD, CAT POOPING ON THE FLOOR
In order to predict what will happen
Not just about trash cans on fire
In the context at Clover Health
How do we make decisions to affect health outcomes?
‹#›Footer
Handling complexity
Temporal data
structures
Current state and friends
Lossy!
Hard to
analyze!
Current state batch redux
Upsert?
Replace?
Keeping the event log
Sensible!!!
(kinda)
Restating history (event log)
Amnesia!
Footer
Time: It’s a matter of perspective
1/1
1/3
Two time dimensions
Effective Time
PublishedTime
10/15
4/15
2/5
3/1
4/20
11/193/2
6/2
8/5
Time as spatial data
Rectangles!
Footer
• Uniform treatment of event logs and snapshots
• Reproduce event and snapshot views from one structure
• Relatively simpler data access patterns
How this helps us
Footer
Multi-temporal
Clover
Member goes
to doctor 

Claims SFTP DWHClearinghouse
Footer
Implementations at
Clover
Footer
Why we use relational (PostgreSQL)
• Industry standard
• Wide adoption
• Robust
• Approachable
• Not constrained by scale
• Distributed / sharding
• Transactions!
• Global clock!
PostgreSQL
• Not limited to scalar types
• GiST indexes!
• Exclusion constraints!
Footer
An example of bitemporal merge in SQL
INSERT table
SELECT id id
, LOWER(publish_tr) publish_tb
, TSRANGE(LOWER(publish_tr), `publish_ts`, '[)') publish_tr
, effective_tr effective_tr
, state state
FROM table
WHERE id = `id`
AND publish_tr @> `publish_ts`
UNION ALL
SELECT `id` id
, `publish_ts` publish_tb
, TSRANGE(`publish_ts`, NULL, '[)') publish_tr
, TSRANGE(`effective_tb`,`effective_te`,'[)') effective_tr
, state state
ON CONFLICT UPDATE
SET publish_tr = publish_tr
, effective_tr = effective_tr
, state = state
Abstracting that away — SQLAlchemy
@sa_compiler.compiles(BitemporalMerge)
def _bitemporal_merge(element, compiler, **kw):
return (';n').join([
compiler.process(element.create_stg_working_table),
compiler.process(element.process_intersecting_set),
compiler.process(element.publish_new_set),
compiler.process(element.clean_up_working_tables),
])
Abstracting that away — Airflow
BitemporalMerge operator (template for a task)
Abstracting that away Alembic
@Operations.register_operation('create_bitemporal_table')
class CreateBitemporalTableOp(MigrateOperation):
"""Create a bitemporal src table”""
identities = identities or []
identity_constraints = [(expr, '=') for identity, expr in identities.items()]
additional_exclusions = additional_exclusions or []
exclusion_contraints = identity_constraints + additional_exclusions
exclusion = sa.dialects.postgresql.ExcludeConstraint(
('published_as_of', '&&'),
('{}'.format(self.as_on_name), '&&'),
*exclusion_contraints)
current_publish_ixes = []
current_publish_current_as_on_ixes = []
…..
Temporality as a concept
import sqlalchemy as sa
import clover_web.models.temporal as temporal
@temporal.add_clock('prop_a', 'prop_b')
class MyModel(temporal.Clocked, SomeBase):
prop_a = sa.Column(sa.Integer)
prop_b = sa.Column(sa.Text)
prop_a_hm = temporal.get_history_model(MyModel.prop_a)
Temporality as a concept
effective/valid
published
S3 archives/versions
Using the time machine
What was member’s status according to the claims processor on Dec 1, 2015?
What was member’s status according to us on Dec 1, 2015?
What is the member’s current full effective history?
What is our latest understanding of the member’s status according to the claims
processor?
Footer
[drawing with cause and effect enumerated]
Figuring out what happened
Footer
• How do we know if a call queue campaign was successful?
• How do we know how and where to deploy our nurses?
• How do we know what impact a certain data integration will have on
understanding the risk profile of our members?
Making meaningful decisions about health outcomes
Footer
• Richard Snodgrass (http://www.cs.arizona.edu/~rts/publications.html)
• Developing Time-Oriented Databases in SQL
Further resources
Footer
Questions

Mais conteúdo relacionado

Mais procurados

2015_May Dotmatics UGM Bob Coner
2015_May Dotmatics UGM Bob Coner2015_May Dotmatics UGM Bob Coner
2015_May Dotmatics UGM Bob ConerBob Coner
 
Hl7 Analytics for IT and Clinical Insights
Hl7 Analytics for IT and Clinical InsightsHl7 Analytics for IT and Clinical Insights
Hl7 Analytics for IT and Clinical InsightsExtraHop Networks
 
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...Spark Summit
 
Agile Data Science 2.0
Agile Data Science 2.0Agile Data Science 2.0
Agile Data Science 2.0Russell Jurney
 
What data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientistsWhat data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientistsHugo Bowne-Anderson
 
Top 10 best practices for WebIntelligence reports development
Top 10 best practices for WebIntelligence reports developmentTop 10 best practices for WebIntelligence reports development
Top 10 best practices for WebIntelligence reports developmentSebastien Goiffon
 

Mais procurados (7)

2015_May Dotmatics UGM Bob Coner
2015_May Dotmatics UGM Bob Coner2015_May Dotmatics UGM Bob Coner
2015_May Dotmatics UGM Bob Coner
 
Hl7 Analytics for IT and Clinical Insights
Hl7 Analytics for IT and Clinical InsightsHl7 Analytics for IT and Clinical Insights
Hl7 Analytics for IT and Clinical Insights
 
TOP Statistical Analysis Software
TOP Statistical Analysis SoftwareTOP Statistical Analysis Software
TOP Statistical Analysis Software
 
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...
 
Agile Data Science 2.0
Agile Data Science 2.0Agile Data Science 2.0
Agile Data Science 2.0
 
What data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientistsWhat data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientists
 
Top 10 best practices for WebIntelligence reports development
Top 10 best practices for WebIntelligence reports developmentTop 10 best practices for WebIntelligence reports development
Top 10 best practices for WebIntelligence reports development
 

Destaque

DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at PinterestDataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at PinterestHakka Labs
 
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInDataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInHakka Labs
 
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQDataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQHakka Labs
 
DataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with OurselvesDataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with OurselvesHakka Labs
 
DataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchDataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchHakka Labs
 
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High DeliverabilityDataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High DeliverabilityHakka Labs
 
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor DataDataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor DataHakka Labs
 
DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...Hakka Labs
 
DataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineeringDataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineeringHakka Labs
 
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale Hakka Labs
 
DataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scaleDataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scaleHakka Labs
 
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...Hakka Labs
 
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...Hakka Labs
 
DataEngConf SF16 - Recommendations at Instacart
DataEngConf SF16 - Recommendations at InstacartDataEngConf SF16 - Recommendations at Instacart
DataEngConf SF16 - Recommendations at InstacartHakka Labs
 
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using SparkDataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using SparkHakka Labs
 
Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Hakka Labs
 
UDL - Creating effective and efficient learning for everyone
UDL - Creating effective  and efficient learning for everyoneUDL - Creating effective  and efficient learning for everyone
UDL - Creating effective and efficient learning for everyoneJenelle H.
 
Flow Engines - Hack The Way You Work, Not The Time You Have
Flow Engines - Hack The Way You Work, Not The Time You HaveFlow Engines - Hack The Way You Work, Not The Time You Have
Flow Engines - Hack The Way You Work, Not The Time You HaveJohn V Willshire
 
The Art of Self-Coaching @ Stanford GSB, Class 9: Endings
The Art of Self-Coaching @ Stanford GSB, Class 9: EndingsThe Art of Self-Coaching @ Stanford GSB, Class 9: Endings
The Art of Self-Coaching @ Stanford GSB, Class 9: EndingsEd Batista
 
Reverse Engineering Your Life
Reverse Engineering Your LifeReverse Engineering Your Life
Reverse Engineering Your LifeDerek Brown
 

Destaque (20)

DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at PinterestDataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
 
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInDataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
 
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQDataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
 
DataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with OurselvesDataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with Ourselves
 
DataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchDataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series search
 
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High DeliverabilityDataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
 
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor DataDataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
 
DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...
 
DataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineeringDataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineering
 
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale
 
DataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scaleDataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scale
 
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
 
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
 
DataEngConf SF16 - Recommendations at Instacart
DataEngConf SF16 - Recommendations at InstacartDataEngConf SF16 - Recommendations at Instacart
DataEngConf SF16 - Recommendations at Instacart
 
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using SparkDataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
 
Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)
 
UDL - Creating effective and efficient learning for everyone
UDL - Creating effective  and efficient learning for everyoneUDL - Creating effective  and efficient learning for everyone
UDL - Creating effective and efficient learning for everyone
 
Flow Engines - Hack The Way You Work, Not The Time You Have
Flow Engines - Hack The Way You Work, Not The Time You HaveFlow Engines - Hack The Way You Work, Not The Time You Have
Flow Engines - Hack The Way You Work, Not The Time You Have
 
The Art of Self-Coaching @ Stanford GSB, Class 9: Endings
The Art of Self-Coaching @ Stanford GSB, Class 9: EndingsThe Art of Self-Coaching @ Stanford GSB, Class 9: Endings
The Art of Self-Coaching @ Stanford GSB, Class 9: Endings
 
Reverse Engineering Your Life
Reverse Engineering Your LifeReverse Engineering Your Life
Reverse Engineering Your Life
 

Semelhante a DataEngConf SF16 - Multi-temporal Data Structures

Serverless to author, schedule, execute and monitor data workflows.
Serverless to author, schedule, execute and monitor data workflows.Serverless to author, schedule, execute and monitor data workflows.
Serverless to author, schedule, execute and monitor data workflows.Anselmo Rodrigues da Silva
 
Next-Generation Medical Analysis | AWS Public Sector Summit 2017
Next-Generation Medical Analysis | AWS Public Sector Summit 2017Next-Generation Medical Analysis | AWS Public Sector Summit 2017
Next-Generation Medical Analysis | AWS Public Sector Summit 2017Amazon Web Services
 
Wellness Through Workflow: How Workflow Contributes to a Better Patient Exper...
Wellness Through Workflow: How Workflow Contributes to a Better Patient Exper...Wellness Through Workflow: How Workflow Contributes to a Better Patient Exper...
Wellness Through Workflow: How Workflow Contributes to a Better Patient Exper...Telmediq
 
The Architecture of Continuous Innovation - OSCON 2015
The Architecture of Continuous Innovation - OSCON 2015The Architecture of Continuous Innovation - OSCON 2015
The Architecture of Continuous Innovation - OSCON 2015Chip Childers
 
The "Ops" Side of DevSecOps
The "Ops" Side of DevSecOps The "Ops" Side of DevSecOps
The "Ops" Side of DevSecOps Rundeck
 
Observability foundations in dynamically evolving architectures
Observability foundations in dynamically evolving architecturesObservability foundations in dynamically evolving architectures
Observability foundations in dynamically evolving architecturesBoyan Dimitrov
 
Just enough web ops for web developers
Just enough web ops for web developersJust enough web ops for web developers
Just enough web ops for web developersDatadog
 
Migrating to CiviCRM
Migrating to CiviCRMMigrating to CiviCRM
Migrating to CiviCRMAllen Shaw
 
Reactive Extensions: classic Observer in .NET
Reactive Extensions: classic Observer in .NETReactive Extensions: classic Observer in .NET
Reactive Extensions: classic Observer in .NETEPAM
 
Application Metrics - IPC2023
Application Metrics - IPC2023Application Metrics - IPC2023
Application Metrics - IPC2023Rafael Dohms
 
BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...
BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...
BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...Big Data Week
 
Practical operability techniques for teams - Matthew Skelton - Agile in the C...
Practical operability techniques for teams - Matthew Skelton - Agile in the C...Practical operability techniques for teams - Matthew Skelton - Agile in the C...
Practical operability techniques for teams - Matthew Skelton - Agile in the C...Skelton Thatcher Consulting Ltd
 
Introduction to Simulation
Introduction to SimulationIntroduction to Simulation
Introduction to Simulationchimco.net
 
[@IndeedEng] Logrepo: Enabling Data-Driven Decisions
[@IndeedEng] Logrepo: Enabling Data-Driven Decisions[@IndeedEng] Logrepo: Enabling Data-Driven Decisions
[@IndeedEng] Logrepo: Enabling Data-Driven Decisionsindeedeng
 
Event Stream Processing SAP
Event Stream Processing SAPEvent Stream Processing SAP
Event Stream Processing SAPGaurav Ahluwalia
 
optimizing_site_performance
optimizing_site_performanceoptimizing_site_performance
optimizing_site_performanceBryan Farrow
 
muCon 2017 - Build Confidence in your System with Chaos Engineering
muCon 2017 - Build Confidence in your System with Chaos EngineeringmuCon 2017 - Build Confidence in your System with Chaos Engineering
muCon 2017 - Build Confidence in your System with Chaos EngineeringSylvain Hellegouarch
 
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)Dan Robinson
 

Semelhante a DataEngConf SF16 - Multi-temporal Data Structures (20)

Complex Event Processing
Complex Event ProcessingComplex Event Processing
Complex Event Processing
 
Serverless to author, schedule, execute and monitor data workflows.
Serverless to author, schedule, execute and monitor data workflows.Serverless to author, schedule, execute and monitor data workflows.
Serverless to author, schedule, execute and monitor data workflows.
 
Next-Generation Medical Analysis | AWS Public Sector Summit 2017
Next-Generation Medical Analysis | AWS Public Sector Summit 2017Next-Generation Medical Analysis | AWS Public Sector Summit 2017
Next-Generation Medical Analysis | AWS Public Sector Summit 2017
 
Wellness Through Workflow: How Workflow Contributes to a Better Patient Exper...
Wellness Through Workflow: How Workflow Contributes to a Better Patient Exper...Wellness Through Workflow: How Workflow Contributes to a Better Patient Exper...
Wellness Through Workflow: How Workflow Contributes to a Better Patient Exper...
 
The Architecture of Continuous Innovation - OSCON 2015
The Architecture of Continuous Innovation - OSCON 2015The Architecture of Continuous Innovation - OSCON 2015
The Architecture of Continuous Innovation - OSCON 2015
 
The "Ops" Side of DevSecOps
The "Ops" Side of DevSecOps The "Ops" Side of DevSecOps
The "Ops" Side of DevSecOps
 
Observability foundations in dynamically evolving architectures
Observability foundations in dynamically evolving architecturesObservability foundations in dynamically evolving architectures
Observability foundations in dynamically evolving architectures
 
Just enough web ops for web developers
Just enough web ops for web developersJust enough web ops for web developers
Just enough web ops for web developers
 
operation managements
operation managementsoperation managements
operation managements
 
Migrating to CiviCRM
Migrating to CiviCRMMigrating to CiviCRM
Migrating to CiviCRM
 
Reactive Extensions: classic Observer in .NET
Reactive Extensions: classic Observer in .NETReactive Extensions: classic Observer in .NET
Reactive Extensions: classic Observer in .NET
 
Application Metrics - IPC2023
Application Metrics - IPC2023Application Metrics - IPC2023
Application Metrics - IPC2023
 
BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...
BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...
BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...
 
Practical operability techniques for teams - Matthew Skelton - Agile in the C...
Practical operability techniques for teams - Matthew Skelton - Agile in the C...Practical operability techniques for teams - Matthew Skelton - Agile in the C...
Practical operability techniques for teams - Matthew Skelton - Agile in the C...
 
Introduction to Simulation
Introduction to SimulationIntroduction to Simulation
Introduction to Simulation
 
[@IndeedEng] Logrepo: Enabling Data-Driven Decisions
[@IndeedEng] Logrepo: Enabling Data-Driven Decisions[@IndeedEng] Logrepo: Enabling Data-Driven Decisions
[@IndeedEng] Logrepo: Enabling Data-Driven Decisions
 
Event Stream Processing SAP
Event Stream Processing SAPEvent Stream Processing SAP
Event Stream Processing SAP
 
optimizing_site_performance
optimizing_site_performanceoptimizing_site_performance
optimizing_site_performance
 
muCon 2017 - Build Confidence in your System with Chaos Engineering
muCon 2017 - Build Confidence in your System with Chaos EngineeringmuCon 2017 - Build Confidence in your System with Chaos Engineering
muCon 2017 - Build Confidence in your System with Chaos Engineering
 
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)
 

Mais de Hakka Labs

DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataHakka Labs
 
DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopDataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopHakka Labs
 
DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...
DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...
DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...Hakka Labs
 
DataEngConf: Data Science at the New York Times by Chris Wiggins
DataEngConf: Data Science at the New York Times by Chris WigginsDataEngConf: Data Science at the New York Times by Chris Wiggins
DataEngConf: Data Science at the New York Times by Chris WigginsHakka Labs
 
DataEngConf: Building the Next New York Times Recommendation Engine
DataEngConf: Building the Next New York Times Recommendation EngineDataEngConf: Building the Next New York Times Recommendation Engine
DataEngConf: Building the Next New York Times Recommendation EngineHakka Labs
 
DataEngConf: Measuring Impact with Data in a Distributed World at Conde Nast
DataEngConf: Measuring Impact with Data in a Distributed World at Conde NastDataEngConf: Measuring Impact with Data in a Distributed World at Conde Nast
DataEngConf: Measuring Impact with Data in a Distributed World at Conde NastHakka Labs
 
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleDataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleHakka Labs
 
DataEngConf: Talkographics: Using What Viewers Say Online to Measure TV and B...
DataEngConf: Talkographics: Using What Viewers Say Online to Measure TV and B...DataEngConf: Talkographics: Using What Viewers Say Online to Measure TV and B...
DataEngConf: Talkographics: Using What Viewers Say Online to Measure TV and B...Hakka Labs
 
DataEngConf: The Science of Virality at BuzzFeed
DataEngConf: The Science of Virality at BuzzFeedDataEngConf: The Science of Virality at BuzzFeed
DataEngConf: The Science of Virality at BuzzFeedHakka Labs
 
DataEngConf: Uri Laserson (Data Scientist, Cloudera) Scaling up Genomics with...
DataEngConf: Uri Laserson (Data Scientist, Cloudera) Scaling up Genomics with...DataEngConf: Uri Laserson (Data Scientist, Cloudera) Scaling up Genomics with...
DataEngConf: Uri Laserson (Data Scientist, Cloudera) Scaling up Genomics with...Hakka Labs
 
DataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big Data
DataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big DataDataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big Data
DataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big DataHakka Labs
 
DataEngConf: Apache Kafka at Rocana: a scalable, distributed log for machine ...
DataEngConf: Apache Kafka at Rocana: a scalable, distributed log for machine ...DataEngConf: Apache Kafka at Rocana: a scalable, distributed log for machine ...
DataEngConf: Apache Kafka at Rocana: a scalable, distributed log for machine ...Hakka Labs
 
DataEngConf: Building Satori, a Hadoop toll for Data Extraction at LinkedIn
DataEngConf: Building Satori, a Hadoop toll for Data Extraction at LinkedInDataEngConf: Building Satori, a Hadoop toll for Data Extraction at LinkedIn
DataEngConf: Building Satori, a Hadoop toll for Data Extraction at LinkedInHakka Labs
 

Mais de Hakka Labs (13)

DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopDataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL Workshop
 
DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...
DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...
DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...
 
DataEngConf: Data Science at the New York Times by Chris Wiggins
DataEngConf: Data Science at the New York Times by Chris WigginsDataEngConf: Data Science at the New York Times by Chris Wiggins
DataEngConf: Data Science at the New York Times by Chris Wiggins
 
DataEngConf: Building the Next New York Times Recommendation Engine
DataEngConf: Building the Next New York Times Recommendation EngineDataEngConf: Building the Next New York Times Recommendation Engine
DataEngConf: Building the Next New York Times Recommendation Engine
 
DataEngConf: Measuring Impact with Data in a Distributed World at Conde Nast
DataEngConf: Measuring Impact with Data in a Distributed World at Conde NastDataEngConf: Measuring Impact with Data in a Distributed World at Conde Nast
DataEngConf: Measuring Impact with Data in a Distributed World at Conde Nast
 
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleDataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
 
DataEngConf: Talkographics: Using What Viewers Say Online to Measure TV and B...
DataEngConf: Talkographics: Using What Viewers Say Online to Measure TV and B...DataEngConf: Talkographics: Using What Viewers Say Online to Measure TV and B...
DataEngConf: Talkographics: Using What Viewers Say Online to Measure TV and B...
 
DataEngConf: The Science of Virality at BuzzFeed
DataEngConf: The Science of Virality at BuzzFeedDataEngConf: The Science of Virality at BuzzFeed
DataEngConf: The Science of Virality at BuzzFeed
 
DataEngConf: Uri Laserson (Data Scientist, Cloudera) Scaling up Genomics with...
DataEngConf: Uri Laserson (Data Scientist, Cloudera) Scaling up Genomics with...DataEngConf: Uri Laserson (Data Scientist, Cloudera) Scaling up Genomics with...
DataEngConf: Uri Laserson (Data Scientist, Cloudera) Scaling up Genomics with...
 
DataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big Data
DataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big DataDataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big Data
DataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big Data
 
DataEngConf: Apache Kafka at Rocana: a scalable, distributed log for machine ...
DataEngConf: Apache Kafka at Rocana: a scalable, distributed log for machine ...DataEngConf: Apache Kafka at Rocana: a scalable, distributed log for machine ...
DataEngConf: Apache Kafka at Rocana: a scalable, distributed log for machine ...
 
DataEngConf: Building Satori, a Hadoop toll for Data Extraction at LinkedIn
DataEngConf: Building Satori, a Hadoop toll for Data Extraction at LinkedInDataEngConf: Building Satori, a Hadoop toll for Data Extraction at LinkedIn
DataEngConf: Building Satori, a Hadoop toll for Data Extraction at LinkedIn
 

Último

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Último (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

DataEngConf SF16 - Multi-temporal Data Structures

  • 1. Clover
 is building a Time Machine DataEngConf Apr 7, 2016
  • 3. Time | tīm |
 noun 
 The indefinite continued progression of existence and events that occur in apparently irreversible succession from the past through the present to the future.
  • 6. Who are these people talking about time machines?! < Jasmine Alyssa >
  • 7. Clover Health is reinventing the health insurance model by using data
  • 9.
  • 10. Made with love and COBOL
  • 13. Typical lifecycle of data App 1459671246251 User clicks on a button 1459671246251 App 1459671246251 User clicks on a button 1459671246251 Life event = publish event
  • 14. Clover’s lifecycle of data Clover Clover publishes claim data in our DWH and knows a member went to the doctor on Jan 10, 2016 as of Apr 10, 2016 Member goes to doctor 
 (Jan 10, 2016) Billing enters claims 
 (Apr 1, 2016) Transaction clearinghouse systems and third-party claims processor
 Data entry human at claims processor
 Our pipelines
 Happy path What did Clover really know about someone’s health event that happened on Jan 10, 2016 as of April 2016 , May 2016, vs Jun 2016? What did the claims processor know? Oops, processed the claim wrong (restatement Apr 11, 2016) Oops, there was a data entry error (restatement Jun 11, 2016) Unreliable path Oops, the pipeline is broken (breakage Jun 12, 2016)
  • 16. Operational complexity Members Providers Financials Clover Data Platform Applications, Data warehouse, Data Science models
  • 17. Trying to figure out what happened • [drawing - WHY IS THE TRASH CAN ON FIRE] • ENTAILS: VASE, CRYING CHILD, CAT POOPING ON THE FLOOR
  • 18. In order to predict what will happen Not just about trash cans on fire
  • 19. In the context at Clover Health How do we make decisions to affect health outcomes?
  • 21. Current state and friends Lossy! Hard to analyze!
  • 22. Current state batch redux Upsert? Replace?
  • 23. Keeping the event log Sensible!!! (kinda)
  • 24. Restating history (event log) Amnesia!
  • 25. Footer Time: It’s a matter of perspective
  • 26. 1/1 1/3 Two time dimensions Effective Time PublishedTime 10/15 4/15 2/5 3/1 4/20 11/193/2 6/2 8/5
  • 27. Time as spatial data Rectangles!
  • 28. Footer • Uniform treatment of event logs and snapshots • Reproduce event and snapshot views from one structure • Relatively simpler data access patterns How this helps us
  • 29. Footer Multi-temporal Clover Member goes to doctor 
 Claims SFTP DWHClearinghouse
  • 31. Footer Why we use relational (PostgreSQL) • Industry standard • Wide adoption • Robust • Approachable • Not constrained by scale • Distributed / sharding • Transactions! • Global clock! PostgreSQL • Not limited to scalar types • GiST indexes! • Exclusion constraints!
  • 32. Footer An example of bitemporal merge in SQL INSERT table SELECT id id , LOWER(publish_tr) publish_tb , TSRANGE(LOWER(publish_tr), `publish_ts`, '[)') publish_tr , effective_tr effective_tr , state state FROM table WHERE id = `id` AND publish_tr @> `publish_ts` UNION ALL SELECT `id` id , `publish_ts` publish_tb , TSRANGE(`publish_ts`, NULL, '[)') publish_tr , TSRANGE(`effective_tb`,`effective_te`,'[)') effective_tr , state state ON CONFLICT UPDATE SET publish_tr = publish_tr , effective_tr = effective_tr , state = state
  • 33. Abstracting that away — SQLAlchemy @sa_compiler.compiles(BitemporalMerge) def _bitemporal_merge(element, compiler, **kw): return (';n').join([ compiler.process(element.create_stg_working_table), compiler.process(element.process_intersecting_set), compiler.process(element.publish_new_set), compiler.process(element.clean_up_working_tables), ])
  • 34. Abstracting that away — Airflow BitemporalMerge operator (template for a task)
  • 35. Abstracting that away Alembic @Operations.register_operation('create_bitemporal_table') class CreateBitemporalTableOp(MigrateOperation): """Create a bitemporal src table”"" identities = identities or [] identity_constraints = [(expr, '=') for identity, expr in identities.items()] additional_exclusions = additional_exclusions or [] exclusion_contraints = identity_constraints + additional_exclusions exclusion = sa.dialects.postgresql.ExcludeConstraint( ('published_as_of', '&&'), ('{}'.format(self.as_on_name), '&&'), *exclusion_contraints) current_publish_ixes = [] current_publish_current_as_on_ixes = [] …..
  • 36. Temporality as a concept import sqlalchemy as sa import clover_web.models.temporal as temporal @temporal.add_clock('prop_a', 'prop_b') class MyModel(temporal.Clocked, SomeBase): prop_a = sa.Column(sa.Integer) prop_b = sa.Column(sa.Text) prop_a_hm = temporal.get_history_model(MyModel.prop_a)
  • 37. Temporality as a concept effective/valid published S3 archives/versions
  • 38. Using the time machine What was member’s status according to the claims processor on Dec 1, 2015? What was member’s status according to us on Dec 1, 2015? What is the member’s current full effective history? What is our latest understanding of the member’s status according to the claims processor?
  • 39. Footer [drawing with cause and effect enumerated] Figuring out what happened
  • 40. Footer • How do we know if a call queue campaign was successful? • How do we know how and where to deploy our nurses? • How do we know what impact a certain data integration will have on understanding the risk profile of our members? Making meaningful decisions about health outcomes
  • 41. Footer • Richard Snodgrass (http://www.cs.arizona.edu/~rts/publications.html) • Developing Time-Oriented Databases in SQL Further resources