Ashim Joshi, Head of Innovation at InfoTrack, will discuss how the Elasticsearch Service helped tackle a variety of uses cases at Infotrack, like building a data-lake, and architecting a data-mart layer.
See the video: https://www.elastic.co/elasticon/tour/2019/sydney/infotrack-creating-a-single-source-of-truth-with-the-elastic-stack
Axa Assurance Maroc - Insurer Innovation Award 2024
InfoTrack: Creating a single source of truth with the Elastic Stack
1. Creating a single source of
truth with the Elastic Stack
A session with
Ashim Joshi - Head of Innovation
Hosted by
2. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Idea to Reality
3. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Idea to Reality
We deal with huge volumes of activity daily
We initially monitored our IIS logs manually
using a file log tool.
This was horribly inefficient, unscalable
and presented issues with transparency as
we couldn’t trace applications.
With the Elastic Cloud we had all of our pain
points answered. Elasticsearch could store
and search our data blazingly fast, Log-
stash transformed our data into a readable
format as we stream data into it and Kibana
would allow us to display our data in beau-
tiful customisable dashboards. The Elastic
Cloud has saved us months instead of
building an in-house solution, as well as
saved resourcing a dedicated developer to
maintain the product.
4. x
Intro
Idea to Reality
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Job titles aren’t binding. Each of us are equipped with a diverse skill
set, and we always leverage each other’s interests and strengths to
create better outcomes for our projects.
x3 Data Engineers
x1 Data Scientist
x6 Developers
x2 Designer
x1 Product Manager
5. X
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Idea to Reality
6. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Idea to Reality
7. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Idea to Reality
8. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
DATAMARTS
Streaming
API Layer
Apps
DATA LAKE
Idea to Reality
9. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
DATAMARTS
Streaming
Platform
Apps
DATA LAKE
Idea to Reality
10. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
A data lake is a centralized repository that allows you to store all
your structured and unstructured raw data at any scale.
S3 bucket(s)
Idea to Reality
11. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
A data lake is a centralized repository that allows you to store all
your structured and unstructured raw data at any scale.
S3 bucket(s)
structured data from
relational databases
(rows and columns)
Idea to Reality
12. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
A data lake is a centralized repository that allows you to store all
your structured and unstructured raw data at any scale.
S3 bucket(s)
structured data from
relational databases
(rows and columns)
semi-structured data
(CSV, logs, XML,
JSON, Parquet)
Idea to Reality
13. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
A data lake is a centralized repository that allows you to store all
your structured and unstructured raw data at any scale.
S3 bucket(s)
structured data from
relational databases
(rows and columns)
semi-structured data
(CSV, logs, XML,
JSON, Parquet)
unstructured data
(emails, documents,
PDFs)
Idea to Reality
14. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
A data lake is a centralized repository that allows you to store all
your structured and unstructured raw data at any scale.
S3 bucket(s)
structured data from
relational databases
(rows and columns)
semi-structured data
(CSV, logs, XML,
JSON, Parquet)
unstructured data
(emails, documents,
PDFs)
binary data
(images, audio,
video)
Idea to Reality
15. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Why do we need Datalake?
Data has been stored everywhere in different databases and services.
Idea to Reality
16. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Why do we need Datalake?
It is getting harder and harder when we want to query our data.
Data has been stored everywhere in different databases and services.
Idea to Reality
17. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Why do we need Datalake?
It is challenged to see overall picture of our business.
It is getting harder and harder when we want to query our data.
Data has been stored everywhere in different databases and services.
Idea to Reality
18. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Why do we need Datalake?
Data lakes are next-generation hybrid data management solutions that can
meet big data challenges and drive new levels of real-time analytics from one
single source of truth.
Idea to Reality
19. x
Intro
The problem
Validation + Roadmap
Datalake
Data Source
Data Source
Amazon MSK
S3 Raw Data
Databricks
Streaming
Batch
Process
Lambda
Data files
Elastic Search
Data marts
SnowflakeS3 Processed Data
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Our Datalake
Idea to Reality
20. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
DATAMARTS
Streaming
Platform
Apps
DATA LAKE
Idea to Reality
21. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Geospatial
Entities
Time Series
Documents
Graphs
Customer Data
Idea to Reality
22. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Data mart
BI Finance MarketingOther Product teams
Data mart Data mart Data mart Data mart Data mart
Data Catalogs
Idea to Reality
23. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Data mart
BI Finance MarketingOther Product teams
Data mart Data mart Data mart Data mart Data mart
Data Catalogs
Idea to Reality
24. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
DATAMARTS
Streaming
API Layer
Apps
DATA LAKE
Idea to Reality
25. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Idea to Reality
26. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Data Source
Data Source
Amazon MSK
S3 Raw Data
Databricks
Streaming
Batch
Process
Lambda
Data Files
Elastic Search
Keywords
Translation API Layer
Details
S3 Processed Data
Idea to Reality
27. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Idea to Reality
28. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Data Source
Data Source
Amazon MSK
S3 Raw Data
Databricks
Streaming
Batch
Process
Lambda
Data Files
Elastic Search
S3 Processed Data
Data marts
Translation API Layer
Idea to Reality
29. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
DATAMARTS
Streaming
Platform
Apps
DATA LAKE
Idea to Reality
30. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
What is Search-able Content
Idea to Reality
31. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Idea to Reality
32. x
Intro
The problem
Validation + Roadmap
Datalake
Datamarts
API + Elastic
Final Product
Future Implementation
Conclusion
Our clients ingesting data through our API’s. Pay per API request?
A scalable and performant platform to support our product, data and customer needs
Data anomaly detection in authority source data -> provide data that's more complete
and accurate than source data by combining and processing multiple sources.
Our clients developing their own products on their data
Single view of customer
Marketplace for data
Idea to Reality