SlideShare uma empresa Scribd logo
1 de 25
Organize & manage master
meta data centrally, built
upon kong, cassandra, neo4j
& elasticsearch.
Hello!
I am Akhil Agrawal
Managing master & meta data is
a very common problem with
no good opensource alternative
as far as I know, so initiating this
project – MasterMetaData
Started BIZense in 2008 &
Digikrit in 2015
1.
Problem
Let’s start with what problem we are
addressing – why mastermetadata ?
Less Frequently Changing
 Master data and meta data both have one common
behavior of less frequent changes although their
purpose is different.
 The less frequently changing data whether it is data
about real world entities (master data) or data
about other data (meta data), both can be stored,
accessed and managed in very similar ways.
Why MasterMetaData ?
No Open Source Option
 There are MDM solutions (mostly from ERP
vendors like SAP, Oracle etc. & analytics
companies like Informatica, SAS) but the
master meta data intersection is being
explored only recently.
 There is no open source alternatives for smaller
companies or something that can be
embedded with SAAS products.
Why MasterMetaData ?
2.
Definitions
Let’s start with some definitions
around data categories
Definition of Data Categories
Meta Data
meta information
about other forms of
data (can describe
master, transaction
or lower level meta
data)
Master Data
real world entities
like customer,
partner etc. (only the
stable attributes are
considered part of
master data)
Transaction Data
real world
interactions which
have very short
lifespan and
occurrence is linked
with time/space
(unstable/changing
attribute values,
although
definition/description
is stable but each new
data point is unique)
Master Meta Data
combination of master and meta data
defined at application, enterprise or global
level (although the volume and variety
of master & meta data is very different, they
have lot of common access patterns)
3.
Implementation
Let’s discuss the implementation –
technologies & concepts involved
Background
◎ Faced difficulty with managing master
and meta data in previous projects
◎ Implemented custom solution while
building mobile ad platform
◎ Currently implementing same features
required for the communication platform
◎ Have worked with elasticsearch + kibana
while kong + cassandra seems useful
Build With Following Technologies
neo4j
highly scalable native graph
database that leverages data
relationships as first-class entities,
handles evolving data challenges
elasticsearch
search and analyze data in real
time, defacto standard for making
data accessible through search
and aggregations
cassandra
right choice when you need linear
scalability and high availability
without compromising
performance & durability
kong
the open-source management
layer for APIs and microservices,
delivering security, high
performance and reliability
lua
lua is a powerful, fast, lightweight,
embeddable scripting language.
For writing kong plugins for access
to various meta master data
kibana
explore and visualize data in
elasticsearch, opensource project
from elasticsearch team, intuitive
interface, visualization & dashboards
Opensource,
Scalable,
Searchable,
Ready to Use
Project mastermetadata
needs to be ready to use
for atleast few of the use
cases like location,
device, movie, tour etc.
Challenges
 Complex & hierarchical
data sets
 Real-time query
performance
 Dynamic structure
 Evolving relationships
Why neo4j for mastermetadata ?
Why neo4j ?
 Native graph store
 Flexible schema
 Performance and
scalability
 High availability
Referenced from
http://neo4j.com/use-cases/master-data-management
Why elasticsearch for mastermetadata ?
Scale
◎ Real-Time Data
◎ Massively
Distributed
◎ High Availability
◎ Multitenancy
◎ Per-Operation
Persistence
Search
◎ Full-Text Search
◎ Document-
Oriented
◎ Schema-Free
◎ Developer-
Friendly, RESTful
API
◎ Build on top of
Apache Lucene™
Analytics
◎ Real-Time Advanced
Analytics
◎ Very flexible Query
DSL
◎ Flexible analytics &
visualization
platform - Kibana
◎ Real-time summary
and charting of
streaming data
Referenced from https://www.elastic.co/products/elasticsearch
Why kong for mastermetadata ?
Secure, Manage &
Extend your APIs and
Microservices
RESTful Interface
Plugin Oriented
Platform Agnostic
Referenced from
https://getkong.org/
Without Kong With Kong
4.
Interesting
What are interesting things happening
around this ?
Master & Metadata Management Interesection
Maximized Metadata
Model
◎data model describing the metadata
needs to be “maximized” to cover as
many use cases possible
◎meta data model needs to be inclusive
of all metadata in the organization as
well as cover the master data
◎governance of metadata model
requires the ability to describe
maximum metadata in the system to
provide ability to govern data
describing other data
Minimalistic Master
Data Model
◎master data model describing master
data needs to be “minimalist”
◎master data model is neither inclusive
of all data in the organization, nor
specific to applications using it for
specific purpose
◎central governance of master data
requires that data model backing it is
minimalistic to be able to govern
without application specific details
◎master data model is basically
metadata describing the master data
Referenced from http://blogs.gartner.com/andrew_white/2011/04/26/more-
on-metadata-and-master-data-management-intersection/
From Big Data To Smart Data
Zero Latency Organization
data
◎latency linked to the data
(capturing)
◎latency linked to analytical
processes (processing)
structural
◎latency linked to decision
making processes
◎time needed to implement
actions linked with decisions
action
◎data latency added with
structural latency
◎time needed from capturing of
data till the action takes place
value
data is considered smart based on
the value it brings in decision
making and action taking (than
anything else like size, source, etc)
master
data which represents real world
entities and also remains stable
over time is the smart data as it
helps with common data reference
meta
data which describes other data
whether master, transactional or
lower level meta data is also smart
data as it helps in understanding
Types Of Latency
Smart Data
5.
Get Involved
Let’s discuss ways to get involved in
this project
Areas where you can get involved ?
DEMO
Functional Tests,
Integration Tests,
Run Demo
CODE
Implement Ideas,
Fix Bugs,
Enhance Features
DOCUMENT
User
Documentation,
Developer
Documentation
Current Focus
Devices
Storage: Device,
Browser, OS
Access: User
Agent
Locations
Storage: Country,
State, City
Access: IP Address
Tours
Storage: People,
Interest, Culture,
Destination, City,
Activity, Duration
Access: What, Where,
For
Storage & Access
Master Data Storage
Storage which is highly efficient
for read but at the same time
efficient for writes. Additional
requirement to be able to search
the stored data as well as flexible
efficient query interface to
enable faster access
Meta Data Storage
Storage which is highly flexible
in defining relationships like
inheritance, composition or
other relationships. Graph
modeled relationships are most
flexible to change as and when
the model evolves
Diagram featured by poweredtemplate.com
Meta Data Access
CRUD, Fill in the blanks,
Semantic Query, Search
Master Data Access
CRUD, Query (Structured /
Unstructured) & Search
References
 https://getkong.org/
 http://neo4j.com/
 http://cassandra.apache.org/
 https://www.elastic.co/
 http://booksite.elsevier.com/9780123743695/
10steps_DataCategories.pdf
 http://blogs.gartner.com/andrew_white/2011/
04/26/more-on-metadata-and-master-data-
management-intersection/
 http://neo4j.com/use-cases/master-data-
management/
Thanks!
Any questions?
You can find me at:
@digikrit
akhil@digikrit.com
Special thanks to all the people who made and released these awesome
resources for free:
 Presentation template by SlidesCarnival
 Presentation models by SlideModel & PoweredTemplate
 To companies behind kong, cassandra, neo4j & elasticsearch

Mais conteúdo relacionado

Mais procurados

Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Dr. Arif Wider
 
Big Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace ImagesBig Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace ImagesMark Kromer
 
The Rise of Microservices
The Rise of MicroservicesThe Rise of Microservices
The Rise of MicroservicesMongoDB
 
Domain Driven Data: Apache Kafka® and the Data Mesh
Domain Driven Data: Apache Kafka® and the Data MeshDomain Driven Data: Apache Kafka® and the Data Mesh
Domain Driven Data: Apache Kafka® and the Data Meshconfluent
 
Azure Stream Analytics
Azure Stream AnalyticsAzure Stream Analytics
Azure Stream AnalyticsJames Serra
 
Perchè un programmatore ama anche i database NoSQL
Perchè un programmatore ama anche i database NoSQLPerchè un programmatore ama anche i database NoSQL
Perchè un programmatore ama anche i database NoSQLMarco Parenzan
 
Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...
Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...
Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...KTL Solutions
 
Azure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsAzure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsMark Kromer
 
Enterprise 360 - Graphs at the Center of a Data Fabric
Enterprise 360 - Graphs at the Center of a Data FabricEnterprise 360 - Graphs at the Center of a Data Fabric
Enterprise 360 - Graphs at the Center of a Data FabricPrecisely
 
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"Fwdays
 
Datamesh community meetup 28th jan 2021
Datamesh community meetup 28th jan 2021Datamesh community meetup 28th jan 2021
Datamesh community meetup 28th jan 2021Prasad Prabhakaran
 
IOOF Mongodb Australia
IOOF Mongodb AustraliaIOOF Mongodb Australia
IOOF Mongodb AustraliaMongoDB
 
How to build your career
How to build your careerHow to build your career
How to build your careerJames Serra
 
The Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reductionThe Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reductionMongoDB
 
Data Structure and Types
Data Structure and TypesData Structure and Types
Data Structure and TypesAnjani Phuyal
 
Scaling Multi-Cloud Deployments with Denodo: Automated Infrastructure Management
Scaling Multi-Cloud Deployments with Denodo: Automated Infrastructure ManagementScaling Multi-Cloud Deployments with Denodo: Automated Infrastructure Management
Scaling Multi-Cloud Deployments with Denodo: Automated Infrastructure ManagementDenodo
 
Big Data Storage Challenges and Solutions
Big Data Storage Challenges and SolutionsBig Data Storage Challenges and Solutions
Big Data Storage Challenges and SolutionsWSO2
 
Cepta The Future of Data with Power BI
Cepta The Future of Data with Power BICepta The Future of Data with Power BI
Cepta The Future of Data with Power BIKellyn Pot'Vin-Gorman
 
Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019Steven Moy
 
Power BI Overview, Deployment and Governance
Power BI Overview, Deployment and GovernancePower BI Overview, Deployment and Governance
Power BI Overview, Deployment and GovernanceJames Serra
 

Mais procurados (20)

Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
 
Big Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace ImagesBig Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace Images
 
The Rise of Microservices
The Rise of MicroservicesThe Rise of Microservices
The Rise of Microservices
 
Domain Driven Data: Apache Kafka® and the Data Mesh
Domain Driven Data: Apache Kafka® and the Data MeshDomain Driven Data: Apache Kafka® and the Data Mesh
Domain Driven Data: Apache Kafka® and the Data Mesh
 
Azure Stream Analytics
Azure Stream AnalyticsAzure Stream Analytics
Azure Stream Analytics
 
Perchè un programmatore ama anche i database NoSQL
Perchè un programmatore ama anche i database NoSQLPerchè un programmatore ama anche i database NoSQL
Perchè un programmatore ama anche i database NoSQL
 
Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...
Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...
Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...
 
Azure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsAzure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analytics
 
Enterprise 360 - Graphs at the Center of a Data Fabric
Enterprise 360 - Graphs at the Center of a Data FabricEnterprise 360 - Graphs at the Center of a Data Fabric
Enterprise 360 - Graphs at the Center of a Data Fabric
 
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
 
Datamesh community meetup 28th jan 2021
Datamesh community meetup 28th jan 2021Datamesh community meetup 28th jan 2021
Datamesh community meetup 28th jan 2021
 
IOOF Mongodb Australia
IOOF Mongodb AustraliaIOOF Mongodb Australia
IOOF Mongodb Australia
 
How to build your career
How to build your careerHow to build your career
How to build your career
 
The Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reductionThe Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reduction
 
Data Structure and Types
Data Structure and TypesData Structure and Types
Data Structure and Types
 
Scaling Multi-Cloud Deployments with Denodo: Automated Infrastructure Management
Scaling Multi-Cloud Deployments with Denodo: Automated Infrastructure ManagementScaling Multi-Cloud Deployments with Denodo: Automated Infrastructure Management
Scaling Multi-Cloud Deployments with Denodo: Automated Infrastructure Management
 
Big Data Storage Challenges and Solutions
Big Data Storage Challenges and SolutionsBig Data Storage Challenges and Solutions
Big Data Storage Challenges and Solutions
 
Cepta The Future of Data with Power BI
Cepta The Future of Data with Power BICepta The Future of Data with Power BI
Cepta The Future of Data with Power BI
 
Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019
 
Power BI Overview, Deployment and Governance
Power BI Overview, Deployment and GovernancePower BI Overview, Deployment and Governance
Power BI Overview, Deployment and Governance
 

Semelhante a Master Meta Data

LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchSheetal Pratik
 
Key aspects of big data storage and its architecture
Key aspects of big data storage and its architectureKey aspects of big data storage and its architecture
Key aspects of big data storage and its architectureRahul Chaturvedi
 
[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...
[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...
[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...Insight Technology, Inc.
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Denodo
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationDenodo
 
Big Data Companies and Apache Software
Big Data Companies and Apache SoftwareBig Data Companies and Apache Software
Big Data Companies and Apache SoftwareBob Marcus
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Denodo
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesAshraf Uddin
 
Discussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docxDiscussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docxmadlynplamondon
 
Key Skills Required for Data Engineering
Key Skills Required for Data EngineeringKey Skills Required for Data Engineering
Key Skills Required for Data EngineeringFibonalabs
 
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...Denodo
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data LakeMetroStar
 
Microsoft Fabric Introduction
Microsoft Fabric IntroductionMicrosoft Fabric Introduction
Microsoft Fabric IntroductionJames Serra
 
Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...Zaloni
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricNathan Bijnens
 

Semelhante a Master Meta Data (20)

LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbench
 
Key aspects of big data storage and its architecture
Key aspects of big data storage and its architectureKey aspects of big data storage and its architecture
Key aspects of big data storage and its architecture
 
[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...
[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...
[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)
 
BigData Analysis
BigData AnalysisBigData Analysis
BigData Analysis
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
 
Big Data Companies and Apache Software
Big Data Companies and Apache SoftwareBig Data Companies and Apache Software
Big Data Companies and Apache Software
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
 
Big data analysis concepts and references
Big data analysis concepts and referencesBig data analysis concepts and references
Big data analysis concepts and references
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
 
MongoDB
MongoDBMongoDB
MongoDB
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Discussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docxDiscussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docx
 
Key Skills Required for Data Engineering
Key Skills Required for Data EngineeringKey Skills Required for Data Engineering
Key Skills Required for Data Engineering
 
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
Microsoft Fabric Introduction
Microsoft Fabric IntroductionMicrosoft Fabric Introduction
Microsoft Fabric Introduction
 
Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
big data
big databig data
big data
 

Último

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Intelisync
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 

Último (20)

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 

Master Meta Data

  • 1. Organize & manage master meta data centrally, built upon kong, cassandra, neo4j & elasticsearch.
  • 2. Hello! I am Akhil Agrawal Managing master & meta data is a very common problem with no good opensource alternative as far as I know, so initiating this project – MasterMetaData Started BIZense in 2008 & Digikrit in 2015
  • 3. 1. Problem Let’s start with what problem we are addressing – why mastermetadata ?
  • 4. Less Frequently Changing  Master data and meta data both have one common behavior of less frequent changes although their purpose is different.  The less frequently changing data whether it is data about real world entities (master data) or data about other data (meta data), both can be stored, accessed and managed in very similar ways. Why MasterMetaData ?
  • 5. No Open Source Option  There are MDM solutions (mostly from ERP vendors like SAP, Oracle etc. & analytics companies like Informatica, SAS) but the master meta data intersection is being explored only recently.  There is no open source alternatives for smaller companies or something that can be embedded with SAAS products. Why MasterMetaData ?
  • 6. 2. Definitions Let’s start with some definitions around data categories
  • 7. Definition of Data Categories Meta Data meta information about other forms of data (can describe master, transaction or lower level meta data) Master Data real world entities like customer, partner etc. (only the stable attributes are considered part of master data) Transaction Data real world interactions which have very short lifespan and occurrence is linked with time/space (unstable/changing attribute values, although definition/description is stable but each new data point is unique) Master Meta Data combination of master and meta data defined at application, enterprise or global level (although the volume and variety of master & meta data is very different, they have lot of common access patterns)
  • 8.
  • 9. 3. Implementation Let’s discuss the implementation – technologies & concepts involved
  • 10. Background ◎ Faced difficulty with managing master and meta data in previous projects ◎ Implemented custom solution while building mobile ad platform ◎ Currently implementing same features required for the communication platform ◎ Have worked with elasticsearch + kibana while kong + cassandra seems useful
  • 11. Build With Following Technologies neo4j highly scalable native graph database that leverages data relationships as first-class entities, handles evolving data challenges elasticsearch search and analyze data in real time, defacto standard for making data accessible through search and aggregations cassandra right choice when you need linear scalability and high availability without compromising performance & durability kong the open-source management layer for APIs and microservices, delivering security, high performance and reliability lua lua is a powerful, fast, lightweight, embeddable scripting language. For writing kong plugins for access to various meta master data kibana explore and visualize data in elasticsearch, opensource project from elasticsearch team, intuitive interface, visualization & dashboards
  • 12. Opensource, Scalable, Searchable, Ready to Use Project mastermetadata needs to be ready to use for atleast few of the use cases like location, device, movie, tour etc.
  • 13. Challenges  Complex & hierarchical data sets  Real-time query performance  Dynamic structure  Evolving relationships Why neo4j for mastermetadata ? Why neo4j ?  Native graph store  Flexible schema  Performance and scalability  High availability Referenced from http://neo4j.com/use-cases/master-data-management
  • 14. Why elasticsearch for mastermetadata ? Scale ◎ Real-Time Data ◎ Massively Distributed ◎ High Availability ◎ Multitenancy ◎ Per-Operation Persistence Search ◎ Full-Text Search ◎ Document- Oriented ◎ Schema-Free ◎ Developer- Friendly, RESTful API ◎ Build on top of Apache Lucene™ Analytics ◎ Real-Time Advanced Analytics ◎ Very flexible Query DSL ◎ Flexible analytics & visualization platform - Kibana ◎ Real-time summary and charting of streaming data Referenced from https://www.elastic.co/products/elasticsearch
  • 15. Why kong for mastermetadata ? Secure, Manage & Extend your APIs and Microservices RESTful Interface Plugin Oriented Platform Agnostic Referenced from https://getkong.org/ Without Kong With Kong
  • 16. 4. Interesting What are interesting things happening around this ?
  • 17. Master & Metadata Management Interesection Maximized Metadata Model ◎data model describing the metadata needs to be “maximized” to cover as many use cases possible ◎meta data model needs to be inclusive of all metadata in the organization as well as cover the master data ◎governance of metadata model requires the ability to describe maximum metadata in the system to provide ability to govern data describing other data Minimalistic Master Data Model ◎master data model describing master data needs to be “minimalist” ◎master data model is neither inclusive of all data in the organization, nor specific to applications using it for specific purpose ◎central governance of master data requires that data model backing it is minimalistic to be able to govern without application specific details ◎master data model is basically metadata describing the master data Referenced from http://blogs.gartner.com/andrew_white/2011/04/26/more- on-metadata-and-master-data-management-intersection/
  • 18. From Big Data To Smart Data Zero Latency Organization data ◎latency linked to the data (capturing) ◎latency linked to analytical processes (processing) structural ◎latency linked to decision making processes ◎time needed to implement actions linked with decisions action ◎data latency added with structural latency ◎time needed from capturing of data till the action takes place value data is considered smart based on the value it brings in decision making and action taking (than anything else like size, source, etc) master data which represents real world entities and also remains stable over time is the smart data as it helps with common data reference meta data which describes other data whether master, transactional or lower level meta data is also smart data as it helps in understanding Types Of Latency Smart Data
  • 19.
  • 20. 5. Get Involved Let’s discuss ways to get involved in this project
  • 21. Areas where you can get involved ? DEMO Functional Tests, Integration Tests, Run Demo CODE Implement Ideas, Fix Bugs, Enhance Features DOCUMENT User Documentation, Developer Documentation
  • 22. Current Focus Devices Storage: Device, Browser, OS Access: User Agent Locations Storage: Country, State, City Access: IP Address Tours Storage: People, Interest, Culture, Destination, City, Activity, Duration Access: What, Where, For
  • 23. Storage & Access Master Data Storage Storage which is highly efficient for read but at the same time efficient for writes. Additional requirement to be able to search the stored data as well as flexible efficient query interface to enable faster access Meta Data Storage Storage which is highly flexible in defining relationships like inheritance, composition or other relationships. Graph modeled relationships are most flexible to change as and when the model evolves Diagram featured by poweredtemplate.com Meta Data Access CRUD, Fill in the blanks, Semantic Query, Search Master Data Access CRUD, Query (Structured / Unstructured) & Search
  • 24. References  https://getkong.org/  http://neo4j.com/  http://cassandra.apache.org/  https://www.elastic.co/  http://booksite.elsevier.com/9780123743695/ 10steps_DataCategories.pdf  http://blogs.gartner.com/andrew_white/2011/ 04/26/more-on-metadata-and-master-data- management-intersection/  http://neo4j.com/use-cases/master-data- management/
  • 25. Thanks! Any questions? You can find me at: @digikrit akhil@digikrit.com Special thanks to all the people who made and released these awesome resources for free:  Presentation template by SlidesCarnival  Presentation models by SlideModel & PoweredTemplate  To companies behind kong, cassandra, neo4j & elasticsearch