SlideShare uma empresa Scribd logo
1 de 34
Telecom Bell
Cloud Migration Kickoff
Yashodhan Kale
Delivery Solutions Architect | Databricks
05/30/2023
Contents Approach
Operating Model
Additional Details
01
3
4
5
Platform &
Architecture
2
Summary
1
Summary | Business challenges
1
1 2 3 4 5
TELECOM BELL must
improve network
QOS to align with
consumers' changing
emphasis on mobile
connectivity and
data usage
As IoT and 5G advance,
customers easily switch
providers, prompting
TELECOM BELL to
prioritize personalized
engagement using
customer data for
customized messaging
and services.
TELECOM BELL is subject
to many regulations,
including data privacy
and security regulations,
and needs effective ways
to adhere to these.
Power of data
there is a data-volume
explosion, requiring both
focus and new
capabilities.
Increase pressure to show
growth and profits
is constant and data and
AI will be a critical
enabler
Summary| Technical Challenges
2
Today there are increased expectations and pressure on the Telecom organization to have a strong data & analytics strategy
•Data platform is not scalable for analytics, AI/ML
Upfront capacity planning and cost
Governance of the data on HDFS is a challenge
Data sits in silos and not easy to integrate/ connect
•Lack of discoverability of data (catalog)
•Housekeeping - Maintenance of the in-house cluster is a difficult thru
different portals and installations
•Advance disaster recovery, durability and availability
•Bigger IT infra staff required
Summary | Executive Plan
Telecom Bell wants to improve the Quality of Service (QoS) of their network and to get there, start
migrating the core applications to cloud.
Databricks will bring industry leading expertise and Databricks platform expertise to drive the
transformation at speed.
Confluent will bring event streaming platform built on Kafka and the necessary platform support
Telecom Bell has a team of 10 Engineers with expertise on Kafka and spark
Desired timeline – May 2024
4
1
3
2
5
3
Contents Approach
Operating Model
Additional Details
01
3
4
5
Platform & Architecture
2
Summary
1
Platform & Architecture | Current Architecture
1
Limitations
• Data platform is not scalable for analytics,
AI/ML
• Upfront capacity planning and cost
• Governance of the data on HDFS is a
challenge
• Data sits in silos and not easy to integrate/
connect
• Lack of discoverability of data (catalog)
• Housekeeping - Maintenance of the in-
house cluster is a difficult thru different
portals and installations
• Advance disaster recovery, durability and
availability
• Bigger IT infra staff required
Platform & Architecture | End state Architecture
2
Design target state architecture for a scalable, secure and well governed data platform
(AI /ML self-serve, advanced engineering capabilities including necessary governance on lake capability)
Highlights
• Warehouse + Data Lake capabilities at scale with
Governance
• Data product mindset – Marketplace, Self service capabilities
• MLOps – Full ML Lifecycle
• Domain data tiers - Advance data management capabilities,
curated democratized data layers
Designing and activating a World Class Data
Platform:
Fundamental Principles
• Scalability
• Performance
• Industrialized processes governing the pipeline
• Distributed, fault tolerant architecture
• Open file format for better interoperability between systems
• Security and reliability
• Data provenance and lineage
• ACID complaint
Platform & Architecture | Current vs New
3
More performant and optimized spark
engine
1 Governance under the same roof
2
New
Platform & Architecture | Artifacts
4
Key components of the data platform:
A World Class Data Platform!
Contents Approach
Operating Model
Additional Details
01
3
4
5
Platform &
Architecture
2
Summary
1
Approach | Our Tenets
1
Security is job
zero
Agile
Methodology
Continues
delivery of
results
Because - "Approach is the first step towards achieving goals"
Leverage customer
asset first
Multiple velocity
joint delivery
approach
A B C D
E F G H
Zero down time Log the journey at
every step to look back
& learn
Principal of least
access
privilege(PoLAP)
Approach | Objectives
2
Build the data strategy
roadmap that
empowers Telecom Bell
to overcome its
business challenges
Mindset
HORIZO
N
HORIZO
N
HORIZO
N
Strategic
roadmap
Platform
1
2
3
Build strong
foundations with data
platform development
and implementation
Co-create an operating
model that would take
TELECOM BELL where it
wants, in a sustainable
way.
Migrate core
applications to cloud in
a secure and reliable
way
4
Industrialization
Contents Approach
Operating Model
Additional Details
01
3
4
5
Platform &
Architecture
2
Summary
1
Operating Model | Joint Delivery Approach
Executive Leadership
Databricks Leadership:
1
Application Team
Telecom Bell Leadership
1
Program Management
Databricks Lead
1
Telecom Bell Lead
1
Platform Team Data Quality &
Governance
Bringing it Together
Databricks
(Professional services)
5
D
C
B
A
Meeting Cadence
• Bi-Weekly Steering
Committee Meetings
• Weekly PMO Meetings
• Daily Delivery Team
Meetings
Telecom Bell Resources
3
1
Telecom Bell Resources
3
Telecom Bell Resources
4
Telecom Bell Resources
1
Databricks
(Professional services)
5
Databricks
(Professional services)
3
Databricks
(Professional services)
3
Leadership
Scrum Master
Application
Team
Functional
Domain Expert
Data Visualization
Engineer
Customer Success
Engineer
Data Engineer
Operating Model | Pod Structure
Data Quality &
Governance
Test /
Quality Lead
Data Quality
Engineer
Data Governance
Lead
Data Lineage and
Profiling Engineer
Product
Owner
Bring it
Together
Delivery Lead
Change
management
Specialist
PMO Lead
Roadmap
Officer
Databricks resource Telecom Bell resource
Leader
Leader
Leader
Leader
Platform
Azure Platform
Cloud Architect
Cloud DevOps
Engineer
Resident Solutions
Architect
Delivery Solutions
Architect
Customer Success
Engineer
Resident
Solutions
Architect
Resident
Solutions
Architect
Specialist Solutions
Architect (Security)
Specialist Solutions
Architect (Security)
Cloud DevOps
Engineer
Scrum Master
16 12
Shared
Resource
Shared
Resource
Shared Resource
Cloud DevOps
Engineer
2
Enterprise
Support
Enterprise
Support
CELEBRATION
Celebrate completion
`
PROGRAM
KICKOFF
Operating Model | Road Map
3
DELIVERABL
ES
DIAGNOSTIC OF THE CURRENT
ENVIRONMENT
1
PLATFOR
M
3
END STATE ARCHITECTURE
2
MIGRATION: 10
%
4
MIGRATION: 60%
5
6 MIGRATIO
N
100%
Progress
Progress
Consistently –
communicate,
remove
roadblocks &
eliminate
friction
Celebrate
completion of
quick wins to
strengthen
morale
ALONG THE WAY
Progress
MEASURE PROGRESS
MIGRATION
PLAYBOOK
A repeatable guideline to
migrate
applications to new
architecture
3
HUMAN-CENTERED
CHANGE
Focus on each individual team
member’s technical skills and
capacity for change. Reskill team
members whose roles are changing
1
MINDSET CHANGE
Adopt ‘Data as a Product’, self
service platform, federated
governance, domain specific
ownership
2
PROCESS GOALS
Operating Model | Timeline
3
Q2 2023 Q3 2023 Q4 2023 Q1 2024 Q2 2024
Agile : Update Roadmap and plan per evolving
priorities
Current State
Diagnostics
Assess skill and capability
gaps within the organization
Design & Deliver Governance Structure
Databricks workspace
setup
Assess Current State & Catalog Critical
Data Elements
Prepare Governance Strategy
(Identify roles, define interaction model)
Application
Platform
Bring it together
Data Quality +
Governance
Best practices and tagging
Design Target State DQ Monitoring
Steerco
Meeting
Assess Current State Data Governance
Steerco
Meeting
Steerco
Meeting
Steerco
Meeting
Confluent workspace
setup
Cost management reports
Define
Elements/Sources/Dat
a
Test & Modify
Refactor the
code
Deploy
Document &
KT
Define Pods and
teams
Create Upskilling Curriculum
and setup trainings sessions
Establish ways of working –
documentation, win celebrations
Continuously monitor, foresee risk, mitigate risks , fetch leadership
guidance
Project management
Arrange handover of all
areas
Handover
Handover
Handover
Security and compliance | phase1
Security and compliance | phase2
Talk to business
team
Incorporate changes
Cost optimization
Move towards Infra as
code
Implement Target State DQ
Monitoring
Contents Approach
Operating Model
01
3
4
5
Platform &
Architecture
2
Summary
1
Additional Details
Industrialization:
Competitive Differentiation
High throughput of innovation analytics (AI/ML)
Predictive analytics at scale
Data driven(real time what-if analysis)
Harmonized MDM; ML & AI based DQ
Fast, repeatable time-to-market from idea to
product
5
Additional Details | Future Scope
1
Additional Details | Risk & Mitigation - Technical
1
Risks Mitigating Actions
Data Loss Risk
Reconciliation, Check pointing, Audit, Monitoring. Use of fault tolerant ingestion/migration tools like Azure Data Factory
– Az Copy Activity
Data Corruption and Data Integrity
Risk
Data Validation - Each record is compared in a bidirectional manner, and each record in the old system is compared
against the target system and the target system against the old system
Interference Risks
(simultaneously use of source
application)
Align with the stakeholders of each source on how the bandwidth can be shared. “Bring it together” team come into
play to address this
Schema Evolution
(Changing Dimensions)
Delta file format – Schema evolution feature. Depends on schema on read. Further to make sure there are no
incompatible schemas coming in. A catalog and governance would be leveraged – Databricks Unity Catalog
Authorization Risk MFA and Identity Federation , access controls at row and column level by Delta Lake
Data Security Risk
Apply Encryption where possible and appropriate
All tokens and keys will be securely stored and rotated in Azure Key Vault
Rotate keys on regular interval
Down time due to migration Replicate and activate approach
Additional Details | Risk & Mitigation - Other
2
Risk Mitigating Actions
Resource Availability &
Competing Priorities
 Making sure employees are fully advised about participation into workshops and/or interviews.
 Get the right people at the right time
Senior Leadership Buy-In and Delays
in Decision Making
 Strong support from the leadership Group, including areas who are not fully involved by the initial changes. One Team,
One direction
 Establish governance to provide clarity on accountabilities for decision making
Potential Impacts to Other
Projects
 Strong support from Senior Leadership if there is a need to put a hold on
existing projects
 Review current state of ongoing projects to see how it impacts to the Finance model
 Prioritize major changes and focus on the big obstacles upfront
Lack of People Adoption –
Major Change
 Agile and inspirational change management and communication structure
 Leverage Bring it together team, and roles like change management experts to steward people readiness and prepare
for change
Design in Isolation
(Enterprise Integration)
 Work with scalable and flexible design principles in mind to ensure proper
integration and alignment with the business. It is a partnership approach
 Gather key inputs to support cross function process design decisions
where applicable
Availability of Key Data Inputs
and Information
 Simplify data requests to collect data and information at the appropriate level of detail
 Assign designated Databricks and Telecom Bell contact to ensure smooth and timely transition of data
 Discovery Phase to identify hidden environmental risks to foresee and mitigate
Area Assumption
1 Platform
Telecom Bell on premise platform is owned and managed by Telecom Bell and Databricks will get the necessary support to extent the setup to
provision the solution per the scope of this effort.
2 Data Security
Telecom Bell is responsible for the design, integration and operation of all Client Identity and Access Management, Security Incident and Event
Management, Vulnerability Scanning and Security Testing tooling and processes as appropriate.
5 Access & Setup
Telecom Bell will provide system access to all source systems or applications required by scope. Telecom Bell will provide access to systems and
environments(including DEV, SIT) within 5 business days of receipt of request.
6 Access & Setup
Databricks persona will not have access to unencrypted PII data. Telecom Bell will be responsible for encrypting any PII data, prior to extraction in
the Databricks platform.
7 Access & Setup PII and GDPR Data handling will be done by Telecom Bell as per the existing practices in delivery , any additional arrangement is out of scope.
9
Project
Management
Telecom Bell will provide relevant functional, technical and process documentation for data platforms and systems required by the scope.
10
Project
Management
Telecom Bell will nominate full time business and technical SMEs aligned to this project as per the agreed pod structure.
11
Project
Management
Telecom Bell data owners /nominees will make every attempt to attend the Scrum meetings and ceremonies to present their progress on the issues
assigned
12
Project
Management
Telecom Bell will make sure we get required time and support from all the stakeholders for complete success of the project.
14 Data Build
Databricks team will reuse and extend the existing data ingestion tooling and framework to support the ingestion activities into the platform. The
project will carry a data discovery exercise where it will assess the local market data quality and readiness.
15 Data Build Source System inventory have already been identified and already in place.
16 License The Cloudera CDH on premise license is already expired in March 2022. However, the extended support is required and obtained.
Additional Details | Assumptions
3
•Is there an onboarding guide for the consultants to get started on your environment ?
Is there a Source System inventory already identified and can be shared ?
What are the roles and skills of existing 10 engineers on the team ?
What is the current data governance mechanism ?
Other than Cloudera, what all other paid subscriptions and packages are installed on the concerned
architecture ?
•Is there any major business contingency on this project plan? If so, what is the impact of the delayed delivery?
•What are all the compliances and regulations that Telecom Bell need to follow about the concerned data?
•Does Telecom Bell already have Azure account? If so, what is the level of enterprise support plan that is subscribed ?
•Does Telecom Bell already have Confluent account? If so, what is the level of enterprise support plan that is subscribed
?
•Any due license expires ?
•What is the Cloudera’s extended support expiry date ?
Additional Details | Questions
4
Thank you
Thank you so much for you time today.. 
Yashodhan Kale
BACKGROUND SELECTED EXPERIENCES
Amazon Web Services Certified Data Analytics - Specialty
Amazon Web Services Solutions Architect - Associate
Cloudera Certified Developer for Apache Hadoop (CCDH)
RELEVANT FUNCTIONAL AND INDUSTRY EXPERIENCE
Modern Technologist | Data and ML at scale
Design and drive clients' Data and AI journeys powered by cloud analytics
expertise! Offering data product mindset-driven solutions to deliver platforms
and beyond: Self-service framework, rapid experimentation lab, democratized
data, data products marketplace, multi-cloud solutions, data lake, data fabric,
data mesh patterns with federated governance, domain-specific ownership, and
more
Industry Focus:
• HealthCare
• Retail
• Market Research
• Finance
Functional Expertise:
• Digital Transformation
• Analytics and CDO Strategy
• Open Source
• Machine Learning, IOT
• Data Drive Re-invention
• Fortune 5 American healthcare company
Establish and manage DevOps, Data Engineering, and ML engineering teams in close collaboration with Data
Scientists. Set up a self-service Data and ML platform on Azure cloud for a Retail enterprise, incorporating an
experimentation framework, Model Training pipelines, and real-time inference using Azure AKS, Kubeflow, and
Snowflake. Implement an Rx enterprise Data and ML platform on Azure cloud, enabling ETL pipelines with
Databricks and Apache Airflow. Lead the development of large-scale projects, including legacy modernization,
Rx personalization, and Retail personalization programs that impact millions of lives daily. Collaborate with
technology partners, MSFT and NVIDIA, to present objectives, findings, and incorporate feedback for ML
solutions with specialized NVIDIA GPUs. Architect and oversee the implementation of the Refrigerator IoT
project on Azure, leveraging IOT hub, Azure Analytics, and Databricks. Lead the development of SAP HANA to
Spark integration. Manage the enhancement team in Data Engineering for pharmacy-related projects, ensuring
critical business deliveries. Design data-driven solutions, including self-service analytics platforms, rapid
experimentation labs, democratized data, multi-cloud solutions, data fabric, data mesh patterns with federated
governance, and domain-specific ownership. Develop an ingestion framework for seamless data migration across
projects and cloud storage services.
• Multinational American information, data & market measurement company
Build a retail store data aggregation engine (Retail Intelligence system) for 24 countries, initially using Hadoop
MapReduce, later upgraded to Spark. Migrate on-premise batch processes to the cloud using Docker, Azure Batch
Services, and Azure Shipyard for cost efficiency. Perform performance tuning on Apache Spark, cloud Hadoop
clusters (HDI), and Databricks on Azure and Hadoop platforms.
CERTIFICATIONS
PREVIOUSLY
Sr Cloud Solution Architect @ Amazon Web Services Level 6
Sr ML Engineering Manager @ Databricks Level 6
WHAT HAS BROUGHT ME HERE
• Customer Obsession
• Deliver Results
• Earn trust
• Learn and Be Curious
ACID Compliant
Time Travel
Data as product
Inter Operability
Self service
experimentation
Scale &
Pay as you go
Lake House Governance
Data Migration
Identity Management,
SSO
Event Streaming
Exactly once
semantics
Upfront cost
Not easy to integrate/
connect
Lack of discoverability
Efforts to make data HA &
durable
End of support
Maintenance
Platform & Architecture | Artifacts
1
Key components of the data platform:
A World Class Data Platform!
Lake House
MLOps
Governance
Databricks Marketplace
Databricks Notebooks
1
Share
insights
Quickly discover new insights with
built-in interactive visualizations,
or leverage libraries such as
Matplotlib and ggplot. Export
results and Notebooks in HTML or
IPYNB format, or build and share
dashboards that always stay up to
date.
3 Production
at scale
Schedule Notebooks to automatically
run machine learning and data
pipelines at scale. Create multistage
pipelines using Databricks Workflows.
Set up alerts and quickly access audit
logs for easy monitoring and
troubleshooting.
2
Work
together
Share Notebooks and work with peers
across teams in multiple languages (R,
Python, SQL and Scala) and libraries of
your choice. Real-time coauthoring,
commenting and automated versioning
simplify collaboration while providing
control.

Mais conteúdo relacionado

Mais procurados

Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data PipelinesBest Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
Eric Kavanagh
 

Mais procurados (20)

Databricks on AWS.pptx
Databricks on AWS.pptxDatabricks on AWS.pptx
Databricks on AWS.pptx
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
Getting Started with Delta Lake on Databricks
Getting Started with Delta Lake on DatabricksGetting Started with Delta Lake on Databricks
Getting Started with Delta Lake on Databricks
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for Dummies
 
Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for Scale
 
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
 
Azure Migrate
Azure MigrateAzure Migrate
Azure Migrate
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Accelerate and modernize your data pipelines
Accelerate and modernize your data pipelinesAccelerate and modernize your data pipelines
Accelerate and modernize your data pipelines
 
Databricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With Data
 
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data PipelinesBest Practices in DataOps: How to Create Agile, Automated Data Pipelines
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
 

Semelhante a Hadoop Migration to databricks cloud project plan.pptx

On the Cloud? Data Integrity for Insurers in Cloud-Based Platforms
On the Cloud? Data Integrity for Insurers in Cloud-Based PlatformsOn the Cloud? Data Integrity for Insurers in Cloud-Based Platforms
On the Cloud? Data Integrity for Insurers in Cloud-Based Platforms
Precisely
 

Semelhante a Hadoop Migration to databricks cloud project plan.pptx (20)

Maximizing Oil and Gas (Data) Asset Utilization with a Logical Data Fabric (A...
Maximizing Oil and Gas (Data) Asset Utilization with a Logical Data Fabric (A...Maximizing Oil and Gas (Data) Asset Utilization with a Logical Data Fabric (A...
Maximizing Oil and Gas (Data) Asset Utilization with a Logical Data Fabric (A...
 
Democratized Data & Analytics for the Cloud​
Democratized Data & Analytics for the Cloud​Democratized Data & Analytics for the Cloud​
Democratized Data & Analytics for the Cloud​
 
On the Cloud? Data Integrity for Insurers in Cloud-Based Platforms
On the Cloud? Data Integrity for Insurers in Cloud-Based PlatformsOn the Cloud? Data Integrity for Insurers in Cloud-Based Platforms
On the Cloud? Data Integrity for Insurers in Cloud-Based Platforms
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
Get ahead of the cloud or get left behind
Get ahead of the cloud or get left behindGet ahead of the cloud or get left behind
Get ahead of the cloud or get left behind
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Top Trends and Challenges in the Cloud
Top Trends and Challenges in the CloudTop Trends and Challenges in the Cloud
Top Trends and Challenges in the Cloud
 
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
 
Data Quality from Precisely: Spectrum Quality
Data Quality from Precisely: Spectrum QualityData Quality from Precisely: Spectrum Quality
Data Quality from Precisely: Spectrum Quality
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
Multi-Cloud Integration with Data Virtualization (ASEAN)
Multi-Cloud Integration with Data Virtualization (ASEAN)Multi-Cloud Integration with Data Virtualization (ASEAN)
Multi-Cloud Integration with Data Virtualization (ASEAN)
 
Data Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWSData Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWS
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
BATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data MeshBATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data Mesh
 
Webinar: Hybrid Cloud Integration - Why It's Different and Why It Matters
Webinar: Hybrid Cloud Integration - Why It's Different and Why It MattersWebinar: Hybrid Cloud Integration - Why It's Different and Why It Matters
Webinar: Hybrid Cloud Integration - Why It's Different and Why It Matters
 
Data Governance for the Cloud with Oracle DRM
Data Governance for the Cloud with Oracle DRMData Governance for the Cloud with Oracle DRM
Data Governance for the Cloud with Oracle DRM
 
Marlabs Capabilities Overview: DWBI, Analytics and Big Data Services
Marlabs Capabilities Overview: DWBI, Analytics and Big Data ServicesMarlabs Capabilities Overview: DWBI, Analytics and Big Data Services
Marlabs Capabilities Overview: DWBI, Analytics and Big Data Services
 
Logical Data Fabric and Industry-Focused Solutions by IQZ Systems
Logical Data Fabric and Industry-Focused Solutions by IQZ SystemsLogical Data Fabric and Industry-Focused Solutions by IQZ Systems
Logical Data Fabric and Industry-Focused Solutions by IQZ Systems
 
Govern and Protect Your End User Information
Govern and Protect Your End User InformationGovern and Protect Your End User Information
Govern and Protect Your End User Information
 
Evolving From Monolithic to Distributed Architecture Patterns in the Cloud
Evolving From Monolithic to Distributed Architecture Patterns in the CloudEvolving From Monolithic to Distributed Architecture Patterns in the Cloud
Evolving From Monolithic to Distributed Architecture Patterns in the Cloud
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Hadoop Migration to databricks cloud project plan.pptx

  • 1. Telecom Bell Cloud Migration Kickoff Yashodhan Kale Delivery Solutions Architect | Databricks 05/30/2023
  • 2. Contents Approach Operating Model Additional Details 01 3 4 5 Platform & Architecture 2 Summary 1
  • 3. Summary | Business challenges 1 1 2 3 4 5 TELECOM BELL must improve network QOS to align with consumers' changing emphasis on mobile connectivity and data usage As IoT and 5G advance, customers easily switch providers, prompting TELECOM BELL to prioritize personalized engagement using customer data for customized messaging and services. TELECOM BELL is subject to many regulations, including data privacy and security regulations, and needs effective ways to adhere to these. Power of data there is a data-volume explosion, requiring both focus and new capabilities. Increase pressure to show growth and profits is constant and data and AI will be a critical enabler
  • 4. Summary| Technical Challenges 2 Today there are increased expectations and pressure on the Telecom organization to have a strong data & analytics strategy •Data platform is not scalable for analytics, AI/ML Upfront capacity planning and cost Governance of the data on HDFS is a challenge Data sits in silos and not easy to integrate/ connect •Lack of discoverability of data (catalog) •Housekeeping - Maintenance of the in-house cluster is a difficult thru different portals and installations •Advance disaster recovery, durability and availability •Bigger IT infra staff required
  • 5. Summary | Executive Plan Telecom Bell wants to improve the Quality of Service (QoS) of their network and to get there, start migrating the core applications to cloud. Databricks will bring industry leading expertise and Databricks platform expertise to drive the transformation at speed. Confluent will bring event streaming platform built on Kafka and the necessary platform support Telecom Bell has a team of 10 Engineers with expertise on Kafka and spark Desired timeline – May 2024 4 1 3 2 5 3
  • 6. Contents Approach Operating Model Additional Details 01 3 4 5 Platform & Architecture 2 Summary 1
  • 7. Platform & Architecture | Current Architecture 1 Limitations • Data platform is not scalable for analytics, AI/ML • Upfront capacity planning and cost • Governance of the data on HDFS is a challenge • Data sits in silos and not easy to integrate/ connect • Lack of discoverability of data (catalog) • Housekeeping - Maintenance of the in- house cluster is a difficult thru different portals and installations • Advance disaster recovery, durability and availability • Bigger IT infra staff required
  • 8. Platform & Architecture | End state Architecture 2 Design target state architecture for a scalable, secure and well governed data platform (AI /ML self-serve, advanced engineering capabilities including necessary governance on lake capability) Highlights • Warehouse + Data Lake capabilities at scale with Governance • Data product mindset – Marketplace, Self service capabilities • MLOps – Full ML Lifecycle • Domain data tiers - Advance data management capabilities, curated democratized data layers Designing and activating a World Class Data Platform: Fundamental Principles • Scalability • Performance • Industrialized processes governing the pipeline • Distributed, fault tolerant architecture • Open file format for better interoperability between systems • Security and reliability • Data provenance and lineage • ACID complaint
  • 9. Platform & Architecture | Current vs New 3 More performant and optimized spark engine 1 Governance under the same roof 2 New
  • 10. Platform & Architecture | Artifacts 4 Key components of the data platform: A World Class Data Platform!
  • 11. Contents Approach Operating Model Additional Details 01 3 4 5 Platform & Architecture 2 Summary 1
  • 12. Approach | Our Tenets 1 Security is job zero Agile Methodology Continues delivery of results Because - "Approach is the first step towards achieving goals" Leverage customer asset first Multiple velocity joint delivery approach A B C D E F G H Zero down time Log the journey at every step to look back & learn Principal of least access privilege(PoLAP)
  • 13. Approach | Objectives 2 Build the data strategy roadmap that empowers Telecom Bell to overcome its business challenges Mindset HORIZO N HORIZO N HORIZO N Strategic roadmap Platform 1 2 3 Build strong foundations with data platform development and implementation Co-create an operating model that would take TELECOM BELL where it wants, in a sustainable way. Migrate core applications to cloud in a secure and reliable way 4 Industrialization
  • 14. Contents Approach Operating Model Additional Details 01 3 4 5 Platform & Architecture 2 Summary 1
  • 15. Operating Model | Joint Delivery Approach Executive Leadership Databricks Leadership: 1 Application Team Telecom Bell Leadership 1 Program Management Databricks Lead 1 Telecom Bell Lead 1 Platform Team Data Quality & Governance Bringing it Together Databricks (Professional services) 5 D C B A Meeting Cadence • Bi-Weekly Steering Committee Meetings • Weekly PMO Meetings • Daily Delivery Team Meetings Telecom Bell Resources 3 1 Telecom Bell Resources 3 Telecom Bell Resources 4 Telecom Bell Resources 1 Databricks (Professional services) 5 Databricks (Professional services) 3 Databricks (Professional services) 3
  • 16. Leadership Scrum Master Application Team Functional Domain Expert Data Visualization Engineer Customer Success Engineer Data Engineer Operating Model | Pod Structure Data Quality & Governance Test / Quality Lead Data Quality Engineer Data Governance Lead Data Lineage and Profiling Engineer Product Owner Bring it Together Delivery Lead Change management Specialist PMO Lead Roadmap Officer Databricks resource Telecom Bell resource Leader Leader Leader Leader Platform Azure Platform Cloud Architect Cloud DevOps Engineer Resident Solutions Architect Delivery Solutions Architect Customer Success Engineer Resident Solutions Architect Resident Solutions Architect Specialist Solutions Architect (Security) Specialist Solutions Architect (Security) Cloud DevOps Engineer Scrum Master 16 12 Shared Resource Shared Resource Shared Resource Cloud DevOps Engineer 2 Enterprise Support Enterprise Support
  • 17. CELEBRATION Celebrate completion ` PROGRAM KICKOFF Operating Model | Road Map 3 DELIVERABL ES DIAGNOSTIC OF THE CURRENT ENVIRONMENT 1 PLATFOR M 3 END STATE ARCHITECTURE 2 MIGRATION: 10 % 4 MIGRATION: 60% 5 6 MIGRATIO N 100% Progress Progress Consistently – communicate, remove roadblocks & eliminate friction Celebrate completion of quick wins to strengthen morale ALONG THE WAY Progress MEASURE PROGRESS MIGRATION PLAYBOOK A repeatable guideline to migrate applications to new architecture 3 HUMAN-CENTERED CHANGE Focus on each individual team member’s technical skills and capacity for change. Reskill team members whose roles are changing 1 MINDSET CHANGE Adopt ‘Data as a Product’, self service platform, federated governance, domain specific ownership 2 PROCESS GOALS
  • 18. Operating Model | Timeline 3 Q2 2023 Q3 2023 Q4 2023 Q1 2024 Q2 2024 Agile : Update Roadmap and plan per evolving priorities Current State Diagnostics Assess skill and capability gaps within the organization Design & Deliver Governance Structure Databricks workspace setup Assess Current State & Catalog Critical Data Elements Prepare Governance Strategy (Identify roles, define interaction model) Application Platform Bring it together Data Quality + Governance Best practices and tagging Design Target State DQ Monitoring Steerco Meeting Assess Current State Data Governance Steerco Meeting Steerco Meeting Steerco Meeting Confluent workspace setup Cost management reports Define Elements/Sources/Dat a Test & Modify Refactor the code Deploy Document & KT Define Pods and teams Create Upskilling Curriculum and setup trainings sessions Establish ways of working – documentation, win celebrations Continuously monitor, foresee risk, mitigate risks , fetch leadership guidance Project management Arrange handover of all areas Handover Handover Handover Security and compliance | phase1 Security and compliance | phase2 Talk to business team Incorporate changes Cost optimization Move towards Infra as code Implement Target State DQ Monitoring
  • 19. Contents Approach Operating Model 01 3 4 5 Platform & Architecture 2 Summary 1 Additional Details
  • 20. Industrialization: Competitive Differentiation High throughput of innovation analytics (AI/ML) Predictive analytics at scale Data driven(real time what-if analysis) Harmonized MDM; ML & AI based DQ Fast, repeatable time-to-market from idea to product 5 Additional Details | Future Scope 1
  • 21. Additional Details | Risk & Mitigation - Technical 1 Risks Mitigating Actions Data Loss Risk Reconciliation, Check pointing, Audit, Monitoring. Use of fault tolerant ingestion/migration tools like Azure Data Factory – Az Copy Activity Data Corruption and Data Integrity Risk Data Validation - Each record is compared in a bidirectional manner, and each record in the old system is compared against the target system and the target system against the old system Interference Risks (simultaneously use of source application) Align with the stakeholders of each source on how the bandwidth can be shared. “Bring it together” team come into play to address this Schema Evolution (Changing Dimensions) Delta file format – Schema evolution feature. Depends on schema on read. Further to make sure there are no incompatible schemas coming in. A catalog and governance would be leveraged – Databricks Unity Catalog Authorization Risk MFA and Identity Federation , access controls at row and column level by Delta Lake Data Security Risk Apply Encryption where possible and appropriate All tokens and keys will be securely stored and rotated in Azure Key Vault Rotate keys on regular interval Down time due to migration Replicate and activate approach
  • 22. Additional Details | Risk & Mitigation - Other 2 Risk Mitigating Actions Resource Availability & Competing Priorities  Making sure employees are fully advised about participation into workshops and/or interviews.  Get the right people at the right time Senior Leadership Buy-In and Delays in Decision Making  Strong support from the leadership Group, including areas who are not fully involved by the initial changes. One Team, One direction  Establish governance to provide clarity on accountabilities for decision making Potential Impacts to Other Projects  Strong support from Senior Leadership if there is a need to put a hold on existing projects  Review current state of ongoing projects to see how it impacts to the Finance model  Prioritize major changes and focus on the big obstacles upfront Lack of People Adoption – Major Change  Agile and inspirational change management and communication structure  Leverage Bring it together team, and roles like change management experts to steward people readiness and prepare for change Design in Isolation (Enterprise Integration)  Work with scalable and flexible design principles in mind to ensure proper integration and alignment with the business. It is a partnership approach  Gather key inputs to support cross function process design decisions where applicable Availability of Key Data Inputs and Information  Simplify data requests to collect data and information at the appropriate level of detail  Assign designated Databricks and Telecom Bell contact to ensure smooth and timely transition of data  Discovery Phase to identify hidden environmental risks to foresee and mitigate
  • 23. Area Assumption 1 Platform Telecom Bell on premise platform is owned and managed by Telecom Bell and Databricks will get the necessary support to extent the setup to provision the solution per the scope of this effort. 2 Data Security Telecom Bell is responsible for the design, integration and operation of all Client Identity and Access Management, Security Incident and Event Management, Vulnerability Scanning and Security Testing tooling and processes as appropriate. 5 Access & Setup Telecom Bell will provide system access to all source systems or applications required by scope. Telecom Bell will provide access to systems and environments(including DEV, SIT) within 5 business days of receipt of request. 6 Access & Setup Databricks persona will not have access to unencrypted PII data. Telecom Bell will be responsible for encrypting any PII data, prior to extraction in the Databricks platform. 7 Access & Setup PII and GDPR Data handling will be done by Telecom Bell as per the existing practices in delivery , any additional arrangement is out of scope. 9 Project Management Telecom Bell will provide relevant functional, technical and process documentation for data platforms and systems required by the scope. 10 Project Management Telecom Bell will nominate full time business and technical SMEs aligned to this project as per the agreed pod structure. 11 Project Management Telecom Bell data owners /nominees will make every attempt to attend the Scrum meetings and ceremonies to present their progress on the issues assigned 12 Project Management Telecom Bell will make sure we get required time and support from all the stakeholders for complete success of the project. 14 Data Build Databricks team will reuse and extend the existing data ingestion tooling and framework to support the ingestion activities into the platform. The project will carry a data discovery exercise where it will assess the local market data quality and readiness. 15 Data Build Source System inventory have already been identified and already in place. 16 License The Cloudera CDH on premise license is already expired in March 2022. However, the extended support is required and obtained. Additional Details | Assumptions 3
  • 24. •Is there an onboarding guide for the consultants to get started on your environment ? Is there a Source System inventory already identified and can be shared ? What are the roles and skills of existing 10 engineers on the team ? What is the current data governance mechanism ? Other than Cloudera, what all other paid subscriptions and packages are installed on the concerned architecture ? •Is there any major business contingency on this project plan? If so, what is the impact of the delayed delivery? •What are all the compliances and regulations that Telecom Bell need to follow about the concerned data? •Does Telecom Bell already have Azure account? If so, what is the level of enterprise support plan that is subscribed ? •Does Telecom Bell already have Confluent account? If so, what is the level of enterprise support plan that is subscribed ? •Any due license expires ? •What is the Cloudera’s extended support expiry date ? Additional Details | Questions 4
  • 25. Thank you Thank you so much for you time today.. 
  • 26. Yashodhan Kale BACKGROUND SELECTED EXPERIENCES Amazon Web Services Certified Data Analytics - Specialty Amazon Web Services Solutions Architect - Associate Cloudera Certified Developer for Apache Hadoop (CCDH) RELEVANT FUNCTIONAL AND INDUSTRY EXPERIENCE Modern Technologist | Data and ML at scale Design and drive clients' Data and AI journeys powered by cloud analytics expertise! Offering data product mindset-driven solutions to deliver platforms and beyond: Self-service framework, rapid experimentation lab, democratized data, data products marketplace, multi-cloud solutions, data lake, data fabric, data mesh patterns with federated governance, domain-specific ownership, and more Industry Focus: • HealthCare • Retail • Market Research • Finance Functional Expertise: • Digital Transformation • Analytics and CDO Strategy • Open Source • Machine Learning, IOT • Data Drive Re-invention • Fortune 5 American healthcare company Establish and manage DevOps, Data Engineering, and ML engineering teams in close collaboration with Data Scientists. Set up a self-service Data and ML platform on Azure cloud for a Retail enterprise, incorporating an experimentation framework, Model Training pipelines, and real-time inference using Azure AKS, Kubeflow, and Snowflake. Implement an Rx enterprise Data and ML platform on Azure cloud, enabling ETL pipelines with Databricks and Apache Airflow. Lead the development of large-scale projects, including legacy modernization, Rx personalization, and Retail personalization programs that impact millions of lives daily. Collaborate with technology partners, MSFT and NVIDIA, to present objectives, findings, and incorporate feedback for ML solutions with specialized NVIDIA GPUs. Architect and oversee the implementation of the Refrigerator IoT project on Azure, leveraging IOT hub, Azure Analytics, and Databricks. Lead the development of SAP HANA to Spark integration. Manage the enhancement team in Data Engineering for pharmacy-related projects, ensuring critical business deliveries. Design data-driven solutions, including self-service analytics platforms, rapid experimentation labs, democratized data, multi-cloud solutions, data fabric, data mesh patterns with federated governance, and domain-specific ownership. Develop an ingestion framework for seamless data migration across projects and cloud storage services. • Multinational American information, data & market measurement company Build a retail store data aggregation engine (Retail Intelligence system) for 24 countries, initially using Hadoop MapReduce, later upgraded to Spark. Migrate on-premise batch processes to the cloud using Docker, Azure Batch Services, and Azure Shipyard for cost efficiency. Perform performance tuning on Apache Spark, cloud Hadoop clusters (HDI), and Databricks on Azure and Hadoop platforms. CERTIFICATIONS PREVIOUSLY Sr Cloud Solution Architect @ Amazon Web Services Level 6 Sr ML Engineering Manager @ Databricks Level 6 WHAT HAS BROUGHT ME HERE • Customer Obsession • Deliver Results • Earn trust • Learn and Be Curious
  • 27. ACID Compliant Time Travel Data as product Inter Operability Self service experimentation Scale & Pay as you go Lake House Governance Data Migration Identity Management, SSO Event Streaming Exactly once semantics
  • 28. Upfront cost Not easy to integrate/ connect Lack of discoverability Efforts to make data HA & durable End of support Maintenance
  • 29. Platform & Architecture | Artifacts 1 Key components of the data platform: A World Class Data Platform!
  • 31. MLOps
  • 34. Databricks Notebooks 1 Share insights Quickly discover new insights with built-in interactive visualizations, or leverage libraries such as Matplotlib and ggplot. Export results and Notebooks in HTML or IPYNB format, or build and share dashboards that always stay up to date. 3 Production at scale Schedule Notebooks to automatically run machine learning and data pipelines at scale. Create multistage pipelines using Databricks Workflows. Set up alerts and quickly access audit logs for easy monitoring and troubleshooting. 2 Work together Share Notebooks and work with peers across teams in multiple languages (R, Python, SQL and Scala) and libraries of your choice. Real-time coauthoring, commenting and automated versioning simplify collaboration while providing control.

Notas do Editor

  1. Ex AWS Ex, Databricks ML Engineering Sr Manager, I have built data platforms and delivered - campaign management, personalization while touches millinos of lives a day. Extensively worked into Retail , healthcare, telecom and finance industries and worked into 3 different counties experienced start up culture. And I know how to deliver results. Qualities that has brought me here are – Customer ob, Delivering result, earn trust and not giving up on learning. fifa, chess, Salsa Transition – that’s me . With that lets get going
  2. 10K ft overview business, technical , and plan slightly deeper look into platform Transition – what how and when
  3. personalization customer engagement regulations - data privacy and security data volumn show growth and profits. Top priority : improve QOS Transition -Lets look at some technical challenges
  4. To address the B challenges above A Strong data and analytics strategy is maintain the pace of innovation = experimentation capability = pay as you go is crucial + easy access to data + SAAS model of services ==== benchmark , Red flags - : kafka and spark architecture which processes nework data End of support Transition – yes we saw the businesses and technical challenges so What’s the plan ? The direction : next slide
  5. we will improve the QoS of the network and start with migration Databricks Confluent telicom bell10 engineers we plan to complete this project in 12 months Transition – Alright. How do we achieve this? I have but together plan and that I will walk you thru. feedback, suggestions, concerns are all welcome. Craft a final version together.
  6. TB On -premise arch would look more or less like this.
  7. Transition – Enough time on architecture. Lets take 2 differences and move from here. 1. More optimized more performant.. With less configurations to worry about. : zordering, vaucum, auto optimize feautures. 2. integration with UC
  8. A leap towards data as a product mind set – Federated goverenance, self service platform, inter – operatability, share within and across organization – notebooks and code. (product mindset.) Add Marketplace
  9. Talk about few in the interest of time Security – Azure key vault, Encryption where possible. Network setup – No data will flow thru public internet.. Private endpoints will be used. Principal of least access privilege(PoLAP) Zero down time: Replicate and then activate Leverage customer asset first : you will see in the next few slides 10 engineers distributed across all project areas
  10. operating model is designed to deliver these objectives over next 12 months After the essential piece of roadmap and planning Platform : Not only run existing apps , empowers bell to accelerate on pace of innovation provide solutions beyond the scope of this project: - personalized customer engagement and other business experiments Also, OM delivers a needed shift in mindset. Think “Data as a product” , create a data-product culture features of marketplace, federated governance, delta sharing, Lastly, it deliver pay as you go, secure & low maintenance solution that can handle the immediate need to migrate to cloud “given end of support”
  11. sharing the resource where possible used all 10 tb enineers Assuming enterprise support from confluent and azure total count
  12. Dignostic - complete picture of where we are, our pain points, scope of improvement, asset Final End state architecture
  13. Time line activity Any party has any concerns, we can definitely relook at this and try adjust to make it smoothly achievable
  14. Ex AWS Ex, Accenture ML Engineering Sr Manager, I have built data platforms and delivered - campaign management, personalization while touches millinos of lives a day. Extensively worked into Retail , healthcare, telecom and finance industries and worked into 3 different counties experienced start up culture. And I know how to deliver results. Qualities that has brought me here are – Customer ob, Delivering result, earn trust and not giving up on learning. fifa, chess, Salsa Transition – that’s me . With that lets get going
  15. A leap towards data as a product mind set – Federated governance, self service platform, inter – operability, share within and across organization – notebooks and code. (product mindset.)
  16. Few pain points - These services all run on premise. upgrades Limitations Data platform is not scalable for analytics, AI/ML Upfront capacity planning and cost Governance of the data on HDFS is a challenge Data sits in silos and not easy to integrate/ connect Lack of discoverability of data (catalog) Housekeeping - Maintenance of the in-house cluster is a difficult thru different portals and installations Advance disaster recovery, durability and availability Bigger IT infra staff required
  17. A leap towards data as a product mind set – Federated goverenance, self service platform, inter – operatability, share within and across organization – notebooks and code. (product mindset.) Add Marketplace