SlideShare uma empresa Scribd logo
1 de 17
Baixar para ler offline
THREE
Big Data
CASE STUDIES
Great use cases of Big Data
Big Data Exploration
Find, visualize, understand all big
data to improve decision making
Enhanced 3600 View
of the Customer
Extend existing customer views
(CRM, etc) by incorporating
additional internal and external
information sources
Security/Intelligence Extension
Lower risk, detect fraud and
monitor cyber security in real-time
Data Warehouse Augmentation
Integrate big data and data
warehouse capabilities to increase
operational efficiency
Operations Analysis
Analyze a variety of machine
data for improved business results
• Greater efficiencies
in business
processes
• New insights from
combining and
analyzing data
types in new ways
• Develop new
business models
with resulting
increased market
presence and
revenue
Why Big Data
File Systems
Relational Data
Content Mgmt
Email
CRM
Supply Chain
ERP
RSS Feeds
Cloud
Custom SourcesDataViews
Applications/
Users
Atidan Approach
Implement a
Hadoop-
centric
reference
architecture
Move
enterprise
batch
processing to
Hadoop
Make Hadoop
the single
point of truth
Massively
reduce ETL by
transforming
within
Hadoop
Move results
and
aggregates
back to legacy
systems for
consumption
Retain, within
Hadoop,
source files at
the finest
granularity for
re-use
Top Criteria
• Allow users to use familiar consumption interfaces (web, mobile)
• Enable businesses to unlock previously unusable data
Unlock Big
Data
Simplify
Your
Warehouse
Preprocess
Raw Data
Ingest
BigData
ArchitectureHighlevel
Atidan Case Study
Usage Analysis using Hadoop
• Business Need
• A large conglomerate had to analyze the last 10 years usage of its web applications by using the IIS logs
• The logs received from IIS were stored in multiple files e.g. Daily logs
• The data had free text, it was unstructured and it also contained irrelevant data
• The exact analysis criteria/parameters/desired outcome were not pre-known
• Solution
• Traditional RDBMS could not handle the problem due to the type and volume of the data and the
uncertainty around ultimate analysis criteria
• Atidan delivered a Hadoop based solution that performed transformation of raw data into reports easily
• The solution was fault tolerant to data inconsistencies
• Hadoop provided elasticity to incremental data addition
• Scalability in the range of Peta Bytes
• Based on data size and complexity, the processing can be scaled from one node to 100 nodes
• Schema-less architecture helped in dynamically changing the data model and analytics even at a late stage
in the project
• The organization got completely new and unexpected insights on employee, customer and vendor/partner
behavior
• Correlations between employee’s usage pattern and attrition as well as productivity were established
Atidan Case Study
Usage Analysis using Hadoop
0
2000
4000
6000
8000
10000
12000
14000
Accepted…
BadRequest…
Created(201)
Forbidden…
Not…
NotFound…
OK(200)
Unauthorise…
Request Types
0
200
400
600
800
1000
1200
January
March
May
July
September
November
January
March
May
July
September
November
2001 2002
Monthly Requests
0
200000
400000
600000
Amare
Amit
Bhagat
Mukesh
Praneel
Sanjog
Vimal
Users
• The size of data being collected
and analyzed in industry for
business intelligence (BI) is
growing rapidly making
traditional warehousing solution
prohibitively expensive
• Map Reduce is low level and
complex to write
• Hive provides high level query
language like SQL
• This allows for ad-hoc analysis
• Business need not know patterns
to look for in advance
Big Query - Hive
Atidan Case Study
Customer data collection (KYC) using Hadoop
• Business Need
• A financial institution had to periodically collect customer data
• Customers are very reluctant to provide updated data
• This customer data has to be cross-checked against the billions of transactions they receive per day
• They want to collate data that is available in public domain from known social media sites
• The data had free text, it was unstructured and it also contained irrelevant data
• Solution
• A graph database is constructed over the extracted social data to analyze transactions
• Atidan delivered a Hadoop based solution that performed transformation of raw data into a graph database
• Aggregate customer information from existing sources, social media, government sources
• Analyzed transaction to find hidden patterns
• Enable link analysis, risk monitoring
• Facilitate decision making(new products) and customer discovery
Atidan Case Study
Customer data collection (KYC) using Hadoop
Big Data Processing
Graph Database
Customer Clustering
Income/Expense changes
Corporate structure
changes
AML
Peer group analysis
Pattern Analysis
Customer InformationWeb
Social
Channel
Partners
Utility
Providers
Aadhar
UIDAI
• Lowers cost of follow-up with users
• Reduces loses by highlighting risky
users early
• Graph database based AML
• Insights into
• New products
• New customers
• New loans to existing customers
• New investment opportunities for
customers
• Reduces operational errors
• Traceability of data source
Advantages
of Hadoop (KYC) Solution to Banks
AML
Graph
Queries
Due
Diligence
Risk
Credit
Scoring
Mitigation
Analysis
Peer
groups
New
Prospects
Insights
New
Products
New
Customers
Atidan Case Study
Email scanning and categorization using MongoDB
Business Need
Retrieve potentially millions of daily emails from a common webmail account, categorize them and post them into individual user’s
page for frontend access
The existing process had significant performance, reliability and scalability issues. The user would also receive a lot of SPAM
Solution
Atidan proposed a MongoDB-Drupal based solution with the following approach:
• Scheduler was created to pull only headers from the all-user common webmail account
• Stored them into the intermediate Catalog in MongoDB
• Data transformed based on the recipient address and user preferences. SPAM removed. Email body was fetched for the filtered
records and saved into the final Catalog in MongoDB
• Emails from the final catalog pushed into the front end platform (Drupal)
Key Takeaways
• Leverage the power of MongoDB in processing ’Big Data’ of millions of daily emails. It is much faster, easy to scale and very flexible
• The task was spilt into multiple sub-tasks and better algorithm used for performance and efficiency
Atidan Case Study
Email scanning and categorization using MongoDB
• Node.js (data transformation)
• MongoDB (database)
• Schema-less
• RESTFUL service to access data from the browser
• Drupal (Frontend)
• Basic unit of data storage and transfer was JSON object
• Storage and querying
• NoSQL/Simple/Schema-less database
• Advantages
• highly scalable, very flexible, simple
• Connectivity
• node.js
 Server side Javascript
Technologies used
Thank you!
www.atidan.com
social@atidan.com

Mais conteúdo relacionado

Mais procurados

Big data Presentation
Big data PresentationBig data Presentation
Big data PresentationAswadmehar
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesAshraf Uddin
 
Building a Winning Roadmap for Analytics
Building a Winning Roadmap for AnalyticsBuilding a Winning Roadmap for Analytics
Building a Winning Roadmap for AnalyticsIronside
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Application of predictive analytics
Application of predictive analyticsApplication of predictive analytics
Application of predictive analyticsPrasad Narasimhan
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBernard Marr
 
Introduction To Predictive Analytics Part I
Introduction To Predictive Analytics   Part IIntroduction To Predictive Analytics   Part I
Introduction To Predictive Analytics Part Ijayroy
 
Big Data Analytics Powerpoint Presentation Slide
Big Data Analytics Powerpoint Presentation SlideBig Data Analytics Powerpoint Presentation Slide
Big Data Analytics Powerpoint Presentation SlideSlideTeam
 
Importance of Data Analytics
 Importance of Data Analytics Importance of Data Analytics
Importance of Data AnalyticsProduct School
 
Big data analytics in banking sector
Big data analytics in banking sectorBig data analytics in banking sector
Big data analytics in banking sectorAnil Rana
 
Big Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewBig Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewSivashankar Ganapathy
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyRohit Dubey
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Edureka!
 

Mais procurados (20)

Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Building a Winning Roadmap for Analytics
Building a Winning Roadmap for AnalyticsBuilding a Winning Roadmap for Analytics
Building a Winning Roadmap for Analytics
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Big data case study collection
Big data   case study collectionBig data   case study collection
Big data case study collection
 
Application of predictive analytics
Application of predictive analyticsApplication of predictive analytics
Application of predictive analytics
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
 
Introduction To Predictive Analytics Part I
Introduction To Predictive Analytics   Part IIntroduction To Predictive Analytics   Part I
Introduction To Predictive Analytics Part I
 
Big Data Analytics Powerpoint Presentation Slide
Big Data Analytics Powerpoint Presentation SlideBig Data Analytics Powerpoint Presentation Slide
Big Data Analytics Powerpoint Presentation Slide
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
 
Importance of Data Analytics
 Importance of Data Analytics Importance of Data Analytics
Importance of Data Analytics
 
Big data analytics in banking sector
Big data analytics in banking sectorBig data analytics in banking sector
Big data analytics in banking sector
 
Big Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewBig Data - Applications and Technologies Overview
Big Data - Applications and Technologies Overview
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit Dubey
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
 
Big data
Big dataBig data
Big data
 
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
 

Destaque

Business case for Big Data Analytics
Business case for Big Data AnalyticsBusiness case for Big Data Analytics
Business case for Big Data AnalyticsVijay Rao
 
Big Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 TelcoBig Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 TelcoBlueData, Inc.
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBernard Marr
 

Destaque (6)

Business case for Big Data Analytics
Business case for Big Data AnalyticsBusiness case for Big Data Analytics
Business case for Big Data Analytics
 
Big Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 TelcoBig Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 Telco
 
Big Data Trends
Big Data TrendsBig Data Trends
Big Data Trends
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should Know
 
What is big data?
What is big data?What is big data?
What is big data?
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 

Semelhante a Three Big Data Case Studies

Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Cloudera, Inc.
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big DataInfochimps, a CSC Big Data Business
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Group
 
Empowering Businesses through Big Data Analytics
Empowering Businesses through  Big Data AnalyticsEmpowering Businesses through  Big Data Analytics
Empowering Businesses through Big Data AnalyticsOlha Hrytsay
 
Unlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeUnlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeMongoDB
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data PlatformVikas Manoria
 
Business Intelligence Architecture
Business Intelligence ArchitectureBusiness Intelligence Architecture
Business Intelligence ArchitecturePhilippe Julio
 
Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise deteo
 
Hadoop in the Cloud: Common Architectural Patterns
Hadoop in the Cloud: Common Architectural PatternsHadoop in the Cloud: Common Architectural Patterns
Hadoop in the Cloud: Common Architectural PatternsDataWorks Summit
 
Big Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureBig Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureMongoDB
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesAmazon Web Services
 
Creating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital TransformationCreating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital TransformationMongoDB
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Denodo
 
Foundation of Business Intelligence for Business Firms .ppt
Foundation of Business Intelligence for Business Firms .pptFoundation of Business Intelligence for Business Firms .ppt
Foundation of Business Intelligence for Business Firms .pptRoshni814224
 
New Innovations in Information Management for Big Data - Smarter Business 2013
New Innovations in Information Management for Big Data - Smarter Business 2013New Innovations in Information Management for Big Data - Smarter Business 2013
New Innovations in Information Management for Big Data - Smarter Business 2013IBM Sverige
 

Semelhante a Three Big Data Case Studies (20)

Accelerating Data Warehouse Modernization
Accelerating Data Warehouse ModernizationAccelerating Data Warehouse Modernization
Accelerating Data Warehouse Modernization
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
Empowering Businesses through Big Data Analytics
Empowering Businesses through  Big Data AnalyticsEmpowering Businesses through  Big Data Analytics
Empowering Businesses through Big Data Analytics
 
Unlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeUnlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data Lake
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
 
Bi orientations
Bi orientationsBi orientations
Bi orientations
 
Business Intelligence Architecture
Business Intelligence ArchitectureBusiness Intelligence Architecture
Business Intelligence Architecture
 
Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise
 
Hadoop in the Cloud: Common Architectural Patterns
Hadoop in the Cloud: Common Architectural PatternsHadoop in the Cloud: Common Architectural Patterns
Hadoop in the Cloud: Common Architectural Patterns
 
Big Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureBig Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise Architecture
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business Outcomes
 
Creating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital TransformationCreating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital Transformation
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
Retail & CPG
Retail & CPGRetail & CPG
Retail & CPG
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
 
Foundation of Business Intelligence for Business Firms .ppt
Foundation of Business Intelligence for Business Firms .pptFoundation of Business Intelligence for Business Firms .ppt
Foundation of Business Intelligence for Business Firms .ppt
 
New Innovations in Information Management for Big Data - Smarter Business 2013
New Innovations in Information Management for Big Data - Smarter Business 2013New Innovations in Information Management for Big Data - Smarter Business 2013
New Innovations in Information Management for Big Data - Smarter Business 2013
 

Último

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Último (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Three Big Data Case Studies

  • 2. Great use cases of Big Data Big Data Exploration Find, visualize, understand all big data to improve decision making Enhanced 3600 View of the Customer Extend existing customer views (CRM, etc) by incorporating additional internal and external information sources Security/Intelligence Extension Lower risk, detect fraud and monitor cyber security in real-time Data Warehouse Augmentation Integrate big data and data warehouse capabilities to increase operational efficiency Operations Analysis Analyze a variety of machine data for improved business results
  • 3. • Greater efficiencies in business processes • New insights from combining and analyzing data types in new ways • Develop new business models with resulting increased market presence and revenue Why Big Data File Systems Relational Data Content Mgmt Email CRM Supply Chain ERP RSS Feeds Cloud Custom SourcesDataViews Applications/ Users
  • 4. Atidan Approach Implement a Hadoop- centric reference architecture Move enterprise batch processing to Hadoop Make Hadoop the single point of truth Massively reduce ETL by transforming within Hadoop Move results and aggregates back to legacy systems for consumption Retain, within Hadoop, source files at the finest granularity for re-use Top Criteria • Allow users to use familiar consumption interfaces (web, mobile) • Enable businesses to unlock previously unusable data Unlock Big Data Simplify Your Warehouse Preprocess Raw Data Ingest BigData ArchitectureHighlevel
  • 5.
  • 6. Atidan Case Study Usage Analysis using Hadoop • Business Need • A large conglomerate had to analyze the last 10 years usage of its web applications by using the IIS logs • The logs received from IIS were stored in multiple files e.g. Daily logs • The data had free text, it was unstructured and it also contained irrelevant data • The exact analysis criteria/parameters/desired outcome were not pre-known • Solution • Traditional RDBMS could not handle the problem due to the type and volume of the data and the uncertainty around ultimate analysis criteria • Atidan delivered a Hadoop based solution that performed transformation of raw data into reports easily • The solution was fault tolerant to data inconsistencies • Hadoop provided elasticity to incremental data addition • Scalability in the range of Peta Bytes • Based on data size and complexity, the processing can be scaled from one node to 100 nodes • Schema-less architecture helped in dynamically changing the data model and analytics even at a late stage in the project • The organization got completely new and unexpected insights on employee, customer and vendor/partner behavior • Correlations between employee’s usage pattern and attrition as well as productivity were established
  • 7. Atidan Case Study Usage Analysis using Hadoop 0 2000 4000 6000 8000 10000 12000 14000 Accepted… BadRequest… Created(201) Forbidden… Not… NotFound… OK(200) Unauthorise… Request Types 0 200 400 600 800 1000 1200 January March May July September November January March May July September November 2001 2002 Monthly Requests 0 200000 400000 600000 Amare Amit Bhagat Mukesh Praneel Sanjog Vimal Users
  • 8. • The size of data being collected and analyzed in industry for business intelligence (BI) is growing rapidly making traditional warehousing solution prohibitively expensive • Map Reduce is low level and complex to write • Hive provides high level query language like SQL • This allows for ad-hoc analysis • Business need not know patterns to look for in advance Big Query - Hive
  • 9.
  • 10. Atidan Case Study Customer data collection (KYC) using Hadoop • Business Need • A financial institution had to periodically collect customer data • Customers are very reluctant to provide updated data • This customer data has to be cross-checked against the billions of transactions they receive per day • They want to collate data that is available in public domain from known social media sites • The data had free text, it was unstructured and it also contained irrelevant data • Solution • A graph database is constructed over the extracted social data to analyze transactions • Atidan delivered a Hadoop based solution that performed transformation of raw data into a graph database • Aggregate customer information from existing sources, social media, government sources • Analyzed transaction to find hidden patterns • Enable link analysis, risk monitoring • Facilitate decision making(new products) and customer discovery
  • 11. Atidan Case Study Customer data collection (KYC) using Hadoop Big Data Processing Graph Database Customer Clustering Income/Expense changes Corporate structure changes AML Peer group analysis Pattern Analysis Customer InformationWeb Social Channel Partners Utility Providers Aadhar UIDAI
  • 12. • Lowers cost of follow-up with users • Reduces loses by highlighting risky users early • Graph database based AML • Insights into • New products • New customers • New loans to existing customers • New investment opportunities for customers • Reduces operational errors • Traceability of data source Advantages of Hadoop (KYC) Solution to Banks AML Graph Queries Due Diligence Risk Credit Scoring Mitigation Analysis Peer groups New Prospects Insights New Products New Customers
  • 13.
  • 14. Atidan Case Study Email scanning and categorization using MongoDB Business Need Retrieve potentially millions of daily emails from a common webmail account, categorize them and post them into individual user’s page for frontend access The existing process had significant performance, reliability and scalability issues. The user would also receive a lot of SPAM Solution Atidan proposed a MongoDB-Drupal based solution with the following approach: • Scheduler was created to pull only headers from the all-user common webmail account • Stored them into the intermediate Catalog in MongoDB • Data transformed based on the recipient address and user preferences. SPAM removed. Email body was fetched for the filtered records and saved into the final Catalog in MongoDB • Emails from the final catalog pushed into the front end platform (Drupal) Key Takeaways • Leverage the power of MongoDB in processing ’Big Data’ of millions of daily emails. It is much faster, easy to scale and very flexible • The task was spilt into multiple sub-tasks and better algorithm used for performance and efficiency
  • 15. Atidan Case Study Email scanning and categorization using MongoDB
  • 16. • Node.js (data transformation) • MongoDB (database) • Schema-less • RESTFUL service to access data from the browser • Drupal (Frontend) • Basic unit of data storage and transfer was JSON object • Storage and querying • NoSQL/Simple/Schema-less database • Advantages • highly scalable, very flexible, simple • Connectivity • node.js  Server side Javascript Technologies used