SlideShare uma empresa Scribd logo
1 de 31
LOG DATA ANALYSIS PLATFORM
May, 2015
Agenda
1) User-Group Introduction
2) Problematic
3) Log Data Analysis System Overview
4) Task Analysis
5) Solution Architecture
6) Trade-off Analysis
7) Automation
8) Performance Testing
9) Outcome & Plans
PROBLEMATIC
Demo Lab: Why we’ve started this project?
1) Increase Internal Experience
2) Create Reference Solution w/o NDA Limitations
3) Get Playground for Tests
4) Provide Demo Environment for Customers (using their data)
5) Decrease time to Market (by introducing automation)
LOG DATA ANALYSIS PLATFORM :
OVERVIEW
Log Data Analysis Platform Details
Key Facts:
• ~270-300 Web Servers
• Log Types: HTTPD Access
logs, Error logs, Application
Server Servlet, OS Service
Logs
• ~500K events per minute
• 150GB of data per day
Technologies:
• Flume
• Hadoop/HDFS, MapReduce
• Hive, Impala
• Oozie
• Elasticsearch, Kibana 3
• Tableau Analytics platform
• Puppet + Vagrant
Log Data Examples
Access log:
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326
Error log:
[Sun Mar 7 20:58:27 2004] [info] [client 64.242.88.10] (104)Connection reset by peer: client
stopped connection before send body completed
[Sun Mar 7 21:16:17 2004] [error] [client 24.70.56.49] File does not exist:
/home/httpd/twiki/view/Main/WebHome
Vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 305416 260688 29160 2356920 2 2 4 1 0 0 6 1 92 2 0
iostat
Linux 2.6.32-100.28.5.el6.x86_64 (dev-db) 07/09/2011
avg-cpu: %user %nice %system %iowait %steal %idle
5.68 0.00 0.52 2.03 0.00 91.76
TASK ANALYSIS
Architecture Drivers: Use Cases
Architecture Drivers: Quality Attributes (1/3)
Architecture Drivers: Quality Attributes (2/3)
Architecture Drivers: Quality Attributes (3/3)
Architecture Drivers: Limitations
Demo Lab: Marketecture
SOLUTION ARCHITECTURE
Solution Architecture
Batch Layer Serving Layer
Speed Layer
Raw Data
Storage
Data
Strea
m
Real-time
Views
Static Views
Precomputing
Precomputing
Ad-hoc Batch
Views
Static Batch
Views
Corporate BI
Tool
Legend:
Layer boundary
Data flow (with direction indicated)
Query flow
Apache HTTP Servers
Raw Data
Storage Pre-computing Batch Views
Real-Time Views
Dashboard/
Search
Data Stream
Real-Time Processing and
Aggregations
BI Tool
 Avro as a Raw Data Storage file
format
 Parquet as a Batch Views file
format
 Star schema as a Batch Views
data model
Architecture: Flume Topology
Batch ETL
TRADE-OFF ANALYSIS
Distribution Selection
Hive Stinger vs Impala
Compression Ratio
Access Speed
AUTOMATION
Automation (saves time and money)
80% 20%
Development and Debugging F&P Testing, Demo
Local Development Cloud Development
vagrant up
Automation Process
Phase Tool Notes
VM Provisioning Vagrant — Supports:
VirtualBox, VMWare ESX, Amazon AWS
VM Bootstraping Puppet — Installs Cloudera Manager, Cloudera Distribution
Hadoop, ElasticSearch+Kibana, Flume, Microstrategy, Log
Generator.
— Creates Cluster using Cloudera Manager API.
Configure ETL
and BI
Puppet — Configures Flume, Oozie, ElasticSearch, Impala, Hive,
Microstrategy Dashboards
Integration Tests Puppet — Generates Workload and ensures data go through.
— Checks Logs for errors.
— Calculates timing/throughput.
PERFORMANCE TESTING
Log Generator
1 Thread can generate:
4200 events / second (File source)
5500 events / second (TCP source)
Accurate Sizing
100k/min
50k/min
20k/min
200k
/min
Calculator!
OUTCOME & PLANS
Outcome
1) Demo lab, playground, testing platform (in 1 hour)
2) Sizing Calculator
3) Help to get 3 new customers (one is really, really
huge)
4) Strategic Partnership with Cloudera
5) Tons of experience and fun 
Plans
1) Add support for other Hadoop Distributions
(Hortonworks, MapR)
2) Make Project Open-Source
Thank You!
31
SoftServe US Office
One Congress Plaza,
111 Congress Avenue, Suite 2700 Austin, TX
78701
Tel: 512.516.8880
Contacts
Valentyn Kropov
vkrop@softserveinc.com
Tel: 866.687.3588 x4341

Mais conteúdo relacionado

Mais procurados

MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
Spark Summit
 

Mais procurados (20)

Metail at Cambridge AWS User Group Main Meetup #3
Metail at Cambridge AWS User Group Main Meetup #3Metail at Cambridge AWS User Group Main Meetup #3
Metail at Cambridge AWS User Group Main Meetup #3
 
An Introduction to the Heatmap / Histogram Plugin
An Introduction to the Heatmap / Histogram PluginAn Introduction to the Heatmap / Histogram Plugin
An Introduction to the Heatmap / Histogram Plugin
 
ADF Mapping Data Flow Private Preview Migration
ADF Mapping Data Flow Private Preview MigrationADF Mapping Data Flow Private Preview Migration
ADF Mapping Data Flow Private Preview Migration
 
Presto: Distributed sql query engine
Presto: Distributed sql query engine Presto: Distributed sql query engine
Presto: Distributed sql query engine
 
Hopsworks - The Platform for Data-Intensive AI
Hopsworks - The Platform for Data-Intensive AIHopsworks - The Platform for Data-Intensive AI
Hopsworks - The Platform for Data-Intensive AI
 
Sql to dax
Sql to daxSql to dax
Sql to dax
 
Module Owb Repositories
Module Owb RepositoriesModule Owb Repositories
Module Owb Repositories
 
Anomaly Detection using Spark MLlib and Spark Streaming
Anomaly Detection using Spark MLlib and Spark StreamingAnomaly Detection using Spark MLlib and Spark Streaming
Anomaly Detection using Spark MLlib and Spark Streaming
 
Prometheus loves Grafana
Prometheus loves GrafanaPrometheus loves Grafana
Prometheus loves Grafana
 
Apache NiFi: A Drag and Drop Approach
Apache NiFi: A Drag and Drop ApproachApache NiFi: A Drag and Drop Approach
Apache NiFi: A Drag and Drop Approach
 
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
 
MLeap: Productionize Data Science Workflows Using Spark
MLeap: Productionize Data Science Workflows Using SparkMLeap: Productionize Data Science Workflows Using Spark
MLeap: Productionize Data Science Workflows Using Spark
 
Automatic Forecasting using Prophet, Databricks, Delta Lake and MLflow
Automatic Forecasting using Prophet, Databricks, Delta Lake and MLflowAutomatic Forecasting using Prophet, Databricks, Delta Lake and MLflow
Automatic Forecasting using Prophet, Databricks, Delta Lake and MLflow
 
Kibana + timelion: time series with the elastic stack
Kibana + timelion: time series with the elastic stackKibana + timelion: time series with the elastic stack
Kibana + timelion: time series with the elastic stack
 
MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
 
Azure Data Factory for Redmond SQL PASS UG Sept 2018
Azure Data Factory for Redmond SQL PASS UG Sept 2018Azure Data Factory for Redmond SQL PASS UG Sept 2018
Azure Data Factory for Redmond SQL PASS UG Sept 2018
 
Managed Feature Store for Machine Learning
Managed Feature Store for Machine LearningManaged Feature Store for Machine Learning
Managed Feature Store for Machine Learning
 
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview SlidesMicrosoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview Slides
 
Building Your First Apache Apex (Next Gen Big Data/Hadoop) Application
Building Your First Apache Apex (Next Gen Big Data/Hadoop) ApplicationBuilding Your First Apache Apex (Next Gen Big Data/Hadoop) Application
Building Your First Apache Apex (Next Gen Big Data/Hadoop) Application
 
A Short Presentation on Kafka
A Short Presentation on KafkaA Short Presentation on Kafka
A Short Presentation on Kafka
 

Destaque

Well Log Interpretation and Petrophysical Analisis in [Autosaved]
Well Log Interpretation and Petrophysical Analisis in [Autosaved]Well Log Interpretation and Petrophysical Analisis in [Autosaved]
Well Log Interpretation and Petrophysical Analisis in [Autosaved]
Ridho Nanda Pratama
 

Destaque (8)

Application Monitoring with WSO2 App Server
Application Monitoring with WSO2 App ServerApplication Monitoring with WSO2 App Server
Application Monitoring with WSO2 App Server
 
Designing Big Data Systems Like a Pro
Designing Big Data Systems Like a ProDesigning Big Data Systems Like a Pro
Designing Big Data Systems Like a Pro
 
Cloudera Impala
Cloudera ImpalaCloudera Impala
Cloudera Impala
 
Approaching Quality in Digital Era
Approaching Quality in Digital EraApproaching Quality in Digital Era
Approaching Quality in Digital Era
 
Impala Performance Update
Impala Performance UpdateImpala Performance Update
Impala Performance Update
 
Training Webinar: Effective Platform Server Monitoring
Training Webinar: Effective Platform Server MonitoringTraining Webinar: Effective Platform Server Monitoring
Training Webinar: Effective Platform Server Monitoring
 
Well Log Interpretation and Petrophysical Analisis in [Autosaved]
Well Log Interpretation and Petrophysical Analisis in [Autosaved]Well Log Interpretation and Petrophysical Analisis in [Autosaved]
Well Log Interpretation and Petrophysical Analisis in [Autosaved]
 
Well log data processing
Well log data processingWell log data processing
Well log data processing
 

Semelhante a Log Data Analysis Platform by Valentin Kropov

Microsoft SQL Server - StreamInsight Overview Presentation
Microsoft SQL Server - StreamInsight Overview PresentationMicrosoft SQL Server - StreamInsight Overview Presentation
Microsoft SQL Server - StreamInsight Overview Presentation
Microsoft Private Cloud
 
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Cloudera, Inc.
 

Semelhante a Log Data Analysis Platform by Valentin Kropov (20)

Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San Jose
 
Apache Eagle Architecture Evolvement
Apache Eagle Architecture EvolvementApache Eagle Architecture Evolvement
Apache Eagle Architecture Evolvement
 
Server Monitoring (Scaling while bootstrapped)
Server Monitoring  (Scaling while bootstrapped)Server Monitoring  (Scaling while bootstrapped)
Server Monitoring (Scaling while bootstrapped)
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
 
What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group
What is going on? Application Diagnostics on Azure - Copenhagen .NET User GroupWhat is going on? Application Diagnostics on Azure - Copenhagen .NET User Group
What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group
 
Productionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices ArchitectureProductionizing Machine Learning with a Microservices Architecture
Productionizing Machine Learning with a Microservices Architecture
 
Cowboy Dating with Big Data or DWH Evolution in Action, Борис Трофимов
Cowboy Dating with Big Data or DWH Evolution in Action, Борис ТрофимовCowboy Dating with Big Data or DWH Evolution in Action, Борис Трофимов
Cowboy Dating with Big Data or DWH Evolution in Action, Борис Трофимов
 
Microsoft SQL Server - StreamInsight Overview Presentation
Microsoft SQL Server - StreamInsight Overview PresentationMicrosoft SQL Server - StreamInsight Overview Presentation
Microsoft SQL Server - StreamInsight Overview Presentation
 
Supercharge Your Product Development with Continuous Delivery & Serverless Co...
Supercharge Your Product Development with Continuous Delivery & Serverless Co...Supercharge Your Product Development with Continuous Delivery & Serverless Co...
Supercharge Your Product Development with Continuous Delivery & Serverless Co...
 
Sumo Logic QuickStart Webinar - Jan 2016
Sumo Logic QuickStart Webinar - Jan 2016Sumo Logic QuickStart Webinar - Jan 2016
Sumo Logic QuickStart Webinar - Jan 2016
 
Visualizing Big Data in Realtime
Visualizing Big Data in RealtimeVisualizing Big Data in Realtime
Visualizing Big Data in Realtime
 
Service quality monitoring system architecture
Service quality monitoring system architectureService quality monitoring system architecture
Service quality monitoring system architecture
 
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
 
Bquery Reporting & Analytics Architecture
Bquery Reporting & Analytics ArchitectureBquery Reporting & Analytics Architecture
Bquery Reporting & Analytics Architecture
 
Cowboy dating with big data, Борис Трофімов
Cowboy dating with big data, Борис ТрофімовCowboy dating with big data, Борис Трофімов
Cowboy dating with big data, Борис Трофімов
 
Apache Eagle in Action
Apache Eagle in ActionApache Eagle in Action
Apache Eagle in Action
 
BizSpark Startup Night Windows Azure March 29, 2011
BizSpark Startup Night Windows Azure March 29, 2011BizSpark Startup Night Windows Azure March 29, 2011
BizSpark Startup Night Windows Azure March 29, 2011
 
Evolving Architecture
Evolving ArchitectureEvolving Architecture
Evolving Architecture
 
Cloud Lambda Architecture Patterns
Cloud Lambda Architecture PatternsCloud Lambda Architecture Patterns
Cloud Lambda Architecture Patterns
 

Mais de SoftServe

Mais de SoftServe (20)

Digital Product Security
Digital Product SecurityDigital Product Security
Digital Product Security
 
Testing Tools and Tips
Testing Tools and TipsTesting Tools and Tips
Testing Tools and Tips
 
Android Mobile Application Testing: Human Interface Guideline, Tools
Android Mobile Application Testing: Human Interface Guideline, ToolsAndroid Mobile Application Testing: Human Interface Guideline, Tools
Android Mobile Application Testing: Human Interface Guideline, Tools
 
Android Mobile Application Testing: Specific Functional, Performance, Device ...
Android Mobile Application Testing: Specific Functional, Performance, Device ...Android Mobile Application Testing: Specific Functional, Performance, Device ...
Android Mobile Application Testing: Specific Functional, Performance, Device ...
 
How to Reduce Time to Market Using Microsoft DevOps Solutions
How to Reduce Time to Market Using Microsoft DevOps SolutionsHow to Reduce Time to Market Using Microsoft DevOps Solutions
How to Reduce Time to Market Using Microsoft DevOps Solutions
 
Containerization: The DevOps Revolution
Containerization: The DevOps Revolution Containerization: The DevOps Revolution
Containerization: The DevOps Revolution
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist
 
Rapid Prototyping for Big Data with AWS
Rapid Prototyping for Big Data with AWS Rapid Prototyping for Big Data with AWS
Rapid Prototyping for Big Data with AWS
 
Implementing Test Automation: What a Manager Should Know
Implementing Test Automation: What a Manager Should KnowImplementing Test Automation: What a Manager Should Know
Implementing Test Automation: What a Manager Should Know
 
Using AWS Lambda for Infrastructure Automation and Beyond
Using AWS Lambda for Infrastructure Automation and BeyondUsing AWS Lambda for Infrastructure Automation and Beyond
Using AWS Lambda for Infrastructure Automation and Beyond
 
Advanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseAdvanced Analytics and Data Science Expertise
Advanced Analytics and Data Science Expertise
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
 
Big Data as a Service: A Neo-Metropolis Model Approach for Innovation
Big Data as a Service: A Neo-Metropolis Model Approach for InnovationBig Data as a Service: A Neo-Metropolis Model Approach for Innovation
Big Data as a Service: A Neo-Metropolis Model Approach for Innovation
 
Personalized Medicine in a Contemporary World by Eugene Borukhovich, SVP Heal...
Personalized Medicine in a Contemporary World by Eugene Borukhovich, SVP Heal...Personalized Medicine in a Contemporary World by Eugene Borukhovich, SVP Heal...
Personalized Medicine in a Contemporary World by Eugene Borukhovich, SVP Heal...
 
Health 2.0 WinterTech: Will Artificial Intelligence change healthcare? by Eug...
Health 2.0 WinterTech: Will Artificial Intelligence change healthcare? by Eug...Health 2.0 WinterTech: Will Artificial Intelligence change healthcare? by Eug...
Health 2.0 WinterTech: Will Artificial Intelligence change healthcare? by Eug...
 
Managing Requirements with Word and TFS by Max Markov
Managing Requirements with Word and TFS by Max MarkovManaging Requirements with Word and TFS by Max Markov
Managing Requirements with Word and TFS by Max Markov
 
How to Implement Hybrid Cloud Solutions Successfully
How to Implement Hybrid Cloud Solutions SuccessfullyHow to Implement Hybrid Cloud Solutions Successfully
How to Implement Hybrid Cloud Solutions Successfully
 
Product Management in Outsourcing by Roman Kolodchak and Roman Pavlyuk
Product Management in Outsourcing by Roman Kolodchak and Roman PavlyukProduct Management in Outsourcing by Roman Kolodchak and Roman Pavlyuk
Product Management in Outsourcing by Roman Kolodchak and Roman Pavlyuk
 
From Sandbox to Production by Vadym Fedorov
From Sandbox to Production by Vadym FedorovFrom Sandbox to Production by Vadym Fedorov
From Sandbox to Production by Vadym Fedorov
 
Why Ukraine? by Brian Borack, COO
Why Ukraine? by Brian Borack, COOWhy Ukraine? by Brian Borack, COO
Why Ukraine? by Brian Borack, COO
 

Último

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Último (20)

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 

Log Data Analysis Platform by Valentin Kropov

  • 1. LOG DATA ANALYSIS PLATFORM May, 2015
  • 2. Agenda 1) User-Group Introduction 2) Problematic 3) Log Data Analysis System Overview 4) Task Analysis 5) Solution Architecture 6) Trade-off Analysis 7) Automation 8) Performance Testing 9) Outcome & Plans
  • 4. Demo Lab: Why we’ve started this project? 1) Increase Internal Experience 2) Create Reference Solution w/o NDA Limitations 3) Get Playground for Tests 4) Provide Demo Environment for Customers (using their data) 5) Decrease time to Market (by introducing automation)
  • 5. LOG DATA ANALYSIS PLATFORM : OVERVIEW
  • 6. Log Data Analysis Platform Details Key Facts: • ~270-300 Web Servers • Log Types: HTTPD Access logs, Error logs, Application Server Servlet, OS Service Logs • ~500K events per minute • 150GB of data per day Technologies: • Flume • Hadoop/HDFS, MapReduce • Hive, Impala • Oozie • Elasticsearch, Kibana 3 • Tableau Analytics platform • Puppet + Vagrant
  • 7. Log Data Examples Access log: 127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 Error log: [Sun Mar 7 20:58:27 2004] [info] [client 64.242.88.10] (104)Connection reset by peer: client stopped connection before send body completed [Sun Mar 7 21:16:17 2004] [error] [client 24.70.56.49] File does not exist: /home/httpd/twiki/view/Main/WebHome Vmstat procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 305416 260688 29160 2356920 2 2 4 1 0 0 6 1 92 2 0 iostat Linux 2.6.32-100.28.5.el6.x86_64 (dev-db) 07/09/2011 avg-cpu: %user %nice %system %iowait %steal %idle 5.68 0.00 0.52 2.03 0.00 91.76
  • 10. Architecture Drivers: Quality Attributes (1/3)
  • 11. Architecture Drivers: Quality Attributes (2/3)
  • 12. Architecture Drivers: Quality Attributes (3/3)
  • 16. Solution Architecture Batch Layer Serving Layer Speed Layer Raw Data Storage Data Strea m Real-time Views Static Views Precomputing Precomputing Ad-hoc Batch Views Static Batch Views Corporate BI Tool Legend: Layer boundary Data flow (with direction indicated) Query flow Apache HTTP Servers Raw Data Storage Pre-computing Batch Views Real-Time Views Dashboard/ Search Data Stream Real-Time Processing and Aggregations BI Tool  Avro as a Raw Data Storage file format  Parquet as a Batch Views file format  Star schema as a Batch Views data model
  • 21. Hive Stinger vs Impala Compression Ratio Access Speed
  • 23. Automation (saves time and money) 80% 20% Development and Debugging F&P Testing, Demo Local Development Cloud Development
  • 25. Automation Process Phase Tool Notes VM Provisioning Vagrant — Supports: VirtualBox, VMWare ESX, Amazon AWS VM Bootstraping Puppet — Installs Cloudera Manager, Cloudera Distribution Hadoop, ElasticSearch+Kibana, Flume, Microstrategy, Log Generator. — Creates Cluster using Cloudera Manager API. Configure ETL and BI Puppet — Configures Flume, Oozie, ElasticSearch, Impala, Hive, Microstrategy Dashboards Integration Tests Puppet — Generates Workload and ensures data go through. — Checks Logs for errors. — Calculates timing/throughput.
  • 27. Log Generator 1 Thread can generate: 4200 events / second (File source) 5500 events / second (TCP source)
  • 30. Outcome 1) Demo lab, playground, testing platform (in 1 hour) 2) Sizing Calculator 3) Help to get 3 new customers (one is really, really huge) 4) Strategic Partnership with Cloudera 5) Tons of experience and fun  Plans 1) Add support for other Hadoop Distributions (Hortonworks, MapR) 2) Make Project Open-Source
  • 31. Thank You! 31 SoftServe US Office One Congress Plaza, 111 Congress Avenue, Suite 2700 Austin, TX 78701 Tel: 512.516.8880 Contacts Valentyn Kropov vkrop@softserveinc.com Tel: 866.687.3588 x4341