Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Advanced Analytics: Going From Big Data to Big Answers
1. Intel Confidential — Do Not Forward
Intel Information Technology
Going From Big Data to Big Answers
March 19, 2014
Ajay Chandramouly
@ajayc47
2. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
Agenda
• Impact and Value of Big Data
• Intel IT Use Cases – Bringing Value of Big Data to the Enterprise
• Call to Action – Bringing Value of Big Data to Your Enterprise
2
3. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
Big
Data
TEXT
4. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
Big Data AnalyticsValue
= the “Asset” = the “Action”
5. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
The Four Pillars Of Big Data
5
Volume
Massive scale and
growth of
unstructured data
• 80%~90% of total data
• Growing 10x~50x
faster than structured
(relational) data
• 10x~100x of traditional
data warehousing
Velocity
Real-time rather
than batch-style
analysis
• Data streamed in,
tortured, and discarded
• Making impact on the
spot rather than after-
the-fact
Variety
Heterogeneity and
variable nature of
Big Data
• Many different forms
(text, document, image,
video, ...)
• No schema or weak
schema
• Inconsistent syntax and
semantics
Variability
Predictive analytics
for future trends
and patterns
• Deep, complex analysis
(machine learning,
statistic modeling,
graph algorithms, …),
versus
• Traditional business
intelligence (querying,
reporting, …)
Big Data augments traditional Business Intelligence
6. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
BIG DATA
MACHINE
GENERATED
HUMAN
GENERATED
BUSINESS
GENERATED
Edge
Scale Up
Distributed
REQUIRES DIFFERENT APPROACHES
Scale Out
NETWO
RK
STORAGE
COMPUTE
Intel® Optimized Big Data
In-Memory
XDW
MPP
One Size Doesn’t Fit All
LOB
IOT
7. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
7
Going from Data to Insight and Action
8. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
Intel IT – What We’re Doing in Big Data
8
9. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
6,500 IT Employees
59 IT sites globally
150,000 Connected Systems
40,000 Handheld Devices
100,000 Intel Employees
164 Intel Sites across 63 Countries
68 Data Centers
25% reduction with virtualization
inspire employees
IT is business
changing traditional thinking
service reliability
Intel Confidential
99
Intel Confidential – for internal use only
10. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
IT Leadership
10
Transform
Contribute Value
Deliver Services
“License to Decide”
Strategic Relationship
“Right to Influence”
Collaborative Relationship
“Reason to Exist”
Transactional Relationship
11. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
Intel IT Vision for Big Data Analytics
11
Priority
We run big data analytics
programs in each of our key
lines of businesses. Also, all
our key strategic initiatives
have a big data component.
Strategy
Implement an internal, cost-
effective big data platform
and in- parallel build the
necessary skill set within
the organization,
Approach
Gradually build business
value through advanced
analytics of big data.
Business Value
The value of our big data
efforts was about USD
$100M in 2012. We expect
that figure to grow 10x by
2014.
IT formed an enterprise Big Data Analytics organization which solves High Value problems
12. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
Big Data Path to Competitive Advantage
12
SMG
Web usage data
for
Marketing/Camp
aign predictions
(What)
SSG IT
IT Incident
Predictability
Context Aware
Analytics for
LBS
Security
Network
Intrusion
Prediction and
Prevention
Big Data Use
Cases
Tailor-made and Unique Big Data environment based on Intel needs
2011 - 12
• Defined strategy and
implementation plan
•Hadoop Path-finding
• Deployed chosen MPP
platform
• Acquired big data skills
• Deployed 3 big data projects
(3 done)
• Completed big data
distribution evaluation
• Landed internal Hadoop
cluster in Prod
• Implemented Internal
Hadoop Production cluster
2013
• Implement Internal Hadoop Pre-
Production cluster
• Deliver a solid platform for the
first set of use cases.
•Deploy internal 5-6-10 projects on
top of the BI big data platform
•Deploy the qualified Big Data
business use cases
• Deliver business value with this
platform through the use of it.
•Expand Big Data Platforms to
support use case demand.
• Setup BDP as a service with
integration of IT processes.
•Prescriptive guidance for
development and architecture.
•Standardize processes & tools
2014
• Expand IBD platform for the next set of
use cases. Deliver business value
through the use of it.
Deploy internal 5-6-100 projects on
top of the BI big data platform
• Evolve the IBD platform towards the
next generation Hadoop ecosystem
Adopt IDH3 with Hadoop 2.0/YARN
Hbase for storage intensive use cases
Explore SQL on Hadoop use cases
Expand Big Data Platforms to support
Enterprise BI use cases.
• Continuous improvement and expansion
of platform, capabilities, guidance,
process and tools.
TMG - POC
Asses feasibility
of Hadoop for
MIDAS as lower
cost solution
HR - POC
Talent
Intelligence
13. Intel Information Technology
Intel Confidential – for internal use
Intel’s Compound Big Data Platform Components
MPP Platform
3rd-party solution
100x faster than traditional systems
Intel® Xeon® processor E7 family blades scale easily
Intel Distribution Of Hadoop
Based on Apache Hadoop
Optimized for Intel® Xeon processors,
SSD and 10GbE
HBase NoSql DB
Spark (In-Memory Analytics)
MPP – Massively Parallel Processing
Predictive Analytics Engine
In house development
Enables real time, on-going Predictive service
Intel® Xeon® processor E7 family
Intel Data
Platform:
Analytics Toolkit
14. Intel Information Technology
Intel Confidential – for internal use
Hadoop Use Cases
Contextual Recommendation Engine: Provides
recommendation engine and analytic capabilities to
acquisition.
Value:
• Provides new, intelligent capabilities and map
management technologies which can be offered as paid
services est. at $1-4m.
Incident Predictability:
Reduces incidents, impact on users and IT
Value:
• Provides 10-30% reduction in number of new incidents
created at estimated cost avoidance of $4m over 2yrs.
Web Data Mining & Customer Insight:
Provides customer and network usage analytics for
Intel.com and customer advertising
Value:
• Provides means to predict and adjust product position
or pricing based on response to marketing campaigns
15. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
Intel IT Multi Data Warehouse Strategy
15
Big Data is a Part of a Comprehensive BI Strategy
16. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
Call to Action
16
17. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
17
People & Skills
CxO Program Manager Project Manager Solutions Architect Data Architect Data Engineer SE/Developer DBA
18. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
The practice of data-science
• Content assessment, normalization
• Content de-duplication
• Content tagging, taxonomy, folksonomy
• Content harvesting
• Content classification and analytics
• Copyright and attribution management
• Demographic/Customer segmentation
• Web log mining and path analysis
• Scraper, phishing, and fraud detection
• Social media monitoring
• Sentiment/Spam detection
• Recommendation engines
• Digital forensics
• Rich media indexing
• Faceted search and federated search
• Entity recognition and linking
• Personalization
• Ad Optimization
• Retention optimization
• Heterogeneous information architecture
• Reporting & Visual Presentation
• Key-movers – data hub and spoke
• Self organizing networks
• Transitive Relation Mining
• Triangle/Quad closing; triangulation
• Six degrees of separation
• Time series modeling
• Trending analysis
• Predictive modeling
• Surfacing better outcomes, better value from data
• Investigative information seeking, synthesis,
visualization, and discovery
19. Intel Information Technology
Intel Confidential – for internal use
• Build a cost-effective, versatile Big Data platform. One Size does not fit all.
• Technology is important, but skill sets are essential.
• Ecosystem is more mature than ever. Easier than ever to get started.
Summary
Big data analytics has led to big value across every sector
20. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
IT @ Intel: Sharing Intel IT Best Practices with the World
20
Learn more about Intel IT’s initiatives at www.intel.com/IT
Or @ajayc47
CIO and IT Perspective
IT White Papers, Audio-Video Blogs
IT-to-IT Community
22. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
1 Slide About Hadoop
Hadoop is…
23. Intel Information Technology
Intel Confidential – for internal use
Intel Information Technology
Intel Confidential – for internal use
Last Slide About Hadoop
23