The socially integrated world, the rise of mobile, the Internet of Things - this explosion of data can be directed and used, rather than simply managed. That's why Big Data and advanced analytics are key components of most digital transformation strategies.
In the last year, Microsoft has made key moves to extend its data platform into this realm. Stalwart platforms like SQL Server and Excel join up with new PaaS offerings to make up a dynamic and powerful Big Data/advanced analytics ecosystem.
In this webinar, our experts covered:
-Why you should include Big Data and advanced analytics in your digital transformation strategy
-Challenges facing digital transformation initiatives
-What options the Microsoft toolset offers for Big Data (Hadoop) and advanced analytics
-How to leverage products and services you already own for your digital transformation
Transforming Business in a Digital Era with Big Data and Microsoft
1. Transforming Business in a Digital Era with
Big Data and Microsoft
facebook.com/perficient twitter.com/Perficient_MSFTlinkedin.com/company/perficient
2. 2
Perficient is a leading information
technology consulting firm serving clients
throughout North America.
We help clients implement business-driven
technology solutions that integrate business
processes, improve worker productivity, increase
customer loyalty and create a more agile enterprise
to better respond to new business opportunities.
ABOUT PERFICIENT
3. 3
PERFICIENT PROFILE
Founded in 1997
Public, NASDAQ: PRFT
2014 revenue $456 million
Major market locations:
Allentown, Atlanta, Ann Arbor, Boston, Charlotte, Chicago, Cincinnati,
Columbus, Dallas, Denver, Detroit, Fairfax, Houston, Indianapolis,
Lafayette, Milwaukee, Minneapolis, New York City, Northern California,
Oxford (UK), Southern California, St. Louis, Toronto
Global delivery centers in China and India
>2,600 colleagues
Dedicated solution practices
~90% repeat business rate
Alliance partnerships with major technology vendors
Multiple vendor/industry technology and growth awards
4. 4
INDUSTRIES
Healthcare
Financial Services
Life Sciences
Consumer Markets
Automotive & Transportation
High Tech
Telecom
Energy & Utilities
Manufacturing
Media & Entertainment
PORTAL
Portal Frameworks
Search
Security
Web Analytics
Web Content Management
Social & Collaboration
Mobility
Experience Design
INTEGRATION
Integration Frameworks
Cloud Architecture
Reference Architecture
Application Integration
Enterprise Application
Integration
Service Oriented Architecture
Process & Content Integration
Business Process Management
Complex Event Processing
Rules Engines
DATA & CONTENT
Business Analytics
Business Intelligence
Predictive Analytics
Reporting
Structured Data Management
Data Integration, Quality &
Governance
Enterprise Data Warehouse
Master Data Management
Product & Information
Management
Unstructured Data Management
Big Data
Content Intelligence
Content Management
Enterprise Search
CUSTOMER EXPERIENCE
Customer 360
Multi Channel Enablement
Relationship Management
Social Engagement
Commerce
Marketing Strategy Implementation
Order Management
Supply Chain Management
Service & Support
Managed Hosting
Sales & Service Support
Customer Service, Sales Force
Automation
Experience Design
Strategic Roadmaps & Envision
Workshops
User Research & Metrics Analysis
Creative & Interaction Design
Custom & Responsive UI
Development
Digital Marketing
Search Engine Marketing
Online Advertising
Content Strategy
Conversion Optimization
Management Consulting
BUSINESS OPERATIONS
Corporate Performance
Management
Budgeting, Forecasting &
Planning
Business Analysis & Predictive
Analytics
Enterprise Business Solutions
Oracle EBS
Vertex Tax Solutions
Human Resource Solutions
Employee Portals
Human Resource Management
Talent Management
Enterprise Social Platforms
Social Strategy
Lync Unified Communications
Office 365
Management Consulting
OUR SOLUTIONS PORTFOLIO
7. 7
7
Introduction
Digital Transformation & Big Data
Big Data Challenges
Big Data & Microsoft
In the Cloud with HDInsight
In the Data Center with APS
AGENDA
11. 11
BIG DATA CHALLENGES:
How to get value from Big Data?
Governance & Security Concerns/ Issues
Analytical / Technology Talent
Integrating different sources of Data
Integrating Enterprise Data with Big Data
Defining Strategy
Funding
18. 18
BIG DATA & MICROSOFT
Andrew Tegethoff,
Microsoft BI Practice Lead
19. 19
19
Consulting on strategic and tactical aspects of BI with the Microsoft Data Platform
MICROSOFT BUSINESS INTELLIGENCE
20. 20
20
• Volume
o Terabytes, petabytes,
exabytes
• Velocity
o How much data is
created every minute?
• Variety
o Social, Web, Internet of
Things, etc.
BIG DATA
21. 21
21
BIG DATA
What types of data are we talking about?
People to People
Online forums
Social networks
Blogs
SMS threads
Email threads
People to Machine
E-commerce
Bank cards
Credit cards
Mobile devices
Digital TV
Machine to Machine
Medical devices
GPS devices
Bar code scanners
Sensors
Surveillance cams
22. 22
22
An open source framework for the storage and processing
of very large data sets.
The Hadoop ecosystem consists of many additional tools that perform functions like:
• Resource management
• Extract, Transform and Load (“ETL”) and/or Extract,
Load and Transform (“ELT”)
• Full text search
• Workflow scheduling
• SQL querying
ENTER HADOOP
23. 23
23
WHAT CAN
HADOOP DO?
• Allow you to keep pace with more volume,
more variety, and greater velocity of data.
• Allow you to store all of data in its raw form,
so you can ask questions later that were
not thought of when the data was captured.
• Enable you to ask questions of your data
that previously couldn’t be answered – as
well as capture data that previously couldn’t
be captured.
24. 24
24
INTRODUCING HDINSIGHT
• Key part of Microsoft’s Big Data/Hadoop story
• “PaaS” option for cloud Hadoop
• Azure wraps an Apache Hadoop implementation
created by Hortonworks and Microsoft partnership
• Uses Azure Storage (Tables) for scalable “NoSql”
cloud storage
• Integrates Big Data into existing applications, BI
solutions, reporting environments, Excel
25. 25
25
• Establish an Azure Storage account
• Set up an HDInsight cluster
• Account cost relates directly to size of cluster & uptime!
• Upload data
• Using native JavaScript, Hadoop command line, Sqoop connection from
SQL Server or Azure SQL Database or a raft of third-party tools
• Connect and analyze
• Use SQL Server and/or Excel via ODBC,
• Integrate with applications via Hadoop.NET or Azure SQL Database via
Sqoop
HOW DOES IT WORK?
26. 26
26
CLOUDERA ON AZURE
• CDH – Cloudera Hadoop distribution
• Installed on Azure Virtual Machines
running Linux
• Cloudera’s preferred cloud platform
• “IaaS” option for cloud-based Hadoop
28. 28
28
ADVANCED ANALYTICS
WITH AZURE ML
CLOUD-BASED DATA SCIENCE & PREDICTIVE ANALYTICS
• Fully-managed Azure offering
• Browser-based development environment
• Deploy predictive models as a Web Service with
Azure ML API
• Data sources: Use HDInsight, Azure Storage, local
data files, HTTP
• Includes best in Class Algorithms from Xbox & Bing
• Built-in support for the R language, includes over
350 packages, or “BYO” R code
• Deploy in minutes
30. 30
30
Connect to an HDInsight cluster using Power Query
Extract data into Power Pivot, join with other datasets
from a variety of sources to create powerful mashups
Easily translate Big Data into compelling
visualizations with Power View
ANALYZE BIG DATA WITH POWER BI
31. 31
31
CLOUD HADOOP:
THE VALUE PROP
• Enables Big Data/Hadoop
proposition, but on a scalable pay-as-
you-go basis
• Enhances analytical capability over
“loosely structured” data
• Expands scope and type of analysis
possible across a wide variety of use
cases
• Integrates easily with existing data
systems
32. 32
32
• Turnkey, on-premises Big Data analytics appliance
• Relational Data
• Massively Parallel Processing (MPP) with SQL
Server Parallel Data Warehouse (PDW)
• Non-Relational Data
• 100% Hadoop installation via on-premises version
of HDInsight
• Seamless Querying
• Polybase – query Big Data using SQL
• Performance
• In-Memory Columnstore
• Scale up to 6 PB
ANALYTICS PLATFORM SYSTEM (APS)
34. 34
34
– Massively Parallel Processing (MPP)
• Fundamentally different than typical RDBMS
Symmetric Multi-Processing (SMP)
• “Shared nothing” architecture
• Large number of dedicated processors
• Every CPU has its own storage
– Better query and load performance
• Amplified by inclusion of In-Memory
Columnstores
– Fault-tolerant, inexpensive, yet comprehensive
VLDW solution
SQL SERVER PDW
35. 35
35
ONCE AGAIN… HDInsight
– Fundamentally the same
product, but deployed within
the APS appliance
– 100% Apache Hadoop
– Query with SQL via Powerbase
– Fully integrated with Microsoft
platform
• User authentication with ADFS
• High availability
• WS Failover Clustering
36. 36
36
ON-PREMISES HADOOP: THE VALUE PROP
• Relational and non-relational data
management in one turnkey
solution
• Lowest cost per TB for a data
warehouse appliance in the industry
• Hardware choices: Dell, HP, Quanta
• Integrates into Windows infrastructure
• Performance, security, and scalability
37. 37
37
PLANNING FOR THE
FUTURE
• Establishing target problems
• Identifying resources (i.e. Azure, on premises)
• Defining and acquiring required skillsets
• Bringing it all together
38. 38
38
POLL
Where do you feel you need help with Big Data
technology?
a. Establishing a business case
b. Transitioning from POC to production
c. Establishing a Solution Architecture
d. Hadoop/Open Source toolset
e. Microsoft toolset
f. Other