Abstract:- Many firms are adopting a cloud first strategy and are migrating their on-premises technologies to the cloud. Logitech is one of them. We have adopted the AWS platform and big data on the cloud for all of their analytical needs, including Amazon Redshift and S3. In this presentation, I will present: The business rationale for migrating to the cloud. How data virtualization enables the migration. Running data virtualization itself in the cloud.
Logitech Accelerates Cloud Analytics Using Data Virtualization by Avinash Deshpande
1. LOGITECH ACCELERATES
CLOUD ANALYTICS USING
BIG DATA FABRIC
Avinash Deshpande
Chief Software Architect, Big data and Advanced Analytics
adeshpande2@logitech.com
2. Logitech designs products that have an everyday place in people's lives, connecting them
to the digital experiences they care about. Over 30 years ago, Logitech started
connecting people through computers; now it’s designing products that bring people
together through music, gaming, video and computing.
In 1981, Logitech was founded in the village of Apples, Switzerland. The start-up was
based in a farm building – the Swiss equivalent of a Silicon Valley garage.
At the heart of Logitech’s success lies its ability to design product experiences that tap
into genuine consumer needs. Under a number of different brands, the company offers
PC peripherals; cases and keyboards for tablets; equipment for gamers; mobile speakers
and earphones for music and sports enthusiasts; devices to make video collaboration
simple in the workplace; and entertainment and control products for the home.
THE LOGITECH STORY
3. LOGITECH – PRODUCT PORTFOLIO
• Mice + Keyboards
• Mobile
• Smart Home
• Gaming
• Speakers
• Video
5. LOGITECH DATA USE CASES
Structured Semi-Structured Unstructured
Batch Data Velocity Real-Time
Social Media
Sentiment
Predictive Analytics
Demand Forecasting
Price violations
on Retail sites
Data Warehousing Text Mining
Security Video
Analysis
Retail Data
scrapping
Machine Learning
ioT
Multi site ERP
Marketing Funnel
Sales Channel Mgmt
Smart Home
Natural Language
Processing (NLP) VR Gaming
Device Events
6. ANALYTICS AT SCALE - SUMMARY
• Create a decentralised self-service analytics environment for traditional business reporting and
analytics (Descriptive and Diagnostic Analytics). Becomes a purely EXPLICIT experience.
• Allowing for a centralised, cross-functional shared advanced analytics service tasked to deliver Predictive
and Prescriptive analytics to the organisation.
• A minimal investment, with leveraged return.
7. • Logitech and Competitors Products Consumer Reviews
§ Core capabilities to scrape retail websites for consumer reviews
§ Raw and structured data available for advanced analysis
§ Petabyte data volumes support and performance
§ Sentiment analysis for business decision
§ BGs data analysis for consumer complaints and issues and negative reviews
• Logitech Product Pricing
§ Core capabilities to scrape retail websites for pricing of Logitech products
§ Amazon buy button analysis
§ Amazon.com and marketplace price analysis
§ Price violation analysis
DATA VOLUME – COMPLEXITY (RETAIL USE CASE)
15. LOGITECH CONFIDENTIAL: NOT FOR DISTRIBUTION
REAL-TIME ON DEMAND delivery to your PHONE and DESKTOP and DASHBOARD
• Executive Summaries
• Customer by Product
• Product by Customer
• Demand / Supply updates
• Market Analytics / Market Share
• Marketing Reports
• Competitive Analysis
• Sentiment
• ...
NLP is a scalable self-service environment, meaning
we can open it to business users (self-service) and
allow them to improve and drive business impact
and adoption. It is language agnostic, meaning we
can publish reports in the language they are
written.
17. SOLUTION ARCHITECTURE
Amazon Web Services
AWS GlacierAWS S3 AWS Redshift
Pentaho DI
Pentaho Operations Mart
Cloudwatch SNSIAM Cloudtrail EMR SPARK Python / R
AWS RDS
Denodo Data Virtualization
Tableau Pentaho BA Data Interfaces Web ServicesOBIEE CUBES
Snowflake
Text Analytics
18. JOURNEY TO CLOUD
Cloud empowers IT organizations to redefine the way Data
services are produced and delivered
Scalable • Infrastructure scaled up - down on the fly (Elastic)
• Focus on simplicity, security, robustness, and scalability
Efficient • Infrastructure costs are pay as use
Reliable
• AWS managed services
Managed &
Governed
• Transparency on usage patterns
• Breadth of services offered, pricing, performance and
flexibility
19. IMPACT OF DATA VIRTUALIZATION PLATFORMS
By 2018, organizations with data virtualization
capabilities will spend 40% less on building and
managing data integration processes for
connecting distributed data assets.”
-Gartner
20. NEED FOR DATA VIRTUALIZATION
Abstract access to disparate data sources
A single semantic repository
Optimized data availability in real-time to consumers
Centralized, governed and secured data layer
21. MANAGING BIG DATA WITH DATA LAKES
Ø Organizations are exploring data lakes as consolidated repositories of massive volumes of
raw, detailed data of various types and formats to overcome Big Data challenges.
Ø But creating a physical data lake presents its own hurdles, one of which is the need to store
the data twice which can lead to governance challenges with regard to data access and
quality.
Ø Data Virtualization technologies can improve an organization’s ability to govern and
extract more value from its data lakes by extending them as logical data lakes.
- Ventana Research
23. • Logical model can be predefined for the data
• Eliminates load processes and the need to update the data
• Uses the security and governance system already in place
• Collects and maintains statistics and determines optimal query execution
• Avails Cache mechanism and pushdown for optimal performance
• Array of connection options from structured to unstructured data
• Business Layer, enabling data Consistency through single object, multiple
consumers
• Rapid prototyping
• Data Audits
• Sandbox for data science
VIRTUALIZATION BENEFITS
24. • Catalog exploration
o Graphical representation of data model
o Data lineage
o Integrated catalog search
• Data Discovery
o User friendly query wizards with drill down capabilities
o Export to CSV, Excel and Tableau Data Extracts
• Secure
o Leverages Denodo’s security model and access control
o Available vis SSL/TLS
GOVERNANCE - DENODO INFORMATION SELF
SERVICE
25. CLOUD AND DV BENEFITS
• Proactive – IT has embraced cloud as a model for achieving
innovation through increased efficiency, reliability and agility
• Reusability and template development
• Rapid innovation within governance structure, balanced
costs, risks and service levels
• Greater efficiency and reliability, enabling broader audience
to consume IT services via self-service
26. LESSONS LEARNT
• Reduced Spend
• Live migration
• Flexible and cost effective
• Better business continuity
• Speed to deliver
• Easier to manage
• More efficient IT operations
• low hardware costs
• No or reduced Software costs
Cons
• Possible learning curve
• Accountability
• Getting all vendors to gel well
Pros
28. DATA VIRTUALIZATION IN IOT ECOSYSTEM
Other RDBMS
(apps, CRM, SAP)
Other Sources
(SaaS, SFDC, etc.)
Ingestion Streaming
analytics
Big Data
Storage
Batch analytics,
Machine learning
Streaming data
Traditional batch
processing
(ETL to EDW)
Semantic
Model
Secure
+
Combine
+
Enrich
Meta Base
30. BENEFITS
• Adds context to device data
– Enrichment and augments with other sources (internal or external)
– No replication: enables Virtual Data Lake
• Simplifies publication of useful results
– End-user oriented semantic model
– Reports and dashboards in SQL
– Data as a service (REST, OData)
• Improves governance
– Security (AD integration, data restrictions, masking, etc.)
– End-to-end lineage
– Abstraction of source technologies