Cloudera and Teradata discuss the best-in-class solution enabling companies to put data and analytics at the center of their strategy, achieve the highest forms of agility, while reducing the costs and complexity of their current environment.
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Pervasive analytics through data & analytic centricity
1. Achieving Pervasive Analytics
through Data and Analytics
Centricity
Chris Twogood | VP of Product & Services Marketing | Teradata Corporation
Clarke Patterson | Senior Director of Product Marketing | Cloudera
2. Chris Twogood is Vice President of Product and
Services Marketing for Teradata Corporation. He is
responsible for Teradata, Aster and Hadoop as well
as Teradata services
Clarke Patterson is the Senior Director of Product
Marketing at Cloudera. In this role he is responsible
for product, solutions, and partner marketing
activities supporting Cloudera’s platform for big
data.
6. Now more than ever data changes how we work
Everything that can be
measured will be measured.
Employees and customers
expect more personal
interactions, but not at the cost
of their privacy.
The most innovative companies
embrace experimentation
and agility.
Instrumentation Consumerization Experimentation
10. Data Drives Industries
Financial Services Public Sector
Healthcare
Telecommunications
Retail
Optimize network performance Money laundering detection Cyber security detection
Product recommendations Personalized medicine
11. Data Drives Business
Sales Operations
Product
Marketing
Customer Satisfaction
Increase conversions by 2% Convert 5% more leads Reduce fraud by 3%
Reduce churn by 1% Increase user adoption by 10%
12. Marketing Drives Revenue Growth
Increase Revenue by 20%
Traditional Data Driven
10% 12%
Landing Page Visits
Conversion Rate
Revenue
1000 1000
$100 $120
+20%
13. Business value is everywhere
Increase revenue & customer
satisfaction
Improve operational efficiency
& product quality
Manage security, risk &
compliance
Customer 360 Data-Driven Products Security, Risk, & Compliance
14. …It is difficult to harness all the data available to
create business value
15. …it takes too long to give user the answers they
need…
25. 25
2855
Inventory
Returns
Manufacturing
Supply Chain
Customer Service
Orders
Revenue
Expenses
Case History
Customers
Products
Pipeline
Customers
Campaign History
FINANCE
SALESMARKETING
OPERATIONS
CUSTOMER EXPERIENCE
2855
Given the rise in warranty costs, isolate the problem to be a plant, then to a battery lot.
Communicate with affected customers, who have not made a warranty claim on batteries,
through Marketing and Customer Service channels to recall cars with batteries.
27. 27
$0 m
$10 m
$20 m
$30 m
$40 m
$50 m
$60 m
Jan Feb Mar Apr May Jun
Inventory
Warranty
Materials
Labor
28. 28
$0 m
$10 m
$20 m
$30 m
$40 m
$50 m
$60 m
Jan Feb Mar Apr May Jun
Inventory
Warranty
Materials
Labor
$0.0 m
$0.5 m
$1.0 m
$1.5 m
$2.0 m
$2.5 m
$3.0 m
$3.5 m
$4.0 m
$4.5 m
Warranty Costs
January
June
29.
30.
31. 31
PRODUCT SENSOR SOCIAL MEDIA CUSTOMER
CARE AUDIO
RECORDINGS
DIGITAL
ADVERTISING
CLICKSTREAM
65 41 32 19 28
How many
visitors did we
have to our
hybrid cars
microsite
yesterday?
What is the
temperature
readings for
batteries by
Manufacturer?
What is the
sentiment
towards line of
hybrid
vehicles?
Which
customers likely
expressed an
angry or with
customer
care?
Which ad
creative
generated the
most clicks?
36. …but there are no free lunches in Information
Management – merely more and different options
Explicit, or implicit, there is always, always, always schema;
“pay me now, or pay me later (and over and over)
42. • Eliminate risk of personal injury and
bad publicity
• Quantify total potential cost / liability
• Minimize costs of recall and repair
– Repair only cars with high risk of developing
problems (> 30o)
• Reimbursement from battery
manufacturer
• Feedback to engineering
• Firmware fix to help alleviate issues
(battery fans turn on earlier)
Business Strategy
Repair these
cars
Monitor
these cars
Test these
cars on
next visit
This can only be done by combining
detailed sensor data with warehouse
business data
Temperatures for Battery Lot 4102
43. Non Coupled Tightly CoupledLoosely Coupled
Data is leveraged dynamically
Data migrates based on usage
45. 45
PATH / TIME SERIES
ANALYTICS
TEXT
ANALYTICS
RICH MEDIA
ANALYTICS
GRAPH
ANALYTICS
SQL ANALYTICS
What is the
forecast of
demand for our
hybrid cars in
the Western
region?
What is the
typical path to
purchase for an
extended
warranty?
What can text
based service
forms tell us
about
potentially
larger safety
issues?
How many
customers that
called
Customer
Service
expressed a
frustrated tone
of voice?
Which
customers are
highly influential
on social media
and regularly
post about our
hybrid vehicles?
49. UNIFIED DATA ARCHITECTURE
System Economics View
Security, Workload ManagementERP
SCM
CRM
Images
Audio
and Video
Machine
Logs
Text
Web and
Social
SOURCES
Marketing
Executives
Operational
Systems
Frontline
Workers
Customers
Partners
Engineers
Data
Scientists
Business
Analysts
Math
and Stats
Data
Mining
Business
Intelligence
Applications
Languages
Marketing
USERS
ANALYTIC
TOOLS & APPS
Search
Marketing
Executives
Operational
Systems
Knowledge
Workers
Customers
Partners
Engineers
Data
Scientists
Business
Analysts
USERS
INTEGRATED DATA WAREHOUSE
HADOOP
INTEGRATED DISCOVERY PLATFORM
Security, Workload ManagementREAL TIME PROCESSING
$/Insight
$/Value
$/TB
50. UNIFIED DATA ARCHITECTURE
Security, Workload Management
Applications
INTEGRATED DATA WAREHOUSE
DATA
PLATFORM
INTEGRATED DISCOVERY PLATFORM
Security, Workload ManagementREAL TIME PROCESSING
CLOUDERA
DISTRIBUTION FOR
HADOOP
TERADATA DATABASE
TERADATA ASTER DATABASE
INGESTLAYER
CONSUMPTIONLAYER
Key Takeaway: What does analytics look like today? Whether you are looking at the business or consumer space, analytics are supplying us value today. But this value is still limited and not always what we want.
What happens when that report is not quite right? Or what happens to your product recommendations when your daughter makes a purchase?
Key Takeaway: Analytics are becoming ingrained into everything we do. They are informing the companies and products that we use everyday. Sometimes customers are the users of the analytics, other times the company uses them to offer a better service to that customer.
Tell a story piecing all of the use cases on the slides together (Don’t name customers)…
A box with hardware in it (Netapp: predictive support)
Electricity from light (Opower: Energy usage analytics).
The right ad served to you on the computer (OpenX ad exchange)
Detecting malware for your business (CounterTack security platform)
A new technology is emerging that combines data science expertise with deep understanding of business problems. These solutions use algorithmic data mining on your own data and often on external third party data accessible by cloud ecosystems and APIs. Data Driven Solutions make predictions about business functions, prescribe what to do next, and in many cases take action autonomously. Trained analysts are not required to query databases. Instead, business users get answers directly from the software. These answers typically feed seamlessly into the flow of business activity, often invisibly.
These solutions make predictions about business functions, prescribe what to do next, and in many cases take action autonomously.
http://www.forbes.com/sites/ciocentral/2014/04/18/8-ways-to-build-and-use-the-new-breed-of-data-driven-applications/
===================
Unlocking analytic agility and analytic reach will change our lives. Organizations have already figured this out.
Setting their sights on pervasive analytics, agile organizations have already had significant impacts on our lives already.
Imagine a world when you walk into your office, and a piece of hardware is waiting for you, because the company’s predictive analytics says that your piece of hardware has 92% chance of failure.
Imagine your office manager telling you to turn off your lights and conserve energy, because of smart meters, he can see that your office is consuming more energy than everyone else on the block.
Imagine you get bak from business trips, and needed to file your expense reports. But instead of entering each individual receipts, which we all know is a never-ending list and never-ending process, it automatically process receipts for you, based off the text located on your receipts. That would be a pretty great world.
We see a few trends driving the increased importance of having a strategy for data.
The Internet has changed everything, and we are more connected than ever before. We all expect to be on the web these days; we rely on it for work, shopping, entertainment, and social interaction.
With the simultaneous proliferation of mobile devices and sensors, we now have the ability measure almost everything. As a result, we’re generating data, and moving it, at a rate that’s entirely new.
In this new online world, customers and employees expect more personalization, but not at the cost of privacy. Security matters.
Ultimately, data enables us to better understand our customers, patients, employees, or students. Innovative organizations embrace experimentation and agile methods.
Representative Customer Stories
Vivint: Everything that can be measured, will be measured.
Challenge: Vivint needed a central repository to gather and analyze data generated from each of the 20-30 sensors -- e.g. thermostats, smart appliances, video cameras, window and door sensors, and smoke and carbon monoxide sensors -- in every one of its 800,000 customers' homes.
Solution: Vivint has deployed an enterprise data hub on Cloudera that allows it to look across many data streams simultaneously for behaviors, geo-location, and actionable events.
Benefit: With its enterprise data hub that combines sensor data across multiple data streams, Vivint can glean new insights that help the company understand and enrich customers' lives. For example, knowing when a home is occupied or vacant is important to security – but when tied into the heating, ventilation and cooling (HVAC) system, you can add a layer of energy cost savings by cooling or heating a home based on occupancy.
Western Union: Employees and customers expect more personal interactions.
Challenge: With customers spanning every corner of the globe and all walks of life, Western Union saw an opportunity to personalize the experience for each customer by combining the volumes of information about their transactions -- of which Western Union processed 29 per second in 2013 -- with user behavior data, clickstream data, and mobile usage patterns.
Solution: Western Union implemented an enterprise data hub on Cloudera to centralize its data -- both structured and unstructured -- in order to provide a 360-degree customer view, while also supporting use cases for risk management and AML compliance.
Benefit: Deeper customer understanding is driving product improvements and enhancements that improve Western Union customers' experience. For example, Western Union learned through its EDH that many customers in key sectors process the same transactions repeatedly, prompting the company to add a "Send Again" button to its mobile app to streamline the processing of repeat transactions. By deploying that capability, the company immediately saw a conversion uptake in those key sectors.
Marketing Associates: The most innovative companies embrace experimentation and agility.
Challenge: Marketing Associates' Magnify Analytic Solutions division has built expertise executing B2C online marketing contests and product giveaways for large clients such as Chrysler, DuPont, Ford, and Jaguar, which requires intensive data processing, elastic flexibility and scalability, and agility and performance. Ford recently offered Magnify the opportunity to manage its entire CRM system -- which Magnify jumped at, but knew it would need a new big data infrastructure to support.
Solution: Without any prior in-house experience with Hadoop, Magnify built an enterprise data hub on Cloudera, leaning heavily on Cloudera Manager, Search, Impala, and integrations with SAS and Tableau to streamline the new platform's adoption.
Benefit: The EDH has been a tremendous success, enabling Magnify to deliver a self-service, 360-degree view of consumers to its clients (vs. sending them Excel spreadsheets every 1-2 days which was the case prior). And better yet, the all-inclusive price of Cloudera Enterprise, Data Hub Edition and all resources needed to built its development, production, and QA environment came in well below ongoing costs of the traditional environment.
Key takeaway: We can measure and act on everything now. 16B connected devices. Only those that can harness this data can take advantage of it. “If you can’t measure it, you can’t fix it.” –DJ Patil
Source: http://www.forbes.com/sites/gilpress/2014/08/22/internet-of-things-by-the-numbers-market-estimates-and-forecasts/
Key Takeaway: We can analyze anything now. Numerical, text, audio, video. We are now able to discover insights in complex data. Leveraging text analytics, rich media analytics, graph analytics, time series, etc. All of these analytics allow us to get a complete understanding of any data problem we are trying to solve. And they are no longer limited by data. This allows us to enter new use cases and expand our understanding of the problems at hand.
Analytics continues to drive more value. Early analytic returns 13X per $1 spent
Key Takeaway:
Source: 18th Annual Global CEO survey
Key Takeaways: Industries are already beginning to transform. How can data help transform the way employees and customers interact with these industries.
========
Data is not only impacting departments, but also transforming the industries.
Telecommunications is figuring out how to optimize network performance based on call logs
Financial services institutes are doing their part to reduce criminal activities by leveraging data to better identify and prevent money laundering.
Government is leveraging data to protect our sensitive information from cyber threats.
The way these industries are operating is being re-written to better incorporate the use of data.
Key Takeaways: Employees are already asking the right questions, we just need to help them achieve their goals through the use of data.
==============
Data is driving this transformation. But in order to do so, we must align it with the needs of business. By understanding business objectives and priorities, organizations can put data in context:
Marketing for example, how can data increase landing page conversions by 2%, by optimizing the landing page, or perhaps by drawing the right page viewers.
Sales, how can we convert 5% more leads by prioritizing inbound leads based on the behaviors?
Or how can we help reduce fraud by 3%, by being able to detect persistent threats?
If we can align data to our business objectives, we can ensure that when the data reaches the actual user, they have the information they need in order to directly impact the numbers that are important to them in the business’s bottomline.
Let’s drill into marketing to see how we can achieve 2% landing page conversion increase, and how this will impact revenue.
Traditionally:
10% conversion rate on 1,000 landing page visits. If each conversion resulted in selling 100 widgets at $1 each, resulting the $100 revenue.
Data driven:
How can we impact this equation with data?
I have information on my customer CRM instance for example. I can use this information to better profile my ideal buyer, so instead of running blanket ads at random sites, I can now target the most likely individuals to buy.
By bringing the right people to the landing page, I see a little uptake in my conversion rate. Next I want to pull information from my marketing automation system in order to optimize the landing page. Maybe I run a A/B testing to measure the effectively of changing the design of the page, or maybe the text on the page, in order to optimize for actual conversions.
Between these two data-driven analysis, we were able to help marketing to achieve goals of increasing landing page conversion by 2%. Which resulted in a 20% increase in revenue.
With maturity of the platform and technology ecosystem, and with enterprises better understanding not only the promise of the technology but also how to implement it, we are seeing a fundamental shift in the market…..
Hadoop and big data are no longer about technologies only, nor are they simply about cost reduction. In fact, there have been shifts towards aligning data to business objectives in order to derive even greater value out of big data.
The three areas of opportunities within businesses generally are:
Customer 360 - How do I understand my customers and my channel better to improve my topline?
Data-driven products - How do I create better and more products to satisfy the needs of my customers?
Risk - How do I make sure that the company complies to rules and regulations, protects customer and enterprise information, and minimize the risk factors?
More data means more details
Despite being data-rich, however, many organizations are still insight-poor.
More data means more opportunity to gain more insight –and better insight – which drives better decisions that give you a competitive edge.
But you can’t afford to get lost in the details. Your company runs on thousands and thousands of details, little decisions and connections.
And more and more data is coming into your business each and every day. You need to understand which details you should focus on to reveal details or patterns you didn’t even know to look for.
The data-driven business puts data and analytics at the center
A data driven business achieves sustainable competitive advantage by leveraging insights from data to deliver greater value to their customers. For decades, Teradata has helped companies become data driven and transform the industries in which they operate. Our experiences have given us deep insight as to the business capabilities required to become data driven: strategic, operational, and cultural capabilities.
Strategic capabilities mean that organizations view data as a valuable asset – as important as any capital asset. They develop their corporate strategy around data. They practice fact-based decision making over intuition and gut instincts. They accelerate innovation with data.
Operational capabilities mean that organizations are grounded by data, using it to run the business and measure success. They empower both back and front office business functions with access to data for fact based decision making. They integrate data as appropriate to provide a foundation for cross-functional analysis and connecting-the-dots across the business. They are able to take action at the point of insight.
Data driven businesses foster a Cultural environment that rewards and encourages the use of data. They value creativity and experimentation and take data-informed risks. They leverage data to understand and improve the customer experience. They possess a willingness to adapt and refine both strategy and execution based on data. They balance governance with agility. They challenge everything based on data. All of this is predicated on senior executives possessing the muscle to transform the organization so that the data and models actually yield better decisions.
But most companies do not need to be convinced of the need to become a data driven business. Much research exists in the public domain on the financial benefits of injecting analytics into the core operations of the business. One recent example from MIT cited that data driven businesses realize between 5 and 6% higher margins.
We’re already doing this with self-driving cars, where we employ machines to react and respond to road conditions better than humans can.
We’re already doing this with self-driving cars, where we employ machines to react and respond to road conditions better than humans can.
Electronic Vehicle/ Battery Warranty Demo
Auto manufacturers can now combine sensor data with their business data to further reduce warranty costs
In the past, an auto manufacturer might see a rise in warranty costs
Then from the business data that is tightly coupled in the data warehouse , isolate the problem to be batteries assembled in a specific plant
And then drill down to isolate it to a specific lot number in that plant
Then, they’d make a business decision to recall all cars with batteries from that lot.
This is impossible to do with a application centric silo approach
Application Centric
APPLICATION CENTRIC: A collection of point solutions for reporting and analytics designed for a specific purpose or a major data subject area that provides value on the margin to specific end users, but limits an organization’s ability to flexibly ask questions across the larger enterprise, and incurs excessive costs through redundancies.
An application-centric approach typically embodies a narrow view of its purpose. It stores the data it needs for a specialized purpose in its own dedicated system without regard to how other processes or functions may need that data. Application Centric approaches are limited to application views, whereby data models are based on rigid processes defined by the application, and the context for analytics originates from the applications. Decisions are colored by the nature of the application and the limited scope of the specific data sources that enable specific processes.
Application Centric refers to a collection of point solutions whereby the data is managed to satisfy a defined set of use cases or departmental needs, by providing answers to questions defined in advance.
Application Centric is a de-facto architecture defined by applications – be it operational applications with bolt-on analytics reflective of a defined process (i.e. inventory management, order management, sales force automation, and others) or analytical applications built to support currently defined departmental functions (i.e. Marketing database, Financial Performance Management, etc.).
Thanks to increasingly affordable processing power, bandwidth, and a greater awareness of the value of analytics, many companies have spent years building application centric solutions in all areas of their organization. However, it’s not hard to see what quickly happens when these dedicated, application-centric environments proliferate: redundant data and an inability to connect the dots across the enterprise
For example, Operations may require Inventory, Returns, Orders, Manufacturing, Products, and Supply Chain data to answer key, known business questions that help run their day to day department. For argument sake, let’s say that Operations can answer 54 key Operations questions within their Operations data mart. Similarly, Finance may require Orders, Expenses, and Customer data to answer 32 key Finance questions
Companies need to move from Application Centric to Data and Analytics Centric.
Companies need to move from Application Centric to Data and Analytics Centric.
Data Centric
DATA CENTRIC: An integrated approach to enterprise information management that enables decisions based on a broad set of information sources, resulting in increased flexibility at the lowest possible cost.
Our point-of-view is that the starting point in defining a data strategy begins with understanding how you use the data to satisfy a variety of use cases within your enterprise, in a thoughtful, repeatable, agile way. We call this this data driven view “The Data Centric Approach.”
Data Centric is a data driven view. The data centric approach enables the enterprise to make decisions on the data available across the enterprise, by reflecting data across the broadest set of sources. Data-centric refers to an enterprise data architecture that is designed to reflect a logical view of data, independent of today’s defined processes, yet inclusive of business rules that are shared across the enterprise. Rule reuse dramatically reduces development and maintenance costs and improves reliability.
It is a nimble environment whereby the data enables the answering of any question, including those that users did not foresee asking. It provides agility because it is independent of process and source system idiosyncrasies.
By integrating data, organizations can ask exponentially more questions when compared to taking a silo’d approach. 2855 questions can now be asked versus the summing up the various data marts to the tune of 205. That is the power of the 360° view. Further, the data centric approach provides the lowest TCO because redundancies in the form of system and people costs are significantly reduced.
Note to Scott. Over the next several slides we will be discussing the Electronic Vehicle Demo
Electronic Vehicle/ Battery Warranty Demo
Auto manufacturers can now combine sensor data with their business data to further reduce warranty costs
In the past, an auto manufacturer might see a rise in warranty costs
Then from the business data that is tightly coupled in the data warehouse , isolate the problem to be batteries assembled in a specific plant
And then drill down to isolate it to a specific lot number in that plant
Then, they’d make a business decision to recall all cars with batteries from that lot.
BTW, this isn’t a bad decision. They isolated the problem quickly from tightly coupled business data in the DW ….
However …..by leveraging the battery sensor data……together with the business data … a better decision can be made.
By loosely coupling battery sensor data to the business data it showed that two thirds of the cars with the bad battery lot are fine, while the other third with the problems are running hot
The business decision with the added sensor data ……is to only recall 1/3 of the cars … and continue to monitor the others.
This results in significantly less cost, fewer customers being impacted, and less damage to the brand image.
Data Centric
DATA CENTRIC: An integrated approach to enterprise information management that enables decisions based on a broad set of information sources, resulting in increased flexibility at the lowest possible cost.
Our point-of-view is that the starting point in defining a data strategy begins with understanding how you use the data to satisfy a variety of use cases within your enterprise, in a thoughtful, repeatable, agile way. We call this this data driven view “The Data Centric Approach.”
Data Centric is a data driven view. The data centric approach enables the enterprise to make decisions on the data available across the enterprise, by reflecting data across the broadest set of sources. Data-centric refers to an enterprise data architecture that is designed to reflect a logical view of data, independent of today’s defined processes, yet inclusive of business rules that are shared across the enterprise. Rule reuse dramatically reduces development and maintenance costs and improves reliability.
It is a nimble environment whereby the data enables the answering of any question, including those that users did not foresee asking. It provides agility because it is independent of process and source system idiosyncrasies.
By integrating data, organizations can ask exponentially more questions when compared to taking a silo’d approach. 2855 questions can now be asked versus the summing up the various data marts to the tune of 205. That is the power of the 360° view. Further, the data centric approach provides the lowest TCO because redundancies in the form of system and people costs are significantly reduced.
Note to Scott. Over the next several slides we will be discussing the Electronic Vehicle Demo
Electronic Vehicle/ Battery Warranty Demo
Auto manufacturers can now combine sensor data with their business data to further reduce warranty costs
In the past, an auto manufacturer might see a rise in warranty costs
Then from the business data that is tightly coupled in the data warehouse , isolate the problem to be batteries assembled in a specific plant
And then drill down to isolate it to a specific lot number in that plant
Then, they’d make a business decision to recall all cars with batteries from that lot.
BTW, this isn’t a bad decision. They isolated the problem quickly from tightly coupled business data in the DW ….
However …..by leveraging the battery sensor data……together with the business data … a better decision can be made.
By loosely coupling battery sensor data to the business data it showed that two thirds of the cars with the bad battery lot are fine, while the other third with the problems are running hot
The business decision with the added sensor data ……is to only recall 1/3 of the cars … and continue to monitor the others.
This results in significantly less cost, fewer customers being impacted, and less damage to the brand image.
Tightly Coupled
Tightly coupled data is a methodology used to define upfront the structure and rules for how data is to be rationalized, organized, and optimized for performance. Data is transformed into packaged finished goods for consumption.
The benefits of this approach are ease of use, faster response times, data quality and integrity assurances, and consistent results.
Tightly Coupled is best used for heterogeneous data that is frequently accessed and extensively reused, with strong needs for data quality and integrity.
Big Data Phenomena brought a whole new set of opportunities and challenges. It started out being about the 3 V, today it is not about that anymore it is about how customers are using the data and analytics
In this early era of big data, we have once again seen organizations revert to an application centric approach. Again, there is value on the margin as organizations begin to get their arms around emerging data sources such as clickstream, sensor, social media text, and even rich media such as audio, video and images.
What our experiences have taught us still hold true:
These new data sources are exponentially more valuable when integrated with other data sets
Costs can be reduced dramatically by taking an approach based on sharing and reuse
Pre “big data,” there was a single approach to data integration whereby data is made to look the same or normalized in some sort of persistence such as a database, and only then can value be created. The idea is that by absorbing the costs of data integration up front, the costs of extracting insights decreases. We call this approach “Tightly Coupled.” This is still an extremely valuable methodology, but is no longer sufficient as a sole approach to manage ALL data in the enterprise.
Post “big data,” using the same tightly coupled approach to integration undermines the value of newer data sets that have unknown or under-appreciated value. Here, new methodologies are essential to cost effectively manage and integrate the data. These distinctions are incredibly helpful in understanding the value of Big Data, where best to think about investments, and highlighting challenges that remain a fundamental hindrance to most enterprises.
Big Data has impacted on all different data types
Marketing expanded from Customer and Campaign History to including Big Data like interaction data, digital advertising and clickstream interactions
Operations expanded from Inventory, Returns, Manufacturing data to include Big Data from Server Logs, Sensor Data, Telemetry
Finance expanded from Orders, Expenses and Revenue to included Electronic Commerce
Customer Experience grew from case History to include social media, audio recordings and IVR routings
Sales expanded from Customer, Prospects and Pipeline to include customer portal interactions
Not all of this data has strong value. We do not want to spend time modeling data that has unknown business value.
Take merits of the different technologies off the table, what some of us are thinking is…
Traditional approaches to DI lots of upfront investment BEFORE requirements / value understood
I think that if you take the merits of the different technologies off the table, what some of us are thinking is this: the time-consuming and expensive part of a “traditional” Business Intelligence and Analytics project is the up-front data modelling and data integration; maybe we just shouldn’t bother with them any more?
Traditional approaches to data integration involve lots of difficult and unglamorous work that adds time and cost to Analytic projects, often before the precise requirements and value of those projects are fully understood. And that makes the idea of a frictionless data acquisition process – Data Lake Big Idea #2 - very appealing.
Late-binding brilliant for evolving data structures
No free lunch / engineering trade offs
Flip-side of increased flexibility is, well, increased flexibility.
So-called schema-less information management means storing raw data – and then “late-binding” one of many different schemas, or interpretations, to the data at query run time, rather than applying a single interpretation to the data at load time.
This approach is a brilliant way of dealing with data whose structure evolves rapidly. If you are capturing raw device log data today, probably you are not modelling that data - or at least all of that data - relationally. Because if you do, you risk having to re-visit the target data model and associated ETL processes every time you upgrade the device firmware so that the device is capable of capturing new attributes about itself and its environment.
But in Information Technology there is always a “but". Engineering is about trade-off and compromise. The flip-side of increased flexibility is well, increased flexibility. Giving users multiple ways of interpreting data typically also means giving them several ways of interpreting it incorrectly. Assuring Data Quality is much more complex without a pre-defined schema. And a well-designed schema-on-load implementation enables us to optimise access paths, so that more users can make more use of valuable data.
In order to take advantage of this Big Data Phenomena. It is an imperative to understand that all data has value at different levels of integration. This introduces Tightly Coupled, Loosely Couple and Non Coupled Data
Tightly Coupled
Tightly coupled data is a methodology used to define upfront the structure and rules for how data is to be rationalized, organized, and optimized for performance. Data is transformed into packaged finished goods for consumption.
The benefits of this approach are ease of use, faster response times, data quality and integrity assurances, and consistent results.
Tightly Coupled is best used for heterogeneous data that is frequently accessed and extensively reused, with strong needs for data quality and integrity.
Loosely Coupled
Loosely-coupled data is a methodology whereby effort to apply structure and rules is deferred as late as possible – often at runtime - so as to avoid unnecessary data preparation. Only the bare minimum of data rationalization in the form a key occurs. Data is treated as raw materials stored in close to original form.
The end user benefits of loosely-coupled is the flexibility to shape data at the user’s discretion, and the opportunity to leverage data that would otherwise be out of reach due to the impracticality of utilizing tight coupling methods.
Loosely-coupled is best used for homogenous data that is less frequently accessed, or where the structure of the source data is evolving - which makes on-going rationalization untenable.
Non Coupled
Non coupled is data in its purist raw form, with no additional keys defined during acquisition or prior to consuming the data to aid in integration. Integration of non-coupled data with loosely and tightly coupled data can occur through expertly written end user code that creates the needed linkages (keys) on the fly.
The end user benefit of non-coupled data is the opportunity to leverage data that would otherwise be withheld during the time in which data provisioners are working to define additional structure.
Non Coupled is best used for data sets with no perceived value from integration with other data sets, or in cases where the understanding of a new data set is still in process
Costs can be reduced dramatically by taking an approach based on sharing and reuse. Additionally, these new data sources are exponentially more valuable when integrated with other data sets, yielding 6,350 key business questions when ALL DATA is leveraged with different degrees of integration.
Leveraging SQL and combining tightly coupled and loosely coupled data we now understand that 2/3’s of the battery lot are fine and then we can join that data with customer and warranty records to only do a recall on those batteries that have impact. However …..by leveraging the battery sensor data……together with the business data … a better decision can be made.
The business decision with the added sensor data ……is to only recall 1/3 of the cars … and continue to monitor the others.
This results in significantly less cost, fewer customers being impacted, and less damage to the brand image.
Maximize customer satisfaction
Proactively contact customers to offer repair before problem develops
Setup early detection on the sensor data
As we continue to advance our UDA you will not only see feature advancements in the Data Platform, IDW, Integrated Discovery Platform, and Real Time processing, but a continued focus on the orchestration of the entire environment. Here you see Restful APIs for both loading data and for accessing data across the UDA.
RESTFUL API
A REST API is a style of interface which is used by web pages all over the Internet. It allows a web page to access a database directly from the browser. A REST API eliminates the need for an ODBC driver or any other special software on the device. Therefore, any phone, tablet, or Internet connected device on the Internet of Things with a pre-installed web browser can access any source that has a REST API.
Teradata REST Services is software that runs on a Teradata Managed Server (or customer supplied server) which provides a REST API interface to the Teradata Database. It allows a web page on any phone/tablet/BYOD device to access the Teradata Database without the need to install any software on the mobile device. The web page makes Teradata REST API calls just by sending the query to a URL. Teradata REST Services then sends the query to the Teradata Database and returns the answer to the web page on the device in JSON format.
Middleware providing a HTTP+JSON bridge to Teradata.
Provides the ability to open database sessions, submit SQL requests and access responses, and access metadata.
Requires zero client install.
Ideal for web, mobile, scripting languages.
Real Time Processing
In the open source area we have deployed projects at insurance companies (Liberty Mutual), Research companies (Mayo Clinic) through consulting services by combining Storm, Elastic Search and Kafka
We also have partnerships where we have optimized Teradata with TIBCO and IBM Streams
Teradata has multiple options for deploying real-time capabilities so that clients can match their requirements to the best approach.
Open Source options include the Lambda Architecture that Teradata has successfully deployed as a professional services engagement for clients. The Lambda Architecture is predicated on a Speed, Server, and Batch Layer that leverages open source projects such as Storm, Elastic Search, and Kafka.
Data Platform
[add content]
IDW
[add content]
Aster
[add content]
As we continue to advance our UDA you will not only see feature advancements in the Data Platform, IDW, Integrated Discovery Platform, and Real Time processing, but a continued focus on the orchestration of the entire environment. Here you see Restful APIs for both loading data and for accessing data across the UDA.
RESTFUL API
A REST API is a style of interface which is used by web pages all over the Internet. It allows a web page to access a database directly from the browser. A REST API eliminates the need for an ODBC driver or any other special software on the device. Therefore, any phone, tablet, or Internet connected device on the Internet of Things with a pre-installed web browser can access any source that has a REST API.
Teradata REST Services is software that runs on a Teradata Managed Server (or customer supplied server) which provides a REST API interface to the Teradata Database. It allows a web page on any phone/tablet/BYOD device to access the Teradata Database without the need to install any software on the mobile device. The web page makes Teradata REST API calls just by sending the query to a URL. Teradata REST Services then sends the query to the Teradata Database and returns the answer to the web page on the device in JSON format.
Middleware providing a HTTP+JSON bridge to Teradata.
Provides the ability to open database sessions, submit SQL requests and access responses, and access metadata.
Requires zero client install.
Ideal for web, mobile, scripting languages.
Real Time Processing
In the open source area we have deployed projects at insurance companies (Liberty Mutual), Research companies (Mayo Clinic) through consulting services by combining Storm, Elastic Search and Kafka
We also have partnerships where we have optimized Teradata with TIBCO and IBM Streams
Teradata has multiple options for deploying real-time capabilities so that clients can match their requirements to the best approach.
Open Source options include the Lambda Architecture that Teradata has successfully deployed as a professional services engagement for clients. The Lambda Architecture is predicated on a Speed, Server, and Batch Layer that leverages open source projects such as Storm, Elastic Search, and Kafka.
Data Platform
[add content]
IDW
[add content]
Aster
[add content]
With Teradata, you can:
Put data and analytics at the center: It’s no longer acceptable to discover and extract data for each project individually. Nor is acceptable to rely on a single technology or data integration methodology. When you work with Teradata, you will be able to establish a data & analytic centric strategic roadmap that takes a holistic view of your information. Your new approach will focus on organizing and facilitating access to all your data so that it is ready to accept a wide range of data sources that meets an extensive set of user needs. And with Teradata, you will have all the tools needed to perform the broadest set of analytics from complex SQL queries, in-database statistics, time series and pathing, graph, text analytics, machine learning, and more. When you start with a data-centric framework, you remove existing silos and are able to integrate your information, no matter what the usage pattern. And because you take an approach that optimizes data placement and where analytics are performed, your level of data duplication and movement is minimized. Your collection of data islands become a single strategic asset.
Create an agile environment to drive innovation on demand: With other options available to users, you need to deliver results faster than ever if you want them to continue working with you. By partnering with Teradata, you can balance enterprise needs for agility and governance. You can give your business users a level of self-service previously unavailable. Your users can quickly leverage newly introduced data, and leverage stabilized data to enable them to quickly get new insights – and then turn those insights into action. This accelerated path to the right data makes you a valued resource for business users seeking answers.
Simplify your infrastructure: Each new data source and analytic engine becomes one more moving piece in an already complicated infrastructure. By working with Teradata, you will establish a plan for masking data complexity to the end users, and managing complexity for data provisioners based on a proven methodology and best in class technology environment. Using the concept of “load once, use many”, you will dramatically reduce the amount of data movement across your environment and the number of unique connections. Additionally, you will be able to provide a single interface to access all your data, making multiple data platforms appear to be one from the end-user perspective. Your users spend less effort getting at the information they need, and your staff spends less time juggling duplicate sets of data and chasing down inconsistencies.