Search and Analytics Platform for Text and Rich Media
Open Innovation is transforming everything
Connected people, apps and things generating massive data in many forms
How do you bridge the gap between data and outcomes?
Augmented Intelligence power apps for competitive advantage
Machine Learning at the Service of Business Augmented Intelligence
HPE Big Data Advanced Analytics Software Solutions
Strong information and weak information
HPE IDOL: Natural Language Processing (NLP) engine
2. Open Innovation is transforming everything
Closed technology
architecture design
“After-the fact” static analytics, e.g.
Monthly reporting
Analyze data at
“rest”
Real-time insight &
understanding via machine
learning
Put data science into your
processes – Next-gen apps
and services
Analyze and apply perishable
data
anywhere at any time
Premise-based
systems
Seamless blending of open
source, advanced technology,
deployment choices…
Contain Cost Create Outcomes
Traditional Data Analytics Open Innovation Data Analytics
Journey to the New Style of Business
3. Human data
Connected people, apps and things generating massive data in many forms
Machine data
Business data
faster growth
than
traditional
business data
10x
4. How do you bridge the gap between data and outcomes?
4
How do you consume
any data generated
or understood by
humans?
How do you identify
key aspects and
patterns to determine
outcomes?
How do you
automate to take
action?
Data sources Diverse Modern
Apps
Q1 Q2 Q3
5. Augmented Intelligence
power apps for competitive advantage
5
Augmented Intelligence
powered by HPE
Artificial intelligence, machine learning and natural
language processing using advanced analytics functions.
7. HPE Big Data Advanced Analytics Software Solutions
Vertica high-performance
advanced analytics
− Real-time performance at scale
− Premise, Cloud, and Hybrid
− Native optimized
Hadoop options
IDOL augmented intelligence for
human information
− Advanced enterprise search and rich media
analytics
− Analyze text, audio, image, and streaming
video
Haven OnDemand APIs and
Services
− Machine Learning as a Service
− Delivered on Microsoft Azure Cloud
− Accessible to any developer
Deep
Learning
Text
Analytics
Face
Detection
Neural
Network
Speech
Recognition
Categor-
ization
9. An analysis platform without data is like humans without senses
9
150 data sources
Index without relocation
10. Why is processing human data different?
Human Information is made up of ideas, is diverse and has context
– Ideas don’t exactly match like data does; they have distance.
– Human Information is not static – it’s dynamic and lives everywhere.
– Legacy techniques have all fallen short.
10
MobileTextsEmailAudioVideoSocial Media
Transactional Data Documents Search Engine Images IT/OT
16. Strong information and weak information
Key Words are small amounts of very strong information without contextLarger amounts of weaker information is what humans refer to as “context”
“Mercury”
Is it a planet?Is it an element?Is it a car?With high certainty; it’s an element!
“A heavy element and the only metal that is liquid at standard conditions for
temperature and pressure with the symbol Hg and atomic number 80,
commonly known as quicksilver”
17. Using Cognitive Analysis to form a human-like understanding of content
HPE IDOL: Natural Language Processing (NLP) engine
Fundamentally created to understand natural
human language using probabilistic modeling
and NLP algorithms
• Allows incoming data to dictate the model,
not pre-defined rules, dictionaries, or semantic webs
Self-Learning / Machine Learning
• Updates as more data is added or removed
• Adapts to changing definitions or meaning
Fundamentally language-independent
• Treats words as symbols
Optimized with language packs
• Eduction, sentiment analysis, speech analytics
Information Theory and
Bayesian Inference
18. IDOL’s Core Capabilities
18
Rich Media Analytics
Knowledge Discovery
Advanced
Enterprise Search
Data Enrichment
What is it?
Augment data with other relevant
data
Example
Extract company names from tweets and
make tweets searchable by company names
Context sensitive search across
internal and external sources
Search for Samsung and get results related
to Samsung, Apple iPhone, Huawei
Uncover trends, patterns &
relationships without explicit
queries
Uncover root causes of customer attrition
with social media and call center data
Recognize and analyze image,
video and audio
Logo/object/text recognition and speed-to-
text transcription in broadcast media
19. HPE IDOL - Market leadership
19
Gartner Magic Quadrant for Enterprise Search 2015
For the 2nd consecutive year, Gartner has
positioned HPE as a leader in its Enterprise
Search Magic Quadrant 2015 based on ability
to execute and completeness of vision.
Source: Gartner (August 2015)
This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the
context of the entire document. The Gartner document is available upon request from HPE.
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise
technology users to select only those vendors with the highest ratings or other designation. Gartner research
publications consist of opinions of Gartner’s research organization and should not be construed as statements of
fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of
merchantability or fitness for a particular purpose.
20. 20
HPE IDOL - Market leadership
The Forrester Wave™: Big Data Search And Knowledge Discovery Solutions, Q3 2015
The Forrester Wave™ is copyrighted by Forrester Research, Inc. Forrester and Forrester
Wave are trademarks of Forrester Research, Inc. The Forrester Wave is a graphical
representation of Forrester's call on a market and is plotted using a detailed spreadsheet
with exposed scores, weightings, and comments. Forrester does not endorse any vendor,
product, or service depicted in the Forrester Wave. Information is based on best available
resources. Opinions reflect judgment at the time and are subject to change.
• A leader in overall results based upon
strategy and current offering
• Top ranked in strategy
22. Over 500 IDOL functions to augment your intelligence
Automatic hyperlinking
Conceptual search
Keyword search
Fieldtext search
Phrase search
Phonetic search
Field modulation
Fuzzy matching
Implicit profiling
Explicit profiling
Community and expertise network
Agents
Intent-based ranking
Alerting
Social feedback
Eduction
Automatic clustering
Clustering 2D/3D
Autoclassification
Auto language detection
Sentiment analysis
Automatic taxonomy generation
Automatic Query Guidance
Highlighting
Parametric refinement
Summarization
Real-time predictive query
Metadata extraction
Automatic tagging
Faceted navigation
Inquire
Search your data
Investigate
Analyze your data
Interact
Personalize your data
Improve
Enhance your data
23. Language independence
–Free from linguistic restraints and
rules
–Automatically adapts to changing
definitions
–Over 150 languages
–Single,multibyte and Unicode
languages
–Optional language packs for
localization
24. Product performance issues
Clustering
Side letters
Off balance
sheet transactionsAutomatically
partition the data
so that similar
information is
clustered
together
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
25. Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
Add context to short queries by grouping results into concepts
Automatic Query Guidance
Query
”Madonna”
Results: Documents
containing ”Madonna”
Query
search
Documents about:
1.Singer
2. Italian Renaissance
3. Madonna Further
suggestions…
Most likely
meaning…
Result
documents
Conceptual
clustering
26. Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
Exploratory analytics that help you discover the “unknown unknowns”
Enhance your data
– Managed classification
– Create categories using business rules or training
– Automatic classification and clustering
– Automatically determine categories based on patterns and relationships in information
– Spot analysis of all themes and grouping
– Time sensitive analysis; What’s hot? What’s New?
– Eduction
– Apply structure to unstructured data by extracting key fields and entities
– Hundreds of entities supported, including names, addresses, credit card information, sentiment, intent, etc
– Audio analysis
– Speaker independent speech to text, speaker identification, audio events, language identification, etc
– Image and video analysis
– Next generation image classification (is this a car?/find more like “this”)
– On-screen OCR, logo detection, intelligent scene analysis, Color and texture analysis,
story segmentation, etc
27. Hundreds of conceptual entities
Eduction
– Quickly narrow search results with auto-identified facets and
conceptual entities such as employee names from documents
– Validate or customize entities
– Is this a valid credit card number?
– What are all docs that contain SSNs?
– If area code is 415, output as Home Office
– Pinpoint accuracy for multibyte languages such as CJK, Thai and some
European languages
Names
Places
IP addresses
Companies
Events
Relationships
Medicines
Airports
Cars
Social Security numbers
Phone numbers
Credit cards
Dates
Holidays
Job titles
Currencies
… many more
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
28. Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
Analyze your data
– Quickly evaluate the relevance of information
–Automatic Query Guidance (providing top themes from query results in real time)
–Concept navigation via advanced visualizations (node graphs, theme tracking, topic
maps, broadcast analysis)
–Intelligent summarization (simple, concept and context)
–Intelligent highlighting (search terms, phrases, concepts, context, fidelity to query
grammar)
–Concept streaming (Real-time summaries from audio that are contextual to queries and
intent)
–Intelligent de-duplication, including “near” de-duplication
– Use structure to navigate the data
–Structured, semi-structured and XML support
–Parametric search (unlimited nesting and association support)
–Directed navigation (create compelling navigation for users)
29. Personalize your data
We are what we… Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
30. Discover Relationships for Richer Insight
30
Knowledge Graph
Customer A is in
Customer B’s network
Customer C is linked
to Customer E via
Customer D
Customers F and G
purchased the same
model last year
Customer H is the
most influential in
Customer B’s network
31. Intent-based ranking
– Search results personalized and targeted based on user and context
– Profile developed through complete behavior
analysis… implicit or explicit profiling
– Gather data from content consumption,
– content contribution, interaction with
colleagues, etc.
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
32. Topical sentiment analysis
– Decomposition and classification within a sentence to pull out specific
topics
– “I stayed at the Marriott last week, and though the mattresses were
very nice, the service was awful.”
– Is this Positive? Negative? Neutral?
– How much Positive? How much Negative?
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
33. Search video as easily as text
Transform rich media into intelligent assets
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
Live video or playback
from archived footage
On-screen text
recognition
Face identification
Automatically generated
transcript using speech
recognition
Speaker identification
Timecode
synchronization
Automatic keyframe
generation
Automate
Automatically create metadata,
keyframes, transcriptions
Understand
Understand video footage and audio
streams in real time
Act
Apply advanced analytics such as
clustering and categorization, and link
with other file types
34. Image technology: 2D objects
Registered image Test image
Generic Logo recognition
Registered
Logos
Test image
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
35. Intuitive Knowledge Discovery for Self-Service Analytics
35
Visualization to simplify analytics workflow Topics Map
Sunburst
Result Comparison
Rich Contextual View
Business Intelligence for Human Information (BIFHI)
37. Customer care, turbocharged
Customer Self Service via IDOL Search
Key Differentiators
Automate more customer service with advanced
features such as contextual search, automatic
hyperlinking, implicit query, sentiment analysis,
alerting, and chat agents
Find and act on 100% of information - regardless
of language, source or information format
Scalably and securely access virtually all systems,
including cloud repositories with over 400 pre-built
enterprise-class connectors
How Customer Would Deploy
IDOL powered self service web-based support using all available knowledge sources: knowledge base, contact center, forums,
product reviews and more, with connectors to existing OSS, BSS, media solutions and network management as needed
How IDOL would drive competitive advantage for customer
• Reduced churn & improved CSAT due to enhanced automation of customer self-service and improved user experience;
• SG&A cost reduction with single systems for internal and external support
Solutions like HPE Service
Anywhere run IDOL Search
to improve service quality
and staff efficiency
38. Monitor social media to proactively address incidents and issues
Social Customer Service via IDOL
Key Differentiators
• Combine social and public data (Twitter, news, etc)
with customer data in the enterprise to gain insights
• Strong text analytics to synthesize and summarize
large volumes of data – sentiment analytics, concept
extraction, extract place names
• xxxxxHow Customer Would Deploy
Deploy IDOL with connectors to various data sources.
How Social Customer Service would drive competitive advantage for customer
Tap into other sources of customer feedback for proactive and reactive resolution of service issues.
Improve customer satisfaction, mitigate churn, identify upsell opportunities
39. Build a knowledge graph of your organization and automate customer interaction
Workforce Productivity via IDOL Knowledge Management
Key Differentiators
• Automates manual customer care processes &
actions
• Expertise location to team and deliver best response
• Proactively deliver and manage relevant & timely
data
How Customer Would Deploy
Deploy a comprehensive platform for customer
interaction to automate a time-consuming, labor-
intensive process.
How IDOL would drive competitive advantage for customer
• SG&A cost savings: Increased customer satisfaction (decrease churn rate), decreased call center load, understand your
organization better to align resources and eliminate inefficiency
• Increased revenue: Re-deploy resources to high value customer add services
Unstructured
Structured
Collaboration
Expertise
Location
Categorization
Eduction
Taxonomy
40. A Smarter Data Lake Needs…
Automatically analyse rich media
Connectors & Policies
IDOL Features
Integration points with Hadoop
Understand myriad file formats and types
Breakdown information silos across enterprise
Improved, intuitive visibility to contents
KeyView
IDOL Server (incl HDFS Sync)
Image Server & Video Server
Using IDOL to enhance Hadoop (Beta for Evaluation)
• Any Source − Build, enrich, and clean up your data lake
• Data Clarity and Mapped Security – Data dictionary and information security within your data lake
• Advanced Analytics - Provide contextual search and text, image, video, speech machine learning
41. On Screen
Text Recog.
Analyze video, audio, images to support & drive the next wave of experience and monetization
Multimedia Analytics via IDOL Multimedia
Key Differentiators
• Automate - create metadata, key frames, transcriptions
• Understand - video and audio streams in real time
• Act – apply advanced analytics (cluster, categorize, link)
How Customer Would Deploy
In line with strategic next wave value added services, rich content,
and services strategy
How IDOL Multimedia Analytics would drive competitive advantage for Customer
Drive next wave content, publishing, and advertising/monetization (revenue enhancement)
- Value Added Services to complete against OTT
- Content screening, moderation
- Ad verification
- Compliance
New Age On-Demand Internet Video, Audio
Multi
Language
Video
Analytics
Face
detection
Sentiment
extraction
Advanced
IDOL
Analytics
Speech-to-
Text
Speaker
Identify
42. IDOL powered Healthcare Analytics for 360 degree clinical intelligence
Core Capabilities:
• Integrated modular platform for variety of use cases
• Hundreds of data connectors and data types
• Rapid identification of concepts, patterns and relationships
• Conceptual search on all data
• Advanced security for compliance
Healthcare specific capabilities:
• SNOMED CT taxonomy with 344K+ concepts and 2M terms
• Integrated ICD codes
Reconciliation
ID discrepancies between
diagnostic code and clinical notes
Monitor KPIs and Metrics
Reporting
Abstraction
Rapid Chart Access
43. IDOL powered Smart City Solution
Integration Analytics Data Fusion
Integrate data feeds
from across the city into
a common command
center for investigation
and event monitoring
Add video, audio, and
event analytics to the
feeds to enable real
time monitoring for
security trends and
incidents
Complete the puzzle
with additional
information sources like
social media, broadcast
media monitoring,
employee databases,
etc.
Built-in Scalability
Unlimited expansion
and connectivity already
included at all levels by
design.
Automation
Streamlined workflow
and automated process
for triggers and alerts
45. – Challenge
• Create airline passenger registration system and compare
information against existing police databases, to protect the
country against crime.
• Been able to intercept suspect passengers before they take a
flight, during transit or at their destination point.
– Solution
– HPE IDOL + Vertica
– Benefits
– Extract meaning from virtually all forms of information associated with any airline
booking, including unstructured data such as audio, video, images, social media,
email and web content, as well as structured data such as customer transaction
logs
– Perform language-independent analysis and flag potential targets.
“2016 will see 3.6 billion passengers
worldwide making journeys by air, so we need
to employ every possible way to protect our
borders against crime”
Spanish Ministry of Interior Improves Safety With Big Data Analytics
46. China Mobile
Communications service provider industry
Challenge
– Allow users to access information on thousands of public services
directly from their mobile phones – success of the Wireless City
platform depends on the users’ ability to quickly find information
Solution
– HPE IDOL
Result
– Over 740 million subscribers can search through more than 8,000
applications for public service information, including public
transportation schedules, public health records, traffic offenses and
more
– Users receive more accurate search results than ever before
– China Mobile customers get the most relevant and useful information
regardless of the terms they use in the search
Private | Confidential | Internal Use Only 46
47. Leading American multinational telecom
Paying careful attention to every aspect of customer-facing processes and applications
Challenge
– Provide support desk staff with fast access to precise information
required to address customer’s problem
– Improve knowledge management system search capabilities
Solution
– HPE IDOL
– HPE Big Data Professional Services
Result
– Reduced time-to-resolution with fast queries that ensure support
experts can resolve customer issues quickly
– Relevant results as query functionality makes sure that results deliver
information most likely to resolve customer issues
Private | Confidential | Internal Use Only 47
48. Leading financial software, data and media company
Subscribers require up-to-the-second information on market conditions and trends
Challenge
– Deliver search performance at the scale required by the size of its data
repository, 200 million messages, 15-20 million chats daily
– Provide robust, cost-efficient solution with scalability for large and
growing volume of data, supported by small IT headcount
Solution
– HPE IDOL
– HPE Big Data Professional Services
Result
– Detects trends in real-time messaging and chats for subscribers
– Accommodates 10+ billion of document entries without compromising
performance today
– Ensures scalability delivers ROI in the future
Private | Confidential | Internal Use Only 48
49. NANA Management Services
HPE IDOL brings higher security, lower cost, better business data
Challenge
– Businesses are challenged to manage high expense of traditional
guard services, false alarm rate and security equipment failure
Solution
– HPE IDOL engine embedded in OEM security solution
– NMS Virtual Guard with Milestone video surveillance platform
Result
– Increased visibility with limited number of cameras versus human
security guards. NMS Virtual Guard records every event
– Better value from assets, system alerts users when a camera goes
down (often recording only black)
– Efficient cost structure, NMS Virtual Guard and HPE IDOL can reduce
security costs by 80% over traditional guard staffing
Private | Confidential | Internal Use Only 49
50. Free State of Saxony
HPE IDOL offers government powerful, centralized search
Challenge
– State government needed a future-proof, easy-to-administer search
solution for all administrative departments in Saxony, Germany
Solution
– HPE IDOL
Result
– A reliable, cost-effective search solution with simple administration and
high-speed indexing of web services, files, cartography, more
– 150 different internet portals indexed each night, with changes to
100,000+ documents
– System manages 110 km of documents, covering 1,100 years of
Saxon, German, and European history across five locations
Private | Confidential | Internal Use Only 50
51. HPE’s Resume Search solution
Finding talent using Big Data Platform
Challenge
– Provide a fast, reliable means for finding the right talent for contracted
service engagements n HPE’s client base
Solution
– HPE IDOL
– HPE Project Portfolio Management
Result
– Meaning-based search of unstructured data across thousands of
resumes helps locate in-house talent quickly
– Resource Brokers identify key attributes, skills and experience
required by Enterprise Services projects
Private | Confidential | Internal Use Only 51
52. HPE on HPE CX Analytics
Answering critical customer satisfaction issues better and faster
Challenge
– Pull all customer-related data into a centralized repository
– Create a set of analytics services its business units can use to
improve the company’s Net Promoter Score® (NPS)
Solution
– HPE IDOL Information Analytics and HPE Vertica Analytics Platform
– Tableau
– Hadoop
Result
– Maximize value of customer experience data to improve customer
satisfaction
– Provide snapshot of customer experience metrics that is current and
comprehensive; answer complex questions in 5-10 minutes
– Generate a 360-degree view of the HPE customer experience
Private | Confidential | Internal Use Only 52
53. Dept. of State Development, Business and Innovation
Public Sector – Victoria, Australia
Challenge
– Provide a single, secure, enterprise-wide search platform across
multiple information sources inside and outside the organization
– Locate information from different information sources such as HPE
TRIM, the DSDBI Intranet, shared network drives, Salesforce and
external sites such as Hansard, Australian Bureau of Statistics,
Victoria online and other websites
Solution
– HPE IDOL, Microsearch Portlet, Microsearch consulting services
Result
– Easily and quickly find relevant information with near real-time search
across millions of documents, 9 enterprise and Internet content
sources, leading to significant time savings
– Single sign-on allows filtered results, preventing the inadvertent
disclosure of sensitive information
Private | Confidential | Internal Use Only 53
54. Dubai Police
Accelerating law enforcement for Smart Cities
Challenge
– Use automation to locate wanted vehicles efficiently
– Read a range of number plate styles in both English and Arabic
lettering and different color codes
Solution
– HPE IDOL
– HPE Media Management and Analytics Platform
Result
– System helped Dubai Police capture 2,739 people locally and
internationally, over 18 month period
– Success led to version 2, incorporating improved cameras with ability
to read across six lanes of high-speed traffic
Private | Confidential | Internal Use Only 54
55. Auckland Transport
Driving groundbreaking Futures Cities initiative
Challenge
– Enable Safe City Solution to be predictive instead of reactive
– Deploy video analytics to support safety and well-being of citizens with
Future Cities initiative
Solution
– HPE IDOL
– HPE Media Management and Analytics Platform
Result
– Optimizing video analytics with more than 2000 video feeds recorded,
200 video analytics running in real time
– Detect red light jumps, congestion, clearway violation & much more
– Utilize data from more than 2,000 cameras monitor traffic patterns for
more than 1.4M citizens
– Implement license plate recognition for accurate identification and
scene analysis
Private | Confidential | Internal Use Only 55
56. High risk environment
Base protection
Challenge
– Detect threats anytime, anywhere by correlating
intelligence from multiple sources in various forms –
audio, video, reports and 3rd party sensors
Solution
– HPE IDOL
– HPE Media Management and Analytics Platform
Result
– Automatically flag anomalies by analyzing feeds from
aerostat, UAV, towers, and correlating with other events
– Use biometric databases to relay real-time recognition of
facial features and license plates
Private | Confidential | Internal Use Only 56
57. Stanford Children’s Health
Research for healthcare provider ranking study
Challenges
– Quality and clinical effectiveness research on ~115K patients, ~390K
encounters, ~3M documents
– Diverse data types (structured and unstructured) across data silos
involved
– Time constraints vs extensive search scope
Solution
– HPE IDOL
– Ontology Tagger and Analytics User Interface
Results
– Cross patient search for cohort identification
– Intuitive UI for simple query construction
– Easy clinical note review with highlights, navigation and related
concepts
– Portable queries and results
– Fast indexing
Private | Confidential | Internal Use Only 57
58. Global health services
Robust search technology supports health services needs of 80 million customers worldwide
Challenge
– Detect meaning of data even if data didn’t conform to specific standard
e.g. physician, MD, doctor, or Dr
– Fast query results to support positive customer experience
Solution
– HPE IDOL
Result
– Customers can quickly identify providers that meet their needs for
specialty, location and other important criteria
– Solution supports business and fiscal objectives with lower cost-in-
network providers
– Scalability maximizes ROI over time
Private | Confidential | Internal Use Only 58
59. Fortune 500 global diversified healthcare company
Private | Confidential | Internal Use Only 59
Claims data
– Provider information
– FWA recovery data
– Call center data
– Treatment/Service data
– Social media
Population and community health
Care management/Care coordination
Surveillance, Analysis, Product development innovation
Consumer cctivation/Engagement/Education
Reputation management/Outreach
Innovation focus
Lines of business
– Innovation
– Brand
– Care delivery
– Product development
– Payment integrity
– Provider
– Consumer activation
60. Fortune 500 global diversified healthcare company
Accelerate and increase cost savings
Challenges
– Drive to find savings by improving payment integrity
– Address evolving patterns of FWA
– Disparate payment systems , no single view
– Skill gaps limit access to analysis
– Long turn around time for BI analysis reports
Solution
– HPE IDOL
Results
– 24X Improvement in analysis turnaround
– Multi $M savings in weeks
– Self-service analysis for business analysts
– Single point of access covering multiple systems
– Dynamic rule-engine tests against new and historical claims to identify
potential recoveries
– Scale out on Hadoop Architecture
– Flexible platform supporting continual additions of new data and use-cases
60Private | Confidential | Internal Use Only
61. Beijing Future Advertising
Next generation sports marketing
Challenge
– Deliver high value-add services for advertising clients
Solution
– HPE Media Management and Analytics Platform
– HPE IDOL
Result
– Integrated: Bring broadcast and social media analytics together
– Efficient: Automate monitoring & analysis of broadcasts & audience
reactions. Reduced data classification and processing time from 10
tens to minutes
– Effective: Tap into insights from audience/consumer engagements
– Impactful: Provide guidance for strategy and resource investment
Private | Confidential | Internal Use Only 61
62. NASCAR
Fan and Media Engagement Center
Challenge
– Economic conditions
– Rapidly changing media landscape (social media growth)
– Rev pressures from sponsors
– Industry leadership expectation
Solution
– HPE IDOL
Results
– Live monitoring and analysis of broadcast, news and social media
– Sponsors’ brand and fan sentiment analyses
– Analytics to support race team sponsorship renewals
– Crisis management
– Build fan base with active engagement
View customer testimonial on YouTube
Private | Confidential | Internal Use Only 62
63. Summary
• Holistic – Integrate data silos & unlock hidden insights
• Proven – Sustained market leadership
• Versatile – One platform for diverse use cases
63
For more information, please visit:
www.hpe.com/software/IDOL
67. IDOL Data Ingestion pipeline
LUA scripting engine is
available within
connectors
KeyView file format process,
Eduction and LUA scripting
engine are available within CFS
OCR
Audio/Video
Category
APA Agents
Repository
Connector
Connector
framework server
Content
Repository
Connector
Repository
Connector
DIH
IDOL Proxy
Index tasks
72. Retrieval methods
– Conceptual
–Natural language
–Conceptual matching
–Unstructured refinement
– Business rules
–Boolean
–Keyword
– Parametric
–Structured refinement
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
73. Over 100 operators for Boolean search
AND
OR
NOT
NEAR
NEARn
DNEAR
DNEARn
WNEAR
WNEARn
BEFORE
AFTER
EOR
WHEN
WHENn
vAND
vSUBSTRING
vMATCHES
NEAR
NEAR/n
SENTENCE
PARAGRAPH
BEFORE
AFTER
ORDER
SOUNDEX
MANY
[n] WORD
CASE
PHRASE
. >
. >=
. <
. <=
. !=
. =
LANG/x
TODAY
YESTERDAY
NOW
NOW+n
NOW-n
term
term*
term?
vOR
vNOT
vACCRUE
vANY
vALL
vIN
vWHEN
vCONTAINS
vENDS
vSTARTS
vSUBSTRING
vCONTAINS
vENDS
vSTARTS
FREETEXT
STEM
TYPO
TYPO/n
YES-NO
PRODUCT
SUM
COMPLEMENT
LOGSUM
LOGSUM/n
MULT
MULT/n
FREQ
term~
term[100]
term[*1.5]
"term"
"term phrase"
term:field
"term
phrase":field
~term
FUZZY()
FUZZYnn()
SOUNDEX()
APCMMOD[]
term[~]
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
74. Conceptual search
– High recall and precision
–Return documents that do not contain query
terms but are conceptually related
– Input sentences or entire document as query
–Extracts main concepts in the query to
deliver the most relevant results
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
75. IDOL powers the largest systems in the world:
Scalability
– Millions of users
–Government Agency : 2.5 million users
– Billions of documents
–Major financial services firm : Over 1 bn
emails
–Major pharma: 50 terabytes of data in
discovery repository alone
– High throughput
–Major information services providers:
Alert on 46m emails per day
76. Mapped security
–Fully integrated Kerberos authentication together with
Secure Socket Layer (SSL) encryption across all transactions
–Compliance with all major Security Standards, including
US DoD5015.2, UK TNA2002, Australia’s VERS, ISO 15489
–Full-range of customizable security functionality:
– Discretionary access control (ACL based)
– Mandatory access control (Based on metadata)
– Kerberized access to IDOL
– SSO authentication using Windows Active Directory
77. Search your data
• Conceptual, Keyword or Object
• Extensive Field combinations
• Full Meta Search
• Linearly Scalable
• Fault Tolerant
• Disaster Recovery Friendly
• All Information
• Real-Time Data
• Audio and Video
• Mapped Security
• Fully Extendable
• Leverages Existing Security
Accuracy
Robust Architecture
Reach
Security
Inquire
“Search your
data”
Investiga
te
“Analyze your
data”
Interact
“Personalize
your data”
Improve
“Enhance your
data”
78. Personalize your data
Explicit profiling (Agent):
user-defined
• Define your interest using:
- Natural language descriptions
- Keyword/boolean rules
- Refine by example
• Automatically monitor information
• Customizable
• Share interests with knowledge community
Implicit profiling: capture
behavior data
• Fully automatic
• Ongoing monitoring of data consumption
and contribution
• Multi-faceted profiles
• Always up-to-date
Expertise
CommunitiesAgents
Profiles
Dynamic communities of interest
• Expert identification
• Define business rules to guide relationships
• Automatically form and
manage community
• Collaboration networks
• Document rating
• Consumer groups
Expertise Expertise
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
79. Understanding the customer at the level of a dialog
Contextual Segmentation
Geo + Demo + Psychographic
segments Behavioral segments
Functions
Performance
Feature Driven
Reviews
News
Adverts
Social media
Buzz driven
18-35 yrs
35-65
Seniors
Have Kids
Male
Female
Semantic
segments
Large Screen
Lots of storage
High Res
Display
Would give it 5
stars
Great Value for
Price
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
80. Automatic hyperlinking
–Automatically retrieves conceptually related content
–Searches automatically done for the user
–Increase productivity and reduce duplicate work
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
81. Visualization of main topics Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
82. Summarization
Quick summary
(N+ lines)
Context summary
(What is this doc about with relation
to query terms?)
Concept Summary
(What do I look for with
regards to interest rates?)
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
Information Theory and
Bayesian Inference
83. Directed navigation Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
Narrow search with facets
84. Foster collaboration byautomatically matching and connecting employees with similar needs
Connect with your colleagues
Experts
Communities
Files
Social
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
85. Eduction
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
<Organization>
• National Security Agency
<Names>
• President Obama
• Vladimir Putin
• Edward Snowden
<Places>
• Moscow
• St. Petersburg
• Washington
• Syria
• Russia
86. Automatically redact sensitive or offensive
entities
–Profanity
–Personally Identifiable Information (PII)
–Payment Card Industry Data Security
Standard (PCI-DSS)
–…and hundreds of entities
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
87. In-Document structured analytics: dynamically
create fields at query time
• Define new “computed” fields on the fly
• Define a new Total_Price field based on the indexed list price and a
run-time tax rate
• Define parametric ranges on the fly
• Add new typed fields on the fly
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
88. IDOL Speech to Text adopts next-gen algorithms
• Deep Learning Speech to Text Algorithms
– Deep Neural Networks (DNNs) used for Acoustic Modeling
– Provide deep learning of the features of Speech
– Better at approximating Speech than statistical-based algorithms
• Language Packs trained on Speech Corpora containing many hours of
Speech
89. Most Advanced Speech Technology
–Deep Neural Networks (DNNs) used for acoustic modeling
– convert spoken words to text
– Acoustic + Language Model
– Speech-to-Text and IDOL’s conceptual understanding
– Eliminate manually adding metadata to A/V clips
– Superior to phonetic and statistical-based approaches
– Model of language disambiguates similar terms
– U.S. President “Bush”
– “bush” as in a large plant
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
90. Limitations of Phonetic Search
– Phonetic sounds do not have a unique match
– Only capable of keyword matching
– “Cambridge University”
– /k ey m b r ih jh y uw n ih v er s ax t iy/
– The University of Cambridge
– Cambridge colleges
– Kings College
– Trinity Hall
/k/ /ae/ /t/
“cat”
“category”
“scatty”
“catalogue”
?
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
91. IDOL Speech is supported by powerful algorithms
Acoustic model, language model and lexicon for each language
• 30+ languages supported
• Real-time operation
• Speaker Independent
• Ability to customize language
• Telephony and broadcast
models
Models of fundamental sound
patterns – different for low
quality telephone models
(8kHz) and higher quality
broadcast models (16Khz+)
Base language
models and
customized models
that include
common phrases
and word
sequences
Trained pronunciation
dictionary with
vocabulary
TextFront End
Processing
Recognition
Algorithm
Language
Model
Lexicon
Acoustic
Model
Speec
h
92. Image technology: Text
Document field extraction
Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
<item>
<price>$6.23</pric
e>
<date>10/2/2012</
date>
<purpose>Lunch</
purpose>
…
</item>
OCR: Read text from images
1D and 2D barcode reading
ISBN
(“9870140189865”)
PDF-417 (“LASTNAME,
FIRSTNAME,…”)
Data Matrix
(“The Future of Ticketing…”)
Many more (about 20
barcode types)
Image artifacts such as wrinkled paper
Avoid non-text parts of the image
Column understanding
93. Image technology: human analysis Inquire
“Search your data”
Investigate
“Analyze your
data”
Interact
“Personalize your
data”
Improve
“Enhance your
data”
Primary clothing color =
white
Not nude
Primary clothing color =
white
Not nude
Primary clothing color =
black
Not nude
Face detection
Face analysis
Found “President Obama” face