SlideShare uma empresa Scribd logo
1 de 33
High Speed, Big Data Analysis using Exalytics
Mark Rittman, Technical Director, Rittman Mead
OUGN 2013 Conference, April 2013, Oslo

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Mark Rittman

• Mark Rittman, Co-Founder of Rittman Mead
• Oracle ACE Director, specialising in Oracle BI&DW
• 14 Years Experience with Oracle Technology
• Regular columnist for Oracle Magazine
• Author of two Oracle Press Oracle BI books
   ‣ Oracle Business Intelligence Developers Guide
   ‣ Oracle Exalytics Revealed
• Writer for Rittman Mead Blog :
  http://www.rittmanmead.com/blog
• Email : mark.rittman@rittmanmead.com
• Twitter : @markrittman




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
About Rittman Mead

• Oracle BI and DW platinum partner
• World leading specialist partner for technical excellence, solutions delivery and
  innovation in Oracle BI
• Approximately 30 consultants worldwide
• All expert in Oracle BI and DW
• UK based
• Offices in US, Europe (Belgium) and India
• Skills in broad range of supporting Oracle tools:
   ‣ OBIEE
   ‣ OBIA
   ‣ ODIEE
   ‣ Essbase, Oracle OLAP
   ‣ GoldenGate
   ‣ Exadata

 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
What is Oracle Exalytics?

• Hardware Element
   ‣ Sun Fire X4470 M2 server
   ‣ 1TB RAM, 40 Cores, 3.6TB HDD
• Software Element
   ‣ OBIEE 11.1.1.6 with Exalytics Enhancements
   ‣ Oracle Essbase 11.1.2 with Exalytics Enhancements
   ‣ Oracle TimesTen 11.2.2.2 for Exalytics
   ‣ New in v1.1 : Support for Oracle Endeca
     Information Discovery, Golden Gate, ODI
   ‣ Runs on 64-bit Oracle Linux
     (Exalogic distribution)
• OBIEE and Essbase are licensed as
  Oracle BI Foundation
• Exalytics features can only be used in
  conjunction with Exalytics hardware

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Exalytics as the Exa-Machine for BI

• Runs the BI layer on a high-performance, multi-core, 1TB server
• In-memory cache used to accelerate the BI part of the stack
• If Exadata addresses 80% of the query performance,
  Exalytics addresses the remaining 20%                              Oracle BI

   ‣ Consistent response times for queries
   ‣ In-memory caching of aggregates
   ‣ 40 cores for high concurrency                             In-Memory DB/Cache
   ‣ Re-engineered BI and OLAP software
     that assumes 40 cores and 1TB RAM



                                                                     ERP/Apps                    DW




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Enables High-Density Analysis of Big Data

• BI tier is rarely the bottleneck, but it can be if very dense visualizations are used
   ‣ Sparklines, grid of charts etc
• Exalytics’ 40 cores and 1TB RAM make higher density presentation viable
   ‣ Single query sent to the database
   ‣ Exalytics breaks data up to create microcharts
• Also helps support high numbers of concurrent users (100+)




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Also Supports Essbase, and Endeca Information Discovery

• In-Memory Essbase for planning, budgeting and sales analysis-style OLAP applications
• Endeca Information Discovery for search/analytic applications against diverse data


           Oracle
         Exalytics
        In-Memory
          Machine




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Oracle’s Wider Strategy for Business Analytics

• Connect to all of your data, from all your sources,
• Subject it to the full range of possible inquiry
• Package solutions for known problems and fixed sources, and
• Deploy to PCs and mobile devices, on premise or in the cloud


          Any Data,                    Full Range of                Integrated                   On Premise,
         Any Source                      Analytics                 Analytic Apps                  On Cloud,
                                                                                                  On Mobile




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Big Data : New Datasets of Higher Variety, Volume and Velocity

• Traditional BI datasets have been relatively small, and well structured
• Financial data and other metrics, with attributes and hierarchies to slice-and-dice it
• Big data is all about collecting and analyzing data sets of wider scope
   ‣ Volume - TBs of data collected from sensors, transactions and other low-granularity
     data sources
   ‣ Variety - unstructured, semi-structured as well as structured sources
   ‣ Velocity - data arriving in real-time, and analyzed in real-time




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Hadoop and MapReduce

• Apache Hadoop is one of the most well-known Big Data technologies
• Family of open-source products used to store, and analyze distributed datasets
• Hadoop is the enabling framework, automatically parallelises and co-ordinates jobs
• MapReduce is the programming framework
  for filtering, sorting and aggregating data
   ‣ Map : filter data and pass on to reducers
   ‣ Reduce : sort, group and return results
• MapReduce jobs can be written in any
  language (Java etc), but it is complicated




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Big Data in the Context of the Oracle Exalytics Platform

• Exalytics, through in-memory aggregates and InfiniBand connection to Exadata,
  can analyze vast (structured) datasets held in relational and OLAP databases
• Endeca Information Discovery can analyze unstructured and semi-structured sources
• InfiniBand connector to Big Data Applicance + Hadoop connector in OBIEE supports
  analysis via Map/Reduce
• Oracle R distribution + Oracle Enterprise R supports SAS-style statistical analysis
  of large data sets, as part of Oracle Advanced Analytics Option
• OBIEE can access Hadoop datasource through another
  Apache technology called Hive




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
New in OBIEE 11.1.1.7 : Hadoop Connectivity through Hive

• MapReduce jobs are typically written in Java, but Hive can make this simpler
• Hive is a query environment over Hadoop/MapReduce to support SQL-like queries
• Hive server accepts HiveQL queries via HiveODBC or HiveJDBC, automatically
  creates MapReduce jobs against data previously loaded into the Hive HDFS tables
• Approach used by ODI and OBIEE
  to gain access to Hadoop data
• Allows Hadoop data to be accessed just like
  any other data source (sort of...)




T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
An example Hive Query Session: Connect and Display Table List

[oracle@bigdatalite ~]$ hive
Hive history file=/tmp/oracle/hive_job_log_oracle_201304170403_1991392312.txt

hive> show tables;
OK
dwh_customer
dwh_customer_tmp
                               Hive Server lists out all “tables”
i_dwh_customer
                               that have been defined within the
ratings
                               Hive
src_customer
                               environment
src_sales_person
weblog
weblog_preprocessed
weblog_sessionized
Time taken: 2.925 seconds




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
An example Hive Query Session: Display Table Row Count

hive> select count(*) from src_customer;
                                                          Request count(*) from table

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
                                                                     Hive server generates
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:                    MapReduce job to “map” table
set hive.exec.reducers.max=                                          key/value pairs, and then reduce
In order to set a constant number of reducers:                       the results to table count
set mapred.reduce.tasks=
Starting Job = job_201303171815_0003, Tracking URL =
  http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201303171815_0003
Kill Command = /usr/lib/hadoop-0.20/bin/
 hadoop job -Dmapred.job.tracker=localhost.localdomain:8021 -kill job_201303171815_0003

2013-04-17 04:06:59,867 Stage-1 map   =   0%, reduce =   0%
2013-04-17 04:07:03,926 Stage-1 map   =   100%, reduce   = 0%
2013-04-17 04:07:14,040 Stage-1 map   =   100%, reduce   = 33%                  MapReduce job automatically run
2013-04-17 04:07:15,049 Stage-1 map   =   100%, reduce   = 100%                 by Hive Server
Ended Job = job_201303171815_0003
OK


25
Time taken: 22.21 seconds                                Results returned to user




T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Importing Hadoop/Hive Metadata into RPD

• HiveODBC driver has to be installed into Windows environment, so that
  BI Administration tool can connect to Hive and return table metadata
• Import as ODBC datasource, change physical DB type to Apache Hadoop afterwards
• Note that OBIEE queries cannot span >1 Hive schema (no table prefixes)

                                                                                 2




            1


                                                                                                        3



T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Set up ODBC Connection at the OBIEE Server (Linux Only)

• OBIEE 11.1.1.7+ ships with HiveODBC drivers, need to use 7.x versions though
• Configure the ODBC connection in odbc.ini, name needs to match RPD ODBC name
• BI Server should then be able to connect to the Hive server, and Hadoop/MapReduce
 [ODBC Data Sources]
 AnalyticsWeb=Oracle BI Server
 Cluster=Oracle BI Server
 SSL_Sample=Oracle BI Server
 bigdatalite=Oracle 7.1 Apache Hive Wire Protocol

 [bigdatalite]
 Driver=/u01/app/Middleware/Oracle_BI1/common/ODBC/
  Merant/7.0.1/lib/ARhive27.so
 Description=Oracle 7.1 Apache Hive Wire Protocol
 ArraySize=16384
 Database=default
 DefaultLongDataBuffLen=1024
 EnableLongDataBuffLen=1024
 EnableDescribeParam=0
 Hostname=bigdatalite
 LoginTimeout=30
 MaxVarcharSize=2000
 PortNumber=10000
 RemoveColumnQualifiers=0
 StringDescribeType=12
 TransactionMode=0
 UseCurrentSchema=0



T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Oracle Advanced Analytics : In-Database Predictive Analytics

• In-Database Predictive Analytics and Statistical Analysis
• Massively-Scalable, able to analyze huge volumes of data
• Exposed through SQL and R, enabling broad usage


           Predictive             Comprehensive Predictive Analytic
           Analytics              Platform Built-Inside the DatabaseData Mining, Text MiningStatistical Analysis (built on R)B
                                  ‣Scalable and Parallel
                                  Tightly-Integrated with SQL
                                  Works Inside Exadata and
                                  Big Data Appliance

                                                  Text Mining

                                                   Statistics

                                                  Data Mining




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
What is R, and Oracle R Enterprise?

• R is a statistical language similar to
  Base SAS, or SPSS
• Open-source, run by the R Project
  (http://www.r-project.org)
• R environment is a suite of client/server
  products for statistical data manipulation
  and graphical analysis
• Modeling and Analysis performed
  in-memory using “frames”
• Enhanced by community-contributed packages
• R distribute the open-source version of R
  with Oracle Linux
• Oracle R Enterprise extends R to allow analysis
  against frames stored in Oracle tables, views
  and embed R scripts in database PL/SQL packages

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Capabilities of R Compared to SQL (Built-In Stats Functions)

• R provides a wide variety of statistical and graphical techniques
• Linear and non-linear modeling, classical statistical tests, time-series analysis
• Classification, clustering and other capabilities
• Matrix arithmetic, with scalar, vector, matrices, list and data frame (aka table) structures
• Extensible through community-contributed packages, and interacts with C++, Java etc
• Available for Oracle Database 11gR2 through the Advanced Analytics Option
• Extends the (free) SQL statistical capabilities provided by Oracle Database
   ‣ Ranking, Windowing, Reporting
   ‣ Lag/Lead, First/Last
   ‣ Linear Regression, Inverse Percentile etc




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Oracle R Enterprise

• Regular R is constrained by only working with in-memory datasets (frames)
• Data from tables and other database structures has to be loaded into memory
• Oracle R Enterprise (ORE) removes this constraint by allowing frames to reside in DB
• Automatically exploits database parallelism, plus Oracle scalability / resilience
• ORE provides three key areas of functionality
   ‣ Embedded R
   ‣ In-Database Statistics Engine
     (R extensions for Oracle SQL)
   ‣ Transparency Layer
     (access RBDMS-based frames)




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Typical R / ORE Topology

• Analyst workstation contains Open Source R client tools
• Models created in-memory on workstation                                                 In-Memory
                                                                                          R Engine
• ORE provides capability to access datasets             Analyst Workstation /
                                                         Laptop
  stored in Oracle RBDMS, transparently                  (2 core,
                                                         16GB RAM)
• Database has ORE embedded within it
                                                                                         In-Memory
• ORE provides data for workstation, and                                                 R Engines
                                                                                         spawned by DB
  can spawn its own R sessions for
  in-database R analysis                                 Oracle Database
                                                         Server with
• Enables lights-out R analysis, plus connectivity       ORE

  to Hadoop and Map/Reduce via
  Oracle R Connector for Hadoop                                          ORE Hadoop Connector


                                                                                           Hadoop Server
                                                                                           (Oracle Big
                                                                                           Data Appliance)




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Oracle Exalytics In-Memory Machine as a Platform for Endeca & R

• Exalytics server acts as an
  analysis workstation “on steroids”
• 1TB Ram + 40 cores for multiple R engines
• Infiniband connection to Exadata and ORE
• Endeca Information Discovery uses 40 CPU
  cores for massively parallel indexing, analysis
• More of Endeca Server datastore in RAM
• OBIEE uses RAM and cores for TimesTen
  datamart, in-memory data source federation
  and Presentation Server caching, performance
• All products benefit from high-end server
  features and fast connectivity to Exadata + BDA




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
An Example OBIEE / Endeca / R on Exalytics Scenario

• “Airline On-Time Performance and Causes of Flight Delays” dataset
• Provide by Bureau of Transportation Statistics, Research and Innovative Technology
  Administration, United States Department of Transportation
• Dataset containing 123M rows of non-stop US domestic flight legs
• Source and destination airports, operator, aircraft type
• Type and duration of delay, delay reason
• Freely-available “big data” set
• What can OBIEE, R and EID tell us?
   ‣ OBIEE - dashboard analysis + drilling
   ‣ EID - discovery, and analysis of supporting
     information describing delays, reasons etc
   ‣ ORE - deep insight into specific questions




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Endeca Information Discovery Search/Analytic Dashboard

• Search capabilities of EID
  help us explore messy and
  unfamiliar data
• “McDonel” corrects to “McDonnell”
• Matches to all variants of
  “McDonnell Douglas”
• Dashboard tells us that
  Delta Airlines has suffered
                                                                                   Any variant of
  most delays using McDonnell                                                      McDonnell
                                                                                   Douglas is
  Douglas aircraft, regardless of                                                  retrieved.
  variations of spelling




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Discovery Environment Provides Many Ways to Explore Data



 “San Fran” provides completion
 suggestions across all attributes,
 allowing us to discover many
 representations of “San Fran” that
 may be present in the data



                                                                                         Unlike search engines, EID
                                                                                         has a rich, SQL-like language for
                                                                                         aggregating and calculations, to
 Similarly, with each step of the
                                                                                         populate graphs and visualizations
 data exploration, all available
                                                                                         in the dashboard
 records in current filter set are
 summarised by facets - prompts
 for every available attribute,
 populated by only the valid values
 that will lead to non-empty results
 sets, allowing users to uncover
 relationships and patterns in the
 data using attributes that they
 may not even be aware existed




T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Text Analytics through EID Lexical / Parsing / Enrichment Features

• Flexible key/value data model and unstructured text enrichment capabilities of EID allow
  text analytics to be combined with data discovery and analytics




                           Most common issue for older,                              Whilst for newer 777, most
                           MD-88 aircraft is “corroded or                            common issue is “inoperative
                           cracked skin on the fuselage”                             emergency lights in the cabin”




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Exalytics OBIEE Dashboard : “Speed of Thought” Analysis



                     123m rows of data, analyzed live
                     with detail in Oracle DB, and
                     aggregates in TimesTen




                  “Go-less” prompts and dashboard
                  controls for instant response to
                  filter changes



                                                          Interactive visuals, in the form of
                                                          maps, graphs, tables, scorecards
                                                          and KPIs




T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Using R to Answer In-Depth, Statistical Questions

• Are some airports more prone to delays than others?
• Are some days of the week likely to see fewer delays than others?
   ‣ Are these differences significant?
• How do arrival delay distributions differ for the best and worst 3 airlines compared to the
  industry?
   ‣ Are there significant differences among airlines?
• For American Airlines, how has the distribution of delays for departures and arrivals
  evolved over time?
• How do average annual arrival delays compare across select airlines?
   ‣ What is the underlying trend for each airline?




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Preparing the Dataset for R, and Running R Queries

• Create R frames using data
  from Oracle RDBMS, using
  ORE transparency layer
• Create R queries to manipulate
  flight delays data
• Build regression models
• Score and rank data
                                                             ontimeSubset <- subset(ONTIME_S, UNIQUECARRIER %in%
• 40 cores and 1TB RAM in Exalytics                          c("AA", "AS", "CO", "DL","WN","NW")) res22
                                                             <- with(ontimeSubset, tapply(ARRDELAY,
  allows multiple R engines to                               list(UNIQUECARRIER, YEAR), mean, na.rm = TRUE))
                                                             g_range <- range(0, res22, na.rm = TRUE)
  be spawned, processing larger                              rindex <- seq_len(nrow(res22))
                                                             cindex <- seq_len(ncol(res22))
  datasets than desktop workstation                          par(mfrow = c(2,3))
                                                             for(i in rindex) {
  could support                                                temp <- data.frame(index = cindex, avg_delay = res22[i,])
                                                               plot(avg_delay ~ index, data = temp, col = "black",
                                                                    axes = FALSE, ylim = g_range, xlab = "", ylab = "",
                                                             main = attr(res22, "dimnames")[[1]][i])
                                                             axis(1, at = cindex, labels = attr(res22, "dimnames")[[2]])
                                                             axis(2, at = 0:ceiling(g_range[2]))
                                                             abline(lm(avg_delay ~ index, data = temp), col = "green")
                                                             lines(lowess(temp$index, temp$avg_delay), col="red")}



T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Integrating with OBIEE and Oracle BI Publisher

• R scripts can be embedded in BI Publisher data models
• Results returned as image vectors in XML, and rendered as BI Publisher output
• R scripts can also be referenced in functions etc and included in OBIEE RPD




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
R Analysis Output within the OBIEE Dashboard




                                             Regression analysis used to
                                             predict average delay for a route,
                                             using ORE integration within
                                             OBIEE BI Repository




     Display flight delay per airport for
     top N busiest airports
     with parameters that are passed to
     live R engines, using R script in BIP
     data model



T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Summary

• Analytics options available with Oracle Database and Oracle Fusion Middleware support
  a wide range of analytic tools and engines
• Oracle Exalytics is an excellent platform to run these on, based on RAM and CPU #
• OBIEE, with TimesTen for Exalytics and the Summary Advisor, supports
  “speed of thought” analytics using a rich, interactive dashboard
• Endeca Information Discovery provides a search / analytic interface to enable you to
  discover the questions that need answering
• Oracle R Enterprise, part of the Advanced Analytics Option for Oracle Database,
  enables deep analysis and insights using the R statistical language and environment
• OBIEE can now connect to Hadoop/MapReduce, through Hive
• A combined, integrated analysis toolset based on an Oracle Engineered System




 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
High Speed, Big Data Analysis using Exalytics
Mark Rittman, Technical Director, Rittman Mead
OUGN 2013 Conference, April 2013, Oslo

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Mais conteúdo relacionado

Mais procurados

Hadoop Architecture Options for Existing Enterprise DataWarehouse
Hadoop Architecture Options for Existing Enterprise DataWarehouseHadoop Architecture Options for Existing Enterprise DataWarehouse
Hadoop Architecture Options for Existing Enterprise DataWarehouseAsis Mohanty
 
Apache Spark Overview @ ferret
Apache Spark Overview @ ferretApache Spark Overview @ ferret
Apache Spark Overview @ ferretAndrii Gakhov
 
Large scale ETL with Hadoop
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with HadoopOReillyStrata
 
Column Stores and Google BigQuery
Column Stores and Google BigQueryColumn Stores and Google BigQuery
Column Stores and Google BigQueryCsaba Toth
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemShivaji Dutta
 
Big Data and Hadoop Introduction
 Big Data and Hadoop Introduction Big Data and Hadoop Introduction
Big Data and Hadoop IntroductionDzung Nguyen
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop EcosystemLior Sidi
 
Extending Hadoop for Fun & Profit
Extending Hadoop for Fun & ProfitExtending Hadoop for Fun & Profit
Extending Hadoop for Fun & ProfitMilind Bhandarkar
 
Spark, Python and Parquet
Spark, Python and Parquet Spark, Python and Parquet
Spark, Python and Parquet odsc
 
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...Uwe Printz
 
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...Facultad de Informática UCM
 
Basics of big data analytics hadoop
Basics of big data analytics hadoopBasics of big data analytics hadoop
Basics of big data analytics hadoopAmbuj Kumar
 
The Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemThe Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemCloudera, Inc.
 
The Zoo Expands: Labrador *Loves* Elephant, Thanks to Hamster
The Zoo Expands: Labrador *Loves* Elephant, Thanks to HamsterThe Zoo Expands: Labrador *Loves* Elephant, Thanks to Hamster
The Zoo Expands: Labrador *Loves* Elephant, Thanks to HamsterMilind Bhandarkar
 
Flink in Zalando's world of Microservices
Flink in Zalando's world of Microservices   Flink in Zalando's world of Microservices
Flink in Zalando's world of Microservices ZalandoHayley
 
Apache Cassandra Lunch #50: Machine Learning with Spark + Cassandra
Apache Cassandra Lunch #50: Machine Learning with Spark + CassandraApache Cassandra Lunch #50: Machine Learning with Spark + Cassandra
Apache Cassandra Lunch #50: Machine Learning with Spark + CassandraAnant Corporation
 
Apache HAWQ and Apache MADlib: Journey to Apache
Apache HAWQ and Apache MADlib: Journey to ApacheApache HAWQ and Apache MADlib: Journey to Apache
Apache HAWQ and Apache MADlib: Journey to ApachePivotalOpenSourceHub
 
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)Uwe Printz
 

Mais procurados (20)

Hadoop Architecture Options for Existing Enterprise DataWarehouse
Hadoop Architecture Options for Existing Enterprise DataWarehouseHadoop Architecture Options for Existing Enterprise DataWarehouse
Hadoop Architecture Options for Existing Enterprise DataWarehouse
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Apache Spark Overview @ ferret
Apache Spark Overview @ ferretApache Spark Overview @ ferret
Apache Spark Overview @ ferret
 
Large scale ETL with Hadoop
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with Hadoop
 
Column Stores and Google BigQuery
Column Stores and Google BigQueryColumn Stores and Google BigQuery
Column Stores and Google BigQuery
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
 
Big Data and Hadoop Introduction
 Big Data and Hadoop Introduction Big Data and Hadoop Introduction
Big Data and Hadoop Introduction
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
Extending Hadoop for Fun & Profit
Extending Hadoop for Fun & ProfitExtending Hadoop for Fun & Profit
Extending Hadoop for Fun & Profit
 
Spark, Python and Parquet
Spark, Python and Parquet Spark, Python and Parquet
Spark, Python and Parquet
 
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
 
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
 
Basics of big data analytics hadoop
Basics of big data analytics hadoopBasics of big data analytics hadoop
Basics of big data analytics hadoop
 
The Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemThe Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop Ecosystem
 
The Zoo Expands: Labrador *Loves* Elephant, Thanks to Hamster
The Zoo Expands: Labrador *Loves* Elephant, Thanks to HamsterThe Zoo Expands: Labrador *Loves* Elephant, Thanks to Hamster
The Zoo Expands: Labrador *Loves* Elephant, Thanks to Hamster
 
Hadoop to spark-v2
Hadoop to spark-v2Hadoop to spark-v2
Hadoop to spark-v2
 
Flink in Zalando's world of Microservices
Flink in Zalando's world of Microservices   Flink in Zalando's world of Microservices
Flink in Zalando's world of Microservices
 
Apache Cassandra Lunch #50: Machine Learning with Spark + Cassandra
Apache Cassandra Lunch #50: Machine Learning with Spark + CassandraApache Cassandra Lunch #50: Machine Learning with Spark + Cassandra
Apache Cassandra Lunch #50: Machine Learning with Spark + Cassandra
 
Apache HAWQ and Apache MADlib: Journey to Apache
Apache HAWQ and Apache MADlib: Journey to ApacheApache HAWQ and Apache MADlib: Journey to Apache
Apache HAWQ and Apache MADlib: Journey to Apache
 
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
 

Semelhante a Ougn2013 high speed, in-memory big data analysis with oracle exalytics

Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...Mark Rittman
 
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data ConnectorsDeep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data ConnectorsMark Rittman
 
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gPart 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gMark Rittman
 
ODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" SourcesODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" SourcesMark Rittman
 
IBANK - Oracle developers-guide
IBANK - Oracle developers-guide IBANK - Oracle developers-guide
IBANK - Oracle developers-guide ibankuk
 
How to Integrate OBIEE and Essbase / EPM Suite (OOW 2012)
How to Integrate OBIEE and Essbase / EPM Suite (OOW 2012)How to Integrate OBIEE and Essbase / EPM Suite (OOW 2012)
How to Integrate OBIEE and Essbase / EPM Suite (OOW 2012)Mark Rittman
 
OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)
OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)
OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)Mark Rittman
 
Inside Oracle Exalytics and Oracle TimesTen for Exalytics - Hotsos 2012
Inside Oracle Exalytics and Oracle TimesTen for Exalytics - Hotsos 2012Inside Oracle Exalytics and Oracle TimesTen for Exalytics - Hotsos 2012
Inside Oracle Exalytics and Oracle TimesTen for Exalytics - Hotsos 2012Mark Rittman
 
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...Mark Rittman
 
In-Memory Oracle BI Applications (UKOUG Analytics Event, July 2013)
In-Memory Oracle BI Applications (UKOUG Analytics Event, July 2013)In-Memory Oracle BI Applications (UKOUG Analytics Event, July 2013)
In-Memory Oracle BI Applications (UKOUG Analytics Event, July 2013)Mark Rittman
 
ODI12c as your Big Data Integration Hub
ODI12c as your Big Data Integration HubODI12c as your Big Data Integration Hub
ODI12c as your Big Data Integration HubMark Rittman
 
Big Data & Oracle Technologies
Big Data & Oracle TechnologiesBig Data & Oracle Technologies
Big Data & Oracle TechnologiesOleksii Movchaniuk
 
ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013Mark Rittman
 
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...Mark Rittman
 
Deploying OBIEE in the Cloud - Oracle Openworld 2014
Deploying OBIEE in the Cloud - Oracle Openworld 2014Deploying OBIEE in the Cloud - Oracle Openworld 2014
Deploying OBIEE in the Cloud - Oracle Openworld 2014Mark Rittman
 
Oracle Exalytics - Tips and Experiences from the Field (Enkitec E4 Conference...
Oracle Exalytics - Tips and Experiences from the Field (Enkitec E4 Conference...Oracle Exalytics - Tips and Experiences from the Field (Enkitec E4 Conference...
Oracle Exalytics - Tips and Experiences from the Field (Enkitec E4 Conference...Mark Rittman
 
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...Mark Rittman
 
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...Mark Rittman
 
What is Big Data Discovery, and how it complements traditional business anal...
What is Big Data Discovery, and how it complements  traditional business anal...What is Big Data Discovery, and how it complements  traditional business anal...
What is Big Data Discovery, and how it complements traditional business anal...Mark Rittman
 
OBIEE & Essbase Integration with Oracle BI Foundation 11.1.1.7 (ODTUG 2013)
OBIEE & Essbase Integration with Oracle BI Foundation 11.1.1.7 (ODTUG 2013)OBIEE & Essbase Integration with Oracle BI Foundation 11.1.1.7 (ODTUG 2013)
OBIEE & Essbase Integration with Oracle BI Foundation 11.1.1.7 (ODTUG 2013)Mark Rittman
 

Semelhante a Ougn2013 high speed, in-memory big data analysis with oracle exalytics (20)

Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
 
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data ConnectorsDeep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
 
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gPart 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
 
ODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" SourcesODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" Sources
 
IBANK - Oracle developers-guide
IBANK - Oracle developers-guide IBANK - Oracle developers-guide
IBANK - Oracle developers-guide
 
How to Integrate OBIEE and Essbase / EPM Suite (OOW 2012)
How to Integrate OBIEE and Essbase / EPM Suite (OOW 2012)How to Integrate OBIEE and Essbase / EPM Suite (OOW 2012)
How to Integrate OBIEE and Essbase / EPM Suite (OOW 2012)
 
OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)
OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)
OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)
 
Inside Oracle Exalytics and Oracle TimesTen for Exalytics - Hotsos 2012
Inside Oracle Exalytics and Oracle TimesTen for Exalytics - Hotsos 2012Inside Oracle Exalytics and Oracle TimesTen for Exalytics - Hotsos 2012
Inside Oracle Exalytics and Oracle TimesTen for Exalytics - Hotsos 2012
 
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
 
In-Memory Oracle BI Applications (UKOUG Analytics Event, July 2013)
In-Memory Oracle BI Applications (UKOUG Analytics Event, July 2013)In-Memory Oracle BI Applications (UKOUG Analytics Event, July 2013)
In-Memory Oracle BI Applications (UKOUG Analytics Event, July 2013)
 
ODI12c as your Big Data Integration Hub
ODI12c as your Big Data Integration HubODI12c as your Big Data Integration Hub
ODI12c as your Big Data Integration Hub
 
Big Data & Oracle Technologies
Big Data & Oracle TechnologiesBig Data & Oracle Technologies
Big Data & Oracle Technologies
 
ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013
 
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
 
Deploying OBIEE in the Cloud - Oracle Openworld 2014
Deploying OBIEE in the Cloud - Oracle Openworld 2014Deploying OBIEE in the Cloud - Oracle Openworld 2014
Deploying OBIEE in the Cloud - Oracle Openworld 2014
 
Oracle Exalytics - Tips and Experiences from the Field (Enkitec E4 Conference...
Oracle Exalytics - Tips and Experiences from the Field (Enkitec E4 Conference...Oracle Exalytics - Tips and Experiences from the Field (Enkitec E4 Conference...
Oracle Exalytics - Tips and Experiences from the Field (Enkitec E4 Conference...
 
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
 
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
 
What is Big Data Discovery, and how it complements traditional business anal...
What is Big Data Discovery, and how it complements  traditional business anal...What is Big Data Discovery, and how it complements  traditional business anal...
What is Big Data Discovery, and how it complements traditional business anal...
 
OBIEE & Essbase Integration with Oracle BI Foundation 11.1.1.7 (ODTUG 2013)
OBIEE & Essbase Integration with Oracle BI Foundation 11.1.1.7 (ODTUG 2013)OBIEE & Essbase Integration with Oracle BI Foundation 11.1.1.7 (ODTUG 2013)
OBIEE & Essbase Integration with Oracle BI Foundation 11.1.1.7 (ODTUG 2013)
 

Mais de Mark Rittman

The Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data PlatformsThe Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data PlatformsMark Rittman
 
Using Oracle Big Data Discovey as a Data Scientist's Toolkit
Using Oracle Big Data Discovey as a Data Scientist's ToolkitUsing Oracle Big Data Discovey as a Data Scientist's Toolkit
Using Oracle Big Data Discovey as a Data Scientist's ToolkitMark Rittman
 
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...Mark Rittman
 
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?Mark Rittman
 
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...Mark Rittman
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...Mark Rittman
 
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle CloudOTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle CloudMark Rittman
 
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...Mark Rittman
 
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Mark Rittman
 
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsOracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsMark Rittman
 
Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...Mark Rittman
 
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...Mark Rittman
 
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsBig Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsMark Rittman
 
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...Mark Rittman
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyMark Rittman
 
Deploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudDeploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudMark Rittman
 
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...Mark Rittman
 
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Mark Rittman
 
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...Mark Rittman
 
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015Mark Rittman
 

Mais de Mark Rittman (20)

The Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data PlatformsThe Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data Platforms
 
Using Oracle Big Data Discovey as a Data Scientist's Toolkit
Using Oracle Big Data Discovey as a Data Scientist's ToolkitUsing Oracle Big Data Discovey as a Data Scientist's Toolkit
Using Oracle Big Data Discovey as a Data Scientist's Toolkit
 
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
 
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
 
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
 
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle CloudOTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
 
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
 
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
 
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsOracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
 
Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...
 
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
 
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsBig Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
 
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
 
Deploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudDeploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle Cloud
 
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
 
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
 
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
 
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
 

Ougn2013 high speed, in-memory big data analysis with oracle exalytics

  • 1. High Speed, Big Data Analysis using Exalytics Mark Rittman, Technical Director, Rittman Mead OUGN 2013 Conference, April 2013, Oslo T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 2. Mark Rittman • Mark Rittman, Co-Founder of Rittman Mead • Oracle ACE Director, specialising in Oracle BI&DW • 14 Years Experience with Oracle Technology • Regular columnist for Oracle Magazine • Author of two Oracle Press Oracle BI books ‣ Oracle Business Intelligence Developers Guide ‣ Oracle Exalytics Revealed • Writer for Rittman Mead Blog : http://www.rittmanmead.com/blog • Email : mark.rittman@rittmanmead.com • Twitter : @markrittman T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 3. About Rittman Mead • Oracle BI and DW platinum partner • World leading specialist partner for technical excellence, solutions delivery and innovation in Oracle BI • Approximately 30 consultants worldwide • All expert in Oracle BI and DW • UK based • Offices in US, Europe (Belgium) and India • Skills in broad range of supporting Oracle tools: ‣ OBIEE ‣ OBIA ‣ ODIEE ‣ Essbase, Oracle OLAP ‣ GoldenGate ‣ Exadata T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 4. What is Oracle Exalytics? • Hardware Element ‣ Sun Fire X4470 M2 server ‣ 1TB RAM, 40 Cores, 3.6TB HDD • Software Element ‣ OBIEE 11.1.1.6 with Exalytics Enhancements ‣ Oracle Essbase 11.1.2 with Exalytics Enhancements ‣ Oracle TimesTen 11.2.2.2 for Exalytics ‣ New in v1.1 : Support for Oracle Endeca Information Discovery, Golden Gate, ODI ‣ Runs on 64-bit Oracle Linux (Exalogic distribution) • OBIEE and Essbase are licensed as Oracle BI Foundation • Exalytics features can only be used in conjunction with Exalytics hardware T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 5. Exalytics as the Exa-Machine for BI • Runs the BI layer on a high-performance, multi-core, 1TB server • In-memory cache used to accelerate the BI part of the stack • If Exadata addresses 80% of the query performance, Exalytics addresses the remaining 20% Oracle BI ‣ Consistent response times for queries ‣ In-memory caching of aggregates ‣ 40 cores for high concurrency In-Memory DB/Cache ‣ Re-engineered BI and OLAP software that assumes 40 cores and 1TB RAM ERP/Apps DW T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 6. Enables High-Density Analysis of Big Data • BI tier is rarely the bottleneck, but it can be if very dense visualizations are used ‣ Sparklines, grid of charts etc • Exalytics’ 40 cores and 1TB RAM make higher density presentation viable ‣ Single query sent to the database ‣ Exalytics breaks data up to create microcharts • Also helps support high numbers of concurrent users (100+) T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 7. Also Supports Essbase, and Endeca Information Discovery • In-Memory Essbase for planning, budgeting and sales analysis-style OLAP applications • Endeca Information Discovery for search/analytic applications against diverse data Oracle Exalytics In-Memory Machine T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 8. Oracle’s Wider Strategy for Business Analytics • Connect to all of your data, from all your sources, • Subject it to the full range of possible inquiry • Package solutions for known problems and fixed sources, and • Deploy to PCs and mobile devices, on premise or in the cloud Any Data, Full Range of Integrated On Premise, Any Source Analytics Analytic Apps On Cloud, On Mobile T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 9. Big Data : New Datasets of Higher Variety, Volume and Velocity • Traditional BI datasets have been relatively small, and well structured • Financial data and other metrics, with attributes and hierarchies to slice-and-dice it • Big data is all about collecting and analyzing data sets of wider scope ‣ Volume - TBs of data collected from sensors, transactions and other low-granularity data sources ‣ Variety - unstructured, semi-structured as well as structured sources ‣ Velocity - data arriving in real-time, and analyzed in real-time T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 10. Hadoop and MapReduce • Apache Hadoop is one of the most well-known Big Data technologies • Family of open-source products used to store, and analyze distributed datasets • Hadoop is the enabling framework, automatically parallelises and co-ordinates jobs • MapReduce is the programming framework for filtering, sorting and aggregating data ‣ Map : filter data and pass on to reducers ‣ Reduce : sort, group and return results • MapReduce jobs can be written in any language (Java etc), but it is complicated T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 11. Big Data in the Context of the Oracle Exalytics Platform • Exalytics, through in-memory aggregates and InfiniBand connection to Exadata, can analyze vast (structured) datasets held in relational and OLAP databases • Endeca Information Discovery can analyze unstructured and semi-structured sources • InfiniBand connector to Big Data Applicance + Hadoop connector in OBIEE supports analysis via Map/Reduce • Oracle R distribution + Oracle Enterprise R supports SAS-style statistical analysis of large data sets, as part of Oracle Advanced Analytics Option • OBIEE can access Hadoop datasource through another Apache technology called Hive T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 12. New in OBIEE 11.1.1.7 : Hadoop Connectivity through Hive • MapReduce jobs are typically written in Java, but Hive can make this simpler • Hive is a query environment over Hadoop/MapReduce to support SQL-like queries • Hive server accepts HiveQL queries via HiveODBC or HiveJDBC, automatically creates MapReduce jobs against data previously loaded into the Hive HDFS tables • Approach used by ODI and OBIEE to gain access to Hadoop data • Allows Hadoop data to be accessed just like any other data source (sort of...) T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 13. An example Hive Query Session: Connect and Display Table List [oracle@bigdatalite ~]$ hive Hive history file=/tmp/oracle/hive_job_log_oracle_201304170403_1991392312.txt hive> show tables; OK dwh_customer dwh_customer_tmp Hive Server lists out all “tables” i_dwh_customer that have been defined within the ratings Hive src_customer environment src_sales_person weblog weblog_preprocessed weblog_sessionized Time taken: 2.925 seconds T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 14. An example Hive Query Session: Display Table Row Count hive> select count(*) from src_customer; Request count(*) from table Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): Hive server generates set hive.exec.reducers.bytes.per.reducer= In order to limit the maximum number of reducers: MapReduce job to “map” table set hive.exec.reducers.max= key/value pairs, and then reduce In order to set a constant number of reducers: the results to table count set mapred.reduce.tasks= Starting Job = job_201303171815_0003, Tracking URL = http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201303171815_0003 Kill Command = /usr/lib/hadoop-0.20/bin/ hadoop job -Dmapred.job.tracker=localhost.localdomain:8021 -kill job_201303171815_0003 2013-04-17 04:06:59,867 Stage-1 map = 0%, reduce = 0% 2013-04-17 04:07:03,926 Stage-1 map = 100%, reduce = 0% 2013-04-17 04:07:14,040 Stage-1 map = 100%, reduce = 33% MapReduce job automatically run 2013-04-17 04:07:15,049 Stage-1 map = 100%, reduce = 100% by Hive Server Ended Job = job_201303171815_0003 OK 25 Time taken: 22.21 seconds Results returned to user T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 15. Importing Hadoop/Hive Metadata into RPD • HiveODBC driver has to be installed into Windows environment, so that BI Administration tool can connect to Hive and return table metadata • Import as ODBC datasource, change physical DB type to Apache Hadoop afterwards • Note that OBIEE queries cannot span >1 Hive schema (no table prefixes) 2 1 3 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 16. Set up ODBC Connection at the OBIEE Server (Linux Only) • OBIEE 11.1.1.7+ ships with HiveODBC drivers, need to use 7.x versions though • Configure the ODBC connection in odbc.ini, name needs to match RPD ODBC name • BI Server should then be able to connect to the Hive server, and Hadoop/MapReduce [ODBC Data Sources] AnalyticsWeb=Oracle BI Server Cluster=Oracle BI Server SSL_Sample=Oracle BI Server bigdatalite=Oracle 7.1 Apache Hive Wire Protocol [bigdatalite] Driver=/u01/app/Middleware/Oracle_BI1/common/ODBC/ Merant/7.0.1/lib/ARhive27.so Description=Oracle 7.1 Apache Hive Wire Protocol ArraySize=16384 Database=default DefaultLongDataBuffLen=1024 EnableLongDataBuffLen=1024 EnableDescribeParam=0 Hostname=bigdatalite LoginTimeout=30 MaxVarcharSize=2000 PortNumber=10000 RemoveColumnQualifiers=0 StringDescribeType=12 TransactionMode=0 UseCurrentSchema=0 T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 17. Oracle Advanced Analytics : In-Database Predictive Analytics • In-Database Predictive Analytics and Statistical Analysis • Massively-Scalable, able to analyze huge volumes of data • Exposed through SQL and R, enabling broad usage Predictive Comprehensive Predictive Analytic Analytics Platform Built-Inside the DatabaseData Mining, Text MiningStatistical Analysis (built on R)B ‣Scalable and Parallel Tightly-Integrated with SQL Works Inside Exadata and Big Data Appliance Text Mining Statistics Data Mining T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 18. What is R, and Oracle R Enterprise? • R is a statistical language similar to Base SAS, or SPSS • Open-source, run by the R Project (http://www.r-project.org) • R environment is a suite of client/server products for statistical data manipulation and graphical analysis • Modeling and Analysis performed in-memory using “frames” • Enhanced by community-contributed packages • R distribute the open-source version of R with Oracle Linux • Oracle R Enterprise extends R to allow analysis against frames stored in Oracle tables, views and embed R scripts in database PL/SQL packages T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 19. Capabilities of R Compared to SQL (Built-In Stats Functions) • R provides a wide variety of statistical and graphical techniques • Linear and non-linear modeling, classical statistical tests, time-series analysis • Classification, clustering and other capabilities • Matrix arithmetic, with scalar, vector, matrices, list and data frame (aka table) structures • Extensible through community-contributed packages, and interacts with C++, Java etc • Available for Oracle Database 11gR2 through the Advanced Analytics Option • Extends the (free) SQL statistical capabilities provided by Oracle Database ‣ Ranking, Windowing, Reporting ‣ Lag/Lead, First/Last ‣ Linear Regression, Inverse Percentile etc T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 20. Oracle R Enterprise • Regular R is constrained by only working with in-memory datasets (frames) • Data from tables and other database structures has to be loaded into memory • Oracle R Enterprise (ORE) removes this constraint by allowing frames to reside in DB • Automatically exploits database parallelism, plus Oracle scalability / resilience • ORE provides three key areas of functionality ‣ Embedded R ‣ In-Database Statistics Engine (R extensions for Oracle SQL) ‣ Transparency Layer (access RBDMS-based frames) T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 21. Typical R / ORE Topology • Analyst workstation contains Open Source R client tools • Models created in-memory on workstation In-Memory R Engine • ORE provides capability to access datasets Analyst Workstation / Laptop stored in Oracle RBDMS, transparently (2 core, 16GB RAM) • Database has ORE embedded within it In-Memory • ORE provides data for workstation, and R Engines spawned by DB can spawn its own R sessions for in-database R analysis Oracle Database Server with • Enables lights-out R analysis, plus connectivity ORE to Hadoop and Map/Reduce via Oracle R Connector for Hadoop ORE Hadoop Connector Hadoop Server (Oracle Big Data Appliance) T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 22. Oracle Exalytics In-Memory Machine as a Platform for Endeca & R • Exalytics server acts as an analysis workstation “on steroids” • 1TB Ram + 40 cores for multiple R engines • Infiniband connection to Exadata and ORE • Endeca Information Discovery uses 40 CPU cores for massively parallel indexing, analysis • More of Endeca Server datastore in RAM • OBIEE uses RAM and cores for TimesTen datamart, in-memory data source federation and Presentation Server caching, performance • All products benefit from high-end server features and fast connectivity to Exadata + BDA T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 23. An Example OBIEE / Endeca / R on Exalytics Scenario • “Airline On-Time Performance and Causes of Flight Delays” dataset • Provide by Bureau of Transportation Statistics, Research and Innovative Technology Administration, United States Department of Transportation • Dataset containing 123M rows of non-stop US domestic flight legs • Source and destination airports, operator, aircraft type • Type and duration of delay, delay reason • Freely-available “big data” set • What can OBIEE, R and EID tell us? ‣ OBIEE - dashboard analysis + drilling ‣ EID - discovery, and analysis of supporting information describing delays, reasons etc ‣ ORE - deep insight into specific questions T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 24. Endeca Information Discovery Search/Analytic Dashboard • Search capabilities of EID help us explore messy and unfamiliar data • “McDonel” corrects to “McDonnell” • Matches to all variants of “McDonnell Douglas” • Dashboard tells us that Delta Airlines has suffered Any variant of most delays using McDonnell McDonnell Douglas is Douglas aircraft, regardless of retrieved. variations of spelling T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 25. Discovery Environment Provides Many Ways to Explore Data “San Fran” provides completion suggestions across all attributes, allowing us to discover many representations of “San Fran” that may be present in the data Unlike search engines, EID has a rich, SQL-like language for aggregating and calculations, to Similarly, with each step of the populate graphs and visualizations data exploration, all available in the dashboard records in current filter set are summarised by facets - prompts for every available attribute, populated by only the valid values that will lead to non-empty results sets, allowing users to uncover relationships and patterns in the data using attributes that they may not even be aware existed T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 26. Text Analytics through EID Lexical / Parsing / Enrichment Features • Flexible key/value data model and unstructured text enrichment capabilities of EID allow text analytics to be combined with data discovery and analytics Most common issue for older, Whilst for newer 777, most MD-88 aircraft is “corroded or common issue is “inoperative cracked skin on the fuselage” emergency lights in the cabin” T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 27. Exalytics OBIEE Dashboard : “Speed of Thought” Analysis 123m rows of data, analyzed live with detail in Oracle DB, and aggregates in TimesTen “Go-less” prompts and dashboard controls for instant response to filter changes Interactive visuals, in the form of maps, graphs, tables, scorecards and KPIs T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 28. Using R to Answer In-Depth, Statistical Questions • Are some airports more prone to delays than others? • Are some days of the week likely to see fewer delays than others? ‣ Are these differences significant? • How do arrival delay distributions differ for the best and worst 3 airlines compared to the industry? ‣ Are there significant differences among airlines? • For American Airlines, how has the distribution of delays for departures and arrivals evolved over time? • How do average annual arrival delays compare across select airlines? ‣ What is the underlying trend for each airline? T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 29. Preparing the Dataset for R, and Running R Queries • Create R frames using data from Oracle RDBMS, using ORE transparency layer • Create R queries to manipulate flight delays data • Build regression models • Score and rank data ontimeSubset <- subset(ONTIME_S, UNIQUECARRIER %in% • 40 cores and 1TB RAM in Exalytics c("AA", "AS", "CO", "DL","WN","NW")) res22 <- with(ontimeSubset, tapply(ARRDELAY, allows multiple R engines to list(UNIQUECARRIER, YEAR), mean, na.rm = TRUE)) g_range <- range(0, res22, na.rm = TRUE) be spawned, processing larger rindex <- seq_len(nrow(res22)) cindex <- seq_len(ncol(res22)) datasets than desktop workstation par(mfrow = c(2,3)) for(i in rindex) { could support temp <- data.frame(index = cindex, avg_delay = res22[i,]) plot(avg_delay ~ index, data = temp, col = "black", axes = FALSE, ylim = g_range, xlab = "", ylab = "", main = attr(res22, "dimnames")[[1]][i]) axis(1, at = cindex, labels = attr(res22, "dimnames")[[2]]) axis(2, at = 0:ceiling(g_range[2])) abline(lm(avg_delay ~ index, data = temp), col = "green") lines(lowess(temp$index, temp$avg_delay), col="red")} T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 30. Integrating with OBIEE and Oracle BI Publisher • R scripts can be embedded in BI Publisher data models • Results returned as image vectors in XML, and rendered as BI Publisher output • R scripts can also be referenced in functions etc and included in OBIEE RPD T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 31. R Analysis Output within the OBIEE Dashboard Regression analysis used to predict average delay for a route, using ORE integration within OBIEE BI Repository Display flight delay per airport for top N busiest airports with parameters that are passed to live R engines, using R script in BIP data model T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 32. Summary • Analytics options available with Oracle Database and Oracle Fusion Middleware support a wide range of analytic tools and engines • Oracle Exalytics is an excellent platform to run these on, based on RAM and CPU # • OBIEE, with TimesTen for Exalytics and the Summary Advisor, supports “speed of thought” analytics using a rich, interactive dashboard • Endeca Information Discovery provides a search / analytic interface to enable you to discover the questions that need answering • Oracle R Enterprise, part of the Advanced Analytics Option for Oracle Database, enables deep analysis and insights using the R statistical language and environment • OBIEE can now connect to Hadoop/MapReduce, through Hive • A combined, integrated analysis toolset based on an Oracle Engineered System T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
  • 33. High Speed, Big Data Analysis using Exalytics Mark Rittman, Technical Director, Rittman Mead OUGN 2013 Conference, April 2013, Oslo T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com