SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
Copyright © 2015 KNIME.com AG
Big Data Science is just a
Click Away!
Rosaria Silipo
KNIME.com
Copyright © 2015 KNIME.com AG
Variety, Volume, Velocity
Variety:
• integrating heterogeneous data (and tools)
Volume:
• from small files...
• ...to distributed data repositories (Hadoop)
• bring the tools to the data
Velocity:
• from distributing computationally heavy
computations...
• ...to real time scoring of millions of
records/sec.
4
Copyright © 2015 KNIME.com AG
Every Minute…
5
Copyright © 2015 KNIME.com AG
IoT
6
Copyright © 2015 KNIME.com AG 7
The Challenge
Copyright © 2015 KNIME.com AG
Energy Usage Prediction from Smart Meters Data
• Read Smart Meter Energy Data (176 millions rows)
• Clean Up and Aggregate total Energy Usage by hour,
week, day, month, year
• Calculate Behavioral Measures for each Smart Meter
• Cluster Smart Meters with Similar Behavior (k-
Means)
• Predict Energy Usage in Clustered Smart Meters
(Auto-Regressive Time Series Prediction)
8
Workflow 1
Workflow 2
Workflow 3
Copyright © 2015 KNIME.com AG
Workflow 1: PrepareData
9
~ 2 days
Copyright © 2015 KNIME.com AG 10
Big Data
Copyright © 2015 KNIME.com AG
Big Data Support
• KNIME Big Data Access Nodes
– preconfigured connectors
– in database processing
• Big Data Platforms
– HDFS, Hive, Impala, HP Vertica, Hortonworks, ParStream,
Actian, any big data platform really!
• Spark MLlib integration (coming soon)
• Streaming Executor (coming soon)
Copyright © 2015 KNIME.com AG
Hadoop Sandboxes
• Hortonworks:
http://hortonworks.com/products/hortonworks-sandbox/
• Cloudera:
http://www.cloudera.com/content/cloudera/en/downloads/
quickstart_vms.html
• Virtual Box
https://www.virtualbox.org/
• VMWare Player
http://www.vmware.com/
12
Copyright © 2015 KNIME.com AG
Access Big
Data
Select Table
In-DB
Processing
Into
KNIME
… as easy as 1,2,3,… 4
13
4321
Copyright © 2015 KNIME.com AG
1. Database Connector
Generic Database Connector
– Can connect to any JDBC source
– Register new JDBC driver via
preferences page
14
Access Big
Data
Copyright © 2015 KNIME.com AG
1. Register JDBC Driver
15
Open KNIME and go to
File -> Preferences
Increase connection timeout for
long running retrieval operations
Access Big
Data
Copyright © 2015 KNIME.com AG
1. Dedicated Connectors
Dedicated pre-configured connectors
– Bundling necessary JDBC drivers
– Easy to use
– DB specific behavior/capability
Some dedicated connectors are part of
the open source KNIME Analytics
Platform, some belong to the
commercial KNIME Big Data Extension
16
works for most
Hadoop HIVE
installations,
including
Hortonworks
free
Access Big
Data
Copyright © 2015 KNIME.com AG
2. Data Table Selection
18
Select
Table
Copyright © 2015 KNIME.com AG
3. In-Database Processing
• Filter rows and columns
• Join tables/queries
• Sort your data
• Write your own query
• Aggregate* your data
19
Similar Settings as
GroupBy node
Similar Settings as
Joiner node
* Database GroupBy node exposes DB specific aggregation methods
In-DB
Processing
Copyright © 2015 KNIME.com AG
3. Queries for average Measures
20
In-DB
Processing
Copyright © 2015 KNIME.com AG
3. Average Monthly Values
22
In-DB
Processing
Copyright © 2015 KNIME.com AG
4. Import Data from Database
23
< 30 min
1 2
3
4
Into KNIME
Copyright © 2015 KNIME.com AG
New Big Data Platform?
24
No problem!
Just change the connector node!
Copyright © 2015 KNIME.com AG
Other Useful Database Nodes
• Drop table
– missing table handling
– cascade option
• Execute any SQL
statement
• Manipulate existing
queries
25
Executes several
queries separated
by ; and new line
Copyright © 2015 KNIME.com AG 26
KNIME Big Data Extension
Copyright © 2015 KNIME.com AG
KNIME Big Data Extension
• KNIME Big Data Access Nodes
– preconfigured connectors
– HDFS File Handling
– Hive/Impala Loader
• Big Data Platforms
– HDFS, Hive, Impala, HP Vertica, Hortonworks, ParStream,
Actian, SAP Hana (to be), …
• Spark MLlib integration (coming soon)
• Streaming Executor (coming soon)
Copyright © 2015 KNIME.com AG
HDFS File Handling
• KNIME & Extensions ->
KNIME File Handling Nodes
• HDFS Connection and
HDFS File Permission nodes
28
Copyright © 2015 KNIME.com AG
Hive/Impala Loader
29
• Upload a KNIME data table to Hive/Impala
Copyright © 2015 KNIME.com AG
KNIME Big Data Extension: Download and Install
KNIME.com Extension Store
License Required!
Installation Instructions
http://tech.knime.org/installation-instructions
Product Description
http://www.knime.org/knime-big-data-extension
Copyright © 2015 KNIME.com AG
License on KNIME Store
http://tech.knime.org/knime-store
30-day trial license available with special Promotion Code
education@knime.com
Copyright © 2015 KNIME.com AG
References
• Whitepaper “KNIME opens the Doors to Big Data”
http://www.knime.org/files/big_data_in_knime_1.pdf
• Blog Post “Integrating Big data is as Easy as 1,2,3, … 4”
http://www.knime.org/blog/integrating-big-data-is-as-easy-as-
1-2-3-4
• The Big Data Extension Product Description
http://www.knime.org/knime-big-data-extension
32
Copyright © 2015 KNIME.com AG
Thank You!
• education@knime.com
• Twitter: @KNIME
• LinkedIn Group: KNIME
• KNIME Blog: http://www.knime.org/blog
33

Mais conteúdo relacionado

Mais procurados

Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s Approach
DataWorks Summit
 

Mais procurados (20)

Big Data Quickstart Series 3: Perform Data Integration
Big Data Quickstart Series 3: Perform Data IntegrationBig Data Quickstart Series 3: Perform Data Integration
Big Data Quickstart Series 3: Perform Data Integration
 
Benchmark of Alibaba Cloud capabilities
Benchmark of Alibaba Cloud capabilitiesBenchmark of Alibaba Cloud capabilities
Benchmark of Alibaba Cloud capabilities
 
Alluxio Use Cases and Future Directions
Alluxio Use Cases and Future DirectionsAlluxio Use Cases and Future Directions
Alluxio Use Cases and Future Directions
 
NetApp ONTAP Select for Service Providers
NetApp ONTAP Select for Service Providers  NetApp ONTAP Select for Service Providers
NetApp ONTAP Select for Service Providers
 
Snowflake + Syncsort: Get Value from Your Mainframe Data
Snowflake + Syncsort: Get Value from Your Mainframe DataSnowflake + Syncsort: Get Value from Your Mainframe Data
Snowflake + Syncsort: Get Value from Your Mainframe Data
 
How to Enable Industrial Decarbonization with Node-RED and InfluxDB
How to Enable Industrial Decarbonization with Node-RED and InfluxDBHow to Enable Industrial Decarbonization with Node-RED and InfluxDB
How to Enable Industrial Decarbonization with Node-RED and InfluxDB
 
MeasureCamp 7 Bigger Faster Data by Andrew Hood and Cameron Gray from Lynchpin
MeasureCamp 7   Bigger Faster Data by Andrew Hood and Cameron Gray from LynchpinMeasureCamp 7   Bigger Faster Data by Andrew Hood and Cameron Gray from Lynchpin
MeasureCamp 7 Bigger Faster Data by Andrew Hood and Cameron Gray from Lynchpin
 
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
 
Scylla Summit 2022: Multi-cloud State for k8s: Anthos and ScyllaDB
Scylla Summit 2022: Multi-cloud State for k8s: Anthos and ScyllaDBScylla Summit 2022: Multi-cloud State for k8s: Anthos and ScyllaDB
Scylla Summit 2022: Multi-cloud State for k8s: Anthos and ScyllaDB
 
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
 
Realizing the Event Driven Enterprise
Realizing the Event Driven EnterpriseRealizing the Event Driven Enterprise
Realizing the Event Driven Enterprise
 
Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...
Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...
Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...
 
Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s Approach
 
Chemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics PlatformChemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics Platform
 
A Walkthrough of InfluxCloud 2.0 by Tim Hall
A Walkthrough of InfluxCloud 2.0 by Tim HallA Walkthrough of InfluxCloud 2.0 by Tim Hall
A Walkthrough of InfluxCloud 2.0 by Tim Hall
 
Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...
Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...
Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...
 
Spark Infrastructure Made Easy
Spark Infrastructure Made EasySpark Infrastructure Made Easy
Spark Infrastructure Made Easy
 
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3GoHigh Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
 
Cassandra summit 2015 - Simplifying Streaming Analytics
Cassandra summit 2015 - Simplifying Streaming AnalyticsCassandra summit 2015 - Simplifying Streaming Analytics
Cassandra summit 2015 - Simplifying Streaming Analytics
 
Installing your influx enterprise cluster
Installing your influx enterprise clusterInstalling your influx enterprise cluster
Installing your influx enterprise cluster
 

Semelhante a Big Data as easy as 1, 2, 3, ... 4 ... with KNIME

Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
EMC
 

Semelhante a Big Data as easy as 1, 2, 3, ... 4 ... with KNIME (20)

SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1
 
InfoSphere BigInsights - Analytics power for Hadoop - field experience
InfoSphere BigInsights - Analytics power for Hadoop - field experienceInfoSphere BigInsights - Analytics power for Hadoop - field experience
InfoSphere BigInsights - Analytics power for Hadoop - field experience
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQL
 
Hadoop Application Architectures tutorial at Big DataService 2015
Hadoop Application Architectures tutorial at Big DataService 2015Hadoop Application Architectures tutorial at Big DataService 2015
Hadoop Application Architectures tutorial at Big DataService 2015
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Software Defined Infrastructure
Software Defined InfrastructureSoftware Defined Infrastructure
Software Defined Infrastructure
 
Multi-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BTMulti-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BT
 
KNIME Software Overview
KNIME Software OverviewKNIME Software Overview
KNIME Software Overview
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
 
Strata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationsStrata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applications
 
Open Sourcing GemFire - Apache Geode
Open Sourcing GemFire - Apache GeodeOpen Sourcing GemFire - Apache Geode
Open Sourcing GemFire - Apache Geode
 
An Introduction to Apache Geode (incubating)
An Introduction to Apache Geode (incubating)An Introduction to Apache Geode (incubating)
An Introduction to Apache Geode (incubating)
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoop
 
Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
 
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
 
Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...
Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...
Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...
 
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal GemfireIMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
 

Último

Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 

Último (20)

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 

Big Data as easy as 1, 2, 3, ... 4 ... with KNIME

  • 1. Copyright © 2015 KNIME.com AG Big Data Science is just a Click Away! Rosaria Silipo KNIME.com
  • 2. Copyright © 2015 KNIME.com AG Variety, Volume, Velocity Variety: • integrating heterogeneous data (and tools) Volume: • from small files... • ...to distributed data repositories (Hadoop) • bring the tools to the data Velocity: • from distributing computationally heavy computations... • ...to real time scoring of millions of records/sec. 4
  • 3. Copyright © 2015 KNIME.com AG Every Minute… 5
  • 4. Copyright © 2015 KNIME.com AG IoT 6
  • 5. Copyright © 2015 KNIME.com AG 7 The Challenge
  • 6. Copyright © 2015 KNIME.com AG Energy Usage Prediction from Smart Meters Data • Read Smart Meter Energy Data (176 millions rows) • Clean Up and Aggregate total Energy Usage by hour, week, day, month, year • Calculate Behavioral Measures for each Smart Meter • Cluster Smart Meters with Similar Behavior (k- Means) • Predict Energy Usage in Clustered Smart Meters (Auto-Regressive Time Series Prediction) 8 Workflow 1 Workflow 2 Workflow 3
  • 7. Copyright © 2015 KNIME.com AG Workflow 1: PrepareData 9 ~ 2 days
  • 8. Copyright © 2015 KNIME.com AG 10 Big Data
  • 9. Copyright © 2015 KNIME.com AG Big Data Support • KNIME Big Data Access Nodes – preconfigured connectors – in database processing • Big Data Platforms – HDFS, Hive, Impala, HP Vertica, Hortonworks, ParStream, Actian, any big data platform really! • Spark MLlib integration (coming soon) • Streaming Executor (coming soon)
  • 10. Copyright © 2015 KNIME.com AG Hadoop Sandboxes • Hortonworks: http://hortonworks.com/products/hortonworks-sandbox/ • Cloudera: http://www.cloudera.com/content/cloudera/en/downloads/ quickstart_vms.html • Virtual Box https://www.virtualbox.org/ • VMWare Player http://www.vmware.com/ 12
  • 11. Copyright © 2015 KNIME.com AG Access Big Data Select Table In-DB Processing Into KNIME … as easy as 1,2,3,… 4 13 4321
  • 12. Copyright © 2015 KNIME.com AG 1. Database Connector Generic Database Connector – Can connect to any JDBC source – Register new JDBC driver via preferences page 14 Access Big Data
  • 13. Copyright © 2015 KNIME.com AG 1. Register JDBC Driver 15 Open KNIME and go to File -> Preferences Increase connection timeout for long running retrieval operations Access Big Data
  • 14. Copyright © 2015 KNIME.com AG 1. Dedicated Connectors Dedicated pre-configured connectors – Bundling necessary JDBC drivers – Easy to use – DB specific behavior/capability Some dedicated connectors are part of the open source KNIME Analytics Platform, some belong to the commercial KNIME Big Data Extension 16 works for most Hadoop HIVE installations, including Hortonworks free Access Big Data
  • 15. Copyright © 2015 KNIME.com AG 2. Data Table Selection 18 Select Table
  • 16. Copyright © 2015 KNIME.com AG 3. In-Database Processing • Filter rows and columns • Join tables/queries • Sort your data • Write your own query • Aggregate* your data 19 Similar Settings as GroupBy node Similar Settings as Joiner node * Database GroupBy node exposes DB specific aggregation methods In-DB Processing
  • 17. Copyright © 2015 KNIME.com AG 3. Queries for average Measures 20 In-DB Processing
  • 18. Copyright © 2015 KNIME.com AG 3. Average Monthly Values 22 In-DB Processing
  • 19. Copyright © 2015 KNIME.com AG 4. Import Data from Database 23 < 30 min 1 2 3 4 Into KNIME
  • 20. Copyright © 2015 KNIME.com AG New Big Data Platform? 24 No problem! Just change the connector node!
  • 21. Copyright © 2015 KNIME.com AG Other Useful Database Nodes • Drop table – missing table handling – cascade option • Execute any SQL statement • Manipulate existing queries 25 Executes several queries separated by ; and new line
  • 22. Copyright © 2015 KNIME.com AG 26 KNIME Big Data Extension
  • 23. Copyright © 2015 KNIME.com AG KNIME Big Data Extension • KNIME Big Data Access Nodes – preconfigured connectors – HDFS File Handling – Hive/Impala Loader • Big Data Platforms – HDFS, Hive, Impala, HP Vertica, Hortonworks, ParStream, Actian, SAP Hana (to be), … • Spark MLlib integration (coming soon) • Streaming Executor (coming soon)
  • 24. Copyright © 2015 KNIME.com AG HDFS File Handling • KNIME & Extensions -> KNIME File Handling Nodes • HDFS Connection and HDFS File Permission nodes 28
  • 25. Copyright © 2015 KNIME.com AG Hive/Impala Loader 29 • Upload a KNIME data table to Hive/Impala
  • 26. Copyright © 2015 KNIME.com AG KNIME Big Data Extension: Download and Install KNIME.com Extension Store License Required! Installation Instructions http://tech.knime.org/installation-instructions Product Description http://www.knime.org/knime-big-data-extension
  • 27. Copyright © 2015 KNIME.com AG License on KNIME Store http://tech.knime.org/knime-store 30-day trial license available with special Promotion Code education@knime.com
  • 28. Copyright © 2015 KNIME.com AG References • Whitepaper “KNIME opens the Doors to Big Data” http://www.knime.org/files/big_data_in_knime_1.pdf • Blog Post “Integrating Big data is as Easy as 1,2,3, … 4” http://www.knime.org/blog/integrating-big-data-is-as-easy-as- 1-2-3-4 • The Big Data Extension Product Description http://www.knime.org/knime-big-data-extension 32
  • 29. Copyright © 2015 KNIME.com AG Thank You! • education@knime.com • Twitter: @KNIME • LinkedIn Group: KNIME • KNIME Blog: http://www.knime.org/blog 33