Mais conteúdo relacionado Semelhante a Big Data as easy as 1, 2, 3, ... 4 ... with KNIME (20) Big Data as easy as 1, 2, 3, ... 4 ... with KNIME1. Copyright © 2015 KNIME.com AG
Big Data Science is just a
Click Away!
Rosaria Silipo
KNIME.com
2. Copyright © 2015 KNIME.com AG
Variety, Volume, Velocity
Variety:
• integrating heterogeneous data (and tools)
Volume:
• from small files...
• ...to distributed data repositories (Hadoop)
• bring the tools to the data
Velocity:
• from distributing computationally heavy
computations...
• ...to real time scoring of millions of
records/sec.
4
6. Copyright © 2015 KNIME.com AG
Energy Usage Prediction from Smart Meters Data
• Read Smart Meter Energy Data (176 millions rows)
• Clean Up and Aggregate total Energy Usage by hour,
week, day, month, year
• Calculate Behavioral Measures for each Smart Meter
• Cluster Smart Meters with Similar Behavior (k-
Means)
• Predict Energy Usage in Clustered Smart Meters
(Auto-Regressive Time Series Prediction)
8
Workflow 1
Workflow 2
Workflow 3
9. Copyright © 2015 KNIME.com AG
Big Data Support
• KNIME Big Data Access Nodes
– preconfigured connectors
– in database processing
• Big Data Platforms
– HDFS, Hive, Impala, HP Vertica, Hortonworks, ParStream,
Actian, any big data platform really!
• Spark MLlib integration (coming soon)
• Streaming Executor (coming soon)
10. Copyright © 2015 KNIME.com AG
Hadoop Sandboxes
• Hortonworks:
http://hortonworks.com/products/hortonworks-sandbox/
• Cloudera:
http://www.cloudera.com/content/cloudera/en/downloads/
quickstart_vms.html
• Virtual Box
https://www.virtualbox.org/
• VMWare Player
http://www.vmware.com/
12
11. Copyright © 2015 KNIME.com AG
Access Big
Data
Select Table
In-DB
Processing
Into
KNIME
… as easy as 1,2,3,… 4
13
4321
12. Copyright © 2015 KNIME.com AG
1. Database Connector
Generic Database Connector
– Can connect to any JDBC source
– Register new JDBC driver via
preferences page
14
Access Big
Data
13. Copyright © 2015 KNIME.com AG
1. Register JDBC Driver
15
Open KNIME and go to
File -> Preferences
Increase connection timeout for
long running retrieval operations
Access Big
Data
14. Copyright © 2015 KNIME.com AG
1. Dedicated Connectors
Dedicated pre-configured connectors
– Bundling necessary JDBC drivers
– Easy to use
– DB specific behavior/capability
Some dedicated connectors are part of
the open source KNIME Analytics
Platform, some belong to the
commercial KNIME Big Data Extension
16
works for most
Hadoop HIVE
installations,
including
Hortonworks
free
Access Big
Data
16. Copyright © 2015 KNIME.com AG
3. In-Database Processing
• Filter rows and columns
• Join tables/queries
• Sort your data
• Write your own query
• Aggregate* your data
19
Similar Settings as
GroupBy node
Similar Settings as
Joiner node
* Database GroupBy node exposes DB specific aggregation methods
In-DB
Processing
17. Copyright © 2015 KNIME.com AG
3. Queries for average Measures
20
In-DB
Processing
19. Copyright © 2015 KNIME.com AG
4. Import Data from Database
23
< 30 min
1 2
3
4
Into KNIME
20. Copyright © 2015 KNIME.com AG
New Big Data Platform?
24
No problem!
Just change the connector node!
21. Copyright © 2015 KNIME.com AG
Other Useful Database Nodes
• Drop table
– missing table handling
– cascade option
• Execute any SQL
statement
• Manipulate existing
queries
25
Executes several
queries separated
by ; and new line
23. Copyright © 2015 KNIME.com AG
KNIME Big Data Extension
• KNIME Big Data Access Nodes
– preconfigured connectors
– HDFS File Handling
– Hive/Impala Loader
• Big Data Platforms
– HDFS, Hive, Impala, HP Vertica, Hortonworks, ParStream,
Actian, SAP Hana (to be), …
• Spark MLlib integration (coming soon)
• Streaming Executor (coming soon)
24. Copyright © 2015 KNIME.com AG
HDFS File Handling
• KNIME & Extensions ->
KNIME File Handling Nodes
• HDFS Connection and
HDFS File Permission nodes
28
25. Copyright © 2015 KNIME.com AG
Hive/Impala Loader
29
• Upload a KNIME data table to Hive/Impala
26. Copyright © 2015 KNIME.com AG
KNIME Big Data Extension: Download and Install
KNIME.com Extension Store
License Required!
Installation Instructions
http://tech.knime.org/installation-instructions
Product Description
http://www.knime.org/knime-big-data-extension
27. Copyright © 2015 KNIME.com AG
License on KNIME Store
http://tech.knime.org/knime-store
30-day trial license available with special Promotion Code
education@knime.com
28. Copyright © 2015 KNIME.com AG
References
• Whitepaper “KNIME opens the Doors to Big Data”
http://www.knime.org/files/big_data_in_knime_1.pdf
• Blog Post “Integrating Big data is as Easy as 1,2,3, … 4”
http://www.knime.org/blog/integrating-big-data-is-as-easy-as-
1-2-3-4
• The Big Data Extension Product Description
http://www.knime.org/knime-big-data-extension
32
29. Copyright © 2015 KNIME.com AG
Thank You!
• education@knime.com
• Twitter: @KNIME
• LinkedIn Group: KNIME
• KNIME Blog: http://www.knime.org/blog
33