Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Latest Trends in Technology:BigData Analytics, Virtualization, Cloud Computing, Internet of Things (IoT)
1. Assoc. Prof. Abzetdin ADAMOV Chair of Computer Engineering Department aadamov@qu.edu.az http://ce.qu.edu.az/~aadamov Shahdag, 29 November 2014
Latest Trends in Technology: BigData Analytics, Virtualization, Cloud Computing, Internet of Things (IoT)
2. Content
•Why Data Mining in BigData?
•Internet Statistics
•BigData Infrastructure
•Web Crowlers for Web Analytics
•Natural Language Processing (NLP)
•Virtualization
•Introduction to Cloud Computing
•Introduction to Internet of Things (IoT)
3. Digital Universe volume of digital data
•2008 – 480.000 petabytes (PB)
•2009 – 800.000 PB
•2010 – 1200 000 PB or 1.2 zettabyte (ZB)
•2011 – 1.8 ZB
•2012 – 2.7 ZB
•2014 ~ 6.2 ZB
•Expected to reach 35 ZB by 2020
IDC's Digital Universe Study
4. Big Measures for Big Data
•kilobyte (kB) 103 210
•megabyte (MB) 106 220
•gigabyte (GB) 109 230
•terabyte (TB) 1012 240
•petabyte (PB) 1015 250
•exabyte (EB) 1018 260
•zettabyte (ZB) 1021 270
•yottabyte (YB) 1024 280
5. Why Data Grows so Fast?
Data sets gathered by ubiquity devices:
•Information-sensing mobile devices,
•Aerial sensory technologies (remote sensing),
•Software logs,
•Cameras,
•Microphones,
•Radio-frequency identification readers,
•wireless sensor networks
6. Internet is Biggest Country
31111551339301601000200030004000millionUSAIndiaChinaInternetPopulation9,629,9717,07514,450200400600million square kmUSACanadaRussiaInternetArea
7. Internet Penetration
13
12,4
8,2
4,1
1,5
1,1
0,9
0,6
0,59
0,5
0,34
0,32
0,04
0,03
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
EE
LV
RU
LT
BY
UA
KG
AM
KZ
UZ
GE
MD
AZ
TM
TJ
Internet Penetration (%)
Country Internet Codes
Note: Internet stats for December 2001 Avarage Internet usage ın the world 8% - 500 Million - 2001
8. Foundations of the Web
34,7
18
15
10
5,2 5
3 2,7 2,3 2,2 1,8 1,8 1,3 0,9 0,8
0 2 4 6 8
10
12
14
16
18
20
22
24
26
28
30
32
34
36
38
40
EE LV RU LT BY UA KG AZ AM KZ UZ GE MD TM TJ
Country Internet Codes
Internet Penetration (%)
Note: Internet stats for December 2004
9. Foundations of the Web
65,6
59,4 59,2
29,1
27,1
18,2
16,2 14,7 13,8 12,3
8,8 7,8 6,6 5,8
1,4
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
EE LV LT BY RU AZ MD UA KG KZ UZ GE TJ AM TM
Country Internet Codes
Internet Penetration (%)
Note: Internet stats for September 2009
Avarage Internet usage in the world 21.9%
10. Foundations of the Web
68,2
59,5
47,1 46
44,1
40 39,3
34,1 33,9
30,9
28,3 26,8
9,2
1,6
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
EE LV LT AM BY AZ RU KG KZ UA MD GE UZ TJ TM
Country Internet Codes
Internet Penetration (%)
Note: Internet stats for March 2011
Avarage Internet usage ın the world 30.2%
11. Foundations of the Web
78
71,1
65,1
60,6
50
47,7 46 45 44,8
34,1
30,2
28,4
26
13
5
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
EE LV LT AM AZ RU BY KZ MD UA UZ GE UZ TJ TM
Country Internet Code
Internet Penetration (%)
Note: Internet stats for June 2012
Avarage Internet usage ın the world 34.3%
http://www.internetworldstats.com
12. Internet Penetration
79
74
68
54,2
53,3
53,3
46,9
45,5
43,4
39,2
36,5
33,7
21,7
14,5
7,2
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
EE
LV
LT
AZ
RU
KZ
BY
GE
MD
AM
UZ
UA
KG
TJ
TM
Internet Penetration (%)
Country Internet Codes
Note: Internet stats for March 2013 Avarage Internet usage ın the world 39% - 2,7 Billion - 2013
15. SUN
SUN
1
2
3
4
5
6
7
8 Foundations of the WEB
DNS
DNS
DNS
DNS
- Countries, Cities, User Groups, …
16. Problem with Moore’s Law
•The number of transistors that can be placed on an integrated circuit doubles every 18 months to two years
•It’s predicted to reach its limit with existing technology in 2020
•Cutting the size of a transistor to a single atom may defeat that concept
•The Digital Universe is growing much more faster than Processing Power
19. Google New Data Centers
Map of Google Data Centers Worldwide
450,000 servers range upwards of 20 megawatts, which cost on the order of US$2 million per month in electricity charges.
22. Everything as a Service
•Utility computing = Infrastructure as a Service (IaaS)
–Why buy machines when you can rent cycles?
–Examples: Amazon’s EC2 (Elastic Compute Cloud), Rackspace, Microsoft Azure
•Platform as a Service (PaaS)
–Give me nice API and take care of the maintenance, upgrades, …
–Example: Google App Engine
•Software as a Service (SaaS)
–Just run it for me!
–Example: Gmail, Salesforce
23. Web Crowlers for Web Analytics
•Indexing
•Searching
•Ranking
•Analysis
•Crowling is Essential Job for all Internet Giants: Google, Yahoo, Facebook, etc.
Some of available open source crowlers: Apache Nutch, Crawler4j, Bixo, Heritrix, etc.
24. Web Crowlers for Web Analytics
•Thanks to Crowlers any website can appear in search results without doing any extra work.
•Customized Crowling by METATags and “ROBOTS.TXT”
25. Natural Language Processing (NLP)
•Natural Language Processing (NLP)
•Computational Linguistics (CL)
•Machine Translation (MT)
26. Natural Language Processing (NLP)
•Multilingual NLP
•Text Mining in Multimedia Networks
•Mining Text Streams
•Text Mining in Social Media
•Cross-Lingual Mining of Text Data
•Contextual analysis of text data
Some of availables NLP tools: NLTK, Apache OpenNLP, MontyLingua, VisualText, etc.
27. Data Mining and Knowledge Discovery
Data-driven Decision Making Model
28. Virtualization as an
Infrastructure
Hardware
Operating System
App
App
App
Traditional Stack
Hardware
OS
App
App
App
Hypervisor
OS
OS
Virtualized Stack
32. IP Address – Dotted decimal notation
•32 bit binary
•Four 8-bit octets
Ex: 11100011010100101001101110110001
11100011 - 01010010 - 10011101 - 10110001
E3 - 52 - 9D - B1
•What’s a subnet ?
–device interfaces with same subnet part of IP address
–can physically reach each other without intervening router Example: 122.97.211.200 We can view these values in their binary form.
122
97
211
200
01111010
01100001
11010011
11001000
36. Data Mining and Knowledge Discovery
•Text Mining from Web
•Natural Language Processing
•Web Crawling
•Large-Scale Data Management and Processing
•Internet Structure Research and Visualization
•Knowledge visualization
•Cluster and Distributed Computings