5. PSU definition of IIS Research
The goal of the Intelligent Information Systems
Research is to explore and support all levels of
research that will improve and enhance our ability
to generate, manage, search, and mine
information and knowledge. Current research
covers Internet database design and analysis,
mobile Web computing, Web mining and
navigation, Web agents, novel and intelligent
Web tools, multimedia retrieval, Web and internet
models, Web usage, automatic content analysis
and digital libraries, Web search, niche search
engines, semantic web, scientific databases, data
mining, and information retrieval.
All about Web ….
6. Knowledge
Information
Data
Data : Simple things; easily captured, structured,
transferred, compressible and quantifiable
Information : Relevant and related data having
some purpose; needs consensus on meaning and
human mediation necessary
Knowledge : Valuable information from human
mind; contextual; hard to capture electronically and
structure; mostly tacit
Source : Adapted from Thomas H. Davenport, Information Ecology
8. Definition - DM and KDD
Data mining(DM) is a step in the knowledge
discovery process consisting of particular data
mining algorithms that, under some acceptable
computational efficiency limitations, find patterns
or models in data
Knowledge discovery in databases (KDD)
(Fayyad et al, 1996), is the process of identifying
valid, novel, potentially useful, and ultimately
understandable patterns or models in data.
9. Data Items
Data items refer to an elementary description
of things, events, activities, and transactions
that are recorded, classified, and stored but
are not organized to convey any specific
meaning.
Data items can be numbers, letters, figures,
sounds, or images. Examples of data items
are a
student grade in a class and the number of
hours an employee worked in a certain week.
10. Information
Information refers to data that have been
organized so that they have meaning and
value to the recipient.
For example, a grade point average (GPA) is
data, but a student’s name coupled with his or
her GPA is information.
The recipient interprets the meaning and
drawsconclusions and implications from the
information.
11. Knowledge
Knowledge consists of data and/or
information that have been organized and
processed
to convey understanding, experience,
accumulated learning, and expertise as they
apply to a current business problem. For
example, a company recruiting at your
university has found
Over time that students with grade point
averages over 3.0 have had the most success
in its management program.
Based on its experience, that company may
14. Major Types of
Systems
• Executive Support Systems (ESS)
• Decision Support Systems (DSS)
• Management Information Systems (MIS)
• Knowledge Work Systems (KWS)
• Office Systems
• Transaction Processing Systems (TPS)
MAJOR TYPES OF SYSTEMS IN
ORGANIZATIONS
16. Task – LOSE WEIGHT!
UK survey - relevant respondents were asked to
name the top reasons as to how using dieting
apps on their Smartphone has helped them to
lose weight. The five highest ranked answers
emerged as follows:
1. Easier to track calories and food intake at the
push of a button (47%)
2. Can check calorie content of items before
deciding to eat them (36%)
3. Helpful for planning healthy and nutritious
meals in advance (32%)
4. Helps keep me motivated (24%)
5. Cheaper and easier alternative to diet books
and magazines (18%)
17. Information System
Executive Support System
Management Information System
Decision Support System
Knowledge Work System
Office System
Transaction Processing System
5 year sales plan
Inventory Control
Pricing/Profit Analysis
Engineering/Graphics
Word Processing/Agenda
Order Processing
6 month diet plan
What should be in refrigerator
Best food within budget
Which food to eat with which food
Keeping agenda of eating timings
19. Magnitude of Data
Human brain capacity 2.5 PETABYTES
Total digital data created 422 EXABYTES (2008)
Web size - 98 PETABYTES (2010)
Total genome sequences of all people on Earth 4.75
EXABYTES
Web users - 2 Billion + (2011)
World’s digital storage capacity 1 ZETTABYTE (2011)
Digital data created 1.8 ZETTABYTES (2011) 2.7 ZB
(2012)
Digital Data to be produced - 35 ZETTABYTES (by 2020 )
Drastic price reduction in per Gigabyte production of storage
=> All data is being or is going to be conserved!
=> Huge data centres
1 bit on 12 atoms …. 1 bit on 1000000 atoms
IDC Digital Universe 2010/ Popular Science Nov 2011/ IBM
GIGA 9
.
.
TERA 12
.
.
PETA 15
.
.
EXA 18
.
.
ZETTA 21
.
.
YOTTA 24
19
20. Some systems we will come
across…
Decision Support Systems
Strategic Information Systems
OLAP
Executive Information Systems
Enterprise Information Systems
…..
Green Information Systems
based on
Data Warehousing
Data Mining
21. Data Equity
UK Report : Annual economic benefit of these big
data analytics can be above 40 billion £ for UK,
for 2017 from Government and Enterprise point of
view.
US government has dedicated 200 million dollars
for research to handle big data
$10 million data project at the University of
California, Berkeley, support for a geosciences
data effort called Earth Cube, and more.
21
23. Obama’s secret weapon in re-
election
With a sluggish economy, unemployment
teetering at around the eight per cent mark, and
growing anti-Obama sentiment in some parts of
the country, a second term seemed an uphill task
for Obama and it was going to take an
extraordinary campaign to make it happen
From the get-go David Axelrod, the brain behind
the Obama campaign, recognised the role that
data and information could play in the election.
The process had been initiated in 2008 but
databases were scattered and it wasn’t until the
2010 midterm elections that the Democratic
Party, despite heavy losses, was able to
streamline the data to accurately forecast results
in a meaningful way.
24. Obama’s secret weapon in re-
election
Pakistani scientist Rayid Ghani
Ghani’s job was to make sense of huge amounts of
information
“The core of the work I was doing was looking at a
large amount of data and making sense of it to help
other people make better decisions”
If the 2008 campaign was about charisma and hope,
the 2012 campaign was about science and data.
How data helps you, is it makes you more efficient and it
helps you spend your money carefully and in the right
way
25. Collaborative Social IS
The data content published by individuals on web,
specifically in a collaborative environment like social
forum, blog, games etc.
Data can be analysed/created by thousands of users
(crowd-sourcing initiatives).
• Right after the earthquake in Haiti in 2009, the
company holding the license to use geo-imagery
opened up the map data of Haiti to general public.
The web users all over the world examined the
imagery and within a couple of days populated the
map with information like refugee camps, hospitals,
damaged buildings. The relief workers in the area
used the map to organise the relief work
http://observedchange.com/demos/linked-haiti/
25
27. Other scenarios …
• Online business transactional system :
Millions of transactions over internet, e.g.
online game servers, business transaction
portals or other business to business oriented
services
• New York Stock Exchange produces about
one terabyte of data per day
• WallMart Data Warehouse Intelligence
• Querying these transactions in Column
databases!
• NoSQL Databases
• Graph Databases
Knowledge is both an individual attribute and a collective attribute of the firm.
Knowledge is generally believed to have a location, either in the minds of humans or in specific business processes.
Knowledge is “sticky” and not universally applicable or easily moved.
Knowledge is thought to be situational and contextual. For example, you must know when to perform a procedure as well as how to perform it.
Tacit knowledge : Knowledge residing in the minds of employees that has not been documented. Like ride a bicycle.
i.e. Knowledge is a cognitive, even a physiological, event that takes place inside peoples’ heads.
Explicit knowledge : Knowledge that has been documented
It is also stored in libraries and records, shared in lectures, and stored by firms in the form of business processes and employee know-how. Knowledge can reside in e-mail, voice mail, graphics, and unstructured documents as well as structured
Information System is a term used for systems which give some information –
IS is categorized depending how the information is used …
ESS – 6 month diet plan!
MIS – What should be in my refrigerator and what should not be!
DSS – which is the best food for me within my budget
KWS - Which food should be eaten with which other food for maximum energy and minimum
OS – Keeping agenda of eating timings
TPS – Transaction processing is simply a system to input the data and convert it into another form
What we see from the example – Technically speaking a simple Mobile app fits in all 5 categories of information systems
So, Mobile Information System is here to help individuals, enterprises, and governments
eBay 6.5 PB 2009, Google 1 PB of new data every 3 days 2009 : It is said that Google is today the largest manufacturer of Computer Hardware –
To handle the data - it designs and produces its own hardware. Maybe becasue once in a security workshop, an expert shared a case in which sensitive company asked the proprietory vendors to submitt an AFFADAVID (registered from court) paper saying there is no backdoor chip in the servers. Not one company went for the selling. The company made their servers from off the shelf motherboards and processors!
The Guardian Tuesday 29 May 2012 - Cyber-attack concerns raised over Boeing 787 chip's 'back door‘
Sergei Skorobogatov , Christopher Woods, (Cambradge Univ. UK) Breakthrough silicon scanning discovers backdoor in military chip. Cryptographic Hardware and Embedded Systems Workshop (CHES-2012), 9-12 September 2012, LNCS 7428, Springer, ISBN 978-3-642-33026-1, pp.23-40.
Researchers claim chip used in military systems and civilian aircraft has built-in function that could let in hackers
Talks about Yottabyte of data is on the boards…
In a presentation at IBM, ECAI 2012 – A researcher from IBM, Haifa said for 2020 onword we should start thinking how to handle Yottabyte!
2^30, 40, 50, 60, 70, 80 in binary from Giga to Yotta
At IBM bit on 12 atoms- Currently it takes about a million atoms to store a bit on a modern hard-disk. Below 12 atoms the researchers found that the bits randomly lost information, owing to quantum effects. More than a factor of 80000 to current disk size to volume ratio …. Density of data
http://www-03.ibm.com/press/us/en/pressrelease/36473.wss
An example of sensor and machine data is found at the Large Hadron Collider at CERN, the European Organization for Nuclear Research. CERN scientists can generate 40 terabytes of data every second during experiments.
Similarly, Boeing jet engines can produce 10 terabytes of operational information for every 30 minutes they turn. A four- engine jumbo jet can create 640 terabytes of data on just one e Atlantic crossing; multiply that by the more than 25,000 flights flown each day, and you get an understanding of the impact that sensor and machine-produced data can make on a BI environment
Governments are going for support for research to handle the open big data
The amount of UNPROTECTED yet sensitive data is growing even faster
Games, social networks produce millions of transactions per second. Catering for it and at the same time producing statistical reports on it
ACID properties are not fully complied.
Graph databases which comply with ACID properties