SlideShare uma empresa Scribd logo
1 de 71
Baixar para ler offline
Chapter 2
Data Governance and IT Architecture Support Long-Term
Performance
Prepared by Dr. Derek Sedlack, South University
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Learning Objectives
Enterprise Architecture and Data Governance
Information Systems: The Basics
Data Centers, Cloud Computing, and Virtualization
Cloud Services Add Agility
Information Management
Information Management
INFORMATION MANAGEMENT HARNESSES SCATTERED
DATA
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Information Management
Information Management
The use of IT tools and methods to collect, process, consolidate,
store, and secure data from sources that are often fragmented
and inconsistent.
Why a continuous plan is needed to guide, control, and govern
IT growth.
Information management is critical to data security and
compliance with continually evolving regulatory requirements,
such as the Sarbanes-Oxley Act, Basel III, the Computer Fraud
and Abuse Act (CFAA), the USA PATRIOT Act, and the Health
Insurance Portability and Accountability Act (HIPAA).
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Information Management
Data Silos
Stand alone data stores not accessible by other information
systems that need data, cannon consistently be updated.
Exist from a lack of IT architecture, only support single
functions, and do not support cross-functional needs.
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Information Management
Key Performance Indicators (KPIs)
These measures demonstrate the effectiveness of a business
process at achieving organizational goals.
Present data in easy-to-comprehend and comparison-ready
formats.
KPI examples: current ratio; accounts payable turnover; net
profit margin; new followers per week; cost per lead; order
status.
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Information Management
Chapter 2
Figure 2.4 Data (or information) silos are ISs that do not have
the capability to
exchange data with other ISs, making timely coordination and
communication across
functions or departments difficult.
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Information Management
Reasons information deficiencies are still a problem
Data Silos
Lost of bypassed data
Poorly designed interfaces
Nonstandardized data formats
Cannot hit moving targets
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Information Management
Chapter 2
Figure 2.5 Factors that are increasing demand for collaboration
technology.
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Global, mobile workforce
62% of the workforce works outside an office at some point.
This number is increasing.
Mobility-driven consumerization
Growing number of cloud collaboration services.
Principle of “any”
Growing need to connect anybody, anytime, anywhere on any
device
Information Management
Obvious benefits of information management
Improves decision quality
Improves the accuracy and reliability of management
predictions
Reduces the risk of noncompliance
Reduces time and cost
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Information Management
Explain information management.
Why do organizations still have information deficiency
problems?
What is a data silo?
Explain KPIs and give an example.
What three factors are driving collaboration and information
sharing?
What are the business benefits of information management?
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Suggested Answers:
1. Information management is the use of IT tools and methods
to collect, process, consolidate, store, and secure data from
sources that are often fragmented and inconsistent. A modern
organization needs to manage a variety of information which
goes beyond the structured types like numbers and texts to
include semi-structured and unstructured contents such as video
and sound. The digital library includes content from social
media, texts, photos, videos, music, documents, address books,
events, and downloads. Maintaining—updating, expanding,
porting—an organization’s digital library’s contents on a
variety of platforms is the task of Information Management.
Specifically, Information Management deals with how
information is organized, stored, and secured, and the speed and
ease with which it is captured, analyzed and reported.
2. Over many decades, changes in technology and the
information companies require, along with different
management teams, changing priorities, and increases or
decreases in IT investments as they compete with other demands
on an organization’s budget, have all contributed. Other
common reasons include: data silos (information trapped in
departments’ databases), data lost or bypassed during transit,
poorly designed user interfaces requiring extra effort from
users, non-standardized data formats, and fast-moving changes
in the type of information desired, particularly unstructured
content, requiring expensive investments.
3. A data silo is one of the data deficiencies that can be
addressed. It refers to the situation where the databases
belonging to different functional units (e.g., departments) in an
organization are not shared between the units because of a lack
of integration. Data silos support a single function and therefore
do not support the cross-functional needs of an organization.
The lack of sharing and exchange of data between functional
units raises issues regarding reliability and currency of data,
requiring extensive verification to be trusted. Data silos exist
when there is no overall IT architecture to guide IS investments,
data coordination, and communication.
4. KPIs are performance measurements. These measures
demonstrate the effectiveness of a business process at achieving
organizational goals. KPIs present data in easy-to-comprehend
and comparison-ready formats. KPIs help reduce the complex
nature of organizational performance to a small number of
understandable measures.
Examples of key comparisons are actual vs. budget, actual vs.
forecasted, and this year vs. prior years.
5. Forrester (forrester.com) identified three factors driving the
trend toward collaboration and information sharing technology.
These are:
Global, mobile workforce (a growing number of employees
telecommute)
Mobility-driven consumerization (cloud-based collaboration
solutions are on the rise)
Principle of any (there is growing need to connect anybody
anytime anywhere and on any device)
6. The following four benefits have been identified:
Improves decision quality (due to timely response using reliable
data)
Improves the accuracy and reliability of management
predictions (“what is going to happen” as opposed to financial
reporting on “what has happened.”)
Reduces the risk of noncompliance (due to improved
compliance with regulation resulting from better information
quality and governance), and
Reduces the time and cost of locating relevant information (due
to savings in time and effort through integration and
optimization of repositories)
11
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Learning Objectives
Enterprise Architecture and Data Governance
Information Systems: The Basics
Data Centers, Cloud Computing, and Virtualization
Cloud Services Add Agility
Information Management
Enterprise Architecture and Data Governance
Enterprise architecture (EA)
The way IT systems and processes are structured.
Helps or impedes day-to-day operations and efforts to execute
business strategy.
Solves two critical challenges: where are we going; how do we
get there?
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Enterprise Architecture and Data Governance
Strategic Focus
IT systems’ complexity
Poor business alignment
Business and IT Benefits of EA
Cuts IT costs; increases productivity with information, insight,
and ideas
Determines competitiveness, flexibility, and IT economics
Aligns IT capabilities with business strategy to grow, innovate,
and respond to market demands
Reduces risk of buying or building systems and enterprise apps
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Enterprise Architecture and Data Governance
Chapter 2
EA Components
Business Architecture
Application Architecture
Data Architecture
Technical Architecture
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Enterprise Architecture and Data Governance
Enterprise-wide Data Governance
Crosses boundaries and used by people through the enterprise.
Increased importance through new regulations and pressure to
reduce costs.
Reduces legal risks associated with unmanaged or inconsistently
managed information
Chapter 2
Dependent on Governance
Food Industry
Financial Services Industry
Health-care Industry
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Enterprise Architecture and Data Governance
Master Data & Management (MDM)
Creates high-quality trustworthy data:
Running the business with transactional or operational use
Improving the business with analytic use
Requires strong data governance to manage availability,
usability, integrity, and security.
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Enterprise Architecture and Data Governance
Politics: The People Conflict
Cultures of distrust between technology and employees may
exist.
Genuine commitment to change can bridge the divide with
support from the senior management.
Methodologies can only provide a framework, not solve people
problems
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Enterprise Architecture and Data Governance
Explain the relationship between complexity and planning. Give
an example.
Explain enterprise architecture.
What are the four components of EA?
What are the business benefits of EA?
How can EA maintain alignment between IT and business
strategy?
What are the two ways that data are used in an organization?
What is the function of data governance?
Why has interest in data governance and MDM increased?
What role does personal conflict or politics play in the success
of data governance?
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Suggested Answers:
1. As enterprise information systems become more complex, the
importance of long-range IT planning increases dramatically.
Companies cannot simply add storage, new apps, or data
analytics on an as needed basis and expect those additions to
work with the existing systems.
The relationship between complexity and planning is easier to
see in physical things such as skyscrapers and transportation
systems. If you are constructing a simple cabin in a remote area,
you do not need a detailed plan for expansion or to make sure
that the cabin fits into its environment. If you are building a
simple, single-user, non-distributed system, you would not need
a well-thought out growth plan either. Therefore, it is no longer
feasible to manage big data, content from mobiles and social
networks, and data in the cloud without the well-designed set of
plans, or blueprint, provided by EA. The EA guides and controls
software add-ons and upgrades, hardware, systems, networks,
cloud services, and other digital technology investments.
2. Enterprise architecture (EA) is the way IT systems and
processes are structured. EA is an ongoing process of creating,
maintaining, and leveraging IT. It helps to solve two critical
challenges: where an organization is going and how it will get
there. EA helps, or impedes, day-to-day operations and efforts
to execute business strategy.
There are two problems that the EA is designed to address:
IT systems’ complexity. IT systems have become unmanageably
complex and expensive to maintain.
Poor business alignment. Organizations find it difficult to keep
their increasingly expensive IT systems aligned with business
needs.
EA is the roadmap that is used for controlling the direction of
IT investments and it is a significant item in long-range
planning. It is the blueprint that guides the build out of overall
IT capabilities consisting of four sub-architectures (see question
3). EA defines the vision, standards, and plan that guide the
priorities, operations, and management of the IT systems
supporting the business.
3. The four components are:
Business Architecture (the processes the business uses to meet
its goals);
Application architecture (design of IS applications and their
interactions);
Data architecture (organization and access of enterprise data);
Technical architecture (the hardware and software infrastructure
that supports applications and their interactions)
4. EA cuts IT costs and increases productivity by giving
decision makers access to information, insights, and ideas
where and when they need them.
EA determines an organization’s competitiveness, flexibility,
and IT economics for the next decade and beyond. That is, it
provides a long-term view of a company’s processes, systems,
and technologies so that IT investments do not simply fulfill
immediate needs.
EA helps align IT capabilities with business strategy—to grow,
innovate, and respond to market demands, supported by an IT
practice that is 100 percent in accord with business objectives.
EA can reduce the risk of buying or building systems and
enterprise apps that are incompatible or unnecessarily expensive
to maintain and integrate.
5. EA starts with the organization’s target–where it is going—
not with where it is. Once an organization identifies the
strategic direction in which it is heading and the business
drivers to which it is responding, this shared vision of the future
will dictate changes in business, technical, information, and
solutions architectures of the enterprise, assign priorities to
those changes, and keep those changes grounded in business
value. EA guides and controls software add-ons and upgrades,
hardware, systems, networks, cloud services, and other digital
technology investments which are aligned with the business
strategy.
6. Data are used in an organization for running the business
(transactional or operational use) and for improving the
business (analytic use.)
7. Data governance is the process of creating and agreeing to
standards and requirements for the collection, identification,
storage, and use of data. The success of every data-driven
strategy or marketing effort depends on data governance. Data
governance policies must address structured, semi-structured,
and unstructured data (discussed in Section 2.3) to ensure that
insights can be trusted.
Data governance allows managers to determine where their data
originates, who owns them, and who is responsible for what—in
order to know they can trust the available data when needed.
Data governance is an enterprise-wide project because data
cross boundaries and are used by people throughout the
enterprise.
8. As data sources and volumes continue to increase, so does the
need to manage data as a strategic asset in order to extract its
full value. Making business data consistent, trusted, and
accessible across the enterprise is a critical first step in
customer-centric business models. With appropriate data
governance and MDM, managers are able to extract maximum
value from their data, specifically by making better use of
opportunities that are buried within behavioral data. Strong data
governance is needed to manage the availability, usability,
integrity, and security of the data used throughout the enterprise
so that data are of sufficient quality to meet business needs.
9. There may be a culture of distrust between technology and
employees in an organization. To overcome this, there must be a
genuine commitment to change. Such a commitment must come
from senior management. A methodology, such as data
governance, cannot solve people problems. It only provides a
framework in which such problems can be solved.
19
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Learning Objectives
20
Enterprise Architecture and Data Governance
Information Systems: The Basics
Data Centers, Cloud Computing, and Virtualization
Cloud Services Add Agility
Information Management
Information Systems: The Basics
DATA, INFORMATION, & KNOWLEDGE
Raw data describes products, customers, events, activities, and
transactions that are recorded, classified, and stored.
Information is processed, organized, or put into context data
with meaning and value to the recipient.
Knowledge is conveyed information as applied to a current
problem or activity.
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Information Systems: The Basics
DATA, INFORMATION, & KNOWLEDGE
Raw data describes products, customers, events, activities, and
transactions that are recorded, classified, and stored.
Chapter 2
Data
Information
Knowledge
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Information Systems: The Basics
Chapter 2
Figure 2.8 Input-processing-output model.
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Information Systems: The Basics
Transaction Processing Systems (TPS)
Internal transactions: originate or occur within the organization
(payroll, purchases, etc.).
External transactions: originate outside the organization
(customers, suppliers, etc.).
Improve sales, customer satisfaction, and reduce many other
types of data errors with financial impacts.
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Information Systems: The Basics
Batch v. Online Real-Time Processing
Batch Processing: collects all transactions for a time period,
then processes the data and updates the data store.
OLTP: processes each transaction as it occurs (real-time).
Batch processing costs less than OLTP, but may be inaccurate
from update delays.
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Information Systems: The Basics
Management Information Systems (MIS)
General-purpose reporting systems that provide reports to
managers for tracking operations, monitoring, and control.
Periodic: reports created or run according to a pre-set schedule.
Exception: generated only when something is outside designated
parameters.
Ad Hoc, or On Demand: unplanned, generated as needed.
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Information Systems: The Basics
Decision Support Systems (DSS)
Interactive applications that support decision making.
Support unstructured and semi-structured decisions with the
following characteristics:
Easy-to-use interactive interface
Models or formulas that enable sensitivity analysis
Data from multiple sources
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Information Systems: The Basics
Transaction Issues
Huge database transactions causes volatility – constant use or
updates.
Makes databases impossible for complex decision making and
problem-solving tasks.
Data is loaded to a data warehouse where ETL (extract,
transform, and load) is better for analysis.
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Business Process Management and Improvement
Contrast data, information, and knowledge.
Define TPS and give an example.
When is batch processing used?
When are real-time processing capabilities needed?
Explain why TPSs need to process incoming data before they
are stored.
Define MIS and DSS and give an example of each.
Why are databases inappropriate for doing data analysis?
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Suggested Answers:
1. Data, or raw data, refers to a basic description of products,
customers, events, activities, and transactions that are recorded,
classified, and stored. Data are the raw material from which
information is produced and the quality, reliability and integrity
of the data must be maintained for the information to be useful.
Information is data that has been processed, organized, or put
into context so that it has meaning and value to the person
receiving it.
Knowledge consists of data and/or information that have been
processed, organized, and put into context to be meaningful,
and to convey understanding, experience, accumulated learning,
and expertise as they apply to a current problem or activity.
Define TPS and give an example.
2. Transaction processing systems are designed to process
specific types of data input from ongoing transactions. TPSs can
be manual, as when data are typed into a form on a screen, or
automated by using scanners or sensors to capture data.
Organizational data are processed by a TPS--sales orders,
payroll, accounting, financial, marketing, purchasing, inventory
control, etc. Transactions are either:
Internal transactions: Transactions that originate from within
the organization or that occur within the organization. Examples
are payroll, purchases, budget transfers, and payments (in
accounting terms, they’re referred to as accounts payable).
External transactions: Transactions that originate from outside
the organization, e.g., from customers, suppliers, regulators,
distributors, and financing institutions.
TPSs are essential systems. Transactions that do not get
captured can result in lost sales, dissatisfied customers, and
many other types of data errors having financial impact. For
example, if accounting issues a check as payment for an invoice
(bill) and that check is cashed, if that transaction is not
captured, the amount of cash on the financial statements is
overstated, the invoice continues to show as unpaid, and the
invoice may be paid a second time. Or if services are provided,
but not recorded, the company loses that service revenue.
3. Batch processing is used when there are multiple transactions
which can be accumulated and processed at one time. These
transactions are not as time sensitive as those that need to be
processed in real time. The transactions may be collected for a
day, a shift, or over another period of time, and then they are
processed. Batch processing often is used to process payroll in a
weekly or bi-weekly manner. Batch processing is less costly
than real-time processing.
4. Online transaction processing (OLTP), or real-time
processing, is used when a system must be updated as each
transaction occurs. The input device or website for entering
transactions must be directly linked to the transaction
processing system (TPS). This type of entry is used for more
time sensitive data, such as reservation systems in which the
user must know how many seats or rooms are available.
5. Processing improves data quality, which is important because
reports and decisions are only as good as the data they are based
on. As data is collected or captured, it is validated to detect and
correct obvious errors and omissions.
Data errors detected later may be difficult to correct or time-
consuming. You can better understand the difficulty of
detecting and correcting errors by considering identity theft.
Victims of identity theft face enormous challenges and
frustration trying to correct data about them.
6. General purpose reporting systems are referred to as
management information systems (MIS). Their objective is to
provide reports to managers for tracking operations, monitoring,
and control.
MIS is used by middle managers in functional areas and
provides routine information for planning, organizing, and
controlling operations. Types of reports include:
Periodic: reports created to run according to a pre-set schedule,
such as daily, weekly, and quarterly.
Exception: reports generated only when something is outside the
norm, either higher or lower than expected. An example might
be increased sales in a hardware store prior to a hurricane.
Ad hoc, or on demand, reports are unplanned reports generated
as needed.
Decision support systems (DSS) are interactive applications that
support decision making. Configurations of a DSS range from
relatively simple applications that support a single user to
complex enterprise-wide systems. A DSS can support the
analysis and solution of a specific problem, to evaluate a
strategic opportunity, or to support ongoing operations. These
systems support unstructured and semi-structured decisions,
such as whether to make-or-buy-or-outsource products, or what
new products to develop and introduce into existing markets.
Decision support systems are used by decision makers and
managers to combine models and data to solve semi-structured
and unstructured problems with user involvement.
To provide such support, DSSs have certain characteristics to
support the decision maker and the decision making process.
Three defining characteristics of DSSs are:
an easy-to-use interactive interface
models that enable sensitivity analysis, what if analysis, goal
seeking, and risk analysis
data from multiple sources - internal and external sources plus
data added by the decision maker who may have insights
relevant to the decision situation.
Having models is what distinguishes DSS from MIS. Some
models are developed by end users through an interactive and
iterative process. Decision makers can manipulate models to
conduct experiments and sensitivity analyses, such as what-if,
and goal-seeking. What-if analysis refers to changing
assumptions or data in the model to see the impacts of the
changes on the outcome. For example, if sales forecasts are
based on a 5 percent increase in customer demand, a what if
analysis would replace the 5 percent with higher and/or lower
demand estimates to determine what would happen to sales if
the demands were different. With goal seeking, the decision
maker has a specific outcome in mind and needs to figure out
how that outcome could be achieved and whether it’s feasible to
achieve that desired outcome. A DSS also can estimate the risk
of alternative strategies or actions.
California Pizza Kitchen (CPK) uses a DSS to support inventory
decisions. CPK has 77 restaurants located in various states in
the U.S. Maintaining inventory of all restaurants at optimal
levels was challenging and time-consuming. A DSS has made it
easy for the managers to keep records updated and make
decisions. Many CPK restaurants increased sales by 5 percent
after implementing a DSS.
7. Databases are used for recording and processing transactions.
Due to the number of transactions, the data in the databases are
constantly in a state of change making it difficult to use for
complex decision making.
29
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Learning Objectives
Enterprise Architecture and Data Governance
Information Systems: The Basics
Data Centers, Cloud Computing, and Virtualization
Cloud Services Add Agility
Information Management
Data Centers, Cloud Computing, and Virtualization
IT Infrastructures
On-premises data centers
Virtualization
Cloud Computing
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Centers, Cloud Computing, and Virtualization
Data Centers
Large numbers of network servers used for the storage,
processing, management, distribution, and archiving of data,
systems, Web traffic, services, and enterprise applications.
National Climatic Data Center
U.S. National Security Agency
Apple
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Centers, Cloud Computing, and Virtualization
Business is Reliant Upon data
Uber (car-hailing service)
Users flooded social media with complaints.
WhatsApp (smartphone text-messaging service)
Competition added 2 million new registered users within 24
hours of WhatsApp outage (a record).
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Centers, Cloud Computing, and Virtualization
Unified Data Center
Cisco’s single solution integrating computing, storage,
networking, virtualization, and management into a single
(unified) platform.
Virtualization gives greater IT flexibility and cutting costs:
Instant access to data any time in any format
Respond faster to changing data analytic needs
Cut complexity and cost
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Centers, Cloud Computing, and Virtualization
Unified Data Center compared to traditional data integration
and replication methods:
Chapter 2
Greater Agility
Streamlined Approach
Better Insight
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Centers, Cloud Computing, and Virtualization
What is “The Cloud”?
A general term for infrastructure that uses the Internet and
private networks to access, share, and deliver computing
resources.
Scalable delivery as a service to end-users over a network.
Should be approached with greater diligence than other IT
decisions as a new technology including Vendor Management
and Service-Level Agreements.
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Centers, Cloud Computing, and Virtualization
Service-Level Agreements
A negotiated agreement between a company and service
provider that can be a legally binding contract or an informal
contract.
The goal is not building the best SLA terms, but getting the
terms that are most meaningful to the business.
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Centers, Cloud Computing, and Virtualization
Types of Clouds
Private Cloud: Single-tenant environments with stronger
security and control (retained) for regulated industries and
critical data.
Public Cloud: Multiple-tenant virtualized services utilizing the
same pool of servers across a public network (distributed).
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Centers, Cloud Computing, and Virtualization
Cloud Infrastructure
Provided on demand for storage virtualization, network
virtualization, and hardware virtualization.
Software or virtualization layer creates virtual machines (VMs)
where the CPU, RAM, HD, NIC, and other components behave
as hardware, but are created with software.
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Centers, Cloud Computing, and Virtualization
Virtualization
Created by a software layer (virtualization layer) containing its
own operating system and applications as a physical computer.
Chapter 2
Infrastructure
As a Service
Platform
As a Service
Software
As a Service
Figure 2.17 Virtual machines running on a simple computer
hardware layer.
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Centers, Cloud Computing, and Virtualization
Characteristics & Benefits
Memory-intensive
Huge amounts of RAM due to massive processing requirements
Energy-efficient
Up to 95% reduction in energy use per server through less
physical hardware
Scalability and load balancing
Handles dynamic demand requests like during the Super Bowl
or World Series
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Centers, Cloud Computing, and Virtualization
What is a data center?
Describe cloud computing.
What is the difference between data centers and cloud
computing?
What are the benefits of cloud computing?
How can cloud computing solve the problems of managing
software licenses?
What is an SLA? Why are SLAs important?
What factors should be considered when selecting a cloud
vendor or provider?
When are private clouds used instead of public clouds?
Explain three issues that need to be addressed when moving to
cloud computing or services.
How does a virtual machine (VM) function?
Explain virtualization.
What are the characteristics and benefits of virtualization?
When is load balancing important?
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Suggested Answers:
1. A data center consists of a large number of network servers
(Figure 2.13) used for the storage, processing, management,
distribution, and archiving of data, systems, Web traffic,
services, and enterprise applications. Data center also refers to
the building or facility that houses the servers and equipment.
2. Cloud computing is the general term for infrastructures that
use the Internet and private networks to access, share, and
deliver computing resources.
3. A main difference between a cloud and data center is that a
cloud is an off-premise form of computing that stores data on
the Internet. In contrast, a data center refers to on-premises
hardware and equipment that store data within an organization’s
local network. Cloud services are outsourced to a third-party
cloud provider who manages the updates, security, and ongoing
maintenance. Data centers are typically run by an in-house IT
department.
A data center is owned by the company. Since only the company
owns the infrastructure, a data center is more suitable for
organizations that run many different types of applications and
have complex workloads. A data center, like a factory, has
limited capacity. Once it is built, the amount of storage and the
workload the center can handle does not change without
purchasing and installing more equipment.
A data center is physically connected to a local network, which
makes it easier to restrict access to apps and information by
only authorized, company-approved people and equipment.
However, the cloud is accessible by anyone with the proper
credentials and Internet connection. This accessibility
arrangement increases exposure to company data at many more
entry and exit points.
Cloud computing is the delivery of computing and storage
resources as a service to end-users over a network. With cloud
computing, shared resources (such as hard drives for storage)
and software apps are provided to computers and other devices
on-demand, like a public utility. That is, it’s similar to
electricity - a utility that companies have available to them on-
demand and pay for it based on usage. Cloud systems are
scalable. That is, they can be adjusted to meet changes in
business needs.
A drawback of the cloud is control because a third party
manages it. Companies do not have as much control as they do
with a data center.
4. Answers may vary.
Many IT infrastructures are extremely expensive to manage and
too complex to easily adapt. Because cloud computing resources
are scalable “on demand”, this increases IT agility and
responsiveness. In a business world where first movers gain the
advantage, IT responsiveness and agility provide a competitive
edge. Access to data in the cloud is possible via any device that
can access the Internet, allowing users to be more responsive
and productive.
Cloud services are outsourced to a third-party cloud provider
who manages the updates, security, and ongoing maintenance,
including backups and disaster recovery, relieving this burden
from the business. The business saves the costs of increased
staff, power consumption, and disposal of discontinued
hardware. Additionally, cloud services significantly reduce IT
costs and complexity through improved workload optimization
and service delivery.
5. Cloud computing makes it more affordable for companies to
use services that in the past would have been packaged as
software and required buying, installing and maintaining on any
number of individual machines. A major type of service
available via the cloud is called software as a service, or SaaS.
Because applications are hosted by vendors and provided on
demand, rather than via physical installations or seat licenses (a
key characteristic of cloud computing), applications are
accessed online through a Web browser instead of stored on a
computer. Companies pay only for the computing resources or
services they use. Vendors handle the upgrades and companies
do not purchase or manage software licenses. They simply pay
for the number of concurrent users.
6. An SLA is a negotiated agreement between a company and
service provider that can be a legally binding contract or an
informal contract.
An SLA serves “as a means of formally documenting the
service(s), performance expectations, responsibilities, and
limits between cloud service providers and their users. A typical
SLA describes levels of service using various attributes such as:
availability, serviceability, performance, operations, billing,
and penalties associated with violations of such attributes.”
(Cloud Standards Customer Council, 2012, pp. 5–6.)
7. See Table 2.5:
8. Companies or government agencies set up their own private
clouds when they need stronger security and control for
regulated industries and critical data.
9. Issues that need to be addressed when moving to public cloud
computing or services include:
Infrastructure issues – Cloud computing runs on a shared
infrastructure so there is less customization for a company’s
specific requirements. The network and WAN (wide area
network) become more critical in the IT infrastructure. Network
bandwidth is also an issue as enough is needed to support the
increase in network traffic. With cloud computing, it may be
more difficult to get to the root of performance problems, like
the unplanned outages that occurred with Google’s Gmail and
Workday’s human resources apps. The trade-off is cost vs.
control.
Disruption issues – There is a risk of disrupting operations or
customers in the process of moving operations to the cloud.
Management issues – Putting part of the IT architecture or
workload into the cloud requires different manageme nt
approaches, different IT skills, and knowing how to manage
vendor relationships and contracts.
(The astute student may also describe the following:
Strategic issues such as deciding which workloads to export to
the cloud; which set of standards to follow for cloud computing;
how to resolve privacy and security issues; and how
departments or business units will get new IT resources.)
10. A virtual machine (VM) is a software layer that runs its own
Operating System (OS) and apps as if it were a physical
computer. A VM behaves exactly like a physical computer and
contains its own virtual (software based) CPU, RAM, hard drive
and Network Interface Card. An OS cannot tell the difference
between a VM and a physical machine, nor can apps or other
computers on a network tell the difference. (See Fig 2.13 for
details)
11. Virtualization is a concept that has several meanings in IT
and therefore several definitions. The major type of
virtualization is hardware virtualization, which remains popular
and widely used. Virtualization is often key part of an
enterprise’s disaster recovery plan. In general, virtualization
separates business applications and data from hardware
resources. This separation allows companies to pool hardware
resources—rather than to dedicate servers to applications—and
assign those resources to applications as needed. The major
types of virtualization are the following:
Storage virtualization is the pooling of physical storage from
multiple network storage devices into what appears to be a
single storage device that is managed from a central console.
Network virtualization combines the available resources in a
network by splitting the network load into manageable parts,
each of which can be assigned (or reassigned) to a particular
server on the network.
Hardware virtualization is the use of software to emulate
hardware or a total computer environment other than the one the
software is actually running in. It allows a piece of hardware to
run multiple operating system images at once. This kind of
software is sometimes known as a virtual machine.
Virtualization increases the flexibility of IT assets, allowing
companies to consolidate IT infrastructure, reduce maintenance
and administration costs, and prepare for strategic IT initiatives.
Virtualization is not primarily about cost-cutting, which is
tactical reason. More importantly, for strategic reasons,
virtualization is used because it enables flexible sourcing, and
cloud computing.
12. Memory-intensive: VMs need a huge amount of RAM
(random access memory, or primary memory) because of their
massive processing requirements.
Energy-efficient: VMs minimize energy consumed running and
cooling servers in the data center— representing up to a 95
percent reduction in energy use per server.
Scalability and load balancing: Virtualization provides load
balancing to handle the demand for requests to the site. The
VMware infrastructure automatically distributes the load across
a cluster of physical servers to ensure the maximum
performance of all running VMs.
13. When a big event happens, such as the Super Bowl, millions
of people hit a Web site at the same time. Virtualization
provides load balancing to handle the demand for requests to
the site. Load balancing is key to solving many of today’s IT
challenges.
42
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Learning Objectives
Enterprise Architecture and Data Governance
Information Systems: The Basics
Data Centers, Cloud Computing, and Virtualization
Cloud Services Add Agility
Information Management
Cloud Services Add Agility
Software as a Service (SaaS)
End-user apps, like SalesForce
Platform as a Service (PaaS)
Tools and services making coding and deployment faster and
more efficient, like Google App Engine
Infrastructure as a Service (IaaS)
Hardware and software that power computing resources, like
EC2 & S3 (Amazon Web Services)
Data as a Service (DaaS)
Data shared among clouds, systems, apps, regardless the data
source or storage location.
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Cloud Services Add Agility
Data as a Service (DaaS)
Easier for data architects to select data from different pools,
filter out sensitive data, and make the remaining data available
on-demand.
Eliminates risks and burdens of data management to a third-
party cloud provider.
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Cloud Services Add Agility
Cloudy Weather Ahead?
Various at-a-service models (such as CRM and HR
management) are still responsible for regulatory compliance.
Legal departments become involved due to high stakes around
legal and compliance issues.
Cut costs, flexibility, and improved responsiveness require IT,
legal, and senior management oversight.
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Cloud Services Add Agility
What is SaaS?
Describe the cloud computing stack.
What is PaaS?
What is IaaS?
Why is DaaS growing in popularity?
How might companies risk violating regulation or compliance
requirements with cloud services?
Chapter 2
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Suggested Answers:
1. Any software that is provided on demand is referred to as
software as a service, or SaaS.
SaaS is a widely used model in which software is available to
users as needed. Specifically, in SaaS, a service provider hosts
the application at its data center and customers access it via a
standard Web browser. Other terms for SaaS are on-demand
computing and hosted services. The idea is basically the same:
Instead of buying and installing expensive packaged enterprise
applications, users can access software apps over a network,
with an Internet browser being the only necessity. A SaaS
provider licenses an application to customers either on-demand,
through a subscription, based on usage (pay-as-you-go), or
increasingly at no cost when the opportunity exists to generate
revenue from advertisements or through other methods.
2. The cloud computing stack consists of the following three
categories:
SaaS apps are designed for end-users.
PaaS is a set of tools and services that make coding and
deploying these apps faster and more efficient.
IaaS consists of hardware and software that power computing
resources— servers, storage, operating systems, and networks.
See Figure 2.19 for a graphical representation.
3. PaaS provides a standard unified platform for app
development, testing, and deployment, thus benefiting software
development. This computing platform allows the creation of
Web applications quickly and easily without the complexity of
buying and maintaining the underlying infrastructure. Without
PaaS, the cost of developing some apps would be prohibitive.
The trend is for PaaS to be combined with IaaS.
4. Infrastructure as a service (IaaS) is a way of delivering cloud
computing infrastructure as an on-demand service. Rather than
purchasing servers, software, data center space, or networks,
companies instead buy all computing resources as a fully
outsourced service.
5. The DaaS model is growing in popularity as data become
more complex, difficult, and expensive to maintain.
Data as a service (DaaS) enables data to be shared among
clouds, systems, apps, and so on regardless of the data source or
where they are stored. DaaS makes it easier for data architects
to select data from different pools, filter out sensitive data, and
make the remaining data available on-demand. A key benefit of
DaaS is the elimination of the risks and burdens of data
management to a third-party cloud provider.
6. Companies are frequently adopting software, platform,
infrastructure, data management and starting to embrace
mobility as a service and big data as a service because they
typically no longer have to worry about the costs of buying,
maintaining, or updating their own data servers. Regulations
mandate that confidential data be protected regardless of
whether the data are on-premises on in the cloud. Therefore, a
company’s legal department needs to get involved in these IT
decisions. Put simply, moving to cloud services is not simply an
IT decision because the stakes around legal and compliance
issues are very high.
47
Chapter 3
Data Management,
Big Data Analytics, and
Records Management
Prepared by Dr. Derek Sedlack, South University
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Learning Objectives
Data Warehouse and Big Data Analytics
Data and Text Mining
Business Intelligence
Electronic Records Management
Database Management Systems
Database Management Systems
Databases
Collections of data sets or records stored in a systematic way.
Stores data generated by business apps, sensors, operations, and
transaction-processing systems (TPS).
The data in databases are extremely volatile.
Medium and large enterprises typically have many databases of
various types.
Volatile data changes frequently
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
Data Warehouses
Integrate data from multiple databases and data silos, and
organize them for complex analysis, knowledge discovery, and
to support decision making.
May require formatting processing and/or standardization.
Loaded at specific times making them non-volatile and ready
for analysis.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
Data Marts
Small-scale data warehouses that support a single function or
one department.
Enterprises that cannot afford to invest in data warehousing may
start with one or more data marts.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
Business intelligence (BI)
Tools and techniques that process data and conduct statistical
analysis for insight and discovery.
Used to discover meaningful relationships in the data, keep
informed of real time, detect trends, and identify opportunities
and risks.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
Database Management System (DBMS)
Integrate with data collection systems such as TPS and business
applications.
Stores data in an organized way.
Provides facilities for accessing and managing data.
Standard database model adopted by most enterprises.
Store data in tables consisting of columns and rows, similar to
the format of a spreadsheet.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
Relational Management System (DBMS)
Provides access to data using a declarative language.
Declarative Language
Simplifies data access by requiring that users only specify what
data they want to access without defining how they will be
achieved.
Structured Query Language (SQL) is an example of a
declarative language:
SELECT column_name(s)
FROM table_name
WHERE condition
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
DBMS Functions
Data filtering and profiling
Data integrity and maintenance
Data synchronization
Data security
Data access
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
Online Transaction Processing and Online Analytics Processing
Online Transaction Processing (OLTP)
Designed to manage transaction data, which are volatile & break
down complex information into simpler data tables to strike a
balance between transaction-processing efficiency and query
efficiency.
Cannot be optimized for data mining
Online Analytics Processing (OLAP)
A means of organizing large business databases.
Divided into one or more cubes that fit the way business is
conducted.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
DBMSs (mid-2014)
Oracle’s MySQL
Microsoft’s SQL Server
PostgreSQL
IBM’s DB2
Teradata Database.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
Trend Toward NoSQL Systems
Higher performance
Easy distribution of data on different nodes
enables scalability and fault tolerance
Greater flexibility
Simpler administration
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
Centralized and Distributed Database Architecture
Centralized Database Architecture
Better control of data quality.
Better IT security.
Distributed Database Architecture
Allow both local and remote access.
Use client/server architecture to process requests.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
Garbage In, Garbage Out
Dirty Data
Lacks integrity/validation and reduces user trust.
Incomplete, out of context, outdated, inaccurate, inaccessible,
or overwhelming.
Chapter 3
Cost of Poor
Quality Data
Lost Business
Cost to Prevent Errors
Cost to Correct Errors
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
Principle of Diminishing Data Value
The value of data diminishes as they age.
Blind spots (lack of data availability) of 30 days or longer
inhibit peak performance.
Global financial services institutions rely on near-real-time data
for peak performance.
Principle of 90/90 Data Use
As high as 90 percent, is seldom accessed after 90 days (except
for auditing purposes).
Roughly 90 percent of data lose most of their value after 3
months.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
Principle of data in context
The capability to capture, process, format, and distribute data in
near real time or faster requires a huge investment in data
architecture.
The investment can be justified on the principle that data must
be integrated, processed, analyzed, and formatted into
“actionable information.”
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
Data Life Cycle
Chapter 3
Figure 3.11 Data life cycle.
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
Chapter 3
Figure 3.12 An enterprise has transactional, master, and
analytical data.
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Database Management Systems
Describe a database and a database management system
(DBMS).
Explain what an online transaction-processing (OLAP) system
does.
Why are data in databases volatile?
Explain what processes DBMSs are optimized to perform.
What are the business costs or risks of poor data quality?
Describe the data life cycle.
What is the function of master data management (MDM)?
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Suggested Answers:
1. A database is a collection of data sets or records stored in a
systematic way. A database stores data generated by business
apps, sensors, and transaction processing systems. Databases
can provide access to all of the organization’s data collected for
a particular function or enterprise-wide, alleviating many of the
problems associated with data file environments. Central
storage of data in a database reduces data redundancy, data
isolation, and data inconsistency and allows for data to be
shared among users of the data. In addition, security and data
integrity are easier to control, and applications are independent
of the data they process. There are two basic types of databases:
centralized and distributed.
A database management system (DBMS) is software used to
manage the additions, updates, and deletions of data as
transactions occur; and support data queries and reporting.
DBMSs integrate with data collection systems such as TPS and
business applications; store the data in an organized way; and
provide facilities for accessing and managing that data.
2. OLTP is a database design that breaks down complex
information into simple data tables in order to be efficient for
capturing transactional data, including additions, updates, or
deletions. OLTP databases are capable of processing millions of
transactions every second.
3. Data in databases are volatile because they can be updated
millions of times every second, especially if they are
transaction processing systems (TPS).
4. Data filtering and profiling: Inspecting the data for errors,
inconsistencies, redundancies, and incomplete information.
Data integrity and maintenance: Correcting, standardizing, and
verifying the consistency and integrity of the data.
Data synchronization: Integrating, matching, or linking data
from disparate sources.
Data security: Checking and controlling data integrity over
time.
Data access: Providing authorized access to data in both
planned and ad hoc ways within acceptable time.
5. Poor quality data cannot be trusted and may result in the
inability to make intelligent business decisions. Poor data may
lead to lost business opportunities, increased time, and effort
trying to prevent errors, increased time, and effort trying to
correct errors, misallocation of resources, flawed strategies,
incorrect orders, and customers becoming frustrated and driven
away.
The cost of poor quality data spreads throughout the company
affecting systems from shipping and receiving to accounting and
customer services. Errors can be difficult, time-consuming, and
expensive to correct, and the impacts of errors can be
unpredictable or serious.
6. Three general data principles relate to the data life cycle
perspective and help to guide IT investment decisions.
Principle of diminishing data value. Viewing data in terms of a
life cycle focuses attention on how the value of data diminishes
as the data age. The more recent the data, the more valuable
they are. This is a simple, yet powerful, principle. Most
organizations cannot operate at peak performance with blind
spots (lack of data availability) of 30 days or longer.
Principle of 90/90 data use. Being able to act on real -time or
near real-time operational data can have significant advantages.
According to the 90/90 data-use principle, a majority of stored
data, as high as 90 percent, is seldom accessed after 90 days
(except for auditing purposes). Put another way, roughly 90
percent of data lose most of their value after three months.
Principle of data in context. The capability to capture, process,
format, and distribute data in near real-time or faster requires a
huge investment in data management architecture and
infrastructure to link remote POS systems to data storage, data
analysis systems, and reporting applications. The investment
can be justified on the principle that data must be integrated,
processed, analyzed, and formatted into “actionable
information.” End users need to see data in a meaningful format
and context if the data are to guide their decisions and plans.
7. Master data management (MDM) is a process whereby
companies integrate data from various sources or enterprise
applications to provide a more complete or unified view of an
entity (customer, product, etc.) Although vendors may claim
that their MDM solution creates “a single version of the truth,”
this claim probably is not true. In reality, MDM cannot create a
single unified version of the data because constructing a
completely unified view of all master data simply is not
possible. Realistically, MDM consolidates data from various
data sources into a master reference file, which then feeds data
back to the applications, thereby creating accurate and
consistent data across the enterprise.
19
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Learning Objectives
Data Warehouse and Big Data Analytics
Data and Text Mining
Business Intelligence
Electronic Records Management
Database Management Systems
Data Warehouse and Big Data Analytics
Market share
Percentage of total sales in a market captured by a brand,
product, or company.
Operating Margin
A measure of the percent of a company’s revenue left over after
paying variable costs: wages, raw materials, etc.
Increased margins mean earning more per dollar of sales.
The higher the operating margin, the better.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Warehouse and Big Data Analytics
TORTURE DATA LONG ENOUGH AND IT WILL CONFESS .
. .
BUT MAY NOT TELL THE TRUTH
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Warehouse and Big Data Analytics
Human Expertise and Judgment Required
Data are worthless if you cannot analyze, interpret, understand,
and apply the results in context.
Data need to be prepared for analysis.
Dirty data degrade the value of analytics.
Data must be put into meaningful context.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Warehouse and Big Data Analytics
Enterprise data warehouses (EDW)
Data warehouses that pull together data from disparate sources
and databases across an entire.
Warehouses are the primary source of cleansed data for
analysis, reporting, and Business Intelligence (BI).
Their high costs can be subsidized by using Data marts.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Warehouse and Big Data Analytics
Procedures to Prepare EDW Data for Analytics
Extract from designated databases.
Transform by standardizing formats, cleaning the data,
integration.
Loading into a data warehouse.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Warehouse and Big Data Analytics
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Warehouse and Big Data Analytics
Active Data Warehouse (ADW)
Real-time data warehousing and analytics.
Transform by standardizing formats, cleaning the data,
integration.
They Provide
Interaction with a customer to provide superior customer
service.
Respond to business events in near real time.
Share up-to-date status data among merchants, vendors,
customers, and associates.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Warehouse and Big Data Analytics
Supporting Actions as well as Decisions
Marketing and Sales
Pricing and Contracts
Forecasting
Sales
Financial
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Warehouse and Big Data Analytics
Really Big Data
Low-cost sensors collect data in real time in all types of
physical things (machine-generated sensor data):
Regulate temperature and climate
Detect air particles for contamination
Machinery conditions/failures
Engine wear/maintenance
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Warehouse and Big Data Analytics
Chapter 3
Figure 3.16 Machine generated data from physical objects are
becoming a much larger portion of big data and analytics..
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Warehouse and Big Data Analytics
Hadoop and MapReduce
Hadoop is an Apache processing platform that places no
conditions on the processed data structure.
MapReduce provides a reliable, fault-tolerant software
framework to write applications easily that process vast
amounts of data (multi-terabyte data-sets) in-parallel on large
clusters (thousands of nodes) of commodity hardware.
Map stage: breaks up huge data into subsets
Reduce stage: recombines partial results
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data Warehouse and Big Data Analytics
Why are human expertise and judgment important to data
analytics? Give an example.
What is the relationship between data quality and the value of
analytics?
Why do data need to be put into a meaningful context?
What are the differences between databases and data
warehouses?
Explain ETL and CDC.
What is an advantage of an active data warehouse (ADW)?
Why might a company invest in a data mart?
How can manufacturers and health care benefit from data
analytics?
Explain how Hadoop implements MapReduce in two stages.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Suggested Answers
1. Human expertise and judgment are needed to interpret the
output of analytics (refer to Figure 3.1). Data are worthless if
you cannot analyze, interpret, understand, and apply the results
in context.
For example, some believe that Super Bowl results in February
predict whether the stock market will go up or down that year.
If the National Football Conference (NFC) wins, the market
goes up; otherwise, stocks take a dive. Looking at results over
the past 30 years, most often the NFC has won the Super Bowl
and the market has gone up. Does this mean anything? No.
2. Dirty data degrade the value of analytics. The “cleanliness”
of data is very important to data mining and analysis projects.
3. Managers need context in order to understand how to
interpret traditional and big data. If the wrong analysis or
datasets are used, the output would be nonsense, as in the
example of the Super Bowl winners and stock market
performance.
4. Databases are:
Designed and optimized to ensure that every transaction gets
recorded and stored immediately.
Volatile because data are constantly being updated, added, or
edited.
OLTP systems.
Medium and large enterprises typically have many databases of
various types.
Data warehouses are:
Designed and optimized for analysis and quick response to
queries.
Nonvolatile. This stability is important to being able to analyze
the data and make comparisons. When data are stored, they
might never be changed or deleted in order to do trend analysis
or make comparisons with newer data.
OLAP systems.
Subject-oriented, which means that the data captured are
organized to have similar data linked together.
Data warehouses integrate data collected over long time periods
from various source systems, including multiple databases and
data silos.
5. ETL refers to three procedures – Extract, Transform, and
Load – used in moving data from databases to a data warehouse.
Data are extracted from designated databases, transformed by
standardizing formats, cleaning the data, integrating them, and
loaded into a data warehouse.
CDC, the acronym for Change Data Capture, refers to processes
which capture the changes made at data sources and then apply
those changes throughout enterprise data stores to keep data
synchronized. CDC minimizes the resources required for ETL
processes by only dealing with data changes.
6. An ADW provides real-time data warehousing and analytics,
not for executive strategic decision making, but rather to
support operations. Some advantages for a company of using an
ADW might be interacting with a customer to provide superior
customer service, responding to business events in near real
time, or sharing up-to-date status data among merchants,
vendors, customers, and associates.
7. The high cost of data warehouses can make them too
expensive for a company to implement. Data marts are lower -
cost, scaled-down versions that can be implemented in a much
shorter time, for example, in less than 90 days. Data marts serve
a specific department or function, such as finance, marketing, or
operations. Since they store smaller amounts of data, they are
faster, and easier to use and navigate.
8. Machine-generated sensor data are becoming a larger
proportion of big data (Figure 3.16). Analyzing them can lead to
optimizing cost savings and productivity gains. Manufacturers
can track the condition of operating machinery and predict the
probability of failure, as well as track wear and determine when
preventive maintenance is needed.
Federal health reform efforts have pushed health-care
organizations toward big data and analytics. These
organizations are planning to use big data analytics to support
revenue cycle management, resource utilization, fraud
prevention, health management, and quality improvement, in
addition to reducing operational expenses.
9. Apache Hadoop is a widely used processing platform which
places no conditions on the structure of the data it can process.
Hadoop implements MapReduce in two stages:
Map stage: MapReduce breaks up the huge dataset into smaller
subsets; then distributes the subsets among multiple servers
where they are partially processed.
Reduce stage: The partial results from the map stage are then
recombined and made available for analytic tools
32
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Learning Objectives
Data Warehouse and Big Data Analytics
Data and Text Mining
Business Intelligence
Electronic Records Management
Database Management Systems
Data and Text Mining
Creating Business Value
Business Analytics: the entire function of applying
technologies, algorithms, human expertise, and judgment.
Data Mining: software that enables users to analyze data from
various dimensions or angles, categorize them, and find
correlative patterns among fields in the data warehouse.
Text Mining: broad category involving interpreted words and
concepts in context.
Sentimental Analysis: trying to understand consumer intent.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data and Text Mining
Text Analytics (Mining) Procedure
Exploration
Simple word counts
Topics consolidation
Preprocessing
Standardization
May be 80% of processing time
Grammar and spell checking
Categorizing and Modelling
Create business rules and train models for accuracy and
precision
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data and Text Mining
Text Analytics Procedure
Exploration
Simple word counts
Topics consolidation
Preprocessing
Standardization
May be 80% of processing time
Grammar and spell checking
Categorizing and Modelling
Create business rules and train models for accuracy and
precision
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Data and Text Mining
Describe data mining.
How does data mining generate or provide value? Give an
example.
What is text mining?
Explain the text mining procedure.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Suggested Answers:
1. Data mining is the process of analyzing data from various
dimensions or angles, categorizing them, and finding
correlations or patterns among fields in the data warehouse.
2. Data mining is used to discover knowledge that you did not
know existed in the databases.
Answers may vary. A data mining example: The mega-retailer
Walmart wanted its online shoppers to find what they were
looking for faster. Walmart analyzed clickstream data from its
45 million monthly online shoppers then combined that data
with product and category related popularity scores which were
generated by text mining the retailer’s social media streams.
Lessons learned from the analysis were integrated into the
Polaris search engine used by customers on the company’s
website. Polaris has yielded a 10 to 15 percent increase in
online shoppers completing a purchase, which equals roughly $1
billion in incremental online sales.
3. Up to 75 percent of an organization’s data are non-structured
word processing documents, social media, text messages, audio,
video, images and diagrams, fax and memos, call center or
claims notes, and so on. Text mining is a broad category that
involves interpreting words and concepts in context. Then the
text is organized, explored, and analyzed to provide actionable
insights for managers. With text analytics, information is
extracted out of large quantities of various types of textual
information. It can be combined with structured data within an
automated process. Innovative companies know they could be
more successful in meeting their customers’ needs if they just
understood them better. Text analytics is proving to be an
invaluable tool in doing this.
4. The basic steps involved in text mining/analytics include:
Exploration. First, documents are explored. This might be in the
form of simple word counts in a document collection, or
manually creating topic areas to categorize documents by
reading a sample of them. For example, what are the major
types of issues (brake or engine failure) that have been
identified in recent automobile warranty claims? A challenge of
the exploration effort is misspelled or abbreviated words,
acronyms, or slang.
Preprocessing. Before analysis or the automated categorization
of the content, the text may need to be preprocessed to
standardize it to the extent possible. As in traditional analysis,
up to 80 percent of the time can be spent preparing and
standardizing the data. Misspelled words, abbreviations, and
slang may need to be transformed into a consistent term. For
instance, BTW would be standardized to “by the way” and “left
voice message” could be tagged as “lvm.”
Categorizing and Modeling. Content is then ready to be
categorized. Categorizing messages or documents from
information contained within them can be achieved using
statistical models and business rules. As with traditional model
development, sample documents are examined to train the
models. Additional documents are then processed to validate the
accuracy and precision of the model, and finally new documents
are evaluated using the final model (scored). Models then can
be put into production for automated processing of new
documents as they arrive.
37
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Learning Objectives
Data Warehouse and Big Data Analytics
Data and Text Mining
Business Intelligence
Electronic Records Management
Database Management Systems
Business Intelligence
Key to competitive advantage
Across industries in all size enterprises
Used in operational management, business process, and decision
making
Provides moment of value to decision makers
Unites data, technology, analytics, & human knowledge to
optimize decisions
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Business Intelligence
Challenges
Data Selection & Quality
Alignment with Business Strategy and BI Strategy
Alignment
Clearly articulates business strategy
Deconstructs business strategy into targets
Identifies PKIs
Prioritizes PKIs
Creates a plan based on priorities
Transform based on strategic results and changes
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Business Intelligence
Chapter 3
Smart Devices
Everywhere
have created demand for
effortless 24/7 access to
insights.
Data is Big
Business
when they provide insight that supports decisions and action.
Advanced BI and Analytics
help to ask questions that were previously unknown and
unanswerable.
Cloud Enabled BI and Analytics
are providing low-cost and flexible solutions.
Figure 3.20 Four factors contributing to increased use of BI.
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Business Intelligence
BI Architecture and Analytics
Advances in response to big data and end-user performance
demands.
Hosted on public or private clouds.
Limits IT staff and controls costs
May slow response time, add security and backup risks
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Business Intelligence
How has BI improved performance management at Quicken
Loans?
What are the business benefits of BI?
What are two data-related challenges that must be resolved for
BI to produce meaningful insight?
What are the steps in a BI governance program?
What is a business-driven development approach?
What does it mean to drill down, and why is it important?
What four factors are contributing to increased use of BI?
How did BI help CarMax achieve record-setting revenue
growth?
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Suggested Answers:
1. Using BI, the company has increased the speed from loan
application to close, which allows it to meet client needs as
thoroughly and quickly as possible. Over almost a decade,
performance management has evolved from a manual process of
report generation to BI-driven dashboards and user-defined
alerts that allow business leaders to proactively deal with
obstacles and identify opportunities for growth and
improvement.
2. BI provides data at the moment of value to a decision
maker—enabling it to extract crucial facts from enterprise data
in real time or near real time. BI solutions help an organization
to know what questions to ask and to find answers to those
questions. BI tools integrate and consolidate data from various
internal and external sources and then process them into
information to make smart decisions. According to The Data
Warehousing Institute (TDWI), BI “unites data, technology,
analytics, and human knowledge to optimize business decisions
and ultimately drive an enterprise’s success. BI programs…
transform data into usable, actionable business information”
(TDWI, 2012).
Managers use business analytics to make better-informed
decisions and hopefully provide them with a competitive
advantage. BI is used to analyze past performance and identify
opportunities to improve future performance.
3. Data selection and data quality.
Information overload is a major problem for executives and for
employees. Another common challenge is data quality,
particularly with regard to online information, because the
source and accuracy might not be verifiable.
4. The mission of a BI governance program is to achieve the
following:
Clearly articulate business strategies.
Deconstruct the business strategies into a set of specific goals
and objectives—the targets.
Identify the key performance indicators (KPIs) that will be used
to measure progress toward each target.
Prioritize the list of KPIs.
Create a plan to achieve goals and objectives based on the
priorities.
Estimate the costs needed to implement the BI plan.
Assess and update the priorities based on business results and
changes in business strategy.
5. A business-driven development approach starts with a
business strategy and work backward to identify data sources
and the data that need to be acquired and analyzed.
6. Drilling down into the data is going from highly consolidated
or summarized figures into the detail numbers from which they
were derived. Sometimes a summarized view of the data is all
that is needed; however, drilling down into the data, from which
the summary came, provides the ability to do more in-depth
analyses.
7. Smart Devices Everywhere creating demand for effortless
24/7 access to insights.
Data is Big Business when they provide insight that supports
decisions and action.
Advanced Bl and Analytics help to ask questions that were
previously unknown and unanswerable.
Cloud Enabled Bl and Analytics are providing low -cost and
flexible solutions.
8. The ISs that helped CarMax include:
A proprietary IS that captures, analyzes, interprets, and
distributes data about the cars CarMax sells and buys.
Data analytics applications that track every purchase; number of
test drives and credit applications per car; color preference in
every demographic and region.
Proprietary store technology that provides management with
real-time data about every aspect of store operations, such as
inventory management, pricing, vehicle transfers, wholesale
auctions, and sales consultant productivity.
An advanced inventory management system helps management
anticipate future inventory needs and manage pricing.
43
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Learning Objectives
Data Warehouse and Big Data Analytics
Data and Text Mining
Business Intelligence
Electronic Records Management
Database Management Systems
Electronic Records Management
Business Records
Documentation of a business event, action, decision, or
transaction.
Electronic Records Management (EMR)
Workflow software, authoring tools, scanners, and databases
that manage and archive electronic documents and image paper
documents.
Index and store documents according to company policy or legal
compliance.
Success depends on partnership of key players.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Electronic Records Management
Best Practices
Effective systems capture all business data.
Input from online forms, bar codes, sensors, websites, social
sites, copiers, e-mails, and more.
Industry Standards
Association for Information and Image Management (AIIM;
www.aiim.org)
National Archives and Records Administration (NARA;
www.archives.gov)
ARMA International (formerly the Association of Records
Managers and Administrators; www.arma.org)
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Electronic Records Management
Primary Benefits
Access and use the content contained in documents.
Cut labor costs by automating business processes.
Reduce time and effort to locate required information for
decision making.
Improve content security, thereby reducing intellectual property
theft risks.
Minimizes content printing, storing, and searching costs.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Electronic Records Management
DISASTER RECOVERY, BUSINESS CONTINUITY, AND
COMPLIANCE
Does the software meet the organization’s needs? For example,
can the DMS be installed on the existing network? Can it be
purchased as a service?
Is the software easy to use and accessible from Web brow sers,
office applications, and e-mail applications? If not, people will
not use it.
Does the software have lightweight, modern Web and graphical
user interfaces that effectively support remote users?
Before selecting a vendor, it is important to examine workflows
and how data, documents, and communications flow throughout
the company.
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Electronic Records Management
What are business records?
Why is ERM a strategic issue rather than simply an IT issue?
Why might a company have a legal duty to retain records? Give
an example.
Why is creating backups an insufficient way to manage an
organization’s documents?
What are the benefits of ERM?
Chapter 3
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Suggested Answers:
1. All organizations create and retain business records. A record
is documentation of a business event, action, decision, or
transaction. Examples are contracts, research and development,
accounting source documents, memos, customer/client
communications, hiring and promotion decisions, meeting
minutes, social posts, texts, e-mails, website content, database
records, and paper and electronic files. Business documents
such as spreadsheets, e-mail messages, and word-processing
documents are a type of records. Most records are kept in
electronic format and maintained throughout their life cycle—
from creation to final archiving or destruction by an electronic
records management (ERM) system.
2. Because senior management must ensure that their companies
comply with legal and regulatory duties, managing electronic
records (e-records) is a strategic issue for organizations in both
the public and private sectors. The success of ERM depends
greatly on a partnership of many key players, namely, senior
management, users, records managers, archivists,
administrators, and most importantly, IT personnel. Properly
managed, records are strategic assets. Improperly managed or
destroyed, they become liabilities.
3. Companies need to be prepared to respond to an audit, federal
investigation, lawsuit, or any other legal action against them.
Types of lawsuits against companies include patent violations,
product safety negligence, theft of intellectual property, breach
of contract, wrongful termination, harassment, discrimination,
and many more.
4. Simply creating backups of records is not sufficient because
the content would not be organized and indexed to retrieve them
accurately and easily. The requirement to manage records—
regardless of whether they are physical or digital—is not new.
ERM systems consist of hardware and software that manage and
archive electronic documents and image paper documents; then
index and store them according to company policy. Properly
managed, records are strategic assets. Improperly managed or
destroyed, they become liabilities.
5. Departments or companies whose employees spend most of
their day filing or retrieving documents or warehousing paper
records can reduce costs significantly with ERM. These systems
minimize the inefficiencies and frustration associated with
managing paper documents and workflows. However, they do
not create a paperless office as had been predicted.
An ERM can help a business to become more efficient and
productive by:
Enabling the company to access and use the content contained
in documents.
Cutting labor costs by automating business processes.
Reducing the time and effort required to locate information the
business needs to support decision making.
Improving the security of content, thereby reducing the risk of
intellectual property theft.
Minimizing the costs associated with printing, storing, and
searching for content.
When workflows are digital, productivity increases, costs
decrease, compliance obligations are easier to verify, and green
computing becomes possible.
49

Mais conteúdo relacionado

Semelhante a Chapter 2Data Governance and IT Architecture Support Long-Term

Data Management Trends 2022_Shailendra Mruthyunjayappa.pdf
Data Management Trends 2022_Shailendra Mruthyunjayappa.pdfData Management Trends 2022_Shailendra Mruthyunjayappa.pdf
Data Management Trends 2022_Shailendra Mruthyunjayappa.pdfShailendra Mruthyunjayappa
 
GLOBAL ASSET, INC. (GAI) Global Asset, Inc. (GAI) is a fin.docx
GLOBAL ASSET, INC. (GAI) Global Asset, Inc. (GAI) is a fin.docxGLOBAL ASSET, INC. (GAI) Global Asset, Inc. (GAI) is a fin.docx
GLOBAL ASSET, INC. (GAI) Global Asset, Inc. (GAI) is a fin.docxbudbarber38650
 
MIS for LOGISTICS B.com Logistics unit 1.pptx
MIS for LOGISTICS B.com Logistics unit 1.pptxMIS for LOGISTICS B.com Logistics unit 1.pptx
MIS for LOGISTICS B.com Logistics unit 1.pptxPranavRaythatha1
 
GLOBAL FINANCE, INC. (GFI) Global Finance, Inc. (GFI) is a.docx
GLOBAL FINANCE, INC. (GFI) Global Finance, Inc. (GFI) is a.docxGLOBAL FINANCE, INC. (GFI) Global Finance, Inc. (GFI) is a.docx
GLOBAL FINANCE, INC. (GFI) Global Finance, Inc. (GFI) is a.docxbudbarber38650
 
Organizational Data And Management System Essay
Organizational Data And Management System EssayOrganizational Data And Management System Essay
Organizational Data And Management System EssayEbony Bates
 
Running head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docxRunning head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docxhealdkathaleen
 
Running head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docxRunning head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docxtodd271
 
BRIDGING DATA SILOS USING BIG DATA INTEGRATION
BRIDGING DATA SILOS USING BIG DATA INTEGRATIONBRIDGING DATA SILOS USING BIG DATA INTEGRATION
BRIDGING DATA SILOS USING BIG DATA INTEGRATIONijmnct
 
comparision between IT and Information system
comparision between IT and Information systemcomparision between IT and Information system
comparision between IT and Information systemtayyab3052
 
Information-Systems-and-Technology.pptx
Information-Systems-and-Technology.pptxInformation-Systems-and-Technology.pptx
Information-Systems-and-Technology.pptxAhimsaBhardwaj
 
IT for Management On-Demand Strategies for Performance, Growth,.docx
IT for Management On-Demand Strategies for Performance, Growth,.docxIT for Management On-Demand Strategies for Performance, Growth,.docx
IT for Management On-Demand Strategies for Performance, Growth,.docxvrickens
 
Global_Technology_Services - Technical_Support_Services_White_Paper_External_...
Global_Technology_Services - Technical_Support_Services_White_Paper_External_...Global_Technology_Services - Technical_Support_Services_White_Paper_External_...
Global_Technology_Services - Technical_Support_Services_White_Paper_External_...Jim Mason
 
Discovery: The Backbone of Digital Enterprise Management
Discovery:  The Backbone of Digital Enterprise ManagementDiscovery:  The Backbone of Digital Enterprise Management
Discovery: The Backbone of Digital Enterprise ManagementMichelle Kerby
 
The top trends changing the landscape of Information Management
The top trends changing the landscape of Information ManagementThe top trends changing the landscape of Information Management
The top trends changing the landscape of Information ManagementVelrada
 
8 Strategies for IT Transformation
8 Strategies for IT Transformation8 Strategies for IT Transformation
8 Strategies for IT Transformationkenaibarbosa
 

Semelhante a Chapter 2Data Governance and IT Architecture Support Long-Term (20)

Data Management Trends 2022_Shailendra Mruthyunjayappa.pdf
Data Management Trends 2022_Shailendra Mruthyunjayappa.pdfData Management Trends 2022_Shailendra Mruthyunjayappa.pdf
Data Management Trends 2022_Shailendra Mruthyunjayappa.pdf
 
GLOBAL ASSET, INC. (GAI) Global Asset, Inc. (GAI) is a fin.docx
GLOBAL ASSET, INC. (GAI) Global Asset, Inc. (GAI) is a fin.docxGLOBAL ASSET, INC. (GAI) Global Asset, Inc. (GAI) is a fin.docx
GLOBAL ASSET, INC. (GAI) Global Asset, Inc. (GAI) is a fin.docx
 
MIS for LOGISTICS B.com Logistics unit 1.pptx
MIS for LOGISTICS B.com Logistics unit 1.pptxMIS for LOGISTICS B.com Logistics unit 1.pptx
MIS for LOGISTICS B.com Logistics unit 1.pptx
 
GLOBAL FINANCE, INC. (GFI) Global Finance, Inc. (GFI) is a.docx
GLOBAL FINANCE, INC. (GFI) Global Finance, Inc. (GFI) is a.docxGLOBAL FINANCE, INC. (GFI) Global Finance, Inc. (GFI) is a.docx
GLOBAL FINANCE, INC. (GFI) Global Finance, Inc. (GFI) is a.docx
 
Embracing BYOD
Embracing BYODEmbracing BYOD
Embracing BYOD
 
Organizational Data And Management System Essay
Organizational Data And Management System EssayOrganizational Data And Management System Essay
Organizational Data And Management System Essay
 
ch02.pdf
ch02.pdfch02.pdf
ch02.pdf
 
Running head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docxRunning head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docx
 
Running head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docxRunning head Database and Data Warehousing design1Database and.docx
Running head Database and Data Warehousing design1Database and.docx
 
BRIDGING DATA SILOS USING BIG DATA INTEGRATION
BRIDGING DATA SILOS USING BIG DATA INTEGRATIONBRIDGING DATA SILOS USING BIG DATA INTEGRATION
BRIDGING DATA SILOS USING BIG DATA INTEGRATION
 
comparision between IT and Information system
comparision between IT and Information systemcomparision between IT and Information system
comparision between IT and Information system
 
Information-Systems-and-Technology.pptx
Information-Systems-and-Technology.pptxInformation-Systems-and-Technology.pptx
Information-Systems-and-Technology.pptx
 
IT for Management On-Demand Strategies for Performance, Growth,.docx
IT for Management On-Demand Strategies for Performance, Growth,.docxIT for Management On-Demand Strategies for Performance, Growth,.docx
IT for Management On-Demand Strategies for Performance, Growth,.docx
 
Global_Technology_Services - Technical_Support_Services_White_Paper_External_...
Global_Technology_Services - Technical_Support_Services_White_Paper_External_...Global_Technology_Services - Technical_Support_Services_White_Paper_External_...
Global_Technology_Services - Technical_Support_Services_White_Paper_External_...
 
Discovery: The Backbone of Digital Enterprise Management
Discovery:  The Backbone of Digital Enterprise ManagementDiscovery:  The Backbone of Digital Enterprise Management
Discovery: The Backbone of Digital Enterprise Management
 
MTW03011USEN.PDF
MTW03011USEN.PDFMTW03011USEN.PDF
MTW03011USEN.PDF
 
Gr 12 Difference Between IT an Information Systems
Gr 12 Difference Between IT an Information SystemsGr 12 Difference Between IT an Information Systems
Gr 12 Difference Between IT an Information Systems
 
The top trends changing the landscape of Information Management
The top trends changing the landscape of Information ManagementThe top trends changing the landscape of Information Management
The top trends changing the landscape of Information Management
 
Big data baddata-gooddata
Big data baddata-gooddataBig data baddata-gooddata
Big data baddata-gooddata
 
8 Strategies for IT Transformation
8 Strategies for IT Transformation8 Strategies for IT Transformation
8 Strategies for IT Transformation
 

Mais de EstelaJeffery653

Individual Project Sampling Tue, 3717Numeric 2000 Re.docx
Individual Project Sampling Tue, 3717Numeric 2000 Re.docxIndividual Project Sampling Tue, 3717Numeric 2000 Re.docx
Individual Project Sampling Tue, 3717Numeric 2000 Re.docxEstelaJeffery653
 
Individual ProjectMedical TechnologyWed, 9617Num.docx
Individual ProjectMedical TechnologyWed, 9617Num.docxIndividual ProjectMedical TechnologyWed, 9617Num.docx
Individual ProjectMedical TechnologyWed, 9617Num.docxEstelaJeffery653
 
Individual Paper The individual paper or project is intended to demo.docx
Individual Paper The individual paper or project is intended to demo.docxIndividual Paper The individual paper or project is intended to demo.docx
Individual Paper The individual paper or project is intended to demo.docxEstelaJeffery653
 
Individual Process ModelingDueAug 28, 1159 PMResourc.docx
Individual Process ModelingDueAug 28, 1159 PMResourc.docxIndividual Process ModelingDueAug 28, 1159 PMResourc.docx
Individual Process ModelingDueAug 28, 1159 PMResourc.docxEstelaJeffery653
 
Individual Reflection PaperAt the end of the course, you must al.docx
Individual Reflection PaperAt the end of the course, you must al.docxIndividual Reflection PaperAt the end of the course, you must al.docx
Individual Reflection PaperAt the end of the course, you must al.docxEstelaJeffery653
 
Individual Multilingualism Guidelines1)Where did the a.docx
Individual Multilingualism Guidelines1)Where did the a.docxIndividual Multilingualism Guidelines1)Where did the a.docx
Individual Multilingualism Guidelines1)Where did the a.docxEstelaJeffery653
 
Individual Implementation Strategiesno new messagesObjectives.docx
Individual Implementation Strategiesno new messagesObjectives.docxIndividual Implementation Strategiesno new messagesObjectives.docx
Individual Implementation Strategiesno new messagesObjectives.docxEstelaJeffery653
 
Individual Learning Project InstructionsThis project will allow yo.docx
Individual Learning Project InstructionsThis project will allow yo.docxIndividual Learning Project InstructionsThis project will allow yo.docx
Individual Learning Project InstructionsThis project will allow yo.docxEstelaJeffery653
 
Individual Network Analysis PaperWrite a 3- to 4-page paper.docx
Individual Network Analysis PaperWrite a 3- to 4-page paper.docxIndividual Network Analysis PaperWrite a 3- to 4-page paper.docx
Individual Network Analysis PaperWrite a 3- to 4-page paper.docxEstelaJeffery653
 
Individual Refine and Finalize WebsiteDueJul 02View m.docx
Individual Refine and Finalize WebsiteDueJul 02View m.docxIndividual Refine and Finalize WebsiteDueJul 02View m.docx
Individual Refine and Finalize WebsiteDueJul 02View m.docxEstelaJeffery653
 
Individual Cultural Communication Written Assignment  (Worth 20 of .docx
Individual Cultural Communication Written Assignment  (Worth 20 of .docxIndividual Cultural Communication Written Assignment  (Worth 20 of .docx
Individual Cultural Communication Written Assignment  (Worth 20 of .docxEstelaJeffery653
 
Individual Expansion of the Mayberry SatelliteDueJul 24, 11.docx
Individual Expansion of the Mayberry SatelliteDueJul 24, 11.docxIndividual Expansion of the Mayberry SatelliteDueJul 24, 11.docx
Individual Expansion of the Mayberry SatelliteDueJul 24, 11.docxEstelaJeffery653
 
Individual ProjectVirtual Lab Demonstrating the Scientific Metho.docx
Individual ProjectVirtual Lab Demonstrating the Scientific Metho.docxIndividual ProjectVirtual Lab Demonstrating the Scientific Metho.docx
Individual ProjectVirtual Lab Demonstrating the Scientific Metho.docxEstelaJeffery653
 
Individual ProjectLab to Determine the Outcome of HeredityTu.docx
Individual ProjectLab to Determine the Outcome of HeredityTu.docxIndividual ProjectLab to Determine the Outcome of HeredityTu.docx
Individual ProjectLab to Determine the Outcome of HeredityTu.docxEstelaJeffery653
 
Individual ProjectLinking Strategy to Your PlanWed, 1181.docx
Individual ProjectLinking Strategy to Your PlanWed, 1181.docxIndividual ProjectLinking Strategy to Your PlanWed, 1181.docx
Individual ProjectLinking Strategy to Your PlanWed, 1181.docxEstelaJeffery653
 
Individual ProjectCytology Lab to Demonstrate Cell Differences.docx
Individual ProjectCytology Lab to Demonstrate Cell Differences.docxIndividual ProjectCytology Lab to Demonstrate Cell Differences.docx
Individual ProjectCytology Lab to Demonstrate Cell Differences.docxEstelaJeffery653
 
Individual Expanded Website PlanView more »Expand view.docx
Individual Expanded Website PlanView more  »Expand view.docxIndividual Expanded Website PlanView more  »Expand view.docx
Individual Expanded Website PlanView more »Expand view.docxEstelaJeffery653
 
Individual Ethical Issues Facing IT ProfessionalsView more .docx
Individual Ethical Issues Facing IT ProfessionalsView more .docxIndividual Ethical Issues Facing IT ProfessionalsView more .docx
Individual Ethical Issues Facing IT ProfessionalsView more .docxEstelaJeffery653
 
Individual Designing FormsDueSep 04, 1159 Resource .docx
Individual Designing FormsDueSep 04, 1159 Resource .docxIndividual Designing FormsDueSep 04, 1159 Resource .docx
Individual Designing FormsDueSep 04, 1159 Resource .docxEstelaJeffery653
 
Individual Expanded Website PlanDueJul 02View more .docx
Individual Expanded Website PlanDueJul 02View more .docxIndividual Expanded Website PlanDueJul 02View more .docx
Individual Expanded Website PlanDueJul 02View more .docxEstelaJeffery653
 

Mais de EstelaJeffery653 (20)

Individual Project Sampling Tue, 3717Numeric 2000 Re.docx
Individual Project Sampling Tue, 3717Numeric 2000 Re.docxIndividual Project Sampling Tue, 3717Numeric 2000 Re.docx
Individual Project Sampling Tue, 3717Numeric 2000 Re.docx
 
Individual ProjectMedical TechnologyWed, 9617Num.docx
Individual ProjectMedical TechnologyWed, 9617Num.docxIndividual ProjectMedical TechnologyWed, 9617Num.docx
Individual ProjectMedical TechnologyWed, 9617Num.docx
 
Individual Paper The individual paper or project is intended to demo.docx
Individual Paper The individual paper or project is intended to demo.docxIndividual Paper The individual paper or project is intended to demo.docx
Individual Paper The individual paper or project is intended to demo.docx
 
Individual Process ModelingDueAug 28, 1159 PMResourc.docx
Individual Process ModelingDueAug 28, 1159 PMResourc.docxIndividual Process ModelingDueAug 28, 1159 PMResourc.docx
Individual Process ModelingDueAug 28, 1159 PMResourc.docx
 
Individual Reflection PaperAt the end of the course, you must al.docx
Individual Reflection PaperAt the end of the course, you must al.docxIndividual Reflection PaperAt the end of the course, you must al.docx
Individual Reflection PaperAt the end of the course, you must al.docx
 
Individual Multilingualism Guidelines1)Where did the a.docx
Individual Multilingualism Guidelines1)Where did the a.docxIndividual Multilingualism Guidelines1)Where did the a.docx
Individual Multilingualism Guidelines1)Where did the a.docx
 
Individual Implementation Strategiesno new messagesObjectives.docx
Individual Implementation Strategiesno new messagesObjectives.docxIndividual Implementation Strategiesno new messagesObjectives.docx
Individual Implementation Strategiesno new messagesObjectives.docx
 
Individual Learning Project InstructionsThis project will allow yo.docx
Individual Learning Project InstructionsThis project will allow yo.docxIndividual Learning Project InstructionsThis project will allow yo.docx
Individual Learning Project InstructionsThis project will allow yo.docx
 
Individual Network Analysis PaperWrite a 3- to 4-page paper.docx
Individual Network Analysis PaperWrite a 3- to 4-page paper.docxIndividual Network Analysis PaperWrite a 3- to 4-page paper.docx
Individual Network Analysis PaperWrite a 3- to 4-page paper.docx
 
Individual Refine and Finalize WebsiteDueJul 02View m.docx
Individual Refine and Finalize WebsiteDueJul 02View m.docxIndividual Refine and Finalize WebsiteDueJul 02View m.docx
Individual Refine and Finalize WebsiteDueJul 02View m.docx
 
Individual Cultural Communication Written Assignment  (Worth 20 of .docx
Individual Cultural Communication Written Assignment  (Worth 20 of .docxIndividual Cultural Communication Written Assignment  (Worth 20 of .docx
Individual Cultural Communication Written Assignment  (Worth 20 of .docx
 
Individual Expansion of the Mayberry SatelliteDueJul 24, 11.docx
Individual Expansion of the Mayberry SatelliteDueJul 24, 11.docxIndividual Expansion of the Mayberry SatelliteDueJul 24, 11.docx
Individual Expansion of the Mayberry SatelliteDueJul 24, 11.docx
 
Individual ProjectVirtual Lab Demonstrating the Scientific Metho.docx
Individual ProjectVirtual Lab Demonstrating the Scientific Metho.docxIndividual ProjectVirtual Lab Demonstrating the Scientific Metho.docx
Individual ProjectVirtual Lab Demonstrating the Scientific Metho.docx
 
Individual ProjectLab to Determine the Outcome of HeredityTu.docx
Individual ProjectLab to Determine the Outcome of HeredityTu.docxIndividual ProjectLab to Determine the Outcome of HeredityTu.docx
Individual ProjectLab to Determine the Outcome of HeredityTu.docx
 
Individual ProjectLinking Strategy to Your PlanWed, 1181.docx
Individual ProjectLinking Strategy to Your PlanWed, 1181.docxIndividual ProjectLinking Strategy to Your PlanWed, 1181.docx
Individual ProjectLinking Strategy to Your PlanWed, 1181.docx
 
Individual ProjectCytology Lab to Demonstrate Cell Differences.docx
Individual ProjectCytology Lab to Demonstrate Cell Differences.docxIndividual ProjectCytology Lab to Demonstrate Cell Differences.docx
Individual ProjectCytology Lab to Demonstrate Cell Differences.docx
 
Individual Expanded Website PlanView more »Expand view.docx
Individual Expanded Website PlanView more  »Expand view.docxIndividual Expanded Website PlanView more  »Expand view.docx
Individual Expanded Website PlanView more »Expand view.docx
 
Individual Ethical Issues Facing IT ProfessionalsView more .docx
Individual Ethical Issues Facing IT ProfessionalsView more .docxIndividual Ethical Issues Facing IT ProfessionalsView more .docx
Individual Ethical Issues Facing IT ProfessionalsView more .docx
 
Individual Designing FormsDueSep 04, 1159 Resource .docx
Individual Designing FormsDueSep 04, 1159 Resource .docxIndividual Designing FormsDueSep 04, 1159 Resource .docx
Individual Designing FormsDueSep 04, 1159 Resource .docx
 
Individual Expanded Website PlanDueJul 02View more .docx
Individual Expanded Website PlanDueJul 02View more .docxIndividual Expanded Website PlanDueJul 02View more .docx
Individual Expanded Website PlanDueJul 02View more .docx
 

Último

Patterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxPatterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxMYDA ANGELICA SUAN
 
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdfMaximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdfTechSoup
 
How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17Celine George
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptxSandy Millin
 
HED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfHED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfMohonDas
 
Prescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxPrescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxraviapr7
 
General views of Histopathology and step
General views of Histopathology and stepGeneral views of Histopathology and step
General views of Histopathology and stepobaje godwin sunday
 
Benefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive EducationBenefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive EducationMJDuyan
 
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptx
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptxPISA-VET launch_El Iza Mohamedou_19 March 2024.pptx
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptxEduSkills OECD
 
How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17Celine George
 
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptxClinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptxraviapr7
 
CAULIFLOWER BREEDING 1 Parmar pptx
CAULIFLOWER BREEDING 1 Parmar pptxCAULIFLOWER BREEDING 1 Parmar pptx
CAULIFLOWER BREEDING 1 Parmar pptxSaurabhParmar42
 
In - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxIn - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxAditiChauhan701637
 
Diploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfDiploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfMohonDas
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.raviapr7
 
What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?TechSoup
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxiammrhaywood
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfYu Kanazawa / Osaka University
 

Último (20)

Personal Resilience in Project Management 2 - TV Edit 1a.pdf
Personal Resilience in Project Management 2 - TV Edit 1a.pdfPersonal Resilience in Project Management 2 - TV Edit 1a.pdf
Personal Resilience in Project Management 2 - TV Edit 1a.pdf
 
Patterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxPatterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptx
 
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdfMaximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
 
Prelims of Kant get Marx 2.0: a general politics quiz
Prelims of Kant get Marx 2.0: a general politics quizPrelims of Kant get Marx 2.0: a general politics quiz
Prelims of Kant get Marx 2.0: a general politics quiz
 
How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
 
HED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfHED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdf
 
Prescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxPrescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptx
 
General views of Histopathology and step
General views of Histopathology and stepGeneral views of Histopathology and step
General views of Histopathology and step
 
Benefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive EducationBenefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive Education
 
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptx
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptxPISA-VET launch_El Iza Mohamedou_19 March 2024.pptx
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptx
 
How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17
 
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptxClinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptx
 
CAULIFLOWER BREEDING 1 Parmar pptx
CAULIFLOWER BREEDING 1 Parmar pptxCAULIFLOWER BREEDING 1 Parmar pptx
CAULIFLOWER BREEDING 1 Parmar pptx
 
In - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxIn - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptx
 
Diploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfDiploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdf
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.
 
What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
 

Chapter 2Data Governance and IT Architecture Support Long-Term

  • 1. Chapter 2 Data Governance and IT Architecture Support Long-Term Performance Prepared by Dr. Derek Sedlack, South University Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Learning Objectives Enterprise Architecture and Data Governance Information Systems: The Basics Data Centers, Cloud Computing, and Virtualization Cloud Services Add Agility Information Management
  • 2. Information Management INFORMATION MANAGEMENT HARNESSES SCATTERED DATA Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Information Management Information Management The use of IT tools and methods to collect, process, consolidate, store, and secure data from sources that are often fragmented and inconsistent. Why a continuous plan is needed to guide, control, and govern IT growth. Information management is critical to data security and compliance with continually evolving regulatory requirements, such as the Sarbanes-Oxley Act, Basel III, the Computer Fraud and Abuse Act (CFAA), the USA PATRIOT Act, and the Health Insurance Portability and Accountability Act (HIPAA). Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Information Management Data Silos Stand alone data stores not accessible by other information systems that need data, cannon consistently be updated.
  • 3. Exist from a lack of IT architecture, only support single functions, and do not support cross-functional needs. Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Information Management Key Performance Indicators (KPIs) These measures demonstrate the effectiveness of a business process at achieving organizational goals. Present data in easy-to-comprehend and comparison-ready formats. KPI examples: current ratio; accounts payable turnover; net profit margin; new followers per week; cost per lead; order status. Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Information Management Chapter 2 Figure 2.4 Data (or information) silos are ISs that do not have the capability to exchange data with other ISs, making timely coordination and communication across functions or departments difficult. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Information Management Reasons information deficiencies are still a problem
  • 4. Data Silos Lost of bypassed data Poorly designed interfaces Nonstandardized data formats Cannot hit moving targets Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Information Management Chapter 2 Figure 2.5 Factors that are increasing demand for collaboration technology. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Global, mobile workforce 62% of the workforce works outside an office at some point. This number is increasing. Mobility-driven consumerization Growing number of cloud collaboration services. Principle of “any” Growing need to connect anybody, anytime, anywhere on any device
  • 5. Information Management Obvious benefits of information management Improves decision quality Improves the accuracy and reliability of management predictions Reduces the risk of noncompliance Reduces time and cost Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Information Management Explain information management. Why do organizations still have information deficiency problems? What is a data silo? Explain KPIs and give an example. What three factors are driving collaboration and information sharing? What are the business benefits of information management? Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Suggested Answers: 1. Information management is the use of IT tools and methods to collect, process, consolidate, store, and secure data from sources that are often fragmented and inconsistent. A modern organization needs to manage a variety of information which goes beyond the structured types like numbers and texts to include semi-structured and unstructured contents such as video
  • 6. and sound. The digital library includes content from social media, texts, photos, videos, music, documents, address books, events, and downloads. Maintaining—updating, expanding, porting—an organization’s digital library’s contents on a variety of platforms is the task of Information Management. Specifically, Information Management deals with how information is organized, stored, and secured, and the speed and ease with which it is captured, analyzed and reported. 2. Over many decades, changes in technology and the information companies require, along with different management teams, changing priorities, and increases or decreases in IT investments as they compete with other demands on an organization’s budget, have all contributed. Other common reasons include: data silos (information trapped in departments’ databases), data lost or bypassed during transit, poorly designed user interfaces requiring extra effort from users, non-standardized data formats, and fast-moving changes in the type of information desired, particularly unstructured content, requiring expensive investments. 3. A data silo is one of the data deficiencies that can be addressed. It refers to the situation where the databases belonging to different functional units (e.g., departments) in an organization are not shared between the units because of a lack of integration. Data silos support a single function and therefore do not support the cross-functional needs of an organization. The lack of sharing and exchange of data between functional units raises issues regarding reliability and currency of data, requiring extensive verification to be trusted. Data silos exist when there is no overall IT architecture to guide IS investments, data coordination, and communication. 4. KPIs are performance measurements. These measures demonstrate the effectiveness of a business process at achieving organizational goals. KPIs present data in easy-to-comprehend
  • 7. and comparison-ready formats. KPIs help reduce the complex nature of organizational performance to a small number of understandable measures. Examples of key comparisons are actual vs. budget, actual vs. forecasted, and this year vs. prior years. 5. Forrester (forrester.com) identified three factors driving the trend toward collaboration and information sharing technology. These are: Global, mobile workforce (a growing number of employees telecommute) Mobility-driven consumerization (cloud-based collaboration solutions are on the rise) Principle of any (there is growing need to connect anybody anytime anywhere and on any device) 6. The following four benefits have been identified: Improves decision quality (due to timely response using reliable data) Improves the accuracy and reliability of management predictions (“what is going to happen” as opposed to financial reporting on “what has happened.”) Reduces the risk of noncompliance (due to improved compliance with regulation resulting from better information quality and governance), and Reduces the time and cost of locating relevant information (due to savings in time and effort through integration and optimization of repositories) 11 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Learning Objectives
  • 8. Enterprise Architecture and Data Governance Information Systems: The Basics Data Centers, Cloud Computing, and Virtualization Cloud Services Add Agility Information Management Enterprise Architecture and Data Governance Enterprise architecture (EA) The way IT systems and processes are structured. Helps or impedes day-to-day operations and efforts to execute business strategy. Solves two critical challenges: where are we going; how do we get there? Chapter 2
  • 9. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Enterprise Architecture and Data Governance Strategic Focus IT systems’ complexity Poor business alignment Business and IT Benefits of EA Cuts IT costs; increases productivity with information, insight, and ideas Determines competitiveness, flexibility, and IT economics Aligns IT capabilities with business strategy to grow, innovate, and respond to market demands Reduces risk of buying or building systems and enterprise apps Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Enterprise Architecture and Data Governance Chapter 2 EA Components Business Architecture Application Architecture Data Architecture Technical Architecture Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Enterprise Architecture and Data Governance Enterprise-wide Data Governance Crosses boundaries and used by people through the enterprise.
  • 10. Increased importance through new regulations and pressure to reduce costs. Reduces legal risks associated with unmanaged or inconsistently managed information Chapter 2 Dependent on Governance Food Industry Financial Services Industry Health-care Industry Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Enterprise Architecture and Data Governance Master Data & Management (MDM) Creates high-quality trustworthy data: Running the business with transactional or operational use Improving the business with analytic use Requires strong data governance to manage availability, usability, integrity, and security. Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Enterprise Architecture and Data Governance Politics: The People Conflict Cultures of distrust between technology and employees may exist. Genuine commitment to change can bridge the divide with support from the senior management. Methodologies can only provide a framework, not solve people
  • 11. problems Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Enterprise Architecture and Data Governance Explain the relationship between complexity and planning. Give an example. Explain enterprise architecture. What are the four components of EA? What are the business benefits of EA? How can EA maintain alignment between IT and business strategy? What are the two ways that data are used in an organization? What is the function of data governance? Why has interest in data governance and MDM increased? What role does personal conflict or politics play in the success of data governance? Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Suggested Answers: 1. As enterprise information systems become more complex, the importance of long-range IT planning increases dramatically. Companies cannot simply add storage, new apps, or data analytics on an as needed basis and expect those additions to work with the existing systems. The relationship between complexity and planning is easier to see in physical things such as skyscrapers and transportation systems. If you are constructing a simple cabin in a remote area, you do not need a detailed plan for expansion or to make sure that the cabin fits into its environment. If you are building a simple, single-user, non-distributed system, you would not need
  • 12. a well-thought out growth plan either. Therefore, it is no longer feasible to manage big data, content from mobiles and social networks, and data in the cloud without the well-designed set of plans, or blueprint, provided by EA. The EA guides and controls software add-ons and upgrades, hardware, systems, networks, cloud services, and other digital technology investments. 2. Enterprise architecture (EA) is the way IT systems and processes are structured. EA is an ongoing process of creating, maintaining, and leveraging IT. It helps to solve two critical challenges: where an organization is going and how it will get there. EA helps, or impedes, day-to-day operations and efforts to execute business strategy. There are two problems that the EA is designed to address: IT systems’ complexity. IT systems have become unmanageably complex and expensive to maintain. Poor business alignment. Organizations find it difficult to keep their increasingly expensive IT systems aligned with business needs. EA is the roadmap that is used for controlling the direction of IT investments and it is a significant item in long-range planning. It is the blueprint that guides the build out of overall IT capabilities consisting of four sub-architectures (see question 3). EA defines the vision, standards, and plan that guide the priorities, operations, and management of the IT systems supporting the business. 3. The four components are: Business Architecture (the processes the business uses to meet its goals); Application architecture (design of IS applications and their interactions); Data architecture (organization and access of enterprise data); Technical architecture (the hardware and software infrastructure that supports applications and their interactions)
  • 13. 4. EA cuts IT costs and increases productivity by giving decision makers access to information, insights, and ideas where and when they need them. EA determines an organization’s competitiveness, flexibility, and IT economics for the next decade and beyond. That is, it provides a long-term view of a company’s processes, systems, and technologies so that IT investments do not simply fulfill immediate needs. EA helps align IT capabilities with business strategy—to grow, innovate, and respond to market demands, supported by an IT practice that is 100 percent in accord with business objectives. EA can reduce the risk of buying or building systems and enterprise apps that are incompatible or unnecessarily expensive to maintain and integrate. 5. EA starts with the organization’s target–where it is going— not with where it is. Once an organization identifies the strategic direction in which it is heading and the business drivers to which it is responding, this shared vision of the future will dictate changes in business, technical, information, and solutions architectures of the enterprise, assign priorities to those changes, and keep those changes grounded in business value. EA guides and controls software add-ons and upgrades, hardware, systems, networks, cloud services, and other digital technology investments which are aligned with the business strategy. 6. Data are used in an organization for running the business (transactional or operational use) and for improving the business (analytic use.) 7. Data governance is the process of creating and agreeing to standards and requirements for the collection, identification, storage, and use of data. The success of every data-driven
  • 14. strategy or marketing effort depends on data governance. Data governance policies must address structured, semi-structured, and unstructured data (discussed in Section 2.3) to ensure that insights can be trusted. Data governance allows managers to determine where their data originates, who owns them, and who is responsible for what—in order to know they can trust the available data when needed. Data governance is an enterprise-wide project because data cross boundaries and are used by people throughout the enterprise. 8. As data sources and volumes continue to increase, so does the need to manage data as a strategic asset in order to extract its full value. Making business data consistent, trusted, and accessible across the enterprise is a critical first step in customer-centric business models. With appropriate data governance and MDM, managers are able to extract maximum value from their data, specifically by making better use of opportunities that are buried within behavioral data. Strong data governance is needed to manage the availability, usability, integrity, and security of the data used throughout the enterprise so that data are of sufficient quality to meet business needs. 9. There may be a culture of distrust between technology and employees in an organization. To overcome this, there must be a genuine commitment to change. Such a commitment must come from senior management. A methodology, such as data governance, cannot solve people problems. It only provides a framework in which such problems can be solved. 19 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
  • 15. Learning Objectives 20 Enterprise Architecture and Data Governance Information Systems: The Basics Data Centers, Cloud Computing, and Virtualization Cloud Services Add Agility Information Management Information Systems: The Basics DATA, INFORMATION, & KNOWLEDGE Raw data describes products, customers, events, activities, and transactions that are recorded, classified, and stored.
  • 16. Information is processed, organized, or put into context data with meaning and value to the recipient. Knowledge is conveyed information as applied to a current problem or activity. Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Information Systems: The Basics DATA, INFORMATION, & KNOWLEDGE Raw data describes products, customers, events, activities, and transactions that are recorded, classified, and stored. Chapter 2 Data Information Knowledge Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Information Systems: The Basics Chapter 2 Figure 2.8 Input-processing-output model. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Information Systems: The Basics Transaction Processing Systems (TPS) Internal transactions: originate or occur within the organization (payroll, purchases, etc.). External transactions: originate outside the organization (customers, suppliers, etc.). Improve sales, customer satisfaction, and reduce many other
  • 17. types of data errors with financial impacts. Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Information Systems: The Basics Batch v. Online Real-Time Processing Batch Processing: collects all transactions for a time period, then processes the data and updates the data store. OLTP: processes each transaction as it occurs (real-time). Batch processing costs less than OLTP, but may be inaccurate from update delays. Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Information Systems: The Basics Management Information Systems (MIS) General-purpose reporting systems that provide reports to managers for tracking operations, monitoring, and control. Periodic: reports created or run according to a pre-set schedule. Exception: generated only when something is outside designated parameters. Ad Hoc, or On Demand: unplanned, generated as needed. Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Information Systems: The Basics Decision Support Systems (DSS) Interactive applications that support decision making.
  • 18. Support unstructured and semi-structured decisions with the following characteristics: Easy-to-use interactive interface Models or formulas that enable sensitivity analysis Data from multiple sources Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Information Systems: The Basics Transaction Issues Huge database transactions causes volatility – constant use or updates. Makes databases impossible for complex decision making and problem-solving tasks. Data is loaded to a data warehouse where ETL (extract, transform, and load) is better for analysis. Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Business Process Management and Improvement Contrast data, information, and knowledge. Define TPS and give an example. When is batch processing used? When are real-time processing capabilities needed? Explain why TPSs need to process incoming data before they are stored. Define MIS and DSS and give an example of each.
  • 19. Why are databases inappropriate for doing data analysis? Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Suggested Answers: 1. Data, or raw data, refers to a basic description of products, customers, events, activities, and transactions that are recorded, classified, and stored. Data are the raw material from which information is produced and the quality, reliability and integrity of the data must be maintained for the information to be useful. Information is data that has been processed, organized, or put into context so that it has meaning and value to the person receiving it. Knowledge consists of data and/or information that have been processed, organized, and put into context to be meaningful, and to convey understanding, experience, accumulated learning, and expertise as they apply to a current problem or activity. Define TPS and give an example. 2. Transaction processing systems are designed to process specific types of data input from ongoing transactions. TPSs can be manual, as when data are typed into a form on a screen, or automated by using scanners or sensors to capture data. Organizational data are processed by a TPS--sales orders, payroll, accounting, financial, marketing, purchasing, inventory control, etc. Transactions are either: Internal transactions: Transactions that originate from within the organization or that occur within the organization. Examples are payroll, purchases, budget transfers, and payments (in accounting terms, they’re referred to as accounts payable).
  • 20. External transactions: Transactions that originate from outside the organization, e.g., from customers, suppliers, regulators, distributors, and financing institutions. TPSs are essential systems. Transactions that do not get captured can result in lost sales, dissatisfied customers, and many other types of data errors having financial impact. For example, if accounting issues a check as payment for an invoice (bill) and that check is cashed, if that transaction is not captured, the amount of cash on the financial statements is overstated, the invoice continues to show as unpaid, and the invoice may be paid a second time. Or if services are provided, but not recorded, the company loses that service revenue. 3. Batch processing is used when there are multiple transactions which can be accumulated and processed at one time. These transactions are not as time sensitive as those that need to be processed in real time. The transactions may be collected for a day, a shift, or over another period of time, and then they are processed. Batch processing often is used to process payroll in a weekly or bi-weekly manner. Batch processing is less costly than real-time processing. 4. Online transaction processing (OLTP), or real-time processing, is used when a system must be updated as each transaction occurs. The input device or website for entering transactions must be directly linked to the transaction processing system (TPS). This type of entry is used for more time sensitive data, such as reservation systems in which the user must know how many seats or rooms are available. 5. Processing improves data quality, which is important because reports and decisions are only as good as the data they are based on. As data is collected or captured, it is validated to detect and correct obvious errors and omissions.
  • 21. Data errors detected later may be difficult to correct or time- consuming. You can better understand the difficulty of detecting and correcting errors by considering identity theft. Victims of identity theft face enormous challenges and frustration trying to correct data about them. 6. General purpose reporting systems are referred to as management information systems (MIS). Their objective is to provide reports to managers for tracking operations, monitoring, and control. MIS is used by middle managers in functional areas and provides routine information for planning, organizing, and controlling operations. Types of reports include: Periodic: reports created to run according to a pre-set schedule, such as daily, weekly, and quarterly. Exception: reports generated only when something is outside the norm, either higher or lower than expected. An example might be increased sales in a hardware store prior to a hurricane. Ad hoc, or on demand, reports are unplanned reports generated as needed. Decision support systems (DSS) are interactive applications that support decision making. Configurations of a DSS range from relatively simple applications that support a single user to complex enterprise-wide systems. A DSS can support the analysis and solution of a specific problem, to evaluate a strategic opportunity, or to support ongoing operations. These systems support unstructured and semi-structured decisions, such as whether to make-or-buy-or-outsource products, or what new products to develop and introduce into existing markets. Decision support systems are used by decision makers and managers to combine models and data to solve semi-structured and unstructured problems with user involvement.
  • 22. To provide such support, DSSs have certain characteristics to support the decision maker and the decision making process. Three defining characteristics of DSSs are: an easy-to-use interactive interface models that enable sensitivity analysis, what if analysis, goal seeking, and risk analysis data from multiple sources - internal and external sources plus data added by the decision maker who may have insights relevant to the decision situation. Having models is what distinguishes DSS from MIS. Some models are developed by end users through an interactive and iterative process. Decision makers can manipulate models to conduct experiments and sensitivity analyses, such as what-if, and goal-seeking. What-if analysis refers to changing assumptions or data in the model to see the impacts of the changes on the outcome. For example, if sales forecasts are based on a 5 percent increase in customer demand, a what if analysis would replace the 5 percent with higher and/or lower demand estimates to determine what would happen to sales if the demands were different. With goal seeking, the decision maker has a specific outcome in mind and needs to figure out how that outcome could be achieved and whether it’s feasible to achieve that desired outcome. A DSS also can estimate the risk of alternative strategies or actions. California Pizza Kitchen (CPK) uses a DSS to support inventory decisions. CPK has 77 restaurants located in various states in the U.S. Maintaining inventory of all restaurants at optimal levels was challenging and time-consuming. A DSS has made it easy for the managers to keep records updated and make decisions. Many CPK restaurants increased sales by 5 percent after implementing a DSS. 7. Databases are used for recording and processing transactions.
  • 23. Due to the number of transactions, the data in the databases are constantly in a state of change making it difficult to use for complex decision making. 29 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Learning Objectives Enterprise Architecture and Data Governance Information Systems: The Basics Data Centers, Cloud Computing, and Virtualization Cloud Services Add Agility Information Management
  • 24. Data Centers, Cloud Computing, and Virtualization IT Infrastructures On-premises data centers Virtualization Cloud Computing Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Centers, Cloud Computing, and Virtualization Data Centers Large numbers of network servers used for the storage, processing, management, distribution, and archiving of data, systems, Web traffic, services, and enterprise applications. National Climatic Data Center U.S. National Security Agency Apple Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Centers, Cloud Computing, and Virtualization Business is Reliant Upon data Uber (car-hailing service) Users flooded social media with complaints. WhatsApp (smartphone text-messaging service) Competition added 2 million new registered users within 24 hours of WhatsApp outage (a record). Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
  • 25. Data Centers, Cloud Computing, and Virtualization Unified Data Center Cisco’s single solution integrating computing, storage, networking, virtualization, and management into a single (unified) platform. Virtualization gives greater IT flexibility and cutting costs: Instant access to data any time in any format Respond faster to changing data analytic needs Cut complexity and cost Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Centers, Cloud Computing, and Virtualization Unified Data Center compared to traditional data integration and replication methods: Chapter 2 Greater Agility Streamlined Approach Better Insight Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Centers, Cloud Computing, and Virtualization What is “The Cloud”? A general term for infrastructure that uses the Internet and private networks to access, share, and deliver computing resources. Scalable delivery as a service to end-users over a network. Should be approached with greater diligence than other IT decisions as a new technology including Vendor Management and Service-Level Agreements. Chapter 2
  • 26. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Centers, Cloud Computing, and Virtualization Service-Level Agreements A negotiated agreement between a company and service provider that can be a legally binding contract or an informal contract. The goal is not building the best SLA terms, but getting the terms that are most meaningful to the business. Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Centers, Cloud Computing, and Virtualization Types of Clouds Private Cloud: Single-tenant environments with stronger security and control (retained) for regulated industries and critical data. Public Cloud: Multiple-tenant virtualized services utilizing the same pool of servers across a public network (distributed). Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Centers, Cloud Computing, and Virtualization Cloud Infrastructure Provided on demand for storage virtualization, network virtualization, and hardware virtualization. Software or virtualization layer creates virtual machines (VMs) where the CPU, RAM, HD, NIC, and other components behave as hardware, but are created with software. Chapter 2
  • 27. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Centers, Cloud Computing, and Virtualization Virtualization Created by a software layer (virtualization layer) containing its own operating system and applications as a physical computer. Chapter 2 Infrastructure As a Service Platform As a Service Software As a Service Figure 2.17 Virtual machines running on a simple computer hardware layer. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Centers, Cloud Computing, and Virtualization Characteristics & Benefits Memory-intensive Huge amounts of RAM due to massive processing requirements Energy-efficient Up to 95% reduction in energy use per server through less physical hardware Scalability and load balancing Handles dynamic demand requests like during the Super Bowl or World Series Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
  • 28. Data Centers, Cloud Computing, and Virtualization What is a data center? Describe cloud computing. What is the difference between data centers and cloud computing? What are the benefits of cloud computing? How can cloud computing solve the problems of managing software licenses? What is an SLA? Why are SLAs important? What factors should be considered when selecting a cloud vendor or provider? When are private clouds used instead of public clouds? Explain three issues that need to be addressed when moving to cloud computing or services. How does a virtual machine (VM) function? Explain virtualization. What are the characteristics and benefits of virtualization? When is load balancing important? Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Suggested Answers: 1. A data center consists of a large number of network servers (Figure 2.13) used for the storage, processing, management, distribution, and archiving of data, systems, Web traffic, services, and enterprise applications. Data center also refers to the building or facility that houses the servers and equipment. 2. Cloud computing is the general term for infrastructures that use the Internet and private networks to access, share, and deliver computing resources.
  • 29. 3. A main difference between a cloud and data center is that a cloud is an off-premise form of computing that stores data on the Internet. In contrast, a data center refers to on-premises hardware and equipment that store data within an organization’s local network. Cloud services are outsourced to a third-party cloud provider who manages the updates, security, and ongoing maintenance. Data centers are typically run by an in-house IT department. A data center is owned by the company. Since only the company owns the infrastructure, a data center is more suitable for organizations that run many different types of applications and have complex workloads. A data center, like a factory, has limited capacity. Once it is built, the amount of storage and the workload the center can handle does not change without purchasing and installing more equipment. A data center is physically connected to a local network, which makes it easier to restrict access to apps and information by only authorized, company-approved people and equipment. However, the cloud is accessible by anyone with the proper credentials and Internet connection. This accessibility arrangement increases exposure to company data at many more entry and exit points. Cloud computing is the delivery of computing and storage resources as a service to end-users over a network. With cloud computing, shared resources (such as hard drives for storage) and software apps are provided to computers and other devices on-demand, like a public utility. That is, it’s similar to electricity - a utility that companies have available to them on- demand and pay for it based on usage. Cloud systems are scalable. That is, they can be adjusted to meet changes in business needs. A drawback of the cloud is control because a third party
  • 30. manages it. Companies do not have as much control as they do with a data center. 4. Answers may vary. Many IT infrastructures are extremely expensive to manage and too complex to easily adapt. Because cloud computing resources are scalable “on demand”, this increases IT agility and responsiveness. In a business world where first movers gain the advantage, IT responsiveness and agility provide a competitive edge. Access to data in the cloud is possible via any device that can access the Internet, allowing users to be more responsive and productive. Cloud services are outsourced to a third-party cloud provider who manages the updates, security, and ongoing maintenance, including backups and disaster recovery, relieving this burden from the business. The business saves the costs of increased staff, power consumption, and disposal of discontinued hardware. Additionally, cloud services significantly reduce IT costs and complexity through improved workload optimization and service delivery. 5. Cloud computing makes it more affordable for companies to use services that in the past would have been packaged as software and required buying, installing and maintaining on any number of individual machines. A major type of service available via the cloud is called software as a service, or SaaS. Because applications are hosted by vendors and provided on demand, rather than via physical installations or seat licenses (a key characteristic of cloud computing), applications are accessed online through a Web browser instead of stored on a computer. Companies pay only for the computing resources or services they use. Vendors handle the upgrades and companies do not purchase or manage software licenses. They simply pay
  • 31. for the number of concurrent users. 6. An SLA is a negotiated agreement between a company and service provider that can be a legally binding contract or an informal contract. An SLA serves “as a means of formally documenting the service(s), performance expectations, responsibilities, and limits between cloud service providers and their users. A typical SLA describes levels of service using various attributes such as: availability, serviceability, performance, operations, billing, and penalties associated with violations of such attributes.” (Cloud Standards Customer Council, 2012, pp. 5–6.) 7. See Table 2.5: 8. Companies or government agencies set up their own private clouds when they need stronger security and control for regulated industries and critical data. 9. Issues that need to be addressed when moving to public cloud computing or services include: Infrastructure issues – Cloud computing runs on a shared infrastructure so there is less customization for a company’s specific requirements. The network and WAN (wide area network) become more critical in the IT infrastructure. Network bandwidth is also an issue as enough is needed to support the increase in network traffic. With cloud computing, it may be more difficult to get to the root of performance problems, like the unplanned outages that occurred with Google’s Gmail and Workday’s human resources apps. The trade-off is cost vs. control. Disruption issues – There is a risk of disrupting operations or customers in the process of moving operations to the cloud.
  • 32. Management issues – Putting part of the IT architecture or workload into the cloud requires different manageme nt approaches, different IT skills, and knowing how to manage vendor relationships and contracts. (The astute student may also describe the following: Strategic issues such as deciding which workloads to export to the cloud; which set of standards to follow for cloud computing; how to resolve privacy and security issues; and how departments or business units will get new IT resources.) 10. A virtual machine (VM) is a software layer that runs its own Operating System (OS) and apps as if it were a physical computer. A VM behaves exactly like a physical computer and contains its own virtual (software based) CPU, RAM, hard drive and Network Interface Card. An OS cannot tell the difference between a VM and a physical machine, nor can apps or other computers on a network tell the difference. (See Fig 2.13 for details) 11. Virtualization is a concept that has several meanings in IT and therefore several definitions. The major type of virtualization is hardware virtualization, which remains popular and widely used. Virtualization is often key part of an enterprise’s disaster recovery plan. In general, virtualization separates business applications and data from hardware resources. This separation allows companies to pool hardware resources—rather than to dedicate servers to applications—and assign those resources to applications as needed. The major types of virtualization are the following: Storage virtualization is the pooling of physical storage from multiple network storage devices into what appears to be a single storage device that is managed from a central console. Network virtualization combines the available resources in a network by splitting the network load into manageable parts,
  • 33. each of which can be assigned (or reassigned) to a particular server on the network. Hardware virtualization is the use of software to emulate hardware or a total computer environment other than the one the software is actually running in. It allows a piece of hardware to run multiple operating system images at once. This kind of software is sometimes known as a virtual machine. Virtualization increases the flexibility of IT assets, allowing companies to consolidate IT infrastructure, reduce maintenance and administration costs, and prepare for strategic IT initiatives. Virtualization is not primarily about cost-cutting, which is tactical reason. More importantly, for strategic reasons, virtualization is used because it enables flexible sourcing, and cloud computing. 12. Memory-intensive: VMs need a huge amount of RAM (random access memory, or primary memory) because of their massive processing requirements. Energy-efficient: VMs minimize energy consumed running and cooling servers in the data center— representing up to a 95 percent reduction in energy use per server. Scalability and load balancing: Virtualization provides load balancing to handle the demand for requests to the site. The VMware infrastructure automatically distributes the load across a cluster of physical servers to ensure the maximum performance of all running VMs. 13. When a big event happens, such as the Super Bowl, millions of people hit a Web site at the same time. Virtualization provides load balancing to handle the demand for requests to the site. Load balancing is key to solving many of today’s IT challenges. 42
  • 34. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Learning Objectives Enterprise Architecture and Data Governance Information Systems: The Basics Data Centers, Cloud Computing, and Virtualization Cloud Services Add Agility Information Management Cloud Services Add Agility Software as a Service (SaaS) End-user apps, like SalesForce
  • 35. Platform as a Service (PaaS) Tools and services making coding and deployment faster and more efficient, like Google App Engine Infrastructure as a Service (IaaS) Hardware and software that power computing resources, like EC2 & S3 (Amazon Web Services) Data as a Service (DaaS) Data shared among clouds, systems, apps, regardless the data source or storage location. Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Cloud Services Add Agility Data as a Service (DaaS) Easier for data architects to select data from different pools, filter out sensitive data, and make the remaining data available on-demand. Eliminates risks and burdens of data management to a third- party cloud provider. Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Cloud Services Add Agility Cloudy Weather Ahead? Various at-a-service models (such as CRM and HR management) are still responsible for regulatory compliance. Legal departments become involved due to high stakes around legal and compliance issues. Cut costs, flexibility, and improved responsiveness require IT, legal, and senior management oversight. Chapter 2
  • 36. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Cloud Services Add Agility What is SaaS? Describe the cloud computing stack. What is PaaS? What is IaaS? Why is DaaS growing in popularity? How might companies risk violating regulation or compliance requirements with cloud services? Chapter 2 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Suggested Answers: 1. Any software that is provided on demand is referred to as software as a service, or SaaS. SaaS is a widely used model in which software is available to users as needed. Specifically, in SaaS, a service provider hosts the application at its data center and customers access it via a standard Web browser. Other terms for SaaS are on-demand computing and hosted services. The idea is basically the same: Instead of buying and installing expensive packaged enterprise applications, users can access software apps over a network, with an Internet browser being the only necessity. A SaaS provider licenses an application to customers either on-demand, through a subscription, based on usage (pay-as-you-go), or increasingly at no cost when the opportunity exists to generate revenue from advertisements or through other methods. 2. The cloud computing stack consists of the following three categories: SaaS apps are designed for end-users.
  • 37. PaaS is a set of tools and services that make coding and deploying these apps faster and more efficient. IaaS consists of hardware and software that power computing resources— servers, storage, operating systems, and networks. See Figure 2.19 for a graphical representation. 3. PaaS provides a standard unified platform for app development, testing, and deployment, thus benefiting software development. This computing platform allows the creation of Web applications quickly and easily without the complexity of buying and maintaining the underlying infrastructure. Without PaaS, the cost of developing some apps would be prohibitive. The trend is for PaaS to be combined with IaaS. 4. Infrastructure as a service (IaaS) is a way of delivering cloud computing infrastructure as an on-demand service. Rather than purchasing servers, software, data center space, or networks, companies instead buy all computing resources as a fully outsourced service. 5. The DaaS model is growing in popularity as data become more complex, difficult, and expensive to maintain. Data as a service (DaaS) enables data to be shared among clouds, systems, apps, and so on regardless of the data source or where they are stored. DaaS makes it easier for data architects to select data from different pools, filter out sensitive data, and make the remaining data available on-demand. A key benefit of DaaS is the elimination of the risks and burdens of data management to a third-party cloud provider. 6. Companies are frequently adopting software, platform, infrastructure, data management and starting to embrace mobility as a service and big data as a service because they typically no longer have to worry about the costs of buying,
  • 38. maintaining, or updating their own data servers. Regulations mandate that confidential data be protected regardless of whether the data are on-premises on in the cloud. Therefore, a company’s legal department needs to get involved in these IT decisions. Put simply, moving to cloud services is not simply an IT decision because the stakes around legal and compliance issues are very high. 47 Chapter 3 Data Management, Big Data Analytics, and Records Management Prepared by Dr. Derek Sedlack, South University Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Learning Objectives Data Warehouse and Big Data Analytics Data and Text Mining Business Intelligence Electronic Records Management
  • 39. Database Management Systems Database Management Systems Databases Collections of data sets or records stored in a systematic way. Stores data generated by business apps, sensors, operations, and transaction-processing systems (TPS). The data in databases are extremely volatile. Medium and large enterprises typically have many databases of various types. Volatile data changes frequently Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems Data Warehouses Integrate data from multiple databases and data silos, and organize them for complex analysis, knowledge discovery, and
  • 40. to support decision making. May require formatting processing and/or standardization. Loaded at specific times making them non-volatile and ready for analysis. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems Data Marts Small-scale data warehouses that support a single function or one department. Enterprises that cannot afford to invest in data warehousing may start with one or more data marts. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems Business intelligence (BI) Tools and techniques that process data and conduct statistical analysis for insight and discovery. Used to discover meaningful relationships in the data, keep informed of real time, detect trends, and identify opportunities and risks. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems Database Management System (DBMS) Integrate with data collection systems such as TPS and business applications.
  • 41. Stores data in an organized way. Provides facilities for accessing and managing data. Standard database model adopted by most enterprises. Store data in tables consisting of columns and rows, similar to the format of a spreadsheet. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems Relational Management System (DBMS) Provides access to data using a declarative language. Declarative Language Simplifies data access by requiring that users only specify what data they want to access without defining how they will be achieved. Structured Query Language (SQL) is an example of a declarative language: SELECT column_name(s) FROM table_name WHERE condition Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems DBMS Functions Data filtering and profiling Data integrity and maintenance Data synchronization Data security Data access Chapter 3
  • 42. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems Online Transaction Processing and Online Analytics Processing Online Transaction Processing (OLTP) Designed to manage transaction data, which are volatile & break down complex information into simpler data tables to strike a balance between transaction-processing efficiency and query efficiency. Cannot be optimized for data mining Online Analytics Processing (OLAP) A means of organizing large business databases. Divided into one or more cubes that fit the way business is conducted. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems DBMSs (mid-2014) Oracle’s MySQL Microsoft’s SQL Server PostgreSQL IBM’s DB2 Teradata Database. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems Trend Toward NoSQL Systems Higher performance Easy distribution of data on different nodes
  • 43. enables scalability and fault tolerance Greater flexibility Simpler administration Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems Centralized and Distributed Database Architecture Centralized Database Architecture Better control of data quality. Better IT security. Distributed Database Architecture Allow both local and remote access. Use client/server architecture to process requests. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems Garbage In, Garbage Out Dirty Data Lacks integrity/validation and reduces user trust. Incomplete, out of context, outdated, inaccurate, inaccessible, or overwhelming. Chapter 3 Cost of Poor Quality Data Lost Business Cost to Prevent Errors Cost to Correct Errors
  • 44. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems Principle of Diminishing Data Value The value of data diminishes as they age. Blind spots (lack of data availability) of 30 days or longer inhibit peak performance. Global financial services institutions rely on near-real-time data for peak performance. Principle of 90/90 Data Use As high as 90 percent, is seldom accessed after 90 days (except for auditing purposes). Roughly 90 percent of data lose most of their value after 3 months. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems Principle of data in context The capability to capture, process, format, and distribute data in near real time or faster requires a huge investment in data architecture. The investment can be justified on the principle that data must be integrated, processed, analyzed, and formatted into “actionable information.” Chapter 3
  • 45. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems Data Life Cycle Chapter 3 Figure 3.11 Data life cycle. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems Chapter 3 Figure 3.12 An enterprise has transactional, master, and analytical data. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Database Management Systems Describe a database and a database management system (DBMS). Explain what an online transaction-processing (OLAP) system does. Why are data in databases volatile? Explain what processes DBMSs are optimized to perform. What are the business costs or risks of poor data quality? Describe the data life cycle. What is the function of master data management (MDM)? Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
  • 46. Suggested Answers: 1. A database is a collection of data sets or records stored in a systematic way. A database stores data generated by business apps, sensors, and transaction processing systems. Databases can provide access to all of the organization’s data collected for a particular function or enterprise-wide, alleviating many of the problems associated with data file environments. Central storage of data in a database reduces data redundancy, data isolation, and data inconsistency and allows for data to be shared among users of the data. In addition, security and data integrity are easier to control, and applications are independent of the data they process. There are two basic types of databases: centralized and distributed. A database management system (DBMS) is software used to manage the additions, updates, and deletions of data as transactions occur; and support data queries and reporting. DBMSs integrate with data collection systems such as TPS and business applications; store the data in an organized way; and provide facilities for accessing and managing that data. 2. OLTP is a database design that breaks down complex information into simple data tables in order to be efficient for capturing transactional data, including additions, updates, or deletions. OLTP databases are capable of processing millions of transactions every second. 3. Data in databases are volatile because they can be updated millions of times every second, especially if they are transaction processing systems (TPS). 4. Data filtering and profiling: Inspecting the data for errors, inconsistencies, redundancies, and incomplete information. Data integrity and maintenance: Correcting, standardizing, and verifying the consistency and integrity of the data.
  • 47. Data synchronization: Integrating, matching, or linking data from disparate sources. Data security: Checking and controlling data integrity over time. Data access: Providing authorized access to data in both planned and ad hoc ways within acceptable time. 5. Poor quality data cannot be trusted and may result in the inability to make intelligent business decisions. Poor data may lead to lost business opportunities, increased time, and effort trying to prevent errors, increased time, and effort trying to correct errors, misallocation of resources, flawed strategies, incorrect orders, and customers becoming frustrated and driven away. The cost of poor quality data spreads throughout the company affecting systems from shipping and receiving to accounting and customer services. Errors can be difficult, time-consuming, and expensive to correct, and the impacts of errors can be unpredictable or serious. 6. Three general data principles relate to the data life cycle perspective and help to guide IT investment decisions. Principle of diminishing data value. Viewing data in terms of a life cycle focuses attention on how the value of data diminishes as the data age. The more recent the data, the more valuable they are. This is a simple, yet powerful, principle. Most organizations cannot operate at peak performance with blind spots (lack of data availability) of 30 days or longer. Principle of 90/90 data use. Being able to act on real -time or near real-time operational data can have significant advantages. According to the 90/90 data-use principle, a majority of stored data, as high as 90 percent, is seldom accessed after 90 days (except for auditing purposes). Put another way, roughly 90 percent of data lose most of their value after three months. Principle of data in context. The capability to capture, process, format, and distribute data in near real-time or faster requires a
  • 48. huge investment in data management architecture and infrastructure to link remote POS systems to data storage, data analysis systems, and reporting applications. The investment can be justified on the principle that data must be integrated, processed, analyzed, and formatted into “actionable information.” End users need to see data in a meaningful format and context if the data are to guide their decisions and plans. 7. Master data management (MDM) is a process whereby companies integrate data from various sources or enterprise applications to provide a more complete or unified view of an entity (customer, product, etc.) Although vendors may claim that their MDM solution creates “a single version of the truth,” this claim probably is not true. In reality, MDM cannot create a single unified version of the data because constructing a completely unified view of all master data simply is not possible. Realistically, MDM consolidates data from various data sources into a master reference file, which then feeds data back to the applications, thereby creating accurate and consistent data across the enterprise. 19 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Learning Objectives Data Warehouse and Big Data Analytics Data and Text Mining Business Intelligence
  • 49. Electronic Records Management Database Management Systems Data Warehouse and Big Data Analytics Market share Percentage of total sales in a market captured by a brand, product, or company. Operating Margin A measure of the percent of a company’s revenue left over after paying variable costs: wages, raw materials, etc. Increased margins mean earning more per dollar of sales. The higher the operating margin, the better. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Warehouse and Big Data Analytics TORTURE DATA LONG ENOUGH AND IT WILL CONFESS . . .
  • 50. BUT MAY NOT TELL THE TRUTH Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Warehouse and Big Data Analytics Human Expertise and Judgment Required Data are worthless if you cannot analyze, interpret, understand, and apply the results in context. Data need to be prepared for analysis. Dirty data degrade the value of analytics. Data must be put into meaningful context. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Warehouse and Big Data Analytics Enterprise data warehouses (EDW) Data warehouses that pull together data from disparate sources and databases across an entire. Warehouses are the primary source of cleansed data for analysis, reporting, and Business Intelligence (BI). Their high costs can be subsidized by using Data marts. Chapter 3
  • 51. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Warehouse and Big Data Analytics Procedures to Prepare EDW Data for Analytics Extract from designated databases. Transform by standardizing formats, cleaning the data, integration. Loading into a data warehouse. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Warehouse and Big Data Analytics Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Warehouse and Big Data Analytics Active Data Warehouse (ADW) Real-time data warehousing and analytics. Transform by standardizing formats, cleaning the data, integration. They Provide Interaction with a customer to provide superior customer service. Respond to business events in near real time. Share up-to-date status data among merchants, vendors, customers, and associates. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
  • 52. Data Warehouse and Big Data Analytics Supporting Actions as well as Decisions Marketing and Sales Pricing and Contracts Forecasting Sales Financial Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Warehouse and Big Data Analytics Really Big Data Low-cost sensors collect data in real time in all types of physical things (machine-generated sensor data): Regulate temperature and climate Detect air particles for contamination Machinery conditions/failures Engine wear/maintenance Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Warehouse and Big Data Analytics Chapter 3 Figure 3.16 Machine generated data from physical objects are becoming a much larger portion of big data and analytics.. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
  • 53. Data Warehouse and Big Data Analytics Hadoop and MapReduce Hadoop is an Apache processing platform that places no conditions on the processed data structure. MapReduce provides a reliable, fault-tolerant software framework to write applications easily that process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware. Map stage: breaks up huge data into subsets Reduce stage: recombines partial results Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data Warehouse and Big Data Analytics Why are human expertise and judgment important to data analytics? Give an example. What is the relationship between data quality and the value of analytics? Why do data need to be put into a meaningful context? What are the differences between databases and data warehouses? Explain ETL and CDC. What is an advantage of an active data warehouse (ADW)? Why might a company invest in a data mart? How can manufacturers and health care benefit from data analytics? Explain how Hadoop implements MapReduce in two stages. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Suggested Answers
  • 54. 1. Human expertise and judgment are needed to interpret the output of analytics (refer to Figure 3.1). Data are worthless if you cannot analyze, interpret, understand, and apply the results in context. For example, some believe that Super Bowl results in February predict whether the stock market will go up or down that year. If the National Football Conference (NFC) wins, the market goes up; otherwise, stocks take a dive. Looking at results over the past 30 years, most often the NFC has won the Super Bowl and the market has gone up. Does this mean anything? No. 2. Dirty data degrade the value of analytics. The “cleanliness” of data is very important to data mining and analysis projects. 3. Managers need context in order to understand how to interpret traditional and big data. If the wrong analysis or datasets are used, the output would be nonsense, as in the example of the Super Bowl winners and stock market performance. 4. Databases are: Designed and optimized to ensure that every transaction gets recorded and stored immediately. Volatile because data are constantly being updated, added, or edited. OLTP systems. Medium and large enterprises typically have many databases of various types. Data warehouses are: Designed and optimized for analysis and quick response to queries. Nonvolatile. This stability is important to being able to analyze the data and make comparisons. When data are stored, they might never be changed or deleted in order to do trend analysis
  • 55. or make comparisons with newer data. OLAP systems. Subject-oriented, which means that the data captured are organized to have similar data linked together. Data warehouses integrate data collected over long time periods from various source systems, including multiple databases and data silos. 5. ETL refers to three procedures – Extract, Transform, and Load – used in moving data from databases to a data warehouse. Data are extracted from designated databases, transformed by standardizing formats, cleaning the data, integrating them, and loaded into a data warehouse. CDC, the acronym for Change Data Capture, refers to processes which capture the changes made at data sources and then apply those changes throughout enterprise data stores to keep data synchronized. CDC minimizes the resources required for ETL processes by only dealing with data changes. 6. An ADW provides real-time data warehousing and analytics, not for executive strategic decision making, but rather to support operations. Some advantages for a company of using an ADW might be interacting with a customer to provide superior customer service, responding to business events in near real time, or sharing up-to-date status data among merchants, vendors, customers, and associates. 7. The high cost of data warehouses can make them too expensive for a company to implement. Data marts are lower - cost, scaled-down versions that can be implemented in a much shorter time, for example, in less than 90 days. Data marts serve a specific department or function, such as finance, marketing, or operations. Since they store smaller amounts of data, they are faster, and easier to use and navigate. 8. Machine-generated sensor data are becoming a larger
  • 56. proportion of big data (Figure 3.16). Analyzing them can lead to optimizing cost savings and productivity gains. Manufacturers can track the condition of operating machinery and predict the probability of failure, as well as track wear and determine when preventive maintenance is needed. Federal health reform efforts have pushed health-care organizations toward big data and analytics. These organizations are planning to use big data analytics to support revenue cycle management, resource utilization, fraud prevention, health management, and quality improvement, in addition to reducing operational expenses. 9. Apache Hadoop is a widely used processing platform which places no conditions on the structure of the data it can process. Hadoop implements MapReduce in two stages: Map stage: MapReduce breaks up the huge dataset into smaller subsets; then distributes the subsets among multiple servers where they are partially processed. Reduce stage: The partial results from the map stage are then recombined and made available for analytic tools 32 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Learning Objectives Data Warehouse and Big Data Analytics Data and Text Mining Business Intelligence
  • 57. Electronic Records Management Database Management Systems Data and Text Mining Creating Business Value Business Analytics: the entire function of applying technologies, algorithms, human expertise, and judgment. Data Mining: software that enables users to analyze data from various dimensions or angles, categorize them, and find correlative patterns among fields in the data warehouse. Text Mining: broad category involving interpreted words and concepts in context. Sentimental Analysis: trying to understand consumer intent. Chapter 3
  • 58. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data and Text Mining Text Analytics (Mining) Procedure Exploration Simple word counts Topics consolidation Preprocessing Standardization May be 80% of processing time Grammar and spell checking Categorizing and Modelling Create business rules and train models for accuracy and precision Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data and Text Mining Text Analytics Procedure Exploration Simple word counts Topics consolidation Preprocessing Standardization May be 80% of processing time Grammar and spell checking Categorizing and Modelling Create business rules and train models for accuracy and precision
  • 59. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Data and Text Mining Describe data mining. How does data mining generate or provide value? Give an example. What is text mining? Explain the text mining procedure. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Suggested Answers: 1. Data mining is the process of analyzing data from various dimensions or angles, categorizing them, and finding correlations or patterns among fields in the data warehouse. 2. Data mining is used to discover knowledge that you did not know existed in the databases. Answers may vary. A data mining example: The mega-retailer Walmart wanted its online shoppers to find what they were looking for faster. Walmart analyzed clickstream data from its 45 million monthly online shoppers then combined that data with product and category related popularity scores which were generated by text mining the retailer’s social media streams. Lessons learned from the analysis were integrated into the Polaris search engine used by customers on the company’s website. Polaris has yielded a 10 to 15 percent increase in online shoppers completing a purchase, which equals roughly $1 billion in incremental online sales.
  • 60. 3. Up to 75 percent of an organization’s data are non-structured word processing documents, social media, text messages, audio, video, images and diagrams, fax and memos, call center or claims notes, and so on. Text mining is a broad category that involves interpreting words and concepts in context. Then the text is organized, explored, and analyzed to provide actionable insights for managers. With text analytics, information is extracted out of large quantities of various types of textual information. It can be combined with structured data within an automated process. Innovative companies know they could be more successful in meeting their customers’ needs if they just understood them better. Text analytics is proving to be an invaluable tool in doing this. 4. The basic steps involved in text mining/analytics include: Exploration. First, documents are explored. This might be in the form of simple word counts in a document collection, or manually creating topic areas to categorize documents by reading a sample of them. For example, what are the major types of issues (brake or engine failure) that have been identified in recent automobile warranty claims? A challenge of the exploration effort is misspelled or abbreviated words, acronyms, or slang. Preprocessing. Before analysis or the automated categorization of the content, the text may need to be preprocessed to standardize it to the extent possible. As in traditional analysis, up to 80 percent of the time can be spent preparing and standardizing the data. Misspelled words, abbreviations, and slang may need to be transformed into a consistent term. For instance, BTW would be standardized to “by the way” and “left voice message” could be tagged as “lvm.” Categorizing and Modeling. Content is then ready to be categorized. Categorizing messages or documents from information contained within them can be achieved using statistical models and business rules. As with traditional model
  • 61. development, sample documents are examined to train the models. Additional documents are then processed to validate the accuracy and precision of the model, and finally new documents are evaluated using the final model (scored). Models then can be put into production for automated processing of new documents as they arrive. 37 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Learning Objectives Data Warehouse and Big Data Analytics Data and Text Mining Business Intelligence Electronic Records Management Database Management Systems
  • 62. Business Intelligence Key to competitive advantage Across industries in all size enterprises Used in operational management, business process, and decision making Provides moment of value to decision makers Unites data, technology, analytics, & human knowledge to optimize decisions Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Business Intelligence Challenges Data Selection & Quality Alignment with Business Strategy and BI Strategy Alignment Clearly articulates business strategy Deconstructs business strategy into targets Identifies PKIs Prioritizes PKIs Creates a plan based on priorities Transform based on strategic results and changes Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Business Intelligence Chapter 3
  • 63. Smart Devices Everywhere have created demand for effortless 24/7 access to insights. Data is Big Business when they provide insight that supports decisions and action. Advanced BI and Analytics help to ask questions that were previously unknown and unanswerable. Cloud Enabled BI and Analytics are providing low-cost and flexible solutions. Figure 3.20 Four factors contributing to increased use of BI. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Business Intelligence BI Architecture and Analytics Advances in response to big data and end-user performance demands. Hosted on public or private clouds. Limits IT staff and controls costs May slow response time, add security and backup risks Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Business Intelligence How has BI improved performance management at Quicken Loans? What are the business benefits of BI? What are two data-related challenges that must be resolved for
  • 64. BI to produce meaningful insight? What are the steps in a BI governance program? What is a business-driven development approach? What does it mean to drill down, and why is it important? What four factors are contributing to increased use of BI? How did BI help CarMax achieve record-setting revenue growth? Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Suggested Answers: 1. Using BI, the company has increased the speed from loan application to close, which allows it to meet client needs as thoroughly and quickly as possible. Over almost a decade, performance management has evolved from a manual process of report generation to BI-driven dashboards and user-defined alerts that allow business leaders to proactively deal with obstacles and identify opportunities for growth and improvement. 2. BI provides data at the moment of value to a decision maker—enabling it to extract crucial facts from enterprise data in real time or near real time. BI solutions help an organization to know what questions to ask and to find answers to those questions. BI tools integrate and consolidate data from various internal and external sources and then process them into information to make smart decisions. According to The Data Warehousing Institute (TDWI), BI “unites data, technology, analytics, and human knowledge to optimize business decisions and ultimately drive an enterprise’s success. BI programs… transform data into usable, actionable business information” (TDWI, 2012). Managers use business analytics to make better-informed
  • 65. decisions and hopefully provide them with a competitive advantage. BI is used to analyze past performance and identify opportunities to improve future performance. 3. Data selection and data quality. Information overload is a major problem for executives and for employees. Another common challenge is data quality, particularly with regard to online information, because the source and accuracy might not be verifiable. 4. The mission of a BI governance program is to achieve the following: Clearly articulate business strategies. Deconstruct the business strategies into a set of specific goals and objectives—the targets. Identify the key performance indicators (KPIs) that will be used to measure progress toward each target. Prioritize the list of KPIs. Create a plan to achieve goals and objectives based on the priorities. Estimate the costs needed to implement the BI plan. Assess and update the priorities based on business results and changes in business strategy. 5. A business-driven development approach starts with a business strategy and work backward to identify data sources and the data that need to be acquired and analyzed. 6. Drilling down into the data is going from highly consolidated or summarized figures into the detail numbers from which they were derived. Sometimes a summarized view of the data is all that is needed; however, drilling down into the data, from which the summary came, provides the ability to do more in-depth analyses.
  • 66. 7. Smart Devices Everywhere creating demand for effortless 24/7 access to insights. Data is Big Business when they provide insight that supports decisions and action. Advanced Bl and Analytics help to ask questions that were previously unknown and unanswerable. Cloud Enabled Bl and Analytics are providing low -cost and flexible solutions. 8. The ISs that helped CarMax include: A proprietary IS that captures, analyzes, interprets, and distributes data about the cars CarMax sells and buys. Data analytics applications that track every purchase; number of test drives and credit applications per car; color preference in every demographic and region. Proprietary store technology that provides management with real-time data about every aspect of store operations, such as inventory management, pricing, vehicle transfers, wholesale auctions, and sales consultant productivity. An advanced inventory management system helps management anticipate future inventory needs and manage pricing. 43 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Learning Objectives Data Warehouse and Big Data Analytics Data and Text Mining
  • 67. Business Intelligence Electronic Records Management Database Management Systems Electronic Records Management Business Records Documentation of a business event, action, decision, or transaction. Electronic Records Management (EMR) Workflow software, authoring tools, scanners, and databases that manage and archive electronic documents and image paper documents. Index and store documents according to company policy or legal compliance. Success depends on partnership of key players. Chapter 3
  • 68. Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Electronic Records Management Best Practices Effective systems capture all business data. Input from online forms, bar codes, sensors, websites, social sites, copiers, e-mails, and more. Industry Standards Association for Information and Image Management (AIIM; www.aiim.org) National Archives and Records Administration (NARA; www.archives.gov) ARMA International (formerly the Association of Records Managers and Administrators; www.arma.org) Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Electronic Records Management Primary Benefits Access and use the content contained in documents. Cut labor costs by automating business processes. Reduce time and effort to locate required information for decision making. Improve content security, thereby reducing intellectual property theft risks. Minimizes content printing, storing, and searching costs. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Electronic Records Management DISASTER RECOVERY, BUSINESS CONTINUITY, AND
  • 69. COMPLIANCE Does the software meet the organization’s needs? For example, can the DMS be installed on the existing network? Can it be purchased as a service? Is the software easy to use and accessible from Web brow sers, office applications, and e-mail applications? If not, people will not use it. Does the software have lightweight, modern Web and graphical user interfaces that effectively support remote users? Before selecting a vendor, it is important to examine workflows and how data, documents, and communications flow throughout the company. Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Electronic Records Management What are business records? Why is ERM a strategic issue rather than simply an IT issue? Why might a company have a legal duty to retain records? Give an example. Why is creating backups an insufficient way to manage an organization’s documents? What are the benefits of ERM? Chapter 3 Copyright © 2015 John Wiley & Sons, Inc. All rights reserved. Suggested Answers: 1. All organizations create and retain business records. A record is documentation of a business event, action, decision, or transaction. Examples are contracts, research and development, accounting source documents, memos, customer/client
  • 70. communications, hiring and promotion decisions, meeting minutes, social posts, texts, e-mails, website content, database records, and paper and electronic files. Business documents such as spreadsheets, e-mail messages, and word-processing documents are a type of records. Most records are kept in electronic format and maintained throughout their life cycle— from creation to final archiving or destruction by an electronic records management (ERM) system. 2. Because senior management must ensure that their companies comply with legal and regulatory duties, managing electronic records (e-records) is a strategic issue for organizations in both the public and private sectors. The success of ERM depends greatly on a partnership of many key players, namely, senior management, users, records managers, archivists, administrators, and most importantly, IT personnel. Properly managed, records are strategic assets. Improperly managed or destroyed, they become liabilities. 3. Companies need to be prepared to respond to an audit, federal investigation, lawsuit, or any other legal action against them. Types of lawsuits against companies include patent violations, product safety negligence, theft of intellectual property, breach of contract, wrongful termination, harassment, discrimination, and many more. 4. Simply creating backups of records is not sufficient because the content would not be organized and indexed to retrieve them accurately and easily. The requirement to manage records— regardless of whether they are physical or digital—is not new. ERM systems consist of hardware and software that manage and archive electronic documents and image paper documents; then index and store them according to company policy. Properly managed, records are strategic assets. Improperly managed or destroyed, they become liabilities.
  • 71. 5. Departments or companies whose employees spend most of their day filing or retrieving documents or warehousing paper records can reduce costs significantly with ERM. These systems minimize the inefficiencies and frustration associated with managing paper documents and workflows. However, they do not create a paperless office as had been predicted. An ERM can help a business to become more efficient and productive by: Enabling the company to access and use the content contained in documents. Cutting labor costs by automating business processes. Reducing the time and effort required to locate information the business needs to support decision making. Improving the security of content, thereby reducing the risk of intellectual property theft. Minimizing the costs associated with printing, storing, and searching for content. When workflows are digital, productivity increases, costs decrease, compliance obligations are easier to verify, and green computing becomes possible. 49