Outlines an Approach to Describing the Organisation Data Landscape to Assist with Data Transformation Analysis and Planning
The Data Landscape is a representation of the organisation’s data entities and their relationships, interfaces and data flows. Data entities are data asset components that perform data-related functions, from data storage to data transfer and data processing within the Data Landscape.
The objective of developing a Data Landscape model is to define an approach for formally and exactly defining the operation and use of data at a high-level within the organisation and to plan for future changes. It allows the enterprise data fabric to be defined and modelled.
Creating a data landscape view is important as data underpins the operation of information technology solutions and business processes. Data breathes life into solutions as its flows through the organisation. The optimum and most cost-effective design of the data landscape is therefore important. Similarly, solutions that are developed or acquired and deployed on the data landscape
The nature of the organisation data landscape is changing as organisations are undergoing a data transformation.
1. Describing
the Organisation
Data
Landscape
Outlines an Approach to Describing the
Organisation Data Landscape to Assist with
Data Transformation Analysis and Planning
Alan McSweeney
http://ie.linkedin.com/in/alanmcsweeney
2. Describing the Organisation Data Landscape
Page 2
Contents
Introduction, Purpose and Scope ........................................................................................................... 5
What is the Organisation Data Landscape? ........................................................................................... 6
Data Landscape Definition Principles.................................................................................................... 9
Benefits and Uses of Data Landscape Approach................................................................................... 10
Data Landscape Concepts.................................................................................................................... 10
Data Zones ...................................................................................................................................... 11
Data Entity Types........................................................................................................................... 14
Data Entity Relationships ............................................................................................................... 22
Data Interfaces and Data Flows....................................................................................................... 24
Levels of Descriptive Detail ............................................................................................................. 34
Data Landscape Data Model................................................................................................................ 35
Core Data Landscape Data Model .................................................................................................... 35
Extended Data Landscape Data Model ............................................................................................ 36
Data Entity Components and Functions ...................................................................................... 38
Data Entity Attributes ................................................................................................................ 39
Data Security and Data Transformation ...................................................................................... 45
Data Entity Contents................................................................................................................... 47
Application Group or Service ........................................................................................................... 49
Data Processes and Capabilities........................................................................................................... 53
Data Process Framework ................................................................................................................. 53
Using a Data Process Framework to Assess the Health of the Data Landscape................................. 59
Data Maturity Models...................................................................................................................... 60
Business Functions and Business Processes .......................................................................................... 66
Data Landscape Model and Enterprise Data Model.............................................................................. 67
Using the Data Landscape Model Approach – Some Sample Scenarios................................................. 72
Move Data Entities to the Cloud...................................................................................................... 75
Implement Cloud-based Data Analytics Capability .......................................................................... 77
Move Test Environments to the Cloud ............................................................................................. 79
Outsource IT Infrastructure............................................................................................................. 81
Implement Backup and Recovery as a Service ................................................................................. 83
Implement Disaster Recovery as a Service ....................................................................................... 85
3. Describing the Organisation Data Landscape
Page 3
List of Figures
Figure 1 – Changing Organisation Data Landscape................................................................................ 7
Figure 2 – Information and Data Architecture in a Wider Information Technology Architecture Context
.............................................................................................................................................................. 8
Figure 3 – Layered View of Data Landscape Data Zones...................................................................... 12
Figure 4 – Island View of Data Landscape Data Zones......................................................................... 12
Figure 5 – Additional Data Zones ........................................................................................................ 13
Figure 6 – Data Entity Types in Data Zones........................................................................................ 15
Figure 7 – Data Zone and Data Entity Type Example with Cloud and Outsourced Service Providers... 21
Figure 8 – Sample Data Entity Relationships ...................................................................................... 23
Figure 9 – Direct Source Push Data Flow ............................................................................................ 25
Figure 10 – Direct Target Pull Data Flow............................................................................................ 26
Figure 11 – Indirect Source Push Target Pull Data Flow ..................................................................... 26
Figure 12 – Indirect Source Push Target Pull Data Flow ..................................................................... 26
Figure 13 – Indirect Source Pull Target Push Data Flow ..................................................................... 26
Figure 14 – Indirect Source Push Target Push Data Flow.................................................................... 27
Figure 15 – Transformation Source Pull Target Pull Data Flow........................................................... 27
Figure 16 – Transformation Source Push Target Pull Data Flow.......................................................... 27
Figure 17 – Transformation Source Pull Target Push Data Flow.......................................................... 27
Figure 18 – Transformation Source Push Target Push Data Flow ........................................................ 28
Figure 19 – Transformation With Multiple Target Pushes.................................................................... 28
Figure 20 – Sample Data Interfaces and Data Flows for Data Entity Types......................................... 29
Figure 21 – Example of Single Extended Data Flow across a Number of Data Entities ........................ 31
Figure 22 – Splitting Sample Extended Data Flow into Two Separate Data Flows............................... 32
Figure 23 – Simplified Representation of Data Flows........................................................................... 33
Figure 24 – Relationships Between Levels of Data Landscape Model Details........................................ 34
Figure 25 – Data Landscape Core Data Model...................................................................................... 35
Figure 26 – Core and Extended Data Landscape Models ...................................................................... 37
Figure 27 – Data Component Level of Detail Extension to Core Data Model ........................................ 38
Figure 28 – Data Attribute Extensions to Core Data Model ................................................................. 41
Figure 29 – Data Entity Future Options.............................................................................................. 43
Figure 30 – Financial Impact View of Data Landscape ........................................................................ 44
Figure 31 – Security Implications of Data Transformation................................................................... 46
Figure 32 – Data Content Extensions to Core Data Model.................................................................... 48
Figure 33 – Application/Service Group Extensions to Core Data Model................................................ 50
Figure 34 – Application View of Data Entities..................................................................................... 51
Figure 35 – Data Entities Shared Between Applications ...................................................................... 51
Figure 36 – Application Environments ................................................................................................ 53
Figure 37 – Data Management Processes Views ................................................................................... 55
Figure 38 – Connections Between Data Capability Areas ..................................................................... 58
Figure 39 – Data Process Implementation, Operation and Use Aspects ................................................ 59
Figure 40 – Data Capability Process Assessment Framework ............................................................... 60
Figure 41 – Generic Maturity Model Structure ..................................................................................... 61
Figure 42 – Improving Process Maturity.............................................................................................. 62
Figure 43 – Data Maturity Models ....................................................................................................... 63
Figure 44 – Data Entity Data Capability Process Health View ............................................................ 65
Figure 45 – Business Processes and Interactions with Data Entities..................................................... 66
Figure 46 – Structure of the Enterprise Data Model (EDM) ................................................................. 67
Figure 47 – Sample Subject Area Model............................................................................................... 69
Figure 48 – Sample Data Landscape with Overlaid Subject Area Model Data ...................................... 71
Figure 49 – Sample Organisation Data Landscape ............................................................................... 74
Figure 50 – Data Entity Sample Representation.................................................................................. 75
Figure 51 – Data Landscape: Move Data Entities to the Cloud............................................................. 76
4. Describing the Organisation Data Landscape
Page 4
Figure 52 – Data Landscape: Implement Cloud-based Data Analytics Capability................................. 78
Figure 53 – Data Landscape: Move Test Environments to the Cloud.................................................... 80
Figure 54 – Data Landscape: Outsource IT Infrastructure ................................................................... 82
Figure 55 – Data Landscape: Implement Backup and Recovery as a Service........................................ 84
Figure 56 – Data Landscape: Implement Disaster Recovery as a Service.............................................. 86
5. Describing the Organisation Data Landscape
Page 5
Introduction, Purpose and Scope
The Data Landscape is a representation of the organisation’s data entities and their relationships,
interfaces and data flows. Data entities are data asset components that perform data-related functions,
from data storage to data transfer and data processing within the Data Landscape.
Supporting the Data Landscape is a database structure to allow information to be stored, managed
extracted and reported on. This is described on page 49 onwards. This means that the Data Landscape is
not a static representation of a current or desired future state. It is a dynamic model that can be updated
and maintained. It can be used to assess change scenarios.
The objective of developing a Data Landscape model is to define an approach for formally and exactly
defining the operation and use of data at a high-level within the organisation and to plan for future
changes.
The outputs from the data landscape model creation process are aimed at both technical and non-
technical audiences.
The model needs to be sufficiently flexible to include different levels of detail, from a high-level view of
data entities and their relationships to detailed descriptive attributes on the contents and processing
performed by data entities and their underlying technology components.
This approach does not use a formal modelling language other than a relational model (as a form of data
database) for the constructs underlying the data landscape. The material contained here represents a set
of conceptual and dynamic models designed to allow insights to be obtained on the design, construction,
operation and use of the data landscape. As the data landscape model is itself data driven, it can be
changed easily and quickly.
This note contains the following sections:
What is the Organisation Data Landscape? on page 6 – this outlines the concept and objectives
of the data landscape, lists the data-related drivers and the linkages to other information technology
architecture practices
Data Landscape Definition Principles on page 9 – this lists some principles to apply to the
creation of the data landscape model.
Benefits and Uses of Data Landscape Approach on page 10 – this lists some of the benefits of
using the data landscape approach.
Data Landscape Concepts on page 10 – this introduces the concepts that underpin the data
landscape approach.
Data Landscape Data Model on page 35 – this describes the core and extended elements of the
data landscape model
Data Processes and Capabilities on page 53 – this describes data processes, capabilities and data
life stages and their possible use to assess the health and status of the data landscape
Business Functions and Business Processes on page 66 – this discusses an extension of the data
landscape model to incorporate details on business processes associated with data processing.
6. Describing the Organisation Data Landscape
Page 6
Data Landscape Model and Enterprise Data Model on page 67 – this outlines an extension to
the data landscape model in to include elements of an Enterprise Data Model such as the Subject
Area Model.
Using the Data Landscape Model Approach – Some Sample Scenarios on page 72 – this
contains some examples of using the data landscape model for planning data-related changes.
What is the Organisation Data Landscape?
The organisation data landscape is a representation of the static data entities and the dynamic
relationships and data flows across a wide view of the organisation, including external interacting data
components and parties, both current and future.
Creating a data landscape view is important as data underpins the operation of information technology
solutions and business processes. Data breathes life into solutions as its flows through the organisation.
The optimum and most cost-effective design of the data landscape is therefore important. Similarly,
solutions that are developed or acquired and deployed on the data landscape
The nature of the organisation data landscape is changing as organisations are undergoing a data
transformation:
The data landscape has been broadened and there are more data entities that form part of the
extended organisation data landscape as more applications are moved to cloud service providers and
as cloud platforms are used for providing additional facilities not currently present in organisations
such as data analytics and machine learning
There is a wider range of data entities as the data landscape increases in complexity
There are more data entity types and data-related capabilities, especially in the areas of advanced
data analytics
There are more data demands within the organisation especially in the areas of analytics
These developments co-exist with other more general data-related trends that include:
Greater volumes of operational data from increasing numbers of different sources and providers
Greater volumes of derived data
More data sources both internal and external to the organisation
Data in larger numbers of different formats
Data with wider range of contents
Data being generated at different rates
Data being generated at different times
Data being generated with varying degrees accuracy, reliability and greater fuzziness
Data that changes constantly
Data that is of different utility and value
7. Describing the Organisation Data Landscape
Page 7
Figure 1 – Changing Organisation Data Landscape
The data landscape approach aims to understand and handle these complexities in order to enable the
organisation move from its current state to a target future state. It allows options to be explored and
understood. It facilitates the planning and organisation required to achieve this change.
The creation of an organisation data landscape is not an end in itself. It is constructed to add value to
data architecture-related activities, provide insight, assist with resolving issues and in planning data-
related changes.
The organisation’s data landscape and the work of the data architect in developing it evolves in line with
other information technology architectural practices within the organisation that can involve some or all
of the following logical roles:
Enterprise Architecture that defines, develops, extends and manages the implementation and
operation of the overall IT delivery and operation framework including standards and solution
development and acquisition.
Solution Architecture that designs and oversees the implementation of a portfolio of IT solutions
that translate business needs into operable and usable systems that comply with standards.
Application Architecture that defines application architectures including development, sourcing,
deployment and integration.
8. Describing the Organisation Data Landscape
Page 8
Business Architecture that defines and manages the implementation of IT solutions and related
organisation changes needed to implement business strategy and objectives.
Service Architecture that designs and oversees the implementation of service processes and
supporting technologies and systems to ensure the successful operation of IT solutions including
outsourced supplier management framework.
Security Architecture that designs data and system security processes and systems to ensure the
security of information and systems across the entire IT landscape.
Technical Architecture that translates solution designs into technical delivery, acting as a bridge
between solution architecture and the delivery function and designing new delivery approaches.
Infrastructure Architecture that designs application, communication and data infrastructures to
operate the portfolio of IT solutions.
The data architect does not work in isolation to these other architectural disciplines. While the data
architect needs to focus on the core work of data architecture, the work should be part of the wider
overall organisation’s IT architecture. The data architect needs to (necessarily) balance narrowly (and
selfishly) focussing on pure data work with the broader needs of other IT architecture disciplines. The
results of the data landscape model should be shared with other members of the wider IT architecture
team.
Figure 2 – Information and Data Architecture in a Wider Information Technology Architecture Context
The data landscape is an integrated view of all data entities within and outside the organisation. It
captures a larger and deeper view of data and the data technologies, processes and capabilities within
and outside the organisation. This approach is designed as one tool to allow the data architect perform
the role of ensuring the success of current data operations while planning for adopting changes and new
technologies.
9. Describing the Organisation Data Landscape
Page 9
This data operations views captures key data entities, their relationships, data flows and the associated
data capabilities and their supporting processes.
The objective is not to represent all information technology components or applications, but just those
that are explicitly related to the processing of data in its widest sense. Server infrastructure used to host
data processing applications is not explicitly represented. Similarly, security infrastructure such as web
proxies, firewalls, security appliances and user directories need not be shown unless doing so adds value
to describing, understanding and planning the data landscape.
Data Landscape Definition Principles
The data landscape model creation process must be governed by a number of principles:
Less is More – create a model that is just detailed and complex enough to allow results to be
generated. The more detail that is added to the model, the greater its complexity becomes. Usability
and the ability to interpret the model to generate insight and value are reduced. While the amount of
detail that constitutes too much is undefined and subjective, the model should nonetheless be kept
simple. Too much detail, especially at the early stage, will kill the data landscape creation process.
Self-Descriptive – the data landscape model should be as self-descriptive as possible. The model
should be easy to understand and require as little additional knowledge as possible.
Simplicity of Representation – the meaning should be immediately obvious. Time spent
explaining how to interpret the model is a waste of time that could be spent on using the model for
planning and decision-making. Information can be filtered to control complexity.
Consistency – the information representation approach should be consistent across all presentation
instances and types.
Utility – one measure of the usability of the model is that it is useful and is used. One objective of
modelling is to aid understanding, insight, planning and problem determination and resolution.
Results-Focussed – the model is not an end in itself but a means to an end. Too much analysis is
implicitly inward- and backward-looking. The model should be forward-looking, looking to assist
with the resolution of problems and in planning and defining the future data landscape. Too much
time and effort can be spent of gathering detail that is not useful or relevant means that less will be
available to devote to value-adding activities. Documenting the existing data entity landscape can
be useful to determine the gaps that must be filled.
Relevance – the model should only contain what is relevant. Irrelevant detail should not be added.
These principles are inexact but they should nonetheless be considered when creating any data landscape
model.
10. Describing the Organisation Data Landscape
Page 10
Benefits and Uses of Data Landscape Approach
The landscape is only as useful as the information it contains and the accuracy and currency of that
information and the ability to present and use the information. The level of detail that is gathered about
the data landscape governs the type of detailed processing and analysis that can be performed. More
information means more maintenance of that information. The usefulness and usage will be reduced if
the information is not current.
Information should be collected at a high level initially. More detail can be added later. Only sufficient
information that is needed to add value should be collected and input into the model. The objective is to
describe the present in order to map, plan and understand the future, identify gaps, consider options and
optimise future configurations.
Creating the data landscape view requires an audit of the existing data entities static and dynamic data
relationships and flows. It can be a once-off or a continuous engagement: once-off to assist with specific
planning activities or continuous to allow the state of the data landscape to be constantly reviewed.
The data landscape is a representation of the way in which the organisation currently and how it would
like in the future to generate, receive, process, use and disseminate data.
The approach will allow changes to be planned and their requirements and impacts to be understood. It
will allow data selection, design and deployment options to be explored and opportunities to be assessed,
their scope understood, their impacts identified and data architecture and technology alternatives be
explored.
It can be used to assist with understanding, mapping and planning an organisation’s data
transformation and assist in moving to a more data operations-oriented organisation. It can be used to
identify opportunities for improvement, simplification and automation. Gaps and missing data
capabilities and facilities and capabilities can be identified.
Data Landscape Concepts
The data landscape model incorporates a number of concepts:
Data Zones – these are groupings of physically closely located data entities. The data zones
represent major clusters of or containers for data entities that are physically and/or logically close to
one another. Data zones do not represent objects that perform processing. The data entities located
within data zones perform the data related activities.
Data Entity Types – these are types of data source, storage, transformation, processing or transfer,
components. Data entities perform data-related work across the spectrum of actions and events.
Essentially a data entity is a hardware or software technology components involved in any form of
data processing. Data entity types are associated with data zones.
Data Entities – these are data assets that are involved in the storage, processing and transfer of
organisation data, in the widest sense. Data entities have a type and are located in Data Zones.
Data Entity Relationships – these are connections or associations between data entities. These
relationships can be loosely or exactly defined.
11. Describing the Organisation Data Landscape
Page 11
Data Interfaces and Data Flows – a data interface is a specific way a data entity can provide or
accept data. A data flow is a link between two (or more) data entity interfaces where there are at
least two endpoints: a source and a target.
Levels of Descriptive Detail – these define types and amounts of information to be provided
ranging from initial foundational information to details on individual data elements within a data
entity.
Data Entity Type Attributes – these are entity-specific attributes that contain descriptive
information.
Data Capabilities and Processes – these are sets of activities commonly and repeatedly performed
to generate specific results within the context of the data landscape.
Data Zones
The data landscape model contains a number of data zones, such as:
Insecure External Organisation Presentation And Access – this represents a location where
publically accessible data entities reside. These entities are regarded as insecure and/or untrusted.
Secure External Organisation Participation and Collaboration – this is a location outside the
physical organisation boundary where data entities that are provided by or two trusted external
parties reside.
Secure External Organisation Access – this zone contains data entities that enable secure access
from outside the organisation.
Organisation – this data zone represents the entire organisation and it contains all the locations
and business units or functions within the organisation.
Central Data Infrastructure – this contains the central data applications and their associated
data.
Business Unit/Location Data Infrastructure – this is an individual organisation business unit or
location and the data entities it contains.
These are shown in the following diagram.
12. Describing the Organisation Data Landscape
Page 12
Figure 3 – Layered View of Data Landscape Data Zones
In this diagram, higher-level data zones are shown as encompassing and surrounding lower-level ones.
This is just one possible representation of the logical layering of these data zones, from central data
infrastructure to wider zones that ring the organisation.
The data zones could also be represented as islands in the following view:
Figure 4 – Island View of Data Landscape Data Zones
13. Describing the Organisation Data Landscape
Page 13
The data landscape model can be extended to include further data zones if necessary. The following
diagram shows additional data zones explicitly represented.
Figure 5 – Additional Data Zones
These additional data zones are overt representations of locations for organisation data entities located
outside the core organisation but effectively part of a stretched data landscape. These further data zones
are:
Co-Located Data Infrastructure – this represents organisation data entities that are logically part
of the organisation but physically located within a co-located facility.
Outsourced Service Provider Data Infrastructure – this represents data entities that are used
by the organisation but are provided or use technology infrastructure provided by an outsourcing
service provider.
Cloud Service Provider Data Infrastructure – this represents organisation data entities of
various types (applications and their associated data stores, data technology infrastructure and data
entities implemented on it or platforms used to create data applications and store their data)
provided by cloud service providers.
There could be several of each of these zones, one for each service provider the organisation uses.
In the diagram above, the data entities that are represented in the Central Data Infrastructure zone
level can exist at the Organisation zone. Central organisation data entities can also be located in an
Organisation Location/Unit zone defined as a container for that purpose.
14. Describing the Organisation Data Landscape
Page 14
Data Entity Types
These represent types of data entities that can reside in data zones. Data entities are assigned a type.
The following diagram shows one view of a possible list of data entity types located in the concentric
data zone view show in Figure 3 on page 12.
16. Describing the Organisation Data Landscape
Page 16
This diagram shows the data entities types using the concentric data zone view shown in Figure 3 on
page 12. This list is neither complete nor definitive. There are many ways of representing data entities
types of which this is one. The objective here is to have a consistent way of representing key data entities.
Once this is done, the relationships, interactions and data flows between data entities can be specified.
There will be many actual instances of each of these data types.
The initial set of data entity types in this view are:
Data Zone Data Entity Type Data Entity Type Description
Insecure
External
Organisation
Presentation
And Access
1
External Public
Mail
The organisation may use the services of external mail providers,
either formally or informally (through some form of shadow IT).
2 Public Web Site
The organisation will have a public web site that will, at the very
least, contain static information about the organisation. Content
may be managed and changes and updates published using a
content management solution. It may contain extracts of
information contained in operational systems. It can contain links
to applications that can process data. The web site may also accept
interactions, either one-way or two-way, such as commercial
transactions. The organisation may collect web site usage
information to understand how users are interacting with it.
3
Social Media
Platforms
The organisation may utilise some of the many social media
platforms to perform functions such as presenting data to the
public, interacting and transacting with the public and customers
and sharing data with customers, employees and partners. These
interactions and transactions may require access to internal
operational systems. Content may be managed and changes and
updates published using a content management solution provided
by the social media platform. These platforms may also be a
source of interaction metadata that the organisation may access
and process.
4
Public Mobile
Apps
The organisation may publish apps that allow the public,
customers, suppliers and other parties view information and
interact and transact with it. These interactions and transactions
may require access to internal operational systems. Content may
be managed and changes and updates published using a content
management solution. The apps will need to be updated and thee
updates will need to be pushed to the application platform and
from these to user devices. These platforms may also be a source of
interaction metadata that the organisation may access and process.
5
External
Organisation
Public Data
Sources
These represent public and insecure sources of data, structured or
unstructured either being supplied to (PUSH) or available to the
organisation (PULL).
6
External
Organisation
Public Data
Targets
These represent public and insecure targets to which the
organisation supplies or transmits data, structured or
unstructured, either being supplied to the target (PUSH) or
available to the target to be retrieved (PULL).
7
External
Organisation
Public Data
Sharing
The organisation may share data with third-parties or among its
own personnel using facilities provided by public data sharing
platforms.
8 Public Data Stores The organisation may store data on public data storage platforms.
17. Describing the Organisation Data Landscape
Page 17
Data Zone Data Entity Type Data Entity Type Description
9
Public External
Data Devices
The organisation may send data to or receive data from externally
located public (not secured) devices.
Secure
External
Organisation
Participation
and
Collaboration
10
External Private
Mail
These represent private mail services hosted securely outside the
organisation.
11
External Data
Sources
These represent private and secure sources of data, structured or
unstructured either being supplied to (PUSH) or available to the
organisation (PULL).
12
External Data
Targets
These represent private and secure targets to which the
organisation supplies or transmits data, structured or
unstructured, either being supplied to the target (PUSH) or
available to the target to be retrieved (PULL).
13
External Secure
Data Sharing
These represent private and secure data sharing facilities.
14
External Secure
Data Stores
These represent private and secure data storage facilities.
15
Collaboration
Solutions
The organisation may collaborate internally and with external
parties securely using the services of collaboration solutions, either
developed or acquired or hosted within the organisation or
externally.
16
Transaction
Partners
The organisation may transact with or use the services of providers
of transaction services. These can include suppliers such as
payment service providers, information service providers, process
outsourcing or supply chain service providers. These transactions
will involve the exchange of data.
17
Private External
Data Devices
The organisation may send data to or receive data from externally
located private secure devices.
Secure
External
Organisation
Access
18
Secure External
Access to Data
This represents the facility through which the external secure data
entities are accessed and along which data passes.
19
External Data
Communications
This represents a channel through which the external data entities
are accessed and along which data passes.
20Edge Devices
These receive data from other data entities such as Private
External Data Devices and Public External Data Devices and
optionally process, concentrate or aggregate it before transmitting
that data onwards.
21
Hosted Data
Infrastructure
This represents hosted (cloud-based) data infrastructure. This is a
high-level representation of what could be a set of data entities
that would be expanded using a structure such as shown in Figure
7 on page 21 where the constituent data entity types are explicitly
shown in a separate explicitly represented hosted provider data
zone.
22
Externally Co-
Located Data
Infrastructure
This represents data infrastructure located in a co-location facility.
This is a high-level representation of what could be a set of data
entities that would be expanded using a structure such as shown in
Figure 7 on page 21 where the constituent data entity types are
explicitly shown in a separate explicitly represented co-location
provider data zone.
23
Externally Co-
Located
Outsourced Data
Infrastructure
This is a placeholder for data infrastructure that the organisation
has chosen to place in a co-location facility. This can be expanded
to include more details on the data entities using a structure such
as shown in Figure 7 on page 21 where the constituent data entity
types are explicitly shown in a separate explicitly represented
outsourcing provider data zone.
18. Describing the Organisation Data Landscape
Page 18
Data Zone Data Entity Type Data Entity Type Description
24
Data Input to
Externally Hosted
Data Processing
Applications
This represents data manually entered into applications that are
hosted externally. Such data inputs can be explicitly represented
or left implicit.
25
Externally Hosted
Data Processing
Applications
This is a placeholder for applications that the organisation uses
that are hosted externally by service providers. This can be
expanded to include more details on the data entities using a
structure such as shown in Figure 7 on page 21.
26
Externally Hosted
Data Processing
Applications Data
Stores
This is a placeholder for an explicit representation of the data
stores used by applications that the organisation uses and that are
hosted externally by service providers. Such data stores can be left
implicit. Given that many service providers has a changing model
that includes a data storage/data usage component, their explicit
representation allows this to be expressed. This can be expanded to
include more details on the data entities using a structure such as
shown in Figure 7 on page 21.
27
Backup and
Recovery as a
Service
This is a placeholder for an explicit representation of the specific
externally-provided service for backup and recovery of data
entities that are located on the organisation’s premises and other
locations.
28
Disaster Recovery
as a Service
This is a placeholder for an explicit representation of the specific
externally-provided service for disaster recovery of data entities
that are located on the organisation’s premises and other locations.
29
External Data
Analytics Services
This is a placeholder for any applications that the organisation
uses that perform data reporting and analysis functions and that
are hosted externally by service providers. This can be expanded to
include more details on the data entities using a structure such as
shown in Figure 7 on page 21.
Organisation
30
External Secure
Data
Communications
This denotes the data communications links required to allow data
entities that are co-located or hosted externally to be accessed from
within the organisation and for data to be transferred between the
data zones.
31
Inter Site/Unit
Data Management
This represents the possible set of management procedures and
tools to enable data access and movement between organisation
locations that are physically separated from each other.
32
Inter Site/Unit
Data
Communications
This signifies the data communications links between organisation
locations that are physically separated from each other and other
data communications-related facilities (such as WAN data
compression).
33
Inter Site/Unit
Data Replication
This represents the capability to replicate some or all data entities
to a second site. This could be represented separately or be a
separate instance of the data entity type Inter Site/Unit Data
Management.
Central Data
Infrastructure
34
Document
Sharing and
Collaboration
This denotes a type of data application that is used to share and
allow collaboration on documents.
35
External Content
Data Management
and Publication
This denotes a type of data entity that provides facilities to create,
manage and publish content to externally-facing data entities such
as Public Web Site or Public Mobile Apps. Organisations may take
a COPE (Create Once Public Everywhere) approach to the
management of content and its publication across multiple
channels and platforms.
19. Describing the Organisation Data Landscape
Page 19
Data Zone Data Entity Type Data Entity Type Description
36
Data Input to
Data Processing
Business
Applications
This denotes data manually entered into on-premises business
applications. Such data inputs can be explicitly represented or left
implicit. Explicit representations are useful in that they indicate
problems with such data entry such as poor data quality checks in
the associated business applications.
37
External Data
Receipt And
Access Control
This signifies data entities that are the target for data being
received from external entities. Data resides here before it is
transmitted to or retrieved by its processing entities.
38
External Data
Transmission And
Access Control
This signifies data entities that are the target for data being sent to
external entities. Data resides here before it is transmitted to or
retrieved by its target external entities.
39
Document and
Record
Management
Systems
These represent a type of data entity that provides document and
records management facilities.
40
Internal Email
Solution
This represents an on-premises email application.
41
Data Processing
Business
Applications
This signifies a type of data entity that performs line of business
data processing.
42
Integration/
Service Bus
This represents a type of data entity that enables communication
between mutually interacting data entities in a service oriented
model.
43
External Data
Synchronisation
This denotes a type of entity that synchronises data held or shared
across data zones.
44Semantic Data
This signifies a type of data entity that stores information on and
enables the processing of the meaning and structure of data held in
other data entities.
45Master Data
This denotes a type of data entity that either or both stores master
data and performs master data management functions.
46
Data Processing
Application
Operational Data
Stores
This denotes data entities that represent data stores used by line of
business data entities.
47
Audit Log and
Usage and
Performance Data
This refers to a type of data entity that holds logs, audit and usage
data on other data entities.
48
Data Storage
Infrastructure
This signifies the range of physical data storage infrastructure used
by other data entities.
49Metadata
This represents a type of data entity that either or both stores
metadata and performs metadata management functions.
50Reference Data
This represents a type of data entity that either or both stores
reference data vocabularies and performs reference data
management functions.
51ETL
This denotes a data entity type that performs the functions of
extracting data from one or more source systems and other data
sources, operates on the data and then loads the transformed data
into a target.
52Data Stores This represents a generic data storage data entity.
53
Data Service
Management
This data entity provides facilities to implement and operate
service management processes that relate to data operations.
20. Describing the Organisation Data Landscape
Page 20
Data Zone Data Entity Type Data Entity Type Description
Tools
54
Unstructured
Data Stores/File
Shares
These data entities enable unstructured document-oriented data to
be stored and shared.
55Data Warehouse
This refers to an integrated data store that takes data from many
operational systems and other sources, cleans, transforms,
normalises and standardises it, adds a time dimension and that is
used for reporting and analysis.
56Data Reporting
This denotes a data entity type that provides a range of tools and
facilities that enable data to be reported on, visualised and
explored.
57
Data Management
Tools
This refers to a type of data entity that provides tools and facilities
to perform data management and housekeeping functions such as
backup and recovery and replication.
58
External Data
Analytics Co-
ordination and
Management
This denotes a data entity type that allows data analytics
functions and work requests to be allocated to external analytics
tools and platforms and that co-ordinates the distribution of work
and data and that collects and assembles the results.
59
Document
Creation
This denotes data entities that are sources of new documents and
changes to existing documents.
60Data Marts
This represents a subset of data warehouse data aimed at
presenting a specific subject-oriented set of data for reporting and
analysis.
61Data Analytics
This refers to a data entity type that provides a range of tools and
facilities that enable the discovery and interpretation of patterns
in data and that provides capabilities such as data modelling.
62Data Mining
This represents a data entity type that provides a range of tools
and facilities that apply statistical techniques to transform data
into knowledge and extract meaning from data.
63
Data Entity
Infrastructure
This signifies the range of physical infrastructure (such as
processing, network and others) used by other data entities.
The data entity types that are in the Central Data Infrastructure data zone could also be located in other
data zones such as:
Co-Located Data Infrastructure.
Outsourced Service Provider Data Infrastructure
Cloud Service Provider Data Infrastructure
These are data entity types and not actual data entities. The scenario analysis section uses a simple data
landscape with data entities of some of these types shown on page 72.
The following diagram shows some of the placeholder data types such as Hosted Data Infrastructure,
Externally Co-Located Data Infrastructure and Externally Co-Located Outsourced Data Infrastructure
being replaced by explicitly references to their constituent data entity types.
21. Describing the Organisation Data Landscape
Page 21
Figure 7 – Data Zone and Data Entity Type Example with Cloud and Outsourced Service Providers
22. Describing the Organisation Data Landscape
Page 22
These data entity types are generic and independent of any specific technology or set of facilities they
provide other than that which is implied by their type.
For example, a finance and accounting or HR solution will have a type of Data Processing Business
Applications. They can be located in a data zone such as Central Data Infrastructure or Secure External
Organisation Access.
Data Entity Relationships
Entities can have (many) relationships with other entities. These can be one-way – from a source to a
target entity – or two-way – between two entities. Relationships can be expressed in active or passive
terms: A acts on B or B is acted up by A. Relationships are not necessarily definitive. They can be used
to denote informal associations.
There can be many different types of entity relationships. These relationships can be characterised in
different ways.
Relationship Type Description
Uses Entity A uses the facilities provided by Entity B
Updates Entity A updates data stored or managed by Entity B
Creates Entity A creates data that is stored or managed by Entity B
Processes Entity A processes data that is supplied by Entity B
Transfers From Entity A transfers data from Entity B to Entity C
Transfers To Entity A transfers data to Entity B from Entity C
Manages Entity A manages Entity B
Administers Entity A administers Entity B
Transforms Entity A performs transformation actions on data from Entity B
Stores Entity A stores data in Entity B
Reads From Entity A reads data from Entity B
Writes To Entity A reads data to Entity B
Publishes To Entity A publishes data to Entity B
Replicates Data To Entity A replicates data to Entity B
Aggregates Data From Entity A aggregates data from Entity B
Collects Information On Entity A collects audit, usage and performance data on Entity B
Backs Up Entity A backs up data on Entity B
Recovers Entity A recovers data for Entity B
Loads Entity A loads data from Entity B
Entity relationships are intended to represent connections between entities. Changes in those entities –
movement to a different zone as a result of movement to a cloud service provider or an outsourcing
arrangement, new entities added, entities aggregated or split, new functionality added – impact the
relationships. Understanding the entity relationships means the impact of entity changes can be
understood.
The following diagram shows some of the possible relationships between entity types.
24. Describing the Organisation Data Landscape
Page 24
Entity relationships can be defined at varying levels of detail and complexity. The amount of definition
needs to be directly related to the benefit that will be derived. Entity relationships allow the likely
consequences of data landscape changes to be identified.
This diagram violates the design principles listed on page 9 because of its level of detail that confuses
rather than add insight. Such a diagram obscures rather than enlightens. However, once the data entity
relationship information has been entered into the data landscape, it can be filtered to show relationship
types or just a subset of relationships.
The absence of defined relationships between entities can be used to identify potential problems such as
underused or redundant entities and the absence of information that need to be collected.
Data Interfaces and Data Flows
A data interface is a method of a data entity where it can accept or provide data. Interfaces can be
PUSH – where the source data provider entity pushes the data to the target or PULL where the target
data entity pulls the data from the source.
Data interfaces can have properties such as:
Parameters supported that affect the nature of the data being sent or received
The data transmission or receipt protocols supported or used
The type of security, authentication and encryption used
The data formats supported or required
Restrictions on data volumes
A data flow is a path from a data source and its associated data interface to a data target and its
associated interface. So a data flow involves two (or more) interfaces.
Data flows can be direct – from the source data entity interface to the target data entity interface – or
indirect – by way of an interim data entity (such as an (S)FTP server, service bus or data storage
location acting as a mailbox). The data flows involved in an ETL tool moving data from one data entity
to another could be viewed at two data flows or one.
Data flows can also involve a transformation, where the source data is modified before it reaches the
target.
At a very high level, based on the combinations of these options, there can be ten major types of data
flow:
Data Flow Description
Direct Source Push The source data entity pushes the data to the target entity.
Direct Target Pull The target data entity pulls data from the target entity.
Indirect Source Pull Target Pull The source data entity handles a pull request from an interim
data entity that then provides the data in response to a pull
request from the target.
Indirect Source Push Target Pull The source data interface pushes the data to an interim
location where it remains until the target data interface
25. Describing the Organisation Data Landscape
Page 25
Data Flow Description
retrieves it.
Indirect Source Pull Target Push Data is pulled from the source data interface by the interim
data entity. The data is then pushed to the target.
Indirect Source Push Target Push The source data interface pushes the data to an interim
location where it is then pushed to the target data interface.
Transformation Source Pull Target Pull The source data entity handles a pull request from a
transformation data entity that then provides the
transformed data in response to a pull request from the
target.
Transformation Source Push Target Pull The source data interface pushes the data to a
transformation data entity location where it remains until
the target data interface retrieves it.
Transformation Source Pull Target Push Data is pulled from the source data interface by the
transformation data entity. The transformed data is then
pushed to the target.
Transformation Source Push Target Push The source data interface pushes the data to a
transformation data entity where the transformed data is
then pushed to the target data interface.
These types of data flows are concerned with the transfer of data. They exclude details on the
handshaking required to initiate the data flow such as authentication and generation and use of
temporary session keys.
Data flows can have other properties such as:
Data transfer type such as file transfer, message, API or other
Data format
Scheduled or unscheduled
Triggers or events
Frequency if scheduled
The type of transformation(s) performed, if any
Protocol used
The following diagram represents a Direct Source Push data flow.
Figure 9 – Direct Source Push Data Flow
The following diagram represents a Direct Target Pull data flow.
26. Describing the Organisation Data Landscape
Page 26
Figure 10 – Direct Target Pull Data Flow
The following diagram represents an Indirect Source Pull Target Pull data flow.
Figure 11 – Indirect Source Push Target Pull Data Flow
The following diagram represents an Indirect Source Push Target Pull data flow.
Figure 12 – Indirect Source Push Target Pull Data Flow
The following diagram represents an Indirect Source Pull Target Push data flow.
Figure 13 – Indirect Source Pull Target Push Data Flow
The following diagram represents an Indirect Source Push Target Push data flow.
27. Describing the Organisation Data Landscape
Page 27
Figure 14 – Indirect Source Push Target Push Data Flow
The following diagram represents a Transformation Source Pull Target Pull data flow.
Figure 15 – Transformation Source Pull Target Pull Data Flow
The following diagram represents a Transformation Source Push Target Pull data flow.
Figure 16 – Transformation Source Push Target Pull Data Flow
The following diagram represents a Transformation Source Pull Target Push data flow.
Figure 17 – Transformation Source Pull Target Push Data Flow
28. Describing the Organisation Data Landscape
Page 28
The following diagram represents a Transformation Source Push Target Push data flow.
Figure 18 – Transformation Source Push Target Push Data Flow
These types of data flows have a single start and single end point in a single data entity. Data flows can
be more complex with, for example, data being sent to multiple targets.
Figure 19 – Transformation With Multiple Target Pushes
Transformations can consist of multiple data processing steps. For the purposes of documenting the data
landscape, this additional information increases complexity while not necessarily adding value in terms
of understanding the existing landscape and planning for data transformation changes.
The following diagram shows a number of possible data flows across a number of interfaces for a subset
of the data entity types shown on page 15.
29. Describing the Organisation Data Landscape
Page 29
Figure 20 – Sample Data Interfaces and Data Flows for Data Entity Types
30. Describing the Organisation Data Landscape
Page 30
The next diagram shows an example of a single extended data flow extracted from the previous diagram.
The example relates to data the flow from data generated by external devices to the data being analysed
across a number of interfaces, and spanning a number of data zones.
In this example, there are 12 data entity types and their interfaces involved in the extended data flow:
1. Public External Data Devices – these collect or provide measurement or telemetry data. The
collected data is pushed to a data concentrator.
2. Edge Device – this acts as a data concentrator, receiving data from multiple external data sources
such as meters or telemetry units. The data is then pushed to a data access data entity type.
3. External Data Receipt And Access Control – this is a generic data entity type that represents the
entry portal for incoming data. The edge device pushes aggregated edge device data to this.
4. Integration/Service Bus – this represents a data entity type that implements or provides service
oriented data integration facilities.
5. Data Processing Business Applications – this denotes data entity types that represent the
business applications that receive the data from the Integration/Service Bus data entity and process
it.
6. ETL – the ETL data entity type may be involved in the extended data flow in a number of ways:
It can receive data from the Integration/Service Bus and pass it the Data Processing Application
Operational Data Stores data entity type that represents the data stores of the Data Processing
Business Applications data entity type.
It can extract data from the Data Processing Application Operational Data Stores and move it
to the Data Warehouse and Data Marts data entity types, after transformation to convert
operational data into the subject-oriented data format with a time dimension that these entity
types typically require.
7. Data Processing Application Operational Data Stores – these data entities are the functional
(rather than infrastructural) data storage component of the corresponding Data Processing Business
Applications entity types. The ETL data entity type pulls data from the stored data, transforms it
and loads it into the Data Warehouse and Data Mart entity types.
8. Data Warehouse – this represents the data entity type that holds long-term data from operations
systems.
9. Data Marts – this signifies data entity types that contain specific subsets of transformed
operational data used for specific reporting and analysis purposes.
10. Data Analytics – this denotes a data analytics data entity
11. External Data Analytics Co-ordination and Management – this represents a data entity that
manages the allocation of data analytics requests to external (cloud-based) data analytics services
and the retrieval of results.
12. External Data Analytics Services – these data entities provide external (cloud-based) data
analytics facilities.
31. Describing the Organisation Data Landscape
Page 31
Figure 21 – Example of Single Extended Data Flow across a Number of Data Entities
32. Describing the Organisation Data Landscape
Page 32
This single extended data flow can (and really should) be broken down into a number of specific data
flows using data entity interfaces that exist and are used for each distinct purpose.
The following diagram shows this sample extended data flow divided into three separate data flows:
Figure 22 – Splitting Sample Extended Data Flow into Two Separate Data Flows
The three data flows represent separate elements of work:
Data Flow 1 – the collection of data from external data sensors
Data Flow 2 – the population of data stores with different types with sensor data
33. Describing the Organisation Data Landscape
Page 33
Data Flow 3 – the analysis of sensor data
The level of detail and the amount of process decomposition applied to a data flow depends on factors
such as:
The amount of detail to be represented and the value to be derived from that detail
The need to include details on the data handoffs between each interface and to describe any data
transformation that occurs
The value and utility that can be obtained from the level of detail
The following diagram shows a simplified representation of these previous sample data flows. In this case,
just the main data entities and their interfaces involved in the data flows are shown.
Figure 23 – Simplified Representation of Data Flows
This version contains a reduced amount of detail when compared with the previous more detailed
illustration.
34. Describing the Organisation Data Landscape
Page 34
Levels of Descriptive Detail
The data landscape model could be used to hold information at different levels of detail:
Level 1 – Data zone and entities, their types, relationships and interfaces. This is the
foundational definition of the data landscape. It identifies the major data entities in each data zone.
Level 2 – Additional details about the data entities, their constituent components, attributes
and characteristics. This includes platform details, technologies used including versions and
products used including versions.
Level 3 – Assessment of the capability, maturity of data management and service
management processes across the landscape. This can contain details on the data management
capabilities and processes and the related service management processes and an assessment of their
application to and how well they have been implemented and operate across data zones and data
entities.
Level 4 – Description of data contents and data processing. This level can contain additional
details on the data that is within the scope of the data entity such as datasets, files, tables or other
data constructs or data processing steps and activities.
Level 5 – Individual details of data contents. This can contain further levels of detail down to the
individual data field level.
These levels are not necessary incremental.
Figure 24 – Relationships Between Levels of Data Landscape Model Details
The purpose of the data landscape view is not to become or replace any existing data dictionary or
semantic data function within the organisation by adding a parallel set of information. At its core it is a
data architecture planning approach.
35. Describing the Organisation Data Landscape
Page 35
Data Landscape Data Model
Core Data Landscape Data Model
The data model required to describe the core data landscape is quite simple. The following shows it
expressed as a simple Entity-Relationship Diagram.
Figure 25 – Data Landscape Core Data Model
The core data model is sufficient to provide a helicopter view of the data landscape.
This core data model contains the following data elements:
Data Landscape Data Model Element Description
1 Data Entity This is used to hold details on the data entities in the
data landscape
2 Data Entity Type This holds the types of the data entities
3 Data Zone This holds the data zone where the date entities are
located
36. Describing the Organisation Data Landscape
Page 36
Data Landscape Data Model Element Description
4 Data Zone Type This holds the types of the data zones
5 Data Entity Relationships This holds the relationships between entities
6 Data Relationship Type This holds the types of the data relationships
7 Data Interface This holds details on data interfaces
8 Data Interface Type This holds the types of the data interfaces
9 Data Entity Data Interface This links data interfaces to data entities
10 Data Flow This holds details on data flows
11 Data Flow Type This holds the types of data flow
12 Data Flow Steps This holds details on steps within data flows
13 Data Flow Step Type This holds the types of the data flow step
The core model is sufficient to describe the primary components of the organisation data landscape and
to perform the analysis and planning described above.
Extended Data Landscape Data Model
The core data landscape model can be extended to allow for the inclusion of other information such as:
Components of data entities that could be used to provide more granular information on their
constituents – this is described on page 38.
Attributes of data entities and data zones to describe their characteristics – this is described on page
39.
Data contents that describe details on the data associated with a data entity – this is described on
page 45.
Application group that links several data entities into a wider application or service – this is
described on page 49.
Events and activities relating to data entities.
Data management and operations processes as they apply to data entities and their status – this is
described in more detail on page 53.
Subject area model data concepts (part of the Enterprise Data Model) and which data entities are
involved in their processing – this is described in more detail on page 67
These are just examples of the types of extensions that can be performed. Such extensions must add
value, utility and insight to justify their use and the amount of work required to populate the data
structures.
These extensions are not sequential. They can be applied in any order to the core data model.
At a high-level, the core and extended data landscape model can be represented as follows:
37. Describing the Organisation Data Landscape
Page 37
Figure 26 – Core and Extended Data Landscape Models
Data Landscape Data Model Element Description
1 Data Entities These are instances of data entity types that perform data
functions within the data landscape
2 Data Zones These are logical groups of data entities
3 Entities Relationships Entities can be related to each other
4 Data Interfaces Interfaces are date entry or exit points within data
entities
5 Data Flows Flows represent movement of data between entity
interfaces
6 Data Attributes Attributes are characteristics of zones, entities, interfaces
and flows
7 Data Components Entities can be divided into their constituent components
8 Data Capability and Management
Processes
The data landscape and its constituent entities are
implemented and operated using data processes
9 Data Entity Events Events can be recorded against entities
10 Data Entity Data Contents The contents of data entities can be recorded
11 Application or Service Group Entities can be grouped into applications
12 Data Subjects Data subjects can be associated with the processing of
data subjects defined in the Subject Area Model
13 Data Security Data entities can have security requirements or
implications
38. Describing the Organisation Data Landscape
Page 38
The next four sections show possible extensions to the core data model to describe:
Data entity components and data entity functions – see below
Data entity attributes – see on page 39
Data entity contents – see on page 45
Application or service groups – see on page 49
Data Entity Components and Functions
Data components are intended to hold an additional level of detail on the contents and composition of
data entities and the functions those components perform.
Figure 27 – Data Component Level of Detail Extension to Core Data Model
These extended data model elements are:
Data Landscape Data Model Element Description
14 Data Component This holds details on data components
15 Data Component Type This holds the types of the data components
16 Data Component Data Entity This contains the data components that a data entity
39. Describing the Organisation Data Landscape
Page 39
Data Landscape Data Model Element Description
contains
17 Data Function Thus contains details on data functions
18 Data Function Type This holds the types of the data functions
19 Data Component Data Function This contains the data functions that a data components
performs
Not all these data elements are required for this data model extension.
Data Entity Attributes
Data entity attributes can be used to store extended details on data zones, entities, interfaces and flows
For example, an attribute called Database Platform could be defined.
This can be assigned a list of possible values such as:
IBM DB2
Informix
Microsoft Azure SQL Database
Microsoft SQL Server
MySQL
Oracle
PostgreSQL
The Database Platform attribute could then be associated with Data Entity Type of Data
Processing Application Operational Data Stores.
The Data Entity of Financial System Database can be assigned a Data Entity Type of Data
Processing Application Operational Data Stores.
The Database Platform attribute of the Data Entity of Financial System Database can then be
assigned a value from the list of possible values.
The objective of allowing data entity attributes in not to store and manage detailed configuration
information. The DMTF (Distributed Management Task Force) maintain and publish a Common
Diagnostic Model https://www.dmtf.org/standards/cdm that contains details on a possible set of IT
infrastructure specific attributes.
There are other examples of detailed data entity attributes from developers of CMDB (Configuration
Management Database) software whose data models contain examples of such attributes. These are some
instances of CMDB data models such as:
BMC Atrium Common Data Model – https://docs.bmc.com/docs/ac1902/common-data-model-
842265110.html
IBM Tivoli Common Data Model – http://www.redbooks.ibm.com/redpapers/pdfs/redp4389.pdf
Microsoft Operation Manager Data Model –
https://blogs.technet.microsoft.com/drewfs/2014/08/17/general-purpose-data-model-for-scom-data-
warehouse/
40. Describing the Organisation Data Landscape
Page 40
ServiceNow CMDB Schema Model – https://docs.servicenow.com/bundle/newyork-servicenow-
platform/page/product/configuration-
management/concept/c_ConfigurationManagementDatabase.html
These details are included for information only. The data landscape data model does not need to include
this level of detail.
42. Describing the Organisation Data Landscape
Page 42
These extended data model elements are:
Data Landscape Data Model Element Description
20 Data Attribute This holds details on data attributes that can be assigned
to data entities and that will be assigned data entity-
specific values
21 Data Attribute Type This holds the types of the data attribute in terms of the
type of values the attribute can hold
22 Data Attribute Values This contains the values associated with the data
attribute
23 Data Attribute Data Entity Type This holds the data attributes that can are linked to
specific data entity types and to which values can be
assigned
24 Data Attribute Data Entity Value This hold the value of data attribute for each data entity
25 Data Attribute Data Zone Type This holds the data attributes that can are linked to
specific data zone types and to which values can be
assigned
26 Data Attribute Data Zone Value This hold the value of data attribute for each data zone
27 Data Attribute Data Interface Type This holds the data attributes that can are linked to
specific data interface types and to which values can be
assigned
28 Data Attribute Data Interface Value This hold the value of data attribute for each data
interface
29 Data Attribute Data Flow Value This holds the data attributes that can are linked to
specific data flow types and to which values can be
assigned
30 Data Attribute Data Flow Type This hold the value of data attribute for each data flow
Not all these data elements are required for this data model extension.
Data entity attributes can be used to hold status and planning information about data entities. These
could include:
Future plans for the data entity
Status of the underlying technology
Process status and health
Support status
Issues with the data entity
End-of-life date
For example, the future plans for data entities could include some or all of the following values that
indicate the corresponding actions:
1. Reassemble – combine functionality of solution with other solutions to create new combined
solution
2. Redevelop – redevelop the custom application and retain its functionality
3. Reduce – stop using functional elements of the current application while retaining it
4. Refactor – change the internal application structure, design and implementation without changing
the external appearance and functionality
5. Rehost – move application to new platform without change
6. Relocate – move the data contained in the data entity to another platform
43. Describing the Organisation Data Landscape
Page 43
7. Repair – resolve problems and issues with the current application while retaining it on the same
platform
8. Replace – replace the application with a functionally similar one
9. Replatform – move application to new platform with some limited changes to enable the
application run on the new target platform
10. Research – this represents data technologies that are emerging and are being researched and piloted
for possible production application
11. Reserve – retain the application but encapsulate access to its functionality via some form of
interface
12. Retain – retain the application entirely in its current form
13. Retire – stop using an obsolete application without explicitly replacing it
The following diagram shows the rough general location of these options arranged along two axes of
future location of the data entity – from existing location to an external cloud or hosted one – and the
level of change involved to the data entity – from none to significant.
Figure 29 – Data Entity Future Options
When this additional status information is available, data entities could then be filtered based on factors
such as their future plans.
There may be a temptation to create lots of data entity type-specific attributes that can be used to
record information about data entities. However, unless these attributes add value to the data landscape
model, they should not be added.
Once area that could add value is using data attributes to track the cost or financial impact of data
entities. This information can then be used to assess the financial impact of various data transformation
options. The following diagram shows a possible view of the financial impact of data entities imposed on
the sample data landscape on page 72.
45. Describing the Organisation Data Landscape
Page 45
Data Security and Data Transformation
Data will have security characteristics and requirements in terms of its sensitivity and confidentiality
and the impact on the organisation of its loss, from regulatory to financial and reputational.
The data attribute extension to the data model can be used to hold security profile information
regarding data entities or data subjects processed by those data entities (see details on the subject area
model on page 67).
In planning data transformations such as those examples listed on page on page 72, the security
implications can be identified and assessed if the security attributes has been defined.
The following diagram illustrates this.
47. Describing the Organisation Data Landscape
Page 47
Data Entity Contents
Data entity contents are intended to hold an additional level of detail on the data contents of data
entities. This is separate from data entity components that are intended to represent functional elements
of a data entity.
49. Describing the Organisation Data Landscape
Page 49
These extended data model elements are:
Data Landscape Data Model Element Description
31 Data Content Type This holds details on data types of data content
33 Data Content Type Data Entity Type This contains details on data content types that can be
assigned to data entity types
33 Data Entity Data Content Value This holds the data content values assigned to data
content types for specified data entities
Not all these data elements are required for this data model extension.
Application Group or Service
A set of data entities can belong to an application or service. The purpose of this extension to the core
data landscape model is to allow data entities be assigned to applications.
50. Describing the Organisation Data Landscape
Page 50
Figure 33 – Application/Service Group Extensions to Core Data Model
51. Describing the Organisation Data Landscape
Page 51
For example, an application that allows external users interact with it may consist of the following data
entities:
Figure 34 – Application View of Data Entities
Common data entities such as those providing infrastructural-related data services can be shared
between applications or services.
Figure 35 – Data Entities Shared Between Applications
52. Describing the Organisation Data Landscape
Page 52
Being able to group data entities to reflect their involvement and the role they perform in an application
or service means that the impact on that application or service can be determined if any of its
component data entities change.
These extended data model elements are:
Data Landscape Data Model Element Description
34 Application/Service This holds details on applications or services that
contain data entities
35 Application/Service Type This holds details on an optional set of types of
application or service
36 Application/Service Data Entity This links applications or services to data entities. A
data entities can belong to more than one application
or service
37 Application/Service Role This holds details on the roles that can be assigned to
data entities within applications or services
38 Application/Service Role Type This holds details on an optional set of application or
service role types
39 Application/Service Data Entity Roles This assigns roles to application or service data
entities. A data entities can have more than one role
for an application or service
40 Application/Service Environment Type This holds details on an optional set of environment
types that can be assigned applications or services
41 Application/Service Data Entity
Environment Types
This assigns environment types to application or
service data entities
The environment type data element can be used to identify separate environments for an application.
Environment values can be defined such as:
Production
Pre-Production
Operations Acceptance Test
User Acceptance Test
System/Integration Test
Development/Unit Test
53. Describing the Organisation Data Landscape
Page 53
Figure 36 – Application Environments
Not all these data elements are required for this data model extension.
Data Processes and Capabilities
Data Process Framework
The data landscape is neither passive nor static. It must be designed, implemented managed,
administered and operated through the development and application of a range of processes. The extent
of their implementation and application should be part of any description and assessment of the state of
the data landscape. The state of these processes and the state of the application to specific entities is one
measure of the state of the data landscape.
If a means is required to assess the health of the data landscape with respect to its operational state then
a structured operational process framework is required against what that assessment can be performed.
This section contains notes on defining such an operational process definition and thus assessment
framework.
The objective here is not to define a complete, exact and rigorous process definition assessment
framework. The modelling principles listed on page 9 should be applied here. Complexity is the enemy of
quick and useful results.
These data-related processes can be grouped and viewed in a number of ways:
Data Service Management Processes – these are data landscape-specific elements of what should
be more general information technology service management processes. They are sets of activities
performed to create a result. These are concerned with looking after the pure operational aspects of
the data landscape (as part of a wider information technology landscape). This is just one view of the
key service management processes that apply to the data landscape.
54. Describing the Organisation Data Landscape
Page 54
Data Capability Process Areas – these are data landscape-specific capabilities and the associated
processes that actualise their use. These represent skills that will be of varying degrees of importance
to each organisation.
Data Life Stages – this is a view of the stages to which data moves as it is being processed by data
entities. Not all of these stages apply to all entities. The entire set of stages may span a number of
data entities. The stage view applies to an individual data instance, a set of data processed by a
specific solution that may use the facilities of multiple data entities.
Each of these views describes a different aspect of the processes associated with the data landscape. The
service management process view describes how well these general service management processes have
been implemented and are operated for entities within the data landscape. The data process area are sets
of skills and abilities that must exist and be applied to the design and implementation of data entities.
The data life stage view takes a cross-functional perspective on data processing and movement through
its life stages
56. Describing the Organisation Data Landscape
Page 56
The service management processes and their applicability to the data landscape are:
Incident Management – handle and manage unplanned interruptions to or reduction in the quality
of a data service and to restore normal operation as quickly as possible, minimising the negative
impact on business operations.
Problem Management – analyse and determine the root causes of incidents to stop incidents from
happening, to eliminate recurring incidents and to minimise the impact of incidents that cannot be
stopped.
Event and Alert Management - detect events and alerts that represent significant occurrences to
entities, identify them and determine the appropriate actions to take and to collect data for analysis.
Performance and Capacity Management – analyse resource consumption, determine patterns
and trends and ensure there are sufficient resources to support the current and projected future
operation of data entities.
Service Level Management – agree and define service targets and then ensure these targets are
met for data entities.
Asset Management – track data entities though their lifecycle to identify ownership, cost of
operation and use and manage upgrade and replacement cycles
Resilience, Availability and Continuity Management – ensure that data entities are available,
can resist failure and recover quickly from failure and ensure continuity of operations in the event of
the loss of data entities.
Risk Management – identify, assess and control occurrences that could cause loss of or damage to
data entities.
Change Management – manage modifications to, additions to or removal of data entities in a
controlled manner to avoid disruption to services.
People Management – manage people resources required to administer, manage and operate data
entities from a service operations view.
Supplier Management – manage suppliers of services across the duration of the product or service
supply contract.
Knowledge Management - manage information and knowledge systems so that personnel have
access to the knowledge needed to effectively perform their work and identify the knowledge needed
for service delivery.
Ideally, there will already be a service management framework in operation within the organisation that
will have implemented these service processes more generally. These can then be applied specifically to
the data landscape.
The data capabilities and their associated processes are:
Data Governance – planning, supervision and control over data management and use, developing
data management and use standards and ensuring compliance with data management processes and
standards.
57. Describing the Organisation Data Landscape
Page 57
Data Architecture Management – defining data technology standards, defining the approach to
managing data assets, use and reuse of and compliance with existing data technology standards, use
and reuse of data infrastructural technology solutions.
Data Model Management – creating standard data models of the data that will be collected,
create, stored and processed that formally describe the data contents and structures, including
metadata and semantic data, integrating, controlling and providing metadata – descriptive data
about the underlying operational data, creation of data description standards and the collection,
categorisation, maintenance, integration, application, use and management of data descriptions.
Data Security Management – ensuring data privacy, confidentiality and appropriate access to
data, managing and implementing data classification and preventing data loss.
Data Solution Design and Implementation Management – ensuring that all the data aspects of
the design of information technology solutions are performed to a suitable standard and incorporated
into subsequent solution implementation.
Data Operations Management – providing data storage, data operations and service management
support from data acquisition to purging. The service management processes listed above could be
subsumed into this capability.
Data Master, Reference and Quality Management – managing master versions and replicas,
management of master versions of shared data resources to reduce redundancy and maintain data
quality through standardised data definitions and use of common data values, defining, monitoring
and improving data quality.
Data Audit, Control and Lifecycle Management – managing the definition, collection and
analysis of data audit information, using audit information to develop data controls
Data Movement, Integration and Transformation Management – data resource integration,
data interfaces and flows, extraction, transformation, movement, delivery, replication, transfer,
sharing, federation, virtualisation and operational support, business solution data interfaces and
integrations.
Data Location, Synchronisation and Access Management – managing data across storage
locations and platforms, both internal and external, synchronising data across platforms and
managing and controlling access to data across platforms.
Data Usability Management – ensure usability across all elements of the data landscape, ensuring
utility, accuracy, consistency, ease of interpretation.
Data Project Management – supporting and managing the data aspects of projects and solution
delivery and handover to production and support.
Data Insight and Presentation Management – creating data warehouses and data marts,
implementing reporting, data visualisation and analytics, defining data metrics and performance and
results indicators, implementing and operating processes to ensure action is taken based on data
insights.
These data capabilities are not isolated silos. They are interlinked. This does not mean that they cannot
and should be assessed and evaluated separately.
The following diagram illustrates some of these data capability linkages.
58. Describing the Organisation Data Landscape
Page 58
Figure 38 – Connections Between Data Capability Areas
The interconnectedness of the data capabilities and their underpinning implementation and operating
processes illustrates the difficulty of assessing one capability independently of others. It is almost certain
that if an organisation is good at any one of these capabilities, it will be good at all of them.
There will be three implementation-related aspects to each of these process areas:
1. How well the processes are defined and implemented
2. How well the defined processes are applied, implemented and operated
3. How important and relevant the process
These aspects apply to both the process in general and to its application for a specific data entity.
59. Describing the Organisation Data Landscape
Page 59
Figure 39 – Data Process Implementation, Operation and Use Aspects
A process measurement and assessment framework that includes all of these elements would be very
complex to use.
Gaps in process definition and operation in the data landscape may indicate potential problem areas that
may require or would benefit from remediation.
Using a Data Process Framework to Assess the Health of the Data Landscape
The following approach could be used to assess data capabilities across the data landscape. For each data
capability, rate the overall importance, implementation status and operation and use status. Then for
each data entity, rate each data capability in the same way. This would result in a measurement
structure along the following lines:
60. Describing the Organisation Data Landscape
Page 60
Figure 40 – Data Capability Process Assessment Framework
This is very complex measurement structure as well as being time-consuming to create, maintain and use.
This approach breaches the design principles listed on page 9. While some form of data capability
process assessment would be useful in being able to detect potential problems and areas requiring
remediation, this approach, without simplification, would be too complex to use and be usable.
In terms of the extended data model, the data entity attribute approach described on page 39 could be
used to hold the process status/health information. For the purposes of identifying issues at a high-level,
this should be sufficient.
Data Maturity Models
It is not possible to discuss the topic of (data) processes and their assessment without the subjects of
their maturity and the use of maturity models being raised. The purpose of this document is not to
61. Describing the Organisation Data Landscape
Page 61
discuss data maturity models in detail. This section covers the topic briefly. There has been a growth in
the number of informal and ad hoc maturity models across different aspects of data processes. These
models lack the rigour and validation and the detailed assessment framework to support their use.
All these maturity models (should) have a common structure:
There is a set of maturity levels on an ascending scale, typically from 1 to 5:
5 - Optimising process
4 - Predictable process
3 - Established process
2 - Managed process
1 - Initial process
Each maturity level has a number of process areas/categories/groupings. Maturity relates to
embedding these processes within the organisation.
Each process area has a number of processes.
Each process has generic and specific goals and practices.
Specific goals describe the unique features that must be present to satisfy the process area.
Generic goals apply to multiple process areas.
Generic practices are applicable to multiple processes and represent the activities needed to manage a
process and improve its capability to perform.
Specific practices are activities that are contribute to the achievement of the specific goals of a
process area.
Figure 41 – Generic Maturity Model Structure
62. Describing the Organisation Data Landscape
Page 62
Maturity levels are intended to be a way of defining a means of evolving improvements in processes
associated with what is being measured.
Figure 42 – Improving Process Maturity
These data maturity models have different areas of focus, as shown in the diagram below.
63. Describing the Organisation Data Landscape
Page 63
Figure 43 – Data Maturity Models
The data maturity model groups are:
Data Capability Maturity Model – these define a set of general data capabilities that should
encompass all the required data competencies and that can be used to measure the organisation’s
overall data process maturity.
Data Governance Capability Model – these apply maturity to the subset of data capabilities
relating to data governance.
Data Stewardship Capability Model – these apply to the further subset of data governance
processes relating to data stewardship – the fitness, quality and usability of data and its metadata,
Data Analytics Maturity Model – these apply to the subset of processes that apply to data
analytics activities.
Big Data Maturity Model – these apply to the subset of processes that apply to big data and by
association data analytics activities.
The following table lists some of the maturity models currently available in these areas. Maturity models
come and go with great regularity in the data domain. There are a large number of obsolete and
64. Describing the Organisation Data Landscape
Page 64
unmaintained data maturity models. Many of the models are developed by vendors who use them to sell
their products and services rather than the models being independent assessments of actual and relevant
organisation data maturity.
Data Maturity
Model Type
Examples Examples
Data Capability
Maturity Model
CMMI Institute Data Management
Maturity (DMM)
https://cmmiinstitute.com/data-
management-maturity
DAMA International Data Management
Body of Knowledge (DMBOK)
https://dama.org/content/body-knowledge
EDM Council DCAM Data Management
Capability Assessment Model
https://edmcouncil.org/page/aboutdcamre
view
Federal Government Data Maturity Model https://www.ntis.gov/assets/FDMM.pdf
MIKE2.0 (Method for an Integrated
Knowledge Environment) Information
Maturity (IM) QuickScan
http://mike2.openmethodology.org/wiki/In
formation_Maturity_QuickScan
Data Governance
Capability Mode
NASCIO Data Governance https://www.nascio.org/EA/ArtMID/572/A
rticleID/198/Data-Governance-Managing-
Information-As-An-Enterprise-Asset-
Part-I-An-Introduction
ARMA The Information Governance
Maturity Model
https://www.arma.org/page/IGMaturityM
odel
Data
Stewardship
Capability Model
NOAA Data Stewardship Capability
Model
https://geo-
ide.noaa.gov/wiki/index.php?title=Data_S
tewardship_Maturity_Questionnaire_(DS
MQ)_User%E2%80%99s_Guide
Data Analytics
Maturity Model
Gartner Data Analytics Maturity Model https://www.gartner.com/smarterwithgart
ner/take-your-analytics-maturity-to-the-
next-level/
Data Science Maturity Model https://blogs.oracle.com/r/a-data-science-
maturity-model-for-enterprise-assessment-
part-1
Big Data
Maturity Model
CSC http://csc.bigdatamaturity.com/
Horton Works http://hortonworks.com/wp-
content/uploads/2016/04/Hortonworks-
Big-Data-Maturity-Assessment.pdf
IBM https://www.ibmbigdatahub.com/blog/big-
data-analytics-maturity-model
Info-Tech https://www.infotech.com/research/ss/lever
age-big-data-by-starting-small/it-big-data-
maturity-assessment-tool
TDWI https://tdwi.org/pages/maturity-
model/big-data-maturity-model-
assessment-tool.aspx?m=1
Such maturity models may be useful for specific assessment engagements. But in terms of the overall
data landscape a considerably simpler approach is needed. The maturity models listed above are all quite
different, have different areas of focus and are both quite detailed as well as not covering the full scope of
the data landscape and the processes required to support and operate it.
The set of data capabilities listed on page 56 could form the basis of a maturity model. This could then
be used to create a data landscape view of data capability process health using a simple traffic light
display as shown in the following diagram.
65. Describing the Organisation Data Landscape
Page 65
Figure 44 – Data Entity Data Capability Process Health View
66. Describing the Organisation Data Landscape
Page 66
Business Functions and Business Processes
The data entities within the organisation data landscape, grouped into applications, are used to operate
business processes. The data that flows into and out of the data entities is used by these business
processes.
Simplistically, business processes and their interactions with data entities can be represented as follows:
Figure 45 – Business Processes and Interactions with Data Entities
The elements of this are:
1. The business process consists of a series of tasks performed in a sequence.
2. Business applications are used to assist with the performance of these tasks. Data is entered into
those applications, it is modified, new data is generated, data is output and some or all of this data is
stored.
3. The business applications consist of sets of data entities that combine to comprise those applications.
In the same way as individual entities can be grouped to comprise applications or services as described
on page 49, the business processes associated with data entities could be defined. This would then allow a
business process view of data entities to be created.
Within the data landscape model, such business process information could be useful but it strays from
the core purpose of understanding the operation and use of data at a high-level within the organisation
and to plan for future changes.