Adopting a Canonical Data Model - how to apply to an existing environment with web services (SOA and REST)
1. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 1)
phil@mp3monster.org
www.mp3monster.org
‘How to implement a canonical data
model in an existing SOA estate’
2. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 2)
phil@mp3monster.org
www.mp3monster.org
Introduction
• The following deck attempts to address the question:
• ‘How to implement a canonical data model in an existing SOA
estate’
• To address this we need to understand a number of
things:
– Assumptions on the current state of affairs
– The value proposition of adopting a canonical model – no
point in an adoption approach that doesn’t deliver value
(with as tangible or intangible benefits)
– The strategies best suited to delivering the goal
– Appreciate the risks we may expose
3. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 3)
phil@mp3monster.org
www.mp3monster.org
Assumptions
• By ‘SOA environment’ we assume to mean capability
centric services primarily built with SOAP/WSDL/XSD and
REST/JSON technologies
• By ‘data model’ we presume to mean data definitions
used in middleware rather than underlying application
and data warehouse/marts
• Assumption that the existing estate doesn’t have an
interface versioning strategy applied across the board
• Services are woven together to deliver larger capabilities
by an ESB
• The approach should be vendor agnostic
4. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 4)
phil@mp3monster.org
www.mp3monster.org
What do we mean by a Canonical Data Model
• The following definition from Forrester 2010 (as part of a
blog on a modelling conference1)
A canonical information model is a model of the semantics and structure
of information that adheres to a set of rules agreed upon within a defined
context for communicating among a set of applications or parties.
• The essence of the various definitions is:
– Internally consistent description of data
– Standard terminology and meaning
– Commonly accepted by all providers & consumers involved in
orchestrating interactions of any form
– Definitions are largely technology agnostic (although typically
not free of the under pinning representation i.e. XML/XSD of
SQL).
1 http://blogs.forrester.com/mike_gilpin/10-03-15-field_first_annual_canonical_model_management_forum
5. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 5)
phil@mp3monster.org
www.mp3monster.org
Value of a Canonical Model
• Semantic Consistency - which allows interactions to have a common
meaning so no problems of your gadget is my widget
– This means mapping data from an event pay load or for WS invocations is easy &
less prone to mapping errors
• Structural Consistency – so the definition of common data items is always
the same
– Eliminates risks of transformation errors
– Potential to reduce transformations in an orchestrated sequence of operations –
meaning greater throughput
• Reduced Design Effort – choose appropriate definitions not create them
– Picking data definitions from a set of models is easier & less error prone than
designing from scratch
• Increases chance of Information Rich integration
– with a predefined data definition increases chances of providing information rich
events as you’re just populating objects
– Information rich events, raises chances of plug and play integration (event types
match, data shared less likely to need changes to get more data)
6. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 6)
phil@mp3monster.org
www.mp3monster.org
Look at a hypothetical integration and how
Canonical Model adoption can change it
7. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 7)
phil@mp3monster.org
www.mp3monster.org
Key
Aggregate
Split
Transform
Store
Tx Endpoint
Endpoint
Pipe (+Filter)
Content Route
De/Normalize
Enrich
Canonical Data
App Data
Icons from Hhope & Woolfe EAIPatterns.com
Organic Growth & Non Canonical Model
• Organic Growth
• No canonical model
• Creating need for
multiple
transforms &
related
operations
• Some operations may
undo previous
operation
• Canonical
application models
excluded here
8. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 8)
phil@mp3monster.org
www.mp3monster.org
Key
Aggregate
Split
Transform
Store
Tx Endpoint
Endpoint
Pipe (+Filter)
Content Route
De/Normalize
Enrich
Canonical Data
App Data
Icons from Hhope & Woolfe EAIPatterns.com
Same Systems with Canonical Model
(systems not canonical conversant)
• Greatly simplified
as each system is
fronted by a
transform
to/from local
representation to
canonical
9. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 9)
phil@mp3monster.org
www.mp3monster.org
Key
Aggregate
Split
Transform
Store
Tx Endpoint
Endpoint
Pipe (+Filter)
Content Route
De/Normalize
Enrich
Canonical Data
App Data
Icons from Hhope & Woolfe EAIPatterns.com
Same Systems with Canonical Model
(some systems canonical conversant)
• Number of
transformations
reduced
• Middleware
purely becomes a
routing / pub-sub
delivery
10. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 10)
phil@mp3monster.org
www.mp3monster.org
Select or Create own Canonical Model?
• Industry standard models cover wider number of domains, but will
not provide 100% fit all the time
• Creating works …
– In a closed, non SaaS/COTs environment create custom canonical model
by deliver benefits
• closer match to service implementation – performance gain
• Alignment to business language
– Create need to take into account lessons from designing enterprise
application/DB data models
• Select a standard model means
– Leverage accumulated good practise lessons learnt/data needs for
interoperability
– Industry models likely to be 80/20 fit you will need own definitions for
business specific concepts e.g.
• in optical retail need to extend standard definition of Item with definition from
Vision Council of America clinical elements of lens shape & cut
– Model selection(s) need to be done with care
– Make sure model(s) are sufficiently mature
11. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 11)
phil@mp3monster.org
www.mp3monster.org
Technical Strategies / Decisions Needed
12. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 12)
phil@mp3monster.org
www.mp3monster.org
Interface/Payload Versioning
• Interfaces or the payload need to have a versioning
strategy as they will change overtime, can be applied by
– URI – works very well for REST
– XSD schema versioning – common, but a problem for
REST+JSON
• Need to know how many versions to actively support
– common to keep current + 1
– factors to account for rate of change & interface users ability to
accommodate the rate of change
• Determine approach handling
– Common URL + ESB conditional logic
– Separate URLs + ESB logic re-use
13. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 13)
phil@mp3monster.org
www.mp3monster.org
Versioning Existing Interfaces – Some Options
• If existing interface/payload has no version
– ESB can use absence as implicit version 0.
– Requires routing conditional logic
• If interface does exist, then
– If versioning uses same strategy then recommend new interface
URIs
– If different can share URI and use ESB to determine version
– Requires routing conditional logic
• Simply create slightly different URI for replacement
services
– Increased volume of code,
– endpoint user aware of change
(less desirable)
– conditioning is implicit
14. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 14)
phil@mp3monster.org
www.mp3monster.org
Transition States
• There will always be a period of transitionary state when
adopting major change such as canonical model
– Therefore when passing on or starting a sequence of event(s)
what do I assume about down stream capability?
• This can be addressed by one of several strategies:
– Late binding using UDDI or equivalent and discover service and
version of interface available – great if overhead is not a
problem
– Assume latest version (predicated on ability to transform to
previous version) & programme of work provides proxy to
legacy interface which transforms down
– Software controlled switching of output (not desirable as
embeds knowledge of consumers into a service)
15. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 15)
phil@mp3monster.org
www.mp3monster.org
REST+JSON Question
• Canonical models in support of the middleware are
typical XSD based today - good for SOAP WSDL services
but REST + JSON more challenging as no schema needed
• However organisations starting to offer JSON models e.g.
part of OASIS, OAGIS
• Could use REST + XML (more like RPC than proper REST)
• Could publish JSON mapped representation (tooling
available) with description via JSON Schema (IETF draft)
– Safest when offering special custom service
– Still delivers benefit for internal services
• Remember R (in REST) is for Resource and ideally you
want resources to be consistent in definition
16. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 16)
phil@mp3monster.org
www.mp3monster.org
Challenges of Abstraction vs Endpoint Needs
• Current interfaces maybe geared towards supporting
specific platforms e.g.
– Phone, thick client, IoT (Internet of Things i.e. agent
devices such as smart meters)
– E.g strip generic message to only necessary elements as
device can only handle small payloads
• Strategy for this is to add layer between core
canonical & ESB an endpoint aware transformation
– Means core routing / business aspects of ESB not impacted
– so changing routing etc not impacted
– Clients not needing adaptation can talk directly to
canonical layer
17. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 17)
phil@mp3monster.org
www.mp3monster.org
Governance
• Engaging with all these previous considerations will
need to be factored into any Governance processes..
– Design Time
• Assurance that the correct approaches identified are being
adopted
• Adoption is for the right reasons
– Execution Time
• Ability to ascertain the adoption, efficacy etc
• We started out with the
declaration that we’re working in a
SOA context, which should mean
– SOA Governance is in place and can
support these Governance goals Open Group
http://www.opengroup.org/soa/source-book/gov/sgvm_artifacts.jpg
18. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 18)
phil@mp3monster.org
www.mp3monster.org
Understanding Why & How To Adopt
Canonical Model we can look at execution
19. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 19)
phil@mp3monster.org
www.mp3monster.org
Implementation Strategy
• Define architectural strategies (i.e. engage with
previous rational and challenges)
• Need to ensure ground work is in place to enable
correct development, could be delivered by
– Reference implementation
– Documentation set with policy & practise
– Very detailed requirements (including test definitions)
– Support through architect involved in pair programming
– workshops
20. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 20)
phil@mp3monster.org
www.mp3monster.org
Implementation Strategy (1)
• Start small & grow as …
– Knowledge & understanding develops
– Principles, ideas and approach are refined
– Help manage risk & impact
– Can make ensure initial work is ‘referencable’
• Assess & Measure
– Helps build cost/benefit -- ROI insights (both hard and soft
factors)
– Informs planning & estimation downstream
– Ensure implementation quality & sustainability
21. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 21)
phil@mp3monster.org
www.mp3monster.org
TOGAF View of realising Canonical Model
Defined objective for how and
why a Canonical Model be
adopted (assuming other
principles etc already set)
Establish business direction of
travel so we can identify
suitable model(s), opportunities
for a pilot
Determine key business
data structures
Build the Tech Ref
Model & Stds
Information Base
Look at opportunities
for piloting canonical
adoption
As not greenfield
transition strategy is
needed
Hands on support
– key to identify
lessons and
approaches to
accelerate & ease
adoption
Apply refinements
to pilot. Depending
upon scope plan
next cycle
Set direction of travel,
scope for change
22. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 22)
phil@mp3monster.org
www.mp3monster.org
Activities from a Execution Sequence
Perspective
Identify Canonical model(s)
& Strategies
•Which model(s) to use
•Approaches/impact to handling
transition
Develop Model Knowledge
•Interpretation/Language
•Versioning
•Ensure guidance & supporting
information is ready
Determine non Critical
integration programme
•Scope cover various patterns of
use and impact
•Agree benchmarks to establish
value
•Develop detailed implementation
plan
Start Development
•Ensure testing of existing
interfaces are in place so can
assure of no impact
•Develop initial interfaces inc
interface & e2e tests
Regression Testing
•Ensure that different message
types & versions exercised
•Check for changes in type within
end to end execution
•Deployment strategy included
Performance & Other
PreProd Tests
•Canonical models can be heavier –
therefore ensure performance is
considered
•Assess Value of approach
Assuming Success…
•Expand adoption
•KT to wider team etc
•Programme of full adoption
Establish Monitoring
•Need to determine when legacy
interfaces stop being used
•Retire interfaces at appropriate
time
Iterate
development
process
23. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 23)
phil@mp3monster.org
www.mp3monster.org
Reminder & Questions
24. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 24)
phil@mp3monster.org
www.mp3monster.org
Remember!
• This has been done before – so make sure you’re
considering best practise recommendations
(particularly from preferred vendors)
25. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 25)
phil@mp3monster.org
www.mp3monster.org
Questions
26. ‘How to implement a canonical data model in an existing SOA
estate’
19/05/2014 (slide 26)
phil@mp3monster.org
www.mp3monster.org
Thankyou
Notas do Editor
Notes:
Approach:
Need uptake enablement
Tech considerations
- fit within larger dev lifecycle
On basis that this presentation has been requested – the minimal personal introduction
If audience size small then questions as we go
SOA can mean many things to many people – so lets declare the interpretation
Differentiate data model from DB perspective and middleware
Some assumptions made on premise that its an excuse to illustrate some useful points / thinking
This quote comes from Forrester blog by Mike Gilpin on March 15 2010 : http://blogs.forrester.com/mike_gilpin/10-03-15-field_first_annual_canonical_model_management_forum
Alternate definitions:
http://www.theintegrationengineer.com/canonical-data/
http://www.information-management.com/issues/2007_50/10001733-1.html
http://www.eai-ideas.com/architecture-ideas/soa-and-canonical-data-model-cdm
http://soapatterns.org/design_patterns/canonical_schema
http://www.soa-probe.com/2010/09/canonical-data-model.html#more
http://blogs.msdn.com/b/nickmalik/archive/2007/06/12/canonical-model-canonical-schema-and-event-driven-soa.aspx
http://xml.fido.gov/documents/completed/oagi/oagis.htm
To understand what value a canonical data model can bring – necessary to determine what changes, therefore how to best go about adoption
Erl P62 – Service Orientated Architecture – Concepts, Technology & Design – 3.4.5 - leverage XML capabilities to richly define the data, ground work for intrinisic interoperability
Cost and effort of application design is reduced after proliferation of standardized XML data
http://blog.digitalml.com/canonical-models-should-be-a-core-component-of-your-api-strategy/
http://blogs.forrester.com/mike_gilpin/12-05-29-canonical_information_models_play_important_role_in_api_layers_increasing_service_reuse
Next couple of slides illustrate the potential value of adopting a canonical model
Integrations added in a fairly unstructured patch things on
The sort of thing that can happen as an evolution on from point to point connectivity
Visualisation of the canonical data structures not shown
All end points are abstracted by a transformation that converts the local data structures to/from a canonical representation (now shown here)
As the diagram shows – the bulk of activity is transform in/out bound from end points and then just routing
Eliminating transforms and counter transforms
Risk of modifying routing/sequencing greatly reduced
Many COTs products can handle canonical data models.
Custom developed solutions can be developed to be conversant with the canonical data model eliminating transform
Middleware move towards routing considerations
E2e more efficient
Source model from OAGI / OASIS / eTOM or develop own
Developing own Model will be time consuming and challenging – need for extensibility strategy critical
As we changing an existing estate need to handle changes to the interface
Canonical model typically based on XSD but REST favours JSON payload
Tools options to assist:
http://javaoraclesoa.blogspot.co.uk/2012/12/a-reusable-solution-for-conversion.html
http://www.balisage.net/Proceedings/vol7/html/Lee01/BalisageVol7-Lee01.html
https://www.oasis-open.org/resources/topics/rest-json
http://www.jsonschema.net/index.html
Canonical Data Model management should be subject to some form of governance framework
If you’re practising full SOA governance then will already be the case
As we’ve focused on all the why and benefits which should result in all the architectural decisioning having been made (a pre-requisite)
We can focus on the mechanics of converting architecture into a reality
In terms of ground work some form of documentation is necessary to enable growth, enable informed decisioning by others in the future. Remember even Agile says:
We value … Working software over comprehensive documentation (http://agilemanifesto.org/)
As we’ve focused on all the why and benefits which should result in all the architectural decisioning having been made (a pre-requisite)
We can focus on the mechanics of converting architecture into a reality
In terms of ground work some form of documentation is necessary to enable growth, enable informed decisioning by others in the future. Remember even Agile says:
We value … Working software over comprehensive documentation (http://agilemanifesto.org/)
You’ll note that the proposed sequence doesn’t perfectly follow TOGAF A E. BUT do cover all the bases. F H is followed more tightly
TOGAF recognises it maybe need to be iterative