I gave this 30 minute presentation in April 2011 at Enterprise Data World in Chicago, United States.
I also have a copy of the script (also PowerPoint) I used for the PowerDesigner demonstration.
Generating XML schemas from a Logical Data Model (EDW 2011)
1. Generating XML Schemas from a canonical model - a practical example George McGeachie Metadata Matters Limited Enterprise Data World Chicago April 5th, 2010
2. The ‘C’ word Managing XSDs using a master (canonical) XML schema The flexibility of PowerDesigner The XML Schema Model in PowerDesigner Which Canonical Model Should You Use? Factors To Be Considered Generating XML Models in Power Designer Enterprise Data World, Chicago, April 5th 2011 2
3. About Me I’ve been involved with modelling and managing data and metadata for longer than I care to remember. When I first came across XML Schemas, I was struck by their simplicity and versatility. When I saw how some people use them, I realised that the days of unmanaged COBOL copybooks had returned to haunt me. This is my attempt to exorcise the XML demon. Enterprise Data World, Chicago, April 5th 2011 3
4. The ‘C’ word It’s difficult to avoid the word ‘Canonical’ when discussing standard XML messages To some, the message schemas are the ‘canonical’ To others, the underlying model is the ‘canonical’ When I use the word, I mean both of them Enterprise Data World, Chicago, April 5th 2011 4
5. Managing XSDs using a master (canonical) XML schema An XML master schema must be hierarchical It cannot easily show all the complex relationships between the data concepts It cannot be flexible enough to support multiple views of the data You need to understand XML to create, edit or read the model Enterprise Data World, Chicago, April 5th 2011 5
6. Managing XSDs using a master (canonical) XML schema How would we integrate the metadata represented by a schema with our other metadata? Reverse-engineer it into a repository or modelling tool? How would we Ensure consistency of XSDs with each other? Enable impact analysis? Manage the variations necessary between dependent schemas? Enterprise Data World, Chicago, April 5th 2011 6
7. The flexibility of PowerDesigner Enterprise Data World, Chicago, April 5th 2011 7 PowerDesigner has a dedicated XML Schema Model Several different approaches supported out of the box Generating XSDs from a class model Generating an XML Schema model from a class model or a physical data model, then generating XSDs Reverse-engineering existing XML Schemas into a class model or XML Schema Model Can then generate a logical or conceptual view
9. The XML Schema Model in PowerDesigner Enterprise Data World, Chicago, April 5th 2011 9 This can link to an XML model, or an XSD Referenced Elements (in another XSD) Complex Type Note the XML-specific object types
10. XML-specific Model Objects 10 Enterprise Data World, Chicago, April 5th 2011 Here’s the detail of one of the sequences in the schema Filter the displayed properties Add additional items to the sequence
11. The XML Model Dedicated to modelling XML Schema 1 schema per XML model, traceable to source models Mapping Editor – drag and drop mappings between models Supports XSD, XDR and DTD Multiple Namespaces supported Can use 1 XML model for multiple schemas in the same target namespace Enterprise Data World, Chicago, April 5th 2011 11
12. The flexibility of PowerDesigner Linkages are automatically maintained between models as you generate them Additional mappings can be added between any two models The generation process can be customised You have control over the naming standards and data types used in each model Impact and Lineage analysis is enabled by the generation and mapping links Enterprise Data World, Chicago, April 5th 2011 12
13. Generation Links For each model, you can trace links to other models in the generation sequence Enterprise Data World, Chicago, April 5th 2011 13
15. Impact of deleting XML Attribute - diagram Enterprise Data World, Chicago, April 5th 2011 15 Dependencies between PDM tables PDM Tables mapped to the Element Our attribute XML Element Simple Type (no longer required) Remember: this diagram traces back to the model the XML attribute was generated from; in this case, the XML model was generated from a PDM, so the impact analysis traces to the PDM
16. Visible Links Between Models Using the ‘project’ feature, we can see the links between models We can also include external objects, and create our own links to them An XSD and an HTML file in this case Enterprise Data World, Chicago, April 5th 2011 16
17. Which canonical model? One (or more) that you can sell within your organisation Standard models Standard Universal Data Models e.g. IBM, Teradata, Oracle, EWS Solutions, Len Silverston’s Universal Data Models Industry Universal Data Models e.g. standard messaging structures, such as from the OAG See www.industrydatamodels.com Some organisations use several of these, which will need to be cross-referenced Your own model(s) possibly used to extend one of the above Enterprise Data World, Chicago, April 5th 2011 17
18. A simple example of a relational canonical model Enterprise Data World, Chicago, April 5th 2011 18 This is an amended version of the ‘Project Management’ PDM supplied with PowerDesigner
19. The ‘Team Members’ schema Enterprise Data World, Chicago, April 5th 2011 19
21. Factors To Be Considered Generating a Type Library vs. generating Schemas Physical Design Challenges Model Management Challenges 21 Enterprise Data World, Chicago, April 5th 2011
22. Generating a type library often straightforward to generate no structure manual control of re-use in schemas "let the service people design the schemas" no impact analysis across schemas may be the only acceptable first step low impact on work patterns gains initial acceptance of the role of a canonical model Enterprise Data World, Chicago, April 5th 2011 22
23. Generating Individual Schemas A 'container' for each schema Model, Submodel, Subject Area, Package etc defines scope provides documentation facilitates governance Complete generation Nothing changed post-generation Requires high degree of control over generation process Enterprise Data World, Chicago, April 5th 2011 23
24. Generating Individual Schemas all types can be local no need to 'include' or 'import' a base schema the 'standard' types are used to generate the schema, rather than being referred to by the schema The standard types are in the canonical data model potential for impact analysis can choose when to update each individual schema Enterprise Data World, Chicago, April 5th 2011 24
25. Sample Message Schema This was generated from the same data model as the type library; it only contains what the message actually needs Enterprise Data World, Chicago, April 5th 2011 25
26. Physical Design Challenges Some of these challenges are similar to those we face when we use a single Logical Data Model for designing multiple database schemas Differences between Schemas Handling Sub-types and inherited attributes Denormalising attributes and entities Data Type and Name conversions Scalability of Process Automating & future-proofing the generation process Enterprise Data World, Chicago, April 5th 2011 26
27. Model Management Challenges Multiple reference models require careful consideration E.g. Using OAG messaging standard as a reference model, extending it where required Provides a quick-start for defining schemas Be careful what you remove; perhaps remove nothing How do you identify the parts that are actually in use? How do you know what extensions are needed? Have to map the standard to your own models Enterprise Data World, Chicago, April 5th 2011 27
28. Generating XML Models in PowerDesigner Two options Generate XML Model using default options, from selected tables, and tinker the XML afterwards Use XML Builder to design and generate almost exactly what is required, and then tinker the XML model 28 Enterprise Data World, Chicago, April 5th 2011
29. Contact Me Telephone: +44 (0) 20 8123 8756 (forwarded to mobile) UK mobile: +44 (0) 794 293 0648 Skype: gmcgeachie Twitter: metadatajunkie Email: George.McGeachie @ MetadataMatters.com Blog: http://metadatajunkie.wordpress.com/ Enterprise Data World, Chicago, April 5th 2011 29
I’m a modeller at heart, not an XML Guru, and not a developer.
Why use a canonical data model?In this presentation, I assume that the use of a canonical model is a given; we all see the need to manage our understanding and metadataSee Designing Canonicals for SOA: Bridging ER and XML Worlds Mehmet Orun & Jeff Pekrul, Genentechhttp://sfdama.org/Presentations/2007-02-07_Canonical_for_SOA_%20ER_XML.pdf
It’s possible to manage the content of XML Schemas by using a single master (canonical) XML Schema, which defines all the allowed types. The hierarchical nature of an XML Schema makes it impossible for a master schema to do any more than specify the possible building blocks for schemas; it cannot govern the ways in which those building blocks are assembled. A canonical data model managed as an XSD can only be a type library; it cannot possibly control the ways in which the types can be assembledMany of the challenges (see later) we face when generating XSDs also apply to an XML Master Schema
A stand-alone canonical XML schema is an island of metadata, providing no impact and lineage analysis capabilities
The dedicated XML model in PowerDesigner enables us to manage XML schema designs in an integrated manner; have a look at the options in the next slide
You can see how the XML Schema Model forms an integral part of the PowerDesigner model management philosophy.This slide only shows the ‘downward’ generation options; it also possible for a model to generate another model of the same type, or another model of a ‘higher’ type. E.g. a PDM can be generated from an XSM.
This is an OAGi XML Schema, reverse-engineered into PowerDesigner. The only change I’ve made is to move some symbols closer to each other. This schema inherits all the simple types from other schemas, which is why there aren’t any here.
The arrows show the navigation path through the model required to create the schema; note that some relationships are traversed from parent to child, others from child to parent. Also note that no attributes are extracted from the ‘Team Member’ entity; we use it as a route to the details of the Employee. The grey parts of the model are not used in this schema.
A more complex schemaThe arrows show the navigation path through the model required to create the schema; note that some relationships are traversed from parent to child, others from child to parent. Also note that no attributes are extracted from the ‘Team Member’ entity; we use it as a route to the details of the Employee.
Differences between SchemasAttribute and relationship optionality and cardinalityExcluding attributesFewer or more enumerationsDenormalising attributes and entitiesAttributes may appear in >1 element1 element may include attributes from multiple entitiesScalability of Process We can handle 5 XSDs, but what about 500 XSDs?