Presented by Eliot Kimber at Documentation and Training East 2008,
October 29-November 1, 2008 in Burlington, MA.
XML applications for publishers have largely failed to realize the
full potential inherent in the technology. While larger publishers
could make the investment necessary to realize significant return on
the use of XML technology, smaller enterprises simply could not, for a
number of reasons, but fundamentally because the startup costs and
ongoing costs of ownership were simply too high. The DITA standard
fundamentally changes the equation, bringing several unique features
that, together, serve to lower both the startup cost and ongoing
costs, making the use of XML for publishers much more affordable than
it ever has before. At the same time, advances in supporting
technologies important to Publishers, such as improved support for XML
in Adobe Creative Suite and Microsoft Office, powerful new XML search
and retrieval systems such as MarkLogic, and a new generation of lower-
cost XML editors, as serve to make the use of XML for Publishing
applications more attractive than it ever has been before.
3. Who Is This Talk For?
Publishers who want to implement XML-based
solutions
Publishers who have XML-based solutions that need
to be enhanced, refined, or upgraded
Creators of XML-aware tools applicable to publishing
use cases
Service providers who support the development and
use of XML-based solutions for Publishers:
Integrators
Data conversion houses
Consultants
DITA for Publishers DocTrain East 2008
4. About Me
Senior Concept Prover at Really Strategies Inc.
20+ years experience with generalized markup
(GML, SGML, XML, etc.)
Career focus on large-scale hyperdocument creation
and management
Focus for last 8+ years on Publishing use cases
around XML-based publishing workflows
Active member of the DITA Technical Committee
Founding member of the XML Working Group
Long-time member of the XSL-FO Working Group
Co-editor of the ISO/IEC HyTime standard
DITA for Publishers DocTrain East 2008
5. Audience Survey
Who is here?
Publishing for profit?
Publishing as a cost?
Technical Documentors?
Service providers?
People just interested in DITA?
DITA knowledge:
No idea what DITA is?
Know about DITA a little?
Familiar with DITA concepts and details?
Using DITA now or implementing DITA-based solution?
DITA for Publishers DocTrain East 2008
7. What Is DITA?
OASIS Open Standard: Darwin Information Typing
Architecture
An XML architecture standard for representing
human-consumed information
Some distinguishing aspects of DITA as an XML
architecture:
Formal mechanism for controlled definition of new
vocabularies (“specialization”)
Optimized for information modularity (“topics”, “maps”) and
blind interchange
Standardized document type implementation design patterns
Growing off-the-shelf processing infrastructure
Sophisticated hyperlinking features (“relationship tables”)
Currently at version 1.1, version 1.2 in final stages of
review and approval
7 DITA for Publishers DocTrain East 2008
8. Key DITA Concepts Briefly Explained
Topics
Topic content is paragraphs and stuff
Standalone units of information
Topics may directly contain other topics
Maps
Hierarchical sets of links to topics
Establish organizational hierarchies for sets of topics
May have many maps over the same topics
Can impose metadata onto topics
Can impose topic-to-topic hyperlinks (relationship tables)
Specialization
New element types are “subclasses” of existing types
DITA for Publishers DocTrain East 2008
9. Oooh, A Picture
Maps Topics
Map One
Topic Topic
A D
Topic Topic
B E
Map Two
Topic Topic
C F
DITA for Publishers DocTrain East 2008
10. Output Results
Map One I. Topic C
1.1 Topic B
1.1.1 Topic A
1.1.2 Topic D
1.2 Topic F
Map to
PDF
Map Two
I. Topic F
1.1 Topic B
1.2 Heading
1.1.1 Topic A
1.1.2 Topic E
DITA for Publishers DocTrain East 2008
11. Specialization
DITA standard defines a set of base element types:
Topic, map, section, paragraph, figure, table, phrase, data
All other elements based on these base types
Establishes a formal class hierarchy for all element
types in any DITA document
Every element type maps back to some standard-
defined base type
Declaration mechanism is formal and simple:
Uses element attributes (“class=“)
Can be processed by almost any XML tool, including CSS
selectors
Even works for DTD-less documents
DITA for Publishers DocTrain East 2008
13. Compare DITA With…
DocBook
Book-focused (not inherently modular)
Can use XInclude to manage information in modular fashion
No facility comparable to DITA maps
Mature standard
Very large tag set reflecting union of wide set of requirements
No formal vocabulary extension mechanism
Blind interchange not really possible
Deep off-the-shelf infrastructure
13 DITA for Publishers DocTrain East 2008
14. Compare DITA With…
NLM
Optimized for journals not books
No formal vocabulary extension mechanism
Little off-the-shelf infrastructure
14 DITA for Publishers DocTrain East 2008
15. Compare DITA With…
PRISM/PAM
Essentially XHTML with sophisticated metadata
Optimized for serialization, not authoring and archiving
Little off-the-shelf processing infrastructure
15 DITA for Publishers DocTrain East 2008
16. Compare DITA With…
Custom XML application
Expensive to develop and maintain
Can be optimized for local requirements
Processing infrastructure must be built from scratch
Content management
Authoring tool configuration and customization
Publishing pipelines
Interchange transforms
No blind interchange possible
16 DITA for Publishers DocTrain East 2008
17. About That…XML Application Development Costs
Information requirements analysis is always required
Using a standard XML application still requires that
you determine how to apply it to your requirements
All useful standard XML applications…
…Provide more stuff than you need
…Fail to provide some things specific to your requirements
Amount of analysis required reflects your business
problem, not standard chosen
Thus: cost of analysis is essentially invariant
regardless of implementation choice
Main variable is cost of system implementation:
Implementation of XML document types (DTDs)
Implementation of management and processing
17 DITA for Publishers DocTrain East 2008
18. XML System Cost Analysis
Three distinct cost domains:
Initial system development
Cost of use (training, skills required, cost of tools)
Maintenance and refinement over long time scale
Ideal implementation base minimizes all costs:
Low cost of acquisition and implementation
Low cost of use, skills and knowledge common in user
population, tools are appropriately priced
Low cost of refinement, extension, interchange, management
Cost evaluated in terms of value:
Short term: ability to meet immediate requirements with
lowest initial cost
Long term: ability to support new requirements with lowest
cost of maintenance and extension
18 DITA for Publishers DocTrain East 2008
19. DITA Largely Meets the Ideal
Lowest possible cost of initial solution development
Implementing custom doctypes very low cost
Many off-the-shelf tools “just work” with little or no
customization or configuration
Large and growing body of use-case-specific DITA modules
Large and growing body of DITA knowledge
Standard is well written
Many service providers with solid DITA knowledge
Growing body of published DITA how-to information
Controlled extension (“specialization”) means:
Knowledge about one DITA application transfers to all other DITA applications
Extensibility and interchange are optimized
Implementations can optimize their own modularity and flexibility
19 DITA for Publishers DocTrain East 2008
20. My Assertion: DITA Is Almost Always Best Fit
DITA can be easily and practically applied to almost all
documentation use cases (not just tech docs)
DITA’s unique features minimize initial cost of
ownership and implementation
DITA’s unique features optimize interchange of
content
DITA’s unique features maximize flexibility and
stability of supporting tools
Therefore:
DITA provides maximum value compared with other
alternatives
Main cost is acceptance of a few constraints that enable
DITA’s value
20 DITA for Publishers DocTrain East 2008
22. DITA Myth One: DITA Is Only For Tech Docs
DITA is a layered, flexible standard
Originally driven by technical documentation
requirements…
…but, core features are completely generic
DITA has been used for:
Government reports
Financial standards
Test preparation books
Travel guides
No inherent restrictions on the kind of publications
DITA will work well for
22 DITA for Publishers DocTrain East 2008
23. Forest for the Trees: It’s Still Just XML
DITA has lots of cool features, some quite
sophisticated
This sophistication can be scary
But...
…It’s still just XML
You don’t have to use any particular feature of DITA
Users don’t necessarily need to know it’s DITA
If it being DITA-based doesn’t help at the moment,
don’t talk about it
To the non-DITA-aware it looks like any other custom
XML application
DITA for Publishers DocTrain East 2008
24. DITA Myth Two: DITA Requires Topic-Based
Writing
DITA standard is optimized for modularity
But it does not require that content be stored or written as
modules
Use of DITA maps is entirely optional
Topics can physically contain other topics
An entire book could be marked up as a single XML document
consisting of one root topic and many child topics
Such a topic would be indistinguishable from any other similar
XML document (e.g., an NLM article, a DocBook document)
24 DITA for Publishers DocTrain East 2008
25. DITA Myth Three: DITA Is Hard
DITA has lots of features, some quite sophisticated
Making full use of all these features requires understanding
those features, of course
But at its simplest, DITA is just like any other XML document
type for publications: sections, paragraphs, lists, figures,
tables, and inlines.
Thus, a DITA application need only be as sophisticated as you
need it to be to satisfy your specific requirements
Complexity and “difficulty” of DITA is concentrated in the data
processing requirements, not in authoring
Ability to easily define custom vocabularies means you can
optimize markup names and structures to reflect local culture
and practice
25 DITA for Publishers DocTrain East 2008
26. In Short: Why DITA? Why Not DITA?
DITA can be applied where any other applicable
XML standard can be applied
At lower absolute cost
With greater flexibility
With greater potential value
Cost of using DITA at worst no greater than using
XML generally
So why not use it?
26 DITA for Publishers DocTrain East 2008
27. Yeah, But…
I’ve said a lot of stuff
What do you need from me to be convinced?
DITA for Publishers DocTrain East 2008
29. Publishing-Specific Challenges
Existing vendor solutions and community knowledge
focused on tech doc requirements
Many vendors don’t really get DITA
Many tools don’t yet fully support specialization
Many older tools limited by architecture and
implementation choices made years ago
Many service providers still building understanding of DITA
Publishing requirement for high-quality composition
always a challenge for any XML-based solution
Publishers have different business drivers from tech
doc
DITA for Publishers DocTrain East 2008
30. Note to Vendors and Service Providers
Potential market for DITA as a publishing solution
orders of magnitude larger than potential market for
DITA as a tech doc solution
Many more publishers and units of publication than
tech doc producers
Tech doc is a cost center
Publishing is profit center
In many ways, the value of DITA to publishing is
more compelling than it is for tech doc
Just saying…
DITA for Publishers DocTrain East 2008
31. Publishing: Open Toolkit Alone Won't Cut It
Pages are and will always be important
Need a path from DITA XML to publishing tools
InDesign
Quark
Etc.
No technical barrier to a generic DITA-to-InDesign
process
Products like Typefi could add significant value
Several advantages for Publishers:
Uses existing layout design skills, tools, and methods
Can be 100% automatic or include human tweaking
Can leverage Toolkit for preprocessing
31 DITA for Publishers DocTrain East 2008
32. Where Could Publishers Go From Here?
DITA as specialized in the DITA spec not always
appropriate for Publishers
Too constrained in some areas
Needs more ways to capture format intent
May not match existing publishing practice or conventions
well
Might be useful to define a separate publishing-
specific specialization family rooted at DITA topic
rather than at concept/task/reference
Can a novel be single topic or small set of topics?
[Yes]
Does that help? Does it it hurt?
32 DITA for Publishers DocTrain East 2008
33. Business Process Improvement Implications
Base cost of using XML is essentially unchanged
Same challenges for legacy conversion
Applying XML at end of process
Using XML as input to revision process
Cost of developing initial markup design can be
significantly lower
Can have more generic, reusable processing
components
DITA encourages and enables small modules
Makes recombination at small granularity possible and
manageable
Adapts well to delivery to portable devices.
33 DITA for Publishers DocTrain East 2008
34. Business Improvement Implications (Cont.)
Incremental cost of DITA-based systems should go
down over time as infrastructure acretes
Enables local optimization of markup without
impeding interchange within an enterprise or across
enterprises
Provides controlled, formal framework for defining
common components used across parts of enterprise
or communities of interest
Enables use of more sophisticated features as
needed
DITA for Publishers DocTrain East 2008
35. Potential Information Economy Improvements
Reduce tight coupling between suppliers and
consumers (aggregators, republishers, etc.)
No need to agree on rigid, overly-general standards
to enable interchange
Supplier need not have full publishing infrastructure
in order to supply high-quality content
Information consumers can apply a generic DITA
processing infrastructure to content from many
suppliers
Increases value of information in DITA form
Reduces impedance of interchange
35 DITA for Publishers DocTrain East 2008
37. In Conclusion
DITA has lots of unique goodness of direct and
compelling value to publishers
DITA can be used in simple ways to good effect with
low cost of entry
DITA’s low cost and strong features represent a
compelling value for almost all XML-based publishing
use cases
Full DITA infrastructure still being developed
Off-the-shelf DITA-to-InDesign processes
Standard publishing-specific DITA modules
Community knowledge of how best to apply DITA to
publishing use cases
DITA for Publishers DocTrain East 2008
38. Questions?
?
DITA for Publishers DocTrain East 2008
39. Thank You
Eliot Kimber
Really Strategies
ekimber@reallysi.com
39 DITA for Publishers DocTrain East 2008