SlideShare uma empresa Scribd logo
1 de 164
Introduction to XML
What is XML?
XML stands for EXtensible Markup Language
XML is a markup language much like HTML
XML was designed to describe data
XML tags are not predefined. You must define your
own tags
XML uses a Document Type Definition (DTD) or an
XML Schema to describe the data
XML with a DTD or XML Schema is designed to be self-
descriptive
XML is a W3C Recommendation
XML was designed to describe
data and to focus on what data
is.
HTML was designed to display
data and to focus on how data
looks.
XML is a W3C
Recommendation
The Extensible Markup Language
(XML) became a W3C
Recommendation 10. February
1998
The Main Difference Between XML and HTML
XML was designed to carry data.
XML is not a replacement for HTML.
XML and HTML were designed with different goals:
XML was designed to describe data and to focus on
what data is.
HTML was designed to display data and to focus on how
data looks.
HTML is about displaying information, while XML is
about describing information
XML Does not DO Anything
XML was not designed to DO anything.
Maybe it is a little hard to understand, but XML does
not DO anything. XML was created to structure, store
and to send information.
The following example is a note to Tove from Jani,
stored as XML:
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this
weekend!</body>
</note>
The note has a header and a
message body. It also has sender
and receiver information. But still,
this XML document does not DO
anything. It is just pure
information wrapped in XML tags.
Someone must write a piece of
software to send, receive or
display it.
XML is Free and Extensible
XML tags are not predefined. You must "invent"
your own tags.
The tags used to mark up HTML documents and the
structure of HTML documents are predefined. The
author of HTML documents can only use tags that are
defined in the HTML standard (like <p>, <h1>, etc.).
XML allows the author to define his own tags and his
own document structure.
The tags in the example above (like <to> and <from>)
are not defined in any XML standard. These tags are
"invented" by the author of the XML document.
XML is a Complement to HTML
XML is not a replacement for HTML.
It is important to understand that XML is not a
replacement for HTML. In future Web development it is
most likely that XML will be used to describe the data,
while HTML will be used to format and display the same
data.
My best description of XML is this: XML is a cross-
platform, software and hardware independent
tool for transmitting information
XML in Future Web Development
XML is going to be everywhere.
We have been participating in XML development since
its creation. It has been amazing to see how quickly the
XML standard has been developed and how quickly a
large number of software vendors have adopted the
standard.
We strongly believe that XML will be as important to the
future of the Web as HTML has been to the foundation
of the Web and that XML will be the most common tool
for all data manipulation and data transmission.
How can XML be Used?
XML can Separate Data from HTML
With XML, your data is stored outside your HTML.
When HTML is used to display data, the data is stored
inside your HTML. With XML, data can be stored in
separate XML files. This way you can concentrate on
using HTML for data layout and display, and be sure
that changes in the underlying data will not require any
changes to your HTML.
XML data can also be stored inside HTML pages as
"Data Islands". You can still concentrate on using HTML
only for formatting and displaying the data.
XML is Used to Exchange Data
With XML, data can be exchanged between
incompatible systems.
In the real world, computer systems and databases
contain data in incompatible formats. One of the most
time-consuming challenges for developers has been to
exchange data between such systems over the
Internet.
Converting the data to XML can greatly reduce this
complexity and create data that can be read by many
different types of applications
XML and B2B
With XML, financial information can be exchanged
over the Internet.
Expect to see a lot about XML and B2B (Business To
Business) in the near future.
XML is going to be the main language for exchanging
financial information between businesses over the
Internet. A lot of interesting B2B applications are under
development
XML Can be Used to Share Data
With XML, plain text files can be used to share
data.
Since XML data is stored in plain text format, XML
provides a software- and hardware-independent way of
sharing data.
This makes it much easier to create data that different
applications can work with. It also makes it easier to
expand or upgrade a system to new operating systems,
servers, applications, and new browsers.
XML Can be Used to Store
Data
With XML, plain text files can
be used to store data.
XML can also be used to store
data in files or in databases.
Applications can be written to
store and retrieve information
from the store, and generic
applications can be used to display
the data.
XML Can Make your Data More Useful
With XML, your data is available to more users.
Since XML is independent of hardware, software and
application, you can make your data available to other
than only standard HTML browsers.
Other clients and applications can access your XML files
as data sources, like they are accessing databases.
Your data can be made available to all kinds of "reading
machines" (agents), and it is easier to make your data
available for blind people, or people with other
disabilities.
XML Can be Used to Create
New Languages
XML is the mother of WAP and
WML.
The Wireless Markup Language
(WML), used to markup Internet
applications for handheld devices
like mobile phones, is written in
XML.
XML Syntax Rules
The syntax rules of XML are
very simple and very strict.
The rules are very easy to
learn, and very easy to use.
Because of this, creating
software that can read and
manipulate XML is very easy.
An Example XML Document
XML documents use a self-describing and simple
syntax.
<?xml version="1.0" encoding="ISO-8859-1"?>
<note> <to>Tove</to> <from>Jani</from>
<heading>Reminder</heading> <body>Don't forget
me this weekend!</body> </note>The first line in the
document - the XML declaration - defines the XML
version and the character encoding used in the
document. In this case the document conforms to the
1.0 specification of XML and uses the ISO-8859-1
(Latin-1/West European) character set
The next line describes the root element of the
document (like it was saying: "this document is a
note"):
<note>The next 4 lines describe 4 child elements of the
root (to, from, heading, and body):
<to>Tove</to> <from>Jani</from>
<heading>Reminder</heading> <body>Don't forget
me this weekend!</body>And finally the last line
defines the end of the root element:
</note>
All XML Elements Must Have a Closing Tag
With XML, it is illegal to omit the closing tag.
In HTML some elements do not have to have a closing
tag. The following code is legal in HTML:
<p>This is a paragraph <p>This is another
paragraphIn XML all elements must have a closing tag,
like this:
<p>This is a paragraph</p> <p>This is another
paragraph</p>
XML Tags are Case Sensitive
Unlike HTML, XML tags are case sensitive.
With XML, the tag <Letter> is different from the tag
<letter>.
Opening and closing tags must therefore be written
with the same case:
<Message>This is incorrect</message>
<message>This is correct</message>
XML Elements Must be Properly Nested
Improper nesting of tags makes no sense to XML.
In HTML some elements can be improperly nested
within each other like this:
<b><i>This text is bold and italic</b></i>In XML all
elements must be properly nested within each other like
this:
<b><i>This text is bold and italic</i></b>
XML Documents Must Have a Root Element
All XML documents must contain a single tag pair
to define a root element.
All other elements must be within this root element.
All elements can have sub elements (child elements).
Sub elements must be correctly nested within their
parent element:
<root> <child> <subchild>.....</subchild> </child>
</root>
XML Attribute Values Must be
Quoted
With XML, it is illegal to omit quotation marks
around attribute values.
XML elements can have attributes in name/value pairs
just like in HTML. In XML the attribute value must
always be quoted. Study the two XML documents
below. The first one is incorrect, the second is correct:
<?xml version="1.0" encoding="ISO-8859-1"?>
<note date=12/11/2002>
<to>Tove</to>
<from>Jani</from>
</note>
<?xml version="1.0" encoding="ISO-8859-1"?>
<note date="12/11/2002">
<to>Tove</to>
<from>Jani</from>
</note>
The error in the first document is that the date attribute
in the note element is not quoted. This is correct:
date="12/11/2002". This is incorrect:
date=12/11/2002.
With XML, White Space is Preserved
With XML, the white space in your document is
not truncated.
This is unlike HTML. With HTML, a sentence like this:
Hello my name is Tove,
will be displayed like this:
Hello my name is Tove,
because HTML reduces multiple, consecutive white
space characters to a single white space.
Comments in XML
The syntax for writing comments in XML is similar to that of
HTML.
<!-- This is a comment -->
There is Nothing Special About XML
There is nothing special about XML. It is just plain text with
the addition of some XML tags enclosed in angle brackets.
Software that can handle plain text can also handle XML. In a
simple text editor, the XML tags will be visible and will not be
handled specially.
In an XML-aware application however, the XML tags can be
handled specially. The tags may or may not be visible, or
have a functional meaning, depending on the nature of the
application.
XML Elements
XML Elements are Extensible
XML documents can be extended to carry more information.
Look at the following XML NOTE example:
<note>
<to>Tove</to>
<from>Jani</from>
<body>Don't forget me this weekend!</body>
</note>
Let's imagine that we created an application that extracted
the <to>, <from>, and <body> elements from the XML
document to produce this output:
MESSAGE To: Tove
From: Jani
Don't forget me this weekend!
Imagine that the author of the XML document added
some extra information to it:
<note> <date>2002-08-01</date> <to>Tove</to>
<from>Jani</from> <heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>Should the application break or crash?
No. The application should still be able to find the <to>,
<from>, and <body> elements in the XML document
and produce the same output.
XML documents are Extensible
XML Elements have
Relationships
Elements are related as parents and
children.
To understand XML terminology, you have to
know how relationships between XML elements
are named, and how element content is
described.
Imagine that this is a description of a book:
My First XMLIntroduction to XML
What is HTML
What is XML
XML Syntax
Elements must have a closing tag
Elements must be properly nested
Imagine that this XML document describes the
book:
<book>
<title>My First XML</title>
<prod id="33-657" media="paper"></prod>
<chapter>Introduction to XML
<para>What is HTML</para>
<para>What is XML</para>
</chapter>
<chapter>XML Syntax
<para>Elements must have a closing tag</para>
<para>Elements must be properly nested</para>
</chapter>
</book>
Book is the root element. Title,
prod, and chapter are child
elements of book. Book is the
parent element of title, prod,
and chapter. Title, prod, and
chapter are siblings (or sister
elements) because they have the
same parent
Elements have Content
Elements can have different content types.
An XML element is everything from (including) the
element's start tag to (including) the element's end tag.
An element can have element content, mixed content,
simple content, or empty content. An element can
also have attributes.
In the example above, book has element content,
because it contains other elements. Chapter has mixed
content because it contains both text and other
elements. Para has simple content (or text content)
because it contains only text. Prod has empty content,
because it carries no information.
In the example above only the prod element has
attributes. The attribute named id has the value
"33-657". The attribute named media has the value
"paper".
Element Naming
XML elements must follow these naming rules:
Names can contain letters, numbers, and other
characters
Names must not start with a number or punctuation
character
Names must not start with the letters xml (or XML, or
Xml, etc)
Names cannot contain spaces
Take care when you "invent" element names and follow
these simple rules:
Any name can be used, no words are reserved, but the
idea is to make names descriptive. Names with an
underscore separator are nice.
Examples: <first_name>, <last_name>.
name from first. Or if you name something "first.name," your
software may think that "name" is a property of the object "first."
Element names can be as long as you like, but don't exaggerate.
Names should be short and simple, like this: <book_title> not like
this: <the_title_of_the_book>.
XML documents often have a corresponding database, in which fields
exist corresponding to elements in the XML document. A good
practice is to use the naming rules of your database for the elements
in the XML documents.
Non-English letters like éòá are perfectly legal in XML element
names, but watch out for problems if your software vendor doesn't
support them.
The ":" should not be used in element names because it is reserved
to be used for something called namespaces (more later).
XML Attributes
XML elements can have attributes.
From HTML you will remember this: <IMG
SRC="computer.gif">. The SRC attribute
provides additional information about the IMG
element.
In HTML (and in XML) attributes provide
additional information about elements:
<img src="computer.gif"> <a
href="demo.asp">Attributes often provide
information that is not a part of the data. In
the example below, the file type is irrelevant to
the data, but important to the software that
wants to manipulate the element:
<file type="gif">computer.gif</file>
Quote Styles, "female" or 'female'?
Attribute values must always be enclosed in
quotes, but either single or double quotes can
be used. For a person's sex, the person tag
can be written like this:
<person sex="female">or like this:
<person sex='female'>Note: If the attribute
value itself contains double quotes it is
necessary to use single quotes, like in this
example:
<gangster name='George "Shotgun" Ziegler'>
Note: If the attribute value itself
contains single quotes it is
necessary to use double quotes,
like in this example:
<gangster name="George
'Shotgun' Ziegler">
Use of Elements vs. Attributes
Data can be stored in child
elements or in attributes.
Take a look at these examples:
<person sex="female">
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
<person> <sex>female</sex>
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
In the first example sex is an attribute.
In the last, sex is a child element. Both
examples provide the same
information.
There are no rules about when to use
attributes, and when to use child
elements. My experience is that
attributes are handy in HTML, but in
XML you should try to avoid them. Use
child elements if the information feels
like data.
I like to store data in child elements.
The following three XML documents contain
exactly the same information:
A date attribute is used in the first example:
<note date="12/11/2002"> <to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading> <body>Don't
forget me this weekend!</body> </note>
A date element is used in the
second example:
<note>
<date>12/11/2002</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this
weekend!</body> </note>
An expanded date element is used
in the third: (THIS IS MY
FAVORITE):
<note> <date> <day>12</day>
<month>11</month>
<year>2002</year> </date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this
weekend!</body> </note>
Avoid using attributes?
Should you avoid using attributes?
Some of the problems with using attributes are:
attributes cannot contain multiple values (child elements
can)
attributes are not easily expandable (for future changes)
attributes cannot describe structures (child elements can)
attributes are more difficult to manipulate by program
code
attribute values are not easy to test against a Document
Type Definition (DTD) - which is used to define the legal
elements of an XML document
If you use attributes as containers for data,
you end up with documents that are difficult to
read and maintain. Try to use elements to
describe data. Use attributes only to provide
information that is not relevant to the data.
Don't end up like this (this is not how XML
should be used):
<note day="12" month="11" year="2002"
to="Tove" from="Jani" heading="Reminder"
body="Don't forget me this weekend!">
</note>
An Exception to my Attribute
Rule
Rules always have exceptions.
My rule about attributes has one exception:
Sometimes I assign ID references to elements. These
ID references can be used to access XML elements in
much the same way as the NAME or ID attributes in
HTML. This example demonstrates this:
<messages> <note id="p501"> <to>Tove</to>
<from>Jani</from> <heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note> <note id="p502"> <to>Jani</to>
<from>Tove</from> <heading>Re:
Reminder</heading> <body>I will not!</body>
</note> </messages>
The ID in these examples is just a
counter, or a unique identifier, to
identify the different notes in the
XML file, and not a part of the
note data.
What I am trying to say here is
that metadata (data about data)
should be stored as attributes,
and that data itself should be
stored as elements.
XML Validation
Well Formed XML Documents
A "Well Formed" XML document has
correct XML syntax.
A "Well Formed" XML document is a document
that conforms to the XML syntax rules that
were described in the previous chapters:
XML documents must have a root element
XML elements must have a closing tag
XML tags are case sensitive
XML elements must be properly nested
XML attribute values must always be quoted
<?xml version="1.0"
encoding="ISO-8859-1"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this
weekend!</body>
</note>
Valid XML Documents
A "Valid" XML document also
conforms to a DTD.
A "Valid" XML document is a "Well
Formed" XML document, which also
conforms to the rules of a Document
Type Definition (DTD):
<?xml version="1.0" encoding="ISO-
8859-1"?> <!DOCTYPE note SYSTEM
"InternalNote.dtd"> <note>
<to>Tove</to> <from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!
</body> </note>
XML DTD
A DTD defines the legal
elements of an XML document.
The purpose of a DTD is to define
the legal building blocks of an XML
document. It defines the
document structure with a list of
legal elements
Introduction to DTD
Internal DOCTYPE declaration
If the DTD is included in your XML
source file, it should be wrapped
in a DOCTYPE definition with the
following syntax:
<!DOCTYPE root-element
[element-declarations]>
<?xml version="1.0"?>
<!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)> ]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>
The DTD above is interpreted like this:
!DOCTYPE note (in line 2) defines that this is a document of
the type note.
!ELEMENT note (in line 3) defines the note element as
having four elements: "to,from,heading,body".
!ELEMENT to (in line 4) defines the to element to be of the
type "#PCDATA".
!ELEMENT from (in line 5) defines the from element to be of
the type "#PCDATA"
and so on.....
External DOCTYPE declaration
If the DTD is external to your XML
source file, it should be wrapped
in a DOCTYPE definition with the
following syntax:
<!DOCTYPE root-element SYSTEM
"filename">
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
And this is a copy of the file "note.dtd" containing the DTD:
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
Why use a DTD?
With DTD, each of your XML files can
carry a description of its own format
with it.
With a DTD, independent groups of
people can agree to use a common
DTD for interchanging data.
Your application can use a standard
DTD to verify that the data you receive
from the outside world is valid.
You can also use a DTD to verify your
own data.
DTD - XML building blocks
The building blocks of XML
documents
Seen from a DTD point of view, all
XML documents (and HTML
documents) are made up by the
following simple building blocks:
Elements
Attributes
Entities
PCDATA
CDATA
Elements
Elements are the main building
blocks of both XML and HTML
documents.
Examples of HTML elements are "body"
and "table". Examples of XML elements
could be "note" and "message".
Elements can contain text, other
elements, or be empty. Examples of
empty HTML elements are "hr", "br"
and "img".
Examples:
<body>body text in
between</body><message>some
message in between</message>
Attributes
Attributes provide extra information
about elements.
Attributes are always placed inside the
starting tag of an element. Attributes
always come in name/value pairs. The
following "img" element has additional
information about a source file:
<img src="computer.gif" />The name of
the element is "img". The name of the
attribute is "src". The value of the attribute
is "computer.gif". Since the element itself
is empty it is closed by a " /".
Entities
Entities are variables used to define
common text. Entity references are
references to entities.
Most of you will know the HTML entity
reference: "&nbsp;". This "no-
breaking-space" entity is used in HTML
to insert an extra space in a document.
Entities are expanded when a
document is parsed by an XML parser.
The following entities are
predefined in XML:
Entity References Character
&lt; <
&gt; >
&amp; &
&quot; "
&apos; '
PCDATA
PCDATA means parsed character data.
Think of character data as the text found between the
start tag and the end tag of an XML element.
PCDATA is text that will be parsed by a parser. Tags
inside the text will be treated as markup and entities
will be expanded.
CDATA
CDATA also means character data.
CDATA is text that will NOT be parsed by a parser. Tags
inside the text will NOT be treated as markup and
entities will not be expanded.
DTD - Elements
Declaring an Element
In the DTD, XML elements are
declared with an element
declaration. An element
declaration has the following
syntax:
<!ELEMENT element-name
category> or <!ELEMENT
element-name (element-
content)>
Empty elements
Empty elements are declared with
the category keyword EMPTY:
<!ELEMENT element-name
EMPTY> example:<!ELEMENT br
EMPTY>XML example:<br />
Elements with only character
data
Elements with only character data
are declared with #PCDATA inside
parentheses:
<!ELEMENT element-name
(#PCDATA)> example:<!ELEMENT
from (#PCDATA)>
Elements with any contents
Elements declared with the
category keyword ANY, can
contain any combination of
parsable data:
<!ELEMENT element-name
ANY>example:<!ELEMENT note
ANY>
Elements with children
(sequences)
Elements with one or more children
are defined with the name of the
children elements inside parentheses:
<!ELEMENT element-name (child-
element-name)> or <!ELEMENT
element-name (child-element-
name,child-element-
name,.....)>example:<!ELEMENT note
(to,from,heading,body)>
When children are declared in a
sequence separated by commas, the
children must appear in the same
sequence in the document. In a full
declaration, the children must also be
declared, and the children can also
have children. The full declaration of
the "note" element will be:
<!ELEMENT note
(to,from,heading,body)> <!ELEMENT
to (#PCDATA)> <!ELEMENT from
(#PCDATA)> <!ELEMENT heading
(#PCDATA)> <!ELEMENT body
(#PCDATA)>
Declaring only one occurrence
of the same element
<!ELEMENT element-name (child-
name)>example:<!ELEMENT note
(message)>The example
declaration above declares that
the child element message must
occur once, and only once inside
the "note" element.
Declaring minimum one
occurrence of the same
element
<!ELEMENT element-name (child-
name+)>example:<!ELEMENT
note (message+)>The + sign in
the example above declares that
the child element message must
occur one or more times inside
the "note" element.
Declaring zero or more
occurrences of the same
element
<!ELEMENT element-name (child-
name*)>example:<!ELEMENT
note (message*)>The * sign in
the example above declares that
the child element message can
occur zero or more times inside
the "note" element.
Declaring zero or one
occurrences of the same
element
<!ELEMENT element-name (child-
name?)>example:<!ELEMENT
note (message?)>The ? sign in
the example above declares that
the child element message can
occur zero or one times inside the
"note" element.
Declaring either/or content
example:<!ELEMENT note
(to,from,header,(message|
body))>The example above
declares that the "note" element
must contain a "to" element, a
"from" element, a "header"
element, and either a "message"
or a "body" element.
Declaring mixed content
example:<!ELEMENT note
(#PCDATA|to|from|header|
message)*>The example above
declares that the "note" element
can contain zero or more
occurrences of parsed character,
"to", "from", "header", or
"message" elements
DTD - Attributes
Declaring Attributes
An attribute declaration has the
following syntax:
<!ATTLIST element-name
attribute-name attribute-type
default-value>example:DTD
example: <!ATTLIST payment
type CDATA "check"> XML
example: <payment type="check"
/>
The attribute-type can have the
following values:
Value Explanation
CDATA The value is character data
(en1|
en2|..)
The value must be one from an
enumerated list
ID The value is a unique id
IDREF
The value is the id of another
element
IDREFS The value is a list of other ids
NMTOKE
N
The value is a valid XML name
NMTOKE
NS
The value is a list of valid XML
names
ENTITY The value is an entity
ENTITIES The value is a list of entities
NOTATIO
N
The value is a name of a notation
xml: The value is a predefined xml value
The default-value can have the following values:
Value
Explanation
value The default value of the attribute
#REQUIRED
The attribute value must be included
in the element
#IMPLIED
The attribute does not have to be
included
#FIXED value The attribute value is fixed
Specifying a Default attribute
value
DTD: <!ELEMENT square EMPTY> <!
ATTLIST square width CDATA
"0">Valid XML: <square
width="100" />In the example above,
the "square" element is defined to be
an empty element with a "width"
attribute of type CDATA. If no width is
specified, it has a default value of 0.
#IMPLIED
Syntax
<!ATTLIST element-name
attribute-name attribute-type
#IMPLIED>Example
DTD: <!ATTLIST contact fax
CDATA #IMPLIED>
Valid XML: <contact fax="555-
667788" />
Valid XML: <contact />
#REQUIRED
Syntax
<!ATTLIST element-name
attribute_name attribute-type
#REQUIRED>Example
DTD: <!ATTLIST person number
CDATA #REQUIRED>Valid XML:
<person number="5677" />Invalid
XML: <person />Use the #REQUIRED
keyword if you don't have an option for
a default value, but still want to force
the attribute to be present.
<!ATTLIST element-name attribute-name
attribute-type #FIXED "value">Example
DTD: <!ATTLIST sender company CDATA
#FIXED "Microsoft">Valid XML: <sender
company="Microsoft" />Invalid XML:
<sender company=“abc" />Use the
#FIXED keyword when you want an
attribute to have a fixed value without
allowing the author to change it. If an
author includes another value, the XML
parser will return an error.
#FIXED
Syntax
Enumerated attribute values
Syntax: <!ATTLIST element-name attribute-
name (en1|en2|..) default-value>DTD example:
<!ATTLIST payment type (check|cash) "cash">
XML example: <payment type="check" /> or
<payment type="cash" />Use enumerated
attribute values when you want the attribute
values to be one of a fixed set of legal values.
DTD - Entities
Entities are variables used to define shortcuts
to common text.
- Entity references are references to entities.
- Entities can be declared internal, or external
Internal Entity Declaration
Syntax: <!ENTITY entity-name
"entity-value"> DTD Example:<!
ENTITY writer "Donald Duck."> <!
ENTITY copyright "Copyright
imsec">XML
example:<author>&writer;&copyri
ght;</author>
External Entity Declaration
Syntax: <!ENTITY entity-name SYSTEM
"URI/URL"> DTD Example:<!ENTITY
writer SYSTEM
"http://www.abc.com/dtd/entities.dtd">
<!ENTITY copyright SYSTEM
"http://www.imsec.com/dtd/entities.dtd"
>XML
example:<author>&writer;&copyright;</
author>
DTD Summary
This tutorial has taught you how to describe
the structure of an XML document.
You have learned how to use a DTD to define
the legal elements of an XML document, and
how the DTD can be declared inside your XML
document, or as an external reference.
You have learned how to declare the legal
elements, attributes, entities, and CDATA
sections for XML documents.
You have also seen how to validate an XML
document against a DTD.
Introduction to XML Schema
What is an XML Schema?
The purpose of an XML Schema is to define the legal
building blocks of an XML document, just like a DTD.
An XML Schema:
defines elements that can appear in a document
defines attributes that can appear in a document
defines which elements are child elements
defines the order of child elements
defines the number of child elements
defines whether an element is empty or can include
text
defines data types for elements and attributes
defines default and fixed values for elements and
attributes
XML Schemas are the Successors
of DTDs
We think that very soon XML Schemas
will be used in most Web applications
as a replacement for DTDs. Here are
some reasons:
XML Schemas are extensible to future
additions
XML Schemas are richer and more
powerful than DTDs
XML Schemas are written in XML
XML Schemas support data types
XML Schemas support namespaces
Why Use XML Schemas?
XML Schemas Support Data Types
One of the greatest strength of XML Schemas
is the support for data types.
With support for data types:
It is easier to describe allowable document
content
It is easier to validate the correctness of data
It is easier to work with data from a database
It is easier to define data facets (restrictions
on data)
It is easier to define data patterns (data
formats)
It is easier to convert data between different
data types
XML Schemas use XML Syntax
Another great strength about XML Schemas is
that they are written in XML.
Some benefits of that XML Schemas are
written in XML:
You don't have to learn a new language
You can use your XML editor to edit your
Schema files
You can use your XML parser to parse your
Schema files
You can manipulate your Schema with the XML
DOM
You can transform your Schema with XSLT
XML Schemas Secure Data Communication
When sending data from a sender to a receiver, it is
essential that both parts have the same "expectations"
about the content.
With XML Schemas, the sender can describe the data in
a way that the receiver will understand.
A date like: "03-11-2004" will, in some countries, be
interpreted as 3.November and in other countries as
11.March.
However, an XML element with a data type like this:
<date type="date">2004-03-11</date>
ensures a mutual understanding of the content,
because the XML data type "date" requires the format
"YYYY-MM-DD".
XML Schemas are Extensible
XML Schemas are extensible, because
they are written in XML.
With an extensible Schema definition you
can:
Reuse your Schema in other Schemas
Create your own data types derived from
the standard types
Reference multiple schemas in the same
document
Well-Formed is not Enough
A well-formed XML document is a document that conforms to
the XML syntax rules, like:
it must begin with the XML declaration
it must have one unique root element
start-tags must have matching end-tags
elements are case sensitive
all elements must be closed
all elements must be properly nested
all attribute values must be quoted
entities must be used for special characters
Even if documents are well-formed they can still contain
errors, and those errors can have serious consequences.
Think of the following situation: you order 5 gross of laser
printers, instead of 5 laser printers. With XML Schemas, most
of these errors can be caught by your validating software.
A Simple XML Document
Look at this simple XML document
called "note.xml":
<?xml version="1.0"?> <note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this
weekend!</body> </note>
A DTD File
The following example is a DTD file called "note.dtd"
that defines the elements of the XML document above
("note.xml"):
<!ELEMENT note (to, from, heading, body)> <!
ELEMENT to (#PCDATA)> <!ELEMENT from
(#PCDATA)> <!ELEMENT heading (#PCDATA)> <!
ELEMENT body (#PCDATA)>The first line defines the
note element to have four child elements: "to, from,
heading, body".
Line 2-5 defines the to, from, heading, body elements
to be of type "#PCDATA".
An XML Schema
The following example is an XML
Schema file called "note.xsd" that
defines the elements of the XML
document above ("note.xml"):
<?xml version="1.0"?> <xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified"><xs:element
name="note"> <xs:complexType> <xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence> </xs:complexType>
</xs:element></xs:schema>
A Reference to a DTD
This XML document has a
reference to a DTD:
<?xml version="1.0"?><!
DOCTYPE note SYSTEM
"http://www.w3schools.com/dtd/n
ote.dtd"><note> <to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this
weekend!</body> </note>
A Reference to an XML Schema
This XML document has a reference to
an XML Schema:
<?xml version="1.0"?><note
xmlns="http://www.w3schools.com"
xmlns:xsi="http://www.w3.org/2001/
XMLSchema-instance"
xsi:schemaLocation="http://www.w3sc
hools.com note.xsd"> <to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!
</body> </note>
XSD - The <schema> Element
The <schema> Element
The <schema> element is the
root element of every XML
Schema:
<?xml version="1.0"?
><xs:schema>...
...</xs:schema>
The <schema> element may contain
some attributes. A schema declaration
often looks something like this:
<?xml version="1.0"?><xs:schema
xmlns:xs="http://www.w3.org/2001/X
MLSchema"
targetNamespace="http://www.w3sch
ools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">... ...
</xs:schema>
XSD Simple Elements
What is a Simple Element?
A simple element is an XML element that
can contain only text. It cannot contain any
other elements or attributes.
However, the "only text" restriction is quite
misleading. The text can be of many
different types. It can be one of the types
included in the XML Schema definition
(boolean, string, date, etc.), or it can be a
custom type that you can define yourself.
You can also add restrictions (facets) to a
data type in order to limit its content, or
you can require the data to match a
specific pattern
Defining a Simple Element
The syntax for defining a simple element is:
<xs:element name="xxx" type="yyy"/>
where xxx is the name of the element and yyy
is the data type of the element. XML Schema
has a lot of built-in data types. The most
common types are:
xs:string
xs:decimal
xs:integer
xs:boolean
xs:date
xs:time
Example
Here are some XML elements:
<lastname>Refsnes</lastname>
<age>36</age> <dateborn>1970-03-
27</dateborn>And here are the corresponding
simple element definitions:
<xs:element name="lastname"
type="xs:string"/> <xs:element name="age"
type="xs:integer"/> <xs:element
name="dateborn" type="xs:date"/>
Default and Fixed Values for Simple Elements
Simple elements may have a default value OR a fixed
value specified.
A default value is automatically assigned to the element
when no other value is specified.
In the following example the default value is "red":
<xs:element name="color" type="xs:string"
default="red"/>A fixed value is also automatically
assigned to the element, and you cannot specify
another value.
In the following example the fixed value is "red":
<xs:element name="color" type="xs:string"
fixed="red"/>
XSD Attributes
What is an Attribute?
Simple elements cannot have
attributes. If an element has
attributes, it is considered to be of
a complex type. But the attribute
itself is always declared as a
simple type.
How to Define an Attribute?
The syntax for defining an attribute is:
<xs:attribute name="xxx" type="yyy"/>
where xxx is the name of the attribute and yyy
specifies the data type of the attribute. XML
Schema has a lot of built-in data types. The
most common types are:
xs:string
xs:decimal
xs:integer
xs:boolean
xs:date
xs:time
Example
Here is an XML element with an
attribute:
<lastname
lang="EN">Smith</lastname>An
d here is the corresponding
attribute definition:
<xs:attribute name="lang"
type="xs:string"/>
Default and Fixed Values for Attributes
Attributes may have a default value OR a fixed value
specified.
A default value is automatically assigned to the
attribute when no other value is specified.
In the following example the default value is "EN":
<xs:attribute name="lang" type="xs:string"
default="EN"/>A fixed value is also automatically
assigned to the attribute, and you cannot specify
another value.
In the following example the fixed value is "EN":
<xs:attribute name="lang" type="xs:string"
fixed="EN"/>
Optional and Required
Attributes
Attributes are optional by default.
To specify that the attribute is
required, use the "use" attribute:
<xs:attribute name="lang"
type="xs:string"
use="required"/>
Restrictions on Content
When an XML element or attribute has a data
type defined, it puts restrictions on the
element's or attribute's content.
If an XML element is of type "xs:date" and
contains a string like "Hello World", the
element will not validate.
With XML Schemas, you can also add your own
restrictions to your XML elements and
attributes. These restrictions are called facets
XML Browser Support
Mozilla Firefox
As of version 1.0.2, Firefox has support for XML and
XSLT (and CSS).
Mozilla
Mozilla includes Expat for XML parsing and has support
to display XML + CSS. Mozilla also has some support for
Namespaces.
Mozilla is available with an XSLT implementation.
Netscape
As of version 8, Netscape uses the Mozilla engine, and
therefore it has the same XML / XSLT support as
Mozilla.
Opera
As of version 9, Opera has support for XML and XSLT
(and CSS). Version 8 supports only XML + CSS.
Internet Explorer
As of version 6, Internet Explorer supports XML,
Namespaces, CSS, XSLT, and XPath
Displaying your XML Files with
CSS?
<?xml version="1.0" encoding="ISO-8859-1"?> <?
xml-stylesheet type="text/css" href="cd_catalog.css"?
> <CATALOG> <CD> <TITLE>Empire
Burlesque</TITLE> <ARTIST>Bob Dylan</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>Columbia</COMPANY>
<PRICE>10.90</PRICE> <YEAR>1985</YEAR>
</CD> <CD> <TITLE>Hide your heart</TITLE>
<ARTIST>Bonnie Tyler</ARTIST>
<COUNTRY>UK</COUNTRY> <COMPANY>CBS
Records</COMPANY> <PRICE>9.90</PRICE>
<YEAR>1988</YEAR> </CD> . . . . </CATALOG>
XML Data Island
XML Data Embedded in HTML
An XML data island is XML data embedded into
an HTML page.
Here is how it works; assume we have the
following XML document ("note.xml"):
<?xml version="1.0" encoding="ISO-8859-1"?
> <note> <to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading> <body>Don't
forget me this weekend!</body> </note>
Then, in an HTML document, you
can embed the XML file above
with the <xml> tag. The id
attribute of the <xml> tag defines
an ID for the data island, and the
src attribute points to the XML file
to embed:
<html> <body><xml id="note"
src="note.xml"></xml></body>
</html>However, the embedded
XML data is, up to this point, not
visible for the user.
Bind Data Island to HTML
Elements
The HTML file looks like this:
<html> <body> <xml id="cdcat"
src="cd_catalog.xml"></xml>
<table border="1"
datasrc="#cdcat"> <tr>
<td><span
datafld="ARTIST"></span></td>
<td><span
datafld="TITLE"></span></td>
</tr> </table> </body> </html>
Example explained:
The datasrc attribute of the <table> tag binds the HTML
table element to the XML data island. The datasrc
attribute refers to the id attribute of the data island.
<td> tags cannot be bound to data, so we are using
<span> tags. The <span> tag allows the datafld
attribute to refer to the XML element to be displayed. In
this case, it is datafld="ARTIST" for the <ARTIST>
element and datafld="TITLE" for the <TITLE> element
in the XML file. As the XML is read, additional rows are
created for each <CD> element.
XML Namespaces
Name Conflicts
Since element names in XML are not
predefined, a name conflict will occur
when two different documents use the
same element names.
This XML document carries information
in a table:
<table> <tr> <td>Apples</td>
<td>Bananas</td> </tr>
</table>This XML document carries
information about a table (a piece of
furniture):
<table> <name>African Coffee
Table</name>
<width>80</width>
<length>120</length>
</table>If these two XML
documents were added together,
there would be an element name
conflict because both documents
contain a <table> element with
different content and definition.
Solving Name Conflicts Using a Prefix
This XML document carries information in a table:
<h:table> <h:tr> <h:td>Apples</h:td>
<h:td>Bananas</h:td> </h:tr> </h:table>This XML
document carries information about a piece of furniture:
<f:table> <f:name>African Coffee Table</f:name>
<f:width>80</f:width> <f:length>120</f:length>
</f:table>Now there will be no name conflict because the two
documents use a different name for their <table> element
(<h:table> and <f:table>).
By using a prefix, we have created two different types of
<table> elements.
XML PARSERS:
Two basic approach followed by
parsers are SAX(single API for
XML) or DOM(Document object
model).
SAX
Sequential ,event based
Cannot move laterally between
elements
Tough to use for complex
structures
Saves memory space
Better choice for quick,less
intensive parsing and processing
Intertactive so can be used for
larger files
DOM
Memory tree representation
Lets you move back and forth ,up
and down
Easy to use and has clean
interface
Memory intensive for larger XML
documents
Better choice for complex XML
structures
Can be used for smaller files as
memory intensive.

Mais conteúdo relacionado

Mais procurados (17)

Web Development Course - XML by RSOLUTIONS
Web Development Course - XML by RSOLUTIONSWeb Development Course - XML by RSOLUTIONS
Web Development Course - XML by RSOLUTIONS
 
Unit 2.2
Unit 2.2Unit 2.2
Unit 2.2
 
Xml
XmlXml
Xml
 
01 Xml Begin
01 Xml Begin01 Xml Begin
01 Xml Begin
 
XML
XMLXML
XML
 
paper about xml
paper about xmlpaper about xml
paper about xml
 
XML
XMLXML
XML
 
Introduction to xml
Introduction to xmlIntroduction to xml
Introduction to xml
 
XML | Computer Science
XML | Computer ScienceXML | Computer Science
XML | Computer Science
 
Xml ppt
Xml pptXml ppt
Xml ppt
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
Xml
XmlXml
Xml
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
Web programming unit IIII XML &DOM NOTES BY BHAVSINGH MALOTH
Web programming unit IIII XML &DOM NOTES BY BHAVSINGH MALOTHWeb programming unit IIII XML &DOM NOTES BY BHAVSINGH MALOTH
Web programming unit IIII XML &DOM NOTES BY BHAVSINGH MALOTH
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
Wp unit III
Wp unit IIIWp unit III
Wp unit III
 
XML
XMLXML
XML
 

Semelhante a Xml description (20)

Xml material
Xml materialXml material
Xml material
 
Xml material
Xml materialXml material
Xml material
 
Web programming xml
Web programming  xmlWeb programming  xml
Web programming xml
 
Week1 xml
Week1 xmlWeek1 xml
Week1 xml
 
XML/XSLT
XML/XSLTXML/XSLT
XML/XSLT
 
Xml intro1
Xml intro1Xml intro1
Xml intro1
 
XML - Extensive Markup Language
XML - Extensive Markup LanguageXML - Extensive Markup Language
XML - Extensive Markup Language
 
xml.pptx
xml.pptxxml.pptx
xml.pptx
 
Full xml
Full xmlFull xml
Full xml
 
Xml
Xml Xml
Xml
 
XML - Extensible Markup Language for Network Security.pptx
XML - Extensible Markup Language for Network Security.pptxXML - Extensible Markup Language for Network Security.pptx
XML - Extensible Markup Language for Network Security.pptx
 
Web Services Part 1
Web Services Part 1Web Services Part 1
Web Services Part 1
 
xml introduction in web technologies subject
xml introduction in web technologies subjectxml introduction in web technologies subject
xml introduction in web technologies subject
 
XML.pptx
XML.pptxXML.pptx
XML.pptx
 
Xml 1
Xml 1Xml 1
Xml 1
 
Xml overview
Xml overviewXml overview
Xml overview
 
XML1.pptx
XML1.pptxXML1.pptx
XML1.pptx
 
Web based application of Live Scoreboard using XML.
Web based application of Live Scoreboard using XML.Web based application of Live Scoreboard using XML.
Web based application of Live Scoreboard using XML.
 
Xml
XmlXml
Xml
 
uptu web technology unit 2 Xml2
uptu web technology unit 2 Xml2uptu web technology unit 2 Xml2
uptu web technology unit 2 Xml2
 

Xml description

  • 2. What is XML? XML stands for EXtensible Markup Language XML is a markup language much like HTML XML was designed to describe data XML tags are not predefined. You must define your own tags XML uses a Document Type Definition (DTD) or an XML Schema to describe the data XML with a DTD or XML Schema is designed to be self- descriptive XML is a W3C Recommendation
  • 3. XML was designed to describe data and to focus on what data is. HTML was designed to display data and to focus on how data looks.
  • 4. XML is a W3C Recommendation The Extensible Markup Language (XML) became a W3C Recommendation 10. February 1998
  • 5. The Main Difference Between XML and HTML XML was designed to carry data. XML is not a replacement for HTML. XML and HTML were designed with different goals: XML was designed to describe data and to focus on what data is. HTML was designed to display data and to focus on how data looks. HTML is about displaying information, while XML is about describing information
  • 6. XML Does not DO Anything XML was not designed to DO anything. Maybe it is a little hard to understand, but XML does not DO anything. XML was created to structure, store and to send information. The following example is a note to Tove from Jani, stored as XML:
  • 8. The note has a header and a message body. It also has sender and receiver information. But still, this XML document does not DO anything. It is just pure information wrapped in XML tags. Someone must write a piece of software to send, receive or display it.
  • 9. XML is Free and Extensible XML tags are not predefined. You must "invent" your own tags. The tags used to mark up HTML documents and the structure of HTML documents are predefined. The author of HTML documents can only use tags that are defined in the HTML standard (like <p>, <h1>, etc.). XML allows the author to define his own tags and his own document structure. The tags in the example above (like <to> and <from>) are not defined in any XML standard. These tags are "invented" by the author of the XML document.
  • 10. XML is a Complement to HTML XML is not a replacement for HTML. It is important to understand that XML is not a replacement for HTML. In future Web development it is most likely that XML will be used to describe the data, while HTML will be used to format and display the same data. My best description of XML is this: XML is a cross- platform, software and hardware independent tool for transmitting information
  • 11. XML in Future Web Development XML is going to be everywhere. We have been participating in XML development since its creation. It has been amazing to see how quickly the XML standard has been developed and how quickly a large number of software vendors have adopted the standard. We strongly believe that XML will be as important to the future of the Web as HTML has been to the foundation of the Web and that XML will be the most common tool for all data manipulation and data transmission.
  • 12. How can XML be Used?
  • 13. XML can Separate Data from HTML
  • 14. With XML, your data is stored outside your HTML. When HTML is used to display data, the data is stored inside your HTML. With XML, data can be stored in separate XML files. This way you can concentrate on using HTML for data layout and display, and be sure that changes in the underlying data will not require any changes to your HTML. XML data can also be stored inside HTML pages as "Data Islands". You can still concentrate on using HTML only for formatting and displaying the data.
  • 15. XML is Used to Exchange Data
  • 16. With XML, data can be exchanged between incompatible systems. In the real world, computer systems and databases contain data in incompatible formats. One of the most time-consuming challenges for developers has been to exchange data between such systems over the Internet. Converting the data to XML can greatly reduce this complexity and create data that can be read by many different types of applications
  • 18. With XML, financial information can be exchanged over the Internet. Expect to see a lot about XML and B2B (Business To Business) in the near future. XML is going to be the main language for exchanging financial information between businesses over the Internet. A lot of interesting B2B applications are under development
  • 19. XML Can be Used to Share Data
  • 20. With XML, plain text files can be used to share data. Since XML data is stored in plain text format, XML provides a software- and hardware-independent way of sharing data. This makes it much easier to create data that different applications can work with. It also makes it easier to expand or upgrade a system to new operating systems, servers, applications, and new browsers.
  • 21. XML Can be Used to Store Data
  • 22. With XML, plain text files can be used to store data. XML can also be used to store data in files or in databases. Applications can be written to store and retrieve information from the store, and generic applications can be used to display the data.
  • 23. XML Can Make your Data More Useful
  • 24. With XML, your data is available to more users. Since XML is independent of hardware, software and application, you can make your data available to other than only standard HTML browsers. Other clients and applications can access your XML files as data sources, like they are accessing databases. Your data can be made available to all kinds of "reading machines" (agents), and it is easier to make your data available for blind people, or people with other disabilities.
  • 25. XML Can be Used to Create New Languages
  • 26. XML is the mother of WAP and WML. The Wireless Markup Language (WML), used to markup Internet applications for handheld devices like mobile phones, is written in XML.
  • 28. The syntax rules of XML are very simple and very strict. The rules are very easy to learn, and very easy to use. Because of this, creating software that can read and manipulate XML is very easy.
  • 29. An Example XML Document XML documents use a self-describing and simple syntax. <?xml version="1.0" encoding="ISO-8859-1"?> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>The first line in the document - the XML declaration - defines the XML version and the character encoding used in the document. In this case the document conforms to the 1.0 specification of XML and uses the ISO-8859-1 (Latin-1/West European) character set
  • 30. The next line describes the root element of the document (like it was saying: "this document is a note"): <note>The next 4 lines describe 4 child elements of the root (to, from, heading, and body): <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body>And finally the last line defines the end of the root element: </note>
  • 31. All XML Elements Must Have a Closing Tag With XML, it is illegal to omit the closing tag. In HTML some elements do not have to have a closing tag. The following code is legal in HTML: <p>This is a paragraph <p>This is another paragraphIn XML all elements must have a closing tag, like this: <p>This is a paragraph</p> <p>This is another paragraph</p>
  • 32. XML Tags are Case Sensitive Unlike HTML, XML tags are case sensitive. With XML, the tag <Letter> is different from the tag <letter>. Opening and closing tags must therefore be written with the same case: <Message>This is incorrect</message> <message>This is correct</message>
  • 33. XML Elements Must be Properly Nested Improper nesting of tags makes no sense to XML. In HTML some elements can be improperly nested within each other like this: <b><i>This text is bold and italic</b></i>In XML all elements must be properly nested within each other like this: <b><i>This text is bold and italic</i></b>
  • 34. XML Documents Must Have a Root Element All XML documents must contain a single tag pair to define a root element. All other elements must be within this root element. All elements can have sub elements (child elements). Sub elements must be correctly nested within their parent element: <root> <child> <subchild>.....</subchild> </child> </root>
  • 35. XML Attribute Values Must be Quoted
  • 36. With XML, it is illegal to omit quotation marks around attribute values. XML elements can have attributes in name/value pairs just like in HTML. In XML the attribute value must always be quoted. Study the two XML documents below. The first one is incorrect, the second is correct: <?xml version="1.0" encoding="ISO-8859-1"?> <note date=12/11/2002> <to>Tove</to> <from>Jani</from> </note>
  • 37. <?xml version="1.0" encoding="ISO-8859-1"?> <note date="12/11/2002"> <to>Tove</to> <from>Jani</from> </note> The error in the first document is that the date attribute in the note element is not quoted. This is correct: date="12/11/2002". This is incorrect: date=12/11/2002.
  • 38. With XML, White Space is Preserved With XML, the white space in your document is not truncated. This is unlike HTML. With HTML, a sentence like this: Hello my name is Tove, will be displayed like this: Hello my name is Tove, because HTML reduces multiple, consecutive white space characters to a single white space.
  • 39. Comments in XML The syntax for writing comments in XML is similar to that of HTML. <!-- This is a comment --> There is Nothing Special About XML There is nothing special about XML. It is just plain text with the addition of some XML tags enclosed in angle brackets. Software that can handle plain text can also handle XML. In a simple text editor, the XML tags will be visible and will not be handled specially. In an XML-aware application however, the XML tags can be handled specially. The tags may or may not be visible, or have a functional meaning, depending on the nature of the application.
  • 41. XML Elements are Extensible
  • 42. XML documents can be extended to carry more information. Look at the following XML NOTE example: <note> <to>Tove</to> <from>Jani</from> <body>Don't forget me this weekend!</body> </note> Let's imagine that we created an application that extracted the <to>, <from>, and <body> elements from the XML document to produce this output: MESSAGE To: Tove From: Jani Don't forget me this weekend!
  • 43.
  • 44.
  • 45. Imagine that the author of the XML document added some extra information to it: <note> <date>2002-08-01</date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>Should the application break or crash? No. The application should still be able to find the <to>, <from>, and <body> elements in the XML document and produce the same output. XML documents are Extensible
  • 47. Elements are related as parents and children. To understand XML terminology, you have to know how relationships between XML elements are named, and how element content is described. Imagine that this is a description of a book: My First XMLIntroduction to XML What is HTML What is XML XML Syntax Elements must have a closing tag Elements must be properly nested
  • 48. Imagine that this XML document describes the book: <book> <title>My First XML</title> <prod id="33-657" media="paper"></prod> <chapter>Introduction to XML <para>What is HTML</para> <para>What is XML</para> </chapter> <chapter>XML Syntax <para>Elements must have a closing tag</para> <para>Elements must be properly nested</para> </chapter> </book>
  • 49. Book is the root element. Title, prod, and chapter are child elements of book. Book is the parent element of title, prod, and chapter. Title, prod, and chapter are siblings (or sister elements) because they have the same parent
  • 51. Elements can have different content types. An XML element is everything from (including) the element's start tag to (including) the element's end tag. An element can have element content, mixed content, simple content, or empty content. An element can also have attributes. In the example above, book has element content, because it contains other elements. Chapter has mixed content because it contains both text and other elements. Para has simple content (or text content) because it contains only text. Prod has empty content, because it carries no information. In the example above only the prod element has attributes. The attribute named id has the value "33-657". The attribute named media has the value "paper".
  • 53. XML elements must follow these naming rules: Names can contain letters, numbers, and other characters Names must not start with a number or punctuation character Names must not start with the letters xml (or XML, or Xml, etc) Names cannot contain spaces Take care when you "invent" element names and follow these simple rules: Any name can be used, no words are reserved, but the idea is to make names descriptive. Names with an underscore separator are nice. Examples: <first_name>, <last_name>.
  • 54. name from first. Or if you name something "first.name," your software may think that "name" is a property of the object "first." Element names can be as long as you like, but don't exaggerate. Names should be short and simple, like this: <book_title> not like this: <the_title_of_the_book>. XML documents often have a corresponding database, in which fields exist corresponding to elements in the XML document. A good practice is to use the naming rules of your database for the elements in the XML documents. Non-English letters like éòá are perfectly legal in XML element names, but watch out for problems if your software vendor doesn't support them. The ":" should not be used in element names because it is reserved to be used for something called namespaces (more later).
  • 56. XML elements can have attributes. From HTML you will remember this: <IMG SRC="computer.gif">. The SRC attribute provides additional information about the IMG element. In HTML (and in XML) attributes provide additional information about elements: <img src="computer.gif"> <a href="demo.asp">Attributes often provide information that is not a part of the data. In the example below, the file type is irrelevant to the data, but important to the software that wants to manipulate the element: <file type="gif">computer.gif</file>
  • 57. Quote Styles, "female" or 'female'? Attribute values must always be enclosed in quotes, but either single or double quotes can be used. For a person's sex, the person tag can be written like this: <person sex="female">or like this: <person sex='female'>Note: If the attribute value itself contains double quotes it is necessary to use single quotes, like in this example: <gangster name='George "Shotgun" Ziegler'>
  • 58. Note: If the attribute value itself contains single quotes it is necessary to use double quotes, like in this example: <gangster name="George 'Shotgun' Ziegler">
  • 59. Use of Elements vs. Attributes Data can be stored in child elements or in attributes. Take a look at these examples: <person sex="female"> <firstname>Anna</firstname> <lastname>Smith</lastname> </person> <person> <sex>female</sex> <firstname>Anna</firstname> <lastname>Smith</lastname> </person>
  • 60. In the first example sex is an attribute. In the last, sex is a child element. Both examples provide the same information. There are no rules about when to use attributes, and when to use child elements. My experience is that attributes are handy in HTML, but in XML you should try to avoid them. Use child elements if the information feels like data.
  • 61. I like to store data in child elements. The following three XML documents contain exactly the same information: A date attribute is used in the first example: <note date="12/11/2002"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
  • 62. A date element is used in the second example: <note> <date>12/11/2002</date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
  • 63. An expanded date element is used in the third: (THIS IS MY FAVORITE): <note> <date> <day>12</day> <month>11</month> <year>2002</year> </date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
  • 64. Avoid using attributes? Should you avoid using attributes? Some of the problems with using attributes are: attributes cannot contain multiple values (child elements can) attributes are not easily expandable (for future changes) attributes cannot describe structures (child elements can) attributes are more difficult to manipulate by program code attribute values are not easy to test against a Document Type Definition (DTD) - which is used to define the legal elements of an XML document
  • 65. If you use attributes as containers for data, you end up with documents that are difficult to read and maintain. Try to use elements to describe data. Use attributes only to provide information that is not relevant to the data. Don't end up like this (this is not how XML should be used): <note day="12" month="11" year="2002" to="Tove" from="Jani" heading="Reminder" body="Don't forget me this weekend!"> </note>
  • 66. An Exception to my Attribute Rule
  • 67. Rules always have exceptions. My rule about attributes has one exception: Sometimes I assign ID references to elements. These ID references can be used to access XML elements in much the same way as the NAME or ID attributes in HTML. This example demonstrates this: <messages> <note id="p501"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> <note id="p502"> <to>Jani</to> <from>Tove</from> <heading>Re: Reminder</heading> <body>I will not!</body> </note> </messages>
  • 68. The ID in these examples is just a counter, or a unique identifier, to identify the different notes in the XML file, and not a part of the note data. What I am trying to say here is that metadata (data about data) should be stored as attributes, and that data itself should be stored as elements.
  • 70. Well Formed XML Documents A "Well Formed" XML document has correct XML syntax. A "Well Formed" XML document is a document that conforms to the XML syntax rules that were described in the previous chapters: XML documents must have a root element XML elements must have a closing tag XML tags are case sensitive XML elements must be properly nested XML attribute values must always be quoted
  • 73. A "Valid" XML document also conforms to a DTD. A "Valid" XML document is a "Well Formed" XML document, which also conforms to the rules of a Document Type Definition (DTD): <?xml version="1.0" encoding="ISO- 8859-1"?> <!DOCTYPE note SYSTEM "InternalNote.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend! </body> </note>
  • 74. XML DTD A DTD defines the legal elements of an XML document. The purpose of a DTD is to define the legal building blocks of an XML document. It defines the document structure with a list of legal elements
  • 76. Internal DOCTYPE declaration If the DTD is included in your XML source file, it should be wrapped in a DOCTYPE definition with the following syntax: <!DOCTYPE root-element [element-declarations]>
  • 77. <?xml version="1.0"?> <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend</body> </note>
  • 78. The DTD above is interpreted like this: !DOCTYPE note (in line 2) defines that this is a document of the type note. !ELEMENT note (in line 3) defines the note element as having four elements: "to,from,heading,body". !ELEMENT to (in line 4) defines the to element to be of the type "#PCDATA". !ELEMENT from (in line 5) defines the from element to be of the type "#PCDATA" and so on.....
  • 79. External DOCTYPE declaration If the DTD is external to your XML source file, it should be wrapped in a DOCTYPE definition with the following syntax: <!DOCTYPE root-element SYSTEM "filename">
  • 80. <?xml version="1.0"?> <!DOCTYPE note SYSTEM "note.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
  • 81. And this is a copy of the file "note.dtd" containing the DTD: <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>
  • 82. Why use a DTD? With DTD, each of your XML files can carry a description of its own format with it. With a DTD, independent groups of people can agree to use a common DTD for interchanging data. Your application can use a standard DTD to verify that the data you receive from the outside world is valid. You can also use a DTD to verify your own data.
  • 83. DTD - XML building blocks
  • 84. The building blocks of XML documents Seen from a DTD point of view, all XML documents (and HTML documents) are made up by the following simple building blocks: Elements Attributes Entities PCDATA CDATA
  • 85. Elements Elements are the main building blocks of both XML and HTML documents. Examples of HTML elements are "body" and "table". Examples of XML elements could be "note" and "message". Elements can contain text, other elements, or be empty. Examples of empty HTML elements are "hr", "br" and "img". Examples: <body>body text in between</body><message>some message in between</message>
  • 86. Attributes Attributes provide extra information about elements. Attributes are always placed inside the starting tag of an element. Attributes always come in name/value pairs. The following "img" element has additional information about a source file: <img src="computer.gif" />The name of the element is "img". The name of the attribute is "src". The value of the attribute is "computer.gif". Since the element itself is empty it is closed by a " /".
  • 87. Entities Entities are variables used to define common text. Entity references are references to entities. Most of you will know the HTML entity reference: "&nbsp;". This "no- breaking-space" entity is used in HTML to insert an extra space in a document. Entities are expanded when a document is parsed by an XML parser.
  • 88. The following entities are predefined in XML: Entity References Character &lt; < &gt; > &amp; & &quot; " &apos; '
  • 89. PCDATA PCDATA means parsed character data. Think of character data as the text found between the start tag and the end tag of an XML element. PCDATA is text that will be parsed by a parser. Tags inside the text will be treated as markup and entities will be expanded. CDATA CDATA also means character data. CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.
  • 91. Declaring an Element In the DTD, XML elements are declared with an element declaration. An element declaration has the following syntax: <!ELEMENT element-name category> or <!ELEMENT element-name (element- content)>
  • 92. Empty elements Empty elements are declared with the category keyword EMPTY: <!ELEMENT element-name EMPTY> example:<!ELEMENT br EMPTY>XML example:<br />
  • 93. Elements with only character data Elements with only character data are declared with #PCDATA inside parentheses: <!ELEMENT element-name (#PCDATA)> example:<!ELEMENT from (#PCDATA)>
  • 94. Elements with any contents Elements declared with the category keyword ANY, can contain any combination of parsable data: <!ELEMENT element-name ANY>example:<!ELEMENT note ANY>
  • 95. Elements with children (sequences) Elements with one or more children are defined with the name of the children elements inside parentheses: <!ELEMENT element-name (child- element-name)> or <!ELEMENT element-name (child-element- name,child-element- name,.....)>example:<!ELEMENT note (to,from,heading,body)>
  • 96. When children are declared in a sequence separated by commas, the children must appear in the same sequence in the document. In a full declaration, the children must also be declared, and the children can also have children. The full declaration of the "note" element will be: <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>
  • 97. Declaring only one occurrence of the same element <!ELEMENT element-name (child- name)>example:<!ELEMENT note (message)>The example declaration above declares that the child element message must occur once, and only once inside the "note" element.
  • 98. Declaring minimum one occurrence of the same element <!ELEMENT element-name (child- name+)>example:<!ELEMENT note (message+)>The + sign in the example above declares that the child element message must occur one or more times inside the "note" element.
  • 99. Declaring zero or more occurrences of the same element <!ELEMENT element-name (child- name*)>example:<!ELEMENT note (message*)>The * sign in the example above declares that the child element message can occur zero or more times inside the "note" element.
  • 100. Declaring zero or one occurrences of the same element <!ELEMENT element-name (child- name?)>example:<!ELEMENT note (message?)>The ? sign in the example above declares that the child element message can occur zero or one times inside the "note" element.
  • 101. Declaring either/or content example:<!ELEMENT note (to,from,header,(message| body))>The example above declares that the "note" element must contain a "to" element, a "from" element, a "header" element, and either a "message" or a "body" element.
  • 102. Declaring mixed content example:<!ELEMENT note (#PCDATA|to|from|header| message)*>The example above declares that the "note" element can contain zero or more occurrences of parsed character, "to", "from", "header", or "message" elements
  • 104. Declaring Attributes An attribute declaration has the following syntax: <!ATTLIST element-name attribute-name attribute-type default-value>example:DTD example: <!ATTLIST payment type CDATA "check"> XML example: <payment type="check" />
  • 105. The attribute-type can have the following values: Value Explanation CDATA The value is character data (en1| en2|..) The value must be one from an enumerated list ID The value is a unique id IDREF The value is the id of another element IDREFS The value is a list of other ids NMTOKE N The value is a valid XML name NMTOKE NS The value is a list of valid XML names ENTITY The value is an entity ENTITIES The value is a list of entities NOTATIO N The value is a name of a notation xml: The value is a predefined xml value
  • 106. The default-value can have the following values: Value Explanation value The default value of the attribute #REQUIRED The attribute value must be included in the element #IMPLIED The attribute does not have to be included #FIXED value The attribute value is fixed
  • 107. Specifying a Default attribute value DTD: <!ELEMENT square EMPTY> <! ATTLIST square width CDATA "0">Valid XML: <square width="100" />In the example above, the "square" element is defined to be an empty element with a "width" attribute of type CDATA. If no width is specified, it has a default value of 0.
  • 108. #IMPLIED Syntax <!ATTLIST element-name attribute-name attribute-type #IMPLIED>Example DTD: <!ATTLIST contact fax CDATA #IMPLIED> Valid XML: <contact fax="555- 667788" /> Valid XML: <contact />
  • 109. #REQUIRED Syntax <!ATTLIST element-name attribute_name attribute-type #REQUIRED>Example DTD: <!ATTLIST person number CDATA #REQUIRED>Valid XML: <person number="5677" />Invalid XML: <person />Use the #REQUIRED keyword if you don't have an option for a default value, but still want to force the attribute to be present.
  • 110. <!ATTLIST element-name attribute-name attribute-type #FIXED "value">Example DTD: <!ATTLIST sender company CDATA #FIXED "Microsoft">Valid XML: <sender company="Microsoft" />Invalid XML: <sender company=“abc" />Use the #FIXED keyword when you want an attribute to have a fixed value without allowing the author to change it. If an author includes another value, the XML parser will return an error. #FIXED Syntax
  • 111. Enumerated attribute values Syntax: <!ATTLIST element-name attribute- name (en1|en2|..) default-value>DTD example: <!ATTLIST payment type (check|cash) "cash"> XML example: <payment type="check" /> or <payment type="cash" />Use enumerated attribute values when you want the attribute values to be one of a fixed set of legal values.
  • 113. Entities are variables used to define shortcuts to common text. - Entity references are references to entities. - Entities can be declared internal, or external
  • 114. Internal Entity Declaration Syntax: <!ENTITY entity-name "entity-value"> DTD Example:<! ENTITY writer "Donald Duck."> <! ENTITY copyright "Copyright imsec">XML example:<author>&writer;&copyri ght;</author>
  • 115. External Entity Declaration Syntax: <!ENTITY entity-name SYSTEM "URI/URL"> DTD Example:<!ENTITY writer SYSTEM "http://www.abc.com/dtd/entities.dtd"> <!ENTITY copyright SYSTEM "http://www.imsec.com/dtd/entities.dtd" >XML example:<author>&writer;&copyright;</ author>
  • 116. DTD Summary This tutorial has taught you how to describe the structure of an XML document. You have learned how to use a DTD to define the legal elements of an XML document, and how the DTD can be declared inside your XML document, or as an external reference. You have learned how to declare the legal elements, attributes, entities, and CDATA sections for XML documents. You have also seen how to validate an XML document against a DTD.
  • 118. What is an XML Schema? The purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD. An XML Schema: defines elements that can appear in a document defines attributes that can appear in a document defines which elements are child elements defines the order of child elements defines the number of child elements defines whether an element is empty or can include text defines data types for elements and attributes defines default and fixed values for elements and attributes
  • 119. XML Schemas are the Successors of DTDs We think that very soon XML Schemas will be used in most Web applications as a replacement for DTDs. Here are some reasons: XML Schemas are extensible to future additions XML Schemas are richer and more powerful than DTDs XML Schemas are written in XML XML Schemas support data types XML Schemas support namespaces
  • 120. Why Use XML Schemas?
  • 121. XML Schemas Support Data Types One of the greatest strength of XML Schemas is the support for data types. With support for data types: It is easier to describe allowable document content It is easier to validate the correctness of data It is easier to work with data from a database It is easier to define data facets (restrictions on data) It is easier to define data patterns (data formats) It is easier to convert data between different data types
  • 122. XML Schemas use XML Syntax Another great strength about XML Schemas is that they are written in XML. Some benefits of that XML Schemas are written in XML: You don't have to learn a new language You can use your XML editor to edit your Schema files You can use your XML parser to parse your Schema files You can manipulate your Schema with the XML DOM You can transform your Schema with XSLT
  • 123. XML Schemas Secure Data Communication When sending data from a sender to a receiver, it is essential that both parts have the same "expectations" about the content. With XML Schemas, the sender can describe the data in a way that the receiver will understand. A date like: "03-11-2004" will, in some countries, be interpreted as 3.November and in other countries as 11.March. However, an XML element with a data type like this: <date type="date">2004-03-11</date> ensures a mutual understanding of the content, because the XML data type "date" requires the format "YYYY-MM-DD".
  • 124. XML Schemas are Extensible XML Schemas are extensible, because they are written in XML. With an extensible Schema definition you can: Reuse your Schema in other Schemas Create your own data types derived from the standard types Reference multiple schemas in the same document
  • 125. Well-Formed is not Enough A well-formed XML document is a document that conforms to the XML syntax rules, like: it must begin with the XML declaration it must have one unique root element start-tags must have matching end-tags elements are case sensitive all elements must be closed all elements must be properly nested all attribute values must be quoted entities must be used for special characters Even if documents are well-formed they can still contain errors, and those errors can have serious consequences. Think of the following situation: you order 5 gross of laser printers, instead of 5 laser printers. With XML Schemas, most of these errors can be caught by your validating software.
  • 126. A Simple XML Document Look at this simple XML document called "note.xml": <?xml version="1.0"?> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
  • 127. A DTD File The following example is a DTD file called "note.dtd" that defines the elements of the XML document above ("note.xml"): <!ELEMENT note (to, from, heading, body)> <! ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <! ELEMENT body (#PCDATA)>The first line defines the note element to have four child elements: "to, from, heading, body". Line 2-5 defines the to, from, heading, body elements to be of type "#PCDATA".
  • 128. An XML Schema The following example is an XML Schema file called "note.xsd" that defines the elements of the XML document above ("note.xml"):
  • 129. <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified"><xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element></xs:schema>
  • 130. A Reference to a DTD This XML document has a reference to a DTD: <?xml version="1.0"?><! DOCTYPE note SYSTEM "http://www.w3schools.com/dtd/n ote.dtd"><note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
  • 131. A Reference to an XML Schema This XML document has a reference to an XML Schema: <?xml version="1.0"?><note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/ XMLSchema-instance" xsi:schemaLocation="http://www.w3sc hools.com note.xsd"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend! </body> </note>
  • 132. XSD - The <schema> Element
  • 133. The <schema> Element The <schema> element is the root element of every XML Schema: <?xml version="1.0"? ><xs:schema>... ...</xs:schema>
  • 134. The <schema> element may contain some attributes. A schema declaration often looks something like this: <?xml version="1.0"?><xs:schema xmlns:xs="http://www.w3.org/2001/X MLSchema" targetNamespace="http://www.w3sch ools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified">... ... </xs:schema>
  • 136. What is a Simple Element? A simple element is an XML element that can contain only text. It cannot contain any other elements or attributes. However, the "only text" restriction is quite misleading. The text can be of many different types. It can be one of the types included in the XML Schema definition (boolean, string, date, etc.), or it can be a custom type that you can define yourself. You can also add restrictions (facets) to a data type in order to limit its content, or you can require the data to match a specific pattern
  • 137. Defining a Simple Element The syntax for defining a simple element is: <xs:element name="xxx" type="yyy"/> where xxx is the name of the element and yyy is the data type of the element. XML Schema has a lot of built-in data types. The most common types are: xs:string xs:decimal xs:integer xs:boolean xs:date xs:time
  • 138. Example Here are some XML elements: <lastname>Refsnes</lastname> <age>36</age> <dateborn>1970-03- 27</dateborn>And here are the corresponding simple element definitions: <xs:element name="lastname" type="xs:string"/> <xs:element name="age" type="xs:integer"/> <xs:element name="dateborn" type="xs:date"/>
  • 139. Default and Fixed Values for Simple Elements Simple elements may have a default value OR a fixed value specified. A default value is automatically assigned to the element when no other value is specified. In the following example the default value is "red": <xs:element name="color" type="xs:string" default="red"/>A fixed value is also automatically assigned to the element, and you cannot specify another value. In the following example the fixed value is "red": <xs:element name="color" type="xs:string" fixed="red"/>
  • 141. What is an Attribute? Simple elements cannot have attributes. If an element has attributes, it is considered to be of a complex type. But the attribute itself is always declared as a simple type.
  • 142. How to Define an Attribute? The syntax for defining an attribute is: <xs:attribute name="xxx" type="yyy"/> where xxx is the name of the attribute and yyy specifies the data type of the attribute. XML Schema has a lot of built-in data types. The most common types are: xs:string xs:decimal xs:integer xs:boolean xs:date xs:time
  • 143. Example Here is an XML element with an attribute: <lastname lang="EN">Smith</lastname>An d here is the corresponding attribute definition: <xs:attribute name="lang" type="xs:string"/>
  • 144. Default and Fixed Values for Attributes Attributes may have a default value OR a fixed value specified. A default value is automatically assigned to the attribute when no other value is specified. In the following example the default value is "EN": <xs:attribute name="lang" type="xs:string" default="EN"/>A fixed value is also automatically assigned to the attribute, and you cannot specify another value. In the following example the fixed value is "EN": <xs:attribute name="lang" type="xs:string" fixed="EN"/>
  • 145. Optional and Required Attributes Attributes are optional by default. To specify that the attribute is required, use the "use" attribute: <xs:attribute name="lang" type="xs:string" use="required"/>
  • 146. Restrictions on Content When an XML element or attribute has a data type defined, it puts restrictions on the element's or attribute's content. If an XML element is of type "xs:date" and contains a string like "Hello World", the element will not validate. With XML Schemas, you can also add your own restrictions to your XML elements and attributes. These restrictions are called facets
  • 148. Mozilla Firefox As of version 1.0.2, Firefox has support for XML and XSLT (and CSS). Mozilla Mozilla includes Expat for XML parsing and has support to display XML + CSS. Mozilla also has some support for Namespaces. Mozilla is available with an XSLT implementation. Netscape As of version 8, Netscape uses the Mozilla engine, and therefore it has the same XML / XSLT support as Mozilla. Opera As of version 9, Opera has support for XML and XSLT (and CSS). Version 8 supports only XML + CSS. Internet Explorer As of version 6, Internet Explorer supports XML, Namespaces, CSS, XSLT, and XPath
  • 149. Displaying your XML Files with CSS?
  • 150. <?xml version="1.0" encoding="ISO-8859-1"?> <? xml-stylesheet type="text/css" href="cd_catalog.css"? > <CATALOG> <CD> <TITLE>Empire Burlesque</TITLE> <ARTIST>Bob Dylan</ARTIST> <COUNTRY>USA</COUNTRY> <COMPANY>Columbia</COMPANY> <PRICE>10.90</PRICE> <YEAR>1985</YEAR> </CD> <CD> <TITLE>Hide your heart</TITLE> <ARTIST>Bonnie Tyler</ARTIST> <COUNTRY>UK</COUNTRY> <COMPANY>CBS Records</COMPANY> <PRICE>9.90</PRICE> <YEAR>1988</YEAR> </CD> . . . . </CATALOG>
  • 152. XML Data Embedded in HTML An XML data island is XML data embedded into an HTML page. Here is how it works; assume we have the following XML document ("note.xml"): <?xml version="1.0" encoding="ISO-8859-1"? > <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
  • 153. Then, in an HTML document, you can embed the XML file above with the <xml> tag. The id attribute of the <xml> tag defines an ID for the data island, and the src attribute points to the XML file to embed:
  • 154. <html> <body><xml id="note" src="note.xml"></xml></body> </html>However, the embedded XML data is, up to this point, not visible for the user.
  • 155. Bind Data Island to HTML Elements
  • 156. The HTML file looks like this: <html> <body> <xml id="cdcat" src="cd_catalog.xml"></xml> <table border="1" datasrc="#cdcat"> <tr> <td><span datafld="ARTIST"></span></td> <td><span datafld="TITLE"></span></td> </tr> </table> </body> </html>
  • 157. Example explained: The datasrc attribute of the <table> tag binds the HTML table element to the XML data island. The datasrc attribute refers to the id attribute of the data island. <td> tags cannot be bound to data, so we are using <span> tags. The <span> tag allows the datafld attribute to refer to the XML element to be displayed. In this case, it is datafld="ARTIST" for the <ARTIST> element and datafld="TITLE" for the <TITLE> element in the XML file. As the XML is read, additional rows are created for each <CD> element.
  • 159. Name Conflicts Since element names in XML are not predefined, a name conflict will occur when two different documents use the same element names. This XML document carries information in a table: <table> <tr> <td>Apples</td> <td>Bananas</td> </tr> </table>This XML document carries information about a table (a piece of furniture):
  • 160. <table> <name>African Coffee Table</name> <width>80</width> <length>120</length> </table>If these two XML documents were added together, there would be an element name conflict because both documents contain a <table> element with different content and definition.
  • 161. Solving Name Conflicts Using a Prefix This XML document carries information in a table: <h:table> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr> </h:table>This XML document carries information about a piece of furniture: <f:table> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length> </f:table>Now there will be no name conflict because the two documents use a different name for their <table> element (<h:table> and <f:table>). By using a prefix, we have created two different types of <table> elements.
  • 162. XML PARSERS: Two basic approach followed by parsers are SAX(single API for XML) or DOM(Document object model).
  • 163. SAX Sequential ,event based Cannot move laterally between elements Tough to use for complex structures Saves memory space Better choice for quick,less intensive parsing and processing Intertactive so can be used for larger files
  • 164. DOM Memory tree representation Lets you move back and forth ,up and down Easy to use and has clean interface Memory intensive for larger XML documents Better choice for complex XML structures Can be used for smaller files as memory intensive.