XML
Published:
This lesson covers XML
XML
<?xml version="1.0" encoding="UTF-8"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
- XML stands for eXtensible Markup Language.
- XML was designed to store and transport data.
- XML was designed to be both human- and machine-readable.
- XML is often used for distributing data over the Internet.
<?xml version="1.0" encoding="UTF-8"?>
<breakfast_menu>
<food>
<name>Belgian Waffles</name>
<price>$5.95</price>
<description>
Two of our famous Belgian Waffles with plenty of real maple syrup
</description>
<calories>650</calories>
</food>
<food>
<name>Strawberry Belgian Waffles</name>
<price>$7.95</price>
<description>
Light Belgian waffles covered with strawberries and whipped cream
</description>
<calories>900</calories>
</food>
<food>
<name>Berry-Berry Belgian Waffles</name>
<price>$8.95</price>
<description>
Belgian waffles covered with assorted fresh berries and whipped cream
</description>
<calories>900</calories>
</food>
<food>
<name>French Toast</name>
<price>$4.50</price>
<description>
Thick slices made from our homemade sourdough bread
</description>
<calories>600</calories>
</food>
<food>
<name>Homestyle Breakfast</name>
<price>$6.95</price>
<description>
Two eggs, bacon or sausage, toast, and our ever-popular hash browns
</description>
<calories>950</calories>
</food>
</breakfast_menu>
XML vs HTML
- XML was designed to carry data - with focus on what data is
- HTML was designed to display data - with focus on how data looks
- XML tags are not predefined like HTML tags are
- XML Does Not Use Predefined Tags
- The XML language has no predefined tags.
- The tags in the example above (like
and ) are not defined in any XML standard. These tags are "invented" by the author of the XML document.
- HTML works with predefined tags like <p>, <h1>, <table>, etc.
- With XML, the author must define both the tags and the document structure.
- XML is Extensible
- Most XML applications will work as expected even if new data is added (or removed).
- Imagine an application designed to display the original version of note.xml (
<body>). - Then imagine a newer version of note.xml with added
and elements, and a removed . - The way XML is constructed, older version of the application can still work
- XML Simplifies Things
- It simplifies data sharing
- It simplifies data transport
- It simplifies platform changes
- It simplifies data availability
- Many computer systems contain data in incompatible formats. Exchanging data between incompatible systems (or upgraded systems) is a time-consuming task for web developers. Large amounts of data must be converted, and incompatible data is often lost.
- XML stores data in plain text format. This provides a software- and hardware-independent way of storing, transporting, and sharing data.
- XML also makes it easier to expand or upgrade to new operating systems, new applications, or new browsers, without losing data.
- With XML, data can be available to all kinds of “reading machines” like people, computers, voice machines, news feeds, etc.
XML became a W3C Recommendation as early as in February 1998
XML Uses
- XML Separates Data from Presentation
- XML does not carry any information about how to be displayed.
- The same XML data can be used in many different presentation scenarios.
- Because of this, with XML, there is a full separation between data and presentation.
- In many HTML applications, XML is used to store or transport data, while HTML is used to format and display the same data.
- When displaying data in HTML, you should not have to edit the HTML file when the data changes.
- With XML, the data can be stored in separate XML files.
- With a few lines of JavaScript code, you can read an XML file and update the data content of any HTML page.
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="web" cover="paperback">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
Example
XML News
<?xml version="1.0" encoding="UTF-8"?>
<nitf>
<head>
<title>Colombia Earthquake</title>
</head>
<body>
<headline>
<hl1>143 Dead in Colombia Earthquake</hl1>
</headline>
<byline>
<bytag>By Jared Kotler, Associated Press Writer</bytag>
</byline>
<dateline>
<location>Bogota, Colombia</location>
<date>Monday January 25 1999 7:28 ET</date>
</dateline>
</body>
</nitf>
XML Weather Service
<?xml version="1.0" encoding="UTF-8"?>
<current_observation>
<credit>NOAA's National Weather Service</credit>
<credit_URL>http://weather.gov/</credit_URL>
<image>
<url>http://weather.gov/images/xml_logo.gif</url>
<title>NOAA's National Weather Service</title>
<link>http://weather.gov</link>
</image>
<location>New York/John F. Kennedy Intl Airport, NY</location>
<station_id>KJFK</station_id>
<latitude>40.66</latitude>
<longitude>-73.78</longitude>
<observation_time_rfc822>Mon, 11 Feb 2008 06:51:00 -0500 EST
</observation_time_rfc822>
<weather>A Few Clouds</weather>
<temp_f>11</temp_f>
<temp_c>-12</temp_c>
<relative_humidity>36</relative_humidity>
<wind_dir>West</wind_dir>
<wind_degrees>280</wind_degrees>
<wind_mph>18.4</wind_mph>
<wind_gust_mph>29</wind_gust_mph>
<pressure_mb>1023.6</pressure_mb>
<pressure_in>30.23</pressure_in>
<dewpoint_f>-11</dewpoint_f>
<dewpoint_c>-24</dewpoint_c>
<windchill_f>-7</windchill_f>
<windchill_c>-22</windchill_c>
<visibility_mi>10.00</visibility_mi>
<icon_url_base>http://weather.gov/weather/images/fcicons/</icon_url_base>
<icon_url_name>nfew.jpg</icon_url_name>
<disclaimer_url>http://weather.gov/disclaimer.html</disclaimer_url>
<copyright_url>http://weather.gov/disclaimer.html</copyright_url>
</current_observation>
XML Tree
XML documents are formed as element trees.
An XML tree starts at a root element and branches from the root to child elements.
All elements can have sub elements (child elements)
The terms parent, child, and sibling are used to describe the relationships between elements.
Parents have children. Children have parents. Siblings are children on the same level (brothers and sisters).
All elements can have text content (Harry Potter) and attributes (
or ). <root> <child> <subchild>.....</subchild> </child> </root>
The XML Prolog
The XML prolog is optional. If it exists, it must come first in the document.
XML documents can contain international characters, like Norwegian øæå or French êèé.
To avoid errors, you should specify the encoding used, or save your XML files as UTF-8.
UTF-8 is the default character encoding for XML documents.
<?xml version="1.0" encoding="UTF-8**"**?>
All elements must have a closing tag
XML tags are case sensitive. The tag
is different from the tag . XML Elements Must be Properly Nested
XML Attribute Values Must Always be Quoted
<note date="12/11/2007"> <to>Tove</to> <from>Jani</from> </note>
replace the “<” character with an entity reference
Only < and & are strictly illegal in XML, but it is a good habit to replace > with > as well.
<message>salary < 1000</message>
Comments in XML
<!-- This is a comment -->
White-space is Preserved in XML
- XML does not truncate multiple white-spaces (HTML truncates multiple white-spaces to one single white-space):
- XML: Hello World
- XML does not truncate multiple white-spaces (HTML truncates multiple white-spaces to one single white-space):
XML Stores New Line as LF
Windows applications store a new line as: carriage return and line feed (CR+LF).
Unix and Mac OSX use LF.
Old Mac systems use CR.
XML stores a new line as LF.
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
![]() |
---|
https://www.w3schools.com/xml/nodetree.gif |
XML Element
An XML element is everything from (including) the element’s start tag to (including) the element’s end tag.
<price>29.99</price>
An element can contain:
- text
- attributes
- other elements
- or a mix of the above
<bookstore> <book category="children"> <title>Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="web"> <title>Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore>
Text content:
, , , and Element contents:
and Attribute:
has an **attribute** (category="children") Empty XML Elements
<element></element> <element />
- Empty elements can have attributes
XML Naming Rules
- Element names are case-sensitive
- Element names must start with a letter or underscore
- Element names cannot start with the letters xml (or XML, or Xml, etc)
- Element names can contain letters, digits, hyphens, underscores, and periods
- Element names cannot contain spaces
- Any name can be used, no words are reserved (except xml).
XML Elements are Extensible
<note> <to>Tove</to> <from>Jani</from> <body>Don't forget me this weekend!</body> </note>
can be improved later without breaking existing applications
<note> <date>2008-01-10</date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
XML Attributes
XML elements can have attributes, just like HTML.
Attributes are designed to contain data related to a specific element.
XML Attributes Must be Quoted
Attribute values must always be quoted. Either single or double quotes can be used.
<person gender="female"> <person gender='male'> <gangster name='George "Shotgun" Ziegler'> <gangster name="George "Shotgun" Ziegler">
XML Elements vs. Attributes
<note date="2008-01-10"> <to>Tove</to> <from>Jani</from> </note> <note> <date>2008-01-10</date> <to>Tove</to> <from>Jani</from> </note> <note> <date> <year>2008</year> <month>01</month> <day>10</day> </date> <to>Tove</to> <from>Jani</from> </note>
Avoid XML Attributes?
- attributes cannot contain multiple values (elements can)
- attributes cannot contain tree structures (elements can)
- attributes are not easily expandable (for future changes)
- Incorrect
-
XML Attributes for Metadata
Sometimes ID references are assigned to elements. These IDs can be used to identify XML elements in much the same way as the id attribute in HTML
metadata should be stored as attributes, and the data itself should be stored as elements.
<messages> <note id="501"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> <note id="502"> <to>Jani</to> <from>Tove</from> <heading>Re: Reminder</heading> <body>I will not</body> </note> </messages>