Susan B. Farmer sfarmer at GOLDSWORD.COM
Sat Feb 12 11:23:36 CET 2000

>>>From the HTML Writers Guild Newsletter

----------     Begin INcluded Taxt

1.  XHTML:  Bridging the Gap between HTML and XML
    (Kynn Bartlett, HWG President, kynn at hwg.org)

On 26 January 2000, the World Wide Web Consortium released the newest
flavor of HTML -- XHTML.  The HTML Writers Guild is proud to have
played a role in the development of this specification, and we would
like to thank Ann Navarro and Frank Boumphrey for their dedication and
hard work in the W3C's HTML Working Group.

But what _is_ XHTML, and what does it mean to members of the Guild?


To understand that, first we need to review what HTML is.  Hypertext
Markup Language was created as an "application" of SGML (Standard
Generalized Markup Language), which is a "meta-language" for describing
other markup languages.

Extensible Hypertext Markup Language (XHTML) is simply a version of
HTML but defined as an application of XML (Extensible Markup Language).
XML was developed as another "meta-language", not as complex as SGML
but powerful enough for use in web environments.


Why an XML-based HTML, if we already have "normal" HTML?  By using
XHTML, we have access to the strengths of XML -- extensibility and
interoperability.  Developers can extend XHTML through modules, DTDs,
schemas, and incorporation of other XML-based languages.  XSLT
(Extensible Stylesheet Language Transformations), which allow XML
documents to be easily changed from one representation to another,
will make it easy to adapt XHTML pages for a wide variety of browsers
and user agents.


If you're already familiar with HTML 4.01, you'll find that XHTML
1.0 offers the same features and abilities you've always had.  (Future
versions of XHTML will expand beyond simply "translating" the HTML
specification into XML.)

There are some differences in syntax, however; the most notable ones

    * Documents must be "well formed" -- an XML term meaning that
      tags must match and be nested properly
    * Element and attribute names must be in lower case
    * For non-empty elements, end tags are required -- no more
      optional </P> or </LI> tags
    * Attribute values must always be enclosed in quotes
    * Empty elements must either have an end tag, or the start tag
      must end with />

If you're like at least half of the web designers out there, the
second bullet point above might stand out and you may be asking:


In HTML, tags and attributes are not case sensitive.  This means
that <BODY>, <body>, <Body>, and even <bOdY> refer to the same thing;
you can have your opening tag be <body> and your closing tag </BODY>,
and it all works fine.

XML, the language used to build XHTML, is case sensitive.  <Body>
is considered a different tag from <BODY>, <body>, or <bOdY>, in XML.

Because of this, a standard for XHTML tags had to be set -- and the
most likely choices were "all uppercase" (such as <BODY>), or "all
lowercase" (such as <body>).  In the end, lowercase won out, although
it really could have gone either way; a semi-arbitrary choice had to
be made.


No and yes.  Current browser releases do not speak XHTML, they speak
HTML.  However, it is possible to write XHTML in a way that is
compatible with HTML, and will not break on existing HTML-based
browsers.  These are described in Appendix C of the XHTML specification.

As an example, the "hr" (horizontal rule) tag must be closed, and
would normally be written in XHTML as either <hr/> or <hr></hr>.
However, this will confuse many existing HTML-based browsers; instead,
you can write the tag as <hr /> -- note the space.  This is legitimate
XHTML and is compatible with HTML browsers.


The complete spec for XHTML 1.0 is available via the web from the W3C's
web site:


For a chance to experience XHTML and XML first-hand -- or perhaps just
view the source of some web pages written in XHTML -- read on to the
next article!

---------  End Included Text


Susan Farmer
sfarmer at goldsword.com
Botany Department, University of Tennessee

More information about the tdwg-content mailing list