Thanks for the many responses I've had to my post yesterday concerning Canada - some very good thoughts and suggestions, including the primary things one needs to remember when in Canada:
"Moving to canada is no big deal, as long as you learn that re is better than er and eh is a verb, noun, and adjective all in one."
The political statement out of the way, its time I got back to why I write this column in the first place ... GOSSIP! Er..., no, XML. Same difference.
The XML Bernie Botts
Watching the W3C at work is very much like shaking out Bertie Bott's Jelly Beans (the kind from Harry Potter that my 11 year old daughter so loves torturing her old father with). Sometimes the specs released by the W3C are very good - a vanilla or taffy flavored jelly bean that's surprisingly delicious. Sometimes the specs are more like "vomit" or "dirt", though with annotations.
The last couple of weeks I'd have to say there were a delightfully high number of blueberry delight and banana split and the worst to come out was still on the order of "grass" - edible, just not all that exciting. We're past the OWL specifications (there's another Harry Potter column coming on if you're not very careful) which quite frankly read like Noam Chomsky had just engaged in a serious argument with Richard Feynman. You had a drought where for a while all that was brewing were small specs like Accessibility, important in their own right but like listening to a lecture about going on dates from your spinster great aunt. Just when I was beginning to despair that the entire friggin' box was filled with "library paste" jelly beans, along came a bunch of very tasty treats indeed.
- 9 November 2004: XML Binary Characterization Use Cases
- 9 November 2004: xml:id Version 1.0 - Last Call Ends 13 December 2004
- 8 November 2004: Semantic Interpretation for Speech Recognition - Last Call Ends 5 December 2004
- 5 November 2004: XSL Transformations (XSLT) Version 2.0
- 2 November 2004: Assigning Media Types to Binary Data in XML - Last Call Ends 24 November 2004
- 1 November 2004: Timed Text (TT) Authoring Format 1.0 – Distribution Format Exchange Profile (DFXP)
- 29 October 2004: Pronunciation Lexicon Specification (PLS) Version 1.0 Requirements
- 29 October 2004: XQuery 1.0 and XPath 2.0 Data Model
- 29 October 2004: XQuery 1.0 and XPath 2.0 Functions and Operators
- 29 October 2004: XML Path Language (XPath) 2.0
- 29 October 2004: XQuery 1.0: An XML Query Language
- 29 October 2004: XSLT 2.0 and XQuery 1.0 Serialization
- 27 October 2004: Scalable Vector Graphics (SVG) 1.2 - Last Call Ends 24 November 2004
Even more is hidden in the details here. For instance, I note that in the XSLT Serialization specification, there's not one editor but three: Michael Kay (Saxonica), Norman Walsh (Sun), Henry Zongaro (IBM). A lot of my job entails reading behind the lines and trying to understand the significance, but to me this is a pretty strong indication that both Sun and IBM are watching XSLT2 VERY carefully; you don't put a powerhouse hitter like Norman Walsh (the author of the DocBook specification) on a specification unless you think it'll be important.
That you'd need a specification just for the serialization aspect of XSLT2 may seem a little odd as well, until you understand that one of the key new features in that specification is the <result-document> element. This makes it possible for an XSLT2 document to generate more than one output. The most obvious uses of this capability is to use XSLT2 to generate secondary XML documents, but much like input and output streams this is just one of many areas. Generation of SOAP messages, creation of non-XML source code from XML documentation files (especially with many of the other new features in that specification), and rerouting of messages to databases all fall within the province of this new feature. Combine that with the elimination of the tree fragment such that intermediate XML can be created within a transformation and manipulated directly (no more node-set() function!), regular expression support, and the ability to import text, and XSLT2 is beginning to look pretty damn brawny.
Serialization gives you the ability to do things such as to set the content encoding of a document, making it much easier to handle Unicode UTF-16 encoding. The specification also introduces character mappings, which gives you a way of avoiding the need to incorporate DTDs within source code (or the transformations themselves) in order to define entities. While there are some features of DTDs which continue to be useful, XML is slowly losing its reliance upon them.
The XPath 2.0 specification is similarly an important document, and the news on this one was perhaps not quite as sweet as I'd hoped. XPath is a major specification because it underlies not only XSLT but XQuery, XForms, and SVG's xXBL may have some dependencies in its next draft (see below). The last specification was labelled Last Call, which means that it is usually considered one step before a Proposed Draft -- it is an opportunity to put the specification in front of people and get feedback. There was a LOT of feedback this time around, enough that after some consideration, XPath has been taken out of Last Call Status and is once again strictly a working draft.
The issues that have necessitated this are fairly broad, due in part to the need to more properly specify the character model (part of the serialization above) and in part to rectify some of the thornier issues with type conversions. Another common feeling (one that I personally agree with) is the huge number of date-time functions and the feeling that many of these are redundant or readily derivable, especially given the extension mechanisms that exist in both XQuery and XSLT. There are also some indication that one of the original goals of XPath 2 - providing data aware type objects, has receded considerably as people have thought more about what specifically the specification was intended to accomplish. Finally, collations, which affect string order, have been put on the table as things which need to be more clearly defined before the specification can be published.
Regardless, this means that most of the specifications involved will likely be in limbo for at least another six months because of this delay. The one positive aspect about this is the fact that the specifications are not likely to change considerably, so that so long as you stay clear of the more problematic functionality, you can actually use the beta XSLT2 processors that are beginning to surface with some reliability.
SVG 1.2 Goes Into Last Call
After the last paragraph, I'll be cautious about saying too much about this one, though because it is for me a little closer to home I've been watching this particular battle rage for the last couple of weeks. SVG 1.2 is, to put it bluntly, what SVG should have been four years ago:
- SVG1.2 defines a mechanism for dealing with wrapping text in paragraphs, and does so in a spectacular fashion by making it possible to flow text through irregular shapes, and from one shape to another.
- SVG1.2 defines video and audio tags for creating robust multimedia.
- SVG1.2 includes a binding mechanism called sXBL (SVG XML Binding Language) that makes it possible to use XML to build complex components.
- SVG1.2 includes an editable attribute to make it possible to change text on the fly.
- SVG1.2 incorporates a mechanism to create filter vector effects.
- SVG1.2 includes sockets and HTTP support for building distributed applications.
The SVG 1.1. specification was revisionary - it clarified a few features that were not completely specified in 1.0. SVG 1.2 on the other hand is meant more as an addendum, shuffling new chapters into the thing that we call SVG. It is also a very controversial specification, as SVG is now beginning to seriously encroach into other areas such as HTML and XSL-FO. This is not so much true in terms of the specifications - HTML is a logical description of a web page while SVG is a lower level graphics description language - but when you combine SVG with sXBL you have what amounts to a mechanism by which you can use SVG to render that HTML in a pretty fair representation of a web page. In this way what SVG serves to do is replace not so much the specifications, but at least SOME functionality of the browsers themselves.
X.Org Does SVG
This has some very interesting implications, especially in conjunction with what is happening in Linux. A year ago, the XFree86 organization, the group responsible for the X graphical system on Unix and later Linux, split on the basis of changes desired by certain members to make the license non-GPL compliant. The GPL faction formed a new organization, X.org, and immediately set out to solve another problem with XFree86 - the glacially slow development pace of XFree. XFree86 is ancient technology - the first X implementation was created in 1984, twenty years ago, and although the framework was remarkably flexible there were several places in the underlying graphics model that were proving remarkably constrictive.
X.org used this break as a chance to rebuild much of the more troublesome aspects of X. While this effort is still ongoing, already there are a number of new features to X which are making developers salivate:
- Moving from a 24 bit to a 32 bit color space - the RGBA space - to provide 8 bits (256 levels) of alpha channel support. This makes it possible to push transparency directly into the chips, whereas before transformation had to be done much higher up the stack at the application level, resulting in far poorer performance.
- Establishing much finer invalid region support into the graphics layer itself, resulting in much faster screen refreshes - an absolute must for animation.
- More tightly integrating the 2D and 3D engines used by unix based systems, a critical feature for the Java Glass window initiative (and X3D).
- Creating more robust events that can be captured at the X layer, rather than needing to be placed up in the application stack.
- Incorporating hooks for SVG integration (yeah!!).
New Compound Document Formats Group Formed
One of the more intriguing ideas behind many of the presentation oriented specifications is the concept that you can embed SVG and MathML code within XHTML, possibly combining it with XForms content or other related specifications. Some browsers are already doing this, but when theory hits reality, inconsistencies between models can cause more headaches than expected.
In mid-October, the W3C announced the formation of the Compound Document Formats Working Group (CDF) specifically to provide ways to smooth that integration. The formation of this group sends a signal that the W3C is shifting from the development of base technologies specifications and to the interchange between the various specs, something which has needed to happen for a while. I suspect that interoperability will be the name of the game for the next couple of years, especially once the big specifications like XSLT2 and XPath2 finally go golden.
There has been a tendency in the past for specifications to occasionally develop potentially competing and certainly conflicting aspects with other specifications issued by the W3C, meaning that if you wished to use on technology you might potentially be locked out of using another, even though both are conformant with the W3C process. By getting this resolved, user agents of the future will be considerably more integrated with all of the W3C specifications, and the notion that all of these objects can effectively play within the same DOM-space is particularly exciting (your HTML controls the MathML, which in turn controls the SVG, populates the XForm data, and passes the results up to a pipe using the XML DOM).
I've been busy on other projects and am beginning to wrap those up, freeing up more time for Metaphorical Web columns - so expect the activity to pick up again here. Is there something that your company or organization is doing in the XML space that you think is worth shouting about? If there is, contact me at firstname.lastname@example.org with the inside scoop. Until next time, enjoy!