October 24, 2004

Conferences and Google

The joy of being a consultant is that frequently you end up being very, very busy. The downside is that you frequently end up very, very busy, and the other things that you are working on tend to get short-shrift. I have been crazy busy, good for the pocket-book, but bad for things such as Metaphorical Web. I'm taking a brief break here though to catch up on the web and post what I've been up to the last couple of weeks.

Chris Sells' Applied XML Development Conference

I was invited to Chris Sells' Applied XML Development Conference, considered by many to be one of the best conferences for covering advanced XML techniques (the other being the Extreme XML conferences), to present a paper on XSLT2. Others will have blogged this conference much better than myself, but I do have a few notes to present as well.

Geek Dinner

Chris Sells put on a good conference, something that isn't always easy (as a habitual conference speaker, something analogous to a drug addict but with presentation tools as the opiate of choice, I should know). It was definitely a Microsoft-oriented event, and I do have to admit that I found the Microsoft propaganda at the show tedious and heavy handed, but at the same time I knew going in that it was likely not to be an agnostic crowd.

The accomodations, the Skamania Lodge in Stevenson, Washington, were remote (about 40 minutes from Portland) but otherwise quite spectacular, showing off southern Washington State at its best with scenic vistas of mountains and the Columbia Gorge. The lodge was the epitome of rustic - I keep wondering whether there is a "rustic" design specialty out there, with classes in how to properly design fireplaces and log walls. I drove down with M. David Peterson, the developer of the Saxon.NET XSLT2 processor and my co-conspirator in crime in putting together the presentation, and we spent the three hour trip down riffing on the nature of meta-data and models.

We didn't go directly to the lodge, however, instead making our way to downtown Portland for the Geek's Dinner. This has apparently become a time honored process at the Sells Cons, in which many of the attendees showed up at a dinner hosted by Don Demsiak (DonXML) at an appropriate venue - in this case the food court at Lloyd Center Mall. While most of the Microsoft big-players didn't show up, just about everyone else did. It was fun, we got a chance to just chat and make jokes before going into the more formal venue of the lodge itself.

Once we got to the lodge and settled ourselves, I went down to the bar and had the privilege of having a beer with Tim Bray, the main architect of XML and the driving force right now of the Atom specification for syndication. While I've met Tim before, this was the first time that I had a chance to really talk with him, and we covered everything from SVG and XUL to the best micro-breweries on the West Coast (he recommends the Yaletown Brewing Company, just down the way from his offices in the Yaletown district of Vancouver). I had a lot of respect for Mr. Bray before I met him, and after this conference my respect has gone up immeasurably.

Blue Pill, Red Pill, Purple Pill

The conference itself got underway around nine with Tim giving the keynote address. Not unexpectedly, it was about blogs and blogging (and the exponential growth of same) but he did manage to get in more than a few digs about a number of standards that were in the works, spread his frustrations with Microsoft and the W3C evenly, and also indicated that the world was not just Microsoft. This was a challenge to subsequent speakers, many of whom were talking the MS line.

Chris Anderson, a key Avalon developer at Microsoft, spoke next about Everyone Hates XML, a play no doubt on "Everyone Loves Raymond" but that oddly continued a theme that was both a little depressing and fairly antagonistic throughout the conference. Anderson's talk actually focused more on XAML and the design decisions used in its development. I'll have to confess that some of the examples that he used, such as

<button id="foo" value="Button">
<solidcolor value="Blue">

(note, may not be completely valid, as I don't have the example handy)

rub me very much the wrong way, as it effectively places stylistic attributes into a position where they effectively create a potentially huge combinatorial set of elements; its not validatable, and as a consequence requires having a very complex runtime in the background to handle. This very quickly jumped into a free-for-all with the audience asking about what exactly was a tag, and I noticed after a while that it was only the Microsoft regulars who were trying to push it into a non-issue. A prediction - the lack of XML validation will prove to be the undoing of XAML in the long run. The talk was othewise quite interesting, however.

Patrick Cauldwell and Scott Hanselman presented Bringing Strongly Typed Business Objects to Legacy Financial Systems with XML Schema, a mouthful which nonetheless proved to be a useful talk. One of the central ideas in their work was the use of XML Schemas to generate Word documents describing a financial system, something which could in turn be used as the basis of legal documents. This is a very interesting idea, one that I've actually pushed before myself. To whit, a legal document serves two purposes -- defining the language (services) upon which two or more parties agree to specify the terms of contract, and the penalties for failure to complete these services on either side. A schema in that regard can effectively perform at least one part of that process, stipulating the terms of the agreement. As a consequence, I suspect that over time, schema documents will begin to take on a quasi-legal status.

Don Box and I have sparred in the past, though I also took advantage of what he thought was a "safe" venue to make some observations about the WS-* initiatives, then in their infancy, that he felt (with some legitimacy) to be unfair. Don's grown more confident in his own role at Microsoft, and while I'm not completely wild still about the WS-* concepts I will concede that they have a certain utility. His talk WS-Why? went over many of these specifications, using a (somewhat strained) island metaphor to go over the latest Microsoft initiatives in this area. My biggest gripe came from his tendency to talk around the purposes of some of the initiatives without providing a little better explanation of what they are intended to be used for - I think even many who were ardent Microsoft proponents in the audience felt a little lost on this. I did note with amusement that UDDI had been relegated to the status of an "oops" kind of idea, one that had some lofty goals but bad implementation, citing many of the same reasons that I've had for disliking that standard (the overbearing use of "business metaphors" for instance). For all that, the talk was a good one, and while I didn't get a chance to talk with him in detail, I do hope to do so in more favorable circumstances in the future.

Then there was the absolute, no question greatest talk of the entire conference: Using XML for Navy Missile Systems, by Whit Kemmey of the Department of Defense. Nuclear missiles. Many submarine shots. Discussions about coding practices aboard ultra-secure environments. The Hunt for Red October could only wish it had been this cool. Whit was very definitely a military programmer, very clean cut, the only one in the entire auditorium wearing a suit, but for all of that he had a wry and deadpan sense of humor that caught some of the brightest programming minds on the planet off-guard more than once. Some favorite quotes:
  • Im a trenches guy. Im not a vendor. This stuff isnt for sale.
  • Our software has never been used for its intended purpose, and hopefully never will be.
  • Localization is not a big problem for us.
The Navy apparently uses a customized Unix solution on hardware that is running about five years behind the curve, largely because of the need to stress-test everything to insure that there are no unexpected gotchas (I'll leave it to you, gentle reader, to figure out the consequences of a system crash on a boatload of nuclear weapons). XML was used here to handle the generation and implementation of Standard Operating Procedures (SOPs) using LibXML as the central processor. There was a certain amount of astonishment at the way that it was being used by many of the Windows-oriented type, though the descriptions for actions used made sense in terms of needing to specify in great detail every single step of a process with such incredible consequences of something going wrong.

I did find it a little irritating at the tone used by more than a few of the commercial XML developers, a kind of mocking astonishment about the process of software development in the military, but having been in the Navy myself in the 1980s, I have to admit that more than a few of these people could have afforded to spend some time in the service to understand what writing mission critical software is REALLY about. When lives could be lost due to a software error, conscientiousness in coding is not only a nice to have but a necessity, a lesson that a few of those writing operating systems would be wise to consider.

Sam Ruby is an unheralded genius. Working for IBM, he also serves helping to put together the Atom specification, and is involved with both the KDE and Gnome groups. He was also easily one of the most depressing speakers there, chastising the crowd for the dangers inherent in many of the problems associated with web practices, from Unicode violations and errors to URL mis-encryptions in his talk XML is an Attractive Nuisance. These were all things that he's had to deal with when coding for Atom, but in the process of giving the talk he kept pointing out fundamental problems about the Internet itself. He did try to use the Matrix metaphor, though the relative disasters of the latter two movies ment that his demos didn't have quite the oomph I think he would have liked. I enjoyed the talk, and managed to identify many (though not all) of the erroneous examples he provided, but given his Matrix references I think more than a few of use were wondering not whether we should take the blue pill or the red pill, but who had a stash of The Purple Pill (Paxil, a commonly prescribed anti-depressant).

Daniel Cuzzolino (evangelist for Schematron) not surprisingly did a talk All About Schematron. For those of you unfamiliar with it, Schematron is a schema validation tool that uses XSLT and XPath to provide more sophisticated validation than can be handled with XML Schema Definition (XSD). Significantly, he brought up a number of features with Schematron that he wanted to be able to incorporate, but couldn't such as the use of regular expressions in his validation. After I gave my talk, we personally discussed Schematron in much more detail with an eye toward using XSLT2 for handling the next generation of the application.

My talk was next, and I will be discussing it in much more detail in the next issue of this blog. I'd put it at a middling success - I had too much material and being immediately before dinner, the attendees were more than a little distracted. If you can avoid it, try not to compete with food - empty stomachs make for wandering minds.

The Case Against XSD

After dinner, the speakers all headed up to the front of the dining hall for a round-table, and almost out of the chute the topic was the inadequacy of the XML Schema Definition Language (XSD).

A little background is in order here. While most of the W3C standards have had a certain degree of controversy associated with them, far and away the most controversial of the specifications has been XSD. It was begun about the same time that the XML standard itself was in its earliest stages, but even given that it took more than five years of very contentious meetings to finally agree to it, with the most vocal proponents being the database vendors of Microsoft, Oracle, and IBM. The difficulty is not in the simple type definitions - how one defines primitives such as integers, floating point numbers, and so forth. Instead, the challenges have come in defining more complex types, such as elements with subordinate children in them. There is a lot of ambiguity in the specification, especially when it comes to these complex data types, and what makes matters worse is that the underlying data model of XSD favors a one tag = one object model that is increasingly being shown to be inaccurate in describing data models.

A subtle revolution is going on, and parts of it emerged in this conference. While there are many (especially those vendors) who want to declare the matter closed and have effectively turned a deaf ear to the plea for re-examining schemas, the most promising alternative candidate, Relax-NG (also given as RNG) has quietly been showing up in all sorts of interesting places. For instance, last year the SVG 1.2 specification was published using RNG and not XSD as the schema. Tools such as Oxygen are making RNG available as their primary validation scheme, and content management engineers, who have known for sometime the deficiency inherent in XSD, have been switching over DTDs to RNG and bypassing the XSD spec altogether.

Consequently, it was interesting to hear the number of people who began by saying that they didn't like XSD but would support it because its already baked into so many other specifications. In the .NET world this is probably true - Microsoft has taken to schemas with a vengeance, perhaps too much so in some respects, even though there are certain core technologies such as XAML that would be much more accurately described (with less ambiguity) in RNG. The Relax-NG schema works at a little lower level than schema does, and is more descriptive of contents; right now it is facing a VHS vs. Beta type struggle, but I don't think it's necessarily a good idea to write it off at this stage.

The rancor which this topic brought up raises a more subtle spectre that vendors especially should be mindful of - is it possible that people are beginning to react to XML technologies that they feel were forced down their throat by bypassing those XML structures in favor of more ad-hoc ones? There is little in the way of true empirical evidence, but there is a lot of anecdotal evidence that suggests this. SOAP has definitely achieved success in the hard business sphere, but is being adopted much less readily by many companies that are dealing with document content. WSDL (and the implied RPC model that it brings to the table) has likewise been successful in a much smaller niche than the proponents of these specs had hoped.

A word to such companies -- your customers are looking for the best solution to their dilemmas, not yours. While it is necessary to place a stake in the ground over a particular technology periodically, making the argument that it's too late to make changes will only make the foundation that these technologies are building just that much more fragile. It also may make your technologies out of synch with the rest of the world, and in a world that is becoming increasingly heterogeneous, this can result in some serious discontinuities to the bottom line when a critical mass of users of the alternative standard is reached.

Is RNG that much better than XSD? I'm still trying to ascertain that myself, though earlier indications are that it is a better schema language. RNG was designed from the bottom up to be a compelling language for defining schemas, XSD was designed from the top down as being a wish-list for vendor support of certain features. Having written a couple of books on XSD over the years, I think I can speak with some authority on this -- it has some REAL problems.

I think that right now there IS a window of opportunity to make changes in the underlying schema language. SOAP and Web Services has not taken off as explosively as the pundits would have liked, in part because of the schema issue, and even the WS-* standards being promulgated by Microsoft are still largely in a prototype and deployment stage.

To me, the best solution, admittedly one of the more complex, is to separate schema from such things as the WS-* standards, making the specific mechanism for validating dependent upon a user defined schema attribute. This approach is already being used with a lot of success in the XSLT space (where transformations beyond XSLT 1.0 can be used for processing just by changing an appropriate selectionNamespace attribute. Schemas, which in general should be much lighter weight than transformations, could easily adopt this approach.

There's a storm brewing there, and because it is something that is so fundamentally - the language to be used for defining things - all parties should spend some serious time asking themselves whether the disinclination to change the default schema lays as much in a level of intellectual laziness as it does worrying about cost. I would rather my children have the best schema language possible when their Internet comes along, rather than something that's a dark horse candidate proposed because everyone hates it equally.

Microsoft Propaganda

I enjoyed the controversies of the first day, even if they were more than a little contentious. That's what healthy scientific debate is all about, and at the level of this conference, I think the word scientifically can be legitimately used. There were multiple viewpoints, and the most refreshing ones were the ones that weren't peddling the next API or service pack upgrade.

I think that the conference to me slid into propaganda on the second day. Doug Purdy presented a piece on versioning with web services that was intriguing, even though I have to admit that it was a little too cut and paste-ish for my tastes. If this had been all of his talk, I think I would have enjoyed it immensely.

What I didn't enjoy was the ten minute "movie" thrown at us promoting Microsoft Development Services, based upon a take-off of the TV show Queer Eye for the Straight Guy, and portraying Linux and Java developers as being idiots who would be far more productive if they would only drink the Microsoft koolaid. To those of the faithful in the crowd, it was a funny piece, but to those of us who tend to straddle both worlds (and more, have moved away from Microsoft technology because of some of its inherent problems) the ad was insulting, and more, its placement within the Sells conference announced like nothing else that this was no longer a technology neutral forum but was instead a wholly owned subsidiary of Microsoft. Until that point, I had been enjoying the conference a great deal, but the movie soured it for me. Can the PR, Microsoft! The people who went to this conference wanted ideas and the give and take of ideas for fixing problems within XML, not captive advertising.


On the other hand, one of my personal victories at this conference occurred when Neetu Rajpal, the Program Manager for Microsoft's XML program, was covering changes that would be introduced into VS.NET 2005 in the XML space. Much of it was welcome and overdue - better XML editing, a semi-decent XSLT editor (I still prefer OxygenXML or Stylus Studio, but that they had was a fair sight better than what currently passed for editors for these technologies in Visual Studio). Inferring schemas is also a nice feature, though again one that has made its way into most commerical XML editors.

What was of course NOT in there was any mention of XSLT 2.0. Microsoft has chosen not to support XSLT 2.0, not now at least, and when asked when by a member of the audience (not me, I promise) Neetu said that it wasn't planned for development any time soon. My talk, on XSLT 2.0, apparently did affect more people than I'd thought, because another member of the audience stood up and asked for a show of hands: who there wanted to see XSLT 2.0 and who wanted to see XQuery 1.0. Half of the audience raised their hand for XSLT, maybe 35% raised it for XQuery. That there are a lot more people using the technology than Microsoft wanted to acknowledge may make the bizDev people think twice about not wanting to adopt it.

Amazing Amazon APIs

Jeff Barr, of Amazon, regaled the audience with the full range of Amazon web services APIs that are currently being rolled out for business developers; some very interesting things going on in this space. Through these APIs (currently available freely to anyone who signs up with the Amazon development program) it is possible to query Amazon listings and customer contents, and to provide certain post-back information for tracking purposes with sales, in both their publishing and general retail divisions. They've also managed to partner with Google in creating a generalized search framework that is similarly scriptable through web services, making it possible to build very elaborate web queries that can then be tied into applications, not just web pages. Finally, they now provide a way to create interface bindings on their pages to build branded store presences, something that I predict will be VERY big.

I have to admit, after looking at this, I felt a strong urge to start creating my own store-fronts (and actually will likely do that with Metaphorical Web Publishing, a venture I will discuss in a subsequent blog). It's cool technology.

Conference Wrap Up

I needed to get an early start back, so missed a couple of the sessions. There were a few other sessions that I have heard from other participants proved to be just as provative as the ones I was able to attend, and if I can I will try to attend the conference again next year. I'd prefer to see the Microsoft aspect downplayed -- yes Chris Sells is now a Microsoft employee, but I think that this conference would have more credibility if it was less hostile to alternative viewpoints; additional sponsorship by Novell, IBM, Oracle, or the Linux center (which was located, ironically, less than an hour's drive from the conference) would also serve to make it more balanced and appealing to all XML developers, and may even lure a few of those hardy souls off in Mono land back into Microsoft's embrace.

Chris himself is a gentleman and a scholar, and he has a truly wonderful family of an age with my own. I wish him luck next year with the next Applied XML Developer's Conference, though this year will be hard to beat.

Google in Kirkland

I just learned this week that Google will be setting up a new development center in Kirkland, Washington, within a couple of miles where I live. The idea of having a Google output in the area makes me giddy -- in addition to possibly applying for a position there myself (okay, admit it, who wouldn't want to work at Google right now?) I get to see the fireworks as yet another major Internet player moves into the Puget Sound.

There's an interesting confluence going on right now. With Google in the neighborhood, the Silicon Forest of the Pacific Northwest is beginning to take on a presence not unlike that of San Francisco in the 1990s. Portland, once a major hub for hardware manufacturers, has quietly been evolving into a Linux powerhouse, with Linus Torvald's move there a few months ago adding the crown jewel for a region that is boasting a renewed IBM and Novell presence as well as bunches of Linux start-ups. Microsoft's influence is still strong in Seattle, but Amazon and now Google are going to change that dynamic considerably, with a shift away from desktop systems to the next generation Internet. Vancouver is becoming a mecca not only for games and movie production but also for UI technologies - several SVG companies (many also sporting XAML arms) are located between Yaletown and Belltown in Vancouver.

Makes me proud of my adopted home. Rain is good for developers.

Signing Off

Don Demsiak has called me the master of the long blog (thanks .. I think) but to keep this from turning into war and peace, I'm going to wrap it up here. Expect for things to pick up on Metaphorical Web, as both this conference and my latest development efforts have provided LOTS of fodder for me for the next few weeks. Until then, enjoy!

-- Kurt Cagle