November 28, 2004

Firefox: Why Microsoft Should Be Worried

I inadvertantly posted this earlier while it was still under development. Sorry for the confusion.

Firefox turned 1.0 last week, and in the process managed to hit 1,000,000 downloads in one day. Put that into perspective - Firefox is in the neighborhood of 5 MB, which means that the mozilla.org servers had something in the neighborhood of 5 terabytes of data streaming over their pipes. From a pure networking standpoint, that's pretty amazing, not to mention the indication about how major a release Firefox has become.

A few Netcraft statistics are perhaps just as revealing. Netcraft measures browser usage on the web, and according to its data, Firefox has managed to capture about 4% of the browser share just since October when the 1. 0 preliminary review was released. Given that the movement of any given browser usually tends to be in the neighborhoods of tenths of points from month to month, this jump was phenomenal. Firefox and Mozilla combined now occupy roughly 7% of the browser market, most of it at the expense of Microsoft's Internet Explorer. IE dropped below 90% or the market for the first time in several years.

I've seen a number of articles on the web asking whether Microsoft should be worried about the rise of Firefox, especially given their own market-share of around 90%. After all, ASP.NET is increasingly putting the orientation of web pages regardless of which server the browser is aimed at, a server-centric philosophy that seems to be consistent with the stance that the company took after moving from a rich client model in the late 1990s.

Personally, I would contend that Microsoft does need to worry ... a great deal. Internet Explorer is more than just a browser ... it is a critical piece of infrastructure that is used in any number of applications, including applications that don't necessarily talk to the web. The ability to create dynamic interfaces is not something to take lightly, as such systems are far easier to update, more readily customizable than precompiled binaries, and are often simpler to write applications around, for the vast majority of all such applications. Given that a significant proportion of such applications are written not for the home user but for the enterprise, Internet Explorer may in fact anchor Windows in businesses even more than Microsoft Office.

Given that, Firefox represents a significant threat to Microsoft. I have been working with Firefox and XUL for roughly three months now, building a number of tools for a content management system including a customized WYSIWYG XML editor and a versioning system monitor. With some work, I've managed to make these tools work across Windows, the Macintosh and Linux, using a combination of Mozilla's XUL, Javascript, and XSLT. The editor, as one example, overrides the Firefox menu, making it possible for me to actually piggyback on top of a user's version of Firefox to get the functionality that I need.

I am writing this blog using the editor I built on the Firefox XUL library and API, with the editor actually running as a tabbed pane within the browser itself. While it is certainly possible (and I'll discuss in more detail how it can be done) to set up such core functions as cut, copy and paste, text searching, undo and redo, and so forth through Javascript code, by building on top of the web browser itself I was able to effectively get all of this for free, leaving me with more time to implement functionality specific to the company's requirements. Perhaps the closest analogy I can think of as to the power of this would be as if you had access to the source code for Internet Explorer, could make changes to the interface using XML and Javascript, and could then run it on any platform without complex recompilation. The XAML model comes closest, but XAML is also still at least two years out, and it's unlikely that you'll actually get a chance to manipulate (or even see) the source code for the XAML rendering engine.

Yet for all this, perhaps the most intriguing aspect of Firefox is its ability to integrate multiple extensions. It's worth considering that most of Firefox is in fact an extension of some sort - some extensions are just bundled more tightly with the original package. Third party extensions exist to do everything from translate or speak selected text to showing the weather for the next few days. Some, such as the Web Developer's Toolkit, can actually work very nicely with an editor to show the dimensions and paths of images, boundaries of tables and divs, and activating and deactivating Javascript and Java components on the fly. These extensions can in fact be utilized in conjunction with your own applications as well -- I use a number of them with the editing suite I've developed, again letting me concentrate on the relevant business logic on my end rather than trying to reimplement everything from scratch. This capability will also increase considerably by mid-next year, when SVG and XForms are integrated into the mix - making it possible to generate rich, intelligent forms and interactive multimedia using SVG, XBL bindings and data-aware form components.

The Anatomy of Mozilla

I've been rather blithely throwing around terms and acronyms here that may be familiar to the XUL developers among you but may otherwise be somewhat mysterious to the rest of you. Consequently, digging into the innards of Mozilla may both end up explaining some of this and giving you a better understanding of what exactly applications such as Firefox can do.

Conceptually, Mozilla (and by extension Firefox and Thunderbird) can be broken down two ways: Gecko and SeaMonkey. Gecko is a set of core objects, written primarily in C++, that handle the detailed rendering and memory management of web-based applications. Gecko is perhaps the oldest part of the Mozilla project, started from scratch to better perform the drawing of web pages than the older Netscape browsers did. You can thank Gecko for Firefox's surprisingly fast speed in rendering. Gecko serves as the interface between tbe application and the native graphical rendering system (such as GDI on Windows or XFree86 on Linux and Unix based systems), freeing up developers from having to explicitly access this layer directly.

Gecko, however, is a largely invisible layer from the application developer standpoint. If you're writing an application, you are much more likely to be interfacing with it through SeaMonkey (you can probably begin to detect the direction of the Mozilla Foundation's code name strategy at work here). SeaMonkey provides the code interface layer that makes it possible for us ordinary mortals to write applications, and even to take over the Mozilla browser in order to create our own. SeaMonkey exposes an XML language called the XML User-interface Language (or XUL) that provides a set of building blocks that control various components - textboxes, formatting boxes, status bars, menus, lists, trees, and so forth, along with abstractions for creating key bindings, event observers and referential commands. This set is fairly rich (there are more than one hundred such tags), but it can also be extended with the HTML element set (useful for creating formatted markup within applications) and will further be augmented with the SVG tag-set by March 2005, and XForms by early 2006.

It is possible to put together applications with nothing but XUL, but they are generally trivial applications at best. As with any other application framework, the structural elements usually need to be bound together with some code of procedural code. SeaMonkey borrowed a page from Internet Explorer here (as well as .NET) - rather than building one language inextricably into the interface, SeaMonkey breaks the process up into two distinct technologies - XPCOM and XPConnect. XPCOM performs the same role for Mozilla that COM does for pre-.NET windows applications - it queries and binds object interfaces and makes them available for other coding applications to utilize. This cuts down on the requirement of maintaining a static-API, and provides a vehicle for writing binary extensions as XPCOM objects. While the two layers are not identical, there is enough similarity between XPCOM and COM that an ActiveX container for Mozilla should soon be supported, making it possible for Firefox applications to run ActiveX controls while at the same time providing a layer of security that prevents them from being the threat they've become under Internet Explorer.

To get around coding a specific language to SeaMonkey, XPCOM is designed to be accessed through XPConnect, a binding layer that maps XPCOM to a specific language's interfaces. Currently the primary such language is Javascript 1.5, though plans are in the work to incorporate Javascript 2.0 once that language goes through its final development phase and is approved by the ECMA (a body, incidentally, that has quietly become the de facto holder in trust of programming languages in general). I've covered some of the features of Javascript 1.5 before, including the use of setters and getters, robust regular expression support, the use of constants, and multiple try-catch statement support. However, bindings for other languages, including Python and Perl, are available, and a much more complete Java binder is also under development. Because of the open nature of Mozilla, I would not be at all surprised to see a C# implementation in the near future as well.

The list of XPCOM objects is quite impressive. A partial list includes the following functionality:
  • Core Functionality (See below)
  • Accessibility Components
  • Address Book Support
  • Clipboard and Selection
  • Content and Layout Managers
  • Cookies
  • HTML and XML DOM Support
  • HTML Editors
  • File and Stream Interfaces
  • Graphics Creation and Manipulation
  • Interprocess Communication (IPC)
  • LDAP
  • Localization
  • Mail Support
  • Network Support (Sockets, et al)
  • News Support
  • Preferences Objects
  • Security
  • Web Browser control
  • Web Services (SOAP/WSDL based)
  • Window Management
  • XML Support (Schema, XSLT, XPath)
  • XUL
The Core functionality provides a number of useful data structures (including dictionaries, arrays, property bags and enumerations) and language type support,along with threading libraries (and pools), timers, event resources, and exception management. While some of these are not necessarily that useful in Javascript, they do have definite utility in other languages such as C++. The graphics library includes interfaces for actually drawing on surfaces within the various objects, though accessing these services can be a little convoluted. The mail, LDAP and news support point out a subtle but important fact about Firefox and Thunderbird - they are simply applications that both sit on the same API - meaning that you could in fact build integrated mail services directly into Firefox if you wanted to.

XPCOM exposes these services and objects via a contract ID, something analogous to the classid used by Microsoft tools. The following, for instance, illustrates how you could create a new local File object:
var file = Components.classes["@mozilla.org/file/local;1"].

createInstance(Components.interfaces.nsILocalFile);
The first part of the expression,
Components.classes["@mozilla.org/file/local;1"]
creates a reference to the local file class defined by the contract ID, "@mozilla.org/file/local;1". This is a class reference, not an instance reference (it points to a particular class definition, rather than one specific instance of the class). The createInstance() function in turn creates an instance of this object, using the Components.interfaces.nsILocalFile interface to expose that particular interface on the instance. A given object may conceivably have more than one interface; this code makes it easier (and more cost efficient in terms of computing) to get the specific interface properties. Once this object is retrieved, you can use its properties and methods in exactly the same manner you would do so in any other language.

The final piece of the SeaMonkey language is the XML Binding Language (or XBL). This XML-based language provides a transformation mechanism that will take user-defined tags written in XUL files and convert them into an internal XUL representation, complete with properties, methods, and event hooks. XBL provides a way of creating more sophisticated elements, and is in fact used within XUL itself for the definition of things such as tab-browsers, which combine tab boxes and browsers into a single component.

A very simple XBL file, one that builds a box with OK and Cancel buttons, might look something like this:


XUL (example.xul):

<?xml version="1.0"?>
<?xml-stylesheet href="chrome://global/skin/" type="text/css"?>
<?xml-stylesheet href="chrome://example/skin/example.css" type="text/css"?>

<window
xmlns="http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul">
<box class="okcancelbuttons">
</window>

CSS (example.css):

box.okcancelbuttons {
-moz-binding: url('chrome://example/skin/example.xml#okcancel');
}

XBL (example.xml):

<?xml version="1.0"?>
<bindings xmlns="http://www.mozilla.org/xbl"
xmlns:xul="http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul">
<binding id="okcancel">
<content>
<xul:button label="OK">
<xul:button label="Cancel">
</content>
</binding>
</bindings>

The XUL file creates a reference to a CSS file, while in turn uses CSS selector and rule syntax for defining the bindings between a given class (okcancelbuttons) and an XBL file and the associated "okcancel" binding item. Real XBL can become much more complex than this, of course, but this is a topic for a different article.

As expected, SeaMonkey also handles the bindings between CSS and the XUL applications, with XUL heavily utilizing CSS not just for simple "styling" but for the actual creation of complex components through XBL. The CSS support that exists as a consequence is VERY impressive, including certain features that have been floating for a while, such as support for multiple flow columns.

The final piece of any XUL application is the use of overlays. An overlay is a XUL file that changes the XUL (and associated scripts) of a given application. By overwriting or extending (as a form of inheritance) you can do such things as create overlays on Firefox or Thunderbird itself. I do this myself to override the default load and save menu items and replace them with my own, making it possible for me to save to a custom XML schema and load from that schema later.

Firefox is an example of all of these principles in action, by the way. If you have Firefox running on your system (and maintain the Java SDK on your system), create a copy of the browser.jar file located in the chrome directory of your Firefox distribution somewhere outside of the firefox application folder. You can use the Jar file extractor from the SDK to convert this into a directory:

jar xvf browser.jar

This will create a folder called content, which in turn will hold the various XUL, XML, and CSS files for the Firefox browser. It is worth spending the time looking at these closely. One of the things you should realize pretty quickly on is that almost all of Firefox is contained within these XUL files, not in some form of C++ application, and that a significant portion of the coding for Firefox is handled by Javascript.

Firefox is remarkable to me in that it is one of a new breed of applications, built around an XML interface and scripting yet fully capable of handling some of the most serious challenges that any "formal" application written in C++ or even Java can handle. It is also eminently accessible, in a way that a lot of other applications aren't. It is this model, as much as any widgets or features of the Firefox application, that is the real story here.

Implications

I'm looking at the November 22, 2004 issue of eWeek on my desk at an article entitled Browser wars back on. I think that sums it up pretty well. Firefox is not just a shot across the bow to Microsoft -- stealing 5% of market share is much like taking out the yard-arm on your ship with that cannon-shot. A major portion of Microsoft's control over that market share has come from the fact that you could access other components from within it, turning the Internet Explorer shell into a general purpose shell for hosting any kind of application. No other browser out there has really managed to pull it off and still be able to maintain its quality as a browser.

Firefox opens up that possibility. It's not that much of a stretch to envision Open Office creating wrappers around their UNO wrapper class to make them work within Firefox, and as the editor application I've written myself illustrates, you can actually go a long way toward building commercially viable enterprise-level applications just using the core components from Firefox. The addition of SVG support and XForms provides another point of attack against both Power Point and InfoPath, and its not hard to envision data-access tools appearing in the next year (perhaps powered by XQuery?) that will give Access a run for its money. Such applications could run on multiple platforms with little or no modification, would be a menu-item away from normal browsing (and could easily run in one browser tab while you maintain your mail in a second tab and surf the web in a third).

None of this will happen overnight, of course, but its easy enough to see the general trendline. Already it is prompting Microsoft to come back with a number of new extensions and innovations on its own browser, though in most cases these extensions still rely upon the existing ActiveX architecture. The biggest danger that Microsoft faces from this comes in its tendency to pick and choose which standards it chooses to comply to; a truly standards compliant development system is likely to be far more politically attractive than one that is closed and proprietary, especially where it counts -- not in the big enterprise settings where the adoption of any new technology usually takes place only after such a technology has become very settled but in the spare bedrooms and coffeehouses and garage work-stations of the individual developers who are the ones who are learning (and in many cases developing) the technology of the future. For them, Firefox and the Mozilla Application Suite represents a huge step forward, one that will have reverberations for the next decade and beyond.

82 comments: