April 16, 2011

A New Job and New Ramblings, or Information Architecture 101

Hi! Chances are if you came from links for the MarkLogic site, you're probably looking for XMLToday.org (I run both) but if you're just here to see some rambling thoughts about information architecture from a graybeard, then no doubt you are in fact in the right place.

About the new job - about a year ago, I came out to Annapolis, Maryland in order to work as a contract data architect for Lockheed on the US National Archives ERA project. NARA was fun to work with. Lockheed, not so much. In January, in a cost cutting move, several consultants were let go. The usual scramble for work began, aided for once with a fairly healthy bank balance and being in a place where there is a hunger for XML architects. Eight weeks later, after a number of interesting encounters, I landed at Avalon Consulting. They had what I was looking for in a job - challenging work, decent pay, sane ideas about remote working and a lot of very bright people that liked to do good things in the information space. That it gave me a chance to work with MarkLogic again was a big plus, as I've rather become addicted to the platform.


I even have a nice shiny new title - Information Architect. I seem to be making a career out of taking jobs with titles that didn't even exist five years ago, which is why I've always found it amusing when interviewers ask me what I want to do five years from now. I think next time I'm in that position, I'm going to tell the interviewer that I hope to be a Paradigm Architect. What does a paradigm architect do? I have no frickin clue, but it definitely sounds cool.

So what about that Information Architect role? I have a theory about this. Humans have this obsession with building. Anthropologists seem to miss the point about humans being tool users - plenty of animals use tools, but only humans use those tools for the purpose of constructing buildings ... you don't see beavers out there with trawls spreading mortar on their dams, you don't see birds construct suspended baskets and scaffolding in order to build their nests. What's more, in every single animal except humans, the building of nests is something that seems to be specific to one overriding objective - to insure that the female and her young are protected while gestating and nursing.

Walk along the Mall in Washington DC or Fifth Avenue in New York and ask yourself whether in fact the role of all of those huge imposing edifices is to insure that someone's female and her young are protected. Uh huh. You answer likely was the same as mine. We build because we can, because it makes a statement about the builder, because it serves to house complex entities - governments, businesses, universities, marketplaces and so forth. It is our attempt to impose upon the natural world a sense of constraint, orderliness and predictability, in great part because our sense of reality is predicated upon the fact that, when reality is orderly, it means you can concentrate upon other things, but when reality is disordered, all of your attention is perforce placed just upon threat analysis.

The idea that information can be similarly ordered is hardly a new one. Large scale relational databases are a testament to such a philosophy - there are known schemas and relationships that exist within any given domain, and one can consequently order those schemas and relationships, describe them precisely down to data type and exceptions. The problem with this is similar to the problem that I see at the Madison Building of the Library of Congress, where I've been doing some consulting. The outside is a massive marble monument, a testament to the power of the written word and the making of informed decisions, and this is carried into the main atrium. 

Yet beyond the power foyer, much of the building is occupied with dense arrays of small cubicles, and people who aren't essential to the day-to-day activities of the library are actually encouraged to work from home because it frees up space. The architecture of the building effectively limits the flexibility of the activities, because when the building was conceived, it's evolution was never really taken into account (this is a common problem with office space in DC in general; many of the buildings were created at a time when government was roughly a fifth the size it is today).

Many years ago, I wrote an essay entitled The Architect and the Gardner, and while my perspective has (hopefully) grown since then, I find that my belief in the validity of the underlying thesis really hasn't. Organizations evolve over time, but they build their information systems as if the organization is always static. The reality that they see is a snapshot of a changing culture, but just like the buildings which seem massive on the outside but are in fact cramped and limited in space on the inside, the information structures that they create also tend to reflect the business reality of the time, with no thought to the evolution of that business.

My own belief is that in many respects the information systems for an organization are a vital, living part of that organization. They follow cycles of creation, maturation, senescence and destruction. They evolve - both in terms of the nature of the data that passes through them and in terms of the ambient technological framework which dictates new content. Additionally, just as an organization has different types of culture depending upon where it is in its own cycle, so too do the information needs of that organization change based upon factors as diverse as current economic climate, stability and size of the company, regulatory regimes and so forth.

Taken in that context, the idea of "building" such an information system begins to look suspect. Rather, a good information designer should think more like a landscape architect than a building architect. From the outset there should be a recognition that information systems have a definite life cycle - growing seasons, if you will - and that a rich ecosystem will have everything from long term data systems that are much like the trees in a garden and that serve as the archival backbone of an organization, to intermediate systems that support working data management and act as perennials, to localized data systems that are brought up on an application basis then brought down when that application is no longer needed.

Yet even beyond that are transients - data that streams through the organization, that arrives from and goes to external services. Until comparatively recently these were seen as being more in the domain of the network manager, but while the network manager is concerned with throughput and access, the information architect is more focused on the content of those data streams, and insuring that the organization doesn't build dependencies upon those streams that are more than transient in nature themselves.

Like a gardner, the information manager has to spend a great deal of time not only planning but pruning -   insuring that legacy data systems are either migrated into archives or brought down gracefully, such that interdependencies can be reduced, and providing growth avenues for data flows when they become a larger part of an organization's operational information.

One additional responsibility that an information architect has comes in helping to shape the shape or type of that data. An IA isn't a data modeler or ontologist (though he or she may wear that hat as well), but typically the architect will be called upon to set the ground rules upon which the ontologists build, as well as to determine which standards will be followed when working with content.

A big reason for that is that information systems, like gardens, generally will evolve on their own, but not necessarily in a way to help facilitate the useful or relevant information flow for the organization. Again, in the garden analogy - if you have good soil, sunlight and rain, you can get anything to grow in a garden, but it's more than likely that what will take root are weeds - information that takes up space in your physical systems but provides little benefit, and reduces the availability of resources for that information that is important. Ontologies run rampant can make accessing information difficult, ad-hoc or  poorly designed schemas (or no schemas at all) can make interoperability difficult, too high a degree of coupling can cause siloization and make the sharing of physical resource problematic and so forth.

In many respects an information architect performs for the information in an organization the same role that a systems architect performs for the physical hardware and a software architect performs for processing systems. The three are complementary roles, and are still largely technical in nature, and in general they represent the technical systems designers for an organization. In particularly large information-centric businesses information architects may manage data designers, ontologists and search engineers, something I suspect will increasingly be the norm for organizations as they shift from process development to information management.

Granted, a lot of these are just my own observations. Roles emerge because there is a niche that needs to be filled in an organization, and given the increasingly dominant role that information organization and management (information gardening?) has in business today, it only stands to reason that the role of information architect is likely to be a critical one moving forward.

Now, about that paradigm architect position ...

Kurt Cagle is an Information Architect specializing in XML data systems, and works as a consultant for Avalon Consulting. He can be reached at caglek@avalonconsult.com

No comments: