Strategic View of The Application of XML: Optimizing the Information Base of the Firm
The capabilities we gain from XML will impact the structure, boundaries and success of the "firm" - whether the firm is a commercial or non-commercial entity. Below is an overview of the general impact of XML-enabled capabilities, expressed in the context of Nobel Laureate Economist Robert Coase's views regarding the "institutional structure of production." He saw the firm as needing to optimize its boundaries based on certain transaction costs. Although the transaction costs described by Coase are literally academic subjects, today's very hot topics such as outsourcing, channel management, and national intelligence organizational design and many others are very much about the impact of shifts in Coase's transaction costs.


As Robert Coase described,  the boundary between what is optimally "inside" a firm or better done outside is governed by three sets of "transaction" costs - 1) search and information costs, 2) bargaining and decision costs, and 3) compliance enforcement costs. If moving a function or process outside the firm increases those costs, then the likely optimal solution is for the functions in question to be kept inside the firm.

He did not describe the fourth (or more) set of transaction costs - the actual, often physical transaction costs (making, shipping, receiving etc.), because at the time of his study their impact on the extent and character of the firm previously had been well characterized in microeconomics.

Also, there are important human considerations influencing the proper boundary of the firm - particularly the quality referred to as "esprit de corps" or, to use a more pedestrian and perhaps less explanatory term - teamwork. Effective, motivated teams can decrease transaction costs. Therefore,  a motivated, empowered workforce can expand the optimal "footprint" of the firm. Of course an external firm's empowered, motivated team can have the opposite effect of making outsourcing more attractive, so the equilibrium "door" swings both ways.

Beyond these broad considerations, XML and related standards will impact all of the transaction costs and factors cited above and, perhaps, redefine the optimal boundary of your firm. Indeed, the efforts of XML standards-makers map directly to the transaction costs Coase cited. For example, ebXML standards-setting regarding company profiles and registries assist in the reduction of "search and information" and "bargaining and decision" costs.

Note that to optimize the firm as conditions change over time, inter-enterprise and intra-enterprise use of technology and technology standards should be synchronized.

Treating "inside" (internal systems) and "outside" (externally facing systems)  differently creates another, needless "transaction" cost. Also, because most economic activity is "internal," the benefits of eBusiness are likely to be greatest if eBusiness processes and standards are used internally as well as externally. As described by Robert Coase, "The firm in mainstream economic theory has often been described as a "black box"... most resources in a modern economic system are employed within firms, with how these resources are used dependent on administrative decisions and not directly on the operation of a market." (see

Going beyond the optimization of the individual firm, for the macro economy to operate well, the micro economy of the firm must support the fluid shifting of roles and information resources between "inside" and "outside," as facilitated by standards and processes.

Options For Employing XML In Data Exchanges

Essentially, a firm has three options - not mutually exclusive - to exploit XML in acquiring, managing and exploiting the data needed to run the business successfully. These three exhibit gradations of "insideness."

1. "Internalize" knowledge and information - i.e., to enlarge  the firm's information assets by bringing information "inside" the firm's boundary.

The firm can accomplish this by using XML to facilitate machine-to-machine data exchanges, typically large-scale, periodic batch transfers. In this effort to internalize, "exchange" is focused primarily on information "import," perhaps leavened with some export. In such a mode, the firm is seeking information self-sufficiency through its own  internal information "manufacture" and its information "imports." A consequence (or perhaps a "symptom") is that employees of the firm and the firm's processes can function principally from information assets wholly within the firm's ERP or other internal databases.

Note that you, as a provider firm, may also find yourself dealing with other firms that want you to perform large-scale exports of your information to them - symptomatic of that firm's efforts to internalize information.

2.  Create virtual extensions of the firm's knowledge assets through dynamic "calls" to external data resources, typically by using "Web Services."

To use that obvious example, the "catalog," in virtual mode the firm would not, as in the first alternative, import entire catalogs, but instead its systems would situationally import data in response to specific need. The importing system (e.g., the firm's ERP) would initiate the necessary web services call on a just in time basis and, after retrieving the data, would integrate it with internal data from the information consumer's own inyernal systems. For example, after retrieving pricing data through a "call," the internal system might then present the user with a unified view combining line item descriptions taken from an internal data source with "fresh" unit prices retrieved dynamically with a web services "call."

From a metadata management perspective, both options 1 and 2  require that the internal systems (and those who define, build and maintain them) need to put inplace suitable ways of using and storing the imported data. For example, if one is importing information about a commodity commercially priced to four decimal positions (e.g., sometimes the case for electric power) or that is taxed or categorized in some distinctive way, the internal system needs to have implemented the metadata infrastructure to deal with those information characteristics. The internal systems would also need to know where to go for information, have a "put away" strategy, etc.

With respect to advantages and disadvantages, Option 1, the batch import option offers the advantages and disadvantages of "fat" inventory management. In a fat inventory environment,  the internal user is likely to find what is needed "on the shelf," but of course as with physical inventory the data inventory will run up storage and spoilage costs.

In comparison, Option 2 - the "virtual" mode using web services calls to address ad hoc needs - has many of the characteristics of "lean" manufacturing - offering the advantages of low to no inventory, but also the disadvantages of realtime supply chain dependency (what is the fallback if the "called" system is "asleep"?). It also is the most likely mode for EDI-like transactions - that is, the exchange of quasi-documents that are event driven, such as orders, shipping advices, invoices, etc.

3.  Adopt the OBI (Open Buying on the Internet) or "punchout" model.

OBI actively engages people's capabilities in the data selection and retrieval process, and it therefore builds on the sometimes overlooked fact that people are smart whereas machines are stupid. (See
OBI dialog for introductory information on OBI)

In this option, users work within a cross-entity browser session. Typically the user accesses some internal server, punches out to an external source for information, picks what is needed, and then the OBI process returns both the user and the selected data to the internal system. The internal system (e.g., eProcurement platform) would process the retrieved data  through internal workflow as appropriate.

Today,  this process is most commonly used for "indirect" or MRO buying - e.g., an Ariba Technologies' buyside platform supporting a dialog with an external book distributor, but it is readily adaptable to any information exchange (e.g., medical records, whether physical or digital in form.

Note that the essential differentiator of Option 3 from both Options 1 and 2 is the active engagement of the user's judgment and decision-making. Rather than bringing external information to the user through an internal intermediary system (again, perhaps through an ERP), the process temporarily redirects the user's browser interaction to where the information already exists. When the user is working with the external information source, the dialog may be complex - e.g. the user probably will exercise the source's system's search capabilities, configuration capabilities, multi-media views, etc. However, the end data retrieved and brought back to the internal system would typically be a formatted transaction that does not pose major "metadata" problems to the receiving internal system.

In contrast to OBI, machine-to-machine batch data transfer and web services "calls" - however well architected and standardized - are both fundamentally limited by the stupidity of machines and the impracticability of adding enough functionality to internal software to deal with the richness of life outside the "home" enterprise. On the other hand, cost of having a person involved in the OBI dialog makes the per transaction cost higher.

Option 1-3 all have their proper place in the firm's architecture, and one is not "better" than the other in any absolute sense. The tradeoffs depend on intensity of information use, criticality of use, and variability of use. There are also infrastructure constraints  - e.g., you cannot do "Web Services" calls to look up, for example, sales tax rates, if there is no one offering to support such a lookup service.

Implications for optimizing the boundaries and capability of the firm

In rationalizing the physical scope of a firm, the management of, say,  a  hotel chain or an automobile manufacturer might explicitly decide how many facilities it will operate, in what geographies and of what size and complexity.

For example, if someone suggested that the hotel chain or the automobile OEM own and run only twenty physical facilities worldwide, management of such a firm probably would deem twenty to be too few and too sparse, while twenty thousand probably would be deemed too many and too dense.  At the high end, diminishing returns set in - because managing a lot of facilities consumes too many resources and because subdividing demand and proliferating comparatively small facilities creates negative economies of scale.

Of course, the answer might not be to reject physical expansion entirely, but to franchise, set up joint ventures or set up strategic relationships that avoid the downside effects of "bricks and mortar" expansion.

Firm-level Information Scope Planning

Just as management needs to decide the scope of the "physical" firm, there is a need to define the scope of the firm's information base, even though the notion of too little or too much is hard to characterize. The costs cited by Coase - of discovery, of decision-making and of compliance - all play a role in the scoping process.

In optimizing the information boundary of the firm, the choices range from full vertical integration and autarky on the one hand to the other extreme of complete virtualization  -  meaning that the firm transactionally "calls" on others for all data. Clearly the realistic choices are somewhere in between.

One trend is that as technology and technology standards drive transaction costs down, the need for the firm to own everything also goes down. On the other hand, as XML makes it feasible to enrich the firm's knowledge base, knowledge functions might tend to shift inwards. The firm has to evaluate such tradeoffs in its own context.

Note that any firm - unless headed toward data autarky - needs to take data standards very seriously (meaning standards for metadata as well as the data itself), because those standards reduce transaction costs and enhance the firm's ability to rebalance as needs change.

If one considers the sorts of information (typically referred to as "data") that is managed in tables, columns and rows, one could and should decide how much of the data base should be populated via XML imports (Option 1 above),  how much  using Web Services calls (Option 2), and how much should be acquired in a judgment-filtered way through OBI dialogs (Option 3). Of course, few management teams have explicitly decided "we are going to be a twenty-five thousand table company, and here's how we get the data" but management may be making other decisions that have the result of either unduly expanding or shrinking the firm's information assets and saddling the firm with a data base that it cannot effectively populate.

Technological change increases the risk that your firm is on track to exceed the optimum by a long distance in "easy" categories, while shortchanging some harder categories. Bloated databases are often symptoms of a high fat, low fiber intake. However, the risk is not just what amount of data exists today, but the risks that come with today's increasing ease of data proliferation, with XML in part contributing to that ease.

Note that the question of limits also pertains to documents  (many concealing transactions and events that should be extracted), images and, a very special information category, "rules." If your firm has more than, say, 25,000 rules, well ...

Also, today's widespread concern over the quality of data is often symptomatic that data has proliferated beyond the capability of the firm to support it. Indeed, data of uneven quality stimulates further data proliferation as people defensively create new data collection and retention processes. Data proliferation is further discussed - see
bucket reduction.

To optimize the data boundaries of the firm, the answer might not be to reject sets of information entirely, but as with physical operations, to franchise, set up joint ventures or otherwise outsource. As described above, the various modes of XML can be adapted to any of those models as well as hybrids, particularly if internal standards are based on external standards.

Bottom Line

The emergence of XML can be thought of as the icing on the technology cake - it is important both for its own sake and because it rides on top of very powerful, ubiquitous computing and networking technology.

The decision facing every firm (or organization) is its boundaries, and that decision involves a balancing of the tradeoffs that define the optimal boundaries of that firm. To what degree does management want the firm to make itself informationally "larger" by internalizing more information, or is "larger" synonymous with being informationally bloated? Does that enlargement increase or decrease the three sets "transaction costs" identified by the economist Robert Coase - search and information acquisition, bargaining and decision, and compliance? Given that the firm has limited resources to add more tables, rows and columns, is there prioritization to be sure that the ones being added are helpful in reducing transaction costs? If the firm needs to interoperate with others, has it already adopted the data standards and related processes to facilitate interoperability?

Although many readers will, quite correctly, be indifferent as to the architecture of their information base or information flows, what all responsible managers need to address are the practical impacts on the firm itself. What are its new optimal boundaries? Are the right people being asked to address the potentially transforming impact of XML on transaction costs that will shape your firm in the future?

Also, your firm has to deal with other firms, and they face the same choices and, as they make their choices, you may have to deal with the consequences. Early adopters have the advantage of influencing later ones, so in many cases it is beneficial to be an early mover - presuming you move consistently with the technology winds and tides.