The SGML community has had increasing interest in the capability of assembling document displays from multiple referenced parts. Elements in such documents reference other elements or objects that are retrieved on the fly and transparently displayed much as if their content had actually occurred inline. This is commonly called ``dynamic document assembly'' or ``boilerplating'' and traces back to the older notion of transclusion (dynamic data inclusion). Transclusion has previously been very theoretical, but is becoming less so daily, especially with the coming use of XML. This presentation looks at reasons for transclusion, possible problems, and some proposed solutions.
Most text processing systems provide a way to ``include'' parts of one document, or independent fragments, into another. The methods, however, vary widely. Many systems provide a ``mail merge'' that can embed small parts repeatedly, or a raw copy-when-referenced facility like most programming languages, for pulling in prerequisite files, macro packages, and so on. This is very useful, for a variety of basic reasons:
Avoiding duplication of data that is often re-used, such as abbreviations or stock phrases.
Ensuring that such duplications stay the same after editing, by ensuring that there really is only one ``normative'' copy.
Always seeing the latest copy of referenced data, even though it may be getting updated on an entirely different schedule than the referencing data, or even created on-the-fly (such as a stock quote or inventory count).
This mechanism has been around since long before SGML, having been standard equipment in programming and macro languages for ages. It was also prominent in the pioneering hypertext and hypermedia systems developed in the late 1960s:
Brown University's Andries van Dam built the FRESS system, which had extensive facilities for including remote content and markup by reference; and for controlling the sub-parts, in terms of re-use, conditional inclusion based on arbitrarily complex attributes and expressions, and so on.
SRI's Douglas Engelbart built NLS and Augment, which introduced fine-grained ways of referring to hierarchical sub-parts of structured documents, and of maintaining persistent names so that the links and quotations would survive later editing.
Ted Nelson, however, most clearly saw the potential of dynamic, on-the-fly inclusion on a world scale, and coined the term ``transclusion'' for it. He focused on the case of quotation, assuming that there would eventually be a worldwide collection of accessible information, which people would explore and link.
 At the time, few people believed him.
More recently we have seen it in less dynamic forms; what we now call ``dynamic document assembly'' is simply a great deal of transclusion, except that it is typically done either as a batch process, or by a heavy server assembling things for a client (rather than being woven right into the thread of every transaction). Much specialized web-server processing is also faking transclusion (since HTML didn't contemplate it except for one case described below). So transclusion is becoming quite widespread; but only in certain specific guises, and with many unnecesary constraints. Here we discuss some of the general principles and processes underlying all these different guises of transclusion, and how those can be addressed in SGML and XML systems.
Transclusion is the dynamic inclusion of data from one document in another. In essence this is quotation, but with a wide range of advantages (and challenges) that arise from making it dynamic and online.
We call the document which references other data the ``referring document,'' and the place in it where the reference occurs the ``referring context.'' Technically, there is no need for any kind of reference to be expressed inline, and many hypertext systems (including HyTime, TEI, Microcosm, and XLL) provide ways to specify links externally; we ignore this distinction here for simplicity.
We call the data, which is referenced, accessed, and then shown largely as if it has occurred in the referenced context, the ``referenced data.'' The place where it originally occurs is the ``referenced'' or ``original'' context, which may or may not bear much resemblance to the referencing context.
The definition of transclusion is not limited to including entire documents or to including documents of similar type. In its original sense, the inclusion is meant to be real-time, and to be able to accommodate issues of dynamic updates and versioning. It is, however, crucial to transclusion that the referenced data retain its identity: the reader can tell that something is a quotation instead of the work of the referring document's author; and moreover can readily access the referenced information in its referenced/original context.
For example, an abbreviation (such as an SGML entity used to expand an oft-repeated acronym such as NASA) is not properly a transclusion: it does not achieve the required level of identity and referenced-context meaning. In contrast, a quotation element typically does represent a transclusion if the quotation is done by live reference rather than data-copying. If done by copying but with an accompanying link to the referenced context, it is a marginal case of transclusion: the identity is expressed, but the quote isn't really live; this renders moot some of the semantic issues, but introduces its own problems such as undetectable obsolescence of the copy.
Until recently, HTML did not contemplate dynamic inclusion. There was no provision for including anything on the fly, with one exception: graphics. That is, the behavior of these two HTML constructs is radically different, although both are links:
<A HREF="foo.htm"> <IMG SRC="bar.gif">
A element in HTML has
built-in semantics that include a requirement that the reader actuate
the element's content before the referenced data is
retrieved. Of course a program might choose to access the data in
anticipation as a way of optimizing performance, but logically the link is not followed until the
reader requests it. Therefore, this is not transclusion as we have
In contrast, the
IMG element is typically retrieved
as soon as it is encountered, with no user intervention. This is
essentially a transclusion, except that there is no (standard) way of
transcluding only a portion of a graphic, or any kind of data other
than a graphic (making the
SRC attribute of
IMG point to HTML does not have the
effect one might wish for).
So both transclusion and inclusion exist in
HTML, but that distinction is conflated: referenced
HTML can only be included, whereas images can only
be transcluded, leaving some cases hard to achieve. Since
IMG transfers can be slow, most browsers provide a global
option setting to change
IMG's behavior to resemble
A; but only globally, rather than on a per-link
or per-link-type basis. This leaves some useful functionality
unachievable, and is one of the reasons the XLL
linking specification adds these particular semantics directly.
IFRAME elements are another way to achieve transclusion.
These function somewhat like SGML
SUBDOC entities: they are able to include only entire
documents, and they create a separate address space, making it
difficult to link into transcluded content. (See further discussion of
SGML can do parts of transclusion in several ways. The most obvious is direct entity references, which can be used to pull in almost any data. However, entities are too powerful in some ways, and not powerful enough in others:
A referenced SGML portion need not be a whole
element, but might consist of the end of one and the start of another
foo</P><![ RCDATA [...). There are
quite complex constraints on just where you can break entities in
relation to elements, tags, quoted literals, marked sections of
various kinds, and so on. This can lead to referenced data being
interpreted to contain different tags in its referenced context than
in its original context. One can imagine scenarios that make that
useful, but they are extraordinarily rare, and completely opaque to
most normal users. This is one reason XML rules out
this kind of ``asynchronous'' entity reference.
Entities do not have a standard way of referrring to parts of a data object. Since SGML does not entirely constrain the form or meaning of system identifiers, anyone can build such a mechanism, but SGML does not provide it. This makes using entities for quotation nonportable. XLL introduces portable conventions for internal references of this kind.
Since a referenced entity (at least, an SGML one) is parsed as part of the referencing context, it can have long-lasting effects there, and therefore the referencing document must be re-parsed whenever the referenced data is to be updated. Given the difficulty of incremental SGML parsing, this makes truly dynamic transclusion using entities quite difficult.
A second mechanism SGML provides is the
SUBDOC entity. This has the advantage of being
synchronous, and of isolating the parsing contexts of the referring
and referenced documents (thus an unmatched delimiter in one
doesn't radically modify the other's parse). However, a
SUBDOC entity can only be an entire
SGML document, making quotation again difficult;
promoting sub-elements to function as entire documents is sometimes
easy, but sometimes requires re-coding the SGML to
avoid all inclusion exceptions,
USEMAPs, and other non-local features. Since
SUBDOC also separates the
ID spaces, linking
across the transclusion boundary requires more work.
In the end, the best way to do transclusion with SGML is to treat it like other semantics of hypertext: build it as specific applications rather than using intrinsic SGML features.
XML rules out most of the constructs that make it hard to implement transclusion with SGML, such as asynchronous entities and document portions that can parse differently depending on where they are referenced. This makes truly dynamic transclusion processors more feasible. Also, XLL provides a powerful convention for referring to subtrees of documents on the fly (based on TEI extended pointer notation), making it very easy to transclude very specific document portions as required for quotation. XLL also provides transclusion semantics as a specific property that can be set for individual links: a link can be declared to require on-the-fly retrieval and display, or inline display on demand, or new-window display on demand, and so on, thus allowing all the needed combinations of behavior.
Transclusion introduces a number of problems and design
decisions. Most of them arise from a single characteristic: one piece
of data exists in multiple contexts. In SGML, an
entity referenced from several different contexts has no real
``identity'' or structure apart from each particular
context. If it is referenced once in a normal parsing context, once in
RCDATA element context, once in an
marked section context, once inside an attribute value, and once in a
context where there happen to be five
start-tags still pending, it might parse to totally different element
structures in each case. But in SGML terms this is
not a problem: the entity is defined to have no structure apart from a
given context, and no rule says it must have similar meanings in
The issue with transclusion is that the referenced data's meaning, structure, and content from its original context must be maintained in the referencing context; remember that our definition included maintaining the data object's identity, and making the original context available on demand. Also, it is clearly not a fair quotation if the parsing or even content of the quoted material can be changed to an arbitrary degree, merely by the act of quotation.
Below, we discuss several examples of issues raised by transclusion. These and, we believe, any problem involving transcluded data can be considered as various combinations of no more than three classes of problems: presentational problems, addressing problems, and modification of the transcluded data.
A simple but pervasive challenge in handling transclusions involves stylesheets. The ancestors of the transcluded element are typically quite different in the two contexts.
Why is the ancestry important? Mainly because stylesheet
mechanisms for SGML (though not for low-end word
processors) use inheritance down through ancestors to determine many
formatting characteristics. For example, a
is usually not set on
P at all, but inherited from a
distant ancestor such as
BOOK. This is also typically
true of the font size for
P, except that some kinds of
intervening ancestors change it when present: the text of a
P inside a
may be smaller because the
sets the smaller size, which is then inherited.
Under the hood, what goes on is commonly that the formatter
calculates fonts, geometry, colors, and linking semantics for each
ancestor in turn, working down from the root element to the element
being formatted. As each ancestor's specification is calculated,
it becomes the ``basis'' for the next one down: most or all
properties get re-applied to the next descendant, unless
overridden. This is not a process specific to SGML
applications; Brian Reid's Scribe system included it from the
beginning. Others also described similar algorithms. But what happens when there are two sets of ancestors? Authors frequently
transclude a paragraph that was not (in its originating context)
buried within a
FOOTNOTE or other special construct, to a
referencing context where it is so
buried. Which context wins?
 For retrospective information about Scribe see ``A high-level approach to computer document formatting,'' in the Conference Record of the Seventh Annual ACM Symposium on Principles of Programming Languages, January, 1980. In addition, Reid's dissertation is an absolute must-read: Scribe: A Document Specification Language and its Compiler, Ph.D. thesis, Carnegie-Mellon University, Pittsburgh, PA. Available as Technical Report CMU-CS-81-100.
 For example, John B. Smith and Stephen F. Weiss, ``Formatting Texts Accessed Randomly,'' in Software - Practice and Experience (SPE) 17(1), 1987.
For example, consider this transclusion:
In the referenced context, the
STANZA has a
FQGI (fully-qualified generic identifier, or the
list of element type names for all its ancestors in order) of
COLL/POET/POEM/STANZA. In the referencing context that
STANZA ends up with quite a different
BOOK/CHAP/SEC/P/FOOTNOTE/QUO/STANZA (though see below for
more details on this). Any such difference may lead to formatting
differences, given typical state-of-the-art style mechanisms.
To calculate the ``right'' font and other layout parameters, we must decide how to relate the two contexts and use each to contribute to the appropriate style. From the reader's point of view several requirements must be met in the referencing display context, even though they may not always fit well together. Formatting that indicates logical qualities of the referenced data must be accurate for the referenced context, but the formatting must reflect the referenced data's status as a transcluded object within a certain context.
Any difference in appearance from the appearance in the referenced context must not significantly affect the meaning (unless of course that is referencer's specific intent!). For example, you might change the font but certainly would not interpret the content as though it were in another character encoding.
The display must indicate clearly that the information is transcluded, and not the referencing author's own work. For example, a quote should not be run in with no sign, but must be marked by quotation marks, layout as an indented block, movement to a separate note area, differing color, or some other conventional distinction.
The display must fit in smoothly with the style of the referencing document. For example, if one happens to quote a large-print edition, it would be silly to display it in that same large size if the referencing context is not also a large-print edition. Although that may seem a marginal case, the same principle arises every day in another form: quoting text from a Web page designed with fonts sized for low-resolution screens, which print out awkwardly big on paper.
For the moment, make the simplifying assumption that an element's FQGI is the only thing given to a stylesheet to determine the element's formatting. That is, the stylesheet can set up layout parameters with knowledge of an element's ancestry, but no knowledge of its siblings, attributes, or other information. This is clearly insufficient for some applications, even ones so simple as auto-numbering; but it will do for the moment. Given that, the question reduces to ``what is the effective (or virtual) FQGI of a transcluded element?'' For the example above, we see at least these possibilities:
COLL/POET/POEM/STANZA (the referenced context
BOOK/CHAP/SEC/P/FOOTNOTE/QUO (the referencing
(the contexts end-to-end)
referencing context plus just the referenced element's own type
information but not its ancestry; if the referenced element has its
own descendants, they would continue the pattern)
contexts end-to-end, but minus the referencing
(the contexts end-to-end, but referenced first, which seems quite
In general we believe the goals can be best achieved by combining the two contexts intelligently rather than choosing one or the other (the first two choices), since otherwise, depending on the author's intent, something important may be lost.
For example, any formatting associated with
elements probably should still apply, since the transcluded data,
after all, is one. On the other
hand, the last-stated goal above, meshing formatting with that of the
referencing document, cannot be achieved at all unless at least some
of that document's context is applied.
Other combinations are possible, such as trying to combine things at a finer granularity, but this leads to many odd cases: any proposal involving taking some parameters (say, font) from one context, but others (such as color) from the other seems destined to become far too complex to explain. It would also share the known problems of proposals to combine stylesheets on a piecemeal basis.
On the whole, it seems the best compromise may be the short combination shown fourth above: calculate the style as usual in the referencing context, and then allow the referenced data to override it based on the GI at the root of the transcluded data (and of course internal ones if any), but not to inherit from the referenced context outside of the referenced data itself.
Before proceeding, we note that there are some cases where an extreme solution may be necessary: these are akin to other cases where formatting is critical for a particular purpose, and the usual strengths of descriptive markup systems such as SGML and XML may be overshadowed by other concerns. First, some documents have format as a part of their intrinsic meaning, such as concrete poetry where the layout is indeed part of the poem. Second, certain legal environments may require absolute format fidelity, such as mandatory warnings in aircraft manuals, where fine details of icon placement, geometry, font size, and the like may be regulated. In such settings of course the designer's options are more limited.
Apart from the issue of having two element contexts, there is a
second issue. Typically, there are two style sheets active: one
applicable to the referencing document, and one to the referenced
document. It may be that the
STANZA element has no style
definition in the referencing context's applicable stylesheet,
in which case it seems clear that the referenced context's
should be used (there isn't any other obvious choice). But what
STANZA is defined in both? A couple of issues
STANZA signify the same concept in both
documents? Two DTDs can easily use the same tag
name for quite different meanings.
Which stylesheet's definition applies? Referencing, referenced, or some combination?
The first issue, name conflict, is insoluble in principle
without some universal namespace to which local names such as
STANZA can relate. XML is currently
developing such mechanisms by providing syntax to declare a name as
belonging to a particular ``namespace'' defined
elsewhere. An alternative for the transclusion case is to modify the
``foreign'' names somehow, so that referencing
stylesheet's own name usage cannot conflict: if it wants to
STANZA elements, it would have to do
it explicitly, such as by defining a style for
On the second issue, one can hardly ``average'' the
colors, indents, or fonts specified, although some properties could be
combined in a manner akin to Cascading Style Sheets. Assuming it can
be determined that both definitions do refer to the same
STANZA type conceptually, the easiest solution is to say
that the referencing stylesheet wins. This allows local override, and
is analogous to the usual solution for variable-name scoping in other
contexts such as hierarchically scoped programming languages.
A general solution for determining the virtual FQGI like the one above can work for many cases, but there will still be occasions in which a document designer will want to control interpretation of transcluded data on a case-by-case basis.
As one example, an author may transclude the same element into two locations, expecting two different presentations. One transclusion may be a cross-reference, and the author may desire that only the section's title be presented:
<PARA>In <XREF HREF="some.url#CHILD(3,CHAPTER)(2,SECTION)"/>, the author suggests that pigs can fly.</PARA>(This is very common in SGML systems.) But on another occasion, the author may actually desire that the entire section be presented to his reader:
<PARA>Here, see for yourself:</PARA> <TRANSCLUDE HREF="some.url#CHILD(3,CHAPTER)(2,SECTION)"/>Both of these examples are transclusions, since in both instances the author desires to include an object in another document as it exists at the time of the reader's access.
Similarly, a list item may be referred to in its original context, and the author may with the item's original number preserved (``See step 5.''). At another point, the author may wish to include that list item in her own list, numbered in the context of its new home. (These cases and those in the preceding paragraph are examples of the philosophical ``use vs. mention'' distinction.)
A stylesheet language designed with transclusion in mind can
give the designer control over the nature of transcluded data's
presentation. For the section reference above, a designer might
XREF elements should display a canonical
object number (such as ``Section 3.2'') followed by the
TITLE child of the transcluded object, and that the text
should serve as a link to that object; whereas
elements should simply display the entire transcluded object.
A stylesheet mechanism like this can also address problems like
those of the
STANZA above. Instead of an application
providing a single rule for determining the virtual
FQGI of transcluded data, it might permit the
stylesheet to state what aspects of presentation should come from
XREF might be styled as, ``Fetch the
TITLE of the object; style it in the current font; change
the slant; but then use the object's stylesheet for any children
TRANSCLUDE, it might say, ``Increase
the margins by three picas and use the current font; reduce the size
by two points; and then use the transcluded object's stylesheet
for any children.''
STANZA might be handled as, ``Increase the
margins by three picas and use a sans serif font, but otherwise give
STANZA complete control over its
Another interesting problem presented by transclusion is addressing - linking to documents that use transclusion, or even transcluding parts thereof. Let us consider what happens if a user selects a range in a document that uses transclusion. Let's take an example in which the user selects a range beginning outside transcluded information and ending within the transclusion, and marks that text for copying or exporting.
Exporting the text of the selection is not an interesting problem. But what if the selection is to be copied into a hypertext document?
The first case we consider is that an author makes the selection in another's document, and pastes the information into her own. The most likely behavior in this situation is to simply transclude the selection within the new document; recursive transclusions are no more complicated than a single one.
If the pasting operation is within a single document, transclusion would still be an option, but a more likely desired effect would be to copy the content and markup to the new location. The information native to the document could be copied as-is; the transcluded information, if selected in its entirety, could be referenced from the new location; if the selection is partial, the transclusion reference can be copied and then modified by the application, such that the reference now reflects the desired extent of the target information.
We also consider the special case in which the user has selected exactly the extent of a transcluded object (modulo some whitespace). The application's behavior here presents a potential field for market differentiation. One behavior would be to dumbly transclude the user's selection, creating a two-level transclusion. A more intelligent approach would be to recognize that the selection was composed solely of a transclusion, and create a transclusion in the new document directly to the original source of the information. In the latter case, the author might prefer that the reference actually be to the ``younger'' document; an application might prompt the user for the intended source of the information.
A still more interesting case is the modification of transcluded objects. It is not at all uncommon to wish to edit quoted texts: to elide unnecessary verbage, or to replace ambiguous pronouns. Since such modifications will typically be different for every instance of modified transclusion, we suggest that instructions for modifying the transcluded information could be contained directly within the transcluding document.
For instance, DSSSL expression syntax or a similar syntax might be used:
<PARA>Then, the author ludicrously suggests: <TRANSCLUDE HREF="some.url#CHILD(3,PARA)"> <REPLACE> (list (list (node-list-first (children (current-node))) "[The author]") (list (node-list-tail (children (current-node)) 89) (string #\[ #\horizontal-ellipsis #\]))) </REPLACE> </TRANSCLUDE> </PARA>which would replace the first character with the string ``[The author]'', and everything after a certain point with ``[...]''.
A hypertext language might also supply architectural forms or known semantics for a simple patching instruction set:
<TRANSCLUDE HREF="some.url#CHILD(3,PARA)"> <REPLACE HREF="some.url#CHILD(3,PARA)STRING(1,'I',0)"> [The author]</REPLACE> <REPLACE HREF="some.url#CHILD(3,PARA)STRING(23,'t',0)..DITTO()STRING(-1,'.',0)"> […]</REPLACE> </TRANSCLUDE>which performs the same substitution with XLL addressing syntax.
 As of 31 July 1997.
Most modification operations would be covered by a simple set of three operators: insert, delete, and replace, applicable to elements or strings therein. A hypertext language would provide the means of addressing regions to be modified.
Modification of transcluded information opens a variety of issues in user interfaces. Editing tools can distinguish themselves in how efficiently they can record the author's modifications. One particularly interesting area is that of reader modification. If a reader has permission to edit both the referring document and the referenced data, and changes transcluded information, which context should reflect the change? It is our suggestion that in this ambiguous case, a user interface should ask the user which was intended, since it seems unlikely that any general rule could anticipate the user's intention.
We believe that any case of transclusion can be considered by considering these three aspects (display, addressing, modification) separately. Market differentiation can occur based on how an application analyzes a transclusion problem. For instance, if a modified transclusion is copied to another document, an application might simply transclude the referenced section; it might copy the transclusion and modification information; or it might prompt the user to decide between those two options and the option to modify the original information again.
Some actions present complications of these aspects. For instance, out-of-line presentational aspects, such as footnotes, can pose a challenge to a user interface. Should the footnotes be presented along with those of the referring document, merged into the numbering? Should the original document's footnotes be displayed directly after the transcluded content - and if so, should they be numbered from 1, or should their original numbers be used? Should they be included at all?
There is also the thorny issue of information ownership. If someone is determined to violate the law or simply good taste by stealing information, there is little way to prevent it. With the current state of the Web, the pirate can simply copy from a document's source; unauthorized transclusion can at least let a curious reader determine the actual source of the information. It is important that user agents retain the original context of transcluded objects, so that information about the object can be determined, such as the ownership declared in metadata associated with the document's root element, cataloging information, and associated alternative stylesheets.
Financial issues must also be considered carefully. First, if transcluded information is only available to paying customers, a user agent should prompt the user before fetching it. (One would hope that a user agent would prompt before initiating any financial transaction.) And what if the transcluded information is subject to royalties? The creator of a transclusion would do well to examine the subject of her links; if her page proves to be popular, she may well become responsible for large sums to the information's owner.
In the highly non-static arena of the Web, there is also the issue of moving targets. It relates closely to ambiguities we encounter in speech: If someone says ``I want to meet the President,'' it seems obvious enough what they mean. But in fact he may mean two things, and we cannot tell which: he may wish to meet the particular person that held the office when they said it (even if she is no longer President later), or, instead, whoever happens to hold the office when the meeting occurs. In much the same way, a reference to given data may have different intents, which are hard to tease apart until something changes later (but then may become quite important).
We believe that no one solution can be proposed that will handle all forms of transclusion, but that there exists an excellent opportunity for user agents to distinguish themselves, as XLL makes transclusion an integral part of our Internet lives.
Steve DeRose has worked with document and hypermedia systems since 1979 when he joined the FRESS project. In 1989 he completed his Ph.D. at Brown University and co-founded Electronic Book Technologies (now part of Inso Corporation), developing DynaText and other products. In his role as Chief Scientist he is active in many standards committees such as TEI and XML, and is co-editor of XLL. He is a frequent speaker in industry and academe, and has written many papers and two books: Making Hypermedia Work: A User's Guide to HyTime (with David Durand), and The SGML FAQ Book. He lives, works, and ice-skates in Rhode Island with his wife Laurie and sons Todd and Brian.
Chris Maden holds a degree in electrical engineering from Brown University, where he first became interested in semantic markup. He began his career as a Technical Support Representative with Electronic Book Technologies, now a part of Inso Corporation, and now works heavily with SGML and DSSSL for O'Reilly, publishers of the popular Nutshell series of computer books. He was an active participant in the development of HTML 2.0 and of XML, and now sits on the XSL Working Group. He is writing a book on XML for O'Reilly, and lives in Newton, Massachusetts with his wife, Ellie.