Before reading this, you may wish to download and print out the formatted RTF output of the sample SGML and stylesheet.
A common problem with mathematical markup schema is that it is difficult to reconcile presentational and semantic aspects of math. While it is desirable to have semantic markup for access by modeling programs, mathematical tradition relies heavily on presentational features to communicate information to readers.
Another problem is that mathematicians and scientists are constantly inventing new notations to explain new ideas. Any fixed mathematical DTD is doomed to disuse because of this.
This poster presents an informal suggestion for a method of using extensible semantic markup together with a stylesheet to control presentational features. I'll use Maxwell's equations of electrodynamics for a demonstration.
All functions are represented as single elements, whose children
are their arguments. At the first level, we introduce generic
functions: multiplication, negation, fractions, addition, dot
products, and cross products. (See the first <eqns>
element in the sample SGML.)
As a further level of abstraction, we introduce more specific
operations; for instance, instead of representing the curl of a vector
as a cross of nabla and the vector, we introduce a curl element which
takes the vector as its single child. (See the second
<eqns>
element in the
SGML.)
One immediately apparent benefit is that because the nabla is
generated by the semantics of the <curl>
element, it
is no longer italicized like other vectors, giving a more appropriate
appearance. (Compare the first and second equation groups in the output.)
The third level of abstraction does not introduce any new element
types or stylesheet dependencies, but leads to even more semantic
markup. I have defined entities for each variable; by referring to
the entity instead of marking up a letter, the SGML is more easily
interpreted even in the absence of a stylesheet. (See the third
<eqns>
element and the internal subset in the SGML.)
The DTD is simple. Elements are grouped by how many arguments they can take. This list can and should be extended by users as necessary for their needs. The stylesheet provides examples for inserting the mathematical symbols corresponding to the markup: addition places plus signs between every child; dot product places a cross between the two children.
Grouping is not dealt with here. In theory, DSSSL is very
powerful, it should be possible to calculate where parentheses are
necessary in the stylesheet. However, that would be extremely
complex, and I suggest simply using a <group>
or
<paren>
element to contain things which it is
necessary to group.
This stylesheet uses tables to format groups of equations. This is only because DSSSL's math flow objects were not yet supported by Jade when this was developed. The math flow objects would obviously be a much better choice to use in the stylesheet.
For comparison, I've marked up the same equations in MathML, both presentational and semantic. Here is a critique of MathML in particular, and math DTD efforts in general.
All files discussed here, plus an additional document instance and formatted output with some examples from Donald Knuth's The TeXbook, can be found in this zip file.
Last updated and validated 19 February 1998 by Chris Maden.