WELCOME [ Log In · Register ]        SITE [ Search · Page Index · Recent Changes ]    RSS

MathML

MathML

Request Title 

Add a subset of the MathML tags
Request Submitted By  PDF/UA
Executive Summary Add the Presentation MathML 2.0 tags and attributes and the MathML math and semantics tags.
Rationale MathML is an existing well-supported standard for encoding mathematical equations that would give assistive-technology users and search engines access to the formulae.
Use Case(s) Non-linear math
Details of Requested Change

Below are the steps suggested.  Two new subsections (one for MathML elements and one for their attributes) need to be added.  These both introduce tables that need to be added. The text refers to table numbers.  In the text below, they are labeled as mmm and nnn and values need to be given to them based on the numbering in the spec.

Note:  It appears that elements and attributes in the PDF spec are "CamelCased" (capital letters for each new word as in "ColSpan").  In MathML, all lowercase letters are used.  In the proposal below, all lowercase letters are used, but nothing is broken if they are switched to the PDF convention.  It would be slightly simpler for exporters to XHTML if lowercase were used, but I doubt it makes any real difference.

Here are the recommendations for what to change:

  1. Table 340 (Standard structure types for illustration elements) should have the "Formula" modified to be
    Structure Type  Description
     Formula (Formula) A mathematical formula.
    This structure type is used to identify an entire content element as a
    formula. Formula can only directly contain the math element.  See table mmm.
  2. The last paragraph of 14.8.4.5 (Illustration Elements) should be modified to read:

    "For accessibility to users with disabilities and other text extraction purposes, an illustration element should have an Alt entry or an ActualText entry (or both) in its structure element dictionary (see 14.9.3, “Alternate Descriptions,” and 14.9.4, “Replacement Text”). The Formula structure element should contain MathML structure elements corresponding to the subexpressions for maximum accessibility. Alt is a description of the illustration, whereas ActualText gives the exact text equivalent of a graphical illustration that has the appearance of text."
  3. A new subsection of 14.8.4.5 (Illustration Elements) should be added at the end of 14.8.4.5.  The text of that section is shown below.

  4. The following row should be added to the bottom of table 341:
    Owner  Description
     MathML-2.0 Attributes associated with MathML structure elements (Table mmm).

    MathML   Attributes governing the layout of mathematical expressions.
  5. A new section should be added after 14.8.5.7 (Table Attributes) that lists the MathML attributes in Table nnn. The text of that section is shown below

New subsection of 14.8.4.5 (Illustration Elements)

The structure types described in table mmm, "MathML Elements", are used to describe formulas. The math element should only appear inside of a Formula element, and all of the other elements in the table should only appear inside of other MathML elements.

Note:  Strictly speaking, the elements listed in Table mmm are neither BLSEs or ILSEs.

The elements in Table mmm are specified in detail in the MathML 2.0 recommendation and are summarized below. With the exception of the math, semantics, annotation, and annotation-xml elements, these elements are part of chapter 3 (Presentation Markup) of the MathML 2.0 recommendation.

Table mmm  Standard math attributes

Structure TypeDescription
math The root element of the MathML.
mi Leaf element whose content is an identifier.
mn Leaf element whose content is a number.
mo Leaf element whose content is an operator.
mtext Leaf element whose content is arbitrary text.
mspace Leaf element whose content is a space whose width and height are given by attributes.  If this element is used, a physical whitespace character should be in the document.
ms Leaf element whose content is a string.  The delimiters of the string are given by the lquote and rqoute attributes.  The delimiters should not be part of the content of the element.
mglyph Leaf element whose content is an identifier.
mrow Group any number of horizontally laid out elements together.
mfrac A vertical or beveled fraction with exactly two children.
msqrt A square root with exactly one child.
mroot A radical with exactly two children:  the index and the radicand.
mstyle Change the style of how the child is displayed by changing attributes that are inherited by the children.
merror Enclose a syntax error message from a preprocessor or otherwise indicate an error.
mpadded Adjust the vertical or horizontal space around the child.
mphantom Make the child invisible but preserve its size.  If this element is used, either a white space character must be used or what is drawn should match the background so that it is invisible.  It should not be spoken.
mfenced Surround the children with "fences" (e.g., parenthesis) and add separators as specified by the open, close, and separators notations.  Neither the fences nor the separator(s) should be children of this element.
menclose Enclose the children with lines, circles, cross-outs, or other decorations as specified by the notation attribute.
msub An expression with a subscript.  Both the base and the subscript are children.
msup An expression with a superscript.  Both the base and the superscript are children.
msubsup An expression with a subscript and a superscript.  The base, subscript and superscript are children.
munder An expression with underscript or lower limit.
mover An expression with overscript or upper limit.
munderover An expression with both an underscript/lower limit and an overscript/upper limit.
mmultiscripts An expression with prescripts (sub/superscripts to the left of the base) or tensor indices.  The base of the multiscript should be the first child, followed by pairs of lower and upper indices (subscripts/superscripts).  Missing scripts are indicated using the none elements.  Pairs of prescripts follow the postscripts and must be preceded by a mprescripts element.
none Valid as a child of mmultiscripts.  none is used to indicate a unused subscript or superscript as part of a subscript/superscript pair.  In MathML, this is an empty element but because of the requirements for structure elements in PDF, it must point to some content.  It is recommended that applications insert a whitespace character in the empty position so this element can refer to some content.
mprescripts Valid as a child of mmultiscripts.  mprescripts is used to indicate the start of prescript subscript/superscript pairs.  In MathML, this is an empty element but because of the requirements for structure elements in PDF, it must point to some content.  It is recommended that applications insert a whitespace character immediately before or after the notation so this element can point to some content.
mtable A matrix or other tabular mathematical layout.  MathML tables are similar to HTML tables and consist of one or more table rows (mtr or mlabeledtr).  Unlike PDF tables, MathML tables have no headers or captions because headers are not mathematical expressions.
mtr A row in a mtable.  Its parent must be mtable.
mlabeledtr A row in a table that has a label on either the left or right side, as determined by the side attribute. The label is the first child of mlabeledtr The rest of the children represent the contents of the row and are identical to those used for mtr; all of the children except the first must be mtd elements.  Like mtr, its parent must be mtable.
mtd One entry, or cell, in a table or matrix. An mtd element is only allowed as a direct child of an mtr or an mlabeledtr element.
maligngroup An alignment marker that is used to help vertically align specified points within a column of MathML expressions.  maligngroup is a space-like leaf element that is used divide a column up into groups; see the MathML recommendation for more details.  In MathML, this is an empty element.  It is recommended that applications insert a whitespace character that corresponds to the maligngroup element.
malignmark An alignment marker that is used to help vertically align specified points within a column of MathML expressions.  It specifies a specific alignment point within a maligngroup.  Like maligngroup, it is a space-like leaf element; see the MathML recommendation for more details.  In MathML, this is an empty element.  It is recommended that applications insert a whitespace character that corresponds to the malignmark element.
maction  In MathML, maction is used to to bind actions to expressions.  In a PDF document, no action is associated with this element although a plug-in could be written to enliven the expression.  maction is mainly provided for compatibility.  maction takes an arbitrary number of children, although only one child is displayed.  Children that are not rendered can not be part of the structure tree and their representation in PDF is currently not specified
semantics Associates a specific notation with a notation-independent representation that carries more semantic information.  For PDF, the first child must  be the MathML for the notation being displayed.  In MathML, subsequent elements (annotation, annotation-xml) specify alternative encodings and are not rendered.  Children that are not rendered can not be part of the structure tree and their representation in PDF is currently not specified
annotation A child element of semantics whose child provides an alternative non-XML representation of the contents of the semantics element. This element cannot currently be part of the structure tree.  For more information, see semantics.
annotation-xml A child element of semantics whose child provides an alternative XML-based representation of the contents of the semantics element. This element cannot currently be part of the structure tree.  For more information, see semantics.

Note:  MathML contains a set of "content" elements that are notation-independent semantic elements. These elements do not have a specific layout and so are not meaningful in PDF unless they are inside of a semantics element.


New Section after 14.8.5.7 (Table Attributes)

The attribute owner "MathML-2.0" shall be associated with the attributes listed below.

Table nnn lists all of the attributes associated with the MathML elements listed in Table mmm. The description of each attribute is given in MathML 2.0 recommendation. They are listed below for completeness.

Unless otherwise noted:

  • these attributes are not inherited (i.e., they apply only to the current element of the structure tree, not the children of the element);
  • these attributes are optional;
  • the "type" for each value is "string";
  • "class", "id", "style", "xref", and "xlink:href" are legal attributes for all MathML elements.

The attributes exist for compatibility with MathML generation tools and to allow translation of the math to an XML dialect.

Table nnn  Standard math attributes

Structure ElementsAttributes
math display, altimg, alttext
mi, mn, mtext mathvariant, mathsize, mathcolor, mathbackground
mo mathvariant, mathsize, mathcolor, mathbackground, form, fence, separator, lspace, rspace, stretchy, symmetric, maxsize, minsize, largeop, movablelimits, accent
mspace mathvariant, mathsize, mathcolor, mathbackground, width, height, depth, linebreak
ms mathvariant, mathsize, mathcolor, mathbackground, lquote, rquote
mglyph mathvariant, mathsize, mathcolor, mathbackground, alt (required), fontfamily (required), index (required)
mfrac linethickness, numalign, denomalign, bevelled
mstyle All optional attributes listed for the tags.  In addition, the following are valid:  scriptlevel, displaystyle, scriptsizemultiplier, scriptminsize, background, veryverythinmathspace, verythinmathspace, thinmathspace, mediummathspace, thickmathspace, verythickmathspace, veryverythickmathspace.

All values are inherited.

mpadded width, lspace, height, depth
mfenced open, close, separators
menclose notation
msub subscriptshift
msup superscriptshift
msubsup subscriptshift, superscriptshift
munder accentunder
mover accent
munderover accentunder, accent
mmultiscripts subscriptshift, superscriptshift
mtable align, rowalign, columnalign, groupalign, alignmentscope, columnwidth, width, rowspacing, columnspacing, rowlines, columnlines, frame, framespacing, equalrows, equalcolumns, displaystyle, side, minlabelspacing
mtr, mlabeledtr rowalign, columnalign, groupalign
mtd rowspan, columnspan, rowalign, columnalign, groupalign
maligngroup groupalign
malignmark edge
maction actiontype (required), selection
semantics,  annotation, annotation-xml  definitionURL, encoding
 

 

 

Return to 14289 Drafting or 32000-2 requests: Beijing