TY - JOUR
T1 - The validation of SGML content models
AU - Brüggemann-Klein, A.
AU - Wood, D.
N1 - Funding Information:
This work was partially supported under the Natural Sciences and Engineering Research Council of Canada and the Information Technology Research Center of Ontario grants of the second author.
PY - 1997/2
Y1 - 1997/2
N2 - The Standard Generalized Markup Language (SGML) is an ISO standard that provides a syntactic meta-language for the definition of textual markup systems, which are used to indicate the structure of documents so that they can be electronically typeset, searched, and communicated. We address only one problem raised by the standard, namely: in SGML, the right-hand sides of context-free productions are regular expressions, called content models, that are restricted to be what the standard calls 'unambiguous,' but what is more appropriately called deterministic. We solve the problem of how to define determinism precisely, how to recognize deterministic regular expressions efficiently, and how to recognize deterministic regular languages. Any SGML parser must check that a given document grammar conforms to the standard; that is, it must validate it. Hence, our results are an important step in the clarification of the standard and in the efficient implementation of an SGML parser for SGML document grammars.
AB - The Standard Generalized Markup Language (SGML) is an ISO standard that provides a syntactic meta-language for the definition of textual markup systems, which are used to indicate the structure of documents so that they can be electronically typeset, searched, and communicated. We address only one problem raised by the standard, namely: in SGML, the right-hand sides of context-free productions are regular expressions, called content models, that are restricted to be what the standard calls 'unambiguous,' but what is more appropriately called deterministic. We solve the problem of how to define determinism precisely, how to recognize deterministic regular expressions efficiently, and how to recognize deterministic regular languages. Any SGML parser must check that a given document grammar conforms to the standard; that is, it must validate it. Hence, our results are an important step in the clarification of the standard and in the efficient implementation of an SGML parser for SGML document grammars.
KW - Ambiguity
KW - Conformance
KW - Content model
KW - Regular expressions
KW - SGML
UR - http://www.scopus.com/inward/record.url?scp=0343990720&partnerID=8YFLogxK
U2 - 10.1016/S0895-7177(97)00025-3
DO - 10.1016/S0895-7177(97)00025-3
M3 - Article
AN - SCOPUS:0343990720
SN - 0895-7177
VL - 25
SP - 73
EP - 84
JO - Mathematical and Computer Modelling
JF - Mathematical and Computer Modelling
IS - 4
ER -