The validation of SGML content models

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

The Standard Generalized Markup Language (SGML) is an ISO standard that provides a syntactic meta-language for the definition of textual markup systems, which are used to indicate the structure of documents so that they can be electronically typeset, searched, and communicated. We address only one problem raised by the standard, namely: in SGML, the right-hand sides of context-free productions are regular expressions, called content models, that are restricted to be what the standard calls 'unambiguous,' but what is more appropriately called deterministic. We solve the problem of how to define determinism precisely, how to recognize deterministic regular expressions efficiently, and how to recognize deterministic regular languages. Any SGML parser must check that a given document grammar conforms to the standard; that is, it must validate it. Hence, our results are an important step in the clarification of the standard and in the efficient implementation of an SGML parser for SGML document grammars.

Original languageEnglish
Pages (from-to)73-84
Number of pages12
JournalMathematical and Computer Modelling
Volume25
Issue number4
DOIs
StatePublished - Feb 1997

Keywords

  • Ambiguity
  • Conformance
  • Content model
  • Regular expressions
  • SGML

Fingerprint

Dive into the research topics of 'The validation of SGML content models'. Together they form a unique fingerprint.

Cite this