Unambiguity of extended regular expressions in SGML document grammars

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

18 Scopus citations

Abstract

In the Standard Generalized Markup Language (SGML), document types are defined by context-free graznmars in an extended Backus-Naur form. The right-hand side of a production is called a content model. Content models are extended regular expressions that have to be unambiguous in the sense that “an element… that occurs in the document instance must be able to satisfy only one primitive content token without looking ahead in the document instance.” In this paper, we present a linear-time algorithm that decides whether a given content model is unambiguous. A similar result has previously been obtained not for content models but for the smaller class of standard regular expressions. It reties on the fact that the languages of marked regular expressions are local—a property that does not hold any more for content models that contain the new &-operator. Therefore, it is necessary to develop new techniques for content models. Besides solving an interesting problem in formal language theory, our results are relevant for developers of SGML systems. In fact, our definitions are causing changes to the revised edition of the SGML standard, and the algorithm to test content models for unambiguity has been implemented in an SGML parser.

Original languageEnglish
Title of host publicationAlgorithms ESA 1993 – 1st Annual European Symposium, Proceedings
EditorsThomas Lengauer
PublisherSpringer Verlag
Pages73-84
Number of pages12
ISBN (Print)9783540572732
DOIs
StatePublished - 1993
Externally publishedYes
Event1st Annual European Symposium on Algorithms, ESA 1993 - Bad Honnef, Germany
Duration: 30 Sep 19932 Oct 1993

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume726 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference1st Annual European Symposium on Algorithms, ESA 1993
Country/TerritoryGermany
CityBad Honnef
Period30/09/932/10/93

Fingerprint

Dive into the research topics of 'Unambiguity of extended regular expressions in SGML document grammars'. Together they form a unique fingerprint.

Cite this