How would you say it? Eliciting lexically diverse data for supervised semantic parsing

Abhilasha Ravichander, Thomas Manzini, Matthias Grabmair, Graham Neubig, Jonathan Francis, Eric Nyberg

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

Building dialogue interfaces for real-world scenarios often entails training semantic parsers starting from zero examples. How can we build datasets that better capture the variety of ways users might phrase their queries, and what queries are actually realistic? Wang et al. (2015) proposed a method to build semantic parsing datasets by generating canonical utterances using a grammar and having crowdworkers paraphrase them into natural wording. A limitation of this approach is that it induces bias towards using similar language as the canonical utterances. In this work, we present a methodology that elicits meaningful and lexically diverse queries from users for semantic parsing tasks. Starting from a seed lexicon and a generative grammar, we pair logical forms with mixed text-image representations and ask crowdworkers to paraphrase and confirm the plausibility of the queries that they generated. We use this method to build a semantic parsing dataset from scratch for a dialog agent in a smart-home simulation. We find evidence that this dataset, which we have named SMARTHOME, is demonstrably more lexically diverse and difficult to parse than existing domain-specific semantic parsing datasets.

Original languageEnglish
Title of host publicationSIGDIAL 2017 - 18th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages374-383
Number of pages10
ISBN (Electronic)9781945626821
StatePublished - 2017
Externally publishedYes
Event18th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL 2017 - Saarbrucken, Germany
Duration: 15 Aug 201717 Aug 2017

Publication series

NameSIGDIAL 2017 - 18th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Proceedings of the Conference

Conference

Conference18th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL 2017
Country/TerritoryGermany
CitySaarbrucken
Period15/08/1717/08/17

Fingerprint

Dive into the research topics of 'How would you say it? Eliciting lexically diverse data for supervised semantic parsing'. Together they form a unique fingerprint.

Cite this