TY - JOUR
T1 - Text-to-structure interpretation of user requests in BIM interaction
AU - Wei, Yinyi
AU - Li, Xiao
AU - Petzold, Frank
N1 - Publisher Copyright:
© 2025 Elsevier B.V.
PY - 2025/6
Y1 - 2025/6
N2 - Numerous efforts have been devoted to utilizing a natural language-based interface for BIM interaction. These interfaces require extracting user's intent (i.e., the operation type) and slots (i.e., the targeted elements and properties). However, there is a lack of a fine-grained approach for extracting intent and slot information simultaneously. This paper introduces a text-to-structure approach based on language models to interpret user requests for BIM interaction (T2S4BIM). It proposed a synthetic data generation method and a curated dataset as data support. Employing Transformer-based models, T2S4BIM converts unstructured user requests into a structured format with intent and slot information. Experiments demonstrated that T2S4BIM outperformed existing approaches, with encoder-decoder models like T5 and FLAN-T5 achieving performance comparable to larger, decoder-only models such as Llama3.1-8B and Qwen2.5-7B, while improving efficiency. The practical applicability of T2S4BIM was illustrated through a Revit plug-in that interprets user requests and executes corresponding actions (e.g., manipulating object properties).
AB - Numerous efforts have been devoted to utilizing a natural language-based interface for BIM interaction. These interfaces require extracting user's intent (i.e., the operation type) and slots (i.e., the targeted elements and properties). However, there is a lack of a fine-grained approach for extracting intent and slot information simultaneously. This paper introduces a text-to-structure approach based on language models to interpret user requests for BIM interaction (T2S4BIM). It proposed a synthetic data generation method and a curated dataset as data support. Employing Transformer-based models, T2S4BIM converts unstructured user requests into a structured format with intent and slot information. Experiments demonstrated that T2S4BIM outperformed existing approaches, with encoder-decoder models like T5 and FLAN-T5 achieving performance comparable to larger, decoder-only models such as Llama3.1-8B and Qwen2.5-7B, while improving efficiency. The practical applicability of T2S4BIM was illustrated through a Revit plug-in that interprets user requests and executes corresponding actions (e.g., manipulating object properties).
KW - BIM interaction
KW - Building information modeling
KW - Language models
KW - Natural language processing
KW - User request understanding
UR - http://www.scopus.com/inward/record.url?scp=86000735536&partnerID=8YFLogxK
U2 - 10.1016/j.autcon.2025.106119
DO - 10.1016/j.autcon.2025.106119
M3 - Article
AN - SCOPUS:86000735536
SN - 0926-5805
VL - 174
JO - Automation in Construction
JF - Automation in Construction
M1 - 106119
ER -