TY - GEN
T1 - From Language to Pixels
T2 - 2nd Workshop on Generalisation (Benchmarking) in NLP, GenBench 2024
AU - Falkenstein, Janek
AU - Schuster, Carolin
AU - Berger, Alex
AU - Groh, Georg
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - Large language models (LLMs) can perform unseen tasks by learning from a few in-context examples. How in-context learning works is still uncertain. We investigate the mechanisms of in-context learning on a challenging non-language task. The task requires the LLM to generate pixel matrices representing images of basic shapes. We introduce a framework to analyze if this task is solved by recognizing similar formats from the training data (task recognition) or by understanding the instructions and learning the skill de novo during inference (task learning). Our experiments demonstrate that LLMs generate meaningful pixel matrices with task recognition and fail to learn such tasks when encountering unfamiliar formats. Our findings offer insights into LLMs’ learning mechanisms to guide future research on their seemingly human-like behavior.
AB - Large language models (LLMs) can perform unseen tasks by learning from a few in-context examples. How in-context learning works is still uncertain. We investigate the mechanisms of in-context learning on a challenging non-language task. The task requires the LLM to generate pixel matrices representing images of basic shapes. We introduce a framework to analyze if this task is solved by recognizing similar formats from the training data (task recognition) or by understanding the instructions and learning the skill de novo during inference (task learning). Our experiments demonstrate that LLMs generate meaningful pixel matrices with task recognition and fail to learn such tasks when encountering unfamiliar formats. Our findings offer insights into LLMs’ learning mechanisms to guide future research on their seemingly human-like behavior.
UR - http://www.scopus.com/inward/record.url?scp=85216530046&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85216530046
T3 - GenBench 2024 - GenBench: 2nd Workshop on Generalisation (Benchmarking) in NLP, Proceedings of the Workshop
SP - 27
EP - 41
BT - GenBench 2024 - GenBench
A2 - Hupkes, Dieuwke
A2 - Dankers, Verna
A2 - Batsuren, Khuyagbaatar
A2 - Kazemnejad, Amirhossein
A2 - Christodoulopoulos, Christos
A2 - Giulianelli, Mario
A2 - Cotterell, Ryan
PB - Association for Computational Linguistics (ACL)
Y2 - 16 November 2024
ER -