CuSINeS: Curriculum-driven Structure Induced Negative Sampling for Statutory Article Retrieval

T. Y.S.S. Santosh, Kristina Kaiser, Matthias Grabmair

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

Abstract

In this paper, we introduce CuSINeS, a negative sampling approach to enhance the performance of Statutory Article Retrieval (SAR). CuSINeS offers three key contributions. Firstly, it employs a curriculum-based negative sampling strategy guiding the model to focus on easier negatives initially and progressively tackle more difficult ones. Secondly, it leverages the hierarchical and sequential information derived from the structural organization of statutes to evaluate the difficulty of samples. Lastly, it introduces a dynamic semantic difficulty assessment using the being-trained model itself, surpassing conventional static methods like BM25, adapting the negatives to the model's evolving competence. Experimental results on a real-world expert-annotated SAR dataset validate the effectiveness of CuSINeS across four different baselines, demonstrating its versatility.

OriginalspracheEnglisch
Titel2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
Redakteure/-innenNicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Herausgeber (Verlag)European Language Resources Association (ELRA)
Seiten4266-4272
Seitenumfang7
ISBN (elektronisch)9782493814104
PublikationsstatusVeröffentlicht - 2024
VeranstaltungJoint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 - Hybrid, Torino, Italien
Dauer: 20 Mai 202425 Mai 2024

Publikationsreihe

Name2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings

Konferenz

KonferenzJoint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024
Land/GebietItalien
OrtHybrid, Torino
Zeitraum20/05/2425/05/24

Fingerprint

Untersuchen Sie die Forschungsthemen von „CuSINeS: Curriculum-driven Structure Induced Negative Sampling for Statutory Article Retrieval“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren