Publikationen - TexPrax

Wang, Yuxi; Longard, Lukas; Hertle, Christian; Metternich, Joachim

Beyond Pareto Analysis: A Decision Support Model for the Prioritization of Deviations with Natural Language Processing Artikel

In: 2023.

BibTeX | Schlagwörter:

Stangier, Lorenz; Lee, Ji-Ung; Wang, Yuxi; Müller, Marvin; Frick, Nicholas; Metternich, Joachim; Gurevych, Iryna

TexPrax: A Messaging Application for Ethical, Real-time Data Collection and Annotation Proceedings Article

In: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: System Demonstrations, S. 9–16, Ässociation for Computational Linguistics", Taipei, Taiwan, 2022.

Abstract | Links | BibTeX | Schlagwörter:

@inproceedings{stangier-etal-2022-texprax,

title = {TexPrax: A Messaging Application for Ethical, Real-time Data Collection and Annotation},

author = {Lorenz Stangier and Ji-Ung Lee and Yuxi Wang and Marvin Müller and Nicholas Frick and Joachim Metternich and Iryna Gurevych},

url = {https://aclanthology.org/2022.aacl-demo.2},

year  = {2022},

date = {2022-11-01},

booktitle = {Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: System Demonstrations},

pages = {9–16},

publisher = {Ässociation for Computational Linguistics"},

address = {Taipei, Taiwan},

abstract = {Collecting and annotating task-oriented dialog data is difficult, especially for highly specific domains that require expert knowledge. At the same time, informal communication channels such as instant messengers are increasingly being used at work. This has led to a lot of work-relevant information that is disseminated through those channels and needs to be post-processed manually by the employees. To alleviate this problem, we present TexPrax, a messaging system to collect and annotate _problems_, _causes_, and _solutions_ that occur in work-related chats. TexPrax uses a chatbot to directly engage the employees to provide lightweight annotations on their conversation and ease their documentation work. To comply with data privacy and security regulations, we use an end-to-end message encryption and give our users full control over their data which has various advantages over conventional annotation tools. We evaluate TexPrax in a user-study with German factory employees who ask their colleagues for solutions on problems that arise during their daily work. Overall, we collect 202 task-oriented German dialogues containing 1,027 sentences with sentence-level expert annotations. Our data analysis also reveals that real-world conversations frequently contain instances with code-switching, varying abbreviations for the same entity, and dialects which NLP systems should be able to handle.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Schließen

Müller, Marvin; Longard, Lukas; Metternich, Joachim

Comparison of preprocessing approaches for text data in digital shop floor management systems Artikel

In: Procedia CIRP, Bd. 107, S. 179-184, 2022, ISSN: 2212-8271, (Leading manufacturing systems transformation – Proceedings of the 55th CIRP Conference on Manufacturing Systems 2022).

Abstract | Links | BibTeX | Schlagwörter: :text mining, data quality improvement, natural language processing

Lee, Ji-Ung; Klie, Jan-Christoph; Gurevych, Iryna

Annotation Curricula to Implicitly Train Non-Expert Annotators Artikel

In: Computational Linguistics, Bd. 48, Nr. 2, S. 343-373, 2022, ISSN: 0891-2017.

Abstract | Links | BibTeX | Schlagwörter:

@article{10.1162/coli_a_00436,

title = {Annotation Curricula to Implicitly Train Non-Expert 

 Annotators},

author = {Ji-Ung Lee and Jan-Christoph Klie and Iryna Gurevych},

url = {https://doi.org/10.1162/coli_a_00436},

doi = {10.1162/coli_a_00436},

issn = {0891-2017},

year  = {2022},

date = {2022-01-01},

journal = {Computational Linguistics},

volume = {48},

number = {2},

pages = {343-373},

abstract = {Annotation studies often require annotators to familiarize themselves with the 

 task, its annotation scheme, and the data domain. This can be overwhelming in 

 the beginning, mentally taxing, and induce errors into the resulting 

 annotations; especially in citizen science or crowdsourcing scenarios where 

 domain expertise is not required. To alleviate these issues, this work proposes 

 annotation curricula, a novel approach to implicitly train annotators. The goal 

 is to gradually introduce annotators into the task by ordering instances to be 

 annotated according to a learning curriculum. To do so, this work formalizes 

 annotation curricula for sentence- and paragraph-level annotation tasks, defines 

 an ordering strategy, and identifies well-performing heuristics and 

 interactively trained models on three existing English datasets. Finally, we 

 provide a proof of concept for annotation curricula in a carefully designed user 

 study with 40 voluntary participants who are asked to identify the most fitting 

 misconception for English tweets about the Covid-19 pandemic. The results 

 indicate that using a simple heuristic to order instances can already 

 significantly reduce the total annotation time while preserving a high 

 annotation quality. Annotation curricula thus can be a promising research 

 direction to improve data collection. To facilitate future research—for 

 instance, to adapt annotation curricula to specific tasks and expert annotation 

 scenarios—all code and data from the user study consisting of 2,400 

 annotations is made available.1},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Schließen

Glockner, Max; Hou, Yufang; Gurevych, Iryna

Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for Misinformation Artikel

In: arXiv preprint arXiv:2210.13865, 2022.

BibTeX | Schlagwörter:

Bigoulaeva, Irina; Sachdeva, Rachneet; Madabushi, Harish Tayyar; Villavicencio, Aline; Gurevych, Iryna

Effective Cross-Task Transfer Learning for Explainable Natural Language Inference with T5 Artikel

In: arXiv preprint arXiv:2210.17301, 2022.

BibTeX | Schlagwörter:

Lee, Ji-Ung; Klie, Jan-Christoph; Gurevych, Iryna

Annotation Curricula to Implicitly Train Non-Expert Annotators Artikel

In: Computational Linguistics, Bd. 48, Nr. 2, S. 343–373, 2022.

BibTeX | Schlagwörter:

Baumgärtner, Tim; Wang, Kexin; Sachdeva, Rachneet; Eichler, Max; Geigle, Gregor; Poth, Clifton; Sterz, Hannah; Puerto, Haritz; Ribeiro, Leonardo FR; Pfeiffer, Jonas; others,

UKP-SQUARE: An Online Platform for Question Answering Research Artikel

In: arXiv preprint arXiv:2203.13693, 2022.

BibTeX | Schlagwörter:

Beck, Tilman; Lee, Ji-Ung; Viehmann, Christina; Maurer, Marcus; Quiring, Oliver; Gurevych, Iryna

Investigating label suggestions for opinion mining in German Covid-19 social media Proceedings Article

In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), S. 1–13, Association for Computational Linguistics, Online, 2021.

Abstract | Links | BibTeX | Schlagwörter: