2023 |
Wang, Yuxi; Longard, Lukas; Hertle, Christian; Metternich, Joachim Beyond Pareto Analysis: A Decision Support Model for the Prioritization of Deviations with Natural Language Processing Artikel In: 2023. BibTeX | Schlagwörter: @article{wangbeyond, |
2022 |
Stangier, Lorenz; Lee, Ji-Ung; Wang, Yuxi; Müller, Marvin; Frick, Nicholas; Metternich, Joachim; Gurevych, Iryna TexPrax: A Messaging Application for Ethical, Real-time Data Collection and Annotation Proceedings Article In: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: System Demonstrations, S. 9–16, Ässociation for Computational Linguistics", Taipei, Taiwan, 2022. Abstract | Links | BibTeX | Schlagwörter: @inproceedings{stangier-etal-2022-texprax, Collecting and annotating task-oriented dialog data is difficult, especially for highly specific domains that require expert knowledge. At the same time, informal communication channels such as instant messengers are increasingly being used at work. This has led to a lot of work-relevant information that is disseminated through those channels and needs to be post-processed manually by the employees. To alleviate this problem, we present TexPrax, a messaging system to collect and annotate _problems_, _causes_, and _solutions_ that occur in work-related chats. TexPrax uses a chatbot to directly engage the employees to provide lightweight annotations on their conversation and ease their documentation work. To comply with data privacy and security regulations, we use an end-to-end message encryption and give our users full control over their data which has various advantages over conventional annotation tools. We evaluate TexPrax in a user-study with German factory employees who ask their colleagues for solutions on problems that arise during their daily work. Overall, we collect 202 task-oriented German dialogues containing 1,027 sentences with sentence-level expert annotations. Our data analysis also reveals that real-world conversations frequently contain instances with code-switching, varying abbreviations for the same entity, and dialects which NLP systems should be able to handle. |
Müller, Marvin; Longard, Lukas; Metternich, Joachim Comparison of preprocessing approaches for text data in digital shop floor management systems Artikel In: Procedia CIRP, Bd. 107, S. 179-184, 2022, ISSN: 2212-8271, (Leading manufacturing systems transformation – Proceedings of the 55th CIRP Conference on Manufacturing Systems 2022). Abstract | Links | BibTeX | Schlagwörter: :text mining, data quality improvement, natural language processing @article{MULLER2022179, In an increasing number of production companies shop floor management (SFM) is supported by digital systems. The data generated while working with these systems can be used for assistance systems to further enhance the value of digital SFM. Several assistance systems using text data from problem-solving processes have been suggested but had limited quality due to the domain specific language characteristics: short texts with spelling errors and the usage of synonyms. This research aims to quantify the improvement potentials of different preprocessing approaches on the quality of the assistance systems. For that and for comparison in the research community a public, labeled data set is needed. This paper introduces such a data set based on the characteristics identified in three real industry data sets. To overcome the problems in text processing of shop floor data (e.g. domain specific synonyms), several approaches are suggested, tested, and compared to a generic approach for text clustering. The study identifies best practices for the handling of shop floor text data and provides a data set with the goal of simplifying and stimulating research on this topic. |
Lee, Ji-Ung; Klie, Jan-Christoph; Gurevych, Iryna Annotation Curricula to Implicitly Train Non-Expert Annotators Artikel In: Computational Linguistics, Bd. 48, Nr. 2, S. 343-373, 2022, ISSN: 0891-2017. Abstract | Links | BibTeX | Schlagwörter: @article{10.1162/coli_a_00436, Annotation studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain. This can be overwhelming in the beginning, mentally taxing, and induce errors into the resulting annotations; especially in citizen science or crowdsourcing scenarios where domain expertise is not required. To alleviate these issues, this work proposes annotation curricula, a novel approach to implicitly train annotators. The goal is to gradually introduce annotators into the task by ordering instances to be annotated according to a learning curriculum. To do so, this work formalizes annotation curricula for sentence- and paragraph-level annotation tasks, defines an ordering strategy, and identifies well-performing heuristics and interactively trained models on three existing English datasets. Finally, we provide a proof of concept for annotation curricula in a carefully designed user study with 40 voluntary participants who are asked to identify the most fitting misconception for English tweets about the Covid-19 pandemic. The results indicate that using a simple heuristic to order instances can already significantly reduce the total annotation time while preserving a high annotation quality. Annotation curricula thus can be a promising research direction to improve data collection. To facilitate future research—for instance, to adapt annotation curricula to specific tasks and expert annotation scenarios—all code and data from the user study consisting of 2,400 annotations is made available.1 |
Glockner, Max; Hou, Yufang; Gurevych, Iryna Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for Misinformation Artikel In: arXiv preprint arXiv:2210.13865, 2022. BibTeX | Schlagwörter: @article{glockner2022missing, |
Bigoulaeva, Irina; Sachdeva, Rachneet; Madabushi, Harish Tayyar; Villavicencio, Aline; Gurevych, Iryna Effective Cross-Task Transfer Learning for Explainable Natural Language Inference with T5 Artikel In: arXiv preprint arXiv:2210.17301, 2022. BibTeX | Schlagwörter: @article{bigoulaeva2022effective, |
Lee, Ji-Ung; Klie, Jan-Christoph; Gurevych, Iryna Annotation Curricula to Implicitly Train Non-Expert Annotators Artikel In: Computational Linguistics, Bd. 48, Nr. 2, S. 343–373, 2022. BibTeX | Schlagwörter: @article{lee2022annotation, |
Baumgärtner, Tim; Wang, Kexin; Sachdeva, Rachneet; Eichler, Max; Geigle, Gregor; Poth, Clifton; Sterz, Hannah; Puerto, Haritz; Ribeiro, Leonardo FR; Pfeiffer, Jonas; others, UKP-SQUARE: An Online Platform for Question Answering Research Artikel In: arXiv preprint arXiv:2203.13693, 2022. BibTeX | Schlagwörter: @article{baumgartner2022ukp, |
2021 |
Beck, Tilman; Lee, Ji-Ung; Viehmann, Christina; Maurer, Marcus; Quiring, Oliver; Gurevych, Iryna Investigating label suggestions for opinion mining in German Covid-19 social media Proceedings Article In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), S. 1–13, Association for Computational Linguistics, Online, 2021. Abstract | Links | BibTeX | Schlagwörter: @inproceedings{beck-etal-2021-investigating, This work investigates the use of interactively updated label suggestions to improve upon the efficiency of gathering annotations on the task of opinion mining in German Covid-19 social media data. We develop guidelines to conduct a controlled annotation study with social science students and find that suggestions from a model trained on a small, expert-annotated dataset already lead to a substantial improvement -- in terms of inter-annotator agreement (+.14 Fleiss' κ) and annotation quality -- compared to students that do not receive any label suggestions. We further find that label suggestions from interactively trained models do not lead to an improvement over suggestions from a static model. Nonetheless, our analysis of suggestion bias shows that annotators remain capable of reflecting upon the suggested label in general. Finally, we confirm the quality of the annotated data in transfer learning experiments between different annotator groups. To facilitate further research in opinion mining on social media data, we release our collected data consisting of 200 expert and 2,785 student annotations. |
Müller, Marvin; Metternich, Joachim Assistenzsysteme durch Natural Language Processing - Umsetzungsstrategien für den Shopfloor Artikel In: Industrie 4.0 Management, Bd. 2021, S. 11-14, 2021. Links | BibTeX | Schlagwörter: @article{article, |
Müller, Marvin; Metternich, Joachim Production specific language characteristics to improve NLP applications on the shop floor Artikel In: Procedia CIRP, Bd. 104, S. 1890-1895, 2021. Links | BibTeX | Schlagwörter: @article{articleb, |
Müller, Marvin; Lee, Ji-Ung; Frick, Nicholas; Stangier, Lorenz; Gurevych, Iryna; Metternich, Joachim In: Procedia CIRP, Bd. 103, S. 231-236, 2021. Links | BibTeX | Schlagwörter: @article{articlec, |
Müller, Marvin; Alexandi, Emanuel; Metternich, Joachim Digital shop floor management enhanced by natural language processing Artikel In: Procedia CIRP, Bd. 96, S. 21-26, 2021. Links | BibTeX | Schlagwörter: @article{articled, |
2020 |
Müller, Marvin; Frick, Nicholas; Lee, Ji-Ung; Metternich, Joachim; Gurevych, Iryna Chats als Datengrundlage für KIAnwendungen in der Produktion Artikel In: Bd. 115, Nr. 7-8, S. 520–523, 2020, ISSN: 2511-0896. Links | BibTeX | Schlagwörter: General Engineering, Management Science and Operations Research, Strategy and Management @article{Müller2020, |
Müller, Marvin; Terziev, Georgi; Metternich, Joachim; Landmann, Nils Knowledge management on the shop floor through recommender engines Artikel In: Procedia Manufacturing, Bd. 52, S. 344-349, 2020. Links | BibTeX | Schlagwörter: @article{articlee, |
2023 |
Beyond Pareto Analysis: A Decision Support Model for the Prioritization of Deviations with Natural Language Processing Artikel In: 2023. |
2022 |
TexPrax: A Messaging Application for Ethical, Real-time Data Collection and Annotation Proceedings Article In: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: System Demonstrations, S. 9–16, Ässociation for Computational Linguistics", Taipei, Taiwan, 2022. |
Comparison of preprocessing approaches for text data in digital shop floor management systems Artikel In: Procedia CIRP, Bd. 107, S. 179-184, 2022, ISSN: 2212-8271, (Leading manufacturing systems transformation – Proceedings of the 55th CIRP Conference on Manufacturing Systems 2022). |
Annotation Curricula to Implicitly Train Non-Expert Annotators Artikel In: Computational Linguistics, Bd. 48, Nr. 2, S. 343-373, 2022, ISSN: 0891-2017. |
Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for Misinformation Artikel In: arXiv preprint arXiv:2210.13865, 2022. |
Effective Cross-Task Transfer Learning for Explainable Natural Language Inference with T5 Artikel In: arXiv preprint arXiv:2210.17301, 2022. |
Annotation Curricula to Implicitly Train Non-Expert Annotators Artikel In: Computational Linguistics, Bd. 48, Nr. 2, S. 343–373, 2022. |
UKP-SQUARE: An Online Platform for Question Answering Research Artikel In: arXiv preprint arXiv:2203.13693, 2022. |
2021 |
Investigating label suggestions for opinion mining in German Covid-19 social media Proceedings Article In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), S. 1–13, Association for Computational Linguistics, Online, 2021. |
Assistenzsysteme durch Natural Language Processing - Umsetzungsstrategien für den Shopfloor Artikel In: Industrie 4.0 Management, Bd. 2021, S. 11-14, 2021. |
Production specific language characteristics to improve NLP applications on the shop floor Artikel In: Procedia CIRP, Bd. 104, S. 1890-1895, 2021. |
In: Procedia CIRP, Bd. 103, S. 231-236, 2021. |
Digital shop floor management enhanced by natural language processing Artikel In: Procedia CIRP, Bd. 96, S. 21-26, 2021. |
2020 |
Chats als Datengrundlage für KIAnwendungen in der Produktion Artikel In: Bd. 115, Nr. 7-8, S. 520–523, 2020, ISSN: 2511-0896. |
Knowledge management on the shop floor through recommender engines Artikel In: Procedia Manufacturing, Bd. 52, S. 344-349, 2020. |