"A taxonomy, data set, and benchmark for detecting and classifying malevolent dialogue responses"

Photo by linkedinsalesnavigator from unsplash

Conversational interfaces are increasingly popular as a way of connecting people to information. With the increased generative capacity of corpus‐based conversational agents comes the need to classify and filter out malevolent responses that are inappropriate in terms of content and dialogue acts. Previous studies on the topic of detecting and classifying inappropriate content are mostly focused on a specific category of malevolence or on single sentences instead of an entire dialogue. We make three contributions to advance research on the malevolent dialogue response detection and classification (MDRDC) task. First, we define the task and present a hierarchical malevolent dialogue taxonomy. Second, we create a labeled multiturn dialogue data set and formulate the MDRDC task as a hierarchical classification task. Last, we apply state‐of‐the‐art text classification methods to the MDRDC task, and report on experiments aimed at assessing the performance of these approaches.

Keywords: dialogue; malevolent dialogue; mdrdc task; data set; detecting classifying

Journal Title: Journal of the Association for Information Science and Technology
Year Published: 2021

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
0

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended