We successfully implemented a fully automated conversational agent that is able to detect abusive tweets on Twitter and to start a conversation with the author of the respective tweet.
We have determined that our classification system performs well and the classified instances of our test case are considered to be abusive. However, as we put effort on minimizing the number of false positives in the classification, we have limited our bot’s ability to capture all instances of abuse. So far it only interacts with few cases occurring in the entire stream of tweets. For a more in-depth understanding of the quality of the classification system, a larger, independent dataset should be labeled to use for testing the classifier’s performance.
It would also be interesting to analyze the inter-rater agreement among the classifiers and the concordance with the ground truth label to understand which classifier contributes most to a classification similar to that of the ground truth. Moreover, in future work we can investigate how the individual “expert” classifiers can be utilized in a way that leads to a more accurate performance, increasing the recall of the classification. This can be implemented for example by building an ensemble classifier that learns how to optimally weigh the scores of each expert classifier. Further, the potential of the meta-classification sentiment scores can be explored further by identifying the most relevant scores that are best able to predict the abusiveness of a text.
With respect to the conversational agent to perform counter-speech, the results of our investigations indicate that the empathy strategy of reminding the opponent that the receiver of their statement might be harmed by it is likely to receive more engagement than compared strategies. We identified a number of cases in which the justification of interacting is questionable, such as the cases of self-harm, reclaimed racial slurs used among peers, expressions of righteous anger, and abusive language directed to unknown third parties, public figures or institutions. To decide how each of these cases should be treated requires ethical debates on a larger level.
We suggest developing the dialogues of the conversational agent and introducing some kind of measure for the health of a conversation, in order to examine if continued dialogue can effectively decrease the level of aggression in the moment. We encourage further research to investigate the impact of different counter speech strategies in the long term to identify which is most suitable to prevent repetitive behavior.
We should also explore the effect of different profile types of conversational agents, as well as the usage of different communication styles such as positive or negative tone, humor, as well as the use of images and media to support the counter speech.