Abusive language proliferates on Twitter and other digital discussion platforms. It causes harm to users and it presents a challenge to the businesses that runs those platforms to moderate user activity. An opportunity presents itself with the development of machine learning models that can be trained to detect abusive language and respond to it by engaging the “abuser” to deter their bad behavior.
We explore the capabilities of current machine learning frameworks to develop a conversational agent that can detect and respond to abusive language on Twitter. Our goal is to provide a product that can perform adequately in deterring abusive behavior online, to prevent the hygienic issue present in a system where people are paid to moderate abusive and harmful content.
We are primarily concerned with understanding the motivations of online abusers, so we built our approach from theories of counter speech and social learning, within the affordances of Twitter as a platform.
We are aware of our roles as active bystanders in a networked public sphere, wherein community-led investigation and inquiry are encouraged.