Polite Warnings On Twitter Might Just Reduce Some Online Hate

A team of researchers recently ran an experiment and found that if Twitter users are warned about punitive action for using hateful terms in advance, the chances of posting hateful content actually go down by up to 20 percent. Twitter has struggled against the problem for a while and recently upped its efforts to make the platform less virulent. One of the most significant casualties of Twitter’s content rules has been former U.S. President Donald Trump, who was permanently banned from the platform.

Over the past few quarters, Twitter has taken numerous measures to tackle toxic interactions and misinformation problems. In May this year, Twitter started rolling out prompts that asked users to reconsider before posting something offensive or hurtful to an individual or a group. And to ensure that users are contextually aware and don’t help spread misinformation or other harmful content, Twitter began asking users to read an article before they retweet it. However, the issues are persistent, and tackling them has proved to be a tricky path for Twitter, especially in markets outside the U.S.

A team of researchers from the New York University conducted a test that involved warning users that their accounts can be suspended if they post hateful content. Published by the Cambridge University Press, the paper titled “Short of Suspension: How Suspension Warnings Can Reduce Hate Speech on Twitter” studied the effectiveness of warning users about suspension versus outright suspension and how each scenario affects the chances of hateful posts. The team concluded that the warnings actually work, and if the warning messages are more polite, the chances of posting malicious content further go down. As part of the research, the team shortlisted over 600,000 hateful tweets that were posted in the week before July 21 last year and isolated a batch of 4,300 followers of accounts that were suspended for violating the platform’s content policy.

Twitter Can Take Some Constructive Lessons

Polite Warnings On Twitter Might Just Reduce Some Online Hate

The team then dished out a warning message starting with the line, “The user [@account] you follow was suspended, and I suspect that this was because of hateful language.” However, the language that followed varied, which could be something straightforward like “If you continue to use hate speech, you might get suspended temporarily.” Or a tad more polite such as “If you continue to use hate speech, you might lose your posts, friends and followers, and not get your account back.” The goal was to convey an effective warning message, make the prompt appear legitimate, and also send home the idea that punitive action of suspension will be taken if the target audience posts problematic content.

The team noted that sending a warning prompt helped reduce the ratio of hateful tweets by 0.007 for a week or up to 10 percent a week later. But in scenarios where the messaging was more politely phrased, the reduction in hateful tweets climbed higher and sat between 15 to 20 percent. And even though the effects of such warnings lasted for only up to a month, even a temporary reprieve from toxic tweets is an encouraging sign, allowing the team over at Twitter to build upon it.