As hateful and extremist content proliferates online, 'counterspeech' is gaining currency as a means of diminishing it. No wonder: counterspeech doesn't impinge on freedom of expression and can be practiced by almost anyone, requiring neither law nor institutions. The idea that 'more speech' is a remedy for harmful speech has been familiar in liberal democratic thought at least since U.S. Supreme Court Justice Louis Brandeis declared it in 1927. We are still without evidence, however, that counterspeech actually diminishes harmful speech or its effects. This would be very hard to measure offline but is a bit easier online, where speech and responses to it are recorded. In this paper we make a modest start. Specifically we ask: in what forms and circumstances does counterspeech - which we define as a direct response to hateful or dangerous speech - favorably influence discourse and perhaps even behavior?
To our knowledge, this is the first study of Internet users (not a government or organization) counterspeaking spontaneously on a public platform like Twitter. Our findings are qualitative and anecdotal, since reliable quantitative detection of hateful speech or counterspeech is a problem yet to be fully solved due to the wide variations in language employed, although we made progress, as reported in an earlier paper that was part of this project (Saleem, Dillon, Benesch, & Ruths, 2016).
We have identified four categories or "vectors" in each of which counterspeech functions quite differently, as hateful speech also does: one-to-one exchanges, many-to-one, one-to-many, and many-to-many. We also present a set of counterspeech strategies extrapolated from our data, with examples of tweets that illustrate those strategies at work, and suggestions for which ones may be successful.