Artificial Intelligence Automatically Reports Hate on the Internet

BoTox March 9, 2024 2:43 p.m. Robert Klatt An AI developed in Germany is able to automatically detect illegal statements on the Internet and forward them to a public information office. Darmstadt (Germany). Insults, incitement to hatred, hate speech and threats of violence are common on the Internet. A study by the Institute for Climate […]

Artificial Intelligence Automatically Reports Hate on the Internet

BoTox

Robert Klatt

An AI developed in Germany is able to automatically detect illegal statements on the Internet and forward them to a public information office.


Darmstadt (Germany). Insults, incitement to hatred, hate speech and threats of violence are common on the Internet. A study by the Institute for Climate Impact Research (PIK) in Potsdam recently showed that these unwanted and often illegal hateful comments increase at high temperatures. Researcher of Darmstadt College (h_da) have therefore developed the artificial intelligence (AI) BoTox (bot and context detection in the field of hateful comments), which can automatically recognize and report hate speech.


According to computational linguist Melanie Siegel, the AI ​​not only recognizes illegal statements, but also reports them to the information desk. HesseAgainstHetze transmitted so that the authority can take appropriate legal measures.

“We have identified twelve different criminal offenses.”


Distinction between humans and robots

BoTox AI can also distinguish whether criminal hate comments come from a human or a bot. According to the researchers, this is particularly problematic with generative AI, because these systems can write more and more human-like. According to a study from the University of Memphis, even experienced linguists often cannot distinguish whether a text was written by a human or by ChatGPT.

Training data from social networks

To train the AI, the researchers used data from social networks, including Facebook and YouTube. They were able to access it through the operator interfaces. Data from X, formerly Twitter, was not used to train AI because Elon Musk recently significantly increased the price of research licenses.


The researchers were therefore only able to use old Twitter data that they had already used for the previous project. Detox (detection of toxicity and aggressiveness in posts and comments on the Internet). Old training data does not cover many current topics, including the war in Ukraine. According to Siegel, methods had to be developed to be able to transfer the data.

“But we are working on making the data transferable. For example, we used the same choice of words about Chancellor Merkel as we do today about traffic lights.”

However, a large portion of hateful comments focus on topics such as xenophobia, discrimination against minorities, anti-Semitism and migration. To classify the comments, three student assistants divided them into acceptable insults and criminally relevant statements. The AI ​​was then able to decide with a high degree of accuracy what type of feedback was present based on this training data.

Teknory