Multilingual abusive language detection
Computer Science Major (School of Informatics, Computing, & Engineering)
Sandra Kuebler (School of Informatics, Computing, & Engineering)
Abusive language/hate speech is more and more becoming an issue in social media. Because of the large volume of newly posted data every day, we need to develop automatic methods to detect abusive language. We will work on solutions to this problem in a multilingual setting. If the students have a background in programming, they will be involved in the development of the automatic methods. Students without a programming background but with knowledge of a language other than English will help with characterizing abusive language in tweets.
Technology or Computational Component
The automatic methods to detecting abusive language are based on machine learning. Students should ideally know some Python to help with developing these approaches. Students who do not have such a background will initially help with detecting language characteristics. All students will be introduced to the machine learning part and will run experiments after a step by step introduction.