E-Community Health and Toxicity

Online communities abound today, arising on social networking sites, on the websites of real-world communities like schools or clubs, on web discussion forums, on the discussion boards of videogames, and even on the comment pages of news sites and blogs. Some of these communities are “healthy” and foster polite discussion between respectful members, but others are “toxic” and devolve into virulent fights, trolling, cyber-bullying, fraud, or worse even, incitation to suicide, radicalization, or the sexual predation and grooming of minors.

Reddit Community Analysis

People are social creatures and interacting with others is a fundamental human behavior. As people spend more time online, a larger part of their social life also become online. Online Forums such as Reddit have become very popular because of this. In this project, we want to define the what health means with regards to online forums - is it the quality of the comment, the average length of the comments, frequency of posts, number of users, the language of the posts or a combination of these. We also hope to determine which factors can affect the health of an online forum.

Computer Vision and Deep Learning for Moderating Visual Content

Two Hat Security is a company that develops next generation moderation tools for social networking apps. Since visual content (e.g. images, videos) is one of the most important types of data shared by social networking apps, an important problem for the company is to identify images/videos that are offensive or inappropriate. For example, certain images/videos might contain violence, nudity, or certain objects (knife, gun, bikini, etc.) that are considered offensive.

Automatic Image Filtering Using Deep Learning

Two Hat Security is a company that develops next generation moderation tools for social networking apps. Since images are of the most important data shared by social networking apps, an important problem for the company is to identify images that are unsafe or inappropriate. In particular, images containing certain objects (e.g. knife, gun, bikini, etc.) are considered unsafe. It is obviously not practical to manually sift through all the images to find the unsafe ones.

Modeling User Behaviour Over Time from Chat Messages

As online communication among children and young adults grows in popularity, concerns about online safety are receiving vast attention. The Two Hat Security (2HS) company has a rule-based filtering system to detect malicious chat messages and identify abusive users. The proposed project will improve the predictions of the trustworthiness (trust-level) of the users which can change over time and also influences what users are allowed to say in chat messages.

Spelling Correction for Improved Detection of Malicious Chat Messages

Cyberbullying, sexting, profanity and other forms of malicious chat messages have become increasingly common in online virtual worlds and social networks that are used by children and teenagers. These conversations are dangerous to children. The partner organization has already implemented a rule based filtering system to filter out malicious messages. However, not all the malicious messages can be filtered out since people invent subtler forms of malicious messages in an effort to subvert such filtering systems.

Word Representation Learning for Detecting Malicious Chat Messages

Communicating with peers online is an increasingly popular activity for children and teenagers, which has led to growing concerns about bullying and other forms of malicious behaviour in online chat rooms and virtual worlds. Although some online forums are patrolled by human moderators, the amount of text being generated is typically too large even for a dedicated team of humans to process. This research project will adapt machine learning techniques for the task of automatically detecting and filtering malicious messages in online chat rooms and virtual worlds.