Curated Issues and Project Proposals

Overview

The Uli project works to denormalize the violence experienced online by marginalised communities in India, and provides tools to protect and enable a collective response for users that belong to these communities.

Uli presents an alternative imagination for the current structure of platform moderation- Where should moderation take place? Whose values are encoded in moderation logics? The immediate intervention that Uli provides is in making the social media experience less gruelling for people belonging to marginalised communities.

Purpose of this document

This document was created during HacktoberFest 2023. The tasks mentioned in the next section are a mix of easy and hard. Some of the tasks are small, self-contained tasks while some others are features that need some planning and design (3-5 point). We also have some open ended project proposal ideas which are not fully scoped out as yet but once someone shows interest in them, we can work with them to flesh out the details.

Note

The goal with this document is to encourage beginners to improve Uli or use Uli resources in ways they see fit to address Online Gender Based Violence in the Indian Context. If this goal resonates with you but you don't find an exact match, let us know in the discussion here.

Open Issues and Proposals

Open Issues

Approximate Matching of Slurs
A lot of slurs on social media are variations of other root words. For example, the word fuck is spelt fk, fcuk, fukkkkk fuccckkk. How do you take a static list of slurs and use it to detect all possible variations of that slur. We have made some exploration of this and its documented tattle-made/Uli#110.
One could approach this using classical techniques like RegEx and levenshtein distance, use knowledge of linguistics or use machine learning. One could also evaluate solutions in terms of their requirements - can they work in a web browser or do they need a heavy server to run. As part of this task participants can choose any approach they like, make a case on why they think its useful/impactful and develop a working demonstration for it.
Localize for Malayalam Language
The Uli website and extension can be navigated in Hindi, Tamil and English. The most up-to-date version of the website however is in English. Translate the website, the tool interface and annotation guideline to Malayalam. As part of Uli we also have created certain resources on reporting problematic content on social media platform. These resources could also be translated.
Github Issue - https://github.com/tattle-made/Uli/issues/422
Improve UX for Crowdsourcing Slurs
In our latest release, we pushed a new feature. This feature allows users to crowdsource slur's and its relevant metadata using Uli. When you right click on a word on your browser, it shows two options:
1. Add Slur to Uli: Words added through this option are redacted from your local browser
2. Crowdsource Slur Word: Words added through this option are submitted to the list of slurs on the server for anyone to see. The two separate options create confusion for users. This task would involve creating merging both the features, so that when you right click on a word that is automatically submitted to the server and also redacted from your browser.\

GitHub Issue - https://github.com/tattle-made/Uli/issues/423
4. Improve Annotation Experience For every slur that contributors add, they also have the options to fill certain metadata.

While we have a detailed annotation guideline that explains what all the fields mean, we should also make it very simple for a new user to access these guidelines, as they are annotating. The scope of this issue is to make the annotation guideline easy to understand and available within the extension itself. This is a UX research, copywriting and frontend engineering issues.
Github Issue - https://github.com/tattle-made/Uli/issues/424

Curate #gender-based-violence on github : tattle-made/Uli#280
Automated e2e testing of browser extension on Brave and Edge : tattle-made/Uli#383
Upgrade gatsby for Uli website : tattle-made/Uli#384
Improve Visual Design for Slur Crowdsource Feature : tattle-made/Uli#409

Project Proposals

Red Teaming ML Models against Indian language slurs:
The term red teaming has been used to encompass a broad range of risk assessment methods for AI systems, including qualitative capability discovery, stress testing of mitigations, automated red teaming using language models, providing feedback on the scale of risk for a particular vulnerability, etc

Given the lack of representation of Indian languages in the data as well as Trust and Safety teams in Generative AI teams, these models might be more easily abused in Indian languages. Following tasks are focused on surfacing vulnerabilities in prominent models.

Red Team ChatGPT
Red Team Stable Diffusion
Red Team LLAMA
Red Team Bloom

You can see this as an example of Red Teaming with the Uli Slur List: Tattle -

Expand the slur list using LLMs:
The slur list is expansive but it might not cover every possible word. Expand on the slur list using ChatGPT. This will require creative prompt engineering. One example of this task is:

Uli Slur List Application for Tracking Online Abuse:
The Uli slur list can be used to understand vitriol being targeted towards people public life. This could include for example women and LGBTQI in sports and politics. This task focuses on tracking the online abuse received women and LGBTQI by using the slur list.

The slur list is crowdsourced list of slur terms in Hindi, Tamil and Indian English from gender rights activists and researchers. The list can be found here: Uli/browser-extension/plugin/scripts/slur-list.txt at main · tattle-made/Uli

One of the assumptions driving the creation of this list was that it would aid in tracking and research on hate speech in Indian languages. This task will help verify if the assumption is true

To track online harassment and abuse in the given topics above, existing datasets could be used. New datasets specifically tailored to these subjects can be created

This is a good dataset to start with - YouTube Comments on Wrestlers Protest

Expand Uli to Other platforms like Discord, Reddit etc
The uli browser extension is an attempt to use the slur list for moderating a user's web browsing experience. We also want to find ways to make the slur list useful in other ways.

One such attempt is this UI which makes it possible to copy-paste our slur list and add to instagram to mute slurs from a user's comment section - Uli [demo]

Some other possibilities are:

build bots to make uli available on platforms like reddit, slack or discord.\
make npm and pypi packages to make uli available to developers as ready to use packages in their language of choice. [inspiration project]
bring uli to mobile. An incomplete exploration of this is documented here

Note

Discuss this document here

Uli Logo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Curated Issues and Project Proposals

Overview

Purpose of this document

Open Issues and Proposals

Open Issues

Project Proposals

Uli Wiki

Contribution Pathways

Setup Guides

Learning

Clone this wiki locally