To combat rampant online abuse, a research team in Australia has come up with an algorithm that can flag misogynistic content, with the underlying message: bring more women to the table during the development process.
In an IPSOS MORI poll, commissioned by Amnesty International, nearly a quarter (23%) of the women surveyed across eight countries – Denmark, Italy, New Zealand, Poland, Spain, Sweden, the UK and USA – shared that they had experienced online abuse or harassment at least once.
Another landmark survey, State of the World's Girls report, by Plan International surveyed 14,071 teenagers and young women aged 15-25 years old across 22 countries, including Australia, Canada, Brazil, Japan, Zambia and the US, to find that almost one in four (24%) of those surveyed said they or a friend had been left fearing for their physical safety due to online harm. Additionally, 42% reported low self-esteem and another 42% reported that this harassment caused mental and emotional stress.
The message is pretty clear: online abuse has the potential to impair their freedom of expression, belittle their expertise, demean their self-worth, and intimidate women and girls into silence.
One could argue that the online ecosystem is just a microcosm of the real world, mirroring the threats that afflict us in physical spaces. These platforms have, over the years, tried to strengthen their reporting processes & safety policies. But the young women surveyed can tell you that these measures fall short at addressing or countering the toxic messages that are often howled towards them online.
“Women are now taking part in the solution as agents of change. ”
Alka Lamba, a politician from India, tells Amnesty International, “When you are on social media, you face trolls, threats, abuses and challenges 100% of the time. Their purpose is to silence you. It makes you want to cry. They talk about your personal life, your looks, and your family.” The same study goes on to validate how the social media platform Twitter has turned into a ‘battlefield’ for women who are constantly being subject to a barrage of physical or sexual threats, degrading content, profanity, slurs, and insulting epithets. And yet, it seems that online harassment is an overt expression of the violence that exists offline, too. The good news is that technology can help create safeguards in online spaces.
Women are now taking part in the solution as agents of change. In light of how rampant – and distressing – the situation of online harassment is, Dr. Richi Nayak, a professor at the Queensland University of Technology, Australia, along with her team, has developed a deep learning algorithm aimed at making the virtual world, specifically Twitter, safer for women through the detection of misogynistic tweets. She spoke with me about her team's research and development process.
Effectively, at the core of machine learning algorithms is simple pattern-driven decision-making, in this case, a determinant as to whether or not a tweet is misogynistic. Nayak explains that the now published algorithm does more than just look out for the occurrence of abusive words, “There’s a lot more nuance to it, especially in the understanding of the context in which a word is used. The mere presence of a word doesn’t correlate with intent. Sometimes, profane words are used as an expression of sarcasm or friendly badgering. We train the algorithm not to flag those as problematic.”
In clarifying this nuance, Nayak continues, “We need to understand that we are dealing with written text, and we lose one very crucial identifier of context, that is, the tone.” They found a solution in the deep learning algorithm called the Long Short-Term Memory with Transfer Learning, whereby the machine can keep referring to its previously acquired understanding and comprehension of terminology, and continue developing the contextual and semantic framework holistically, over time.
In other words, the algorithm “learns” to distinguish an abusive tweet from a non-abusive one by looking at the content and intent as well as the context and extent of the tweet by learning these patterns, solely through the reams of examples that are provided to it. The training process reinforces the old research adage, “The algorithm is only as good as the data we feed it.” The evolving nature of the algorithm makes for a highly iterative development process.
“Designing the algorithm from scratch is where the innovation of this project lies”, Nayak shares, “We started with a blank network, taught the model standard language by training with datasets from Wikipedia. Next, we trained it to learn somewhat abusive language through content related to user reviews of movies, products etc. Finally, we trained the model on a large dataset of tweets. The language and semantics are very different on Twitter – there are slang terms, abbreviations – all of which was taught to the algorithm to develop its linguistic capability.”
Once the algorithm had developed its linguistic prowess, it was taught to classify and differentiate between misogynistic and non-misogynistic content. In the first instance, they were refined for the presence of one of three abusive keywords (these keywords were informed by a body of literature). Then the algorithm’s job was to flag tweets with an underlying abusive context using the semantics training it received. Nayak tells me that in order for the algorithm to have a representative sample to glean, they decided on an optimum proportion of abusive and non-abusive tweets as 40:60. The tweet mining stopped once they reached this proportion. Ultimately, the team mined over one million tweets.
In the next stage, the process was further refined: researchers evaluated manually what patterns were being learned and which patterns were being missed. Nayak throws light on the gradations here, “Some tweets emerged clear-cut as either misogynistic or not. The others, however, were fuzzier in their meaning. So the idea is to train the algorithms on these fuzzy bits where the boundaries are not so clear. The more examples of ambiguous patterns it trains on, the better it learns to identify the categories.” One such ambiguous phrase was “Go to the kitchen”, which when looked at in isolation seems innocuous, but once put in context, has sexist subtext.
“A nasty comment made online might not strike as threatening to a man going through the data, but a woman may be able to better pick up on the subverted tropes.”
It’s in these cracks and crevices that there’s a scope for subjectivity and interpretation – who gets to decide if a tweet is misogynistic or not? A nasty comment made online might not strike as threatening to a man going through the data, but a woman may be able to better pick up on the subverted tropes. This is when having a woman’s voice at the table becomes crucial. And that opens up the conversation to the importance of incorporating lived experiences, particularly the lived experiences of those at the receiving end of sexist verbal abuse. Even more vital to this conversation is having it at the forefront of the development process rather than as an afterthought.
According to HR consulting firm Mercer, the Diversity & Inclusion (D&I) technology market – encompassing tools that prevent and address bias, harassment, and discrimination – is worth approximately $100 million. Such tools are gaining momentum because of realizations including, as mentioned by Ellen Pao in her book Reset, the failure of the rabid yet ineffectual one-off bias training, the whole tech system having exclusion built into its design, and the “diversity solutions” that up until 2016 were PR-oriented gimmicks.
These tools come with reams of benefits in the form of better D&I outcomes, scalability, and bias elimination. At the same time, there is potential for bias to creep in artificial intelligence (AI) algorithms. Yet again, our algorithms are only as good as the data we feed them. Or as Nayak explains, “The learning occurs on the basis of the examples you provide to the system; it doesn’t have learning or assimilation capabilities of its own. It’s important for the algorithm to be exposed to as much diversity as possible in these examples.” In short, the algorithm data must be influenced by those it’s designed to protect.
The algorithm is an evolving tool that emulates the human experience of lifelong learning in that it will keep brushing up on its context-identifying skills. It can potentially also be modelled and adapted to identify other brackets of abusive content – racist, homophobic, ableist, classist, transphobic content – and nip it in the bud. There’s scope to expand it to accommodate linguistic specificities across a plethora of global and regional languages. The development of each of these should, therefore, involve the voices of people from marginalized communities for a safer online experience.
The algorithm is a classic case study of using data science for social good. But the team will substitute the tech industry’s obsessive knack for reckless scalability with conscious expansion. Nayak shares their compassionate vision, “The algorithm will find application in social media monitoring firms, but not until initial trials are done to strengthen the robustness of the juggernaut, as well as diverse voices, are included in its development.”
Machine learning and AI are an intricate part of our world online, as we all seek to see ourselves in the content we browse and interact with. The tech industry faces a giant lack of diversity that is beginning to show up in ways that can harm and disrupt marginalized and underrepresented people. Because of this and the deep connection between the people creating algorithms and the people experiencing as end-users - diversity and representation matters now more than ever.