Improving the detection and analysis of disinformation to alleviate threats to democracy: interview with Patricia Martín-Rodilla

By Maria Campins, Newtral

Fighting disinformation and harassment online is a challenging task. Patricia Martín-Rodilla is engaged in research aimed to achieve this goal and is contributing to the HYBRIDS project with the same objective.

Throughout the HYBRIDS initiative, the acronym of Hybrid Intelligence to monitor, promote and analyse transformations in good democracy practices, 14 institutions will collaborate in the following years to promote different approaches on the basis of an exhaustive analysis of public discourse about crucial global issues, such as health, climate crisis, European skepticism or immigration, which will take into account both traditional media and content published through social networks.

Martín-Rodilla is a computer engineer and member of HYBRIDS, representing the Universidade da Coruña (Spain), where she is focused on researching information processing and knowledge extraction humanistic domains, as they present great challenges and opportunities in software engineering and information systems. Martín-Rodilla is also an assistant professor at the Universidade da Coruña and her main interests are modeling, processing, and exploitation of information.

What do you aim to achieve with HYBRIDS?

The overall goal of the project is to train Ph.D. students as a new professional profile in the field of confluence between computer engineering, language, and social problems in terms of disinformation (organized in domains such as health, Euroscepticism, the climate emergency, and immigration). This new professional profile, abolishing the limitations of the disciplinary areas (linguists, engineers, sociologists, political scientists, etc.) is also one of my personal objectives in HYBRIDS since I am particularly devoted to the formation of hybrid profiles in Digital Humanities.

Obviously, and in parallel with the training goals, the scientific goals of the project revolve around improving the detection and analysis of disinformation in its multiple forms to alleviate threats to democracy. For example, more related to my line of research, one objective is to test the current natural language processing technology necessary for the analysis of disinformation and to insert a domain and symbolic information into current language models for the detection of certain types of disinformation (fake news or hate speech), which currently work without this information, only based on machine learning. Above all, I am interested in the linguistic level of discourse, which is the level that tells us the most about the content of a text and its possible connection with disinformation.

Why are you interested in participating in the project?

This project was born as a result of innumerable conversations between the project coordinator (Dr. Pablo Gamallo) and myself and I have been involved in it from the beginning. The interest is twofold, at a research and a pedagogical level. At a pedagogical level, as I have previously commented, I believe that the hybrid profiles of knowledge and skills in the humanities and technology in the same person represent a professional profile that we need to address and solve many of the problems that we currently have in this area and that are presented at HYBRIDS (like fake news, harassment, etc.), as well as those that will come.

At the research level, HYBRIDS tests the hybrid intelligence paradigm, combining structural and rule approaches and machine learning approaches in the treatment and analysis of natural language to deal with disinformation problems. On a personal level, my main interest is in how these approaches respond when we analyze natural language at the discursive and argumentative levels. For example, if we analyze political discourse: Could we detect misinformation? In what forms? (for example, inconsistencies, unreal cause-effect attributions, etc.? I also study these discursive and argumentation levels in social networks for the detection of mental illness or harassment.

What kind of research are you going to pursue as part of it? What are the investigation lines most promising to include in the project?

Regarding the most promising lines, I believe that part of the project’s success is combining profiles of such diverse professionals and institutions with such an ambitious goal. It is always interesting and a challenge in research to see how far the confluence of experts in human and technological areas can do together.

In our case, we are somehow specialized in the connection between disinformation, natural language (especially in discourse) and social networks. In this context, the research that we will carry out focuses on the technological and algorithmic improvement of the treatment of the level of linguistic discourse from social networks (since it presents particularities due to its short but connected speeches and its varied formats depending on the social network). We will do it in various case studies and topics such as harassment on immigration, climate change, etc.

What are some of the challenges you face in your research? And for hybrid intelligence?

As I previously commented, working at the linguistic level of discourse but with short texts, text related in various ways to each other, and text with various speakers or participants in each case, is itself a challenge for current discourse processing algorithms. Most of the existing technology allows long texts such as political speeches to be analyzed at a discursive and argumentative level, but presents problems in conversational texts and/or threads from social networks. Improving this technology will allow us to approach with more symbolic and semantic knowledge what interventions on social networks speak of disinformation and what relationship they have with disinformation.

At the same time, we must socially analyze the case studies that we deal with in harassment: who promotes/promotes it? What impact does it have and how does it spread on social networks? What arguments do you use on threatening issues for European democracies such as immigration or climate change?

Which problems do you want to solve with the PhD that will be under your supervision?

In research, you always know what problems you have at the beginning of your project, but not which ones will arise meantime. At the beginning, we will have to build datasets for the case studies that we deal with. There we will deal with finding representativeness in profiles, speeches, native languages and sources (social networks). Europe is diverse, and our starting datasets must be diverse. The biases in them are also diverse, and we must consider them and try to mitigate them.

Later, we will face more technological and algorithmic problems. Especially, with ways to embed symbolic information in existing discourse algorithms, data input and output formats, etc.

I am confident that we will learn a lot addressing all these problems from a completely hybrid perspective, and we will be able to achieve the proposed objectives by offering solutions.