Disinfo Demasked by Narrative Detection!

Collaboration

The project was initiated by DW Akademie, Deutsche Welle’s center for international media development. It was implemented in collaboration with Deutsche Welle’s Research & Cooperations Unit (DW ReCo), and SocialLab as technology partner.

Researchers

The driving force behind the project research, methodology and findings are a group of experts in the fields of data science and AI, geopolitics, disinformation, especially Russian disinformation spread in the Global South, as well as journalism and fact-checking. This interdisciplinary group comes from several countries from the Middle East, Asia, Africa and America.

Findings & Data

On this website, you can follow the whole project process step by step and access all research findings and project outputs. PLEASE NOTE: the datasets produced in the project can be accessed only via the contact field after sharing key information with us.

Project Flow

1- Project Kick-off : Setting the scope, milestones and goals

The “Disinfo Demasked” project is part of DW Akademie’s global initiative to tackle disinformation in the Global South, which aims to support journalists and fact-checkers and to increase the Media and Information Literacy (MIL) of media audiences. Due to rapid advancements in both the quality and spread of AI-generated content, there is an urgent need to explore how AI technologies in their turn can be used to counter disinformation. For this purpose, good datasets that can be used to train LLM models are crucial, even more so for low resource languages like Arabic. This project draws on previous projects which had explored the potential of Social Listening technologies to strengthen reliable and independent quality journalism. It was developed in close collaboration with the tech expertise of DW’s department Research and Cooperation (DW ReCo).
Research has highlighted the involvement of Russian state-sponsored media outlets in orchestrating recent disinformation campaigns (see Literature Review). Given the geopolitical significance of the Arabic-speaking world and Russia's growing influence in global politics, understanding the nature of Russian state-sponsored media strategies in this region is crucial. Media outlets like Russia Today (RT) and Sputnik have faced accusations of information manipulation and have even lost their licences in certain locations. Still, they enjoy strong and even increasing popularity in Arab countries. Up to date, there is very limited research on the scope and dynamics of the Russian disinformation campaigns targeting the Arabic-speaking online spaces.
Social Listening Technologies help monitor content on social media and can analyse specific content, sources, or accounts. LLM models as part of AI technologies have the potential to conduct such content analyses on big amounts of data. By uncovering and disseminating information about state-sponsored propaganda and disinformation strategies, the project serves as an educational tool. It raises public awareness about the nuances and complexities of media consumption and encourages a more critical and discerning approach to news and information, which is essential for media literacy in the modern age.

2- Defining the research scope

It was decided to use the official channels of the two largest Russian news outlets, Sputnik and Russia Today, as sources of data. The timeframe was set from 17 May 2018 until the point of data collection 31 November 2023, covering events both before and after the recent onset of the war in Ukraine. Telegram's significance stems from its growing popularity in the Middle East and its Russian roots, making it a pivotal platform for our study. While not widely used in Western Europe, Telegram has gained traction in the Middle East due to its user-friendly interface, robust privacy features, and less restrictive content policies compared to other social media platforms. Its Russian origins are particularly relevant given the focus on Russian state-sponsored media. In comparison to other messenger services or social media platforms, Telegram also has less restrictions in terms of access to data.

3- Data Collection

Data from the Telegram Channels of Sputnik Arabic (https://t.me/Sputnik_Arabic) and Russia Today Arabic (https://t.me/rtarabictelegram) published in the time frame from 17.05.2018– 31.11.2023
The data collection was done with a proprietary tool of the technology partner SocialLab. To replicate this step which has used open-source tools, check the technical report. Number of records: around ~170.000 for Sputnik and ~270.000 for Russia Today
- Dataset 1: 170K Records from Telegram Channel of Sputnik Arabic (AR)
- Dataset 2: 270K Records from Telegram Channel of Russia Today Arabic (AR)

4- Data Selection and Preparation

Dataset 1 and 2 contained more than 440 000 records. To narrow it down and further proceed with the research, 2000 posts from each dataset were selected – more specifically the 1000 most viewed and 1000 most forwarded posts of each media channel. This resulted in a new dataset of 4000 unique records (Timeframe: 17.05.2018- 31.11.2023). The posts for human labelling were selected from this dataset based on top views and forwards (500 posts selected).
The Arabic text has been machine-translated into English to enable a wider audience to access the data (performed by free Google Translate within Google Sheets). New columns were added to the data frame to prepare the dataset for (human and machine) labelling (see next step and research methodology for a more detailed explanation.

- The posts for human labelling were selected from the dataset of 4000 posts based on top views and forwards.
    ->The 500 posts with the most engagements were selected and equally divided among the researchers.
1- Dataset 3: 4000 unique records from Sputnik and RT – top views and forwarded – AR
2- Dataset 4: 4000 translated unique records from Sputnik and RT – top views and forwarded – ENG
3- Dataset 5: 500 posts with the most engagements prepared for human labelling (combined dataset Sputnik and RT)

5- Development of labelling system (Narratives coding Methodology)

In order to develop a labelling system to analyse the dataset 5 and gain a deeper understanding on Russian state sponsored propaganda and disinformation and the impact it has on Arab-speaking audiences, the researchers used academic studies (for a selection see literature review) as well as findings from previous fact-checking activities, e.g. narratives identified in debunked fake news and disinformation campaigns.
The researchers agreed on a list of 7 main themes and corresponding sub-themes and produced a code book and coding instructions based on them, e.g. a list of codes linked to the list of expected themes mentioned. For more details, consult the file research methodology
- List of Themes and Subthemes (Narratives)
- Labelling System and Coding Instructions
- See all details in the Research Methodology

6- Human Labelling and Research findings: Top Russian Narratives targeting Arabic-speaking audiences

The 275 most forwarded messages from dataset 5 were selected and divided among the researchers for manual labeling. The themes and sub-themes used for the coding of each individual Telegram post were provided as part of the instructions to coders who as part of the pilot phase coded 50 posts each and revisited the themes to ensure high consistency and inter-coder reliability. Following this pilot phase, additional values were added to the coding instructions. The researchers then manually coded the top 275 most forwarded messages from the channels (in a combined dataset). The focus was on textual content only, no audio or video content was included in the research.
From the manually coded 275 posts, 21 narratives were extracted and provided with an exemplary post for reference. This manual process involved an in-depth examination of each piece of content to identify and label the underlying themes and sub-themes – with a focus on textual content only /video or audio content were not included).
- Dataset 6: 275 posts were humanely labelled by assigning themes, sub-themes, confidence score and comments
- Top 20 narratives with the most impact on Arabic-speaking audiences spread via Telegram Channels of Sputnik and RT (Check Narrative)

7- Feedback-Loop: Expert Meet-up in Berlin

The intermediary results of the project were discussed with a group of Disinformation and AI experts from the US, Europe, Africa, South America and the MENA region during a DW Akademie event which took place in Berlin in March 2024. The aim of the event was to collect feedback from peers about the methodology, potential use cases for the database as well as its insights and potential for researchers, data scientists and journalists, and overall as an approach to counter disinformation. The Expert Meetup sought to promote the exchange of lessons learned developing LLM models for low-resource languages of countries from the Global South.
The group confirmed the importance of developing AI models for narrative detection as a promising approach to tackle the issue of disinformation or FIMI (Foreign Information Manipulation and Interference).
Check out this LinkedIn post!

8- Machine Learning Experiment: automated labelling of 2K Posts

- Preparation of two datasets:
  * The 275 humanly annotated posts were prepared for five-fold cross validation by generating five 80%-20% splits.
  * Another ~1500 posts were selected from the unlabeled remainder of the previously selected 4000 posts.

The posts were prepared by removing all information other than the English translation of the text, and the empty fields for labelling.
- The method used was few-shot prompting. The prompt contained the coding instructions explaining the meaning of each theme and sub-theme.

- Unseen posts were labelled in blocks of 10, using 15 examples from the human annotated data to illustrate the coding methodology.
There are two outputs that were evaluated differently:

1. The quality of the machine labelling of the 275 humanly labelled data points was estimated by comparing the predicted themes and sub-themes to the human answers. The scores used for evaluation are Accuracy, Precision, Recall and F1-Score.
2. The previously unseen, machine labelled data points were manually validated by the research team. The validation was performed in a qualitative way, assigning labels from (partially) correct to (partially) incorrect and giving explanations on what went wrong.
- Dataset 7: Machine labelling of 275 posts from human dataset for comparison and calculation of evaluation metrics
- Dataset 8: Machine labelling of ~1500 additional posts for human evaluation
- Technical report including detailed explanation to reproduce the experiment
- Codes and Prompts used in the ML experimentation

Research Conclusion

Russian narratives targeting Arabic-speaking audiences

The findings indicate a strategic use of Sputnik and Russia Today to disseminate narratives among Arabic-speaking audiences that favour Russian perspectives, especially in geopolitical conflicts.

Russia Today: Themes ranged from promoting RT Arabic's live broadcasts to discussions on Russia's historical interactions with global figures like Saddam Hussein. Messages often portrayed Russia in a positive or defensive light, particularly in the context of international criticism or conflict.

Sputnik: consistent focus on military and geopolitical themes. Several messages featured footage from the Russian Ministry of Defense, suggesting an intent to showcase Russian military operations positively.

In both channels: Emphasis on military successes and the portrayal of opposing forces, particularly in the context of the Ukraine conflict and recurrent portrayal of Russia as a resilient force and a beacon of stability.

The in-depth data analysis highlighted the 20 top narratives with the most impact on the audiences of those channels (see the narratives here)

Potential of ML models for narrative detection

  • Confirmation that structured, labelled databases are critical to unlocking the potential of AI for narrative detection.
  • Early experiments show that LLMs have the potential to recognize narratives in texts.
  • This is the case even when this is done via instructions and a few examples provided (few-shot prompting).
  • The potential for fine-tuning may be even higher, but this remains to be determined.
  • The first experiment highlights the potential to perform narrative recognition using clear instructions and less data, without the need for very large datasets.
  • It is important to include all relevant information in the prompt, such as the time frame, to ensure that the model is able to place the text in the correct context.

Contact us

    (* mandatory fields)