Learning From State-of-the-Art Procedures and Tools for Fact Checking

In our endeavours to train an AI to engage in a Socratic dialogue with people who seek to fact check information, TITAN has sought to learn lessons from the existing procedures that fact-checkers already use. Some of our initial findings from desktop research and interviews are outlined below. This content is extracted from our soon to-be published 'socio-technical framework and user needs analysis report'.

A man doing a fact check holding a newspaper that is on fire

Available Tools

Existing tools and methods used for fact-checking social media include:

CrowdTangle - a social media monitoring and analysis tool. Not all fact checkers use CrowdTangle heavily. The reasons for this include lack of confidence or mastery, lack of local language capability (for example CrowdTangle not recognizing Turkish letters like ç), or seeing CrowdTangle as a less useful tool for identifying online misinformation in their specific country context.
Manual searches - Many fact checkers also manually monitor, following specific people and pages, searching for links and misspellings of names, or following private Instagram accounts to monitor what content is appearing beyond CrowdTangle reach, which only covers public content. Some fact checkers also monitor topics to spot possible claims – for example a keyword search for “coronavirus” could highlight claims that the virus originated in a lab in Wuhan, or that 5G causes the virus.
Reader suggestions - Many fact checkers encourage reader suggestions via WhatsApp, Messenger, email, Telegram and custom-built platforms.
Facebook’s fact checking product - Facebook’s program provides partners with a tool to carry out their work: the “fact checking product”. This allows fact checkers to see a queue of user-submitted and AI-surfaced content which may be false or misleading (known colloquially as “the queue”). Ideas about how to funnel down the content in the queue included:
- Connect Full Fact’s claim detection tool to the queue to sift checkable claims from general viral content. This uses a machine learning model to say whether or not text contains a factual claim, and could reduce the queue down and make it more manageable and useful.
- Expand community reviewers to sift through the queue and narrow down what is there based on criteria agreed among fact checkers.
- Feed whitelisted sites such as genuine news websites into the algorithm.
- Integrate speech to text software for video content.
While most fact checkers’ preferred monitoring tools are CrowdTangle and Facebook’s fact checking product, others include Google Alerts, Brand24, Twitter advanced search, Buzzsumo and (paid-for tool) Trendolizer.

What to investigate?

It is one thing to monitor what is being claimed on social media. Another thing to choose which claims should be checked by AI. The obvious answer is ‘the non-factual ones' however being able to identify these without error us easier said than done, besides TITAN's ambition is to upskill humans to make this decision, not the AI. To decide what content should be investigated we can learn from looking at how fact checkers select claims to investigate:

The first and most obvious consideration is whether a claim can be checked. Within Facebook’s fact checking product and among reader requests are a lot of opinions or commentary articles that fact checkers can’t verify. Sometimes data or evidence isn’t available for certain topics, or is of such low quality that it isn’t usable.
Another factor could be to prioritize potential for harm. Questions fact checkers ask themselves about harm include:
- If someone believed a claim, what damage could it cause to their and others’ health, lives or finances?
- Could the claim threaten democratic processes or minority groups?
- What is the implication for public discourse and national security?
- Who and how will this claim hurt if people believe it?
- Is life at risk?
- Does the claim relate to an urgent situation (e.g. floods, bombings) and require a quick response to stop the misinformation from exacerbating that situation?
Virality and reach is also important, but almost all the fact checkers we interviewed had a sceptical view of virality and how to define it. Here are a few of the ways fact checkers consider reach and virality as part of their selection criteria:
- Number and speed of shares on social platform
- A piece of content has reached a threshold of engagement (e.g. 3,000 retweets, 5,000 shares, 1,000+ reactions or comments)
- A claim is getting reported to the WhatsApp tip line multiple times
- The publisher (e.g. page or account) has lots of followers

Learning from fact-checkers to research, write, and review a fact check

i) Overview of the research, writing and review process

Every fact checker has a review process involving at least one other editor checking the quality of evidence, logic of argument, clarity of prose and political balance. Organisations have different types and levels of checks in place, ranging from up to six layers of editing to a voting system where a minimum of four editors must approve a draft. Some organisations’ directors are involved in the editing process daily, whereas in others the director is only involved for controversial topics or tricky fact checks. Usually, these steps are used to come to a fact-check:

Identify Claim
Check of existing fact-checks of the same claim
Identify source of claim
Consider motivation
Attempt to contact claimant
Look for evidence
Assess quality of primary sources
Contact press offices and data institutes
Write draft
First edit
Editors read and vote on draft
Second edit
Writers and editors to discuss rating
Final review

Extra steps which are taken when needed include, using freedom of information requests, contacting international colleagues, or seeking expert input. We can look at the process in a bit more detail:

ii) Source of an online claim

Almost all fact checkers see identifying the source of a claim as the key to fact checking it and to understand intention.  Intention is hard to prove, but fact checkers do look at what motivation someone might have for sharing or creating a post. 

Tracing the origin also reveals where a post has travelled and how it propagated, which can add useful context for both fact checkers and readers. Some fact checkers, such as Animal Politico and Rappler, follow the rabbit hole down to establish whether there is a coordinated network behind a piece of misinformation. 

iii) Review process

Senior or managing editors in most fact checking organizations review the credibility, quality and sufficiency of evidence used in a fact check. The editor looks at the draft from multiple perspectives and removes any bias they see, and considers possible misinterpretations to avoid backlash. 

Editors also review language and writing style. Some, like Africa Check, have a style guide. Others make an effort to break away from journalistic norms within their country.

iv) Editorial Materials

Many fact checkers have an established structure or template for drafting articles. 

Fact Crescendo, the leading fact checking website in India, says, “We don’t want people left guessing or wrongly assuming that a claim is true, so we use an inverted pyramid, starting with what’s being spread, what is being claimed, and why it’s wrong. We keep our methodology simple, saying how we searched it on Google, and what keywords we used – then a clear conclusion.” 

PesaCheck, Africa's largest indigenous fact checking org, asks writers to answer six questions: 

What is the claim? 
Where was it published? 
Who was it made by, or to whom do we attribute it? 
Why is it deserving of a full fact check? 
Is it something that could lead to real world harm that can be avoided by fact checking it? And what impact will fact checking have on public conversation – will it just create more buzz and confusion? 
What is the verdict?

Chequeado has an eight-step methodology for fact checking misinformation online, developed in collaboration with other Latin American fact checkers and First Draft during a conference in 2019: 

Select suspicious content from the social networks that are monitored 
Weigh its relevance 
Consult, when identifiable, the original source 
Consult, if identifiable, those involved in/affected by the misinformation 
Consult the official source 
Consult alternative sources 
Give context 
Confirm or deny the content

US based PolitiFact has five standard questions that a fact checker answers as part of the research and writing process: 

What is the claim? 
Where was it published? 
Who made the claim? 
What’s the significance? (This covers potential for real world harm as well as the possible impact on public conversation of fact checking the claim – sometimes a fact check might just be adding to buzz and confusion.) 
What is our verdict?

v) Skills

Social media claims vary in terms of topic, format and source. From one day to another, a fact checker might check text posts, videos, images or audio clips presenting fabricated quotes, bogus cures or overblown claims about the performance of the government. This requires knowledge of a wide range of tools and sources as well as human judgement, curiosity and determination. 

In general, journalism schools do not teach students basic online verification techniques such as reverse image searching. Experienced journalists do not necessarily have the skills to do online fact checking, either. 

Here’s what many fact checkers will be able to do after several months on the job: 

Identify text-based and non text-based claims – e.g. which part of a meme is being checked. 
Spot when an image has been fabricated or manipulated and find the original. 
Identify edited videos.
Search screenshots from videos to identify the original source.
Construct effective keyword searches. 
Find and use basic statistics such as international population figures or voter registration data. 
Spot psychological tricks that attempt to elicit certain audience reactions. 
Look beyond individual claims to spot patterns and learn how misinformers operate. 
Quickly interpret new online environments and judge their credibility.

vi) Training

Teyit is an independent fact checking organisation based in Turkey. It's training for interns begins with reading translated research, Teyit’s own reports, and experts’ articles on information disorder and misinformation. Then, interns give a presentation on what they understand about misinformation, how Teyit tackles it, and their own ideas for tackling it.

After a week, interns are assigned simple fact checks, graduating to harder and more complex claims which require more than simple reverse image searches. Teyit has a database of over 200 tools with descriptions of their uses which is shared with new staff and interns when they join, including internal training videos. Interns are taught how to use Wordpress and encouraged to try using a wide variety of tools. Teyit also encourages people to pick up the phone. 

Fact Crescendo gives new editorial staff a fifteen day structured training induction. It covers: 

Tips on how to spot fake news: critical thinking, looking out for emotional appeals, incomplete details, etc. 
Brief introduction to International Fact-Checking Network’s code of conduct and policies. 
Basic tools to analyze content: Simple tools such as reverse image, advanced Google Search, twitter search, translator, etc. 
Monitoring tools such as Crowd Tangle and Tweet Deck. 
Fact Crescendo maintains a list of tools which is regularly updated and is used by new and existing fact checkers.

Arabian Fatabyyano has five volunteer team members. People must enter a competition to join the team. Such competitions can involve up to 50 people checking information as quickly as possible. “They have to be able to read English, use basic tools such as reverse image search, and write a short draft of an article. Later we continue to train people depending on which team they join. But everyone joins with basic skills.” 

vii) Evidence

Fact checkers mentioned a wide range of tools that they use primarily in research, rather than monitoring. These were the most frequently mentioned in interviews: 

Searching 
Baidu search 
Bing search 
Google advanced search 
Twitter advanced search 
Facebook Graph Search (not currently operating - fact checkers want this to be reinstated) 
Video and image verification 
Amnesty video verification tool 
InVid 
Google reverse image search 
RevEye 
Tin Eye 
Yandex Reverse Images Search 
Fotoforensics 
Archiving pages or locating previously-archived pages 
Internet Archive 
Archive.is
eyewitness 
Evaluating web pages 
Website Informer 
Who.is 
CrowdTangle Chrome extension 
Other 
Google Dataset search 
Google Translate 
News agencies 
Newspaper archives

Fact checkers use a wide range of data to check online claims, for example international sources such as the World Bank and the World Health Organization, national statistics bureaus, data produced by NGOs, data obtained through freedom of information laws, archives and legal documents. Access to data, quality of data and publication formats vary from country to country.

viii) Publication and distribution of online fact checks

Fact checkers distribute and publish online fact checks in multiple places including their own websites and promote them through their own social media channels. 

Most also distribute their online fact checks more widely to reach a bigger audience, for example through media partnerships: sometimes media pay fees and others have arrangements where they republish and reuse fact checks free of charge. 

Fact checkers sometimes run online advertisements, and many have received ad credits from internet companies, particularly during elections and during the coronavirus pandemic. 

A third distribution method is technology set up by internet companies, such as Facebook’s Third-Party Fact-Checking programme, which shows fact checks and fact checkers’ ratings to Facebook users, and ClaimReview, which enables Google, YouTube, Bing and others to highlight fact checks in search results and in apps. 

The main challenges fact checkers experience in this part of the process include: 

getting set up on new social media channels with limited staff resources, 
presentation of fact checks, 
knowing too little about audiences, 
managing media partnerships, 
internet censorship, 
online harassment, 
getting clear answers and support about how to use ClaimReview and how internet companies use it in their products.

ix) Technology (wish list)

The most popular proposition was a tool which identifies claims and provides virality metrics alongside them.  Based on interviews, the ideal – though perhaps unrealistic – monitoring tool would: 

Identify claims in a wide range of languages and alphabets. 
Take in data about previously-checked accounts and pages (helping to identify repeat offenders). 
Capture virality and predict the performance of a post. 
Work across Facebook, Twitter, Instagram and YouTube. 
Have video and image search functionality. 
Auto-generate keyword searches based on live data. 
Transcribe speech to text in a wide range of languages. 
Have detailed transparency documentation.

Beyond this, organisations’ wish lists were more specific, either to their workflow or country: 

Improving natural language processing in specific languages, e.g. Arabic. 
“Editorial Checklist” WordPress plugin. 
Crowdsourcing platform for micro-research tasks (e.g. converting PDFs to raw data). 
PDF to Excel converter. 
Auto-generating parts of articles, e.g. the CMS suggests a link to frequently used dataset on the topic you are writing about, or auto-fills a sentence about the share count of the claim you’re checking, based on the claim’s URL. 
Software that flags whether videos are likely to have been altered. 
Instagram Stories monitoring tool.

Technology for automatic identification of claims, crowdsourcing reader tips, and search trends: 

Lead Stories’ Trendolizer: identifying emerging viral posts and connecting the dots between known misinformers 
Rappler’s shark tank: monitoring which ingests accounts previously identified as spreading misinformation 
Full Fact: claim detection and claim matching 
Tech and Check Cooperative: identifying claims and disseminating fact checks via an app 
Chequeado’s Chequeabot: identifying claims in online media outlets 
Aos Fatos’s Radar: disinformation monitoring in real time 
RMIT ABC Fact Check: identifying bushfires misinformation on Twitter 
Teyit’s crowd-powered website: educating users as they participate in monitoring and research 
Africa Check’s WhatsApp chat bot: crowdsourcing WhatsApp misinformation via reader request 
First Draft’s coronavirus search trends: briefings on trending coronavirus searches 
Technology to help fact checkers with research 
Forensia: authenticity scores for audio files 
Maldita.es: superpowered community of experts to advise and contribute to research 
Full Fact’s robochecking prototype 
Technology to help fact checkers publish and distribute their work: 
Aos Fatos’ Fátima: replying to users in Facebook Messenger, and challenging sharing of false information on Twitter 
Coronavirus alliance: searchable global database of coronavirus fact checks

Challenges for technologists 

Social listening tools that combine virality with claim identification 
Claim spotting and matching 
YouTube monitoring tool 
Improving natural language processing in smaller languages 
Searchable image and video misinformation database 
Database for fact checks of claims that go across borders, with internal translation capability 
Speech-to-text transcription for YouTube content that can be connected with claim spotting tools

Summary of challenges 

Monitoring 
Volume and relevance 
Overemphasis on virality from social listening tools 
Inundation with audience requests 
No monitoring tool for YouTube 
Image and video searching 
Research 
Repetitive claims and time consuming or repetitive tasks 
Accessibility of information and transparency of authorities 
Training editorial staff 
Difficulty of finding a source for claims originating from closed platforms 
Publication and distribution: 
Setting up new social media channels 
Sustaining media partnerships 
Presenting fact checks with limited space and design resources 
Internet shutdowns 
Online harassment 
Working with internet companies: 
Financial dependency on internet company funding 
Transparency: both in terms of the full scope and nature of internet companies’ responses to online misinformation, and of detailed impact metrics for partnerships with fact checkers or products powered by fact checks 
The need for more investment by more internet companies in partnerships and engagement with fact checkers 
Testing and feedback 
Variation in fact check data requirements of different internet companies products

More information from our (TITAN) 'Socio-technical Framework and User Needs Analysis' where this content was extracted from will be published soon. Join our mailing list by subscribing at the bottom of this page to receive copy of our report.

Learning From State-of-the-Art Procedures and Tools for Fact Checking

Recent Posts

Komentarji