Concepedia

TLDR

Social media is increasingly used by victims, volunteers, and relief organizations to report and respond to large‑scale events, yet current mining approaches focus on search or aggregate trends rather than narrative construction, and Twitter’s short, often nonstandard messages demand novel extraction techniques. The authors introduce CrisisTracker, an online system that captures real‑time, distributed situation‑awareness reports from social media during large‑scale events such as natural disasters. CrisisTracker tracks keyword sets on Twitter, clusters related tweets by lexical similarity to build stories, and employs crowdsourcing to enable users to verify and analyze those stories. During an eight‑day pilot on the Syrian civil war, CrisisTracker processed an average of 446,000 tweets per day, distilled them into consumable stories, and received positive feedback from 48 domain experts and volunteer curators.

Abstract

Victims, volunteers, and relief organizations are increasingly using social media to report and act on large-scale events, as witnessed in the extensive coverage of the 2010–2012 Arab Spring uprisings and 2011 Japanese tsunami and nuclear disasters. Twitter® feeds consist of short messages, often in a nonstandard local language, requiring novel techniques to extract relevant situation awareness data. Existing approaches to mining social media are aimed at searching for specific information, or identifying aggregate trends, rather than providing narratives. We present CrisisTracker, an online system that in real time efficiently captures distributed situation awareness reports based on social media activity during large-scale events, such as natural disasters. CrisisTracker automatically tracks sets of keywords on Twitter and constructs stories by clustering related tweets on the basis of their lexical similarity. It integrates crowdsourcing techniques, enabling users to verify and analyze stories. We report our experiences from an 8-day CrisisTracker pilot deployment during 2012 focused on the Syrian civil war, which processed, on average, 446,000 tweets daily and reduced them to consumable stories through analytics and crowdsourcing. We discuss the effectiveness of CrisisTracker based on the usage and feedback from 48 domain experts and volunteer curators.

References

YearCitations

2005

42.1K

2009

17.2K

2007

2.6K

2004

2.2K

2002

2.2K

2010

1.4K

2010

1.4K

2012

645

2010

585

2011

469

Page 1