Applying Authorship Analysis to Extremist-Group Web Forum Messages

TLDR

Internet media’s speed, ubiquity, and anonymity make them ideal channels for militant groups, prompting intelligence agencies to analyze web content, where authorship analysis—though traditionally applied to literary texts—offers a way to detect terrorist communication patterns. The study aims to adapt an existing online authorship framework to Arabic and English extremist forum messages to address challenges posed by online communication. The authors built a multilingual model with specialized algorithms and features for Arabic, and added a sophisticated message‑extraction component to capture a broader set of online‑communication features. The analysis demonstrates that linguistic feature evaluation against known styles provides intelligence agencies with a tool to detect terrorist communication patterns.

Abstract

The speed, ubiquity, and potential anonymity of Internet media - email, Web sites, and Internet forums - make them ideal communication channels for militant groups and terrorist organizations. Analyzing Web content has therefore become increasingly important to the intelligence and security agencies that monitor these groups. Authorship analysis can assist this activity by automatically extracting linguistic features from online messages and evaluating stylistic details for patterns of terrorist communication. However, authorship analysis techniques are rooted in work with literary texts, which differ significantly from online communication. To explore these problems, we modified an existing framework for analyzing online authorship and applied it to Arabic and English Web forum messages associated with known extremist groups. We developed a special multilingual model - the set of algorithms and related features - to identify Arabic messages, gearing this model toward the language's unique characteristics. Furthermore, we incorporated a complex message extraction component to allow the use of a more comprehensive set of features tailored specifically toward online messages. Evaluating the linguistic features of Web messages and comparing them to known writing styles offers the intelligence community a tool for identifying patterns of terrorist communication.

References

Page 1

	Year	Citations

Page 1