Concepedia

TLDR

The authors propose a modular, efficient entity annotator built on TagME’s algorithmic framework and intend to release it as an open‑source library. The system is modular and efficient, featuring redesigned spotting, disambiguation, and pruning modules with new algorithms evaluated on public datasets such as AIDA, IITB, MSN, AQUAINT, and the ERD Challenge. Evaluation on the ERD dataset yielded an F1 of 74.8 % (dev) and 67.2 % (test) with 87.6 % precision and 54.5 % recall, improving TagME by 1–9 % on the D2W benchmark and demonstrating the system’s flexibility as a reusable entity‑annotator library.

Abstract

In this paper we propose a novel entity annotator for texts which hinges on TagME's algorithmic technology, currently the best one available. The novelty is twofold: from the one hand, we have engineered the software in order to be modular and more efficient; from the other hand, we have improved the annotation pipeline by re-designing all of its three main modules: spotting, disambiguation and pruning. In particular, the re-design has involved the detailed inspection of the performance of these modules by developing new algorithms which have been in turn tested over all publicly available datasets (i.e. AIDA, IITB, MSN, AQUAINT, and the one of the ERD Challenge). This extensive experimentation allowed us to derive the best combination which achieved on the ERD development dataset an F1 score of 74.8%, which turned to be 67.2% F1 for the test dataset. This final result was due to an impressive precision equal to 87.6%, but very low recall 54.5%. With respect to classic TagME on the development dataset the improvement ranged from 1% to 9% on the D2W benchmark, depending on the disambiguation algorithm being used. As a side result, the final software can be interpreted as a flexible library of several parsing/disambiguation and pruning modules that can be used to build up new and more sophisticated entity annotators. We plan to release our library to the public as an open-source project.

References

YearCitations

Page 1