Publication | Closed Access
Taghreed
68
Citations
35
References
2014
Year
Unknown Venue
Cluster ComputingEngineeringInformation RetrievalData ScienceTaghreed VisualizerStreaming EngineManagementQuery ProcessorData IntegrationComputer ScienceDistributed Query ProcessingSearch Engine IndexingBig DataData Stream ManagementData ManagementFull-fledged SystemText MiningQuery Optimization
This paper presents Taghreed; a full-fledged system for efficient and scalable querying, analyzing, and visualizing geotagged microblogs, e.g., tweets. Taghreed supports arbitrary queries on a large number (Billions) of microblogs that go up to several months in the past. Taghreed consists of four main components: (f) Indexer, (2) query engine, (3) recovery manager, and (4) visualizer. Taghreed indexer efficiently digests incoming microblogs with high arrival rates in light memory-resident indexes. When the memory becomes full, a flushing policy manager transfers the memory contents to disk indexes which are managing Billions of microblogs for several months. On memory failure, the recovery manager restores the system status from replicated copies for the main-memory content. Taghreed query engine consists of two modules: a query optimizer and a query processor. The query optimizer generates an optimal query plan to be executed by the query processor through efficient retrieval techniques to provide low query response, i.e., order of milli-seconds. Taghreed visualizer allows end users to issue a wide variety of spatio-temporal queries. Then, it graphically presents the answers and allows interactive exploration through them. Taghreed is the first system that addresses all these challenges collectively for microblogs data. In the paper, each system component is described in detail.
| Year | Citations | |
|---|---|---|
Page 1
Page 1