Publication | Open Access
Siphoning Hidden-Web Data through Keyword-Based Interfaces
107
Citations
16
References
2004
Year
In this paper, we study the problem of automating the retrieval of data hidden behind simple search interfaces that accept keyword-based queries. Our goal is to automatically retrieve all available results (or, as many as possible). We propose a new approach to siphon hidden data that automatically generates a small set of representative keywords and builds queries which lead to high coverage. We evaluate our algorithms over several real Web sites. Preliminary results indicate our approach is effective: coverage of over 90 % is obtained for most of the sites considered. 1.
| Year | Citations | |
|---|---|---|
Page 1
Page 1