Publication | Closed Access
Crawler Detection: A Bayesian Approach
21
Citations
7
References
2006
Year
Unknown Venue
Artificial IntelligenceSearch Engine OptimizationEngineeringIntelligent SystemsWeb AgentInformation RetrievalData ScienceData MiningRobot LearningProbabilistic ModelBayesian NetworkComputer ScienceSearch Engine DesignAccess Log AnalysisWeb MiningBayesian StatisticsLog AnalysisStatistical InferenceProbabilistic Modeling ApproachRoboticsCrawler DetectionDistributed Search Engine
In this paper, we introduce a probabilistic modeling approach for addressing the problem of Web robot detection from Web-server access logs. More specifically, we construct a Bayesian network that classifies automatically access-log sessions as being crawler- or human-induced, by combining various pieces of evidence proven to characterize crawler and human behavior. Our approach uses machine learning techniques to determine the parameters of the probabilistic model. We apply our method to real Web-server logs and obtain results that demonstrate the robustness and effectiveness of probabilistic reasoning for crawler detection
| Year | Citations | |
|---|---|---|
Page 1
Page 1