Concepedia

Publication | Closed Access

Segmenting Webpage with Gomory-Hu Tree Based Clustering

23

Citations

15

References

2011

Year

Abstract

We propose a novel web page segmentationalgorithm based on finding the Gomory-Hu tree in a planargraph. The algorithm firstly distills vision and structureinformation from a web page to construct a weightedundirected graph, whose vertices are the leaf nodes of theDOM tree and the edges represent the visible positionrelationship between vertices. Then it partitions the graphwith the Gomory-Hu tree based clustering algorithm.Experimental results show that, compared with VIPS andChakrabarti et al.’s graph theoretic algorithm, ouralgorithm improves upon the other two with much higherprecision and recall, and its running time is far lower thanthat of Chakrabarti et al.’s graph theoretic algorithm.

References

YearCitations

Page 1