Publication | Closed Access
Rate of change and other metrics: a live study of the world wide web
318
Citations
15
References
1997
Year
Unknown Venue
Background: "Caching in the World Wide Web is based on two critical assumptions: that a significant fraction of requests reaccess resources that have already been retrieved; and that those resources do not change between accesses." Purpose: combine sentences labeled Purpose. Those sentences: "We tested the validity of these assumptions, and their dependence on characteristics of Web resources, including access rate, age at time of reference, content type, resource size, and Internet top-level domain." and "We also measured the rate at which resources change, and the prevalence of duplicate copies in the Web." So purpose: "The study tested the validity of caching assumptions and measured how resource characteristics affect them, including access rate, age, content type, size, domain, and the rate of change and duplicate prevalence." Mechanism: combine Mechanism sentences: "We quantified the potential benefit of a shared proxy-caching server in a large environment by using traces that were collected at the Internet connection points for two large corporations, representing significant numbers of references." plus "In addition, we studied other aspects of the rate of change, including semantic differences such as the insertion or deletion of anchors, phone numbers, and email addresses." Also earlier Purpose, Mechanism sentences also have Mechanism content: those two sentences also have Mechanism. So mechanism: "Using traces from two large corporations, the authors quantified the benefit of a shared proxy cache and examined how resource characteristics and semantic changes affect caching, including access rate, age, content type, size, domain, and changes to anchors, phone numbers, and email addresses." Need concise. Findings: combine Findings sentences: 22% accessed more than once, but half of references to those multiply-referenced resources; 13% of that half were to a resource modified since previous reference; content type and rate of access strongly influence metrics, domain moderate, size little effect.
Caching in the World Wide Web is based on two critical assumptions: that a significant fraction of requests reaccess resources that have already been retrieved; and that those resources do not change between accesses. We tested the validity of these assumptions, and their dependence on characteristics of Web resources, including access rate, age at time of reference, content type, resource size, and Internet top-level domain. We also measured the rate at which resources change, and the prevalence of duplicate copies in the Web. We quantified the potential benefit of a shared proxy-caching server in a large environment by using traces that were collected at the Internet connection points for two large corporations, representing significant numbers of references. Only 22% of the resources referenced in the traces we analyzed were accessed more than once, but about half of the references were to those multiply-referenced resources. Of this half, 13% were to a resource that had been modified since the previous traced reference to it. We found that the content type and rate of access have a strong influence on these metrics, the domain has a moderate influence, and size has little effect. In addition, we studied other aspects of the rate of change, including semantic differences such as the insertion or deletion of anchors, phone numbers, and email addresses.
| Year | Citations | |
|---|---|---|
Page 1
Page 1