Google Scholar

Syntactic clustering of the web

AZ Broder, SC Glassman, MS Manasse… - Computer networks and …, 1997 - Elsevier

We have developed an efficient way to determine the syntactic similarity of files and have
applied it to every document on the World Wide Web. Using this mechanism, we built a
clustering of all the documents that are syntactically similar. Possible applications include a
“Lost and Found” service, filtering the results of Web searches, updating widely distributed
web-pages, and identifying violations of intellectual property rights.

Save Cite Cited by 2172 Related articles All 24 versions

Cite

Advanced search

Saved to My library

Syntactic clustering of the web