Abstract
In this paper, we describe a case study in developing an access cost model for WebSources in the context of a wrapper mediator architecture. We document our experiences in validating this model, and note successes and lessons learned. Using experimental data of query feedback from severalWebSources, we develop a Catalog and Access Cost model. We identify WebSource characteristics of the query feedback that are reflective of the particular WebSource behavior and identify groupings of WebSources based on these characteristics. We also characterize the Access Cost model as having High or Low Prediction Accuracy, with respect to its ability to predict access costs for the WebSources. We then correlate WebSource characteristics and groupings of WebSources with High or Low prediction accuracy of the model.
This research has been partially supported by the Defense Advanced Research Project Agency under grant 01-5-28838, and the National Science Foundation under grant IRI9630102.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
CGIAR FishBase 99. http://www.cgiar.org/iclarm/fishbase/search.cfm . 376
S. Adali et al. Query caching and optimization in distributed mediator systems. Proc. of the ACM Sigmod Conf., 1996. 372, 376
A. Bairoch and R. Apweiler. The SWISS-PROT protein sequence databank and its supplement TrEMBL. Nucleic Acids Res, 1(27):49–54, January 1999. http://www.expasy.ch/sprot.
D. Benson, I. Karsch-Mizrachi, D. Lipman, J. Ostell, B. Rapp, and D. Wheeler. GenBank. Nucleic Acids Res, 1(28):15–8, January 2000. http://www.ncbi.nlm.nih.gov/Genbank.
L. Bright, J-R Gruser, L. Raschid, and M. E. Vidal. A wrapper generation toolkit to specify and construct wrappers for web accessible data sources (websources). Journal of Computer Systems Science & Engineering. Special Issue: Semantics on the World Wide Web, 14(2), March 1999.
L. Bright, L. Raschid, V. Zadorozhny, and T. Zhan. A comparison of a web prediction tool and a neural network in learning response times for websources using query feedback. Proceedings of the International Conference on Cooperative Information Systems, 1999.
FAA Aviation Safety Data. http://nasdac.faa.gov/internet/ . 376
EPA Toxic Releases Inventory Database. http://www.epa.gov/enviro/html/tris/tris query java.html . 376
W. Du et al. Query optimization in a heterogeneous dbms. Proc. of the Very Large Data Bases Conference (VLDB), 1992. 372
Landings Aviation Search Engines. http://www.landings.com/ landings/pages/search.html . 376
P. Francis, S. Jamin, V. Paxson, L. Zhang, D. Gryniewicz, and Y. Jin. An architecture for a global internet host distance estimation service. In Proceedings of IEEE InfoComm, 1999. 372
G. Gardarin et al. IRO-DB: A Distributed System Federating Object and Relational Databases, In Object-Oriented Multidatabase Systems: A solution for Advanced Applications, Bukhres, O. and Elmagarmid, A. Prentice Hall, 1996. 372
GeneCards. http://bioinformatics.weizmann.ac.il/cards/. Weizmann Institute Genome Center and Bioinformatics Unit.
Open System Group. An explanation of the specweb96 benchmark. http://www.specbench.org/osg/web96/webpaper.html , 1996.
J. R. Gruser, L. Raschid, V. Zadorozhny, and T. Zhan. Learning response time for websources using query feedback and application in query optimization. To appear in the Very Large Data Base Journal, Special Issue on Databases and the Web. Mendelzon, A. and Atzeni, P., editors., 2000. 375
SAS Institute Inc. Sas(r) proprietary system for unix(r) environments, release 6.12 (ts060). 377
S. Jamin, C. Jin, Y. Jin, D. Raz, Y. Shavin, and L. Zhang. On the placement of internet instrumentation. In Proceedings of IEEE InfoComm, 2000. 372
D. Karger, T. Leighton, D. Lewin, and A. Sherman. Web caching with consistent hashing. Proc. of WWW8, 1999. 372
ACM Digital Library. http://www.acm.org/dl/Search.html . 376
L. Haas M. Tork Roth, F. Ozcan. Cost models do matter: Providing cost information for diverse data sources in a federated system. Proc. of the Very Large Data Bases Conference (VLDB), 1999. 372
L. Ott. An Introduction to Statistical Methods and Data Analysis. PWS-Kent, 1984.
R. Ramakrishnan P. Seshadri, M. Livny. The case for enhanced abstract data types. Proc. of VLDB, 1997.
M. Rabinovich and A. Aggarwal. Radar: A scalable architecture for a global web hosting service. Proc. of WWW8, 1999. 372
M. Rebhan, V. Chalifa-Caspi, J. Prilusky, and D. Lancet. GeneCards: A novel functional genomics compendium with automated data mining and query reformulation support. Bioinformatics, July 1998. available at http://bioinformatics.weizmann.ac.il/cards/CABIOS paper.html.
A. Sayal, P. Scheuermann, and P. Vingralek. Selection algorithms for replicated web servers. Proc. of the Internet Server Performance Workshop (in conjunction with SIGMETRICS’98), 1998.
M. Stemm, S. Seshan, and R. Katz. A network measurement architecture for adaptive applications. In Proceedings of IEEE InfoComm, 2000. 372
Geographic Names Information System. http://mapping.usgs.gov/www/gnis/antform.html . 376
K. Thompson, G. Miller, and R. Wilder. Wide-area internet traffic patterns and characteristics. IEEE Network, November/December, 1997. 372
G. Trent and M. Sake. Webstone: The first generation in http server benchmarking. http://www.mindcraft.com/webstone/paper.html , 1995.
C. Wills and M. Mikhailov. Towards a better understanding of web resources and server responses for improved caching. Proc. of WWW8, 1999. 372
R. Wolski. Dynamically forecasting network performance to support dynamic scheduling using the network weather service. Proc. of the 6th High-Performance Distributed Computing Conference, 1997. 372
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zadorozhny, V., Raschid, L., Zhan, T., Bright, L. (2001). Validating an Access Cost Model for Wide Area Applications. In: Batini, C., Giunchiglia, F., Giorgini, P., Mecella, M. (eds) Cooperative Information Systems. CoopIS 2001. Lecture Notes in Computer Science, vol 2172. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44751-2_28
Download citation
DOI: https://doi.org/10.1007/3-540-44751-2_28
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42524-3
Online ISBN: 978-3-540-44751-1
eBook Packages: Springer Book Archive