Abstract
Since many years, companies and laboratories have had a pressing need for processing large amounts of data in areas such as astronomy, medicine or social networks. Cloud computing provides users with a virtually infinite amount of computing resources. Scaling up cloud performance can be usually achieved by using more numerous and/or more powerful nodes. However, this results in high costs as well as using more resources than necessary. In the area of databases, caching and query rewriting are two important ways to improve performance. This paper proposes rewriting rules for aggregate queries using semantic caching in the cloud. We have implemented our proposal in the Pig system and conducted experiments in a private cloud.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Amazon. Dynamodb. Web page. http://aws.amazon.com/dynamodb/
Amazon. Rds. Web page. http://aws.amazon.com/rds/
Amazon. Simpledb. Web page. http://aws.amazon.com/simpledb/
Apache. Hadoop. Web page. http://hadoop.apache.org/
Armbrust, M., Fox, A., Griffith, R., Joseph, A.D., Katz, R.H., Konwinski, A., Lee, G., Patterson, D.A., Rabkin, A., Stoica, I., Zaharia, M.: A view of cloud computing. Commun. ACM 53(4), 50–58 (2010)
Beyer, K.S., Ercegovac, V., Gemulla, R., Balmin, A., Eltabakh, M.Y., Kanne, C. -C., Özcan, F., Shekita, E.J.: Jaql: a scripting language for large scale semistructured data analysis. PVLDB 4(12), 1272–1283 (2011)
Chaiken, R., Jenkins, B., Larson, P.Å., Ramsey, B., Shakib, D., Weaver, S., Zhou, J.: Scope: easy and efficient parallel processing of massive data sets. PVLDB 1(2), 1265–1276 (2008)
Chen, L., Rundensteiner, E.A., Wang. S.: XCache: a semantic caching system for XML queries. In: SIGMOD, Madison, Wisconsin, USA, p. 618 (2002)
Chidlovskii, B., Borghoff, U.M.: Semantic caching of web queries. VLDBJ 9(1), 2–17 (2000)
Dar, S., Franklin, M.J., Jonsson, B.T., Srivastava, D., Tan, M.: Semantic data caching and replacement. In: VLDB, Bombay, India, pp. 330–341 (1996)
Dash, D., Kantere, V., Ailamaki, A.: An economic model for self-tuned cloud caching. In: ICDE, Shanghai, China, pp. 1687–1693 (2009)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI, San Francisco, California, USA, pp. 137–150 (2004)
d’Orazio, L., Traore, M.K.: Semantic cache for pervasive grids. In: IDEAS, Cetraro, Italy, pp. 227–233 (2009)
Halevy, A.Y.: Answering queries using views: a survey. VLDBJ 10(4), 270–294 (2001)
Kantere, V., Dash, D., Gratsias, G., Ailamaki, A.: Predicting cost amortization for query services. In: SIGMOD, Athens, Greece, pp. 325–336 (2011)
Keller, A.M., Basu, J.: A predicate-based caching scheme for client-server database architectures. VLDBJ 5(1), 35–47 (1996)
Laurent, D., Spyratos, N.: Rewriting aggregate queries using functional dependencies. In: MEDES, San Francisco, CA, USA, pp. 40–47 (2011)
Lillis, K., Pitoura, E.: Cooperative xpath caching. In: SIGMOD, Vancouver, BC, Canada, pp. 327–338 (2008)
Microsoft. Sql azure. Web page. http://www.windowsazure.com/en-us/home/features/data-management/
Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig latin: a not-so-foreign language for data processing. In: SIGMOD, Vancouver, BC, Canada, pp. 1099–1110 (2008)
O’Neil, P., O’Neil, E., Chen, X., Revilak, S.: The star schema benchmark and augmented fact table indexing. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 237–252. Springer, Heidelberg (2009)
Silva, Y.N., Larson, P.-A., Zhou, J.: Exploiting common subexpressions for cloud query processing. In: ICDE, Washington, DC, USA, pp. 1337–1348 (2012)
Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., 0002, N.Z., Anthony, S., Liu, H., Murthy, R.: Hive - a petabyte scale data warehouse using hadoop. In: ICDE, Long Beach, California, USA, pp. 996–1005 (2010)
Upadhyaya, P., Balazinska, M., Suciu, D.: How to price shared optimizations in the cloud. PVLDB 5(6), 562–573 (2012)
Vancea, A., Machado, G.S., d’Orazio, L., Stiller, B.: Cooperative database caching within cloud environments. In: Sadre, R., Novotný, J., Čeleda, P., Waldburger, M., Stiller, B. (eds.) AIMS 2012. LNCS, vol. 7279, pp. 14–25. Springer, Heidelberg (2012)
Acknowledgments
This work is partially supported by the STIC Asia project GOD (http://home.isima.fr/god/godwiki/). We would like to sincerely thank all the colleagues at ETIS, LIMOS and LRI laboratories for the interesting discussions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Perriot, R., d’Orazio, L., Laurent, D., Spyratos, N. (2014). Rewriting Aggregate Queries Using Functional Dependencies within the Cloud. In: Kawtrakul, A., Laurent, D., Spyratos, N., Tanaka, Y. (eds) Information Search, Integration, and Personalization. ISIP 2013. Communications in Computer and Information Science, vol 421. Springer, Cham. https://doi.org/10.1007/978-3-319-08732-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-08732-0_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08731-3
Online ISBN: 978-3-319-08732-0
eBook Packages: Computer ScienceComputer Science (R0)