Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Rewriting Aggregate Queries Using Functional Dependencies within the Cloud

  • Conference paper
  • First Online:
Information Search, Integration, and Personalization (ISIP 2013)

Abstract

Since many years, companies and laboratories have had a pressing need for processing large amounts of data in areas such as astronomy, medicine or social networks. Cloud computing provides users with a virtually infinite amount of computing resources. Scaling up cloud performance can be usually achieved by using more numerous and/or more powerful nodes. However, this results in high costs as well as using more resources than necessary. In the area of databases, caching and query rewriting are two important ways to improve performance. This paper proposes rewriting rules for aggregate queries using semantic caching in the cloud. We have implemented our proposal in the Pig system and conducted experiments in a private cloud.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Amazon. Dynamodb. Web page. http://aws.amazon.com/dynamodb/

  2. Amazon. Rds. Web page. http://aws.amazon.com/rds/

  3. Amazon. Simpledb. Web page. http://aws.amazon.com/simpledb/

  4. Apache. Hadoop. Web page. http://hadoop.apache.org/

  5. Armbrust, M., Fox, A., Griffith, R., Joseph, A.D., Katz, R.H., Konwinski, A., Lee, G., Patterson, D.A., Rabkin, A., Stoica, I., Zaharia, M.: A view of cloud computing. Commun. ACM 53(4), 50–58 (2010)

    Google Scholar 

  6. Beyer, K.S., Ercegovac, V., Gemulla, R., Balmin, A., Eltabakh, M.Y., Kanne, C. -C., Özcan, F., Shekita, E.J.: Jaql: a scripting language for large scale semistructured data analysis. PVLDB 4(12), 1272–1283 (2011)

    Google Scholar 

  7. Chaiken, R., Jenkins, B., Larson, P.Å., Ramsey, B., Shakib, D., Weaver, S., Zhou, J.: Scope: easy and efficient parallel processing of massive data sets. PVLDB 1(2), 1265–1276 (2008)

    Google Scholar 

  8. Chen, L., Rundensteiner, E.A., Wang. S.: XCache: a semantic caching system for XML queries. In: SIGMOD, Madison, Wisconsin, USA, p. 618 (2002)

    Google Scholar 

  9. Chidlovskii, B., Borghoff, U.M.: Semantic caching of web queries. VLDBJ 9(1), 2–17 (2000)

    Article  Google Scholar 

  10. Dar, S., Franklin, M.J., Jonsson, B.T., Srivastava, D., Tan, M.: Semantic data caching and replacement. In: VLDB, Bombay, India, pp. 330–341 (1996)

    Google Scholar 

  11. Dash, D., Kantere, V., Ailamaki, A.: An economic model for self-tuned cloud caching. In: ICDE, Shanghai, China, pp. 1687–1693 (2009)

    Google Scholar 

  12. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI, San Francisco, California, USA, pp. 137–150 (2004)

    Google Scholar 

  13. d’Orazio, L., Traore, M.K.: Semantic cache for pervasive grids. In: IDEAS, Cetraro, Italy, pp. 227–233 (2009)

    Google Scholar 

  14. Halevy, A.Y.: Answering queries using views: a survey. VLDBJ 10(4), 270–294 (2001)

    Article  MATH  Google Scholar 

  15. Kantere, V., Dash, D., Gratsias, G., Ailamaki, A.: Predicting cost amortization for query services. In: SIGMOD, Athens, Greece, pp. 325–336 (2011)

    Google Scholar 

  16. Keller, A.M., Basu, J.: A predicate-based caching scheme for client-server database architectures. VLDBJ 5(1), 35–47 (1996)

    Article  Google Scholar 

  17. Laurent, D., Spyratos, N.: Rewriting aggregate queries using functional dependencies. In: MEDES, San Francisco, CA, USA, pp. 40–47 (2011)

    Google Scholar 

  18. Lillis, K., Pitoura, E.: Cooperative xpath caching. In: SIGMOD, Vancouver, BC, Canada, pp. 327–338 (2008)

    Google Scholar 

  19. Microsoft. Sql azure. Web page. http://www.windowsazure.com/en-us/home/features/data-management/

  20. Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig latin: a not-so-foreign language for data processing. In: SIGMOD, Vancouver, BC, Canada, pp. 1099–1110 (2008)

    Google Scholar 

  21. O’Neil, P., O’Neil, E., Chen, X., Revilak, S.: The star schema benchmark and augmented fact table indexing. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 237–252. Springer, Heidelberg (2009)

    Google Scholar 

  22. Silva, Y.N., Larson, P.-A., Zhou, J.: Exploiting common subexpressions for cloud query processing. In: ICDE, Washington, DC, USA, pp. 1337–1348 (2012)

    Google Scholar 

  23. Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., 0002, N.Z., Anthony, S., Liu, H., Murthy, R.: Hive - a petabyte scale data warehouse using hadoop. In: ICDE, Long Beach, California, USA, pp. 996–1005 (2010)

    Google Scholar 

  24. Upadhyaya, P., Balazinska, M., Suciu, D.: How to price shared optimizations in the cloud. PVLDB 5(6), 562–573 (2012)

    Google Scholar 

  25. Vancea, A., Machado, G.S., d’Orazio, L., Stiller, B.: Cooperative database caching within cloud environments. In: Sadre, R., Novotný, J., Čeleda, P., Waldburger, M., Stiller, B. (eds.) AIMS 2012. LNCS, vol. 7279, pp. 14–25. Springer, Heidelberg (2012)

    Google Scholar 

Download references

Acknowledgments

This work is partially supported by the STIC Asia project GOD (http://home.isima.fr/god/godwiki/). We would like to sincerely thank all the colleagues at ETIS, LIMOS and LRI laboratories for the interesting discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Romain Perriot .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Perriot, R., d’Orazio, L., Laurent, D., Spyratos, N. (2014). Rewriting Aggregate Queries Using Functional Dependencies within the Cloud. In: Kawtrakul, A., Laurent, D., Spyratos, N., Tanaka, Y. (eds) Information Search, Integration, and Personalization. ISIP 2013. Communications in Computer and Information Science, vol 421. Springer, Cham. https://doi.org/10.1007/978-3-319-08732-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08732-0_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08731-3

  • Online ISBN: 978-3-319-08732-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics