Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

MapReduce

  • Reference work entry
  • First Online:
Encyclopedia of GIS
  • 323 Accesses

Synonyms

Hadoop; Hadoop MapReduce

Definition

MapReduce refers to a parallel programming model and an associated implementation for processing and generating large datasets (Dean and Ghemawat 2008). It is built on the simple concept of mapping which filters and sorts data (e.g., sorting words alphabetically into queues, one queue for each word) and reducing that reduces the data through summary operations (e.g., counting the number of words in each queue and producing word frequencies). MapReduce has become widely used in today’s big data processing work due to the following reasons (Dean and Ghemawat 2008):

  • It hides the details of parallelization, fault tolerance, data locality optimization, and load balancing and therefore relieves the burden of the programmers dealing with distributed programming;

  • A wide range of computing problems could be addressed by the MapReduce model, e.g., generation of data for Google’s production web search service, sorting, data mining, machine learning,...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 1,599.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 1,999.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Abouzeid A, Bajda-Pawlikowski K, Abadi D, Silberschatz A, Rasin A (2009) HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. Proc VLDB Endow 2(1):922–933. doi:10.14778/1687627.1687731, http://dx.doi.org/10.14778/1687627.1687731

  • Aji A, Wang F, Vo H, Lee R, Liu Q, Zhang X, Saltz J (2013) Hadoop GIS: a high performance spatial data warehousing system over mapreduce. Proc VLDB Endow 6(11):009–1020

    Article  Google Scholar 

  • Almeer MH (2012) Cloud Hadoop map reduce for remote sensing image analysis. J Emerg Trends Comput Inf Sci 3(4):637–644

    Google Scholar 

  • Cao G, Wang S, Hwang M, Padmanabhan A, Zhang Z, Soltani K (2015) A scalable framework for spatiotemporal analysis of location-based social media data. Comput Environ Urban Syst 51:70–82

    Article  Google Scholar 

  • Chen Q, Wang L, Shang Z (2008) MRGIS: a MapReduce-enabled high performance workflow system for GIS. In: IEEE fourth international conference on eScience, eScience’08, Indianapolis, 7–12 Dec 2008. IEEE, pp 646–651

    Google Scholar 

  • Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In: Proceedings of the sixth symposium on operating system design and implementation, San Francisco, Dec 2004, pp 137–150

    Google Scholar 

  • Eldawy A, Mokbel MF (2013) A demonstration of spatialhadoop: an efficient mapreduce framework for spatial data. Proc VLDB Endow 6(12):1230–1233

    Article  Google Scholar 

  • Ghemawat S, Gobioff H, Leung ST (2003) The Google file system. ACM SIGOPS Oper Syst Rev 37(5):29–43

    Article  Google Scholar 

  • Golpayegani N, Halem M (2009) Cloud computing for satellite data processing on high end compute clusters. In: Proceedings of IEEE 2009 international conference on cloud computing, 21–25 Sept 2009, Bangalore, pp 88–92

    Google Scholar 

  • Gu R, Yang X, Yan J, Sun Y, Wang B, Yuan C, Huang Y (2014) SHadoop: improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters. J Parallel Distrib Comput 74(3):2166–2179

    Article  Google Scholar 

  • Hadoop (2014) Apache Hadoop. Acquired from http://hadoop.apache.org/

  • Hive (2014) Apache Hive. Acquired from http://hive.apache.org/

  • Huang Q, Yang C (2011) Optimizing grid configuration to support geospatial processing – an example with DEM interpolation. Comput Geosci 37(2):165–176

    Article  MathSciNet  Google Scholar 

  • Huang Q, Yang C, Benedict K, Rezgui A, Xie J, Xia J, Chen S (2013) Using adaptively coupled models and high-performance computing for enabling the computability of dust storm forecasting. Int J Geogr Inf Sci 27(4):765–784

    Article  Google Scholar 

  • Huang Q, Cervone G, Jing D, Chang C (2015) DisasterMapper: a CyberGIS framework for disaster management using social media data. In: ACM SIGSPATIAL international workshop on analytics for big geospatial data, Seattle. ACM

    Book  Google Scholar 

  • Jiang H, Chen Y, Qiao Z, Weng T-H, Li K-C (2015) Scaling up mapreduce-based big data processing on multi-GPU systems. Clust Comput 18(1):369–383

    Article  Google Scholar 

  • Li J, Jiang Y, Yang C, Huang Q, Rice M (2013) Visualizing 3D/4D environmental data using many-core graphics processing units (GPUs) and multi-core central processing units (CPUs). Comput Geosci 59:78–89. doi:j.cageo.2013.04.029

    Google Scholar 

  • Lv Z, Hu Y, Zhong H, Wu J, Li B, Zhao H (2010) Parallel K-means clustering of remote sensing images based on mapreduce. In: Web information systems and mining. Springer, Berlin/Heidelberg, pp 162–170

    Chapter  Google Scholar 

  • Nickolls J, Dally WJ (2010) The GPU computing era. IEEE Micro 30(2):56–69

    Article  Google Scholar 

Recommended Reading

  • Audenino P, Rognant L, Chassery JM, Planes JG (2001) Fusion strategies for high resolution urban DEM. In: Proceedings of the IEEE/ISPRS Joint Workshop on Remote Sensing and Data Fusion over Urban Areas (Cat. No.01EX482), Rome, Italy, pp. 90-94. IEEE: New Jersey

    Google Scholar 

  • Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107–113

    Article  Google Scholar 

  • Zhao J, Tao J, Streit A (2014) Enabling collaborative MapReduce on the cloud with a single-sign-on mechanism. Computing, 98(1-2):55–72

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qunying Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this entry

Cite this entry

Huang, Q. (2017). MapReduce. In: Shekhar, S., Xiong, H., Zhou, X. (eds) Encyclopedia of GIS. Springer, Cham. https://doi.org/10.1007/978-3-319-17885-1_1608

Download citation

Publish with us

Policies and ethics