Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Mariposa: a wide-area distributed database system

Published: 01 January 1996 Publication History

Abstract

The requirements of wide-area distributed database systems differ dramatically from those of local-area network systems. In a wide-area network (WAN) configuration, individual sites usually report to different system administrators, have different access and charging algorithms, install site-specific data type extensions, and have different constraints on servicing remote requests. Typical of the last point are production transaction environments, which are fully engaged during normal business hours, and cannot take on additional load. Finally, there may be many sites participating in a WAN distributed DBMS.In this world, a single program performing global query optimization using a cost-based optimizer will not work well. Cost-based optimization does not respond well to site-specific type extension, access constraints, charging algorithms, and time-of-day constraints. Furthermore, traditional cost-based distributed optimizers do not scale well to a large number of possible processing sites. Since traditional distributed DBMSs have all used cost-based optimizers, they are not appropriate in a WAN environment, and a new architecture is required.We have proposed and implemented an economic paradigm as the solution to these issues in a new distributed DBMS called Mariposa. In this paper, we present the architecture and implementation of Mariposa and discuss early feedback on its operating characteristics.

References

[1]
Banerjea A Mah BA (1991) The real-time channel administration protocol. In: Proc 2nd Int Workshop on Network and Operating System Support for Digital Audio and Video, Heidelberg, Germany, November.
[2]
Bernstein PA, Goodman N, Wong E, Reeve CL, Rothnie J (1981) Query processing in a system for distributed databases (SDD-1). ACM Trans Database Syst, 6:602-625.
[3]
Bitton D, DeWitt DJ, Turbyfill C (1983) Benchmarking data base systems: a systematic approach. In: Proc 9th Int Conf on Very Large Data Bases, Florence, Italy, November.
[4]
Cheriton D, Mann TP (1989) Decentralizing a global naming service for improved performance and fault tolerance. ACM Trans Comput Syst, 7:147-183.
[5]
Copeland G, Alexander W, Boughter E, Keller T (1988) Data placement in bubba. In: Proc 1988 ACM-SIGMOD Conf on Management of Data, Chicago, Ill, June, pp. 99-108.
[6]
Dewan S, Mendelson H (1990) User delay costs and internal pricing for a service facility. Management Sci 36:1502-1517.
[7]
Ferguson D, Nikolaou C, Yemini Y (1993) An economy for managing replicated data in autonomous decentralized systems. Proc Int Symp on Autonomous Decentralized emsSyst (ISADS 93), Kawasaki, Japan, March, pp. 367-375.
[8]
Huberman BA (ed) (1988) The ecology of computation. North-Holland, Amsterdam.
[9]
Kurose J, Simha R (1989) A microeconomic approach to optimal resource allocation in distributed computer systems. IEEE Trans Comp, 38:705- 717.
[10]
Litwin W et al (1982) SIRIUS system for distributed data management. In: Schneider HJ (ed) Distributed data bases. North-Holland, Amsterdam.
[11]
Mackert LF, Lohman GM (1986) R* Optimizer validation and performance evaluation for distributed queries. Proc 12th Int Conf on Very Large Data Bases, Kyoto, Japan, August, pp. 149-159.
[12]
Malone TW, Fikes RE, Grant KR, Howard MT (1988) Enterprise: a market-like task scheduler for distributed computing environments. In: Huberman BA (ed) The ecology of computation. North-Holland, Amsterdam.
[13]
Mendelson H (1985) Pricing computer services: queueing effects. Commun ACM, 28:312-321.
[14]
Mendelson H, Saharia AN (1986) Incomplete information costs and data-base design. ACM Trans Database Syst., 11:159-185.
[15]
Miller MS, Drexler KE (1988) Markets and computation: agoric open systems. In: Huberman BA (ed) The ecology of computation. North-Holland, Amsterdam.
[16]
Ousterhout JK (1994) Tcl and the Tk Toolkit. Addison-Wesley, Reading, Mass.
[17]
Sah A, Blow J (1994) A new architecture for the implementation of scripting languages. In: Proc USENIX Symp on Very High Level Languages, Santa Fe, NM, October. pp. 21-38.
[18]
Sah A, Blow J, Dennis B (1994) An introduction to the Rush language. In: In: Proc Tcl'94 Workshop, New Orleans, La, June pp. 105-116.
[19]
Selinger PG, Astrahan MM, Chamberlin DD, Lorie RA, Price TG (1979) Access path selection in a relational database management system. In: Proc 1979 ACM-SIGMOD Conf on Management of Data, Boston, Mass, June.
[20]
Sidell J, Aoki PM, Barr S, Sah A, Staelin C, Stonebraker M, Yu A (1995) Data replication in Mariposa (Sequoia 2000 Technical Report 95-60) University of California, Berkeley, Calif.
[21]
Stonebraker M (1986) The design and implementation of distributed INGRES. In: Stonebraker M (ed) The INGRES papers. M. Addison-Wesley, Reading, Mass.
[22]
Stonebraker M (1991) An overview of the Sequoia 2000 project (Sequoia 2000 Technical Report 91/5), University of California, Berkeley, Calif.
[23]
Stonebraker M, Kemnitz G (1991) The POSTGRES next-generation data-base management system. Commun ACM 34:78-92.
[24]
Stonebraker M, Aoki PM, Devine R, Litwin W, Olson M (1994a) Mariposa: a new architecture for distributed data. In: Proc 10th Int Conf on Data Engineering, Houston, Tex, February, pp. 54-65.
[25]
Stonebraker M, Devine R, Kornacker M, Litwin W, Pfeffer A, Sah A, Staelin C (1994b) An economic paradigm for query processing and data migration in Mariposa. In: Proc 3rd Int Conf on Parallel and Distributed Information Syst, Austin, Tex, September, pp. 58-67.
[26]
Waldspurger CA, Hogg T, Huberman B, Kephart J, Stornetta S (1992) Spawn: a distributed computational ecology. IEEE Trans Software Eng, 18:103-117.
[27]
Wellman MP (1993) A market-oriented programming environment and its applications to distributed multicommodity ow problems. J AI Res 1:1-23.
[28]
Williams R, Daniels D, Haas L, Lapis G, Lindsay B, Ng P, Obermarck R, Selinger P, Walker A, Wilms P, Yost R (1981) R*: an overview of the architecture. (IBM Research Report RJ3325), IBM Research Laboratory, San Jose, Calif.
[29]
Zhang H, Fisher T (1992) Preliminary measurement of the RMTP/RTIP. In: Proc Third Int Workshop on Network and Operating System Support for Digital Audio and Video, San Diego, Calif November.

Cited By

View all
  • (2022)Styx++: Reliable Data Access and Availability Using a Hybrid Paxos and Chain Replication ProtocolExtended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491101.3519635(1-7)Online publication date: 27-Apr-2022
  • (2019)Infinite-duration Bidding GamesJournal of the ACM10.1145/334029566:4(1-29)Online publication date: 16-Jul-2019
  • (2017)Query Centric Partitioning and Allocation for Partially Replicated Database SystemsProceedings of the 2017 ACM International Conference on Management of Data10.1145/3035918.3064052(315-330)Online publication date: 9-May-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image The VLDB Journal — The International Journal on Very Large Data Bases
The VLDB Journal — The International Journal on Very Large Data Bases  Volume 5, Issue 1
January 1996
95 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 January 1996

Author Tags

  1. Autonomy
  2. Databases
  3. Distributed systems
  4. Economic site
  5. Name service
  6. Wide-area network

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)9
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Styx++: Reliable Data Access and Availability Using a Hybrid Paxos and Chain Replication ProtocolExtended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491101.3519635(1-7)Online publication date: 27-Apr-2022
  • (2019)Infinite-duration Bidding GamesJournal of the ACM10.1145/334029566:4(1-29)Online publication date: 16-Jul-2019
  • (2017)Query Centric Partitioning and Allocation for Partially Replicated Database SystemsProceedings of the 2017 ACM International Conference on Management of Data10.1145/3035918.3064052(315-330)Online publication date: 9-May-2017
  • (2017)How to make a best-sellerApplied Soft Computing10.1016/j.asoc.2017.01.03655:C(178-196)Online publication date: 1-Jun-2017
  • (2015)The BigDAWG Polystore SystemACM SIGMOD Record10.1145/2814710.281471344:2(11-16)Online publication date: 12-Aug-2015
  • (2015)Query-Based Data PricingJournal of the ACM10.1145/277087062:5(1-44)Online publication date: 2-Nov-2015
  • (2015)An Equilibrium Analysis of Scrip SystemsACM Transactions on Economics and Computation10.1145/26590063:3(1-32)Online publication date: 23-Jun-2015
  • (2015)Elections and Reputation for High Dependability and Performance in Distributed Workload ExecutionIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2014.234086426:8(2233-2246)Online publication date: 13-Jul-2015
  • (2015)Coordinated resource management for large scale interactive data query systemsProceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing10.1109/CCGrid.2015.149(677-686)Online publication date: 4-May-2015
  • (2014)A Bi-objective Optimization Framework for Heterogeneous CPU/GPU Query PlansFundamenta Informaticae10.5555/2692080.2692091135:4(483-501)Online publication date: 1-Oct-2014
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media