Abstract
Object-based cloud storage system has an important role in handling big data. All available cloud storage systems deal with scalability, reliability or durability issues. However, there is lack of work addressing data variety. In a previous paper, a basic architecture of an object-based schema oriented data storage system has been proposed which stores data in an encapsulated way. The system comprises account layer, container layer, object layer, database layer and schema layer. In this paper, the architecture proposed in our previous paper has been elaborated. For example, the communication protocols of the proposed system are explained. Moreover, this architecture is realized to test its effectiveness on health data in terms of query execution performance and flexibility on the basis of four different queries of database computation (e.g., append, read, aggregate and delete). The result set are collected on three types of datasets (table, document, file) taken from healthcare scenario. Each type of dataset consists of four different sets of data records. The performance is compared with Amazon S3 (i.e., bucket oriented object-based data storage system) and Microsoft Azure (i.e., account-container oriented object-based data storage system). Flexibility property is also analyzed with respect to these three database operations (i.e., READ, WRITE and DELETE) on three types of experimental datasets (table, document, file) with Amazon S3.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bhatt C, Dey N, Ashour AS (2017) Internet of things and big data technologies for next generation healthcare, vol 23. Springer, Berlin
Fichman RG, Kohli R, Krishnan R (2011) The role of information systems in healthcare: current research and future trends. Inf Syst Res 22(3):419–428
Groves P, Kayyali B, Knott D, Kuiken SV (2013) The ‘big data’ revolution in healthcare: accelerating value and innovation. McKinsey & Company, pp 1–19
Chatterji S, Ustün BL, Sadana R, Salomon JA, Mathers CD, Murray CJ (2002) The conceptual basis for measuring and reporting on health. Global Programme on Evidence for Health Policy Discussion, World Health Organization 45:1–20
Abrahams J (2011) Disaster risk management for health fact sheets. In: Global platform, World Health Organization, United Kingdom Health Protection Agency and partners. pp 1–2
Batini C, Cappiello C, Francalanci C, Maurino A (2009) Methodologies for data quality assessment and improvement. ACM Comput Surv (CSUR) 41(3):16
Gayathri P, Jaisankar N (2016) A hybrid neuro-fuzzy system-based ranking function and its application to effective medical information retrieval. Int J Intell Inf Database Syst 9(3–4):248–268
Agrawal D, Das S, El Abbadi A (2011) Big data and cloud computing: current state and future opportunities. In: Proceedings of the 14th international conference on extending database technology. ACM, pp 530–533
Kulkarni K, Mattos N, Cochrane R (1999) Active database features in SQL3. Springer, New York, pp 197–219
Kulkarni G, Waghmare R, Palwe R, Waykule V, Bankar H, Koli K (2012) Cloud storage architecture. In: 2012 7th international conference on telecommunication systems, services, and applications (TSSA). IEEE, pp 76–81
Han J, Haihong E, Le G, Du J (2011) Survey on NoSQL database. In: 2011 6th international conference on pervasive computing and applications (ICPCA). IEEE
Bugiotti F, Cabibbo L, Atzeni P, Torlone R (2013) A logical approach to NoSQL databases. pp 1–12
Mesnier M, Ganger GR, Riedel E (2003) Object-based storage. IEEE Commun Mag 41(8):84–90
Zdonik SB, Maier D (1990) Readings in object-oriented database systems. Morgan Kaufmann, Burlington
Yamamoto S, Matsumoto S, Saiki S, Nakamura M (2013) Materialized view as a service for large-scale house log in smart city. In: 2013 IEEE 5th international conference on cloud computing technology and science (CloudCom), vol 2. IEEE, pp 311–316
Goli-Malekabadi Z, Sargolzaei-Javan M, Akbari MK (2016) An effective model for store and retrieve big health data in cloud computing. Comput Methods Prog Biomed 132:75–82
Mondal AS, Chattopadhyay S, Neogy S, Mukherjee N (2016) Object based schema oriented data storage system for supporting heterogeneous data. In: 2016 International conference on advances in computing, communications and informatics (ICACCI). pp 1025–1032
Palankar MR, Iamnitchi A, Ripeanu M, Garfinkel S (2008) Amazon s3 for science grids: a viable solution? In: Proceedings of the 2008 international workshop on data-aware distributed computing. New York, NY, USA
Li Y, Guo L, Guo Y (2012) CACSS: towards a generic cloud storage service. In: CLOSER
Google cloud storage. https://cloud.google.com/storage/. Accessed 18 Feb 2019
Arnold J (2014) Openstack swift: using, administering, and developing for swift object storage. O’Reilly Media Inc, Newton
Biswas P, Patwa F, Sandhu R (2015) Content level access control for openstack swift storage. In: Proceedings of the 5th ACM conference on data and application security and privacy. ACM, pp 123–126
Rackspace. https://www.rackspace.com/cloud_hosting_products/files. Accessed 18 Feb 2019
Rackspace object storage. https://developer.rackspace.com/docs/user-guides/infrastructure/cloud-config/storage/cloud-files-product-concepts/object-storage/. Accessed 18 Feb 2019
Rackspace products. https://www.rackspace.com/cloud_hosting_products/files. Accessed 18 Feb 2019
Rackspace architecture. https://support.rackspace.com/how-to/rackspace-open-cloud-reference-architecture/. Accessed 18 Feb 2019
Rackspace cdn. https://www.rackspace.com/en-in/cloud/cdn-content-delivery-network. Accessed 18 Feb 2019
Microsoft azure. https://docs.microsoft.com/en-us/azure/storage/storage-introduction. Accessed 18 Feb 2019
Calder B, Wang J, Ogus A, Nilakantan N, Skjolsvold A, McKelvie S, Xu Y, Srivastav S, Wu J, Simitci H et al (2011) Windows azure storage: a highly available cloud storage service with strong consistency. In: Proceedings of the twenty-third ACM symposium on operating systems principles. ACM, pp 143–157
Shirinbab S, Lundberg L, Erman D (2013) Performance evaluation of distributed storage systems for cloud computing. IJ Comput Appl 20(4):195–207
Stougie B, Schrijver FD, Slootmaekers RRA, Spiegeleer KMGD, Wispelaere WD, Eetvelde WV, Damad JY (2013) Distributed object storage system. US Patent 8,386,840
Weil SA (2007) Ceph: reliable, scalable, and high-performance distributed storage. Ph.D. thesis, University of California Santa Cruz
Weil SA, Leung AW, Brandt SA, Maltzahn C (2007) Rados: a scalable, reliable storage service for petabyte-scale storage clusters. In: Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing’07. ACM, pp 35–44
Morita K. Sheepdog: distributed storage system for qemu/kvm, LCA 2010 DS&R miniconf
Maciel P, Matos R, Callou G, Silva B, Barreto D, Araujo J, Araujo J, Alves V, Worth S (2014) Performance evaluation of sheepdog distributed storage system. In: 2014 IEEE international conference on systems, man, and cybernetics (SMC). IEEE, pp 3370–3375
Amazon s3. https://aws.amazon.com/s3/. Accessed 18 Feb 2019
Openstack swift. https://www.swiftstack.com/docs/introduction/openstack_swift.html. Accessed 20 Feb 2019
Storj storage system. https://storj.io/. Accessed 18 Feb 2019
Boudjeloud-Assala L (2012) An evolutionary approach for high dimensional attribute selection, vol 6. Inderscience Publishers Ltd, Geneva
Majkić Z, Prasad B (2015) Theory of sketches for database mappings, vol 9. Inderscience Publishers (IEL), Geneva
Mongodb. https://www.mongodb.org/. Accessed 18 Feb 2019
Cassandra. http://cassandra.apache.org/. Accessed 18 Feb 2019
Borthakur D (2008) HDFS architecture guide. Hadoop Apache Project 53(2):1–14
Uci cancer dataset. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+Prognostic. Accessed 19 Feb 2019
Cancer cost dataset. https://costprojections.cancer.gov/. Accessed 18 Feb 2019
Cancer cost dataset. http://cancer.digitalslidearchive.net/. Accessed 18 Feb 2019
Paul M, Das A (2017) Health informatics as a service (HIaaS) for developing countries. Internet of Things and big data technologies for next generation healthcare. Springer, Cham, pp 251–279
Disaster definitions. In: Public health guide for emergencies, pp 24–43. Accessed 2019
Acknowledgements
This work is done under project ‘Remote Health: A framework of healthcare services using mobile and sensor-cloud technologies.’ The project is funded by Information Technology Research Academy (ITRA), Government of India under ITRA-Mobile Grant ITRA/15(59)/Mobile/RemoteHealth/01, Media Lab of Asia.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mondal, A.S., Neogy, S., Mukherjee, N. et al. Performance analysis of an efficient object-based schema oriented data storage system handling health data. Innovations Syst Softw Eng 16, 63–77 (2020). https://doi.org/10.1007/s11334-019-00354-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11334-019-00354-2