Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A Methodology to Evaluate Important Dimensions of Information Quality in Systems

Published: 03 June 2015 Publication History

Abstract

Assessing the quality of the information proposed by an information system has become one of the major research topics in the last two decades. A quick literature survey shows that a significant number of information quality frameworks are proposed in different domains of application: management information systems, web information systems, information fusion systems, and so forth. Unfortunately, they do not provide a feasible methodology that is both simple and intuitive to be implemented in practice. In order to address this need, we present in this article a new information quality methodology. Our methodology makes use of existing frameworks and proposes a three-step process capable of tracking the quality changes through the system. In the first step and as a novelty compared to existing studies, we propose decomposing the information system into its elementary modules. Having access to each module allows us to locally define the information quality. Then, in the second step, we model each processing module by a quality transfer function, capturing the module’s influence over the information quality. In the third step, we make use of the previous two steps in order to estimate the quality of the entire information system. Thus, our methodology allows informing the end-user on both output quality and local quality. The proof of concept of our methodology has been carried out considering two applications: an automatic target recognition system and a diagnosis coding support system.

References

[1]
C. Bishop. 2006. Pattern Recognition And Machine Learning. Springer-Verlag, New York, NY.
[2]
R. Bovee, R. Srivastava, and B. Mak. 2003. A conceptual framework and belief-function approach to assessing overall information quality. International Journal of Intelligent Systems. 18, 1 (Jan. 2003), 51--74.
[3]
I. N. Chengalur-Smith, D. P. Ballou, and H. L. Pazer. 1999. The impact of data quality information on decision making: An exploratory analysis. IEEE Transactions on Knowledge and Data Engineering 11, 6 (Dec. 1999), 853--864.
[4]
A. K. Dey. 2001. Understanding and using context. Personal Ubiquitous Computing 5, 1 (Jan. 2001), 4--7.
[5]
D. Dubois. 1980. Fuzzy Sets and Systems: Theory and Applications. Academic Press, New York, NY.
[6]
D. Dubois. 1988. Possibility Theory: An Approach to Computerized Processing of Uncertainty. Plenum Press, New York, NY.
[7]
M. Endsley and D. Garland. 2000. Situation Awareness Analysis and Measurement. CRC Press, Mahwah, NJ.
[8]
L. English. 2009. Information Quality Applied: Best Practices for Improving Business Information, Processes and Systems. John Wiley & Sons, Indianapolis, IN.
[9]
S. Ethiraj and D. Levinthal. 2004. Modularity and innovation in complex systems. Management Science 50, 2, 159--173.
[10]
L. Floridi. 2009. Philosophical conceptions of information. In Formal Theories of Information. Springer, Berlin, 13--53.
[11]
C. Fox, A. Levitin, and T. Redman. 1994. The notion of data and its quality dimensions. Information Processing and Management 30, 1, 9--19.
[12]
M. Grabisch. 1995. Fuzzy integral in multicriteria decision making. Fuzzy Sets and Systems 69, 279--298.
[13]
L. Lecornu, C. Le Guillou, P. J. Garreau, P. Saliou, H. Jantzem, J. Puentes, and J-M. Cauvin. 2009. REFEROCOD: A probabilistic method to medical coding support. In Proceedings of the 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology (EMBC’09).
[14]
L. Lecornu, C. Le Guillou, F. Le Saux, M. Hubert, J. Puentes, and J-M. Cauvin. 2010. ANTEROCOD: Actuarial survival curves applied to medical coding support for chronic diseases. In Proceedings of the 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC’10), 1158--1161.
[15]
L. Lecornu, C. Le Guillou, F. Le Saux, M. Hubert, J. Puentes, J. Montagner, and J.-M. Cauvin. 2011. Information fusion for diagnosis coding support. In Proceedings of teh 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC’11), 3176--3179.
[16]
L. Lecornu, C. Le Guillou, G. Thillay, P. J. Garreau, H. Jantzem, and J.-M. Cauvin. 2009. C2i: A tool to gather medical indexed information. In Proceedings of the 9th International Conference on Information Technology and Applications in Biomedicine (ITAB’09).
[17]
S. Madnick, R. Wang, Y. Lee, and H. Zhu. 2009. Overview and framework for data and information quality research. Journal of Data and Information Quality 1, 1, 2:1--2:22.
[18]
A. Maydanchik. 2007. Data Quality Assessment. Data Quality for Practitioners Series. Technics Publications, Bradley Beach, NJ.
[19]
F. Naumann. 2002. Information quality criteria. In Quality-Driven Query Answering. LNCS 2261, Springer-Verlag, Berlin Heidelberg, 29--50.
[20]
M. E. J. Newman. 2011. Complex Systems: A Survey. arXiv:1112.1440 {cond-mat.stat-mech}. (December 2011).
[21]
R. Pautke and T. Redman. 2002. The organisation’s most important data issues. In Information and Database Quality. Springer Science+Business Media New York, NY, 1--12.
[22]
R. Pon and A. Cárdenas. 2005. Data quality inference. In Proceedings of the 2nd International Workshop on Information Quality in Information Systems. 105--111.
[23]
J. Puentes, J. Montagner, L. Lecornu, and J.-M. Cauvin. 2013. Information quality measurement of medical encoding support based on usability. Computer Methods and Programs in Biomedicine 112, 3, 329--342.
[24]
T. Redman. 1992. Data Quality: Management and Technology. Bantam Books, New York, NY.
[25]
A. Renyi. 1970. Foundation of Probability. Holden-Day, San Francisco, CA.
[26]
G. Rogova and E. Bossé. 2010. Information quality in information fusion. In Proceedings of the 13th Conference on Information Fusion (FUSION’10), 1--8.
[27]
L. Sebastian-Coleman. 2013. Measuring Data Quality for Ongoing Improvement: A Data Quality Assessment Framework. Morgan Kaufmann, Waltham, MA.
[28]
G. Shafer. 1976. A Mathematical Theory of Evidence. Princeton University Press, Princeton, NJ.
[29]
R. Srivastava. 1983. Reliability modeling of information systems with human elements: A new perspective. In Proceedings of the IEEE Transactions: Total System Reliability Symposium. 30--39.
[30]
R. Srivastava. 1985. A note on internal control systems with control components in series. Accounting Review LX, 3, 504--507.
[31]
I. G. Todoran, L. Lecornu, A. Khenchaf, and J. M. Le Caillec. 2013. Information quality evaluation in fusion systems. In Proceedings of the 16th Conference on Information Fusion (FUSION’13).
[32]
I. G. Todoran, L. Lecornu, A. Khenchaf, and J. M. Le Caillec. 2014. Assessing information quality in information fusion systems. In Proceedings of the NATO SAS-106 Symposium on Analysis Support to Decision Making in Cyber Defence & Security.
[33]
E. Turban, J. Aronson, and T. P. Liang. 2005. Decision Support Systems and Intelligent Systems (7th ed.). Pearson Education, Upper Saddle River, NJ.
[34]
E. Waltz. 2003. Knowledge Management in the Intelligence Enterprise. Artech House Information Warfare Library. Artech House, Boston, MA.
[35]
E. Waltz and J. Llinas. 1990. Multisensor Data Fusion. Artech House, Norwood, MA.
[36]
R. Wang and D. Strong. 1996. Beyond accuracy: What data quality means to data consumers. Journal of Management Information Systems 12, 4 (March 1996), 5--33.
[37]
J. Ye, S. McKeever, L. Coyle, S. Neely, and S. Dobson. 2008. Resolving uncertainty in context integration and abstraction: context integration and abstraction. In Proceedings of the 5th international conference on Pervasive services (ICPS’08). ACM, New York, NY, USA, 131--140.

Cited By

View all
  • (2024)Use of Context in Data Quality Management: a Systematic Literature ReviewJournal of Data and Information Quality10.1145/3672082Online publication date: 17-Jun-2024
  • (2023)Data Quality Assessment through a Preference ModelJournal of Data and Information Quality10.1145/363240716:1(1-21)Online publication date: 29-Nov-2023
  • (2023)Why is instant messaging not instant? Understanding users’ negative use behavior of instant messaging softwareComputers in Human Behavior10.1016/j.chb.2023.107655142(107655)Online publication date: May-2023
  • Show More Cited By

Recommendations

Reviews

Brian D. Goodman

As the world begins to surf down the "trough of disillusionment" with big data [1], on the journey to sorting out what can be real and what has value, we rediscover the discipline that data and information scientists have been nurturing for quite some time: assessing the quality of what has been created and displayed. This domain of interest is front and center the minute you successfully integrate a variety of data sources, enriching and producing a novel set of analytics that was never before possible. After stepping back and marveling at this creation, someone rightful asks, "How do we measure the quality of the information we are working with__?__" This is at the heart of the paper. The challenge with data quality frameworks is that they tend to not be practical. Todoran et al. highlight this, and it would seem that, in practice, these frameworks are often not needed for the simplicity of what many systems are aspiring to do. However, enter the world of high velocity, high volume, and high variety, and you actually need to understand entropy and resulting quality-this is where Todoran et al. are brilliant. Even if the specific examples are not relevant to you, they offer a three-step framework that is portable to approaching your specific situation. They have included tables for quality criteria and their measures for data and information, all of which provoke critical thinking about how to assess data quality as it flows through a system. One could even argue that their formula, well documented and exemplified, is not required in order to structure an information quality measurement strategy. If you are involved in a new or existing data mashup, where it is not enough to have just any answer but an answer that comes with statistical transparency, the authors' methodology will prove useful at a variety of levels. Online Computing Reviews Service

Gökhan Kul

With evolving requirements and technology, databases hold large amounts of heterogeneous data, collected by different types of sensors, systems, and agents. Even though data quality evaluation research dates back many years, the big data concept has raised many issues in this field. This work presents a new methodology for assessing information quality in a system while also making use of existing frameworks. The methodology that is proposed in the paper has three phases. The first phase is a divide-and-conquer approach that decomposes a system into smaller modules to locally define the information quality. In the second phase, the authors identify a module's influence over the information quality of the overall system with a quality transfer function. The third phase estimates the entire information quality of the system by bringing together the locally defined quality of modules and their influence over the system. The strongest point of this paper is that the solid features the authors identify in the study can help other methodologies arise, and the methodology proposed in the paper is also very comprehensive and mature. Another advantage of the proposed methodology is that it computes both local and overall quality, which should be useful in cases that give users the opportunity to evaluate smaller parts of their data sources if they wish to increase the quality. The weakest point of this paper is that the authors don't provide a comparison between their method and other existing methods while claiming that the other methods treat a system as a black box without citing which methods they considered. Also, they don't consider that some systems may have only one module dealing with data, which cannot be evaluated like the bigger systems. However, these are very minor points that can be disproved or refuted. Overall, the authors have great insights on the subject matter and this paper should attract significant attention from the research community. The methodology proposed in the paper is very well defined and should be beneficial for all who are involved in this track of research. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image Journal of Data and Information Quality
Journal of Data and Information Quality  Volume 6, Issue 2-3
July 2015
60 pages
ISSN:1936-1955
EISSN:1936-1963
DOI:10.1145/2788681
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2015
Accepted: 01 March 2015
Revised: 01 January 2015
Received: 01 March 2014
Published in JDIQ Volume 6, Issue 2-3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Information quality
  2. complex information system
  3. quality measures
  4. quality transfer function

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Direction Générale de l'Armement (French MoD)
  • Brittany Council, France

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)42
  • Downloads (Last 6 weeks)3
Reflects downloads up to 01 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Use of Context in Data Quality Management: a Systematic Literature ReviewJournal of Data and Information Quality10.1145/3672082Online publication date: 17-Jun-2024
  • (2023)Data Quality Assessment through a Preference ModelJournal of Data and Information Quality10.1145/363240716:1(1-21)Online publication date: 29-Nov-2023
  • (2023)Why is instant messaging not instant? Understanding users’ negative use behavior of instant messaging softwareComputers in Human Behavior10.1016/j.chb.2023.107655142(107655)Online publication date: May-2023
  • (2022)Towards interactive event log forensicsInformation Systems10.1016/j.is.2022.102039109:COnline publication date: 1-Nov-2022
  • (2022)Modeling Context for Data Quality ManagementConceptual Modeling10.1007/978-3-031-17995-2_23(325-335)Online publication date: 10-Oct-2022
  • (2021)Information Quality Assessment for Data Fusion SystemsData10.3390/data60600606:6(60)Online publication date: 8-Jun-2021
  • (2021)Exploring big data traits and data quality dimensions for big data analytics application using partial least squares structural equation modellingJournal of Big Data10.1186/s40537-021-00439-58:1Online publication date: 23-Mar-2021
  • (2019)Discovering Patterns for Fact Checking in Knowledge GraphsJournal of Data and Information Quality10.1145/328648811:3(1-27)Online publication date: 7-May-2019
  • (2019)QUALM: Ganzheitliche Messung und Verbesserung der Datenqualität in der TextanalyseDatenbank-Spektrum10.1007/s13222-019-00318-7Online publication date: 6-Jun-2019
  • (2019)Analytics and Quality in Medical Encoding SystemsInterface Development for Learning Environments10.1007/978-3-030-03643-0_19(455-470)Online publication date: 2-Apr-2019
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media