Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1115950.1115961guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Data management support for statistical data editing and subset selection

Published: 02 December 1981 Publication History

Abstract

Statistical analysis of large data sets often involves an initial data editing and preparation phase to check the validity of individual data items, check for consistency among related data, correct erroneous data, and supply (impute) values for missing data where possible. During this preparatory phase of analysis, it is often necessary to partition the data set into a number of subsets by logical selection and/or random sampling techniques for purposes of hypothesis testing. This paper examines the data management support required by these editing and subsetting operations in terms of data descriptions, data manipulation functions, and logical and physical data structures. The design of a data management system which seeks to meet these requirements is described in detail. The system, called SDB, is built around a self-describing transposed file structure and supporting data access software. SDB representations of some logical data structures which are commonly encountered in statistical databases are also described. Experiences with a partial implementation of the system and its application in an interactive data editor have been encouraging.

References

[6]
{6} Ryosuke Hotaka, Masaaki Tsubaki: Self-Descriptive Relational Data Base. VLDB 1977: 415-426.
[7]
{7} Don S. Batory: On Searching Transposed Files. ACM Trans. Database Syst. 4(4): 531-544(1979).
[8]
{8} M. J. Turner, R. Hammond, P. Cotton: A DBMS for Large Statistical Databases. VLDB 1979: 319-327.
[10]
{10} Michael Stonebraker: Operating System Support for Database Management. Commun. ACM 24(7): 412-418(1981).

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
SSDBM'81: Proceedings of the 1st LBL Workshop on Statistical database management
December 1981
402 pages
ISBN:155555222X

Publisher

Lawrence Berkeley Laboratory

Berkeley, United States

Publication History

Published: 02 December 1981

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (1985)A decomposition storage modelACM SIGMOD Record10.1145/971699.31892314:4(268-279)Online publication date: 1-May-1985
  • (1985)A decomposition storage modelProceedings of the 1985 ACM SIGMOD international conference on Management of data10.1145/318898.318923(268-279)Online publication date: 1-May-1985
  • (1983)An Analytic Approach to Statistical DatabasesProceedings of the 9th International Conference on Very Large Data Bases10.5555/645911.673617(260-274)Online publication date: 31-Oct-1983
  • (1982)Statistical DatabasesProceedings of the 8th International Conference on Very Large Data Bases10.5555/645910.673595(208-222)Online publication date: 8-Sep-1982
  • (1982)Metadata Management for Large Statistical DatabasesProceedings of the 8th International Conference on Very Large Data Bases10.5555/645910.673459(234-243)Online publication date: 8-Sep-1982
  • (1982)A framework for research in database management for statistical analysis or a primer on statistical database management problems for computer scientistsProceedings of the 1982 ACM SIGMOD international conference on Management of data10.1145/582353.582365(69-78)Online publication date: 2-Jun-1982

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media