Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Sequentiality and prefetching in database systems

Published: 01 September 1978 Publication History

Abstract

Sequentiality of access is an inherent characteristic of many database systems. We use this observation to develop an algorithm which selectively prefetches data blocks ahead of the point of reference. The number of blocks prefetched is chosen by using the empirical run length distribution and conditioning on the observed number of sequential block references immediately preceding reference to the current block. The optimal number of blocks to prefetch is estimated as a function of a number of “costs,” including the cost of accessing a block not resident in the buffer (a miss), the cost of fetching additional data blocks at fault times, and the cost of fetching blocks that are never referenced. We estimate this latter cost, described as memory pollution, in two ways. We consider the treatment (in the replacement algorithm) of prefetched blocks, whether they are treated as referenced or not, and find that it makes very little difference. Trace data taken from an operational IMS database system is analyzed and the results are presented. We show how to determine optimal block sizes. We find that anticipatory fetching of data can lead to significant improvements in system operation.

References

[1]
AHo, A.V, DENNING, P.J., AND ULLMAN, J.D. Principles of optimal page replacement. J. ACM 18, 1 (Jan. 1971), 80-93.
[2]
BAER, J.L., AND SAGER, G.R. Dynamic improvement of locality in virtual memory systems. IEEE Trans. Software Eng. SE-2, 1 (March 1976), 54-62.
[3]
BARD, Y. Characterization of program paging in a time-sharing environment. IBM J. Res. Develop. 17, 3 (Sept. 1973), 387-393.
[4]
COrtBATO, F.J. A paging experiment with the Multics system. In In Honor of P.M. Morse, M.I,T. Press, Cambridge, Mass., 1969, pp. 217-228.
[5]
DATE, C.J. An Introduction to Data Base Systems. Addison-Wesley, Reading, Mass, 1975.
[6]
DENNING, P.J. The working set model for program behavior. Comm. ACM I1, 5 (May 1968), 323-333.
[7]
GAV~,R, D.P., LAVENBERG, S.S., AND PRICE, T.G. JR. Exploration analysis of access path length data for a data base management system. IBM J. Res. Develop. 20, 5 (Sept. 1976), 449-464.
[8]
GOLD, D.E., AND KUCK, D.J. A model for masking rotational latency by dynamic disk allocation. Comm. ACM 17, 5 (May 1974), 278-288.
[9]
HELD, G.D., STONEBRAKER, M.R., AND WON(}, E. INGRES--a relational data base system. Proc. AFIPS 1975 NCC, AFIPS Press, Montvale, N.J., pp. 409-416.
[10]
IBM CORP. Information Management System/360, Version 2, General information Manual. Form GH20-0765-3, IBM Corp. Tech. Pub. Dept., Palo Alto, Calif., 1973.
[11]
IBM CORP. Information Management System/360, Version 2, Application Programming Reference Manual. Form SH20-0912, IBM Corp., Palo Alto, Calif., Nov. i973.
[12]
IBM CORP. Information Management System/360, Version 2, System Programming Reference Manual. Form SH20-0911, IBM Corp., Palo Alto, Calif., Sept. 1974.
[13]
JOSEPH, M. An analysis of paging and program behavior. Comptr. J. 13, 1 (Feb. 1970), 48-54.
[14]
LAVENBERG, S.S., AND SCHEDLER, G,S. A queueing model of the DL/I component of IMS. Res. Rep. RJ 1561, IBM Res. Lab., San Jose, Calif., April 1975. Republished as {15}.
[15]
LAVENBERC, S.S., AND SHEDLER, G.S. Stochastic modelling of processor scheduling with application to data base management systems. IBM J. Res. Develop. 20, 5 (Sept. 1976), 437-448.
[16]
LEwis, P.A.W., ANt) Sn~.DLErt, G.S. Statistical analysis of transaction processing in a data base system. Res. Rep. RJ 1629, IBM Res. Lab., San jose, Calif., Sept. 1975. Republished as: Statistical analysis of non-stationary series of events in a data base system. IBM J. Res. Develop. 20, 5 (Sept. 1976), 465-482.
[17]
MATTSON, R., GECSEI, J., SLUTZ, D.R., AND TRAIGER, I.L. Evaluation techniques for storage hierarchies. IBM Syst. J. 2 (I970), 78-117.
[18]
RAGAZ, N., AND RODRIGUEZ-ROSELL, J. Empirical studies of storage management in a data base system. Res. Rep. RJ 1834, IBM Res. Lab., San Jose, Calif., Oct. 1976.
[19]
RITCHIE, D.M., AND THOMPSON, K. The UNIX time-sharing system. Comm. ACM 17, 7 (July 1974), 365-375.
[20]
RODRIGUEZ-ROSELL, J. Empirical data reference behavior in data base systems. Computer 9, 11 (Nov. 1976), 9-13.
[21]
RODR{GUEZ-ROSELL, J., AND HILDEBRAND, D. A framework for evaluation of data base systems. Res. Rep. RJ 1587, IBM Res. Lab., San Jose, Calif., May 1975. Also in Proc. Int. Comput. Syrup., Antibes, France, June 1975.
[22]
SMITH, A.J. A locality model for disk reference patterns. Proc. IEEE Comptr. Soc. Conf., San Francisco, Feb. 1975, 109-112.
[23]
SMITH, A.J. Analysis of a locality model for disk reference patterns. Proc. Second Conf. Inform. Sci. and Syst., Johns Hopkins U., Baltimore, Md., April 1976, 593-601.
[24]
SMITH, A.J. Sequential program prefetching in memory hierarchies. April 1977; submitted for publication. To appear, Computer.
[25]
TRIVEDI, K.S. Prepaging and applications to array algorithms. IEEE Trans. Comptrs. C-25, 9 (Sept. 1976), 915-921.
[26]
TRIV}~DI, K.S. Prepaging and applications to the STAR-100 computer. Proc. Symp. High Performance Comptr. and Algorithm Organization, Champaign, Ill., April 1977.
[27]
TRIVF.DI, K.S. An analysis of prepaging. Comptr. Sci. Rep. CS-1977-7, Duke U., Durham, N.C., Aug. 1977.
[28]
TRIVED1, K.S. On the paging performance of array algorithms. IEEE Trans. Comptrs. C-26, i0 (Oct. 1977), 938-947.
[29]
TUgL, W.G. JR. An analysis of buffer paging in virtual storage systems. Res. Rep. RJ 1421, IBM Res. Lab., San Jose, Calif., 1974.
[30]
TU~L, W.G. Jm, AN~ RODamUEZ-ROSELL, J. A methodology for the evaluation of data base systems. Res. Rep. RJ 1668, IBM Res. Lab., San Jose, Calif., Oct. 1975.

Cited By

View all
  • (2024)SeLeP: Learning Based Semantic Prefetching for Exploratory Database WorkloadsProceedings of the VLDB Endowment10.14778/3659437.365945817:8(2064-2076)Online publication date: 1-Apr-2024
  • (2023)FIFO can be Better than LRU: the Power of Lazy Promotion and Quick DemotionProceedings of the 19th Workshop on Hot Topics in Operating Systems10.1145/3593856.3595887(70-79)Online publication date: 22-Jun-2023
  • (2023)FrozenHot Cache: Rethinking Cache Management for Modern HardwareProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3587446(557-573)Online publication date: 8-May-2023
  • Show More Cited By

Index Terms

  1. Sequentiality and prefetching in database systems

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 September 1978
    Published in TODS Volume 3, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. IMS
    2. buffer management
    3. database systems
    4. dynamic programming
    5. paging
    6. prefetching
    7. sequentiality

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)184
    • Downloads (Last 6 weeks)28
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)SeLeP: Learning Based Semantic Prefetching for Exploratory Database WorkloadsProceedings of the VLDB Endowment10.14778/3659437.365945817:8(2064-2076)Online publication date: 1-Apr-2024
    • (2023)FIFO can be Better than LRU: the Power of Lazy Promotion and Quick DemotionProceedings of the 19th Workshop on Hot Topics in Operating Systems10.1145/3593856.3595887(70-79)Online publication date: 22-Jun-2023
    • (2023)FrozenHot Cache: Rethinking Cache Management for Modern HardwareProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3587446(557-573)Online publication date: 8-May-2023
    • (2023)ACEing the Bufferpool Management Paradigm for Modern Storage Devices2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00106(1326-1339)Online publication date: Apr-2023
    • (2022)LTC: A Fast Algorithm to Accurately Find Significant Items in Data StreamsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.303891134:9(4342-4356)Online publication date: 1-Sep-2022
    • (2022)BibliographyStorage Systems10.1016/B978-0-32-390796-5.00023-1(641-693)Online publication date: 2022
    • (2022)IntroductionStorage Systems10.1016/B978-0-32-390796-5.00010-3(1-87)Online publication date: 2022
    • (2021)EIRES: Efficient Integration of Remote Data in Event Stream ProcessingProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457304(2128-2141)Online publication date: 9-Jun-2021
    • (2021)Spitfire: A Three-Tier Buffer Manager for Volatile and Non-Volatile MemoryProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3452819(2195-2207)Online publication date: 9-Jun-2021
    • (2021)Revisiting Data Prefetching for Database Systems with Machine Learning Techniques2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00218(2165-2170)Online publication date: Apr-2021
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media