Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

A fast string searching algorithm

Published: 01 October 1977 Publication History

Abstract

An algorithm is presented that searches for the location, “il” of the first occurrence of a character string, “pat,” in another string, “string.” During the search operation, the characters of pat are matched starting with the last character of pat. The information gained by starting the match at the end of the pattern often allows the algorithm to proceed in large jumps through the text being searched. Thus the algorithm has the unusual property that, in most cases, not all of the first i characters of string are inspected. The number of characters actually inspected (on the average) decreases as a function of the length of pat. For a random English pattern of length 5, the algorithm will typically inspect i/4 characters of string before finding a match at i. Furthermore, the algorithm has been implemented so that (on the average) fewer than i + patlen machine instructions are executed. These conclusions are supported with empirical evidence and a theoretical analysis of the average behavior of the algorithm. The worst case behavior of the algorithm is linear in i + patlen, assuming the availability of array space for tables linear in patlen plus the size of the alphabet.

References

[1]
Aho, A.V., and Corasick, M.J. Fast pattern matching: An aid to bibliographic search. Comm. ACM 18, 6 (June, 1975), 333-340.
[2]
Beeler, M., Gosper, R.W., and Schroeppel, R. Hakmem. Memo No. 239, M.I.T. Artificial Intelligence Lab., M.I.T., Cambridge, Mass., Feb. 29, 1972.
[3]
Dewey, G. Relativ Frequency o f English Speech Sounds. Harvard U. Press, Cambridge, Mass., 1923, p. 185.
[4]
Knuth, D.E., Morris, J.H., and Pratt, V.R. Fast pattern matching in strings. TR CS-74-440, Stanford U., Stanford, Calif., 1974.
[5]
Knuth, D.E., Morris, J.H., and Pratt, V.R. Fast pattern matching in strings. (to appear in SIAM J. Comput.).

Cited By

View all
  • (2024)Gamification Techniques and Contribution Filtering in Crowdsourcing Micro-Task ApplicationsJournal on Interactive Systems10.5753/jis.2024.372715:1(401-416)Online publication date: 15-May-2024
  • (2024)Multi-Pattern GPU Accelerated Collision-Less Rabin-Karp for NIDSInternational Journal of Distributed Systems and Technologies10.4018/IJDST.34126915:1(1-16)Online publication date: 9-Apr-2024
  • (2024)Predictive Maintenance with Linguistic Text MiningMathematics10.3390/math1207108912:7(1089)Online publication date: 4-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 20, Issue 10
Oct. 1977
93 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/359842
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 1977
Published in CACM Volume 20, Issue 10

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bibliographic search
  2. computational complexity
  3. information retrieval
  4. linear time bound
  5. pattern matching
  6. text editing

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,121
  • Downloads (Last 6 weeks)90
Reflects downloads up to 02 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Gamification Techniques and Contribution Filtering in Crowdsourcing Micro-Task ApplicationsJournal on Interactive Systems10.5753/jis.2024.372715:1(401-416)Online publication date: 15-May-2024
  • (2024)Multi-Pattern GPU Accelerated Collision-Less Rabin-Karp for NIDSInternational Journal of Distributed Systems and Technologies10.4018/IJDST.34126915:1(1-16)Online publication date: 9-Apr-2024
  • (2024)Predictive Maintenance with Linguistic Text MiningMathematics10.3390/math1207108912:7(1089)Online publication date: 4-Apr-2024
  • (2024)Finite State Automata on Multi-Word Units for Efficient Text-MiningMathematics10.3390/math1204050612:4(506)Online publication date: 6-Feb-2024
  • (2024)Hardware acceleration of DNA pattern matching using analog resistive CAMsFrontiers in Electronics10.3389/felec.2023.13436124Online publication date: 12-Feb-2024
  • (2024)HClass: Fast hybrid network traffic classification with bit and keyword level signaturesJournal of High Speed Networks10.3233/JHS-230145(1-17)Online publication date: 11-Jun-2024
  • (2024)Practical Implementation of a Quantum String Matching AlgorithmProceedings of the 2024 Workshop on Quantum Search and Information Retrieval10.1145/3660318.3660327(17-24)Online publication date: 3-Jun-2024
  • (2024)Exploiting Data-pattern-aware Vertical Partitioning to Achieve Fast and Low-cost Cloud Log StorageACM Transactions on Storage10.1145/364364120:2(1-35)Online publication date: 19-Feb-2024
  • (2024)The deterministic pattern matching based on the parameterized quantum circuitEPJ Quantum Technology10.1140/epjqt/s40507-023-00215-911:1Online publication date: 9-Jan-2024
  • (2024)Utilization of Information Entropy in Training and Evaluation of Students’ Abstraction Performance and Algorithm Efficiency in ProgrammingIEEE Transactions on Education10.1109/TE.2024.335429767:2(266-281)Online publication date: 5-Feb-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media