Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/938980.939477guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis

Published: 03 August 2003 Publication History

Abstract

Neural networks are a powerful technology forclassification of visual inputs arising from documents.However, there is a confusing plethora of different neuralnetwork methods that are used in the literature and inindustry. This paper describes a set of concrete bestpractices that document analysis researchers can use toget good results with neural networks. The mostimportant practice is getting a training set as large aspossible: we expand the training set by adding a newform of distorted data. The next most important practiceis that convolutional neural networks are better suited forvisual document tasks than fully connected networks. Wepropose that a simple "do-it-yourself" implementation ofconvolution with a flexible architecture is suitable formany visual document problems. This simpleconvolutional neural network does not require complexmethods, such as momentum, weight decay, structure-dependentlearning rates, averaging layers, tangent prop,or even finely-tuning the architecture. The end result is avery simple yet general architecture which can yieldstate-of-the-art performance for document analysis. Weillustrate our claims on the MNIST set of English digitimages.

References

[1]
Y. Tay, P. Lallican, M. Khalid, C. Viard-Gaudin, S. Knerr, "An Offline Cursive Handwriting Word Recognition System", Proc. IEEE Region 10 Conf., (2001).
[2]
A. Sinha, An Improved Recognition Module for the Identification of Handwritten Digits, M.S. Thesis, MIT, (1999).
[3]
Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, "Gradient-based learning applied to document recognition" Proceedings of the IEEE, v. 86, pp. 2278- 2324, 1998.
[4]
K. M. Hornik, M. Stinchcombe, H. White, "Universal Approximation of an Unknown Mapping and its Derivatives using Multilayer Feedforward Networks" Neural Networks, v. 3, pp. 551-560, (1990).
[5]
C. M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, (1995).
[6]
L. Yaeger, R. Lyon, B. Webb, "Effective Training of a Neural Network Character Classifier for Word Recognition", NIPS, v. 9, pp. 807-813, (1996).
[7]
Y. LeCun, "The MNIST database of handwritten digits," http://yann.lecun.com/exdb/mnist.
[8]
M. Banko, E. Brill, "Mitigating the Paucity-of-Date Problem: Exploring the Effect of Training Corpus Size on Classifier Performance for Natural Language Processing," Proc. Conf. Human Language Technology, (2001).
[9]
D. Decoste and B. Scholkopf, "Training Invariant Support Vector Machines", Machine Learning Journal, vol 46, No 1-3, 2002.

Cited By

View all
  • (2024)Debugging convergence problems in probabilistic programs via program representation learning with SixthSenseInternational Journal on Software Tools for Technology Transfer (STTT)10.1007/s10009-024-00737-226:3(249-268)Online publication date: 19-Feb-2024
  • (2023)CircuitNetProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619896(35817-35835)Online publication date: 23-Jul-2023
  • (2022)A robust deformed convolutional neural network (CNN) for image denoisingCAAI Transactions on Intelligence Technology10.1049/cit2.121108:2(331-342)Online publication date: 15-Jun-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICDAR '03: Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
August 2003
ISBN:0769519601

Publisher

IEEE Computer Society

United States

Publication History

Published: 03 August 2003

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Debugging convergence problems in probabilistic programs via program representation learning with SixthSenseInternational Journal on Software Tools for Technology Transfer (STTT)10.1007/s10009-024-00737-226:3(249-268)Online publication date: 19-Feb-2024
  • (2023)CircuitNetProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619896(35817-35835)Online publication date: 23-Jul-2023
  • (2022)A robust deformed convolutional neural network (CNN) for image denoisingCAAI Transactions on Intelligence Technology10.1049/cit2.121108:2(331-342)Online publication date: 15-Jun-2022
  • (2021)tf.dataProceedings of the VLDB Endowment10.14778/3476311.347637414:12(2945-2958)Online publication date: 1-Jul-2021
  • (2021)Learning Sample-Specific Policies for Sequential Image AugmentationProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475602(4491-4500)Online publication date: 17-Oct-2021
  • (2021)Explainable AI: A Multispectral Palm-Vein Identification System with New Augmentation FeaturesACM Transactions on Multimedia Computing, Communications, and Applications10.1145/346887317:3s(1-21)Online publication date: 15-Nov-2021
  • (2021)Phenotypical ontology driven framework for multi-task learningProceedings of the Conference on Health, Inference, and Learning10.1145/3450439.3451881(183-192)Online publication date: 8-Apr-2021
  • (2021)TOCAB: A Dataset for Chinese Abusive Language Processing2021 IEEE 22nd International Conference on Information Reuse and Integration for Data Science (IRI)10.1109/IRI51335.2021.00069(445-452)Online publication date: 10-Aug-2021
  • (2020)HANDS: a multimodal dataset for modeling toward human grasp intent inference in prosthetic handsIntelligent Service Robotics10.1007/s11370-019-00293-813:1(179-185)Online publication date: 1-Jan-2020
  • (2020)On the Perceptron’s CompressionBeyond the Horizon of Computability10.1007/978-3-030-51466-2_29(310-325)Online publication date: 29-Jun-2020
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media