Article

Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis

Authors:

Patrice Y. Simard,

Dave Steinkraus,

John C. PlattAuthors Info & Claims

ICDAR '03: Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2

Page 958

Published: 03 August 2003 Publication History

Publisher Site

Abstract

Neural networks are a powerful technology forclassification of visual inputs arising from documents.However, there is a confusing plethora of different neuralnetwork methods that are used in the literature and inindustry. This paper describes a set of concrete bestpractices that document analysis researchers can use toget good results with neural networks. The mostimportant practice is getting a training set as large aspossible: we expand the training set by adding a newform of distorted data. The next most important practiceis that convolutional neural networks are better suited forvisual document tasks than fully connected networks. Wepropose that a simple "do-it-yourself" implementation ofconvolution with a flexible architecture is suitable formany visual document problems. This simpleconvolutional neural network does not require complexmethods, such as momentum, weight decay, structure-dependentlearning rates, averaging layers, tangent prop,or even finely-tuning the architecture. The end result is avery simple yet general architecture which can yieldstate-of-the-art performance for document analysis. Weillustrate our claims on the MNIST set of English digitimages.

References

[1]

Y. Tay, P. Lallican, M. Khalid, C. Viard-Gaudin, S. Knerr, "An Offline Cursive Handwriting Word Recognition System", Proc. IEEE Region 10 Conf., (2001).

Google Scholar

[2]

A. Sinha, An Improved Recognition Module for the Identification of Handwritten Digits, M.S. Thesis, MIT, (1999).

Google Scholar

[3]

Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, "Gradient-based learning applied to document recognition" Proceedings of the IEEE, v. 86, pp. 2278- 2324, 1998.

Crossref

Google Scholar

[4]

K. M. Hornik, M. Stinchcombe, H. White, "Universal Approximation of an Unknown Mapping and its Derivatives using Multilayer Feedforward Networks" Neural Networks, v. 3, pp. 551-560, (1990).

Digital Library

Google Scholar

[5]

C. M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, (1995).

Digital Library

Google Scholar

[6]

L. Yaeger, R. Lyon, B. Webb, "Effective Training of a Neural Network Character Classifier for Word Recognition", NIPS, v. 9, pp. 807-813, (1996).

Google Scholar

[7]

Y. LeCun, "The MNIST database of handwritten digits," http://yann.lecun.com/exdb/mnist.

Google Scholar

[8]

M. Banko, E. Brill, "Mitigating the Paucity-of-Date Problem: Exploring the Effect of Training Corpus Size on Classifier Performance for Natural Language Processing," Proc. Conf. Human Language Technology, (2001).

Digital Library

Google Scholar

[9]

D. Decoste and B. Scholkopf, "Training Invariant Support Vector Machines", Machine Learning Journal, vol 46, No 1-3, 2002.

Digital Library

Google Scholar

Cited By

View all

Huang ZDutta SMisailovic S(2024)Debugging convergence problems in probabilistic programs via program representation learning with SixthSenseInternational Journal on Software Tools for Technology Transfer (STTT)10.1007/s10009-024-00737-226:3(249-268)Online publication date: 19-Feb-2024
https://dl.acm.org/doi/10.1007/s10009-024-00737-2
Wang YJiang XRen KShan CLuo XHan DSong KShen YLi DKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)CircuitNetProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619896(35817-35835)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619896
Zhang QXiao JTian CChun‐Wei Lin JZhang S(2022)A robust deformed convolutional neural network (CNN) for image denoisingCAAI Transactions on Intelligence Technology10.1049/cit2.121108:2(331-342)Online publication date: 15-Jun-2022
https://dl.acm.org/doi/10.1049/cit2.12110
Show More Cited By

Index Terms

Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis

Recommendations

Convolutional Neural Networks In Python: Beginner's Guide To Convolutional Neural Networks In Python
Analysis and optimization of multilayer neural networks
Granular neural networks

Fuzzy neural networks (FNNs) and rough neural networks (RNNs) both have been hot research topics in the artificial intelligence in recent years. The former imitates the human brain in dealing with problems, the other takes advantage of rough set theory ...

Comments

Information & Contributors

Information

Published In

ICDAR '03: Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2

August 2003

ISBN:0769519601

Publisher

IEEE Computer Society

United States

Publication History

Published: 03 August 2003

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

154
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Huang ZDutta SMisailovic S(2024)Debugging convergence problems in probabilistic programs via program representation learning with SixthSenseInternational Journal on Software Tools for Technology Transfer (STTT)10.1007/s10009-024-00737-226:3(249-268)Online publication date: 19-Feb-2024
https://dl.acm.org/doi/10.1007/s10009-024-00737-2
Wang YJiang XRen KShan CLuo XHan DSong KShen YLi DKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)CircuitNetProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619896(35817-35835)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619896
Zhang QXiao JTian CChun‐Wei Lin JZhang S(2022)A robust deformed convolutional neural network (CNN) for image denoisingCAAI Transactions on Intelligence Technology10.1049/cit2.121108:2(331-342)Online publication date: 15-Jun-2022
https://dl.acm.org/doi/10.1049/cit2.12110
Murray DŠimša JKlimovic AIndyk I(2021)tf.dataProceedings of the VLDB Endowment10.14778/3476311.347637414:12(2945-2958)Online publication date: 1-Jul-2021
https://dl.acm.org/doi/10.14778/3476311.3476374
Li PLiu XXie XShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)Learning Sample-Specific Policies for Sequential Image AugmentationProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475602(4491-4500)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3475602
Chen YJhong SHsia CHua K(2021)Explainable AI: A Multispectral Palm-Vein Identification System with New Augmentation FeaturesACM Transactions on Multimedia Computing, Communications, and Applications10.1145/346887317:3s(1-21)Online publication date: 15-Nov-2021
https://dl.acm.org/doi/10.1145/3468873
Ghalwash MYao ZChakraporty PCodella JSow DGhassemi MNaumann TPierson E(2021)Phenotypical ontology driven framework for multi-task learningProceedings of the Conference on Health, Inference, and Learning10.1145/3450439.3451881(183-192)Online publication date: 8-Apr-2021
https://dl.acm.org/doi/10.1145/3450439.3451881
Chung ILin C(2021)TOCAB: A Dataset for Chinese Abusive Language Processing2021 IEEE 22nd International Conference on Information Reuse and Integration for Data Science (IRI)10.1109/IRI51335.2021.00069(445-452)Online publication date: 10-Aug-2021
https://dl.acm.org/doi/10.1109/IRI51335.2021.00069
Han MGünay SSchirner GPadır TErdoğmuş D(2020)HANDS: a multimodal dataset for modeling toward human grasp intent inference in prosthetic handsIntelligent Service Robotics10.1007/s11370-019-00293-813:1(179-185)Online publication date: 1-Jan-2020
https://dl.acm.org/doi/10.1007/s11370-019-00293-8
Moran SNachum IPanasoff IYehudayoff A(2020)On the Perceptron’s CompressionBeyond the Horizon of Computability10.1007/978-3-030-51466-2_29(310-325)Online publication date: 29-Jun-2020
https://dl.acm.org/doi/10.1007/978-3-030-51466-2_29
Show More Cited By

Abstract

References

Cited By

Index Terms

Recommendations

Convolutional Neural Networks In Python: Beginner's Guide To Convolutional Neural Networks In Python

Analysis and optimization of multilayer neural networks

Granular neural networks

Comments

Information

Published In

Publisher

Publication History

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations