research-article

Syntactic Pattern Recognition in Computer Vision: A Systematic Review

Authors:

Gilberto Astolfi,

Fábio Prestes Cesar Rezende,

João Vitor De Andrade Porto,

Edson Takashi Matsubara,

Hemerson PistoriAuthors Info & Claims

ACM Computing Surveys (CSUR), Volume 54, Issue 3

Article No.: 65, Pages 1 - 35

https://doi.org/10.1145/3447241

Published: 17 April 2021 Publication History

Abstract

Using techniques derived from the syntactic methods for visual pattern recognition is not new and was much explored in the area called syntactical or structural pattern recognition. Syntactic methods have been useful because they are intuitively simple to understand and have transparent, interpretable, and elegant representations. Their capacity to represent patterns in a semantic, hierarchical, compositional, spatial, and temporal way have made them very popular in the research community. In this article, we try to give an overview of how syntactic methods have been employed for computer vision tasks. We conduct a systematic literature review to survey the most relevant studies that use syntactic methods for pattern recognition tasks in images and videos. Our search returned 597 papers, of which 71 papers were selected for analysis. The results indicated that in most of the studies surveyed, the syntactic methods were used as a high-level structure that makes the hierarchical or semantic relationship among objects or actions to perform the most diverse tasks.

References

[1]

Nosheen Abid, Adnan ul Hasan, and Faisal Shafait. 2018. DeepParse: A trainable postal address parser. In Proceedings of the Conference on Digital Image Computing: Techniques and Applications (DICTA’18). IEEE, 1--8.

[2]

Francisco Álvaro, Joan-Andreu Sánchez, and José-Miguel Benedí. 2014. Recognition of on-line handwritten mathematical expressions using 2D stochastic context-free grammars and hidden Markov models. Pattern Recog. Lett. 35 (2014), 58--67.

Digital Library

[3]

Francisco Álvaro, Joan-Andreu Sánchez, and José-Miguel Benedí. 2016. An integrated grammar-based approach for mathematical expression recognition. Pattern Recog. 51 (2016), 135--147.

Digital Library

[4]

Alexander Andreopoulos and John K. Tsotsos. 2013. 50 Years of object recognition: Directions forward. Comput. Vis. Image Underst. 117, 8 (2013), 827--891.

[5]

Gilberto Astolfi, Marcio Carneiro Brito Pache, Geazy Vilharva Menezes, Adair da Silva Oliveira Junior, Gabriel Kirsten Menezes, Vanessa Aparecida Moares de Weber, Everton Castelão Tetila, Nícolas Alessandro de Souza Belete, Edson Takashi Matsubara, and Hemerson Pistori. 2020. Combining syntactic methods with LSTM to classify soybean aerial images. IEEE Geosci. Rem. Sens. Lett. 1, 1 (2020), 1--5.

[6]

Kaouther Khazri Ayeb, Afef Kacem Echi, and Abdel Belaïd. 2015. A syntax directed system for the recognition of printed Arabic mathematical formulas. In Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR’15). IEEE, 186--190.

[7]

Herbert Bay, Andreas Ess, Tinne Tuytelaars, and Luc J. Van Gool. 2008. Speeded-Up robust features (SURF). Comput. Vis. Image Underst. 110, 3 (June 2008), 346--359.

Digital Library

[8]

Andrew Blake, Pushmeet Kohli, and Carsten Rother. 2011. Markov Random Fields for Vision and Image Processing. The MIT Press, Cambridge, MA.

[9]

Alexandre Boulch, Simon Houllier, Renaud Marlet, and Olivier Tournaire. 2013. Semantizing complex 3D scenes using constrained attribute grammars. In Proceedings of the 11th Eurographics/ACMSIGGRAPH Symposium on Geometry Processing (SGP’13). Eurographics Association, 33--42.

Digital Library

[10]

Lubomir Bourdev, Subhransu Maji, Thomas Brox, and Jitendra Malik. 2010. Detecting people using mutually consistent poselet activations. In Proceedings of the 11th European Conference on Computer Vision (ECCV’10). Springer-Verlag, Berlin, 168--181. Retrieved from http://dl.acm.org/citation.cfm?idequals;1888212.1888227.

[11]

Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng. 2011. Handbook of Markov Chain Monte Carlo. CRC Press, Boca Raton, FL. Retrieved from https://books.google.com.br/books?idequals;qfRsAIKZ4rIC.

[12]

Gaurav Chanda and Frank Dellaert. 2004. Grammatical Methods in Computer Vision: An Overview. Technical Report GIT-GVU-04-29. Georgia Institute of Technology. Retrieved from https://www.cc.gatech.edu/gvu/reports/2004/abstracts/04-29.html.

[13]

Tae Eun Choe, Hongli Deng, Feng Guo, Mun Wai Lee, and Niels Haering. 2013. Semantic video-to-video search using sub-graph grouping and matching. In Proceedings of the IEEE International Conference on Computer Vision Workshops. IEEE, 787--794.

Digital Library

[14]

Jeroen Chua and Pedro F. Felzenszwalb. 2016. Scene grammars, factor graphs, and belief propagation. CoRR abs/1606.01307 (2016), 1--46.

[15]

Nicholas Dahm, Yongsheng Gao, Terry Caelli, and Horst Bunke. 2013. Matching non-aligned objects using a relational string-graph. In Proceedings of the IEEE International Conference on Image Processing. IEEE, 3394--3398.

[16]

Lluís-Pere de las Heras, Oriol Ramos Terrades, and Josep Lladós. 2015. Attributed graph grammar for floor plan analysis. In Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR’15). IEEE, 726--730.

Digital Library

[17]

Ilke Demir, Daniel G. Aliaga, and Bedrich Benes. 2015. Procedural editing of 3D building point clouds. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’15). IEEE, 2147--2155.

Digital Library

[18]

Vincenzo Deufemia, Michele Risi, and Genoveffa Tortora. 2014. Sketched symbol recognition using latent-dynamic conditional random fields and distance-based clustering. Pattern Recog. 47, 3 (2014), 1159--1171.

Digital Library

[19]

Murray Eden. 1961. On the formalization of handwriting. Amer. Math. Soc. Appl. Math Symp. 12 (1961), 83--88.

[20]

Haoshu Fang, Yuanlu Xu, Wenguan Wang, Xiaobai Liu, and Song-Chun Zhu. 2018. Learning pose grammar to encode human body configuration for 3D pose estimation. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, (AAAI’18), the 30th innovative Applications of Artificial Intelligence (IAAI’18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’18), Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). AAAI Press, 6821--6828.

[21]

Weiguo Feng, Rui Liu, and Ming Zhu. 2014. Fall detection for elderly person care in a vision-based home surveillance environment using a monocular camera. Sig. Image Vid. Proc. 8, 6 (2014), 1129--1138.

[22]

G. Ferber. 1986. Classifying and validating intermittent EEG patterns with syntactic methods. Pattern Recog. 19, 4 (1986), 289--295.

Digital Library

[23]

Amy Fire and Song-Chun Zhu. 2017. Inferring hidden statuses and actions in video by causal reasoning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’17). IEEE, 48--56.

[24]

Mariusz Flasiński and Janusz Jurek. 2014. Fundamental methodological issues of syntactic pattern recognition. Pattern Anal. Applic. 17, 3 (01 Aug. 2014), 465--480.

[25]

G. D. Forney. 2001. Codes on graphs: Normal realizations. IEEE Trans. Inf. Theor. 47, 2 (Feb. 2001), 520--548.

Digital Library

[26]

David A. Forsyth and Jean Ponce. 2002. Computer Vision: A Modern Approach. Prentice Hall Professional Technical Reference, Upper Saddle River, NJ.

Digital Library

[27]

King-Sun Fu and A. Rosenfeld. 1976. Pattern recognition and image processing. IEEE Trans. Comput. C-25, 12 (Dec. 1976), 1336--1346.

[28]

Raghudeep Gadde, Renaud Marlet, and Nikos Paragios. 2016. Learning grammars for architecture-specific facade parsing. Int. J. Comput. Vis. 117, 3 (May 2016), 290--316.

Digital Library

[29]

Zoubin Ghahramani. 2001. An introduction to hidden Markov models and Bayesian networks. Int. J. Pattern Recog. Artif. Intell. 15, 01 (2001), 9--42.

[30]

Josep M. Gonfaus, Marco Pedersoli, Jordi González, Andrea Vedaldi, and F. Xavier Roca. 2015. Factorized appearances for object detection. Comput. Vis. Image Underst. 138 (2015), 92--101.

Digital Library

[31]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Proceedings of the International Conference on Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2672--2680.

[32]

Klaus Greff, Rupesh K. Srivastava, Jan Koutník, Bas R. Steunebrink, and Jürgen Schmidhuber. 2017. LSTM: A search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28, 10 (Oct. 2017), 2222--2232.

[33]

Christian Hentschel and Harald Sack. 2014. Does one size really fit all?: Evaluating classifiers in bag-of-visual-words classification. In Proceedings of the 14th International Conference on Knowledge Technologies and Data-driven Business. ACM, New York, NY.

Digital Library

[34]

Geoffrey Hinton, Sara Sabour, and Nicholas Frosst. 2018. Matrix capsules with EM routing. In Proceedings of the 6th International Conference on Learning Representations (ICLR’18). ICLR, 1--15.

[35]

Geoffrey E. Hinton, Alex Krizhevsky, and Sida D. Wang. 2011. Transforming auto-encoders. In Lecture Notes in Computer Science. Springer Berlin, 44--51.

Digital Library

[36]

Satoshi Ikehata, Hang Yang, and Yasutaka Furukawa. 2015. Structured indoor modeling. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’15). IEEE, 1323--1331.

Digital Library

[37]

Phillip Isola and Ce Liu. 2013. Scene collaging: Analysis and synthesis of natural images with semantic layers. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’13). IEEE, Washington, DC, 3048--3055.

Digital Library

[38]

Tommi S. Jaakkola and David Haussler. 1999. Exploiting generative models in discriminative classifiers. In Proceedings of the Conference on Advances in Neural Information Processing Systems. The MIT Press, Cambridge, MA, 487--493. Retrieved from http://dl.acm.org/citation.cfm?idequals;340534.340715.

[39]

A. K. Jain, R. P. W. Duin, and Jianchang Mao. 2000. Statistical pattern recognition: A review. IEEE Trans. Pattern Anal. Mach. Intell. 22, 1 (Jan. 2000), 4--37.

Digital Library

[40]

Ahsan Jalal, Ahmad Salman, Ajmal Mian, Mark Shortis, and Faisal Shafait. 2020. Fish detection and species classification in underwater environments using deep learning with temporal information. Ecol. Inform. 57 (May 2020), 101088.

[41]

Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM International Conference on Multimedia (MM’14). Association for Computing Machinery, New York, NY, 675--678.

Digital Library

[42]

Chenfanfu Jiang, Siyuan Qi, Yixin Zhu, Siyuan Huang, Jenny Lin, Lap-Fai Yu, Demetri Terzopoulos, and Song-Chun Zhu. 2018. Configurable 3D scene synthesis and 2D image rendering with per-pixel ground truth using stochastic grammars. Int. J. Comput. Vis. 126, 9 (June 2018), 920--941.

Digital Library

[43]

Yunsheng Jiang and Jinwen Ma. 2015. Combination features and models for human detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). IEEE, Boston, MA, 240--248.

[44]

Frank D. Julca-Aguilar, Harold Mouchère, Christian Viard-Gaudin, and Nina S. T. Hirata. 2017. A general framework for the recognition of online handwritten graphics. CoRR abs/1709.06389 (2017), 1--14.

[45]

Aniruddha Kembhavi, Mike Salvato, Eric Kolve, Minjoon Seo, Hannaneh Hajishirzi, and Ali Farhadi. 2016. A diagram is worth a dozen images. In Computer Vision -- ECCV 2016, Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer International Publishing, Cham, 235--251.

[46]

Diederik P. Kingma, Danilo J. Rezende, Shakir Mohamed, and Max Welling. 2014. Semi-supervised learning with deep generative models. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS’14). The MIT Press, Cambridge, MA, 3581--3589.

[47]

Russell A. Kirsch. 1964. Computer interpretation of English text and picture patterns. IEEE Trans. Electron. Comput. EC-13, 4 (Aug. 1964), 363--376.

[48]

Barbara Kitchenham and Stuart Charters. 2007. Guidelines for Performing Systematic Literature Reviews in Software Engineering. Technical Report EBSE 2007-001. Keele University and Durham University Joint Report. Retrieved from http://www.dur.ac.uk/ebse/resources/Systematic-reviews-5-8.pdf.

[49]

W. W. Kong and Surendra Ranganath. 2014. Towards subject independent continuous sign language recognition: A segment and merge approach. Pattern Recog. 47, 3 (2014), 1294--1308.

Digital Library

[50]

Adam Kortylewski, Aleksander Wieczorek, Mario Wieser, Clemens Blumer, Sonali Parbhoo, Andreas Morel-Forster, Volker Roth, and Thomas Vetter. 2019. Greedy structure learning of hierarchical compositional models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). Computer Vision Foundation/IEEE, 11612--11621.

[51]

Mateusz Koziński, Raghudeep Gadde, Sergey Zagoruyko, Guillaume Obozinski, and Renaud Marlet. 2015. A MRF shape prior for facade parsing with occlusions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). IEEE, Boston, MA, 2820--2828.

[52]

Mateusz Koziński and Renaud Marlet. 2014. Image parsing with graph grammars and Markov Random Fields applied to facade analysis. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision. IEEE, 729--736.

[53]

Mateusz Koziński, Guillaume Obozinski, and Renaud Marlet. 2015. Beyond procedural facade parsing: Bidirectional alignment via linear programming. In Computer Vision -- ACCV 2014, Daniel Cremers, Ian Reid, Hideo Saito, and Ming-Hsuan Yang (Eds.). Springer International Publishing, Cham, 79--94.

[54]

Volker Krüger and Dennis Herzog. 2013. Tracking in object action space. Comput. Vis. Image Underst. 117, 7 (2013), 764--789.

Digital Library

[55]

Hilde Kuehne, Juergen Gall, and Thomas Serre. 2016. An end-to-end generative framework for video segmentation and recognition. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’16). IEEE, 1--8.

[56]

Hilde Kuehne, Alexander Richard, and Juergen Gall. 2017. Weakly supervised learning of actions from transcripts. Comput. Vis. Image Underst. 163 (2017), 78--89.

Digital Library

[57]

Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 2. IEEE, New York, NY, 2169--2178.

Digital Library

[58]

T. Hoang Ngan Le, ChenChen Zhu, Yutong Zheng, Khoa Luu, and Marios Savvides. 2017. DeepSafeDrive: A grammar-aware driver parsing approach to Driver Behavioral Situational Awareness (DB-SAW). Pattern Recog. 66 (2017), 229--238.

Digital Library

[59]

Kyuhwa Lee, Dimitri Ognibene, Hyung Jin Chang, Tae-Kyun Kim, and Yiannis Demiris. 2015. STARE: Spatio-temporal attention relocation for multiple structured activities detection. IEEE Trans. Image Proc. 24, 12 (Dec. 2015), 5916--5927.

Digital Library

[60]

Eduardo Lemus, Ernesto Bribiesca, and Edgar Garduno. 2015. Surface trees Representation of boundary surfaces using a tree descriptor. J. Vis. Commun. Image Represent. 31 (2015), 101--111.

Digital Library

[61]

Bo Li, Yaobin Chen, and Fei-Yue Wang. 2015. Pedestrian detection based on clustered poselet models and hierarchical and-or grammar. IEEE Trans. Vehic. Technol. 64, 4 (Apr. 2015), 1435--1444.

[62]

Bo Li, Xi Song, Tianfu Wu, Wenze Hu, and Mingtao Pei. 2014. Coupling-and-decoupling: A hierarchical model for occlusion-free object detection. Pattern Recog. 47, 10 (2014), 3254--3264.

[63]

Xilai Li, Xi Song, and Tianfu Wu. 2019. AOGNets: Compositional grammatical architectures for deep learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). IEEE, 6220--6230.

[64]

Xilai Li, Tianfu Wu, Xi Song, and Hamid Krim. 2017. AOGNets: Deep AND-OR grammar networks for visual recognition. CoRR abs/1711.05847 (2017), 1--12.

[65]

Li Liu, Shu Wang, Yuxin Peng, Zigang Huang, Ming Liu, and Bin Hu. 2016. Mining intricate temporal rules for recognizing complex activities of daily living under uncertainty. Pattern Recog. 60 (2016), 1015--1028.

Digital Library

[66]

Xianming Liu, Rongrong Ji, Changhu Wang, Wei Liu, Bineng Zhong, and Thomas S. Huang. 2015. Understanding image structure via hierarchical shape parsing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). IEEE, Boston, MA, 5042--5050.

[67]

Xiaobai Liu, Yuanlu Xu, Lei Zhu, and Yadong Mu. 2018. A stochastic attribute grammar for robust cross-view human tracking. IEEE Trans. Circ. Syst. Vid. Technol. 28, 10 (Oct. 2018), 2884--2895.

[68]

Xiaobai Liu, Yibiao Zhao, and Song-Chun Zhu. 2014. Single-view 3D scene parsing by attributed grammar. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 684--691.

Digital Library

[69]

Xiaobai Liu, Yibiao Zhao, and Song-Chun Zhu. 2018. Single-view 3D scene reconstruction and parsing by attribute grammar. IEEE Trans. Pattern Anal. Mach. Intell. 40, 3 (Mar. 2018), 710--725.

[70]

Yang Lu, Tianfu Wu, and Song-Chun Zhu. 2014. Online object tracking, learning, and parsing with and-or graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3462--3469.

Digital Library

[71]

Andelo Martinovic and Luc Van Gool. 2013. Early Parsing for 2D Stochastic Context Free Grammars. Technical Report KUL/ESAT/PSI/1301. Department of Electrical Engineering (ESAT), University Hospital Gasthuisberg, Kasteelpark Arenberg, België.

[72]

Andelo Martinovic and Luc Van Gool. 2013. Bayesian grammar learning for inverse procedural modeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13). IEEE Computer Society, Washington, DC, 201--208.

Digital Library

[73]

Lilyana Mihalkova, Tuyen Huynh, and Raymond J. Mooney. 2007. Mapping and revising Markov logic networks for transfer learning. In Proceedings of the 22nd National Conference on Artificial Intelligence (AAAI’07). AAAI Press, 608--614. Retrieved from http://dl.acm.org/citation.cfm?idequals;1619645.1619743.

[74]

Darnell Moore and Irfan Essa. 2002. Recognizing multitasked activities from video using stochastic context-free grammar. In Proceedings of the 18th National Conference on Artificial Intelligence. American Association for Artificial Intelligence, 770--776.

[75]

Louis-Philippe Morency, Ariadna Quattoni, and Trevor Darrell. 2007. Latent-dynamic discriminative models for continuous gesture recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1--8.

[76]

R. Narasimhan. 1962. A Linguistic Approach to Pattern Recognition. Technical Report 121. Digital Computer Laboratory, University of Illinois, Urbana, IL.

[77]

Andrew Y. Ng and Michael I. Jordan. 2001. On discriminative vs. generative classifiers: A comparison of logistic regression and naive Bayes. In Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic (NIPS’01). The MIT Press, Cambridge, MA, 841--848.

[78]

Andrew Y. Ng and Michael I. Jordan. 2001. On discriminative vs. generative classifiers: A comparison of logistic regression and naive Bayes. In Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic (NIPS’01). The MIT Press, Cambridge, MA, 841--848.

[79]

T. Ojala, M. Pietikainen, and D. Harwood. 1994. Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In Proceedings of 12th International Conference on Pattern Recognition. IEEE, 582--585.

[80]

Eray Özkural. 2014. An application of stochastic context sensitive grammar induction to transfer learning. In Artificial General Intelligence, Ben Goertzel, Laurent Orseau, and Javier Snaider (Eds.). Springer International Publishing, Cham, 121--132.

[81]

Seyoung Park, Bruce Xiaohan Nie, and Song-Chun Zhu. 2018. Attribute and-or grammar for joint parsing of human pose, parts and attributes. IEEE Trans. Pattern Anal. Mach. Intell. 40, 7 (July 2018), 1555--1569.

[82]

Seyoung Park and Song-Chun Zhu. 2015. Attributed grammars for joint estimation of human attributes, part and pose. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’15). IEEE, 2372--2380.

Digital Library

[83]

Ricardo Wandré Dias Pedro, Fátima L. S. Nunes, and Ariane Machado-Lima. 2013. Using grammars for pattern recognition in images: A systematic review. ACM Comput. Surv. 46, 2 (Nov. 2013).

[84]

Mingtao Pei, Zhangzhang Si, Benjamin Z. Yao, and Song-Chun Zhu. 2013. Learning and parsing video events with goal and intent prediction. Comput. Vis. Image Underst. 117, 10 (Oct. 2013), 1369--1383.

Digital Library

[85]

John L. Pfaltz and Azriel Rosenfeld. 1969. Web grammars. In Proceedings of the 1st International Joint Conference on Artificial Intelligence (IJCAI’69). Morgan Kaufmann Publishers Inc., San Francisco, CA, 609--619. Retrieved from http://dl.acm.org/citation.cfm?idequals;1624562.1624616.

[86]

Hamed Pirsiavash and Deva Ramanan. 2014. Parsing videos of actions with segmental grammars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). IEEE Computer Society, Washington, DC, 612--619.

Digital Library

[87]

Hemerson Pistori, Andrew Calway, and Peter Flach. 2013. A new strategy for applying grammatical inference to image classification problems. In Proceedings of the IEEE International Conference on Industrial Technology (ICIT’13). IEEE, 1032--1037.

[88]

Siyuan Qi, Siyuan Huang, Ping Wei, and Song-Chun Zhu. 2017. Predicting human activities using stochastic grammar. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). IEEE, 1173--1181.

[89]

Siyuan Qi, Yixin Zhu, Siyuan Huang, Chenfanfu Jiang, and Song-Chun Zhu. 2018. Human-centric indoor scene synthesis using stochastic grammar. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 5899--5908.

[90]

Christian P. Robert and George Casella. 1999. The Metropolis—Hastings algorithm. In Springer Texts in Statistics. Springer New York, New York, NY, 231--283.

[91]

Antonio Foncubierta Rodríguez, Henning Müller, and Adrien Depeursinge. 2017. From visual words to a visual grammar: Using language modelling for image classification. CoRR abs/1703.05571 (2017), 1--17.

[92]

Brandon Rothrock, Seyoung Park, and Song-Chun Zhu. 2013. Integrating grammar and segmentation for human pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3214--3221.

Digital Library

[93]

Sara Sabour, Nicholas Frosst, and Geoffrey E. Hinton. 2017. Dynamic routing between capsules. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). Curran Associates Inc., Red Hook, NY, 3859--3869.

[94]

Anderson Santos, José Marcato Junior, Jonathan de Andrade Silva, Rodrigo Pereira, Daniel Matos, Geazy Menezes, Leandro Higa, Anette Eltner, Ana Paula Ramos, Lucas Osco, and Wesley Gonçalves. 2020. Storm-drain and manhole detection using the RetinaNet method. Sensors 20, 16 (Aug. 2020), 4450.

[95]

Sunita Sarawagi and William W. Cohen. 2004. Semi-Markov conditional random fields for information extraction. In Proceedings of the 17th International Conference on Neural Information Processing Systems. The MIT Press, Cambridge, MA, 1185--1192. Retrieved from http://dl.acm.org/citation.cfm?idequals;2976040.2976189.

[96]

M. Schuster and K. K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Trans. Sig. Proc. 45, 11 (1997), 2673--2681.

Digital Library

[97]

Ricky J. Sethi and Amit K. Roy-Chowdhury. 2010. Modeling and recognition of complex multi-person interactions in video. In Proceedings of the 1st ACM International Workshop on Multimodal Pervasive Video Analysis (MPVA’10). ACM, New York, NY, 43--46.

[98]

Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15). ICLR, 1--14.

[99]

Kenneth Slonneger and Barry Kurtz. 1995. Formal Syntax and Semantics of Programming Languages: A Laboratory Based Approach (1st ed.). Addison-Wesley Longman Publishing Co., Inc., Boston, MA.

[100]

Xi Song, Tianfu Wu, Yunde Jia, and Song-Chun Zhu. 2013. Discriminatively trained and-or tree models for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3278--3285.

Digital Library

[101]

George Stiny and James Gips. 1971. Shape grammars and the generative specification of painting and sculpture. In Information Processing, Proceedings of IFIP Congress, Vol. 2. Elsevier, North Holland Publishing Co., 1460--1465.

[102]

Domen Tabernik, Matej Kristan, Jeremy L. Wyatt, and Ales Leonardis. 2016. Towards deep compositional networks. In Proceedings of the 23rd International Conference on Pattern Recognition (ICPR’16). IEEE, 3470--3475.

[103]

Domen Tabernik, Aleš Leonardis, Marko Boben, Danijel Skočaj, and Matej Kristan. 2015. Adding discriminative power to a generative hierarchical compositional model using histograms of compositions. Comput. Vis. Image Underst. 138, C (Sept. 2015), 102--113.

[104]

Jawad Tayyub, Majd Hawasly, David C. Hogg, and Anthony G. Cohn. 2018. Learning hierarchical models of complex daily activities from annotated videos. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’18). IEEE, 1633--1641.

[105]

Olivier Teboul, Iasonas Kokkinos, Loic Simon, Panagiotis Koutsourakis, and Nikos Paragios. 2011. Shape grammar parsing via reinforcement learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). IEEE Computer Society, Washington, DC, 2273--2280.

Digital Library

[106]

Olivier Teboul, Iasonas Kokkinos, Loic Simon, Panagiotis Koutsourakis, and Nikos Paragios. 2013. Parsing facades with shape grammars and reinforcement learning. IEEE Trans. Pattern Anal. Mach. Intell. 35, 7 (July 2013), 1744--1756.

Digital Library

[107]

Everton Castelão Tetila, Bruno Brandoli Machado, Gilberto Astolfi, Nícolas Alessandro de Souza Belete, Willian Paraguassu Amorim, Antonia Railda Roel, and Hemerson Pistori. 2020. Detection and classification of soybean pests using deep learning with UAV images. Comput. Electron. Agric. 179 (2020), 105836.

[108]

Bin Tian, Ming Tang, and Fei-Yue Wang. 2015. Vehicle detection grammars with partial occlusion handling for traffic surveillance. Transport. Res. Part C: Emerg. Technol. 56 (2015), 80--93.

[109]

Nam N. Vo and Aaron F. Bobick. 2014. From stochastic grammar to Bayes network: Probabilistic parsing of complex activity. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2641--2648.

[110]

Nam N. Vo and Aaron F. Bobick. 2016. Sequential interval network for parsing complex structured activity. Comput. Vis. Image Underst. 143 (2016), 147--158.

Digital Library

[111]

Michael Walton, Doug Lange, and Song-Chun Zhu. 2017. Inferring context through scene understanding. In Proceedings of the AAAI Spring Symposium Series. AAAI Press, 356--360.

[112]

Heng Wang, Alexander Kläser, Cordelia Schmid, and Cheng-Lin Liu. 2013. Dense trajectories and motion boundary descriptors for action recognition. Int. J. Comput. Vis. 103, 1 (May 2013), 60--79.

[113]

Wenguan Wang, Wenguan Wang, Yuanlu Xu, Jianbing Shen, and Song-Chun Zhu. 2018. Attentive fashion grammar network for fashion landmark detection and clothing category classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 4271--4280.

[114]

Julien Weissenberg, Hayko Riemenschneider, Mukta Prasad, and Luc Van Gool. 2013. Is there a procedural logic to architecture? In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Washington, DC, 185--192.

Digital Library

[115]

A. D. Wilson and A. F. Bobick. 1999. Parametric hidden Markov models for gesture recognition. IEEE Trans. Pattern Anal. Mach. Intell. 21, 9 (Sep. 1999), 884--900.

Digital Library

[116]

David Windridge, Josef Kittler, Teofilo de Campos, Fei Yan, William Christmas, and Aftab Khan. 2015. A novel Markov logic rule induction strategy for characterizing sports video footage. IEEE MultiMedia 22, 2 (Apr. 2015), 24--35.

Digital Library

[117]

Bingwei Wu. 2013. Two-dimensional (2D) Languages and Application to Handwritten Graphical Parsing. Technical Report. Ecole Polytechnique de l’université de Nantes. Retrieved from https://hal.archives-ouvertes.fr/hal-00861080.

[118]

Ying Nian Wu, Zhangzhang Si, Haifeng Gong, and Song-Chun Zhu. 2009. Learning active basis model for object detection and recognition. Int. J. Comput. Vis. 90, 2 (Aug. 2009), 198--235.

[119]

Xianglei Xing, Tianfu Wu, Song-Chun Zhu, and Ying Nian Wu. 2020. Inducing hierarchical compositional model by sparsifying generator network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). IEEE, 14284--14293.

[120]

Xianglei Xing, Song-Chun Zhu, and Ying Nian Wu. 2019. Inducing sparse coding and And-Or grammar from generator network. In Proceedings of the AAAI Conference on Artificial Intelligence, Workshop on Network Interpretability for Deep Learning. AAAI Press, 1--4.

[121]

Yuanlu Xu, Lei Qin, Xiaobai Liu, Jianwen Xie, and Song-Chun Zhu. 2018. A causal and-or graph model for visibility fluent reasoning in tracking interacting objects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2178--2187.

[122]

M. S. Zarchi, R. T. Tan, C. van Gemeren, A. Monadjemi, and R. C. Veltkamp. 2016. Understanding image concepts using ISTOP model. Pattern Recog. 53, C (May 2016), 174--183.

[123]

Yibiao Zhao and Song-Chun Zhu. 2013. Scene parsing by integrating function, geometry and appearance models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3119--3126.

Digital Library

[124]

Y. Zhu, N. Nayak, U. Gaur, B. Song, and A. Roy-Chowdhury. 2013. Modeling multi-object interactions using string of feature graphs. Comput. Vis. Image Underst. 117, 10 (2013), 1313--1328.

Digital Library

[125]

Bartosz Zieliński, Marek Skomorowski, Wadim Wojciechowski, Mariusz Korkosz, and Kamila Sprężak. 2015. Computer aided erosions and osteophytes detection based on hand radiographs. Pattern Recog. 48, 7 (2015), 2304--2317.

Digital Library

Cited By

Lande K(2024)Pictorial syntaxMind & Language10.1111/mila.12497Online publication date: 2-Jan-2024
https://doi.org/10.1111/mila.12497
Shi ZJia YShi GZhang KJi LWang DWu Y(2024)Design of Motor Skill Recognition and Hierarchical Evaluation System for Table Tennis PlayersIEEE Sensors Journal10.1109/JSEN.2023.334688024:4(5303-5315)Online publication date: 15-Feb-2024
https://doi.org/10.1109/JSEN.2023.3346880
Wang BLi MPeng ZLu W(2024)Hierarchical attributed graph-based generative façade parsing for high-rise residential buildingsAutomation in Construction10.1016/j.autcon.2024.105471164(105471)Online publication date: Aug-2024
https://doi.org/10.1016/j.autcon.2024.105471
Show More Cited By

Index Terms

Syntactic Pattern Recognition in Computer Vision: A Systematic Review
1. Mathematics of computing
  1. Mathematical software
    1. Mathematical software performance

Recommendations

Using grammars for pattern recognition in images: A systematic review

Grammars are widely used to describe string languages such as programming and natural languages and, more recently, biosequences. Moreover, since the 1980s grammars have been used in computer vision and related areas. Some factors accountable for this ...
Syntactic Pattern Recognition of the ECG

An application of the syntactic method to electrocardiogram (ECG) pattern recognition and parameter measurement is presented. Solutions to the related problems of primitive pattern selection, primitive pattern extraction, linguistic representation, and ...
Inference of Parsable Graph Grammars for Syntactic Pattern Recognition

A research into a syntactic pattern recognition model based on (edNLC) graph grammars (introduced and investigated in Janssens and Rozenberg Inform. Sci. 20 (1980), 191-216, and Janssens, Rozenberg and Verraedt Comp. Vis. Graph. Image Process. 18 (1982),...

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 54, Issue 3

April 2022

836 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/3461619

Editor:
Albert Zomaya
University of Sydney, Australia

Issue’s Table of Contents

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2021

Accepted: 01 January 2021

Revised: 01 November 2020

Received: 01 April 2020

Published in CSUR Volume 54, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Foundation for the Support and Development of Education, Science and Technology from the State of Mato Grosso do Sul, FUNDECT
Brazilian National Council of Technological and Scientific Development, CNPq
Coordination for the Improvement of Higher Education Personnel, CAPES

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
669
Total Downloads

Downloads (Last 12 months)120
Downloads (Last 6 weeks)8

Reflects downloads up to

Other Metrics

View Author Metrics

Citations

Cited By

Lande K(2024)Pictorial syntaxMind & Language10.1111/mila.12497Online publication date: 2-Jan-2024
https://doi.org/10.1111/mila.12497
Shi ZJia YShi GZhang KJi LWang DWu Y(2024)Design of Motor Skill Recognition and Hierarchical Evaluation System for Table Tennis PlayersIEEE Sensors Journal10.1109/JSEN.2023.334688024:4(5303-5315)Online publication date: 15-Feb-2024
https://doi.org/10.1109/JSEN.2023.3346880
Wang BLi MPeng ZLu W(2024)Hierarchical attributed graph-based generative façade parsing for high-rise residential buildingsAutomation in Construction10.1016/j.autcon.2024.105471164(105471)Online publication date: Aug-2024
https://doi.org/10.1016/j.autcon.2024.105471
Yuan HLiu W(2024)Research on Pedestrian Intrusion Detection Method in Coal Mine Based on Deep LearningMultimedia Technology and Enhanced Learning10.1007/978-3-031-50577-5_13(169-183)Online publication date: 21-Feb-2024
https://doi.org/10.1007/978-3-031-50577-5_13
Serey JAlfaro MFuertes GVargas MDurán CTernero RRivera RSabattin J(2023)Pattern Recognition and Deep Learning Technologies, Enablers of Industry 4.0, and Their Role in Engineering ResearchSymmetry10.3390/sym1502053515:2(535)Online publication date: 17-Feb-2023
https://doi.org/10.3390/sym15020535
Mou CXie ZLi YLiu HYang SCui X(2023)Urban Carbon Price Forecasting by Fusing Remote Sensing Images and Historical Price DataForests10.3390/f1410198914:10(1989)Online publication date: 3-Oct-2023
https://doi.org/10.3390/f14101989
Yadav TSachdeo R(2023)Development of Optimal Hyperparameter Tuning-Cycle GAN for Photo-realistic Face Age Progression ModelInternational Journal on Artificial Intelligence Tools10.1142/S021821302350068932:07Online publication date: 28-Nov-2023
https://doi.org/10.1142/S0218213023500689
N CJha JSayal AGupta VGupta A(2023)A Paradigm Shift towards Computer Vision2023 International Conference on Device Intelligence, Computing and Communication Technologies, (DICCT)10.1109/DICCT56244.2023.10110300(54-58)Online publication date: 17-Mar-2023
https://doi.org/10.1109/DICCT56244.2023.10110300
Xexeo GSantos L(2022)A Spatial Lexical Analyzer and 3D Grammars that Recognize Voxel Based Structures Using Linear Positional Grammars in Minecraft2022 21st Brazilian Symposium on Computer Games and Digital Entertainment (SBGames)10.1109/SBGAMES56371.2022.9961122(1-6)Online publication date: 24-Oct-2022
https://doi.org/10.1109/SBGAMES56371.2022.9961122
Bartusiak EHao HJacobs MNguyen NChan MComer MDelp E(2022)A Stochastic Grammar Approach to Predict Flight Phases of a Hypersonic Glide Vehicle2022 IEEE Aerospace Conference (AERO)10.1109/AERO53065.2022.9843362(01-15)Online publication date: 5-Mar-2022
https://doi.org/10.1109/AERO53065.2022.9843362
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents