The induction of dynamical recognizers

Pollack, Jordan B.

doi:10.1007/BF00114845

The induction of dynamical recognizers

Published: September 1991

Volume 7, pages 227–252, (1991)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

The induction of dynamical recognizers

Download PDF

Jordan B. Pollack^1,2

752 Accesses
Explore all metrics

Abstract

A higher order recurrent neural network architecture learns to recognize and generate languages after being “trained” on categorized exemplars. Studying these networks from the perspective of dynamical systems yields two interesting discoveries: First, a longitudinal examination of the learning process illustrates a new form of mechanical inference: Induction by phase transition. A small weight adjustment canses a “bifurcation” in the limit behavior of the network. This phase transition corresponds to the onset of the network's capacity for generalizing to arbitrary-length strings. Second, a study of the automata resulting from the acquisit on of previously published training sets indicates that while the architecture is not guaranteed to find a minimal finite automaton consistent with the given exemplars, which is an NP-Hard problem, the architecture does appear capable of generating non regular languages by exploiting fractal and chaotic dynamics. I end the paper with a hypothesis relating linguistic generative capacity to the behavioral regimes of non-linear dynamical systems.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Anderson J.A., Silverstein J.W., Ritz S.A. & Jones R.S. (1977). Distinctive features, categorical perception, and probability learning: Some applications of a neural model. Psychological Review, 84, 413–451.
Google Scholar
Angluin D. (1978). On the complexity of minimum inference of regular sets. Information and Control, 39, 337–350.
Google Scholar
Angluin D. & Smith C.H. (1983). Inductive inference: Theory and methods. Computing Surveys, 15, 237–269.
Google Scholar
Barnsley M.F. (1988). Fractals everywhere. San Diego: Academic Press.
Google Scholar
Berwick R. (1985). The acquisition of syntactic knowledge. Cambridge: MIT Press.
Google Scholar
Chaitin G.J. (1966). On the length of programs for computing finite binary sequences. Journal of the AM 13, 547–569.
Google Scholar
Chomsky N. (1956). Three models for the description of language. IRE Transactions on Information Theory IT-2, 113–124.
Google Scholar
Chomsky N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press.
Google Scholar
Crutchfield J.P., Farmer J.D., Packard N.H. & Shaw R.S. (1986). Chaos. Scientific American, 255, 46–57.
Google Scholar
Crutchfield J.P. & Young K. (1989). Computation at the onset of chaos. In W.Zurek. (Ed.), Complexity, entropy and the physics of Information. Reading, MA: Addison-Wesley.
Google Scholar
Derrida, B. & Meir, R. (1988). Chaotic behavior of a layered neural network. Phys. Rev. A. 38.
Devaney R.L. (1987) An introduction to chaotic dynamical systems. Reading, MA: Addison-Wesley.
Google Scholar
Elman J.L. (1990). Finding structure in time. Cognitive Science, 14, 179–212.
Google Scholar
Feldman J.A. (1972). Some decidability results in grammatical inference. Information & Control, 20, 244–462.
Google Scholar
Fodor J. & Pylyshyn A. (1988). Connectionism and cognitive architecture: A critical analysis. cognition, 28, 3–71.
Google Scholar
Giles C.L., Sun G.Z., Chen H.H., Lee Y.C. & Chen D. (1990). Higher order recurrent networks and grammatical inference. In D.S.Touretzky, (Ed.), Advances in neural information processing systems, Los Gatos, CA: Morgan Kaufmann.
Google Scholar
Gleick J. (1987). Chaos: Making a new science. New York: Viking.
Google Scholar
Gold E.M. (1967). Language identification in the limit. Information & Control, 10, 447–474.
Google Scholar
Gold E.M. (1978). Complexity of automaton identification from given data. Information and Control, 37, 302–320.
Google Scholar
Grassberger P. & Procaccia I. (1983). Measuring the strangeness of strange attractors. Physica, 9D, 189–208.
Google Scholar
Grebogi C., Ott E. & Yorke J.A. (1987). Chaos, strange attractors, and fractal basin boundaries in nonlinear dynamics. Science, 238, 632–638.
Google Scholar
Hendin, O., Horn, D. & Usher, M. (1991). Chaotic behavior of a neural network with dynamical thresholds. Int. Journal of Neural Systems, to appear.
Hopfield J.J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences USA, 79, 2554–2558.
Google Scholar
Hornik, K., Stinchcombe, M. & White, H. (1990). Multi-layer feedforward networks are universal approximators. Neural networks, 3.
Huberman B.A. & Hogg T. (1987). Phase transitions in artificial intelligence systems. Artificial Intelligence, 33, 155–172.
Google Scholar
Joshi A.K. (1985). Tree adjoining grammars: How much context-sensitivity is required to provide reasonable structural descriptions? In D.R.Dowty, L.Karttunen & A.M.Zwicky, (Eds.). Natural language parsing Cambridge, Cambridge University Press.
Google Scholar
Joshi A.K., Vijay-shanker K. & Weir D.J. (1989). Convergence of mildly context-sensitive grammar formalism. In T.Wasow & P.Sells, (Eds.), The processing of linguistic structure. Cambridge: MIT Press.
Google Scholar
Kolen J.F. & Pollack J.B. (1990). Back-propagation is sensitive to initial conditions. Complex Systems, 4, 269–280.
Google Scholar
Kurten, K.E. (1987). Phase transitions in quasirandom neural networks. In Institute of Electrical and Electronics Engineers First International Conference on Neural Networks. San Diego, II-197-20.
Lapedes, A.S. & Farber, R.M. (1988). How neural nets work (LAUR-88-418): Los Alamos, NM.
Lieberman P. (1984). The biology and evolution of language. Cambridge: Harvard University Press.
Google Scholar
Lindgren K. & Nordahl M.G. (1990). Universal computation in simple one-dimensional cellular automata. Complex Systems, 4, 299–318.
Google Scholar
Lippman, R.P. (1987). An introduction to computing with neural networks. Institute of Electrical and Electronics Engineers ASSP Magazine, April, 4–22.
MacLennan B.J. (1989). Continuous computation (CS-89–83). Knoxville, TN: University of Tennessee, Computer Science Dept.
Google Scholar
MacWhinney B. (1987). Mechanisms of language acquisition. Hillsdale: Lawrence Erlbaum Associates
Google Scholar
Mandelbrot B. (1982). The fractal geometry of nature, San Francisco: Freeman.
Google Scholar
McCulloch W.S. & Pitts W. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5, 115–133.
Google Scholar
Mealy G.H. (1955). A method for synthesizing sequential circuits. Bell System Technical Journal, 43, 1045–1079.
Google Scholar
Metcalfe J. & Wiebe D. (1987). Intuition in insight and noninsight problem solving. Memory and Cognition. 15, 238–246.
Google Scholar
Minsky M. (1972). Computation: Finite and infinite machines. Cambridge, MA: MIT Press.
Google Scholar
Minsky M. & Poper S. (1988). Perceptrons. Cambridge, MA: MIT Press.
Google Scholar
Moore C. (1990). Unpredictability and undecidability in dynamical systems. Physical Review Letters, 62, 2354–2357.
Google Scholar
Mozer, M. (1988). A focused back-propagation algorithm for temporal pattern recognition (CRG-Technical Report-88-3). University of Toronto.
Pearlmutter B.A. (1989). Learning state space trajectories in recurrent neural networks. Neural Computation, 1, 263–269.
Google Scholar
Pineda F.J. (1987). Generalization of back-propagation to recurrent neural networks Physical Review Letters, 59, 2229–2232.
Google Scholar
Pinker S. (1984). Language learnability and language development. Cambridge: Harvard University Press.
Google Scholar
Pinker S. & Prince A. (1988). On language and connectionism: Analysis of a parallel distributed processing model of language inquisition. Cognition, 28, 73–193.
Google Scholar
Pinker S. & Bloom P. (1990). Natural language and natural selection. Brain and Behavioral Sciences, 12 707–784.
Google Scholar
Plunkett K. & Marchman V. (1989). Pattern association in a back-propagation network: Implications for child language acquisition (Technical Report 8902). San Diego: UCSD Center for Research in Language.
Google Scholar
Pollack, J.B. (1987). Cascaded back propagation on dynamic connectionist networks. Proceedings of the Ninth Conference of the Cognitive Science Society (pp. 391–404). Seattle, WA.
Pollack, J.B. (1987). On connectionist models of natural language processing. Ph.D. Thesis, Computer Science Department, University of Illinois, Urbana, IL. (Available as MCCS-87-100. Computing Research Laboratory Las Cruces, NM).
Pollack J.B. (1989). Implications of recursive distributed representations. In D.S.Touretzky, (Ed.), Advances in neural information processing systems. Los Gatos, CA: Morgan Kaufmann.
Google Scholar
Pollack J.B. (1990). Recursive distributed representation. Artificial Intelligence, 46, 77–105.
Google Scholar
Pollard, C. (1984). Generalized context-free grammars, head grammars and natural language. Doctoral Dissertation, Dept. of Linguistics, Stanford University, Palo Alto, CA.
Rivest, R.L. & Schapire, R.E. (1987). A new approach to unsupervised learning in determunistic environments. Proceedings of the Fourth International Workshop on Machine Learning (pp. 364–475). Irvine, CA.
Rumelhart D.E. & McClelland J.L. (1986). PDP models and general issues in cognitive science. In D.E.Rumelhart, J.L.McClelland & the PDP Research Group, (Eds.), Parallel distributed processing: Experiments in the microstructure of cognition, Vol. 1. Cambridge: MIT Press.
Google Scholar
Rumelhart D.E., Hinton G. & Williams R. (1986). Learning internal representations through error propagation In D.E.Rumelhart, J.L.McClelland & the PDP Research Group, (Eds.), Parallel distributed processing Experiments in the microstructure of cognition, Vol. 1. Cambridge: MIT Press.
Google Scholar
Servan-Schreiber D., Cleeremans A. & McClelland J.L. (1989). Encoding sequential structure in simple recurrent networks. In D.S.Touretzky, (Ed.), Advances in neutral information processing systems. Los Gatos, CA: Morgan Kaufmann.
Google Scholar
Skarda, C.A. & Freeman, W.J. (1987). How brains make chaos. Brain & Behavioral Science, 10.
Smolensky P. (1986). Information processing in dynamical systems: Foundations of harmony theory. In D.F.Rumelhart, J.L.McClelland & the PDP Research Group, (Eds.), Parallel distributed processing: Experiments in the microstructure of cognition, Vol. 1. Cambridge: MIT Press.
Google Scholar
Tomita, M. (1982). Dynamic construction of finite-state automata from examples using hill-climbing. Proceedings of the Fourth Annual Cognitive Science Conference (pp. 105–108). Ann Arbor. MI.
Touretzky, D.S. & Geva, S. (1987). A distributed connectionist representation for corcept structures. Proceedings of the Ninth Annual Conference of the Cognitive Science Society (pp. 155–164). Seattle. WA.
van derMaas H., Verschure P. & Molenaar P. (1990). A note on chaotic behavior in simple neutral networks, Neural Networks, 3, 119–122.
Google Scholar
Wexler K. & Culicover P.W. (1980). Formal principles of language acquisition Cambridge: MIT Press.
Google Scholar
Wolfram S. (1984). Universality and complexity in cellular automata. Physica, 10D, 1–35.
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory for AI Research, The Ohio State University, 2036 Neil Avenue, 43210, Columbus, OH
Jordan B. Pollack
Computer & Information Science Department, The Ohio State University, 2036 Neil Avenue, 43210, Columbus, OH
Jordan B. Pollack

Authors

Jordan B. Pollack
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pollack, J.B. The induction of dynamical recognizers. Mach Learn 7, 227–252 (1991). https://doi.org/10.1007/BF00114845

Download citation

Issue Date: September 1991
DOI: https://doi.org/10.1007/BF00114845

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The induction of dynamical recognizers

Abstract

Article PDF

Similar content being viewed by others

Synthesizing Context-free Grammars from Recurrent Neural Networks

Interpretability of Recurrent Neural Networks Trained on Regular Languages

On the Interpretation of Recurrent Neural Networks as Finite State Machines

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The induction of dynamical recognizers

Abstract

Article PDF

Similar content being viewed by others

Synthesizing Context-free Grammars from Recurrent Neural Networks

Interpretability of Recurrent Neural Networks Trained on Regular Languages

On the Interpretation of Recurrent Neural Networks as Finite State Machines

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation