Addressing Ravens
Addressing Ravens
Addressing Ravens
Abstract
The Ravens Progressive Matrices (RPM) test is a commonly used test of general human intelligence. The RPM is somewhat unique as a general intelligence test in that it focuses on visual problem solving, and in particular, on visual similarity and analogy. We are developing a small set of methods for problem solving in the RPM which use propositional, imagistic, and multimodal representations, respectively, to investigate how different representations can contribute to visual problem solving and how the effects of their use might emerge in behavior.
Introduction
The Ravens Progressive Matrices (RPM) test1 is a standardized intelligence test that consists of visually presented, geometric-analogy-like problems in which a matrix of geometric figures is presented with one entry missing, and the correct missing entry must be selected from a set of answer choices. Figure 1 shows an example of a problem that is similar to one of the problems in the Standard Progressive Matrices (SPM). Although the test is supposed to measure only eductive ability, or the ability to extract and understand information from a complex situation (Raven, Raven, & Court 1998), the RPMs high level of correlation with other multidomain intelligence tests have given it a position of centrality in the space of psychometric measures (Snow, Kyllonen, & Marshalek 1984), and it is therefore often used as a test of general intelligence. Using the RPM as a
Copyright 2009, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
1 A note on the Ravens family of tests: We use RPM to denote the Ravens Progressive Matrices paradigm without referring to any particular version of the test. Specific versions that we discuss include the original Standard Progressive Matrices (SPM) and the Advanced Progressive Matrices (APM), which was developed as a more difficult test to reduce the ceiling effects sometimes found with the SPM (Raven, Raven, & Court 1998). Colored Progressive Matrices (CPM) is a simpler version of the test often used with children or other less mentally able individuals; we do not address the CPM in this work.
measure of general intelligence, though it consists only of problems in a single, nonverbal format, stands in contrast to using broader tests like the Wechsler scales, which are comprised of subtests across several different verbal and nonverbal domains. Despite its widespread use, neither the computational nor the cognitive characteristics of the process of solving the RPM are well understood. Hunt (1974) gives a theoretical account of the information processing demands of certain problems from the Advanced Progressive Matrices (APM), in which he proposes two qualitatively different solution algorithmsGestalt, which uses visual representations and perceptually based operations, and Analytic, which uses feature-based representations and logical operationsthat could yield identical results on at least portions of the test. Our work expands on Hunts idea by asking whether qualitatively different systems of representation can lead to identical performance on the RPM. Our question is theoretically interesting for the study of cognition and AI because the fact that the RPM correlates so well with broader tests of intelligence suggests that the specific information processing capacities tapped by the RPM may
Figure 1. Example problem similar to one in the Standard Progressive Matrices (SPM) test.
be central to domain-general processes of reasoning used across a variety of cognitive tasks. If different schemes of representation can provide equivalent performance, then this raises the issues of 1) whether these broader reasoning processes can (or must) also be instantiated with different types of representations, 2) to what extent a single individual may (or must) draw on these different representations for reasoning tasks, and 3) to what degree individual differences may (or must) result from variations in the underlying representations being used. Our central question is also of immense practical import because the RPM family of tests is used extensively in clinical, educational, vocational, and scientific settings as an accurate assessment of intelligence. Therefore, interpretations of RPM scores should be made only with a thorough understanding of what cognitive implications these scores do and do not provide.
are much higher than their Wechsler scores (Mottron 2004; Dawson et al. 2007). Individuals with Aspergers syndrome show a similar pattern (Hayashi et al. 2008). One possible explanation for these results is that individuals with autism might be predisposed towards reasoning visually (Kunda & Goel 2008a, 2008b). In this case, they might find the RPM amenable to a visual reasoning solution but the verbal Wechsler subtests very difficult. Recent neuroimaging evidence is consistent with this possibility (Soulires et al. 2009).
Figure 2. Illustration of four image transformations implicit in a two-by-two RPM problem matrix.
solve a problem by determining which of the candidate solutions yields the most analogous transformations. To this end, we explored the fractal encoding of one image in terms of another as a representational basis for calculating and discovering the underlying analogies. The mathematical derivation of fractal encoding expressly depends upon the notion of real world images, i.e. images that are two dimensional and continuous. A key observation is that all naturally occurring images we perceive appear to have similar, repeating patterns. Another observation is that no matter how closely you examine the real world, you find instances of similar structures and repeating patterns. This suggests that it is possible to describe the real world in terms other than those of shapes or traditional graphical elementsin particular, terms which capture the similarity and repetition alone. The theorem at the heart of the fractal encoding algorithm can be stated concisely: For any particular real world image D, there exists a finite set of affine transformations T which, if applied repeatedly and indefinitely to any other real world image S, will result in the convergence of S into D.
manipulate it geometrically (copy, rotation, or flip) and photometrically (altering the blocks luminosity). While it is tempting to treat contiguous subsets of these transformations as features, note that their derivation does not follow strictly Cartesian notions (e.g. adjacent material in the destination might arise from strongly non-adjacent source material). With this in mind, we consider (in our present implementations) each of these block-level transformations to be independent of one another, and we only construct candidate fractal features for matching from single block-level transformations. Each such transform yields a very small finite set of fractal features. We generate fractal solutions to RPM problems by examining all possible pairwise transforms and calculating a measure of similarity for each pair. This metric reflects similarity as a comparison of the number of fractal features shared between candidate pairs taken in contrast to the joint number of fractal features found in each pair member (Tversky 1977). The solution is chosen as the answer that results in the highest measured similarity for both row and column pairs in the RPM problem.
to test on the matrix. The algorithm proceeds as follows, in the case of a two-by-two RPM matrix problem: 1) For each transformation Ti in memory: a) For the top row, check to see if Ti holds. i) If so, apply Ti to the bottom row to generate a guess for the missing image. Go to step 2. ii) If not, continue to step 1b. b) For the left column, check to see if Ti holds. i) If so, apply Ti to the right column to generate a guess for the missing image. Go to step 2. ii) If not, repeat step 1 with the next Ti. iii) If there are no more transformations in memory, then the algorithm halts. 2) For each answer choice Ai: a) Compute the sum-squared-difference (SSD) between the predicted image and Ai. 3) Choose the answer choice Ai that minimizes the SSD value. Unlike the fractal method, the affine-extended method will not always produce a solution. The memory set is restricted to a finite set of possible relations in order to prevent the algorithm from stalling indefinitely on a single problem. Despite this limitation, we hypothesize that the affine-extended method will in fact be able to solve a large fraction of the SPM problems.
or if a transformation holds for both rows and columns, the algorithm will use the first one that crosses its path, which may or may not be the correct one. Figure 3 gives an example of a problem with this kind of ambiguity; this problem is analogous to one in the actual SPM test. One transformation that could hold in this problem would be, across the top row, reflection about a vertical axis. Using this transformation on the bottom row, the predicted answer would be identical to answer #5. However, one could also have a transformation across the top row that consists of a rotation 90 to the left and then a scaling along the vertical axis (i.e. squishing the figure down vertically). Applying this transformation to the bottom row would predict an image identical to answer #6. So which is correct? There are several ways, which we are currently investigating, in which this conflict could be resolved. First, the algorithm could just stick with its first prediction, and sometimes its biases would result in the correct answer being chosen, and sometimes not. Another option is to explicitly assign biases to the transformations in memory, where, for example, a transformation comprised of a single operation (e.g. reflection) would be examined before transformations comprised of multiple operations (e.g. rotation plus scaling). This approach puts additional requirements on the memory store and assumes that such rankings of transformations are somehow justified. A third approach is to add an additional step to the algorithm that examines the final matrix (with the predicted answer choice in place) for qualities of symmetry or other global measures. If symmetry were the measure, then the algorithm would choose answer #5 over answer #6 in the example problem shown in Figure 3. This approach raises questions as to what precisely the RPM tests are measuring, if such problem ambiguities are resolved by a fixed, global scale of optimal answers. It could be that aspects of symmetry play a key role, in which case it would be interesting to compare the demands of computing symmetry on the RPM to the requirements for recognizing symmetry in other cognitive tasks.
Figure 3. Example problem similar to one in the Standard Progressive Matrices (SPM), illustrating ambiguity in possible solutions when using the affine-extended method.
described in their paper, the model operates on a handcoded, symbolic description of each matrix entry. Thus, visual information is encoded implicitly in their coded matrix entries and also in the set of production rules but is nowhere explicitly accessed or used by the model.
The answer with the highest calculated similarity is deemed correct. For the arrow problem, using a 32 x 32 block size, the similarity measures for each answer are: S(T,C1) = 21 / (21+18+24) 0.333333 S(T,C2) = 15 / (15+24+30) 0.217391 S(T,C3) = 16 / (16+23+11) 0.32 S(T,C4) = 14 / (14+25+29) 0.205882 Therefore, the fractal method chooses as its answer #1.
Preliminary Results
In this section, we will describe solutions based upon each of our two novel computational methods.
These transformations are successively tested against the images in the top row and then left column to see if they hold. To check each transformation, the first entry in the row or column is transformed accordingly and compared to the second entry using the sum-squared-difference (SSD) of the pixel intensity values. A threshold value determines whether the transformation holds, and the algorithm stops testing transformations once a valid one has been found. For the example shown in Figure 4, the SSD values for each of the first three transformations were around 150 million, which fell well above the threshold, and the SSD for the fourth transformation was exactly zero (due to the noiselessness of the input images). So, the fourth transformation was chosen as the correct one. Once a valid transformation is found, the first entry in the remaining row or column is transformed according to this rule and compared to the answer choices, again using the sum-squared difference of the pixel values. The answer with minimum difference is chosen as correct. In this example problem, the SSD values calculated for each answer choice were as follows: 1) 2) 3) 4) 0.0 3.19 108 3.57 108 3.54 108
Discussion
Figure 4. Example problem for illustrating the execution of the fractal and affine-extended methods.
Presently, we are engaged in defining, implementing, and testing our three systems of representation on subsets of
the RPM. We expect to answer these questions concerning accuracy, efficiency, and behavioral effects: Can these different methods produce the same (correct) behavior on RPM problems, and if so, on which of the problems and why? Are there intrinsic aspects of the various problems that make them amenable to solution using particular methods? Do these different methods confer processing speed advantages for certain RPM problems? What behavioral markers can we determine to distinguish among these three methods, and given these markers, how might we test for their presence in human subjects?
Dillon, R. F., Pohlmann, J. T., & Lohman, D. F. 1981. A factor analysis of Raven's Advanced Progressive Matrices freed of difficulty factors. Educational and Psychological Measurement 41: 12951302. Hayashi, M., Kato, M., Igarashi, K., & Kashima, H. 2008. Superior fluid intelligence in children with Aspergers disorder. Brain and Cognition 66 (3): 306-310. Hunt, E. 1974. Quote the raven? Nevermore! In Gregg, L. W. ed. Knowledge and cognition, 129158. Hillsdale, NJ: Erlbaum. Jolliffe, T., & Baron-Cohen, S. 1997. Are people with autism and Asperger syndrome faster than normal on the embedded figures test? Journal of Child Psychology and Psychiatry and Allied Disciplines 38 (5): 527-534. Kosslyn, S. M., Thompson, W. L., & Ganis, G. 2006. The Case for Mental Imagery. New York, NY: Oxford University Press. Kunda, M., & Goel, A.K. 2008a. How Thinking in Pictures can explain many characteristic behaviors of autism. In Proceedings of the 7th IEEE International Conference on Development and Learning, 304-309. Monterrey, CA. Kunda, M., & Goel, A.K. 2008b. What can pictorial representations reveal about the cognitive characteristics of autism? In Stapleton, G., Howse, J., and Lee, J. eds. Proceedings of the 5th International Conference on the Theory and Application of Diagrams, LNAI 5223: 103-117. Lovett, A., Forbus, K., & Usher, J. 2007. Analogy with qualitative spatial representations can simulate solving Ravens Progressive Matrices. In Proceedings from the 29th Annual Conference of the Cognitive Science Society, 449-454. Nashville, TN: Cognitive Science Society, Inc. Mottron, L. 2004. Matching strategies in cognitive research with individuals with high-functioning autism: Current practices, instrument biases, and recommendations. Journal of Autism and Developmental Disorders 34: 19-27. Raven, J., Raven, J. C., & Court, J. H. 1998. Manual for Raven's Progressive Matrices and Vocabulary Scales, Section 1: General Overview. San Antonio, TX: Harcourt Assessment. Soulires, I., Dawson, M., Samson, F., Barbeau, E. B., Sahyoun, C., Strangman, G. E., et al. 2009. Enhanced visual processing contributes to matrix reasoning in autism. Human Brain Mapping. Forthcoming. Snow, R., Kyllonen, P., & Marshalek, B. 1984. The topography of ability and learning correlations. In Sternberg, R. ed. Advances in the Psychology of Human Intelligence 2: 47-103. Hillsdale, NJ: Erlbaum. Tversky, A. 1977. Features of similarity. Psychological Review 84 (4): 327-352.
With the answers to these questions in hand, we propose to have established the first firm cognitive and computational account of the Ravens Progressive Matrices test using multiple-representation methods.
Acknowledgments
This research has been partially supported by an NSF grant (IIS Award #0534266) entitled Multimodal Case-Based Reasoning in Modeling and Design; by the Office of Naval Research through an NDSEG fellowship; and by the NSF GRFP fellowship program. We also thank Gregory Abowd for his encouragement of this work.
References
American Psychiatric Association. 2000. Diagnostic and Statistical Manual of Mental Disorders, Revised 4th Ed. Washington, DC: American Psychiatric Association. Barnsley, M. F., & Hurd, L. P. 1992. Fractal Image Compression. Boston: A. K. Peters. Carpenter, P. A., Just, M. A., & Shell, P. 1990. What one intelligence test measures: A theoretical account of the processing in the Raven Progressive Matrices Test. Psychological Review 97 (3): 404-431. Davies, J., & Goel, A.K. 2001. Visual analogy in problem solving. In Proceedings of the 17th International Joint Conference on Artificial Intelligence, 377-382. Seattle, WA: Morgan Kaufmann. Davies, J., Goel, A.K, & Yaner, P.W. 2008. Proteus: A theory of visual analogies in problem solving. KnowledgeBased Systems 21 (7): 636-654. Dawson, M., Soulires, I., Gernsbacher, M. A., & Mottron, L. 2007. The level and nature of autistic intelligence. Psychological Science 18 (8): 657-662. DeShon, R. P., Chan, D., & Weissbein, D. A. 1995. Verbal overshadowing effects on Raven's Advanced Progressive Matrices: Evidence for multidimensional performance determinants. Intelligence 21 (2): 135-155.