Keywords

1 Introduction

Interactive tasks allow learners to construct knowledge in a student-centered, dynamic process that commonly involves hands-on, personal inquiry with concrete problems, and metacognitive support [1]. In mathematics education, such environments—GeoGebraFootnote 1, The Geometer’s SketchpadFootnote 2, and DesmosFootnote 3—have been used in numerous educational processes and settings, overall with favorable outcomes [2].

Interactive tasks enable the learning process not only to be monitored at its end, but also during it, hence may support learners with formative assessment. One way of doing so is by providing learners with real-time feedback that relates to the learner’s performance or understanding [3]. Therefore, it is crucial to identify students’ strategies while solving problems in an interactive task, in order to design a feedback system that would respond to them. To meet this goal, we set-up the following research questions:

  1. 1)

    Impact of feedback on success

    1. a)

      To what extent is immediate feedback noticed?

    2. b)

      Does noticing immediate feedback lead to success?

  2. 2)

    Relations between shape categorization and success

    1. a)

      What are the differences in correct classification on first attemptFootnote 4 between examples and non-examples for reflective symmetry?

    2. b)

      What are the differences in correct classification on first attempt (see Footnote 4) between intuitive and non-intuitive examples of reflective symmetry?

  3. 3)

    Relationships between solving strategies and success

    1. a)

      What is the relationship between number of stepsFootnote 5 in which each shape was chosen to be classified for the first time and overall student success?

    2. b)

      What is the relationship between number of steps (see Footnote 5) between similar shapes being classified and overall student success?

2 Background

2.1 Concept Image

The term concept image is used “to describe the total cognitive structure that is associated with the concept, which includes all the mental pictures and associated properties and processes” [4]. This highlights the cognitive structure that resembles the information one individual links to a concept. The information itself does not necessarily have to be correct and may be erroneous. Three elements can be derived from this definition: mental pictures, associated properties, and processes. Within this study we examine reflective symmetry as a topic from the field of Euclidian Geometry.

Visual information in the form of images is what we understand as mental pictures, within the framework of concept images. Learners are often presented with examples of certain concepts [5], which serves as one of the first steps for the development of geometric thought according to Van Hiele’s model [6]. Regarding reflective symmetry, there may be different kinds of plane figures, which serve as prototypical examples.

Mental pictures as prototypical (visual) examples can only represent a concept (e.g., reflective symmetry) to a certain extent and can be problematic for applying or recalling aspects of a concept [7]. This may be related to previous findings according to which learners struggle with lines of symmetry being inclined and neither horizontally nor vertically oriented [8]. This indicates a lack of concept images containing prototypes with non-horizontal/-vertical aligned lines of symmetry.

Information about properties is also part of the concept image and can be of use, when mental (prototypical) images fail. This may be definitions on a rather formal level. Within the topic of reflective symmetry, a link to processes is likely, since students of younger age are often introduced to this concept by folding a plane figure on a piece of paper or using a mirror as line of symmetry. It is just later that more formal aspects of reflective symmetry are being taught and extend ideas of folding/ mirroring by determining distances between points of a plane shape and the line of symmetry. Therefore, it is likely that students’ concept images not only consist of formal information but also of less formal information about actions associated with symmetry.

A valid method for investigating a concept image as a cognitive structure are sorting tasks, in which learners are asked to sort items based on pre-defined categories—or to come up with their own categories according to which subsets of items could be arranged. They serve as an effective tool to elicit issues of organization and context for investigating cognitive structures [9]. Compared with other types of tasks—like writing, or even recalling—sorting tasks are easier accessible, hence are more able to tap into a learners’ knowledge structure that can be distorted by difficult production tasks [10]. Therefore, this kind of task has been used for learning and assessment in various disciplines [10,11,12]. Sorting tasks with predefined categories are relatively easy to implement as interactive tasks that support the learner with immediate feedback, as objects could be a-priori tagged with their correct classification.

2.2 Intuitive and Non-intuitive (Non-) Examples

There are multiple ways for determining (non-)reflective symmetry of an object. Having in mind how points are reflected it is possible to follow rules for constructing reflections by using necessary construction tools [13]. In the case of polygons, it is sufficient to reflect corner points of a figure since direct connections between points as lines are unambiguous. To do so, it is necessary to define a line that the points are reflected by (what we refer to as line of symmetry) and compare the resulting halves.

These approaches consider properties of reflective symmetry without necessarily using prototypical examples. Such examples may serve as reference objects for deriving implications of reflective symmetry and properties for other objects (e.g., a rectangle is symmetrical, therefore a square is symmetrical as this is a special case of a rectangle). This raises the question whether learners argue reflectiveness for objects individually or also make use of relationships between shapes.

Considering the different possibilities for arguing reflective symmetry, it is necessary to also take a closer look at different types of examples that shall be examined. Since (prototypical) examples are part of the concept image, the question arises of their nature. We found the taxonomy developed by Tsamir et al. [14] extremely useful for our study. This taxonomy uses a 2*2 categorization of objects regarding a given property, based on them being examples/non-examples and intuitive/non-intuitive. Tsamir et al. found that about 80% of children’s and prospective elementary school teachers’ first non-example for a triangle was a circle. This had led them to the assumption that there are shapes that are more likely to be stated as (non-)examples for a given geometric property than others. Therefore, it is assumed that intuitive (non-)examples being more easily recalled from the concept image than non-intuitive (non-)examples. Furthermore, there seems to be no need for justification of intuitive (non-)examples as their properties appear self-evident [15]. Notably, intuitive as well as non-intuitive examples and non-examples should be included into teaching of geometric concepts [14]. Therefore, we made use of this categorization in constructing our task within the digital environment (see Sect 4.3).

2.3 Problem-Solving Strategies in Interactive Tasks

The use of online learning environments allows for a continuous monitoring of the learning throughout the learning process. Most relevant to the current study are studies of how students are engaged with online learning environments [16]. Even more specifically, we are interested in the ways by which students interact with open-ended tasks that allow for dynamic exploration in a trial-and-error manner rather than with close-ended tasks that simply require answering questions.

One prominent example of researching problem-solving strategies in open-ended learning environments is the study of science inquiry skills, which enables science researchers and educators to detect higher-order skills like designing controlled experiment [17]. Detecting cognitive and meta-cognitive skills while learning opens a gate for further explorations of relationships between such skills and learning outcomes [18].

In the context of mathematics education, attempts have been taken to analyze learners’ interaction with online learning environments in a nuanced way, using log-based measures like page visits, time on task, or repetition [19,20,21]. These attempts are successful—at least to some degree—to prove relationships between interaction patterns and outcome measures, which may be seen as a validation of this approach. That is, real-time interaction patterns may indeed serve as a good proxy to learning.

3 Methodology

Students at elementary schools are the targeted audience for this interactive task. For this exploratory study, we chose a convenient sample of N = 29 elementary school students (9–12 years old) from both Israel and Germany. In Israel n1 = 12 students were recruited through personal and professional networks of the research team. In Germany, a 4th grade classroom was recruited with n2 = 17 participants. In both countries, the sample contained students from different skill levels.

3.1 Research Field

Reflective symmetry in two-dimensional geometry is studied early in the elementary school grades of both countries. By 4th-grade, students in both countries are expected to understand this concept, to identify lines of symmetry, and to correctly classify shapes based on reflective symmetry-related characteristics.

3.2 Research Population

The participants in our study were of age 9–12 years (M = 10, SD = 0.9), with a gender distribution of 12 female to 17 male participants. We are aware of a statistically significant difference in age between the two country-based groups (Mann-Whitney's W-value = 38, at p < 0.01, and with Rank-Biserial Correlation of 0.63), however none of the research variables proved a difference between these two groups; also, there were no gender differences between the country-based groups, with χ2(1) = 2.26, at p = 0.13. Therefore, we treated the whole population as one group.

3.3 Research Tool and Process

Our main research tool was an applet integrated, designed and developed using GeoGebra. We choose GeoGebra as it allows designing a task with great flexibility at no cost, as well as logging students’ actions (for future use). The applet providing the task presents users with seven quadrilaterals, which they are asked to classify (see Fig. 1). The quadrilaterals consist of both intuitive, as well as non-intuitive examples and non-examples (see Table 1). The classification is based on the existence of at least one line of symmetry and is done by dragging the shapes into one of two regions. Immediate feedback is available in the form of an updated cumulative count of correct and incorrect classifications. Users can keep dragging shapes from anywhere to anywhere on the screen. We ran the applet on either tablets or touch-screen laptops.

Fig. 1.
figure 1

The GeoGebra applet used in this study.

Table 1. 2x2 shape classification [14].

3.4 Data Collection, Preparation, Analysis

Data Collection.

The data collection took place in early March 2022. Members of the research team had met with each of the participants individually. In Israel, these meetings took place in the students’ homes, after getting approval from their parents; in Germany, these meetings took place in school, after getting approvals from their parents, the responsible teachers as well the school management. First, the researcher made sure – by asking them directly about it – that the participant was familiar with the concept of reflective symmetry and was able to classify shapes based on this property. Then, the researcher presented the participant a similar applet—focusing on a non-symmetry-related classification task—which had the very same graphical interface and made sure that the participant got familiar with the interface, with how to engage with the applet, as well as the feedback mechanism. Finally, the researcher presented the participant the symmetry applet and made sure the instructions are clear. The participant was then let to use the applet by themselves until they stated that they were done. Each such meeting was a few (up to approx. 5) minutes long. While using the applet, we captured the screen, and used these recordings for our analysis. Also, the researcher instructed the participant to think-aloud while using the applet, and to continuously comment on their reasoning; we audio-recorded the participants for future analyses.

Data Preparation.

The videos were manually coded with the basic unit of analysis being a shape-movement, that is, dragging and dropping a shape from one place on the screen to another place. Overall, we had 266 shape-movements, with number of shape-movement per participant ranging between 7–21 (M = 9.2, SD = 3.0), with the most common ones being from the “pool” (the area where all the shapes are located when the applet is initiated) to either the symmetry (107, 40% of all shape-movements) or the no-symmetry (109, 41%) areas. There were 37 shape-movements (14%) from either the symmetry or no-symmetry areas to the other area, and 13 shape-movements (5%) from either the symmetry or no-symmetry areas back to the pool. For each shape-movement, we documented the following fields: action ID (across the whole population, to make each movement distinguishable), user ID (so that movements can be linked to the corresponding student), user-action number (count of actions for each user ID), country, object dragged, area from which the object was dragged [pool, symmetry, no symmetry], area in which the object was dropped [pool, symmetry, no symmetry], correct classification [yes, no, N/A (in case of dropping at the pool)]. These fields were used for calculating the variables.

Data Analysis.

Due to the relatively small population size, we used non-parametric statistical tests, that is, testing for differences between independent groups using Mann Whitney test, between paired samples using Wilcoxon’s Signed-Rank test, and for correlations using Spearman’s rho. Analyses were conducted in JASP 0.16.

3.5 Research Variables

Independent Variables.

Serve as proxies to students’ feedback noticing and strategies.

Feedback Noticing After [Correct/Incorrect] Classification.

For each incorrect object classification, we checked whether the student had immediately moved that object to another area [True/False]. Similarly, for each correct object classification, we checked whether the student had not moved that object in the immediate next step [True/False]. These are proxies for feedback noticing.

First Time [Shape] Moved. For each student, each of these seven variables (one per shape) holds the serial number of the action in which that student moved this shape from the pool area for the first time (whether it was correctly or incorrectly classified).

Steps Between [Squares/Parallelograms] First Moves.

Our two pairs of similar shapes denote two different cases – the two squares are both examples for reflective symmetry, with one being intuitive and the other being non-intuitive; the two parallelograms are both non-intuitive non-examples. These two variables measure to what extent students recognized the similarity within each of these pairs of shapes, and to what extent they understood that this similarity denotes on keeping the property of symmetry (or non-symmetry). For each student and for each of these pairs, we calculated the number of steps between first attempts to classify the two shapes within the pair.

Dependent Variables.

Success was a dependent variable at the student-level.

Normalized Final Correct Classifications (M = 0.79, SD = 0.19).

We calculated the ratio of final correct score to the total number of shape-movements, because not all participants achieved a perfect score of 7 correct classifications before declaring that they had finished. Otherwise, the number of shape-movements as a proxy to knowledge of (reflective) symmetry would be sufficient. This ratio normalized the final score. Note that there were no significant differences in success between the two country-based groups, with Mann-Whitney’s W-value = 119.5, at p = 0.45.

4 Findings

4.1 Feedback Noticing (RQ1)

Noticing Immediate Feedback (RQ1a).

Overall, we had 50 instances relevant to identifying feedback noticing after incorrect classification, that is, cases in which a student incorrectly classified an item and took another action after it; of these, in 18 cases (36%) the feedback was unnoticed, that is, the next action did not consist of moving the incorrectly classified object. Conversely, of the 174 instances relevant to identifying feedback noticing after correct classification, students were noticing feedback, that is, did not move the correctly classified object, in 172 cases (99%).

Success and Feedback Noticing (RQ1b).

Averaging at the student level, Feedback Noticing After Incorrect Classification takes an average of 0.38 (SD = 0.38,N = 21), and it was found to be not significantly correlated with Normalized Final Correct Classifications, with ρ = 0.12, at p = 0.61. Feedback Noticing After Correct Classification was not relevant for this test, due to a ceiling effect.

4.2 Empirically Validating the Research Framework (RQ2)

This research question aims at validating our theoretical framework. We did so by testing for correct classification on first attempt, taking into consideration that each shape is either an example or non-example for reflective symmetry, and whether it is an intuitive or non-intuitive (non-)example (see Table 2). Overall, the square (intuitive example) and the irregular quadrilateral (intuitive non-example) were the easiest to classify, with all but one of the participants succeeding on their first attempt (however, in different stages of the process, as is reported below, in Sect. 4.2). The parallelogram and rotated parallelogram (non-intuitive non-examples) were the most difficult to classify, with only 18 and 16 of the participants (respectively) succeeding on their first attempt.

Table 2. Number (%) of participants who correctly classified each of the quadrilaterals on first attempt, and the average step number in which each shape was first chosen for classification.

Differences in Difficulty Between Examples and Non-Examples (RQ2a).

We counted the cases in which each of the examples (square, tilted square, kite) and non-example (irregular, trapezoid, parallelogram, rotated parallelogram) shapes were correctly classified on first attempt. The examples were correctly classified in 85% of the cases (74 of 87), while the non-examples were correctly classified in only 76% of the cases (88 of 116). This difference is, however, not statistically significant, with χ2 = 2.6, at p = 0.11. Findings are summarized in Table 3 (right).

Differences in Difficulty Between Intuitive and Non-Intuitive (RQ2b).

We counted the cases in which each of the intuitive (square, kite, irregular, trapezoid) and non-intuitive (tilted square, parallelogram, rotated parallelogram) shapes were correctly classified on first attempt. The intuitive shapes were correctly classified in 88% of the cases (102 of 116), while the non-intuitive shapes were correctly classified in only 69% of the cases (60 of 87). This difference is statistically significant, with χ2 = 11.1, at p < 0.05. Findings are summarized in Table 3 (left).

Table 3. Correctness in first attempt, by (non-)intuitive (left) and (non-)example (right)

4.3 Relationships Between Solving Strategies and Success (RQ3)

First Shape Chosen to be Classified (RQ3a).

Notably, the square was by far the most popular shape to be classified on the very first attempt, with 69% of the participants (20 of 29) doing so. Other shapes were each chosen by only 1–3 of the participants to be classified on first attempt. Therefore, the seven variables accounting for the step-number in which each of the shapes were first chosen to be classified is more indicative on students’ order-strategy compared to considering the first shape alone. From these variables, we observed that, on average, the tilted square was chosen for classification relatively early in the process, while the trapezoid and the kite were chosen for classification relatively late. Findings are summarized in Table 2.

When testing for correlations between each of these seven variables and Normalized Final Correct Classifications, we found that three such variables proved statistically significant relationships. The later the parallelogram was first chosen to be classified – the higher was student success, with ρ = 0.39, at p < 0.05. The earlier the kite or tilted square were first chosen to be classified – the higher was student success, with ρ = -0.41 and ρ = -0.38, respectively, both at p < 0.05.

Importantly, we observed that the German participants tended to classify the kite earlier in the process, compared with the Israeli participants (M = 4.5, SD = 1.7 and M = 8.0, SD = 4.4, respectively). This could be explained by the fact that the kite – an intuitive example – was located on the left-hand side of the pool area and recalling that German is a left-to-right language while Hebrew is a right-to-left language. However, in-depth studies on cultural differences are still pending.

Other shapes did not prove difference in this manner; this included the other intuitive example, i.e., square, probably due to a ceiling effect, as it was the most frequent shape to be classified on the very first attempt.

Steps Between Attempts to Classify Pairs of Similar Shapes (RQ3b).

On average, Steps Between Squares First Moves was 2.6 (SD = 1.9), and Steps Between Parallelograms First Moves was 1.9 (SD = 1.3). This difference is marginally significant, with Wilcoxon’s Signed-Rank W-value = 38.5, at p = 0.07. Note that the distances between each of the pairs—as they appeared on the screen—were relatively similar (see Fig. 1), hence this parameter cannot explain the difference in steps.

We found that Steps Between Squares First Moves was negatively correlated with Normalized Final Correct Classification, with ρ = -0.40, at p < 0.05, that is, the closer a student attempted to classify both squares, the higher was their success. There was no relationship between Steps Between Parallelograms First Moves and Normalized Final Correct Classification (ρ = -0.10, at p = 0.36).

These findings indicate on the importance of recognizing similarity between shapes of different types. Note that the two parallelograms are both non-intuitive examples, while the two squares are either an intuitive example or a non-intuitive example.

5 Discussion

We used visual information, along with immediate feedback, to study cognitive processes required for building a concept image. Using an interactive sorting task that included both intuitive and non-intuitive shapes, that served as either example or non-example, we explored the process of analyzing and justifying (to self) symmetry-related properties. Overall, our findings point out to the importance of non-intuitive (non-)examples in this process (RQ2b). Contrary to previous findings, we found no evidence for differences in the identification of examples or non-examples (RQ2a) [14, 22]. Therefore, non-intuitive shapes should be presented to students more often. Importantly, our findings help in designing effective feedback to sorting tasks.

5.1 Recommended Strategies for Problem-Solving

Indeed, we observed a few strategies of completing the sorting task, which—based on the very nature of the problem, i.e., classifying—are overall governed by the order by which shapes were chosen to be classified. Based on our findings of the relationships between such strategies and success, we can derive a few recommendations for learners engaged in similar tasks. Note that these recommendations are relevant for different phases of the task-solving process.

Start with the Easiest Sub-Task.

Many students started by classifying the square, which at the same time was also one of the easiest shapes to classify (and therefore in practice this strategy was not associated with success, due to a ceiling effect). Notably, we found that an indication for success was relatively early attempts to classify the kite, an intuitive example (RQ3a), as well as relatively-late attempts to classify the parallelogram, a non-intuitive non-example, which led us to this recommendation.

Once a Sub-task is Correctly Accomplished, Look for Similar Sub-tasks.

Another indication for success was relatively close attempts to classify both squares (RQ3b), which also has to do with the benefit of relatively early attempts to classify the tilted square (RQ3a; as the square was highly frequently chosen in the first step). Hence, this recommendation, which echoes the notion of the importance of using analogies while solving problems [23]. Recently, Palmér and van Bommel’s [24] emphasized the importance of explicitly using similar problems as part of young children’s learning. Note that this strategy is enabled by the immediate feedback the system gives to learners. This strategy, however, is not always linked to success as is the case with the pair of parallelograms; note that the two parallelograms are both non-intuitive non-examples, which may have made them more difficult to classify in the first place.

When Left with the More Difficult Sub-Tasks, Compare them with what has Already Been Accomplished.

Implementing the two previous strategies will leave students with the more difficult subtasks, however with a set of examples and non-examples, both intuitive and non-intuitive. This set of worked examples could help them to identify more complex similarities or dissimilarities [23].

5.2 Contribution to the Development of Automatic Feedback

The identification of beneficial strategies is of great importance for designing an automatic elaborate feedback system that would support learners in real-time, throughout the learning process. Here lies another important advantage of using interactive tasks, as they can automatically and continuously store learner actions in log files, and this data can be used to detect various behaviors, to respond to them, and to help in assessment. This is indeed our next step following the findings of this study.

While there are already designs providing elaborate feedback similar tasks [25], our approach differs in that it provides (non-)examples instead of asking learners to elicit (non-)examples. This allows for presenting learners with non-intuitive (non-)examples, which they are unlikely to produce themselves. Additionally, it is not restricted to inner-mathematical objects, so that learners can be asked to classify extra-mathematical objects as well [26]. Often, the feedback function, even as it could immediately lead students to the correct answer, was not (immediately) used in the cases of incorrect classifications (RQ1a), which highlights the need to make the feedback more prominent for them to serve as a learning opportunity, as feedback in these cases seems to be linked to success (RQ1b). Correctly classified objects serve as reference for identifying similarities regarding the given property.

5.3 Possible Application in Intelligent Tutoring Systems

Intelligent Tutoring Systems not only provide feedback to learners but also assign further tasks according to a learner’s performance. The 2*2 framework of intuitive/non-intuitive examples/non-examples and our findings can be of use for assigning tasks. Following our findings, that intuitive (non-)examples are easier to classify than non-intuitive ones, the learners should be presented with intuitive objects first. Once they are classified correctly, tasks including non-intuitive tasks should be provided. In case of incorrect classification of non-intuitive objects, similar intuitive objects (from a previous task) can be presented and have learners compare them, serving as a learning opportunity to analyze their mistake. This way learners can be systematically confronted with examples demonstrating boundaries and unusual cases representing properties of a given concept. Using the notions of (non-)intuitiveness for task assignment can also be helpful to systematically test for boundaries of learners’ knowledge.

5.4 Implications for the Design of Digital Tasks

Our findings are of importance for designing digital tasks as well. We point out to the importance of using both examples and non-examples for supporting the enhancement of children’s concept image. Also, we emphasize the importance of incorporating both intuitive and non-intuitive (non-)examples, as they help distinguish between levels of proficiency, as well as for supporting beneficial feedback. Importantly, the 2*2 framework of intuitive/non-intuitive examples/non-examples can be implemented in numerous educational contexts and settings, which is one of its strengths.

5.5 Conclusions and Limitations

We demonstrated how student’s engagement with an interactive sorting task in mathematics can derive various strategies, and that some of them were beneficial for solving the task successfully. The findings of this study could serve as foundations for the development of an elaborate feedback mechanism, as well as a basis for task assignment loops in intelligent tutoring systems. Of course, this study has limitations; mostly, it is about the sample size being rather small and about having examined only one applet in a specific educational context. Despite these limitations we believe that our findings, based on this unique approach we took, are meaningful.