PSLC Integrated Project Summary

From LearnLab
Jump to: navigation, search

Page with list of projects: PSLC Year 5 Projects

Computational Modeling and Data Mining THRUST [Ken]

Knowledge Analysis: Developing Cognitive Models of Domain-Specific Content

Abstract. Knowledge transfer is a core assumption built into the pedagogy of most educational programs from K-12 to college. It is assumed that the material learned in the fourth week of the course is retained and transfers to material taught in the eighth week of the course. This is particularly true for highly structured courses such as physics; however, the empirical literature on learning suggests that far transfer is much more difficult than traditional pedagogy assumes (for reviews, see Bransford, Brown, & Cocking, 2000; Bransford & Schwartz, 1999; Gick & Holyoak, 1983). The goal of the present paper is to reconcile these apparently incompatible beliefs. Toward that end, we will use a repository of data, taken from the Physics LearnLab, to argue that the level of granularity of the constituent knowledge components affects the detection of to transfer from one domain to another.

Introduction: In well-structured domains, such as math or science, teachers often presume that the contents of one unit will transfer to units taught later in the semester; however, the learning literature is replete with evidence suggesting that transfer, especially far transfer, is difficult to achieve (Detterman, 1993). Do teachers have unrealistic expectations of their students, or are scientists looking in the wrong places to find evidence of far transfer? The primary goal of the present paper is to seek a resolution to this potential contradiction. Toward that end, we will define learning at multiple levels of granularity and show how different levels of knowledge disaggregation reveal different conclusions about the existence or non-existence of far transfer.

Results The data analyzed for this project were taken from three semesters (Fall 2005 - 07) of college physics taught at the United States Naval Academy (USNA). Students used the Andes Physics Tutor to solve their homework assignments and the data is stored in the PSLC's DataShop. For the analyses reported below (i.e., translational kinematics, translational dynamics, and rotational kinematics), the sample size consisted of two-hundred and twenty-one students (n = 221) who generate 76,891 transactions.

At the unit level, transfer was not identified. Translational dynamics occurred after translational kinematics, but before rotational kinematics. Therefore, we would expect the learning curves for translational dynamics to fall somewhere between translational and rotational kinematics. Instead, translational kinematics was the easiest of the three units -- with lower assistance score (on the first opportunity) than rotational kinematics (p = .01) and rotational dynamics (p = .35).


In the introduction, we pointed out the observation that there is an apparent contradiction between the empirical results investigating far transfer and the assumptions that teachers make within their own classroom. Teachers expect that their students should retain the knowledge components over several weeks, often with many other intervening units of instruction. However, the learning literature on far transfer seems to suggest that it is a rare occasion when knowledge lasts over long retention intervals.

To resolve the discrepancy between theory and practice, we introduced the hypothesis that the granularity of the assessed knowledge plays a large role in whether transfer is observed or not. For example, when the unit was taken as the knowledge component, then there was absolutely no evidence of transfer. The assistance scores associated with translational kinematics was initially lower (i.e., the first opportunity) than both the translational dynamics and rotational kinematics units. This initial advantage was maintained over fourteen of the sixteen opportunities.

Because there was no evidence of any sort of transfer, we decomposed the large, unit-size knowledge components into three smaller knowledge components that corresponded to the three broad categories of user-interface elements. We repeated this process for the user interface elements that were vectors because the learning curves suggested that there was a drift toward increasing assistance score values. For the most part, the equations and scalar definitions were decreasing as the semester advanced. The vectors were disaggregated into acceleration, velocity, and displacement. These categories were more sensible because they finally corresponded to the concepts that are taught in the physics textbook.

Future work will include better understanding why the displacement vector showed such a steep learning curve. At first, students were asking for lots of help and committing many mistakes. However, after making those initial attempts, they seemed to learn how to apply this knowledge component fairly quickly. We also plan to extend our analyses to include the equations that were written. From the student's perspective, writing equations is the most important part of the course.

Learning Analysis: Developing Models of Domain-General Learning and Motivational Processes

Abstract The purpose of this project is to study how students learn errors from examples. We apply a computational model of learning, called SimStudent that learns cognitive skills inductively either from worked-out examples or by being tutored. In this study, we use SimStudent to study how and when erroneous skills (the skills that produce errors when applied) would be learned.

We are particularly interested in studying how the differences in prior knowledge affect the nature and rate of learning. We hypothesize that when students rely on shallow, domain general features (which we call "weak" features) as opposed to deep, more domain specific features ("strong" features), then students would more likely to make induction errors.

To test this hypothesis, we give SimStudent different sets of prior knowledge and analyze learning outcomes.

Background and Significance. This project explores how differences in prior knowledge affect the nature of student learning, particularly how students knowing little, to making reasonable errors, to mostly correct performance.

We hypothesize that incorrect generalizations are more likely when students have weaker, more general prior knowledge for encoding incoming information. This knowledge is typically perceptually grounded and is in contrast to deeper or more abstract encoding knowledge. An example of such perceptually grounded prior knowledge is to recognize 3 in x/3 simply as a number instead of as a denominator. Such an interpretation might lead students to learn an inappropriate generalization such as "multiply both sides by a number in the left hand side of the equation" after observing x/3=5 gets x=15. If this generalization gets applied to an equation like 4x=2, the error of multiplying both sides by 4 is produced.

We call this type of perceptually grounded prior knowledge "weak" prior knowledge in a similar sense as Newell and Simon’s weak reasoning methods (1972). Weak knowledge can apply across domains and can yield successful results prior to domain-specific instruction. However, in contrast to "strong" domain-specific knowledge, weak knowledge is more likely to lead to incorrect conclusions.

In general, a particular example can be modeled both with weak and strong operators. For example, suppose a step x/3=5 gets demonstrated to "multiply by 3." Such step can be explained by a strong operator getDenominator(x/3), which returns a denominator of a given fraction term and multiply that number to both sides. On the other hand, the same step can be explained by a weak operator getNumberStr(x/3), which returns the left-most number in a given expression. This weak operator produces errors on problems like 3x=5 where it will multiply both sides by 3.

Learning Curves Figure 1 shows average step score, aggregated across the test problems and student conditions. The X-axis shows the number of training iterations.

The Weak-PK and Strong-PK conditions had similar success rates on test problems after the first 8 training problems. After that, the performance of the two conditions began to diverge. On the final test after 20 training problems, the Strong-PK condition was 82% correct while the Weak-PK was 66%, a large and statistically significant difference (t = 4.00, p < .001).

A simple fit to power law functions to the learning curves (converting success rate to log-odds) showed that the slope (or rate) of the Weak-PK learning curve (.78) is smaller (or slower) than that of the Strong-PK learning curve (.82). We then subtracted the two functions in their log-log form and verified in a linear regression analysis that the coefficient of the number of training problems (which predicts the difference in rate) is significantly greater than 0 (p < .05).


Figure 1: Average step score after each of the 20 training problems for SimStudents with either strong or weak prior knowledge. Error Prediction

Figure 2 shows a number of true negative predictions made on the test problems for each of the training iterations.

Surprisingly, the Weak PK condition did make as many as 22 human-like errors on the 11 test problems. On the other hand, the Strong PK condition hardly made human-like errors.


Figure 2: Number of True Negative predictions, which are the same errors made both by SimStudent and human students on the same step in the test problems. Publications

  • Matsuda, N., Lee, A., Cohen, W. W., & Koedinger, K. R. (2009; to appear). A Computational Model of How Learner Errors Arise from Weak Prior Knowledge. In Conference of the Cognitive Science Society.
  • Cross referencing projects in other thrusts:
    • Mayer? Baker?

Instructional Analysis: Developing Predictive Engineering Models to Inform Instructional Event Design

Metacognition & Motivation Thrust

The work in this thrust builds on prior work started before the renewal, particularly work in the Coordinative Learning Cluster.


Past work within the Coordinative Learning Cluster emphasized to broad themes: Example-Rule Coordination and Visual-Verbal Coordination. These themes involve instruction that provides students with multiple input sources and/or prompts for multiple lines of reasoning. A good self-regulated learned needs to have the metacognitive strategies to coordinate information coming from different sources and lines of reasoning. We summarize Year 5 project results within these two themes as they address both whether providing multiple sources or reasoning prompts enhances student learning and whether metacognitive coordination processes can be supported or improved.

Example-Rule Coordination

Much of academic learning, particularly in Science, Math, Engineering, and Technology (SMET) domains but also in language learning, involves the acquisition of concepts and skills that must generalize across many situations if robust learning is to achieved. Often instruction expresses such generalizations explicitly to students with verbal descriptions, which we call "rules" (see the top-left cell in Figure XX). It may also communicate these generalizations by providing examples (bottom-left cell). Because "learning by doing" is recognized as critical to concept and skill acquisition, typical instruction also includes opportunities for students to practice application of the rules in "problems" (bottom-right cell). All to rarely, students are asked to generate rules themselves from examples of worked out problem solutions -- prompting students to do so is called "self-explanation" (top-right cell). The optimal combination of these four kinds of instruction (or instructional events) has been the focus on many projects that cut across math, science, and language domains. While typical instruction tends to focus on rules and practice opportunities (the main diagonal in Figure XX), these studies have now consistently demonstrated that a more balanced approach that includes at least as many examples and self-explanation opportunities leads to more robust learning.

PSLC studies in math, science, and language learning domains have been exploring the combination of worked-examples and self-explanation with computer-based tutoring during problem-solving practice. These studies bring together different research traditions 1) studies worked examples and cognitive load theory from Educational Psychology, particularly in Europe, 2) self-explanation studies primarily from cognitive science and psychology, and 3) intelligent tutoring system primarily from Computer Science.

As discussed in Schwonke, Renkel, Krieg, Wittwer, Aleven, & Salden (2009), past studies of worked example effects had compared against a control condition involving unsupported problem solving. This award-winning project has demonstrated the benefit of adding worked examples even in the context of a stronger control condition, namely, problem solving with instructional support of an intelligent tutor. Students spend take 20% less time in the example condition and learn as much or more on a variety of robust learning measures. The project has further demonstrated that a computer tutor that automatically adapts the transition from worked examples to problem solving leads to even further gains in robust student learning (Salden, Aleven, Renkl, & Schwonke, 2009).

Reflecting the benefits of a center in general and of the PSLC infrastructure in particular, this line of research has involved 5 laboratory studies and 3 in vivo studies run in labs and classrooms in Freiburg, Germany and Pittsburgh. These studies were all run in the context of the Geometry Cognitive Tutor, which automates delivery of complex instruction, insures reliable implementation of experimental differences, and provides rich process data (every 10 seconds) over hours of instruction. These studies involved more than 900 students and an average of 4 hours of learning time per student.

While the in vivo studies demonstrate that these substantial effects are robust to the high variability in real classroom studies, the associated lab studies allow more in depth investigation of learning process and learning theory. In particular, resent results from one of the lab studies reported in Schwonke et al. (2009) enhance theoretical understanding of complex human learning processes, particularly how and how deeply students choose to reason about instructional examples.

Table XX. Categories and examples of students' self-explanations

  • Principle-based explanation The learner verbalizes and elaborates on a mathematical principle. Mentioning a principle without some elaboration would not be coded as a principle-based explanation “Oh, that is major-minor arc, that means I’ve to subtract the minor arc from 360°”
  • Visual mapping The learner tries to relate content organized in different external representations and/or different visual tools (verbally as well as supported by gestures) “Where is angle ETF. Ah, has to be this one” (learner is pointing at corresponding spots in the graphic)
  • Goal–operator combination The learner verbalizes a (sub-)goal together with operators that help to accomplish this (sub-)goal “You can calculate this arc of a circle by subtracting 33° from 360°”

The example condition provided both more principle-based self-explanations and more visual mapping explanations whereas the problem condition provided many more goal–operator combination explanations. Principle-based and visual mapping self-explanations are consistent deeper processing that places more attention on the geometry rules and the non-trivial mapping of the rules to specific situations. These explanations suggest greater attention to the if-part or retrieval features of relevant knowledge components. Goal-operator explanations attend more to the arithmetic, that which must be done in problem-solving (the then-part). The arithmetic processing may be strengthening prerequisite knowledge, but is not directly relevant to the target geometric content. The greater number of principle-based and visual mapping self-explanations in the example group is consistent with the theory that example study frees cognitive resources so learners can engage in deeper processing. While problem-solving is beneficial later in learning (and thus the fading approach), in early learning it not only wastes time but it may put students in a performance-oriented mode (Dweck) whereby they do not as deeply process tutor instruction, which is equivalent to an example.

Investigations of correlations between self-explanation behavior further support this explanation, at least in part. Visual mapping explanations were significantly correlated with conceptual transfer, but principle-based explanations were not. Both principle-based and goal-operator explanations were significantly correlated with procedural transfer. Greater use of examples enhances deeper processing and that processing (at least of the mapping type) leads to greater conceptual understanding and transfer. Recall, that this achievement benefit was observed on top of substantial efficiency benefit in which example students needed 20% less instructional time.

  • Schwonke, R., Renkel, A., Krieg, C, Wittwer, J., Aleven, V., Salden, R. J. C. M. (2009). The Worked-example Effect: Not an Artefact of Lousy Control Conditions. Computers in Human Behavior, 25, 258-266.
  • Salden, R. J. C. M., Aleven, V. A. W. M. M., Renkl, A., & Schwonke, R. (2009). Worked examples and tutored problem solving: Redundant or synergistic forms of support? Topics in Cognitive Science, 1, 203-213.

The Center-mode focus on the issues of worked examples and self-explanation has allowed for cross-domain investigations of how these principles may further enhance student learning beyond already effective intelligent tutoring systems. These studies have been performed in Chemistry, Algebra, Physics, and English as a Second Language. The Chemistry studies replicated the result from Geometry that replacing half of the problems in a tutoring system with worked examples leads to more efficient learning. Across three studies, one at the college level and two at the high school level, McLaren et al (2008) found students learned just as much but in about 20% less time. If scaled to a semester long course, students using an example-based approach would have more than 3 free weeks!

  • McLaren, B.M., Lim, S., & Koedinger, K.R. (2008). When and How Often Should Worked Examples be Given to Students? New Results and a Summary of the Current State of Research. In B. C. Love, K. McRae, & V. M. Sloutsky (Eds.), Proceedings of the 30th Annual Conference of the Cognitive Science Society (pp. 2176-2181). Austin, TX: Cognitive Science Society.

Other PSLC studies have identified benefits of worked examples and self-explanation in Physics (Nokes ref) and Algebra (Anthony ref).

A study in English is exploring whether the benefits of worked examples and self-explanation extend from math and science domains to language learning. This study is summarized in the Cognitive Factors section below. Interestingly, while there is some evidence that self-explanation helps in early learning, it does not appear to have as strong a benefit overall. This provides an important theoretical puzzle: Under what conditions is prompted self-explanation a productive strategy? Ongoing theoretical and empirical work is investigating this question.

More Direct Support for MetaCognition

One of the main challenges of education is to help students reach meaningful and robust learning. The assistance dilemma raises the question of what form (and ‘amount’) of assistance are most effective with different learners in different stages of the learning process (Koedinger & Aleven, 2007). Instruction followed by practice is known to be very efficient for teaching novices (e.g., Koedinger, Anderson, Hadley & Mark, 1997); yet, students often acquire shallow procedural skills, and fail to acquire conceptual understanding (Aleven & Koedinger, 2002). This can be attributed, at least in part, to students using superficial features and not encoding the deep features of the domain (Chi, Feltovich & Glaser, 1981). One approach to getting students to attend and encode the deep features is to add an invention phase prior to instruction. Invention as preparation for leaning (IPL) was shown to help students better cope with novel situations that require learning (Schwartz & Martin, 2004; Sears, 2006). In this process students are presented with a dilemma in the form of contrasting cases, and attempt to invent a mathematical model to resolve this dilemma. For example, Figure 1 shows four possible pitching machines. Students are asked to invent a method that will allow them to pick the most reliable machine. The concept of contrasting cases comes from the perceptual learning literature, since these cases, when appropriately designed, emphasize differences in the deep structure of the examples (Gibson & Gibson, 1955). The invention process includes designing a model, applying it to the given set of contrasting cases, evaluating the result, and debugging the model. This iterative process is very similar to the debugging process as described by Klahr and Carver (1988; Figure 2). Unlike other inquiry-based manipulations (cf. Lehrer et al., 2001; de Jong & van Joolingen, 1998), the goal of the IPL process is not for students to discover the correct model, but to prepare them for subsequent instruction.

1) Last year we completed a study with 7 classes at Steel Valley Middle School. We got positive results - cognitive and motivational benefits. There is also a cogsci paper, which will be the basis for the updated Wiki page. Domain knowledge:

  • IPL students in advanced classes were more capable of solving new strategy items without learning resource. In fact, in the absence of a learning resource, direct instruction students performed at floor, while IPL students performed as well as with the source.
  • This effect holds when controlling for simple domain knowledge (performance on normal items in the same test).
  • This was found in multiple new-strategy items. However, all results were found on a single topic (central tendency and graphing). The single test item on the topic of variability failed to capture difference between the groups.

Motivation: [Put below in motivation section or use as bridge to that section]

  • IPL students reported to have benefited more (F=3.3, p<.07)
  • There was a significant interaction between condition and test anxiety. Text anxiety was assessed using the MSLQ (Pintrich 1999) before the study began. Students who reported to have lower test anxiety also reported to have benefited more from IPL instruction compared to high-anxiety students in the no desin condition.
  • IPL students stayed more often in class to work during breaks (IPL: 16% No Design: 3%).
  • Furthermore, they did so during invention activities and not show-and-practice activities, suggesting that it is the activities that are motivating.

2) Over the year since then we built a tutoring system for IPL. The Invention Lab is an intelligent tutoring system for IPL. To give intelligent feedback, it uses two models:

  • A meta-cognitive model of the invention process
  • A cognitive model of the main concepts in the domain

This project we began at the beginning of the center to explore whether the benefits of tutoring could be achieved at the meta-cognitive level. Recent analysis of log files have revealed an exciting new finding. Past analysis had shown that the Help Tutor reduces students help-seeking errors while it was in place and giving immediate feedback, but we wanted to explore whether it would have a lasting impact and reduce such such errors later in the course. The second of two in vivo studies involved two different units of the Geometry Cognitive Tutor. The two units of study 2 were spread apart by one month. We collected data from students' behavior in the months between the two units and following the study. During these months students repeated previous material in preparation for statewide exams. As seen in the table below, the main effects of the help-seeking environment persisted even once students moved on to working with the native Cognitive Tutor! Overall, students who received help-seeking support during the study took more time for their actions following the study, especially for reading hints - their hint reading time before asking for additional hint is longer by almost one effect size in the month following the study (12 vs. 8 seconds, t(44)=3.0, p<.01). Also, students who were in the Help condition did not drill down through the hints as often (average hint level: 2.2 vs. 2.6, t(51)=1.9, p=.06). These effects are more consistently significant after both units, suggesting that having the study stretched across two units provided enough time for students to better acquire domain-independent help-seeking skills.

Visual-Verbal Coordination

Overall, student progress was slower than anticipated by the experimenters or the classroom teacher. Of the 83 students working in the intelligent tutor, 31 students (11 Control, 10 Visual Highlighting, 10 Visual Cueing) reached the last instructional unit (unit 3) during the experiment. For these students, results show that benefits of visual self-explanation for problem solving change over the course of tutoring practice (see the figure below). In the first instructional unit, students provided with visual cueing by the tutor are most accurate in their problem solving answers (M = .89, SD = .05) compared to students in the control condition (M = .83, SD = .06) or the visual-explanation condition (M = .85, SD = .05). Results demonstrated an overall effect of condition (F (2, 27) = 4.01, p = .03); post-hoc Bonferroni comparisons demonstrated that visual cueing significantly outperformed the control condition (p = .03) but not the visual self-explanation condition (p = .15), which fell between the two other groups. In contrast, by unit 3, students who visually self-explained the geometry principles (M = .86, SD = .08) were most accurate in their problem-solving answers, followed by the visual cueing condition (M = .83, SD = .11), and then the control condition (M = .73, p = .10). Results again demonstrated an overall effect of condition (F(2, 27) = 4.84, p = .016); post-hoc Bonferroni comparisons showed that the control condition was outperfomed by the visual self-explanations (p = .03) and the visual cueing (p = .05) conditions.

We analyzed overall posttest and delayed posttest results for students who had also taken the pretest. Posttest results demonstrated an overall improvement from pre- to posttest (F(1, 65) = 9.68, p = .03), but no significant condition differences (F<1). At delayed posttest, result suggested a test time (pretest vs. delayed posttest) by condition interaction (F(2. 37) = 2.87, p = .07). At delayed posttest (see Figure 2), students in the visual self-explanation condition outperformed students from the visual cueing condition and the control (interactive diagram) condition.


  • Butcher, K. R., & Aleven, V. A. (in press 2009). Visual self-explanation during intelligent tutoring? More than attentional focus? European Association for Research on Learning and Instruction, 13th Biennial Conference. August 25-29, 2009: Amsterdam, the Netherlands.

An ongoing challenge that educators continually face is helping their students recognize connections between related information while also appreciating distinctions between only seemingly related information. Learners who fail to recognize connections across different contexts or representations demonstrate overly specific knowledge, failing to generalize what they have learned to new circumstances. Learners who fail to distinguish between superficially related information demonstrate overly general knowledge, failing to discriminate between subtle but important features of the problems before them. In the former case, learners should refine their knowledge by removing irrelevant features from overly specific knowledge components; in the latter, the desired knowledge refinement requires adding relevant features to overly general knowledge components. One promising approach for addressing this challenge for both situations is to ask students to compare similar problem situations that highlight key commonalities and differences. In particular, comparisons of different representations of related concepts may prove especially valuable for helping students make sense of the complex visualizations often employed in introductory chemistry lessons. Carefully designed comparisons may help students coordinate information between and across multiple representations, enabling them to refine their knowledge by drawing critical distinctions and recognizing fundamental commonalities between the concepts represented. The research proposed here will examine how structured comparisons of multiple representations of chemical reactions may facilitate students’ abilities to generalize their understanding of common concepts across different representations and to discriminate between different concepts in superficially similar representations.

This project investigates the possible combined strengths of graphically-oriented (Animation Tutor) and procedurally-oriented (Cognitive Tutor) instructional software. Students in Algebra II Cognitive Tutor classrooms are randomly assigned to one of four instructional groups on constructing equations for mixture problems. Three of the instructional groups study worked examples in which they received a verbal explanation of the solution with quantities represented by either (1) a table of values, (2) a static bar graph, or (3) a dynamic bar graph linked to the equation. The fourth group solve the same example problems using the Algebra Cognitive Tutor. Each of the example problems are followed by a test problem on the Algebra Cognitive Tutor that serves to both evaluate the different instructional conditions and provide additional opportunities for learning. Students enter quantities into a table and use these quantities (following feedback) to construct an equation to represent the problem. Two days of instruction are followed a week later, by (1) a Model Analysis Cognitive Tutor activity in which students are tested (with feedback) on their ability to construct equations for different problem structures and interpret the meaning of the terms in the equations; and (2) a paper-and-pencil test to measure retention and transfer. This variety of evaluation measures will help in identify how the different instructional formats help students learn the various knowledge components needed to construct equations for problems represented by general linear models.

An extended summary of the study design is in this pdf ...Media:ReedHoffmanCorbettWorkExSummary.pdf

We investigate a key issue in coordinative learning, namely, how learning with multiple external representations (MERs) should be sequenced to effectively support students’ conceptual understanding. In order to benefit from MERs, learners must attain some level of fluency in interpreting and manipulating the individual representations, and must also engage in sense making across the representations to relate them and abstract underlying concepts. The question arises how tasks involving different representations should be sequenced so that both these aspects of robust learning are realized. In particular, how frequently should students switch between representations? We focus on fractions as a challenging topic area for students in which multiple representations are often used and likely to support robust learning. This research will contribute to the literature on early mathematics learning, learning with multiple representations, and learning with intelligent tutoring systems. It will also add to the portfolio of studies in the PSLC’s coordinative learning cluster.

Background & Significance A quintessential form of coordinative learning occurs when learners work with multiple external representations (MERs) of subject matter. Accumulating evidence points towards the promise of learning with MERs (Ainsworth, Bibby, & Wood, 2002; Larkin & Simon, 1987; Seufert, 2003), and also to the need for students to make sense out of the different representations by connecting and abstracting from them (Ainsworth, 1999). This research focuses on a difficult area of early mathematics learning: fractions. Both teachers’ experiences and research in educational psychology show that students have difficulties with fraction arithmetic and with the various representations for fractions (e.g. Brinker, 1997; Callingham & Watson, 2004; Caney & Watson, 2003; Person et al., 2004; Pitta-Pantazi, Gray & Christou, 2004). Coordinating between MERs is regarded as a key process for learning across areas of mathematics (Kilpatrick, Swafford, & Findell, 2001; NCTM, 2000), including fractions (e.g. Kieren, 1993; Moss & Case, 1999; Martinie & Bay-Williams, 2003; Thompson & Saldanha, 2003). A number of authors have argued, based on observational studies, that MERs can lead to deeper conceptual understanding of fractions (Corwin et al., 1990; Cramer et al., 1997a, 1997b; Steiner & Stoeckling, 1997). However, we know of no experimental studies that have investigated the advantages of instruction with multiple (graphical) fraction representations over instruction that focuses on a single representation, with one exception: an in vivo experiment, in which 132 6th-grade students used four versions of CTAT-built tutors (Rau, Aleven, & Rummel, 2009). Students learning with MERs and prompted to self-explain performed best on a posttest and delayed posttest assessing procedural and conceptual knowledge of fractions. At this point, however, we do not know enough about the circumstances that may influence the effectiveness of learning with multiple representations of fractions, a criticism that has been leveraged against the existing body of research on learning with MERs more generally (Ainsworth, 2006; Goldman, 2003). The proposed research looks at how the development of fluency with any given representation interacts with sense making across representations. First, as Ainsworth (2006) points out, being able to interpret a particular type of representation is a prerequisite for learning from it. However, such ‘representational fluency’ does not just emerge by itself, but requires practice. Second, it is important that students engage in sense making across the different representations to relate them and integrate the information they provide (Ainsworth, 2006; Brinker, 1997; Paik, 2005; Uttal et al., 1997). According to cognitive flexibility theory (Spiro & Jehng, 1990), being presented with MERs challenges the learner to switch between different perspectives on the same concepts. Under this perspective, learning with MERs supports the development of robust – flexible and transferable – knowledge (Kaput, 1989), to the extent that learners coordinate between the representations, that is, cognitively link the information the MERs provide and abstract underlying conceptual knowledge. A key question is therefore whether learners should build up fluency with each representation first, before they engage in sense-making activities aimed at coordinating representations, or whether they develop more flexible knowledge when they become familiar with the different representations in parallel and continuously engage in sense making across representations. This potential conflict is inherent in designing instruction with MERs.


Consistent with the goals of the new Metacognition and Motivation Thrust, which will officially begin in Year 6, past PSLC projects have been begun investigating motivational issues. We summarize results of projects

We are investigating what factors lead students to make specific path choices in the learning space, focusing specifically on the shallow strategy known as gaming the system, and on Off-Task Behavior. Prior PSLC research has shown that a variety of motivations, attitudes, and affective states are associated with the choice to game the system (Baker et al, 2004; Baker, 2007b; Rodrigo et al, 2007) and the choice of off-task behavior (Baker, 2007b) within intelligent tutoring systems. However, other recent research has found that differences between lessons are on the whole better predictors of gaming than differences between students (Baker, 2007), suggesting that contextual factors associated with a specific tutor unit may be the most important reason why students game the system. Hence, this project is investigating how the content and presentational/interface aspects of a learning environment influence whether students tend to choose a gaming the system strategy. An extension to this project in 2008-2009 also investigated how the content and presentational/interface aspects of a learning environment influence whether students tend to choose a gaming the system strategy.

To this end, we have annotated a large proportion of the learning events/transactions in a set of twenty units in the Algebra LearnLab with descriptions of each unit's content and interface features, using a combination of human coding and educational data mining. We then used data mining to predict gaming and off-task behavior with the content and interface features of the units they occur in. This gives us new insight into why students make specific path choices in the learning space, and explains the prior finding that path choices differ considerably between tutor units.

Findings and Explanation

The text below is taken from (Baker, 2007b; Baker et al, in press a, accepted).

The difference between lessons is a significantly better predictor than the difference between students in determining how much gaming behavior a student will engage in, in a given lesson. Put more simply, knowing which lesson a student is using is a better predictor of how much gaming will occur, than knowing which student it is.

In the Middle School Tutor, lesson has 35 parameters and achieves an r-squared of 0.55. Student has 240 parameters and achieves an r-squared of 0.16. In the Algebra Tutor, lesson has 21 parameters and achieves an r-squared of 0.18. Student achieves an equal r-squared, but with 58 students; hence, lesson is a statistically better predictor because it achieves equal or significantly better fit with considerably fewer parameters.

We empirically grouped the 79 features of the CTLVS1.1 with Principal Component Analysis (PCA). We grouped the 79 features of the CTLVS1 into 6 factors. We then analyzed whether the correlation between these 6 factorsand the frequency of gaming the system was significant in any case.

Of these 6 factors, one was statistically significantly associated with the choice to game the system, r2 = 0.29 (e.g. accounting for 29% of the variance in gaming), F(1,19)= 7.84, p=0.01. The factor loaded strongly on eight features associated with more gaming:

   * 14: The same number being used for multiple constructs
   * 23-inverse-direction: Reading hints does not positively influence performance on future opportunities to use skill
   * 27: Proportion of hints in each hint sequence that refer to abstract principles
   * 40: Not immediately apparent what icons in toolbar mean
   * 53-inverse-direction: Lack of text in problem statements not directly related to the problem-solving task, generally there to increase interest
   * 63-inverse-direction: Hints do not give directional feedback such as “try a larger number”
   * 71-inverse-direction: Lack of implementation flaw in hint message, where there is a reference to a non-existent interface component
   * 75: Hint requests that student perform some action 

In general, several of the features in this factor appear to correspond to a lack of clarity in the presentation of the content or task (23-inverse, 40, 63-inverse), as well as abstractness (27) and ambiguity (14). Curiously, feature 71-inverse (the lack of a specific type of implementation flaw in hint messages, which would make things very unclear) appears to point in the opposite direction – however, this implementation flaw was only common in a single rarely gamed lesson, so this result is probably a statistical artifact.

Feature 53-inverse appears to represent a different construct – interestingness (or the attempt to increase interestingness). The fact that feature 53 was associated with less gaming whereas more specific interest-increasing features (features 46-52) were not so strongly related may suggest that it is less important exactly how a problem scenario attempts to increase interest, than it is important that the problem scenario has some content in it that is not strictly mathematical.

Taken individually, two of the constructs in this factor were significantly (or marginally significantly) associated with gaming. Feature 53-inverse (text in the problem statement not directly related to the problem-solving task) was associated with significantly less gaming, r2 = 0.19, F(1,19) = 4.59, p = 0.04. Feature 40 (when it is not immediately apparent what icons in toolbar mean) was marginally significantly associated with more gaming, r2 = 0.15, F(1, 19)=3.52, p=0.08. The fact that other top features in the factor were not independently associated with gaming, while the factor as a whole was fairly strongly associated with gaming, suggests that gaming may occur primarily when more than one of these features are present.

Two features that were not present in the significant factor was statistically significantly associated with gaming: Feature 36, where the location of the first problem step does not follow conventions (such as being the top-left cell of a worksheet) and is not directly indicated, r2 = 0.20, F(1,19)=4.97, p=0.04. This feature, like many of those in the gaming-related factor, represents an unclear or confusing lesson. Also, Feature 79, whether or not the lesson was an equation solver unit, was statistically significantly better than chance, r2 = 0.30, F(1, 19)=8.55, p<0.01. Note, however, that although a lower amount of interesting text is generally associated with more gaming (Feature 53), equation-solver units (which have no text) have less gaming in general (Feature 79). This result may suggest that interest-increasing text is only beneficial (for reducing gaming) above a certain threshold -- alternatively, other aspects of the equation-solver units may have reduced gaming even though the lack of interesting-increasing text would generally be expected to increase it.

When the gaming-related factor, Feature 36, and Feature 79, were included in a model together, all remain statistically significant, and the combined model explains 56% of the variance in gaming (e.g. r2 = 0.55).

Five other features that were not strongly loaded in the significant factor were marginally associated with gaming. None of these other features is statistically significant in a model that already includes the gaming-related cluster and Feature 36. Due to the non-conclusiveness of the evidence relevant to these features, we will not discuss all of these features in detail, but will briefly mention one that has appeared in prior discussions of gaming. Lessons where a higher proportion of hint sequences told students what to do on the last hint (Feature 61) had marginally significantly more gaming, r2 = 0.14, F(1,19)=3.28, p=0.09. This result is unsurprising, as drilling through hints and typing in a bottom-out hint is one of the easiest and most frequently reported types of gaming the system.

The off-task behavior model achieved similar predictive power, but was a much less complex model. None of the 6 factors were statistically significantly associated with gaming. Only one of the features was individually statistically significantly associated with off-task behavior: Feature 79, whether or not the lesson was an equation solver unit. Equation solver units had significantly less off-task behavior, just as they had significantly less gaming the system, and the effect was large in magnitude, r2 = 0.55, F(1, 21)=27.29, p<0.001, Bonferroni adjusted p<0.001.

To put this relationship into better context, we can look at the proportion of time students spent off-task in equation-solver lessons as compared to other lessons. On average, students spent 4.4% of their time off-task within the equation-solver lessons, much lower than is generally seen in intelligent tutor classrooms or, for that matter, in traditional classrooms. By contrast, students spent 14.1% of their time off-task within the other lessons, a proportion of time-on-task which is much more in line with previous observations. The difference in time spent per type of lesson is, as would be expected, statistically significant, t(22)=4.48, p<0.001.

The goal of this project is to examine how student learning is affected by social cues in computer-based learning environments, such as the conversational style of online cognitive tutors. In particular, students will learn how to solve stoichiometry problems in the Chemistry LearnLab, using a cognitive tutor that provides hints and feedback in direct style or in polite style (McLaren, Lim, Yaron, & Koedinger, 2007). The stoichiometry tutor has been used for other PSLC studies, in particular those by McLaren et al that have investigated personalization, politeness, and worked examples.

Our study is based on Brown and Levinson’s (1987) theory of politeness, which specifies how people create polite requests; Reeves and Nass’ (1996, 2005) media equation theory, which specifies the conditions under which people accept computers as conversational partners; and Mayer’s (2005) personalization principle in which people work harder to learn when they feel they are in a conversation with a tutor. Our working hypothesis is that learners work harder to make sense of lessons when they work with polite rather than direct tutors, because learners are more likely to accept polite tutors as conversational partners (Mayer, 2005; Wang, Johnson, Mayer, Rizzo, Shaw, & Collins, 2008).

Findings: A lab study with over 100 subjects was run in early 2009 at the University of California with the above conditions. College students learned to solve chemistry stoichiometry problems with the stoichiometry tutor through hints and feedback, either polite or direct, as described above. There was a pattern in which students with low prior knowledge of chemistry performed better on subsequent problem-solving tests if they learned from the polite tutor rather than the direct tutor (d = .73 on an immediate test, d = .46 on a delayed test), whereas students with high prior knowledge showed the reverse trend (d = -.49 for an immediate test; d = -.13 for a delayed test). On the other hand, the high school study, also run in early 2009 with over 100 subjects, produced different results. In particular, the high school students did not show a pattern in which students with low prior knowledge of chemistry performed better on subsequent tests. We are still analyzing the audio feature of the study, i.e., the comparison of audio to text hints and messages, but preliminary results indicate that adding audio hurt the performance of high knowledge learners and helped low knowledge learners on the delayed test.

  • **New** Aleven- Improving student affect through adding game elements to mathematics LearnLabs Math_Game_Elements


There is much evidence to believe that games are fun. Can we incorporate some of the features that make games fun into intelligent tutors, in a way that improves motivation, generates positive affect, and improves the robustness of student learning? Specifically, what happens if we take game elements known to be effective such as fantasy, competition, and trivial choice, and embed into tutors already known to promote learning, using principles in PSLC theoretical framework? In the current project (which started in Year 5 of the PSLC), we are investigating the effect of adding game elements to an existing set of fractions tutors developed by Martina Rau (Rau, Aleven, & Rummel, in press) for a different PSLC project. The game elements comprise a fantasy soccer game, where success in the soccer game depends on learning progress in the factions tutors.

Background & Significance

Games and tutors appear to have complementary strengths. The current project is an attempt to determine if we can develop learning environments that leverage the strength of each type of learning environment, creating learning software that is as motivationally effective as games but promotes robust learning as well as intelligent tutors. We investigate one particular way of integrating game elements and learning content, building a game around an intelligent tutor engine.

How best to integrate learning content and game elements has been the subject of much theorizing, with some authors arguing that optimal learning requires that the learning content and game world be mutually dependent (Lepper & Malone, 1987), and others arguing that the learning should be embedded in the game’s core mechanic (Habgood, 2007). However, such theories ignore the real-world success (at least in anecdotal reports from teachers and students) of environments that feature a much looser integration between learning and motivational embellishments (e.g., FirstInMath). Given that a loose coupling between game elements and learning activities is far easier to implement (since it avoids the difficult problem of embedding math problems in a storyline or game context, hard to do especially if the learning content is to be adapted to individual students’ learning results), it is reasonable to investigate this option first. It may well be that as long as the game features are “cool,” the degree of integration is not really a strong factor. As mentioned, the success of for example the motivational embellishments in FirstInMath certainly suggest so.

Planned Experiments

We will conduct two in-vivo experiments comparing the tutor with game features against the regular tutor.

The first experiment, to be conducted in LearnLab schools in Fall 2009, will control for time. Students will be randomly selected to use either the game or the unmodified tutor (for equity, all students will receive access to the game over the web after the study). Motivation and liking will be assessed by pre-test and post-test questionnaires, and affect will be assessed by quantitative field observations during usage. Robust learning will be measured by pre-test and post-test.

The second experiment, to be conducted in LearnLab schools in Spring 2010, will allow time to vary . Students will be randomly selected to use either the game or the unmodified tutor (for equity, all students will receive access to the game over the web after the study). Students will be required to use the condition for one class period, and then in two subsequent class periods will be given the choice to switch conditions or use an alternate piece of educational software covering the same material. Motivation will be assessed by students' time allocation once they are given the option of switching tasks.

Bringing it Together: Exploring Effects of Combining Principles

(Perhaps this should be saved for a cross-thrust section as there is CF, CMDM, and M&M involved.)

The main idea in the current project is to combine instructional interventions derived from four instructional principles. Each of these interventions has been shown to be effective in separate (PSLC) studies, and can be expected on theoretical grounds to be synergistic (or complementary). We hypothesize that instruction that simultaneously implements several principles will be dramatically more effective than instruction that does not implement any of the targeted principles (e.g. current common practice), especially if the principles are tied to different learning mechanisms. This project will test this hypothesis, focusing on the following four principles:

   * Visual-verbal integration principle
   * Worked example principle
   * Prompted self-explanation principle
   * Accurate knowledge estimates principle 

Building on our prior work that tested these principles individually, we have created a new version of the Geometry Cognitive Tutor that implements these four principles. We have conducted an in vivo experiment, and will conduct a lab experiment, to test the hypothesis that the combination of these principles produces a large effect size compared to the standard Cognitive Tutor, which does not support these principles or supports them less strongly. Analysis of the in vivo experiment is in progress.