Koedinger - Discovery of Domain-Specific Cognitive Models
Ken Koedinger and John Stamper
This project will address goal 1 of the CMDM thrust and in particular use DataShop datasets (90 in 5 years) to produce better cognitive models and verify the models with in vivo experiments. Cognitive models drive the great many instructional decisions that automated tutoring currently make, whether it is how to organize instructional messages, sequence topics and problems in a curriculum, adapt pacing to student needs, or select appropriate materials and tasks to adapt to student needs. Cognitive models also appear critical to accurate assessment of self-regulated learning skills or motivational states. Multiple algorithms have been developed for automated discovery of the attributes or factors that make up a cognitive model (or a "Q matrix") including various Q-matrix discovery algorithms like Rule Space, Knowledge Spaces, Learning Factors Analysis (LFA), exponential-family PCA. This project will create an infrastructure for automatically applying such algorithms to data sets in the DataShop, discovering better cognitive models, and evaluating whether such models improve tutors.
Planned accomplishments for PSLC Year 6 (Oct 09 to Oct 10)
1. Develop code and human-computer interfaces for applying, comparing and interpreting cognitive model discovery algorithms across multiple data sets in DataShop. We will document processes for how the algorithms, like LFA, combine automation and human input to discover or improve cognitive models of specific learning domains.
2. Demonstrate the use of the model discovery infrastructure (#1) for at least two discovery algorithms applied to at least 4 DataShop data sets. We will target at least one math (Geometry area and/or Algebra equation solving), one science (Physics kinematics), and one language (English articles) domain.
3. For at least one of this data sets, work with associated researchers to perform a "close the loop" experiment whereby we demonstrate that a better cognitive model leads to better or more efficient student learning.
Integrated Research Results and High Profile Publication
Establishing that cognitive models of academic domain knowledge in math, science, and language can be discovered from data would be an important scientific achievement. The achievement will be greater to the extent that the discovered models involve deep or integrative knowledge components not directly apparent in surface task structure (e.g., model discovery in the Geometry area domain isolated a problem decomposition skill). The statistical model structure of competing discovery algorithms promises to shed new light on the nature or extent of regularities or laws of learning, like the power or exponential shape of learning curves, whether the complexity of task behavior is due to human or domain characteristics (the ant on the beach question), whether or not there are systematic individual differences in student learning rates. We expect integrative results of this project can be published in high-profile general journals (Science or Nature) or more specific technical (e.g., Machine Learning) or psychological journals (e.g., Cognitive Science or Learning Science).