Baker - Building Generalizable Fine-grained Detectors

From LearnLab
Revision as of 11:30, 4 December 2009 by Ryan (Talk | contribs)

Jump to: navigation, search

Building Generalizable Fine-grained Detectors

Summary Table

Study 1

PIs Ryan Baker, Vincent Aleven
Other Contributers Sidney D'Mello (Consultant, University of Memphis), Ma. Mercedes T. Rodrigo (Consultant, Ateneo de Manila University)
Study Start Date February, 2010
Study End Date February, 2011
LearnLab Site TBD
LearnLab Course Algebra, Geometry, Chemistry, Chinese
Number of Students TBD
Total Participant Hours TBD
DataShop TBD


This project, joint between M&M and CMDM, will create a set of fine-grained detectors of affect and M&M behaviors. These detectors will be usable by future projects in these two thrusts to study the impact of learning interventions on these dimensions of students’ learning experiences, and to study the inter-relationships between these constructs and other key PSLC constructs (such as measures of robust learning, and motivational questionnaire data). It will be possible to apply these detectors retrospectively to existing PSLC data in DataShop, in order to re-interpret prior work in the light of relevant evidence on students’ affect and M&M behaviors.

Background & Significance


Metacognition and Motivation

Computational Modeling and Data Mining

Gaming the system

Off-Task Behavior





Engaged Concentration


H1: We hypothesize that it will be possible to develop reasonably accurate detectors of student affect for four LearnLabs, that detect affect using only the data from the interaction between the student and the keyboard/mouse.

H2: We hypothesize that models of behaviors such as gaming the system, and off-task behavior, in combination with models of affect/behavior dynamics, can make affect detectors more accurate.

H3: We hypothesize that these affect models will become a valuable component of future research in the M&M and CMDM thrusts.

Research Plan

We will develop detectors of the M&M (metacognitive & motivational) behaviors of gaming the system, off-task behavior, proper help use, on-task conversation, help avoidance and self-explanation without scaffolding. This set of behaviors has already been effectively detected in mathematics LearnLabs. We will model the dynamics between these behaviors and student affect (following on work in the PSLC and at Memphis), in order to be able to leverage these detectors to create detectors of the affective states of flow, boredom, confusion, and frustration (the dynamics models will enable us to set Bayesian priors for how likely an affective state is at a given time).

These detectors will be developed for multiple LearnLabs, and the generalizability of detectors across LearnLabs will be one of the focuses of study during this project. We anticipate developing detectors for Algebra and Geometry, Chinese/FaCT, and the Chemistry Virtual Lab. Each of these learning environments presents a context where complex learning occurs, fine-grained interaction behavior is logged, and the outputs of the detectors will provide leverage on a number of research questions of interest.

“Ground truth” for the M&M behavior categories will be established through quantitative field observations. “Ground truth” for the affect categories will be established by field observations and infrequent pop-up questions. Work will be conducted to increase the reliability of quantitative field observations of affect to a standard considered appropriate by psychology journals, through repeated coding and discussion sessions and the development of a detailed coding manual based on prior work to code affect in field settings and work to code emotions from facial expressions. A limited degree of video will be used during the training process (but not during the main coding of affect for data mining, due to the relatively high cost of obtaining and coding video data in school settings).

Models will be developed solely using distilled log file data of the sort currently collected in DataShop (more sophisticated sensors will NOT be included in this project). The models will be built with a combination of machine learning, and knowledge engineering (specifically, through leveraging and adapting existing knowledge engineered models such as Aleven et al’s help-seeking model and Shih et al’s self-explanation model). Generalization of models across learning environments will involve expectation maximization to adapt models to new data sets, and/or leveraging the CTLVS1 taxonomy to develop meta-models that relate prediction features to design features. We will first develop models for individual learning environments and then extend them across environments.

Independent Variables

n/a (see Research Plan)

Dependent Variables

n/a (see Research Plan)

Planned Studies

In 2010 and 2011, data will be collected in the Algebra, Geometry, Chemistry, and Chinese LearnLabs.


Further Information


Annotated Bibliography


Future Plans