REAP Study on the Correlation Between Automatically and Manually Generated Reading Comprehension Questions (Summer 2007)

From LearnLab
Revision as of 18:36, 25 July 2007 by Feeney (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

REAP Study on the Correlation Between Automatically and Manually Generated Reading Comprehension Questions (Summer 2007)

Logistical Information

Contributors Christine M. Feeney and Michael Heilman
Study Start Date June, 2007
Study End Date July, 2007
Learnlab Courses no
Number of Students 30
Total Participant Hours (est.) 300
Data in Datashop no


Previous psychological research has identified two types of comprehension: shallow, in which people can reproduce information, and deep, in which people can comprehend the meaning of information. In addition, researchers have found that these two types of comprehension are separate and disparate. The PSLC currently employs an English as a Second Language vocabulary tutor, REAP, that uses shallow, automatically generated, comprehension questions to check that students are actively reading, rather than just skimming, practice reading passages. The purpose of the current study was to examine performance on the REAP-type questions with manually authored, deeper, reading comprehension questions. Participants were thirty undergraduate students (male = 11) participating in summer research programs at Carnegie Mellon University. The researcher predicted a positive correlation between the two testing types, which was supported by the data (r = .366, p < 0.0005, one-tailed t-test).


Shallow Comprehension: Processing directed towards reproducing the learning material without necessarily understanding it.

Deep Comprehension: Processing directed towards comprehending the intended meaning of the learning material.

Research question

Does performance on the shallow, automatically-generated questions correlate with performance on more sophisticated, deep reading comprehension questions?

Dependent variables

Percentage of Shallow Comprehension Questions Answered Correctly

Percentage of Deep Comprehension Questions Answered Correctly

Independent variables

Five reading passages, each of which was followed by four shallow reading comprehension questions and three to five deep reading comprehension questions.


A significant positive correlation between shallow and deep reading comprehension questions will occur in this study.


There was a relationship between performance on the two question types (automatically-generated shallow reading comprehension questions and manually-generated deep comprehension questions), which could imply that using the automatically-generated questions as a check for reading comprehension for the REAP program is a sufficient measure (r = .366, p < 0.0005, one-tailed test). Further studies could examine different types of deep reading comprehension questions, removing the “giveaway words” from the automatically-generated distracters, and examining whether ESL students approach these questions in a different manner than native English speakers.



The researchers chose five passages of between five hundred to one thousand words because this length is comparable to reading comprehension passages that appear on standardized tests, such as the GRE, and to REAP texts. The researchers used a variety of sources in order to have passages of varying difficulty.

Annotated bibliography

Curran, James R. Curran & Moens, Marc. (2002a). Improvements in automatic thesaurus extraction. Proceedings of the Workshop of the ACL Special Interest Group on the Lexicon (SIGLEX), 59-66.

Davoudi, Mohammad. (2005). Inference Generation Skill and Text Comprehension. The Reading Matrix, 5(1), 106-123.

Heilman, M., Collins-Thompson, K., Callan, J. & Eskenazi, M. (2006). Classroom success of an Intelligent Tutoring System for lexical practice and reading comprehension. Proceedings of the Ninth International Conference on Spoken Language Processing.

Heilman, M. & Eskenazi, M. In Press. Application of Automatic Thesaurus Extraction for Computer Generation of Vocabulary Questions. Proceedings of the SLaTE Workshop on Speech and Language Technology in Education.