REAP Comparison to Classroom Instruction (Fall 2006)

From LearnLab
Revision as of 13:50, 24 April 2008 by Juffs (Talk | contribs)

Jump to: navigation, search

Logistical Information

Contributors Alan Juffs, Lois Wilson, Maxine Eskenazi, Michael Heilman
Study Start Date September 2006
Study End Date April, 2007
Learnlab Courses English Language Institute Reading 4 (ESL LearnLab)
Number of Students ~72
Total Participant Hours (est.) approximately 360
Data in Datashop no



This paper focuses on the long-term retention and production of instructed vocabulary in an intensive English program (IEP). The paper draws on the practical framework of Coxhead (2001) and Nation (2005) and the theoretical perspectives of Laufer & Hulstijn (2001). The project collected data from over 72 ESL learners over a period of two semesters. The vocabulary instruction occurred in intermediate level reading class (intermediate =TOEFL 450 or iBT TOEFL 45, average MTELP score, 58). All learners spent 40 minutes per week for 9 weeks reading texts containing words from the Academic Word List. In fall 2006, topic interest was manipulated in the CALL condition. In spring 2007, all of the students received texts that were in line with their interests. In contrast, in classroom instruction, a subset of the learners' normal in-class vocabulary instruction was tracked for two semesters. Pre-, post and delayed post-test data were collected for the CALL vocabulary learning and the in-class learning. In addition, during this period, all of the students' writing assignments were collected on-line. From this database of written output, each student’s texts were analyzed to determine which words seen during computer training and regular reading class had transferred to their spontaneous output in compositions in their writing class. Results indicate that although the CALL practice led to recognition one semester later, only the words which were practiced during regular reading class vocabulary instruction transferred to their spontaneous writing. This transfer effect is attributed to the output practice and deeper processing that occurred during the regular vocabulary instruction. The data also showed that the production of words seen in the CALL condition alone suffered from errors in word recognition (‘clang’ associations) and morphological form errors (Schmitt & Meara, 1997). We conclude that these data suggest that some negative views on output practice by Folse (2006) and Barcroft (2005) must be modified to accommodate these data. As a result of this comparision, we are amending REAP to create more interactive production opportunities.


In the fall of 2006 and the spring of 2007, we tracked learners using REAP [1]and also tracked the in class vocabulary instruction. This study therefore focused on a comparison of REAP with what normally happens in classrooms. In that sense, it is not very tightly controlled study, but it is ecologically valid in the sense that the study reflects what actually happens in classes.

It is important to note that the REAP treatments in the fall of 2006 and spring 2007 were slightly different. In the fall of 2006, participants were introduced to the personalization of texts that they read. Some students received texts that they were interested by topic, while others received random texts that contained their focus words. In the fall of 2006, all of the students had their focus words highlighted, but not all students received texts that were of interest to them all the time. Details can be read here.[2] Students were all able to select their topics of interest in the spring of 2007 and also had their focus words highlighted. [See the study in level 5 for a comparison of highlighted versus non lighted words,

In the classroom conditions, the Reading 4 curriculum supervisor decided on a list of Academic Word List vocabulary [3] items that had been excluded from the tests that the students took to establish their focus word lists. This list included 58 items that in her view the students should know.

In this report, we compare learning gains in REAP in the fall of 2006 with in class learning. In the spring of 2007, we again compare learning gains with REAP and in class interaction.


Recent research in vocabulary acquisition has suggested that the time taken for written output practice may not be well spent (Folse, 2006). Instead of using new words to create new meaningful texts during practice, Folse (2006) has suggested that fill-in-the-blank type exercises are more efficient.

However, Hulstijn and Laufer (2001) (H&L) have suggested the involvement load hypothesis for vocabulary acquistion. The involvement load hypothesis suggests that deeper processing leads to better learning. However, definitions of 'depth of processing' during learning have not been made. Hulstijn and Laufer (2001) suggest a definition that contains three parts. Each part is assigned a processing load weight, from 0 to 2. Although H&L do not fully flesh out their model, one can infer the different processing weights for each component of processing. The first is 'need', which they controversially label 'non-cognitive'. A level of no need would be '0'; an externally imposed need would be a weight of 1, and a self-generated need would be a '2'. The second component of processing is 'search'. No search would have a weight of '0', teacher or class provided information would be '1', and self-look up in a dictionary or on line would be '2'. Finally, there is the evaluation stage. A zero involvement would be a non-linguistic response such as choosing from a multiple choice list, i.e. no output at all. A level of '1' would be an activity such as 'fill-in-the-blank'. Finally, a level 2 load would be free production such as in class writing activities.

The problem with both Folse (2006) and H&L is that they selected for in class study a very limited number of words (15 and 10) respectively. Moreover, the post-test results in both studies showed surprisingly low retention scores. For example, H&L showed that an average of delayed post-test gain for practice with reading and fill-in the blank of 1.6 or 1.7 out of 10, and production only 2.6 or 3.7 out of ten.

However, we know that learners need to master many more than 10-15 words to be able to effectively comprehend and produce academic discourse. What is needed is a much closer focus how time can be effectively used to make a 50-150 word (family) gain over time. This goal has been the focus in REAP. However, the time on task for each word in REAP is limited, and we have observed limited gains.


1. How does REAP vocabulary learning differ from the ecological control 'normal' in class instruction?

2. How do the learning outcomes differ in the immediate number of words learned, and in the number of words the learners transfer to other contexts?

3. If there are differences, what might the source of those differences be in terms of Hulstijn and Laufer's involvement load hypothesis.

4. Can 'deeper processing' through writing be ‘skipped’ by using a CALL program?

5. What can teachers and computer scientists learn from each other?

The two treatments in REAP and in class are summarized in the following table. The major differences from the point of view of the involvement load hypothesis are that in REAP the need is created by the student in his or her self-generated list, whereas in class the teacher decides on the words to be learned. Second, in terms of search, the student is responsible for looking up words in the computer program, whereas in class the teacher assigns group activities for discovery of words. Finally, in terms of 'evaluation', REAP only requires a non-linguistic response to a multiple choice question, whereas in class activities require evaluation where meaningful output is created.

The hypothesis then is that the in class learning gains will be far greater than for REAP. The question is how much better the in class activities will be. We explore the involvement load hypothesis scores for REAP and in class activities in the methodology section.



The participants in the two semesters are typical of the heterogeneous classes in Intensive English Programs in general in the United States. Recently, the ELI at the University of Pittsburgh has had larger numbers of Arabic-speaking and Korean speaking learners. Overall, the Arabic-speaking students score lower on overall proficiency tests than the Korean speakers. However, the Arabic speakers are placed in level 4 classes because they have passed level 3 adequately.


The variation in MTELP scores is important. This point will be raised in the discussion section.


The table below summarizes the differences between REAP and the in class treatment of vocabulary.


In terms of Hulstijn and Laufer's involvement load hypothesis, we can estimate their level of depth of processing 'quantitatively'.

REAP 2 2 0 4
IN CLASS 1 1 2 4

Hence, both sets of activities have the same involvement load. However, REAP'S involvement load is divided strongly into need and search, whereas the in class activities are spread more over need, and search , plus a very strong evaluation component. Hulstijn and Laufer (2001) make no predictions about where the involvement load will fall, but common sense dictates that involvement needs to be spread across the different factors. For this reason, we anticipate, based on H&Ls results, that the learners in class will do much better in retaining their words.


The descriptive results are provided below. As can be observed, the supervisor chose some words that were already known to the students.

Term Fall 2006 Spring 2007
Results Fall2006.jpg Spring2007.jpg


Selected References

Allum, P. (2002). CALL and the classroom: the case for comparative research. ReCALL, 14, 146-166.

Barcroft, J. (2004). Effects of sentence writing in second language lexical acquisition. Second Language Research, 20, 303-334.

Barcroft, J. (2006). Negative Effects of forced output on vocabulary learning. Second Language Research, 22, 487-497.

Folse, K. S. (2006). The effect of type of written exercise on L2 vocabulary retention. TESOL Quarterly, 40, 273-293.

Hulstijn, J., & Laufer, B. (2001). Some empirical evidence for the involvement load hypothesis in vocabulary acquisition. Language Learning, 51, 539-558.

Juffs, A., Friedline, B. F., Eskenazi, M., Wilson, L., & Heilman, M. (in review). Activity theory and computer-assisted learning of English vocabulary. Applied Linguistics.

Stanowicz, K. E. (1986). Matthew effects in reading: some consequences of individual differences in the acquisition of literacy. Reading Research Quarterly, 21, 360-407.