Difference between revisions of "REAP Study on Word Sense Disambiguation (Summer 2007)"

From LearnLab
Jump to: navigation, search
(Annotated bibliography)
(Findings)
 
(5 intermediate revisions by 2 users not shown)
Line 23: Line 23:
 
=== Abstract ===
 
=== Abstract ===
  
In previous REAP studies, there has been no control over the sense of the word being taught to students. This lack of control could be a problem for the many words that are polysemous (e.g., "bank", "initial", "labor", "prime"). Word sense disambiguation (WSD) techniques may be a means of providing control over word senses in REAP readings, practice exercises, and assessments. However, current WSD technologies have a significant error rate, correctly identifying around two thirds of senses correctly on average.
+
In previous REAP studies, there has been no control over the sense of the word being taught and then tested to students. Although many words can convey only a single meaning, for many other words that is not the case and such words are termed as ambiguous words. It is important for REAP, to operate at the level of the word-meaning pairs being learned and not just the words being learned, for several reasons. The most important reason is to be able to assess learning of the particular meanings of a word that the student was exposed to. The second reason is to personalize and adapt the tutoring material in
 +
order to expose the student to all or a particular set of meanings of a word. These observations motivate the study of word meaning/sense disambiguation (WSD) for supporting vocabulary learning in a tutoring system.
  
In this study, students will be randomly assigned to control or treatment conditions. Students in the treatment condition will receive readings and practice materials that are matched, using WSD, with the target senses of vocabulary words. These senses will be chosen by course teachers (for example, a course teacher would likely want the sense of "prime" that means, roughly, "of great importance or first in rank" rather than the sense related to mathematics (e.g., "prime numbers"). The set of words on which REAP will provide instruction will be constructed from words that have multiple senses. Data from previous studies will also be examined to constrain the set of words to be those which students have found difficult (i.e., have performed poorly even after receiving instruction).
+
Ambiguous words can be categorized as polysemes or homonyms. Polysemes are words that can convey multiple related meanings (e.g., "branch"), whereas, homonyms are words that can convey multiple distinct meanings (e.g., "bark"). In this study we concentrate on homonyms for two reasons. First, distinguishing between related senses of a word is a highly subjective task. It had been shown that the agreement between human annotators is very low on this task. Second, we believe that ESL students can transfer their knowledge about one sense of a word to another related sense of the word without much difficulty, especially in a context-based learning setup. However, we hypothesize that learners are not able to do so for homonyms, and thus assistance should improve learning.
 +
 
 +
Thus, this user study will test if automatic disambiguation of homonyms can have a positive effect on ESL vocabulary learning.
  
 
=== Glossary ===
 
=== Glossary ===
Line 32: Line 35:
 
=== Research question ===
 
=== Research question ===
  
Does automatically matching the sense of vocabulary words to the target sense chosen by teachers improve learning?
+
What is the effect of automatic disambiguation of homonyms that occur in the reading material on ESL vocabulary learning?
  
 
=== Dependent variables ===
 
=== Dependent variables ===
Line 44: Line 47:
 
=== Hypotheses ===
 
=== Hypotheses ===
  
For certain polysemous words with distinct senses (e.g., "prime"), matching of training materials to target senses will improve robust learning measures.
+
For certain polysemous words with distinct senses (e.g., "bark"), matching of training materials to target senses will improve robust learning measures.
  
 
=== Findings ===
 
=== Findings ===
  
 +
The following null hypothesis :- learning outcomes with and without word sense disambiguation are equivalent, was rejected with a p-value of 0.001. The mean and the standard deviation for the experimental group (students using WSD-enabled version of REAP) and control groups are (M = 0.8, SD = 0.404) and (M = 0.5, SD = 0.505), respectively.
  
 
=== Explanation ===
 
=== Explanation ===
Line 56: Line 60:
 
=== Annotated bibliography ===
 
=== Annotated bibliography ===
  
Kulkarni, A., Heilman, M., Eskenazi, M., and Callan, J. (accepted). Word Sense Disambiguation for Vocabulary Learning. Ninth International Conference on Intelligent Tutoring Systems.
+
Kulkarni, A., Heilman, M., Eskenazi, M., and Callan, J. (2008). Word Sense Disambiguation for Vocabulary Learning. Ninth International Conference on Intelligent Tutoring Systems.

Latest revision as of 13:57, 24 April 2008

REAP Study on Word Sense Disambiguation (Summer 2007)

Logistical Information

Contributors Maxine Eskenazi, Alan Juffs, Anagha Kulkarni, Jamie Callan, Michael Heilman
Study Start Date May, 2007
Study End Date July, 2007
Learnlab Courses English Language Institute Reading 4&5 (ESL LearnLab)
Number of Students ~45
Total Participant Hours (est.) ~250
Data in Datashop no

Abstract

In previous REAP studies, there has been no control over the sense of the word being taught and then tested to students. Although many words can convey only a single meaning, for many other words that is not the case and such words are termed as ambiguous words. It is important for REAP, to operate at the level of the word-meaning pairs being learned and not just the words being learned, for several reasons. The most important reason is to be able to assess learning of the particular meanings of a word that the student was exposed to. The second reason is to personalize and adapt the tutoring material in order to expose the student to all or a particular set of meanings of a word. These observations motivate the study of word meaning/sense disambiguation (WSD) for supporting vocabulary learning in a tutoring system.

Ambiguous words can be categorized as polysemes or homonyms. Polysemes are words that can convey multiple related meanings (e.g., "branch"), whereas, homonyms are words that can convey multiple distinct meanings (e.g., "bark"). In this study we concentrate on homonyms for two reasons. First, distinguishing between related senses of a word is a highly subjective task. It had been shown that the agreement between human annotators is very low on this task. Second, we believe that ESL students can transfer their knowledge about one sense of a word to another related sense of the word without much difficulty, especially in a context-based learning setup. However, we hypothesize that learners are not able to do so for homonyms, and thus assistance should improve learning.

Thus, this user study will test if automatic disambiguation of homonyms can have a positive effect on ESL vocabulary learning.

Glossary

Research question

What is the effect of automatic disambiguation of homonyms that occur in the reading material on ESL vocabulary learning?

Dependent variables

Independent variables

Hypotheses

For certain polysemous words with distinct senses (e.g., "bark"), matching of training materials to target senses will improve robust learning measures.

Findings

The following null hypothesis :- learning outcomes with and without word sense disambiguation are equivalent, was rejected with a p-value of 0.001. The mean and the standard deviation for the experimental group (students using WSD-enabled version of REAP) and control groups are (M = 0.8, SD = 0.404) and (M = 0.5, SD = 0.505), respectively.

Explanation

Descendants

Annotated bibliography

Kulkarni, A., Heilman, M., Eskenazi, M., and Callan, J. (2008). Word Sense Disambiguation for Vocabulary Learning. Ninth International Conference on Intelligent Tutoring Systems.