Providing optimal support for robust learning of syntactic constructions in ESL

From Pslc
Jump to: navigation, search
PIs Levin, Frishkoff, De Jong, Pavlik
Faculty Levin
Postdocs Frishkoff, De Jong, Pavlik
Others with > 160 hours n/a
Study 1 Goals Calibrate linguistic model; explicit instruction Study 2 Goals Calibrate linguistic model; no explicit instruction
Start date study 1 March 22, 2007 Start date study 2 April 2, 2007
End date study 1 April 6, 2007 End date study 2 (est.) April 29, 2007
Learnlab ESL Learnlab ESL
Number of participants 17 ELI; 17 native English Number of participants (est.) 20 ELI; 20 native English
Total Participant Hours 100 Total Participant Hours (est.) 100
Datashop? Expected date 7/15 Datashop? Expected date 8/15


The goal of this project is to examine how second-language learners acquire context-appropriate use of syntactic constructions. In some cases, learning when to use a syntactic construction is straightforward. In other cases, use of a syntactic construction is seldom mastered, even by advanced students. The main challenge, in such cases, is to learn which contextual cues predict the occurrence of a particular form (for example, I am here for two years vs. I have been here for two years). That is, students must learn the “meanings” or functions of grammatical constructions, in order to use them in appropriate contexts.

This project focuses on acquisition of the dative alternation ( give someone a book vs. give a book to someone) by ESL students. Recently, Bresnan and associates (Bresnan & Hay, 2006; Bresnan & Nikitina, 2003; Bresnan, Cueni, Nikitina, & Baayen, 2005) identified 14 features that combine to form a linear regression model of the dative alternation for native English speakers. Using this model as a blueprint, we propose to develop a student-centered, cognitive–linguistic model that will determine optimal scheduling of example texts to support acquisition of the dative alternation in ESL. We propose to use a constraint-based model (the Bresnan model), combined with a model of learning (the Pavlik model), to select training examples that we will present in a way that is hypothesized to result in optimal refinement, transfer, and retention, of proficiency with the dative alternation. This approach will comprise the following steps:

(A) The native-speaker model will constitute the target for ESL acquisition of the dative alternation.
(B) We will calculate distances between student performance and the native speaker model.
(C) We will select training items that maximize learning, i.e., that reduce the distance between the ESL student models and the native-speaker model.


The stimuli for the experiment are based on Switchboard, which is a corpus of telephone conversations between native speakers of English. Switchboard was recorded by the Linguistic Data Consortium in the early 1990's for the purpose of speech research. The Bresnan team selected 2600 sentences from the Switchboard corpus that contain agent, theme, and recipient arguments. For each sentence, they annotated the values of fourteen features (is the action concrete or abstract, is the theme a pronoun, is the recipient a pronoun, etc.) A linear regression model trained on the features has 92% accuracy in predicting whether the native speaker chose the double object or prepositional option in each sentence. We want to know whether we can get non-native speakers to make the same choice that the native speaker made. However, we can't show the original dialogue to the non-native speaker because it contains disfluencies, culture-specific information, and advanced vocabulary. We are therefore adapting the Switchboard sentences for ESL students, but we must be sure not to change the values of any of the 14 features. Following is an example of a Switchboard segment and the adapted version for our experiment. The subjects make a forced choice between the double object and prepositional options.


Well, the thing I think that annoys me the most is, I have, I have young children, a baby in the house and, and inevitably as soon as they're asleep, someone calls on the phone trying to sell me something.


I have young children at home. As soon as they are asleep at night, someone calls on the phone, trying to sell

  • me something. (NP-NP or "Double Object" Construction)
  • something to me. (NP-PP or "Prepositional" Construction)


Novelty of the project: Almost all previous studies of the dative alternation in ESL have focused on the form of the dative alternation, rather than whether it is used appropriately in context. Our study will be the first to —

  • Teach the use (meaning or function) of the dative alternation;
  • Provide an implemented model of how the dative alternation is learned; and
  • Test whether models of native speaker use can serve as the basis for effective instruction
  • Augment the native speaker models with a model of how the dative alternation is learned

In the course of this study, the Pavlik and Anderson model will be applied to a new area: the interaction of multiple features in second language acquisition We will build a framework (theory, model, and tools) that can be re-used in subsequent studies of syntactic constructions that involve multiple features.

Relevance for Second-Language Learning Research: While the dative alternation itself represents a relatively small part of the English grammar, it is one instance of a broader (and relatively productive) class of constructions known as resultatives, which include non-prototypical uses of verbs like “sneeze” to express change-of-state — e.g., “He sneezed the letter across the table” (cf. Goldberg, 1995). Resultative constructions have received considerable attention among linguists (see Goldberg & Jackendoff, 2004 for a recent review), particularly in recent research on grammar acquisition (e.g., Goldberg & Casenhiser, 2004, 2005). Still more broadly, the dative alternation involves transitivity relations (also referred to as voice, or diathesis), which are represented cross-linguistically using several important syntactic devices (Givón, 1984). Voice markers are often associated with complex meanings and functions (cf. Frishkoff, 1997), and acquisition of grammatical voice in some languages occupies a large part of the curriculum. Therefore, although our selection of the dative may suggest a narrow focus, our studies should have wide-spread implications for acquisition of transitivity and diathesis relations in many (perhaps all) languages.

We view this project as seeking to establish “proof of concept,” which will justify work in other domains of grammar learning using this same approach. Our goal is to tune our procedures to our findings, particularly with respect to the degree to which training items affect student behavior, and the proportional effects of student performance vs. experience. After we tune our procedures for the dative alternation, we will extend our methods to address errors in the use of articles and tense-aspect constructions.

Contribution to the theory of robust learning This project relates to refinement and fluency in the learning process. Our hypothesis is that we can strengthen and refine feature representations to approximate native speaker competence. The instructional goals of this project focus on long-term retention and transfer. Following the Pavlik and Anderson model, training items are selected and spaced to optimize long-term gain. In addition, we will test transfer from trained to untrained items, from comprehension to production, from prototypical to less prototypical exemplars, and from correct use of the dative alternation to correct use of other syntactic constructions, which rely on the same or similar linguistic cues.


Alternation Pair
A pair of sentences with the same verb, agent, recipient, and theme, where one sentence of the pair is a double object construction and the other is a prepositional dative construction:
  • Gretchen sent her a form.
  • Gretchen sent a form to her.
A participant in the action described by the verb. The sentence I gave a book to him has three arguments, I, book, and him.
Bresnan Model
A logistic regression model including fourteen features that predicts whether native English speakers will use the double object or prepositional variant of a dative sentence.
Dative Alternation
There are two ways to express sentences that contain agent, recipient, and theme arguments. In the double object construction, the recipient comes first, followed by the theme, with no prepositions. In the prepositional dative construction, the theme comes first, followed by the preposition to and then the recipient.
  • Gretchen sent her a form. (double object)
  • Gretchen sent a form to her. (prepositional dative)
Double object construction or NP NP construction
In the double object construction, the recipient argument comes first, followed by the theme argument with no prepositions.
  • Gretchen sent her a form.
  • This music gives me a headache.
  • The light gave her features a healthy glow.
NP NP Construction
See Double object construction
NP PP Construction
See Prepositional dative construction
Prepositional dative construction or NP PP construction
In the prepositional dative construction, the theme comes first, followed by the preposition 'to' and then the recipient.
  • Gretchen sent a form to her.
  • The teacher told a story to the children.

The following sentence is also an instance of the prepositional dative construction, but it has undergone an alternation called Heavy NP Shift. The recipient retains its preposition and the theme moves to the end of the sentence because it is long (heavy).

  • Gretchen sent to her a long form containing many confusing questions.
The argument that receives something in an abstract or concrete way. The recipient arguments are in italics in these examples:
  • Gretchen sent her a form.
  • Gretchen sent a form to her.
Syntactic Construction
A recognizable configuration of words and morphemes (prefixes and suffixes). Constructions may contain fixed expressions (What a ADJ NOUN!, What a nice dress!). They may also be normal parts of the syntax of the language such as using a noun phrase before a verb phrase to make a predication (The girl ran). Constructions have a form and a meaning or use. ESL textbooks may have lessons on when to use the present perfect ( have Verb-ed, I have eaten).
The argument that is moved, given, or communicated to the recipient in an abstract or concrete way. The theme arguments are in italics in these examples:
  • Gretchen sent her a form.
  • Gretchen sent a form to her.

Research Questions

1. Our central research question concerns the learning of complex form-meaning mappings. For example, the dative shift involves fourteen meaning-related features (knowledge components) with different weights (cue strengths). We want to know whether a model of native speaker behavior can be used as a target for non-native speaker behavior and what instructional interventions (learning events) will bring non-native speakers closer to that target.
2. Our second question concerns the effects of explicit instruction in model-based grammar learning. DeKeyser (1994) has suggested that adult grammar learning is critically dependent on explicit instruction and attentional cueing (cf. Morris & Ortega, 2000; Hulstijn, 1989). . The present research uses a between-subjects design to examine the effect of explicit instruction in the use of the dative alternation. Study 1 includes two types of explicit instruction -- block-level rule instruction and explanatory (rule-based, corrective) feedback after incorrect responses. By contrast, in Study 2, we are examining whether adult language-learners will learn the "rules" for correct use of the dative alternation in the absence of explicit instruction.


We propose the following hypotheses:

(A) Learning will be more effective when training examples are selected to maximize the strength of model cues (high-contrast* versus low-contrast* examples)
(B) An algorithm that selects training items on the basis of student performance, as well as on the basis of training history, will lead to better performance than one that selects training items based on the history of training alone (optimized scheduling).
(C) Selection of examples based on student performance will be superior in:
  • transfer from trained to untrained items
  • transfer from comprehension to production
  • transfer from prototypical to less prototypical examples, provided they are presented at the right times, with the right frequencies
  • transfer to acquisition of new syntactic constructions, which share certain "rules" or "regularities" with the dative alternation

Studies 1-2 will calibrate the learning model. The effects of learning will be measured in terms of feature weights that characterize the student's responses. We will measure the size of the effect that is caused by exposure to specific types of examples. These two studies will also measure medium term forgetting between blocks of stimuli

Comparison of Studies 1 and 2 will reveal effects of explicit instruction (Study 1 -- presentation of "rules" for cue-response mappings, explanatory feedback on incorrect trials), versus implicit pattern learning (Study 2 -- no grammar explanations or explanatory feedback).

Study 3 will test whether trial selection by the learning model leads to native-like behavior. The learning model uses information about which trials have been presented as well as the students' performance. In the control condition, trials are selected based on the history of practice only, not on student performance (i.e., the performance sensitivity in the model will be turned off). In other words, in the control condition, the trial-selection is model-based, but not personalized, as in a classroom where all students see examples in the same order. It is expected that both conditions will result in learning, but that there will be higher gains in the experimental condition than in the control condition.

A follow up session will test several types of transfer. Transfer from trained to untrained items is intrinsic to each of our study designs. The second type of transfer (from comprehension to production) will be tested by including a posttest for production, in which students put sentence constituents into the preferred word order. The third type of transfer will be tested by including a number of test items with non-prototypical features. Some features are binary, such as pronominality ( the book is non-pronominal, whereas it is pronominal), while other features can be prototypical or non-prototypical. For example, in a context where only a house and a man are mentioned, the house is highly accessible in subsequent context, whereas, for instance, the clouds is not. However, in the same context, the kitchen and his wife have intermediate accessibility because they are implied by the mentioning of a house and a man. Finally, in a future study, we plan to test how learning correct usage of the dative alternation will affect usage of other grammatical constructions, including closely related constructions such as the Benefactive (e.g., "I fixed him a plate of spaghetti" vs. "I fixed a plate of spaghetti for him"), and more dissimilar alternations, such as the Resultative ("bag the groceries" vs. "put the groceries in a bag"; "spray paint on the wall" vs. "spray the wall with paint").

Experiment Methods

Dative Model
The logistic regression model proposed by Bresnan and Nikitina (2003) includes 14 linguistic (syntactic, semantic, and discourse-pragmatic) variables that account for Native English speaker use of the dative alternation (model accuracy, ~92%). We applied Principal Components Analysis (PCA) to obtain a smaller set of variables that would be more amenable to experimental manipulation. The input to the PCA consisted of 2360 rows X 14 columns, where columns represent the 14 linguistic variables in the original Bresnan model, and rows are speech samples from the Switchboard corpus that include examples of the dative alternation (either an NP-NP or NP-PP construction for each sample). The data were transformed into a 14 x 14 correlation matrix, which was decomposed using PCA with varimax rotation. The resulting Pattern Factor Matrix showed a sensible clustering of variables. Givenness, Definiteness, and Pronominality of the Theme loaded on Factor 1 (variance accounted for ~23%). Givenness, Definiteness, and Pronominality of the Recipient loaded on Factor 2 (variance accounted for ~15%). Concretness (vs abstractness) of the Theme and verb semantics loaded on factor 3 (~ 9% variance). Relative length of the Theme and Recipient split across Factors 1 and 2 in the first analysis. The four variables that had the smallest contribution were dropped from the second analysis, resulting in a new 5-factor structure, where length loaded separately on Factor 4, and grammatical Person of the Recipient loaded uniquely on Factor 5. The first four factors were selected for manipulation in Studies 1 and 2.
Stimulus development
The Bresnan corpus consists of 2360 examples from the Switchboard corpus, with have been annotated for each of the 14 Bresnan model variables. For development of experiment stimuli, we selected 12 samples for each of 16 experiment conditions (see Independent Variables for details). The original 192 speech samples were modified (shortened, corrected, simplified in grammar and word choice), taking care not to affect linguistic variables, such as givenness and length.
Study Participants
Study participants are volunteers, recruited from Levels 3-5 (intermediate level) courses at the English Language Institute (ELI), University of Pittsburgh and native English speaking subjects, recruited from the Reading & Language Lab database. Native English speakers are included to test the accuracy of the reduced (4-Factor) model. Also, consistent with Hypothesis (A), we expected native-speaker task performance to be higher and less variable in the high-contrast vs. low-contrast condition.
Experiment Design & Protocol
Participants completed two sessions, scheduled one week apart. Subjects were paid for their participation ($15/hour plus an additional amount that was contingent on task performance, averaging ~$5-8).
Session I. Prior to Session I, participants completed a Language History Questionnaire and read and signed a consent form. They then completed a sequence of 8 blocks (16 trials per block). On each trial, they were presented with a context* (one or two short sentences), which ended with a ditransitive verb, followed by two alternative completions (either an NP-NP or an NP-PP structure). Subjects selected the best completion by pressing the '1' or '2' key on the keyboard (response mapping randomized). A trial counter at the top of each screen tracked and displayed subject accuracy on each trial (number correct/number trials completed). ELI subjects took approximately 1.5-2 hours to complete Session I, and native English speakers took ~1-1.5 hours.
Session II. In Session II (one week later), subjects completed another 4 blocks (64 trials). At the end of the task, they completed a 3-page questionnaire that was designed to test learning and retention of the grammar rules that were introduced in the first session. Subjects completed Session II in 1-2 hours.
(A) Study 1: Explicit Instruction
Study 1 is in progress. The design for study one uses two sessions seperated by a long-term interval of a week. During these sessions we mix practice for the 4 dative factors and 2 contrast levels and 2 responses using 16 trials per block. Session 1 is composed of 8 blocks and session 2 is composed of 4 blocks.
Session 1, Blocks 1-2 No prior introduction to rules; Accuracy feedback only on each trial
Session 1, Blocks 3-4 4 Rules introduced prior to Block 3; Accuracy feedback only on each trial
Session 1, Blocks 5-6 Accuracy feedback on each trial; Explanatory (rule-based) feedback after incorrect trials
Session 1, Blocks 7-8 Accuracy feedback only on each trial
Session 2, Blocks 1-2 Accuracy feedback only on each trial
Session 2, Blocks 3-4 Accuracy feedback on each trial; Explanatory (rule-based) feedback after incorrect trials
(A) Study 2: Implicit Pattern Learning (No Explicit Instruction)
Study 2 is also in progress. The design for this study is the same as for Study 1, with two differences: (1) there is no explicit, rule-based instruction, and (2) there is no explanatory (rule-based) feedback on incorrect trials. Thus, all 8 blocks are the same, with accuracy feedback alone given on each trial. This design supports implicit pattern learning.
Learning Model
The model of learning that will be used to interpret this data is already being used for optimized scheduling in other projects (e.g. Optimizing the practice schedule). This model is a version of the ACT-R declarative memory model which is a mathematical model consisting of a system of equations for describing expected performance as a function of a history of performance. To apply this model to our data will be a several step process.
  • Describe a strucutral knowledge component model (e.g. 4 knowledge components, one for each factor)
  • Use this strucutral model to determine how the history maps to performance for each item (e.g. performance for each item may be a compensatory function of the item features specifying the 4 facotrs and the history of practice with each factor).
  • Given this model structure and mapping to history optimize parameters such as
    • learning rate for implicit items
    • learning rate for explicit items
    • learning from instructions
  • Analyse the fitted model to determine how to optimize the long-term learning gain per second of practice.

Independent variables

For Studies 1-2, the main between-subjects variable is language background. ELI students are recruited from Levels 3-5 (Grammar courses). Native-English speaking subjects are recruited from the Perfetti Reading & Language Lab (RLL) database.

Within-subjects variables are factors, contrast, and time (session, half of session, and half of block).

  • Factor: Using principal components analysis, the fourteen features of the Bresnan model were reduced to four factors(see Methods for details).
    • Factor 1: Definiteness, pronominality, and discourse accessibility (givenness) of the theme.
    • Factor 2: Definiteness, pronominality, and discourse accessibility (givenness) of the recipient.
    • Factor 3: Ratio of lengths of the theme and recipient noun phrases.
    • Factor 4: Abstractness vs. concreteness of the action and of the theme argument.
Each factor has two values, which we can call plus and minus: e.g., action and the theme are abstract ( pay attention to someone) or concrete ( give something to someone).
  • Contrast: If the Bresnan model assigns a score close to zero, both the double object and prepositional variants of the sentence are generally acceptable ( send American Express a check vs. send a check to American Express). If the Bresnan model assigns a score farther from zero, one of the two options will be highly preferable ( give it to anyone who comes in vs. give anyone who comes in it).
  • Time (session, half of session, and half of block): The first experiment consists of two sessions, each session having eight blocks, and each block having two halves.
    • First half of first session: One block for each feature, with the plus value for the factor in one half of the block and the minus value of the factor in the other half.
    • Second half of first session: An additional block for each feature (medium term forgetting).
    • Second session: One more block for each feature (longer term forgetting.)

Study 1 (but not Study 2) includes two variables that reflect explicit instruction at the Block level (Explicit Rule-based Instructions) and at the Trial level (Explanatory feedback on incorrect trials.

  • Explicit (Rule-Based) Instructions: Relevant concept, such as theme, receiver, and concreteness are introduced. In addition, for each Factor (1-4), subjects are presented with rules that determine when the THEME comes before the RECEIVER, and when the RECEIVER comes before the THEME.
  • Explanatory (Rule-Based) Feedback: When subjects select the incorrect response, they are told the correct response (Accuracy Feedback). In addition, they are reminded of the relevant rule:
'If the RECEIVER is longer than the THEME, then the RECEIVER often comes LAST'

Study 3 will include an additional manipulation: some training examples will be chosen based on student performance (the student's feature weights), and some training stimuli will be based on history only (i.e., which stimuli have already been presented).

Example screen shot of instructional event presentation : Examplescreen2FaCT.JPG

Dependent variables

  • Accuracy on a forced-choice (NP-NP vs. NP-PP) task
    • Learning (improvement in accuracy from early to later trials)
    • Long-term retention (improvement in accuracy from the beginning of Session 2, cf. with beginning of Session 1.
    • Transfer: comprehension to production; prototypical to nonprototypical;


  • Pilot study
In this study we tested the stimuli adapted from the Bresnan corpus with native English speakers. The goals was to confirm that there were no problems with specific stimuli and to verify that native English speaker preferences for NP or PP constructions was consistent with the Bresnan model categorization of these items. 19 subjects completed this test. A repeated measures comparison (factor by contrast) revealed significant differences as a fucntion of factor (F=7.5,p<.001) and contrast (F=110, p<.001). The very strong contrast effects in the data show that the stimuli selection and creation procedures resulted in stimuli that retain the differences the model predicts should occur.
  • Study 1 (Model Calibration, Explicit Instruction)
Study 1 data collection and analyses are in-progress. Results from 16 NS and 12 ESL participants are shown in Figure 1 (below).
Study 1 Mean Accuracy
Figure 1. Study 1 (Explicit Instruction). Dotted line separating Blocks 2, 3 indicates presentation of grammar rules. Transparent green shading overlaying Blocks 4-5 and 9-10 indicates use of explanatory (rule-based) feedback on incorrect trials. Solid black line indicates 1-week delay between Sessions 1 and 2.
  • Study 2 (Model Calibration, Implicit Learning)
Study 1 data collection and analyses are also in-progress. Results from 12 NS and 10 ESL participants are shown in Figure 1 (below).

Study 2 Mean Accuracy
Figure 2. Study 2 (Implicit Instruction). Solid black line indicates 1-week delay between Sessions 1 and 2.
  • Study 3
Study 2 will be completed after development of the Pavlik learning model, based on the results from Studies 1-2.

Explanation & Discussion

This research program seeks to increase fluency by adjusting cue strengths (weights of the fourteen features of the Bresnan model). If the model is an accurate predictor of native-speaker performance in our task, then model-based presentation of stimuli (i.e., high-contrast examples) should lead to improved learning and long-term retention. We are testing these predictions in Studies 1-2, where the main goal is to calibrate the linguistic model, allowing us to fit the model parameters separately for ELI learners. The Pavlik-Anderson model predicts the number and spacing of training items needed to change cue strength to achieve long-term retention. In Study 3, we will test the efficacy of this model for determining optimal presentaiton of stimuli to support ESL grammar acquisition.

Pilot results validated the accuracy of the Bresnan model for predicting native-speaker use of the dative alternation as a function of the 4 factors: high-contrast stimuli, selected to bias native-speaker judgments towards a particular response, elicited faster and more accurate responses than low-contrast stimuli. Based on these data, we were also able to filter out examples that were problematic for various reasons.

Results from Studies 1-2 (Figs. 1-2) are also promising: native English speaking (NS) subjects performed close to ceiling, whereas English language learners (ESL subjects) showed an increase in performance across blocks. In addition, task performance was influenced by Contrast, consistent with the model predictions. Interestingly, there was a marked decrease in ESL task performance in Block 5, with the introduction of explanatory (rule-based) feedback. There are several possible explanations for this pattern. One possibility is that feedback cued learners to focus on a particular cue or cues. However, the next trial would be likely to represent a different cue (Factor), requiring participants to switch attention. Mixed designs, in general, may promote this kind of attentional switching. Ongoing analyses are examining evidence for attentional switching. In general, an important issue for future research may be to understand effects of mixing versus blocking of like stimuli. It is possible that while blocked designs eliminate attentional "switch costs," mixed designs may promote more flexible and robust learning. Contingent on funding, future studies will examine these questions more directly.

In Study 2 our goal was to measure learning in the absence of explicit instruction and rule-based feedback. Analysis of results from Studies 1-2 together suggests there was a beneficial effect of providing explicit grammar instructions prior to blocks 3-4 (Fig. 3). Note the benefit is only observed for low-contrast examples, possibly because performance on high-contrast examples was approaching ceiling.

Study 1-2 results combined
Figure 3. Blocks 3-4, mean accuracy (ESL participants only). Results for Studies 1-2 combined.

In a follow-up study (in development) we will implement a new design that will allow us to test the hypothesis that training on blocks of examples for one factor will benefit performance on the same factor more than it will benefit performance on a different factor. To test this idea, we will compare results across consecutive blocks representing the same versus different factors. The motivation for this study is to refine our understanding of what is learned across blocks (e.g., knowledge of specific rules versus something more general related to English grammar or task demands). We will also be developing more fine-grained assessments, including pre- and post-test measures to evaluate skill in making grammaticality judgments and specific knowledge of rules and principles that are relevant for use of the dative alternation. Finally, Study 3 (anticipated completion by end of July 2007) will implement the Pavlik-Anderson model, to test the prediction that training items selected on the basis of student performance, as well as on the basis of training history, will lead to better performance than training items based on the history of training alone (optimized scheduling).

Annotated Bibliography

Bresnan, J. & Hay, J. (2006). Gradient grammar: An effect of animacy on the syntax of give in varieties of English. [draft downloaded from]

Bresnan, J. & Nikitina, T. (2003). On the gradience of the dative alternation. [draft downloaded from]

Bresnan, J., Cueni, A., Nikitina, T., & Baayen, R. H. (2005). Predicting the dative alternation. Paper presented at the KNAW Academy Colloquium: Cognitive Foundations of Interpretation, Amsterdam.

DeKeyser, R. M. (1994). How implicit can adult second language learning be? AILA Review, 11, 83–96.

Dienes, Z. & Perner, J. (1999). A theory of implicit and explicit knowledge. Behavioral and Brain Sciences, 22(5), 735–755.

Goldberg, A. E. (1995). A construction grammar approach to argument structure. Chicago: University of Chicago.

Goldberg, A. E. & Jackendoff, R. (2004, to appear). The English resultative as a family of constructions. Language.

Hulstijn, J. (1989). Implicit and incidental second language learning: Experiments in the processing of natural and partly artificial input. In H. W. Dechert & M. Raupauch (Eds.), Interlingual processes. Tübingen: Gunter Narr Verlag.

Inagaki, S. (1997). Japanese and Chinese learner's acquisition of the narrow-range rules for the dative alternation in English. Language Learning, 47, 637-669.

Marefat, H. (2005). The impact of information structure as a discourse factor on the acquisition of the dative alternation by L2 learners. Studia Linguistica, 59(1), 66-82.

Morris, C. D., Bransford, J. D., & Franks, J. J. (1977). Levels of processing versus transfer appropriate processing. Journal of Verbal Learning and Verbal Behavior, 16, 519-533.

Pavlik Jr., P.I. & Anderson, J.R. (2005) Practice and Forgetting Effects on Vocabulary Memory: An Activation-Based Model of the Spacing Effect. Cognitive Science, 29, 559-586.