A Semi-Automatic Tool That Facilitates Reliable Content Analysis of Corpus Data
Funded by: NSF through the Pittsburgh Sciences of Learning Center
PI: Carolyn Rose
Co-PI: William Cohen
Students and Staff: Pinar Donmez, Cammie Williams
The goal of our research is to develop text classification technology to address concerns specific to classifying sentences using coding schemes developed for behavioral research. A wide range of behavioral researchers including social scientists, psychologists, learning scientists, and education researchers collect, code, and analyze large quantities of natural language corpus data as an important part of their research. A particular focus of our work is developing text classification technology that performs well on highly skewed data sets, which is an active area of machine learning research.