See my past and current research projects.
Combining Word Embeddings for Binary Classification Tasks
This article explores the potential of combining two word-level vector representations, or word embeddings, for binary classification Natural Language Processing tasks. This research considers GloVe, ELMo, and BERT embeddings and compares the classification ability of a single embedding with the classification ability of the same embedding and an additional embedding on a single data set. Drafted, not published.
Political Emails Data Set
In January 2020, I began collecting campaign emails from over 600 U.S. House of Representatives congressional campaigns. I am currently collecting these emails to form a corpus of political emails for use in the Natural Language Processing community. I hope this dataset will help answer questions such as: “Do Democrats and Republicans speak differently?” “What issues do political candidates focus on in campaign correspondence?” “Does
<email attribute> correlate with winning a political race?” and “Does one party speak more positively than another?” I anticipate that this dataset will contain close to 25,000 emails by November 2020, when I intend to end the collection phase of this research. The code is available on GitHub. Not drafted, not published.