Data Scientist, Natural Language Processing
Send a job offer directly to this candidate
University of Washington ‘21, Georgia Tech ’18. Fullstack and natural language processing skillset.
Extracted syllabus keywords (as indicators of course topic and level) and review sentiment from legacy site data: syllabi, reviews, course equivalencies, and professor ratings with PyTorch, Selenium, and NLTK. Wrote frontend and APIs in React and Django to take initial product to pilot (of 500 users), including mission-critical login system.
Worked on two tech stacks: S3-APIGateway-DynamoDB for large-scale and EC2 for small-scale. Communicated details of NLP applications, large scale data operations, and recommendation engines to nontechnical stakeholders, including issues of fairness, bias, and privacy.
Selective volunteer program (10%) during COVID-19 to teach introductory Python to adults from around the world. Ensured understanding of material and well-being in a group of students. Led special topic weekend sessions on language, computing, and ML.
Automated grade calculations and created concrete chat transcript metrics for a class moved online due to COVID- LING 566 (Syntax for Computational Linguists), using Python Pandas and NLTK, increasing grading efficiency and enabling more granular final grade calculations.
Built ELT pipeline for an aggregated corpus of scraped Yelp reviews to score locations on relevance to various sightseeing and tourism categories using Word2Vec embeddings using Python Gensim, with an emphasis on speed. Cached predictions to improve model performance on mobile devices.
Project to rewrite EarSketch coding lessons to be more accessible to middle school students below target reading level. Wrote script in Python to calculate Flesch-Kincaid reading levels to facilitate faster rewrites, using bash, Python, regexes, and NumPy. Drove increase in enrollment and student retention.
GPA: 3.82
Relevant Coursework: Syntax, Semantics, Phonetics, Phonology, Analyzing Neural Language Models, Deep/Shallow NLP, NLP Systems and Applications (Project)
Societies + Activities: Research Computing Club; Computation Language and Meaning Band of Researchers (CLMBR); Huskies for Suicide Prevention and Awareness (HSPA)
Relevant Project: Hate Speech Detection with BERT (Python, PyTorch, HTCondor, Jupyter Notebooks)
Georgia Institute of Technology, Atlanta - B.S. Computer ScienceJUNE 2016 – DECEMBER 2018
Relevant Coursework: Automata and Complexity, Machine Learning, Statistics, Linguistics Intro, Multivariable Calculus
Societies + Activities: Concert Band (Clarinet + Bass Clarinet); STEM outreach (Junior STEM)