SENIOR DATA ANNOTATOR at TRACKMIND SYSTEMS (2025-07 – 2026-05)
Client – S&P Global | Project: Structured Hearls | Tools: Prodigy (Spacy), MS Excel, Kensho Spark
- Performed structured NLP annotation and entity labelling by categorizing complex commodity market textual content into predefined taxonomy attributes based on detailed project guidelines, consistently maintaining 100% data accuracy standards across high-volume structured and unstructured datasets directly supporting LLM training and generative AI validation pipelines for S&P Global's financial intelligence platforms.
- Conducted rigorous document and content review, identifying and labelling relevant entities, attributes, specifications, grades, certifications, contractual terms, geographical locations, and shipping trade information (incoterms, delivery dates, laycan periods) from unstructured commodity market text, ensuring data completeness and correctness for downstream model consumption.
- Proactively identified day-to-day operational friction points and SOP inconsistencies in annotation tooling and workflows, proposing actionable corrective changes that were implemented to unblock operations and improve team-wide annotation efficiency and output quality.
- Built an AI-driven annotation validation plugin using prompt engineering and SOP-based instruction design, automating entity extraction and verification of market headline data, which improved team annotation accuracy from 85% to 95%.
- Collaborated with operational leaders including product managers and commodity price reporters to clarify data scenarios, streamline annotation workflows, and improve reporting efficiency, demonstrating strong stakeholder communication and cross-functional coordination skills.
- Organized daily annotation team operations including task scheduling, progress monitoring, quality checks, and stakeholder coordination with product managers to resolve annotation-related queries and maintain data quality standards and sent weekly reports to the leadership.
AI DATA ANNOTATOR at ENTERPRET (2022-05 – 2025-07)
Clients – Notion, Zoom, Mixpanel, Figma | Tools: Labelstudio, Braintrust Dev, Google Sheets, KPI Dashboards
- Reviewed, validated, and structured large volumes of customer support conversations, emails, and feedback data across text and audio modalities for enterprise SaaS clients including Zoom, Mixpanel and Notion, generating high-quality human insight data that directly informed ML model training and product intelligence decisions.
- Collaborated with the Engineering team to evaluate in-house built ML models by assessing the authenticity, factual accuracy, and hallucination of LLM outputs through reinforced human feedback (RLHF), contributing directly to iterative generative AI quality improvement and responsible AI deployment practices.
- Generated performance and quality reports for enterprise clients and program leadership using MS Excel and Google Docs, tracking KPI metrics, productivity benchmarks, and annotation quality outcomes across multiple concurrent task types demonstrating strong business writing and analytical reporting capabilities.
- Identified and resolved data errors, missing information, and labelling inconsistencies during document and data review, performing root cause analysis on recurring error patterns and proposing process-level solutions to enhance labelling task quality and output consistency.
- Onboarded and trained new team members and handled a team of 10 associates, providing detailed process guidelines, conducting peer quality verification, and sharing SOP knowledge to maintain consistent annotation standards demonstrating floor support ownership and the ability to scale team productivity without compromising output quality.
- Demonstrated the ability to rapidly pivot between multiple task categories and annotation modalities based on shifting business requirements, maintaining individual productivity targets while supporting overall team operational deliverables.
DATA ASSOCIATE at AMAZON DEVELOPMENT CENTER (2019-07 – 2022-05)
Organization – Alexa Data Services | Tools: Mercury, Quip Sheets
- Processed and annotated large-scale audio and text datasets generated from Alexa Smart Speaker user interactions executing transcription, entity labelling, intent classification, and sentiment identification across Amazon Shopping, Music, Weather, and General Knowledge query datasets directly contributing to NLU model training pipelines at production scale.
- Performed metadata annotation of audio files, labelling speech segments, speaker identities, emotional tone, and keyword occurrences for speech recognition and audio classification model training demonstrating proficiency in generating high-quality human insight data across speech and audio modalities.
- Achieved consistent throughput of 1,000+ annotated data items per day while sustaining approximately 95% accuracy, meeting strict operational productivity and quality benchmarks across both short-form and long-form data files (1+ hour audio duration), demonstrating exceptional concentration, multitasking capability, and attention to detail in a high-volume environment.
- Analyzed root causes of data quality issues, identified error patterns in annotation outputs, and applied rigorous quality control measures to ensure annotated data met the highest standards required for training accurate and robust machine learning models.
- Recognized as a top-performing annotator in Amazon's US and UK Alexa Data Services division for exemplary precision and reliability on pilot AI training projects "Dylan" and "MSLFT." Based on my quality metrics was assigned with mentorship role to guide peers on understanding annotation guidelines and improve quality.
- Maintained strict adherence to Amazon's customer data privacy, confidentiality, and compliance policies throughout all data handling activities, reinforcing the principle that customer privacy is mandatory.