Summary
We are looking for a Linguist to help us develop language components for a variety of voice-enabled technologies and products. We are seeking candidates with native fluency in Telugu with strong linguistic data analysis and language technology experience to manage data collection, LLM-powered data synthesis and data annotation tasks, prompt engineering, localization and quality evaluations.
Job Responsibilities
• Provide linguistic expertise in the areas of syntax, semantics, pragmatics and sociolinguistics
• Collaborate with other linguists and data operations teams in data collection, data curation, translation, localization and annotation efforts
• Evaluate and curate data sets for ML models using LLM solutions
• Assess model and data quality
• Prompt engineering
• Collaboratively develop complex and consistent linguistic analyses
Required Qualifications
• Master’s degree in general Linguistics or Linguistics with an emphasis on Romance languages, Computational Linguistics, Speech Science, or related field
• Native or near-native fluency in Telugu, and at least another language
• Awareness of Indian languages and their linguistic, cultural, local nuances
• Knowledge of syntax, semantics, pragmatics, sociolinguistics, corpus linguistics, and other areas of linguistics
• Experience working with speech and text data in multiple languages
• Familiar with Large Language Models (LLMs), prompt engineering and their applications
• Comfortable working in a fast paced, highly collaborative, dynamic work environment
• Experience with database queries and data analysis processes (i.e. SQL, RegEx, spreadsheets, R, Unix, or others)
• Strong organizational skills and detail oriented
• Excellent communication skills both verbal and written
Preferred (additional) Qualifications
• PhD in Linguistics or Romance languages, language technologies, computational linguistics, speech science, or related field
• Proficiency in Python
• Experience with machine learning frameworks, NLP Libraries and Tools