Sr. Lead AI Research Scientist, AI Evaluation and Reliability The AI Foundations team leads core research and development across the training, evaluation, and deployment of AI systems that power Uma, Upwork’s flagship AI model, and other customer-facing generative AI capabilities. As a Sr. Lead AI Research Scientist focused on AI Evaluation and Reliability, you will drive high-impact research initiatives that improve the trustworthiness, robustness, and real-world performance of AI systems operating at marketplace scale. At the Sr. Lead level, this role combines deep technical expertise with cross‑functional leadership. You will identify and lead research efforts that address systemic reliability challenges, partner closely with engineering and product teams to translate research into production outcomes, and help shape how Upwork evaluates AI performance in real work scenarios. Your work will support AI systems embedded in retrieval‑based workflows, agentic architectures, and human plus AI collaboration patterns, while contributing to Upwork’s broader AI research strategy and external presence. Responsibilities: Lead applied research initiatives focused on AI evaluation, reliability, and robustness, defining success metrics tied to customer impact and production readiness. Design and validate methods to measure and mitigate AI reliability risks, including uncertainty estimation, hallucination detection, and identification of model failure modes. Partner cross‑functionally with engineering, data science, and product teams to integrate research outcomes into customer‑facing AI systems and workflows. Own research projects end to end, from problem framing and hypothesis development through experimentation, prototyping, and synthesis of results. Influence technical direction across teams by surfacing insights, proposing scalable solutions, and aligning stakeholders on priorities and trade‑offs. Mentor researchers and engineers through technical guidance, feedback, and collaborative leadership on shared initiatives. Contribute to Upwork’s external research footprint through publications, presentations, and engagement with the broader AI research community. What it takes to catch our eye: Proven experience leading applied AI research that balances scientific rigor with real‑world deployment constraints and business impact. A strong record of research contribution through publications, internal innovation, or demonstrable influence on production AI systems. Deep proficiency with Python and modern deep learning frameworks such as PyTorch, with hands‑on experience evaluating and improving large‑scale models. An adaptive approach to integrating AI tools into research and development workflows to accelerate experimentation, improve evaluation quality, and share best practices with others. A collaborative, growth‑oriented mindset with the ability to mentor peers, communicate complex ideas clearly, and thrive in a fast‑evolving, bottom‑up environment. This position will initially be employed through a partner to ensure a seamless hiring process while we establish the hub. Once the hub is established, there may be opportunities to transition to employment with Upwork depending on business needs and other requirements. While employed by the partner, you’ll work as part of Upwork’s team, with access to our resources, culture, and growth opportunities. Upwork is an Equal Opportunity Employer committed to recruiting and retaining a diverse and inclusive workforce. We do not discriminate based on race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, or other legally protected characteristics under federal, state, or local law. Please note that a criminal background check may be required once a conditional job offer is made. Qualified applicants with arrest or conviction records will be considered in accordance with applicable law, including the California Fair Chance Act and local Fair Chance ordinances. The Company is committed to conducting an individualized assessment and giving all individuals a fair opportunity to provide relevant information or context before making any final employment decision. I understand individuals in this role must be within reasonable commuting distance of Lisbon or Toronto, and will be required to report to an office 3 days per week. Select... U.S. Standard Demographic Questions We invite applicants to share their demographic background. If you choose to complete this survey, your responses may be used to identify areas of improvement in our hiring process. Providing this information is completely voluntary . It will not be considered in the hiring process or in any employment decisions. Choosing not to share this information will not affect your application in any way. If you do provide it, the information will be kept confidential and reported only in aggregate, consistent with applicable laws and regulations. How would you describe your gender identity? (mark all that apply) Select... How would you describe your racial/ethnic background? (mark all that apply) Select... How would you describe your sexual orientation? (mark all that apply) Select... Do you identify as transgender? Select... Do you have a disability or chronic condition (physical, visual, auditory, cognitive, mental, emotional, or other) that substantially limits one or more of your major life activities, including mobility, communication (seeing, hearing, speaking), and learning? Select... Are you a veteran or active member of the United States Armed Forces? Select... #J-18808-Ljbffr
Sr Ai Research Scientist, Ai Evaluation And Reliability Toronto, Ontario, Canada
UPWORK ENTERPRISE
toronto, toronto
Published 27 days ago
Report job