Contract Duration: May 4, 2026 to November 4, 2026 Overview We are seeking an experienced AI / ML QA Engineer to support testing and validation of advanced machine learning and generative AI systems. This role focuses on ensuring model quality, reliability, and responsible AI practices across a range of applications including NLP, classification, recommendation systems, and large language models. Key Responsibilities Functional and Model Testing Design and execute test cases for AI and ML models including NLP, classification, recommendation, and generative AI systems Validate outputs for accuracy, consistency, and expected behavior across diverse datasets and inputs Perform regression testing following model updates, retraining cycles, or data changes Prompt and LLM Testing Develop and maintain prompt test libraries for LLM based features Evaluate responses for relevance, tone, factual accuracy, and alignment with business rules Identify hallucinations, response drift, and edge case failures AI Safety and Red Teaming Conduct adversarial testing to uncover vulnerabilities such as prompt injection and jailbreak scenarios Assess outputs for bias, fairness, and compliance with responsible AI standards Document risks and collaborate with model teams on remediation strategies Test Automation and Tooling Build and maintain automated test suites using Python based frameworks Track and report quality metrics, model performance benchmarks, and defect trends Collaboration and Documentation Work closely with data scientists, ML engineers, and product teams Develop clear test plans, defect reports, and evaluation summaries Contribute to QA standards and best practices for AI systems Required Skills Hands on experience testing AI and ML models including NLP, classification, and generative AI Strong experience with LLM testing including prompt engineering and evaluation of hallucinations, bias, and response quality Proficiency in Python and test automation frameworks Experience working with CI CD pipelines Solid understanding of regression testing Nice to Have Experience in banking or financial services Exposure to AI safety and red teaming techniques such as prompt injection and jailbreak testing Knowledge of responsible AI and model governance principles Familiarity with LLM evaluation tools and frameworks Experience testing recommendation or ranking systems Exposure to ML pipelines and MLOps workflows Understanding of bias and fairness metrics Note: We use AI tools to: obtain basic information, detect plagiarism, false employment history or references, categorize your skills, and do an initial match with job posting. #J-18808-Ljbffr
Ai / Ml Qa Engineer
INFOTEK CONSULTING INC.
toronto, toronto
Published 27 days ago
Report job