Katalyze AI is looking for a Senior ML Scientist to design, build, and deploy the core intelligence layer of our platform. You'll work across two interconnected domains: RAG and knowledge retrieval (making our platform reason accurately over large scientific document corpora) and agentic AI systems (building the multi-step reasoning pipelines that autonomously complete complex tasks for our biopharma and manufacturing customers). This is a research-forward role with direct impact on production. You'll go from reading a paper to shipping a production system; sometimes in the same week. Design and build advanced RAG pipelines for scientific knowledge retrieval: chunking strategies, embedding model selection, hybrid search, re-ranking, and rigorous retrieval evaluation Develop and maintain Knowledge Graph architectures (Neo4j, ontologies, semantic structures) that capture domain relationships and give agents deep understanding of biopharma and manufacturing workflows Architect agentic workflows using LangChain/LangGraph or custom orchestration: designing autonomous, multi-step reasoning pipelines for complex enterprise tasks Build the "skills layer" that allows agents to execute domain-specific tasks reliably, with proper validation, auditability, and error handling for high-stakes regulated environments Advance entity extraction and knowledge representation: building systems that turn unstructured scientific documents into structured, queryable domain knowledge Design and run rigorous evaluation frameworks to benchmark agent reliability, RAG accuracy, and model consistency — define what "good enough to ship" looks like Stay current with ML research (NeurIPS, ICML, ICLR, ACL) and identify applicable advances; translate them from paper to production Collaborate with the Data Science, Engineering, and Product teams to integrate ML components into customer-facing features What We're Looking For 4+ years of applied ML research or engineering experience, with production deployments under your belt Deep RAG expertise: chunking, embedding models, vector databases (Pinecone, Weaviate, pgvector), hybrid retrieval, context window optimization, and evaluation methodology Hands-on experience with Knowledge Graph construction (Neo4j, RDF/OWL, property graphs) and graph-based reasoning Proficiency with agent frameworks: LangChain, LangGraph, AutoGPT, CrewAI, or custom orchestration — and real opinions on their limitations Experience with PyTorch or JAX; familiarity with Hugging Face ecosystem PhD or Master's in Machine Learning, NLP, Computer Science, or related field preferred Domain knowledge in life sciences, biopharma, chemistry, or industrial processes is a significant advantage LLM Providers: Anthropic Claude, OpenAI, AWS Bedrock #J-18808-Ljbffr