Be at the forefront of infrastructure reliability as an AI Infrastructure Site Reliability Engineer. Focus on maintaining system performance, security, and incident management to support our growing platform.You'll collaborate with a small yet passionate infrastructure team, working closely with DevOps and leadership to enhance the reliability of AI systems. This hands-on role demands your proactive approach in automating processes, improving observability, and ensuring services run cost-efficiently in production.Key Responsibilities:• Sustain platform uptime and availability metrics• Optimize and secure infrastructure• Resolve scaling issues proactively• Collaborate on troubleshooting with product engineers• Build and maintain observability systemsRequirements:• Proven experience in Site Reliability Engineering or related field• Familiarity with Elixir desirable• Operating experience with Kubernetes clusters• Competence with Terraform• Expertise in AWS services, including EKS and RDSPlay an integral role in enhancing our platform while fostering a vibrant engineering culture focused on reliability.#J-18808-Ljbffr

Ai Infrastructure Site Reliability Engineer

HIIVE

Similar jobs

Accountant

VACO BY HIGHSPRING

Financial Analyst

VACO BY HIGHSPRING

Superviseur Garage

TRANSPORT GINO BOIS (GROUPE TGB)

Technicien En Installation De Systèmes De Sécurité

GROUPE PRO ACCÈS

Remorqueur

TRANSPORT GINO BOIS (GROUPE TGB)

Mécanicien D'équipement Lourd

TRANSPORT GINO BOIS (GROUPE TGB)

Chef D'équipe Mécanique (Lead Hand)

ÉQUIPEMENT ST-GERMAIN INC.

Receive similar jobs by email