Join a dynamic infrastructure team as a Site Reliability Engineer. Focus on enhancing platform reliability, ensuring availability, and supporting AI workloads for improved system performance.In this role, you'll directly impact platform operational performance and reliability. Collaborating with DevOps and engineering teams, you will help build scalable infrastructure and address incident responses. You'll play a key role in implementing security measures and improving observability for AI systems.Key Responsibilities:• Maintain platform reliability and availability• Optimize and secure infrastructure systems• Proactively address scaling and reliability challenges• Configure monitoring and incident response strategies• Support AI/ML infrastructure and workloadsRequirements:• Experience in Site Reliability Engineering or similar• Proven skills with AWS, particularly EKS and RDS• Familiarity with Kubernetes for production environments• Proficient in Terraform for infrastructure development• Strong background in PostgreSQL and observability toolsEnhance the system performance and contribute to a vibrant engineering culture while supporting AI innovations.#J-18808-Ljbffr

Site Reliability Engineer In Growing Team

HIIVE

Similar jobs

Mid-Market Account Manager In Fraser Valley

BDC

Hybrid Product Specialist For Marine Sales

YAMAHA MOTOR CANADA LTD.

Principal Backend Developer

PARALLELS

Sales Executive For Ticket Packages

OAK VIEW GROUP

Senior Product Software Engineer

CLOUDDEVS

Casual Home Health Nurse — Revelstoke, Bc

INTERIOR HEALTH AUTHORITY

Intermediate Wildlife Biologist

WSP IN CANADA

Receive similar jobs by email