Welcome to TELUS Digital — where innovation drives impact at a global scale. As an award-winning digital product consultancy and the digital division of TELUS, one of Canada’s largest telecommunications providers, we design and deliver transformative customer experiences through cutting‑edge technology, agile thinking, and a people‑first culture. With a global team across North America, South America, Central America, Europe, and APAC, we offer end‑to‑end expertise across eight core service areas: Digital Product Consulting, Digital Marketing Services, Data & AI, Strategy Consulting, Business Operations Modernization, Enterprise Applications, Cloud Engineering, and QA & Test Engineering. From mobile apps and websites to voice UI, chatbots, AI, customer service, and in‑store solutions, TELUS Digital enables seamless, trusted, and digitally powered experiences that meet customers wherever they are — all backed by the secure infrastructure and scale of our multi‑billion‑dollar parent company. Location & Flexibility This role will work from home (Canada). Opportunity: Site Reliability Engineer (SRE) – EMR About the Role We are seeking a Site Reliability Engineer (SRE) with strong cloud, automation, and infrastructure expertise to support our mission‑critical EMR. In this role, you will ensure the reliability, security, and performance of systems that handle sensitive patient information and clinical workflows. You’ll work across GCP, AWS, and modern DevOps tooling to build resilient infrastructure that meets healthcare‑grade compliance and uptime requirements. Key Responsibilities Cloud Infrastructure & Platform Reliability Architect, deploy, and maintain secure, scalable infrastructure across GCP and AWS for EMR/medical data workloads. Implement and manage Terraform‑based Infrastructure‑as‑Code to support consistent, compliant environment provisioning. Ensure systems meet healthcare reliability standards, including high availability, disaster recovery, and data durability. Automation & Operational Excellence Build automation tools and scripts using Python to reduce manual operations and improve system consistency. Enhance release deployment process using GitHub Actions to support safe, traceable, and compliant deployments. Develop runbooks, operational workflows, and automated remediation for common reliability issues. Monitoring, Observability & Incident Response Design and maintain monitoring, alerting, and observability systems for EMR/clinical applications. Lead incident response, root‑cause analysis, and post‑incident reviews with a focus on long‑term reliability improvements. Define and track SRE metrics such as SLIs, SLOs, and error budgets tailored to clinical system uptime requirements. Collaboration & Process Management Work closely with engineering, data, and clinical operations teams to design reliable architectures for medical data systems. Use JIRA to manage tasks, incidents, and sprint workflows. Document infrastructure, processes, and compliance artifacts in Confluence. Required Skills & Experience Hands‑on experience with GCP and AWS cloud platforms. Strong proficiency with Terraform and Infrastructure‑as‑Code methodologies. Solid understanding of Linux systems, networking, and distributed systems. Experience with GitHub Actions for version control and deployment automation. Working knowledge of Python for scripting and automation. Familiarity with healthcare data environments, EMR/EHR systems, or regulated data workflows. Experience with JIRA and Confluence in an Agile environment. Understanding of SRE principles: SLIs, SLOs, error budgets, incident management. Nice‑to‑Have Experience with containerization (Docker, Kubernetes). Knowledge of healthcare interoperability standards (HL7, FHIR). Experience with monitoring tools (Cloud Monitoring, CloudWatch). Background in security engineering or compliance frameworks. What You’ll Bring You’re someone who thrives in high impact environments where reliability truly matters. You care deeply about automation, secure design, and building systems that clinicians and patients can depend on. You’re collaborative, curious, and committed to operational excellence. #J-18808-Ljbffr
Site Reliability Engineer
TELUS DIGITAL
, , canada, , , canada
Published 27 days ago
Report job