Job Overview SRE is part of a global organization that leverages the latest technology to communicate with our colleagues across the globe. We organize ourselves into distributed teams – SRE teams are anchored to iManage offices worldwide. Tuesdays and Fridays are dedicated to in‑office collaboration, rapid innovation, and developing a sense of belonging. Mondays and Fridays are reserved for remote‑friendly focus time to get things done. Our workplace intentionally balances collaboration and accomplishment. As a Senior Site Reliability Engineer, you are an engineer, a builder, and a systems thinker. You’ll create middleware and platform guardrails that empower developers to innovate quickly and reliably. You combine deep technical judgment with empathy to eliminate customer pain, especially when working with enthusiastic teams stewarding the world’s most privileged data. You uplift those around you, act as a subject‑matter expert, mentor others, and drive change. You chase contributing factors over root causes, value code over documentation, and documentation over process. You’ll engage in, and often lead, architectural discussions, reduce toil, and deliver scalable, resilient platforms that support our customers and organization. As a Senior SRE, you’ll help scale our cloud platform, collaborate across teams to promote standardization and resiliency, and participate in on‑call rotations. You’ll be a key voice in observability, change management, and service scalability, providing guidance during complex technical decisions and high‑impact events. Responsibilities Eliminate TOIL through automation and software development. Partner cross‑functionally with application teams and internal stakeholders. Create a modern, cloud‑native platform that is resilient, cost‑effective, and secure by default. Scale cloud infrastructure to support our Kubernetes‑based ecosystem. Maintain the freshness and utility of platform services. Improve the security posture of our products. Design automation, orchestration, observability, and disaster readiness into our products. Participate in production support and on‑call rotations, providing senior‑level guidance during critical events. Lead incident management and post‑incident retrospectives, coaching teams in these practices. Qualifications Experience writing design documents, postmortems, and refactoring application code. Built automation to reduce operational burden or developed internal SaaS tools. Ability to advocate for SRE principles (e.g., SLOs vs SLAs) and introduce them effectively. Experience in public cloud or hosted datacenter environments (Azure and AKS preferred). A passion for collaborative teamwork and influencing reliability best practices across teams. Bonus Points Hands‑on experience with Linux server stacks (Ubuntu/Debian preferred). Knowledge of cloud provisioning platforms (Terraform preferred). Exposure to configuration management tools (Chef preferred). Experience with containerization/clustering technologies (Docker preferred). Familiarity with observability and alerting tools (Prometheus/Grafana or ELK/EFK). Practical experience with CI/CD pipelines and rollout strategies. A bachelor’s degree (or equivalent experience) in Computer Engineering or related field. Proficiency in one or more programming languages (e.g., Java, Python, Golang). Familiarity with scripting languages (e.g., PowerShell, Bash, Python, Ruby). Benefits Creating an inclusive environment where you can help shape the culture. Market‑competitive salary applied through a consistent, equitable process. Annual performance‑based bonus. Comprehensive Health, Vision, Dental, and Life insurance. Registered Retirement Savings Plan with company match up to 5%. Enhanced leave for expecting parents – 20 weeks 100% paid primary leave, 10 weeks 100% paid secondary leave. Flexible time‑off policy for vacation, volunteering, holidays, family, or recharge. Multiple company wellness days each year. Access to a global behavioral health platform that enhances personal well‑being. Equal Opportunity Statement iManage provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. #J-18808-Ljbffr
Senior Site Reliability Engineer
IMANAGE
toronto, toronto
Published 24 days ago
Report job