Senior Data Center Operations Engineer Job Overview TheSenior Data Center Operations Engineerplays a critical, hands‑on role in supporting the build‑out and long‑term operation of a high‑performance, enterprise‑scale data center environment supporting advanced compute and large‑scale infrastructure deployments.This position is designed for an experienced engineer with deep expertise in server hardware, Linux systems, and data center operations, operating within environments that demand high availability, precision, and performance. You will contribute during the initial deployment phase, supporting infrastructure bring‑up, validation, and hardware readiness. As the environment transitions into steady‑state operations, you will take ownership of ongoing reliability, advanced troubleshooting, and continuous improvement initiatives.This role requires a strong operator mindset—someone who thrives in complex, production‑critical environments and takes pride in resolving issues at their root. You will serve as a primary technical escalation point, working closely with engineering and infrastructure teams to maintain system stability and performance.You will collaborate with cross‑functional teams, making clear and professional communication in English (written and verbal) essential for success in this role.This role offers continuity across both deployment and operational phases and provides exposure to large‑scale, modern infrastructure environments, with a clear path for progression into advanced technical or engineering roles.Key Responsibilities Advanced Hardware Troubleshooting & RepairDiagnose and resolve complex hardware failures across server platforms (motherboards, CPUs, memory, storage)Perform component‑level repairs and replacements on servers and data center hardwareExecute break/fix processes with a focus on minimizing downtime and meeting SLAsConduct root cause analysis (RCA) of hardware failures and implement preventative improvementsIdentify recurring failure trends and contribute to tooling, automation, and process enhancementsLinux Systems & Platform SupportUtilize Linux command‑line tools for system monitoring, diagnostics, and troubleshootingSupport provisioning and deployment of servers across Linux distributions (RHEL, Ubuntu, etc.)Troubleshoot boot‑level and OS‑level issues in production environmentsCollaborate with engineering teams to resolve complex hardware/software interaction issuesData Center OperationsSupport hardware installation, structured cabling, and infrastructure validationMaintain accurate inventory of spare parts, assets, and retired equipmentDocument repairs, changes, and configurations in ITSM/DCIM systemsEnsure adherence to safety, security, and operational protocolsServe as a primary escalation point for complex infrastructure issuesParticipate in on‑call rotation supporting 24x7 operationsCollaboration & MentorshipProvide guidance and mentorship to technicians on hardware troubleshooting and best practicesCollaborate with network, storage, and infrastructure teams to resolve cross‑functional issuesContribute to knowledge sharing, documentation, and operational excellence initiativesSupport continuous improvement efforts across processes, tooling, and operational workflowsRequired SkillsStrong English communication skills (written and verbal)are required for coordination with cross‑functional teamsExpert‑level knowledge of server hardware architecture and component‑level troubleshootingStrong proficiency with Linux systems and command‑line diagnosticsSolid understanding of networking fundamentals and infrastructure componentsExperience working within structured operational environments (SOPs, SLAs, ticketing systems)Familiarity with ITSM/DCIM tools (ServiceNow, Jira, or similar)Experience with structured cabling and fiber optic connectivityStrong analytical and problem‑solving skills with attention to detailAbility to operate effectively in high‑pressure, high‑availability environmentsStrong organizational and documentation skillsRequired Experience5+ years of experience in data center operations or similar infrastructure environmentsSignificant hands‑on experience with server hardware troubleshooting and repairMinimum of 2 years of experience working with Linux operating systems in production environmentsExperience supporting enterprise server platforms and infrastructure environmentsDemonstrated experience performing root cause analysis and resolving complex hardware issuesExperience working within ticketing systems and operational workflowsExposure to data center build‑outs, deployments, or infrastructure upgrades (preferred)Preferred CertificationsCompTIA A+, Server+, or Linux+LPI certification or equivalentVendor‑specific hardware certificationsPhysical RequirementsAbility to lift and move equipment up to 50 lbsAbility to work in a temperature‑controlled environment with moderate noise levelsAbility to perform physical tasks such as standing, walking, bending, and kneeling for extended periods#J-18808-Ljbffr
Sr Data Center Operations Engineer
COVESTIC INC
saint jérôme, saint jérôme
Published 18 days ago
Report job