Intuitive AI is dedicated to addressing the global waste crisis by creating innovative solutions that make waste management more effective and impactful. Our mission is to inspire changes in behavior and foster sustainability through advanced technology and user-friendly systems. With a focus on environmental stewardship, we aim to empower individuals and organizations to make responsible waste disposal choices. At Intuitive AI, we believe in making a lasting impact and creating a cleaner planet for future generations.The Role We're hiring a Fleet Reliability Engineer to help us keep that fleet healthy, observable, and improving. You'll work directly alongside our Senior Engineers to maintain, monitor, and scale the thousands of Oscar Sort and Pixel units deployed worldwide - catching issues before they happen, automating the boring stuff, and shipping the systems that let a small team operate a fleet far bigger than it has any right to.You'll have a senior engineer in your corner from day one, and real ownership from week one. The work is hands‑on, the feedback loop is fast, and what you build will be running on devices in real customer sites within days.What You'll Work OnDiagnosing and resolving issues on Linux‑based edge devices in the field; logs, crashes, network drops, hardware quirksBuilding and maintaining our fleet management, monitoring, and alerting systemsOwning OTA updates and release rollouts - staged deployments, rollbacks, canary fleetsWriting scripts and tooling to automate diagnostics, remediation, and routine maintenanceImproving system reliability - driving down recurring issue classes, reducing manual support load, raising fleet uptimePartnering with Customer Success when client‑facing issues need engineering depthWhat We're Looking For Required2–4 years working with Linux systems in a production or fleet contextStrong command line fluency and systems‑level debugging chopsSolid scripting in Python and BashNetworking fundamentals - SSH, TCP/IP, firewalls, VPNs, basic troubleshootingExperience with monitoring or observability tooling (Prometheus, Grafana, Datadog, Loki, or similar)Hands‑on experience with embedded Linux, IoT, or edge devices — this is the core of the role, not a side interestExperience with OTA / fleet update systems (Mender, RAUC, balena, AWS IoT Greengrass, or similar)Comfortable with Docker and containerized workflowsA genuine problem‑solving instinct - you don't stop at the symptomNice to HaveCloud infrastructure (AWS preferred)CI/CD pipelines and infrastructure‑as‑codeHands‑on hardware experience (replacing components, debugging physical units remotely)Experience operating fleets at thousands‑of‑devices scaleWho You AreCurious, hands‑on, and allergic to "we've always done it that way"Comfortable owning a problem end‑to‑end, from the log line to the fix to the postmortemDetail‑oriented - you trust the data, not the vibeEnergized by real‑world AI systems and devices people actually touchGenuinely interested in why something broke, not just getting it back upWhy JoinReal fleet, real customers, real impact - every change you make ships to devices in the fieldDirect mentorship from senior engineers, no layers between you and the systemsHigh ownership in a small team where your work is visible end‑to‑end#J-18808-Ljbffr
Fleet Reliability Engineer
INTUITIVE AI
vancouver, vancouver
Published 18 days ago
Report job