We're looking for an experienced, technical Staff-level Technical Program Manager (TPM) to lead our Quality and Reliability efforts across critical systems and services. This is a high-impact, individual contributor role for someone who has done this before, who can build structure where needed, navigate ambiguity, and drive outcomes across multiple teams and domains.You’ll be responsible for leading cross-functional programs to improve system reliability, scalability, and operational quality, from improving incident response and production readiness to redefining the ways we test and deploy software. This is not a generalist role: we’re looking for a TPM with deep technical fluency and a track record of shaping system-level quality and delivery culture at scale.What You’ll DoOwn the Quality & Reliability Program: Define and drive the vision for quality—across proactive practices (testing, deployment, observability), reactive processes (incident response, external communications), and cultural expectations (quality ownership, readiness).Lead Cross-Functional Programs: Drive reliability and quality initiatives across Engineering, Product, Operations, and Customer Success.Production Readiness: Own the Production Readiness Review (PRR) process; ensure all releases meet reliability standards before they go live.Define and Drive SLOs: Establish and track Service Level Objectives (SLOs). Build visibility into reliability metrics and lead efforts to meet or exceed targets.Improve Incident Management: Streamline incident response and postmortems. Drive structural improvements in tooling, communication, and ownership.Scale Tooling & Automation: Collaborate across teams to enhance observability, alerting, testing automation, and response tooling.Mitigate System Risk: Identify risk vectors early, build mitigation plans, and drive resolution with urgency.Drive Alignment: Influence across Eng, Product, Ops, and GTM teams to prioritize reliability and integrate quality into every initiative.Track Progress: Use tools like Atlas, Jira, and internal dashboards to maintain clarity on goals, risks, and outcomes.Embed Continuous Learning: Build programs that ensure we learn from every incident, test edge cases, and continuously harden our systems.
Job Details
ID | #54435722 |
Estado | California |
Ciudad | Santaclara |
Tipo de trabajo | Full-time |
Salario | USD TBD TBD |
Fuente | PayNearMe |
Showed | 2025-09-03 |
Fecha | 2025-09-03 |
Fecha tope | 2025-11-02 |
Categoría | Etcétera |
Crear un currículum vítae | |
Aplica ya |
Staff Technical Program Manager (Reliability and Quality) - Remote
California, Santaclara, 95050 Santaclara USA