Job Details

ID #53382762
Estado California
Ciudad Santaclara
Tipo de trabajo Full-time
Fuente PayNearMe
Showed 2025-02-01
Fecha 2025-02-01
Fecha tope 2025-04-02
Categoría Etcétera
Crear un currículum vítae

Senior Data Reliability Engineer - Remote

California, Santaclara
Aplica ya

We’re looking to add a dynamic Senior Data Reliability Engineer, reporting to our Manager of Data Operations.About our Data Stack:Cloud Provider: AWSDatabase: MySQL, PostgreSQLExtract/Load: FivetranTransform: dbtData Warehouse: SnowflakeBI Visualization: LookerCode versioning: GitLabPreferred languages: SQL, PythonInfrastructure as Code: Terraform/OpenTofuObservability: Monte Carlo, DatadogAs our Senior Data Reliability Engineer, you will design, build, and maintain the data infrastructure that powers our data platform, ensuring reliability, scalability, and performance. You will bring a Site Reliability Engineer (SRE) approach to data operations, automating workflows, and continuously improving the data infrastructure and tools to support our business needs.What you’ll do: Infrastructure Management: Design, build, and maintain scalable and reliable data infrastructure using CI/CD and continuous improvement practices using IaC to manage both SaaS and cloud platform infrastructure.  Automation & Enabling Self-Service: Automate manual data operations tasks to enhance efficiency and ensure consistent, repeatable processes. Include self-service as a core tenet of the infrastructure designs and architecture.  Observability: Develop and implement observability solutions to ensure data platform reliability.  Build metrics, SLIs and SLOs to measure build success/failures, infrastructure stability, capacity, etc. Data Reliability Engineering: Drive expansion of SRE practices, such as failure analysis, redundancy, automated QA and security integration, and improvement as a tool. With the goal to minimize toil and enhance the uptime and performance of our data platform.Data Platform Team Support: Partner with members of the Data Platform Team to design and deliver solutions that align with their requirements, fostering collaboration to ensure all infrastructure and systems meet their functional and technical needs.  Collaboration Across Teams: Coordinate with data engineers, analysts, and platform teams to ensure the usability, reliability, and scalability of data solutions.Data Platform Optimization: Optimize the performance and scalability of the data platform to manage costs and reliability.Security and Compliance: Partner with security teams to implement industry best practices and standards like PCI or SOC 2.On-Call and Maintenance: Ensure data platform uptime by actively monitoring and responding to data platform issues.  Data Platform On-Call: Participate in on-call rotations to address data platform issues. Manage incidents impacting the data platform, perform root cause analysis, and implement solutions to prevent recurrence.  Incident Response: Take ownership of and continuously improve the incident response processes for the data platform.

Aplica ya Suscribir Reportar trabajo