Vacancy caducado!
We are looking for
Site Reliability Engineer - Remote / Telecommute for our client in Houston, TX Job Title: Site Reliability Engineer - Remote / Telecommute Job Location: Houston, TX Job Type: Contract Job Description: Responsibilities:- Reduce Lead Time: The time from code written to entering production.
- Increase Deployment Frequency: How often deploys happen.
- Shorten Mean-Time-To-Recover (MTTR): How quickly can teams restore service after outages.
- Lessen Change Fail Rate: What percentage of deploys result in service impairment or an outage.
- Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve.
- Provide primary operational support and engineering for multiple large, distributed software applications.
- Engage in and improve the whole lifecycle of services—from inception and design, deployment, operation, and refinement.
- Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
- Scale systems sustainably through mechanisms like automation; evolve systems by pushing for changes that improve reliability and velocity.
- Practice sustainable incident response and blameless postmortems.
- A proactive approach to spotting problems, areas for improvement, and performance bottlenecks.
- Passion for problem-solving, continuous improvement, and optimization.
- Experience programming in at least one of the following languages: C#, Java, Python, or Go.
- Ability to debug, optimize code, and automate routine tasks.
- Experience with Azure-related resources such as VNets, Resource Groups, Functions, App Service, Azure VM, NSGs (Network Security Groups), Express Route & RBAC (Role Based Access Control).
- Experience with software deployment and orchestration technologies such as Helm, Docker, Kubernetes, Kubernetes Operators, and Service Mesh (Istio).
- Understanding of testing principles in the context of IaC.
- Strong communication skills with the ability to communicate complex technical concepts and align the organization on decisions.
- Experience programming in at least one of the following languages: C#, Java, Python, or Go.
- Experience with software deployment and orchestration technologies such as Helm, Docker, Kubernetes, Kubernetes Operators, and Service Mesh (Istio).
- 7+ years Experience
Vacancy caducado!