Job Details

ID #46059376
Estado Minnesota
Ciudad Remote
Tipo de trabajo Contract
Salario USD Market Market
Fuente York Solutions, LLC
Showed 2022-09-28
Fecha 2022-09-27
Fecha tope 2022-11-25
Categoría Etcétera
Crear un currículum vítae

Site Reliabilty Engineer (SRE) - Remote!

Minnesota, Remote 00000 Remote USA

Vacancy caducado!

Position Overview: We are looking a Lead Site Reliability Engineer (SRE) to join our team who can help define Site Reliability, establish best practices, and establish a dedicated SRE team. As a Lead SRE, you should have experience with infrastructure focused software development along with a deep understanding in monitoring, alerting, reliability, infrastructure (cloud/on premise), debugging, product engineering and security. SREs work with our DevOps teams to introduce and define SRE principles, establish reliability goals, and develop tooling for operational observability.
Site Reliability Engineers are responsible for influencing systems reliability and scalability practices across enterprise. In this role, there is a strong focus on building the tooling and integrations necessary to easily onboard services. There will be a mix of platform and application-level work to support out-of-the-box visibility, monitoring, and dashboarding. Not only is our lead responsible for hands on establishing tooling, principles, etc., but also serves as an advocate of SRE. The lead will help others understand Site Reliability Engineering as a practice, how it can benefit our company as a whole, along with how they can support individual teams.

Responsibilities:
Research new and inventive ways to improve the overall reliability of sites, services, applications, and infrastructure using customer focused, data driven and metrics-based software engineering approaches.Evangelize, design, and deploy SRE (Systems Reliability Engineering) concepts, methodologies, and practices using automation, targeted engagements, and other light touch consulting with engineering, product, infrastructure, and other teams.Deploy, configure, and consult on application/infrastructure monitoring, dashboards, observability, telemetry, logging, tracing, alerting, platform integrations, and other technology as required and recommended.Define, code, and publish standards for modern KPIs and velocity/reliability control mechanisms using SLOs, SLIs, error budgets, and other OKRs recommended by SRE principles, and as agreed to by key stakeholders.Proactively recommend design changes to new and existing applications or infrastructure to increase reliability.Proactively manage performance of applications and other systems within the environment.Troubleshoot high visibility issues in production and other environments, applying SRE, debugging and problem-solving techniques.Promote and drive a blameless culture across the company to obtain factual data and information to continuously improve overall reliability and performance of company technology assets.
Qualifications:
5+ years IT experienceStrong ability to act as a technology evangelist, driving innovative engineering solutions.Proven software engineering background and/or ability to produce quality code in one or more programming languages.Exceptional time management and organizational skills with the ability to manage shifting priorities in a fast-paced, Agile and DevOps enabled environment.Effective communication, interpersonal, and persuasive skills; adaptable to technical and non-technical audiences to build consensus.Strong analytical, statistical, and problem-solving skillsProven leader across one or more technology areas.Preferred Job QualificationsBanking / Finance / Insurance experienceBachelors degree or equivalent experience in computer science or related fields.Deep experience with infrastructure focused software development along with a deep understanding in monitoring, alerting, reliability, infrastructure (cloud/on premise), networking, debugging, product engineering and security.Direct experience with tools and platforms like Grafana, ELK Stack, Datadog, Dynatrace, Splunk, SCOM, Oracle Enterprise Manager, Service Now, Azure Monitoring, AWS CloudWatch, etc., as examples.This role can sit 100% remote

Vacancy caducado!

Suscribir Reportar trabajo

Puestos de trabajo relacionados