Vacancy caducado!
Role Description:Strong knowledge in observability and Site reliability engineer (SRE) with experience automating and pro-actively monitoring DevOps platforms and a passion for developing and architecting automation solutions., should be able to handle first point escalation for all technical and process issues. Provide technical subject matter expertise wherever required. Ensure proper communication and quick resolution as a crisis manager. Plan and schedule Changes, Coordinating with different stakeholders. Perform RCA for Major Incident's related to his / her tower Follow quality / security process defined for the engagement. Perform Trend analysis, identify top few incidents and work with respective teams/individual to minimize the incidents, Hardware troubleshooting & Vendor coordination Prepare Weekly and monthly status reports. Participate in business meetings with various stake holders on a need basis. Take corrective actions based on the customer satisfaction surveys. Work on the service improvement programs. Effort estimation/reviews on need basis for new projects. Training of new team members. Able to work on Knowledge acquisition and updates to related document