Lead site reliability engineer - aws/azure cloud services
MumbaiNeemtree
...reliable, scalable, and fault-tolerant systems, including infrastructure, monitoring, alerting.- Incident Management : Manage incident response processes, including root cause analysis, post-mortem reviews, and proactive mitigation strategies to minimize system downtime and impact.- Monitoring & Alerting : Develop and maintain [...]
Category IT & Telecommunications
29 days ago in Hirist