A. Observations The E2E’s monitoring team received lots of servers unreachable alerts.
B. Immediate Actions Taken ( * ) • What action we have taken to identify the reported issue? Monitoring team immediately checked the alerts detail and identify that the issue with the selected pools. Therefore they immediately reported the issue to the network team. Network team verify all switches status and logs and did not find any internal issue in the network and reported the same issue to ISP team.
ANALYSIS & ROOT CAUSE: ISP Team confirmed they are observing drop in the traffic and checking it further. They identify the issue with one of the ISP and opened a ticket with them. The identified ISP team confirmed the internal outage from their side and they will be sharing the final RCA in next 1-2 days.
Actions Taken to Resolve the Issue: We have manually disabled the identified ISP in our network therefore all the traffic automatically shifted to another ISP and confirmed from the customers that issue is resolved for them post disabling the ISP in the network.