On the 4th of July, we experienced issues to dispatch immediate ride for our Taxi/VTC customers. All automatic dispatches performed by our system ended up failing.
Our dispatch system relies on one of our provider to compute ETA for drivers, when a immediate ride is created. When the provider cannot answer, the dispatch system fails and the ride is not dispatched automatically to a driver. This means that a back office agent has to manually dispatch the ride. This behaviour applies to VTC and Taxi companies.
Due to the outage of our provider, all our calls from our back end system failed, thus making the dispatch system fail too.
For more information on the outage, see:
Steps taken to diagnose, assess, and resolve:
Even if our providers have proven they are really reliable, we cannot afford to let our dispatch system fail when it doesn’t get an answer. Our monitoring system alerted on some issues with a part of our back end, but we had to investigate a bit to discover that our provider was experiencing an outage.
When we first switch from a provider to another, we were confident in the fact that the dispatch system would be operating normally again. But it appears a regression has appeared and modified the source code behaviour. This should not happen.
We therefore have three tasks to enhance our systems: