Issue
Every user accessing the platform community pages were greeted with an error message instead of the requested community page.
This then resulted in a situation where no community functions could be performed or the content viewed which then corresponded to a complete global outage across both regions.
Cause
A code change to the platform related to another issue was deployed which then caused this issue to occur.
In short; a code comment was not being parsed correctly by the production server and as a result it created a fatal front end error for every customer and community across the EU and US region.
Why did our automated tests not pick this up?
Resolution
The issue was swiftly mitigated by reverting the code deploy that was causing the issue, however this led to a 27 minute period where the communities were not available and this error message displayed to all users instead.
Prevention steps
Additional health checks for most critical staging resources
Reinforce checking staging consistently before approving deployment to production
Timeline