I encountered a bug last year that was a bit tricky because it wasn't something that we would ever run into during development but was a legitimate problem in production. I found the bug because I set up Sentry for our client application and it was reporting that there were unhandled exceptions being thrown with the name ChunkLoadError. It would happen consistently in production but rarely, if ever, in the master and stage environments.

Because of the name, I immediately understood that this was probably being thrown by the client-side Webpack logic which was failing to load a "page" in our application. Pages are split into separate bundles (also known as chunks) so that users only need to download them when they navigate to a route for that individual page.

At first we thought that the error was caused by mobile users with spotty cell service. The thinking was that a user would try to navigate to a new page, the request would fail because of a weak or interrupted connection, Webpack would throw the error when the request failed, and Sentry would store and report the error when the connection had been re-established. This assumption turned out to be incorrect and lulled us into a false sense of security that the error wasn't one we could fix.

A little later on we enabled the Release Tracking feature in Sentry which allowed us to correlate exceptions to individual releases and I noticed that the the error reports seemed to cluster around new releases.

By trying to replicate a simplified production release locally, I was able to figure out that the problem was one we could fix. The error occurred because each chunk's filename contains a hash based on it's content so that if the content changes, the application would be able to retrieve the updated chunk instead of using the old one stored in the browser's cache.

The manifest of chunk filenames that the application uses is only loaded on a full page load so if a user navigates to the application, we release a new version with updated pages, and, without doing a full page load, the user tries to navigate to one of those pages, the client application will try to request the old chunk filename contained in the manifest. At the time, we weren't using a CDN and instead serving static files directly from the server so requests for the old chunks would fail since they didn't exist within the application server's filesystem.

Using a CDN prevented the error from being thrown since the old chunks would still be cached for a certain period of time, but it didn't solve the core issue since users that were using our application during a new release would still be using the old client application as long as they kept the browser tab open.

I solved this issue by adding a try-catch around the navigation and page import logic so that if a page chunk request fails, the app will do a full page request via window.location instead of just a request for the individual chunk.