2025-10-14: Goodbye, goo.gl/0R8XX6
![]() |
goo.gl/0R8XX6 is now 404 on the live web. |
It's been nearly two months since Google stopped redirecting some of its goo.gl short URLs (August 25, 2025). In previous posts, I looked at how many of these goo.gl URLs that Wayback Machine had archived (at least 1.3M at the time), and estimated 4,000 goo.gl short links used in scholarly publications would be lost. In early September, I verified that Google had indeed implemented this terrible, horrible, no good, very bad idea. It's now mid-October, and I'm just now getting a chance to write up this behavior. While the redirections are missing from the live web, many (most? all?) have been archived, and it also turns out they leave a live web tombstone.
First, let's look at the live web goo.gl URLs. The ones that were not lucky enough to receive traffic in "late 2024" no longer return a sunset message, and now return a garden variety HTTP 404 (image above, HTTP response below):
% curl -Is https://goo.gl/0R8XX6 | head -7
HTTP/2 404
content-type: text/html; charset=utf-8
cache-control: no-cache, no-store, max-age=0, must-revalidate
pragma: no-cache
expires: Mon, 01 Jan 1990 00:00:00 GMT
date: Tue, 14 Oct 2025 22:31:25 GMT
content-length: 0
Recall that goo.gl/0R8XX6 was one of the 26 shortened URLs from a 2017 survey of data sets for self-driving cars that was not lucky enough to have received traffic during late 2024, and thus was no longer going to continue to redirect (the other 25 shortened URLs are still redirecting). One reason that I had put off posting about this finding is that other than saying that they did the thing they said they were going to do, there wasn't a surprise or interesting outcome. But, it turns out I was wrong: it appears that you can look at the HTML entity to determine if there was ever a redirection at the now 404 shortened URL,
I wanted to test if goo.gl would return a 410 Gone response for URLs that no longer redirect. The semantics of a 410 are slightly stronger than a 404, in that a 410 allows you to infer that there used to be a resource identified by this URL, but there isn't now. A regular 404 doesn't allow you to distinguish from something that used to be 200 (or 302*, in the case of goo.gl) vs. something that was never 200 (or 301, 302, etc.). Unfortunately, 410s are rare in the live web, but goo.gl deprecating some of its URLs seemed like a perfect opportunity to use them. But in my testing of shortened URLs, I discovered that you get a different HTML entity depending on if the goo.gl URL ever existed or not.
Let's take a look at the HTML entity that comes back via curl (I've created a gist with the full responses, but here I'll just show byte count):
% curl -s https://goo.gl/0R8XX6 | wc -c
1652
Doing the same thing for a shortened URL that presumably never existed, we get a response that's about 5X bigger (9,237 bytes vs. 1,652 bytes), even though it's still an HTTP 404:
% curl -s https://goo.gl/asdkfljlsdjfljasdljfl | wc -c
9237
% curl -Is https://goo.gl/asdkfljlsdjfljasdljfl | head -7
HTTP/2 404
content-type: text/html; charset=utf-8
vary: Sec-Fetch-Dest, Sec-Fetch-Mode, Sec-Fetch-Site
cache-control: no-cache, no-store, max-age=0, must-revalidate
pragma: no-cache
expires: Mon, 01 Jan 1990 00:00:00 GMT
date: Wed, 15 Oct 2025 00:00:10 GMT
We can see that goo.gl/asdkfljlsdjfljasdljfl produces a completely different (Firebase-branded**) HTML page:
![]() |
goo.gl/asdkfljlsdjfljasdljfl -- still 404, but a different HTML entity |
Note that the 404 page shown in the top image is the same Google-branded 404 page that one gets from google.com; for example google.com/asdkfljlsdjfljasdljfl.
![]() |
https://google.com/asdkfljlsdjfljasdljfl |
It's possible there's a regular expression that checks for goo.gl style hashes in the URLs and "asdkfljlsdjfljasdljfl" was handled differently. So next I tested a pair of six character hashes: goo.gl/111111 vs. goo.gl/111112 and got the same behavior: both 404, but 111111's HTML was 5X bigger than 111112's HTML:
% curl -Is https://goo.gl/111111 | head -7
HTTP/2 404
content-type: text/html; charset=utf-8
cache-control: no-cache, no-store, max-age=0, must-revalidate
pragma: no-cache
expires: Mon, 01 Jan 1990 00:00:00 GMT
date: Wed, 15 Oct 2025 00:05:28 GMT
content-length: 0
% curl -Is https://goo.gl/111112 | head -7
HTTP/2 404
content-type: text/html; charset=utf-8
vary: Sec-Fetch-Dest, Sec-Fetch-Mode, Sec-Fetch-Site
cache-control: no-cache, no-store, max-age=0, must-revalidate
pragma: no-cache
expires: Mon, 01 Jan 1990 00:00:00 GMT
date: Wed, 15 Oct 2025 00:05:34 GMT
% curl -s https://goo.gl/111111 | wc -c
1652
% curl -s https://goo.gl/111112 | wc -c
9222
![]() |
https://goo.gl/111111 |
![]() |
https://goo.gl/111112 |
Turns out that I was lucky with my first pair of random strings: goo.gl/111111 has an archived redirection and goo.gl/111112 does not, with 111111 also not being popular in "late 2024". While the archived redirection proves that there was a redirection for 111111, the lack of an archived redirection for 111112 technically does not prove that there was never a redirection (there could have been one and it wasn't archived). While I could spend more time trying to reverse engineer goo.gl and Firebase, I will be satisfied with my initial guess and trust my intuition, which says that the different returned HTML entities allow you to determine what goo.gl URLs used to redirect (i.e., de facto HTTP 410s) vs. the goo.gl URLs that never redirected (i.e., correct HTTP 404s).
![]() |
https://web.archive.org/web/*/https://goo.gl/111111 |
![]() |
https://web.archive.org/web/*/https://goo.gl/111112 |
So the current status of goo.gl is even crazier than it first seems: rather than simply have all the goo.gl URLs redirect, they are keeping a separate list of goo.gl URLs that do not redirect. We now have:
- goo.gl URLs that still redirect correctly
- goo.gl URLs that no longer redirect, but goo.gl knows they used to redirect, because they return a Google-branded 404 page
- goo.gl URLs that never redirected (i.e., were never really goo.gl shortened URLs), for which goo.gl returns a Firebase-branded 404 page
I suppose we should be happy that they did not deprecate all of the goo.gl URLs, but surely keeping all of them would have been easier.
Fortunately, web archives, specifically IA's Wayback Machine in this case, have archived these redirections. The Wayback Machine is especially important in the case of goo.gl/0R8XX6, since its redirection target, 3dvis.ri.cmu.edu/data-sets/localization/, no longer resolves, and the page is not unambiguously discoverable via a Google search. In this case, we need the Wayback Machine to get both the goo.gl URL and the cmu.edu URL.
![]() |
https://web.archive.org/web/*/https://goo.gl/0R8XX6 |
![]() |
https://web.archive.org/web/20231125001435/https://goo.gl/0R8XX6 https://web.archive.org/web/20190107062345/http://3dvis.ri.cmu.edu/data-sets/localization/ |
So there is a possible, but admittedly unlikely, use case for this bit of knowledge. If you're resolving goo.gl URLs and get a 404 instead of a 302, then check the Wayback Machine, it probably has the redirect archived. If Wayback Machine doesn't have the redirect archived, you can check the HTML entity returned in the goo.gl 404 response: Google-branded 404s (deprecated goo.gl URLs) are much smaller than Firebase-branded 404s (never valid goo.gl URLs). A small, Goole-branded 404 page is a good indicator that there used to be a redirection, and if the Wayback Machine doesn't have it archived, maybe another web archive does.
So goodbye, goo.gl/0R8XX6. Happily, you were in archive.org a full two years before goo.gl knew you were dead.
--Michael
* While we're at it, why did goo.gl use 302s and not the more standard practice of 301s?!
** At some point, gool.gl URLs were implemented as "Firebase Dynamic Links", which were also deprecated on 2025-08-25.
Comments
Post a Comment