2022-07-19: Review of "Facebook has started to encrypt links to counter privacy-improving URL Stripping"

 

This figure, from the original blog post, does not concern "fbclid" for external links.  The top URL is a deep link to this post on facebook.com and the URL shown in the dev tools is for @ghacksnet Facebook account page.  Both URLs still have tracking URL query parameters of the form: "?__cft__[0]=...".

Yesterday I saw "Facebook Is Now Encrypting Links to Prevent URL Stripping", tweeted by @schneierblog, which was a summary of the original blog post of the same name, "Facebook has started to encrypt links to counter privacy-improving URL Stripping," by Martin Brinkmann (@ghacks).  I read with interest because of the implications on web archiving, since lookup by URL is almost exclusively the manner of accessing web archives.  In summary, I cannot replicate any of the statements in the original blog post by Brinkmann.    

The blog post begins with a discussion about Firefox's recent update which allows the user to specify stripping of common URL query parameters used for tracking.  For Facebook, this means the "?fbclid=..." parameter that it adds to external links shared in Facebook posts.  For example, in this Facebook post:

 

https://www.facebook.com/ghacksnet/posts/pfbid0247HbwuBJApJ9rJGJtNRHcvca8rFFquwFqya6pwjbDGYuvv6ksEvbc7R8rLjxqQjbl

The shared URL previewed in the post is to:

https://l.facebook.com/l.php?u=https%3A%2F%2Fwww.ghacks.net%2F2022%2F05%2F31%2Fmicrosoft-hopes-to-improve-edge-with-permanent-imports-from-chrome%2F%3Ffbclid%3DIwAR2nk5I-kMFdXjv_6keGMWDuL5lau4a14FdTfIVQnW7MEUhqJJD6hKvIhgY&h=AT1iOk8AV6dU8Q_hSkuk4DlNlooduZZPPfJ5kW7LMyNCZ_W_9T-RNuONvT2J6Qac7MvTubm5GjI4rfdSVoorjhK_wNeqpYvOl5Udczn16MCvSEwaaSene3T5zHHZw7xdnIrZZgskmpcelUvEbstmMw&__tn__=-UK-R&c[0]=AT1qGhyn4zni_fnlQA3hNhy95LFw7YpPWfXAHfBPKPFu808C3CwKpiNwlVDgPKvkQPmgorV9y2VmSNN-b4C6130qELqIef618k18AS9nHOZ1M73dOclcMuQVN3OPJ-NflB2h4bd1nsXW9-SJ29VFRkVcMkGriArcf-FQe65dexO3NMk

Which ultimately redirects to:

https://www.ghacks.net/2022/05/31/microsoft-hopes-to-improve-edge-with-permanent-imports-from-chrome/?fbclid=IwAR2F0RNcg3Rje6ALNS4iuZzumdDUUx4f2UdIN9JFX13GvwGiBbOCxSf7OvY

The "?fbclid=..." parameter is still present, it is just applied to external links, in this case the link from facebook.com to ghacks.net

The Brinkmann blog post also says:

Previously, Facebook used the parameter fbclid for tracking purposes. Now, it uses URLs such as https://www.facebook.com/ghacksnet/posts/pfbid0RjTS7KpBAGt9FHp5vCNmRJsnmBudyqRsPC7ovp8sh2EWFxve1Mk2HaGTKoRSuVKpl?__cft__[0]=AZXT7WeYMEs7icO80N5ynjE2WpFuQK61pIv4kMN-dnAz27-UrYqrkv52_hQlS_TuPd8dGUNLawATILFs55sMUJvH7SFRqb_WcD6CCOX_zYdsebOW0TWyJ9gT2vxBJPZiAaEaac_zQBShE-UEJfatT-JMQT5-bvmrLz7NlgwSeL6fGKH9oY9uepTio0BHyCmoY1A&__tn__=%2CO%2CP-R instead.
The URL shown is not to an external link, it is a link to an internal Facebook post page, the base URL of which is: https://www.facebook.com/ghacksnet/posts/pfbid0RjTS7KpBAGt9FHp5vCNmRJsnmBudyqRsPC7ovp8sh2EWFxve1Mk2HaGTKoRSuVKpl.  

https://www.facebook.com/ghacksnet/posts/pfbid0RjTS7KpBAGt9FHp5vCNmRJsnmBudyqRsPC7ovp8sh2EWFxve1Mk2HaGTKoRSuVKpl

There is tracking information in the link, but it is with the "?__cft__[0]=..." parameter.  Regarding the above URL, Brinkmann also says:

The main issue here is that there it is no longer possible to remove the tracking part of the URL, as Facebook merged it with part of the required web address. Removing the entire construct after the ? would open the main Facebook page of Ghacks Technology News, but it won't open the linked post.

I cannot replicate this either; as of this writing, the following URLs lead to the same page:

https://www.facebook.com/ghacksnet/posts/pfbid0RjTS7KpBAGt9FHp5vCNmRJsnmBudyqRsPC7ovp8sh2EWFxve1Mk2HaGTKoRSuVKpl?__cft__[0]=AZXT7WeYMEs7icO80N5ynjE2WpFuQK61pIv4kMN-dnAz27-UrYqrkv52_hQlS_TuPd8dGUNLawATILFs55sMUJvH7SFRqb_WcD6CCOX_zYdsebOW0TWyJ9gT2vxBJPZiAaEaac_zQBShE-UEJfatT-JMQT5-bvmrLz7NlgwSeL6fGKH9oY9uepTio0BHyCmoY1A&__tn__=%2CO%2CP-R

https://www.facebook.com/ghacksnet/posts/pfbid0RjTS7KpBAGt9FHp5vCNmRJsnmBudyqRsPC7ovp8sh2EWFxve1Mk2HaGTKoRSuVKpl

Facebook has changed how they create links to internal posts.  At least as seen by the Internet Archive crawler, somewhere between 2022-04-04 (/posts/\d+) and 2022-04-05 (/posts/pfbid\w+).  The following page has two URLs:

This page is available at: 1) https://www.facebook.com/ghacksnet/posts/4950751111633509 and 2) https://www.facebook.com/ghacksnet/posts/pfbid0RjTS7KpBAGt9FHp5vCNmRJsnmBudyqRsPC7ovp8sh2EWFxve1Mk2HaGTKoRSuVKpl

https://www.facebook.com/ghacksnet/posts/4950751111633509 

https://www.facebook.com/ghacksnet/posts/pfbid0RjTS7KpBAGt9FHp5vCNmRJsnmBudyqRsPC7ovp8sh2EWFxve1Mk2HaGTKoRSuVKpl

The latter URL could have tracking information in the "...pfbid..." portion, but I verified with Sawood Alam that when we both went to facebook.com/ghacksnet/ and looked at the URL of the first story in the feed (@ ~5:20pm, EDT), we both got:

https://www.facebook.com/ghacksnet/posts/pfbid0PqbHsuq6uATTnJkZLsM4X89bXPWmFPV81vNuYqWgbrCn5mNv7NCbigTFAZFhuTRfl

As the base URL, and we had differing values in the "?__cft__[0]=..." parameter, thus confirming the suspicion that the tracking information is still in the query parameter, even for internal links.  Obviously, we were on different computers, different networks, and had different Facebook accounts. 

Facebook does appear to change the "/posts/pfbid/" URLs over time, though I'm not sure how or why.  All three of these URLs point to the same page:

https://www.facebook.com/ghacksnet/posts/pfbid023yPyyVYBSF6qHH7uwrZRBRYErxJgxnFX9PBQ1yDTyEaRjs4K3spKfbkRJHxvky46l

https://www.facebook.com/ghacksnet/posts/pfbid0247HbwuBJApJ9rJGJtNRHcvca8rFFquwFqya6pwjbDGYuvv6ksEvbc7R8rLjxqQjbl

https://www.facebook.com/ghacksnet/posts/pfbid037UmnZvXvCfTGaCvbJPTAbpDU8CraGmfzjkCk8cQCVuqi9qHkFfP9svtbtgjUixvrl

Perhaps these changing values encode the time and method of discovery: the first URL I discovered via an archived page (2022-05-31), the third URL I discovered via search, and I've lost track of how I discovered the second URL. 

The fact that Facebook does change the URLs for deep links for individual post pages does have implications for web archiving: URL aliases are the enemy of web archives, and in the example above there are at least three different (and opaque) URLs that lead to the same content.  However, it appears that the base of these URLs appears to be the same for different accounts when discovered at the same time and in the same manner.  Brinkmann's blog post is about stripping tracking URL query parameters, and those seem to still be in place: "?fbclid=..." for external links and "?__cft__[0]=..." for internal links. 


--Michael

 

Thanks to Dr. Sawood Alam for this help in verifying Facebook behavior.  


Comments