2024-03-20: The Curious Case of Twitter's Server-Side UI

Web archives are essential for researchers studying Twitter, offering a historical perspective on evolving conversations and trends. Archives ensure data integrity and preservation, which is crucial in a platform where content can be deleted or even, in some cases, modified. Additionally, web archives allow technical analysis of social media changes, enable comparative studies across time and platforms, and provide cultural insights, making them an essential resource for researchers in various fields. However, archiving Twitter has always been a challenge. Archiving Twitter is challenging not only due to the massive volume and dynamic nature of tweets but also because of Twitter's continually evolving user interface (UI). Studying the history of Twitter through web archives can lead to confusion, especially when encountering mementos that display different Twitter UIs from the same period. 


This blog delves into the nuances of two types of Twitter UIs, client-side and server-side, and how they coexist in web archives (Figure 1). The overlapping presence of these UIs in web archives can create a challenging scenario for researchers as they navigate through varying interfaces that are in use simultaneously. Understanding the distinction and interaction between these UIs is crucial for accurately interpreting the archived Twitter data.


Different Twitter UIs and their overlapping existence in web archives

Figure 1. Overlapping timelines of mementos with three different Twitter UIs in web archives.



New Client-side UI 

Twitter introduced a significant redesign of its website in July 2019, ultimately discontinuing the old server-side UI on June 1st, 2020, and prompting users to switch to the new client-side UI. The client-side UI is created and rendered in the user's browser with JavaScript making API calls, allowing for a more dynamic, interactive experience that can adapt to the user's actions and preferences. In our paper Challenges in replaying archived Twitter pages, we shed light on the difficulties web archives were experiencing in archiving the new UI, as there were vast differences between the old and new UI infrastructure. In Twitter's server-side UI, the tweets were embedded in the root HTML by the server. The new UI’s tweets are populated through asynchronous JSON responses and rendered on the client side. Because of this change, archives must make multiple calls to Twitter’s rate-limiting API to archive a single Twitter page, increasing the complexity of preserving Twitter. We have covered this change in UI and its impact on web archives in a series of blog posts (1, 2, 3, 4).


Old Server-side UI 

Prior to 2020, Twitter underwent several UI changes, all of which were server-side. For simplicity, we’ll refer to all these interfaces as the old server-side UI. Server-side UI refers to the UI generated and rendered on the server before being sent to the user's browser. Twitter shut down the old server-side UI on June 1st, 2020, necessitating all desktop users to use the new client-side UI. However, the old server-side UI was still accessible for Googlebot. 


Archiving Twitter's server-side UI is much easier as the content is embedded in the root HTML. Due to difficulties archiving the new UI, most web archives continue to archive the old UI by pretending to be a Googlebot. Web archives archiving the old UI resulted in the dissimilarity between replayed Twitter pages and those on the live web. Users saw the new client-side UI on the live web and the old server-side UI on archival replay from June 2020 to June 2022 (Figure 2).


Figure 2: Vast difference in the look of the archived server-side old UI (archived 2022-06-23) on the left and the live web version with client-side UI on the right.



New Server-side UI 

In late June 2022, Twitter updated the look of the server-side UI and made it similar to Twitter's client-side UI (Figure 3). We are referring to this interface as the new server-side UI. If you have archived a Twitter page in the last year (June 2022 – November 2023), you might not have noticed any differences between the archived and live web versions (Figure 3). We archived a Twitter page using IA's Save Page Now (SPN) in October 2023. The archived version looks similar to the new Twitter UI without the sidebars. However, when we looked in our browser's Developer Console, we noticed the tweet content was embedded in the root HTML, similar to Twitter’s old server-side UI. Twitter had updated the look of the server-side UI, which was identical to the new Twitter UI without the sidebars. This update was suitable for web archives as it lessens the difference between the live Twitter page and its replay in the web archives. 


Figure 3: Similarity between the memento of the new server-side UI on the left (archived 2023-10-19) and the live web version.


A slight disparity exists between the new client-side UI and server-side UI (Figure 4). The new server-side UI lacks the sidebars showing the sign-in options and the "You might like" section. It also does not have a Trends section that shows the latest trends on Twitter. However, the information about trends is only available on the live web if you are logged in. The login and sign-up bar appear at the bottom of the page in the new client-side UI, while they appear at the top in the new server-side UI. These little disparities can help you identify the mementos of the new server-side UI. Additionally, the new server-side UI appears to have been designed to simplify the indexing process for bots such as Googlebot. This UI primarily showcases the bio and the tweets, omitting extraneous details such as trending topics and the "You might like" section, thus making it more efficient for crawler indexing.


Figure 4: Small differences in the appearance of the memento of the new server-side UI on the left (archived 2023-10-19) and the live web version on the right.



We reviewed multiple mementos to investigate when the old server-side UI was last seen/archived and when the new server-side UI mementos started appearing. We discovered this change happened to Elon Musk’s account much earlier than others (Figure 5). We could see that the first new server-side UI memento was on January 5, 2022, and the last old UI was on June 23, 2022. However, this change happened in late June 2023 for other accounts. We collected the account page mementos of five Twitter accounts: @NASA, @narendramodi, @KamalaHarris, @JoeBiden, and @elonmusk. In Figure 5, we colored their mementos based on the UI: new client-side UI (blue), old server-side UI (orange), and new server-side UI (red). We can see the red dots started appearing for @elonmusk in January 2023, while for other accounts, the red dots started appearing in late June 2023. The old server-side UI mementos (orange) stopped appearing for all the accounts after June 2023. The new client-side UI memento in blue appears occasionally. 


Figure 5: Visualizing the mementos of five well-archived Twitter accounts. Before June 23, 2022 (dotted line), we could see primarily old server-side UI mementos (orange), and after, we could see new server-side UI mementos (red). @elonmusk account is an exception where new server-side UI mementos (red) started appearing early. We also occasionally see the new client-side UI mementos (blue).


The new server-side UI was short-lived. In late November 2023, we observed that Twitter discontinued the server-side UI for Googlebot. Now, if you set your user-agent to "Googlebot" and attempt to access any Twitter page, Twitter responds with a 404 status code. For instance, when we issued a curl request to 'https://twitter.com/elonmusk,’ we received an HTTP 200 OK response. However, when we executed a curl request with the user agent set to Googlebot, we encountered an HTTP 404 File Not Found error.


$ curl -iLs 'https://twitter.com/elonmusk' -H 'user-agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)' | head -2

HTTP/2 404

date: Fri, 23 Feb 2024 19:55:05 GMT


$ curl -iLs 'https://twitter.com/elonmusk' | head -2

HTTP/2 200

date: Fri, 23 Feb 2024 19:57:09 GMT


Although Googlebot encounters 404 errors, it is plausible to speculate that server-side support is still maintained for search engines, evidenced by the ongoing indexing of tweets (Figure 6). It's likely that Twitter grants special access to a select set of IP addresses, enabling them to access the server-side UI for indexing tweets. Consequently, entities attempting to crawl tweets while posing as Googlebot, such as web archives, are now unable to do so.


Figure 6: Google indexing @elonmusk tweets (archived on 2024-02-29)


Distinguishing Twitter Mementos UIs By Comparing Their Content Sizes


We can classify between the old server-side UI, the new server-side UI, and the new client-side UI Twitter mementos by looking at the size of the root HTML page. The content size for the three UIs seems to follow the following pattern: 


 Size of Old Server-Side UI > Size of New Server-Side UI > Size of New Client-Side UI    


To confirm this, we used the same URI-R (https://twitter.com/elonmusk) and dereferenced three different URI-Ms, one for each corresponding time frame for the respective UIs. We modified the URI-Ms by appending the 'id_' flag to the datetime field, ensuring that the page is retrieved exactly as it was archived, without any additional content from the archive. We then used the Unix "wc -c" command to determine the exact size of a file's content in bytes.


Old server-side UI memento: 

$ curl -s "https://web.archive.org/web/20220623150916id_/https://twitter.com/elonmusk" | wc -c

  720174


New server-side UI memento

$ curl -s "https://web.archive.org/web/20220105032111id_/https://twitter.com/elonmusk" | wc -c

   51914


New client-side UI memento:

$ curl -s "https://web.archive.org/web/20220623163904id_/https://twitter.com/elonmusk" | wc -c

   21917


As we can see in the above example, the old server-side UI memento’s size (720174) is much larger than that of the new server-side UI memento (51914). Similarly, the new server-side UI memento’s length is almost twice that of the new client-side UI (21917). We used this technique to distinguish between the old and new server-side UI mementos of five Twitter accounts, as shown in Figure 5. Twitter account pages display content of varying lengths, making it difficult to define a specific range as it changes from one account to another.


You can also leverage certain differences in the root HTML of the three UIs to distinguish between them:

  • The old server-side UI root HTML contains a unique data-scribe-reduced-action-queue="true" attribute in the <html> tag.

  • The new server-side UI and client-side UI contain meta tags such as meta http-equiv="origin-trial. 

  • The new server-side UI root HTML contains an error message: “Something went wrong, but don’t fret—let’s give it another shot."


Conclusion

We explore the differences among the three user interfaces (UIs) and explain how to identify them. The table below summarizes key features to help you determine which U is being displayed in the archive.


Feature

Old Server-side UI

New Client-side UI

New Server-side UI

Time Period

Before June 2022

After July 2019

June 2022 - November 2023

(except for @elonmusk, which started in Jan 2021)

Look and Feel

Mostly static

Dynamic and modern

Static page, looks similar to client-side

Sidebars and Sections

Includes sign-in options, Media-timeline, Trends

Includes sign-in options, "You might like,” Trends

N/A

Root HTML

Tweets, bio embedded in root HTML

The root HTML contains an error message: “Something went wrong, but don’t fret—let’s give it another shot."

Tweets, bio embedded in root HTML

<html> Tag & Attribute

Attribute in the <html> tag:

data-scribe-reduced-action-queue="true"

meta http-equiv="origin-trial

meta http-equiv="origin-trial

Twitter Account Page Size Range

Typically ranges from 100-100K

Typically ranges from 20-35K

Typically ranges from 35 - 100K



In this blog post, we highlighted the significant challenges in archiving Twitter's evolving user interface, from the old server-side UI to the new client-side UI. The overlapping presence of different UIs in web archives presents unique challenges for researchers. Recent updates further complicate archiving strategies, underscoring the importance of understanding and differentiating between these versions for accurate historical analysis. This discussion emphasizes the dynamic nature of web content and the essential role of web archives in preserving the digital history of platforms like Twitter.




- Kritika Garg (@kritika_garg)


Comments