2020-11-04: New Twitter UI: Replaying Archived Twitter Pages That Never Existed

 

Figure 1: Multiple Temporal Violations in an archived page with the new Twitter interface. 


When you visit web archives to go back in time and look at a web page, you naturally expect it to display the content exactly as it appeared on the live web at that particular datetime. That is, of course, with the assumption in mind that all of the resources on the page were captured at or near the time of the datetime displayed in the banner for the root HTML page. However, we noticed that it is not always the case and problems with archiving Twitter's new UI can result in replaying Twitter profile pages that never existed on the live web. In our previous blog post, we talked about how difficult it is to archive Twitter's new UI, and in this blog post, we uncover how the new Twitter UI mementos in the Internet Archive are vulnerable to temporal violations.


On Aug 18, 2020, we stumbled upon a recently archived memento (Figure 1) of Donald Trump’s Twitter profile page in the Internet Archive’s Wayback Machine and noticed that the tweets were different from what we saw on the live web: We noticed that the latest tweet in the archived page was from Aug 16, 2020, almost two days earlier than the archival time. By comparing the live web to the memento, we counted 69 tweets posted between 8:48 PM on Aug 16 (2020-08-17T00:48Z) and 1:52 AM on Aug 18 (2020-08-18T05:52:23Z) that were missing from the memento. To verify the memento was missing 69 tweets, we used Twitter’s API and the Trump Twitter Archive. We found that there were actually 71 missing tweets: 69 tweets are still live and the other two tweets are no longer available. Those two missing tweets were actually retweets by Donald Trump, one of which was deleted by the original author and the other came from an account that was later suspended (effectively deleting that tweet). As those two tweets are gone from the live web, we would not have had any record of him retweeting those tweets/accounts if it were not for these two mementos on 2020-08-17T23:05:12Z and 2020-08-17T01:27:16Z. This highlights the importance of archiving Donald Trump’s Twitter page frequently, and more importantly, being able to replay those mementos coherently. 


Intrigued by this, we kept digging further into this behavior and noted that the tweets in the main memento (Figure 1) are from a different datetime.  We used the Chrome developer tools to confirm that the tweets were indeed coming from 29 hours earlier than the memento datetime (2020-08-18T05:52:23Z). This is an example of a temporal violation. Focusing on the Internet Archive (IA), the archival datetime of a web page can be determined from the URI and the banner, but the archival datetime of different sections of a complex web page (composite memento) may or may not be temporally aligned. When archived embedded resources are incorrectly combined with an archived root HTML page to replay a web page that never existed on the live web, the memento is temporally violative (see “Evaluating the Temporal Coherence of Composite Mementos” by Scott Ainsworth for a formal discussion).


To our surprise, this memento from 2020-08-18T05:52:23Z is of Twitter's new UI. In our previous blog post, we described how the new UI is not easily archivable by most web archives. Previously, in the Internet Archive, these new UI mementos were displaying Twitter’s error messages during replay. In the same blog post, we talked about problems related to archiving the new Twitter UI after Twitter shut down its "legacy" UI on 2020-06-01 and explored the differences in the UI and their effects on web archiving. This is primarily because rate limiting makes the new UI difficult to archive, given the fact that Twitter’s new UI talks to api.twitter.com which imposes aggressive rate limiting. We also observed that Twitter returned an HTTP 429 “Too Many Requests” status code (Figure 2) while archiving using Conifer.


Figure 2: Conifer receiving an HTTP 429 response (2020-06-29T14:44:37)


We went back and looked at some of these error mementos referenced in our previous blog post and noticed that they were fixed retroactively and now display the new UI successfully (Figure 4). Figure 3 shows how a new UI memento from 2020-06-18T20:45:47Z is displaying the “Something went wrong, but don’t fret — let’s give it another shot” error, and Figure 4 shows the same memento successfully replaying Twitter’s new UI. We witnessed a temporal violation in Figure 4 where the 2020-06-18T20:45:47Z root HTML page is displaying content from one day in the past (2020-06-17). Although the error message is gone, it is now temporally violative: the memento in Figure 4 could not have existed on the live web as replayed.


Figure 3:  Twitter's Try again error page (Archived on 2020-06-18T20:45:47Z, Screenshot at 2020-07-02T07:30:32Z)



Figure 4: The same memento in Figure 3, but now replaying Twitter's new UI successfully (Screenshot on 2020-08-05T14:58:18Z)


This left us with a TimeMap filled with either mementos with the new Twitter UI (either with the error message or the completely rendered new UI) or mementos with the old Twitter UI. For example, on Aug 7, 2020, we noticed that there were three new UI mementos, six old UI mementos, and one redirect to an old UI memento (Figure 5).


Figure 5: TimeMap of Donald Trump’s Twitter page displaying the 3 new UI mementos (N), 6 old UI mementos (O), and a redirect (R) for 2020-08-07. 

Why are the error pages now showing content? 


We investigated the cause of this issue by thoroughly looking at the way pages in the new Twitter UI are built. By referring to the source code, we learned that the new UI provides a placeholder template and then the page is built with various JSON responses as opposed to the old UI, where most of the content was embedded in the root HTML. The template includes the errors which may get triggered if the JSON response could not be obtained. The previously error mementos were unable to obtain from the web archive the JSON responses necessary to build the page. However, these mementos are now displaying the content with the help of archived JSON responses. For example in Figure 4, the root HTML was archived on 2020-06-18T20:45:47Z, and later the necessary JSON was archived. This has an implication of mementos potentially not being the same each time you replay them. In this case, the mementos improved from displaying Twitter’s error page on 2020-07-02 (Figure 3) to displaying content on 2020-08-05 (Figure 4). We looked at the HTTP requests and responses of the archived page with the new UI to see how the different components are built. The Internet Archive is built in a way to redirect memento requests to an archived copy that is closest to the requested datetime. In this case, it tries to populate the content by requesting the particular JSON, and when the requested datetime JSON does not exist, the closest copy will be used to build the page. If the content in the closest JSON copy is different from the content on the live page at the time the root HTML memento was captured, it can lead to a memento that never existed on the live page.  This behavior is known as temporal violations.


 

Screen capture of the 2020-09-17T22:31:49Z memento displaying the content from 2020-09-14T06:27:08Z.

Different Components in the Twitter Profile Page


We analyzed the HTTP requests and responses of the new UI memento to see how the different components in Twitter's new UI profile page is built. We looked at the following sections of the Twitter page:


  1. Profile timeline/Tweet feed

  2. Bio section

  3. Sidebar

    1. Media timeline

    2. “You might like” section 

    3. “What's happening” section


Figure 6: Five JSON responses are required to build the profile page in the new UI

  1. Profile Timeline/Tweet Feed

Figure 7: Profile timeline/ tweet feed


The tweet feed in your profile page displays all the tweets/retweets made by you. We know that the Twitter UI has been updated to a new mobile-inspired UI, but the old UI was built differently. In the old UI, the HTML itself had the first 20 tweets embedded in it. Unlike the old UI, there are no embedded tweets in the new UI, the tweets shown in Figure 7 are populated with a "timeline/profile" JSON which requires an asynchronous request to api.twitter.com. We noticed that the tweet section in the page archived on 2020-08-18T05:52:23Z is populated with the "timeline/profile" JSON archived on 2020-08-17T00:48:43Z. The following curl command shows how the request to 20200818055223 copy of “timeline/profile” JSON receives an HTTP 302 redirect to a copy archived on 2020-08-17T00:48:43Z.


$ curl -ILs "http://web.archive.org/web/20200818055223/https://api.twitter.com/2/timeline/profile/25073877.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweet=true&include_tweet_replies=false&userId=25073877&count=20&ext=mediaStats%2ChighlightedLabel"| grep "^HTTP\|Location\|Date"

HTTP/1.1 302 FOUND

Date: Thu, 29 Oct 2020 02:35:31 GMT

Location: http://web.archive.org/web/20200817004843/https://api.twitter.com/2/timeline/profile/25073877.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweet=true&include_tweet_replies=false&userId=25073877&count=20&ext=mediaStats%2ChighlightedLabel

HTTP/1.1 200 OK

Date: Thu, 29 Oct 2020 02:35:31 GMT

Memento-Datetime: Mon, 17 Aug 2020 00:48:43 GMT

  

In the old UI, the most recent 20 tweets are embedded in the HTML, making the memento temporally coherent (prima facie coherent) at that time. After a certain period of time, there is an XHR request to load 20 new tweets. Clicking on “See 20 new Tweets”, the Twitter timeline is populated by a new archived JSON response. Sawood Alam has discussed the anatomy of Twitter's old UI timeline in his “Cookie Violations Cause Archived Twitter Pages to Simultaneously Replay in Multiple Languagesblog post. These newly loaded tweets can be from a different datetime making this old UI memento now temporarily violative. For example, as shown in Figure 8, the old UI memento from 2020-08-17T09:02:05Z shows a "See 20 new Tweets" link. Selecting "See 20 new Tweets" gives us tweets from 2020-09-23T20:04:53Z, which is over a month in the future from when the memento was captured. This shows how both new UI and old UI mementos are susceptible to temporal violations.


Figure 8: Temporal violation in old UI memento captured on 2020-08-17T09:02:05Z. The memento showing “See 20 new Tweets” after 2 minutes (left), selecting which populates the tweets from over one month in the future (2020-09-23T20:04:53Z) (right).  


We noticed that in the new UI the content can be populated with JSON from far in the future as well. For example (Figure 9), in the new UI memento from 2020-05-12T19:39:32Z, the tweets are populated with the closest archived JSON from 2020-05-19T01:25:31Z, which means tweets from 5 days in the future can be seen in this archived Twitter page. This shows that temporal violations can happen from both the past and the future depending on the archived JSON closest to the memento datetime.


Figure 9: Memento captured on 2020-05-12T19:39:32Z is displaying tweets from 5 days in the future (2020-05-19T01:25:31Z)


Redirecting to the temporally closest available memento also means that replayed web pages can change when rendered later. For example (Figure 10), we verified on 2020-09-17 that the memento from 2020-09-17T22:31:49Z is being populated by tweets from 2020-09-14T06:27:08Z. When the same memento was opened on the next day (2020-09-18), we noticed that now the tweets were coming from 2020-09-16T20:49:40Z instead of 2020-09-14T06:27:08Z.


Figure 10: Content on memento from 2020-09-17T22:31:49Z varies depending on when it was replayed. The same memento was displaying tweets from 2020-09-14T06:27:08Z when replayed on 2020-09-17 (left), and tweets from 2020-09-16T20:49:40Z when replayed on 2020-09-18 (right).

  1. Bio Section


Figure 11: Bio section of the Twitter profile page


Twitter’s bio section is where you can introduce yourself, and convey a message to people about your personality and/or interests as a part of customizing your profile as you wish. In the old UI, the bio section was pre-populated in the HTML itself and not loaded asynchronously, hence the bio section in the old UI is not temporally violative. We have inspected to see how this section is built in the new Twitter UI using the archived copy to better understand the phenomenon. We noticed that the root HTML page archived on 2020-08-18T05:52:23Z is populating the bio section with JSON archived on 2020-08-17T00:48:42Z. There is a potential temporal violation, but since most people infrequently edit their bio information, it is possibly coherent. In simple terms, a temporal violation in the bio section has less impact since Trump’s bio information on 2020-08-17 is likely the same content as what he had on 2020-08-18.


$ curl -ILs "http://web.archive.org/web/20200818055223/https://api.twitter.com/graphql/-xfUfZsnR_zqjFd-IfrN5A/UserByScreenName?variables=%7B%22screen_name%22%3A%22realdonaldtrump%22%2C%22withHighlightedLabel%22%3Atrue%7D" | grep "^HTTP\|Location\|Date"

HTTP/1.1 302 FOUND

Date: Thu, 29 Oct 2020 02:37:54 GMT

Location: http://web.archive.org/web/20200817004842/https://api.twitter.com/graphql/-xfUfZsnR_zqjFd-IfrN5A/UserByScreenName?variables=%7B%22screen_name%22%3A%22realdonaldtrump%22%2C%22withHighlightedLabel%22%3Atrue%7D

HTTP/1.1 200 OK

Date: Thu, 29 Oct 2020 02:37:54 GMT

Memento-Datetime: Mon, 17 Aug 2020 00:48:42 GMT


  1. Sidebar

3.1 Media Timeline

Figure 12: Media Timeline


Twitter’s media timeline section displays a few of the most recent images and videos from one's timeline. We noticed that the root HTML page archived on 2020-08-18T05:52:23Z, when it requests the “/timeline/media/25073877” JSON, is redirected to the 2020-08-17T00:33:32Z JSON memento, so the media timeline in the 2020-08-18 root HTML page shows the media from one day in the past (2020-08-17).  


$ curl -ILs "http://web.archive.org/web/20200818055223/https://api.twitter.com/2/timeline/media/25073877.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweet=true&count=20&ext=mediaStats%2ChighlightedLabel"  | grep "^HTTP\|Location\|Date"

HTTP/1.1 302 FOUND

Date: Thu, 29 Oct 2020 02:39:19 GMT

Location: http://web.archive.org/web/20200817003332/https://api.twitter.com/2/timeline/media/25073877.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweet=true&count=20&ext=mediaStats%2ChighlightedLabel

HTTP/1.1 200 OK

Date: Thu, 29 Oct 2020 02:39:20 GMT

Memento-Datetime: Mon, 17 Aug 2020 00:33:32 GMT


3.2 “You might like” Section

Figure 13: “You might like” section


In the “You might like” section on your profile page you will see personalized suggestions/recommendations made by Twitter on who to follow on Twitter. We noticed that the root HTML page archived on 2020-08-18T05:52:23Z, when it requests the “/users/recommendations” JSON, is redirected to the 2020-08-17T00:33:34Z JSON memento, so the media timeline in the 2020-08-18 root HTML page shows the “You might like” section from one day in the past (2020-08-17).  


$ curl -ILs "http://web.archive.org/web/20200818055223/https://api.twitter.com/1.1/users/recommendations.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&&pc=true&display_location=profile_accounts_sidebar&limit=4&user_id=25073877&ext=mediaStats%2ChighlightedLabel"  | grep "^HTTP\|Location\|Date"

HTTP/1.1 302 FOUND

Date: Thu, 29 Oct 2020 02:40:25 GMT

Location: http://web.archive.org/web/20200817003334/https://api.twitter.com/1.1/users/recommendations.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&&pc=true&display_location=profile_accounts_sidebar&limit=4&user_id=25073877&ext=mediaStats%2ChighlightedLabel

HTTP/1.1 200 OK

Date: Thu, 29 Oct 2020 02:40:25 GMT

Memento-Datetime: Mon, 17 Aug 2020 00:33:34 GMT

 

 

 3.3 “What's happening” Section

jj

Figure 14: “What's happening” section

 

In the “What’s happening'' section of a profile page, one expects to see what topics are popular right this moment based on several factors like who you are following, your interests, and your geographical location. We noticed that the root HTML page archived on 2020-08-18T05:52:23Z, when it requests the “/guide” JSON, is redirected to the 2020-08-17T00:33:36Z JSON memento, so the “what’s happening” section in the 2020-08-18 root HTML page shows the content from one day in the past (2020-08-17, Figure 15 - left).  

 

$ curl -ILs "http://web.archive.org/web/20200818055223/https://api.twitter.com/2/guide.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweet=true&display_location=web_sidebar&include_page_configuration=false&profile_user_id=25073877&entity_tokens=false&count=20&ext=mediaStats%2ChighlightedLabel"  | grep "^HTTP\|Location\|Date"

HTTP/1.1 302 FOUND

Date: Thu, 29 Oct 2020 02:41:41 GMT

Location: http://web.archive.org/web/20200817003336/https://api.twitter.com/2/guide.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweet=true&display_location=web_sidebar&include_page_configuration=false&profile_user_id=25073877&entity_tokens=false&count=20&ext=mediaStats%2ChighlightedLabel

HTTP/1.1 200 OK

Date: Thu, 29 Oct 2020 02:41:42 GMT

Memento-Datetime: Mon, 17 Aug 2020 00:33:36 GMT

 

After a certain amount of time (5 minutes), the JavaScript is requesting guide.json again to check for the most recent trends, but this time there is a slight change in the request parameters (count = 20 to count = 40, and addition to cursor = DefaultTopCursorValue). Frequently the additional parameters in a URL are not significant, but this is a scenario where a change in query parameters plays an important role in triggering these temporal violations. We noticed that the root HTML page archived on 2020-08-18T05:52:23Z, when it requests the “/guide” JSON again, is redirected to the 2020-07-24T08:23:33Z JSON memento (this is different to what occurred earlier due to the change in parameters in the request URL), so the “what’s happening” section in the 2020-08-18 root HTML page shows the content from 3 weeks in the past (2020-07-24, Figure 15 - right).  

 

$ curl -ILs "http://web.archive.org/web/20200818055223/https://api.twitter.com/2/guide.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweet=true&display_location=web_sidebar&include_page_configuration=false&profile_user_id=25073877&entity_tokens=false&count=40&cursor=DefaultTopCursorValue&ext=mediaStats%2ChighlightedLabel"  | grep "^HTTP\|Location\|Date"

HTTP/1.1 302 FOUND

Date: Thu, 29 Oct 2020 02:44:02 GMT

Location: http://web.archive.org/web/20200724082333/https://api.twitter.com/2/guide.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweet=true&display_location=web_sidebar&include_page_configuration=false&profile_user_id=25073877&entity_tokens=false&count=40&cursor=DefaultTopCursorValue&ext=mediaStats%2ChighlightedLabel

HTTP/1.1 200 OK

Date: Thu, 29 Oct 2020 02:44:03 GMT

Memento-Datetime: Fri, 24 Jul 2020 08:23:33 GMT

 

Unlike the "bio" section, the "What's happening" section changes frequently and reflects current events. For example, the topmost trending on “What’s happening” section in the second stage, “Taylor Swift’s brand new album, Folklore, is here” was indeed released on 2020-07-23 at midnight which was over 3 weeks prior to the memento datetime of the archived copy which displays it as just released and trending. One thing to note here is that these trends are non-personalized since the crawler is not logged in, but the mechanics would not change if it were logged in.

 

Figure 15 - What’s happening section: Content from 2020-08-17 (left) and content from 2020-07-24 (right) present in the archived copy of 2020-08-18.

 

This shows us how the archived root HTML page continues to change after the initial replay as the JavaScript is running which makes the temporal violations get even worse. This behavior was first described in "Refresh" For Zombies, Time Jumps by Michael L. Nelson, but in this case, because the violative resource is JSON and the root HTML page does not change, this is harder to detect. 


How far back does this problem go?

The earliest memento that we could find which displays this behavior is from 2019-10-25T16:10:49Z (Figure 16).  Here, the tweets are coming from one day in the future (2019-10-26T18:47:23Z). 

 

Figure 16: The earliest memento that displays temporal violations caused due to similar behavior is 2019-10-25T16:10:49Z 

 

Last year, the Internet Archive published a blog post about the new and improved version of Save Page Now (SPN). Also, in our previous blog post, we talked about different versions of SPN, where one version was archiving the new Twitter UI and the other version was archiving the old Twitter UI. From some time in early October 2020, the IA is using only the newer version of SPN which archives the old Twitter UI. However, we can still witness new UI mementos in IA because of the heterogeneity of the Wayback Machine’s holdings from many different sources.


When replaying mementos of Twitter profile pages, you need to take note if the memento is of the old UI or the new UI. The old UI will be temporally coherent with respect to the initial 20 tweets shown, but the “See 20 new Tweets” option is likely to be a temporal violation. With the new UI, any page that does not display an error (e.g., Figure 1) will likely have temporal violations and the 20 tweets shown are likely to not be the 20 tweets on the root HTML page at the time of archiving.    

Conclusion

Initially, the archived copies of Twitter's new UI pages were displayed as error pages during replay, but most of them are now replaying with temporal violations, which is arguably even worse because the violations are not necessarily easily detected. From our main example of the root HTML page archived on 2020-08-18T05:52:23Z, the JSON components for tweets, bio, media timeline, “You might like”, and “What’s happening” are coming from 1 day 5 hrs 4 mins, 1 day 5 hrs 4 mins, 1 day 5 hrs 19 mins, 1 day 5 hrs 19 mins, 24 days 21 hrs 29 mins, in the past, respectively.   


In summary:

  • New UI mementos are vulnerable to temporal violations from both the past and the future, depending on the archived JSON closest to the memento datetime.

  • These new UI mementos are not just subject to temporal violations, but they can vary depending on when they are replayed. 


As the new Twitter UI serves many objects as JSON (including tweets and users), replaying archived Twitter pages continues to get worse. In this case, primary content (i.e., tweets) is subject to temporal violations whereas, in the past, at least the initial 20 tweets were temporally coherent. Looking back at our examples we can conclude that pages built with JSON responses are subject to temporal violations. This has implications for the historical record: replayed pages with Twitter's new UI are likely to have significant temporal violations that will be all but impossible for regular users to detect. Perhaps web archives should try to detect such common issues and acknowledge them to provide necessary context.


Acknowledgments

We are grateful for the guidance given by Dr. Michael Nelson, Dr. Michele Weigle, and Sawood Alam in compiling this blog post.


--

Himarsha Jayanetti (@HimarshaJ) and Kritika Garg (@kritika_garg)

Comments