2021-01-09: Embedded Tweets From @realDonaldTrump: They Won’t Break, But They Can Be Faked

On Friday, Twitter suspended Donald Trump's account due to concerns that his current and future tweets might continue to foment violence in the United States. Hayes Brown from MSNBC and Marshall Cohen from CNN echoed a concern I had when developing MementoEmbed: what happens to the embed if the source material or a service managing the embed goes away?

Are the embeds really dead? What can we do to provide evidence that the tweet actually existed?

Are Trump's tweet embeds dead?

No. As explained by Michael Nelson in the Twitter thread below, the embeds still display the tweet text because the tweet contents are stored in the embedding page by-value and thus the content is still accessible on the embedding page.

If we visit one of Trump's tweets directly on Twitter, we get the following:


If we embed the same tweet in a page, we get the following:

The embed code itself is quite simple. As seen below, an embed of a tweet contains HTML with the tweet text, author, author's Twitter handle, and date. The last element loads the JavaScript library widgets.js that applies formatting, labels, and other content to enhance this HTML so that the embed looks more like something from the Twitter website.

<blockquote class="twitter-tweet" data-lang="en">
    <p dir="ltr" lang="en">To all of those who have asked, I will not be going to the Inauguration on January 20th.</p>— Donald J. Trump (@realDonaldTrump) 
    <a href="https://twitter.com/realDonaldTrump/status/1347569870578266115?ref_src=twsrc%5Etfw">January 8, 2021</a>
</blockquote>
<script async="" charset="utf-8" src="https://platform.twitter.com/widgets.js"></script>

By including the tweet text, author, author's twitter handle, date, and link to tweet, the HTML allows the embed to gracefully degrade if the browser cannot load widgets.js. The widgets.js JavaScript does not modify this HTML for suspended accounts, deleted tweets, or invalid tweets in general; it just doesn't bother formatting them.

What can we do to provide evidence that the tweet actually existed?

From the perspective of a web page author, tweet embeds exist to serve two purposes:

  1. They provide a visualization of the tweet and its contents so that authors can demonstrate to readers what someone tweeted.
  2. They link to the actual tweet to provide evidence to readers of its existence.

Some web page authors forgo embedding and include a screenshot of the tweet instead. This satisifies the first purpose, but not the second. While I do not question the integrity of any author who includes screenshots, it is possible to modify or otherwise forge a screenshot. A link to the actual tweet could allow readers to visit the tweet and verify its existence for themselves. In this case, we are considering Twitter, through widgets.js, the trustworthy source for proving existence. A degraded embed does include the link, but the link goes to an unavailable tweet. So, how does one establish that a degraded embed refers to a tweet that truly existed? Let's start with creating some fake tweets.

How would fake tweet embeds work?

Some know that Michael Nelson and I have differing opinions when it comes to pineapple on pizza. The animation video below demonstrates two @phonedude_mln tweets: one with a pro-pineapple argument and second with an anti-pineapple argument (page used in this animation).

Note how the HTML displays first before widgets.js renders the embed.

Faking scenario #1: changing the text of the tweet in the embed

In the animation below (link to page), Michael changed the text in the embed HTML of the first tweet to "@shawnmjones pineapple on pizza is great!" and the text of the second to "@shawnmjones pineapple on pizza is great! anyone who disagrees should be exiled." In the first case, we changed a single word. In the second, we added an entire sentence to the tweet in an attempt to see if an "overflow" of the tweet text alters how it is rendered.

When the browser runs widgets.js to render these embeds, the JavaScript ignores what is in the HTML of the embed and loads the correct tweet text. This further demonstrates that widgets.js serves as an integrity check on embeds.



Faking scenario #2: creating an embed for a deleted tweet with the incorrect text

Here we deleted the second tweet from Twitter, but we left the embed with the incorrect text on the page. Note how widgets.js does nothing to indicate that the second embed refers to a deleted tweet. It allows the embed to gracefully degrade to the incorrect text.



Thus, a malicious author could apply a legitimate deleted tweet URL to an embed that has incorrect text, claim that the tweet once existed, and we would have no evidence to disprove their claim.

Faking scenario #3: creating an embed for a completely fake tweet that never existed

Let's start with the embed above of the tweet that was deleted from scenario #2. This embed has the wrong text, but the link does refer to a real tweet. Because the tweet was deleted, widgets.js will not render anything for it. Now, consider the tweet below that never existed.

If embeds for deleted and fake tweets look the same, then how does someone distinguish between a deleted tweet and a fake one? Maybe deleted tweets act differently if we try to visit their URLs?

If we remove the query string text of ?ref_src=twsrc%5Etfw from the tweet URL and paste the rest into our browser, we get the following messages for the deleted (left) and fake (right) tweets.

visiting deleted tweet in browser
visting fake tweet in browser





We did not reuse the same image. These are two different screenshots. How do we distinguish fake tweets from deleted tweets?

Some potential solutions

Web archives could provide evidence that a tweet existed. We could reformat the embed to point to a memento. Consider the embedded tweet from Michael Nelson above. Unexpectedly, if we replace the URL in the embed's href with the URI-M from the Internet Archive of the same tweet (https://web.archive.org/web/20210109205943/https://twitter.com/phonedude_mln/status/1347970115766259715), then we get the following.

It still renders, but it links to the live web resource rather than the memento. This is likely an unintended effect from the widgets.js code. The code in widgets.js is extracting the URI-R from the URI-M by pattern matching. If we try using a URI-M with Trump's tweet embed, again from a suspended account, it degrades to just the content in the HTML.

To further demonstrate this pattern-matching behavior, We replaced the Tweet URL in the embed code of Nelson's tweet with its URI-M https://archive.is/Kc7NZ from Archive.Today. The widgets.js does not format it at all. This establishes that widgets.js is not really archive-aware, but the first example accidentally worked due to some pattern matching that searches the URL for a tweet ID. In fact, we've conducted additional experiments and found that patterns like XXXXXXXXXXhttp://twitter.com/acct/status/1347569870578266115 also work in the href of the embed.

We could not find Trump's latest tweet in Archive.Today, but we did find an Archive.Today memento of an earlier @realDonaldTrump Tweet. The widgets.js does not format it either. This further supports the hypothesis that it is not the suspended account's status, but the fact that wigets.js cannot find a Twitter URL within the embed.

A better approach than replacing the tweet's link with its URI-M would include Robust Links in the process. This way a reader can visit the tweet at its live URL, see that the account has been disabled, but still visit evidence of the tweet's existence in a web archive. Robust Links include additional attributes with the a element that provide links to the live web resource, a memento of that resource, and the date that the author viewed the resource. Below is an annotated example of a robust link for Michael Nelson's tweet.



If the web page author loads robustlinks.js and robustlinks.css into the page, then the Robust Links dropdown menu (e.g., ) appears next to the robust link text. The reader can click this symbol and get a menu that provides access to the current version of the page, the memento, and other mementos for the same link without having to install additional browser extensions or other software. We show this menu in the screenshot below.

We use a Robust Link in the degraded embed of Trump's latest tweet below.

It would be nice if we could combine the Robust Links code with the widgets.js code, but the widgets.js code ceases to function if we include the Robust Links data attributes inside the link portion of the embed. We see this below with an example of Michael Nelson's still present tweet, but with Robust Links markup added. Fortunately, the Robust Links dropdown menu still works so readers can navigate to the tweet's live version or its mementos.

Can we combine some of these ideas? Seen below, authors could take the screenshot (this one from Yahoo! News), but also wrap the image in the robust link. This way they can convey what they saw in their own browser while also providing one or more links to evidence from third parties.

Thus, an author can convey what they saw in their own browser while also allowing readers to verify the existence of the tweet both on the live web and in web archives. Additionally, the reader can leverage the robust link to visit view mementos of this tweet in multiple web archives and compare different mementos to each other.

If we want to still use embeds, what can MementoEmbed do for us? Fortunately, MementoEmbed still works in this scenario. In the case of mementos of tweets, the tweet text is available in the og:description field of the meta element on the page. This field exists so that card services like MementoEmbed know which description to render in the card. Because MementoEmbed already applies the additional attributes of Robust Links to its HTML, as shown below, it can satisfy both needs of the web page author.

ARCHIVE.ORG   Preserved by ARCHIVE.ORG

“To all of those who have asked, I will not be going to the Inauguration on January 20th.”

Unfortunately, while MementoEmbed is Robust Links compatible and archive-aware, it is not Twitter-aware. More complex tweets that contain additional tweets, their own cards, videos, and images do not render as expected. The example below contains a quote tweet. MementoEmbed only sees the quoted tweet URL (https://t.co/3fs1oPVnAx) as text and does nothing to expand it because MementoEmbed is just relaying text from og:description.

ARCHIVE.ORG   Preserved by ARCHIVE.ORG

“Get smart Republicans. FIGHT! https://t.co/3fs1oPVnAx”

Since, 2015, Trump has posted more than 57,000 tweets. If the webmaster for cnn.com, thehill.com, or another news websites wants to sustain the existing embeds for these tweets, because the effort to replace them manually would be too great due, they could create their own widgets.js. It would be even better for readers if it were both Twitter-aware and archive-aware. It could still provide an integrity check on the embed while also providing Robust Links.

Summary

Trump's Twitter account suspension will not result in broken embeds in news articles. The embeds will degrade and continue to display the content of the tweet. While this may be a relief to page authors who embedded these tweets, this graceful degradation preserves the content, but also demonstrates how a web page author could insert fake tweets into a web page and then claim that the tweets had been since deleted or are otherwise unavailable. Could embeds be an additional front for the circulation of disinformation?

Web archives can help us demonstrate that a tweet truly existed with the content cited by the citing web page author, but links alone are not enough. In the case of tweets, embeds help readers view what the author saw while providing a link to the tweet for verification. If the tweeting account is suspended or the tweet is otherwise not available, the embed fails to support the author's argument that the tweet even existed. Embeds alone do not link to web archives, but by incorporating Robust Links, we can help readers easily verify the evidence presented by web page authors. We have outlined several potential solutions to address this problem, like adding Robust Links to screenshots or generating embeds of archived tweets with MementoEmbed. The ideal solution should be both archive-aware and Twitter-aware and incorporate Robust Links to help readers find other mementos as supporting evidence.

By providing embeds and robust links to mementos from multiple web archives, we have the tools to defend ourselves on the embed front of the disinformation war.

--Shawn M. Jones and Michael L. Nelson

Acknowledgements

I would like to thank Kelly Durkin Ruth (United States Naval Academy) and Herbert Van de Sompel for bringing these tweets to my attention. Kelly started me thinking in this direction with an early conversation of Cohen's tweet. Valentina Neblitt-Jones, Martin Klein, and Herbert further helped me discuss and flesh out some of the ideas present in this blog post. I noticed that Michael Nelson was explaining this in a Twitter thread while I beginning this blog post, so we started to coordinate on a joint post. - Shawn

Comments