Thursday, April 17, 2014

2014-04-17: TimeGate Design Options For MediaWiki

We've been working on the development, testing, and improvement of the Memento MediaWiki Extension.  One of our principle concerns is performance.

The Memento MediaWiki Extension supports all Memento concepts:
  • Original Resource (URI-R) - in MediaWiki parlance referred to as a "topic URI"
  • Memento (URI-M) - called "oldid page" in MediaWiki
  • TimeMap (URI-T) - analogous to the MediaWiki history page, but in a machine readable format
  • TimeGate (URI-G) - no native equivalent in MediaWiki; acquires a datetime from the Memento client, supplies back the appropriate URI-M for the client to render
This article will focus primarily on the TimeGate (URI-G), specifically the analysis of two different alternatives in the implementation of TimeGate.  In this article we use the following terms to refer to these two alternatives:
  • Special:TimeGate - where we use a MediaWiki Special Page to act as a URI-G explicitly
  • URI-R=URI-G - where a URI-R acts as a URI-G if it detects an Accept-Datetime header in the request
Originally, the Memento MediaWiki Extension used Special:TimeGate.
A Special:TimeGate datetime negotiation session would proceed as follows, also as described as Pattern 2.1 in Section 4.2.1 of RFC 7089:
  1. HEAD request is sent with Accept-Datetime header to the URI-R*; URI-R responds with a Link header containing the location of the URI-G
  2. GET request is sent with Accept-Datetime header to the URI-G; URI-G responds with a 302 response header containing the location of the URI-M
  3. GET request is sent to the URI-M; URI-M responds with a 200 response header and Memento content
Obviously, this consists of 3 separate round trips between the client and server.  This URI-G architecture is referred to as Special:TimeGate.
The duration for Special:TimeGate is represented by:
dstg = a + RTT + b + RTT + M + RTT
dstg = 3RTT + a + b + M                            (1)
  • a - time to generate the initial URI-R response in step 1
  • b - time to generate the URI-G response in step 2
  • M - time to generate the URI-M response in step 3
  • RTT - round trip time for each request
Based on a conversation with the Wikimedia team, we chose to optimize this exchange by reducing the number of round trips, effectively implementing Pattern 1.1 in Section 4.1.1 of RFC 7089:
  1. HEAD request is sent with Accept-Datetime header to the URI-R; URI-R response with a 302 response header containing the location of the URI-M
  2. GET request is sent to the URI-M; URI-M responds with a 200 response header and Memento content
This URI-G architecture is referred to as URI-R=URI-G.

The duration for URI-R=URI-G is represented by:
drg = B + RTT + M + RTT
drg = 2RTT + B + M                                 (2)
  • B - time to generate the URI-G response in step 1
  • M - time to generate the URI-M response in step 2
  • RTT - round trip time for each request
Intuitively, URI-R=URI-G should be faster.  It has fewer round trips to make between client and server.

For URI-R=URI-G to be the better choice, drg < dstg, which is the same as the following derived relationship:
2RTT + B + M < 3RTT + a + b + M
2RTT - 2RTT + B + M < 3RTT - 2RTT + a + b + M
B + M - M < RTT + a + b + M - M


B < RTT + a + b                                       (3)
First, let's try to acquire the value of the term a.

After review of the Wikimedia architecture, it also became apparent that caching was an important aspect of our design and architecture plans.  Because the initial architecture utilized a Special:TimeGate URI and 302 responses are not supposed to be cached, caching was not of much concern.  Now that we've decided to pursue URI-R=URI-G, it becomes even more important.
Experiments with Varnish (the caching server used by Wikimedia) indicate that the Vary header correctly indicates what representations of the resource are to be cached.  If the URI-R contains a Vary header with the value Accept-Datetime, this indicates to Varnish that it should cache each
URI-R representation in response to an Accept-Datetime in the request for that URI-R.  Other values of the Vary header have a finite number of values, but Accept-Datetime can have a near-infinite number of values, making caching near useless for URI-R=URI-G.

Those visitors of a URI-R that don't use Accept-Datetime in the request header will be able to reap the benefits of caching readily.  Memento users of a URI-R=URI-G system will never reap this benefit, because Memento clients send an initial Accept-Datetime with every initial request.
Caching is important to our duration equations because a good caching server returns a cached URI-R in a matter of milliseconds, meaning our value of a in (3) above is incredibly small, on the order of 0.1 seconds on average.

Next we attempt to get the values of b and B in (3) above.

To get a good range of values, we conducted testing using the benchmarking tool Siege on our demonstration wiki.  The test machine is running an Apache HTTP Server 2.2.15 on top of Red Hat Enterprise Linux 6.5.  This server is a virtual machine consisting of two 2.4 GHz Intel Xeon CPUs and 2 GB of RAM.  The test machine consists of two installs of MediaWiki containing the Memento MediaWiki Extension: one with Special:TimeGate implemented, and a second using URI-R=URI-G.

Both TimeGate implementations use the same function for datetime negotiation.  The only major difference being whether they are called from a topic page (URI-R) or a Special page.

Tests were performed against localhost to avoid benefits to using the installed Varnish caching server.

The output from siege looks like the following:

This output was processed to extract the 302 responses, which correspond to those instances of datetime negotiation (the 200 responses are just siege dutifully following the 302 redirect). The URI then indicates which version of the Memento MediaWiki Extension is installed. URIs beginning with /demo-special use the Special:TimeGate design option. URIs beginning with /demo use the URI-R=URI-G design option. From these lines we can compare the amount of time it takes to perform datetime negotiation using each design option.

The date of Mon, 30 Jun 2011 00:00:00 GMT was used for datetime negotiation, because the test wiki contains fan-created data for the popular book series A Song of Ice And Fire (aka Game of Thrones), and this date corresponds to a book released during the wiki's use.

Figure 1: Differences in URI-G performance between URI-R=URI-G and Special:TimeGate
Figure 1 shows the results of performing datetime negotiation against 6304 different wiki pages.  The plot shows the difference between the URI-R=URI-G durations and the Special:TimeGate durations. Seeing as most values are above 0, there is a marked benefit to using Special:TimeGate.

Why the big difference?  It turns out that the earliest MediaWiki hook in the chain that we can use for URI-R=URI-G is ArticleViewHeader, because we needed something that provides an object that allows access to both the request (for finding Accept-Datetime) and response (for providing a 302) at the same time.  This hook is called once all of the data for a page has been loaded, leading to a lot of extra processing that is not incurred by the Special:TimeGate implementation.

Figure 2:  Histogram showing the range of URI-R=URI-G values
Figure 2 shows a histogram with 12 buckets containing the values for the durations of URI-R=URI-G.  The minimum value is 0.56 seconds.  The maximum value is 12.06 seconds.  The mean is 1.24 seconds. The median is 0.77 seconds. The biggest bucket spans 0 and 1.0.
Figure 3: Histogram showing the range of Special:TimeGate values
Figure 3 shows a histogram also with 12 buckets (for comparison) containing the values for the duration of Special:TimeGate.  The Special:TimeGate values only stretch between 0.22 and 1.75 seconds. The mean is 0.6 seconds. The median is 0.59 seconds. The biggest bucket spans 0.5 and 0.6.

Using this data, we can derive a solution for (3).  The values for B range from 0.56 to 12.06.  The values for b range from 0.22 to 1.75 seconds.

Now, the values of RTT can be considered.

The round trip time (RTT) is a function of the transmission delay (dt) and propagation delay (dp):
RTT = dt + dp                                                (4)
And transmission delay is a function of the number of bits (N) divided by the rate of transmission (R)
dt = N / R                                                      (5)
The average TimeGate request-response pair consists of a 300 Byte HTTP HEAD request header + 600 Byte HTTP 302 response header + 20 Byte TCP header + 20 Byte IP header = 940 Byte  = 7520 bit payload.

For 1G wireless telephony (28,800 bps), the end user would experience a transmission delay of
dt = 7520 b / 28800 bps
dt = 0.26 s
So, in our average case for both URI-G implementations (using a = 0.1 for a cached URI-R in (3)):
B < RTT + a + b
B < dp + dt + a + b
1.24 s < dp + dt + 0.1 s + 0.6 s
we replace RTT with our value for 1G wireless telephony:
1.24 s < dp + 0.26 s + 0.1 s + 0.6 s
1.24 s < dp + 0.96 s
So, an end user with 1 G wireless telephony would need to experience an additional 0.22 s of propagation delay in order for URI-R=URI-G to even be comparable to Special:TimeGate.

Propagation delay is a function of distance and propagation speed:
dp = d / sp                                                (6)
Seeing as 1G wireless telephony travels at the speed of light, the distance one would need to transmit a signal to make URI-R=URI-G viable becomes
0.22 s = d / (299,792,458 m/s)
(0.22 s) (299,792,458 m/s) = d
d = 65,954,340.76 m = 65,954 km = 40,981 miles
This is more than the circumference of the Earth.  Even if we used copper wire (which is worse) rather than radio waves, the order of magnitude is still the same.  Considering the amount of redundancy on the Internet, the probability of hitting this distance is quite low, so let's ignore propagation delay for the rest of this article.

That brings us back to transmission delay.  At what transmission delay, and essentially what bandwidth, does URI-R=URI-G win out over Special:TimeGate using our average values for the generation of the 302 response?
B < dt + a + b from (1) and (4), dropping dp
1.24 s < dt + 0.1 s + 0.6 s
1.24 s < dt + 0.7 s
0.54 s < dt

dt = N / R
0.54 s = 7520 b / R
(0.54 s) R = 7520 b
R = 7520 b / 0.54 s
R = 13,925 bps = 13 kbps
Thus, those MediaWiki sites with users using something slower than a 14.4 modem will benefit from the URI-R=URI-G implementation for TimeGate using our average values for the generation of TimeGate responses.

Therefore, we have decided that Special:TimeGate provides the best performance in spite of the extra request needed between the client and server.  The reason that the intuitive choice did not work out in most cases is due to idiosyncrasies in the MediaWiki architecture, rather than network concerns, as originally assumed.

--Shawn M. Jones

* It is not necessary for a client to send an Accept-Datetime header to a URI-R.  Most Memento clients do (and RFC 7089 Section 3.1 demonstrates this), in hopes that they encounter a URI-R=URI-G pattern and can save on an extra request.

No comments:

Post a Comment