2013-09-09: MS Thesis: HTTP Mailbox - Asynchronous RESTful Communication

It is my pleasure to report the successful completion of my Master's degree thesis entitled "HTTP Mailbox - Asynchronous RESTful Communication". I have defended my thesis on July 11th and got my written thesis accepted on August 23rd 2013. In this blog post I will briefly describe the problem that the thesis is targeting at followed by proposed and implemented solution to the problem. I will walk through an example that will illustrate the usage of the HTTP Mailbox then I will provide various links and resources to further explore the HTTP Mailbox.

Traditionally, general web services used only the GET and POST methods of HTTP while several other HTTP methods like PUT, PATCH, and DELETE were rarely utilized. Additionally, the Web was mainly navigated by humans using web browsers and clicking on hyperlinks or submitting HTML forms. Clicking on a link is always a GET request while HTML forms only allow GET and POST methods. Recently, several web frameworks/libraries have started supporting RESTful web services through APIs. To support HTTP methods other than GET and POST in browsers, these frameworks have used hidden HTML form fields as a workaround to convey the desired HTTP method to the server application. In such cases, the web server is unaware of the intended HTTP method because it receives the request as POST. Middleware between the web server and the application may override the HTTP method based on special hidden form field values. Unavailability of the servers is another factor that affects the communication. Because of the stateless and synchronous nature of HTTP, a client must wait for the server to be available to perform the task and respond to the request. Browser-based communication also suffers from cross-origin restrictions for security reasons.

We describe HTTP Mailbox, a mechanism to enable RESTful HTTP communication in an asynchronous mode with a full range of HTTP methods otherwise unavailable to standard clients and servers. HTTP Mailbox also allows for multicast semantics via HTTP. We evaluate a reference implementation using ApacheBench (a server stress testing tool) demonstrating high throughput (on 1,000 concurrent requests) and a systemic error rate of 0.01%. Finally, we demonstrate our HTTP Mailbox implementation in a human-assisted Web preservation application called "Preserve Me!" and a visualization application called "Preserve Me! Viz".

The HTTP Mailbox is inspired by the pre-Web distributed computing model Linda and modern Web scale distributed computing architecture REST. It tunnels the HTTP traffic over HTTP using message/http (or application/http) MIME type and stores the HTTP messages (requests/responses) along with some extra metadata for later retrieval. The HTTP Mailbox provides a RESTful API to send and retrieve asynchronous HTTP messages. For a quick walk-through of the thesis please refer to the oral presentation slides (HTML) or access them on SlideShare. A complete copy of the thesis (PDF) is also available publicly at:
Sawood Alam, HTTP Mailbox - Asynchronous RESTful Communication, MS Thesis, Computer Science Department, Old Dominion University, August 2013.


Our preliminary implementation code can be found on GitHub. We have also deployed an instance of our implementation on Heroku for public use. This instance internally uses Fluidinfo service for message storage. Let us have a look at the deployed service to illustrate its usage.

Let us assume that we want to check the HTTP Mailbox to see if there any messages for http://example.com/all. Our HTTP Mailbox API endpoint is located at http://httpmailbox.herokuapp.com/hm/. Hence we will make a GET request as illustrated below.

$ curl -i http://httpmailbox.herokuapp.com/hm/http://example.com/all
HTTP/1.1 404 Not Found
Content-Type: message/http
Date: Mon, 09 Sep 2013 16:59:13 GMT
Server: HTTP Mailbox
Content-Length: 0
Connection: keep-alive

This indicates that there are no messages for the given URI. Now let us POST something to that URI first. We have an example file named "welcome.txt" that is a valid HTTP message which we want to send to http://example.com/all.

$ cat welcome.txt
POST /all HTTP/1.1
Host: example.com
Content-Type: text/plain
Content-Length: 32

Welcome to the HTTP Mailbox! :-)

Now let us POST this message to the given URI.

$ curl -i -X POST --data-binary @welcome.txt \
> -H "Sender: hm-deployer" \
> -H "Content-Type: message/http" \
> http://httpmailbox.herokuapp.com/hm/http://example.com/all
HTTP/1.1 201 Created
Content-Type: message/http
Date: Mon, 09 Sep 2013 17:13:02 GMT
Location: http://httpmailbox.herokuapp.com/hm/id/ab3defce-dfa9-4d09-a72d-cac267531ca6
Server: HTTP Mailbox
Content-Length: 0
Connection: keep-alive

Now that we have POSTed the message, we can retrieve it anytime later.

$ curl -i http://httpmailbox.herokuapp.com/hm/http://example.com/all
HTTP/1.1 200 OK
Content-Type: message/http
Date: Mon, 09 Sep 2013 17:15:33 GMT
Link: <http://httpmailbox.herokuapp.com/hm/http://example.com/all>; rel="current",
 <http://httpmailbox.herokuapp.com/hm/id/ab3defce-dfa9-4d09-a72d-cac267531ca6>; rel="self",
 <http://httpmailbox.herokuapp.com/hm/id/ab3defce-dfa9-4d09-a72d-cac267531ca6>; rel="first",
 <http://httpmailbox.herokuapp.com/hm/id/ab3defce-dfa9-4d09-a72d-cac267531ca6>; rel="last"
Memento-Datetime: Mon, 09 Sep 2013 17:13:01 GMT
Server: HTTP Mailbox
Via: sent by 128.82.4.75 on behalf of hm-deployer, delivered by http://httpmailbox.herokuapp.com/hm/
Content-Length: 114
Connection: keep-alive

POST /all HTTP/1.1
Host: example.com
Content-Type: text/plain
Content-Length: 32

Welcome to the HTTP Mailbox! :-)

So far, there is only one message for the given URI. If more messages are posted to the same URI, above retrieval request will only retrieve the last message of the chain. From there the "Link" header can be used to navigate through the message chain.

We have been using HTTP Mailbox service in various applications including "Preserve Me!" and "Preserve Me! Viz". Following screenshot illustrates its usage in "Preserve Me!".


We would like to thank GitHub for hosting our code, Heroku for running our HTTP Mailbox instance on their cloud infrastructure, and Fluidinfo for storing messages in their "tag and value" style RESTful storage system.

I am grateful to my advisor Dr. Michael L. Nelson, committee members Dr. Michele C. Weigle  and Dr. Ravi Mukkamala, colleagues and everyone else who helped me in the process of getting my Master's degree. Now, I am continuing my research under the guidance of Dr. Michael L. Nelson at Old Dominion University.

Resources

--
Sawood Alam

Comments