Monthly Archives: June 2009

Basics of HTTP

This is a refresher about the HTTP protocol (citation: RESTful Web Services, 2007, L.Richardson, S. Ruby, O’Rielly Media Inc).

What is HTTP?

HTTP stands for Hypertext Transfer Protocol.

Using the analogy of a document and envelope…

HTTP is a document based protocol:

  1. A client puts a document into an envelope and sends to a server.
  2. The sever puts a response envelope into an envelope and sends to the client.

HTTP has strict standards about the type of envelope but not what goes inside it.

HTTP Request (Client)

The HTTP Method


The Path

The URI to the right of the hostname – this becomes the address on the envelope.

Request Headers

These are key-value pairs that specify infomation about the envelope – there are 46 in all – some of them include: Host, User-Agent, Accept, Keep-Alive…

The Entity Body/Document/Representation

The document inside the envelope.  GET requests never have an entity body, all the information necessary is included in the path and request header.

In POST requests – the entity body could be structured XML to pass over a substantial amount of complex data.  Some web services may take this data and make objects from the XML and feed it into a database, however you do not necesarily have to use XML you can just use plain text, you also don’t have to put the data in a database – you can do what you want with data once you recieve it on the database.

The following shows an example of a GET reqeust to the BBC:

(Request-Line)    GET / HTTP/1.1
User-Agent    Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv: Gecko/2009060215 Firefox/3.0.11
Accept    text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language    en-gb,en;q=0.5
Accept-Encoding    gzip,deflate
Accept-Charset    ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive    300
Proxy-Connection    keep-alive
Cookie    AMOS_PREF=sac%3Dg4; BBC-UID=e4c73e81d4fc55653460b7caa03036f31b6ba10990a0b114b4ffb881cf0ba6450Mozilla%2f5%2e0%20%28Windows%3b%20U%3b%20Windows%20NT%205%2e1%3b%20en%2dGB%3b%20rv%3a1%2e8%2e1%2e12%29%20Gecko%2f20080201%20Firefox%2f2%2e0%2e0%2e12; BBCNewsAudience=Domestic; BBCNewsAudcWght=-99; BBCMediaSelector=m%3Arm&b%3Abb&st%3Ah&ms3%3A4; hp=+acv+ba+neaj+hj+oab*+c1+g1ab+mc2+rad*+da+f1a7b7c7d7+i+kca+la

HTTP Response (Server)

HTTP Response Codes

The code tells the client what happened to their request – did it fail or was success achieved.   There are lots of codes, the most common you will have seen on the web is 404 – which means “Not Found” i.e. the web page was not found on the server.  A success code will usually be 200 i.e. if you got to see the web page you asked for.

Response Header

Pretty much the same as Request Headers e.g. Date, Server, Etag, Content-Type, Last-Modified…

The Entity Body/Document/Representation

In GET requests you actually get an entity body sent to you (or at least you should do!)   The Response Header also makes an important contribution here, like Content-Type – this will tell the client about the media-type to expect (there are many of these) but in the case of a browser rendering an HTML web page it will commonly be text/html.

N.B – media-type can also be referred to as MIME type, content type or data type.

The following is an example of a response from the BBC:

(Status-Line)    HTTP/1.1 200 OK
Via    1.1 BLADEWIN17
Connection    Keep-Alive
Proxy-Connection    Keep-Alive
Content-Length    110652
Date    Wed, 24 Jun 2009 16:00:01 GMT
Age    87
Content-Type    text/html
Server    Apache
Accept-Ranges    bytes
Keep-Alive    timeout=4, max=194