May 29, 2014

Introduction to REST - Representational state transfer

how a server and a client communicate with eac...
how a server and a client communicate with each other. (Photo credit: Wikipedia)
An example interaction between a client and a ...
An example interaction between
a client and a server
Joe Gregorio goes over the basic principles behind REST.
RFC 2616: http://www.ietf.org/rfc/rfc2616.txt
RFC 3986: http://www.rfc-editor.org/rfc/rfc3986.txt

Representational state transfer is a software architectural style consisting of a coordinated set of architectural constraints applied to components, connectors, and data elements, within a distributed hypermedia system.
 

0:10
Gregorio: Hi. I'm Joe Gregorio,
0:12
and I work in Developer Relations at Google.
0:14
This talk is on REST and, in the talk,
0:16
I presume you're familiar with the Atom Publishing Protocol.
0:19
If you're not, you can watch my other video
0:21
"An Introduction to the Atom Publishing Protocol,"
0:23
and then come back and watch this one.
0:25
So let's begin.
0:26
You may have heard the term REST,
0:28
and a lot of protocols these days
0:29
are advertising themselves as REST.
0:32
REST comes from Roy Fielding's thesis
0:34
and stands for Representational State Transfer.
0:36
It's an architectural style.
0:38
Now, an architectural style is an abstraction
0:41
as opposed to a concrete thing.
0:43
For example, this Shaker house
0:45
is different from the Shaker architectural style.
0:48
The architectural style of Shaker
0:50
defines the attributes or characteristics
0:53
you would see in a house built in that style.
0:56
In the same way, the REST architectural style
0:59
is a set of architectural constraints
1:01
you would see in a protocol built in that style.
1:05
HTTP is one such protocol.
1:07
And, for the remainder of this talk,
1:09
we're just going to talk about HTTP.
1:11
And I'll refer back
1:12
to the architectural constraints of REST
1:14
as we work through that example.
1:16
Now, it's simply not possible to cover every aspect HTTP,
1:20
so at the end of this presentation
1:22
there will be a further reading list,
1:23
if you'd like to learn more.
1:25
So why should you care about REST?
1:28
Well, it's the architecture of the Web as it works today.
1:31
And if you're going to be building applications
1:33
on the Web, shouldn't you be working
1:35
with the architecture instead of against it?
1:38
And, hopefully, as you see us go through this video,
1:41
there will be many opportunities
1:43
for increasing the performance
1:44
and scalability of your application,
1:47
and solve some traditionally tricky problems
1:49
by working with HTTP
1:50
and taking full advantage of its capabilities.
1:53
Let's get some of the basics down,
1:55
some nomenclature in the operation of HTTP.
1:58
At its simplest,
1:59
HTTP is a request response protocol.
2:02
You browser makes a request to the server,
2:05
the Web server gives you a response.
2:07
The beauty of the Web is that it appears very simple,
2:09
as if your browser is talking directly to the server.
2:13
So, let's look in detail
2:14
at a specific request and response.
2:17
Here is a GET request
2:19
to the URL http://example.org/news
2:25
and here's what the response looks like.
2:27
It's a 200 response
2:28
and what you're seeing here are the headers
2:31
and a little bit of the response body.
2:33
The request is to a resource identified by a URI,
2:36
in this case like I said, http://example.org/news.
2:42
Resources or addressability is very important.
2:45
The request is to a resource identified by a URI.
2:48
In this case, http://example.org/news.
2:53
The URI is broken down into two pieces.
2:56
The path goes into the request line,
2:59
and you can see the host shows up in the host header.
3:04
There is a method
3:06
and that's the action to perform on the resource.
3:08
There are actually several different methods
3:10
that can be used,
3:11
GET, PUT, DELETE, HEAD, and POST among others,
3:15
and each of those methods
3:16
has particular characteristics about them.
3:19
For example, GET is safe, idempotent, and cacheable.
3:23
Cacheable means the response can be cached
3:25
by an intermediary along the way,
3:27
idempotent means the request can be done multiple times,
3:31
and safe means there are no side effects
3:33
from performing that action.
3:35
So PUT is also idempotent,
3:38
but not safe, and not cacheable.
3:40
Same with DELETE, it is idempotent.
3:43
HEAD is safe and idempotent.
3:45
POST has none of those characteristics.
3:48
Also returned in that response
3:50
was the representation of that resource,
3:53
what lives at that URI.
3:55
The representation is the body
3:56
and, in this case, it was an HTML document.
3:59
HTML is a form of hypertext,
4:02
which means it has links to other resources.
4:05
Here is a traditional link that you would click on
4:07
to go to another page,
4:09
but there's more than one kind of link.
4:11
Here is a link to a CSS document
4:12
that the browser will call and include to style the page.
4:16
There's also other kinds of links.
4:17
Here's one to a JavaScript document
4:20
that will get pulled in.
4:21
This is a particularly important kind of hypertext
4:24
or document that's pulled in.
4:25
This is called Code on Demand,
4:27
the ability to load code into the browser
4:29
and execute it on the client.
4:32
The response headers show control data,
4:34
such as this header which controls how long
4:36
the response can be cached.
4:39
So now that we've looked
4:40
at simple HTTP request and response,
4:43
let's go back and look at some of the characteristics
4:45
that a RESTful protocol is supposed to have.
4:48
Application state and functionality
4:50
are directed into resources.
4:52
Those resources are uniquely addressable
4:55
using a universal syntax for use in hypermedia links.
4:59
All resources share a uniform interface
5:01
for transferring the state
5:03
between the client and the server
5:04
consisting of a constraint set of well-defined operations,
5:08
a constraint set of content types
5:11
optionally supporting Code on Demand,
5:13
and a protocol which is client-server,
5:15
stateless, layered, and cacheable.
5:19
Now that we've already talked about
5:20
many of these aspects with HTTP,
5:22
we can see that we already have resources
5:25
that are identified by URIs,
5:27
and those resources have a uniform interface
5:30
understanding a limited set of methods
5:32
such as GET, PUT, POST, HEAD, and DELETE,
5:35
and that the representations are self-identified,
5:38
a constraint set of content types
5:39
that might not only be hypertext,
5:42
but could also include Code on Demand
5:44
such as the example we saw with JavaScript.
5:47
And we've even seen that HTTP is a client-server protocol.
5:51
To discuss the remainder of the characteristics
5:52
of the protocol,
5:54
we need to look at the underlying structure
5:55
of the Web.
5:57
We originally started out with a simplified example
5:59
of how the Web appears to a client.
6:01
Let's switch to using the right names
6:03
for each of those pieces.
6:05
They're the user agent and the origin server.
6:08
The reality is that the connections
6:11
between these pieces could be a lot more complicated.
6:14
There can be many intermediaries between you and the server
6:17
you're connecting to.
6:19
By intermediaries, we mean HTTP intermediaries,
6:22
which doesn't include devices at lower levels
6:24
such as routers, modems, and access points.
6:27
Those intermediaries are
6:29
the layered part of the protocol,
6:31
and that layering allows intermediaries to be added
6:34
at various points in the request response path
6:36
without changing the interfaces between components
6:39
where they can do things to passing messages,
6:41
such as translation or improving performance with caching.
6:44
Intermediaries include proxies and gateways.
6:48
Proxies are chosen by the client,
6:49
while gateways are chosen by the origin server.
6:52
Despite the slide showing only one proxy and one gateway,
6:55
realize there may be several proxies and gateways
6:58
between your user agent and origin server,
7:01
or there may actually be none.
7:03
Finally, every actor in the chain,
7:05
from the user agent through the proxies
7:07
and the gateways to the origin server,
7:09
may have a cache associated with them.
7:11
If an intermediary does caching
7:14
and a response indicates that the response can be cached,
7:16
in this case for an hour,
7:18
then if a new request for that resource
7:20
comes within an hour,
7:22
then the cached response will be returned.
7:24
These caches finish out the major characteristics
7:27
of our REST protocol.
7:29
Now, we said this architecture had benefits.
7:32
What are some of those?
7:33
Let's first look at some of the performance benefits,
7:35
which include efficiency, scalability,
7:37
and user perceived performance.
7:40
For efficiency,
7:41
all of those caches help along the way.
7:44
Your request may not have to reach all the way back
7:46
to the origin server
7:47
or, in the case of a local user agent cache,
7:50
you may never even hit the network at all.
7:52
Control data allows the signaling of compression,
7:55
so a response can be GZIPPED before being sent
7:57
to the user agents that can handle them.
8:00
Scalability comes from many areas.
8:02
The use of gateways allows you to distribute traffic
8:04
among a large set of origin servers
8:06
based on method, URI, content type,
8:09
or any of the other headers coming in from the request.
8:12
Caching helps scalability also
8:13
as it reduces the actual number of requests
8:15
that make it all the way back to the origin server.
8:18
And statelessness allows a request to be routed
8:20
through different gateways and proxies,
8:22
thus avoiding introducing bottlenecks
8:23
and allowing more intermediaries to be added as needed.
8:28
Finally, User Perceived Performance is increased
8:30
by having a reduced set of known media types
8:32
that allows browsers to handle known types much faster.
8:36
For example, partial rendering of HTML documents
8:38
as they download.
8:40
Also, Code on Demand allows computations
8:42
to be moved closer to the client
8:44
or closer to the server,
8:45
depending on where the work can be done fastest.
8:48
For example, having JavaScript to do form validation
8:51
before a request is even made to the origin server
8:54
is obviously faster
8:55
than round-tripping the form values to the server
8:58
and having the server return any validation errors.
9:01
Similarly, caching helps here as it requests may not need
9:04
to go completely back to the origin server.
9:06
Also, since GET is idempotent and safe,
9:09
a user agent could pre-fetch results before they're needed,
9:12
thus increasing user perceived performance.
9:15
Lots of other benefits we won't cover,
9:17
but these are outlined in Roy's thesis.
9:20
But all these benefits aren't free.
9:22
You actually have to structure your application
9:24
or service to take advantage of them.
9:26
If you do, then you will get the benefits.
9:29
And if you don't, you won't get them.
9:31
To see how structuring helps, let's look at two protocols:
9:34
XML-RPC and the Atom Publishing Protocol.
9:52
So this is what an XML-RPC request looks like,
9:56
and here's an example response.
10:01
All of the requests in XML-RPC are posts.
10:04
So what do the intermediaries see of this request response?
10:08
Is it safe? No.
10:09
Is it idempotent? No.
10:11
Or is it cacheable? No.
10:12
If they are, the intermediaries would never know that.
10:16
All the requests go to the same URI,
10:18
which means that if you're going to distribute many such calls
10:20
among a group of origin servers,
10:22
you would have to look inside the body
10:24
for the method name.
10:25
This gives the least amount of information to the Web,
10:29
and thus it doesn't get any help from intermediaries
10:31
and doesn't scale with off the shelf parts.
10:33
So let's take a look at the Atom Publishing Protocol.
10:37
So for authoring to begin in the Atom Publishing Protocol,
10:39
a client needs to discover the capabilities and locations
10:42
of the available collections.
10:44
Service documents are designed
10:45
to support this discovery service.
10:48
To retrieve a service document, we send a GET to its URI.
10:52
GET is safe, idempotent, cacheable, and zipable.
10:56
The response type is self-identifying.
10:58
As you can see, there's a content type header
11:00
of application Atom Service plus XML
11:03
that self-identifies what the content is specifically,
11:07
and the response itself is hypertexted.
11:10
It contains URIs for each of the collections.
11:12
That's what's highlighted, in this slide here,
11:14
is the relative URI for the collection.
11:17
Once we have a collection URI,
11:19
we can post an entry to create a new member,
11:22
and then GET, PUT, or DELETE the members at their own URIs.
11:25
So here's an example of a GET to a collection document.
11:29
Again, this is safe, idempotent, cacheable, and zipable.
11:35
The response is also self-identifying here
11:38
as you have another content type,
11:40
application/atom+xml.
11:42
And again, the response is hypertext.
11:46
Lastly, the edit URI identifies
11:49
where the entry can actually be modified.
11:51
That URI, you can do a GET to, to retrieve it,
11:54
you can send a PUT to update the resource,
11:57
or you can send a DELETE to remove it
11:58
from the collection.
11:59
So as you can see, the Atom Publishing Protocol
12:02
is designed with RESTful characteristics in mind
12:05
and gets many advantages
12:07
from intermediaries and the network itself
12:10
as those messages transfer back and forth.
12:12
So, let's look at some of the other idioms
12:15
that you can use in building your RESTful protocol
12:18
to get some of the advantages.
12:19
For example, long-lived images.
12:22
If you have large images
12:24
that need to be transferred back and forth
12:26
as part of your Web page, what you should do is
12:29
set the cache for those images to be very long.
12:32
If you need to update those images,
12:33
upload a new image to a new URI
12:35
and change the HTML to point to that new URI.
12:39
Here's an example where I have big-image.png.
12:43
And, if we retrieve that image,
12:45
you'll see that the cache control header
12:47
has been set to a very long time.
12:49
In this case, 30 days.
12:51
If we made a mistake, or we'd like to update that image,
12:54
what I need to do is upload a new image, big-image-2,
12:58
set the cache control for that to be very long,
13:01
and then update the HTML.
13:03
The idea here is that you keep the HTML
13:06
with the short cache lifetime,
13:07
and thus you can update that easily.
13:11
So there you go,
13:12
a high level view of REST and how it relates to HTTP.
13:16
Here's the list of further reading
13:17
that I had promised you.
13:19
"RFC 2616" actually outlines what HTTP is.
13:23
"RFC 3986" outlines the URI standard.
13:27
You can read Roy Fielding's thesis,
13:29
"Architectural Styles and the Design
13:31
of Network-based Software Architectures."
13:33
And there's also this "Caching Tutorial"
13:35
by Mark Nottingham which covers in detail
13:38
many of the things we just talked about.
13:39
Thanks and have fun.
Enhanced by Zemanta

About the Author

Iwebslog Labs

Author & Editor

Has laoreet percipitur ad. Vide interesset in mei, no his legimus verterem. Et nostrum imperdiet appellantur usu, mnesarchum referrentur id vim.

Post a Comment

 
Iwebslog Blog © 2015 - Designed by Templateism.com