Want to find your data? Use REST

I think the most important feature of the web is the fact that it’s based on open standards, so everybody can participate. It brings the whole world together. But another and just as important feature of the web is that all the information is out in the open, and linked together using hyperlinks. This enables search engines to crawl the web and find all information. And this enables you to find all information.

In my opinion the web would be a lot less useful if you couldn’t find so much information through search engines. Imagine that you have to access all information through human created directories like the yahoo of 12 years ago or dmoz. You wouldn’t be able to find 1% of all the information you can find right now. Sites would be listed under one or two categories, and you’d never find it if you were looking for something else, even though the site might also have some useful information on that topic.

To be able to auto discover information by crawling the information in extremely useful. With the rise of Ajax and RIA’s implemented using Flex and Silverlight i was afraid that we would go back in time, and not be able to find our information anymore. Back to the desktop and the enterprise, were information is pretty much hidden inside applications. If you want to find something, most of the times you have to start the appropriate application and look for it there.

Enterprise search products have tried to solve this problem, but they have a serious problem. It’s almost impossible to crawl the information in an enterprise. Instead these enterprise search solutions need plugins for every application that they need to index. This might work inside an enterprise, but it’s costly if you need to create a large number of plugins for custom applications and databases. And it certainly doesn’t scale to the size of the internet.

Now that we are moving towards RIAs and SOA underneath, chances are big that information will be locked up again. These appication hard to crawl, and data made available through WS-* services are hard to discover and crawl.

But there is a solution: REST or RESTfull HTTP. Services and data made by applying the REST architecture will be crawl-able and thus discoverable. RESTfull HTTP has some very important aspects that enable this:

  • Standardized API. The API consists of GET, PUT, POST and DELETE. Every resource can be retrieved using a GET operation. You do not have to extend search engines with new plugins for new applications, as every application uses the same operations.
  • Links. REST requires resources to have links to other resources. This enables data crawling and discovery of data.
  • Different representations: Resources can be fetched in the representation most appropriate for the search engine. This could be xml, json or something else.

By implementing SOA based on the REST principles, data in the enterprise will be a lot more findable than it is today!

blog comments powered by Disqus