Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Agile Web Development With Rails, 2nd Edition (2006).pdf
Скачиваний:
30
Добавлен:
17.08.2013
Размер:
6.23 Mб
Скачать

THE PROBLEM WITH GET REQUESTS 462

Where do you find the cache files to delete? Not surprisingly, this is configurable. Page cache files are by default stored in the public directory of your application. They’ll be named after the URL they are caching, with an .html extension. For example, the page cache file for content/show/1 will be in

app/public/content/show/1.html

This naming scheme is no coincidence; it allows the web server to find the cache files automatically. You can, however, override the defaults using

config.action_controller.page_cache_directory = "dir/name" config.action_controller.page_cache_extension = ".html"

Action cache files are not by default stored in the regular filesystem directory structure and cannot be expired using this technique.

21.6The Problem with GET Requests

At the time this book was written, there’s a debate raging about the way web applications use links to trigger actions.

Here’s the issue. Almost since HTTP was invented, it was recognized that there is a fundamental difference between HTTP GET and HTTP POST requests. Tim Berners-Lee wrote about it back in 1996.15 Use GET requests to retrieve information from the server, and use POST requests to request a change of state on the server.

The problem is that this rule has been widely ignored by web developers. Every time you see an application with an Add To Cart link, you’re seeing a violation, because clicking that link generates a GET request that changes the state of the application (it adds something to the cart in this example). Up until now, we’ve gotten away with it.

This changed in the spring of 2005 when Google released its Google Web Accelerator (GWA), a piece of client-side code that sped up end users’ browsing. It did this in part by precaching pages. While the user reads the current page, the accelerator software scans it for links and arranges for the corresponding pages to be read and cached in the background.

Now imagine that you’re looking at an online store containing Add To Cart links. While you’re deciding between the maroon hot pants and the purple tank top, the accelerator is busy following links. Each link followed adds a new item to your cart.

The problem has always been there. Search engines and other spiders constantly follow links on public web pages. Normally, though, these links that invoke state-changing actions in applications (such as our Add To Cart link)

15. http://www.w3.org/DesignIssues/Axioms

Report erratum

THE PROBLEM WITH GET REQUESTS 463

are not exposed until the user has started some kind of transaction, so the spider won’t see or follow them. The fact that the GWA runs on the client side of the equation suddenly exposed all these links.

In an ideal world, every request that has a side effect would be a POST,16 not a GET. Rather than using links, web pages would use forms and buttons whenever they want the server to do something active. The world, though, isn’t ideal, and there are thousands (millions?) of pages out there that break the rules when it comes to GET requests.

The default link_to method in Rails generates a regular link, which when clicked creates a GET request. But this certainly isn’t a Rails-specific problem. Many large and successful sites do the same.

Is this really a problem? As always, the answer is “It depends.” If you code applications with dangerous links (such as Delete Order, Fire Employee, or Fire Missile), there’s the risk that these links will be followed unintentionally and your application will dutifully perform the requested action.

Fixing the GET Problem

Following a simple rule can effectively eliminate the risk associated with dangerous links. The underlying axiom is straightforward: never allow a straight <a href="..." link that does something dangerous to be followed without some kind of human intervention. Here are some techniques for making this work in practice.

Use forms and buttons, rather than hyperlinks, to perform actions that change state on the server. Forms are submitted using POST requests, which means that they will not be submitted by spiders following links, and browsers will warn you if you reload a page.

Within Rails, this means using the button_to helper to point to dangerous actions. However, you’ll need to design your web pages with care. HTML does not allow forms to be nested, so you can’t use button_to within another form.

Use confirmation pages. For cases where you can’t use a form, create a link that references a page that asks for confirmation. This confirmation should be triggered by the submit button of a form; hence, the destructive action won’t be triggered automatically.

Some folks also use the following techniques, hoping they’ll prevent the problem. They don’t work.

Don’t think your actions are protected just because you’ve installed a JavaScript confirmation box on the link. For example, Rails lets you write

16.Or a rarer PUT or DELETE request

Report erratum

THE PROBLEM WITH GET REQUESTS 464

link_to(:action => :delete, :confirm => "Are you sure?")

This will stop users from accidentally doing damage by clicking the link, but only if they have JavaScript enabled in their browsers. It also does nothing to prevent spiders and automated tools from blindly following the link anyway.

Don’t think your actions are protected if they appear only in a portion of your web site that requires users to log in. Although this does prevent global spiders (such as those employed by the search engines) from getting to them, it does not stop client-side technologies (such as Google Web Accelerator).

Don’t think your actions are protected if you use a robots.txt file to control which pages are spidered. This will not protect you from client-side technologies.

All this might sound fairly bleak. The real situation isn’t that bad. Just follow one simple rule when you design your site, and you’ll avoid all these issues.

Web

Health

Warning

Put All Destructive Actions

Behind a POST Request

Report erratum