- •Introduction
- •Rails Is Agile
- •Finding Your Way Around
- •Acknowledgments
- •Getting Started
- •Models, Views, and Controllers
- •Installing Rails
- •Installing on Windows
- •Installing on Mac OS X
- •Installing on Unix/Linux
- •Rails and Databases
- •Keeping Up-to-Date
- •Rails and ISPs
- •Creating a New Application
- •Hello, Rails!
- •Linking Pages Together
- •What We Just Did
- •Building an Application
- •The Depot Application
- •Incremental Development
- •What Depot Does
- •Task A: Product Maintenance
- •Iteration A1: Get Something Running
- •Iteration A2: Add a Missing Column
- •Iteration A4: Prettier Listings
- •Task B: Catalog Display
- •Iteration B1: Create the Catalog Listing
- •Iteration B2: Add Page Decorations
- •Task C: Cart Creation
- •Sessions
- •More Tables, More Models
- •Iteration C1: Creating a Cart
- •Iteration C3: Finishing the Cart
- •Task D: Checkout!
- •Iteration D2: Show Cart Contents on Checkout
- •Task E: Shipping
- •Iteration E1: Basic Shipping
- •Task F: Administrivia
- •Iteration F1: Adding Users
- •Iteration F2: Logging In
- •Iteration F3: Limiting Access
- •Finishing Up
- •More Icing on the Cake
- •Task T: Testing
- •Tests Baked Right In
- •Testing Models
- •Testing Controllers
- •Using Mock Objects
- •Test-Driven Development
- •Running Tests with Rake
- •Performance Testing
- •The Rails Framework
- •Rails in Depth
- •Directory Structure
- •Naming Conventions
- •Active Support
- •Logging in Rails
- •Debugging Hints
- •Active Record Basics
- •Tables and Classes
- •Primary Keys and IDs
- •Connecting to the Database
- •Relationships between Tables
- •Transactions
- •More Active Record
- •Acts As
- •Aggregation
- •Single Table Inheritance
- •Validation
- •Callbacks
- •Advanced Attributes
- •Miscellany
- •Action Controller and Rails
- •Context and Dependencies
- •The Basics
- •Routing Requests
- •Action Methods
- •Caching, Part One
- •The Problem with GET Requests
- •Action View
- •Templates
- •Builder templates
- •RHTML Templates
- •Helpers
- •Formatting Helpers
- •Linking to Other Pages and Resources
- •Pagination
- •Form Helpers
- •Layouts and Components
- •Adding New Templating Systems
- •Introducing AJAX
- •The Rails Way
- •Advanced Techniques
- •Action Mailer
- •Sending E-mail
- •Receiving E-mail
- •Testing E-mail
- •Web Services on Rails
- •Dispatching Modes
- •Using Alternate Dispatching
- •Method Invocation Interception
- •Testing Web Services
- •Protocol Clients
- •Securing Your Rails Application
- •SQL Injection
- •Cross-Site Scripting (CSS/XSS)
- •Avoid Session Fixation Attacks
- •Creating Records Directly from Form Parameters
- •Knowing That It Works
- •Deployment and Scaling
- •Picking a Production Platform
- •A Trinity of Environments
- •Iterating in the Wild
- •Maintenance
- •Finding and Dealing with Bottlenecks
- •Case Studies: Rails Running Daily
- •Appendices
- •Introduction to Ruby
- •Ruby Names
- •Regular Expressions
- •Source Code
- •Cross-Reference of Code Samples
- •Resources
- •Index
FINDING AND DEALING WITH BOTTLENECKS |
459 |
The caching store system is available only for caching actions and fragments. Full-page caches need to be kept on the filesystem in the public directory. In this case, you will have to go the network drive route if you want to use page caching across multiple web servers. You can then symlink either the entire public directory (but that will also cause your images, stylesheets, and JavaScript to be passed over the network, which may be a problem) or just the individual directories that are needed for your page caches. In the latter case, you would, for example, symlink public/products to your network drive to keep page caches for your products controller.
22.6 Finding and Dealing with Bottlenecks
With your share-nothing architecture in place, you can add more servers as load increases. That’s usually a prudent way of dealing with scaling needs. Hardware is cheap and getting cheaper. Developers’ time is not. Thus, it’s important to recognize that performance is not a problem before performance is a problem.
While that might appear as a naïve truism, it’s surprisingly common for developers to succumb to the lure of tuning for its own sake. Besides being a waste of valuable time—time that could be spent improving the application in ways that deliver business value—such tuning commonly has a detrimental effect on the quality of code and design of the application. The fastest application is rarely the prettiest.
You should guard the aesthetics of code dearly—don’t lightly sacrifice it on the altar of performance. But sometimes you must make the sacrifice, either because the bottleneck is simply too gross to ignore or because the economics of a project doesn’t allow for time and money to be interchanged the way they are in a regular commercial environment (this might be the case in an open-source hobby project). That’s when you reach for optimizations.
The first (and most imprtant) rule of performance tuning is, do not act without measurements. The second rule of performance tuning is, do not act without measurements. This is generally true for any type of application, but even more so for web applications. Interactions with databases, web services, mail daemons, and payment gateways are often orders of magnitude slower than looping over 100 objects in memory to build a list of orders.
Thus, improving the runtime performance of a loop from 0.005 to 0.001 seconds doesn’t really turn a cricket into a black stallion if it’s also saddled
Prepared exclusively for Rida Al Barazi
Report erratum
FINDING AND DEALING WITH BOTTLENECKS |
460 |
with an expensive database query weighing in at 0.5 seconds. In that case, the query is the bottleneck. When it comes to dealing with performance, all you care about is bottlenecks.
How do you find the bottlenecks in Rails? By measurement, of course. But measuring the performance of a Rails application is more than being quick on the stopwatch as you click Refresh in the browser. Your choice of measurements range from black-box timing, through external benchmarkers that test the entire runtime of a request (or requests) under load, to profiling the computations involved in a single method.
Staying Alert with a Tail
The easiest and least intrusive way to spot bottlenecks is by running a continous tail on your production.log while interacting with the application.13 Do this, and you’ll have the internals of your application yelling left and right about how much work they’re doing and how long it takes each time you refresh your browser.
This running report will alert you to things such as the infamous N+1 query problem (where a loop over 100 models might cause 101, 201, or 301 queries to be triggered, as each of the model objects goes out to fetch one or more associated model objects by itself). It’ll also give you a rough idea of the performance of each action and whether time is being spent in rendering or in the database.
13In OS X, Linux, or Cygwin, that’s done with tail -f log/production.log. The tail command with the -f option shows new content as it is added to a file.
Prepared exclusively for Rida Al Barazi
Report erratum
FINDING AND DEALING WITH BOTTLENECKS |
461 |
The emphasis is on rough estimation, though. Creating the log files slows down the rendering of the action quite a lot on most machines. The results will also vary wildly between your development and production machine. It’s not at all uncommon to see a page take five seconds to load in development with a tail running on your laptop while the same action takes only 0.2 seconds in production mode on that beefy dual Xeon production server of yours.
Going Beyond the Tail
Once you’ve identified that a given action (or entire segment) of your application needs tuning, you need to establish a reproducible baseline. The baseline is most commonly expressed in requests per second (RPS), a measure of the maximum number of successful requests the application was able to handle per second.
You can then start tweaking things—introduce caching or eager loading, perhaps—and rerunning the test against the baseline. If a significant improvement was apparent, you congratulate yourself and leave the change in. If the change didn’t affect anything, you take it back out and keep looking for other ways to improve performance.
But while a tail on the log file will give you a rough RPS figure, it’s better to find a way of getting the performance numbers automatically. You want to run a large number of requests against a given action, or set of actions, and then have the benchmarker calculate a mean time per request and use that as the baseline.
On Unix, there are two great tools for doing just that. The first is called ab,14 (short for Apache HTTP server benchmarking tool). It can bombard a single URL from multiple threads (the -c option), issuing a specified number from each (the -n option). If your action runs fairly quickly, run something such as
myapp> ab -c 9 -n 1000 http://www.example.com/controller/action
However, if your actions run at less than 20 RPS, you’ll need to cut this down.
myapp> ab -c 4 -n 200 http://www.example.com/controller/action
This is ApacheBench, Version 1.3d <$Revision: 1.73 $> apache-1.3
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 100 requests
Finished 200 requests
14ab is installed alongside Apache, so most *nix setups already have it.
Prepared exclusively for Rida Al Barazi
Report erratum
|
|
|
|
|
|
FINDING AND DEALING WITH BOTTLENECKS |
462 |
Server Software: |
|
WEBrick/1.3.1 |
|
|
|||
Server Hostname: |
|
localhost |
|
|
|||
Server Port: |
|
|
3000 |
|
|
|
|
Document Path: |
|
/action |
|
|
|
||
Document Length: |
|
3477 bytes |
|
|
|||
Concurrency Level: |
|
4 |
|
|
|
||
Time taken for tests: |
15.150 |
seconds |
|
|
|||
Complete requests: |
|
200 |
|
|
|
||
Failed requests: |
|
99 |
|
|
|
||
(Connect: 0, Length: |
99, Exceptions: 0) |
|
|||||
Broken pipe errors: |
0 |
|
|
|
|||
Non-2xx responses: |
|
99 |
|
|
|
||
Total transferred: |
|
952431 |
bytes |
|
|
||
HTML transferred: |
|
898350 |
bytes |
|
|
||
Requests per second: |
13.20 [#/sec] (mean) |
|
|||||
Time per request: |
|
303.00 |
[ms] (mean) |
|
|||
Time per request: |
|
75.75 [ms] (mean, across all concurrent requests) |
|
||||
Transfer rate: |
|
62.87 [Kbytes/sec] received |
|
||||
Connnection Times (ms) |
|
|
|
|
|||
|
|
min |
mean[+/-sd] median |
max |
|
||
Connect: |
|
0 |
0 |
0.0 |
0 |
0 |
|
Processing: |
246 |
300 |
56.6 |
298 |
608 |
|
|
Waiting: |
|
246 |
300 |
56.6 |
298 |
608 |
|
Total: |
|
246 |
300 |
56.6 |
298 |
608 |
|
Percentage of the requests served within a certain time (ms) |
|
||||||
50% |
298 |
|
|
|
|
|
|
66% |
307 |
|
|
|
|
|
|
75% |
312 |
|
|
|
|
|
|
80% |
317 |
|
|
|
|
|
|
90% |
340 |
|
|
|
|
|
|
95% |
446 |
|
|
|
|
|
|
98% |
512 |
|
|
|
|
|
|
99% |
517 |
|
|
|
|
|
|
100% |
608 |
(last request) |
|
|
|
On some systems with really fast actions, it may well be the session system that’s the bottleneck. When you just use vanilla ab, every request will start a new session. That’s not a particularly realistic test, as users will normally perform more than one action, so in order to test reusing the same session, there’s a neat trick using a call to curl.15
myapp> curl -I http://example.com/login
HTTP/1.1 200 OK
Date: Thu, 12 May 2005 16:32:21 GMT
Server: Apache/1.3.33 (Darwin) <filename>mod_fastcgi</filename>/2.4.2 Set-Cookie: _session_id=a94c090f0895aefba381ca5974fbddd9; path=/ Cache-Control: no-cache
Content-Type: text/html; charset=utf-8
As you can see, the server returned a cookie called _session_id that represents the session created. We can grab this cookie and reuse it for our ab calls like this:
myapp>ab -c 4 -n 200 -C "_session_id=a94c090f0895aefba381ca5974fbddd9" \ http://www.example.com/controller/action
15curl is a command-line utility that can (among other things) fetch web pages. If you don’t have curl on your system, you might find its cousin, wget.
Prepared exclusively for Rida Al Barazi
Report erratum