Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Agile Web Development With Rails, 1st Edition (2005).pdf
Скачиваний:
28
Добавлен:
17.08.2013
Размер:
7.99 Mб
Скачать

MAINTENANCE 454

But you can also interact with them directly, which gives you all the objectoriented goodness, Rails query generation, and much more right at your fingertips. The gateway to this world is the console script. It’s launched in production mode with

myapp> ruby ./script/console production Loading production environment.

irb(main):001:0> p = Product.find_by_title("Pragmatic Version Control")

=> #<Product:0x24797b4 @attributes={. . .} irb(main):002:0> p.price = 32.95

=> 32.95 irb(main):003:0> p.save => true

You can use the console for much more than just fixing problems. It’s also an easy administrative interface for parts of the applications that you may not want to deal with explicitly by designing controllers and methods up front. You can also use it to generate statistics and look for correlations.

22.4 Maintenance

Keeping the machinery of your application well-oiled over long periods of time means dealing with the artifacts produced by its operation. The two concerns that all Rails maintainers must deal with in production are log files and sessions.

Log Files

By default, Rails uses the Logger class that’s included with the Ruby standard library. This is convenient: it’s easy to set up, and there are no dependencies. You pay for this with reduced flexibility: message formatting, log file rollover, and level handling are all a bit anemic.

If you need more sophisticated logging capabilities, such as logging to multiple files depending on levels, you should look into Log4R7 or (on BSD systems) SyslogLogger.8 It’s easy to move from Logger to these alternatives, as they are API compatible. All you need to do is replace the log object assigned to RAILS_DEFAULT_LOGGER in config/environment.rb.

Dealing with Growing Log Files

As an application runs, it constantly appends to its log file. Eventually, this file will grow uncomfortably large. To overcome this, most logging

7http://rubyforge.org/projects/log4r

8http://rails-analyzer.rubyforge.org/classes/SyslogLogger.html

Prepared exclusively for Rida Al Barazi

Report erratum

MAINTENANCE 455

solutions feature rollover. When some specified criteria are met, the logger will close the current log file, rename it, and open a new, empty file. You’ll end up with a progression of log files of increasing age. It’s then easy to write a periodic script that archives and/or deletes the oldest of these files.

The Logger class supports rollover. However, each FastCGI process has its own Logger instance. This sometimes causes problems, as each logger tries to roll over the same file. You can deal with it by setting up your own periodic script (triggered by cron or the like) to first copy the contents of the current log to a different file and then truncate it. This ensures that only one process, the cron-powered one, is responsible for handling the rollover and can thus do so without fear of a clash.

Clearing Out Sessions

People are often surprised that Ruby’s session handler, which Rails uses, doesn’t do automated housekeeping. With the default file-based session handler, this can quickly spell trouble.9 Files accumulate and are never removed. The same problem exists with the database session store, albeit to a lesser degree. Endless numbers of session rows are created.10

As Ruby isn’t cleaning up after itself, we have to do it ourselves. The easiest way is to run a periodic script. If you keep your sessions in files, the script should look at when those files were last touched and delete those older than some value. For example, the following script, which could be invoked by cron, uses the Unix find command to delete files that haven’t been touched in 12 hours.

find /tmp/ -name 'ruby_sess*' -ctime +12h -delete

If your application keeps session data in the database, your script can look at the updated_at column and delete rows accordingly. We can use script/runner to execute this command.

> RAILS_ENV=production ./script/runner \ 'ActiveRecord::Base.connection.delete(

"DELETE FROM sessions WHERE updated_at < now() - 12*3600")'

9I learned that lesson the hard way when 200,000+ session files broke the limit on the number of files a single directory can hold under FreeBSD.

10I also learned that lesson the hard way when I tried to empty 2.5 million rows from

the sessions table during rush hour, which locked up the table and brought the site to a screeching halt.

Prepared exclusively for Rida Al Barazi

Report erratum

SCALING: THE SHARE-NOTHING ARCHITECTURE 456

22.5 Scaling: The Share-Nothing Architecture

Now that your application is properly deployed, it’s time to examine how we can make it scale. Scaling means different things to different people, but we’ll stick to the somewhat loose definition of “coping with increasing load by adding hardware.” That’s not the full story, of course, and we’ll shortly have a look at how you can delay the introduction of more hardware through optimizations. But for now, let’s look at the “more hardware” solution.

When it comes to scaling Rails applications, the most important concept is the share-nothing architecture. Share-nothing removes the burden of maintaining state from the web and application tier and pushes it down to a shared integration point, such as the database or a network drive. This means that it doesn’t matter which server a user initiates his session on and what server the next request is handled by. Nothing is shared from one request to another at the web/application server layer.

Using this architecture, it’s possible to run an application on a pool of servers, each indifferent to the requests it handles. Increasing capacity means adding new web and application server hardware. At the integration point—database, network drive, or caching server—you use techniques honed from years of experience scaling with those technologies. This means that it’s no longer your problem to cope with mass concurrency; it’s handled by MySQL, Oracle, memcached, and so on. Figure 22.3, on the next page shows a conceptual model of this setup.

This deployment style has some venerable precedents. PHP as used by Yahoo, Perl as used by LiveJournal, and many, many other big applications have scaled high and large on the same principles. Rails is sitting on top of a tool chain that has already proven its worth.

Getting Rails to a Share-Nothing Environment

While Rails has been built from the ground up to be ready for a sharenothing architecture, it doesn’t necessarily ship with the best configuration for that out of the box. The key areas to configure are sessions, caching, and assets (such as uploaded files).

Picking a Session Store

As we saw when we looked at sessions back on page 302, session data is by default kept in files in the operating system’s temporary directory

Prepared exclusively for Rida Al Barazi

Report erratum

SCALING: THE SHARE-NOTHING ARCHITECTURE 457

Load

Servers

Database/

Balancer

Network drive

 

 

 

 

 

 

 

network

Figure 22.3: Share-Nothing Setup

(normally /tmp). This is done through the FileStore, which requires no configuration or other arrangements to get started with. But it’s not necessarily a great model for scaling. The problem is that every server needs to access the same set of session data, as a session could have its requests handled by multiple servers in turn. While you could potentially place the sessions files on a shared network drive, there are better alternatives.

The most commonly used alternative is the ActiveRecordStore, which uses the database to store session data, keeping a single row per session. You need to create a session table, as follows.

File 21

create table

sessions (

 

 

id

int(11)

not null auto_increment,

 

sessid

varchar(255),

 

data

text,

 

updated_at datetime default NULL, primary key(id),

index session_index (sessid) );

As with any model, the sessions table uses an autoincrementing id, but it’s really driven by the sessid, which is created by the Ruby session system. Since this is the main retrieval parameter, it’s also very important to keep it indexed. Session tables fill up fast, and searching 50,000 rows for the relevant one on every action can take down any database in no time.

The database store is enabled by putting the following piece of configuration either in the file config/environment.rb, if you want it to serve for all environments, or possibly just in config/environments/production.rb.

Prepared exclusively for Rida Al Barazi

Report erratum

SCALING: THE SHARE-NOTHING ARCHITECTURE 458

ActionController::CgiRequest::DEFAULT_SESSION_OPTIONS[:database_manager] =

CGI::Session::ActiveRecordStore

In addition to the database store, there’s also the choice of using a DRb11 store or a memcached12 store. Unless you already need such services for other parts of the application, it’s probably best to hold off moving the sessions to these stores until the scaling demands grow very high. Fewer moving parts is better.

Picking a Caching Store

As with sessions, caching also features a set of stores. You can keep the fragments in files, in a database, in a DRb server, or in memcached servers. But whereas sessions usually contain small amounts of data and require only one row per user, fragment caching can easily create sizeable amounts of data, and you can have many per user. This makes database storage a poor fit.

For many setups, it’s easiest to keep cache files on the filesystem. But you can’t keep these cached files locally on each server, as expiring a cache on one server would not expire it on the rest. You therefore need to set up a network drive that all the servers can share for their caching.

As with session configuration, you can configure a file-based caching store globally in environment.rb or in a specific environment’s file.

ActionController::Base.fragment_cache_store =

ActionController::Caching::Fragments::FileStore.new( "#{RAILS_ROOT}/cache")

This configuration assumes that a directory named cache is available in the root of the application and that the web server has full read and write access to it. This directory can easily be symlinked to the path on the server that represents the network drive.

Regardless of which store you pick for caching fragments, you should be aware that network bottlenecks can quickly become a problem. If your site depends heavily on fragment caching, every request will need a lot of data transferring from the network drive to the specific server before it’s again sent on to the user. In order to use this on a high-profile site, you really need to have a high-bandwidth internal network between your servers or you will see slowdown.

11A library for creating network services in Ruby.

12A service for creating a caching cluster spanning multiple machines. See the memcached

page at http://www.danga.com/memcached/.

Prepared exclusively for Rida Al Barazi

Report erratum