Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Agile Web Development With Rails, 2nd Edition (2006).pdf
Скачиваний:
30
Добавлен:
17.08.2013
Размер:
6.23 Mб
Скачать

WITH_OPTIONS 255

How does this wizardry work? It relies on the fact that the & notation for parameter passing expects a Proc object. If it doesn’t get one, Ruby tries to convert whatever it does get by invoking its to_proc method. Here we’re passing it a symbol (:author_id). And Rails has conveniently defined a to_proc method in class Symbol. Here’s the implementation—figuring it out is left as an exercise to the reader.

class Symbol def to_proc

Proc.new { |obj, *args| obj.send(self, *args) } end

end

15.7with_options

Many Rails methods take a hash of options as their last parameter. You’ll sometimes find yourself calling several of these methods in a row, where each call has one or more options in common. For example, you might be defining some routes.

ActionController::Routing::Routes.draw do |map|

map.connect

"/shop/summary" ,

:controller => "store",

 

 

:action => "summary"

map.connect

"/titles/buy/:id" , :controller => "store",

 

 

:action => "add_to_cart"

map.connect

"/cart",

:controller => "store",

 

 

:action => "display_cart"

end

The with_options method lets you specify these common options just once.

ActionController::Routing::Routes.draw do |map|

map.with_options(:controller => "store") do |store_map|

store_map.connect "/shop/summary" , :action => "summary"

store_map.connect "/titles/buy/:id" , :action => "add_to_cart"

store_map.connect "/cart", :action => "display_cart" end

end

In this example, store_map acts just like a map object, but the option :controller => store will be added to its option list every time it is called.

The with_options method can be used with any API calls where the last parameter is a hash.

Report erratum

UNICODE SUPPOR T 256

15.8Unicode Support

In the old days, characters were represented by sequences of 6, 7, or 8 bits. Each computer manufacturer decided its own mapping between these bit patterns and their character representations. Eventually, standards started to emerge, and encodings such as ASCII and EBCDIC became common. However, even in these standards, you couldn’t be sure that a given bit pattern would display a particular character: the 7-bit ASCII character 0b0100011 would display as # on terminals in the United States and £ on those in the United Kindom. Hacks such as code pages, which overlaid multiple characters onto the same bit patterns, solve the problems locally but compounded them globally.

At the same time, it quickly became apparent that 8 bits just wasn’t enough to encode the characters needed for many languages. The Unicode Consortium was formed to address this issue.1

Unicode defines a number of different encoding schemes that allow for up to 32 bits for the representation of each character. Unicode is generally stored using one of three encoding forms. In one of these, UTF-32, every character (technically a code point) is represented as a 32-bit value. In the other two (UTF-16 and UTF-8), characters are represented as one or more 16or 8-bit values. When Rails stores strings in Unicode, it uses UTF-8.

The Ruby language that underlies Rails originated in Japan. And it turns out that historically Japanese programmers have had issues with the encoding of their language into Unicode. This means that, although Ruby supports strings encoded in Unicode, it doesn’t really support Unicode in its libraries. For example, the UTF-8 representation of ü is the 2-byte sequence c3 bc (we’re now using hex to show the binary values). But if you give Ruby a string containing ü, its library methods won’t know about the fact that 2 bytes are used to represent a single character.

dave> irb

irb(main):001:0> name = "Günter" => "G\303\274nter" irb(main):002:0> name.length

=> 7

Although Günter has six characters, its representation uses 7 bytes, and that’s the number Ruby reports.

However, Rails 1.2 includes a fix for this. It isn’t a replacement for Ruby’s libraries, so there are still areas where unexpected things happen. But even so, the new Rails Multibyte library, added to Active Support in September 2006, goes a long way toward making Unicode processing easy in Rails applications.

1. http://www.unicode.org

Report erratum

UNICODE SUPPOR T 257

Rather than replace the Ruby built-in string library methods with Unicodeaware versions, the Multibyte library defines a new class, called Chars. This class defines the same methods as the built-in String class, but those methods are aware of the underlying encoding of the string.

The rule for using Multibyte strings is easy: whenever you need to work with strings that are encoded using UTF-8, convert those strings into Chars objects first. The library adds a chars method to all strings to make this easy.

Let’s play with this in script/console.

Line 1 dave> script/console

-Loading development environment.

->> name = "G\303\274nter"

-=> "Günter"

5 >> name.length

-=> 7

->> name.chars.length

-=> 6

->> name.reverse

10 => "retn\274?G"

->> name.chars.reverse

-=> #<ActiveSupport::Multibyte::Chars:0x2c4cdf4 @string="retnüG">

We start by storing a string containing UTF-8 characters into the variable name.

On line 5 we ask Ruby for the length of the string. It returns 7, the number of bytes in the representation. But then, on line 7, we use the chars method to create a Chars object that wraps the underlying string. Asking that new object for its length, we get 6, the number of characters in the string.

Similarly, reversing the raw string produces gibberish; it simply reverses the order of the bytes. Reversing the Chars object, on the other hand, produces the expected result.

In theory, all the Rails internal libraries are now Unicode clean, meaning that (for example) validates_length_of will correctly check the length of UTF-8 strings if you enabling UTF-8 support in your application.

However, having string handling that honors encoding is not enough to ensure your application works with Unicode characters. You’ll need to make sure the entire data path, from browser to database, agrees on a common encoding. To explore this, let’s write a simple application that builds a list of names.

The Unicode Names Application

We’re going to write a simple application that displays a list of names on a page. An entry field on that same page lets you add new names to the list. The full list of names is stored in a database table.

Report erratum

UNICODE SUPPOR T 258

We’ll create a regular Rails application.

dave> rails namelist dave> cd namelist

namelist> ruby script/server

We next need to create our database. However, we also need to ensure that the default character set for this database is UTF-8. Just how you do this is database dependent. Here’s what you do for MySQL.2

namelist>

mysql -u root

 

 

 

 

 

Welcome to

the MySQL

monitor.

Commands end

with ; or

\g.

Your

MySQL

connection

id

is 85

to server

version: 5.0.22

Type

'help;' or '\h'

for

help.

Type '\c'

to

clear the

buffer.

mysql> create database namelist_development character set utf8;

Query OK, 1 row affected (0.00 sec)

That told the database what character encoding to use. Perhaps surprisingly, we also have to tell each MySQL connection what encoding it should use. We do this with the encoding option in database.yml. (We show only the development stanza here: you’ll need to do the same for test and production.)

Download e1/namelist/config/database.yml

development: adapter: mysql

database: namelist_development username: root

password:

host: localhost encoding: utf8

Now we’ll create a model for our names.

namelist> script/generate model person

And we’ll populate the migration.

Download e1/namelist/db/migrate/001_create_people.rb

class CreatePeople < ActiveRecord::Migration def self.up

create_table :people do |t| t.column :name, :string

end end

def self.down drop_table :people

end end

2. Normally we’d use mysqladmin to create databases. However, its --default-character-set option doesn’t seem to work.

Report erratum

UNICODE SUPPOR T 259

Because we set the default character set of the whole database to UTF-8, we don’t need to do anything special in the migration file. If we hadn’t been able to set this option at the database level, we could have instead done it on a per-table basis in the migration.

create_table :people, :options => 'default charset=utf8' do |t| t.column :name, :string

end

However, this makes the migration MySQL specific. As a result, the table options will not be copied across into the test database unless you change the default schema_format in environment.rb to :sql. This hassle is a gentle suggestion that making the character set choice at the database level is the way to go.

Now we’ll write our controller and our view. We’ll keep the controller simple by using a single action.

Download e1/namelist/app/controllers/people_controller.rb

class PeopleController < ApplicationController

def index

@person = Person.new(params[:person]) @person.save! if request.post? @people = Person.find(:all)

end end

We’ve made the database Unicode-aware. Now we just need to do the same thing on the browser side.

As of Rails 1.2, the default content-type header is

Content-Type: text/html; charset=UTF-8

However, just to be sure, we’ll also add a <meta> tag to the page header to enforce this. This also means that if a user saves a page to a local file, it will display correctly later. Our layout file is

Download e1/namelist/app/views/layouts/people.rhtml

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" >

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">

<head>

<meta http-equiv="content-type" content="text/html; charset=UTF-8" ></meta> <title>My Name List</title>

</head>

<body>

<%= yield :layout %>

</body>

</html>

Report erratum

UNICODE SUPPOR T 260

In our index view, we’ll show the full list of names in the database and provide a simple form to let folks enter new ones. In the list, we’ll display the name and its size in bytes and characters, and, just to show off, we’ll reverse it.

Download e1/namelist/app/views/people/index.rhtml

<table border="1">

<tr> <th>Name</th><th>bytes</th><th>chars</th><th>reversed</th>

</tr>

<% for person in @people %>

<tr>

<td><%= h(person.name) %>

<td><%= person.name.length %></td> <td><%= person.name.chars.length %></td>

<td><%= h(person.name.chars.reverse) %></td> </tr>

<% end %>

</table>

<% form_for :person do |form| %>

New name: <%= form.text_field :name %> <%= submit_tag "Add" %>

<% end %>

When we point our browser at our people controller, we’ll see an empty table. Let’s start by entering “Dave” in the name field.

When we hit the Add button, we see that the string “Dave” contains both 4 bytes and 4 characters—normal ASCII characters take 1 byte in UTF-8.

When we hit Add after typing Günter, we see something different.

Report erratum

UNICODE SUPPOR T 261

Because the ü character takes 2 bytes to represent in UTF-8, we see that the string has a byte length of 7 and a character length of 6. Notice that the reversed form displays correctly.

Finally, we’ll add some Japanese text.

Now the disparity between the byte and character lengths is even greater. However, the string still reverses correctly, on a character-by-character basis.

Report erratum