Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Clojure.pdf
Скачиваний:
17
Добавлен:
09.05.2015
Размер:
12.92 Mб
Скачать

LAZIER THAN LAZY 160

You could define the sequence directly as a top-level var:

Download examples/functional.clj

; holds the head (avoid!)

(def head-fibo (lazy-cat [0 1] (map + head-fibo (rest head-fibo))))

This definition uses lazy-cat, which is like concat except that the arguments are evaluated only when needed. This is a very pretty definition in that it defines the recursion by mapping a sum over (each element of the Fibonaccis) and (each element of the rest of the Fibonaccis).

head-fibo works great for small Fibonacci numbers:

(take 10 head-fibo)

(0 1 1 2 3 5 8 13 21 34)

but not so well for huge ones:

(nth head-fibo 1000000)

java.lang.OutOfMemoryError: Java heap space

The problem is that the top-level var head-fibo holds the head of the collection. This prevents the garbage collector from reclaiming elements of the sequence after you have moved past those elements. So, any part of the Fibonacci sequence that you actually use gets cached for the life of the value referenced by head-fibo, which is likely to be the life of the program.

Unless you want to cache a sequence as you traverse it, you must be careful not to keep a reference to the head of the sequence. As the headfibo example demonstrates, you should normally expose lazy sequences as a function that returns the sequence, not as a var that contains the sequence. If a caller of your function wants an explicit cache, the caller can always create its own var.

With lazy sequences, losing your head is often a good idea.

5.3 Lazier Than Lazy

Clojure’s lazy sequences are a great form of laziness at the language level. As a programmer, you can be even lazier by finding solutions that do not require explicit sequence manipulation at all. You can often combine existing sequence functions to solve a problem, without having to get your hands dirty at the level of recur or lazy sequences at all.

Prepared exclusively for WG Custom Motorcycles

Report erratum

this copy is (P1.0 printing, May 2009)

LAZIER THAN LAZY 161

As an example of this, you will implement several solutions to the following problem.7 You are given a sequence of coin toss results, where heads is :h and tails is :t:

[:h :t :t :h :h :h]

How many times in the sequence does heads come up twice in a row? In the previous example, the answer is two. Toss 3 and 4 are both heads, and toss 4 and 5 are both heads.

The sequence of coin tosses might be very large, but it will be finite. Since you are looking for a scalar answer (a count), by rule 2 it is acceptable to use recur:

Download examples/functional.clj

Line 1 (defn count-heads-pairs [coll]

2(loop [cnt 0 coll coll]

3(if (empty? coll)

4cnt

5(recur (if (= :h (first coll) (second coll))

6

(inc cnt)

7

cnt)

8

(rest coll)))))

Since the purpose of the function is to count something, the loop introduces a cnt binding, initially zero on line 2. The loop also introduces its own binding for the coll so that we can shrink the coll each time through the recur.

Line 3 provides the basis for the recurrence. If a sequence of coins tosses is empty, it certainly has zero runs of two heads in a row.

Line 5 is the meat of the function, incrementing the cnt by one if the first and second items of coll are both heads (:h).

Try a few inputs to see that count-heads-pairs works as advertised:

(count-heads-pairs [:h :h :h :t :h])

2

(count-heads-pairs [:h :t :h :t :h])

0

Although count-heads-pairs works, it fails as code prose. The key notion of “two in a rowness” is completely obscured by the boilerplate for

7. Hat tip to Jeff Brown, who posed this problem over breakfast at a No Fluff, Just Stuff symposium.

Prepared exclusively for WG Custom Motorcycles

Report erratum

this copy is (P1.0 printing, May 2009)

LAZIER THAN LAZY 162

loop/recur. To fix this, you will need to use rules 5 and 6, subdividing the problem to take advantage of Clojure’s sequence library.

The first problem you will encounter is that almost all the sequence functions do something to each element in a sequence in turn. This doesn’t help us at all, since we want to look at each element in the context of its immediate neighbors. So, let’s transform the sequence. When you see this:

[:h :t :t :h :h :h]

you should mentally translate that into a sequence of every adjacent pair:

[[:h :t] [:t :t] [:t :h] [:h :h] [:h :h]]

Write a function named by-pairs that performs this transformation. Because the output of by-pairs varies based on the size of its input, by rule 3 you should build this sequence lazily:

Download examples/functional.clj

Line 1 ; overly complex, better approaches follow...

2(defn by-pairs [coll]

3(let [take-pair (fn [c]

4

(when (next c) (take 2 c)))]

5(lazy-seq

6(when-let [pair (seq (take-pair coll))]

7

(cons pair (by-pairs (rest coll)))))))

Line 3 defines a function that takes the first pair of elements from the collection. Line 5 ensures that the recursion is evaluated lazily. Line 6 is a conditional: if the next pair does not actually contain two elements, we must be (almost) at the end of the list, and we implicitly terminate. If we did get two elements, then on line 7 we continue building the sequence by consing our pair onto the pairs to be had from the rest of the collection.

Check that by-pairs works:

(by-pairs [:h :t :t :h :h :h])

((:h :t) (:t :t) (:t :h) (:h :h) (:h :h))

Now that you can think of the coin tosses as a sequence of pairs of results, it is easy to describe count-heads-pairs in English:

“Count the pairs of results that are all heads.”

Prepared exclusively for WG Custom Motorcycles

Report erratum

this copy is (P1.0 printing, May 2009)

LAZIER THAN LAZY 163

This English description translates directly into existing sequence library functions: “Count” is count, of course, and “that are all heads” suggests a filter:

Download examples/functional.clj

(defn count-heads-pairs [coll]

(count (filter (fn [pair] (every? #(= :h %) pair)) (by-pairs coll))))

This is much more expressive than the recur-based implementation, and it makes clear that we are counting all the adjacent pairs of heads. But we can make things even simpler. Clojure already has a more general version of by-pairs named partition:

(partition size step? coll)

partition breaks a collection into chunks of size size. So, you could break a heads/tails vector into a sequence of pairs:

(partition 2 [:h :t :t :h :h :h])

((:h :t) (:t :h) (:h :h))

That isn’t quite the same as by-pairs, which yields overlapping pairs. But partition can do overlaps too. The optional step argument determines how far partition moves down the collection before starting its next chunk. If not specified, step is the same as size. To make partition work like by-pairs, set size to 2 and step to 1:

(partition 2 1 [:h :t :t :h :h :h])

((:h :t) (:t :t) (:t :h) (:h :h) (:h :h))

(by-pairs [:h :t :t :h :h :h])

((:h :t) (:t :t) (:t :h) (:h :h) (:h :h))

Another possible area of improvement is the count/filter idiom used to count the pairs that are both heads. This combination comes up often enough that it is worth encapsulating in a count-if function:

Download examples/functional.clj

(use '[clojure.contrib.def :only (defvar)])

(defvar count-if (comp count filter) "Count items matching a filter" )

The definition of count-if introduces two new forms. defvar is a convenience wrapper around def and is described in the sidebar on the following page. comp is used to compose two or more functions:

(comp f & fs)

The composed function is a new function that applies the rightmost function to its arguments, the next-rightmost function to that result,

Prepared exclusively for WG Custom Motorcycles

Report erratum

this copy is (P1.0 printing, May 2009)

LAZIER THAN LAZY 164

Faces of def

Throughout the book you will use various def forms to create vars, such as defn, defmacro, and defmulti. These forms are all eventually wrappers around the def special form.

A number of other def variants are less often used but still worth knowing. defvar provides a convenient way to add a documentation string to a var:

(clojure.contrib.def/defvar a-symbol initial-value? docstring?)

defonce ensures that a var exists and sets the root binding for the var only if it is not already set :

(defonce a-symbol initial-value?)

defhinted will create a var and set its type information based on the initial value of the var:

(clojure.contrib.def/defhinted a-symbol initial-value?)

defn- works just like defn but yields a private function that is accessible only in the namespace where it was defined.

(defnname & args-as-for-defn)

Many other def forms also have dash-suffixed variants that are private.

For more def variants, see the source code for clojure.contrib.def.

and so on. So, count-if will first filter and then count the results of the

filter:

(count-if odd? [1 2 3 4 5])

3

Finally, you can use count-if and partition to create a count-runs function that is more general than count-heads-pairs:

Download examples/functional.clj

(defn count-runs

"Count runs of length n where pred is true in coll."

[n pred coll]

(count-if #(every? pred %) (partition n 1 coll)))

Prepared exclusively for WG Custom Motorcycles

Report erratum

this copy is (P1.0 printing, May 2009)

LAZIER THAN LAZY 165

count-runs is a winning combination: both simpler and more general than the previous versions of count-heads-pairs. You can use it to count pairs of heads:

(count-runs 2 #(= % :h) [:h :t :t :h :h :h])

2

But you can just as easily use it to count pairs of tails:

(count-runs 2 #(= % :t) [:h :t :t :h :h :h])

1

Or, instead of pairs, how about runs of three heads in a row?

(count-runs 3 #(= % :h) [:h :t :t :h :h :h])

1

If you still want to have a function named count-heads-pairs, you can implement it in terms of count-runs:

Download examples/functional.clj

(defvar count-heads-pairs (partial count-runs 2 #(= % :h))

"Count runs of length two that are both heads" )

This version of count-heads-pairs builds a new function using partial:

(partial f & partial-args)

partial performs a partial application of a function. You specify a function f and part of the argument list when you perform the partial. You specify the remainder of the argument list later, when you call the function created by partial. So, the following:

(partial count-runs 1 #(= % :h))

is a more expressive way of saying this:

(fn [coll] (count-runs 1 #(= % :h) coll))

Partial application is similar but not identical to currying.

Currying and Partial Application

When you curry a function, you get a new function that takes one argument and returns the original function with that one argument fixed. (Curry is named for Haskell Curry, an American logician best known for his work in combinatory logic.) If Clojure had a curry, it might be implemented like this:

; almost a curry

(defn faux-curry [& args] (apply partial partial args))

Prepared exclusively for WG Custom Motorcycles

Report erratum

this copy is (P1.0 printing, May 2009)

LAZIER THAN LAZY 166

One use of curry is partial application. Here is partial application in Clojure:

(def add-3 (partial + 3)) (add-3 7)

10

And here is partial application using our faux-curry:

(def add-3 ((faux-curry +) 3)) (add-3 7)

10

If all you want is partial application, currying is just an intermediate step. Our faux-curry is not a real curry. A real curry would return a result, not a function of no arguments, once all the arguments were fixed. You can see the difference here, using the function true?, which takes only one argument:

; faux curry

((faux-curry true?) (= 1 1))

#<... mangled function name ...>

; if the curry were real ((curry true?) (= 1 1))

true

Since Clojure functions can have variable-length argument lists, Clojure cannot know when all the arguments are fixed. But you, the programmer, do know when you are done adding arguments. Once you have curried as many arguments as you want, just invoke the function. That amounts to adding an extra set of parentheses around the earlier expression:

(((faux-curry true?) (= 1 1)))

true

The absence of curry from Clojure is not a major problem, since partial is available, and that is what people generally want out of curry anyway. In fact, many programmers use the terms currying and partial application interchangeably.

You have seen a lot of new forms in this section. Do not let all the details obscure the key idea: by combining existing functions from the sequence library, you were able to create a solution that was both simpler and more general than the direct approach. And, you did not have to worry about laziness or recursion at all. Instead, you worked at a higher level of abstraction and let Clojure deal with laziness and recursion for you.

Prepared exclusively for WG Custom Motorcycles

Report erratum

this copy is (P1.0 printing, May 2009)