- •Guide to Elliptic Curve Cryptography
- •Contents
- •List of Algorithms
- •List of Tables
- •List of Figures
- •Acronyms
- •Preface
- •1 Introduction and Overview
- •1.1 Cryptography basics
- •1.2.3 Elliptic curve systems
- •1.3 Why elliptic curve cryptography?
- •1.4 Roadmap
- •2 Finite Field Arithmetic
- •2.2.1 Addition and subtraction
- •2.2.2 Integer multiplication
- •2.2.3 Integer squaring
- •2.2.4 Reduction
- •2.2.5 Inversion
- •2.3.1 Addition
- •2.3.2 Multiplication
- •2.3.3 Polynomial multiplication
- •2.3.4 Polynomial squaring
- •2.3.5 Reduction
- •2.4.1 Addition and subtraction
- •2.4.2 Multiplication and reduction
- •2.4.3 Inversion
- •3 Elliptic Curve Arithmetic
- •3.1 Introduction to elliptic curves
- •3.1.2 Group law
- •3.1.3 Group order
- •3.1.4 Group structure
- •3.2.1 Projective coordinates
- •3.3 Point multiplication
- •3.3.1 Unknown point
- •3.3.2 Fixed point
- •3.3.3 Multiple point multiplication
- •3.4 Koblitz curves
- •3.4.1 The Frobenius map and the ring Z[τ ]
- •3.4.2 Point multiplication
- •3.6 Point multiplication using halving
- •3.6.1 Point halving
- •3.6.3 Point multiplication
- •3.7 Point multiplication costs
- •4 Cryptographic Protocols
- •4.1 The elliptic curve discrete logarithm problem
- •4.2.3 Determining the number of points on an elliptic curve
- •4.4 Signature schemes
- •4.4.1 ECDSA
- •4.4.2 EC-KCDSA
- •4.5.1 ECIES
- •4.5.2 PSEC
- •4.6.1 Station-to-station
- •4.6.2 ECMQV
- •5 Implementation Issues
- •5.1 Software implementation
- •5.1.1 Integer arithmetic
- •5.1.5 Timings
- •5.2 Hardware implementation
- •5.3 Secure implementation
- •5.3.1 Power analysis attacks
- •5.3.2 Electromagnetic analysis attacks
- •5.3.4 Fault analysis attacks
- •5.3.5 Timing attacks
- •A.1 Irreducible polynomials
- •A.2 Elliptic curves
- •A.2.2 Random elliptic curves over F2m
- •A.2.3 Koblitz elliptic curves over F2m
- •C.1 General-purpose tools
- •C.2 Libraries
- •Bibliography
- •Index
3.3. Point multiplication |
95 |
Algorithm 3.25 Point addition (y2+x y=x3+ax2+b, a {0, 1}, LD-affine coordinates)
INPUT: P = (X1 : Y1 : Z1) in LD coordinates, Q = (x2, y2) in affine coordinates on
E/K : y2 + x y = x3 + ax2 + b.
OUTPUT: P + Q = (X3 : Y3 : Z3) in LD coordinates.
1.If Q = ∞ then return(P).
2.If P = ∞ then return(x2 : y2 : 1).
3. |
1 |
← |
2 |
· |
x2. |
{ |
T1 |
← |
2 |
|
} |
T |
|
Z1 |
|
|
|
X2 Z1 |
|
||||
4. |
T2 ← Z1 . |
|
{T2 ← Z1 |
} |
|
||||||
5. |
X3 ← X1 + T1. |
{X3 ← B |
= X2 Z1 + X1} |
||||||||
6. |
T1 ← Z1 · X3. |
{T1 ← C =2Z1 B} |
|||||||||
7. |
T3 ← T2 · y2. |
{T3 ← Y2 Z1 } |
|||||||||
8. |
Y3 ← Y1 + T3. |
{Y3 ← A = Y2 Z12 + Y1} |
9.If X3 = 0 then
9.1If Y3 = 0 then use Algorithm 3.24 to compute
(X3 : Y3 : Z3) = 2(x2 : y2 : 1) and return(X3 : Y3 : Z3).
9.2Else return(∞).
10. |
Z3 ← T12. |
|
{Z3 ← C2} |
2} |
|
|
|
||||||||
11. |
3 ← T1 |
· Y3. |
{T3 |
← E = |
|
|
|
||||||||
T |
|
|
|
|
|
|
|
|
|
AC |
|
|
|
||
12. |
If a = 1 then T1 ← T1 + T2. |
{T1 |
← C + a Z1 } |
|
|
|
|||||||||
13. |
T2 |
← X32. |
|
{T2 |
← B2} |
|
|
|
|
|
|||||
14. |
X3 ← T2 · T1. |
{X3 ← D = B2(C + a Z12)} |
|||||||||||||
15. |
T2 |
← Y32. |
|
{T2 |
← A2} |
|
|
|
|
|
|||||
|
X3 ← X3 + T2. |
|
|
|
2 |
+ D} |
|
|
|
||||||
16. |
{X3 ← A2 |
|
|
|
|||||||||||
17. |
X3 ← X3 + T3. |
{X3 ← A |
+ D + E} |
|
|
||||||||||
18. |
T2 |
← x2 · Z3. |
{T2 |
← X2 Z3} |
+ |
|
} |
|
|||||||
19. |
2 |
← |
2+ |
X3. |
{ |
T2 |
← |
2= |
X3 |
X2 Z3 |
|
||||
T |
|
T2 |
|
|
|
F |
|
|
|
|
|||||
20. |
T1 |
← Z3 . |
|
{T1 |
← Z3 } |
|
|
|
|
|
|||||
21. |
T3 |
← T3 |
+ Z3. |
{T3 |
← E + Z3} |
|
|
|
|||||||
22. |
Y3 |
← T3 · T2. |
{Y3 |
← (E + Z3)F} |
|
|
|||||||||
23. |
T2 |
← x2 + y2. |
{T2 |
← X2 + Y2} |
|
2 |
} |
||||||||
24. |
T3 |
← T1 · T2. |
{T3 |
← G = (X2 + Y2)Z3 |
|||||||||||
25. |
Y3 |
← Y3 + T3. |
{Y3 |
← (E + Z3)F + G} |
|
||||||||||
26. |
Return(X3 : Y3 : Z3). |
|
|
|
|
|
|
|
|
|
|
The field operation counts for point addition and doubling in various coordinate systems are listed in Table 3.4.
3.3Point multiplication
This section considers methods for computing k P, where k is an integer and P is a point on an elliptic curve E defined over a field Fq . This operation is called point mul-
96 |
3. Elliptic Curve Arithmetic |
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
Coordinate system |
|
General addition |
|
General addition |
Doubling |
|
|
|
|
|
|
|
(mixed coordinates) |
|
|
|
|
|
|
|
|
|
|
|
|
|
Affine |
|
V + M |
|
— |
V + M |
|
|
|
Standard projective |
|
13M |
|
12M |
7M |
|
|
|
Jacobian projective |
|
14M |
|
10M |
5M |
|
|
|
Lopez´-Dahab projective |
|
14M |
|
8M |
4M |
|
|
|
|
|
|
|
|
|
|
|
Table 3.4. Operation counts for point addition and |
doubling on y2 + x y = x3 + ax2 + b. |
|||||||
M = multiplication, V = division (see §2.3.6). |
|
|
|
|
tiplication or scalar multiplication, and dominates the execution time of elliptic curve cryptographic schemes (see Chapter 4). The techniques presented do not exploit any special structure of the curve. Point multiplication methods that take advantage of efficiently computable endomorphisms on some special curves are considered in §3.4, §3.5, and §3.6. §3.3.1 covers the case where P is not known a priori. In instances where P is fixed, for example in ECDSA signature generation (see §4.4.1), point multiplication algorithms can exploit precomputed data that depends only on P (and not on k); algorithms of this kind are presented in §3.3.2. Efficient techniques for computing k P + l Q are considered in §3.3.3. This operation, called multiple point multiplication, dominates the execution time of some elliptic curve cryptographic schemes such as ECDSA signature verification (see §4.4.1).
We will assume that #E(Fq ) = nh where n is prime and h is small (so n ≈ q), P and Q have order n, and multipliers such as k are randomly selected integers from the interval [1, n − 1]. The binary representation of k is denoted (kt−1, . . . , k2, k1, k0)2, where t ≈ m = log2 q .
3.3.1Unknown point
Algorithms 3.26 and 3.27 are the additive versions of the basic repeated-square-and- multiply methods for exponentiation. Algorithm 3.26 processes the bits of k from right to left, while Algorithm 3.27 processes the bits from left to right.
Algorithm 3.26 Right-to-left binary method for point multiplication
INPUT: k = (kt−1, . . . , k1, k0)2, P E(Fq ).
OUTPUT: k P.
1.Q ← ∞.
2.For i from 0 to t − 1 do
2.1If ki = 1 then Q ← Q + P.
2.2P ← 2P.
3.Return(Q).
3.3. Point multiplication |
97 |
Algorithm 3.27 Left-to-right binary method for point multiplication
INPUT: k = (kt−1, . . . , k1, k0)2, P E(Fq ).
OUTPUT: k P.
1.Q ← ∞.
2.For i from t − 1 downto 0 do
2.1Q ← 2Q.
2.2If ki = 1 then Q ← Q + P.
3.Return(Q).
The expected number of ones in the binary representation of k is t/2 ≈ m/2, whence the expected running time of Algorithm 3.27 is approximately m/2 point additions and m point doublings, denoted
m |
A + m D. |
(3.15) |
2 |
Let M denote a field multiplication, S a field squaring, and I a field inversion. If affine coordinates (see §3.1.2) are used, then the running time expressed in terms of field operations is
2.5m S + 3m M + 1.5m I |
(3.16) |
if Fq has characteristic > 3, and |
|
3m M + 1.5m I |
(3.17) |
if Fq is a binary field.
Suppose that Fq has characteristic > 3. If mixed coordinates (see §3.2.2) are used, then Q is stored in Jacobian coordinates, while P is stored in affine coordinates. Thus the doubling in step 2.1 can be performed using Algorithm 3.21, while the addition in step 2.2 can be performed using Algorithm 3.22. The field operation count of
Algorithm 3.27 is then |
|
8m M + 5.5m S + (1I + 3M + 1S) |
(3.18) |
(one inversion, three multiplications and one squaring are required to convert back to affine coordinates).
Suppose now that Fq is a binary field. If mixed coordinates (see §3.2.3) are used, then Q is stored in LD projective coordinates, while P can be stored in affine coordinates. Thus the doubling in step 2.1 can be performed using Algorithm 3.24, and the addition in step 2.2 can be performed using Algorithm 3.25. The field operation count of Algorithm 3.27 is then
8.5m M + (2M + 1I ) |
(3.19) |
(one inversion and two multiplications are required to convert back to affine coordinates).
98 3. Elliptic Curve Arithmetic
Non-adjacent form (NAF)
If P = (x, y) E(Fq ) then − P = (x, x + y) if Fq is a binary field, and − P = (x, −y) if Fq has characteristic > 3. Thus subtraction of points on an elliptic curve is just as
efficient as addition. This motivates using a signed digit representation k = l−1 k 2i ,
i=0 i
where ki {0, ±1}. A particularly useful signed digit representation is the non-adjacent form (NAF).
Definition 3.28 A non-adjacent form (NAF) of a positive integer k is an expression |
||||||||||
k |
= |
|
2i where k |
i |
{ |
± } |
l−1 = |
0, and no two consecutive digits k |
i |
are |
i=0 i |
||||||||||
|
l−1 k |
|
0, |
1 , k |
|
|
nonzero. The length of the NAF is l.
Theorem 3.29 (properties of NAFs) Let k be a positive integer.
(i)k has a unique NAF denoted NAF(k).
(ii)NAF(k) has the fewest nonzero digits of any signed digit representation of k.
(iii)The length of NAF(k) is at most one more than the length of the binary representation of k.
(iv)If the length of NAF(k) is l, then 2l /3 < k < 2l+1/3.
(v)The average density of nonzero digits among all NAFs of length l is approximately 1/3.
NAF(k) can be efficiently computed using Algorithm 3.30. The digits of NAF(k) are generated by repeatedly dividing k by 2, allowing remainders of 0 or ±1. If k is odd, then the remainder r {−1, 1} is chosen so that the quotient (k − r )/2 is even—this ensures that the next NAF digit is 0.
Algorithm 3.30 Computing the NAF of a positive integer
INPUT: A positive integer k.
OUTPUT: NAF(k).
1.i ← 0.
2.While k ≥ 1 do
2.1If k is odd then: ki ← 2 − (k mod 4), k ← k − ki ;
2.2Else: ki ← 0.
2.3k ← k/2, i ← i + 1.
3.Return(ki−1 , ki−2, . . . , k1, k0).
Algorithm 3.31 modifies the left-to-right binary method for point multiplication (Algorithm 3.27) by using NAF(k) instead of the binary representation of k. It follows from (iii) and (v) of Theorem 3.29 that the expected running time of Algorithm 3.31 is approximately
m |
A + m D. |
(3.20) |
|
3
3.3. Point multiplication |
99 |
Algorithm 3.31 Binary NAF method for point multiplication
INPUT: Positive integer k, P E(Fq ).
OUTPUT: k P.
1. Use Algorithm 3.30 to compute NAF(k) = l−1 k 2i .
i=0 i
2.Q ← ∞.
3.For i from l − 1 downto 0 do
3.1Q ← 2Q.
3.2If ki = 1 then Q ← Q + P.
3.3If ki = −1 then Q ← Q − P.
4.Return(Q).
Window methods
If some extra memory is available, the running time of Algorithm 3.31 can be decreased by using a window method which processes w digits of k at a time.
Definition 3.32 Let w |
≥ |
2 be a positive integer. A width-w NAF of a positive integer k |
|||||||||||
is an expression k |
= |
|
|
l |
|
1 |
ki 2i where each nonzero coefficient ki |
is odd, |
| |
ki |
| |
< 2w−1, |
|
|
|
i=0 |
|||||||||||
|
|
|
|
− |
|
|
|
||||||
k |
most one of any w consecutive digits is nonzero. The length of the |
||||||||||||
l−1 = 0, and at |
|
|
|
|
|
|
|
|
|
|
|
|
width-w NAF is l.
Theorem 3.33 (properties of width-w NAFs) Let k be a positive integer.
(i)k has a unique width-w NAF denoted NAFw(k).
(ii)NAF2(k) = NAF(k).
(iii)The length of NAFw (k) is at most one more than the length of the binary representation of k.
(iv)The average density of nonzero digits among all width-w NAFs of length l is approximately 1/(w + 1).
Example 3.34 (width-w NAFs) Let k = 1122334455. We denote a negative integer −c by c. The binary representation of k and the width-w NAFs of k for 2 ≤ w ≤ 6 are:
(k)2 = |
1 0 0 0 |
0 1 0 1 1 |
1 0 0 1 0 1 |
0 1 1 1 0 1 1 0 1 1 1 1 0 1 1 1 |
|||||||||||||||||||
NAF2(k) = |
1 0 0 0 |
1 0 |
1 |
0 0 |
|
1 |
0 1 0 |
1 |
0 |
|
1 |
0 0 0 |
1 |
0 0 |
1 |
0 0 0 0 |
|
1 |
0 0 |
1 |
|||
NAF3 |
(k) = |
1 0 0 0 |
0 0 3 0 0 |
|
1 |
0 0 1 0 0 |
3 0 0 0 |
1 |
0 0 |
1 |
0 0 0 0 |
|
1 |
0 0 |
1 |
||||||||
NAF4 |
(k) = |
1 0 0 0 |
0 1 0 0 0 |
7 0 0 0 0 5 |
0 0 0 7 0 0 0 7 0 0 0 |
1 |
0 0 0 7 |
||||||||||||||||
NAF5 |
(k) = 1 0 0 0 0 |
15 |
0 0 0 0 |
|
9 |
0 0 0 0 0 |
11 0 0 0 0 0 0 |
9 |
0 0 0 0 0 0 0 |
9 |
|||||||||||||
NAF6 |
(k) = |
1 0 0 0 |
0 0 0 0 0 |
23 0 0 0 0 0 |
11 0 0 0 0 0 0 |
9 |
0 0 0 0 0 0 0 |
9 |
NAFw (k) can be efficiently computed using Algorithm 3.35, where k mods 2w denotes the integer u satisfying u ≡ k (mod 2w ) and −2w−1 ≤ u < 2w−1. The digits
100 3. Elliptic Curve Arithmetic
of NAFw (k) are obtained by repeatedly dividing k by 2, allowing remainders r in [−2w−1, 2w−1 − 1]. If k is odd and the remainder r = k mods 2w is chosen, then (k − r )/2 will be divisible by 2w−1, ensuring that the next w − 1 digits are zero.
Algorithm 3.35 Computing the width-w NAF of a positive integer
INPUT: Window width w, positive integer k.
OUTPUT: NAFw (k).
1.i ← 0.
2.While k ≥ 1 do
2.1If k is odd then: ki ← k mods 2w , k ← k − ki ;
2.2Else: ki ← 0.
2.3k ← k/2, i ← i + 1.
3.Return(ki−1 , ki−2, . . . , k1, k0).
Algorithm 3.36 generalizes the binary NAF method (Algorithm 3.31) by using NAFw (k) instead of NAF(k). If follows from (iii) and (iv) of Theorem 3.33 that the expected running time of Algorithm 3.36 is approximately
( |
) |
+ |
|
A + m D+. |
|
|
|
1D + (2w−2 − 1) A + |
* |
m |
|
(3.21) |
|
|
w |
1 |
Algorithm 3.36 Window NAF method for point multiplication
INPUT: Window width w, positive integer k, P E(Fq ).
OUTPUT: k P.
1. Use Algorithm 3.35 to compute NAF (k) = l−1 k 2i ,
w i=0 i
2.Compute Pi = i P for i {1, 3, 5, . . . , 2w−1 − 1}.
3.Q ← ∞.
4.For i from l − 1 downto 0 do
4.1Q ← 2Q.
4.2If ki = 0 then:
If ki > 0 then Q ← Q + Pki ; Else Q ← Q − P−ki .
5. Return(Q).
Note 3.37 (selection of coordinates) The number of field inversions required can be reduced by use of projective coordinates for the accumulator Q. If inversion is sufficiently expensive relative to field multiplication, then projective coordinates may also be effective for Pi . Chudnovsky coordinates (§3.2.2) for curves over prime fields eliminate inversions in precomputation at the cost of less-efficient Jacobian-Chudnovsky mixed additions in the evaluation phase.
3.3. Point multiplication |
101 |
The window NAF method employs a “sliding window” in the sense that Algorithm 3.35 has a width-w window, moving right-to-left, skipping consecutive zero entries after a nonzero digit ki is processed. As an alternative, a sliding window can be used on the NAF of k, leading to Algorithm 3.38. The window (which has width at most w) moves left-to-right over the digits in NAF(k), with placement so that the value in the window is odd (to reduce the required precomputation).
Algorithm 3.38 Sliding window method for point multiplication
INPUT: Window width w, positive integer k, P E(Fq ).
OUTPUT: k P.
1. Use Algorithm 3.30 to compute NAF(k) = l−1 k 2i .
i=0 i
2.Compute Pi = i P for i {1, 3, . . . , 2(2w − (−1)w )/3 − 1}.
3.Q ← ∞, i ←l − 1.
4.While i ≥ 0 do
4.1If ki = 0 then t ← 1, u ← 0;
4.2Else: find the largest t ≤ w such that u ← (ki , . . . , ki−t+1) is odd.
4.3Q ← 2t Q.
4.4If u > 0 then Q ← Q + Pu ; else if u < 0 then Q ← Q − P−u .
4.5i ← i − t.
5.Return(Q).
The average length of a run of zeros between windows in the sliding window method
is
4 (−1)w
ν(w) = 3 − 3 · 2w−2 .
It follows that the expected running time of Algorithm 3.38 is approximately
*1D + |
w |
− |
( 1)w |
− 1 A+ |
|
m |
|
|
2 |
− |
+ |
|
A + m D. |
(3.22) |
|||
|
|
3 |
w + ν(w) |
Note 3.39 (comparing sliding window and window NAF methods) For a given w, the sliding window method allows larger values in a window compared with those appearing in a width-w NAF. This translates to a higher cost for precomputation (roughly 2w /3 in step 2 of Algorithm 3.38 versus 2w /4 point operations in step 2 of Algorithm 3.36) in the sliding window method, but fewer point operations in the main loop (m/(w + ν(w)) versus m/(w + 1)). If the comparison is on point operations, then the window NAF method will usually result in fewer point additions (when the optimum w is selected for each method) for m of interest. To make a more precise comparison, the coordinate representations (driven by the cost of field inversion versus multiplication) must be considered.
As an example, consider the NIST binary curves and suppose that the inverse to multiplication ratio is I/M = 8. Affine coordinates are used in precomputation, while the
102 3. Elliptic Curve Arithmetic |
|
|
|
|
|
|
|
|
||||||||
|
|
|
|
|
|
|
|
|
|
|
||||||
|
|
|
points |
|
m = 163 |
|
m = 233 |
m = 283 |
m = 409 |
m = 571 |
||||||
|
w |
|
WN SW |
|
WN SW |
WN SW |
WN SW |
WN |
SW |
WN |
SW |
|||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
|
1 |
1 |
|
442 |
442 |
|
626 |
626 |
762 |
762 |
1098 |
1098 |
1530 |
1530 |
|
3 |
|
2 |
3 |
|
340 |
318 |
|
484 |
438 |
580 |
526 |
836 |
750 |
1156 |
1038 |
|
4 |
|
4 |
5 |
|
296 |
298 |
|
408 |
402 |
488 |
474 |
688 |
666 |
952 |
914 |
|
5 |
|
8 |
11 |
|
296 |
310 |
|
384 |
398 |
456 |
462 |
624 |
622 |
840 |
822 |
|
6 |
|
16 |
21 |
|
344 |
386 |
|
424 |
458 |
480 |
514 |
624 |
650 |
808 |
834 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Table 3.5. Point addition cost in sliding versus window NAF methods, when I/M = 8. “points” denotes the number the points stored in the precomputation stage. “WN” denotes the window NAF method (Algorithm 3.36). “SW” denotes the sliding window method (Algorithm 3.38).
main loop uses mixed projective-affine additions. Table 3.5 shows the expected cost of point additions in each method. Note that there will also be m point doublings with each method, so the difference in times for point multiplication will be even smaller than Table 3.5 suggests. If there are constraints on the number of points that can be stored at the precomputation phase, then the difference in precomputation may decide the best method. For example, if only three points can be stored, then the sliding window method will be preferred, while storage for four points will favour the window NAF method. The differences are fairly small however; in the example, use of w = 3 (two and three points of precomputation, respectively) for both methods will favour sliding window, but gives only 7–10% reduction in point addition cost over window NAF.
Montgomery’s method
Algorithm 3.40 for non-supersingular elliptic curves y2 + x y = x3 + ax2 + b over binary fields is due to L´opez and Dahab, and is based on an idea of Montgomery. Let Q1 = (x1, y1) and Q2 = (x2, y2) with Q1 = ±Q2. Let Q1 + Q2 = (x3, y3) and Q1 − Q2 = (x4, y4). Then using the addition formulas (see §3.1.2), it can be verified that
x3 |
= x4 |
+ |
x2 |
+ |
|
x2 |
2 . |
(3.23) |
x1 x2 |
x1 x2 |
|||||||
|
|
+ |
|
+ |
|
|
Thus, the x-coordinate of Q1 + Q2 can be computed from the x-coordinates of Q1, Q2 and Q1 − Q2. Iteration j of Algorithm 3.40 for determining k P computes the x- coordinates only of Tj = [l P, (l + 1) P], where l is the integer represented by the j leftmost bits of k. Then Tj+1 = [2l P, (2l + 1) P] or [(2l + 1) P, (2l + 2) P] if the ( j + 1)st leftmost bit of k is 0 or 1, respectively, as illustrated in Figure 3.4. Each iteration requires one doubling and one addition using (3.23). After the last iteration, having computed the x-coordinates of k P = (x1, y1) and (k + 1) P = (x2, y2), the y-coordinate of k P can be recovered as:
y1 = x−1(x1 + x)[(x1 + x)(x2 + x) + x2 + y] + y. |
(3.24) |