montgomery algorithm for modular multiplication with ...math-sa-sara0050/space16/... · a systolic...
TRANSCRIPT
![Page 1: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/1.jpg)
MRABET Amine
Montgomery Algorithm for Modular Multiplication
with Systolic Architecture
LIASD Paris 8
ENIT-TUNIS EL MANAR University
SAS - CMP - Gardanne
SPACE 2016
1
![Page 2: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/2.jpg)
1. Introduction for pairing
2. Montgomery Multiplication (CIOS)
3. Architecture
4. Results
5. Conclusion and Perspectives
Plan
2
![Page 3: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/3.jpg)
1. Introduction for pairing
2. Montgomery Multiplication (CIOS)
3. Architecture
4. Results
5. Conclusion and Perspectives
Plan
2
![Page 4: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/4.jpg)
This work is part of the hardware implementation of
asymmetric cryptography primitives, such as Optimal-Ate
pairing based on elliptic curves, the cryptographic systems
based on elliptic curves and RSA,
3
General Context
![Page 5: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/5.jpg)
This work is part of the hardware implementation of
asymmetric cryptography primitives, such as Optimal-Ate
pairing based on elliptic curves, the cryptographic systems
based on elliptic curves and RSA,
Which are the best known methods in asymmetric encryption.
General Context
3
![Page 6: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/6.jpg)
Let G1 and G2 be two additive groups and let G3 be a
multiplicative group.
Pairing is an application
e : G1 × G2 G3 with the following properties:
4
Definition
![Page 7: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/7.jpg)
Definition
4
Let G1 and G2 be two additive groups and let G3 be a
multiplicative group.
Pairing is an application
e : G1 × G2 G3 with the following properties:
e is non degenerate :
if P ∈ G1, P ≠ 0 it exists Q ∈ G2 such as e(P, Q) ≠ 1
and
if Q ∈ G2, Q ≠ 0 it exists P ∈ G1 such as e(P, Q) ≠ 1.
![Page 8: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/8.jpg)
e is non degenerate :
if P ∈ G1, P ≠ 0 it exists Q ∈ G2 such as e(P, Q) ≠ 1
and
if Q ∈ G2, Q ≠ 0 it exists P ∈ G1 such as e(P, Q) ≠ 1.
Bilinearity:
e(xP, yQ) = e(P,Q)xy ,
e(xP, yQ)z = e(yP, zQ)x = e(zP, xQ)y = e(P,Q)xyz
Definition
4
Let G1 and G2 be two additive groups and let G3 be a
multiplicative group.
Pairing is an application
e : G1 × G2 G3 with the following properties:
![Page 9: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/9.jpg)
The bilinearity of the pairings allowed the construction of
protocols.
5
Pairing protocols
![Page 10: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/10.jpg)
5
Pairing protocols
Diffie–Hellman key exchange ( Joux 2001)
Identity-Based Cryptography(Boneh and Franklin)
Short signature schemes (Boneh, Lynn, Shacham)
The bilinearity of the pairings allowed the construction of
protocols.
![Page 11: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/11.jpg)
Trusted authority
Alice
IA
Pairing protocolsExample of Cryptography Based on Identity
6
Bob
IB
S: The secret of the trusted authority
The Public keys are the identities of people.
![Page 12: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/12.jpg)
S: The secret of the trusted authority
The Public keys are the identities of people.
The private keys are Constructed by the trusted authority and
Transmitted to users.
Trusted authority
Bob Alice
IB IA
6
PB=S*IB PA=S*IA
Pairing protocolsExample of Cryptography Based on Identity
e (PA, IB) = e (IA, IB) se (PB, IA) = e (IA, IB) s
![Page 13: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/13.jpg)
7
Alice wants to send a message to Bob:
She chooses an integer a randomly,
She retrieves Bob's public key : IB,
She calculates the pairing e(IB;Q0)a,
She sends to Bob : [ aP, M ⊕H2 (e(IB;Q0)a) ]=[U,V]
Pairing protocols
Example of Cryptography Based on Identity
Encryption step of the clear message M
![Page 14: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/14.jpg)
8
Bob follows the following steps:
He contacts the trusted authority to retrieve his private key
PB = sIB,
He finds the message by calculating V ⊕ H2 (e(PB,U)).
The message : M
The bilinearity of pairings :
e(PB,U) = e(sIB,aP) = e(IB,P)as = e(IB,sP)a
Pairing protocolsExample of Cryptography Based on Identity
Decryption step of the encrypted message.
![Page 15: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/15.jpg)
Different pairings
9
Weil pairing
eW
: E (Fp)[r ] × E(Fpk)/rE (Fpk) → F*pk
(P,Q) → (-1)r fr, p
(Q) / fr ,Q
(P)
Miller Lite fr, p
(Q)
Miller Full fr ,Q
(P)
Inversion
Multiplication
![Page 16: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/16.jpg)
Different pairings
9
Weil pairing
eW
: E (Fp)[r ] × E(Fpk)/rE (Fpk) → F*pk
(P,Q) → (-1)r fr, p
(Q) / fr ,Q
(P)
Tate pairing
eT: E (Fp)[r ] × E(Fpk)/rE (Fpk) → F*pk
(P,Q) → [ fr, p(Q) ] (p^k- 1)/r
Tate pairing is defined with the same parameters E, Fp, r, k
than Weil pairing.
For the calculation of Tate pairing we make log2(r) iterations during
the Miller algorithm, where r is the order of the subgroups used.
![Page 17: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/17.jpg)
The main advantage compared to Tate pairing is the reduction of the number of
iterations made during the Miller algorithm.
log2(T) where T = t − 1, and t is the Frobenius trace on E(Fp).
The disadvantage of Ate pairing is that it corresponds to a Miller Full application.
Different pairings
Ate paring
G1 = E[r] ∩ Ker(p-[1]) = E(Fp)[r], G2 = E[r] ∩ Ker(p-[p])
eA
: G1 × G2 → F*pk;
(P,Q) → [ fT, Q
(P) ] (p^k- 1)/r
10
![Page 18: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/18.jpg)
The calculation is made by an execution of Miller Lite, which would alleviate the
complexity of the calculations.
Different pairings
Twisted Ate pairingG1 = E[r] ∩ Ker(p-[1]) = E(Fp)[r], G2 = E[r] ∩ Ker(p-[p])
eTA
: G1 × G2 → F*pk;
(P,Q) → [ fT, p
(Q) ] (p^k- 1)/r
11
![Page 19: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/19.jpg)
Different pairings
Ate-Optimal (OATE) pairing
Ate-Optimal pairing improves Ate pairing by reducing the number of iterations
in the Miller algorithm used to calculate f,Q(P).
In the case of BN curves , OATE pairing is defined by:
where = 6t+2 (t the parameter of BN curves)
The calculation is made by an execution of Miller Lite, which would alleviate the
complexity of the calculations.
Twisted Ate pairingG1 = E[r] ∩ Ker(p-[1]) = E(Fp)[r], G2 = E[r] ∩ Ker(p-[p])
eTA
: G1 × G2 → F*pk;
(P,Q) → [ fT, p
(Q) ] (p^k- 1)/r
11
![Page 20: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/20.jpg)
The basic operations in the Finite field :
Addition
Subtraction
Multiplication
inversion
Basic operations
12
![Page 21: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/21.jpg)
The basic operations in the Finite field :
Addition
Subtraction
Multiplication
inversion
Constitute the essential of calculation time of pairing.
That’s why the optimization of these operation is the most
important
12
Basic operations
![Page 22: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/22.jpg)
1. Introduction for pairing
2. Montgomery Multiplication (CIOS)
3. Architecture
4. Results
5. Conclusion and Perspectives
Plan
13
![Page 23: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/23.jpg)
Reminder: Montgomery algorithm
14
![Page 24: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/24.jpg)
Reminder: Montgomery algorithm
14
Ordinary domain Montgomery domain
a M(a)=a.R mod p
b M(b)=b.R mod p
a.b M(a.b)=a.b.R mod p
Conversion between Ordinary Field and Montgomery
![Page 25: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/25.jpg)
The CIOS method improves the Montgomery algorithm by
integrating multiplication and reduction.
How?
[1] Analyzing and Comparing Montgomery Multiplication Algorithms, IEEE Micro. , juin1996
Cetin Kaya Koç, Tolga Acar and Burton S. Kaliski Jr.
The Coarsely Integrated Operand Scanning method [1] ?
15
![Page 26: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/26.jpg)
The CIOS method improves the Montgomery algorithm by
integrating multiplication and reduction.
How?
Instead of multiplying axb then performe to reduction, it
allows to alternate between the iterations of multiplication
and reduction.
[1] Analyzing and Comparing Montgomery Multiplication Algorithms, IEEE Micro. , juin1996
Cetin Kaya Koç, Tolga Acar and Burton S. Kaliski Jr.
15
The Coarsely Integrated Operand Scanning method [1] ?
![Page 27: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/27.jpg)
What is a systolic architecture ?
16
It’s a network composed of a large number of cells, Each
cell receives data from the neighboring cells, performs a
simple calculation, and then transmits the results, always to
neighboring cells.
![Page 28: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/28.jpg)
What is a systolic architecture ?
16
It’s a network composed of a large number of cells, Each
cell receives data from the neighboring cells, performs a
simple calculation, and then transmits the results, always to
neighboring cells.
![Page 29: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/29.jpg)
What is a systolic architecture ?
16
It’s a network composed of a large number of cells, Each
cell receives data from the neighboring cells, performs a
simple calculation, and then transmits the results, always to
neighboring cells.
A systolic architecture provides very simplified elementary
cells. Therefore, this architecture reduces resource
requirements in hardware implementations.
![Page 30: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/30.jpg)
It’s a network composed of a large number of cells, Each
cell receives data from the neighboring cells, performs a
simple calculation, and then transmits the results, always to
neighboring cells.
A systolic architecture provides very simplified elementary
cells. Therefore, this architecture reduces resource
requirements in hardware implementations.
Our contribution in this work is to combine a systolic
architecture, which is supposed to be the best solution for
FPGA implementations, with the CIOS method of the
Montgomery modular multiplication.
What is a systolic architecture ?
16
![Page 31: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/31.jpg)
Coarsely Integrated Operand Scanning
17
Coarsely Integrated Operand Scanning
![Page 32: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/32.jpg)
Coarsely Integrated Operand Scanning
17
![Page 33: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/33.jpg)
Cutting the algorithm CIOS
17
alpha : the lines 5 and 6
![Page 34: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/34.jpg)
17
_2alpha : the lines 7,8 and 9
alpha : the lines 5 and 6
Cutting the algorithm CIOS
![Page 35: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/35.jpg)
17
beta: the lines11 and 12
_2alpha : the lines 7,8 and 9
alpha : the lines 5 and 6
Cutting the algorithm CIOS
![Page 36: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/36.jpg)
gamma: the lines14 and 15
17
beta: the lines11 and 12
_2alpha : the lines 7,8 and 9
alpha : the lines 5 and 6
Cutting the algorithm CIOS
![Page 37: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/37.jpg)
_2gamma: the lines16,17 and 18
17
gamma: the lines14 and 15
beta: the lines11 and 12
_2alpha : the lines 7,8 and 9
alpha : the lines 5 and 6
Cutting the algorithm CIOS
![Page 38: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/38.jpg)
Plan
18
1. Introduction
2. Montgomery Multiplication (CIOS)
3. Architecture
4. Results
5. Conclusion and Perspectives
![Page 39: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/39.jpg)
1 1 1
1
2 2 2
2 2 2
3 3
3 3
i=0
_
2
3_
2
Multiplication Step
Reduction Step
a0 b0 a0 b1 a0 b2 a0 b3 a0 b4 a0 b5 a0 b6 a0 b7
j=0 j=1 j=2 j=3 j=4 j=5 j=7 j=6
CIOS in Systolic for s=8
19
_2
_2
![Page 40: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/40.jpg)
1 1 1
1
2 2 2
2 2 2
3 3
3 3
i=0
_
2
3_
2
Multiplication Step
Reduction Step
a0 b0 a0 b1 a0 b2 a0 b3 a0 b4 a0 b5 a0 b6 a0 b7
j=0 j=1 j=2 j=3 j=4 j=5 j=7 j=6
1 1 1
1
2 2 2
2 2 2
3 3
3 3
i=1
_
2
3_
2
19
CIOS in Systolic for s=8
![Page 41: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/41.jpg)
1 1 1
1
2 2 2
2 2 2
3 3
3 3
i=0
_
2
3_
2
Multiplication Step
Reduction Step
a0 b0 a0 b1 a0 b2 a0 b3 a0 b4 a0 b5 a0 b6 a0 b7
j=0 j=1 j=2 j=3 j=4 j=5 j=7 j=6
1 1 1
1
2 2 2
2 2 2
3 3
3 3
i=1
_
2
3_
2
In this architecture we also have an integration between
the different iterations that loop on i.
In our case we have 3 iterations of i which can be
executed at the same time.
19
CIOS in Systolic for s=8
![Page 42: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/42.jpg)
1 1 1
1
2 2 2
2 2 2
3 3
3 3
i=0
_
2
3_
2
Multiplication Step
Reduction Step
a0 b0 a0 b1 a0 b2 a0 b3 a0 b4 a0 b5 a0 b6 a0 b7
j=0 j=1 j=2 j=3 j=4 j=5 j=7 j=6
1 1 1
1
2 2 2
2 2 2
3 3
3 3
i=1
_
2
3_
2
1 1 1
1
2 2 2
2 2 2
3 3
3 3
i=7
_
2
3_
2
i=2
i=3
i=4
i=5
i=6
19
CIOS in Systolic for s=8
. . . . . . . . . . . .. . . . . . . . . . . .
. . . . . . . . . . . .
![Page 43: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/43.jpg)
1 1 1
1
2 2 2
2 2 2
3 3
3 3
i=0
_
2
3_
2
Multiplication Step
Reduction Step
a0 b0 a0 b1 a0 b2 a0 b3 a0 b4 a0 b5 a0 b6 a0 b7
j=0 j=1 j=2 j=3 j=4 j=5 j=7 j=6
1 1 1
1
2 2 2
2 2 2
3 3
3 3
i=1
_
2
3_
2
1 1 1
1
2 2 2
2 2 2
3 3
3 3
i=7
_
2
3_
2
a x b x R-1 mod p
i=2
i=3
i=4
i=5
i=6
19
CIOS in Systolic for s=8
. . . . . . . . . . . .. . . . . . . . . . . .
. . . . . . . . . . . .
![Page 44: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/44.jpg)
i=0
2
2
i=1
2
2
2
2
a x b x R-1 mod p
i=2
Multiplication Step
Reduction Step
2
2
i=3
2
2
i=4
2
2
i=5
2
2
i=6
2
2
i=7
20
CIOS in Systolic for s=8
![Page 45: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/45.jpg)
S
C C
S
ai bj
i=0
2
2
i=1
2
2
2
2
a x b x R-1 mod p
Multiplication Step
Reduction Step
2
2
i=3
2
2
i=4
2
2
i=5
2
2
i=6
2
2
i=7
20
CIOS in Systolic for s=8
i=2
![Page 46: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/46.jpg)
S
C C
S
C
C
ai bj
m pj
i=0
2
2
i=1
2
2
2
2
a x b x R-1 mod p
Multiplication Step
Reduction Step
2
2
i=3
2
2
i=4
2
2
i=5
2
2
i=6
2
2
i=7
20
CIOS in Systolic for s=8
i=2
S
S
S
![Page 47: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/47.jpg)
a0
a1
.
.
.
.
.
.
.
a7
b0 b1 b2 b3 b4 b5 b6 b7
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_
2
3_
2
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_
2
3_
2
B
i=0
i=1
A
p0 p1 p2 p3 p4 p5 p6 p7P
Data Flow
1 1 1
1
2 2 2
2 2 2
i=2
21
. . . . . . . . .. . . . . . . .
. . . . . . . .
![Page 48: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/48.jpg)
b0 b1 b2 b3 b4 b5 b6 b7
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_
2
3_
2
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_
2
3_
2
b0 b1 b2 b3 b4 b5 b6 b7
B
B1 B2 B3
i=0
i=1
p0 p1 p2 p3 p4 p5 p6 p7P
a0
a1
.
.
.
.
.
.
.
a7
A
1 1 1
1
2 2 2
2 2 2
Data Flow
i=2
21
. . . . . . . . .. . . . . . . .
. . . . . . . .
![Page 49: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/49.jpg)
b0 b1 b2 b3 b4 b5 b6 b7
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_
2
3_
2
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_
2
3_
2
b0 b1 b2 b3 b4 b5 b6 b7
B
B1 B2 B3
i=0
i=1
p0 p1 p2 p3 p4 p5 p6 p7P
p0 p1 p2 p3 p4 p5 p6 p7
P2 P3
a0
a1
.
.
.
.
.
.
.
a7
A
1 1 1
1
2 2 2
2 2 2
Data Flow
i=2
21
. . . . . . . . .. . . . . . . .
. . . . . . . .
![Page 50: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/50.jpg)
b0 b1 b2 b3 b4 b5 b6 b7
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
b0 b1 b2 b3 b4 b5 b6 b7
B
B1 B2 B3
i=0
i=1
p0 p1 p2 p3 p4 p5 p6 p7P
p0 p1 p2 p3 p4 p5 p6 p7
a0
a1
.
.
.
.
.
.
.
a7
A
1 1 1
1
2 2 2
2 2 2
Data Flow
i=2
21
. . . . . . . . .. . . . . . . .
. . . . . . . .
S
C
P2 P3
![Page 51: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/51.jpg)
b0 b1 b2 b3 b4 b5 b6 b7
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
b1 b2 b0 b3 b4 b5 b6 b7
B
B1 B2 B3
i=0
i=1
p0 p1 p2 p3 p4 p5 p6 p7P
p0 p1 p2 p3 p4 p5 p6 p7
P2 P3
a0
a1
.
.
.
.
.
.
.
a7
A
1 1 1
1
2 2 2
2 2 2
Data Flow
i=2
21
. . . . . . . . .. . . . . . . .
. . . . . . . .
S
C
SC
C
![Page 52: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/52.jpg)
b0 b1 b2 b3 b4 b5 b6 b7
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
b2 b0 b1 b3 b4 b5 b6 b7
B
B1 B2 B3
i=0
i=1
p0 p1 p2 p3 p4 p5 p6 p7P
p0 p1 p2 p3 p4 p5 p6 p7
P2 P3
a0
a1
.
.
.
.
.
.
.
a7
A
1 1 1
1
2 2 2
2 2 2
Data Flow
i=2
21
. . . . . . . . .. . . . . . . .
. . . . . . . .
S
C
S
C
S
C
C
S
C
![Page 53: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/53.jpg)
b0 b1 b2 b3 b4 b5 b6 b7
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
b0 b1 b2 b3 b4 b5 b6 b7
B
B1 B2 B3
i=0
i=1
p0 p1 p2 p3 p4 p5 p6 p7P
p0 p1 p2 p3 p4 p5 p6 p7
P2 P3
a0
a1
.
.
.
.
.
.
.
a7
A
1 1 1
1
2 2 2
2 2 2
Data Flow
i=2
21
. . . . . . . . .. . . . . . . .
. . . . . . . .
S
C
S
C
S
C
S
C
S
CC
S
C
![Page 54: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/54.jpg)
b0 b1 b2 b3 b4 b5 b6 b7
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
b1 b2 b0 b4 b5 b3 b6 b7
B
B1 B2 B3
i=0
i=1
p0 p1 p2 p3 p4 p5 p6 p7P
p0 p1 p3 p4 p2 p5 p6 p7
P2 P3
a0
a1
.
.
.
.
.
.
.
a7
A
1 1 1
1
2 2 2
2 2 2
Data Flow
i=2
21
. . . . . . . . .. . . . . . . .
. . . . . . . .
S
C
S
C
S
C
S
C
S
C
S
CC C
S
C
S
![Page 55: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/55.jpg)
b0 b1 b2 b3 b4 b5 b6 b7
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
b2 b0 b1 b5 b3 b4 b6 b7
B
B1 B2 B3
i=0
i=1
p0 p1 p2 p3 p4 p5 p6 p7P
p0 p1 p4 p2 p3 p5 p6 p7
P2 P3
a0
a1
.
.
.
.
.
.
.
a7
A
1 1 1
1
2 2 2
2 2 2
Data Flow
i=2
21
. . . . . . . . .. . . . . . . .
. . . . . . . .
S
C
S
C
S
C
S
C
S
C
S
C
S
CC C
S
C
S S
C
![Page 56: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/56.jpg)
b0 b1 b2 b3 b4 b5 b6 b7
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
b0 b1 b2 b3 b4 b5 b6 b7
B
B1 B2 B3
i=0
i=1
p0 p1 p2 p3 p4 p5 p6 p7P
p0 p1 p2 p3 p4 p5 p6 p7
P2 P3
a0
a1
a2
.
.
.
.
.
.
a7
A
1 1 1
1
2 2 2
2 2 2
i=2
Data Flow
21
. . . . . . . . .. . . . . . . .
. . . . . . . .
S
C
S
C
S
C
S
C
S
C
S
C
S
C
S
CC C
S
C
S S
C
S
C
![Page 57: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/57.jpg)
b0 b1 b2 b3 b4 b5 b6 b7
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
b1 b2 b0 b4 b5 b3 b7 b6
B
B1 B2 B3
i=0
i=1
p0 p1 p2 p3 p4 p5 p6 p7P
p0 p1 p3 p4 p2 p6 p7 p5
P2 P3
a0
a1
a2
.
.
.
.
.
.
a7
A
1 1 1
1
2 2 2
2 2 2
i=2
Data Flow
21
. . . . . . . . .. . . . . . . .
. . . . . . . .
S
C
S
C
S
C
S
C
S
C
S
C
S
C
S
C
S
CC C
S
C
S S
C
S
C
S
C
![Page 58: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/58.jpg)
b0 b1 b2 b3 b4 b5 b6 b7
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
1 1 1
1
2 2 2
2 2 2
3 3
3 3
_f
3 _f
b2 b0 b1 b5 b3 b4 b7 b6
B
B1 B2 B3
i=0
i=1
p0 p1 p2 p3 p4 p5 p6 p7P
p0 p1 p4 p2 p3 p7 p5 p6
P2 P3
a0
a1
a2
.
.
.
.
.
.
a7
A
1 1 1
1
2 2 2
2 2 2
i=2
Data Flow
21
. . . . . . . . .. . . . . . . .
. . . . . . . .
S
C
S
C
S
C
S
C
S
C
S
C
S
C
S
C
S,C
S
CC C
S
C
S S
C
S
C
S
C
S
C
![Page 59: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/59.jpg)
i=0
2
2
i=1
2
2
2
2
a x b x R-1 mod p
Multiplication Step
Reduction Step
2
2
i=3
2
2
i=4
2
2
i=5
2
2
i=6
2
2
During execution of this algorithm
there are always three iterations
of the loop 'i' which are executed
at the same time, which gives a
maximum of three alphas and
three gammas which are executed
in parallel.
i=7
22
CIOS in Systolic for s=8
i=2
![Page 60: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/60.jpg)
According to the blocks that are
repeated, we modeled our FSM
with 3 states, which allows us to
perform all the multiplication in
just 33 cycles.
(8+3)*3=33
i=0
2
2
i=1
2
2
2
2
a x b x R-1 mod p
i=2
Multiplication Step
Reduction Step
2
2
i=3
2
2
i=4
2
2
i=5
2
2
i=6
2
2
i=7
S0 S1 S2
CIOS in Systolic for s=8
S0 S1 S2
S0 S1 S2
S0 S1 S2
S0 S1 S2
S0 S1 S2
S0 S1 S2
S0 S1 S2 S0 S1 S2 S0 S1 S2 S0
22
![Page 61: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/61.jpg)
1 1 1
1
2 2 2
2 2 2
6 6
6 6
i=0
_
2
6_
2
a0 b0 a0 b1 a0 b2 a0 b3 a0 b4 a0 b5a0 b14 a0 b15
j=0 j=1 j=2 j=3 j=4 j=5 j=14 j=15
CIOS in Systolic for s=16
23
![Page 62: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/62.jpg)
CIOS in Systolic for s=16
23
1 1 1
1
2 2 2
2 2 2
6 6
6 6
i=0
_
2
6_
2
a0 b0 a0 b1 a0 b2 a0 b3 a0 b4 a0 b5a0 b14 a0 b15
j=0 j=1 j=2 j=3 j=4 j=5 j=14 j=15
i=2
i=3
i=15
1 1 1
1
2 2 2
2 2 2
6 6
6 6
_
2
6_
2
a x b x R-1 mod p
. . . . . . . . . . . .. . . . . . . . . . . .
. . . . . . . . . . . .. . . . . . . . . . . .
![Page 63: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/63.jpg)
CIOS in Systolic for s=16
b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 b11 b12 b13 b14 b15
b0 b1 b2
b3 b4 b5
B
B1
B2
B3
b6 b7 b8
b9 b10 b11
B4
b12 b13 b14
B5
b15
1
2
3
4
5
6
B6
24
![Page 64: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/64.jpg)
CIOS in Systolic for s=16
b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 b11 b12 b13 b14 b15
b0 b1 b2
b3 b4 b5
B
B1
B2
B3
p0 p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15P
p0
p1
p2 p3 p4
p5 p6 p7
P2
P3
b6 b7 b8
b9 b10 b11
B4
b12 b13 b14
B5
b15
1
2
3
4
5
6
p8 p9 p10
p11 p12 p13
P4
P5
p14 p15
P6
B6
P1
1
64
53
2
24
![Page 65: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/65.jpg)
alpha_2
gamma_
2
alpha
(1)
alpha
(2)
alpha
(3)
gamma
(1)
gamma
(2)
gamma
(3)
beta
i++
K=256, w=32, s=8
K=512, w=64, s=8
33 clock cycles
CIOS in Systolic for s=8
25
![Page 66: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/66.jpg)
K=256, w=16, s=16
alpha_f
gamma_f
alpha
(1)
alpha
(2)
alpha
(3)
gamma
(1)
gamma
(2)
gamma
(3)
beta
i++
alpha
(4)
alpha
(5)
alpha
(6)
gamma
(4)
gamma
(5)
gamma
(6)
K=512, w=32, s=16
66 clock cycles
Alpha_f
gamma_
f
alpha
(1)
alpha
(2)
alpha
(3)
gamma
(1)
gamma
(2)
gamma
(3)
beta
i++
K=256, w=32, s=8
K=512, w=64, s=8
33 clock cycles
CIOS in Systolic for s=8
25
![Page 67: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/67.jpg)
S=8 6 +3 cells 33 clock cycles
S=16 12 +3 cells 66 clock cycles
S=32 24 +3 cells 132 clock cycles
S=64 48 +3 cells 264 clock cycles
Comparison
26
![Page 68: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/68.jpg)
S=8 S=16 S=32
K=256 32 16 8
K=512 64 32 16
K=1024 128 64 32
Number of
cycles
33 66 132
The interest of each architecture depends on our needs
Security level
Resources
Speed
The method used
The interest of each architecture
27
![Page 69: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/69.jpg)
ArchitecturesDigital signal processing (DSP)
Modern FPGAs are equipped with hardware extensions for
arithmetic calculation.
28
![Page 70: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/70.jpg)
ArchitecturesDigital signal processing (DSP)
Modern FPGAs are equipped with hardware extensions for
arithmetic calculation.
Perform basic arithmetic calculations: multiplication, addition and
subtraction of unsigned integers.
28
![Page 71: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/71.jpg)
The arithmetic operations of each cell
are designed to use the maximum of the
DSPs.
29
a[i]
b[j]
C__In
REGLSB w bits
REGMSB w bits
C__Out
S__Out
S__In
+
+x
alpha
_2
_2
Internal architectures - cells
![Page 72: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/72.jpg)
p’
S__In
P[0]REG
C__Out
REG m
xx
+
beta
29
a[i]
b[j]
C__In
REGLSB w bits
REGMSB w bits
C__Out
S__Out
S__In
+
+x
alpha
S__In
Internal architectures - cells
![Page 73: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/73.jpg)
m]
p[j]
C_ _In
REGLSB w bits
REGMSB w bits
C_ _Out
S_ _Out
gamma
S_ _In
+
+x
30
Internal architectures - cells
_2
_2
![Page 74: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/74.jpg)
gamma_2
S1__2_In
C__2
REGw bits
REG S2__2_Out
S1__2_Out
S2__2_In
LSB w bits
MSB w bits
++
30
m]
p[j]
C_ _In
REGLSB w bits
REGMSB w bits
C_ _Out
S_ _Out
gamma
S_ _In
+
+x
Internal architectures - cells
![Page 75: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/75.jpg)
alpha_2C__2
REG
REG S2__2_Out
S1__2_OutS__2_In LSB w bits
MSB w bits
+
Internal architectures - cells
30
gamma_2
S1__2_In
C__2
REGw bits
REG S2__2_Out
S1__2_Out
S2__2_In
LSB w bits
MSB w bits
++
m]
p[j]
C_ _In
REGLSB w bits
REGMSB w bits
C_ _Out
S_ _Out
gamma
S_ _In
+
+x
![Page 76: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/76.jpg)
ROTATION
Mux
A (K bits)X
31
Internal architectures - Rotation
![Page 77: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/77.jpg)
ROTATION
Mux
A (K bits)X
ROTATION
Mux
B (3 w bits)X
ROTATION
Mux
B (3 w bits)X
ROTATION
Mux
B (2 w bits)X
31
Internal architectures - Rotation
![Page 78: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/78.jpg)
Internal architectures - Rotation
ROTATION
Mux
A (K bits)X
ROTATION
Mux
B (3 w bits)X
ROTATION
Mux
P (3 w bits)X
ROTATION
Mux
B (3 w bits)X
ROTATION
Mux
P (3 w bits)X
ROTATION
Mux
B (2 w bits)X
31
![Page 79: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/79.jpg)
PE
alpha
(1)
MUX
C_1_Out
zero
C_1_InMUX
S_1_In
S_2_Out S_1_Out
S_1_Out
sig_state
A- alpha1
Architectures
32
![Page 80: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/80.jpg)
PE
alpha
(1)
MUX
C_1_Out
zero
C_1_InMUX
S_1_In
S_2_Out S_1_Out
S_1_Out
PE
alpha
(2)
MUX
C_2_Out
C_2_In
MUXS_2_In
S_3_Out S_2_Out
S_2_Out
C_1_Out
sig_state sig_state
A- alpha1B- alpha2
Architectures
32
![Page 81: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/81.jpg)
PE
alpha
(3)
MUX
C_3_Out
C_3_InMUX
S_3_In
S_3_Out
S_3_Out
C_2_OutS1__2_Out
sig_state
C- alpha3
PE
alpha
(1)
MUX
C_1_Out
zero
C_1_InMUX
S_1_In
S_2_Out S_1_Out
S_1_Out
PE
alpha
(2)
MUX
C_2_Out
C_2_In
MUXS_2_In
S_3_Out S_2_Out
S_2_Out
C_1_Out
sig_state sig_state
A- alpha1B- alpha2
Architectures
32
![Page 82: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/82.jpg)
PE
gamma
(1)
C_ 1_Out
C_ 1_InS_ 1_In
S_ 1_Out
D- gamma1
m
p[0]
Architectures
33
![Page 83: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/83.jpg)
PE
gamma
(2)
MUX
C_ 2_Out
C_ 2_InMUX
S_ 2_In
S_
2_Out
S_ 2_Out
C_ 1_OutS_ 1_Out
sig_state
E- gamma2
m
p[j]
PE
gamma
(1)
C_ 1_Out
C_ 1_InS_ 1_In
S_ 1_Out
D- gamma1
m
p[0]
Architectures
33
![Page 84: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/84.jpg)
PE
gamma
(3)
MUX
C_ 3_Out
C_ 3_InMUX
S_ 3_In
S_
3_Out
S_ 3_Out
C_ 2_OutS_ 2_Out
sig_state
F- gamma3
m
p[j]
PE
gamma
(2)
MUX
C_ 2_Out
C_ 2_InMUX
S_ 2_In
S_
2_Out
S_ 2_Out
C_ 1_OutS_ 1_Out
sig_state
E- gamma2
m
p[j]
PE
gamma
(1)
C_ 1_Out
C_ 1_InS_ 1_In
S_ 1_Out
D- gamma1
m
p[0]
Architectures
33
![Page 85: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/85.jpg)
PE
alpha_2
PE
gamma_2
S1__2_Out S2__2_Out S1_ _2_Out S2_ _2_Out
C_ _2
PE
beta
m C_ _Out
S_ _In
G- alpha_2H- gamma_2
I- beta
p’P[0]
S1__2_In S2__2_In C__2 S__2_In
Architectures
34
![Page 86: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/86.jpg)
Plan
35
1. Introduction
2. Montgomery Multiplication (CIOS)
3. Architecture
4. Results
5. Conclusion and Perspectives
![Page 87: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/87.jpg)
Nexys 4 DSP Frequency (MHz) Cycles
MMM(s=8/K=256) 31 105.275 33
Alpha 4 291.023 1
Gamma 4 291.023 1
Beta 4 388.350 1
Alpha_2 1 459.918 1
Gamma_2 2 442.811 1
Results
36
![Page 88: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/88.jpg)
Nexys 4 DSP LUTs Reg Occupied
slice
Frequency Cycles
MMM
S=8/k=256
31 809 870 352 105.275 33
MMM
S=16/k=256
33 846 1123 402 145.892 66
MMM
S=8/k=512
87 2650 1614 878 64.825 33
MMM
S=16/k=512
57 1789 2164 798 105.594 66
Results
37
![Page 89: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/89.jpg)
Plan
38
1. Introduction
2. Montgomery Multiplication (CIOS)
3. Architecture
4. Results
5. Conclusion and Perspectives
![Page 90: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/90.jpg)
We have implemented the Montgomery multiplication with a
systolic architecture in a number of fixed clock cycles.
We made our design in order to use the maximum of the DSPs on
FPGA card
Conclusion
conclusion and perspectives
39
We implemented two architectures(s=8 and s=16)
We used this two design to implement the scalar multiplication for
the security level of 128-bits.
![Page 91: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/91.jpg)
Perspective
40
Perform a Mixed Implementation Soft / hard (co-design) for the
Optimal-Ate pairing on the BN curves in Jacobian coordinates
using this multiplication algorithm.
Finalize the hardware implementation of the designs
s= 32.
s= 64.
![Page 92: Montgomery Algorithm for Modular Multiplication with ...math-sa-sara0050/space16/... · A systolic architecture provides very simplified elementary cells. Therefore, this architecture](https://reader034.vdocuments.net/reader034/viewer/2022052612/5f0afa7d7e708231d42e44aa/html5/thumbnails/92.jpg)