gauss sieve on gpus - rsa conference agenda motivation lattice-based cryptography and cryptanalysis...

Post on 26-Apr-2018

220 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

#RSAC

Gauss Sieve on GPUs

Shang-Yi Yang1, Po-Chun Kuo1, Bo-Yin Yang2, and Chen-Mou Cheng1

1 Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan

{ilway25,kbj,doug}@crypto.tw

2 Institute of Information Science, Acamedia Sinica, Taipei, Taiwan, by@crypto.tw

#RSAC

Agenda

Motivation

Lattice-based Cryptography and Cryptanalysis

Sieve Algorithm

Lifting Technique

Parallel Method

Results

#RSAC

Motivation

NIST announces the Post-Quantum Standard competition

Lattice based cryptography provides many efficient cryptosystem

But how about the security?Security model is all based on lattice enumeration

What is the security estimation from Sieve algorithm?

#RSAC

Lattice based cryptography

Resistance to quantum attack

Provable security

Average-case to worse-case reduction [Ajtai’96, MicReg’05]

Most crypto-primitives can be constructed with lattice problemFHE [Gen’09, BV’11a] PKC secure under CCA[PW’08,Pei’09]OT [PVW’08]IBE [GPV’08, CHKP’10, ABBF’10] FE [GKP+’13]one-way functions [Ajt’96]digital signature schemes [GPV’08, CHKP’10]zero-knowledge proof [MV’03, KTX’08]

#RSAC

Lattice Problems

Central Problem: Shortest Vector Problem (SVP)

Related Problem: CVP, BDDP, GapSVP, short basis problem, covering radius problem….etc.

Relation between these problems cf. [LyuMic’09, AhaReg’05]

SVP: Given Basis, find the

shortest nonzero vector

#RSAC

How to estimate the hardness of lattice problem

Lattice basis reductionShorter vector

More orthogonal

Approximation algorithmLLL

BKZ

Exact algorithmEnumeration

Sieve

Voronoi Cell computing

Algorithm Time Space Type

Enumeration 2O(n log n) Poly(n) Deterministic

Voronoi Cell-based 2O(cn) 2O(cn) Deterministic

Sieve 2O(cn) 2O(cn) Randomized

#RSAC

Sieve algorithm

Proposed by Ajtai, Kumar, and Sivakumar in 2001

Main IdeaSample 2cn pointsCover the samples with spheres of sufficient radius centered at samplesObtain shorter vectors by subtracting the centers

ProsTime complexity is bounded in 2O(cn)

ConsSpace complexity is in 2O(cn)

Algorithm Time Space

AKS’2001 2O(n) 2O(n)

Regev’04 216n+o(n) 28n+o(n)

NV’08 25.9n+o(n) 22.95n+o(n)

ListSieve’10 23.2n+o(n) 21.6n+o(n)

ListSieve’10(birthday.)

22.465n+o(n) 21.233n+o(n)

Algorithm Time Space

NV’08 20.415n+o(n) 20.2075n+o(n)

Three-Level 20.3778+o(n) 20.2833n+o(n)

GaussSieve’10 20.52n+o(n) 20.41n+o(n)

Heuristic Version of Sieve

#RSAC

Gauss Sieve

Algorithm described by Micciancio and Voulgaris in 2009

All the vector in the list are pair-wise Gauss reduced

||a+b|| ≧ max(||a||,||b||)

Algorithma subtracts the projection quant on b as large as possible, swap a and b, repeatedly do this

— a := a – (<a,b>/<b,b>)•b

Swap(a,b) and goto previous step a

b

#RSAC

Gauss Sieve

List

Sampler

Stack

Vector

1. Take a Vector from Stack or sample by GaussianSampler

2. Reduce Vector by the list and Reduce List vectors by Vector, if a List vector is reduced, move it into Stack

3. move the Vector into List

List

Sampler

Stack

Vector

List

Sampler

Stack

Vector

• Repeat steps 1—3 below Until a short vector is found.

#RSAC

Gauss Sieve Implementation

[IKMT PKC’14]First massive parallel implementation

parallel on CPU by MPI

Our work [Kuo, Yang, Cheng, Yang] 50 times faster than previous single-CPU core

lifting computations in prime-cyclotomic ideals

reduces the complexity of inner products of ideals from O(n3) to O(n2)

Parallelize Gauss Sieve on single-GPU by the framework of Ishiguro et al.

Parallelize Gauss Sieve on multi-GPUs by the framework of Bos et al.

#RSAC

lifting computations in prime-cyclotomic ideals

… mod xn+xn-1+…1 = ℒ

mod xn+1-1 = ҧℒ = (xn+xn−1+…1)(x-1)= xn−1…

Know F(x) mod xn+xn-1+…1Compute xF(x) mod xn+xn-1+…1

Know F(x) mod xn-1Compute xF(x) mod xn-1

O(n), Not very easy O(1), very easy

prime-cyclotomic ideals cyclic ideals

Problem: waste spacehow to compute inner product in ҧℒ

#RSAC

lifting computations in prime-cyclotomic ideals

• Use the freedom to make vector components sum to zero

• Reduces the complexity of inner products over ideal lattice from O(n3) to O(n2)

• Fastest in prime-cyclotomic ideal lattice to the best of our knowledge

< 𝑢, 𝑣 > = < ത𝑢, ҧ𝑣 > −𝑝 ത𝑢 − 𝑞 ҧ𝑣 + 𝑛 + 1 𝑝𝑞

(1, 2, 3, 4, 5) mod x4+x3+x2+x+1

(1, 2, 3, 4, 5,0) mod x5-1 = (x4+x3+x2+x+1)(x-1)

(2, 3, 4, 5, 6, 1) mod x5-1

(1-p,2-p, 3-p, 4-p, 5-p, p) mod x5-1

p is the norm of this vector

#RSAC

Parallel Gauss Sieve (inner layer, in GPU)

List

Sampler

Stack

Vectors

1. Sample Vectors from Stack or GaussianSampler

List

Sampler

Stack

Vectors

2. Reduce Vectors by the list

List

Sampler

Stack

Vectors

3. Reduce Vectors by themselves, if a vector is reduced, move it into Stack

List

Sampler

Stack

Vectors

4. Reduce List vector by Vectors, if a List vector is reduced, move it into Stack move the Vectors into List

#RSAC

Parallel Gauss Sieve (outer layer, between GPUs)

ListSampler

Stack Vectors

List0Vectors List1

Vectors Listn-1VectorsStack Stack Stack

Update the minimal Listi

#RSAC

Our record

#RSAC

Implementation Results

CUDA version 7.5

8x NVIDIA GeForce GTX TITAN X

4 in the main machine

4 in a PCIe extension box.

SVP is from Darmstadt’s Ideal Lattice ChallengeAll the input is pre-computed by BKZ with blocksize=30 and delta=0.99

#RSAC

Parallel Efficient (in GPU)

Compare 1 GPU to single-core CPUIn dimension 96, [IKMT14] requires 200-CPU hour; our single-GPU implementation requires 9.6 GPU-hour

21.5x faster

Hardness Comparison between general/ideal latticeUse the model as [IKMT14] for speed-up ratio in ideal lattices

[IKMT14] 600x speedup in anti-cyclic ideal lattices in dimension 128.

This work: 300x speedup in prime-cyclotomic ideal lattices in dimension 130

Prime cyclotomic “1/2 as nice ” as anti-cyclic

#RSAC

Parallel efficient (between GPUs)

Parallel efficient = runtime for N GPUs

N × runtime for 1 GPU

In dim 108, parallel efficient is 74%, 72%, 55% and 45%, respectively

In dim 112 base on 2 GPUs, parallel efficient is 86%, 81% and 74% respectively

#RSAC

Experiment Results

16.81

23.2324.5

18.76

21.4 22.1

0

5

10

15

20

25

30

112 126 130

log_

2

Running Time [GPU-second] Number of vectors

#RSAC

Hardness Estimation Model from Sieve

conservative model of SVP hardness in ideal lattices, with approximation

#RSAC

Conclusion

We propose the first implementation on GPUsBoth inner & outer layer parallelism

We solve a 130 dimensional SVP instance over ideal latticeSpecifically, a prime-cyclotomic ideal lattice

We propose the first hardness estimation model for (ideal-) SVP based on sieve algorithm

#RSAC

Thank you!

Any questions?

#RSAC

Atsushi Takayasu and Noboru Kunihiro

A Tool Kit for Partial Key Exposure Attacks on RSA

The University of Tokyo, Japan

#RSAC

24

Background

#RSAC

RSA

Public key: 𝑁, 𝑒

Secret key: (𝑝, 𝑞, 𝑑)

Key generation: 𝑁 = 𝑝𝑞 and𝑒𝑑 = 1 mod (𝑝 − 1)(𝑞 − 1)

The security relates to the hardness for factoring 𝑁.

Several attacks with partial information of the secret key have been studied using lattice-based Coppersmith’s method.

25

#RSAC

Partial Key Exposure Attacks

Partial information of 𝑝, 𝑞

MSBs of 𝑑

LSBs of 𝑑

𝑝 = 100101010? ? ? ? ? ? ? ? ?

𝑑 = 111011010? ? ? ? ? ? ? ? ?

𝑑 =? ? ? ? ? ? ? ? ? 10110100126

#RSAC

Multi-Prime RSA

Public key: 𝑁, 𝑒

Secret key: (𝑝1, ⋯ , 𝑝𝑟 , 𝑑)

Key generation: 𝑁 = ς𝑖=1𝑟 𝑝𝑖 and

𝑒𝑑 = 1 modෑ

𝑖=1

𝑘

(𝑝𝑖−1)

The standard RSA is the special case for 𝑟 = 2.

Analogous attacks have been studied. 27

#RSAC

Previous Works

MSBs/LSBs of 𝑑 for RSA [BDF98],[BM03],[EJMW@EC’05],[SGM10],[TK@SAC’14]

Small prime differences for RSA [Weg02]

MSBs/LSBs of 𝑑 for Multi-Prime RSA [Hin08]

MSBs of 𝑝, 𝑞 for RSA [SMS08]

MSBs/LSBs of 𝑑 and MSBs of 𝑝, 𝑞 for RSA [SM08]

𝑝, 𝑞 for RSA sharing the LSBs [SWS+08]

Small prime differences for Multi-Prime RSA [ZT13],[ZT14],[TK@ICISC’14]

28

#RSAC

Previous Works

MSBs/LSBs of 𝑑 for RSA [BDF98],[BM03],[EJMW05],[SGM10],[TK14b]

Small prime differences for RSA [Weg02]

MSBs/LSBs of 𝑑 for Multi-Prime RSA [Hin08]

MSBs of 𝑝, 𝑞 for RSA [SMS08]

MSBs/LSBs of 𝑑 and MSBs of 𝑝, 𝑞 for RSA [SM08]

𝑝, 𝑞 for RSA sharing the LSBs [SWS+08]

Small prime differences for Multi-Prime RSA [ZT13],[ZT14],[TK14a]

29

Are all the papers valuable?

#RSAC

30

Our Contributions

#RSAC

General Exposure Scenarios

Public key: 𝑁, 𝑒 = 𝑁𝛼

Secret key: (𝑝1, ⋯ , 𝑝𝑟 , 𝑑 = 𝑁𝛽)

Key generation:𝑁 = ς𝑖=1𝑟 𝑝𝑖 and 𝑒𝑑 = 1 mod 𝛷(𝑁)

31

𝛼, 𝛽, 𝛾, 𝛿 -Partial Key Exposure Attacks

Attacks with ሚ𝑑, ෩𝛷 𝑁

s.t. ሚ𝑑 is the (𝛽 − 𝛾)-log𝑁 bit MSBs/LSBs of 𝑑෩𝛷 𝑁 − 𝛷 𝑁 ≤ 𝑁𝛿

#RSAC

Our proposed Attacks

We propose attacks for general exposure scenarios.

Our attacks contain all the currently known best attacks [EJMW@EC’05],[TK@SAC’14],[TK@ICISC’14] as special cases.

Special cases of our attacks improve attacks with

the MSBs/LSBs of 𝑑 for Multi-Prime RSA [Hin08],

the MSBs/LSBs of 𝑑 and MSBs of 𝑝, 𝑞 for RSA [SM08].

The result can be viewed as a tool kit for partial key exposure attacks on RSA.

32

#RSAC

MSBs of 𝑑 and MSBs of 𝑝, 𝑞 for 𝛾 = 5/16

33Sizes of Secret ExponentsPo

rtio

ns

of

Part

ial I

nfo

rmat

ion

fo

r 𝑑

#RSAC

MSBs of 𝑑 and LSBs of 𝑝, 𝑞 for 𝛾 = 5/16

34Sizes of Secret ExponentsPo

rtio

ns

of

Part

ial I

nfo

rmat

ion

fo

r 𝑑

#RSAC

MSBs of 𝑑 for Multi-Prime RSA for 𝑟 = 3

35Sizes of Secret ExponentsPo

rtio

ns

of

Part

ial I

nfo

rmat

ion

fo

r 𝑑

#RSAC

LSBs of 𝑑 for Multi-Prime RSA for 𝑟 = 3

36Sizes of Secret ExponentsPo

rtio

ns

of

Part

ial I

nfo

rmat

ion

fo

r 𝑑

#RSAC

37

Coppersmith’s Methods

#RSAC

Coppersmith’s Method

Coppersmith’s method uses the LLL lattice reduction algorithm and solves modular/integer equations with small roots.

19/27

#RSAC

Coppersmith’s Method

Coppersmith’s method uses the LLL lattice reduction algorithm and solves modular/integer equations with small roots.

Formulate an RSA key generation as appropriate equations.

𝑓 𝑥, 𝑦 = 𝑥 𝑁 + 𝑦 + 1 − 𝑒𝑑1 = 0 mod 𝑒𝑀

19/27

#RSAC

Coppersmith’s Method

Coppersmith’s method uses the LLL lattice reduction algorithm and solves modular/integer equations with small roots.

Formulate an RSA key generation as appropriate equations.

Construct a matrix whose row elements are coefficients of polynomials which has the same roots as the original equations.

19/27

#RSAC

Coppersmith’s Method

Coppersmith’s method uses the LLL lattice reduction algorithm and solves modular/integer equations with small roots.

Formulate an RSA key generation as appropriate equations.

Construct a matrix whose row elements are coefficients of polynomials which has the same roots as the original equations.

Since the short lattice vectors generated by the matrix have information of the roots, recover the vectors by applying the LLL reduction.

19/27

#RSAC

Coppersmith’s Method

Coppersmith’s method uses the LLL lattice reduction algorithm and solves modular/integer equations with small roots.

Formulate an RSA key generation as appropriate equations.

Construct a matrix whose row elements are coefficients of polynomials which has the same roots as the original equations.

Since the short lattice vectors generated by the matrix have information of the roots, recover the vectors by applying the LLL reduction.

19/27

#RSAC

Coppersmith’s Method

Coppersmith’s method uses the LLL lattice reduction algorithm and solves modular/integer equations with small roots.

Formulate an RSA key generation as appropriate equations.

Construct a matrix whose row elements are coefficients of polynomials which has the same roots as the original equations.

Since the short lattice vectors generated by the matrix have information of the roots, recover the vectors by applying the LLL reduction.

19/27

The matrix construction is crucial to obtain the best bounds.

#RSAC

44

Our General Formulations

#RSAC

Spirit of Our General Formulations

45

Partial information of

𝑑

Partial information of

𝑝, 𝑞Multi-Prime RSA

#RSAC

Spirit of Our General Formulations

46

Partial information of

𝑑

Partial information of

𝑝, 𝑞Multi-Prime RSAPartial information of

𝑝, 𝑞Multi-Prime RSA

Essentially the same information

#RSAC

Example

Given 𝑑1 the LSBs of 𝑑 for RSA𝑒(𝑑0𝑀 + 𝑑1) = 1 + 𝑘 (𝑁 − 𝑝 − 𝑞 + 1)

𝑓 𝑥, 𝑦 = 𝑥 𝑁 + 𝑦 + 1 − 𝑒𝑑1 = 0 mod 𝑒𝑀

The root 𝑥, 𝑦 = (𝑘, −𝑝 − 𝑞 + 1) is bounded above by 𝑋, 𝑌.

47

#RSAC

Example

Given 𝑑1 the LSBs of 𝑑 and 𝑝1, 𝑞1 the MSBs of 𝑝, 𝑞 for RSA𝑒(𝑑0𝑀 + 𝑑1) = 1 + 𝑘 (𝑁 − 𝑝 − 𝑞 + 1)

𝑓 𝑥, 𝑦 = 𝑥 𝑁 − 𝑝1 − 𝑞1 + 𝑦 + 1 − 𝑒𝑑1 = 0 mod 𝑒𝑀

The root 𝑥, 𝑦 = (𝑘, −𝑝 + 𝑝1 − 𝑞 + 𝑞1 + 1) is bounded above by 𝑋, 𝑌.

48

#RSAC

Example

Given 𝑑1 the LSBs of 𝑑 for Multi-Prime RSA

𝑒(𝑑0𝑀 + 𝑑1) = 1 + 𝑘ෑ𝑖=1

𝑘

(𝑝𝑖−1)

𝑓 𝑥, 𝑦 = 𝑥 𝑁 + 𝑦 + 1 − 𝑒𝑑1 = 0 mod 𝑒𝑀

The root 𝑥, 𝑦 = (𝑘,ς𝑖=1𝑘 (𝑝𝑖−1)−𝑁) is bounded above by

𝑋, 𝑌. 49

#RSAC

Spirit of Our General Formulations

50

Partial information of

𝑝, 𝑞Multi-Prime RSA

larger 𝑌smaller 𝑌

#RSAC

General Exposure Scenarios

Public key: 𝑁, 𝑒 = 𝑁𝛼

Secret key: (𝑝1, ⋯ , 𝑝𝑟 , 𝑑 = 𝑁𝛽)

Key generation:𝑁 = ς𝑖=1𝑟 𝑝𝑖 and 𝑒𝑑 = 1 mod 𝛷(𝑁)

51

𝛼, 𝛽, 𝛾, 𝛿 -Partial Key Exposure Attacks

Attacks with ሚ𝑑, ෩𝛷 𝑁

s.t. ሚ𝑑 is the (𝛽 − 𝛾)-log𝑁 bit MSBs/LSBs of 𝑑෩𝛷 𝑁 − 𝛷 𝑁 ≤ 𝑁𝛿

#RSAC

52

Our Lattice Constructions

#RSAC

Our Lattice Constructions

Applying the previous best strategy appropriately

• [EJMW@Eurocrypt05]The basic attacks that solve integer equations by simple lattice constructions.

• [TK@SAC’14]The attacks solve modular equations and are better for small ሚ𝑑.

• [TK@ICISC’14]The attacks solve modular equations and are better for large ෩𝛷 𝑁 .

53

#RSAC

Summary

We defined general exposure scenarios that include several partial key exposure attacks on (Multi-Prime) RSA as special cases.

For the general scenarios, we propose several attacks.

The attacks contain all the state-of-the-art attacks as special cases.

The attacks improve previous ones in two scenarios.

Our result enables beginners of Coppersmith’s methods to analyze the security of RSA.

54

top related