privacy-preserving efficient subset of features selection ... · choiceofslotexponent...

Post on 21-Jul-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Privacy-preserving Efficient Subset of Features Selectionfor Regression Models

N. Gama, M. Georgieva

December 10, 2018 1 / 20

GWAS (find the best additional feature)

SX Y

Patient 1

Patient n

intercept

age

weightgender

1

1 1

01

covariates target

sisi

SPNi

m>10000

Question: Is the new future important?Naive method: compute stati for each i...... that means compute more than 10000 logreg

2 / 20

Description of Idash 2018 Task 2

Goal:Develop a secure parallel outsourcing solution to compute Genome WideAssociation Studies (GWAS) based on linear/logistic regression usinghomomorphically encrypted data.

Challenge (informally):Currently: a logreg model, 250 patients, with 3 or 4 physical features.Which new feature, among 10000 possible genes (SNP), would improve themodel?

Semi-parallel approachDon’t do 10000 logregs...

3 / 20

Description of Idash 2018 Task 2

Goal:Develop a secure parallel outsourcing solution to compute Genome WideAssociation Studies (GWAS) based on linear/logistic regression usinghomomorphically encrypted data.

Challenge (informally):Currently: a logreg model, 250 patients, with 3 or 4 physical features.Which new feature, among 10000 possible genes (SNP), would improve themodel?

Semi-parallel approachDon’t do 10000 logregs...

3 / 20

Logreg, IRLS, relevance of a feature

X Y

Patient 1

Patient n

intercept

age

weightgender

1

1 1

01

covariates target

Single Logistic regression:Find θ s.t Y = sign(Xθ)

IRLS:Compute grad = Xt(Y − p), with p = σ(Xθ)Compute Hessian = Xtdiag(p(1 − p))X

4 / 20

Logreg, IRLS, relevance of a feature

X Y

Patient 1

Patient n

intercept

age

weightgender

1

1 1

01

covariates target

Importance of the ith feature:the ith coeff is big: θi (numerator)the ith error term is small:(Hess−1)i,i (denominator)

stat= ratio

Single Logistic regression:Find θ s.t Y = sign(Xθ)

IRLS:Compute grad = Xt(Y − p), with p = σ(Xθ)Compute Hessian = Xtdiag(p(1 − p))X

4 / 20

Semi-parallel GWAS (high level idea)

Semi-parallel GWAS (optimized)1 Do logreg(X, y) without S2 Once model is converged, add si

Gradient:

X0

0

t

si

Y-p

<si,Y-p>

They can be batch-computed: (Y-p) St

Hessian:

Xt

si

X si

p(1-p)

Old Hess

5 / 20

MPC versus FHE

FHELong term storageUnique CloudSlower and consumes more memory

MPCFaster than FHEMore accuracyAll data owner must participate

6 / 20

Fixed points versus Floating point

Floating point:x = m.2τ , with m ∈ 2−ρ.Z and 1

2 ≤ |m| < 1τ = dlog2(x)e data dependent and not public (not FHE-friendly)The exponent is always in sync with the dataex: (1.23 · 10−4) ∗ (7.24 · 10−4) = (8.90 · 10−8)

Fixed point:x = m.2τ , with m ∈ 2−ρ.Z and 0 ≤ |m| < 1,τ is public, thus FHE-friendlyRisk of overflow (τ too small)Risk of underflow (τ too large)ex: (0.000123 · 100) ∗ (0.000724 · 100) = (0.000000 · 100)

Plaintext parameters:ρ ∈ N: bits of precision of the plaintext (≈ 15 bits)τ ∈ Z: slot exponent (order of magnitude of the complex values in each slot)

7 / 20

Choice of slot exponent

The slot exponent τ that defines the plaintext interval must be carefully estimated.

variable avg stdev min max dist

p 0.440816 0.0975715 0.176397 0.853487 0

10

20

30

40

50

60

70

80

90

100

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

’p.histo’

w 0.236977 0.0201871 0.125047 0.25 0

20

40

60

80

100

120

0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26

’w.histo’

z∗i -3.33092 7.36068 -30.9426 31.2008 0

500

1000

1500

2000

2500

3000

3500

4000

-40 -30 -20 -10 0 10 20 30 40

’zStar.histo’

G 0.0577846 0.0953495 -0.011997 0.236977 0

0.5

1

1.5

2

2.5

3

-0.05 0 0.05 0.1 0.15 0.2 0.25

’G.histo’

A 0.0621965 0.301255 -0.317312 2.236 0

2000

4000

6000

8000

10000

12000

14000

16000

-0.5 0 0.5 1 1.5 2 2.5

’A.histo’

(s∗i )2 2.44243 4.11085 0.111961 14.5044 0

500

1000

1500

2000

2500

3000

3500

4000

0 2 4 6 8 10 12 14 16

’sStar2.histo’

log(stati) 0.200039 1.84459 -13.7207 4.36158 0

200

400

600

800

1000

1200

-14 -12 -10 -8 -6 -4 -2 0 2 4 6

’ri.histo’

p− value 0.310218 0.24083 0 0.999163 0

200

400

600

800

1000

1200

1400

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

’pval.histo’

8 / 20

Choice of slot exponent

The slot exponent τ that defines the plaintext interval must be carefully estimated.

variable avg stdev min max dist

p 0.440816 0.0975715 0.176397 0.853487 0

10

20

30

40

50

60

70

80

90

100

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

’p.histo’

w 0.236977 0.0201871 0.125047 0.25 0

20

40

60

80

100

120

0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26

’w.histo’

z∗i -3.33092 7.36068 -30.9426 31.2008 0

500

1000

1500

2000

2500

3000

3500

4000

-40 -30 -20 -10 0 10 20 30 40

’zStar.histo’

G 0.0577846 0.0953495 -0.011997 0.236977 0

0.5

1

1.5

2

2.5

3

-0.05 0 0.05 0.1 0.15 0.2 0.25

’G.histo’

A 0.0621965 0.301255 -0.317312 2.236 0

2000

4000

6000

8000

10000

12000

14000

16000

-0.5 0 0.5 1 1.5 2 2.5

’A.histo’

(s∗i )2 2.44243 4.11085 0.111961 14.5044 0

500

1000

1500

2000

2500

3000

3500

4000

0 2 4 6 8 10 12 14 16

’sStar2.histo’

log(stati) 0.200039 1.84459 -13.7207 4.36158 0

200

400

600

800

1000

1200

-14 -12 -10 -8 -6 -4 -2 0 2 4 6

’ri.histo’

p− value 0.310218 0.24083 0 0.999163 0

200

400

600

800

1000

1200

1400

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

’pval.histo’

8 / 20

Numerical stability

Not stableIncrease the precision of the algorithm, butthat implies bigger parameters.

StableUse stable computation with negativefeedback(e.g. gradient descent)Smaller parameters

9 / 20

FHE Solution

FHE parameters:L ∈ N: level exponent of the ciphertext ( α = 2−(L+ρ): noise rate)N = f(λ, α): key size, with λ the security parameter

The lwe-estimator script was used to assert the security.(conform to HE security standardization white paper)

10 / 20

FHE Solution

FHE parameters:L ∈ N: level exponent of the ciphertext ( α = 2−(L+ρ): noise rate)N = f(λ, α): key size, with λ the security parameter

The lwe-estimator script was used to assert the security.(conform to HE security standardization white paper)

10 / 20

Plaintext algorithm in FHE solution

SX Y

Patient 1

Patient n

intercept

age

weightgender

1

1 1

01

covariates target

sisi

SPNi

m>10000

Input:X ∈Mn,k+1(R) input matrixy ∈ Bn binary vectorS ∈Mn,m(R) assumed binary

Output:stat ∈ Rmwith stati = z∗

i√s∗2

i

Key points of our solution:Make plaintext algorithmFHE friendlyUse hybrid homomorphic encryption

11 / 20

Plaintext algorithm in FHE solution

SX Y

Patient 1

Patient n

intercept

age

weightgender

1

1 1

01

covariates target

sisi

SPNi

m>10000

Input:X ∈Mn,k+1(R) input matrixy ∈ Bn binary vectorS ∈Mn,m(R) assumed binary

Output:stat ∈ Rmwith stati = z∗

i√s∗2

i

Key points of our solution:Make plaintext algorithmFHE friendlyUse hybrid homomorphic encryption

11 / 20

Optimization of plaintext algorithm

Make the plaintext algorithm FHE friendlyFind simple geometric equivalents of the formulaFind approximation with lower multiplicative depthReplace feature scaling of X with orthogonalization

12 / 20

Algorithm in plaintext

13 / 20

Algorithm in plaintext

continuous non-polynomial functions

(Approx numbers, or Lookup tables)

for loops

(better with fast bootstrapping)

13 / 20

Algorithm in plaintext

continuous non-polynomial functions

(Approx numbers, or Lookup tables)

for loops

(better with fast bootstrapping)

individual non-linear operations in small dimension

(lookup tables)

multiplication with fresh ciphertexts

(better with TFHE’s external product)

13 / 20

Algorithm in plaintext

continuous non-polynomial functions

(Approx numbers, or Lookup tables)

for loops

(better with fast bootstrapping)

individual non-linear operations in small dimension

(lookup tables)

multiplication with fresh ciphertexts

(better with TFHE’s external product)

continuous function batched on a large vector

very large dimension

(fully packed SIMD)

13 / 20

Algorithm in plaintext

continuous non-polynomial functions

(Approx numbers, or Lookup tables)

for loops

(better with fast bootstrapping)

individual non-linear operations in small dimension

(lookup tables)

multiplication with fresh ciphertexts

(better with TFHE’s external product)

continuous function batched on a large vector

very large dimension

(fully packed SIMD)

Which fully homomoprhic scheme should we choose?

13 / 20

Each library has its own strengths

Strengths of HE librariesBGV/Helib: SIMD finite field arithmeticB/FV, Seal: SIMD vector mod t

HEAAN: SIMD fixed point arithmeticTFHE: single evaluation, boolean logic, comparison, threshold, complexcircuitsetc...

How to get all the benefits without the limitations?

14 / 20

Solution: Chimera

Idea:Unified plaintext space over the TorusSwitch between ciphertext representationsImplement bridges between TFHE, B/FV and HEAAN

For this use-caseWe use the switch between TFHE and HEAAN!

15 / 20

Solution: Chimera

Idea:Unified plaintext space over the TorusSwitch between ciphertext representationsImplement bridges between TFHE, B/FV and HEAAN

For this use-caseWe use the switch between TFHE and HEAAN!

15 / 20

Chimera solution

1 Initial Logreg on matrix X and vector yadapt lib TFHE + logreg

2 Mass Linear algebra computationsimplement Chimera (version 2 of TFHE)

3 Batch Logarithm computationadapt lib HEAAN

16 / 20

Benchmarks (Idash Bootstrapped)

Steps Timing (4 cores) Timing (96 cores) RAMKeyGen 5.5 mins 2.0 mins 4.4 GBEncryption 7.2 mins 1.3 mins 8.6 GBCloud Computation 3h06 10.2 mins 7.8 GB

Input ciphertext: 5GB (enc X, y, S)Final ciphertext: 640KB (enc numerator + denominator)

17 / 20

Benchmarks (with new optimizations)k = 3, n = 250, m = 10000

Steps Timing (4 cores) Timing (96 cores) RAMKeyGen 5.5 mins 2.0 mins 4.4 GBEncryption 7.2 mins 1.3 mins 8.6 GBCloud Computation 35 mins 3 mins 7.8 GB

k = 7, n = 250, m = 10000

Steps Timing (4 cores) Timing (96 cores) RAMKeyGen 5.5 mins 2.0 mins 4.4 GBEncryption 7.2 mins 1.3 mins 8.6 GBCloud Computation 41 mins 3.1 mins 7.8 GB

initial ciphertext: 5GB (enc X, y, S)final ciphertext: 640KB (enc numerator + denominator)

18 / 20

Numerical Accuracy (FHE has noise)

-10

-5

0

5

10

-10 -5 0 5 10

actual vs. computedy=x

19 / 20

Questions?

20 / 20

top related