protecting enterprise data in apache hadoop

22
© Hortonworks Inc. 2015 Protecting Enterprise Data in Apache Hadoop June 2015 Page 1 Owen O’Malley [email protected] @owen_omalley

Upload: owen-omalley

Post on 09-Aug-2015

264 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Protecting Enterprise Datain Apache Hadoop

June 2015

Page 1

Owen O’[email protected]@owen_omalley

Page 2: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Security

Page 2

• What are the important threats?• What are the attack vectors?• Minimize attack surfaces.

Page 3: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Security Architecture

Page 3

Page 4: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Attack Vectors

Page 4

• Physical Access to Machines• Remote Access• Gateway Machines• Slave Machines• Master Machines

Page 5: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Attack Vectors

Page 5

• Root• Has complete access on machine

• Hadoop Administrator• Has complete Hadoop access

• User• Limited by Hadoop permissions

Page 6: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Threat: Accidental Damage

Page 6

• User deletes files• Need HDFS permissions• User kills other user’s jobs• Need Linux Container Executor

Page 7: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Threat: Remote Access

Page 7

• Need Service Level Authorization• Which users can use each service• Apache Ranger simplifies this!

• Need firewall around cluster• Control attack surface

Page 8: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Threat: Eavesdropping

Page 8

• Root can watch network traffic• Need wire encryption

• HTTP SSL encryption• Shuffle• Data transfer• RPC encryption

Page 9: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Threat: User accesses private data

Page 9

• Need Kerberos• HDFS permissions are critical

• ACLs add additional flexibility• Define user-to-group mapping

• Need group directories• Minimize user or global spaces

Page 10: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Threat: Physical access

Page 10

• Very Rare• Attacker can get to physical box

• Hopeless• Can remove hard drives

• Includes access to retired drives• Need raw file system encryption

Page 11: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Threat: Hadoop Admin in Cluster

Page 11

• Attacker is an Hadoop Admin• But not root

• Need HDFS Encryption Zones• Directory sub-tree encryption• Each file gets unique key• Client decrypts data

Page 12: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

HDFS Encryption

Page 12

• Each zone has master key• Each file gets unique sub-key• HDFS stores sub-key encrypted with

master key• Client uses sub-key to decrypt file

Page 13: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

KeyProvider API

Page 13

• Key management• Allows 3rd party plugins• Named keys• Key versions and key rolling• Key Management Server is alpha

Page 14: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Encryption Scheme

Page 14

• AES/CTR• Supports append

and seek• Does not pad files• Uses Initialization

Vector

Page 15: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Threat: User Deletes Hive tables

Page 15

• Configure Storage-Based Auth• Table permissions match HDFS

• Change owner and group on HDFS• Modify permissions on HDFS

Page 16: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Threat: User reads private columns

Page 16

• Some parts of file have different protection requirements.

• Configure SQL Standard Auth• Supports grant permissions

• Restrict permissions in HDFS• All normal users use Hive Server 2

Page 17: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Threat: User reads private columns

Page 17

• Some columns are more sensitive• But need custom file formats & UDF• Need Column Encryption

• Strong encryption of entire column• File format specific• Working on ORC in HIVE-4227

Page 18: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

ORC File Layout

Page 18

Page 19: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Threat: User reads hidden values

Page 19

• Want to hide values in some columns.

• Need value encryption• Provided by 3rd parties• Typically done using UDFs

Page 20: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Threat: Shadow Security

Page 20

• Be very wary of non-public encryption schemes

• Cryptosystems need deep analysis• Need to use AES or stronger.• Avoid masking or format preserving

that is not based on AES

Page 21: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Resources

Page 21

• Use the developer lists:• [email protected]

[email protected]• Report security holes:

[email protected][email protected]

Page 22: Protecting Enterprise Data in Apache Hadoop

© Hortonworks Inc. 2015

Thank You!

Page 22

Owen O’Maley

@owen_omalley

[email protected]