protecting enterprise data in apache hadoop

Post on 09-Aug-2015

264 Views

Category:

Technology

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© Hortonworks Inc. 2015

Protecting Enterprise Datain Apache Hadoop

June 2015

Page 1

Owen O’Malleyowen@hortonworks.com@owen_omalley

© Hortonworks Inc. 2015

Security

Page 2

• What are the important threats?• What are the attack vectors?• Minimize attack surfaces.

© Hortonworks Inc. 2015

Security Architecture

Page 3

© Hortonworks Inc. 2015

Attack Vectors

Page 4

• Physical Access to Machines• Remote Access• Gateway Machines• Slave Machines• Master Machines

© Hortonworks Inc. 2015

Attack Vectors

Page 5

• Root• Has complete access on machine

• Hadoop Administrator• Has complete Hadoop access

• User• Limited by Hadoop permissions

© Hortonworks Inc. 2015

Threat: Accidental Damage

Page 6

• User deletes files• Need HDFS permissions• User kills other user’s jobs• Need Linux Container Executor

© Hortonworks Inc. 2015

Threat: Remote Access

Page 7

• Need Service Level Authorization• Which users can use each service• Apache Ranger simplifies this!

• Need firewall around cluster• Control attack surface

© Hortonworks Inc. 2015

Threat: Eavesdropping

Page 8

• Root can watch network traffic• Need wire encryption

• HTTP SSL encryption• Shuffle• Data transfer• RPC encryption

© Hortonworks Inc. 2015

Threat: User accesses private data

Page 9

• Need Kerberos• HDFS permissions are critical

• ACLs add additional flexibility• Define user-to-group mapping

• Need group directories• Minimize user or global spaces

© Hortonworks Inc. 2015

Threat: Physical access

Page 10

• Very Rare• Attacker can get to physical box

• Hopeless• Can remove hard drives

• Includes access to retired drives• Need raw file system encryption

© Hortonworks Inc. 2015

Threat: Hadoop Admin in Cluster

Page 11

• Attacker is an Hadoop Admin• But not root

• Need HDFS Encryption Zones• Directory sub-tree encryption• Each file gets unique key• Client decrypts data

© Hortonworks Inc. 2015

HDFS Encryption

Page 12

• Each zone has master key• Each file gets unique sub-key• HDFS stores sub-key encrypted with

master key• Client uses sub-key to decrypt file

© Hortonworks Inc. 2015

KeyProvider API

Page 13

• Key management• Allows 3rd party plugins• Named keys• Key versions and key rolling• Key Management Server is alpha

© Hortonworks Inc. 2015

Encryption Scheme

Page 14

• AES/CTR• Supports append

and seek• Does not pad files• Uses Initialization

Vector

© Hortonworks Inc. 2015

Threat: User Deletes Hive tables

Page 15

• Configure Storage-Based Auth• Table permissions match HDFS

• Change owner and group on HDFS• Modify permissions on HDFS

© Hortonworks Inc. 2015

Threat: User reads private columns

Page 16

• Some parts of file have different protection requirements.

• Configure SQL Standard Auth• Supports grant permissions

• Restrict permissions in HDFS• All normal users use Hive Server 2

© Hortonworks Inc. 2015

Threat: User reads private columns

Page 17

• Some columns are more sensitive• But need custom file formats & UDF• Need Column Encryption

• Strong encryption of entire column• File format specific• Working on ORC in HIVE-4227

© Hortonworks Inc. 2015

ORC File Layout

Page 18

© Hortonworks Inc. 2015

Threat: User reads hidden values

Page 19

• Want to hide values in some columns.

• Need value encryption• Provided by 3rd parties• Typically done using UDFs

© Hortonworks Inc. 2015

Threat: Shadow Security

Page 20

• Be very wary of non-public encryption schemes

• Cryptosystems need deep analysis• Need to use AES or stronger.• Avoid masking or format preserving

that is not based on AES

© Hortonworks Inc. 2015

Resources

Page 21

• Use the developer lists:• common-dev@hadoop.apache.org

• dev@hive.apache.org• Report security holes:

• security@hadoop.apache.org• security@hive.apache.org

© Hortonworks Inc. 2015

Thank You!

Page 22

Owen O’Maley

@owen_omalley

owen@hortonworks.com

top related