issues securing (big) data

19
Issues Securing Big Data Mike Pluta, Sr Technical Architect | April 23, 2015

Upload: mike-pluta

Post on 16-Jul-2015

189 views

Category:

Technology


0 download

TRANSCRIPT

Issues Securing Big DataMike Pluta, Sr Technical Architect | April 23, 2015

The enclosed materials are highly sensitive, proprietary and confidential. Please use every effort to safeguard the confidentiality of these materials. Please do not copy, distribute, use, share or otherwise provide access to these materials to any person inside or outside DST Systems, Inc. without prior written approval.

This proprietary, confidential presentation is for general informational purposes only and does not constitute an agreement. By making this presentation available to you, we are not granting any express or implied rights or licenses under any intellectual property right.

If we permit your printing, copying or transmitting of content in this presentation, it is under a non-exclusive, non-transferable, limited license, and you must include or refer to the copyright notice contained in this document. You may not create derivativeworks of this presentation or its content without our prior written permission. Any reference in this presentation to anotherentity or its products or services is provided for convenience only and does not constitute an offer to sell, or the solicitation ofan offer to buy, any products or services offered by such entity, nor does such reference constitute our endorsement, referral, or recommendation.

Our trademarks and service marks and those of third parties used in this presentation are the property of their respective owners.

© 2015 DST Systems, Inc. All rights reserved.

DisclaimerDisclaimer

• DST has established internal rules around the use of Big Data

• Data flowing into our data lake is partitioned by, what we call, Data Domains

• Each DST business unit is in essence at least one Data Domain

• Data Domains serve as the primary method of organizing our permission-ing

Big (or not) Data Security

• By default, one Business Unit is not granted access to another’s data

• Agreements between business units are made to access data for purpose

• Internal Data Scientists are given cross-Business Unit access to data

• Management mandate to secure data which has not been explicitly granted access

What This Means

4

• These rules result in a very complex matrix of permissions

• Example below• Data Doman ‘Business Unit A’ may be accessed by Business Unit A and Business

Unit D. Business Units B and C may not access this Data Domain

Complexity

5

BU A BU B BU C BU D

Dat

a D

om

ain Business Unit A X X

Business Unit B X X

Business Unit C X X X

Third Party Data X X

• Let’s deal with just text data on a file system in a Linux server

• Logical approach is to arrange directories to track with the Data Domains

• For permission-ing, create a group and directory for each Data Domain• Assign the group ownership as appropriate• Set umask to 007 – new files to have u:rw-, g:rw-, o:--- permissions

Scenario

6

sudo useradd buaadmsudo passwd -d buaadm

sudo useradd bubadmsudo passwd -d bubadm

sudo useradd bucadmsudo passwd -d bucadm

sudo useradd budadmsudo passwd -d budadm

sudo useradd tpdadmsudo passwd -d tpdadm

Details – Setup Users and Groups

7

sudo groupadd buagsudo usermod -G buag buaadm

sudo groupadd bubgsudo usermod -G bubg bubadm

sudo groupadd bucgsudo usermod -G bucg bucadm

sudo groupadd budgsudo usermod -G budg budadm

sudo groupadd tpdgsudo usermod -G tpdg tpdadm

sudo usermod -a -G buag,bubg,bucg,budg,tpdg dt206031

umask 007

cd $HOMEmkdir data

cd datamkdir buamkdir bubmkdir bucmkdir tpd

cd $HOME/data/buatouch bua_file_1touch bua_file_2touch bua_file_3touch bua_file_4touch bua_file_5sudo chown buaadm:buag *

Details – Setup Files

8

cd $HOME/data/bubtouch bub_file_1touch bub_file_2touch bub_file_3touch bub_file_4touch bub_file_5sudo chown bubadm:bubg *

cd $HOME/data/buctouch buc_file_1touch buc_file_2touch buc_file_3touch buc_file_4touch buc_file_5sudo chown bucadm:bucg *

cd $HOME/data/tpdtouch tpd_file_1touch tpd_file_2touch tpd_file_3touch tpd_file_4touch tpd_file_5sudo chown tpdadm:tpdg *

cd $HOME/datasudo chown buaadm:buag buasudo chown bubadm:bubg bubsudo chown bucadm:bucg bucsudo chown tpdadm:tpdg tpd

What It Looks Like

9

• The directory for the Data Domain ‘Business Unit A’ can be accessed by members of the ‘bua’ group

• How can we grant additional access to the ‘bud’ group, but still restrict other groups?

Complexity Redux

10

BU A BU B BU C BU D

Dat

a D

om

ain Business Unit A X X

Business Unit B X X

Business Unit C X X X

Third Party Data X X

• POSIX Access Control Lists (ACLs) are the answer to our dilemma• Not enabled by default. Needs to be enabled at the filesystem level• mount with the remount and acl options can enable• mount –o remount –o acl /dev/sda5 /home• See your system administrator for the permanent enable

The Secret Sauce

11

• setfacl is used to set the ACL for a file or directory

• getfacl is used to query and list the ACL of a file or directory

• Our specific need:• In addition to rwx permissions for the group ‘buag’, add rwx permissions for

the group ‘budg’ to the directory ‘bua’• In addition to rwx permissions for the group ‘bubg’, add rwx permissions for

the group ‘budg’ to the directory ‘bub’• In addition to rwx permissions for the group ‘bucg’, add rwx permissions for

the groups ‘bubg’ and ‘budg’ to the directory ‘buc’• In addition to rwx permissions for the group ‘tpdg’, add rwx permissions for the

groups ‘bucg’ and ‘budg’ to the directory ‘tpd’

The Tools

12

• In addition to rwx permissions for the group ‘buag’, add rwx permissions for the group ‘budg’ to the directory and contents of ‘bua’• setfacl –R --set u::rwx,g::rwx,o::-,g:budg:rwx bua

• In addition to rwx permissions for the group ‘bubg’, add rwx permissions for the group ‘budg’ to the directory and contents of ‘bub’• setfacl –R --set u::rwx,g::rwx,o::-,g:budg:rwx bub

• In addition to rwx permissions for the group ‘bucg’, add rwx permissions for the groups ‘bubg’ and ‘budg’ to the directory and contents of ‘buc’• setfacl –R --set u::rwx,g::rwx,o::-,g:bubg:rwx,g:budg:rwx buc

• In addition to rwx permissions for the group ‘tpdg’, add rwx permissions for the groups ‘bucg’ and ‘budg’ to the directory and contents of ‘tpd’• setfacl –R --set u::rwx,g::rwx,o::-,g:bucg:rwx,g:budg:rwx tpd

The Commands

13

Results

14

• Hadoop HDFS v2.6 adds POSIX ACLs

• Make sure to turn it on firsthdfs-site.xml

<property>

<name>dfs.namenode.acls.enabled</name>

<value>true</value>

</property>

• Reboot the namenode

• Set an ACLhdfs dfs -setfacl -m u::rwx,g::rwx,o::-,g:budg:rwx /bua

• See the ACLshdfs dfs –getfacl /bua

How To Hadoop It

15

• Use a Default ACL for Automatic Application to New Childrensudo setfacl -d --set u::rwx,g::rwx,o::-,g:budg:rwx bua

sudo setfacl -d --set u::rwx,g::rwx,o::-,g:budg:rwx bub

sudo setfacl -d --set u::rwx,g::rwx,o::-,g:bubg:rwx,g:budg:rwx buc

sudo setfacl -d --set u::rwx,g::rwx,o::-,g:bucg:rwx,g:budg:rwx tpd

• And in Hadoop…hadoop fs -setfacl --set d:u::rwx,d:g::rwx,d:o::-,d:g:budg:rwx bua

hadoop fs -setfacl --set d:u::rwx,d:g::rwx,d:o::-,d:g:budg:rwx bub

hadoop fs -setfacl --set d:u::rwx,d:g::rwx,d:o::-,d:g:bubg:rwx,d:g:budg:rwx buc

hadoop fs -setfacl --set d:u::rwx,d:g::rwx,d:o::-,d:g:bucg:rwx,d:g:budg:rwx tpd

Other Goodies

16

Results With Default ACLs

17

• Don’t forget about the sticky bit• Makes it so that only root or the directory owner can delete filessudo chmod +t bua

• Use the setgid bit to set new files in a directory to have the same group owner as the directory.• Very handy when paired with default ACLSsudo chmod g+s bua

Last Extra Bits

18

19