using classification for data security and data management

31
Using Classification for Data Security and Data Management Clyde Law Software Design Engineer Microsoft Corporation SVR02

Upload: phuc

Post on 25-Feb-2016

54 views

Category:

Documents


1 download

DESCRIPTION

SVR02. Using Classification for Data Security and Data Management. Clyde Law Software Design Engineer Microsoft Corporation. Agenda. Motivation File Classification Infrastructure (FCI) Overview and Demo FCI Architecture Retrieving Properties from Files Custom File Management Tasks - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Using Classification for Data Security and Data Management

Using Classification for Data Security and Data ManagementClyde LawSoftware Design EngineerMicrosoft Corporation

SVR02

Page 2: Using Classification for Data Security and Data Management

Agenda> Motivation> File Classification Infrastructure (FCI)

Overview and Demo> FCI Architecture> Retrieving Properties from Files> Custom File Management Tasks> FCI Extensions> Extensibility Demo

Page 3: Using Classification for Data Security and Data Management

Data Management Challenges

Replication

Backup

HSMSecurity

Archive

Encryption

Expiration

Storage Growth

Storage Costs

Compliance Security and

Information Leakage

Increasing data management needs with disparate data management products

Page 4: Using Classification for Data Security and Data Management

Managing Data by LocationBusines

sIT

Need per-project file share

Ensure business secret files do not leak out

Back up files with personal information to encrypted store

Expire low business impact files created over three years ago and not touched in the past

year

Page 5: Using Classification for Data Security and Data Management

Managing Data using ClassificationMitigate costs and risks

Manage data based on business value

Classify data

Apply policy

File Classification Infrastructure

Classify Manage Report Extend

Page 6: Using Classification for Data Security and Data Management

Introducing the File Classification Infrastructure

Clyde LawSoftware Design EngineerFile Server Management Team

demo

Page 7: Using Classification for Data Security and Data Management

Benefits of ClassificationManage Risk

Find sensitive files on public serversWatermark documents with confidential dataEncrypt backups of files with personal informationApply rights management to high-secrecy filesComply with retention policies

Reduce Cost

Optimize backup SLAs

Replicate only business-related documents

Expire files to reduce storage purchasing needsMove files to less expensive storage

Available in Windows

Extend through IT or ISV solutions

Page 8: Using Classification for Data Security and Data Management

FCI ArchitectureClassification Pipeline> Designed to enable an ecosystem around

classification> Comprehensive API for solutions> Extensible classification infrastructure

Discover Data

Extract Existing

Classification

Properties

Classify Data

Store Classificati

on Properties

Apply Policies

Based on Classificati

on

File Classification Extensibility Points

Get/Set Property API for external applications

Page 9: Using Classification for Data Security and Data Management

Get/Set Property API> Consume properties by specifying files> Automation-compatible COM API

> Works with native code, managed code, or scripts

> Available through classification manager object> Set is meant for manual classification

> Use extensibility modules instead to extend rule-based automatic classification

Page 10: Using Classification for Data Security and Data Management

Get/Set Property APIUsing PowerShell

# Get an instance of the Classification Manager$cm = New-Object –ComObject Fsrm.FsrmClassificationManager

# Enumerate and display all properties associated with a file$props = $cm.EnumFileProperties("P:\foo\bar.txt", 0)foreach ($prop in $props) { Write-Host $prop.Name = $prop.Value}

# Get and display the value of the "Secrecy" property$secrecyProp = $cm.GetFileProperty("P:\foo\bar.txt", "Secrecy", 0)Write-Host $secrecyProp.Value

# Set the value of the "Secrecy" property to "High"$cm.SetFileProperty("P:\foo\bar.txt", "Secrecy", "High")

Page 11: Using Classification for Data Security and Data Management

Get/Set Property APIUsing native C++

// Get an instance of the Classification ManagerCComPtr<IFsrmClassificationManager> spClassMgr;HRESULT hr = CoCreateInstance(CLSID_FsrmClassificationManager, NULL, CLSCTX_LOCAL_SERVER, __uuidof(IFsrmClassificationManager), &spClassMgr);

// Get the "PII" propertyCComBSTR bstrFilename(L"P:\\foo\\bar.txt");CComBSTR bstrPropName(L"PII");CComPtr<IFsrmProperty> spPIIProp;hr = spClassMgr->GetFileProperty(bstrFilename, bstrPropName, 0, &spPIIProp);

Page 12: Using Classification for Data Security and Data Management

Custom File Management Tasks> Apply policies by

running custom commands on files that match specified criteria

> Faster than scanning and retrieving properties yourself> No control on file

order> Task runs command in

new process per file

Page 13: Using Classification for Data Security and Data Management

FCI Extensions> Classification modules

> Determine values of properties to apply to files> Available in Windows:

> Folder classifier – assigns properties based on file location

> Content classifier – assigns properties based on string and regular expression matches in file content

> Storage modules> Supply and persist properties associated with

files> Available in Windows:

> System storage module for all file types> Uses NTFS named stream to store properties> Functions as a cache for fast retrieval

> Office 97-2003 and Office 2007 in-file storage

Page 14: Using Classification for Data Security and Data Management

Pipeline Anatomy

Classification Runtime Process

Hosting Process

Hosting Process

Hosting Process

Scanner

Gets basic file properties

Office Storage [Load]

Loads embedded properties

Folder Classifier

Classifies based on location

Content Classifier

Classifies based on content

Office Storage [Save]

Saves embedded properties

Reporting Engine

Adds files to report

Discover Data

Extract Properties

Classify Data Store Properties

Apply Policies

Streams can cross processes• Security checks are performed on

cross-process data transfers

Most modules are hosted within a separate process

Each module passes streams of property bags to the next one

Page 15: Using Classification for Data Security and Data Management

Custom Pipeline Modules> Register module by creating a module

definition through the Classification Manager> Typically once during installation

> Module is a COM server that implements IFsrmClassifierModuleImplementation or IFsrmStorageModuleImplementation> Both native and managed are supported

> Pipeline calls OnLoad to initialize module> Module needs to return connector object to

connect hosting process> Instructions in MSDN documentation

Page 16: Using Classification for Data Security and Data Management

Classifier ModulesModels for classification> Yes/no

> Pipeline asks module whether or not a property value applies to the file

> Explicit value> Pipeline asks module what value to assign to a

specified property> Controlled by NeedsExplicitValue flag in

module definition

Page 17: Using Classification for Data Security and Data Management

Classifier ModulesClassification session call sequence> UseRulesAndDefinitions called at start of

session> Module can choose to cache these rules

> For each file:> OnBeginFile – specifies the property bag of the

file to classify and the rules to classify it with> Module can choose to process file right away

> For each rule:> Yes/no – DoesPropertyValueApply

> Return TRUE or FALSE> Explicit value – GetPropertyValueToApply

> Return value to apply, or return error code FSRM_E_NO_PROPERTY_VALUE if no value should be applied

> OnEndFile – indicates end of file processing

Page 18: Using Classification for Data Security and Data Management

Storage Modules> Supply or persist properties associated with

file> Two types supported: InFile and Database

> Cache is reserved for the built-in System Cache Module

> Capabilities field in module definition determines whether module is instantiated for loading and/or saving properties> Separate instances created for load and save

> LoadProperties – provide property values by calling SetFileProperty in the property bag

> SaveProperties – retrieve properties in the property bag and persist them

Page 19: Using Classification for Data Security and Data Management

Accessing File Contents> Modules should never open files directly

> May not have proper permissions> Stream state may not be consistent with

metadata> Use GetFileStreamInterface in the property

bag> Supports ILockBytes and IStream interfaces> Takes care of getting the right permissions> Ensures last access and last modified times are

unchanged> Ensures changes are properly committed (for

storage modules)

Page 20: Using Classification for Data Security and Data Management

PowerShell Host Classifier> Included in Windows SDK> Presents itself as a classifier to FCI that

hosts PowerShell scripts to do the actual classification

> Create custom classifiers without compiling and registering your own modules

> Simpler to build, but has slower performance> Intended for in-house IT solutions and

prototyping> More information at

http://blogs.technet.com/filecab/archive/2009/08/14/using-windows-powershell-scripts-for-file-classification.aspx

Page 21: Using Classification for Data Security and Data Management

Putting it all together

Clyde LawSoftware Design EngineerFile Server Management Team

demo

Page 22: Using Classification for Data Security and Data Management

Developer OpportunitiesCall to action> FCI provides many avenues to be part of

end-to-end data lifecycle management solutions> Classifiers – provide classification based on

content, identity, regulations, etc.> Data management products – leverage

classification in solutions to backup, archival, leakage-prevention, etc.

> Storage modules – provide property storage for new file formats

> Flexible COM API> Native code, managed code, or scripting> PowerShell support enables fast deployment of

solutions

Page 23: Using Classification for Data Security and Data Management

Additional Resources> FCI Overview

> http://microsoft.com/fci/> Microsoft TechNet

> http://technet.microsoft.com/en-us/library/dd758765%28WS.10%29.aspx

> http://technet.microsoft.com/en-us/library/dd758756%28WS.10%29.aspx

> Developing for FCI> Windows SDK

> http://msdn.microsoft.com/en-us/windows/bb980924.aspx> FSRM API Documentation on MSDN

> http://msdn.microsoft.com/en-us/library/bb972746%28VS.85%29.aspx

> FCI Code Gallery> http://code.msdn.microsoft.com/fci/

Page 24: Using Classification for Data Security and Data Management

Contact Us> Storage Team Blog

> http://blogs.technet.com/filecab/default.aspx> E-mail

> FCI Team> [email protected]

> Clyde Law, Developer> [email protected]

> Matthias Wollnik, Program Manager> [email protected]

Page 25: Using Classification for Data Security and Data Management

YOUR FEEDBACK IS IMPORTANT TO US! Please fill out session evaluation

forms online atMicrosoftPDC.com

Page 26: Using Classification for Data Security and Data Management

Learn More On Channel 9> Expand your PDC experience through

Channel 9

> Explore videos, hands-on labs, sample code and demos through the new Channel 9 training courses

channel9.msdn.com/learnBuilt by Developers for Developers….

Page 27: Using Classification for Data Security and Data Management

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Page 28: Using Classification for Data Security and Data Management

AppendixProperty aggregation and conflict resolution

> Values from Storage:> In-file > Database > Cache

> Values from Classification Rules:> Default values applied once if not

already present> Can also choose to explicitly

aggregate or overwrite existing values

> Ordered lists, Booleans, Multi-choice lists, and Multi-strings can be aggregated

[Default] Apply only if there is no value stored in the file

[Consider Existing] Apply but aggregate with values from Storage and Default rules

[Ignore Existing] Apply and ignore (replace) values from Storage and Default rules

Page 29: Using Classification for Data Security and Data Management

AppendixProperty bags> Property bag object holds the metadata of a file being

classified> The object flows through the classification pipeline> Each pipeline module can assign property values

Property BagFile System Info

Relative Path, Creation Time, etc.

Properties

Messages

Read Stream Write Stream

Current ContextModule Type, Rule, etc.

PropertyName

Type

Assigned Values and SourcesFrom Storage Modules

From Default and CE Rules

From IE Rules

Aggregated Value

Aggregated Sources

Page 30: Using Classification for Data Security and Data Management

AppendixConnecting a module to the pipeline

STDMETHODIMP CCustomModule::OnLoad( __in IFsrmPipelineModuleDefinition *pDefinition, __deref_out IFsrmPipelineModuleConnector **ppModuleConnector ){ ...perform module initialization...

// Create the connector CComPtr<IFsrmPipelineModuleConnector> spConnector; hr = CoCreateInstance(CLSID_FsrmPipelineModuleConnector, NULL, CLSCTX_LOCAL_SERVER, __uuidof(IFsrmPipelineModuleConnector), &spConnector); ...handle any errors... CComQIPtr<IFsrmPipelineModuleImplementation> spModuleImpl = GetControllingUnknown(); if (spModuleImpl == NULL) ...handle error...

// Bind the connector to the module hr = spConnector->Bind(pDefinition, spModuleImpl); ...handle any errors...

// Return the connector *ppModuleConnector = spConnector.Detach();

return hr;}

Page 31: Using Classification for Data Security and Data Management