using classification for data security and data management
DESCRIPTION
SVR02. Using Classification for Data Security and Data Management. Clyde Law Software Design Engineer Microsoft Corporation. Agenda. Motivation File Classification Infrastructure (FCI) Overview and Demo FCI Architecture Retrieving Properties from Files Custom File Management Tasks - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/1.jpg)
Using Classification for Data Security and Data ManagementClyde LawSoftware Design EngineerMicrosoft Corporation
SVR02
![Page 2: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/2.jpg)
Agenda> Motivation> File Classification Infrastructure (FCI)
Overview and Demo> FCI Architecture> Retrieving Properties from Files> Custom File Management Tasks> FCI Extensions> Extensibility Demo
![Page 3: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/3.jpg)
Data Management Challenges
Replication
Backup
HSMSecurity
Archive
Encryption
Expiration
Storage Growth
Storage Costs
Compliance Security and
Information Leakage
Increasing data management needs with disparate data management products
![Page 4: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/4.jpg)
Managing Data by LocationBusines
sIT
Need per-project file share
Ensure business secret files do not leak out
Back up files with personal information to encrypted store
Expire low business impact files created over three years ago and not touched in the past
year
![Page 5: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/5.jpg)
Managing Data using ClassificationMitigate costs and risks
Manage data based on business value
Classify data
Apply policy
File Classification Infrastructure
Classify Manage Report Extend
![Page 6: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/6.jpg)
Introducing the File Classification Infrastructure
Clyde LawSoftware Design EngineerFile Server Management Team
demo
![Page 7: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/7.jpg)
Benefits of ClassificationManage Risk
Find sensitive files on public serversWatermark documents with confidential dataEncrypt backups of files with personal informationApply rights management to high-secrecy filesComply with retention policies
Reduce Cost
Optimize backup SLAs
Replicate only business-related documents
Expire files to reduce storage purchasing needsMove files to less expensive storage
Available in Windows
Extend through IT or ISV solutions
![Page 8: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/8.jpg)
FCI ArchitectureClassification Pipeline> Designed to enable an ecosystem around
classification> Comprehensive API for solutions> Extensible classification infrastructure
Discover Data
Extract Existing
Classification
Properties
Classify Data
Store Classificati
on Properties
Apply Policies
Based on Classificati
on
File Classification Extensibility Points
Get/Set Property API for external applications
![Page 9: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/9.jpg)
Get/Set Property API> Consume properties by specifying files> Automation-compatible COM API
> Works with native code, managed code, or scripts
> Available through classification manager object> Set is meant for manual classification
> Use extensibility modules instead to extend rule-based automatic classification
![Page 10: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/10.jpg)
Get/Set Property APIUsing PowerShell
# Get an instance of the Classification Manager$cm = New-Object –ComObject Fsrm.FsrmClassificationManager
# Enumerate and display all properties associated with a file$props = $cm.EnumFileProperties("P:\foo\bar.txt", 0)foreach ($prop in $props) { Write-Host $prop.Name = $prop.Value}
# Get and display the value of the "Secrecy" property$secrecyProp = $cm.GetFileProperty("P:\foo\bar.txt", "Secrecy", 0)Write-Host $secrecyProp.Value
# Set the value of the "Secrecy" property to "High"$cm.SetFileProperty("P:\foo\bar.txt", "Secrecy", "High")
![Page 11: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/11.jpg)
Get/Set Property APIUsing native C++
// Get an instance of the Classification ManagerCComPtr<IFsrmClassificationManager> spClassMgr;HRESULT hr = CoCreateInstance(CLSID_FsrmClassificationManager, NULL, CLSCTX_LOCAL_SERVER, __uuidof(IFsrmClassificationManager), &spClassMgr);
// Get the "PII" propertyCComBSTR bstrFilename(L"P:\\foo\\bar.txt");CComBSTR bstrPropName(L"PII");CComPtr<IFsrmProperty> spPIIProp;hr = spClassMgr->GetFileProperty(bstrFilename, bstrPropName, 0, &spPIIProp);
![Page 12: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/12.jpg)
Custom File Management Tasks> Apply policies by
running custom commands on files that match specified criteria
> Faster than scanning and retrieving properties yourself> No control on file
order> Task runs command in
new process per file
![Page 13: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/13.jpg)
FCI Extensions> Classification modules
> Determine values of properties to apply to files> Available in Windows:
> Folder classifier – assigns properties based on file location
> Content classifier – assigns properties based on string and regular expression matches in file content
> Storage modules> Supply and persist properties associated with
files> Available in Windows:
> System storage module for all file types> Uses NTFS named stream to store properties> Functions as a cache for fast retrieval
> Office 97-2003 and Office 2007 in-file storage
![Page 14: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/14.jpg)
Pipeline Anatomy
Classification Runtime Process
Hosting Process
Hosting Process
Hosting Process
Scanner
Gets basic file properties
Office Storage [Load]
Loads embedded properties
Folder Classifier
Classifies based on location
Content Classifier
Classifies based on content
Office Storage [Save]
Saves embedded properties
Reporting Engine
Adds files to report
Discover Data
Extract Properties
Classify Data Store Properties
Apply Policies
Streams can cross processes• Security checks are performed on
cross-process data transfers
Most modules are hosted within a separate process
Each module passes streams of property bags to the next one
![Page 15: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/15.jpg)
Custom Pipeline Modules> Register module by creating a module
definition through the Classification Manager> Typically once during installation
> Module is a COM server that implements IFsrmClassifierModuleImplementation or IFsrmStorageModuleImplementation> Both native and managed are supported
> Pipeline calls OnLoad to initialize module> Module needs to return connector object to
connect hosting process> Instructions in MSDN documentation
![Page 16: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/16.jpg)
Classifier ModulesModels for classification> Yes/no
> Pipeline asks module whether or not a property value applies to the file
> Explicit value> Pipeline asks module what value to assign to a
specified property> Controlled by NeedsExplicitValue flag in
module definition
![Page 17: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/17.jpg)
Classifier ModulesClassification session call sequence> UseRulesAndDefinitions called at start of
session> Module can choose to cache these rules
> For each file:> OnBeginFile – specifies the property bag of the
file to classify and the rules to classify it with> Module can choose to process file right away
> For each rule:> Yes/no – DoesPropertyValueApply
> Return TRUE or FALSE> Explicit value – GetPropertyValueToApply
> Return value to apply, or return error code FSRM_E_NO_PROPERTY_VALUE if no value should be applied
> OnEndFile – indicates end of file processing
![Page 18: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/18.jpg)
Storage Modules> Supply or persist properties associated with
file> Two types supported: InFile and Database
> Cache is reserved for the built-in System Cache Module
> Capabilities field in module definition determines whether module is instantiated for loading and/or saving properties> Separate instances created for load and save
> LoadProperties – provide property values by calling SetFileProperty in the property bag
> SaveProperties – retrieve properties in the property bag and persist them
![Page 19: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/19.jpg)
Accessing File Contents> Modules should never open files directly
> May not have proper permissions> Stream state may not be consistent with
metadata> Use GetFileStreamInterface in the property
bag> Supports ILockBytes and IStream interfaces> Takes care of getting the right permissions> Ensures last access and last modified times are
unchanged> Ensures changes are properly committed (for
storage modules)
![Page 20: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/20.jpg)
PowerShell Host Classifier> Included in Windows SDK> Presents itself as a classifier to FCI that
hosts PowerShell scripts to do the actual classification
> Create custom classifiers without compiling and registering your own modules
> Simpler to build, but has slower performance> Intended for in-house IT solutions and
prototyping> More information at
http://blogs.technet.com/filecab/archive/2009/08/14/using-windows-powershell-scripts-for-file-classification.aspx
![Page 21: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/21.jpg)
Putting it all together
Clyde LawSoftware Design EngineerFile Server Management Team
demo
![Page 22: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/22.jpg)
Developer OpportunitiesCall to action> FCI provides many avenues to be part of
end-to-end data lifecycle management solutions> Classifiers – provide classification based on
content, identity, regulations, etc.> Data management products – leverage
classification in solutions to backup, archival, leakage-prevention, etc.
> Storage modules – provide property storage for new file formats
> Flexible COM API> Native code, managed code, or scripting> PowerShell support enables fast deployment of
solutions
![Page 23: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/23.jpg)
Additional Resources> FCI Overview
> http://microsoft.com/fci/> Microsoft TechNet
> http://technet.microsoft.com/en-us/library/dd758765%28WS.10%29.aspx
> http://technet.microsoft.com/en-us/library/dd758756%28WS.10%29.aspx
> Developing for FCI> Windows SDK
> http://msdn.microsoft.com/en-us/windows/bb980924.aspx> FSRM API Documentation on MSDN
> http://msdn.microsoft.com/en-us/library/bb972746%28VS.85%29.aspx
> FCI Code Gallery> http://code.msdn.microsoft.com/fci/
![Page 24: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/24.jpg)
Contact Us> Storage Team Blog
> http://blogs.technet.com/filecab/default.aspx> E-mail
> FCI Team> [email protected]
> Clyde Law, Developer> [email protected]
> Matthias Wollnik, Program Manager> [email protected]
![Page 25: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/25.jpg)
YOUR FEEDBACK IS IMPORTANT TO US! Please fill out session evaluation
forms online atMicrosoftPDC.com
![Page 26: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/26.jpg)
Learn More On Channel 9> Expand your PDC experience through
Channel 9
> Explore videos, hands-on labs, sample code and demos through the new Channel 9 training courses
channel9.msdn.com/learnBuilt by Developers for Developers….
![Page 27: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/27.jpg)
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
![Page 28: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/28.jpg)
AppendixProperty aggregation and conflict resolution
> Values from Storage:> In-file > Database > Cache
> Values from Classification Rules:> Default values applied once if not
already present> Can also choose to explicitly
aggregate or overwrite existing values
> Ordered lists, Booleans, Multi-choice lists, and Multi-strings can be aggregated
[Default] Apply only if there is no value stored in the file
[Consider Existing] Apply but aggregate with values from Storage and Default rules
[Ignore Existing] Apply and ignore (replace) values from Storage and Default rules
![Page 29: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/29.jpg)
AppendixProperty bags> Property bag object holds the metadata of a file being
classified> The object flows through the classification pipeline> Each pipeline module can assign property values
Property BagFile System Info
Relative Path, Creation Time, etc.
Properties
Messages
Read Stream Write Stream
Current ContextModule Type, Rule, etc.
PropertyName
Type
Assigned Values and SourcesFrom Storage Modules
From Default and CE Rules
From IE Rules
Aggregated Value
Aggregated Sources
![Page 30: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/30.jpg)
AppendixConnecting a module to the pipeline
STDMETHODIMP CCustomModule::OnLoad( __in IFsrmPipelineModuleDefinition *pDefinition, __deref_out IFsrmPipelineModuleConnector **ppModuleConnector ){ ...perform module initialization...
// Create the connector CComPtr<IFsrmPipelineModuleConnector> spConnector; hr = CoCreateInstance(CLSID_FsrmPipelineModuleConnector, NULL, CLSCTX_LOCAL_SERVER, __uuidof(IFsrmPipelineModuleConnector), &spConnector); ...handle any errors... CComQIPtr<IFsrmPipelineModuleImplementation> spModuleImpl = GetControllingUnknown(); if (spModuleImpl == NULL) ...handle error...
// Bind the connector to the module hr = spConnector->Bind(pDefinition, spModuleImpl); ...handle any errors...
// Return the connector *ppModuleConnector = spConnector.Detach();
return hr;}
![Page 31: Using Classification for Data Security and Data Management](https://reader035.vdocuments.net/reader035/viewer/2022081502/568168ba550346895ddfa5ec/html5/thumbnails/31.jpg)