hp integrated archive platform

43
©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Twitter hashtag #HPSWU IM-TH-1000 Twitter hashtag #HPSWU

Upload: hp-software-solutions

Post on 12-Jun-2015

3.138 views

Category:

Documents


0 download

DESCRIPTION

A detailed overview of HP Integrated Archive Platform and its role in Information Management

TRANSCRIPT

Page 1: HP Integrated Archive Platform

©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

Twitter hashtag #HPSWU

IM-TH-1000Twitter hashtag #HPSWU

Page 2: HP Integrated Archive Platform

©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

Speaker Name: Jaap van KleefDate: December 2010Session ID: IM-TH-1000

HP Integrated Archive PlatformA Technical Overview

Page 3: HP Integrated Archive Platform

Agenda• Information Lifecycle Management

• HP Integrated Archiving Platform (IAP)– Concepts & Architecture

• HP E-Mail Archiving Software for Exchange (EAsE)

Page 4: HP Integrated Archive Platform

April 13, 2023 4

HP Information Management solutions

Business outcomes

Storage optimization

E-Discovery and compliance

Chief Compliance OfficerVP Risk Management

General CounselDir. of Records/Messaging

Informationretention

solution set

Business continuity and availability

Informationavailabilitysolution set

CIOVP IT Operations

Application DirectorsBackup Administration

Chief Storage Officer • VP/Director of Storage • Storage Administrator

Storage Optimization solution set

IM solutions

Page 5: HP Integrated Archive Platform

HP Information Management solution sets

April 13, 2023 5

Information Availability solution

set• HP Data Protector

software• HP Database

Archiving software

Information Retention solution set

• HP Integrated Archive Platform

• HP Email Archiving software for Microsoft Exchange

• HP Email Archiving software for IBM Lotus Domino

• HP File Archiving software

• HP Medical Archive solution

Page 6: HP Integrated Archive Platform

HP Information Management

ILM is a set of solutions and services to capture, manage, retain, and deliver information according to its business relevance

Reduce the cost of managing ever-increasing amounts of data while simultaneously transforming it into accessible,

relevant business information.

− Comply with ever changing business needs

− Leverage information for better business performance

− Automate the management of your business information

Capitalize on business information for competitive advantage

Page 7: HP Integrated Archive Platform

ILM addresses the three major information management challenges

• Reference information (static content) is underutilized and the ability to tap into it has potential business value

• When you need it, reference information is of great value

Reference Information

Management

• Corporate and government regulations require retention policies

• Companies placed under subpoena to produce email and documents in legal actions taken against the company

Retention Management • Information growth

continues at an accelerated rate

• Need to significantly reduce management costs while maintaining service-levels

• Increase performance on file servers

• Reduce back up time

Data Management

Page 8: HP Integrated Archive Platform

©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

HP Integrated Archive PlatformConcepts and Architecture

Page 9: HP Integrated Archive Platform

Change - Simplicity, Agility, ValueTraditional Total Cost of Ownership

€ 5

€ 1 Storage

Maintenance

StorageSoftware

AccessSoftware

Application Middleware

Industry StandardHardware

File system

CAS SW

Storage & archiving middleware

HSM

DatabaseSearch engine

Servers

TapeLibrary

CAS HW DAS, SAN or NAS storage

Application

e.g. Traditional Lego approach to archiving

Non-IntegratedSingle-points of failure, unable to scale

Integrated

Page 10: HP Integrated Archive Platform

HP Integrated Archive Platform (IAP)

ComplianceAnti TamperingRetentionIndexingSearch Policy Management

ECMInformation

e-Discovery

Page 11: HP Integrated Archive Platform

IAP

IAP is :• Integrated Storage• Build with Grid technology

• provides scalability to billions of objects• enables mixed HW or eases refresh

• Build on standard HW components• Full content indexing• Capacity optimization via single-instancing• Fast web-search and retrieval• WORM• Retention management• Low TCO via single administration

Connectivity: IP LANProtocols: IAP API, HTTP, SMTPBase Unit: Starts with 5 TB

HP IAP

Large scale storage, access and content retrieval for reference information

Page 12: HP Integrated Archive Platform

IAP Scalability• Simply start with a base unit• And grow by one cell increments! (5 TB)• Up to 250 cells!!! (today)

Page 13: HP Integrated Archive Platform

IAP advantages

• An “all-in-one” complete solution – reduces complexity of integration & management

• Full content & attribute indexing & search

• Scalable grid computing architecture

• All content stored within the IAP non-tamperable architecture.

• Single Instances – storage optimization

• Disaster Tolerance enabled

Page 14: HP Integrated Archive Platform

IAP concepts

Page 15: HP Integrated Archive Platform

Flexible Grid Approach enables technology refresh allows performance upgrades provides scalability

IAP Grid Computing Architecture

SmartCell (storage) Fabric:Distributed computer system of self contained, all

inclusive data repositories (data-grid)

SmartCell

SmartCell

SmartCell

SmartCell

SmartCell

SmartCell

Storage Content indexing

processingpower

Backbone Fabric• HTTP portals• SMTP portals• AdministrationSystem• Mirroring for fault tolerance

HTTP

SMTP

SMTP

FW

Page 16: HP Integrated Archive Platform

Smart Cell grid architectureDomains and repositories

– Domains:• Subgroups of Smart Cells

within a larger grid• Distinct policies• Might in large organization

– Repositories• Logical entity spanning one

or more smart cells• Multiple repositories in a

domain• Might represent a single

user’s mailbox

16

Page 17: HP Integrated Archive Platform

Domains and Repositories •Domains

– Provide physical isolation of data via physically separate SmartCells

– Consist of repositories and users– Attributes: retention, replication,

backup, audit, audit log– Note: usually one domain per IAP

system

•Repositories– Logical entities that span SmartCells

inside a domain– Typically corresponds to a user’s

mailbox– Routing rules determine which data

goes to which repository– Access is governed by access control

lists (ACL’s)– Information pertaining to one

repository can never be stored on media belonging to another repository

3-17

Page 18: HP Integrated Archive Platform

Data flowStore path1. Mail message or document enters system via

network.2. SMTP portal creates digital signature (CRC of

message plus date/time of receipt).3. SMTP portal encrypts digital signature using 128-

bit triple-DES encryption.4. Portal contacts Metaserver and addresses mail to

Smart Cell associated with recipient’s repository.5. Router passes message and digital signature on to

selected Smart Cells (Primary and Secondary).6. If both Smart Cells agree, they send

acknowledgment back to sending application.7. Smart Cells index and compress object, store index,

digital signature, and compressed object on disk.

18

Page 19: HP Integrated Archive Platform

Data flowQuery and retrieval path

• Query1. Query submitted from browser through

firewall. 2. HTTP portal formats query, uses information

from Metaserver to determine which Smart Cells will return data.

3. Smart Cells receive multicast request in parallel; those with data perform local query.

• Retrieval1. Each Smart Cell returns results from its search,

along with results’ digital signature.2. Meta Server receives results, time-orders

them, and passes them back to HTTP portal.3. HTTP portal computes digital signature and

validates against stored digital signature before returning data to querying application.

19

Revision 09.09a.

2008– HP

Restricted

Page 20: HP Integrated Archive Platform

Security - Physical

•Domain vs. repository– Domains provide physical isolation of data– Repositories are logical entities that span across SmartCells within a domain– Access to repositories is controlled by ACL’s

•Grid architecture– Allows immediate isolation of information subsets without interruption

•Software installation controlled via MAC address– All software components “locked down” to authorized hardware via the MAC

to prevent unapproved hardware substitution

•Locked Racks– Strongly recommended to secure physical rack components

3-20

Page 21: HP Integrated Archive Platform

Security - Network

•Private internal subnet not exposed to the outside– Built-in NAT for production network (via Firewall)– Built-in NAT for operations network (via PCC)

•Built-in firewall (limited exposure)– 80: HTTP (can be disabled by customer)– 25: SMTP (no relay mechanism and protected via ‘IP to domain map’ preventing

unauthorized protocol transport)– 443: HTTPS (DoD server certificate support)– SSH (only if technical support access required by customer)– SNMP (can be enabled/disabled by customer) – Insight Manager

•Other– Software installation/update controlled via MAC address– SSL access - 128-bit encryption supported for all external HTTP access

3-21

Page 22: HP Integrated Archive Platform

Security - Other– Firewall Server prevents unauthorized access into

the appliance from the customer network– File access is restricted to authorized users– All traffic between the end-user and the appliance

is protected against ‘snooping’– URLs for retrieved data are encrypted– User passwords are encrypted within the appliance

to provide internal security– Digital signatures identify if a file has been

tampered with

3-22

Page 23: HP Integrated Archive Platform

Data Integrity

•Mirroring– Motto “Never less than 2 copies of information”– Synchronous mirroring of all data to Primary &

Secondary SmartCells• WRITE ACK is a lock step process with CRC check

•Digital signature for detection of tampering

3-23

Page 24: HP Integrated Archive Platform

Administration and Management• Single Web based built-in admin interface• SNMP traps• E-Mail notifications• No need for dedicated database administrator• SIM Manager

– (SIM agents future rel.)

Page 25: HP Integrated Archive Platform

IAP Replication Overview• Building a mirror of the IAP system on a remote location• Replica site provides same functionalities as primary site (search, data

retrieval,…)• Store can read-failover to replica site when primary down• Domain based management: each IAP domain can be replicated to a

different site

Replicated data(compressed)Site 1- Primary

siteSite 2 - Replica site

Queries / retrievalsQueries / retrievals

end user

Page 26: HP Integrated Archive Platform

Replication - Components

•Primary site:– Each SmartCell provides a list of files to replicate– Replication manager

• Polls SmartCells for data to replicate• Retrieves data and sends it to secondary as an email attachment

– Database Server is used for controlling the replication flow• Access to SmartCells• Suspend or resume replication for a given domain

•Secondary (replica) site:– SMTP servers receive data– Replica SmartCells store data and index (optionally mirror)

3-26

Page 27: HP Integrated Archive Platform

Replication - Data Transfer

– Uses SMTP to send the data– Mail envelope carries each stored file (actually a

zip file)– Header provides necessary information for the

SMTP server to figure out where to store the file (group, path, file name, etc.)

– Header’s data is encrypted– Emails are sent individually but batched 100 at a

time

3-27

Page 28: HP Integrated Archive Platform

©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

HP Integrated Archive PlatformE-Mail Archiving

Page 29: HP Integrated Archive Platform

IAP for: Messaging management example with Base IAP

• 2 email modes– Policy based data movement

Messages are archived via policy settings (i.e. message age > 90 days) and are replaced with a reference link

– ComplianceAll messages are archived to the IAP (Exchange journaling)

• Results (for policy based archival) – Increased scalability of individual email

servers– Fewer email servers eases

administration– Reduced storage requirements of email

and file servers

Capture & index everythingAll to, from, subjectBody and attachments

Exchange or Dominoemail environment

IAP Outlook Plugin

Outlook clients

EAsE

Page 30: HP Integrated Archive Platform

Two email archive modes

Compliance Archiving

Selective Archiving• All messages are

automatically forwarded from the mail-server (using journal mailbox) to the archive

• Provides a secure copy of all emails for compliance

• IT centrally sets retention policy

• Messages retrieved via Web UI

• Individuals can access their own messages

• Auditors can access the entire archive

• Designed for Legal Discovery & Compliance

• Messages are “mined” out of user mailboxes based on policies e.g. age of message, size of message

• A “stub” is left behind in the user’s mailbox

• User double-clicks the stub to retrieve the archived message

• User can also search for and retrieve messages via Web UI

• Designed for Email Data Management

Page 31: HP Integrated Archive Platform

HP EAsEEmail example

Mail Infrastructure(Exchange / Domino)

Outlook Clients

IAP Outlook Plugin

IAP Archive• Content Searchable• Scalable to billions of

objects• Retention management

IAP Archive• Content Searchable• Scalable to billions of

objects• Retention management

Outlook plugin provides transparent access to archived messages

Outlook plugin provides transparent access to archived messages

Automatic message archival • i.e. based on size and

age• i.e. keep all incoming

mails

Automatic message archival • i.e. based on size and

age• i.e. keep all incoming

mails

Mailbox store can be reduced due to archived messages

Mailbox store can be reduced due to archived messages

Extended Reference data storageExtended Reference data storage

XX

EAsE

MAPI

Page 32: HP Integrated Archive Platform

Selective Archive Migration Rules

What are the options?• Bodies and attachments are archived, attachments are

replaced by a pointer (optionally email)• Rules are defined per user or group of users and

messaging servers• Migration criteria: Age, size, contents, to, from, …

Best practices:

- Keep it simple -• Archive messages

• >90 days & Attachment size > 150KB• Trim Attachment

• Archive messages • > 1year & Message size > 50 KB• Trim Attachment + Body

• Some conditions are rarely used, such as• Mailbox status/quota

Page 33: HP Integrated Archive Platform

EAsE-Outlook Plug-In

• Provides end-user with seamless integration to IAP

• Needed on client PC that uses Outlook to retrieve tombstoned messages

• Not needed when using OWA• COM Add-In (.msi file) available on the IAP

Utilities CD• Can be automatically deployed with AD Group

Policies

Page 34: HP Integrated Archive Platform

Transparent to end-users (selective Archive) Messages in the extended storage are “stubbed” in mailbox

“stubbed” message

Page 35: HP Integrated Archive Platform

Offline Exchange Users

RIMPlug-inRIMPlug-in

Offline Cache Synchronization• Continuously synchronizes messages

between IAP & Outlook 2003 cache• Messages are cached based on

Outlook plugin policy settings• Offline Plugin cache size dynamically

configurable• Cache FIFO managed per retention

rules & cache space available Cached Message Access• Tombstone message accesses from

Outlook always checks the local offline cache first− Cached messages satisfied locally− Un-cached messages retrieved from IAP

• All retrievals from IAP locally cached, subject to cache space availability

Outlook 2003/2000Outlook 2003/2000

Offline CacheOffline Cache

HP RIM Outlook Plugin

HP RIM Outlook Plugin

2

Exchange 2K /2003Exchange 2K /2003

1

1

2

IAPIAP

Page 36: HP Integrated Archive Platform

EAsE Outlook Web Access (OWA) Support

OWA Support for IAP• No loss of OWA functionality

• Enables OWA users to retrieve archived messages

• Supports Exchange 2000, 2003

• IE is the recommended browser

• Recommended by MS for OWA “Premium Service”

OWA

OWA Server

IAP OWA Plugin

IAPExchang

e

Page 37: HP Integrated Archive Platform

Bringing all together: Unified, Web-based search

All search functions are available via free API

Page 38: HP Integrated Archive Platform

IBM Lotus Notes & Domino Archiving• Flexible, robust, & “Lotus-like” interface

• NSF Tools Export / Bulk Import

• Friendly administration using Notes based tools

• O/S platform support Windows, AIX, Solaris, Linux

Page 39: HP Integrated Archive Platform

©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

HP Integrated Archive PlatformBeyond E-Mail

Page 40: HP Integrated Archive Platform

HP IAP - the central information storeMail Servers

IAP Archive• Reduction in TCO• Content searchable• Reduced amount of file

servers• Eases administration• Reduced storage

requirements

IAP Archive• Reduction in TCO• Content searchable• Reduced amount of file

servers• Eases administration• Reduced storage

requirements

File Servers

HPFMAHP

FMA

Print-Outs

any other supported

ISV

any other supported

ISV

Clients

IAP APIsIAP APIs

Web-searchWeb-search

RetrievalRetrieval

Sharepoint

Tower SoftwareTower

Software

MessagingMessaging

Page 41: HP Integrated Archive Platform

File archival/move to IAP

• Migrate inactive files from Windows File servers to multiple targets

• Migration based on age and/or size of the file

• Cluster-Support• Advantages for file servers:

– Faster Backup and Recovery – File Server capacity does not need

to grow• Failure tolerant data management

software (no HSM DB needed)

file migration & recall via LAN and

IAP API, CIFS, FTP

IAP NAS/disk system

FMAFMAFMAFMA FMAFMA

MSCS Cluster

EVA

FSE Archive

Cache disk

FSE Server

Cache disk

FSE Server

E SeriesE SeriesMSA. EVAMSA. EVA

WindowsWindows

Page 42: HP Integrated Archive Platform

File archival/move to IAP

Page 43: HP Integrated Archive Platform

Continue the conversation with your peers at the HP Software Community hp.com/go/swcommunity