architecting for scale in sharepoint 2010 russ houberg senior technical architect, mcm...

23
Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Upload: makenzie-linscomb

Post on 30-Mar-2015

224 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Architecting for Scale in SharePoint 2010

Russ HoubergSenior Technical Architect, MCMKnowledgeLake, Inc.

Page 2: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Scaling SP2010 from the Ground Up

Storage ArchitectureSQL Tuning TidbitsRemote Blob Storage (Demo)Performance and Control Scalable Taxonomy Design (Demo)Search… A Complete StoryThe Big Picture: 10 million, 100 million

1 BILLION Documents?

Page 3: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Storage Architecture

Storage Architecture can make or break SharePoint Performance

Poor storage performance can tank the whole SharePoint farm!

Can Tough to EstimateUse an extendable storage platform if possible

Wider is BetterMore spindles always better than higher GBAvoid using a small number of large disks for increasing storage capacity

Page 4: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Storage Architecture

TempDB, Search DBs, Content DBsMultiple Data Files in Primary File Group

# Files = ½ to ¼ of CPU Cores | <= CPU CoresSeparate to unique spindle sets if possible

Pre-Allocate all Data Files, Including TempDB

Estimate Projected DB Size and Divide by # Files to get the pre-allocation size for each file

Leave “AutoGrow” enabled, but don’t rely on it

Pre-Allocation to prevent AutoGrowSet AutoGrow to 10% or logical MB/GB value based on projected databse Size

Page 5: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Storage ArchitectureData / Log File Spindle Priority

Priority

DB File RAID IOPS Optimization

1 TempDB Data RAID 10 2 IOPS/GB Write

2 TempDB Log RAID 10 2 IOPS/GB Write

3 Content/DB Log RAID 10 2 IOPS/GB Write

4 Crawl DB Log RAID 10 2 IOPS/GB Write

5 Crawl DB Data RAID 10 2 IOPS/GB Read/Write

6 Property DB Log RAID 10 2 IOPS/GB Write

7 Property DB Data RAID 10 2 IOPS/GB Read/Write

8 Services DB Log RAID 10 2 IOPS/GB Write

9 Services DB Data [Depends]

[Depends] [Depends]

10 Content DB Data (Collab)

RAID 10 0.75 IOPS/GB

Read / Write

11 Content DB Data (Archive)

RAID 5 0.75 IOPS/GB

Read

Page 6: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

SQL Tuning Tidbits

SQL Instant InitializationRun SQL As Domain User with either…

Local Admin Grant “Perform Volume Maintenance Tasks”

TempDB Pre-Allocation to 10% Largest DBSAN vs DAS vs NAS (Don’t Overshare!)Host Bus Adapter (HBA) ConfigurationNTFS Allocation Unit Size: 64KEnable Locked Pages in Memory (SQL Std.)Don’t skip on RAM!

Page 7: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Remote BLOB StorageWhat’s this ECM thing?- Interesting workarounds• API access was problematic

SharePoint 2003

SP1 Brings us EBS Provider- BLOBs are orphaned during edit/save- Orphan cleanup is resource intensive- Externalization happens on the WFE (reduced RPS)- Future support of EBS API is not guaranteed

SharePoint 2007

Long Live RBS- Transactional consistency supports “VETO”- Transactional consistency allows for UPDATE- Orphan cleanup uses SQL Indexes- Transparent to the SharePoint API- RBS is the best option for future support

SharePoint 2010

Page 8: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Remote BLOB StorageSharePoint WFE

SharePoint Object Model

BLOB StoreProvider Library

BlobStore

SQL Server

ContentDB

ConfigDB

2. Enforce

Business Logic

RBS Client Library Relational Access

1. Save Request

3. Save Blob

4. Write Blob

5. Return BLOB ID 6. Save

Metadata & BLOB ID

7. Back to User

Page 9: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

RBS Requirements

SQL Server 2008 R2 November CTPAny Version, even SQL Express

FILESTREAM RBS ProviderUpdated version dated November 1!http://go.microsoft.com/fwlink/?LinkId=177388

Page 10: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Remote BLOB Storage

demo…

Page 11: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Performance and Control- Column Indexes were not possible- Database Indexes were not supported

SharePoint 2003

- Column Indexes (10) could be configured via the UI- End users could impact performance with poor performing list views

SharePoint 2007

- Database optimizations allow far more items in a list- Support for (20) Multi-Column Indexes- Resource intensive operations can be limited or disallowed during production hours• Large query thresholds• Blocking Operations• Can be overridden via the Object Model• Can configure an unblocked “window”

SharePoint 2010

Page 12: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Scalable Taxonomy Design

Targeted LimitsTens of Millions of Documents/Items in a List5000 Item View/Query Result Size100 Million Items per SP2010 Search Index1 BILLION Items in FAST For SharePoint Index150,000 Site Collections per Web Application50,000 Site Collections per Content DB100GB Content DB Size is SOFT LIMIT!

Recommend for Collab or Fast Backup/Restore SLASome archival type Content DBs exist at near 1TB!

Page 13: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Scalable Taxonomy Design

Enabling 100 MillionPlace large Collaboration Site Collections (20GB+) in their own content databaseBreak Up Archive/Records Site Collections by Year or, if necessary, Content Type and YearAVOID Item Level ACLs!!!

Release to Metadata Based Folder Structures as a workaround

Use Content Type Syndication to facilitate multiple Site Collections of the same typeUse Content Organizer as a “Drop Zone”

Page 14: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Content Organization

demo…

Page 15: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Search… A Complete Story- WSS CAML Only- SPS Shared Services yielded decent full text results

SharePoint 2003

- WSS 3.0 SiteDataQuery allowed search across lists/sites- MOSS Search added Managed Properties - FAST ESP for SharePoint was a late player

SharePoint 2007

- Microsoft SharePoint Foundation Search- Site Collection Scope | No Redundancy | 10 Million

- Microsoft Search Server Express 2010- Extended Features| No Redundancy | 10 Million

- Microsoft SharePoint 2010 Search / Search Server- Extended Features | Scale Out | Redundancy | 100

Million- Microsoft FAST Search Server 2010 for SharePoint

- Extreme Scale | Redundancy | Doc Processing Pipeline

- BILLIONS of documents!

SharePoint 2010

Page 16: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Search… A Complete Story

SharePoint Server 2010 / Search Server

Multiple Crawl Servers (Scale Out/Redundancy)Crawl Servers comprised of stateless CrawlersMultiple Crawlers improve crawl performanceMultiple Crawl DBs support more CrawlersCrawl DB is separated from Property DBIndex is comprised of multiple Index Partitions that can be mirrored on different Query ServersMultiple Index Partitions improve Query Performance

Page 17: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Search… A Complete StoryCool… What can it do?

Page 18: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Search… A Complete Story

FAST Search Server 2010 for SharePoint

Extreme Scale and PerformanceCustom Relevancy and Navigation TuningTune Performance for content volume, query volume, crawl pipeline performance and query speedUses SharePoint 2010 Query ServersBolts on FAST Servers for additional processingAdd server ROWS for query performance or COLUMNS for crawl performanceCan scale to support BILLIONS of items!

Page 19: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

10 million, 100 million, 1 Billion?

Page 20: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

In Review…

Storage is the KEY to PerformanceRBS reduces Content DB Size and facilitates large repositoriesSharePoint governs end-user operations Content Type Publishing and Content Organization help balance database loadingSearch solutions now handle the entire range of corpus possibilities10 million is easy, 100 million can be done, 1 BILLION is theoretically possible!

Page 21: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

More…

http://www.houberg.net

@rhouberg

http://www.knowledgelake.com/whitepaper

Page 22: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Thank you sponsors!!

Page 23: Architecting for Scale in SharePoint 2010 Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc

Remember to fill out your evaluations for your chance to win

cool prizesKodak Zi8 HD Pocket Video Camera 2 HP Netbook’s

Also Tons of books2 thinkgeek giftcards for $100 Telerik rad controls set2 licenses of essential user interface studio1 webcast from critical pathMicrosoft Zune