repository scalability - comparing sharepoint 2010 with oracle ucm 11g
DESCRIPTION
Report on testing comparing the performance and scalability of SharePoint 2010 and Oracle UCM 11g - presented at Collaborate 2011TRANSCRIPT
Repository Scalability: Comparing Microsoft SharePoint 2010 and
Oracle UCM 11g
Raoul Miller and Brent Seaman,
TEAM Informatics, inc.
Outline
• Our ingestion rate experiments• Hardware and Software setup• Experimental design
• Observations and Conclusions from these tests• Implications for Repository Sizing and
Organization in SharePoint and UCM• Lessons Learned and Recommendations• Q and A
Overall aims of this research
• Apply real-world scenarios to ingestion testing• Rather than ultra high performance / ultra high cost
• Determine actual ingestion rates for different scenarios on identical hardware
• Expose weaknesses / issues in large imports• Derive recommendations for best practices in
importing existing content into new CMS repositories
Experimental Approach• Import existing files from file system into newly-installed
CMS– Standard configurations– Commodity hardware– No specialized tuning or optimizations– Vendor recommended OS and databases
• Four scenarios– 20,000 files @ 40kB– 20,000 files @ 100kB– 1,000,000 files @ 40kB– 1,000,000 files @ 100kB
Are these Scenarios Realistic?
• >80% of single instance CMS repositories contain 50-200,000+ items
• Average “document” size in most industries is ~100kB.
• Most projects need to import existing content from file shares or other systems
Commodity Hardware
• Dell PowerEdge R710s server• Dual Intel Xeon 5560 CPUs (@ quad core)
running at 2.8Ghz • 16GB RAM• Eight 146GB 10K RPM SAS drives
UCM Installation
• Operating System: RedHat Enterprise 5 (specifically the CentOS 5 build)
• Database: Oracle 11g Standard Edition database
• Web / Application Server: Weblogic 11gR1 (10.3.3)• Content Management UCM 11gR1 (11.1.1.4.0)
System:• Java Runtime Environment: Sun Hotspot SDK (1.6.0_11) &
JRockit R28• File storage: File system (default) and JDBC
(SecureFiles)
SharePoint Installation
• Operating System: Windows Server 2008 Std Edition for Partners
• Database: Microsoft SQL Server 2008 R2 Enterprise
• Web Server: IIS7 (Standard with Windows Server 2008 - specifically v 7.5.76)
• Content Management SharePoint Server 2010 Enterprise for System: Partners
• File storage: Database Storage in SQL server
Ingestion Approaches
• UCM– used BatchBuilder and BatchLoader
• SharePoint - – had to use third party tool (UploadZen by
Roxority)– Need to organize content before import– Limited flexibility in directory size
Supported SharePoint 2010 bulk import strategies
• Multiple file upload applet– Silverlight; supports up to 100 docs, does not support
subdirectories
• Windows Explorer view– Extension of webDAV– Limited performance
• SharePoint Workspace– Client integration– Only supports up to 500 documents
Differences between Import Strategies• BatchLoader
– Supported system tool– Allows automated file system crawl (BatchBuilder)– Storage / browse location in repository unrelated to source
location– Supports high volume
• UploadZen– Third-party application– Requires organization and sizing of import directories– Organization within repository reflects import location– Major challenges with high volume imports
Considerations for Repository Sizing
1. Should be primarily driven by business / infosec needs
2. Practicality– Import / migrate– Search / organize– Backup / DR
3. Flexibility– Growth in content volume / size– Leverage HSM / partitioning– Provide options for storage strategies
Ingestion Rate Testing
• Major things to test:– Overall rate of ingestion with different sized
files and different sized collections– Ease of use of import tools– Flexibility in organization of content during /
after import
20,000 files – each 40kB• First set of tests• Single directory for SharePoint source
• UCM – File System storage – 198,000 docs/hr• UCM – JDBC storage – 156,000 docs/hr• SharePoint – 153,000 docs/hr
20,000 files – each 100kB
• UCM – File System storage – 171,000 docs/hr• SharePoint – 138,000 docs/hr
• Ingestion rates fell 10-15% for larger file size• SharePoint RAM usage higher, primarily in
database
1,000,000 files – each 40kB• Need to organize files in directories for SharePoint
– 50 folders each with 20,000 items - failed– 2,000 folders each with 500 items – succeeded
• UCM – FS storage & Sun JRE 205,000 docs/hr• UCM – FS storage & JRockit JRE 212,000 docs/hr• UCM – JDBC storage & Sun JRE 171,000 docs/hr• SharePoint w/ 50 import folders failed• SharePoint w/ 2,000 import folders 217,000 docs/hr
1,000,000 files – each 40kB (contd.)
• Substantial work to organize content for SharePoint import
• SharePoint much more RAM intensive– Primarily with database process
• UCM more CPU intensive– Much more linear response
1,000,000 files – each 100kB
• UCM – FS storage & Sun JRE 179,000 docs/hr– 15% decrease in rate due to file size
• Unable to complete test with SharePoint
Conclusions
• SharePoint requires 3rd party tools and substantial work before import
• SharePoint has limited flexibility in terms of repository sizing, content organization, and import strategies
• With optimized import, SharePoint ingestion rates are comparable to UCM
• UCM has much more flexibility in import strategies• UCM has consistent import rates between 156,000 and
212,000 docs/hr (OOTB)
Conclusions (contd.)• Ingestion rates are dependant on average file size (10-
15% decrease in rate between 40kB and 100kB file size)• UCM can be deployed on commodity hardware for
repositories of 1,000,000 items• SharePoint has challenges importing 1,000,000 files on
commodity hardware• Both systems function well on this hardware after import.• SharePoint import is much more RAM intensive whereas
UCM import is CPU intensive
Q&A