transactional content management (tcm) optimizing sharepoint for

45
Transactional Content Management (TCM) Optimizing SharePoint for...

Upload: gervase-turner

Post on 22-Dec-2015

232 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Transactional Content Management (TCM) Optimizing SharePoint for

Transactional Content Management (TCM)

Optimizing SharePoint for...

Page 2: Transactional Content Management (TCM) Optimizing SharePoint for

About Hershey Technologies…

• Founded in 1991

• Microsoft Partner

• Specialists in • Document Imaging / Scanning

• OCR (data and document capture)

• ECM

• BPM / workflow

• End to End SharePoint Consulting Services

• Follow us on Twitter: @HersheyTech

Page 3: Transactional Content Management (TCM) Optimizing SharePoint for

About Tom Castiglia…

•Principal at Hershey Technologies • Twitter: @tomcastiglia

• Email: [email protected]

• Joined Hershey Tech in 1998

•Director of Hershey’s professional services team since 2001

Page 4: Transactional Content Management (TCM) Optimizing SharePoint for

Agenda

• Explanation of “Transactional Content Management” (TCM)• Comparison of transactional content management requirements to

collaborative document management in SharePoint

• Overview of SharePoint features that are relevant to TCM• Compare strengths and weaknesses of SharePoint related to TCM

• How to make SharePoint support TCM

• Demo of solutions that fill the feature gaps to ensure SharePoint is successful for your transactional content management project• Ad-hoc scanning / document capture into SharePoint

• Optimizing SharePoint search for large scale TCM deployments

• Enable collaboration of static, transactional documents

• Make scanned images and PDF documents a 1st class citizen within SharePoint

Page 5: Transactional Content Management (TCM) Optimizing SharePoint for

Topics not covered in this presentation• Assumptions - I presume that you understand:• Columns (document metadata)

• Content Types

• Document Libraries

• Other topics not covered (just not enough time to include):• Automated Data Capture/OCR

• Records Management

• Workflow

• RBS

Page 6: Transactional Content Management (TCM) Optimizing SharePoint for

Enterprise Content Management in SharePoint

SharePoint Rocks at this!

Web Content

SharePoint Rocks at this!

Document Collaboration

SharePoint needs a little help here

Transactional Documents

Page 7: Transactional Content Management (TCM) Optimizing SharePoint for

What is “Transactional Content Management”?

“high-volume throughput of relatively static documents”

“content which typically originates outside and organization from external parties – customers or partners-and relies on workflow or business process management (BPM) to drive transactional, back-office business processes.”

-Forrester Research

Page 8: Transactional Content Management (TCM) Optimizing SharePoint for

Typical types of documentsTRANSACTIONAL DOCUMENTS

• Purchase Orders

• Vendor Invoices

• Application Forms

• Insurance claims

• Student Records

• Enrollment Forms

• (Not project based)

COLLABORATIVE DOCUMENTS• Proposals, reports,

spreadsheets, presentations and other documents created and edited by knowledge worker users• Office docs (Word, Excel,

PowerPoint)

• PDF files

• Created and uploaded on an ad-hoc basis to support day to day operations

• (Often project based)

Page 9: Transactional Content Management (TCM) Optimizing SharePoint for

How documents are typically received

TRANSACTIONAL DOCUMENTS

Fax Server

[email protected]@mycompany.com

OCRForm

Processing

External Systems

(AP, claims, etc.)

Page 10: Transactional Content Management (TCM) Optimizing SharePoint for

Information Architecture

TRANSACTIONAL CONTENT

• Centralized

• Often isolated to just one or a few site collections• Document Center or Record

Center

• Thousands to millions of documents per library

COLLABORATIVE CONTENT

• Decentralized

• Documents are often spread throughout many site collections, sub-sites, libraries and content types

• Typically under 5K documents per library.

Page 11: Transactional Content Management (TCM) Optimizing SharePoint for

How users find documentsTRANSACTIONAL DOCUMENTS

• Navigation doesn’t work - too many documents per library

• Search via metadata queries only• Ignore document content

• Ignore social based algorithms like ratings

• Users expect intuitive, graphical query builders to specify precise search conditions against one or more metadata fields.

COLLABORATION SCENARIOS

• Navigation• SiteSub-

SiteLibraryFolderDocument

• Keyword Search• Searches both metadata and

document content

• Use of social algorithms improve search results (e.g. highly rated documents are returned above other documents)

Page 12: Transactional Content Management (TCM) Optimizing SharePoint for

How users find documentsTRANSACTIONAL DOCUMENT SEARCH

TYPICAL SHAREPOINT SEARCH

Page 13: Transactional Content Management (TCM) Optimizing SharePoint for

Why Keyword Search doesn’t work for Transactional DocsKEYWORD SEARCH METADATA DRIVEN QUERY

Returns 16 items, only 6 of which are related to what I wanted.

Included other documents that happen to contain the StudentId value either as text in the document or in some other field (like an Invoice Number, or something else)

Returns only the 6 correct items

Page 14: Transactional Content Management (TCM) Optimizing SharePoint for

Four Challenges to Transactional Content Management in SharePoint• Configuring Managed Properties in SharePoint Search is

more complex than it needs to be.

• SharePoint does not provide a robust query builder for users to intuitively query documents (other ECM solutions offer this OOB)

• SharePoint formats Search results like a search engine, not like a document management product.

• SharePoint treats PDF documents and scanned images as a 2nd class citizen.

Page 15: Transactional Content Management (TCM) Optimizing SharePoint for

Integrating Metadata with Search

Metadata

Columns

Crawled Properti

es

Managed

Properties

Search Results

Page 16: Transactional Content Management (TCM) Optimizing SharePoint for

Crawled Properties

• Crawled properties are metadata (such as author, title, or subject) that are extracted from SharePoint columns during crawls.

• However, this is the internal representation of the metadata. To enable users to search on this metadata, we need to use managed properties that are mapped to the crawled properties.

Page 17: Transactional Content Management (TCM) Optimizing SharePoint for

Crawled Properties

•A new crawled property is created for each new custom column, after…• The column is added to at least one list or

library

• The column is populated with a value in at least one item

• A Full Crawl is performed

Page 18: Transactional Content Management (TCM) Optimizing SharePoint for

Crawled Properties - Categories• All Crawled properties are grouped into various categories.

• For Transactional Content Management solutions, we generally care about the “SharePoint” Category, which contains crawled properties that are tied to list columns in SharePoint.

• Accessible from Search Service Application: Metadata Properties>Categories

Page 19: Transactional Content Management (TCM) Optimizing SharePoint for

Crawled Properties• The Naming convention is fully controlled by SharePoint, using this convention: ows_[internal name of column]

•However, spaces or other symbols (.-!@#$%^, etc.) within the internal column name are escaped, such as:Column Internal Name Crawled Property Name

InvoiceNumber ows_InvoiceNumber

Invoice Number ows_Invoice_x0020_Number

Invoice.Number ows_Invoice_x002e_Number

Invoice-Number ows_Invoice_x002d_Number

Page 20: Transactional Content Management (TCM) Optimizing SharePoint for

Crawled Properties

• In SP2010, most SharePoint columns gets one crawled property• Managed Metadata Columns get a 2nd crawled property, with

a prefix of “ows_taxid”

• This extra crawled property is used to store the internal GUID value that is associated with the managed metadata term. For example:

Column Name: CostCenter

Normal Crawled Property:

ows_CostCenter

MM Id Crawled Property:

ows_taxid_CostCenter

Page 21: Transactional Content Management (TCM) Optimizing SharePoint for

Managed Properties…

•…Allow you to enable standardization in the terms used for searching SharePoint.

•…Represent the end-user’s vision of the SP taxonomy (at least with regards to Search)•So the name of your managed properties should normally be something intuitive to your end-users

Page 22: Transactional Content Management (TCM) Optimizing SharePoint for

Managed Properties

• One managed property may be mapped to one or more crawled properties.• Useful in low governance situations where multiple site

owners or site collection admins have duplicated site columns using different names (e.g. InvoiceNumber vs ‘Invoice Number’)

• One crawled property may be mapped to one or more managed properties• Useful if different applications create their own

managed properties, and need to reference the same crawled property.

Page 23: Transactional Content Management (TCM) Optimizing SharePoint for

Using Managed PropertiesWITHOUT MANAGED PROPERTIES

WITH MANAGED PROPERTIES

Returns 16 items, only 6 of which are related to what I wanted.

Included other documents that happen to contain the StudentId value either as text in the document or in some other field (like an Invoice Number, or something else)

Returns only the 6 correct items

Page 24: Transactional Content Management (TCM) Optimizing SharePoint for

Advanced Search Web Part

Provides an OOB search interface that allows users to select a Managed Property from a drop down list, rather than having to type out the managed property name (e.g. “StudentID:” or “StudentID=“)

Page 25: Transactional Content Management (TCM) Optimizing SharePoint for

Configuring Advanced Search Web Part

Use your favorite XML editor (VS 2012)

Page 26: Transactional Content Management (TCM) Optimizing SharePoint for

Using Advanced Search Web Part

Page 27: Transactional Content Management (TCM) Optimizing SharePoint for

Creating Managed Properties

Unlike Crawled Properties (which are always auto-generated by SharePoint…

Managed properties can be created in one of three ways…

Page 28: Transactional Content Management (TCM) Optimizing SharePoint for

Creating Managed Properties (Option 1)

• SP2010: “Metadata Properties” link

• SP2013: “Search Schema” link

SP 2010 SP 2013

Managed Properties can be created manually by a SharePoint Administrator from the Search Service Application configuration.

Page 29: Transactional Content Management (TCM) Optimizing SharePoint for

Creating Managed Property (SP2010)• Click “New Managed Property” link

from Metadata Property Mappings• Property Name can contain most characters,

except for spaces (but please don’t use special characters)

• Based on the selected type, this managed property can only be mapped to crawled properties with the same type.

• Add Mapping – Select 1 or more crawled properties to map to this managed property.• If multiple are selected decide whether to include all

values or just the first one found

• Scopes – preset filter on content – like a global where clause

• Reduce storage requirements (“hash”) – option actually works in reverse to what is stated.

Page 30: Transactional Content Management (TCM) Optimizing SharePoint for

Creating Managed Property (SP2013)• Property Name - Same as SP2010

• Add Mapping - same as in SP2010

• Reduce storage requirements (“hash”) option - No longer exists in SP2013

• Many additional settings• Searchable – Enables querying against the content of

the managed property

• Queryable – Enables querying against the specific managed property

• Retrievable – Enable this setting for managed properties that are relevant to present in search results.

• Refinable – Can be used as a search refiner

• Sortable –

• Token Normalization

• Complete Matching

Page 31: Transactional Content Management (TCM) Optimizing SharePoint for

Creating Managed Properties (Option 2)

• For example, Hershey’s XenDocs ECM for SharePoint will validate that a managed property is properly configured or automatically create a managed property for each column when our web part is configured.

Automatically generated by custom code or a 3rd party application

Page 32: Transactional Content Management (TCM) Optimizing SharePoint for

Creating Managed Properties (Option 3)

Let SharePoint Auto-Generate new managed properties when it crawls

Page 33: Transactional Content Management (TCM) Optimizing SharePoint for

Auto-Generating Managed Properties

•In SharePoint 2010…• This feature is off by default, but it can be enabled in your Search Service Application

From the Categories list, hover over the SharePoint category, click the drop down arrow and then select the Edit Category option.

Select the option to “automatically generate a new managed property for each crawled property…”

Page 34: Transactional Content Management (TCM) Optimizing SharePoint for

Auto-Generating Managed Properties

•In SharePoint 2013…• All site columns that contain data will have a managed

property auto-generated upon a full crawl

• This does not happen for list columns

• This feature cannot be turned off and is not configurable (as far as I can tell)

http://technet.microsoft.com/en-us/library/jj613136.aspx

Page 35: Transactional Content Management (TCM) Optimizing SharePoint for

Comparison of Naming conventions for Crawled and Auto-Generated Managed Properties

Column

SharePoint 2010 SharePoint 2013

Name Crawled Property

Managed Property

Crawled Properties

Managed Property

FooBar 0ws_FooBar owsFooBar1 0ws_FooBarows_q_TEXT_FooBar

Not mappedFooBarOWSTEXT

Foo Bar

ows_Foo_x0020_Bar

owsFoox0020Bar

ows_Foo_x0020_Bar

FooBarOWSTEXT

Foo_Bar

0ws_Foo_Bar owsFooBar 0ws_Foo_Barows_q_TEXT_Foo_Bar

Not mappedFooBarOWSTEXT

Foo-Bar

ows_Foo_x002d_Bar

owsFoox002dBar

ows_Foo-Barows_q_TEXT_Foo-Bar

Not mappedFoo-BarOWSTEXT

Foo.Bar

ows_Foo_x002e_Bar

owsFoox002eBar

ows_Foo.Barows_q_TEXT_Foo.Bar

Not mappedFoo.BarOWSTEXT

The auto-generated names for managed properties are not “end-user friendly” !

Page 36: Transactional Content Management (TCM) Optimizing SharePoint for

Enhancing the User Experience…

Page 37: Transactional Content Management (TCM) Optimizing SharePoint for

Hershey’s XenDocs ECM for SharePointA vast improvement compared to the native Advanced Search Web Part

Page 38: Transactional Content Management (TCM) Optimizing SharePoint for

Viewing PDF files and scanned images• MS Office Documents are first 1st class citizens in

SharePoint• When office files are opened in Office 2007, 2010 or 2013, users

can perform many SharePoint functions on those documents:• Edit document content• Check in/out/discard• See version history• Edit metadata

• Preview Thumbnails in SP 2013

• Most other file types, especially PDF files and scanned images are 2nd class citizens• Read only view of document

Page 39: Transactional Content Management (TCM) Optimizing SharePoint for

Viewing Scanned Images and/or PDF files

Files typically open in native apps such as Windows Photo Gallery or Adobe Reader• Users cannot edit

metadata

• If user rotates, re-orders or deletes a page, the changes cannot be saved to SP

• User cannot annotate pages (e.g. sticky notes, redactions, etc.)

Page 40: Transactional Content Management (TCM) Optimizing SharePoint for

Vizit Essential™ - Integrated viewing of scanned images and/or PDF documents in SharePoint

A powerful, low cost PDF and imaging viewer for SharePoint

Page 41: Transactional Content Management (TCM) Optimizing SharePoint for

Vizit Essential - Integrated viewing of scanned images and/or PDF documents in SharePoint

Visually search documents with thumbnails and quick previews

Page 42: Transactional Content Management (TCM) Optimizing SharePoint for

Vizit Essential - Integrated viewing of scanned images and/or PDF documents in SharePoint

Search for text within a PDF file (just like Adobe Reader/Acrobat)

Page 43: Transactional Content Management (TCM) Optimizing SharePoint for

Vizit Essential - Integrated viewing of scanned images and/or PDF documents in SharePoint

Edit SharePoint metadata within the viewer for PDF documents and scanned images

Page 44: Transactional Content Management (TCM) Optimizing SharePoint for

Vizit Pro™ - Integrated viewing of scanned images and/or PDF documents in SharePoint

Adds robust image editing features – annotations, re-order, rotate or delete pages, image cleanup

Page 45: Transactional Content Management (TCM) Optimizing SharePoint for

Conclusion• To leverage SharePoint’s native features for

transactional document management… • Extensive upfront planning

• Complex configuration (many more steps to configure SP compared to most dedicated document management products)

• To make the overall user experience in SharePoint comparable with dedicated Document Management products, plan on:• Lots of custom code ... OR …

• 3rd party solutions