fast for sharepoint 2010: how and why?
TRANSCRIPT
C D H
C D H FAST Search Server 2010
May 2011
C D H Quick Facts
About Us• 21st Year• Grand Rapids &
Royal Oak• 30 Staff
Approach• Vendor Neutral• Non-reseller• Professional
Services Only
Partnerships• Microsoft Gold
• Central Region Client Experience Award Winner
• VMware Enterprise• Cisco Premier• Novell Gold• Citrix Silver
C D H
Infrastructure
Access & Identity Management
Expertise
Project Management
Collaboration
C D H Talks TechC D H
C D H About David Tappan
David TappanIOAp | MCITP:EA | MCP:[email protected]
C D H Agenda
• Introducing FAST Search Server 2010 for SharePoint
• Customizing search to meet your business needs
• Building search-driven applications• FAST Search Server 2010 for SharePoint
architecture• Q&A
C D H Agenda
• Introducing FAST Search Server 2010 for SharePoint
• Customizing search to meet your business needs
• Building search-driven applications• FAST Search Server 2010 for SharePoint
architecture• Q&A
C D H
People & Expertise
My Work
Business Data
Information Services
Enterprise Content
Search connects people with informationMaking people more productive and driving business outcomes
C D H
Productivity Search Experience
Search-Driven Applications
Connect all of your people with a broad set of information
• Enterprise content and data• Intranet sites• Team workspaces• People and expertise
Drive measurable ROI by helping a business group make the most of a
specific set of information• 360o customer insight• Competitive intelligence• Research portals
Search solutions can be broad and deep
C D H
SoftwareCosts
People Costs
Opportunity Costs
Need to be less complex & costly
to use company-wide
Need to be more interactive & personalto satisfy expectations
Productivity Search Experiences
The Enterprise Search challengeMany businesses invest in multiple solutions, with unsatisfactory results
C D HA new choice for enterprise search that eliminates compromise
Productivity Search Experience
Search-Driven Applications
Introducing FAST Search for SharePoint
A Single, Cost-effectiveInfrastructure
C D H
Built on SharePoint Search CenterLeverages all of innovations in SharePointOpen Web Parts, Federation, query suggestions, related queries, Did you mean?
Visual results connects users with contentThumbnails for Word and PowerPointVisual Best Bets highlight premium content Preview in browser without leaving the results
Deep Refinement
Thumbnails
Previews
Sort on any field
Similar Results
User Interface is visual and actionableVisual and conversational interaction with precise control
C D H
Built on SharePoint KeywordsMatches keywords and synonyms that are contextually relevant to users. Include banners, videos, external websites.
Easy and quick to setupPoint and click setup for site admins. Set and forget with content expiration dates . Web Parts allow for easy page customization
Visual Notification Web Part Flexibility
Visual Best BetsVisual cues to highlight essential content
C D H
Contains exact countsLeads to discovering non-obvious relationships, key data trends, and deep analysis of your content
Enables conversational experience You will never miss any content; enabling better findability and exploration across the entire result set
Exact Counts
Sorted by frequency
Provides a sorted viewEach refiner is sorted by frequency, from highest to lowest, indicating the importance of each term
Deep RefinementEnables precise control of results
C D H Agenda
• Introducing FAST Search Server 2010 for SharePoint
• Customizing search to meet your business needs
• Building search-driven applications• FAST Search Server 2010 for SharePoint
architecture• Q&A
C D HCustomize search to meet your
business needsKey ingredients to a great customized search experience
Search in the language of your business
Deliver results that are contextually relevant
Tune relevancy to meet diverse needs
Process content with advanced linguistics
Customize the user experience to build engaging applications
Search in the language of your business
Deliver results that are contextually relevant
Tune relevancy to meet diverse needs
Process content with advanced linguistics
Customize the user experience to build engaging applications
C D HSearch in the language of your business
Use terms and language that are unique to your organization
– Automatically generate metadata from your content– Entity extraction with the FAST
Content Processing Pipeline– Metadata is the key to findability
• Leverage corporate knowledge assets to determine metadata to extract
– Corporate taxonomies– Business terminology– Product names– Acronyms
• Incorporate your language in the search experience
– Users can quickly refine content using familiar terms
– Build confidence that you found the correct answers the first time
Profit Profit
TaxonomyTaxonomy
riskrisk
best practicesbest practices
Strategy DevelopmentStrategy Development
customer relationscustomer relations
revenuerevenue
brand managementbrand management
compliancecomplianceSOXSOX
supply chainsupply chain
Disaster RecoveryDisaster Recovery
mergermerger
acquisitionacquisition
target marketstarget markets
cloud computingcloud computing
mobile workforcemobile workforce
qualityquality
cost savingscost savings
market share market share
ProductivityProductivity
Social MediaSocial Media
IP TelephonyIP Telephony
communicationscommunications
CompetitionCompetition
part numberspart numbers
Global presenceGlobal presence
direct maildirect mail
storagestoragearchivearchiveauditaudit
XMLXML
Profit
Taxonomy
risk
best practices
Strategy Development
customer relations
revenue
brand management
complianceSOX
supply chain
Disaster Recovery
merger
acquisition
target markets
cloud computing
mobile workforce
quality
cost savings
market share
Productivity
Social Media
IP Telephony
communications
Competition
part numbers
Global presence
direct mail
storagearchiveaudit
XML
The Content PipelineProcessing & refinement
…
FormatConversion
LanguageDetection
EntityExtraction
ConfigurableStages
Mapper
REDMOND, Wash., and OSLO, Norway — Jan. 8, 2008
Microsoft Corp. (Nasdaq “MSFT”) today announced that it will make an offer to acquire Fast Search & Transfer ASA (OSE: “FAST”), a leading provider of enterprise search solutions, through a cash tender offer for 19.00 Norwegian kroner (NOK) per share. This offer represents a 42 percent premium to the closing share price on Jan. 4, 2008 (the last trading day prior to this announcement), and values the fully diluted equity of FAST at 6.6 billion NOK (or approximately $1.2 billion U.S.). FAST’s board of directors has unanimously recommended that its shareholders accept the offer.
Concept
Money
Date
Company
Company
LocationLocation
Introducing the FAST Content Processing PipelineProcessing & refinement
…
FormatConversion
LanguageDetection
EntityExtraction
ConfigurableStages
Mapper
REDMOND, Wash., and OSLO, Norway — Jan. 8, 2008
Microsoft Corp. (Nasdaq “MSFT”) today announced that it will make an offer to acquire Fast Search & Transfer ASA (OSE: “FAST”), a leading provider of enterprise search solutions, through a cash tender offer for 19.00 Norwegian kroner (NOK) per share. This offer represents a 42 percent premium to the closing share price on Jan. 4, 2008 (the last trading day prior to this announcement), and values the fully diluted equity of FAST at 6.6 billion NOK (or approximately $1.2 billion U.S.). FAST’s board of directors has unanimously recommended that its shareholders accept the offer.
Any term can be extracted and converted to metadata
Any term can be extracted and converted to metadata
C D HCrawled Properties Standard document metadata discovered by the crawler or extracted from the full text by the FAST Content Processing Pipeline.
Location
Redmond, WA
Oslo, Norway
Company
Microsoft
FAST
Date
January 8, 2008
January 4, 2008
Concept
Cash tender
Share price
Managed Properties Map one or more Crawled Properties to a single field. Enables sorting, refinement, relevance tuning and fielded searching.
Crawled Properties
Any data can be found!!Maps automatically or through Central Administration or PowerShell
Type DocId Title Author Date Size Location Company Concept Body
123 PressRelease
… 01/08/2008 26K Redmond Microsoft Cash Tender
…
345 … … … … … … … …
Index Profile Managed Properties
Map metadata to Managed PropertiesAutomatic association of metadata to content
C D H
Enables deep refinementMakes search conversational, guiding users to navigate and refine, while summarizing the results that are found
Enables precision relevancyManaged properties are also used for relevancy tuning & ranking, multi-level sorting, advanced (or fielded) search
File Formats
Region
Industry
Expertise
Enables Advanced Searching and
Sorting
What can I do with a Managed Property?Metadata quality is critical to a good search experience
C D H How does it work?
• Put your terms in the out of the box extraction dictionaries by modifying an XML file
• Map the crawled property to a managed property• Index your content• Modify refinement panel web part
Example: Create a custom entity extractor
Customized Extraction Dictionary
C D HHow does it work?Add refiners to user interface
Built on a SharePoint List or custom extractorEdit the Search Center Results PageModify the shared web part by adding tags to the refinement panel XMLCreate your own labelsSave and Publish
Custom Collections
C D HContext matters
Users need to access multiple types of content
HRLegal
Finance
Depends on role, location, responsibility and task. This can change day to day, or hour to hour.
Marketing Sales R&DCustomerSupport
ProfessionalServices
ManufacturingOperations
. . .
Ent
erpr
ise
Con
tent
C D H
”What should I know about selling ERP?”
- Alan Brewer, Sales Lead
”What should I know about implementing ERP?”
- Renee Lo, Consultant
Role-specific relevance
Business drivenrefinement
Targeted Best Bets / Visual
Best Bets
Deliver results that are contextually relevantwith search that can understands your business and role
C D HQuickly build a contextual experienceUser based tools for creating results that are relevant to your users
Pick the right ingredients Match the proper terms and contexts to boost relevancy for targeted users to ensure your users are always finding the right content
One-way synonymsKeywords map to other termsTwo-way synonymsKeywords become equivalent to other termsBest BetsHighlights key resources that are always relevant to a keywordVisual Best BetsExtend Best Bets with pictures, video, Silverlight controlsDocument Promotion / DemotionTailor specific document relevancy
Create new user contextsSite administrators create contexts based on user profiles to deliver relevant results to the right audiences
Create new keywordsSite Administrators have powerful and simple tools to configure the search experience for groups of users
C D HTune relevancy to meet diverse needs
A flexible solution for your organizations, groups and individuals
Optimize Relevancy for broad intranet use
Query results with the default relevancy
”I want to know about my customer Woodgrove Bankand customers in Financial
Services"- Alan Brewer, Sales Lead
New Default Sorting
Promotes relevantResults
Quickly tailor relevancy models Deliver the right results to the right people by creating new Rank Profiles
”I want to get right to the technical documents"
- Renee Lo, Consultant
Documentation, RFPs and SOWs are now promoted Same Results
different order Users can select rank profiles in the sort by box or create their own default views by modifying the web part
C D H Rank ProfilesTune relevancy without impacting the default algorithm
Quality Also known as static rank, consists of multiple managed properties including site, URL depth (preference for shorter URLs), and relative importance of links to this document.
Authority Applies when the query word falls in the link or anchor text.
Query Authority
Maps the popularity of a document, or the click-through rate when documents are clicked as a result of a query
Freshness Increases the relevancy if a document was recently created or modified, based on the last modified property.
Proximity Applies to where query terms fall and how close they are to each other within a document
Context Increases the rank of a document if the query term is a managed property associated with that document
Managed Property
Effects relevancy when a managed property contains a specific value, such as Woodgrove Bank or Financial Services
Out of the box relevancyTuned for great general productivity experience, relevancy improves with click-throughs and link text analysis.
Extend the default algorithmsCreate new default relevancy models. Blend static and dynamic ranking parameters to instantly improve search results.
C D HHow to create a Rank Profile
IT Pros are empowered to create new profiles quickly
Rank Profiles created in PowerShell by extending the default relevancy algorithm…
Rank Profiles created in PowerShell by extending the default relevancy algorithm…
… and are exposed in the user interface by modifying
the sorting web part.
C D H Process content with advanced linguisticsAutomatic and detailed analysis creates a great search experience
– Breaks down content to the smallest addressable chunks to build meaning
– Understands file encoding, data formats, and written languages
– Supports 400+ file formats, 80+ languages
Map Crawled Properties
Maps all of the metadata that was discovered by the various pipeline stages
Web Link Analysis
Analyzes documents for hyperlinks extracting anchor text which reinforces the authority ranking of a document.
Document Vector
Creates a unique representation of a document that reflects important terms and frequency of occurrence. Used to find similar documents.
Date and Time Normalization
Converts dates and times to a standard representation, to handle locale specific representations. For example, knows that 14-Mar-10 is equivalent March 14, 2010.
Entity Extraction
Finds terms in the content and maps them to predefined categories. Out of the box support for People, Companies and Locations, but can be extended to any category.
Lemmatization Finds the root of a word for a given language. For English it maps run, runs, running and ran back to a single lemma. Understands language specific grammar and context.
Tokenization Apply the language specific rules for identifying words, concepts, idioms and phrases. Also applies custom word breakers found in part numbers or telephone numbers.
Language Encoding and
Detection
Identifies the native written language and locale specific encoding so that the proper dictionaries can be used by the tokenization and lemmatization stages
Format Conversion
Extracts plain text from multiple file formats, encodings, and applications
FAST Content Processing Pipeline
C D HExtending Pipeline capabilities
Safely add additional analysis and processing
Configure Optional Processing Steps• XML Properties mapper• Offensive Content Filter• Pipeline Extensibility
– Calls external applications for custom item processing
• Field Collapsing• Add Custom Processing
– Content classification– Geo-tagging– Machine translation– Sentiment Analysis
What is Custom Processing?Pipeline Extensibility is a specially defined stage that takes a set of crawled properties, as flat text as input and maps output to another crawled property
Custom Processing is SafeExecutable arguments and temporary files automatically handled in sandbox with timeouts. Runs before Crawled Property mapping stage, making new metadata accessible in SharePoint
C D HCustomize the user experience
Extend the interface by modifying web parts or creating new ones
• Create custom interactive Web Parts – Primary search results Web Parts are extendable– Change the default query and result rendering
behavior– All Web Parts communicate through the federation
object model
• Use Federation Object Model to:– Search multiple data sources simultaneously
• SharePoint Search• FAST Search for SharePoint• Open Search (both synchronous and asynchronous)
– Build customized locations• Connect to new data sources (e.g., Exchange)• Combine results from multiple locations.
Extend Refinement
C D HRobust query language
Use FAST Query Language (FQL) for precise query development
• FQL provides a robust and expressive query language
– Wildcard support - *, ?– Numeric Data types (Integer, Float,
Decimal, Datetime)• Operators
– Direct field access (e.g., title:othello, author:shakespeare)
– Numeric (COUNT, RANGE, <, <=, >, >=)
– Boolean (AND, OR, ANY, NOT)– Rank (RANK, XRANK)– Proximity (NEAR, ONEAR)
• String (operator support for strings) – Boundary (starts-with, ends-with,
equals)– Filter
C D H
C D H Demo – FAST Search for SharePoint
User Interface and Site Administration
C D H Agenda
• Introducing FAST Search Server 2010 for SharePoint
• Customizing search to meet your business needs
• Building search-driven applications• FAST Search Server 2010 for SharePoint
architecture• Q&A
C D H
User Experience
SharePointSearch
FAST Search
for SharePoint
Information
FAST Search Server ArchitectureCommon platform scales with your business need
C D H FAST Search Server 2010Summary of architectural components11
C D HOpenSearch Federation
Indexing Connectors
EnterpriseContent
Business Applications
Information Services
User Experience
Search Index
FacebookBing
flickrLinkedIn
Wikipedia
Dow Jones FactivaThe New York Times
Secure, unified access to informationIndex or federate with content, applications, and services
C D H
Build custom connectorsUse SharePoint Designer to configure data model and connect to SharePoint. Connect to WCF services, or create your own .NET assembly connector with Visual Studio.
Quickly connect to contentUse a consistent framework to quickly connect both inside and outside of SharePoint including content management systems, web services, databases, and line of business systems
Connect to all of your enterprise contentExtend your reach with Business Connectivity Services
C D HSimplified, powerful administration
A high-end enterprise search solution that’s easy to deploy and manage
C D H
Content Volume
Query Volume
Scale-out multiple “dimensions”
Query VolumeContent VolumeIndexing freshness
Redundancy optionsSearchIndexing
Performance targets*15M Docs/node25 QPS/node50 docs/sec
*Depends on content and hardware specifics
Search and Indexing
Crawling and Content Processing
Query and Result Processing
No theoretical upper bounds!
FAST Search for SharePoint ScaleoutBack-end with extreme and flexible scale out options
C D H
Productivity Search Experience
Search-DrivenSearch Applications
A Single, Cost-effectiveInfrastructure
Enterprise Search from MicrosoftA new choice for enterprise search that eliminates compromise
C D H
Royal Oak306 S. Washington Ave.Suite 212Royal Oak, MI 48067p: (248) 546-1800
Thank You
Grand Rapids15 Ionia SWSuite 270Grand Rapids, MI 49503p: (616) 776-1600
(c) C/D/H 2007. All rights reservedwww.cdh.com