integrating google search appliance with mura cms

28
Integrating Google Search Appliance with Mura CMS Ajay Sathuluri @sathuluri

Upload: mura-cms

Post on 16-Jan-2015

730 views

Category:

Technology


0 download

DESCRIPTION

An overview of integrating Google Search Appliance with Mura CMS. Presented at MuraCon 2012 by Ajay Sathuluri.

TRANSCRIPT

Page 1: Integrating Google Search Appliance with Mura CMS

Integrating Google Search Appliance

with Mura CMS

Ajay Sathuluri@sathuluri

Page 2: Integrating Google Search Appliance with Mura CMS

Ajay Sathuluri Sr. Architect at ICF International Using ColdFusion since ’98 Server Tuning, Administration, Load Testing I like spending time with my kids and wife.

About Me

Page 3: Integrating Google Search Appliance with Mura CMS

Google Search Appliance Configuring a Crawl Control Access to Content Configuring Database Crawl Collections / Front Ends Crawl Diagnostics

Configuring GSA with Mura CMS Plugin (FW/1) Search Search Results

What are we covering?

Page 4: Integrating Google Search Appliance with Mura CMS

Google Search Appliance - Home

Page 5: Integrating Google Search Appliance with Mura CMS

Before starting a crawl, you must configure the crawl path so that it only includes information that you wants to make available in search results.

Use the Crawl and Index > Crawl URLs page in the Admin Console to enter URLs

URLs are case-sensitive. Configure your network to disallow search appliance

connectivity outside of your intranet.

Configuring a Crawl

Page 6: Integrating Google Search Appliance with Mura CMS

Google Search Appliance – Crawl URL

Page 7: Integrating Google Search Appliance with Mura CMS

Demo

Configuring a Crawl

Page 8: Integrating Google Search Appliance with Mura CMS

robot.txt meta tag no-crawl Directories

Control Access to Content

Page 9: Integrating Google Search Appliance with Mura CMS

robot.txt The Google Search Appliance always obeys the rules in

robots.txt and it is not possible to override this feature. robots.txt file is not mandatory. It is located in the Web server's root directory. For the search appliance to be able to access the

robot.txt file, the file must be public. Includes one or more Disallow: or Allow: User-agent: gsa-crawler Disallow: /personal_records/ Disallow: /admin/ Allow: / Allow: /personal_records/mypersonal.doc

Control Access to Content (2)

Page 10: Integrating Google Search Appliance with Mura CMS

meta tag Prevent the search appliance crawler (as well as

other crawlers) from indexing or following links in a specific HTML page.

Embed a robots meta tag in the head of the HTML page.

The search appliance crawler obeys the index, noindex, follow, and nofollow in meta tags.

<meta name="robots" content="index, nofollow"><meta name="robots" content="noindex, nofollow">

Control Access to Content (3)

Page 11: Integrating Google Search Appliance with Mura CMS

no-crawl Directories The Google Search Appliance does not crawl any

directories named "no_crawl." You can prevent the search appliance from crawling files and directories by: Creating a directory called "no_crawl."

Putting the files and subdirectories you do not want crawled under the no_crawl directory.

Control Access to Content (4)

Page 12: Integrating Google Search Appliance with Mura CMS

Database data source information enables the search appliance to access content stored in a database.

To configure a database crawl, provide database data source information.

Crawl and Index > Databases page in the Admin Console.

After you create a new database data source, click the Sync link to start a database crawl.

Configuring Database Crawl

Page 13: Integrating Google Search Appliance with Mura CMS

Google Search Appliance – Databases

Page 14: Integrating Google Search Appliance with Mura CMS

A collection lets you search over a specific part of the index.

For example, you may want to create a products collection or a faq collection that supports searches that are only within the products or faqs part of your index.

Maximum number of collections for a search appliance is 200.

Use the Crawl and Index > Collections - In the Collection Name text box, type a name for the new collection.

Manage collection by Editing a Collection Exporting and Importing a Collection Configuration Deleting a Collection

Collections

Page 15: Integrating Google Search Appliance with Mura CMS

Google Search Appliance – Collections

Page 16: Integrating Google Search Appliance with Mura CMS

A front end enables you to change the look and feel of the search and search result pages your users access.

You can customize these pages to display your organization's colors, fonts, and design. If you have multiple collections, you can make each front end appear in a different format, and have its own configuration options.

Use the Serving > Front Ends - In the Front End Name field, enter a name for the new front end.

Manage Front End by Editing a Front End Deleting a Front End

Front Ends

Page 17: Integrating Google Search Appliance with Mura CMS

Google Search Appliance – Front Ends

Page 18: Integrating Google Search Appliance with Mura CMS

Crawl diagnostics provide detailed information about appliance crawl status for a domain, host, directory, or URL.

Crawl Diagnostics

Page 19: Integrating Google Search Appliance with Mura CMS

Google Search Appliance - Crawl Diagnostics

Page 20: Integrating Google Search Appliance with Mura CMS

Google Search Appliance – Secret Recipe

"The appliance uses a sophisticated algorithm to generate the results

bla… bla ..."

Page 21: Integrating Google Search Appliance with Mura CMS

Deploy Mura Plugin

Mura – Plugin

Page 22: Integrating Google Search Appliance with Mura CMS

Search Code

GSA Plugin - Search

Page 23: Integrating Google Search Appliance with Mura CMS

Search results code

GSA Plugin - Results

Page 24: Integrating Google Search Appliance with Mura CMS

DEMO

GSA Plugin – DEMO

Page 25: Integrating Google Search Appliance with Mura CMS

Google Search Appliance – Secret Recipe

Page 26: Integrating Google Search Appliance with Mura CMS

http://docs.getmura.com/ http://www.getmura.com/marketplace/apps/

fw1-plugin-template/ https://developers.google.com/search-

appliance/documentation/614/ https://developers.google.com/search-

appliance/documentation/614/xml_reference http://www.robotstxt.org/meta.html http://muracms.com/forum/

Resources

Page 27: Integrating Google Search Appliance with Mura CMS

Thanks to Oğuz Demirkapi for helping to prepare the presentation.

Acknowledgements

Page 28: Integrating Google Search Appliance with Mura CMS

Q & A

?