opensearch installation guide

Upload: arif-nusabdi-p-m-74

Post on 05-Apr-2018




0 download


  • 7/31/2019 OpenSearch Installation Guide


    Scbl OpenSearch build 27012012

    OpenSearch Installation Guide

    For Developers

    1. Cygwin Linux Shell Emulator2. Installing Sol-R server in Eclipse3. Running Nutch

  • 7/31/2019 OpenSearch Installation Guide


    Scbl OpenSearch build 27012012

    Cygwin Linux Shell Emulator

    To emulate nutch crawler in our local environment we need cygwin that act as a shell of the linux binary


    These are few easy steps for installing cygwin in our local environment.

    Cygwin Installation

    Step-1: Download cygwin installer from the cygwin online repository or from thelocal repository

    Step-2: Run the installer, select Install from Internet or Install from Local Directory if you already have

    the setup files or if you have downloaded from the local repository

    Step-3: Select Root Directory for the cygwin, finished Installing Cygwin

    Nutch & Solr Preparation

    Extract the nutch-solr-latest-cimsa-build to any folder below the cygwin Root Directory eg:

    C:\cygwin\home\{your_user_name} or C:\cygwin\home\{your_user_name}\nutch-solr

  • 7/31/2019 OpenSearch Installation Guide


    Scbl OpenSearch build 27012012

    Installing Sol-R server in Eclipse

    Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project.

    Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering,

    database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly

    scalable, providing distributed search and index replication, and it powers the search and navigation

    features of many of the world's largest internet sites.

    We can have easy access to sol-r from our development environment by adding the sol-r tomcat server

    in the Eclipse server interface.


    Step-1: extract apache-ant into C:\Development\tools\

    Step-2: add C:\Development\tools\apache-ant-1.8.2\bin to PATH environment variable

    See: more information

    Importing Sol-r source

    Step-1: extract thesol-r sourceinto any location in your hard drive

    Step-2: open command line tool, go to sol-r source home
  • 7/31/2019 OpenSearch Installation Guide


    Scbl OpenSearch build 27012012

    Step-3: run ant eclipse in the root of sol-r source

    Step-4: import the source folder to eclipse IDE

  • 7/31/2019 OpenSearch Installation Guide


    Scbl OpenSearch build 27012012

    Step-5: test build sol-r using the build.xml inside the solr subfolder (not the build.xml in the root project

    folder) by running the default solr task usage in Ant window.

    Step-6: start run-example to check if its run correctly

  • 7/31/2019 OpenSearch Installation Guide


    Scbl OpenSearch build 27012012

    Preparing tomcat

    Step-1: downloadapache-tomcatand extract the content into C:\Development\servers,

    {your_tomcat_home} will be C:\Development\servers\apache-tomcat-6.xx.xx

    Step-2: change port number in {your_tomcat_home}/conf/server.xml, add 100 to every default port

    numbers to avoid conflict with other servers. Or downloadserver.xmland put it into


    Configure tomcat in Eclipse

    Step-1: go to Window >Preferences and then go to Server > Runtime Environments
  • 7/31/2019 OpenSearch Installation Guide


    Scbl OpenSearch build 27012012

    Step-2: add, select Apache > Apache Tomcat v6.0 and click Next

    Step-3: Browse, select {your_tomcat_home} folder and click finish

    Step-4: go to Window > Show View > Servers

    Step-5: right click in empty space and select New > Server

    Step-6: select Apache > Tomcat v6.0 Server and click finish.

    Step-7: double click the server, set server locations as Use Tomcat Installation, save

    Step-8: test run the server

    Build and Deploy in Tomcat

    Step-0: make sure the server is not running (stop the Apache Tomcat v6.0 server within eclipse)

    Step-1: downloadsolr-tomcat-configand put it inside the {your_tomcat_home}/ so you will have


    Step-2: downloadsolr.xmltomcat context and put it in{your_tomcat_home}/conf/Catalina/localhost/


    Step-3a (optional): run ant dist from solr/build.xml within eclipse to build a new solr from source

    Step-3b (optional): copy {your_solr_project_location}/solr/dist/apache-solr-*.war and rename it to


    Step-4: testhttp://localhost:8180/solr/pengkajian/admin/andhttp://localhost:8180/solr/taplai/admin/
  • 7/31/2019 OpenSearch Installation Guide


    Scbl OpenSearch build 27012012

    Running Nutch

    Apache Nutch is an open source web-search software project. Stemming from Apache Lucene, it now

    builds on Apache Solr adding web-specifics, such as a crawler, a link-graph database and parsing support

    handled by Apache Tika for HTML and and array other document formats.

    These are few steps for running nutch using cygwin.

    Before running nutch, please make sure that sol-r have been started correctly.

    Step-1: open another instance of cygwin

    Step-2: go to nutch home directory (the location of nutch is from previously extracted nutch-solr-latest-

    cimsa-build). For example if you extract the file to the C:\cygwin\home\{your-user-name}\ then you

    should type: cd /home/{your-user-name}/nutch-1.3

  • 7/31/2019 OpenSearch Installation Guide


    Scbl OpenSearch build 27012012

    Step-3: go to runtime/local directory of nutch cd runtime/local

    Step-4: set environment variable for nutch java home export


  • 7/31/2019 OpenSearch Installation Guide


    Scbl OpenSearch build 27012012

    Step-5a: run nutch to fill pengkajian sol-r database bin/nutch crawl urlspengkajian -solr

    http://localhost:8180/solr/pengkajian/ -depth 3 -topN 5

    The next screen will tell you that it has updated the pengkajian database correctly:

  • 7/31/2019 OpenSearch Installation Guide


    Scbl OpenSearch build 27012012

    Step-5b: run nutch to fill taplai sol-r database bin/nutch crawl urlstaplai -solr

    http://localhost:8180/solr/taplai/ -depth 3 -topN 5

    The next screen will tell you that it has updated the taplai database correctly:

  • 7/31/2019 OpenSearch Installation Guide


    Scbl OpenSearch build 27012012

    Appendix A: Test Running Sol-R in Cygwin

    These are few steps for running sol-r using cygwin.

    Step-1 : Open Cygwin Terminal

    Step-2: go to sol-r home directory (the location of solr is from previously extracted nutch-solr-latest-

    cimsa-build). For example if you extract the file to the C:\cygwin\home\{your-user-name}\ then you

    should type: cd /home/{your-user-name}/solr-3.4.0

  • 7/31/2019 OpenSearch Installation Guide


    Scbl OpenSearch build 27012012

    Tips: you can list the directory and confirm if we are in the right folder by typing ls

    Step-3: go to example directory cd example

    Step-4: start the sol-r Jetty based server to test if its run correctly java -jar start.jar

    This next screenshot shows that sol-r have been started correctly (note that this server open at the 8983

    port number):