full text search poc

Upload: disha-virk

Post on 08-Jul-2018

232 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/19/2019 Full Text Search POC

    1/19

     

    alon 3.0 ystem

    FULL TEXT SEARCH - SOLUTIONS EVALUATION

    Submitted by

     

    www.Patni.com

    Version: 1.0

    Date: 07 Noe!"er #00$

  • 8/19/2019 Full Text Search POC

    2/19

     

    TABLE OF CONTENTS

    1 INTRODUCTION.........................................................................................................4

    1.1 Salon 3.0 Requirements....................................................................................................................................4

    2 POSSIBLE SOLUTIONS............................................................................................4

    2.1 File System Based Search - Using Windos !nde"ing Ser#ice.....................................................................4

    2.1.1 Steps for configuring this service:-.............................................................................................................4

    2.1.2 Security Points for Windows Indexing Service :-......................................................................................8

    2.1.3 Pros nd !ons.............................................................................................................................................8

    2.2 $ata%ase Based Search & Using S'( Ser#er 200) Full *e"t Search............................................................+

    2.2.1 Pros nd !ons..........................................................................................................................................14

    3 POC DETAILS...........................................................................................................15

    3.1 , !m/lementation $etails.........................................................................................................................1

    3.1.1 "sing Windows Indexing Service.............................................................................................................1#

    3.1.2 "sing S$% Server 2&&' ()u**+ext Serch................................................................................................18

    4 PATNI RECOMMENDATION....................................................................................19

    5 APPENDIX................................................................................................................19

    ).1 Reerence.........................................................................................................................................................1

  • 8/19/2019 Full Text Search POC

    3/19

     

    DOCUMENT CONTROL:

    Security

    Classiication:

    %atni Con&i'entia(

    !ssue Date: 07 Noe!"er #00$ Author(s): Name Title

    Sa!eer ) Te*+ni*a( Desi,ner

    Re"iewer#s$

    Ar*+ana a!at Te*+ni*a( Ar*+ite*t

    Document %istory: Date Re"ision   Change

    17 Se #00$ 0.01D Initia( Dra&t .

    07 No #00$ 1.0 %atni re*o!!en'e' so(/tion Fi(e

    Sste! "ase' stora,e an' sear*+2+as "een &ina(i3e' &or Sa(on 4.0

    sste!

  • 8/19/2019 Full Text Search POC

    4/19

     

    & !NTRODUCT!ON

    T+is 'o*/!ent roi'es 'etai(s on t+e F/(( Te5t sear*+ re6/ire!ent o& Sa(on 4.0 sste!. It a(so

    e5(ores ossi"(e so(/tions &or i!(e!entin, t+e sa!e.

    &.& Salon '.( Re)uirements

    In ,enera( F/(( te5t sear*+ !eans sear*+in, 'ata in t+e 'ata"ase an' &i(es on t+e &i(e sste!.

    State' "e(o8 is t+e F/(( Te5t Sear*+ re6/ire!ent &ro! T+e F/t/re State2 'o*/!ent &or Sa(on

    4.0 sste!.

    1. F/(( te5t sear*+ 9Sear*+ For9 - 8it+ t+e entr o& a sear*+ e a &/(( te5t sear*+ 8i(( "e

    e5e*/te' re(in, on one or !ore attri"/tes.

    * POSS!+LE SOLUT!ONS

    T+ese re6/ire!ents &or F/(( Te5t Sear*+ *an "e a*+iee' " &o((o8in, 8as -

    *.& ,ile System +ased Searc- Usin/ 0indows !nde1in/ Ser"ice

    ;in'o8s In'e5in, Seri*e is a "ase seri*e &or )i*roso&t< ;in'o8s< #000 an' (ater. It s/orts

    *reation o& in'e5e' *ata(o, " e5tra*tin, *ontent &ro! one or !ore se(e*te' &i(es. T+is in'e5e'

    *ata(o,/e ena"(es e&&i*ient an' rai' sear*+in, on t+e &i(e sste!.

    *.&.& Ste2s or coni/urin/ t-is ser"ice:

    1= Oen t+e Co!/ter )ana,e!ent too( aai(a"(e in A'!inistratie Too(s.#= In t+e tree ie8 /n'er Seri*es an' A(i*ation2 no'e *(i* on In'e5in, Seri*e.

    4= A (ist o& e5istin, *ata(o,s is 'is(ae' in t+e ri,+t ane(.

  • 8/19/2019 Full Text Search POC

    5/19

     

    >= Ri,+t-*(i* on ?In'e5in, Seri*e? an' se(e*t ?Ne8?  ?Cata(o,? &ro! t+e (ist t+at aears.

    T+is 8i(( resent t+e &o((o8in, 'ia(o,/e "o5.

  • 8/19/2019 Full Text Search POC

    6/19

     

    @= Enter t+e *ata(o, a na!e (ie Sear*+2 an' se*i& t+e (o*ation o& t+e *ata(o, 8+ere it

    8i(( "e store'.

    = %ress ?O? to *ontin/e.

    7= On t+e *ata(o, *reate' se(e*t t+e 'ire*tor &o('er. Ri,+t *(i* an' se(e*t ne8 'ire*tor

    !en/ otion. In t+e 'is(ae' 8in'o8 ,ie t+e at+ o& t+e 'ire*tor t+at nee's to "e

    in*(/'e' in t+e sear*+ oeration.

  • 8/19/2019 Full Text Search POC

    7/19

     

    $= Reeat ste 7 to in*(/'e !ore 'ire*tories.

    B= Sto t+e in'e5in, seri*e an' t+en restart it. T+e seri*e 8i(( start s*annin, an'

    in'e5in, on t+e 'ire*tories 'e&ine' &or t+e *ata(o,.

  • 8/19/2019 Full Text Search POC

    8/19

     

    *.&.* Security Points or 0indows !nde1in/ Ser"ice :

    T+e In'e5in, seri*e r/ns on t+e (o*a( sste! a**o/nt. It *an not "e *on&i,/re' to r/n

    in an ot+er *onte5t.

    On a (o*a( *o!/ter in'e5in, seri*e /ses t+e Sste! a**o/nt to oerate. I& t+e sste!

    a**o/nt 'oes not +ae a**ess to 'o*/!ents or 'ire*tories In'e5in, seri*e 8i(( not "e

    a"(e to in'e5 t+e 'o*/!ents. An a/t+enti*ate' (o*a( or re!ote /ser *an iss/e In'e5in, Seri*e 6/eries.

    *.&.' Pros and Cons

    %ros

    S/orts 6/erin, on *ontents o& &i(e as 8e(( as its roerties.

    S/orts sear*+in, 8it+in o&&i*e 'o*/!ents HT)L &i(es (ain te5t &i(es )/(ti/rose

    Internet )ai( E5tension )I)E= !essa,es.

    Sear*+in, 8it+in %DF &i(es is a(so s/orte' " insta((in, t+e a'o"e i&i(ter.

    F/((Te5t Sear*+ Seri*e is art o& t+e oeratin, sste! +en*e no a''itiona( so&t8are

    insta((ation nee'e'.

    Cons

    No sear*+ s/ort &or 5!( 'o*/!ents.

    Uti(i3es 'is sa*e an' reso/r*es on t+e 8e" serer &or &i(e stora,e an' *ata(o,s

    In *ase o& *(/stere' eniron!ent stora,e o& 'ata&i(es= on &i(e sste! is not

    re*o!!en'e' as it *a/ses serer a&&init.

    In*ase o& 'is *ras+ t+e seri*e *on&i,/ration an' *ata(o, is (ost

    I& not se*/re' roer( &i(es sae' on t+e &i(e sste! *an "e ta!ere'

    *.* Database +ased Searc- 3 Usin/ S4L Ser"er *((5 ,ull Te1tSearc-

    Stes &or /sin, t+is seri*e:

    1= Oen t+e )i*roso&t SL Serer )ana,e!ent St/'io an' *onne*t to t+e SL Serer #00@

    'ata"ase instan*e 8+ere t+e &/(( te5t *ata(o, set/ nee's to "e *reate'.

    #= Create a ta"(e &or storin, &i(es. For e5a!(e :

    CREATE TALE G'"o.GDo*/!ents

    G'o*/!enti' Gint IDENTIT11= NOT NULL

    GFi(eNa!e Gnar*+ar@0= NULL

    GFi(eSi3e Gint NULL

    GContentTe Gnar*+ar@0= NULL

    G&/((JTe5tJ"in Gar"inar!a5= NULL

    GE5tention Gn*+ar10= NULL

     CONSTRAINT GJ'o*/!ents %RI)AR E CLUSTERED

    G'o*/!enti' ASC

    ==

  • 8/19/2019 Full Text Search POC

    9/19

     

    4= Ens/re t+e F/((-Te5t sear*+ is ena"(e' on t+e se(e*te' 'ata"ase. Oen t+e 'ata"ase

    roerties s*reen t+en se(e*t t+e Fi(es a,e. T+is 8in'o8 +as 9Use &/((-te5t in'e5in,9

    *+e*"o5 &or ena"(in, or 'isa"(in, t+e &/((-te5t sear*+ on t+is 'ata"ase. I& t+e otion is

    'isa"(e' t+en ena"(e it " *+e*in, t+e *+e*"o5.

    >= On t+e se(e*te' 'ata"ase ,o to t+e Stora,e -K F/(( Te5t Cata(o,s &o('er. Ri,+t C(i* an'

    se(e*t t+e otion Ne8 F/(( Te5t Cata(o,2. T+is 8i(( "rin, a ne8 8in'o8

  • 8/19/2019 Full Text Search POC

    10/19

     

  • 8/19/2019 Full Text Search POC

    11/19

     

    @= On t+e 8in'o8 enter t+e Cata(o, na!e (ie Sear*+2 its (o*ation an' ot+er 'etai(s an'

    *(i* t+e O2 "/tton. T+is *reates t+e *ata(o, &or t+e 'ata"ase.

    = Se(e*t t+e *ata(o, ri,+t *(i* an' on t+e 'is(ae' !en/ se(e*t %roerties2 otion. A

    ne8 8in'o8 8i(( "e 'is(ae'.

  • 8/19/2019 Full Text Search POC

    12/19

     

  • 8/19/2019 Full Text Search POC

    13/19

     

    7= On t+e 'is(ae' 8in'o8 se(e*t t+e Ta"(esVie8s2 otion. T+e s*reen (oos as &o((o8s

  • 8/19/2019 Full Text Search POC

    14/19

     

    $= Assi,n t+e Do*/!ents2 ta"(e &ro! t+e 'is(ae' (ist to t+e *ata(o,. In t+e Se(e*te'

    o"Me*t roerties se*tion /n'er Aai(a"(e Co(/!ns2 ti* t+e *+e* "o5 &or

    &/((Jte5tJ"in2 &ie('. Un'er t+e Data Te Co(/!n2 &or t+e &/((Jte5tJ"in &ie(' se(e*t

    t+e E5tension2 &ie(' &ro! t+e 'ro'o8n. C(i* t+e O2 "/tton to sae t+e *+an,es.

    B= T+is *o!(etes t+e set/ &or *reatin, a *ata(o, on t+e SL Serer #00@.

    *.*.& Pros and Cons

    %ros

    F/((te5t sear*+ seri*e is art o& S( Serer #00@. No a''itiona( seri*e nee's to "e

    *reate'.

    S/orts sear*+ &or 'i&&erent &i(e tes s/*+ as o&&i*e 'o*/!ents HT)L &i(es (ain te5t

    &i(es )/(ti/rose Internet )ai( E5tension )I)E= !essa,es '& &i(es an' 5!( &i(es

    too.

    a*/ o& t+e *ata(o, *an "e taen a(on, 8it+ 'ata"ase. In*ase 'ata is (ost it *an "e

    re*oere' &ro! "a*/.

    Data is !ore se*/re' t+an storin, on &i(e sste!.

    Cons

  • 8/19/2019 Full Text Search POC

    15/19

     

    %ro,ra!!in, oint o& ie8 storin, &i(es an' retriein, &ro! 'ata"ase is !ore te'io/s

    t+an t+at on &i(e sste!.

    SL Serer #00@ i!oses restri*tion on &i(e si3e o& t+e &i(e to "e store' in t+e D. )a5

    &i(e si3e a((o8e' is # ).

    ' POC DET6!LSA %OC +as "een 'one to test "ot+ so(/tions o& F/(( Te5t Sear*+.

    T+e 'etai(s o& t+e %OC are as &o((o8s

    e(o8 &i(es *onsi'ere' &or t+e &/(( te5t sear*+

    %OC &or Sa(on.5(s#0=

    Ana(sis Reort S*reen.+t! 4=

    Li*ense.t5t4=

    roosa(-e5e*tations.'o*$=

    Sa(onSste!4.0JRe6/estJ)o'/(eJUCS.'o*170=

    a"*'&.'&44 =

    NET )e!or %ro&i(er.'&1.#4 )= A 8e" a(i*ation *reate' &or enterin, sear*+ *riteria an' t8o s/"!it "/ttons one &or Fi(e

    Sste! F/(( Te5t Sear*+ an' t+e ot+er &or Data"ase F/(( Te5t Sear*+.

    T+e /ser *an enter t+e sear*+ strin, in t+e te5t "o5 an' *(i* one o& t+e s/"!it "/ttons.

    On *(i*in, t+e Sear*+ In Fi(e Sste!2 "/tton a sear*+ is er&or!e' on t+e *ata(o,

    *reate' on t+e IIS !a*+ine. T+e sear*+ res/(ts an' t+e t/rn-aro/n' ti!e taen &or t+e

    sear*+ oeration is re*or'e' an' 'is(ae' on t+e s*reen:

  • 8/19/2019 Full Text Search POC

    16/19

     

    On *(i*in, t+e Sear*+ In Data"ase2 "/tton a sear*+ &or t+e entere' strin, is 'one on t+e

    *ata(o, *reate' on t+e 'ata"ase. T+e sear*+ res/(ts an' t+e t/rn-aro/n' ti!e taen &or t+e

    sear*+ oeration is re*or'e' an' 'is(ae' 'on t+e s*reen:

  • 8/19/2019 Full Text Search POC

    17/19

     

    '.& POC !m2lementation Details

    '.&.& Usin/ 0indows !nde1in/ Ser"ice A ne8 *ata(o, na!e( Sa(on2 is *reate' on t+e a(i*ation serer !a*+ine

    ADO .NET O(eD roi'erSystem.Data.OleDb)  /se' &or t+e &/(( te5t sear*+ on

    &i(esste! .So/r*e %roi'er ta, a''e' to ;e".Con&i, . An' se(e*t 6/er /se' &or t+e

    sear*+.

    Web.config settin,

  • 8/19/2019 Full Text Search POC

    18/19

     

    &uery = 3"S454( 6ile7ame 68OM sco+e$) 9:484 (O7*I7S $(ontents; 0" 0"0  strsearc/strin% 0"0  "0)"else &uery = 3"S454( 6ile7ame 68OM sco+e$) 9:484 68444X$(ontents; 0" 0"0  strsearc/strin% 0"0  "0)"

    OleDbData*da+ter ob*d+ = ne# OleDbData*da+ter $&uery; connection )  ob*d+.6ill$ds; "6ile7ame")

    Se*/rit settin,s &or %OC

    A(i*ation +oste' on IIS 8it+ Se*/rit settin, set to asi* A/t+enti*ation2. An' no-anon!o/s

    a**ess.

    Correson'in, settin,s 'one in ;e".*on&i, as s+o8n "e(o8

    F/((te5t sear*+ on &i(e sste! teste' &or (o*a( an' re!ote /sers 8+o are s/**ess&/((

    a/t+enti*ate' an' t+e 'o not +ae a'!in a**ess on t+e &i(e sste! o& t+e a(i*ation

    serer.

    '.&.* Usin/ S4L Ser"er *((5 3,ullTe1t Searc-

    I!(e!entation o& sear*+ is si!i(ar to an re*or' sear*+ in t+e 'ata"ase ta"(es. ADO .NET

    S6(*(ient roi'er /se' &or 'ata"ase a**ess.

    Co'e Sniet &or 'ata sear*+ on t+e 'ata"ase /sin, S6(C(ient roi'er o& A'o .NET

    DataSet ds = ne# DataSet$)  strin% &uery = ""  try  '  strin% strconnection =(oni%urationMana%er.(onnectionStrin%s,"Databaseconnection"-.oStrin%$)

      i $strsearc/strin%.IndexO$010) 2)&uery = "S454( 6ile7ame 68OM Documents 9:484

    (O7*I7S$ull@ext@bin;70"  0"0  strsearc/strin% 0"0  "0)"  else  &uery = "S454( 6ile7ame 68OM Documents 9:48468444X$ull@ext@bin;70"  0"0  strsearc/strin% 0"0  "0)" 

    S&l(onnection db(onn = ne# S&l(onnection$strconnection)

      db(onn.O+en$)

      S&lData*da+ter ob*d+ = ne# S&lData*da+ter$&uery;db(onn)

      ob*d+.6ill$ds; "Documents")

  • 8/19/2019 Full Text Search POC

    19/19

     

    7 P6TN! R ECOMMEND6T!ONOn ana(3in, t+e ossi"(e-so(/tions 'etai(s an' %OC res/(ts 8e &ee( "ot+ so(/tions are s/ita"(e

    &or Sa(on 4.0 sste! re6/ire!ents. T+at is eit+er

    Fi(e-Sste! "ase' 'o*/!ent stora,e an' &/((-te5t-sear*+ *an "e /se'. OR  

    Data"ase "ase' 'o*/!ent stora,e an' &/((-te5t-sear*+ *an "e /se'.

    %atni re*o!!en's ,ile system based documents stora/e and ull te1t searc-. Reasons are as

    "e(o8:-

    1. A(( roso& Fi(e Sste! "ase' sear*+= as 'etai(e' in rior se*tion are in &aor o& Sa(on 4.0

    sste! re6/ire!ents

    #. A(( *ons o& Fi(e Sste! "ase' sear*+= as 'etai(e' in rior se*tion 'o not +ae !aMor

    i!a*t on Sa(on 4.0 sste! re6/ire!ents.

      For e5a!(e: Cons re(ate' to se*/rit'ata "a*/ *an "e taen *are 8it+ aroriate

    Oerationa( an' a'!inistratie so(/tions.

    Re,ar'in, no s/ort &or 8ML &i(es 8e ass/!e 8ML is not a re6/ire' &i(e te to "e

    s/orte' &or Sa(on 4.0 sste!. Fi(e tes s/orte' " Fi(e sste! "ase' &/(( te5t sear*+

    are )S O&&i*e &i(es )I)E !essa,es HT)L te5t &i(es %DFs.

    4. As er t+e roose' in&rastr/*t/re re6/ire!ents &or Sa(on 4.0 sste! t+e a(i*ation

    serer 8i(( "e a 'e'i*ate' one. An' as er t+e %P in&rastr/*t/re stan'ar's no *(/stere'

    eniron!ent aai(a"(e &or 'e'i*ate' a(i*ation serer. Hen*e serer a&&init iss/e is not

    anti*iate' in *ase o& &i(e-sste! "ase' stora,e.

    Een in &/t/re i& *(/stere' eniron!ent is a(ie' t+en so(/tion *an "e 'eise' " +ain,

    a *entra( serer &or &i(e stora,e.

    >. For &i(esste!-"ase' sear*+ to s/ort %DF &i(es a'o"e i&i(ter&ree 'o8n(oa'a"(e= +as to

    "e insta((e' on t+e a(i*ation serer. Sin*e Sa(on4.0 a(i*ation serer is 'e'i*ate' one

    t+is insta((ation s+o/(' not "e a ro"(e!.

    @. Hain, D "ase' &i(e stora,e an' sear*+ !a *a(( &or !ore ta"(es store'-ro*e'/res an'

    a(so so!e se*ia( settin,s on t+e 'ata"ase. As er t+e roose' in&rastr/*t/re

    re6/ire!ents &or Sa(on 4.0 sste! t+e 'ata"ase serer 8i(( "e in s+are' eniron!ent. As

    er %P in&rastr/*t/re stan'ar's t+ere are restri*tions on t+e D si3e Q o& ta"(es store'

    ro*e'/re et* &or t+e D +oste' in s+are' serer.

    . SL Serer #00@ 'ata"ase +as #) &i(e si3e restri*tion. /t t+ere is no re-'e&ine' !in-si3e

    or !a5-si3e &or t+e &i(es'o*/!ents to "e store' in Sa(on 4.0 sste!.

    5 6PPEND!8

    5.& Reerence

    +tt:!s'n.!i*roso&t.*o!en-/s(i"rar!s1>#@71SL.B0=.as5

    +tt:!s'n.!i*roso&t.*o!en-/s(i"raraa14#4.as5

    %OC So/r*e Co'e - In Sa(on 4.0 VSS

    http://msdn.microsoft.com/en-us/library/ms142571(SQL.90).aspxhttp://msdn.microsoft.com/en-us/library/aa163263.aspxhttp://msdn.microsoft.com/en-us/library/ms142571(SQL.90).aspxhttp://msdn.microsoft.com/en-us/library/aa163263.aspx