smartdiscovery and vizserver -...

14
SmartDiscovery and VizServer from Inxight Research Bloor

Upload: others

Post on 11-Sep-2019

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SmartDiscovery and VizServer - Techhosteddocs.ittoolbox.com/BloorResearchInxightSmartDiscovery.pdf · Fast facts Inxight SmartDiscovery is a tool de signed for the re trieval and

SmartDiscovery andVizServer

from Inxight

ResearchBloor

Page 2: SmartDiscovery and VizServer - Techhosteddocs.ittoolbox.com/BloorResearchInxightSmartDiscovery.pdf · Fast facts Inxight SmartDiscovery is a tool de signed for the re trieval and

Fast factsInxight SmartDiscovery is a tool de signed for the re trieval and man age ment of un -struc tured data and, spe cif i cally, for the anal y sis, sum ma ri sa tion, cate gori sa tion, tax -on omy man age ment, in for ma tion ex trac tion and visu ali sa tion of text files anddoc u ments of ev ery de scrip tion. It is Inxight VizServer that pro vides this visu ali sa tion and this is avail able as an in de pend ent prod uct that will run with third-partyrepositories.

Key findings

In the opin ion of Bloor Re search the fol low ing rep re sent the key facts of which pro -spec tive us ers should be aware:

• Inxight SmartDiscovery is based upon the use of nat u ral lan guage pro cess ing.Com pared to lan guage in de pend ent pro cess ing (the other com monly used ap -proach) nat u ral lan guage pro cess ing has a num ber of ad van tages. In par tic u lar, itcan de rive mean ing (con text) from a doc u ment, which lan guage in de pend entmeth ods can not. This means that when doc u ments are re trieved as the re sult of asearch, you can see a précis or sum mary, tai lored spe cif i cally to the user’s query, ofthe doc u ment; rather than just the first few lines of the doc u ment. The down sideof this ap proach is that you have to use lan guage-spe cific ver sions of the prod uct.Cur rently there are 27 languages supported by Inxight.

• On top of this nat u ral lan guage pro cess ing, Inxight pro vides the abil ity to au to -mat i cally re cog nise and ex tract spec i fied en ti ties. A large num ber of these en titytypes (which in clude peo ple, places, dates and so on) have been pre-de fined andyou can customise these and de fine your own if nec es sary. It should be noted thatthis rep re sents a sig nif i cant ad van tage for Inxight. Very few other com pa nies havethis ca pa bil ity and, of those that do, none have such advanced facilities as Inxight.

• A ma jor fea ture of the lat est re lease is a fact ex trac tion ca pa bil ity that ex tends theen tity ex trac tion de scribed in the pre vi ous para graph by au to mat i cally re cog nis ing any as so ci a tions and re la tion ships that may ex ist be tween en ti ties. An authoringwork bench is pro vided to sup port the def i ni tion of ap pro pri ate rules and tem -plates that describe these relationships.

• In ad di tion to sum ma ri sa tion, Inxight also pro vides tax on omy cre ation and man -age ment, cate gori sa tion, and search in dex ing by full-text, con cept and sim i lar ity.Fur ther, en tity ex trac tion ca pa bil i ties are pro vided that have the abil ity to rec og nize,ex tract and pres ent in for ma tion about en ti ties (such as peo ple, places, or gani sa tionsand cur ren cies) that are ref er enced within the doc u ments un der con sid er ation).

• Cate gori sa tion is achieved by ex am ple, a method that com bines the use of as so -ci a tions, rules and ex am ple doc u ments. Rules (both in this case and for fact ex -trac tion) are op tional and there fore there is no man da tory rules main te nancere quire ment.

SmartDiscovery and VizServer

Page 1 © Bloor Research 2004

Page 3: SmartDiscovery and VizServer - Techhosteddocs.ittoolbox.com/BloorResearchInxightSmartDiscovery.pdf · Fast facts Inxight SmartDiscovery is a tool de signed for the re trieval and

• Inxight’s VizServer prod uct of fers par tic u larly pow er ful and in no va tive visu ali sa -tion ca pa bil i ties and, in our opin ion, the com pany is clearly the mar ket leader inthis area. Most no ta ble is the pat ented (in all SmartDiscovery and VizServer arebased on more than 70 pat ents) StarTree tech nol ogy. Its TableLens tech nol ogycan also be use ful un der the right cir cum stances and the com pany has also re cently in tro duced a new fa cil ity called TimeWall, which is in tended to aid in the iden ti fi -ca tion of trends that oc cur over time. As a three-di men sional rep re sen ta tion oftime-based events this is sig nif i cantly more in tu itive than any other such productthat we have seen.

• Nearly all of SmartDiscovery’s ca pa bil i ties are also avail able as stand-alone soft -ware de vel op ment kits so that OEMs may em bed these fa cil i ties within their ownproducts.

The bottom line

It is self-ev i dent that dif fer ent lan guages have dif fer ent struc tures and gram mars, even leav ing aside the is sue of vary ing al pha bets. As a re sult, it is not sur pris ing that sta tis ti -cal tech niques (as used in lan guage in de pend ent ap proaches) for ana lys ing text are in -ad e quate when it co mes to mean ing. But, of course, it is mean ing that makes all thedif fer ence when it co mes to sum ma ris ing and cate go ris ing doc u ments, not to men -tion when the end-user has to select which one to use.

How ever, it is not just mean ing that makes a dif fer ence, it is also the abil ity to eas ilytra verse what may be many thou sands of doc u ments, in an or gan ised way, so that youcan get to the point of choos ing what to ref er ence. As we have al ready noted, Inxightpro vides out stand ing visu ali sa tion ca pa bil i ties for tex tual data and, when these fa cil i -ties are com bined with the ex trac tion and clas si fi ca tion ca pa bil i ties of Smart -Discovery, it is not sur pris ing that the Inxight is in a market-leading position.

© Bloor Research 2004 Page 2

SmartDiscovery and VizServer

Page 4: SmartDiscovery and VizServer - Techhosteddocs.ittoolbox.com/BloorResearchInxightSmartDiscovery.pdf · Fast facts Inxight SmartDiscovery is a tool de signed for the re trieval and

Vendor InformationBackground information

Inxight mar kets a num ber of prod ucts that were orig i nally de vel oped at Xerox PARC, from which Inxight was spun-off in 1996. The com pany’s em pha sis, in terms of itsprod ucts, is on man ag ing and re triev ing pre cise in for ma tion from un struc tured data.

Inxight has two sets of prod ucts, one aimed at end-us ers and the other at OEMs andpart ners, where the lat ter are re ally ex trap o lated ver sions of the for mer for soft warede vel op ers. In fact, when the com pany was started it was the OEM mar ket on whichthe com pany fo cused and it has some no ta ble cus tom ers in this space, in clud ing SASIn sti tute (SAS Text Miner), Ver ity (its search en gine), IBM, Comshare (now part ofGeac), Or a cle, Microsoft, and others.

More re cently, (from around the be gin ning of this cen tury) the com pany has movedinto di rect sales, where it tar gets For tune 2000 cus tom ers par tic u larly in the phar ma -ceu ti cal and pub lish ing in dus tries, as well as gov ern ment or gani sa tions. It has also es -tab lished an in di rect sales chan nel. The com bined ef fect of these ini tia tives has beensig nif i cant with the rev e nues for these sources now having overtaken OEM revenues.

Inxight web ad dress: www.inxight.com

Product availability

Inxight mar kets two prod ucts to end us ers: Inxight SmartDiscovery and InxightVizServer, where SmartDiscovery is about the search, ex trac tion and clas si fi ca tion oftext data and VizServer is about the visu ali sa tion of that data. The for mer uses muchof the tech nol ogy pro vided by VizServer (es pe cially the StarTree) though VizServercan also be used with third party re pos i to ries as well that which is pro vided withinSmartDiscovery.

In ad di tion to these two prod ucts, a num ber of SDKs are avail able for OEMs, as fol lows:

• Inxight LinguistX Plat form—the nat u ral lan guage pro cess ing en gine within Smart -Discovery. It cur rently sup ports 27 lan guages (this is chang ing all the time) and in -cludes dif fer ent ver sions of Eng lish (US vs Brit ish) and Por tu guese (Eu ro pean vsBra zil ian) and even ex ot ica such as the dif fer ent di a lects of Nor we gian (Bok mal andNy norsk). Other lan guages sup ported in clude Chi nese, Dan ish, Dutch, Farsi, Finn -ish, French, Ger man, Ital ian, Jap a nese, Ko rean, Span ish, Swed ish, Rus sian, Pol ish,He brew and Arabic. The Plat form in cludes a built-in API.

• Inxight ThingFinder—a text anal y sis ap pli ca tion that au to mat i cally iden ti fies,tags and in dexes named entities.

• Inxight StarTree—a visu ali sa tion and nav i ga tion tech nique some times known asfish-eye nav i ga tion. An ex am ple of this method is il lus trated re port ing Fig ure 4.

SmartDiscovery and VizServer

Page 3 © Bloor Research 2004

Page 5: SmartDiscovery and VizServer - Techhosteddocs.ittoolbox.com/BloorResearchInxightSmartDiscovery.pdf · Fast facts Inxight SmartDiscovery is a tool de signed for the re trieval and

• Inxight Summarizer—for gen er at ing ab stracts of a doc u ment.

• Inxight TableLens—an other visu ali sa tion tech nique, this is also dis cussed later inthis re port.

The com pany’s most re cent re lease is Inxight TimeWall. This was an nounced inMarch 2004, along with the in tro duc tion of ver sion 4.0 of Inxight SmartDiscovery,and which will be gen er ally avail able by the time this re port is pub lished. Ini tially, this is only avail able as a part of Inxight VizServer but it is likely that it will be avail able asan SDK, and will be de ployed within SmartDiscovery, in due course.

Inxight prod ucts run on Win dows 2000 and 2003, Win dows XP and Sun Solaris,and they sup port the Tom cat and WebSphere ap pli ca tion serv ers. The Inxight re pos -i tory is de liv ered with the open source MySQL da ta base to sup port it though you canim ple ment this on top of Or a cle or SQL Server if that is preferred.

Inxight sup ports more than 200 dif fer ent file for mats and the com pany has pre-builtadapt ers at both the back-end, to sup port doc u ment and con tent man age ment sys -tems for en vi ron ments such as Documentum, Lo tus Notes, SQL Server, MicrosoftEx change, IMAP en abled e-mail sys tems, and JDBC com pli ant da ta bases; and at thefront-end for third party en ter prise in for ma tion por tals such as those pro vided byPlumtree, Microsoft (SharePoint Por tal Server) and Hum ming bird. Java, C++,COM and XML in ter faces are all sup ported as well as a newly in tro duced SOAPinterface to support Web services.

Financial results

Inxight is a pri vately owned com pany backed by a num ber of banks and ven ture cap i -tal ists, as well as Xerox it self. The com pany has its head quar ters in the United Stateswith over seas of fices in Ger many, Bel gium and the UK which, be tween them, coversales through out Eu rope, Af rica, the Mid dle East, Rus sia and the for mer USSR.Inxight has ap prox i mately 120 staff.

© Bloor Research 2004 Page 4

SmartDiscovery and VizServer

Page 6: SmartDiscovery and VizServer - Techhosteddocs.ittoolbox.com/BloorResearchInxightSmartDiscovery.pdf · Fast facts Inxight SmartDiscovery is a tool de signed for the re trieval and

Product descriptionIntroduction

Inxight uses nat u ral lan guage pro cess ing in or der to un der stand the var i ous doc u -ments and other de tails that it parses. How ever, nat u ral lan guage pro cess ing is not the only pos si ble ap proach to these is sues and, be fore we dis cuss Inxight’s ca pa bil i ties inde tail, it will be use ful to com pare nat u ral lan guage pro cess ing with lan guage in de -pend ent pro cess ing, which is its principal rival.

Lan guage in de pend ence has the ob vi ous ad van tage that you don’t have to have a dif -fer ent ver sion of your prod uct for ev ery sin gle dif fer ent lan guage. How ever, this sortof sta tis ti cal ap proach also has dis ad van tages. For ex am ple, take the two state ments:“San Fran cisco Gi ants take over Na tional League West” and “Gi ant SUVs take overwestern San Francisco”.

If you do a sta tis ti cal anal y sis on these two sen tences then you will get a re sult as in di -cated in the box to the left of Fig ure 1. This is n’t en tirely use ful. Nat u ral lan guagepro cess ing, at least as far as Inxight is con cerned at any rate, will re cog nise that SanFran cisco is an en tity in its own right, as are the San Fran cisco Gi ants. This is il lus -trated in the right hand box in Fig ure 1. It is not dif fi cult to see how nat u ral lan guagepro cess ing pre serves con text while lan guage independent processing loses it.

A cor ol lary to this fact is that lan guage in de pend ent pro cess ing does n’t un der standthe text that it is pro cess ing. So, when you do a search based on such a sys tem all it can do is to bring back the first few lines of any rel e vant doc u ment (à la Google). Us ingnat u ral lan guage pro cess ing, on the other hand, you can sum ma rise doc u ments (seelater) so that you get a cou ple of mean ing ful sen tences that makes look ing for whatyou ac tu ally need much eas ier. For ex am ple, CNN uses Inxight’s tech nol ogy for thenews sum ma ries that flash across the bottom of the screen.

An other prob lem with lan guage in de pend ent pro cess ing is that it can have a prob lemwith get ting back to the roots or stems of words. This is easy enough in Eng lish, when

SmartDiscovery and VizServer

Page 5 © Bloor Research 2004

Figure 1: Statistical Analysis vs Natural Language processing on two phrases using different methods

Word Count

san 2

francisco 2

giant 2

take 2

over 2

west 2

national 1

league 1

SUVs 1

Token Part of speech Entity type Count

San Fransisco Giants Proper Noun Group Organization 1

take verb–present tense 2

over preposition 2

Nationl League West Proper Noun Group Organization 1

giant adjective 1

SUV noun 1

western adjective 1

San Fransisco Proper Noun Group City 1

Page 7: SmartDiscovery and VizServer - Techhosteddocs.ittoolbox.com/BloorResearchInxightSmartDiscovery.pdf · Fast facts Inxight SmartDiscovery is a tool de signed for the re trieval and

flight, fly ing, flew and so forth can all be stemmed back to fly. How ever, as a coun ter-ex -am ple, stem ming a Ger man word such as lebensversicherungsgesellschaftsangestellter, which means “life in sur ance com pany em ployee”, is just a tri fle more dif fi cult. In prac tice,you re ally need dif fer ent stem ming al go rithms for dif fer ent lan guages.

Architecture

The ba sic pro cess in SmartDiscovery is that you start with in for ma tion anal y sis. Thatis you iden tify and ex tract en ti ties such as peo ple, places, com pa nies and other‘things’ (such as the San Fran cisco Gi ants) from the data. Then, us ing a new fa cil itycalled Fact Ex trac tion that has been in tro duced with SmartDiscovery 4.0, the soft -ware will au to mat i cally iden tify any re la tion ships that ex ist be tween the var i ous en ti -ties that have been discovered.

Fol low ing that you can cre ate a tax on omy, cate go rise the data(and from that point cre ate sum ma ries), cre ate con cept in -dexes of the data, and group as so ci ated doc u ments, all ofwhich leads to in for ma tion dis cov ery. The re sults are stored inInxight’s re pos i tory, which sup ports ad di tional lin guis tic ca -pa bil i ties such as the abil ity to sup port aliases, as well as busi -ness ref er ences that can iden tify key peo ple, func tions and soforth, and which can be used to sup port person alisation. Itshould go with out say ing that so phis ti cated search ca pa bil i tiesare also pro vided. Fi nally, Smart Discovery Col lec tion Ex -plorer (which works in con junc tion with the Col lec tion Man -ager) is the com pany’s por tal-like in ter face, which sup portsthe com pany’s var i ous visu ali sa tion tech niques. Al ter na tively,you can plug the data into a third-party por tal or ap pli ca tion.

The ac tual ar chi tec ture of the SmartDiscovery en vi ron mentis il lus trated in Fig ure 2.

Information Discovery

The ac tual pro cess of in for ma tion dis cov ery (lin guis tic anal y sis) con sists of five steps:

• Lan guage iden ti fi ca tion—French, Ger man or what ever.

• Tokenisation—the set ting aside of words that do not add to mean ing, such as‘the’, ‘an’ and so forth.

• Part of speech tag ging. This is im por tant be cause you may want to ex tract ref er -ences to the word “award” used as a verb, say, but ex clude such things as an Acad -emy Award.

• En tity ex trac tion—such as, but not lim ited to, the proper noun groups in the ex -am ple above. Other en ti ties that can be au to mat i cally iden ti fied (there are 27 ofthese) in clude ad dresses, cit ies, com pa nies, coun tries, cur ren cies, dates, days, fi -nan cial in di ces, per cent ages, peo ple, phone num bers, prod ucts, so cial se cu ritynum bers, time pe ri ods, and ve hi cles, amongst oth ers. Note that aliases are sup -

© Bloor Research 2004 Page 6

SmartDiscovery and VizServer

Figure 2: Architecture of SmartDiscovery

Page 8: SmartDiscovery and VizServer - Techhosteddocs.ittoolbox.com/BloorResearchInxightSmartDiscovery.pdf · Fast facts Inxight SmartDiscovery is a tool de signed for the re trieval and

ported, as are dif fer ent for mats for tem po ral and nu meric ex pres sions. While thepre-de fined cat e go ries that are avail able may dif fer by lan guage there are also fa cil i -ties provided to customise these and to define your own.

• Stem ming. An ex ten sion to this fa cil ity that has been pro vided in the lat est re leaseal lows you to spec ify a par tic u lar tense that you would like to use in con junc tionwith en tity ex trac tion. Thus, for ex am ple, you could spec ify that you wanted onlythe past par ti ci ple of award (awarded) to be used.

Once en ti ties have been ex tracted you can then pro ceed to fact ex trac tion. While the ac -tual ex trac tion is au to matic you need to tell the soft ware about the re la tion ships thatyou are in ter ested in. This is done through the Fact Finder Work bench, which al lowsyou (typ i cally do main ex perts) to cre ate tem plates and rules for pro cess ing facts. As anex am ple of the use of fact ex trac tion you might cre ate a tem plate called “Ex ec u tiveChange” which iden ti fied all per son nel that were ex ec u tives (CIO, CEO, CFO,pres i dent, vice-pres i dents, chair man, di rec tors, and so forth) to gether with termssuch as “re cruited”, “hired”, “em ployed” and so on. Note that SmartDiscovery also hasinferencing ca pa bil i ties. For ex am ple, if A re lates to B and B re lates C then the soft warecan au to mat i cally de tect the in ferred re la tion ship that A re lates to C. The ac tual pro cessof cre at ing these rules is via a drag-and-drop in ter face that has been de signed for busi -ness us ers. There are fa cil i ties for run ning re sults in test mode, as well as de bug gingtools, so that you can check your def i ni tions prior to go ing into pro duc tion.

Next, the text can be sum ma rised. This is pa ram e ter based (for ex am ple, you mightwant just 4 sen tences or, as in the case of CNN, a sin gle sen tence no lon ger than somany words). Sum ma ri sa tion can also in clude pref er ence by com pany, de part ment,in di vid ual and so forth. Fur ther, a head of de part ment or par tic u lar in di vid ual canthen add their own em pha ses into the mix, which will re sult in a re-sum ma ri sa tion.This means that you can have a hi er ar chy of sum ma ries that are ap pro pri ate atdifferent levels within the same organisation.

Fi nally, when search ing, Inxight sup ports the in put of fil ters so that you can limit there sult set by con cept, cat e gory or en tity. In the case of the last of these, what en ti tiesal low you to do is to drill down from a doc u ment to an in di vid ual per son’s de tails (inthe case of a peo ple en tity). In as so ci a tion with this there are three fur ther new fea -tures in the lat est re lease that are worth men tion ing. The first is the per for mance en -hance ments that have been im ple mented to sup port en tity ex trac tion, the sec ond isthe in tro duc tion of a new rel e vancy rank ing op tion that can be used to de ter mine anen tity’s im por tance to the doc u ment as a whole, and the third are the enhancementsto the query language that is provided.

The query lan guage is used both in con junc tion with search ing and, in this re lease, bythe Tax on omy Ed i tor for clas si fi ca tion pur poses. Where it has been ex tended is that it now sup ports lin guis tic op er a tors (such as the ‘awarded’ op tion dis cussed pre vi ously), in ad di tion to the stan dard Boolean ca pa bil i ties that it had previously.

Categorisation

Inxight uses a pro cess called Cat e go ri za tion By Ex am ple (CBE) in or der to cre ate sub -ject cate gori sa tion. This method com bines both lin guis tic and al go rith mic anal y sis

SmartDiscovery and VizServer

Page 7 © Bloor Research 2004

Page 9: SmartDiscovery and VizServer - Techhosteddocs.ittoolbox.com/BloorResearchInxightSmartDiscovery.pdf · Fast facts Inxight SmartDiscovery is a tool de signed for the re trieval and

and works in a sim i lar fash ion to some data min ing meth ods(in the sense that you have to train the data). That is, youstart with a set of train ing doc u ments, which are cate gor isedby the Inxight en gine and it is only af ter these have been ana -lysed to your sat is fac tion that you are ready to pro ceed to livecate gori sa tion. The ba sic principle is illustrated in Fig ure 3.

What is hap pen ing here is that the pro cess ing of the train ingdoc u ments has es tab lished four ma jor sub ject cat e go ries,with over laps be tween the cat e go ries iden ti fied as “XY”,“XYZ” and so forth. When new doc u ments are pro cessedthey are au to mat i cally al lo cated to one or other of these cat e -go ries. Note, how ever, that one of the new doc u ments falls

out side the sub ject cate gori sa tion de rived from the train ing set. This will need man -ual al lo ca tion. In fact, this might oc cur dur ing train ing as well, where it is un clear asto the cate gori sa tion of a train ing doc u ment. If this oc curs too of ten dur ing train ing,then it may be appropriate to update the training set.

An other con sid er ation in cate gori sa tion is the trade-off be tween pre ci sion on the onehand, and be ing com pre hen sive on the other. For ex am ple, a cus tomer sub scrib ing toa news feed does not want a rec ipe for “Bombe Sur prise” ap pear ing in a cat e goryabout “war and con flict”. On the other hand, there are other en vi ron ments where it isim por tant not to miss any rel e vant de tail even if that means hav ing the oc ca sionalspu ri ous piece of in for ma tion. Inxight al lows you to set thresh olds about the per cent -age of po ten tially rel e vant doc u ments to be in cluded in any par tic u lar cat e gory. Thatis, how sure do you want to be that this doc u ment is rel e vant to this category? Notethat this tuning can be done by category.

Apart from cate gori sa tion per se, Inxight pro vides a num ber of other fa cil i ties that ex -ploit the in for ma tion dis cov ered, once it has been cate gor ised. These in clude bothcon cept link ing and sim i lar ity capabilities.

Visualisation

Fig ures 4 and 5 are two screenshots from SmartDiscoveryCol lec tion Ex plorer, which high light a num ber of rel e vantpoints.

In Fig ure 4, an Inxight StarTree is shown. This is used tonav i gate and ex plore the News Nav i ga tor tax on omy tree.Here we can see that News is bro ken down into eight mainar eas and that ed u ca tion, for ex am ple, has sep a rate en tries for adult ed u ca tion and en trance ex am i na tions. Note that dif fer -ent tax on omy hi er ar chies can be de fined for use by dif fer entde part ments, even if this con sists of the same data. This isdone by means of Inxight’s Taxonomy Manager.

It is also per ti nent to note the in struc tions for us ing theStarTree on screen. One fea ture that is not men tioned isthe abil ity to move both up and down through the hi er ar -

© Bloor Research 2004 Page 8

SmartDiscovery and VizServer

Figure 3: Inxight categorisation by example

Figure 4: Inxight StarTree shown in the SmartDiscovery Collection Explorer

Page 10: SmartDiscovery and VizServer - Techhosteddocs.ittoolbox.com/BloorResearchInxightSmartDiscovery.pdf · Fast facts Inxight SmartDiscovery is a tool de signed for the re trieval and

chy, ei ther ex pand ing (see Fig ure 5) or ag gre gat ing de -pend ing upon whether you are pro ceed ing top-down orbot tom-up.

Fig ure 5 is the re sult of tra vers ing the hi er ar chy shown in Fig ure 4, first into “econ omy, busi ness and fi nance”, and then into“com pany in for ma tion”. The doc u ment sum ma ries shown onthe right are the re sults ob tained from drill ing down a step fur -ther into the high lighted area.

Other op tions pro vided by StarTrees in clude the abil ity tode fine pri ori tised link colours and to mod ify line widths in asim i lar way.

It is worth point out at this stage just how good the StarTreetech nol ogy is. This may not be im me di ately clear from these

screenshots. While it might take a mo ment or two to get used to, it is by far and waythe most use ful visu ali sa tion tech nique we have seen for ex am in ing large amounts ofhi er ar chi cal data.

The sec ond in no va tive tech nique used by Inxight forview ing data is what it calls a TableLens. This is usedto vi su ally spot pat terns and out li ers in very large ta -bles. Ba si cally, it rep re sents data as lines of dif fer entlengths, so that the re sults look rather like a ver ti calspec tro scopic anal y sis. You can move col umns within a ta ble and sort col umns ac cord ing to ap pro pri ate at -

trib utes so that you might achieve an ef fect such as the one shown in Fig ure 6. In thisex am ple, you can see that there is gen er ally de creas ing trend (based prob a bly uponsome sort ing cri te ria), but the fourth line rep re sents an outlier that represents anexception.

By in te grat ing SmartDiscovery and VizServer, within SmartDiscovery’s Col lec tionEx plorer you can em bed TableLens fa cil i ties within a StarTree node, which can beuse ful for things such as clin i cal tri als, where you start with tex tual data but thenwant to drill down into nu mer i cal data. An other typ i cal ap pli ca tion for this is salesanal y sis.

Fi nally, the com pany has just (June 2004) in tro duced Inxight TimeWall. This is only avail able with VizServer at pres ent, and is not a part of SmartDiscovery, though this is likely to change in the fu ture. TimeWall is prob a bly best ex plained by means of ex am -ple, and Fig ure 7 il lus trates the tech nol ogy. In this screenshot, pre vi ous his tory isshown on the wall re ced ing to the left and the fu ture (with re spect to the time pe riodwe are look ing at) is to the right.

As can be seen, the pres ent can be di vided into var i ous cat e go ries (in this case,coun tries) and ‘cards’ (peo ple and events) are pinned to the wall, dis played rel a -tive to the coun tries that they are as so ci ated with, and in the right place in thetimeline. As may be imag ined, you can click on any point on the wall to changeyour fo cus, you can col our code the cards, and you can ap ply fil ters to the data

SmartDiscovery and VizServer

Page 9 © Bloor Research 2004

Figure 5: Expanding the the StarTree

Page 11: SmartDiscovery and VizServer - Techhosteddocs.ittoolbox.com/BloorResearchInxightSmartDiscovery.pdf · Fast facts Inxight SmartDiscovery is a tool de signed for the re trieval and

that you want to look at. We must say that this tech nol ogy is sig nif i cantly moread vanced and in tu itive than any other com pa ra ble prod uct that we have seen,which typ i cally do not go be yond two-di men sional rep re sen ta tions, mak ing theiden ti fi ca tion of trends very dif fi cult.

© Bloor Research 2004 Page 10

SmartDiscovery and VizServer

Figure 7: Inxight TimeWall

Page 12: SmartDiscovery and VizServer - Techhosteddocs.ittoolbox.com/BloorResearchInxightSmartDiscovery.pdf · Fast facts Inxight SmartDiscovery is a tool de signed for the re trieval and

SummaryBe cause it ini tially con cen trated on de vel op ing an OEM mar ket for its prod ucts,Inxight is less well-known than might oth er wise be the case. If you are a user of theVer ity search en gine, for ex am ple, then you are us ing Inxight’s nat u ral lan guage pro -ces sor, even though you may not know it. Sim i larly, the same is true if you use SASText Miner, while Inxight’s StarTree tech nol ogy has also been em bed ded in a num -ber of third-party prod ucts. The is sue for Inxight, there fore, is gain ing more ex po sureas a sup plier of end-user so lu tions. While it has had some suc cess in this area in thelast cou ple of years, and the com pany has be come fairly well-known within its tar getsec tors of pharmaceuticals, pub lish ing and gov ern ment, there are still a great manycom pa nies in other sec tors that could ben e fit from Inxight’s tech nol ogy. WhetherInxight can get its mes sage out to this broader mar ket re mains to be seen, butSmartDiscovery and VizServer cer tainly merit a closer in spec tion by any com panywith con cerns about in ves ti gat ing the bur ied detail within their document or contentmanagement systems.

SmartDiscovery and VizServer

Page 11 © Bloor Research 2004

Page 13: SmartDiscovery and VizServer - Techhosteddocs.ittoolbox.com/BloorResearchInxightSmartDiscovery.pdf · Fast facts Inxight SmartDiscovery is a tool de signed for the re trieval and

Copyright & DisclaimerThis doc u ment is sub ject to copy right. No part of this pub li ca tion may be re pro duced by anymethod what so ever with out the prior con sent of Bloor Re search

Due to the na ture of this ma te rial, nu mer ous hard ware and soft ware prod ucts have been men -tioned by name. In the ma jor ity, if not all, of the cases, these prod uct names are claimed as trade -marks by the com pa nies that man u fac ture the prod ucts. It is not Bloor Re search’s in tent to claim these names or trade marks as our own.

Whilst ev ery care has been taken in the prep a ra tion of this doc u ment to en sure that the in for ma -tion is cor rect, the pub lish ers can not ac cept re spon si bil ity for any er rors or omissions.

Page 14: SmartDiscovery and VizServer - Techhosteddocs.ittoolbox.com/BloorResearchInxightSmartDiscovery.pdf · Fast facts Inxight SmartDiscovery is a tool de signed for the re trieval and

ResearchBloor

Suite 6, Chal lenge House, Sherwood Drive,Bletchley, Mil ton Keynes, MK3 6DP, United King dom

Tel: +44 (0) 1908 625100 - Fax: +44 (0) 1908 625124Web: www.bloor-re search.com - email: info@bloor-re search.com