20080204 documill visual search
DESCRIPTION
Documill provides a solution for visual search over PDF and MS Office document archives with page-level accuracy and instant browser-based pre-visualization.TRANSCRIPT
Disclaimer: All information included in this presentation is company confidential, and should not be redistributed in any media or form without prior written permission from authorized Documill personnel. © 1996-2008 Documill, Inc. All rights reserved.
Documill, Inc.
Visual Search2008-02-04
(c) 2008 Documill, Inc. All rights reserved. 2
Vision
Fundamentally change search engine end-user experience
Enhance discoverability of multi-page documents
Improve document data mining experience thru instant previews and page-level results
Enable faster visual verification of search result relevancy
Documill Visual Search enables unified, browser-driven accessto MSOffice and PDF files as part of the content
discovery experience
You should only need one application –
Web Browser!
(c) 2008 Documill, Inc. All rights reserved. 3
Traditional SearchDifficult to verify if the results
really meet the criteriaNeed to download, open
and view documents 1 by 1Need to do another search
within the document
Keywords are highlighted in
contextual summary, visual context missing
Search results (and accuracy) is in
document level, treating equally
short HTML pages and documents with
tens of pages
Viewing documents requires 3rd party
viewers/tools
(c) 2008 Documill, Inc. All rights reserved. 4
Documill Visual Search
(c) 2008 Documill, Inc. All rights reserved. 5
Visual Search User ExperienceNo need to have content specific viewers/tools for initial screening of the document.
Selected document can be found visually without downloading and opening it.
Keywords are highlighted even in
the thumbnails
Simultaneous view of multiple
documents instead of just one – faster
relevancy verification
Thumbnail view provides immediate
overview of the search results
Instantly zoomable previews from
thumbnails
Only pages/slides with matching
content are shownPreview whole document with
browser
(c) 2008 Documill, Inc. All rights reserved. 6
Documill Visual Search Scalable
Server-side solution, only browser is needed. Both IE and FF supported.
Supports on-demand or background preview creation and caching
Stateless preview engine for maximum scalability
Supported formats today MSOffice up to Office 2003 PDF up to v1.6, selected features
Integration alternatives Deploy with any search solution having an
open API available SOAP/XML API integration HTML proxying
Enhanced user experience Enterprise search solutions Enterprise document management
systems Intranet document repositories
High visual quality Preserve original document layout to
the greatest detail Multiple output format alternatives Optimize the preview file size to
minimize download time Easy deployment
Hosted SaS, or customer on-site server deployment
Flexibility to choose scalable and affordable hardware platform
(c) 2008 Documill, Inc. All rights reserved. 7
Potential Applicaton Areas Horisontal search
Intra/extra/internet e-discovery
Evidence material for civil or criminal legal cases
Archive search Commonstore for email Commonstore for SAP
Forms search & review Public services like IRS e-Office documents
Cross industry solution Financial services Legal services Professional services Goverment & Defence Manufacturing & Engineering Medical & Health Care Education & Academic
Research Utilities Telecom Retail & Distribution
(c) 2008 Documill, Inc. All rights reserved. 8
Scalable and Open Technology Technology based on scalable J2EE server solutions
Full Java solution for both standalone and server use No third party software licenses needed Ability to process documents on a large scale
State-of-art rendering process preserving the visual presentation of the original document in finest detail import proprietary formats like MSOffice, convert files into structural XML for further
processing export processed XML into Web-friendly content formats; (x)HTML, PDF, bitmaps, SVG
Flexible content optimization capabilities allow true automated multi-channel publishing processes modify (search/replace) document data, content filtering embed 3rd party content like advertisements
Open Java APIs allow fast integration into 3rd party search engines and content management systems
(c) 2008 Documill, Inc. All rights reserved. 9
About Documill Documill is an independent software vendor
(ISV) enabling browser-based acccess to MSOffice and PDF documents and server-side content processing solutions
Documill´s core personnel has worked together in various digital media ventures past ten years
Our core competency is server-side enterprise document processing
We are located at Innopoli 2:Tekniikantie 14FI - 02151 EspooFINLAND