data harmony version 3.9 features update
DESCRIPTION
Marjorie M.K. Hlava, President and founder of Access Innovations, Inc., unveils the newest version and module updates of the Data Harmony indexing software suite.TRANSCRIPT
Marjorie M.K. [email protected] Innovations, Inc.
www.accessinn.com
Leveraging your content semantically
10th Annual Data Harmony User Group Meeting
DH Technical Support Team
Development programming team Lamine Idjeraoui ** Allexander Lyons Daniel Vasicek Scott Roberts Doug Vendcat
Customer support Mary Garcia ** Jack Bruce Gabe Carr Samantha Lewis
Documentation Jack Bruce ** Kirk Sanders Gena San Nicolas Barbara Gilles
Systems Tom Peterson** SWCP
DH Customer Support Team
Sales and Licensing Marjorie Hlava Janice McIntyre Bill Richardson Jay Ven Eman ** Leland Yates
Blog and Web team Barbara Gilles Melody Smith ** Timothy Soholt **
Marketing Heather Kotula ** Ashley Beard
Editorial Team Taxonomy and Rule Building
Gabe Carr Jack Bruce Kathy Brown Barbara Gilles Bob Kasenchak **
Samantha Lewis Kirk Sanders Tim Soholt Gena San Nicolas Alice Redmond-Neal Eric Ziecker
Access Integrity
Kathy Brown Jerry Jorgeson John Kuranz** Leland Yates Access Rule Building Team Access Programming Team
Who’s Who?
Introduce yourself Relationship to Data Harmony Where do you use Data Harmony Project Name(s)
Access InnovationsWhat do we do?
Four Divisions Database Services Data Harmony
NewsIndexer National Information Center for
Educational Media (NICEM) MediaSleuth
Access Integrity Medical Claims Compliance Integracoder
Database Services
Database Design Consulting DTD / Metadata Schemas Workflow Scheduling
Editorial Services Metadata capture and creation Tagging – XML, SGML Abstracting Indexing Author disambiguation
Database Services - 2
Taxonomy Construction Thesaurus Vocabulary Ontology Data Linking (linked data) Authority Files – pick lists Rule Bases
Semantic Enrichment Data Format Conversion Database Applications Retrospective metadata tagging Author disambiguation
Database Services - 3
Applications development Search – Lucene and Solr Search Harmony interface Web services layer
Link to user experience or user interface Web calls
API setup and linking www.accessinn.com
Data Harmony
Built for our use starting in 1987 Visual Basic C++ Java Aid to the editorial and indexing processes Alleviate the clerical aspects Speed the tagging process Guarantee accuracy, consistency, and
depth of indexing
Data Harmony Suite – Main Modules
M.A.I. Thesaurus Master XIS
XML Intranet System Administrative configuration module “The Data Harmony Suite”
Tech stuff Downloadable Documentation revised 2014 APIs for client server versions Internet accessible Cloud and SaaS Full multilingual display Unicode - Accepts ASCII data Entification tables converted Drivers for display and print
For most languages
Data Harmony
Java Platform independent Applet modules Web services APIs
XML TCP/IP JSON and SSL on WEB Start GlassFish for extension support www.dataharmony.com
Full multilingual display
Data Harmony Machine Aided Indexing (M.A.I.)
Semantic, syntactic, morphological, etc. layer Rule Builder for users Concept Extractor for text Statistics for Machine Learning Use in automatic, batch, or assisted mode
Thesaurus Master For creating taxonomies, thesauri, ontologies, and
authority files MAIstro
Thesaurus Master and M.A.I. combined
Data Harmony Extensions
Inline Tagging Metadata Extractor MAIChem Search Harmony SharePoint integration Recommender
New
DH Author Submission System Author / Name Disambiguation MAIBatch GUI Semantic Fingerprinting Web Start Sneak Peek at “Ontology Master”
Retiring
Automatic Summarizer WebThes ThesViewer
TaxoDiary
Daily blog Weekly feature 3 + items per day Big archive Launched in June 2010
DH Bulletin Board Exchangehttp://dhd.accessinn.com
Data Harmony Forum
Discussion threads Solutions to reported problems Access to the newest documentation Announcements of features Bug reports Enhancement requests
Data Harmony Partners
EJ Press MarkLogic
Really strategies (R Suite) Yuxi Xquire
Publishing Technology More ….
Some DH Connectors & Exports…
ACD/Labs’
Lucene (org. & Solr)
Perfect Search
Oracle/Stellent Universal Content Management
Jive Software’s Clearspace
EJ Press
Publishing Technology
OpenOffice
Mark Logic’s MarkLogic Server
Microsoft’s SharePoint
NorthPlains
Temis
Synaptica
and more…
Other DH offerings
Off-the-shelf taxonomy Term records Browseable list Rule bases
Consulting Information architecture DTD and schema creation
Search implementation
Knowledge Domains in over 40 subject areas.• Agriculture• Applied Technologies• Business (popular)• Business and Finance• Communications• Computer and Information
Science (popular)• Computer Science • Consumer and Homemaking
Education• Corporate Names• Counseling and Guidance• Economics• Education• Engineering• Environment• Geography (subject)• Geographical Place Names• Health and Safety• History• Language Arts
• Languages• Literature and Drama• Mathematics• News • Occupations• Organizational Names• Personal Names• Physical Education and
Recreation• Political Science• Psychology• Religion and Philosophy• Science (popular)• Science, Technology, and Medicine (STM)• Society• Sports• Technology• Visual and Performing Arts• US Industrial Codes (NAICS)• US Zip Codes and Places
Go to TaxoBank for more!
NewsIndexer
Automatic indexing of newspapers 8 topical areas Maps to IPTC, NAICS, ICB, and GICS
codes Popular, automatic, and fast Remote submission / ASP 13 levels Filter to 3 License and augment www.newsindexer.com
National Information Center for Educational Media - NICEM
667,000 records for non-print educational media
23,000 producers and distributors Based on school curriculum needs Online and CD-ROMs MARC cataloging Thesaurus Print www.nicem.com
MediaSleuth
Online ordering of media from NICEM Search Harmony implementation Full e-commerce platform for ordering Educational and popular materials
www.mediasleuth.com
Access Integrity, Inc. (AI2) Medical Claims Compliance Automatic IDC-9 suggestions CPT rule base HCPCS rule base ICD-9 V 3 Hospitals ICD-10 Accurate, deep, consistent coding Making medical billing efficient
Corporate Information
Closely held Financed by
Sweat and Persistence Good Cash Flow and Management
Since 1978 - 35 years in business Marjorie M.K. Hlava Jay Ven Eman Joanna Ginter
www.accessinn.com
Woman Owned Small Business
UPDATE
Data Harmony Users Group Meeting
February 10-14, 2014
The 15 modules + extensionsWhat’s new
Admin Module Author Submission
System Author / Name
Disambiguation Inline Tagging Metadata Extractor M.A.I. MAIBatch GUI
MAIChem Ontology Master Thesaurus Master Search Harmony SharePoint Recommender Web Start XIS
Rule Base
TermKeyRecord
ConceptExtractor
Statistics Module
M.A.I.
TaxonomyAuthority filesAll terms AlphabeticPermuted view
XML (Extensible Markup Language) - Unicode
Java Virtual Machine
TCP/IP Transmission Control Protocol / Internet Protocol
Thesaurus Master
Native XMLContentCreationRepository
OWL Zthes SKOSXMLMARC, etc.
Administrative modules
DH Extensions
XIS Search Harmony
NavTree
Auto Completion
Narrow Search - NT
Expanded Search - RT
Auto Sum
Metadata Extractor
MAI Chem
Data Harmony 2013 Stack
Data Harmony 2014 Stack
Rule Base
TermKeyRecord
ConceptExtractor
Statistics Module
TaxonomyAuthority filesAll terms AlphabeticPermuted view
XML (Extensible Markup Language) - Unicode
Java Virtual Machine
TCP/IP Transmission Control Protocol / Internet Protocol
Thesaurus Master
Native XMLContentCreationRepository
OWL Zthes SKOSXMLMARC, etc.
Administrative modules
Web Start, APIs, Web services and connectors
XIS Search Harmony
NavTree Auto Completion
Narrow
Search - NTExpanded
Search - RT
Metadata Extractor
MAIChem
Inline Tagging
Author Disambiguation
Recommender
M.A.I.
Automatic Summarizer
Author Submission System
SharePoint Connector
Ontology Master
MAIBatch
Admin Module
Configuration of Thesaurus Master, M.A.I., MAIstro
Separate Admin Module for XIS MAIBatch added to MAIstro Admin
Module
The author pastes the data into the
document template,
attaching images, graphs, etc. as
necessary:
Copyright © 2013 Access Innovations, Inc.
Author Submission Module
Author Submission Module
Copyright © 2013 Access Innovations, Inc.
The author fills in the data to the document template, attaching images and graphs as necessary.
An API calls Data Harmony and generates a list of indexing terms based on the content.
Authors review the indexing and may change it.
Content is stored into a data repository as HTML, XML, etc.
Author Submission Module
Copyright © 2013 Access Innovations, Inc.
DH Author Submission System
Leveraging Records Management with Documentum, Author Submission, and MAIstroMarjorie M.K. Hlava and Leland Yates, Access Innovations, Inc.
Admin Module
DH Author Submission System
Configure any field Index on any field XML or XHTML Link to the CMS
Author Submission
System Configuration Module
Author Disambiguation
Build a file of authors Name: first, second, surname DOIs published Publication rank (first author, etc.) Keywords for those DOIs Affiliation(s) Location(s) city, state, country, etc. Co-authors (inferred by DOI) Etc.
Affiliation Disambiguation
Build a file of affiliations Name
Lab, institute, etc. name DOI Location Full address Keywords Etc.
Author Disambiguation
Link the two databases Build a web service to accept files Auto-disambiguate incoming files Review new or non-match to ensure
accuracy Leveraging Semantic Fingerprinting for
Building Author NetworksBob Kasenchak, Wednesday @ 9:30 AM
Inline Tagging
Full text tagging Send search query directly to the place in
the document where the concept is mentioned.
Flexible in XML and HTML views Inline Tagging and Dictionary Connection
Gena San Nicolas, Wednesday @ 2:15
Inline tagging Web service
Use M.A.I. to put terms in context for high-precision indexing
Inline Tagging
Shows the exact point where the concept is mentioned
Mouse over to view the term record
Statistical summary, showing the number of times each term is mentioned in the article
XML View forInline Tagging
Copyright © 2013 Access Innovations, Inc.
Metadata Extractor
Automatic creation from PDF digital layer Position training needed Dublin Core metadata Bibliographic citation created Automatic summarization added Uses M.A.I. on full text Can be linked to Author Disambiguation
Input file
Source file PDF digital layer
Metadata Extractor Full Record Display
Output in XML
Or use with HTML Pages
. <document><title>Access Innovations -
Knowledge Management Professionals</title><document-type>Web Page</document-type><copyright>© 2007 Access Innovations, Inc.</copyright><address>
<street>131 Adams NE</street><city>Albuquerque</city><state>New Mexico</state>
</address><subject-terms>
<term>Data Harmony</term><term>Indexing</term><term>Taxonomies</term>
</subject-terms></document>
M.A.I.
M.A.I. is used to describe or categorize items by matching text to controlled vocabulary terms Rule Builder Concept Extractor Statistics Collector Test MAI
M.A.I. 2014
Find in Test MAI Export Fields function Expanded warning and information labels Expanded print functions Rule error details Emphasis tags MAIBatch GUI
Find Function In Test MAI
Export with fields selection
Expanded warning and information labels
Delete term warning
Term warnings
Term with multiple Broader Terms warning Remove relationship warning message
Move term functions
Move a single term
Expanded print functions
Test the syntax of a rule
View information about a thesaurus term
MAIBatch GUI
IMAIBatch input format
PDF XML, nXML Web content (HTML, HTM) Plain text (TXT), rich text (RTF) MS Word documents (DOC, DOCX)
Full window with suggestedAND used terms
Select all or just some files to process
MAIBatch XML
Add Custom tags Click on “XML tags” in
the Settings menu.
MAIBatch - Adding files Viewing results
Upload File/Directory
Row of asterisks separates each document
file path of a document
suggested thesaurus terms
Log Statistics From source data to
compare accuracy By human editors
assigning values HIT MISS NOISE
From source file data
<USEDTERMS><TERM>Term 1</TERM><TERM>Term 2</TERM></USEDTERMS>
M.A.I. Statistics Module
Exporting MAIBatch
resultsSave as .txt file through export menu
Save to Log Spreadsheet .xls
MAIChem
Dictionaries Full terms Beginners Enders
M.A.I. Concept Extractor Links to graphical displays
Ontology Master
Sneak Peek Built on Thesaurus Master Full OWL and SKOS exports Full directional relationships Same extensive functionality Bob Kasenchak – Wednesday @ 1:15
PM
Recommender
More Like This - Recommender
Search Harmony
Built to leverage semantically enriched text
Uses the thesaurus sections BT-NT relationships for taxonomy tree Type ahead from tab, permuted index Related terms Narrower terms
Copyright © 2005 - Access Innovations, Inc.
Taxonomyview
ThesaurusTerm Record
view
Search Presentation Layer
Automatic completion
and type ahead from thesaurus
Search Presentation Layer
Related
Narrower
Search Presentation Layer
The Hierarchical view of the thesaurus is also a browseable view of the content.
The numbers include the number of hits 1. For the term 2. For the branch
Semantic Fingerprinting
People / Authors Articles Medical records Organizations and affiliations Point ads to users Related to author disambiguation
Thesaurus Master
Machine Aided Indexer
(M.A.I.™)
Repository
SearchPresentation:
90% accuracy
Browse by SubjectAuto-completionBroader TermsNarrower TermsRelated Terms
Client Taxonomy
Inline Tagging
Metadata and Entity
Extractor
Automatic Summarizatio
n
SearchSoftware
Client Data
Full Text
HTML, PDF,
Data Feeds, etc.
Client taxonomy
Fully integrated SharePoint
Copyright © 2013 Access Innovations, Inc.
[Data Harmony fully integrated with MOSS.]
Select term store management located under Site AdministrationEdit term sets to accurately reflect your document
libraries and content types. Term sets can be individual taxonomies or flat controlled vocabulary lists. 90
Thesaurus Master - 2014
Built for vocabulary control Taxonomy Thesaurus Entities
Full standards compliance ISO 25964 Parts 1 and 2 NISO Z39.19 – 2010
Emphasis Is Available for Preferred Terms
bold, italics, or underline Term with emphasized words
Term with enriched words
Change Term dialog with enhancement buttons
XML Emphasis Export
Full Path Export
Data Harmony Custom Features as Implemented for Triumph Learning
Kirk Sanders Wednesday @ 11:00
Emphasis Full path export
Thesaurus Master 2014
Emphasis tags – more Wednesday @ 11:00 Data Harmony Custom Features as
Implemented for Triumph LearningKirk Sanders, Access Innovations, Inc.
Pattern analysisDomain associations
Pattern analysisComponent gaps
Web Start
Replacing WebThes and ThesViewer Allows auto-start from the browser Full featured Password access control Everything from view only to full access
V
XIS
A XIS project consists of the following: Folders that XIS uses. These are the “project
folders.” A schema (configuration file) called
projects.MyProject.xml. A XIS DTD, called “projects.dtd.”
XIS links to Thesaurus Master and M.A.I.
XIS and Lucene
Search within a search (recursive search)
New Lucene search
Using Lucene for Search within XISAllexander Lyons, Wednesday @ 11:45
DHUG 2015
Albuquerque February 16 – 20 Call for papers is now open Ideas for what to do better and differently
VERY welcome
We Apply ImaginationKeep the System Flexible
Make the Applications Fun
Thank you!
Marjorie M.K. Hlava, President,
Access Innovations
505-998-0800