discover new value from unstructured data
DESCRIPTION
Presented at Semantic Garage Meetup San Francisco 2011. Unstructured data comes at a high cost - $37,000 per year per person in information industries. By using tools to automatically add metadata enterprises can improve search results, speed e-discovery and risk assessment, summarize content and extract entities from files. Unstructured and semi-structured data represents a large component of big data. By turning unstructured content into business intelligence, enterprise can speed time to information.TRANSCRIPT
Pingar SharePoint NZ Idol
For Wave to incorporate into Peter’s presentation
Emails
Creating docs
Analyzing in
fo
Search
ing
Reviewing
Gathering in
fo
Organizing docs
Creating presentations
Creating images
Data entry
Doc approva
l
Publishing
Translating
14.513.3
9.6 9.58.8 8.3
6.8 6.75.6 5.6
4.3 4.2
1
Avg. hours per week
Source: IDC, Hidden Cost of Information (2005)
Time spent on information tasks
= 37K year/person
Emails
Creating docs
Analyzing in
fo
Search
ing
Reviewing
Gathering in
fo
Organizing docs
Creating presentations
Creating images
Data entry
Doc approva
l
Publishing
Translating
14.513.3
9.6 9.58.8 8.3
6.8 6.75.6 5.6
4.3 4.2
1
Avg. hours per week
Time spent on information tasks
Source: IDC, Hidden Cost of Information (2005)
…can be rescued!
Redaction example is from dysonology.wordpress.com
New Pingar API
Rapid DiscoveryRelated searchesDynamic facetsDocument preview
HCIR Workshop20 October 2011
Google, Mountain View
Entity ExtractionNamed entity extractionTaxonomy mappingLinked Data connectorsAddress detectionInvoice analysis
New Pingar API
Mining Custom TaxonomiesSept 2010 – Feb 2012
NZ Ministry of Science and InnovationUniversity of Waikato & Pingar
Content AnalysisSanitization and redactionOffensive content filteringSummarizationReport generation
New Pingar API
query
Link to downloadan auto-generatedPDF report
Exploring verticalsLegal
BioscienceEducation
Government
Demo time