building database-backended multilingual, multimedia data repositories: the aaqua experience

16
Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Upload: harry-franklin

Post on 18-Jan-2018

222 views

Category:

Documents


0 download

DESCRIPTION

Developmental Informatics Lab Usability

TRANSCRIPT

Page 1: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

BuildingDatabase-backended Multilingual, Multimedia Data Repositories:The aAQUA Experience

Page 2: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Developmental Informatics Lab

Introduction

aAqua’s (almost All questions answered) – An online forum for answering

questions from the grassroots by the experts in the field.

Bridges gaps in use of ICT– Usability– Availability– Multi-Linguality – Multi-media Support– Multi-Lingual Storage and

Retrieval– Reusability

Page 3: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Developmental Informatics Lab

Usability

Page 4: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Developmental Informatics Lab

A Sample Thread

Page 5: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Developmental Informatics Lab

aAqua in Operation

Page 6: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Developmental Informatics Lab

aAqua Server

aAqua Server

Crop Doctor

Crop Recommendation

KeywordBrowser

BhavPuchiye

aAqua

Internet

HTTP

aAquaOffline

Mobile network

aAquaMobile

Gateway

SMS

Page 7: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

aAqua Demo

Page 8: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Developmental Informatics Lab

aAQUA- a technical perspectiveEmploys three tier web architecture Uses mvnforum which is based on the MVC

architecture.Lucene used as search engine.Compatible with any servlet container which

supports JSP1.2 and Servlet2.3Runs on tomcat Works with unicode UTF-8 compliant Oracle 9i as

well as mysql database Is integrated with open source digital library

software

Page 9: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Developmental Informatics Lab

Multi-Linguality

Page 10: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Developmental Informatics Lab

Multi-lingual Storage and Retrieval

Query in Hindi

UNL Document

Result in Hindi

“flowersScorch”

UNL Document

UNL Document

Inforepository

…The plants blossom but the flowers scorch… and(blossom(icl>develop(obj>thing)):0S.@entry.@custom, scorch(icl>dry(obj>thing)):2E.@contrast.@custom) obj(blossom(icl>develop(obj>thing)):0S.@entry.@custom, plant(icl>organism):04.@def.@pl) obj(scorch(icl>dry(obj>thing)):2E.@contrast.@custom, flower(icl>reproductive structure):1P.@pl.@def)

UNL graph

Page 11: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Developmental Informatics Lab

UnicodeComputers store letters and other characters by

assigning a number for each.Hundreds of different encoding systems for

assigning these numbers. Before unicode, no single encoding could contain

enough characters. Universal encoded character set

– Enables information from any language to be stored using a single character set.

– Provides a unique code value for every character, regardless of the platform, program, or language.

Page 12: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Developmental Informatics Lab

Unicode standard UTF-8 encoding

–Popular with html–A way of transforming all Unicode characters into a variable length encoding of bytes. –The Unicode characters corresponding to the familiar ASCII set have the same byte values as ASCII–UTF-8 can be used with much existing software without extensive software rewrites.  

UTF-16 encoding–UTF-16 used when efficient access to characters is needed with economical use of storage. –Most of the heavily used characters fit into a single 16-bit code unit, while all other characters are accessible via pairs of 16-bit code units.–Better compatibility with Java

Page 13: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Developmental Informatics Lab

Unicode Encodings

C3 9174

63

E6 84 80ED A0 81 B0

C3 B6D0

64

006300E100746100

006400F60424

D801 DC02

át

c

öd

A4

UTF-8 UTF-16Characters

Page 14: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Developmental Informatics Lab

Unicode and the WebPreferred encoding form for Unicode characters on

the web is UTF-8 HTTP header of a document should contain the line

– Content-Type: text/html; charset=utf-8 (for HTML files)– Content-Type: text/plain; charset=utf-8 (for TEXT files)

Or in a HTML document, add the following line under HEAD the element < META http-equiv=Content-Type content="text/html; charset=UTF-8" >

Page 15: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Developmental Informatics Lab

Creating unicode databasesMysql/Oracle

– CREATE DATABASE database_name CHARACTER SET character_set

– CREATE DATABASE confluence CHARACTER SET utf8; – Oracle 9i supports UTF 16 also. (CHARACTER SET :

AL16UTF16 )Postgres

– CREATE DATABASE database_name WITH ENCODING 'UTF8';

Page 16: Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Thank You