digitizing transmuter. extracting relevant information from the electronic media into digitized form...

13
Digitizing Transmuter

Upload: emory-preston

Post on 29-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Digitizing Transmuter

• Extracting relevant information from the electronic media into digitized form and accumulating the information bank for further processing based on the user needs.

Problem statement

Solution

• “Digitizing Transmuter” is the answer to the unanswered question.

• As opposed to the general transmuters, Digitizing Transmuter controls the input and renders it onto a output which can be inturn used as an input.

Electronic Media

Digitizing Transmuter

Digitized Information Bank

Digitizing Transmuter

• A common barrier for the present transmuters in market is the PDF format.

• Pdf provides information based on few parameters viz. Text objects which are considered as one or more glyph shapes representing characters of text. Similarly with the image object.

• These parameters makes it difficult for other transmuters to use the information for any further processing

• Digitizing Transmuter makes it happen in seconds. It transmutes all the necessary data to more and more sententious and processable data.

Digitizing Transmuter

Input PDF file

Read parameters like height, weight

Read line & rectangle based on start & end points

Get co-ordinate space by Transformation matrix

Region or table detection based on

connected lines

Is parent rectangl

e

Region contains literals

Form rows and tables based on the literals

Divide rows into columns

Retrieve Headers

Does header matches in

mapping structure

Fetch Data under the header

Yes

No

No

No

Yes

Yes

Grow region till next

rectangle

Process

Digitizing Transmuter

• Mapping file decides the data to be fetched from a input file.

• Input string in mapping file is the header under which the

necessary data resides.

• Other parameters like data type and the names to be appeared in output file are mentioned.

Overview

• Process

• Submit a electronic form of data viz. PDF, Excel to DT.

• Instruct DT of what is needed and what to fetch.

• DT performs the parsing instructed by the user with the validation on the parsed data.

• Presents the output in digital form viz. DB,XML.

• Components

• Settings

• Parser

• Validation Engine

• Report Generation

How it work

• Settings

• This component helps to select the input file, to locate the mapping file and gets the path to produce the output file.

How it work

• Parser

• Select the parser viz. PDF,Excel,List etc.

• Select the file or folder of files to parse.

• Validation

• Is performed internally after parsing is completed.

How it work

• Report Generation

• Report generation takes place in 3 types.

XML form

CSV form HTML form

Achievements

• Commercial use of Digitizing Transmuter

• RiskSpan Serving the MBS and ABS marketplace, also offers integrated solutions that combine powerful analytics, data and expert advisory services. Is using Digitizing Transmuter for last 12 months

Achievements

• Digitized Transmuter handles monthly ABX deals for RiskSpan• Few hundred users access the ABX analysis data on a monthly

basis

Achievements

• It is estimated that Digitized Transmuter can parse and output data from 7000 deals in 3 working days (24 hrs).

• Digitized Transmuter has parsed and rendered 3000 deals in a .lst format in under 3 minutes

3000 parsed and rendered 3000

deals

3000 Deals parsed under

3 minutes

Deals