psi 2005n. ramangaseheno
DESCRIPTION
Presentation of the Project Environment of Work Introducing the Subject Unformat Process Applied to Graphics Indexing Contributions Applications. Unformating SVG Documents. PSI 2005N. Ramangaseheno. Environment of Work. Laboratories : - PowerPoint PPT PresentationTRANSCRIPT
PSI 2005 N. Ramangaseheno
Unformating SVG DocumentsUnformating SVG Documents
● Presentation of the Project– Environment of Work
– Introducing the Subject– Unformat Process Applied
to Graphics Indexing● Contributions● Applications
PSI 2005 N. Ramangaseheno
Environment of WorkEnvironment of Work
Laboratories :
• DI (Document Interaction) team of PSI (Perception, Systèmes, Information) laboratory, University of Rouen, France,
• IPI (Image Processing and Interpretation) research group of SCSIT (School of Computer Science and Information Technology), University of Notthingham, England.
Tutors :
• Eric Trupin, « Directeur de Recherche » within the PSI,• Tony Pridmore, Senior Lecturer, member of IPIresearch group.
Collaborations :
• Mathieu Delalandre, post-doc within the IPI group,• Karim Zouba, master traineewithin the PSI.
Integrated project : Indexation de graphiques vectoriels
Programming language : Java
Trainning period : april 2005 - september 2005
PSI 2005 N. Ramangaseheno
● Presentation of the Project– Environment of Work
– Introducing the Subject– Unformat Process Applied
to Graphics Indexing● Contributions● Applications
Unformating SVG DocumentsUnformating SVG Documents
PSI 2005 N. Ramangaseheno
Vector GraphicsVector Graphics
miyamoto.ai
<rect x="400" y="100" width="400“ height="200"fill="yellow" stroke="navy" stroke-width="10" />
(a) (b)
Common vector formats :
• AI (Adobe Illustrator)
• SVG (Scalable Vector Graphic)
• WMF (Windows Metafile)
• EPS (Encapsulted PostScript)
• DXF (AutoCAD)
a vector graphic
a SVG rectangle
a WMF pen
an EPS plane
PSI 2005 N. Ramangaseheno
Image IndexingImage Indexing
Graphics types : - raster images or bitmaps - vector mages
Indexing : automatic extraction of informative characteristics from multimedia containers to aid retrieval and browsing through large databases.
Architecture of an image indexing system
Graphic document
Conception of visual
descriptors
Image signature
Image database
Measure of similarity
Classification
PSI 2005 N. Ramangaseheno
● Presentation of the Project– Environment of Work
– Introducing the Subject– Unformat Process Applied
to Graphics Indexing● Contributions● Applications
Unformating SVG DocumentsUnformating SVG Documents
PSI 2005 N. Ramangaseheno
Diagram of the Indexing SystemDiagram of the Indexing System
Vector graphic
Analyzer
Generator of synthetic
documents
Unformat process
Low-level representation
High-level representation
PSI 2005 N. Ramangaseheno
SVG document unformated document
R2R1 R3
Why unformat before analysing ?
Problem with Unformating Problem with Unformating
PSI 2005 N. Ramangaseheno
region graph
R2R1 R3
acquiring modelling filtering
SVG document unformated document
Problem with Unformating Problem with Unformating
PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno
● Presentation of the Project● Contributions
– Diagram of the Unformat System– The SVG Format– Parsing – Modelling– Filtering– Intersection Search
● Applications
Unformating SVG DocumentsUnformating SVG Documents
PSI 2005 N. Ramangaseheno
filtering
parsing
modelling
intersection search
Diagram of the Unformat SystemDiagram of the Unformat System
SVG document
PSI 2005 N. Ramangaseheno
● Presentation of the Project● Contributions
– Diagram of the Unformat System– The SVG Format– Parsing – Modelling– Filtering– Intersection Search
● Applications
Unformating SVG DocumentsUnformating SVG Documents
PSI 2005 N. Ramangaseheno
The SVG FormatThe SVG FormatNorm and advantages:
● W3C norm for describing 2D graphics => open standard
● Growing format => growig number of visualizers and users
● Vectorial description of graphical objects => scalablility
● Based on XML (described by a DTD); compatible with XLink, XPointer, CSS/XSL, and SMIL (animation language) => textual, separation between semantic and presentation
● Scripts and animations started on associated events => interactif
Inconvenient :
● Lack of realism
Adapté pour :
● Interactive geographic maps
● Technical drawings
● XML accounts
<rect x="400" y="100" width="400“ height="200"fill="yellow" stroke="navy" stroke-width="10" />
(a) (b)
a SVG rectangle
PSI 2005 N. Ramangaseheno
Structure of a SVG DocumentStructure of a SVG Document
<?xml version="1.0" encoding="iso-8859-1" standalone="no"?><!DOCTYPE svg PUBLIC
"-//W3C//DTD SVG 1.0//EN""http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd">
<svg width="5cm" height="4cm"><desc>Un joli rectangle</desc><rect x="3cm" y="0.5cm" width="1.5cm" height="2cm"/>
</svg>
Exemple :
SVG tag corresponding declaration
<svg> SVG document
<g> group of objects
<‘symbol’> geometrical shape
<text> , <tspan>, ou <tref> text
<image> image
<defs> definition of links
<use> link towards an internal graphical object
PSI 2005 N. Ramangaseheno
Common shapes● Ellipse: <ellipse cx="400" cy="300" rx="72" ry="50" />
● Rectangle: <rect x="150" y="50" width="135" height="100" />
● Circle: <circle cx="70" cy="100" r="50" />
● Line: <line x1="375" y1="50" x2="425" y2="150" />
● Polyline: <polyline points="50, 250,75,350,100,250,125,350" />
● Polygon: <polygon points="250,250,297,284,279,340,220,340" />
Complex shape
● Path: <path d="M 50 250 L 100 250 L 150 300"/>
Geometrical ShapesGeometrical Shapes
PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno
Unformating SVG DocumentsUnformating SVG Documents
● Presentation of the Project● Contributions
– Diagram of the Unformat System– The SVG Format– Parsing – Modelling– Filtering– Intersection Search
● Applications
PSI 2005 N. Ramangaseheno
SAX ParserSAX Parser
● Any XML handling need a parser
– a parser is a syntaxic analyzer; it is placed between the XML file and the application
– a parser can be used :
● from a program (script, java, C++)● from a navigator
● SAX, event driven parser
– handler methods called from special events
– file sequentially analyzed before being transmitted to the application
Handler
startDocument()startElement()endElement()endDocument()
treated events
Architecture of a SAX application
SVG ParsingSVG Parsing
SVG code
vectorial representation
PSI 2005 N. Ramangaseheno
Parser SAX
startDocument()startElement()charactersendElement() endDocument()
SVG document
graphical objects
met events
PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno
Unformating SVG DocumentsUnformating SVG Documents
● Presentation of the Project● Contributions
– Diagram of the Unformat System– The SVG Format– Parsing – Modelling– Filtering– Intersection Search
● Applications
OGraphic
OGraphicImpl
OPoint OLine OHL
OExtremity OJunction
PSI 2005 N. Ramangaseheno
GOMLib [Delalandre 2004]:
• Graphical Objects Modelling Library
• XML and SVG export
• Multi-model : different representations possible
Graphical Objects ModellingGraphical Objects Modelling
line graph hierarchical list hierarchical graph
linked-squares point list line list
PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno
Unformating SVG DocumentsUnformating SVG Documents
● Presentation of the Project● Contributions
– Diagram of the Unformat System– The SVG Format– Parsing – Modelling– Filtering– Intersection Search
● Applications
PSI 2005 N. Ramangaseheno
Filtering superfluous linesFiltering superfluous lines
visuallly in reality
Need filtering to respect orders Need filtering to respect orders 1 point of a 2D planimetry = 1 single representation1 point of a 2D planimetry = 1 single representation
PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno
Preliminary Tests (1/2)Preliminary Tests (1/2)Given - two lines L1 et L2
- b1(xb1,yb1) begin point of L1 ; b2(xb2,yb2) begin point of L2
- e2(xe2,ye1) end point of L1 ; e2(xe2,ye2) end point of L2
•L1 isEqual L2 : L1 and L2 are equal if
xb1 = xb2 ; yb1 = yb2 ; xe1 = xe2 ; ye1 = ye2 ;
•L1 isParallel L2 : L1 and L2 are parallel if
(( xe1 - xb1 ) * ( xe2 - xb2 ) - ( ye1 - yb1 ) * ( ye2 - yb2 )) = 0
•L1 isColinear p : a point p(x,y) is colinear to L1 if
y = t * x + o
(t = ( ye1 - yb1 ) / (xe1 - xb1 ) and o = yb1 - t * xb1 )
•L1 isColinear L2 : L2 is colinear to L1 if
L1 isColinear b2 and L1 isColinear e2
b1b1 e1e1
b2b2 e2e2
PSI 2005 N. Ramangaseheno
•L1 overlaps p : L1 overlaps a point p(x,y) if
(( x - xb1 ) * ( x - xe1 )) < 0 or (( y - yb1 ) * ( y -
ye1 )) < 0
•L1 overlaps L2 : L1 overlaps L2 if
L1 overlaps( b2 ) or L1 overlaps( e2 )
•L1 isConnected p : L1 is connected to the point p(x,y) if
b1 = p
or e1 = p
•L1 isConnected L2 : L1 is connected to L2 if
L1 isConnected b2
or L1 isConnected e2 l1 is connected to l2
Preliminary Tests (2/2)Preliminary Tests (2/2)
Filtering Tests (1/3)Filtering Tests (1/3)
PSI 2005 N. Ramangaseheno
•L1 sameAs L2 : L1 and L2 are the same if
L1 isEqual L2
or xb1 = xe2 ; yb1 = ye2 ; xe1 = xb2 ; ye1 = yb2 ;
in this case, line L2 is filtered (erased)
l1 same as l2
PSI 2005 N. Ramangaseheno
•L1 includes L2 : L1 includes L2
case (a) : L2 totally included inside L1
if L1 isColinear L2
and L1 overlaps b2
and L1 overlaps e2
case (b) : L2 included inside and connected to L1
or L1 isConnected L2
and L1 isParallel L2
and [ L1 overlaps b2 or L1 overlaps e2 ]
l1 includes l2
(a)
(b)
Filtering Tests (2/3)Filtering Tests (2/3)
PSI 2005 N. Ramangaseheno
•L1 isJoined L2 : L1 and L2 join together
case (a) : L1 is extended by L2, without overlapping
if L1 isConnected L2
and L1 isParallel b2
and « L1 overlaps e2 » is false
case (b) : L1 is extended by L2, with overlapping
or L1 isColinear L2
and L1 overlaps L2
and L2 overlaps L1
l1 and l2 join together
(a)
(b)
Filtering Tests (3/3)Filtering Tests (3/3)
PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno
Unformating SVG DocumentsUnformating SVG Documents
● Presentation of the Project● Contributions
– Diagram of the Unformat System
– The SVG Format– Parsing – Modelling– Filtering– Intersection Search
● Applications
PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno
X junction T junction
Multi-degree junction
Segments separation
Get Line Intersection (1/3)Get Line Intersection (1/3)
PSI 2005 N. Ramangaseheno
Intersections processing algorithmSearch and list all junctions of the document, For each line, test if it contains the junctionIf yes, break the line in two at the junction point
Used tests
•L1 isIntersected L2 : L1 and L2 intersects themselves on a point p(x,y)so that p <-- L1 getIntersection L2
case (a) : X junction
if p is not null
case (b) : T junction
if p is null
L1 overlaps p and L2 overlaps p
or L1 isConnected p and L2 overlaps p
or L2 isConnected p and L1 overlaps p
In both cases, add junction p(x,y) to the junctions list
Get Line Intersection (2/3)Get Line Intersection (2/3)
L1 isIntersected L2 : returns • intersection point p(xc,yc) between lines L1 and L2
• null if lines are parralel or colinear • null if x<0 , y<0
Four cases to take into account :
1.L1 and L2 are regular
y1 = a1 * x1 + b1 ;y2 = a2 * x2 + b2 ;
yc = y1 = y2 ;xc = x1 = x2 ;
xc = (b2 - b1)/(a1 - a2) ;yc = a1 * xc + b1 = a2 * xc + b2 ;
4. L1 and L2 are irregular (see case 2.)Two cases:• L1 is vertical, L2 horizontal• L2 is vertical, L1 horizontal
PSI 2005 N. Ramangaseheno
Get Line Intersection (3/3)Get Line Intersection (3/3)2. L1 is regular, L2 irregularTwo cases:
• L2 is horizontal
y1 = a1 * x1 + b1 ; x2 = c ; - y2 ;
yc = c ; xc = (c - b1)/ a1 ;
• L2 is vertical y1 = a1 * x1 + b1 ; y2 = c ; - x2 ;
xc = c ; yc = a1 * c + b1;
3. L1 is irregular, L2 regular (see case 2.)Two cases:
• L1 is horizontal• L1 is vertical
PSI 2005 N. Ramangaseheno
● Presentation of the Project● Contributions● Applications
Unformating SVG DocumentsUnformating SVG Documents
PSI 2005 N. Ramangaseheno
Experiments & Results (1/3)Experiments & Results (1/3)Unformating results on SVG documents created with the 2gT system ("graphic ground Truth").
PSI 2005 N. Ramangaseheno
• Colour index
– black vectors : no change
– red vectors : filtered vectors
– blue vectors : « broken »vectors
• «visually », all original lines are retrieved
• After reduction, we do see that all intersections have been erased
original SVG document unformated document after reduction of 20%
Experiments & Results (2/3)Experiments & Results (2/3)
PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno
• Algorithm complexity : n(n-1)
• Filtering : n + (n-1) + (n-2) + … + 1 comparaisons
• Intersections retrieval : n + (n-1) + (n-2) + … + 1 comparaisons
• Runtime : about 1,5 min for 100 documents (2500 vectors and 100 intersections per
document)
Experiments & Results (3/3)Experiments & Results (3/3)
ConclusionConclusion
PSI 2005 N. Ramangaseheno
Outcome• Technical achievement
• Unformating system effective and functional
PerspectivesUse upstream pattern recognition
tools