interpreter for visually impaired

Upload: arun-kumar

Post on 14-Apr-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 interpreter for visually impaired

    1/40

    Interpreter/Smart GlassForVisually Impaired

    Guided By,Mrs J Jeyarani M.E., A.P.(Se.G)

    Processed By,S AnbarasanS ArunrajG BharanidharanJ Jesuraj

  • 7/30/2019 interpreter for visually impaired

    2/40

    INTRODUCTION

    The only available text format for visually impaired isBraille

    letter code.

    This becomes a restriction for the blind, to read the common

    texts (news papers, documents..etc) as we do.

    Our proposal is to help the visually impaired to cross this

    barrier.

  • 7/30/2019 interpreter for visually impaired

    3/40

    Abstract

    This project renders an aid to visually impaired, featuring them

    by synthesising speech for the textual contents of a document.

    This is possible with OCR engine And TTS technology.

  • 7/30/2019 interpreter for visually impaired

    4/40

    Block Diagram

    Save in text file

    Generate speech

    Wait for user

    instruction

    Image Segmentation

    Conversion to

    gray scale

    Correlation

    Conversion to

    binary image

    Image

    enhancement

    Yes

    No

    Discarded

    OCR

    TTS

    Input image

    Pre-processing

  • 7/30/2019 interpreter for visually impaired

    5/40

    Conversion to gray scale

    rgb2gray converts RGB values to gray scalevalues by forming a weighted sum of

    theR, G, andB components:

    0.2989 * R + 0.5870 * G + 0.1140 * B

  • 7/30/2019 interpreter for visually impaired

    6/40

    Histogram Equalization

    This is done to adjust the contrast of the gray scale

    image.

    The general formula of histogram equalization is givenby,

    )1()(

    )()()(

    min

    min

    LcdfNM

    cdfvcdfroundvh

    where,

    cdf- cumulative distribution function

    M X Nimages no. of pixels

    LNo. of gray levels used

  • 7/30/2019 interpreter for visually impaired

    7/40

    9478766568697987

    8365615559647185

    7558687770606579

    706888126104686167

    6970106154122715863

    7366104144113685962

    72698510990555963

    7364617066615552Gray scale image

    Equivalent Matrix

  • 7/30/2019 interpreter for visually impaired

    8/40

    9478766568697987

    8365615559647185

    7558687770606579

    706888126104686167

    6970106154122715863

    7366104144113685962

    72698510990555963

    7364617066615552

  • 7/30/2019 interpreter for visually impaired

    9/40

    Value Count Value Count

    52 1 79 2

    55 3 83 1

    58 2 85 2

    59 3 87 1

    60 1 88 1

    61 4 90 1

    62 1 94 1

    63 2 104 2

    64 2 106 1

    65 3 109 1

    66 2 113 1

    67 1 122 1

    68 5 126 1

    69 3 144 1

    70 4 154 1

    71 272 1

    73 2

    75 1

    76 1

    77 1

    78 1

  • 7/30/2019 interpreter for visually impaired

    10/40

    9478766568697987

    8365615559647185

    7558687770606579

    706888126104686167

    6970106154122715863

    7366104144113685962

    72698510990555963

    7364617066615552

  • 7/30/2019 interpreter for visually impaired

    11/40

    Value Count Value Count

    52 1 79 2

    55 3 83 1

    58 2 85 2

    59 3 87 1

    60 1 88 1

    61 4 90 1

    62 1 94 1

    63 2 104 2

    64 2 106 1

    65 3 109 166 2 113 1

    67 1 122 1

    68 5 126 1

    69 3 144 1

    70 4 154 1

    71 2

    72 1

    73 2

    75 1

    76 1

    77 1

    78 1

  • 7/30/2019 interpreter for visually impaired

    12/40

    1 st value,

    cdf(v)=1

    cdfmin=1

    M X N= 8 X 8=64L=256

    Then,

    h(v) = 0

    2 nd value,

    cdf(v) = 3

    cdfmin=1

    M X N= 8 X 8=64L=256

    Then,

    h(v)=12

    )1()(

    )()()(

    min

    min

    LcdfNM

    cdfvcdfroundvh

    By substituting cdf(v) corresponding h(v) values are found and values

    are given in next slide.

  • 7/30/2019 interpreter for visually impaired

    13/40

    21918217485117130190206

    1948553123273154202

    170201171781463685190

    1461172102472271175397

    1301462312552431542065

    166932272512391173257

    15813020223521512326516673531469353120

    Equalized Image

    Equivalent Matrix

  • 7/30/2019 interpreter for visually impaired

    14/40

    CALCULATIONOFTHRESHOLDVALUE

    Reshape the 2 dimensional gray scale image to 1 dimensional.

    Find the histogram of the image using hist function.

    Initialize a matrix with values from 0 to 255.

    Find the weight , mean and the variance for the foreground and

    background.

    Calculate weight of foreground* variance of foreground + weight of

    background* variance of background.

    Find the minimum value.

  • 7/30/2019 interpreter for visually impaired

    15/40

    Reshape the 2 dimensional gray scale image

    to 1 dimensional.

    21918217485117130190206

    1948553123273154202

    170201171781463685190

    1461172102472271175397

    1301462312552431542065

    166932272512391173257

    158130202235215123265

    16673531469353120

    219

    194

    .

    .

    .

    .

    .

    206

    202

    190

    97

    65

    57

    65

    0

    8 X 8 Matrix of an image

    64 X 1 Matrix of same image

  • 7/30/2019 interpreter for visually impaired

    16/40

    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 . . . . 255

    1 0 0 0 0 0 0 0 0 0 0 0 3 0 0 . . . . 0

    1 1

    2 0

    3 0

    4 0

    5 0

    6 07 0

    8 0

    9 0

    10 0

    11 0

    12 3

    13 0

    14 0

    . .

    . .

    . .

    . .

    256 0

    Value count 1 x 256

    Re-shaping

    256 x 1

  • 7/30/2019 interpreter for visually impaired

    17/40

    1 0

    2 1

    3 2

    4 3

    5 4

    6 5

    7 6

    8 7

    9 8

    10 9

    11 10

    12 11

    13 12

    14 13

    . .

    . .

    . .

    . .

    256 255

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 . . . . 256

    0 1 2 3 4 5 6 7 8 9 10 11 12 13 . . . . 255

    Creating a matrix for intensity level from 0 to 255

    Re-shaping the matrix for

    calculation simplicity

  • 7/30/2019 interpreter for visually impaired

    18/40

    weight=sum(h(1:1))/sum(h);

    weight=1/64=0.0156

    1 1

    2 03 0

    4 0

    5 0

    6 0

    7 0

    8 0

    9 0

    10 0

    11 0

    12 3

    13 0

    14 0

    . .

    . .

    . .

    . .

    256 0

  • 7/30/2019 interpreter for visually impaired

    19/40

    Mean Calculation

    value=H(m:n).*Index(m:n);

    total=sum(value);

    mean=total/sum(H(m:n));

    Ex:

    H(1:1) .* Index(1:1)

    Variance calculation

    value2=(Index(m:n)-mean).^2;

    number=sum(value2.*H(m:n));

    var=number/sum(H(m:n));

    Ex:

    value2= (Index(1:1)-mean).^2;

    Threshold Calculation

    [wbk,varbk]=calculate(1,i);

    [wfg,varfg]=calculate(i+1,255);

    After calculating the weights and the variance, the final computation is

    stored in the array result.

    result(i+1)=(wbk*varbk)+(wfg*varfg);

    Value=min(result);

    Thr=Value/256;

  • 7/30/2019 interpreter for visually impaired

    20/40

    Conversion to binary image

    BW = im2bw(I, level) converts the gray scale image I to a binary image.

    The output image BW replaces all pixels in the input image with luminance

    greater than level with the value 1 (white) and replaces all other pixels with

    the value 0 (black).

    Specify level in the range [0,1]. This range is relative to the signal levels

    possible for the image's class.

    Therefore, a level value of 0.5 is midway between black and white,

    regardless of class.

    To compute the level argument, you can use the function graythresh.

    If you do not specify level, im2bw uses the value 0.5.

  • 7/30/2019 interpreter for visually impaired

    21/40

    CLIPPING

    [r c] = find(g);h= g(min(r):max(r),min(c):max(c));

    where, r= rows of the matrix

    c= columns of the matrixg= input image

    User-defined function for clipping:

  • 7/30/2019 interpreter for visually impaired

    22/40

    00000000

    00111000

    00111100

    0111111000111100

    00111100

    00111100

    00000000

    Input image matrix

  • 7/30/2019 interpreter for visually impaired

    23/40

    011100

    011110

    111111

    011110

    011110

    011110

    After applying the

    clipping function

  • 7/30/2019 interpreter for visually impaired

    24/40

    Non-zeros founded in r matrix

    1 5

    2 2

    3 3

    4 45 5

    6 6

    7 2

    8 3

    9 4

    10 5

    11 6

    12 7

    13 2

    14 3

    15 4

    16 5

    17 6

    18 7

    19 2

    20 3

    21 4

    22 5

    23 6

    24 7

    25 5

    Non-zeros founded in c matrix

    1 2

    2 3

    3 3

    4 35 3

    6 3

    7 4

    8 4

    9 4

    10 4

    11 4

    12 4

    13 5

    14 5

    15 5

    16 5

    17 5

    18 5

    19 6

    20 6

    21 6

    22 6

    23 6

    24 6

    25 7

  • 7/30/2019 interpreter for visually impaired

    25/40

    00000000

    00111000

    00111100

    0111111000111100

    00111100

    00111100

    00000000

    5th row,

    2nd column

    5th row,

    7th

    column

  • 7/30/2019 interpreter for visually impaired

    26/40

  • 7/30/2019 interpreter for visually impaired

    27/40

    LINE CROPPING

    110110

    010111

    011011

    000000

    110110

    110111 = 6

    = 4

    = 0

    = 4

    = 4

    = 4

    }

    }

    1st LINE

    2nd LINE

  • 7/30/2019 interpreter for visually impaired

    28/40

    011100100010

    010101010010010101010101

    010100100101

    2 2 2 0 2 2 2 0 4 1 4 0

    Y O U

    LETTER CROPPING

  • 7/30/2019 interpreter for visually impaired

    29/40

  • 7/30/2019 interpreter for visually impaired

    30/40

    Image Re-sizing

    Re-size this cropped letter to 42 X 24(size of database image)

    Cropped letter

    42 X 24 image

  • 7/30/2019 interpreter for visually impaired

    31/40

    Correlation

    In this step the re-sized image is correlated with the images in

    database.

    The database image which has high correlative value is

    assumed to be the letter and saved in a text file.

  • 7/30/2019 interpreter for visually impaired

    32/40

    010

    010101

    101

    [A ] [B] . [Y] . [a] . . . [9] [0]

    . . . . . . . .

    (0.146)

    (0.9877)

    1 2 3 25 26 27 35 37 52 61 62

    sem=corr2(templates{1,n},imagn);comp=[comp sem];

  • 7/30/2019 interpreter for visually impaired

    33/40

    0.146 . . 0.9877 . -0.44 . . . 0.567 0.111

    1 2 3 25 26 27 35 37 52 61 62

    Comp=

    vd=find(comp==max(comp));

  • 7/30/2019 interpreter for visually impaired

    34/40

    Calculating space between letters

    Line without Y

    Using clipping function

    Difference between above two images gives the space value

    Here value=4

    These space values are stored in space vector array

  • 7/30/2019 interpreter for visually impaired

    35/40

    [Y] [O] [U] [A] [R] [E] . . . . .

    Word Matrix

    4 4 3 14 3 2 . . . . .

    Space Matrix

    [Y] [O] [U] [A] [R] [E] . . . .

    Word Matrix

  • 7/30/2019 interpreter for visually impaired

    36/40

    Saved in Text File

  • 7/30/2019 interpreter for visually impaired

    37/40

    TTS

    There are number of TTS available.

    Some parameters have to be checked before choosing an algorithm :

    Accuracy

    Processing Speed , etc.

  • 7/30/2019 interpreter for visually impaired

    38/40

    Speech Application Programming Interface

    The Microsoft text-to-speech voices are speech synthesizers provided for use

    with applications that use the Microsoft Speech API(SAPI) or the Microsoft

    Speech Server Platform.

    In general the Speech API is a freely-redistributable component which can be

    shipped with any Windows application that wishes to use speech technology.

    There are both SAPI 4 and SAPI 5 versions of these text to speech voices.

  • 7/30/2019 interpreter for visually impaired

    39/40

    OUTPUT

  • 7/30/2019 interpreter for visually impaired

    40/40

    Conclusion

    For precise recognition of the text the template database must be

    trained well.

    Phonetic rhythm must be maintained by the machine generated

    sound to cope with natural voice.

    For this linguistic analysis must be done.

    And escape sequences are to be developed so that special

    characters such as Mr. , St. , etc., are pronounced literally.