long range horizontal connectivity : cost-effective ... · long range horizontal connectivity :...

1
Long range horizontal connectivity : Cost-effective circuit for natural image perception Seungdae Baek 1 * , Youngjin Park 1 * & Se-Bum Paik 1,2 1 Department of Bio and Brain Engineering, 2 Program of Brain and Cognitive Engineering, Korea Advanced Institute of Science and Technology * Contributed Equally * This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (NRF-2016R1C1B2016039, NRF-2017R1E1A2A02080940)(to S.P.). Results Introduction Task : image classification - Dataset: Cifar-10 (Natural image) Model network of visual pathway 1) feedforward - convergence connection 2) lateral intra-layer connections -Basic structure : 3 layer MLP RGC V1 RGC V1 Feedforward convergence Receptive field(RF) LRC LRC range > 2*RF ( RF non-overlapping range) - Training: minimizing error of network’s prediction by updating weights LRCs dramatically improve natural image classification performance LRCs contribute particularly to encoding global structure of input image LRC can be spontaneously evolved on the model network when total connection length is limited LRCs are applicable to conventional deep neural network for natural image perception Summary Methods Epoch Length sum Accuracy Length penalty w/o length penalty Length (pixel) # of connections 0 Epoch Length (pixel) Length (pixel) 3000 (Bosking 1997) Expectation Epoch LRC/Lateral Length penalty Cifar10 Permuted Cifar10 Lateral connections LRC threshold Optimized lateral connections Random image Natural image Length (pixel) # of deleted connection Accuracy Disconnect LRC Disconnect random conn. Development of lateral connection CNN CNN + LRC ( same # of conn.) Local Global Local+ Global ( local feature : shape of number global feature : spatial pattern of a pair ) Long-range horizontal connections may allow cost-effective perception for natural image by integrating global information efficiently Hypothesis # of parameters Accuracy 4 layer 4 layer +LRC 5 layer Cifar10 The combination of LRCs and local connections dramatically enhances visual perception LRCs contribute to image perception by integrating low-frequency global information LRCs can be spontaneously evolved when total connection length is limited LRCs can be applied to conventional DNN for cost-efficient perception of natural images The combination of long-range horizontal connections and local convergent connections enables efficient coding of the natural image To figure out LRCs’ effect on local and global feature separately, we designed 3 datasets: FF connection local features LRC connection global features Most initial connections were pruned through learning, but a few long connections survived until the end of learning, even with length penalty LRC distribution was not formed if the training data was not a natural image Adding LRCs is more efficient to resource usage than adding a layer Total length decreased as the accuracy increased with length penalty Long rage connection (LRC) 1mm 0.5 1 1.5 2 2.5 3 0 0.04 0.08 0.12 0.16 Cortical length (mm) Length probability LRC Local (V1, Bosking 1997) Average power Spatial frequency (cyc/image) Local structure Global structure Local+Global structure Spatial structure of natural image : local + global LRC added Local added Added connections (number) Max performance Accuracy (%) Plane Cat Dense network + no LRC Sparse network + few LRC Sparse network + many LRC Normalized accuracy Ratio of LRC Dataset Accuracy Accuracy Accuracy Local dataset Global dataset Local + global Developed LRCs are indeed more important than other connections 1 , , + 2 , , 2 =1 Optimization Initial network Optimized connectivity Error minimization Length penalty ‘?’ ‘Plane’ By adding only a small number of LRCs to the FF network, the performance of the network dramatically increased The combination of a few LRC with local convergent connections maximize the cost-effectiveness of connections Starting from random initial connectivity, network was changed toward the direction that minimizing the total length while recognizing image correctly Visual cortex and deep neural network (DNN) both excel in natural image perception Visual cortex consists of fewer layers than DNNs, presumably due to the limited volume of the brain Natural image classification 60 65 70 75 80 85 90 95 100 Classification accuracy Cat Dog Mug DNN 1 DNN 2 DNN 3 Human level (DNN1: Alexnet Krizhevesky et al 2012) (DNN2: VGG Simonyan et al 2014) (DNN3: Resnet He et al 2015) Accuracy (%) (IMAGENET) Brain (Visual cortex) DNN = Limited volume (6 layers) [1] VS RGC LGN V1 V2 V4 IT DNN “Cat” “Cat” No limit volume (152 layers) [2] ( [1]: DiCarlo & Cox 2007) ( [2]: Resnet He et al 2015) Q. What is the structure of the brain to recognize natural image efficiently ? Encoding spatial structure of natural image LRC Local Adding a few LRC LRCs were implemented as skip connections covering RF non-overlapping range via dilated convolution layer Cifar10 Implementation of LRCs on CNN Cost effectiveness : LRC vs Layer Performance per cost LRC added performance / length ( −1 ) Length of added connections (pixel) LRC 5 % Accuracy (%) LRC added (10%) local added (10%) No Lateral * ( Two-sampled t-test : * p < 0.001 ) ( Two-sampled t-test : * p < 0.001 ) ( Two-sampled t-test : * p < 0.001 ) Accuracy (%) Performance in various network architecture

Upload: others

Post on 01-Oct-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Long range horizontal connectivity : Cost-effective ... · Long range horizontal connectivity : Cost-effective circuit for natural image perception Seungdae Baek1*, Youngjin Park1*

Long range horizontal connectivity :

Cost-effective circuit for natural image perception

Seungdae Baek1*, Youngjin Park1* & Se-Bum Paik1,2

1 Department of Bio and Brain Engineering, 2 Program of Brain and Cognitive Engineering, Korea Advanced Institute of Science and Technology

* Contributed Equally

* This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (NRF-2016R1C1B2016039, NRF-2017R1E1A2A02080940)(to S.P.).

Results Introduction

Task : image classification

- Dataset: Cifar-10 (Natural image)

Model network of visual pathway

1) feedforward - convergence connection

2) lateral – intra-layer connections

-Basic structure : 3 layer MLP

RGC V1 RGC V1

Feedforward

convergence

Receptive

field(RF)

LRC

LRC range > 2*RF

( RF non-overlapping range)

- Training: minimizing error of network’s

prediction by updating weights

LRCs dramatically improve natural image classification performance

LRCs contribute particularly to encoding global structure of input image

LRC can be spontaneously evolved on the model network when total connection length is limited

LRCs are applicable to conventional deep neural network for natural image perception

Summary

Methods

Epoch

Length

sum

A

ccura

cy

Length penalty

w/o length penalty

Length (pixel)

# o

f connections

0 Epoch

Length (pixel) Length (pixel)

3000

(Bosking 1997)

Expectation

Epoch

LR

C/L

ate

ral

Length penalty Cifar10

Permuted

Cifar10

Late

ral connections

LRC threshold

Optimized lateral connections

Random image

Natural image

Length (pixel)

# of deleted connection

Accura

cy

Disconnect LRC

Disconnect random conn.

Development of lateral connection

CNN CNN + LRC

( same # of conn.)

Local

Global

Local+

Global

( local feature : shape of number

global feature : spatial pattern of a pair )

Long-range horizontal connections may allow cost-effective perception for

natural image by integrating global information efficiently

Hypothesis

# of parameters

Accura

cy

4 layer

4 layer

+LRC

5 layer

Cifar10

The combination of LRCs and local connections dramatically enhances visual perception

LRCs contribute to image perception by integrating low-frequency global information

LRCs can be spontaneously evolved when total connection length is limited

LRCs can be applied to conventional DNN for cost-efficient perception of natural images

The combination of long-range horizontal connections and local convergent connections enables efficient

coding of the natural image

To figure out LRCs’ effect on local and global feature

separately, we designed 3 datasets:

FF connection local features

LRC connection global features

Most initial connections were pruned through learning, but a few long connections survived

until the end of learning, even with length penalty

LRC distribution was not formed if the training data

was not a natural image

Adding LRCs is more efficient to

resource usage than adding a layer

Total length decreased

as the accuracy increased

with length penalty

Long rage connection (LRC)

1mm

0.5 1 1.5 2 2.5 3 0

0.04

0.08

0.12

0.16

Cortical length (mm)

Length

pro

babili

ty

LRC Local

(V1, Bosking 1997)

Avera

ge p

ow

er

Spatial frequency

(cyc/image)

Local

structure

Global

structure

Local+Global

structure

Spatial structure of natural image

: local + global

LRC added

Local added

Added connections (number)

Max performance

Accura

cy (

%)

Plane Cat

Dense network + no LRC

Sparse network + few LRC

Sparse network + many LRC Norm

aliz

ed a

ccura

cy

Ratio of LRC Dataset

Accura

cy

Accura

cy

Accura

cy

Local dataset

Global dataset

Local + global

Developed LRCs are indeed more

important than other connections

𝑎𝑟𝑔𝑚𝑖𝑛𝑊1

𝑁 𝐿𝑖 𝑓 𝑥𝑖 ,𝑊 , 𝑦𝑖 +

𝜆

2 𝐿𝑒𝑛𝑘,𝑙⨂𝑊𝑘,𝑙

2

𝑙𝑘

𝑁

𝑖=1

Optimization

Initial network Optimized connectivity

Error minimization Length penalty

‘?’ ‘Plane’

By adding only a small number of LRCs to the FF network,

the performance of the network dramatically increased

The combination of a few LRC with local convergent connections

maximize the cost-effectiveness of connections

Starting from random initial connectivity, network was changed toward

the direction that minimizing the total length while recognizing

image correctly

Visual cortex and deep neural network (DNN) both excel

in natural image perception

Visual cortex consists of fewer layers than DNNs,

presumably due to the limited volume of the brain

Natural image

classification

60

65

70

75

80

85

90

95

100

Classification

accuracy

Cat

Dog

Mug DNN

1

DNN

2

DNN

3

Human level

(DNN1: Alexnet Krizhevesky et al 2012)

(DNN2: VGG Simonyan et al 2014)

(DNN3: Resnet He et al 2015)

Accura

cy (

%)

(IMAGENET)

Brain (Visual cortex)

DNN

=

Limited volume (6 layers) [1]

VS

RGC LGN V1 V2 V4 IT

DNN

“Cat”

“Cat”

No limit volume (152 layers) [2] ( [1]: DiCarlo & Cox 2007)

( [2]: Resnet He et al 2015)

Q. What is the structure of the brain to recognize natural image

efficiently ?

Encoding spatial structure of natural image

LRC Local

Adding a few LRC

LRCs were implemented as skip connections covering RF non-overlapping range

via dilated convolution layer

Cifar10

Implementation of LRCs on CNN Cost effectiveness : LRC vs Layer

Performance per cost

LRC added

∆ p

erf

orm

ance /

length

(𝑝𝑖𝑥𝑒𝑙−1)

Length of added connections (pixel)

LRC

5 %

Accura

cy (

%)

LRC added (10%)

local added (10%)

No Lateral

*

( Two-sampled t-test : * p < 0.001 )

( Two-sampled t-test : * p < 0.001 )

( Two-sampled t-test : * p < 0.001 )

Accura

cy (

%)

Performance in various

network architecture