csci 497p/597p: computer visionwehrwes/courses/csci497p_20f/lectures/l03.pdfconvolution. • know...
TRANSCRIPT
CSCI 497P/597P: Computer Vision
Lecture 3: Convolution and its Properties
CSCI 497P/597P: Computer Vision
Lecture 3: Convolution and its Properties
Announcements• HW1 is out today
• Convers filtering and convolution
• Due in 1 week (10pm next Tuesday)
Goals• Understand the distinction between cross-correlation and
convolution.
• Know the properties of cross-correlation and convolution:
• Linearity and shift-invariance (both) Associativity and commutativity (convolution only)
• Understand the design of several common image filters:
• Box blur and Gaussian blur
• Sharpening
• Understand the limitations of linear filtering
Computing Cross-Correlationg = f ⌦ w
<latexit sha1_base64="A2BTYXOmOKiP4ZSUdZvtUXeuNFA=">AAAB+HicbVBNSwMxEM3Wr1o/uurRS7AInsquKHoRil48VrAf0C4lm2bb0GyyJLNKXfpLvHhQxKs/xZv/xrTdg7Y+GHi8N8PMvDAR3IDnfTuFldW19Y3iZmlre2e37O7tN41KNWUNqoTS7ZAYJrhkDeAgWDvRjMShYK1wdDP1Ww9MG67kPYwTFsRkIHnEKQEr9dzyAF/hCHcV8JgZ/NhzK17VmwEvEz8nFZSj3nO/un1F05hJoIIY0/G9BIKMaOBUsEmpmxqWEDoiA9axVBK7Jshmh0/wsVX6OFLalgQ8U39PZCQ2ZhyHtjMmMDSL3lT8z+ukEF0GGZdJCkzS+aIoFRgUnqaA+1wzCmJsCaGa21sxHRJNKNisSjYEf/HlZdI8rfpn1fO7s0rtOo+jiA7RETpBPrpANXSL6qiBKErRM3pFb86T8+K8Ox/z1oKTzxygP3A+fwBIU5I3</latexit>
input image
weights, or filter, or kerneloutput image
for x = 0 to w: for y = 0 to h: for i in -k to k: for j in -k to k: out[x,y] += w[i,j] * in[x+i, y+j]
Computing Cross-Correlationg = f ⌦ w
<latexit sha1_base64="A2BTYXOmOKiP4ZSUdZvtUXeuNFA=">AAAB+HicbVBNSwMxEM3Wr1o/uurRS7AInsquKHoRil48VrAf0C4lm2bb0GyyJLNKXfpLvHhQxKs/xZv/xrTdg7Y+GHi8N8PMvDAR3IDnfTuFldW19Y3iZmlre2e37O7tN41KNWUNqoTS7ZAYJrhkDeAgWDvRjMShYK1wdDP1Ww9MG67kPYwTFsRkIHnEKQEr9dzyAF/hCHcV8JgZ/NhzK17VmwEvEz8nFZSj3nO/un1F05hJoIIY0/G9BIKMaOBUsEmpmxqWEDoiA9axVBK7Jshmh0/wsVX6OFLalgQ8U39PZCQ2ZhyHtjMmMDSL3lT8z+ukEF0GGZdJCkzS+aIoFRgUnqaA+1wzCmJsCaGa21sxHRJNKNisSjYEf/HlZdI8rfpn1fO7s0rtOo+jiA7RETpBPrpANXSL6qiBKErRM3pFb86T8+K8Ox/z1oKTzxygP3A+fwBIU5I3</latexit>
input image
weights, or filter, or kerneloutput image
for x = 0 to w: for y = 0 to h: for i in -k to k: for j in -k to k: out[x,y] += w[i,j] * in[x+i, y+j]
A bit of practiceIn groups: work on Problems #1-4
• Problems are linked from the course webpage on the Schedule table
• Write answers in your Google Doc (pinned in group Discord channels)
Questions remain• What happens at the edges?
• What properties does this operator have?
• What can and can't this operator do?
A shift filterCross-correlate the image f with the kernel w. Use "same" output size, with zero-padding for out-of-bounds values.
1 2 11 3 10 1 3g = f ⌦ w
<latexit sha1_base64="A2BTYXOmOKiP4ZSUdZvtUXeuNFA=">AAAB+HicbVBNSwMxEM3Wr1o/uurRS7AInsquKHoRil48VrAf0C4lm2bb0GyyJLNKXfpLvHhQxKs/xZv/xrTdg7Y+GHi8N8PMvDAR3IDnfTuFldW19Y3iZmlre2e37O7tN41KNWUNqoTS7ZAYJrhkDeAgWDvRjMShYK1wdDP1Ww9MG67kPYwTFsRkIHnEKQEr9dzyAF/hCHcV8JgZ/NhzK17VmwEvEz8nFZSj3nO/un1F05hJoIIY0/G9BIKMaOBUsEmpmxqWEDoiA9axVBK7Jshmh0/wsVX6OFLalgQ8U39PZCQ2ZhyHtjMmMDSL3lT8z+ukEF0GGZdJCkzS+aIoFRgUnqaA+1wzCmJsCaGa21sxHRJNKNisSjYEf/HlZdI8rfpn1fO7s0rtOo+jiA7RETpBPrpANXSL6qiBKErRM3pFb86T8+K8Ox/z1oKTzxygP3A+fwBIU5I3</latexit>
f w
=0 0 0
0 0 1
0 0 0
• Cross-correlation:
• Convolution:
Cross-correlation vs Convolution
g(x, y) =kX
i=�k
kX
j=�k
w(i, j)f(x+ i, y + j)
<latexit sha1_base64="ppMYXLOtixgDad+/y4K7xXv0VRA=">AAACI3icbVBdSwJBFJ3t0+zL6rGXIQkUTXbDKAJB6qVHg/wANZkdZ3Xc2Q9mZstl8b/00l/ppYdCeumh/9KoS5R2YODcc+7lzj2mz6iQuv6pLS2vrK6tJzaSm1vbO7upvf2a8AKOSRV7zOMNEwnCqEuqkkpGGj4nyDEZqZv29cSvPxAuqOfeydAnbQf1XGpRjKSSOqnLXmaYD7OwBFsicDoRLZ3Yo/vIHsX14Kd+zND8IAutzDBH82FukO2k0npBnwIuEiMmaRCj0kmNW10PBw5xJWZIiKah+7IdIS4pZmSUbAWC+AjbqEeairrIIaIdTW8cwWOldKHlcfVcCafq74kIOUKEjqk6HST7Yt6biP95zUBaF+2Iun4giYtni6yAQenBSWCwSznBkoWKIMyp+ivEfcQRlirWpArBmD95kdROC0axcHZbTJev4jgS4BAcgQwwwDkogxtQAVWAwRN4AW/gXXvWXrWx9jFrXdLimQPwB9rXN8JMonw=</latexit>
g(x, y) =kX
i=�k
kX
j=�k
w(i, j)f(x� i, y � j)
<latexit sha1_base64="WReDaxTpY1c+rYMm2xmhnyNLFsM=">AAACI3icbVBdSwJBFJ21L7Mvq8dehiRQUNkNowgEqZceDfID1GR2nNVxZz+YmS2Xxf/SS3+llx4K6aWH/kujLlHagYFzz7mXO/eYPqNC6vqnllhZXVvfSG6mtrZ3dvfS+wd14QUckxr2mMebJhKEUZfUJJWMNH1OkGMy0jDt66nfeCBcUM+9k6FPOg7qu9SiGEklddOX/ewoH+ZgGbZF4HQjWi7Y4/vIHsf18Kd+zNL8MAet7KhA82FhmOumM3pRnwEuEyMmGRCj2k1P2j0PBw5xJWZIiJah+7ITIS4pZmScageC+AjbqE9airrIIaITzW4cwxOl9KDlcfVcCWfq74kIOUKEjqk6HSQHYtGbiv95rUBaF52Iun4giYvni6yAQenBaWCwRznBkoWKIMyp+ivEA8QRlirWlArBWDx5mdRPi0apeHZbylSu4jiS4AgcgywwwDmogBtQBTWAwRN4AW/gXXvWXrWJ9jFvTWjxzCH4A+3rG8hsooA=</latexit>
g = f ⌦ w
<latexit sha1_base64="A2BTYXOmOKiP4ZSUdZvtUXeuNFA=">AAAB+HicbVBNSwMxEM3Wr1o/uurRS7AInsquKHoRil48VrAf0C4lm2bb0GyyJLNKXfpLvHhQxKs/xZv/xrTdg7Y+GHi8N8PMvDAR3IDnfTuFldW19Y3iZmlre2e37O7tN41KNWUNqoTS7ZAYJrhkDeAgWDvRjMShYK1wdDP1Ww9MG67kPYwTFsRkIHnEKQEr9dzyAF/hCHcV8JgZ/NhzK17VmwEvEz8nFZSj3nO/un1F05hJoIIY0/G9BIKMaOBUsEmpmxqWEDoiA9axVBK7Jshmh0/wsVX6OFLalgQ8U39PZCQ2ZhyHtjMmMDSL3lT8z+ukEF0GGZdJCkzS+aIoFRgUnqaA+1wzCmJsCaGa21sxHRJNKNisSjYEf/HlZdI8rfpn1fO7s0rtOo+jiA7RETpBPrpANXSL6qiBKErRM3pFb86T8+K8Ox/z1oKTzxygP3A+fwBIU5I3</latexit>
g = f ⇤ w
<latexit sha1_base64="bxHTw2hGiFLXrIf5UuLpwN2js4M=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRZBPJRdadGLUPTisYL9kHYp2TTbhibZJckqZemv8OJBEa/+HG/+G9N2D9r6YODx3gwz84KYM21c99vJrayurW/kNwtb2zu7e8X9g6aOEkVog0Q8Uu0Aa8qZpA3DDKftWFEsAk5bwehm6rceqdIskvdmHFNf4IFkISPYWOlhgK5QiM7QU69YcsvuDGiZeBkpQYZ6r/jV7UckEVQawrHWHc+NjZ9iZRjhdFLoJprGmIzwgHYslVhQ7aezgyfoxCp9FEbKljRopv6eSLHQeiwC2ymwGepFbyr+53USE176KZNxYqgk80VhwpGJ0PR71GeKEsPHlmCimL0VkSFWmBibUcGG4C2+vEya52WvUq7eVUq16yyOPBzBMZyCBxdQg1uoQwMICHiGV3hzlPPivDsf89ack80cwh84nz+sQ48H</latexit>
These are related:
Properties
Shift invariance (both)
Linearity (both)
Commutativity (conv only)
Associativity (conv only)
Assume: is an image; and are filters; , are scalars.f w v s t
f(x, y) ⊗ w = [ f(x − s, y − t) ⊗ w)](x − s, y − t)
( f ⊗ w) + ( f ⊗ v) = f ⊗ (w + v)( f ⊗ sw) = s( f ⊗ w)
and
f * w = w * f
( f * w) * v = f * (w * v)
What can we do with this?
What can we do with this?
000
010
000
⌦
<latexit sha1_base64="73rJz0sVedxRxb1nPqaqAz07Rvo=">AAAB7nicbVDLSgNBEOyNrxhfUY9eBoPgKeyKosegF48RzAOSJcxOZpMhszPLTK8QQj7CiwdFvPo93vwbJ8keNLGgoajqprsrSqWw6PvfXmFtfWNzq7hd2tnd2z8oHx41rc4M4w2mpTbtiFouheINFCh5OzWcJpHkrWh0N/NbT9xYodUjjlMeJnSgRCwYRSe1uhpFwm2vXPGr/hxklQQ5qUCOeq/81e1rliVcIZPU2k7gpxhOqEHBJJ+WupnlKWUjOuAdRxV1S8LJ/NwpOXNKn8TauFJI5urviQlNrB0nketMKA7tsjcT//M6GcY34USoNEOu2GJRnEmCmsx+J31hOEM5doQyI9ythA2poQxdQiUXQrD88ippXlSDy+rVw2WldpvHUYQTOIVzCOAaanAPdWgAgxE8wyu8ean34r17H4vWgpfPHMMfeJ8/iIOPtQ==</latexit>
=
000
010
000
=
Identity filter: output = input
*
What can we do with this?
What can we do with this?
=000
001
000
*
What can we do with this?
=000
001
000
left shift
*
What can we do with this?
=111
111
111
*
What can we do with this?
=111
111
111
mean filter, or box blur
*
Blurring: Another example
Notice: lattice-like textureWhat motivated the mean filter?
* =
Blurring: Another example
Notice: lattice-like textureWhat motivated the mean filter?Idea: the closer the pixel, the more likely it is to be similar
* =
Gaussian Blur• Idea: weight closer pixels more heavily using a
Gaussian kernel:
This is a bivariate (2D) Gaussian function:
Gaussian Blur• Idea: weight closer pixels more heavily using a
Gaussian kernel:
This is a bivariate (2D) Gaussian function: 3x3 approximation
Gaussian Filters
= 30 pixels= 1 pixel = 5 pixels = 10 pixels
Mean vs. Gaussian
Composing Filters• Recall associativity:
* =
G� ⇤G� = Gp2�
<latexit sha1_base64="7sKsM38nFcd+qu/baOm86FQFTrQ=">AAACEXicbZDLSsNAFIYn9VbrLerSzWARiouSlIpuhKILXVawF2hCmEwn7dDJJM5MhBLyCm58FTcuFHHrzp1v47QNoq0/DHz85xzOnN+PGZXKsr6MwtLyyupacb20sbm1vWPu7rVllAhMWjhikej6SBJGOWkpqhjpxoKg0Gek448uJ/XOPRGSRvxWjWPihmjAaUAxUtryzMqV50g6CBE8hj94rjF15J1QaS2beZlnlq2qNRVcBDuHMsjV9MxPpx/hJCRcYYak7NlWrNwUCUUxI1nJSSSJER6hAelp5Cgk0k2nF2XwSDt9GERCP67g1P09kaJQynHo684QqaGcr03M/2q9RAVnbkp5nCjC8WxRkDCoIjiJB/apIFixsQaEBdV/hXiIBMJKh1jSIdjzJy9Cu1a169WTm3q5cZHHUQQH4BBUgA1OQQNcgyZoAQwewBN4Aa/Go/FsvBnvs9aCkc/sgz8yPr4BM9icqw==</latexit>
Sharpening!?• What gets removed when we blur?
– =
Sharpening!?• What gets removed when we blur?
– =
=+ α
SharpeningAfterBefore
=+ α – =( )
Sharpening: once more with mathing
=+ α – =( )
Sharpening: once more with mathing
Blur kernelscaled impulseLaplacian of Gaussian
=+ α – =( )
Sharpening: once more with mathing
Blur kernelscaled impulseLaplacian of Gaussian
5 -10
-1-10
0 -1 0Possible 3x3
Approximation
Effects of Sharpening
Limitations: What can't convolution do?
• Maximum filter?
• Threshold?
• partial derivative?y
Problem #5 - discuss in groups.