a tutorial devert alexandre
TRANSCRIPT
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
MatplotlibA tutorial
Devert AlexandreSchool of Software Engineering of USTC
30 November 2012 — Slide 1/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Table of Contents
1 First steps
2 Curve plots
3 Scatter plots
4 Boxplots
5 Histograms
6 Usage example
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 2/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Curve plot
Let’s plot a curve
impor t mathimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te a s i n u s o i dnbSamples = 256xRange = (−math . p i , math . p i )
x , y = [ ] , [ ]f o r n i n x range ( nbSamples ) :
k = (n + 0 . 5 ) / nbSamplesx . append ( xRange [ 0 ] + ( xRange [ 1 ] − xRange [ 0 ] ) ∗ k )y . append (math . s i n ( x [−1]))
# P lo t the s i n u s o i dp l t . p l o t ( x , y )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 3/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Curve plot
This will show you something like this
4 3 2 1 0 1 2 3 41.0
0.5
0.0
0.5
1.0
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 4/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
numpy
matplotlib can work with numpy arrays
impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te a s i n u s o i dnbSamples = 256xRange = (−math . p i , math . p i )
x , y = numpy . z e r o s ( nbSamples ) , numpy . z e r o s ( nbSamples )f o r n i n x range ( nbSamples ) :
k = (n + 0 . 5 ) / nbSamplesx [ n ] = xRange [ 0 ] + ( xRange [ 1 ] − xRange [ 0 ] ) ∗ ky [ n ] = math . s i n ( x [ n ] )
# P lo t the s i n u s o i dp l t . p l o t ( x , y )p l t . show ( )
numpy provides a lot of function and is efficient
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 5/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
numpy
• zeros build arrays filled of 0
• linspace build arrays filled with an arithmeticsequence
impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te a s i n u s o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . z e r o s ( nbSamples )f o r n i n x range ( nbSamples ) :
y [ n ] = math . s i n ( x [ n ] )
# P lo t the s i n u s o i dp l t . p l o t ( x , y )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 6/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
numpy
numpy functions can work on entire arrays
impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te a s i n u s o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )
# P lo t the s i n u s o i dp l t . p l o t ( x , y )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 7/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
PDF output
Exporting to a PDF file is just one change
impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te a s i n u s o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )
# P lo t the s i n u s o i dp l t . p l o t ( x , y )p l t . s a v e f i g ( ’ s i n−p l o t . pdf ’ , t r a n s p a r e n t=True )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 8/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Table of Contents
1 First steps
2 Curve plots
3 Scatter plots
4 Boxplots
5 Histograms
6 Usage example
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 9/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Multiple curves
It’s often convenient to show several curves in one figure
impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )z = numpy . cos ( x )
# P lo t the cu r v e sp l t . p l o t ( x , y )p l t . p l o t ( x , z )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 10/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Multiple curves
It’s often convenient to show several curves in one figure
4 3 2 1 0 1 2 3 41.0
0.5
0.0
0.5
1.0
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 10/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Custom colors
Changing colors can help to make nice documents
impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )z = numpy . cos ( x )
# P lo t the cu r v e sp l t . p l o t ( x , y , c=’#FF4500 ’ )p l t . p l o t ( x , z , c=’#4682B4 ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 11/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Custom colors
Changing colors can help to make nice documents
4 3 2 1 0 1 2 3 41.0
0.5
0.0
0.5
1.0
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 11/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Line thickness
Line thickness can be changed as well
impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )z = numpy . cos ( x )
# P lo t the cu r v e sp l t . p l o t ( x , y , l i n ew i d t h =3, c=’#FF4500 ’ )p l t . p l o t ( x , z , c=’#4682B4 ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 12/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Line thickness
Line thickness can be changed as well
4 3 2 1 0 1 2 3 41.0
0.5
0.0
0.5
1.0
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 12/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Line patterns
For printed document, colors can be replaced by linepatterns
impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t
# L i n e s t y l e s can be ’− ’ , ’−−’, ’− . ’ , ’ : ’
# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )z = numpy . cos ( x )
# P lo t the cu r v e sp l t . p l o t ( x , y , l i n e s t y l e=’−− ’ , c=’#000000 ’ )p l t . p l o t ( x , z , c=’#808080 ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 13/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Line patternsFor printed document, colors can be replaced by linepatterns
4 3 2 1 0 1 2 3 41.0
0.5
0.0
0.5
1.0
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 13/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Markers
It sometime relevant to show the data points
impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t
# Markers can be ’ . ’ , ’ , ’ , ’ o ’ , ’ 1 ’ and more
# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=64)y = numpy . s i n ( x )z = numpy . cos ( x )
# P lo t the cu r v e sp l t . p l o t ( x , y , marker=’ 1 ’ , ma r k e r s i z e =15, c=’#000000 ’ )p l t . p l o t ( x , z , c=’#000000 ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 14/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Markers
It sometime relevant to show the data points
4 3 2 1 0 1 2 3 41.0
0.5
0.0
0.5
1.0
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 14/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Legend
A legend can help to make self–explanatory figures
impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t
# l egend l o c a t i o n can be ’ b e s t ’ , ’ c e n t e r ’ , ’ l e f t ’ , ’ r i g h t ’ , e t c .
# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )z = numpy . cos ( x )
# P lo t the cu r v e sp l t . p l o t ( x , y , c=’#FF4500 ’ , l a b e l= ’ s i n ( x ) ’ )p l t . p l o t ( x , z , c=’#4682B4 ’ , l a b e l=’ cos ( x ) ’ )p l t . l e g end ( l o c=’ b e s t ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 15/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Legend
A legend can help to make self–explanatory figures
4 3 2 1 0 1 2 3 41.0
0.5
0.0
0.5
1.0
sin(x)cos(x)
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 15/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Custom axis scaleChanging the axis scale can improve readability
impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t
# l egend l o c a t i o n can be ’ b e s t ’ , ’ c e n t e r ’ , ’ l e f t ’ , ’ r i g h t ’ , e t c .
# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )z = numpy . cos ( x )
# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)
a x i s . s e t y l i m (−0.5 ∗ math . p i , 0 . 5 ∗ math . p i )
# P lo t the cu r v e sp l t . p l o t ( x , y , c=’#FF4500 ’ , l a b e l= ’ s i n ( x ) ’ )p l t . p l o t ( x , z , c=’#4682B4 ’ , l a b e l=’ cos ( x ) ’ )p l t . l e g end ( l o c=’ b e s t ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 16/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Custom axis scale
Changing the axis scale can improve readability
4 3 2 1 0 1 2 3 41.5
1.0
0.5
0.0
0.5
1.0
1.5sin(x)cos(x)
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 16/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
GridSame goes for a grid, can be helpful
impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t
# l egend l o c a t i o n can be ’ b e s t ’ , ’ c e n t e r ’ , ’ l e f t ’ , ’ r i g h t ’ , e t c .
# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )z = numpy . cos ( x )
# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)
a x i s . s e t y l i m (−0.5 ∗ math . p i , 0 . 5 ∗ math . p i )a x i s . g r i d ( True )
# P lo t the cu r v e sp l t . p l o t ( x , y , c=’#FF4500 ’ , l a b e l= ’ s i n ( x ) ’ )p l t . p l o t ( x , z , c=’#4682B4 ’ , l a b e l=’ cos ( x ) ’ )p l t . l e g end ( l o c=’ b e s t ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 17/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Grid
Same goes for a grid, can be helpful
4 3 2 1 0 1 2 3 41.5
1.0
0.5
0.0
0.5
1.0
1.5sin(x)cos(x)
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 17/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Error bars
Your data might come with a known measure error
impor t mathimpor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te a n o i s y s i n u s o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=48)y = numpy . s i n ( x + 0.05 ∗ numpy . random . s t anda rd no rma l ( l e n ( x ) ) )y e r r o r = 0 .1 ∗ numpy . random . s t anda rd no rma l ( l e n ( x ) )
# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)
a x i s . s e t y l i m (−0.5 ∗ math . p i , 0 . 5 ∗ math . p i )
# P lo t the cu r v e sp l t . p l o t ( x , y , c=’#FF4500 ’ )p l t . e r r o r b a r ( x , y , y e r r=y e r r o r )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 18/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Error bars
Your data might come with a known measure error
4 3 2 1 0 1 2 3 41.5
1.0
0.5
0.0
0.5
1.0
1.5
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 18/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Table of Contents
1 First steps
2 Curve plots
3 Scatter plots
4 Boxplots
5 Histograms
6 Usage example
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 19/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Scatter plot
A scatter plot just shows one point for each dataset entry
impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te a 2d normal d i s t r i b u t i o nnbPo in t s = 100x = numpy . random . s t anda rd no rma l ( nbPo in t s )y = numpy . random . s t anda rd no rma l ( nbPo in t s )
# P lo t the p o i n t sp l t . s c a t t e r ( x , y )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 20/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Scatter plotA scatter plot just shows one point for each dataset entry
3 2 1 0 1 2 34
3
2
1
0
1
2
3
4
5
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 20/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Aspect ratio
If can be very important to have the same scale on bothaxis
impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te a 2d normal d i s t r i b u t i o nnbPo in t s = 100x = numpy . random . s t anda rd no rma l ( nbPo in t s )y = 0 .1 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s )
# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111 , a sp e c t=’ equa l ’ )
# P lo t the p o i n t sp l t . s c a t t e r ( x , y , c=’#FF4500 ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 21/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Aspect ratioIf can be very important to have the same scale on bothaxis
3 2 1 0 1 2 30.30.20.10.00.10.20.3
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 21/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Aspect ratioAlternative way to keep the same scale on both axis
impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te a 2d normal d i s t r i b u t i o nnbPo in t s = 100x = numpy . random . s t anda rd no rma l ( nbPo in t s )y = 0 .1 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s )
# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)
cmin , cmax = min (min ( x ) , min ( y ) ) , max(max( x ) , max( y ) )cmin −= 0.05 ∗ ( cmax − cmin )cmax += 0.05 ∗ ( cmax − cmin )
a x i s . s e t x l i m ( cmin , cmax )a x i s . s e t y l i m ( cmin , cmax )
# P lo t the p o i n t sp l t . s c a t t e r ( x , y , c=’#FF4500 ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 22/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Aspect ratio
Alternative way to keep the same scale on both axis
2 1 0 1 2
2
1
0
1
2
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 22/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Multiple scatter plotsAs for curve, you can show 2 datasets on one figure
impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
c o l o r s = ( ’#FF4500 ’ , ’#3CB371 ’ , ’#4682B4 ’ , ’#DB7093 ’ , ’#FFD700 ’ )
# Genera te a 2d normal d i s t r i b u t i o nnbPo in t s = 100
x , y = [ ] , [ ]
x += [ numpy . random . s t anda rd no rma l ( nbPo in t s ) ]y += [ 0 . 2 5 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) ]
x += [ 0 . 5 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) + 3 . 0 ]y += [2 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) + 2 . 0 ]
# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111 , a sp e c t=’ equa l ’ )
# P lo t the p o i n t sf o r i i n x range ( l e n ( x ) ) :
p l t . s c a t t e r ( x [ i ] , y [ i ] , c=c o l o r s [ i % l e n ( c o l o r s ) ] )
p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 23/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Multiple scatter plots
As for curve, you can show 2 datasets on one figure
3 2 1 0 1 2 3 4 54
2
0
2
4
6
8
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 23/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Showing centersIt can help to see the centers or the median points
impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
c o l o r s = ( ’#FF4500 ’ , ’#3CB371 ’ , ’#4682B4 ’ , ’#DB7093 ’ , ’#FFD700 ’ )
# Genera te a 2d normal d i s t r i b u t i o nnbPo in t s = 100
x , y = [ ] , [ ]
x += [ numpy . random . s t anda rd no rma l ( nbPo in t s ) ]y += [ 0 . 2 5 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) ]
x += [ 0 . 5 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) + 3 . 0 ]y += [2 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) + 2 . 0 ]
# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111 , a sp e c t=’ equa l ’ )
# P lo t the p o i n t sf o r i i n x range ( l e n ( x ) ) :
c o l = c o l o r s [ i % l e n ( c o l o r s ) ]p l t . s c a t t e r ( x [ i ] , y [ i ] , c=c o l )p l t . s c a t t e r ( [ numpy . median ( x [ i ] ) ] , [ numpy . median ( y [ i ] ) ] , c=co l , s=250)
p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 24/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Showing centers
It can help to see the centers or the median points
3 2 1 0 1 2 3 4 54
2
0
2
4
6
8
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 24/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Marker stylesYou can use different markers styles
impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
markers = ( ’+’ , ’ ˆ ’ , ’ . ’ )
# Genera te a 2d normal d i s t r i b u t i o nnbPo in t s = 100
x , y = [ ] , [ ]
x += [ numpy . random . s t anda rd no rma l ( nbPo in t s ) ]y += [ 0 . 2 5 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) ]
x += [ 0 . 5 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) + 3 . 0 ]y += [2 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) + 2 . 0 ]
# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111 , a sp e c t=’ equa l ’ )
# P lo t the p o i n t sf o r i i n x range ( l e n ( x ) ) :m = markers [ i % l e n ( markers ) ]p l t . s c a t t e r ( x [ i ] , y [ i ] , marker=m, c=’#000000 ’ )p l t . s c a t t e r ( [ numpy . median ( x [ i ] ) ] , [ numpy . median ( y [ i ] ) ] , marker=m, s=250 , c=’#000000 ’ )
p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 25/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Marker styles
You can use different markers styles
4 3 2 1 0 1 2 3 4 54
2
0
2
4
6
8
10
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 25/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Table of Contents
1 First steps
2 Curve plots
3 Scatter plots
4 Boxplots
5 Histograms
6 Usage example
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 26/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Boxplots
Let’s do a simple boxplot
impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te normal d i s t r i b u t i o n datax = numpy . random . s t anda rd no rma l (256)
# Show a boxp l o t o f the datap l t . b o xp l o t ( x )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 27/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Boxplots
Let’s do a simple boxplot
13
2
1
0
1
2
3
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 27/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Boxplots
You might want to show the original data in the sametime
impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te normal d i s t r i b u t i o n datax = numpy . random . s t anda rd no rma l (256)
# Show a boxp l o t o f the datap l t . s c a t t e r ( [ 0 ] ∗ l e n ( x ) , x , c=’#4682B4 ’ )p l t . b o xp l o t ( x , p o s i t i o n s =[0 ] )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 28/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
BoxplotsYou might want to show the original data in the sametime
04
3
2
1
0
1
2
3
4
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 28/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Multiple boxplots
Boxplots are often used to show side by side variousdistributions
impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te normal d i s t r i b u t i o n datadata = [ ]f o r i i n x range ( 5 ) :
mu = 10 ∗ numpy . random . random sample ( )s igma = 2 ∗ numpy . random . random sample ( ) + 0 .1data . append (numpy . random . normal (mu, sigma , 256))
# Show a boxp l o t o f the datap l t . b o xp l o t ( data )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 29/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Multiple boxplotsBoxplots are often used to show side by side variousdistributions
1 2 3 4 52
4
6
8
10
12
14
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 29/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Orientation
Changing the orientation is easily done
impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te normal d i s t r i b u t i o n datadata = [ ]f o r i i n x range ( 5 ) :
mu = 10 ∗ numpy . random . random sample ( )s igma = 2 ∗ numpy . random . random sample ( ) + 0 .1data . append (numpy . random . normal (mu, sigma , 256))
# Show a boxp l o t o f the datap l t . b o xp l o t ( data , v e r t =0)p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 30/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Orientation
Changing the orientation is easily done
6 4 2 0 2 4 6 8 10 12
1
2
3
4
5
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 30/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
LegendGood graphics have a proper legend
impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
# Genera te normal d i s t r i b u t i o n datal a b e l s = [ ’ mercury ’ , ’ l e a d ’ , ’ l i t h i um ’ , ’ t ung s t ene ’ , ’ cadnium ’ ]data = [ ]f o r i i n x range ( l e n ( l a b e l s ) ) :
mu = 10 ∗ numpy . random . random sample ( ) + 100sigma = 2 ∗ numpy . random . random sample ( ) + 0 .1data . append (numpy . random . normal (mu, sigma , 256))
# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)
a x i s . s e t t i t l e ( ’ A l i e n nodu l e s compos i t i on ’ )xt ickNames = p l t . s e t p ( a x i s , x t i c k l a b e l s = l a b e l s )a x i s . s e t y l a b e l ( ’ c o n c e n t r a t i o n (ppm) ’ )
# Show a boxp l o t o f the datap l t . b o xp l o t ( data )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 31/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Legend
Good graphics have a proper legend
mercury lead lithium tungstene cadnium100
102
104
106
108
110
112co
nce
ntr
ati
on (
ppm
)Alien nodules composition
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 31/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Table of Contents
1 First steps
2 Curve plots
3 Scatter plots
4 Boxplots
5 Histograms
6 Usage example
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 32/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Histograms
Histogram are convenient to sum-up results
impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
# Some datadata = numpy . abs (numpy . random . s t anda rd no rma l ( 30 ) )
# Show an h i s tog ramp l t . barh ( range ( l e n ( data ) ) , data , c o l o r=’#4682B4 ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 33/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Histograms
Histogram are convenient to sum-up results
0.0 0.5 1.0 1.5 2.0 2.50
5
10
15
20
25
30
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 33/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Histograms
A variant to show 2 quantities per item on 1 figure
impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
# Some datadata = [ [ ] , [ ] ]f o r i i n x range ( l e n ( data ) ) :
data [ i ] = numpy . abs (numpy . random . s t anda rd no rma l ( 30 ) )
# Show an h i s tog raml a b e l s = range ( l e n ( data [ 0 ] ) )
p l t . barh ( l a b e l s , data [ 0 ] , c o l o r=’#4682B4 ’ )p l t . barh ( l a b e l s , −1 ∗ data [ 1 ] , c o l o r=’#FF4500 ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 34/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Histograms
A variant to show 2 quantities per item on 1 figure
3 2 1 0 1 2 30
5
10
15
20
25
30
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 34/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
LabelsVery often, we need to name items on an histogram
impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
# Some datanames = [ ’Wang Bu ’ , ’ Cheng Cao ’ , ’ Zhang Xue L i ’ , ’ Tang Wei ’ , ’ Sun Wu Kong ’ ]marks = 7 ∗ numpy . random . random sample ( l e n ( names ) ) + 3
# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)
a x i s . s e t x l i m (0 , 10)
p l t . y t i c k s ( numpy . a range ( l e n ( marks ) )+0.5 , names )
a x i s . s e t t i t l e ( ’ Data−mining marks ’ )a x i s . s e t x l a b e l ( ’mark ’ )
# Show an h i s tog ramp l t . barh ( range ( l e n ( marks ) ) , marks , c o l o r=’#4682B4 ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 35/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Labels
Very often, we need to name items on an histogram
0 2 4 6 8 10mark
Wang Bu
Cheng Cao
Zhang Xue Li
Tang Wei
Sun Wu Kong
Data-mining marks
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 35/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Error barsError bars, to indicate the accuracy of values
impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t
# Some datanames = [ ’ 6809 ’ , ’ 6502 ’ , ’ 8086 ’ , ’ Z80 ’ , ’RCA1802 ’ ]speed = 70 ∗ numpy . random . random sample ( l e n ( names ) ) + 30e r r o r = 9 ∗ numpy . random . random sample ( l e n ( names ) ) + 1
# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)
p l t . y t i c k s ( numpy . a range ( l e n ( names ))+0.5 , names )
a x i s . s e t t i t l e ( ’ 8 b i t s CPU benchmark − s t r i n g t e s t ’ )a x i s . s e t x l a b e l ( ’ s c o r e ’ )
# Show an h i s tog ramp l t . barh ( range ( l e n ( names ) ) , speed , x e r r=e r r o r , c o l o r=’#4682B4 ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 36/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Error bars
Error bars, to indicate the accuracy of values
0 20 40 60 80 100score
6809
6502
8086
Z80
RCA1802
8 bits CPU benchmark - string test
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 36/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Table of Contents
1 First steps
2 Curve plots
3 Scatter plots
4 Boxplots
5 Histograms
6 Usage example
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 37/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Old Faithful
Let’s display some real data: Old Faithful geyser
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 38/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Old Faithful
This way works, but good example of half-done job
impor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t
# Read the datadata = numpy . l o a d t x t ( ’ . / d a t a s e t s / g e y s e r . dat ’ )
# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111 , a sp e c t=’ equa l ’ )
# P lo t the p o i n t sp l t . s c a t t e r ( data [ : , 0 ] , data [ : , 1 ] , c=’#FF4500 ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 39/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Old Faithful
This way works, but good example of half-done job
30 40 50 60 70 80 90 1001.52.02.53.03.54.04.55.05.5
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 39/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Old Faithful
Let’s make a more readable figure
impor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t
# Read the datadata = numpy . l o a d t x t ( ’ . / d a t a s e t s / g e y s e r . dat ’ )
# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)
a x i s . s e t t i t l e ( ’ Old F a i t h f u l g e y s e r d a t a s e t ’ )a x i s . s e t x l a b e l ( ’ w a i t i n g t ime (mn . ) ’ )a x i s . s e t y l a b e l ( ’ e r u p t i o n du r a t i o n (mn . ) ’ )
# P lo t the p o i n t sp l t . s c a t t e r ( data [ : , 0 ] , data [ : , 1 ] , c=’#FF4500 ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 40/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Old Faithful
Let’s make a more readable figure
30 40 50 60 70 80 90 100waiting time (mn.)
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5eru
pti
on d
ura
tion (
mn.)
Old Faithful geyser dataset
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 40/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Mercury & fishes
Let’s display more complex data: fishes and mercurypoisoning
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 41/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Mercury & fishes
A first try
impor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t
# Read the datadata = numpy . l o a d t x t ( ’ . / d a t a s e t s / f i s h . dat ’ )
# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)
a x i s . s e t t i t l e ( ’ North Ca r o l i n a f i s h e s mercury c o n c e n t r a t i o n ’ )a x i s . s e t x l a b e l ( ’ we ight ( g . ) ’ )a x i s . s e t y l a b e l ( ’ mercury c o n c e n t r a t i o n (ppm) ’ )
# P lo t the p o i n t sp l t . s c a t t e r ( data [ : , 3 ] , data [ : , 4 ] , c=’#FF4500 ’ )p l t . show ( )
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 42/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Mercury & fishes
A first try
1000 0 1000 2000 3000 4000 5000weight (g.)
0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0m
erc
ury
conce
ntr
ati
on (
ppm
)North Carolina fishes mercury concentration
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 42/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Mercury & fishes
Show more information with subplots
20 30 40 50 60 70length (cm.)
10000
10002000300040005000
weig
ht
(g.)
1000 0 1000 2000 3000 4000 5000weight (g.)
0.50.00.51.01.52.02.53.03.54.0
merc
ury
conce
ntr
ati
on (
ppm
)
20 30 40 50 60 70length (cm.)
0.50.00.51.01.52.02.53.03.54.0
merc
ury
conce
ntr
ati
on (
ppm
)North Carolina fishes mercury concentration
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 43/44
UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC
Mercury & fishes
Data from one river with its own color
20 30 40 50 60 70length (cm.)
10000
10002000300040005000
weig
ht
(g.)
1000 0 1000 2000 3000 4000 5000weight (g.)
0.50.00.51.01.52.02.53.03.54.0
merc
ury
conce
ntr
ati
on (
ppm
)
20 30 40 50 60 70length (cm.)
0.50.00.51.01.52.02.53.03.54.0
merc
ury
conce
ntr
ati
on (
ppm
)North Carolina fishes mercury concentration
Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 44/44