sds podcast episode 271: making the public … · 2019-06-19 · you just need to go to...
TRANSCRIPT
![Page 2: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/2.jpg)
Kirill Eremenko: This is episode number 271 with the legend of visual
journalism, Alberto Cairo.
Kirill Eremenko: Welcome to the SuperDataScience podcast. My name
is Kirill Eremenko, Data Science Coach and Lifestyle
Entrepreneur, and each week we bring you inspiring
people and ideas to help you build your successful
career in data science. Thanks for being here today
and now let's make the complex simple.
Kirill Eremenko: This episode is brought to you by SuperDataScience,
our online membership platform for learning data
science at any level. We've got over two and a half
thousand video tutorials, over 200 hours of content
and 30 plus courses with new courses being added on
average once per month. You can get access to all of
this today just by becoming a SuperDataScience
member. There is no strings attached. You just need to
go to superdatasceine.com and sign up there, cancel at
any time. In addition with your membership, you get
access to any new courses that we release plus all the
bonuses associated with them. Of course there are
many additional features that are in place or are being
put in place as we speak, such as a slack channel for
members where you can already today connect with
other data scientists all over the world or in your
location, and discuss different topics such as artificial
intelligence, machine learning, data science,
visualization and more, or just hang out in the pizza
room and have random chats with fellow data
scientists.
Kirill Eremenko: Also, another feature of the SuperDataScience
platform is the office hours, where every week we invite
![Page 3: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/3.jpg)
valuable guests in the space of data science and
interrogate them about their techniques, about their
methodologies in the space of data science, and you
actually get a presentation from the guest and you get
an opportunity to ask Q&A at the end. In some of our
office hours, we just present some of the most valuable
techniques that our hosts think are going to be
valuable to you. All of that and more you get as part of
your membership at SuperDataScience, so don't hold
off, sign up today at www.superdatascience.com,
secure your membership and take your data science
skills to the next level.
Kirill Eremenko: Welcome back to the SuperDataScience podcast, ladies
and gentlemen. Super, super pumped to have you
back here on the show today because the guest for
today, I've been hunting this man down for months.
We've been inviting Alberto or trying to get a spot in
Alberto's super busy schedule for months now, and
finally it's happened. I just got off the phone with
nobody else, but Alberto Cairo, and we had an
amazing, amazing chat about data visualization. If
you're not familiar with who Alberto is, Alberto is a
journalist, he's a speaker, an author. He's also the
Knight chair in visual journalism at University of
Miami. The knight chair means that he's endowed by
the Knight Foundation, which recognizes and puts
certain journalists into leading positions as tenure
professors in academia. There's only a handful of
Knight chairs in the US, maybe a couple of dozen, and
Alberto Cairo is one of them.
![Page 4: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/4.jpg)
Kirill Eremenko: All of these credentials should speak for themselves as
to what kind of calibre of a journalist and data
visualization expert Alberto is. He's presented at
numerous conferences and he's actually published two
books already. You might be actually familiar with
them. The first one is called The Functional Art and
Introduction to Information Graphics and
Visualization, came out in 2012. The second one is
The Truthful Art: Data, Charts, and Maps for
Communication, came out in 2016. What's exciting is
that Alberto's third book is coming out, it's called How
Charts Lie: Getting Smarter about Visual Information.
It's coming out in October this year, October 2019,
and you can actually already pick it up on pre-order.
We talked about Alberto's book and you get some very
useful insights from this book for your visualization
practices, and also for understanding visualizations
better.
Kirill Eremenko: Plus, we talked about plenty of other things on this
podcast. Here's a couple of teasers of what you're
about to experience. Why do people misinterpret
visualizations? The Simpson's paradox, the ecological
fallacy, four kinds of literacy, being conscious about
visualizations, exploratory data analysis versus
communicating results, how to design effective
visualizations, and ethics in data visualization. Those
are just a few topics that we touched on. As you can
imagine, it's going to be a value packed podcast.
Without further ado, I bring to Alberto Cairo, the
legend of data visualization.
![Page 5: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/5.jpg)
Kirill Eremenko: Welcome back to the SuperDataScience podcast, ladies
and gentlemen. Today I'm super excited because I've
got a legendary guest on the show, Alberto Cairo
calling in from Miami. Alberto, how are you going
today?
Alberto Cairo: Hey, doing good. How are you?
Kirill Eremenko: I'm doing very well, and super pumped to talking to
you. I watched your presentation at Microsoft
yesterday as we were chatting just before the podcast,
and my God, you have some very interesting
approaches to visualization. I'm very excited to dig into
these today.
Alberto Cairo: Likewise. Thanks for having me.
Kirill Eremenko: Yeah. No, pleasure's mine. How is Miami this time of
the year? I saw on your Twitter feed that you're
spending ... you're finally taking some time away from
all the presentations and conferences, and I guess
spend some time with family. Are you looking forward
to that? How's that going to be?
Alberto Cairo: Oh yeah, I'm so super looking forward to that. One
thing that I usually joke about Miami is that I am
originally from Spain, from Northwestern Spain, a
region called Galicia, and Galicia is very rainy and
dark and windy and cold, and Miami can be rainy
sometimes particularly during the summer because
clouds build up during the day and you get a
downpour at the end of the day, but most of the time
is warm and sunny. I got used to this weather very
quickly and I love it here, and I'm looking forward to
those three months of staying at home, no trouble. But
![Page 6: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/6.jpg)
I will have tons of work. I mean I'm not planning to
basically rest, so will be working on tons of stuff. It's
only that I can do it in my backyard, next to the swing,
which is the luxury that I have.
Kirill Eremenko: Yeah, no, that's very exciting. But it doesn't get too hot
in ... I've only been in Miami briefly and then I went to
Florida Keys. I was wondering, it doesn't get too hot?
Because in Spain, for instance, in summer, last year, I
think it was like 37 degrees Celsius or something like
that.
Alberto Cairo: Oh yeah. If you go to the south of Spain, you can get
to 40 degrees Celsius or even more, 40, 42. Miami
doesn't get that warm. However, what happens is that
you have crazy humidity. You need to hydrate all day
basically. But if you do that, you're fine. I mean, if you
always carry water with you, which is advisable, then
you're fine. But you need to like this kind of weather. I
mean, if you are a cold weather person, you will suffer
mightily, mightily here. But I'm a warm weather
person, so I really enjoy Miami.
Kirill Eremenko: Yeah, yeah. I understand. Indeed, it's really humid. As
soon as you get out the plane, you start sweating like
crazy.
Alberto Cairo: Yeah. Exactly, yeah.
Kirill Eremenko: Which part of Miami?
Alberto Cairo: I live in a neighborhood called Kendall, which is in
Southwestern Miami. I am not close to the coast, to
Miami beach. I'm closer to the Everglades, which is the
large natural park, the swamp. It over here. I usually
![Page 7: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/7.jpg)
joke that I'm closer to the alligators than I am to the
dolphins.
Kirill Eremenko: Or the sharks.
Alberto Cairo: Or the sharks, yes.
Kirill Eremenko: Okay, got you. Okay. Well, very cool. Very excited for
your time in Miami for the next few months to have a
rest. A well deserved rest because as we were chatting
before the podcast, you've got your third book coming
out in October. Once that happens, you're going to be
on the move going to conferences pretty much every
day. As you said, you can see it as a problem or as a
huge opportunity.
Alberto Cairo: Yeah. It's a problem-
Kirill Eremenko: How are you feeling?
Alberto Cairo: Yeah, it's a problem or an opportunity. Yeah, the book
that comes out in October, it's actually my first book
for the general public. The title is How Charts Lie,
although perhaps a more appropriate title would be
how we lie to ourselves with charts. The way that it is
written, it's very informal, very nontechnical. It's an
introduction to how to become a better reader of
charts. Not a better designer, but a better reader
because it's for the general public, it's not for
designers. It's how to correctly interpret all the line
charts and bar graphs and data maps that we see
every day in social media and the news media, how to
extract the right meaning from them. I don't know.
Perhaps it will ... I don't know. It will sell well, it will
attract lots of attention. Who knows? Yeah. I already
![Page 8: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/8.jpg)
have several speaking engagements lined up for the
fall in relationship to it, just to help with the
promotional efforts.
Kirill Eremenko: No, that's very exciting, and totally agree that a book
for the general public, well, especially from somebody
of your level in the space of visualization, it's
necessary because there's people who want to hear
from you, but maybe they're not technical, they don't
have the technical background to understand certain
concepts or to keep up with certain concepts. A book
for the general public I think is a great idea. What are
some of the main things that you color off in this
book? What are some of the main themes?
Alberto Cairo: Yeah. What I did in the new book was to basically ask
myself, if I had not learned anything myself, about
data visualization by studying or practicing it, what
are the most elementary skills or pieces of knowledge
that I need to have in order to be a critical, not
designer, but a critical reader of these kinds of
products in news media? Right? Obviously, I cover
things such as the main principles of data
visualization that you can read about in any more
technical books, like the ones that I wrote in the past,
such as the Truthful Art, for example, right?
Alberto Cairo: Principles such as visual encoding, what is visual
encoding? Right? Visual encoding basically is getting
your data and then mapping your data onto objects,
and then changing some properties of those objects in
proportion to the data that you're trying to represent.
It could be the length of the object or the height of the
![Page 9: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/9.jpg)
object or the color of the object and so on and so forth.
Those properties we call them encodings. Right?
Kirill Eremenko: Mm-hmm (affirmative)
Alberto Cairo: In the past I taught these skills to people who wanted
to work in data visualization. What I do in the new
book is to try to explain these very elementary
principles to people who are not going to be graphic
designers or visualization designers or data scientists,
but who are going to be consumers of those kinds of
products. So they need to be prepared to read them
correctly, and in order to read them correctly, you
need to understand data visualization at the
symbolical level, so understanding the principle of
mapping data onto objects at the grammatical level,
meaning that you need to learn about encodings. In
the third level, which is the core of the book actually,
it's the semantics level.
Alberto Cairo: Once you are able to understand the mechanics of a
graphic, how to read it, right? Then you need to be
able to interpret it, right? It's at the semantics level.
What is the information that that graphic is caring,
how to extract the right insights, or the right
inferences from the chart that you are seeing. I think
that these skills are of greater value for anybody.
Right? The problem is that the literature about data
visualization, and this includes my own previous
books, they are aimed too much at people who want to
specialize in the field. We don't really share some
knowledge, right? We have basically the same, similar
levels of knowledge. Right? There are challenges that
... Basically what is happening is that there is an
![Page 10: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/10.jpg)
increase in the sophistication that visualization
designers have, but there is not the same increase in
sophistication in the readers who consume these types
of data visualizations, right?
Alberto Cairo: There's a growing gap between, let's say, the
communities made of visualization designers, data
scientists, statisticians, et cetera. We are developing
new methods every day, we are making all these fields
advance very quickly and improve very quickly and
create new tools and so on and so forth, but the
general public is falling behind, right? My interest in
the past few years has been, how can we help the
general public bring themselves up to speed with all
these new techniques? Obviously, I cannot write about
data science. I'm not a statistician, I'm not a data
scientist, but I'm a visualization designer. I asked
myself, what can I do to help my dad, for example,
who's a medical doctor, not trained in statistics, not
trained in data visualization, what can I do to help my
dad bring himself up to speed with data visualization?
Alberto Cairo: I wrote the book that way. If I had to explain to a
nontechnical person what data visualization is about,
why it is so important, why it can be so powerful, but
at the same time how dangerous it can be as well, if
you're don't use it correctly. How would I write that
book? That's the frame of mind that I put myself into
to write this new book.
Kirill Eremenko: I totally understand. I like how you say in your talks,
that good data visualizations have two really powerful
qualities, that they're persuasive, and they're memoral
right? If you see a good visualization, not only
![Page 11: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/11.jpg)
understand what hopefully, probably understand
properly, always communicate property, what's the
underlying insights are, but also you're able to
memorize it because it's an image and you can see it in
your head and you can maybe describe those insights
later to somebody else. I think perhaps those are the
two reasons why more and more publications, such as
the Wall Street Journal, New York Times and so on,
they're moving to visualization.
Kirill Eremenko: The amount of info graphics and visual
representations of information, whether it's about
elections or about population statistics or about crime
rates and things like that. The amount of info graphics
out there is crazy, and now they're getting interactive
and they're getting more and more exciting and
interesting on these publications. That's very
interesting.
Alberto Cairo: There is a reason. There is a reason for this increase,
which is that if you ask people who work in data
journalism departments or graphics departments in
news publications such as the ones that you
mentioned, Wall Street Journal or 538 or the New York
Times or ProPublica or many others, the Financial
Times, all of these publications are considered the gold
standard in using data visualization in the news. They
will all tell you the same thing, which is that if our
data visualization is well designed, and it covers a
topic that the public is interested in obviously, it will
become extremely, extremely, extremely popular. I
mean, some of the most popular pieces of content
![Page 12: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/12.jpg)
published in the past decade by some of these media
publications have been data visualization.
Alberto Cairo: The most popular, and this is a factoid that I usually
talk about in some of the talks in relationship to the
new book, How Charts Lie, one of the things I say is
that the most popular piece of content ever published
by the newyorktimes.com, the New York Times is the
most important, most serious newspaper in the United
States, and one of the most important newspapers in
the world, the most popular piece of content ever
published by the New York Times online is a data
visualization.
Kirill Eremenko: Oh wow.
Alberto Cairo: It's a data visualization that is commonly called ...
Yeah, it's commonly called the dialect map. You can
Google it up. The dialect map, New York Times. The
actual title is How You, Y'all and Youse Guys Talk, or
something like that. I don't remember exactly what the
title is, but everybody knows it as the dialect map.
Basically it's a tool that asks you several questions.
How do you pronounce this word in English? Or how
do you refer to this particular phenomenon or this
particular animal in English? What word do you use
for that? Based on your responses to the questions
that are posed to you, basically what you start seeing
is a bunch of maps that predict where you'll probably
live or where you're from, right? Based on some of
your-
Kirill Eremenko: In the United Sates, right?
![Page 13: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/13.jpg)
Alberto Cairo: In the United States, although recently they created a
version for the United Kingdom.
Kirill Eremenko: Oh wow.
Alberto Cairo: Yeah. It's a lot of fun. That project is the ... The
reasons why this project is so popular or viral it has to
do with how interesting the topic is, but also because
it's a visually ... it's a visual tool, right? And it is so
well designed and so well done, and it's the most
popular piece of content ever published by the
Newyorktimes.com.
Kirill Eremenko: Why would you say people like visualization so much?
Alberto Cairo: Well, I mean it appeals to us, visualization, because
first of all, it's visual and we are visual creatures. We
prefer to see things rather than to read things. We've
basically evolved to be visual creatures. I mean, a huge
part of our brain is devoted to processing visual
information. Then another version of a data
visualization is that, as I mentioned before, I mean, it's
persuasive and it's memorable when it is very well
designed. The way that I usually put this in talks and
in the new book is that if I did a visualization which
was well designed and it reveals certain insights
coming from the data, once you see those insights, you
can not unsee them anymore. Basically they stick to
your brain. It's like they are very memoral. That's
another reason. Visualization is much more
memorable if it is well designed, right? Sometimes
than text alone, right?
Alberto Cairo: By the way, visualization is not just visualizing things,
visualization is very often the combination of visuals
![Page 14: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/14.jpg)
with words that supplement those visual, right? The
best data visualizations are usually combinations of
visual optics with words that reinforce each other. We
call the annotation layer in the world of data
visualization. They are also beautiful objects, right?
We human beings like beauty, beautiful, to see
beautiful objects and enjoy, right? Good visualizations
are highly enjoyable. There's maybe a bunch of
reasons. This may be just some of them, a few of them.
Kirill Eremenko: They trigger an emotion, right? Like that example-
Alberto Cairo: Yeah, yeah, absolutely.
Kirill Eremenko: ... that you gave about the hockey stick, right?
Alberto Cairo: They can be joyful, right? As very common ... we say
commonly these days, they may spark joy, right?
Kirill Eremenko: Yeah.
Alberto Cairo: They-
Kirill Eremenko: Or they can terrify, right? You have that-
Alberto Cairo: They can terrify you. I mean, they can terrify you, they
can surprise you, they can ... I don't know. They can
be emotional. The same way that a good text can be,
right? Texts can also elicit emotion sometimes, but
there is something more visceral, something more
direct in the use of visual objects to do that.
Kirill Eremenko: Yeah. Therefore, because a visual creates this imprint
and creates ... As you said, if you see it, you cannot
unsee it, it's a bit dangerous or sometimes sad when
visuals are, as you put it, either misused or
misinterpreted, and people see the wrong thing or are
![Page 15: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/15.jpg)
shown the wrong thing and therefore now they cannot
unsee the wrong thing, and that creates a whole
[crosstalk 00:20:46]
Alberto Cairo: It's so persuasive, so powerful that they can overpower
you, all right? They can basically become means that
controls your thoughts. That's a whole another reason
why I wrote this new book, right? To basically warn
people about how careful we need to be when reading
visualizations, right? There are many examples of that.
One example that I have in the book is ... which I use
by the way to explain one of the core principles of
reading data visualization, which is that when you see
a data visualization, one of the key things that you
need to do is to come up with the right description of
what you are seeing, right? I do this as scatter plot,
which I borrow from a friend of mine, Heather Cross,
who is a statistician. It's a scatter plot that shows the
positive association as a positive correlation between
cigarette consumption and life expectancy, country by
country. When you take a look at the country level
data, the association between cigarette consumption
per capita and life expectancy is positive, right?
Kirill Eremenko: Wow.
Alberto Cairo: Imagine this scatter plot. Now the way that I, that you
would describe, that we commonly describe that kind
of chart, and I know this because I have done this
myself, is to say if you see the x-axis, cigarette
consumption per capita and the y axis, the vertical
axis, life expectancy per capita, and you see that one
of them is positively correlated with the other, the way
that we usually describe that kind of chart is, the more
![Page 16: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/16.jpg)
cigarettes we consume the longer we live, right? But
that is not the right description. If you describe the
chart like that, you are biasing people's perception of
that chart and you are biasing your own perception of
that chart. Because what you're only maybe
considering is that you're looking at the data
aggregated at the national level, and that can be very
dangerous because it could be an example of a
Simpson's paradox, right?
Alberto Cairo: The phenomenon that data that gets aggregated at
certain level may display patterns that may disappear
or reverse completely once you disaggregate the data
at lower levels of aggregation. It's a perfect example to
explain these phenomena. I do this in the book.
Because once you disaggregate the data at the regional
level, at the local level, and you go down to the
individual level, you will see that the relationship that
was positive before, more cigarettes more life
expectancy, reverses completely; more cigarettes, less
life expectancy. Why the reversal? The reversal is
related to wealth, right? The wealthier a country is, the
more cigarettes people in that country can consume.
The wealthier a country is, the more cigarettes per
capita you have. But at the same time, the wealthier a
country is, the higher the life expectancy is as well,
because people can pay for better health care, right?
Alberto Cairo: Basically what you're seeing there is a spurious
correlation between the ... Well, it's not really
spurious. The correlation really exists, but it only
exists at the national level, not at the individual level,
which is the level that you are interested in. If you
![Page 17: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/17.jpg)
want to know, for example, whether smoking cigarettes
is good for you, you should not look at data at the
national level because the correlation that you see at
the national level may not reproduce at the individual
level.
Kirill Eremenko: Got you. At the national level, every point on the chart
is a country, at the individual level-
Alberto Cairo: Exactly. Every [inaudible 00:24:10] at individual level.
So the x-axis is cigarette consumption, the y-axis will
be life expectancy, and just see a positive association.
The more cigarette ... the bigger the cigarette
consumption is, the further to the right a point is, the
further up the point needs to be as well.
Kirill Eremenko: Yeah, yeah. No, that's very interesting. Or that other ...
There was another example in one of your talks I had
in my mind just now, that had the same thing that if
you ... it depends on how you interpret it, right? How
you ... Oh, the chocolate and Nobel prize winners. That
the example.
Alberto Cairo: Yeah, the chocolate and [crosstalk 00:24:51]
Kirill Eremenko: Can you tell us about that?
Alberto Cairo: Right.
Kirill Eremenko: I love that example.
Alberto Cairo: Yeah, that's an example that I don't use in How Charts
Lie. I use it in the previous book, in The Truthful Art.
Basically it's like, if you take a look at a scatter plot,
it's a very similar. Imagine a scatter plot at the
national level, each dot is a country. Then on the x-
![Page 18: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/18.jpg)
axis, you plot a chocolate consumption per capita. So
the farther to the right a country is, the more
chocolate per capita that country consumes, and then
the further up on the y-scale, on the vertical axes, the
larger the number of Nobel prizes per ten million
people you have. There is a very strong positive
association, is correlation. It's linear. It's a linear
association between chocolate consumption per capita
and Nobel prices per capita ... per ten million people,
right? The more chocolate consumption ... The bigger
the chocolate consumption, the bigger, the larger the
number of Nobel prizes.
Alberto Cairo: But obviously you cannot enfigure that there's a
relationship between those two things. That's the first
thing, right? The classic correlation is not causation,
right? But we need to go beyond that, right? The
correlation is not causation is a mantra that we have
been repeating for decades now, and it's basic
knowledge, it's an elementary knowledge. We need to
keep repeating it because it's very easy to infer
causation based on some mere correlation, but we
need to go beyond that, and that's what I need to ... I
try to do in the new book; explaining concepts again
such as a Simpson's paradox or the ecological fallacy,
right? That the ecological fallacy being inferring
something about yourself, for example, based on data
that is aggregated at the national level or the regional
level, right?
Alberto Cairo: You cannot infer something about yourself, whether
cigarette consumption is good for you, right?
Individually, based on data that you're seeing at the
![Page 19: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/19.jpg)
national level, right? Because there may be
confounding variables that you're not taking into
consideration, for example, wealth in this particular
case. I am emphasizing all of these examples so much
in our conversation today and also in the new book,
because this is a mistake that I have made myself,
because I was careless about data, right? Describing
the cigarette consumption chart or life expectancy
chart as the more we smoke, the longer we live. Well,
that's not true. The way to describe a scatter plot
showing the positive as a correlation between cigarette
consumption and life expectancy would be to say that
there is a positive association between cigarette
consumption and life expectancy, but that doesn't
mean that one of the variables causes the other, and
this relationship may disappear once we start
disaggregating the data.
Alberto Cairo: We need to warn people about these kinds of
phenomena when we present it to them. At the same
time, a reader of charts need to be prepared not to just
look at the graphic and move away, but to read the
graphic carefully and think about the chart because if
you don't pay attention to the chart, right, you will
probably be misled by the chart, you will struck the
wrong inferences from it. Charts, maps, graph, et
cetera, they are not meant to be seen, they are meant
to be read like a piece of text. You need to read them
and think about them carefully. Right? Otherwise, you
would probably be misled by them.
Kirill Eremenko: Got you, and I really like what you say about why
people misinterpret charts and how we can ... what is
![Page 20: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/20.jpg)
missing in that puzzle. When you talk about the four
kinds of literacy, so the normal literacy as in reading,
the one we're used to-
Alberto Cairo: Reading and writing.
Kirill Eremenko: Yeah. Articulacy, numeracy-
Alberto Cairo: Articulacy.
Kirill Eremenko: Numeracy and the graphicacy. Do you mind telling us
a bit about those?
Alberto Cairo: Sure. Sure, sure.
Kirill Eremenko: What are the last two?
Alberto Cairo: Yeah. These are not terms that I have invented. They
haven't been around for many, many years. I learned
about all these in books such as Innumeracy, which is
a very famous book about how to interpret numbers
correctly, and also a book called Mapping It Out, by a
cartographer called Mark Monmonier. In Mapping It
Out, Monmonier says that, and I agree with that, that
in order to consider a source, educated citizens
nowadays, we need to be able to do more than just
merely read and write. That's basic literacy, right? We
need that obviously.
Alberto Cairo: We cannot abandon that obviously. But we also need
articulacy, which is the ability to express ourselves
correctly through spoken words. On top of that, we
need numeracy. Numeracy is basically the elementary
skill, being able to think critically about numbers. I
usually equate it, compare it to some sort of sixth
sense in the back of your brain, that it starts ringing
![Page 21: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/21.jpg)
when you see a number in news media that doesn't
sound right.
Kirill Eremenko: Like a BS meter, a bullshit meter.
Alberto Cairo: Yes. Yeah, but it's not conscious. It's sort of a sixth
sense, that you see a number in the media and say,
"There is something dubious about this number.
There's something wrong about it. I don't know what it
is, but it doesn't sound right." That's a numeracy at
work. Numeracy is a skill that can be developed. You
can be educated in that, right? You don't need to
become a statistician or data scientist to have
elementary numeracy. Right? Obviously if you want to
become really, really numerate, it is better if you
formally study statistics and data science. But I've
come to believe that any regular citizen, like myself,
I'm not a statistician, I'm a journalist and a graphic
designer. I have come to believe that any citizen can
educate themselves, herself or himself in basic
numeracy.
Alberto Cairo: Then on top of that, you have a graphicacy, which is
graphical literacy, right? The ability to interpret, to
read and interpreted correctly maps and charts and
graphs and any sort of visual that represents the
numbers, right? How to extract the right meaning from
them, and it all begins with attention. You need to
basically put yourself in the frame of mind that says
that what you're seeing is not an illustration, is a
visual argument. In order to understand that visual
argument, you need to pay attention to it, right? Then
you need to apply some elementary principles of chart
reading that I explain in the book and in talks, et
![Page 22: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/22.jpg)
cetera, such as don't read too much into a chart. A
chart shows only what it shows and nothing else,
right? Because we tend to project what we want to
believe onto the charts that we see every day in news
media and that's very, very dangerous, right?
Alberto Cairo: Double check the sources. Where did the data come
from? Right? You need to ask yourself whether the
numbers that are displayed on the chart are
measuring what they say that they are measuring.
This is another critical thing to do sometimes, right?
So is it measuring the right thing, and what methods
were used to measure these particular phenomena?
Right? These things don't take longer than five or 10
minutes, and they can take you a long way to avoid
most of the cases in which you can be misled by a
chart that you see in news media.
Kirill Eremenko: Yeah. I really liked a lot your principles of graphical
literacy. So definite, it's not something that is taught
at school. If you don't mind, let's go over them. I think
they'll get a lot of value. Maybe starting with the
foundational one that you call as number zero, is your
data measuring what you think is measuring?
Alberto Cairo: Yeah, [crosstalk 00:32:14] measuring what you think
their measuring. Yes.
Kirill Eremenko: That's a very important question, right? Have you seen
examples of when charts are created-
Alberto Cairo: Oh, yeah.
Kirill Eremenko: ... with the wrong data?
![Page 23: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/23.jpg)
Alberto Cairo: Yeah, I have seen. I have seen the samples of charts
measuring the wrong thing and saying that they are
measuring the right thing. Yeah. I don't know. But, for
example ... I don't know, not adjusting for inflation, for
example. Right? How many times have we seen stories
in news media saying, "The latest Marvel movie, the
latest superhero movie, is the highest grossing movie
of all time." Right? Then you take a look at the data
and you realize that the data is not adjusted for
inflation. That statement is not true obviously,
because you're basically using the absolute values,
when you should be using the adjusted values in order
to make that comparison. That happens all the time,
and sometimes we don't pay enough attention, and
therefore we are misled by those charts. Right? I have
plenty of examples of this in the book. The one that is
most popular with people in conferences is, is that I
once saw a map, this plain number of heavy metal
bands all over Europe-
Kirill Eremenko: Oh, yes. That one.
Alberto Cairo: Yeah, you saw that in the talk. That's a good chart by
the way. It's not a bad chart. But it's an example of
how to double check the source, because I actually
double checked the source in that particular case,
because when I saw the map, number of heavy metal
bands per million people per country, I asked myself,
"Well, what is this source of this chart calling heavy
metal? Are all the bands out there counting really
heavy metal, or do they belong to other musical genre,
et cetera?" Before tweeting the map and popularizing
the mapping in social media, I actually went to the
![Page 24: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/24.jpg)
source and made sure that they are actually counting
... that they had a more or less strict definition of what
heavy metal is. Obviously it's very hard to define, but
you can set some boundaries in there and basically
assess whether they are counting heavy metal bands,
or they are also including ... I don't know, pop or rock
bands or hard rock bands that are not really, really
heavy metal.
Alberto Cairo: I took a look at the source. I use these fun examples
and talks in the books to explain people how
important it is to spend at least one minute or a
couple of minutes double checking that, verifying that,
before you put that chart that you have seen in social
media in your own feed, for example. Because the
chart may be wrong, and if the chart is wrong, then
what you're doing is spreading misinformation, right?
We should ... We all have a responsibility as citizens
not to spread misinformation, or at least try not to
spread misinformation. We all make mistakes, right?
We all spread misinformation, but if we only spent one
minutes or two thinking about what we are seeing, it
will be less likely that we will spread misinformation
among our peers, or family or friends in social media.
Kirill Eremenko: Yeah. That's a common problem these days in the
world we live in, where people just catch onto
something they hear and they start spreading it. It's
very evident, for instance, in the political space where
something happens and people think it's really bad,
they start spreading, and they don't know the full
story, they don't know what actually happened. Then
when the full story emerges is completely different,
![Page 25: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/25.jpg)
and now there's all this deformation is already
happened. People are calling each other-
Alberto Cairo: Look, it happens to all of us. This is something that I
make very clear in talks and in the book, it happens to
all of us. It has happened to me, it will keep happening
to me in the future. However, it is less likely that it will
happen to me today than it was, say, five years ago or
10 years ago. Right? It was more likely before just
because now I'm a little bit more conscious about how
I consume media, how prone I am to be misled by
numbers or by stories or by charts. I try to be a little
bit more careful, and if we all try to be a little bit more
careful, we would not be able to avoid 100% of
problems or cases in which we may be misled by a
number or by a chart, but if we only avoid, say half of
them, that means half less misinformation around
there, right?
Kirill Eremenko: Yeah. With the hard rock bands, as far as I remember
from your talk, they had Bon Jovi in that ...
Alberto Cairo: No, they didn't. No, they didn't. That's the key thing.
That's what I explained in the talk and also in the
book, that the reason why I double checked the source
of that chart is that if you look into the literature
about the history of heavy metal or even if you go to
the Wikipedia page about heavy metal, you will see
that there are some bands that are mentioned in there
that is a little bit dubious that they are heavy metal.
For example, I think that the Wikipedia page mentions
Poison, which is a glam rock band from the '80s and
'90s.
![Page 26: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/26.jpg)
Alberto Cairo: I doubt that that band can be really called heavy
metal. It's like if you ... I mean, heavy metal, what is ...
Heavy metal is Metallica, or is a ...I don't know, Slayer
or Judas Priest and all these bands, or Iron Maiden,
right? Poison is a fine rock band, but it's certainly not
heavy metal, I would say. They don't mention Bon
Jovi. None of these bands that I have some times seen
being categorized as heavy metal. They don't appear in
the source. I mean, the source only counts all the sub
genres of heavy metal.
Kirill Eremenko: Yeah. I guess that's your journalistic investigative
minds. It's interesting to see you coming from a
journalism background because then you can apply
this curiosity, this investigative approach to digging in
and being ... double checking all the facts. How would
you say that somebody can just develop that without
being a journalist, without the background that you
have?
Alberto Cairo: Through practice. It's also practice. As I said before, I
mean, I am a little bit better at doing this today than I
was say 10 years ago. The way that I wrote both How
Charts Lie and my previous book, The Truthful Art,
was trying to remember how I was 10 years ago or 15
years ago. What didn't I know 10 or 15 years ago that I
should have known? I try to basically summarize all
that into some key principles. Take a look at the
source, ask yourself whether the source is counting
what they said that they're counting, make sure that
the data is displayed in correct scales, that they are
not destroying the scales of the chart. Ask yourself
![Page 27: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/27.jpg)
whether the chart that you're seeing is showing
sufficient or insufficient information, right?
Alberto Cairo: Is it showing the right amount of detail in order for you
to figure it out what's going on, right? Try not to
project your own beliefs onto the chart that you are
seeing because a chart shows only what it shows and
nothing else. Be really, really careful because we are
prone, all prone to doing that, right? Try to curb your
own impulses a little bit to see your own views
confirmed by the data that you are seeing. Take a look
at whether the patterns that you are ... that the chart
is displaying are really there or not, right? You'll ask
yourself, be a little bit more attentive. Only by doing
that, as I said before, you will not be able to avoid all
cases in which you may be misled by chart but you
will avoid many, and by doing that you will become a
better chart reader.
Kirill Eremenko: Or creator, right? That's-
Alberto Cairo: Or a creator, right.
Alberto Cairo: ... very important as well.
Alberto Cairo: Yeah. Yeah. It's very important as well, because many
of these problems or many of the mistakes that we
make when reading charts, they're very common, even
among practitioners like myself, like journalists or
graphic designers, et cetera, that sometimes we are a
little bit careless with the data that we handle. I speak
based on my own experience. I mean, I take a look
back, 10, 15 years ago and I see some charts that if it
were today, I would have never had [inaudible
00:40:23] such as pie charts in 3D and with shadows
![Page 28: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/28.jpg)
and shades or highlights and things like that that
totally distorted the data, or scatter plot, the one that I
mentioned before, which I described such as the more
cigarettes we consume the longer we live. But no,
that's the wrong description for that chart. That's not
how to describe that chart, because that's not what
the chart is showing and so on and so forth.
Kirill Eremenko: Got you. I'll probably, here, jump to your, fifth
principle of graphical literacy because it fits in really
well. When you build visualizations, you recommend to
build narratives and test utilization. Specifically, I
really liked what you said about beginning of the text,
have ... rather than just starting to throw visualization
together, once you know what you want to display,
think of a long sentence that will describe
visualization, and then break it down into pieces and
visualize that. Could you tell us a bit more about this
approach, please?
Alberto Cairo: Sure. Sure, sure. But before I do that, I need to also
emphasize that visualization can be used with multiple
purposes in mind. When you take a look, for example,
at the classical cycle of data science diagram, right?
That you can read about in books, it just hardly
weakens, are for data science and many others.
Visualization comes in two different steps in that cycle,
because visualization can be used to either explore
data and discover things from the data, and we call
that exploratory data analysis, obviously. Right? It can
also be used to communicate your findings, right?
What I specialize in is on the second use of
visualization. I'm not an expert in exploratory data
![Page 29: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/29.jpg)
analysis, right? There are many people who work in
these fields, people who work in scientific
visualization, and in data science, and specializing
visualization for exploration.
Alberto Cairo: What I specialize in is in helping scientists and other
kinds of experts in communicating the results. When
you already know what you want to say, once you have
come out with the conclusions of your study, and you
want to communicate those conclusions, how you do
it. Then when I teach these principles to specialists, I
describe that technique that you have just mentioned,
that this is a little trick that I learned throughout the
years, to never begin with the visualization itself but
always begin with a very long description of what you
want to say, right? An elevator speech, or what you
want to describe.
Alberto Cairo: This is not a technique that I have invented. I need to
credit the sources for this technique because I
shamelessly stole it from some friends of mine. I heard
about this technique from Juan Velazco, who used to
be the graphics director at National Geographic
Magazine, he's a friend of mine, and also Javier
Zarracina, who is the graphics director at vox.com,
both long time visualization designers. Very, very
talented, very nice people.
Kirill Eremenko: Both from Spain, right?
Alberto Cairo: Both from Spain, yeah. There's some sort of Spanish
Mafia in the world of visualization in journalism. They
are both from Spain, yes. Anyway, they both described
this technique one day in a conference that I attended,
![Page 30: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/30.jpg)
a couple of conferences that I attended, and it all
begins by writing a very long sentence of what you
want to say. What is the story? What is the narrative
that you're trying to convey? Right? Begin always with
that, begin with a very long sentence. Oh, my study
focus on this and that, I discover this and that, the
exceptions are these and that, the limitations are this
and that, and you write a very long sentence about
that, and my conclusions are such and such, and
possible autonomous explanations may be such and
such.
Alberto Cairo: You'll begin with a very long sentence, and then what
you do is to split up that sentence into its natural
components. You try to find the natural breaks in that
sentence, and then you split it up into four, five, six
different components. Each one of those components
may become the headline of a different section in your
visualization or in your scientific poster or in your
whatever it is that you're writing, your article, right?
Those will be the main themes, the main topic in your
design, and they may become the titles of the sections
for your design. Then what you do is to design the
visualizations that support the assertions that you're
making in those pieces of the sentence, right? You put
your visualizations underneath each one of the pieces
of the sentence. By doing that, you're basically, first of
all, providing the elevator speech itself.
Alberto Cairo: If people don't want to really dig very deeply into your
visualizations, they can still read the long sentence
because the long sentence is after all the headlines
over your sections, so they can get away ... they can
![Page 31: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/31.jpg)
just read that, right? And get the gist of your story.
But then, if they want to really double check whether
what you are saying is right or not, they can take a
look at your visualizations, as your charts or graphs,
your maps, whatever visualizations that you're
designing.
Kirill Eremenko: That's a very powerful approach. On top of that, I
would like to probably talk a little bit about building
narrative into visualization. With this day and age, one
thing is just to create one image, which can be very
useful and insightful, but sometimes and more often
we see these infographics that combine multiple
images and a whole story behind them. In one of your
talks, I really enjoyed that whole story you built
around the population of Brazil as you were doing
some research or visualization on how the population
of Brazil has changed from 2000 to 2010. But then
once you added additional charts about the fertility
rate, you were able to tell a much clearer story. If you
don't mind, could you tell us a bit about that and how
that played out and the whole thing-
Alberto Cairo: Yes.
Kirill Eremenko: ... behind that?
Alberto Cairo: Yeah, [inaudible 00:46:15] It's actually quite weird to
do a podcast about visualization because you need to
verbally describe the chart. But this is an example that
appears in my first book, The Functional art, and it's a
story that I published when I was working for a media
organization in Brazil. I lived in Brazil for a few years.
We published this very large poster about population
![Page 32: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/32.jpg)
pattern changes in Brazil. It's a story made of several
graphics, and the first thing that you see is basically a
map and a bunch of bar graphs that shows you the
population increase, between 2000 and 2010, right?
Basically, the population of Brazil increased
everywhere, right? At the national level, at the regional
level, at the local level, with some exceptions. There
are several regions that lost population rather than
gaining population. But in general, the population of
Brazil grew between the two years.
Alberto Cairo: Well, that's interesting per se, right? But we decided to
start, in collaboration with demographers ... I rarely do
these kinds of project alone because I'm not an expert
on anything, right? In collaboration with
demographers and some political scientists, we started
digging a little bit deeper into the data provided by the
Brazilian Census Bureau. One critical piece of data
that appear in the news releases that we were getting
and the data that we were getting, is that Brazil's
fertility rate, which is the number of children per
woman in a country, was strangely or surprisingly
different to what it was expected, right? When you
think about fertility rates, when you think about rich
nations, for example, rich nations tend to have low
fertility rates, right? If you think about Germany or
Spain or whatever, western nations, relatively high
income in general, they tend to have fertility rates that
are around 1.5 children per woman, 1.8 children per
woman and so on and so forth.
Alberto Cairo: They are relatively low. If you go to very poor nations,
right? For example, Afghanistan or Yemen, fertility
![Page 33: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/33.jpg)
rates are very high, five children per woman, six
children per woman. Some African nations also have
very high fertility rates. I think that Nigeria is around
four right now. That's the average, right? Then if you
go to the middle of the spectrum, middle of the income
spectrum countries such as Brazil for example, right?
Fertility rates are usually between 2.5 or three point
something children per woman. That's the benchmark
of these kinds of nations, right? But when you take a
look at the data, that is not true. I mean, the fertility
rate of Brazil, if you ask Brazilians themselves, right? I
know this because I did it. If you ask Brazilian
journalists, what do you think that is the current
fertility rate of Brazil, you will get numbers such as 2.5
for three children per woman.
Alberto Cairo: Just because we have this idea of Brasil in mind as a
nation that is still in development, right? Or a nation
that is still very poor, and certainly there's a high
degree of poverty in Brazil, but that is not true over the
entirety of the country. Brazil is a continent, right?
When you take a look at the data, you will discover
that fertility rates in Brazil have dropped very
dramatically in the past 50 years, and the current
fertility rate of Brazil is around 1.8 children per
woman. That was a second piece of content that we
put in that poster that we designed. Because
obviously, if you have such a low fertility rate, 1.8,
that's below the replacement rate. The replacement
rate is the minimum number of children per woman of
fertility rate that a country needs to have in order to
keep the population stable.
![Page 34: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/34.jpg)
Alberto Cairo: If your fertility rate drops below 2.1, which is this
magical number, right? Your population will become
older, and will start shrinking in the future, just
because you do not have enough children. If your
fertility rate drops below that number, your population
will become older, and in the future will start
shrinking. If you ask Brazilian demographers about
future population patterns in Brazil, they will tell you
that, that Brazil's population is predicted to become
older and to start shrinking around 2030 or something
like that. That's a problem. Why? Because well, Brazil
has a public health care system, it has retirement,
obviously public social security like the United States.
These population patterns would put a lot of pressure
in Brazil's public finances. How can you face that?
Well, there are several things that political scientists
have recommended to face these future situation.
Alberto Cairo: If you think about it, what I have done over here is
basically to use the technique that I explained before.
My very long sentence would be, "Brazil's population
has grown bigger but fertility rate is way below
expected. As a consequence of these, Brazil's
population will become older, and it will start
shrinking in the future. This will be a problem. Here's
how to face these problems." That's a very long
sentence. You split it up into its components, and then
you compare each one of these headlines, these little
titles, with the graphics that show the evidence for the
assertion that you're making. What we did was to use
maps and bar graphs to show population change, align
![Page 35: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/35.jpg)
chart to show the drop of fertility rates in Brazil in
comparison to other countries all over the world.
Alberto Cairo: We used a population pyramid to compare Brazil's
population today versus Brazil's population based on
age groups in 2050. A line chart to show Brazil's
population growing but then it started shrinking in
2030, and so on and so forth. Basically, it's a good
example, I believe, to illustrate how these narrative
principle works, right? It doesn't work always, but
when it does, when you can structure your
information this way, it can be really, really powerful.
Kirill Eremenko: It also takes care of the audience, because if you just
showed a chart where you're showing how the
population of Brazil grows from 2000 to 2010, people
might ... even though the chart's showing the correct
insights, people might misinterpret it and extrapolate
that the population is going to keep growing, and by
2020, it's going to-
Alberto Cairo: Or they may miss important features of the data,
right? That's why I emphasized before, the importance
of using text in data visualization. Again, we call this
the annotation layer in data visualization. Let's say
that you are doing a line chart showing progress in
sales in your company, and there is a sudden spike in
a particular point in time, you better put on an
annotation in there because otherwise people will
wonder, why is there this spike over here? What's
going on? Because you need to try to explain it. Put an
annotation in there, right? That annotation layer is
really, really relevant in data visualization. Pairing,
again. Pairing the visuals with the copy, with the texts
![Page 36: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/36.jpg)
that you can write to emphasize the important points
in the data, to supplement the data a little bit, to
reinforce the main messages that you're trying to
convey, or to avoid misinterpretations, right? Also, to
avoid misinterpretations of the data that you're
presenting.
Kirill Eremenko: In that sense, I really like the grammar of graphics,
how did they describe the multiple layers of
visualization. Multiple, starting from the axes all the
way to different colors and including annotation. Once
you understand, basically as they call it in the book,
the grammar of graphics, it really helps-
Alberto Cairo: The layer-
Kirill Eremenko: Layers.
Alberto Cairo: [inaudible 00:53:45] grammar of graphics. Yeah. This
is another one of those concepts that I try to explain to
the general public in the new book, in How Charts Lie.
I talk about the grammar of graphics. Obviously, I do it
in a much less technical way that Leland Wilkinson
did in his famous book, the Grammar of Graphics, or
Hadley Wickham does when talking about ggplot2, but
I still describe it. I still teach this principle in the new
book.
Kirill Eremenko: Definitely. That's very interesting. Unfortunately, we
won't have time to go into the rest of the principles of
graphical literacy. For our listeners, if you'd like to
learn more about them, I highly recommend picking
up Alberto's book, which is available on pre-order,
right Alberto?
![Page 37: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/37.jpg)
Alberto Cairo: Yeah, it's already available everywhere for pre-order;
Amazon, Barnes & Noble, independent bookstores. It's
basically everywhere. [crosstalk 00:54:38] Yeah. Yeah,
it comes out in October the 15th, but yeah, you can
order it now.
Kirill Eremenko: Guys, girls, go get that book. It's going to be epic. I'm
definitely going to pick up a copy. In the remaining five
or so minutes, I wanted to just quickly touch on
something I'd love to get your opinion on, and that is
ethics in visualization. We already spoke a little bit
about being conscious about what you reshare, how
you read charts and double check the data behind
that, and I think with how we're moving more into a
technological world, with more and more screens
around us, with soon wearable devices and things like
that, ethics is going to be super important. What is
your stance on ethics in visualization? What
recommendations can you give to practitioners
listening to this?
Alberto Cairo: Oh Wow. That would take another entire book to talk
about. I may write about that in the future. I have that
on the pipeline, to write a book about how to handle
data, and particularly when you are going to visualize
it. I don't have very formed thoughts at the moment
because again, I may use this new book to think
clearly about these sorts of principles. But there's lots
of people writing about these things already, not from
the point of view of visualization but more from the
point of view of data science in general. I'm a
bookworm, I would like to recommend books. I would
recommend, for example, Cathy O'Neal's Weapons of
![Page 38: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/38.jpg)
Math Destruction. I think that is a good introduction
to thinking about the implications of the data that we
handle every day, how to handle it carefully, clearly,
and ethically. I think that is a good introduction to
that.
Alberto Cairo: If you like something a little bit more controversial and
aggressive, which I really, really enjoy and that makes
you think, even if you disagree with the book
sometimes, because it's so aggressive, I would really,
really, really recommend Mike Montero's new book.
His new book, I believe, is called Ruined By Design. He
has a word design in the title, but it's a book about
data science. It's about a book about technologists,
how technologists gather data, how the data is
handled or mishandled, how careful we need to be
with the tools that we create and think about the
possible consequences of the tools that we create and
that we put out for the public to use, and so on and so
forth. Mike is a very passionate speaker. He's also a
very passionate writer.
Alberto Cairo: Again, you may not agree with everything that he says
in the book, but it's one of those books that even if you
disagree with it sometimes, it makes you think deeply,
and it makes you stop and think, "Is this gay right?
Am I doing things correctly?" Ethics begins with that;
with doubt. With doubting about your own decisions
and making ... have a dialogue with the book itself.
The book makes you think clearly. Those are two of my
favorite books to start thinking about how to use data
ethically, and visualization as an extension of that.
But there are many others. For example, Meredith
![Page 39: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/39.jpg)
Broussard, she has a book title Artificial
Unintelligence, which I really enjoyed. This is by MIT
Press if I'm not wrong.
Alberto Cairo: Virginia Eubanks, she has another book titled
Automating Inequality, which is about how algorithms
may promote or may perpetuate societal inequality.
That's another book that make me think. Again, none
of them covers visualization graphics in general, but
you can not understand visualizations separately from
the data that visualization is representing. Any book or
any thoughts about the ethics in data visualization,
necessarily needs to begin with thinking about the
data themselves.
Kirill Eremenko: Well, totally I love it. You're definitely a book warm.
That's so many interesting books that I've just been
writing down. Yeah, now I'm very curious about this
one, Ruined By Design by Mike Montero.
Alberto Cairo: You should really read it. I mean, it will make you feel
angry sometimes, I think, but for a very, very good
reason. I think that he makes a very good case. I
think.
Kirill Eremenko: That's wonderful. Well, on that note, Alberto, thank
you so much for coming on the show, sharing all your
insights. It's been a huge pleasure. Before I let you go,
what are some of the best ways to get in touch for your
work? Of course, in addition, or apart from purchasing
your book, which I highly recommend to everybody if
you love this podcast, go and get Alberto's new book,
How Charts Lie. In addition to that, what are some
![Page 40: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/40.jpg)
other ways that people can follow you and get access
to all these great things that you're creating?
Alberto Cairo: Sure. The best ways, I use Twitter quite a lot. My
handle is very easy to remember, is my first name and
last name. So it's Alberto Cairo, @AlbertoCairo. You
can find me on Twitter. I'm also on Facebook and on
LinkedIn. I'm both in LinkedIn and Facebook, but I
use Twitter most of the time, as a way to promote
things that other people do, graphics that other people
design, articles that I read, papers that I have
discovered, books that I'm reading, whatever. I use it
as a platform to share, basically, things that I see and
that I enjoy. I also have a web blog. The web blog is the
title of my first book, The Functional Art. It's
thefunctionalart.com. That's my web blog, and that's
the platform that I use to write a little bit more
extensively about things that I see or so. Those are the
best ways, I would say.
Kirill Eremenko: Got you.
Alberto Cairo: My E-mail address is very easy to find, in any of these
platforms.
Kirill Eremenko: Fantastic. Also, everybody listening, Alberto, you have
a huge 45 and a half thousand followers on Twitter.
Yeah, it's a great community to be part of, I guess, to
follow-
Alberto Cairo: Yeah, and it's a-
Kirill Eremenko: [crosstalk 01:00:36] his insights.
Alberto Cairo: It's a fun community, as well. There is one virtue that
the visualization community has, which is that it's
![Page 41: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/41.jpg)
very welcoming to newcomers. If you want to get
started in data visualization, you just need to basically
get started. Start designing your graphics, putting it
out there, asking people for advice, asking people for
feedback, and most people, or 99.9% of the people who
I know in the visualization community are very
constructive, welcoming, friendly, and it's a great
community to work in.
Kirill Eremenko: For sure. I find that to be true across all of data
science. It's surprisingly such and inspiringly so such
a wonderful community of helpful-
Alberto Cairo: Yeah, absolutely.
Kirill Eremenko: ... people.
Alberto Cairo: The [inaudible 01:01:21] community is very similar to
the visualization, as far as I have seen. Yeah.
Kirill Eremenko: Fantastic. Well, once again, Alberto, thank you so
much for coming on the show and sharing all these
amazing insights. Super, super excited to chat, and
good luck for the book once and for all the touring that
you're going to do in a couple months from now.
Alberto Cairo: Thank you so much for having me again. It was a
pleasure.
Kirill Eremenko: There you have it, ladies and gentlemen. Thank you so
much for being part of today's episode of the
SuperDataScience podcast. That was Alberto Cairo.
What an epic person. What an epic expert in the space
of data visualization. I got a ton from this podcast, got
so many takeaways, and I hope you did too. Just from
this conversation, you can tell the depth of thinking
![Page 42: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/42.jpg)
that goes into Alberto in his visualization. You're going
to find all of the infographics that we talked about in
the show notes for this episode at
www.superdatascience.com/271. That's
superdatascience.com/271, and just have a look
through them. Look at, for instance, the cigarettes
versus life expectancy, or the Brazil visualization that
we were talking about, or the Nobel Prize and
chocolates visualizations.
Kirill Eremenko: Just look at all of these different visualizations that
you'll find there, and notice the depth of thinking that
went into creating them, and you will recognize a lot of
the things that Alberto was actually talking about on
this podcast, from understanding if your data is
measuring the right thing that you wanted to be
measuring and that you think it's measuring, to
building narratives and creating a narrative structure
in your visualization and conveying those insights in a
certain way so that people can better understand
them. Also, if you see Alberto's visualizations on the
Internet, you'll find that they're definitely very
persuasive and very memoral. Of course if you enjoyed
this podcast, make sure to pick up Alberto's new book,
which is called How Charts Lie: Getting Smarter about
Visual Information, is coming out in October, 2019,
but you can already pick up a copy now. You can pre-
order a copy on Amazon or Barnes & Noble, on
Amazon UK, or wherever you're shopping for your
books.
Kirill Eremenko: Highly recommend putting on a pre-order so that you
get it fresh once they're live. What I really like about
![Page 43: SDS PODCAST EPISODE 271: MAKING THE PUBLIC … · 2019-06-19 · You just need to go to superdatasceine.com and sign up there, cancel at any time. In addition with your membership,](https://reader033.vdocuments.net/reader033/viewer/2022050109/5f475e4962915633fb624b4e/html5/thumbnails/43.jpg)
this book, as Alberto described it, is that it's for the
general public, and that means if you're not that at
depth at data visualization, you're going to get a great
headstart. But if you're already a data scientist, and
you're already visualizing a lot of things and you're
pretty experienced in this space, it will help you see
visualization from the eyes of your audience, and
understand what kind of issues they're going through,
what kind of challenges they're facing. I think it's a
very valuable skill to empathize with the people that
you're creating this for, for your audience. That can be
very, very powerful.
Kirill Eremenko: Of course, as usual, if you know anybody who can
benefit from this podcast, somebody who's interested
in visualization, somebody who's a fan of Alberto
Cairo, or somebody who's dabbling on the verge of
getting into visualization or not, send them this
podcast, give them this gift of insights into what the
world of data visualization's all about, and you might
even help them change their lives, changing their
careers and progress forward. Share the love, share
this link; superdatascience.com/271 with anybody
who you think could benefit from it. On that note,
thank you so much for being here today, make sure to
follow Alberto on Twitter and any other social media,
and I look forward to seeing you back here next time.
Until then, happy analyzing.