Data Visualisation
2
3
Data Visualisation
A Handbook for Data Driven Design
Andy Kirk
4
SAGE Publications Ltd
1 Oliver’s Yard
55 City Road
London EC1Y 1SP
SAGE Publications Inc.
2455 Teller Road
Thousand Oaks, California 91320
SAGE Publications India Pvt Ltd
B 1/I 1 Mohan Cooperative Industrial Area
Mathura Road
New Delhi 110 044
SAGE Publications Asia-Pacific Pte Ltd
3 Church Street
#10-04 Samsung Hub
Singapore 049483
5
© Andy Kirk 2016
First published 2016
Apart from any fair dealing for the purposes of research or private study,
or criticism or review, as permitted under the Copyright, Designs and
Patents Act, 1988, this publication may be reproduced, stored or
transmitted in any form, or by any means, only with the prior permission
in writing of the publishers, or in the case of reprographic reproduction, in
accordance with the terms of licences issued by the Copyright Licensing
Agency. Enquiries concerning reproduction outside those terms should be
sent to the publishers.
Library of Congress Control Number: 2015957322
British Library Cataloguing in Publication data
A catalogue record for this book is available from the British Library
ISBN 978-1-4739-1213-7
ISBN 978-1-4739-1214-4 (pbk)
Editor: Mila Steele
Editorial assistant: Alysha Owen
Production editor: Ian Antcliff
Marketing manager: Sally Ransom
Cover design: Shaun Mercier
Typeset by: C&M Digitals (P) Ltd, Chennai, India
Printed and bound in Great Britain by Bell and Bain Ltd, Glasgow
6
Contents
List of Figures with Source Notes
Acknowledgements
About the Author
INTRODUCTION
PART A FOUNDATIONS
1 Defining Data Visualisation
2 Visualisation Workflow
PART B THE HIDDEN THINKING
3 Formulating Your Brief
4 Working With Data
5 Establishing Your Editorial Thinking
PART C DEVELOPING YOUR DESIGN SOLUTION
6 Data Representation
7 Interactivity
8 Annotation
9 Colour
10 Composition
PART D DEVELOPING YOUR CAPABILITIES
11 Visualisation Literacy
References
Index
7
List of Figures with Source Notes
1.1 A Definition for Data Visualisation 19
1.2 Per Capita Cheese Consumption in the U.S., by Sarah Slobin
(Fortune magazine) 20
1.3 The Three Stages of Understanding 22
1.4–6 Demonstrating the Process of Understanding 24–27
1.7 The Three Principles of Good Visualisation Design 30
1.8 Housing and Home Ownership in the UK, by ONS Digital
Content Team 33
1.9 Falling Number of Young Homeowners, by the Daily Mail 33
1.10 Gun Deaths in Florida (Reuters Graphics) 34
1.11 Iraq’s Bloody Toll, by Simon Scarr (South China Morning Post)
34
1.12 Gun Deaths in Florida Redesign, by Peter A. Fedewa
(@pfedewa) 35
1.13 If Vienna would be an Apartment, by NZZ (Neue Zürcher
Zeitung) [Translated] 45
1.14 Asia Loses Its Sweet Tooth for Chocolate, by Graphics
Department (Wall Street Journal) 45
2.1 The Four Stages of the Visualisation Workflow 54
3.1 The ‘Purpose Map’ 76
3.2 Mizzou’s Racial Gap Is Typical On College Campuses, by
FiveThirtyEight 77
3.3 Image taken from ‘Wealth Inequality in America’, by YouTube
user ‘Politizane’ (www.youtube.com/watch?v=QPKKQnijnsM) 78
3.4 Dimensional Changes in Wood, by Luis Carli (luiscarli.com) 79
3.5 How Y’all, Youse and You Guys Talk, by Josh Katz (The New
York Times) 80
3.6 Spotlight on Profitability, by Krisztina Szücs 81
3.7 Countries with the Most Land Neighbours 83
3.8 Buying Power: The Families Funding the 2016 Presidential
Election, by Wilson Andrews, Amanda Cox, Alicia DeSantis, Evan
Grothjan, Yuliya Parshina-Kottas, Graham Roberts, Derek Watkins
and Karen Yourish (The New York Times) 84
3.9 Image taken from ‘Texas Department of Criminal Justice’
Website
(www.tdcj.state.tx.us/death_row/dr_executed_offenders.html) 86
8
3.10 OECD Better Life Index, by Moritz Stefaner, Dominikus Baur,
Raureif GmbH 89
3.11 Losing Ground, by Bob Marshall, The Lens, Brian Jacobs and
Al Shaw (ProPublica) 89
3.12 Grape Expectations, by S. Scarr, C. Chan, and F. Foo (Reuters
Graphics) 91
3.13 Keywords and Colour Swatch Ideas from Project about
Psychotherapy Treatment in the Arctic 92
3.14 An Example of a Concept Sketch, by Giorgia Lupi of Accurat 92
4.1 Example of a Normalised Dataset 99
4.2 Example of a Cross-tabulated Dataset 100
4.3 Graphic Language: The Curse of the CEO, by David Ingold and
Keith Collins (Bloomberg Visual Data), Jeff Green (Bloomberg
News) 101
4.4 US Presidents by Ethnicity (1789 to 2015) 114
4.5 OECD Better Life Index, by Moritz Stefaner, Dominikus Baur,
Raureif GmbH 116
4.6 Spotlight on Profitability, by Krisztina Szücs 117
4.7 Example of ‘Transforming to Convert’ Data 119
4.8 Making Sense of the Known Knowns 123
4.9 What Good Marathons and Bad Investments Have in Common,
by Justin Wolfers (The New York Times) 124
5.1 The Fall and Rise of U.S. Inequality, in Two Graphs Source:
World Top Incomes Database; Design credit: Quoctrung Bui (NPR)
136
5.2–4 Why Peyton Manning’s Record Will Be Hard to Beat, by
Gregor Aisch and Kevin Quealy (The New York Times) 138–140
C.1 Mockup Designs for ‘Poppy Field’, by Valentina D’Efilippo
(design); Nicolas Pigelet (code); Data source: The Polynational War
Memorial, 2014 (poppyfield.org) 146
6.1 Mapping Records and Variables on to Marks and Attributes 152
6.2 List of Mark Encodings 153
6.3 List of Attribute Encodings 153
6.4 Bloomberg Billionaires, by Bloomberg Visual Data (Design and
development), Lina Chen and Anita Rundles (Illustration) 155
6.5 Lionel Messi: Games and Goals for FC Barcelona 156
6.6 Image from the Home page of visualisingdata.com 156
6.7 How the Insane Amount of Rain in Texas Could Turn Rhode
Island Into a Lake, by Christopher Ingraham (The Washington Post)
156
9
6.8 The 10 Actors with the Most Oscar Nominations but No Wins
161
6.9 The 10 Actors who have Received the Most Oscar Nominations
162
6.10 How Nations Fare in PhDs by Sex Interactive, by Periscopic;
Research by Amanda Hobbs; Published in Scientific American 163
6.11 Gender Pay Gap US, by David McCandless, Miriam Quick
(Research) and Philippa Thomas (Design) 164
6.12 Who Wins the Stanley Cup of Playoff Beards? by Graphics
Department (Wall Street Journal) 165
6.13 For These 55 Marijuana Companies, Every Day is 4/20, by Alex
Tribou and Adam Pearce (Bloomberg Visual Data) 166
6.14 UK Public Sector Capital Expenditure, 2014/15 167
6.15 Global Competitiveness Report 2014–2015, by Bocoup and the
World Economic Forum 168
6.16 Excerpt from a Rugby Union Player Dashboard 169
6.17 Range of Temperatures (°F) Recorded in the Top 10 Most
Populated Cities During 2015 170
6.18 This Chart Shows How Much More Ivy League Grads Make
Than You, by Christopher Ingraham (The Washington Post) 171
6.19 Comparing Critics Scores (Rotten Tomatoes) for Major Movie
Franchises 172
6.20 A Career in Numbers: Movies Starring Michael Caine 173
6.21 Comparing the Frequency of Words Used in Chapter 1 of this
Book 174
6.22 Summary of Eligible Votes in the UK General Election 2015
175
6.23 The Changing Fortunes of Internet Explorer and Google Chrome
176
6.24 Literarcy Proficiency: Adult Levels by Country 177
6.25 Political Polarization in the American Public’, Pew Research
Center, Washington, DC (February, 2015) (http://www.peoplepress.org/2014/06/12/political-polarization-in-the-american-public/)
178
6.26 Finviz (www.finviz.com) 179
6.27 This Venn Diagram Shows Where You Can Both Smoke Weed
and Get a Same-Sex Marriage, by Phillip Bump (The Washington
Post) 180
6.28 The 200+ Beer Brands of SAB InBev, by Maarten Lambrechts
for Mediafin: www.tijd.be/sabinbev (Dutch),
10
www.lecho.be/service/sabinbev (French) 181
6.29 Which Fossil Fuel Companies are Most Responsible for Climate
Change? by Duncan Clark and Robin Houston (Kiln), published in
the Guardian, drawing on work by Mike Bostock and Jason Davies
182
6.30 How Long Will We Live – And How Well? by Bonnie
Berkowitz, Emily Chow and Todd Lindeman (The Washington Post)
183
6.31 Crime Rates by State, by Nathan Yau 184
6.32 Nutrient Contents – Parallel Coordinates, by Kai Chang
(@syntagmatic) 185
6.33 How the ‘Avengers’ Line-up Has Changed Over the Years, by
Jon Keegan (Wall Street Journal) 186
6.34 Interactive Fixture Molecules, by @experimental361 and
@bootifulgame 187
6.35 The Rise of Partisanship and Super-cooperators in the U.S.
House of Representatives. Visualisation by Mauro Martino, authored
by Clio Andris, David Lee, Marcus J. Hamilton, Mauro Martino,
Christian E. Gunning, and John Armistead Selde 188
6.36 The Global Flow of People, by Nikola Sander, Guy J. Abel and
Ramon Bauer 189
6.37 UK Election Results by Political Party, 2010 vs 2015 190
6.38 The Fall and Rise of U.S. Inequality, in Two Graphs. Source:
World Top Incomes Database; Design credit: Quoctrung Bui (NPR)
191
6.39 Census Bump: Rank of the Most Populous Cities at Each
Census, 1790–1890, by Jim Vallandingham 192
6.40 Coal, Gas, Nuclear, Hydro? How Your State Generates Power.
Source: U.S. Energy Information Administration, Credit: Christopher
Groskopf, Alyson Hurt and Avie Schneider (NPR) 193
6.41 Holdouts Find Cheapest Super Bowl Tickets Late in the Game,
by Alex Tribou, David Ingold and Jeremy Diamond (Bloomberg
Visual Data) 194
6.42 Crude Oil Prices (West Texas Intermediate), 1985–2015 195
6.43 Percentage Change in Price for Select Food Items, Since 1990,
by Nathan Yau 196
6.44 The Ebb and Flow of Movies: Box Office Receipts 1986–2008,
by Mathew Bloch, Lee Byron, Shan Carter and Amanda Cox (The
New York Times) 197
6.45 Tracing the History of N.C.A.A. Conferences, by Mike Bostock,
11
Shan Carter and Kevin Quealy (The New York Times) 198
6.46 A Presidential Gantt Chart, by Ben Jones 199
6.47 How the ‘Avengers’ Line-up Has Changed Over the Years, by
Jon Keegan (Wall Street Journal) 200
6.48 Native and New Berliners – How the S-Bahn Ring Divides the
City, by Julius Tröger, André Pätzold, David Wendler (Berliner
Morgenpost) and Moritz Klack (webkid.io) 201
6.49 How Y’all, Youse and You Guys Talk, by Josh Katz (The New
York Times) 202
6.50 Here’s Exactly Where the Candidates Cash Came From, by Zach
Mider, Christopher Cannon, and Adam Pearce (Bloomberg Visual
Data) 203
6.51 Trillions of Trees, by Jan Willem Tulp 204
6.52 The Racial Dot Map. Image Copyright, 2013, Weldon Cooper
Center for Public Service, Rector and Visitors of the University of
Virginia (Dustin A. Cable, creator) 205
6.53 Arteries of the City, by Simon Scarr (South China Morning
Post) 206
6.54 The Carbon Map, by Duncan Clark and Robin Houston (Kiln)
207
6.55 Election Dashboard, by Jay Boice, Aaron Bycoffe and Andrei
Scheinkman (Huffington Post). Statistical model created by Simon
Jackman 208
6.56 London is Rubbish at Recycling and Many Boroughs are Getting
Worse, by URBS London using London Squared Map © 2015
www.aftertheflood.co 209
6.57 Automating the Design of Graphical Presentations of Relational
Information. Adapted from McKinlay, J. D. (1986). ACM
Transactions on Graphics, 5(2), 110–141. 213
6.58 Comparison of Judging Line Size vs Area Size 213
6.59 Comparison of Judging Related Items Using Variation in Colour
(Hue) vs Variation in Shape 214
6.60 Illustrating the Correct and Incorrect Circle Size Encoding 216
6.61 Illustrating the Distortions Created by 3D Decoration 217
6.62 Example of a Bullet Chart using Banding Overlays 218
6.63 Excerpt from What’s Really Warming the World? by Eric
Roston and Blacki Migliozzi (Bloomberg Visual Data) 218
6.64 Example of Using Markers Overlays 219
6.65 Why Is Her Paycheck Smaller? by Hannah Fairfield and Graham
Roberts (The New York Times) 219
12
6.66 Inside the Powerful Lobby Fighting for Your Right to Eat Pizza,
by Andrew Martin and Bloomberg Visual Data 220
6.67 Excerpt from ‘Razor Sales Move Online, Away From Gillette’,
by Graphics Department (Wall Street Journal) 220
7.1 US Gun Deaths, by Periscopic 225
7.2 Finviz (www.finviz.com) 226
7.3 The Racial Dot Map: Image Copyright, 2013, Weldon Cooper
Center for Public Service, Rector and Visitors of the University of
Virginia (Dustin A. Cable, creator) 227
7.4 Obesity Around the World, by Jeff Clark 228
7.5 Excerpt from ‘Social Progress Index 2015’, by Social Progress
Imperative, 2015 228
7.6 NFL Players: Height & Weight Over Time, by Noah Veltman
(noahveltman.com) 229
7.7 Excerpt from ‘How Americans Die’, by Matthew C. Klein and
Bloomberg Visual Data 230
7.8 Model Projections of Maximum Air Temperatures Near the
Ocean and Land Surface on the June Solstice in 2014 and 2099:
NASA Earth Observatory maps, by Joshua Stevens 231
7.9 Excerpt from ‘A Swing of Beauty’, by Sohail Al-Jamea, Wilson
Andrews, Bonnie Berkowitz and Todd Lindeman (The Washington
Post) 231
7.10 How Well Do You Know Your Area? by ONS Digital Content
team 232
7.11 Excerpt from ‘Who Old Are You?’, by David McCandless and
Tom Evans 233
7.12 512 Paths to the White House, by Mike Bostock and Shan Carter
(The New York Times) 233
7.13 OECD Better Life Index, by Moritz Stefaner, Dominikus Baur,
Raureif GmbH 233
7.14 Nobel Laureates, by Matthew Weber (Reuters Graphics) 234
7.15 Geography of a Recession, by Graphics Department (The New
York Times) 234
7.16 How Big Will the UK Population be in 25 Years Time? by ONS
Digital Content team 234
7.17 Excerpt from ‘Workers’ Compensation Reforms by State’, by
Yue Qiu and Michael Grabell (ProPublica) 235
7.18 Excerpt from ‘ECB Bank Test Results’, by Monica Ulmanu,
Laura Noonan and Vincent Flasseur (Reuters Graphics) 236
7.19 History Through the President’s Words, by Kennedy Elliott, Ted
13
Mellnik and Richard Johnson (The Washington Post) 237
7.20 Excerpt from ‘How Americans Die’, by Matthew C. Klein and
Bloomberg Visual Data 237
7.21 Twitter NYC: A Multilingual Social City, by James Cheshire,
Ed Manley, John Barratt, and Oliver O’Brien 238
7.22 Killing the Colorado: Explore the Robot River, by Abrahm
Lustgarten, Al Shaw, Jeff Larson, Amanda Zamora and Lauren
Kirchner (ProPublica) and John Grimwade 238
7.23 Losing Ground, by Bob Marshall, The Lens, Brian Jacobs and
Al Shaw (ProPublica) 239
7.24 Excerpt from ‘History Through the President’s Words’, by
Kennedy Elliott, Ted Mellnik and Richard Johnson (The Washington
Post) 240
7.25 Plow, by Derek Watkins 242
7.26 The Horse in Motion, by Eadweard Muybridge. Source: United
States Library of Congress’s Prints and Photographs division, digital
ID cph.3a45870. 243
8.1 Titles Taken from Projects Published and Credited Elsewhere in
This Book 248
8.2 Excerpt from ‘The Color of Debt: The Black Neighborhoods
Where Collection Suits Hit Hardest’, by Al Shaw, Annie Waldman
and Paul Kiel (ProPublica) 249
8.3 Excerpt from ‘Kindred Britain’ version 1.0 © 2013 Nicholas
Jenkins – designed by Scott Murray, powered by SUL-CIDR 249
8.4 Excerpt from ‘The Color of Debt: The Black Neighborhoods
Where Collection Suits Hit Hardest’, by Al Shaw, Annie Waldman
and Paul Kiel (ProPublica) 250
8.5 Excerpt from ‘Bloomberg Billionaires’, by Bloomberg Visual
Data (Design and development), Lina Chen and Anita Rundles
(Illustration) 251
8.6 Excerpt from ‘Gender Pay Gap US?’, by David McCandless,
Miriam Quick (Research) and Philippa Thomas (Design) 251
8.7 Excerpt from ‘Holdouts Find Cheapest Super Bowl Tickets Late
in the Game’, by Alex Tribou, David Ingold and Jeremy Diamond
(Bloomberg Visual Data) 252
8.8 Excerpt from ‘The Life Cycle of Ideas’, by Accurat 252
8.9 Mizzou’s Racial Gap Is Typical On College Campuses, by
FiveThirtyEight 253
8.10 Excerpt from ‘The Infographic History of the World’, Harper
Collins (2013); by Valentina D’Efilippo (co-author and designer);
14
James Ball (co-author and writer); Data source: The Polynational War
Memorial, 2012 254
8.11 Twitter NYC: A Multilingual Social City, by James Cheshire,
Ed Manley, John Barratt, and Oliver O’Brien 255
8.12 Excerpt from ‘US Gun Deaths’, by Periscopic 255
8.13 Image taken from Wealth Inequality in America, by YouTube
user ‘Politizane’ (www.youtube.com/watch?v=QPKKQnijnsM) 256
9.1 HSL Colour Cylinder: Image from Wikimedia Commons
published under the Creative Commons Attribution-Share Alike 3.0
Unported license 265
9.2 Colour Hue Spectrum 265
9.3 Colour Saturation Spectrum 266
9.4 Colour Lightness Spectrum 266
9.5 Excerpt from ‘Executive Pay by the Numbers’, by Karl Russell
(The New York Times) 267
9.6 How Nations Fare in PhDs by Sex Interactive, by Periscopic;
Research by Amanda Hobbs; Published in Scientific American 268
9.7 How Long Will We Live – And How Well? by Bonnie
Berkowitz, Emily Chow and Todd Lindeman (The Washington Post)
268
9.8 Charting the Beatles: Song Structure, by Michael Deal 269
9.9 Photograph of MyCuppa mug, by Suck UK
(www.suck.uk.com/products/mycuppamugs/) 269
9.10 Example of a Stacked Bar Chart Based on Ordinal Data 270
9.11 Rim Fire – The Extent of Fire in the Sierra Nevada Range and
Yosemite National Park, 2013: NASA Earth Observatory images, by
Robert Simmon 270
9.12 What are the Current Electricity Prices in Switzerland
[Translated], by Interactive things for NZZ (the Neue Zürcher
Zeitung) 271
9.13 Excerpt from ‘Obama’s Health Law: Who Was Helped Most’,
by Kevin Quealy and Margot Sanger-Katz (The New York Times) 272
9.14 Daily Indego Bike Share Station Usage, by Randy Olson
(@randal_olson)
(http://www.randalolson.com/2015/09/05/visualizing-indego-bikeshare-usage-patterns-in-philadelphia-part-2/) 272
9.15 Battling Infectious Diseases in the 20th Century: The Impact of
Vaccines, by Graphics Department (Wall Street Journal) 273
9.16 Highest Max Temperatures in Australia (1st to 14th January
2013), Produced by the Australian Government Bureau of
15
Meteorology 274
9.17 State of the Polar Bear, by Periscopic 275
9.18 Excerpt from Geography of a Recession by Graphics
Department (The New York Times) 275
9.19 Fewer Women Run Big Companies Than Men Named John, by
Justin Wolfers (The New York Times) 276
9.20 NYPD, Council Spar Over More Officers by Graphics
Department (Wall Street Journal) 277
9.21 Excerpt from a Football Player Dashboard 277
9.22 Elections Performance Index, The Pew Charitable Trusts © 2014
278
9.23 Art in the Age of Mechanical Reproduction: Walter Benjamin by
Stefanie Posavec 279
9.24 Casualties, by Stamen, published by CNN 279
9.25 First Fatal Accident in Spain on a High-speed Line [Translated],
by Rodrigo Silva, Antonio Alonso, Mariano Zafra, Yolanda Clemente
and Thomas Ondarra (El Pais) 280
9.26 Lunge Feeding, by Jonathan Corum (The New York Times);
whale illustration by Nicholas D. Pyenson 281
9.27 Examples of Common Background Colour Tones 281
9.28 Excerpt from NYC Street Trees by Species, by Jill Hubley 284
9.29 Demonstrating the Impact of Red-green Colour Blindness
(deuteranopia) 286
9.30 Colour-blind Friendly Alternatives to Green and Red 287
9.31 Excerpt from, ‘Pyschotherapy in The Arctic’, by Andy Kirk 289
9.32 Wind Map, by Fernanda Viégas and Martin Wattenberg 289
10.1 City of Anarchy, by Simon Scarr (South China Morning Post)
294
10.2 Wireframe Sketch, by Giorgia Lupi for ‘Nobels no degree’ by
Accurat 295
10.3 Example of the Small Multiples Technique 296
10.4 The Glass Ceiling Persists Redesign, by Francis Gagnon
(ChezVoila.com) based on original by S. Culp (Reuters Graphics)
297
10.5 Fast-food Purchasers Report More Demands on Their Time, by
Economic Research Service (USDA) 297
10.6 Stalemate, by Graphics Department (Wall Street Journal) 297
10.7 Nobels No Degrees, by Accurat 298
10.8 Kasich Could Be The GOP’s Moderate Backstop, by
FiveThirtyEight 298
16
10.9 On Broadway, by Daniel Goddemeyer, Moritz Stefaner,
Dominikus Baur, and Lev Manovich 299
10.10 ER Wait Watcher: Which Emergency Room Will See You the
Fastest? by Lena Groeger, Mike Tigas and Sisi Wei (ProPublica) 300
10.11 Rain Patterns, by Jane Pong (South China Morning Post) 300
10.12 Excerpt from ‘Pyschotherapy in The Arctic’, by Andy Kirk 301
10.13 Gender Pay Gap US, by David McCandless, Miriam Quick
(Research) and Philippa Thomas (Design) 301
10.14 The Worst Board Games Ever Invented, by FiveThirtyEight
303
10.15 From Millions, Billions, Trillions: Letters from Zimbabwe,
2005−2009, a book written and published by Catherine Buckle
(2014), table design by Graham van de Ruit (pg. 193) 303
10.16 List of Chart Structures 304
10.17 Illustrating the Effect of Truncated Bar Axis Scales 305
10.18 Excerpt from ‘Doping under the Microscope’, by S. Scarr and
W. Foo (Reuters Graphics) 306
10.19 Record-high 60% of Americans Support Same-sex Marriage,
by Gallup 306
10.20 Images from Wikimedia Commons, published under the
Creative Commons Attribution-Share Alike 3.0 Unported license 308
11.1–7 The Pursuit of Faster’ by Andy Kirk and Andrew Witherley
318–324
17
Acknowledgements
This book has been made possible thanks to the unwavering support of my
incredible wife, Ellie, and the endless encouragement from my Mum and
Dad, the rest of my brilliant family and my super group of friends.
From a professional standpoint I also need to acknowledge the
fundamental role played by the hundreds of visualisation practitioners (no
matter under what title you ply your trade) who have created such a wealth
of brilliant work from which I have developed so many of my convictions
and formed the basis of so much of the content in this book. The people
and organisations who have provided me with permission to use their work
are heroes and I hope this book does their rich talent justice.
18
About the Author
Andy Kirk
is a freelance data visualisation specialist based in Yorkshire, UK. He
is a visualisation design consultant, training provider, teacher,
researcher, author, speaker and editor of the award-winning website
visualisingdata.com
After graduating from Lancaster University in 1999 with a BSc
(hons) in Operational Research, Andy held a variety of business
analysis and information management positions at organisations
including West Yorkshire Police and the University of Leeds.
He discovered data visualisation in early 2007 just at the time when
he was shaping up his proposal for a Master’s (MA) Research
Programme designed for members of staff at the University of Leeds.
On completing this programme with distinction, Andy’s passion for
the subject was unleashed. Following his graduation in December
2009, to continue the process of discovering and learning the subject
he launched visualisingdata.com, a blogging platform that would
chart the ongoing development of the data visualisation field. Over
time, as the field has continued to grow, the site too has reflected this,
becoming one of the most popular in the field. It features a wide
range of fresh content profiling the latest projects and contemporary
techniques, discourse about practical and theoretical matters,
commentary about key issues, and collections of valuable references
and resources.
In 2011 Andy became a freelance professional focusing on data
visualisation consultancy and training workshops. Some of his clients
include CERN, Arsenal FC, PepsiCo, Intel, Hershey, the WHO and
McKinsey. At the time of writing he has delivered over 160 public
and private training events across the UK, Europe, North America,
Asia, South Africa and Australia, reaching well over 3000 delegates.
In addition to training workshops Andy also has two academic
teaching positions. He joined the highly respected Maryland Institute
College of Art (MICA) as a visiting lecturer in 2013 and has been
teaching a module on the Information Visualisation Master’s
Programme since its inception. In January 2016, he began teaching a
data visualisation module as part of the MSc in Business Analytics at
the Imperial College Business School in London.
19
Between 2014 and 2015 Andy was an external consultant on a
research project called ‘Seeing Data’, funded by the Arts &
Humanities Research Council and hosted by the University of
Sheffield. This study explored the issues of data visualisation literacy
among the general public and, among many things, helped to shape
an understanding of the human factors that affect visualisation
literacy and the effectiveness of design.
20
Introduction
I.1 The Quest Begins
In his book The Seven Basic Plots, author Christopher Booker investigated
the history of telling stories. He examined the structures used in biblical
teachings and historical myths through to contemporary storytelling
devices used in movies and TV. From this study he found seven common
themes that, he argues, can be identifiable in any form of story.
One of these themes was ‘The Quest’. Booker describes this as revolving
around a main protagonist who embarks on a journey to acquire a
treasured object or reach an important destination, but faces many
obstacles and temptations along the way. It is a theme that I feel shares
many characteristics with the structure of this book and the nature of data
visualisation.
You are the central protagonist in this story in the role of the data
visualiser. The journey you are embarking on involves a route along a
design workflow where you will be faced with a wide range of different
conceptual, practical and technical challenges. The start of this journey
will be triggered by curiosity, which you will need to define in order to
accomplish your goals. From this origin you will move forward to
initiating and planning your work, defining the dimensions of your
challenge. Next, you will begin the heavy lifting of working with data,
determining what qualities it contains and how you might share these with
others. Only then will you be ready to take on the design stage. Here you
will be faced with the prospect of handling a spectrum of different design
options that will require creative and rational thinking to resolve most
effectively.
The multidisciplinary nature of this field offers a unique opportunity and
challenge. Data visualisation is not an especially difficult capability to
acquire, it is largely a game of decisions. Making better decisions will be
your goal but sometimes clear decisions will feel elusive. There will be
occasions when the best choice is not at all visible and others when there
will be many seemingly equal viable choices. Which one to go with? This
book aims to be your guide, helping you navigate efficiently through these
21
difficult stages of your journey.
You will need to learn to be flexible and adaptable, capable of shifting
your approach to suit the circumstances. This is important because there
are plenty of potential villains lying in wait looking to derail progress.
These are the forces that manifest through the imposition of restrictive
creative constraints and the pressure created by the relentless ticking clock
of timescales. Stakeholders and audiences will present complex human
factors through the diversity of their needs and personal traits. These will
need to be astutely accommodated. Data, the critical raw material of this
process, will dominate your attention. It will frustrate and even disappoint
at times, as promises of its treasures fail to materialise irrespective of the
hard work, love and attention lavished upon it.
Your own characteristics will also contribute to a certain amount of the
villainy. At times, you will find yourself wrestling with internal creative
and analytical voices pulling against each other in opposite directions.
Your excitably formed initial ideas will be embraced but will need taming.
Your inherent tastes, experiences and comforts will divert you away from
the ideal path, so you will need to maintain clarity and focus.
The central conflict you will have to deal with is the notion that there is no
perfect in data visualisation. It is a field with very few ‘always’ and
‘nevers’. Singular solutions rarely exist. The comfort offered by the rules
that instruct what is right and wrong, good and evil, has its limits. You can
find small but legitimate breaking points with many of them. While you
can rightly aspire to reach as close to perfect as possible, the attitude of
aiming for good enough will often indeed be good enough and
fundamentally necessary.
In accomplishing the quest you will be rewarded with competency in data
visualisation, developing confidence in being able to judge the most
effective analytical and design solutions in the most efficient way. It will
take time and it will need more than just reading this book. It will also
require your ongoing effort to learn, apply, reflect and develop. Each new
data visualisation opportunity poses a new, unique challenge. However, if
you keep persevering with this journey the possibility of a happy ending
will increase all the time.
I.2 Who is this Book Aimed at?
22
The primary challenge one faces when writing a book about data
visualisation is to determine what to leave in and what to leave out. Data
visualisation is big. It is too big a subject even to attempt to cover it all, in
detail, in one book. There is no single book to rule them all because there
is no one book that can cover it all. Each and every one of the topics
covered by the chapters in this book could (and, in several cases, do) exist
as whole books in their own right.
The secondary challenge when writing a book about data visualisation is to
decide how to weave all the content together. Data visualisation is not
rocket science; it is not an especially complicated discipline. Lots of it, as
you will see, is rooted in common sense. It is, however, certainly a
complex subject, a semantic distinction that will be revisited later. There
are lots of things to think about and decide on, as well as many things to
do and make. Creative and analytical sensibilities blend with artistic and
scientific judgments. In one moment you might be checking the statistical
rigour of your calculations, in the next deciding which tone of orange most
elegantly contrasts with an 80% black. The complexity of data
visualisation manifests itself through how these different ingredients, and
many more, interact, influence and intersect to form the whole.
The decisions I have made in formulating this book‘s content have been
shaped by my own process of learning about, writing about and practising
data visualisation for, at the time of writing, nearly a decade. Significantly
– from the perspective of my own development – I have been fortunate to
have had extensive experience designing and delivering training
workshops and postgraduate teaching. I believe you only truly learn about
your own knowledge of a subject when you have to explain it and teach it
to others.
I have arrived at what I believe to be an effective and proven pedagogy
that successfully translates the complexities of this subject into accessible,
practical and valuable form. I feel well qualified to bridge the gap between
the large population of everyday practitioners, who might identify
themselves as beginners, and the superstar technical, creative and
academic minds that are constantly pushing forward our understanding of
the potential of data visualisation. I am not going to claim to belong to that
latter cohort, but I have certainly been the former – a beginner – and most
of my working hours are spent helping other beginners start their journey.
I know the things that I would have valued when I was starting out and I
23
know how I would have wished them to be articulated and presented for
me to develop my skills most efficiently.
There is a large and growing library of fantastic books offering many
different theoretical and practical viewpoints on the subject of data
visualisation. My aim is to bring value to this existing collection of work
by taking on a particular perspective that is perhaps under-represented in
other texts – exploring the notion and practice of a visualisation design
process. As I have alluded to in the opening, the central premise of this
book is that the path to mastering data visualisation is achieved by making
better decisions: effective choices, efficiently made. The book’s central
goal is to help develop your capability and confidence in facing these
decisions.
Just as a single book cannot cover the whole of this subject, it stands that a
single book cannot aim to address directly the needs of all people doing
data visualisation. In this section I am going to run through some of the
characteristics that shape the readers to whom this book is primarily
targeted. I will also put into context the content the book will and will not
cover, and why. This will help manage your expectations as the reader and
establish its value proposition compared with other titles.
Domain and Duties
The core audiences for whom this book has been primarily written are
undergraduate and postgraduate-level students and early career researchers
from social science subjects. This reflects a growing number of people in
higher education who are interested in and need to learn about data
visualisation.
Although aimed at social sciences, the content will also be relevant across
the spectrum of academic disciplines, from the arts and humanities right
through to the formal and natural sciences: any academic duty where there
is an emphasis on the use of quantitative and qualitative methods in studies
will require an appreciation of good data visualisation practices. Where
statistical capabilities are relevant so too is data visualisation.
Beyond academia, data visualisation is a discipline that has reached
mainstream consciousness with an increasing number of professionals and
organisations, across all industry types and sizes, recognising the
24
importance of doing it well for both internal and external benefit. You
might be a market researcher, a librarian or a data analyst looking to
enhance your data capabilities. Perhaps you are a skilled graphic designer
or web developer looking to take your portfolio of work into a more datadriven direction. Maybe you are in a managerial position and not directly
involved in the creation of visualisation work, but you need to coordinate
or commission others who will be. You require awareness of the most
efficient approaches, the range of options and the different key decision
points. You might be seeking generally to improve the sophistication of
the language you use around commissioning visualisation work and to
have a better way of expressing and evaluating work created for you.
Basically, anyone who is involved in whatever capacity with the analysis
and visual communication of data as part of their professional duties will
need to grasp the demands of data visualisation and this book will go some
way to supporting these needs.
Subject Neutrality
One of the important aspects of the book will be to emphasise that data
visualisation is a portable practice. You will see a broad array of examples
of work from different industries, covering very different topics. What will
become apparent is that visualisation techniques are largely subject-matter
neutral: a line chart that displays the ebb and flow of favourable opinion
towards a politician involves the same techniques as using a line chart to
show how a stock has changed in value over time or how peak
temperatures have changed across a season in a given location. A line
chart is a line chart, regardless of the subject matter. The context of the
viewers (such as their needs and their knowledge) and the specific
meaning that can be drawn will inevitably be unique to each setting, but
the role of visualisation itself is adaptable and portable across all subject
areas.
Data visualisation is an entirely global concern, not focused on any defined
geographic region. Although the English language dominates the written
discourse (books, websites) about this subject, the interest in it and visible
output from across the globe are increasing at a pace. There are cultural
matters that influence certain decisions throughout the design process,
especially around the choices made for colour usage, but otherwise it is a
discipline common to all.
25
Level and Prerequisites
The coverage of this book is intended to serve the needs of beginners and
those with intermediate capability. For most people, this is likely to be as
far as they might ever need to go. It will offer an accessible route for
novices to start their learning journey and, for those already familiar with
the basics, there will be content that will hopefully contribute to finetuning their approaches.
For context, I believe the only distinction between beginner and
intermediate is one of breadth and depth of critical thinking rather than any
degree of difficulty. The more advanced techniques in visualisation tend to
be associated with the use of specific technologies for handling larger,
complex datasets and/or producing more bespoke and feature-rich outputs.
This book is therefore not aimed at experienced or established
visualisation practitioners. There may be some new perspectives to enrich
their thinking, some content that will confirm and other content that might
constructively challenge their convictions. Otherwise, the coverage in this
book should really echo the practices they are likely to be already
observing.
As I have already touched on, data visualisation is a genuinely
multidisciplinary field. The people who are active in this field or
profession come from all backgrounds – everyone has a different entry
point and nobody arrives with all constituent capabilities. It is therefore
quite difficult to define just what are the right type and level of preexisting knowledge, skills or experiences for those learning about data
visualisation. As each year passes, the savvy-ness of the type of audience
this book targets will increase, especially as the subject penetrates more
into the mainstream. What were seen as bewilderingly new techniques
several years ago are now commonplace to more people.
That said, I think the following would be a fair outline of the type and
shape of some of the most important prerequisite attributes for getting the
most out of this book:
Strong numeracy is necessary as well as a familiarity with basic
statistics.
While it is reasonable to assume limited prior knowledge of data
26
visualisation, there should be a strong desire to want to learn it. The
demands of learning a craft like data visualisation take time and
effort; the capabilities will need nurturing through ongoing learning
and practice. They are not going to be achieved overnight or acquired
alone from reading this book. Any book that claims to be able
magically to inject mastery through just reading it cover to cover is
over-promising and likely to under-deliver.
The best data visualisers possess inherent curiosity. You should be
the type of person who is naturally disposed to question the world
around them or can imagine what questions others have. Your instinct
for discovering and sharing answers will be at the heart of this
activity.
There are no expectations of your having any prior familiarity with
design principles, but a desire to embrace some of the creative aspects
presented in this book will heighten the impact of your work. Unlock
your artistry!
If you are somebody with a strong creative flair you are very
fortunate. This book will guide you through when and crucially when
not to tap into this sensibility. You should be willing to increase the
rigour of your analytical decision making and be prepared to have
your creative thinking informed more fundamentally by data rather
than just instinct.
A range of technical skills covering different software applications,
tools and programming languages is not expected for this book, as I
will explain next, but you will ideally have some knowledge of basic
Excel and some experience of working with data.
I.3 Getting the Balance
Handbook vs Tutorial Book
The description of this book as being a ‘handbook’ positions it as being of
practical help and presented in accessible form. It offers direction with
comprehensive reference – more of a city guidebook for a tourist than an
instruction manual to fix a washing machine. It will help you to know what
things to think about, when to think about them, what options exist and
how best to resolve all the choices involved in any data-driven design.
Technology is the key enabler for working with data and creating
27
visualisation design outputs. Indeed, apart from a small proportion of
artisan visualisation work that is drawn by hand, the reliance on
technology to create visualisation work is an inseparable necessity. For
many there is a understandable appetite for step-by-step tutorials that help
them immediately to implement data visualisation techniques via existing
and new tools.
However, writing about data visualisation through the lens of selected
tools is a bit of a minefield, given the diversity of technical options out
there and the mixed range of skills, access and needs. I greatly admire
those people who have authored tutorial-based texts because they require
astute judgement about what is the right level, structure and scope.
The technology space around visualisation is characterised by flux. There
are the ongoing changes with the enhancement of established tools as well
as a relatively high frequency of new entrants offset by the decline of
others. Some tools are proprietary, others are open source; some are easier
to learn, others require a great deal of understanding before you can even
consider embarking on your first chart. There are many recent cases of
applications or services that have enjoyed fleeting exposure before
reaching a plateau: development and support decline, the community of
users disperses and there is a certain expiry of value. Deprecation of
syntax and functions in programming languages requires the perennial
updating of skills.
All of this perhaps paints a rather more chaotic picture than is necessarily
the case but it justifies the reasons why this book does not offer teaching in
the use of any tools. While tutorials may be invaluable to some, they may
also only be mildly interesting to others and possibly of no value to most.
Tools come and go but the craft remains. I believe that creating a practical,
rather than necessarily a technical, text that focuses on the underlying craft
of data visualisation with a tool-agnostic approach offers an effective way
to begin learning about the subject in appropriate depth. The content
should be appealing to readers irrespective of the extent of their technical
knowledge (novice to advanced technicians) and specific tool experiences
(e.g. knowledge of Excel, Tableau, Adobe Illustrator).
There is a role for all book types. Different people want different sources
of insight at different stages in their development. If you are seeking a text
that provides in-depth tutorials on a range of tools or pages of
programmatic instruction, this one will not be the best choice. However, if
28
you consult only tutorial-related books, the chances are you will likely fall
short on the fundamental critical thinking that will be needed in the longer
term to get the most out of the tools with which you develop strong skills.
To substantiate the book’s value, the digital companion resources to this
book will offer a curated, up-to-date collection of visualisation technology
resources that will guide you through the most common and valuable tools,
helping you to gain a sense of what their roles are and where these fit into
the design workflow. Additionally, there will be recommended exercises
and many further related digital materials available for exploring.
Useful vs Beautiful
Another important distinction to make is that this book is not intended to
be seen as a beauty pageant. I love flicking through those glossy ‘coffee
table’ books as much as the next person; such books offer great inspiration
and demonstrate some of the finest work in the field. This book serves a
very different purpose. I believe that, as a beginner or relative beginner on
this learning journey, the inspiration you need comes more from
understanding what is behind the thinking that makes these amazing works
succeed and others not.
My desire is to make this the most useful text available, a reference that
will spend more time on your desk than on your bookshelf. To be useful is
to be used. I want the pages to be dog-eared. I want to see scribbles and
annotated notes made across its pages and key passages underlined. I want
to see sticky labels peering out above identified pages of note. I want to
see creases where pages have been folded back or a double-page spread
that has been weighed down to keep it open. In time I even want its cover
reinforced with wallpaper or wrapping paper to ensure its contents remain
bound together. There is every intention of making this an elegantly
presented and packaged book but it should not be something that invites
you to ‘look, but don’t touch’.
Pragmatic vs Theoretical
The content of this book has been formed through many years of absorbing
knowledge from all manner of books, generations of academic papers,
thousands of web articles, hundreds of conference talks, endless online and
29
personal discussions, and lots of personal practice. What I present here is a
pragmatic translation and distillation of what I have learned down the
years.
It is not a deeply academic or theoretical book. Where theoretical context
and reference is relevant it will be signposted as I do want to ground this
book in as much evidenced-based content as possible; it is about judging
what is going to add most value. Experienced practitioners will likely have
an appetite for delving deeper into theoretical discourse and the underlying
sciences that intersect in this field but that is beyond the scope of this
particular text.
Take the science of visual perception, for example. There is no value in
attempting to emulate what has already been covered by other books in
greater depth and quality than I could achieve. Once you start peeling back
the many different layers of topics like visual and cognitive science the
boundaries of your interest and their relevance to data visualisation never
seem to arrive. You get swallowed up by the depth of these subjects. You
realise that you have found yourself learning about what the very concept
of light and sight is and at that point your brain begins to ache (well, mine
does at least), especially when all you set out to discover was if a bar chart
would be better than a pie chart.
An important reason for giving greater weight to pragmatism is because of
people: people are the makers, the stakeholders, the audiences and the
critics in data visualisation. Although there are a great deal of valuable
research-driven concepts concerning data visualisation, their practical
application can be occasionally at odds with the somewhat sanitised and
artificial context of the research methods employed. To translate them into
real-world circumstances can sometimes be easier said than done as the
influence of human factors can easily distort the significance of otherwise
robust ideas.
I want to remove the burden from you as a reader having to translate
relevant theoretical discourse into applicable practice. Critical thinking
will therefore be the watchword, equipping you with the independence of
thought to decide rationally for yourself what the solutions are that best fit
your context, your data, your message and your audience. To do this you
will need an appreciation of all the options available to you (the different
things you could do) and a reliable approach for critically determining
what choices you should make (the things you will do and why).
30
Contemporary vs Historical
This book is not going to look too far back into the past. We all respect the
ancestors of this field, the great names who, despite primitive means,
pioneered new concepts in the visual display of statistics to shape the
foundations of the field being practised today. The field’s lineage is
decorated by the influence of William Playfair’s first ever bar chart,
Charles Joseph Minard’s famous graphic about Napoleon’s Russian
campaign, Florence Nightingale’s Coxcomb plot and John Snow’s cholera
map. These are some of the totemic names and classic examples that will
always be held up as the ‘firsts’. Of course, to many beginners in the field,
this historical context is of huge interest. However, again, this kind of
content has already been superbly covered by other texts on more than
enough occasions. Time to move on.
I am not going to spend time attempting to enlighten you about how we
live in the age of ‘Big Data’ and how occupations related to data are or
will be the ‘sexiest jobs’ of our time. The former is no longer news, the
latter claim emerged from a single source. I do not want to bloat this book
with the unnecessary reprising of topics that have been covered at length
elsewhere. There is more valuable and useful content I want you to focus
your time on.
The subject matter, the ideas and the practices presented here will
hopefully not date a great deal. Of course, many of the graphic examples
included in the book will be surpassed by newer work demonstrating
similar concepts as the field continues to develop. However, their worth as
exhibits of a particular perspective covered in the text should prove
timeless. As more research is conducted in the subject, without question
there will be new techniques, new concepts, new empirically evidenced
principles that emerge. Maybe even new rules. There will be new thoughtleaders, new sources of reference, new visualisers to draw insight from.
New tools will be created, existing tools will expire. Some things that are
done and can only be done by hand as of today may become seamlessly
automated in the near future. That is simply the nature of a fast-growing
field. This book can only be a line in the sand.
Analysis vs Communication
31
A further important distinction to make concerns the subtle but significant
difference between visualisations which are used for analysis and
visualisations used for communication.
Before a visualiser can confidently decide what to communicate to others,
he or she needs to have developed an intimate understanding of the
qualities and potential of the data. This is largely achieved through
exploratory data analysis. Here, the visualiser and the viewer are the same
person. Through visual exploration, different interrogations can be pursued
‘on the fly’ to unearth confirmatory or enlightening discoveries about what
insights exist.
Visualisation techniques used for analysis will be a key component of the
journey towards creating visualisation for communication but the practices
involved differ. Unlike visualisation for communication, the techniques
used for visual analysis do not have to be visually polished or necessarily
appealing. They are only serving the purpose of helping you to truly learn
about your data. When a data visualisation is being created to
communicate to others, many careful considerations come into play about
the requirements and interests of the intended or expected audience. This
has a significant influence on many of the design decisions you make that
do not exist alone with visual analysis.
Exploratory data analysis is a huge and specialist subject in and of itself. In
its most advanced form, working efficiently and effectively with large
complex data, topics like ‘machine learning’, using self-learning
algorithms to help automate and assist in the discovery of patterns in data,
become increasingly relevant. For the scope of this book the content is
weighted more towards methods and concerns about communicating data
visually to others. If your role is in pure data science or statistical analysis
you will likely require a deeper treatment of the exploratory data analysis
topic than this book can reasonably offer. However, Chapter 4 will cover
the essential elements in sufficient depth for the practical needs of most
people working with data.
Print vs Digital
The opportunity to supplement the print version of this book with an ebook and further digital companion resources helps to cushion the
agonising decisions about what to leave out. This text is therefore
32
enhanced by access to further digital resources, some of which are newly
created, while others are curated references from the endless well of
visualisation content on the Web. Included online
(book.visualisingdata.com) will be:
a completed case-study project that demonstrates the workflow
activities covered in this book, including full write-ups and all related
digital materials;
an extensive and up-to-date catalogue of over 300 data visualisation
tools;
a curated collection of tutorials and resources to help develop your
confidence with some of the most common and valuable tools;
practical exercises designed to embed the learning from each chapter;
further reading resources to continue learning about the subjects
covered in each chapter.
I.4 Objectives
Before moving on to an outline of the book’s contents, I want to share four
key objectives that I hope to accomplish for you by the final chapter.
These are themes that will run through the entire text: challenge, enlighten,
equip and inspire.
To challenge you I will be encouraging you to recognise that your current
thinking about visualisation may need to be reconsidered, both as a creator
and as a consumer. We all arrive in visualisation from different subject and
domain origins and with that comes certain baggage and prior sensibilities
that can distort our perspectives. I will not be looking to eliminate these,
rather to help you harness and align them with other traits and viewpoints.
I will ask you to relentlessly consider the diverse decisions involved in this
process. I will challenge your convictions about what you perceive to be
good or bad, effective or ineffective visualisation choices: arbitrary
choices will be eliminated from your thinking. Even if you are not
necessarily a beginner, I believe the content you read in this book will
make you question some of your own perspectives and assumptions. I will
encourage you to reflect on your previous work, asking you to consider
how and why you have designed visualisations in the way that you have:
where do you need to improve? What can you do better?
33
It is not just about creating visualisations, I will also challenge your
approach to reading visualisations. This is not something you might
usually think much about, but there is an important role for more tactical
approaches to consuming visualisations with greater efficiency and
effectiveness.
To enlighten you will be to increase your awareness of the possibilities in
data visualisation. As you begin your discovery of data visualisation you
might not be aware of the whole: you do not entirely know what options
exist, how they are connected and how to make good choices. Until you
know, you don’t know – that is what the objective of enlightening is all
about.
As you will discover, there is a lot on your plate, much to work through. It
is not just about the visible end-product design decisions. Hidden beneath
the surface are many contextual circumstances to weigh up, decisions
about how best to prepare your data, choices around the multitude of
viable ways of slicing those data up into different angles of analysis. That
is all before you even reach the design stage, where you will begin to
consider the repertoire of techniques for visually portraying your data – the
charts, the interactive features, the colours and much more besides.
This book will broaden your visual vocabulary to give you more ways of
expressing your data visually. It will enhance the sophistication of your
decision making and of visual language for any of the challenges you may
face.
To equip is to ensure you have robust tactics for managing your way
through the myriad options that exist in data visualisation. The variety it
offers makes for a wonderful prospect but, equally, introduces the burden
of choice. This book aims to make the challenge of undertaking data
visualisation far less overwhelming, breaking down the overall prospect
into smaller, more manageable task chunks.
The structure of this book will offer a reliable and flexible framework for
thinking, rather than rules for learning. It will lead to better decisions.
With an emphasis on critical thinking you will move away from an overreliance on gut feeling and taste. To echo what I mentioned earlier, its role
as a handbook will help you know what things to think about, when to
think about them and how best to resolve all the thinking involved in any
data-driven design challenge you meet.
34
To inspire is to give you more than just a book to read. It is the opening of
a door into a subject to inspire you to step further inside. It is about helping
you to want to continue to learn about it and expose yourself to as much
positive influence as possible. It should elevate your ambition and broaden
your capability.
It is a book underpinned by theory but dominated by practical and
accessible advice, including input from some of the best visualisers in the
field today. The range of print and digital resources will offer lots of
supplementary material including tutorials, further reading materials and
suggested exercises. Collectively this will hopefully make it one of the
most comprehensive, valuable and inspiring titles out there.
I.5 Chapter Contents
The book is organised into four main parts (A, B, C and D) comprising
eleven chapters and preceded by the ‘Introduction’ sections you are
reading now.
Each chapter opens with an introductory outline that previews the content
to be covered and provides a bridge between consecutive chapters. In the
closing sections of each chapter the most salient learning points will be
summarised and some important, practical tips and tactics shared. As
mentioned, online there will be collections of practical exercises and
further reading resources recommended to substantiate the learning from
the chapter.
Throughout the book you will see sidebar captions that will offer relevant
references, aphorisms, good habits and practical tips from some of the
most influential people in the field today.
Introduction
This introduction explains how I have attempted to make sense of the
complexity of the subject, outlining the nature of the audience I am trying
to reach, the key objectives, what topics the book will be covering and not
covering, and how the content has been organised.
35
Part A: Foundations
Part A establishes the foundation knowledge and sets up a key reference of
understanding that aids your thinking across the rest of the book. Chapter 1
will be the logical starting point for many of you who are new to the field
to help you understand more about the definitions and attributes of data
visualisation. Even if you are not a complete beginner, the content of the
chapter forms the terms of reference that much of the remaining content is
based on. Chapter 2 prepares you for the journey through the rest of the
book by introducing the key design workflow that you will be following.
Chapter 1: Defining Data Visualisation
Defining data visualisation: outlining the components of thinking
that make up the proposed definition for data visualisation.
The importance of conviction: presenting three guiding principles of
good visualisation design: trustworthy, accessible and elegant.
Distinctions and glossary: explaining the distinctions and overlaps
with other related disciplines and providing a glossary of terms used
in this book to establish consistency of language.
Chapter 2: Visualisation Workflow
The importance of process: describing the data visualisation design
workflow, what it involves and why a process approach is required.
The process in practice: providing some useful tips, tactics and
habits that transcend any particular stage of the process but will best
prepare you for success with this activity.
Part B: The Hidden Thinking
Part B discusses the first three preparatory stages of the data visualisation
design workflow. ‘The hidden thinking’ title refers to how these vital
activities, that have a huge influence over the eventual design solution, are
somewhat out of sight in the final output; they are hidden beneath the
surface but completely shape what is visible. These stages represent the
often neglected contextual definitions, data wrangling and editorial
challenges that are so critical to the success or otherwise of any
36
visualisation work – they require a great deal of care and attention before
you switch your attention to the design stage.
Chapter 3: Formulating Your Brief
What is a brief?: describing the value of compiling a brief to help
initiate, define and plan the requirements of your work.
Establishing your project’s context: defining the origin curiosity or
motivation, identifying all the key factors and circumstances that
surround your work, and defining the core purpose of your
visualisation.
Establishing your project’s vision: early considerations about the
type of visualisation solution needed to achieve your aims and
harnessing initial ideas about what this solution might look like.
Chapter 4: Working With Data
Data literacy: establishing a basic understanding with this critical
literacy, providing some foundation understanding about datasets and
data types and some observations about statistical literacy.
Data acquisition: outlining the different origins of and methods for
accessing your data.
Data examination: approaches for acquainting yourself with the
physical characteristics and meaning of your data.
Data transformation: optimising the condition, content and form of
your data fully to prepare it for its analytical purpose.
Data exploration: developing deeper intimacy with the potential
qualities and insights contained, and potentially hidden, within your
data.
Chapter 5: Establishing Your Editorial Thinking
What is editorial thinking?: defining the role of editorial thinking in
data visualisation.
The influence of editorial thinking: explaining how the different
dimensions of editorial thinking influence design choices.
Part C: Developing Your Design Solution
37
Part C is the main part of the book and covers progression through the data
visualisation design and production stage. This is where your concerns
switch from hidden thinking to visible thinking. The individual chapters in
this part of the book cover each of the five layers of the data visualisation
anatomy. They are treated as separate affairs to aid the clarity and
organisation of your thinking, but they are entirely interrelated matters and
the chapter sequences support this. Within each chapter there is a
consistent structure beginning with an introduction to each design layer, an
overview of the many different possible design options, followed by
detailed guidance on the factors that influence your choices.
The production cycle: describing the cycle of development activities
that take place during this stage, giving a context for how to work
through the subsequent chapters in this part.
Chapter 6: Data Representation
Introducing visual encoding: an overview of the essentials of data
representation looking at the differences and relationships between
visual encoding and chart types.
Chart types: a detailed repertoire of 49 different chart types, profiled
in depth and organised by a taxonomy of chart families: categorical,
hierarchical, relational, temporal, and spatial.
Influencing factors and considerations: presenting the factors that
will influence the suitability of your data representation choices.
Chapter 7: Interactivity
The features of interactivity:
Data adjustments: a profile of the options for interactively
interrogating and manipulating data.
View adjustments: a profile of the options for interactively
configuring the presentation of data.
Influencing factors and considerations: presenting the factors that will
influence the suitability of your interactivity choices.
Chapter 8: Annotation
38
The features of annotation:
Project annotation: a profile of the options for helping to provide
viewers with general explanations about your project.
Chart annotation: a profile of the annotated options for helping to
optimise viewers’ understanding your charts.
Influencing factors and considerations: presenting the factors that will
influence the suitability of your annotation choices.
Chapter 9: Colour
The features of colour:
Data legibility: a profile of the options for using colour to represent
data.
Editorial salience: a profile of the options for using colour to direct
the eye towards the most relevant features of your data.
Functional harmony: a profile of the options for using colour most
effectively across the entire visualisation design.
Influencing factors and considerations: presenting the factors that will
influence the suitability of your colour choices.
Chapter 10: Composition
The features of composition:
Project composition: a profile of the options for the overall layout and
hierarchy of your visualisation design.
Chart composition: a profile of the options for the layout and
hierarchy of the components of your charts.
Influencing factors and considerations: presenting the factors that will
influence the suitability of your composition choices.
Part D: Developing Your Capabilities
Part D wraps up the book’s content by reflecting on the range of
capabilities required to develop confidence and competence with data
39
visualisation. Following completion of the design process, the
multidisciplinary nature of this subject will now be clearly established.
This final part assesses the two sides of visualisation literacy – your role as
a creator and your role as a viewer – and what you need to enhance your
skills with both.
Chapter 11: Visualisation Literacy
Viewing: Learning to see: learning about the most effective strategy
for understanding visualisations in your role as a viewer rather than a
creator.
Creating: The capabilities of the visualiser: profiling the skill sets,
mindsets and general attributes needed to master data visualisation
design as a creator.
40
Part A Foundations
41
1 Defining Data Visualisation
This opening chapter will introduce you to the subject of data
visualisation, defining what data visualisation is and is not. It will outline
the different ingredients that make it such an interesting recipe and
establish a foundation of understanding that will form a key reference for
all of the decision making you are faced with.
Three core principles of good visualisation design will be presented that
offer guiding ideals to help mould your convictions about distinguishing
between effective and ineffective in data visualisation.
You will also see how data visualisation sits alongside or overlaps with
other related disciplines, and some definitions about the use of language in
this book will be established to ensure consistency in meaning across all
chapters.
1.1 The Components of Understanding
To set the scene for what is about to follow, I think it is important to start
this book with a proposed definition for data visualisation (Figure 1.1).
This definition offers a critical term of reference because its components
and their meaning will touch on every element of content that follows in
this book. Furthermore, as a subject that has many different proposed
definitions, I believe it is worth clarifying my own view before going
further:
Figure 1.1 A Definition for Data Visualisation
42
At first glance this might appear to be a surprisingly short definition: isn’t
there more to data visualisation than that, you might ask? Can nine words
sufficiently articulate what has already been introduced as an eminently
complex and diverse discipline?
I have arrived at this after many years of iterations attempting to improve
the elegance of my definition. In the past I have tried to force too many
words and too many clauses into one statement, making it cumbersome
and rather undermining its value. Over time, as I have developed greater
clarity in my own convictions, I have in turn managed to establish greater
clarity about what I feel is the real essence of this subject. The definition
above is, I believe, a succinct and practically useful description of what the
pursuit of visualisation is truly about. It is a definition that largely informs
the contents of this book. Each chapter will aim to enlighten you about
different aspects of the roles of and relationships between each component
expressed. Let me introduce and briefly examine each of these one by one,
explaining where and how they will be discussed in the book.
Firstly, data, our critical raw material. It might appear a formality to
mention data in the definition for, after all, we are talking about data
visualisation as opposed to, let’s say, cheese visualisation (though
visualisation of data using cheese has happened, see Figure 1.2), but it
needs to be made clear the core role that data has in the design process.
Without data there is no visualisation; indeed there is no need for one.
Data plays the fundamental role in this work, so you will need to give it
your undivided attention and respect. You will discover in Chapter 4 the
importance of developing an intimacy with your data to acquaint yourself
with its physical properties, its meaning and its potential qualities.
43
Figure 1.2 Per Capita Cheese Consumption in the US
Data is names, amounts, groups, statistical values, dates, comments,
locations. Data is textual and numeric in format, typically held in datasets
in table form, with rows of records and columns of different variables.
This tabular form of data is what we will be considering as the raw form of
data. Through tables, we can look at the values contained to precisely read
them as individual data points. We can look up values quite efficiently,
scanning across many variables for the different records held. However,
we cannot easily establish the comparative size and relationship between
multiple data points. Our eyes and mind are not equipped to translate
easily the textual and numeric values into quantitative and qualitative
meaning. We can look at the data but we cannot really see it without the
context of relationships that help us compare and contrast them effectively
with other values. To derive understanding from data we need to see it
represented in a different, visual form. This is the act of data
representation.
This word representation is deliberately positioned near the front of the
definition because it is the quintessential activity of data visualisation
design. Representation concerns the choices made about the form in which
your data will be visually portrayed: in lay terms, what chart or charts you
will use to exploit the brain’s visual perception capabilities most
effectively.
When data visualisers create a visualisation they are representing the data
they wish to show visually through combinations of marks and attributes.
Marks are points, lines and areas. Attributes are the appearance properties
44
of these marks, such as the size, colour and position. The recipe of these
marks and their attributes, along with other components of apparatus, such
as axes and gridlines, form the anatomy of a chart.
In Chapter 6 you will gain a deeper and more sophisticated appreciation of
the range of different charts that are in common usage today, broadening
your visual vocabulary. These charts will vary in complexity and
composition, with each capable of accommodating different types of data
and portraying different angles of analysis. You will learn about the key
ingredients that shape your data representation decisions, explaining the
factors that distinguish the effective from the ineffective choices.
Beyond representation choices, the presentation of data concerns all the
other visible design decisions that make up the overall visualisation
anatomy. This includes choices about the possible applications of
interactivity, features of annotation, colour usage and the composition of
your work. During the early stages of learning this subject it is sensible to
partition your thinking about these matters, treating them as isolated
design layers. This will aid your initial critical thinking. Chapters 7–10
will explore each of these layers in depth, profiling the options available
and the factors that influence your decisions.
However, as you gain in experience, the interrelated nature of visualisation
will become much more apparent and you will see how the overall design
anatomy is entirely connected. For instance, the selection of a chart type
intrinsically leads to decisions about the space and place it will occupy; an
interactive control may be included to reveal an annotated caption; for any
design property to be even visible to the eye it must possess a colour that is
different from that of its background.
The goal expressed in this definition states that data visualisation is about
facilitating understanding. This is very important and some extra time is
required to emphasise why it is such an influential component in our
thinking. You might think you know what understanding means, but when
you peel back the surface you realise there are many subtleties that need to
be acknowledged about this term and their impact on your data
visualisation choices. Understanding ‘understanding’ (still with me?) in
the context of data visualisation is of elementary significance.
When consuming a visualisation, the viewer will go through a process of
understanding involving three stages: perceiving, interpreting and
45
comprehending (Figure 1.3). Each stage is dependent on the previous one
and in your role as a data visualiser you will have influence but not full
control over these. You are largely at the mercy of the viewer – what they
know and do not know, what they are interested in knowing and what
might be meaningful to them – and this introduces many variables outside
of your control: where your control diminishes the influence and reliance
on the viewer increases. Achieving an outcome of understanding is
therefore a collective responsibility between visualiser and viewer.
These are not just synonyms for the same word, rather they carry
important distinctions that need appreciating. As you will see
throughout this book, the subtleties and semantics of language in data
visualisation will be a recurring concern.
Figure 1.3 The Three Stages of Understanding
Let’s look at the characteristics of the different stages that form the process
of understanding to help explain their respective differences and mutual
dependencies.
Firstly, perceiving. This concerns the act of simply being able to read a
chart. What is the chart showing you? How easily can you get a sense of
the values of the data being portrayed?
Where are the largest, middle-sized and smallest values?
What proportion of the total does that value hold?
How do these values compare in ranking terms?
To which other values does this have a connected relationship?
The notion of understanding here concerns our attempts as viewers to
46
efficiently decode the representations of the data (the shapes, the sizes and
the colours) as displayed through a chart, and then convert them into
perceived values: estimates of quantities and their relationships to other
values.
Interpreting is the next stage of understanding following on from
perceiving. Having read the charts the viewer now seeks to convert these
perceived values into some form of meaning:
Is it good to be big or better to be small?
What does it mean to go up or go down?
Is that relationship meaningful or insignificant?
Is the decline of that category especially surprising?
The viewer’s ability to form such interpretations is influenced by their preexisting knowledge about the portrayed subject and their capacity to utilise
that knowledge to frame the implications of what has been read. Where a
viewer does not possess that knowledge it may be that the visualiser has to
address this deficit. They will need to make suitable design choices that
help to make clear what meaning can or should be drawn from the display
of data. Captions, headlines, colours and other annotated devices, in
particular, can all be used to achieve this.
Comprehending involves reasoning the consequence of the perceiving and
interpreting stages to arrive at a personal reflection of what all this means
to them, the viewer. How does this information make a difference to what
was known about the subject previously?
Why is this relevant? What wants or needs does it serve?
Has it confirmed what I knew or possibly suspected beforehand or
enlightened me with new knowledge?
Has this experience impacted me in an emotional way or left me
feeling somewhat indifferent as a consequence?
Does the context of what understanding I have acquired lead me to
take action – such as make a decision or fundamentally change my
behaviour – or do I simply have an extra grain of knowledge the
consequence of which may not materialise until much later?
Over the page is a simple demonstration to further illustrate this process of
understanding. In this example I play the role of a viewer working with a
sample isolated chart (Figure 1.4). As you will learn throughout the design
47
chapters, a chart would not normally just exist floating in isolation like this
one does, but it will serve a purpose for this demonstration.
Figure 1.4 shows a clustered bar chart that presents a breakdown of the
career statistics for the footballer Lionel Messi during his career with FC
Barcelona.
The process commences with perceiving the chart. I begin by establishing
what chart type is being used. I am familiar with this clustered bar chart
approach and so I quickly feel at ease with the prospect of reading its
display: there is no learning for me to have to go through on this occasion,
which is not always the case as we will see.
I can quickly assimilate what the axes are showing by examining the labels
along the x- and y-axes and by taking the assistance provided by colour
legend at the top. I move on to scanning, detecting and observing the
general physical properties of the data being represented. The eyes and
brain are working in harmony, conducting this activity quite instinctively
without awareness or delay, noting the most prominent features of
variation in the attributes of size, shape, colour and position.
Figure 1.4 Demonstrating the Process of Understanding
I look across the entire chart, identifying the big, small and medium values
48
(these are known as stepped magnitude judgements), and form an overall
sense of the general value rankings (global comparison judgements). I am
instinctively drawn to the dominant bars towards the middle/right of the
chart, especially as I know this side of the chart concerns the most recent
career performances. I can determine that the purple bar – showing goals –
has been rising pretty much year-on-year towards a peak in 2011/12 and
then there is a dip before recovery in his most recent season.
My visual system is now working hard to decode these properties into
estimations of quantities (amounts of things) and relationships (how
different things compare with each other). I focus on judging the absolute
magnitudes of individual bars (one bar at a time). The assistance offered
by the chart apparatus, such as the vertical axis (or y- axis) values and the
inclusion of gridlines, is helping me more quickly estimate the quantities
with greater assurance of accuracy, such as discovering that the highest
number of goals scored was around 73.
I then look to conduct some relative higher/lower comparisons. In
comparing the games and goals pairings I can see that three out of the last
four years have seen the purple bar higher than the blue bar, in contrast to
all the rest. Finally I look to establish proportional relationships between
neighbouring bars, i.e. by how much larger one is compared with the next.
In 2006/07 I can see the blue bar is more than twice as tall as the purple
one, whereas in 2011/12 the purple bar is about 15% taller.
By reading this chart I now have a good appreciation of the quantities
displayed and some sense of the relationship between the two measures,
games and goals.
The second part of the understanding process is interpreting. In reality, it
is not so consciously consecutive or delayed in relationship to the
perceiving stage but you cannot get here without having already done the
perceiving. Interpreting, as you will recall, is about converting perceived
‘reading’ into meaning. Interpreting is essentially about orientating your
assessment of what you’ve read against what you know about the subject.
As I mentioned earlier, often a data visualiser will choose to – or have the
opportunity to – share such insights via captions, chart overlays or
summary headlines. As you will learn in Chapter 3, the visualisations that
present this type of interpretation assistance are commonly described as
offering an ‘explanatory’ experience. In this particular demonstration it is
49
an example of an ‘exhibitory’ experience, characterised by the absence of
any explanatory features. It relies on the viewer to handle the demands of
interpretation without any assistance.
As you will read about later, many factors influence how well different
viewers will be able to interpret a visualisation. Some of the most critical
include the level of interest shown towards the subject matter, its relevance
and the general inclination, in that moment, of a viewer to want to read
about that subject through a visualisation. It is also influenced by the
knowledge held about a subject or the capacity to derive meaning from a
subject even if a knowledge gap exists.
Returning to the sample chart, in order to translate the quantities and
relationships I extracted from the perceiving stage into meaning, I am
effectively converting the reading of value sizes into notions of good or
bad and comparative relationships into worse than or better than etc. To
interpret the meaning of this data about Lionel Messi I can tap into my
passion for and knowledge of football. I know that for a player to score
over 25 goals in a season is very good. To score over 35 is exceptional. To
score over 70 goals is frankly preposterous, especially at the highest level
of the game (you might find plenty of players achieving these statistics
playing for the Dog and Duck pub team, but these numbers have been
achieved for Barcelona in La Liga, the Champions League and other
domestic cup competitions). I know from watching the sport, and poring
over statistics like this for 30 years, that it is very rare for a player to score
remotely close to a ratio of one goal per game played. Those purple bars
that exceed the height of the blue bars are therefore remarkable. Beyond
the information presented in the chart I bring knowledge about the periods
when different managers were in charge of Barcelona, how they played the
game, and how some organised their teams entirely around Messi’s talents.
I know which other players were teammates across different seasons and
who might have assisted or hindered his achievements. I also know his age
and can mentally compare his achievements with the traditional football
career arcs that will normally show a steady rise, peak, plateau, and then
decline.
Therefore, in this example, I am not just interested in the subject but can
bring a lot of knowledge to aid me in interpreting this analysis. That helps
me understand a lot more about what this data means. For other people
they might be passingly interested in football and know how to read what
50
is being presented, but they might not possess the domain knowledge to go
deeper into the interpretation. They also just might not care. Now imagine
this was analysis of, let’s say, an NHL ice hockey player (Figure 1.5) –
that would present an entirely different challenge for me.
In this chart the numbers are irrelevant, just using the same chart as before
with different labels. Assuming this was real analysis, as a sports fan in
general I would have the capacity to understand the notion of a
sportsperson’s career statistics in terms of games played and goals scored:
I can read the chart (perceiving) that shows me this data and catch the gist
of the angle of analysis it is portraying. However, I do not have sufficient
domain knowledge of ice hockey to determine the real meaning and
significance of the big–small, higher–lower value relationships. I cannot
confidently convert ‘small’ into ‘unusual’ or ‘greater than’ into
‘remarkable’. My capacity to interpret is therefore limited, and besides I
have no connection to the subject matter, so I am insufficiently interested
to put in the effort to spend much time with any in-depth attempts at
interpretation.
Figure 1.5 Demonstrating the Process of Understanding
Imagine this is now no longer analysis about sport but about the sightings
in the wild of Winglets and Spungles (completely made up words). Once
again I can still read the chart shown in Figure 1.6 but now I have
51
absolutely no connection to the subject whatsoever. No knowledge and no
interest. I have no idea what these things are, no understanding about the
sense of scale that should be expected for these sightings, I don’t know
what is good or bad. And I genuinely don’t care either. In contrast, for
those who do have a knowledge of and interest in the subject, the meaning
of this data will be much more relevant. They will be able to read the chart
and make some sense of the meaning of the quantities and relationships
displayed.
To help with perceiving, viewers need the context of scale. To help with
interpreting, viewers need the context of subject, whether that is provided
by the visualiser or the viewer themself. The challenge for you and I as
data visualisers is to determine what our audience will know already and
what they will need to know in order to possibly assist them in interpreting
the meaning. The use of explanatory captions, perhaps positioned in that
big white space top left, could assist those lacking the knowledge of the
subject, possibly offering a short narrative to make the interpretations – the
meaning – clearer and immediately accessible.
We are not quite finished, there is one stage left. The third part of the
understanding process is comprehending. This is where I attempt to form
some concluding reasoning that translates into what this analysis means for
me. What can I infer from the display of data I have read? How do I relate
and respond to the insights I have drawn out as through interpretation?
Does what I’ve learnt make a difference to me? Do I know something
more than I did before? Do I need to act or decide on anything? How does
it make me feel emotionally?
Figure 1.6 Demonstrating the Process of Understanding
52
Through consuming the Messi chart, I have been able to form an even
greater appreciation of his amazing career. It has surprised me just how
prolific he has been, especially having seen his ratio of goals to games, and
I am particularly intrigued to see whether the dip in 2013/14 was a
temporary blip or whether the bounce back in 2014/15 was the blip. And
as he reaches his late 20s, will injuries start to creep in as they seem to do
for many other similarly prodigious young talents, especially as he has
been playing relentlessly at the highest level since his late teens?
My comprehension is not a dramatic discovery. There is no sudden
inclination to act nor any need – based on what I have learnt. I just feel a
heightened impression, formed through the data, about just how good and
prolific Lionel Messi has been. For Barcelona fanatics who watch him play
every week, they will likely have already formed this understanding. This
kind of experience would only have reaffirmed what they already probably
knew.
And that is important to recognise when it comes to managing
expectations about what we hope to achieve amongst our viewers in terms
of their final comprehending. One person’s ‘I knew that already’ is another
person’s ‘wow’. For every ‘wow, I need to make some changes’ type of
reflection there might be another ‘doesn’t affect me’. A compelling
visualisation about climate change presented to Sylvie might affect her
53
significantly about the changes she might need to make in her lifestyle
choices that might reduce her carbon footprint. For Robert, who is already
familiar with the significance of this situation, it might have substantially
less immediate impact – not indifference to the meaning of the data, just
nothing new, a shrug of the shoulders. For James, the hardened sceptic,
even the most indisputable evidence may have no effect; he might just not
be receptive to altering his views regardless.
What these scenarios try to explain is that, from your perspective of the
visualiser, this final stage of understanding is something you will have
relatively little control over because viewers are people and people are
complex. People are different and as such they introduce inconsistencies.
You can lead a horse to water but you cannot make it drink: you cannot
force a viewer to be interested in your work, to understand the meaning of
a subject or get that person to react exactly how you would wish.
Visualising data is just an agent of communication and not a guarantor for
what a viewer does with the opportunity for understanding that is
presented. There are different flavours of comprehension, different
consequences of understanding formed through this final stage. Many
visualisations will be created with the ambition to simply inform, like the
Messi graphic achieved for me, perhaps to add just an extra grain to the
pile of knowledge a viewer has about a subject. Not every visualisation
results in a Hollywood moment of grand discoveries, surprising insights or
life-saving decisions. But that is OK, so long as the outcome fits with the
intended purpose, something we will discuss in more depth in Chapter 3.
Furthermore, there is the complexity of human behaviour in how people
make decisions in life. You might create the most compelling
visualisation, demonstrating proven effective design choices, carefully
constructed with very a specific audience type and need in mind. This
might clearly show how a certain decision really needs to be taken by
those in the audience. However, you cannot guarantee that the decision
maker in question, while possibly recognising that there is a need to act,
will be in a position to act, and indeed will know how to act.
It is at this point that one must recognise the ambitions and – more
importantly – realise the limits of what data visualisation can achieve.
Going back again, finally, to the components of the definition, all the
reasons outlined above show why the term to facilitate is the most a
visualiser can reasonably aspire to achieve.
54
It might feel like a rather tepid and unambitious aim, something of a copout that avoids scrutiny over the outcomes of our work: why not aim to
‘deliver’, ‘accomplish’, or do something more earnest than just ‘facilitate’?
I deliberately use ‘facilitate’ because as we have seen we can only control
so much. Design cannot change the world, it can only make it run a little
smoother. Visualisers can control the output but not the outcome: at best
we can expect to have only some influence on it.
1.2 The Importance of Conviction
The key structure running through this book is a data visualisation design
process. By following this process you will be able to decrease the size of
the challenge involved in making good decisions about your design
solution. The sequencing of the stages presented will help reduce the
myriad options you have to consider, which makes the prospect of arriving
at the best possible solution much more likely to occur.
Often, the design choices you need to make will be clear cut. As you will
learn, the preparatory nature of the first three stages goes a long way to
securing that clarity later in the design stage. On other occasions, plain old
common sense is a more than sufficient guide. However, for more nuanced
situations, where there are several potentially viable options presenting
themselves, you need to rely on the guiding value of good design
principles.
‘I say begin by learning about data visualisation’s “black and whites”,
the rules, then start looking for the greys. It really then becomes quite a
personal journey of developing your conviction.’ Jorge Camoes, Data
Visualization Consultant
For many people setting out on their journey in data visualisation, the
major influences that shape their early beliefs about data visualisation
design tend to be influenced by the first authors they come across. Names
like Edward Tufte, unquestionably one of the most important figures in
this field whose ideas are still pervasive, represent a common entry point
into the field, as do people like Stephen Few, David McCandless, Alberto
Cairo, and Tamara Munzner, to name but a few. These are authors of
prominent works that typically represent the first books purchased and
55
read by many beginners.
Where you go from there – from whom you draw your most valuable
enduring guidance –will be shaped by many different factors: taste, the
industry you are working in, the topics on which you work, the types of
audiences you produce for. I still value much of what Tufte extols, for
example, but find I can now more confidently filter out some of his ideals
that veer towards impractical ideology or that do not necessarily hold up
against contemporary technology and the maturing expectations of people.
‘My key guiding principle? Know the rules, before you break them.’
Gregor Aisch, Graphics Editor, The New York Times
The key guidance that now most helpfully shapes and supports my
convictions comes from ideas outside the boundaries of visualisation
design in the shape of the work of Dieter Rams. Rams was a German
industrial and product designer who was most famously associated with
the Braun company.
In the late 1970s or early 1980s, Rams was becoming concerned about the
state and direction of design thinking and, given his prominent role in the
industry, felt a responsibility to challenge himself, his own work and his
own thinking against a simple question: ‘Is my design good design?’. By
dissecting his response to this question he conceived 10 principles that
expressed the most important characteristics of what he considered to be
good design. They read as follows:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Good design is innovative.
Good design makes a product useful.
Good design is aesthetic.
Good design makes a product understandable.
Good design is unobtrusive.
Good design is honest.
Good design is long lasting.
Good design is thorough down to the last detail.
Good design is environmentally friendly.
Good design is as little design as possible.
Inspired by the essence of these principles, and considering their
applicability to data visualisation design, I have translated them into three
56
high-level principles that similarly help me to answer my own question: ‘Is
my visualisation design good visualisation design?’ These principles offer
me a guiding voice when I need to resolve some of the more seemingly
intangible decisions I am faced with (Figure 1.7).
Figure 1.7 The Three Principles of Good Visualisation Design
In the book Will it Make the Boat Go Faster?, co-author Ben Hunt-Davis
provides details of the strategies employed by him and his team that led to
their achieving gold medal success in the Men’s Rowing Eight event at the
Sydney Olympics in 2000. As the title suggests, each decision taken had to
pass the ‘will it make the boat go faster?’ test. Going back to the goal of
data visualisation as defined earlier, these design principles help me judge
whether any decision I make will better aid the facilitation of
understanding: the equivalence of ‘making the boat go faster’.
I will describe in detail the thinking behind each of these principles and
explain how Rams’ principles map onto them. Before that, let me briefly
explain why there are three principles of Rams’ original ten that do not
entirely fit, in my view, as universal principles for data visualisation.
‘I’m always the fool looking at the sky who falls off the cliff. In other
words, I tend to seize on ideas because I’m excited about them without
thinking through the consequences of the amount of work they will
entail. I find tight deadlines energizing. Answering the question of
“what is the graphic trying to do?” is always helpful. At minimum the
work I create needs to speak to this. Innovation doesn’t have to be a
wholesale out-of-the box approach. Iterating on a previous idea, moving
it forward, is innovation.’ Sarah Slobin, Visual Journalist
Good design is innovative: Data visualisation does not need always
to be innovative. For the majority of occasions the solutions being
created call upon the tried and tested approaches that have been used
for generations. Visualisers are not conceiving new forms of
representation or implementing new design techniques in every
57
project. Of course, there are times when innovation is required to
overcome a particular challenge; innovation generally materialises
when faced with problems that current solutions fail to overcome.
Your own desire for innovation may be aligned to personal goals
about the development of your skills or through reflecting on previous
projects and recognising a desire to rethink a solution. It is not that
data visualisation is never about innovation, just that it is not always
and only about innovation.
Good design is long lasting: The translation of this principle to the
context of data visualisation can be taken in different ways. ‘Long
lasting’ could be related to the desire to preserve the ongoing
functionality of a digital project, for example. It is quite demoralising
how many historic links you visit online only to find a project has
now expired through a lack of sustained support or is no longer
functionally supported on modern browsers.
Another way to interpret ‘long lasting’ is in the durability of the
technique. Bar charts, for example, are the old reliables of the field –
always useful, always being used, always there when you need them
(author wipes away a respectful tear). ‘Long lasting’ can also relate to
avoiding the temptation of fashion or current gimmickry and having a
timeless approach to design. Consider the recent design trend moving
away from skeuomorphism and the emergence of so-called flat
design. By the time this book is published there will likely be a new
movement. ‘Long lasting’ could apply to the subject matter. Expiry in
the relevance of certain angles of analysis or out-of-date data is
inevitable in most of our work, particularly with subjects that concern
current matters. Analysis about the loss of life during the Second
World War is timeless because nothing is now going to change the
nature or extent of the underlying data (unless new discoveries
emerge). Analysis of the highest grossing movies today will change
as soon as new big movies are released and time elapses. So, once
again, this idea of long lasting is very context specific, rather than
being a universal goal for data visualisation.
Good design is environmentally friendly: This is, of course, a noble
aim but the relevance of this principle has to be positioned again at
the contextual level, based on the specific circumstances of a given
project. If your work is to be printed, the ink and paper usage
immediately removes the notion that it is an environmentally friendly
activity. Developin...
Purchase answer to see full
attachment