Learning Data Visualisation
To understand and communicate data, you must visualise it – for yourself and for other people – says Nathan Yau
Data grows more easily accessible and ubiquitous – stored in spreadsheets, databases, and text files – and with this new resource comes a need and desire to understand it and communicate with it. Visualisation is a great way to do this, and because of all the data available, now is a great time to learn how.
Pick a purpose
A lot of people tell me they’re interested in visualisation and ask the best way to get started and what tools and software to use. The answer depends on what you want to use visualisation for.
Are you interested in visualisation as an exploratory tool to understand your data better? As an analysis tool? Maybe you want to design a business dashboard that keeps you updated on sales, spending, or productivity, or maybe you’re a researcher with complex data, and you need to explore on multiple dimensions.
Tools like Tableau Software can help you look at your data beyond what Microsoft Excel offers. My personal favorite, R, is a statistical computing language that offers a wide variety of open source packages and can help you visualise data in different ways.
On the other hand, if you’re more interested in visualisation for presentation–to communicate ideas to a wider audience–you might want to learn Adobe Illustrator or Inkscape. If your graphics are intended for the Web, the JavaScript library Data-Driven Documents (D3) could come in handy.
A combination of all of the above might be what you’re after.
The key is to figure out what you want out of visualization (and what data you have), and you can take it from there.
Understand data
Software is only a small part of the equation though. You can have the program best for the job, but if you don’t know how to work with, analyse, and most importantly, understand data, the software is useless.
Formal statistics coursework can come in handy at this stage in the game (and I highly recommend it, because it’s awesome), but you can still get a lot out of data through charts and graphs. The key is to think like a statistician: Pay attention to detail, see the big picture, avoid your own biases, look for context, and always ask why.
Does the trend, pattern, or relationship you see make sense? How does it relate to the real world?
A lot of people skip this step, essentially reducing visualisation into a design exercise. This is fine for practice or learning how to use a tool, but ultimately, you need to understand your data, regardless of purpose.
With visualisation as an exploratory tool, you of course are trying to understand your data. That’s the point.
However, to present data to others, you first must understand the data yourself. This is my favorite step in the visualizstion process. What you find here is what makes your final graphic most interesting.
Design accordingly
Once you’ve explored, picked, and prodded at your data, it’s time to design for your audience. Of course, if you’re using visualszation only for analysis (i.e. the only audience is you), you don’t have to do much here. However, once your audience grows, you must remember that not everyone is familiar with your data. It’s your job to explain and highlight what is interesting.
Design around these interesting points with annotation, color, and contrast. Ask yourself who your audience is, what they know already, and what they might want to know.
Again, this comes back to purpose. A visualisation to make people laugh is designed differently than one that goes into an annual report, so design accordingly.
Practice
At the end of the day, learning data visualisation comes down to practice. When I got started in the field, I read all the books I could and went through lots of tutorials, and this provided great background. However, you have to apply what you learn to your own data.
Analyse it, visualise it, and design. Show your work to other people and see how they interpret your data. The reactions might surprise you, and you’ll learn a lot from others’ reactions.
Most importantly, have fun with it. Data is an abstraction of the real world, and visualisation gives you a peek into how things work and how people live. That’s pretty awesome.
Nathan Yau writes the popular FlowingData blog, and is the author of Data Points: Visualization That Means Something, published by Wiley. He has a Ph.D. in statistics from UCLA, with a focus in data visualisation. He is most interested in personal data collection, data for non-professionals, and information design.
This week’s Big Data coverage is sponsored by publisher John Wiley & Sons