Making People Understand Your Data: A Data Visualization Tutorial

Daffa Sadek
6 min readJun 19, 2021

--

Creating effective yet simple visualizations from scratch. (Source: Photomix)

Data visualization is creating a virtual representation to make it easier for the reader to understand the data. Its aim is to help user make better data-driven decisions. Whether deliberately or not, you are or will already be doing Data Visualization as a part of a job or a school project. This simple fact makes a lot of people underestimate the importance of creating good visualizations.

Creating a graph or a chart is simple enough, but creating compelling one that is easy to understand and looks good is an art in and of itself. Find out how to create graphs the correct way to make your data clearer in this article.

The key points that you need to understand about Data Visualization are:

  • Figure out what kind of information you want to present.
  • Tailor your visualization according to the user you will be presenting to.
  • Use effects such as color, lines, and emphasis as needed.
  • Lastly, Less is More!

What do you want your data to say?

The first step in creating beautiful visualizations is determining what is it are you trying to present? If you decided to choose the charts and then the data, it will end up confusing the audience or even worse, mislead them.

So you have to fit the chart to the data, not the other way around. Here are some data types that are commonly used in industries and projects and what chart is best to represent them:

For trends, use line charts

Line charts are one of the most commonly used charts because it demonstrates an overall trend with little chance of misinterpretation. Specifically, they are good for depicting changes in values over a period of time. Quantitative data such as demand forecasts, products sold, and population growth are typical examples of using bar charts.

Example of a simple line chart. (Source: BBC News)

For showing showing proportions or comparing values, use bar or pie charts

Both bar and pie charts are great for showing the differences between categorical values. It can also be used as a side-by-side comparison spanning between different categories. Both shows how an individual category fares against other categories (for bar charts) or in total (for pie charts)

Example of a bar chart that shows two different values for a category side by side. The goal for this chart is to show the difference between online and in-store purchases. (Source: Statista)

For bar charts, it’s important to use 0 as a guideline, or else your data will mislead users into thinking that a certain category is significantly better than the rest. You can see it in this example:

A non-zero basis bar chart (left) can heavily mislead the audience into thinking that a certain category is far better than the other, as opposed to a zero-basis bar chart (right). (Source: Claus Wilke)

For cases like these where the value between several categories are not big enough, it’s better to use pie charts to show the differences between them such as in this example below. It’s important to not show more than 6 categories in a pie chart, as the number of categories shown increases, the difference between each becomes less significant.

Pie chart (left) used to compare one category to a whole, while bar chart (right) is used to compare one category to other categories. (Source: corona.jakarta.go.id)

For comparing proportions, use are charts

This chart is rarely used for typical use because it’s quite difficult to understand. Are charts show the overall volume as well as the proportion taken up by each category. It’s usually used to show proportions over a period of time, such as a revenue vs cost chart in the example below.

An example of an area chart used to plot revenue and cost. (Source: Sisense)

The chart above shows how much revenue overlaps cost. It shows that in certain times of the year, the cash flow is really tight as opposed to other years.

For showing distributions, use histograms

Another chart that is usually used in industries is a histogram, where this chart shows how often each value occurs in a dataset. Examples of these are population distribution or income distribution where each ‘bar’ shows the number of data inside a category. The difference between histograms and bar charts is in the category used. For histograms, the categories must be ordered (such as age groups) whereas bar charts doesn’t (such as brand name).

An example of a histogram. (Source: Chegg)

For showing relationships, use scatter plots

Scatter plots shows the relationship of data between two categories. It shows whether there are an correlations between the data or not. The important thing here is to ensure that the data from both category is numerical.

Example of a scatter plot that is typically used in research journal (Source: Chartio)

Who will be seeing your visualizations?

How you present your visualizations depends greatly on who your target audience is. What do they intend to do with this data? What cultural, domain, or industry needs require them to see the data? How you show your data should be different whether you’re presenting to executives, lead data engineer, the sales team, or a group of high school students.

How to style your data to make it more effective?

Visualizations often use styles to make it easier to understand. This can even be influenced by branding, as certain publications tend to use specific color palette to enhance their branding. For more info on this issue, you can go and look at each respective Brand manuals. An example of Brand Manual that people usually use as a guideline is the BBC Global Experience Language and the Cato Institute Data Visualization Guidelines.

The elements that you can tweak to make your data visualization complete are usually these:

  • Shapes
  • Color
  • Typography
  • Iconography
  • Legends

It’s important that you don’t think much on how attractive your visualizations look, but rather focus on whether the elements will help you to achieve your goal when making the chart.

Styling shapes

The shapes of your figures can be adjusted according to the required levels of precision. Data that are used for comparison or functions that require a certain level or precision should use sharp, defined edges whereas data that is used to convey a general idea can use shapes with less detail.

This is a bad shape example for a bar chart because it makes it difficult to read the bars with imprecise edges. (Source: Google Material Design)

Using colors

Color is used to differentiate your data in several ways. As mentioned above, certain brands tend to use specific colors in specific ways to show their branding on certain issues. In general, colors are usually used to differentiate categories, representing a specific quantity, highlighting an issue, and expressing a certain meaning.

The first specific type of use is using color for different data types. Categorical palettes are used to distinguish different chunks of data that do not have any specific orders, while a gradient is commonly used for showing data in a certain range.

Examples of using categorical color palette (left) to show different categories while using a gradient (right) to show quantity range from low to high. (Source: Google Material Design)

Color can also be used to highlight a certain data. Use one primary color while using gray to indicate which data is the most important one that the audience should focus on.

Using the color orange to highlight a specific group of people that deviates from the trend in the revenue analysis. (Source: Google Material Design)

Lastly, Less is More!

You should avoid using too many elements or flashy designs in your visualizations. Keep it as simple as possible and use the bare minimum to get your point across to the audience. If you want to read up more, James Cheshire wrote a great article on what not to do in Data Visualizations, which you can read up more on here.

If you are interested in reading more, Claus Wilke created a great and comprehensive guide on making visualizations in his book, ‘Fundamentals of Data Visualization’.

I hope this short and simple tutorial is enough to get you started on the right foot in Data Visualization and avoid the mistakes most people make in visualizing data. Stay curious!

--

--

Daffa Sadek
Daffa Sadek

Written by Daffa Sadek

An aspiring entrepreneur, philanthropist, and public figure.

Responses (1)