Tableau Tunnel #2 — Improving on scatter-plots and building basic bar graphs

Ninad Barbadikar
11 min readJan 11, 2021

--

Hey everybody, welcome back to the Tableau Tunnel.

I trust that you found the first tutorial helpful and I’m grateful for all the support that you all showed, it truly does mean a lot.

Now, in today’s tutorial, I’ll try and show you how you can improve on the scatter-plot that you made from the first tutorial and towards the end of this tutorial, you will have learnt how to make basic bar graphs.

Now, this is where we stopped the last time. What are the problems with this scatter-plot?

1. Problematic colour-scheme — Some of the color shades look similar and therefore, using such a colour palette isn’t the most optimal way to go about it.

2. Overlapping circle marks — Some of the circle marks on the plot are pretty close to each other which is why we don’t see the year labels on some of them.

Let’s try to tackle these problems and figure out a way around them in today’s scatter-plot.

Today, I’ll introduce a different data set. This is from Football Reference.com and contains some standard statistics.

Access the data set from this drive link here —

https://drive.google.com/drive/folders/1RmzP5qTuepSpQQuNLEctekqah5RNp1TS?usp=sharing

Improving Scatter-plots

These days, a lot of us prefer to have a dark-themed scatter-plot and all graphics in a darker theme in general. Why? Well, apart from the coolness factor, darker visualisations are easier on the eyes as well.

The colours come out looking better as well.

So let’s try to build an improvised scatter-plot today.

Open up your Tableau Public and connect to the Premier League data set that you downloaded from the above drive link.

This is how your screen should look.

Now, you’ll see that a lot of the column names are by default F1, F2 and so forth. And there are a lot of null values as well, so to fix that, check the use data interpreter option.

After you do that, the null values should be gone and the column names should be cleaned up and will be in accordance with your data set.

Now let’s go to Sheet 1.

Let’s look at basic some stuff, who has scored the most goals and had the most assists so far this season?

In Tutorial #1, I told you that you’ll need to drag your measure or dimension to column or row. An alternate method is to just double click on the two measures you’re trying to compare.

Here’s how —

After double-clicking on Performance Ast

You’ll notice that Tableau has summed up all the values in the data set and generated a bar graph. Notice how this immediately changes to a scatter-plot when you double click on performance Gls.

This happens because when you’re comparing two Measures, Tableau generates a scatter-plot by default. Once again, you notice that all the values have been summed up and a single dot has been generated.

When you hover your cursor over the dot, you will see that 441 goals have been scored and 301 assists have been made. That little box you see the information in is called the Tooltip. Keep this in mind, it’ll come in handy later.

In Tutorial#1, I told you that to generate different marks on the scatter-plot, you have to go to analysis and uncheck aggregate measures.

Another way to generate all the player marks is to drag the Player dimension and drag it onto the detail box in the Marks section.

Let me quickly go over what the Marks section is, and why is it important.

The Marks section is the element on Tableau that you interact with the most and one of the most important elements as well.

So what this basically helps you do is manipulate the graph you’re looking to generate and add or remove different things. Such as Color of marks, Size of marks, Label of marks, the detail in the calculation, the tooltip.

Now, when the user doesn’t specify any particular type of mark, you’ll notice that the scatter-plot usually generates circles that aren’t filled.

Case in point —

After dragging the player dimension onto detail, this is what it looks like.

Each of the marks generated represents the values of different players. Now, if you’d like to change the type of marks being used, you can do that in the marks section.

You’ll see here the different types of marks that you can play around with, for now, we’ll stick with circles.

Now wait a minute, the Premier League obviously has a lot more players than the no.of marks on this scatter-plot, so why are we seeing such few marks? That is because a lot of the players may have the same no. of goals/assists and therefore, that will reflect on the scatter-plot.

Another thing to keep in mind is that these are the values of all players in all positions in the Premier League who have spent even a minute on the pitch. This is why, when it comes to making basic visualisations like this, it is important to adjust for minutes played, more on that later.

Let’s get back to our scatter-plot and the labelling.

Once again, drag the player dimension and drop it onto the label box in the marks section.

This is what you should see —

Now, these labels look problematic and the repetition of names is not ideal, so how do we overcome this? We simply split the player dimension.

How you do that is, right-click on the player dimension under Tables.

Click on Split under Transform.

When you do that, the single player dimension splits into three.

After splitting

Now, before dragging one of those and making them the new label, you’ll have to remove the previous label from the marks on the plot.

Simply right-click on the player label under the marks section with the little letter T beside it [ that’s the symbol for labels ]and click on remove.

Replace that with player split 1.

Looks better, right?

Now do the same thing and remove the player in detail right below the player-split 1. That cleans up your Tool-tip.

Now let’s talk Filters.

Introducing Filters

Now, why are filters important?

Whenever you’re making player comparisons like this, one thing is important to keep in mind, I’ll throwback to this very important point from Jay Socik, AKA BA Analytics

Always keep this in mind.

So let’s learn how to do that!

In Tableau, you can filter using both dimensions and measures. Let’s filter by positions and compare Forwards in the Premier League. You’ll see the Filters section right above the Marks section.

Drag the dimension named Pos into that Filters section.

A dialog box will appear out of nowhere.

Here, you will be able to filter for all the positions in the dataset. Let’s check all the options for forwards.

Now you’ll see that some of the names will change and the no. of marks will reduce.

Let’s go one step further and adjust for minutes played. Similar to how you dragged Pos into filters. Scroll down the list of measures, and drag Playing Time Min into filters.

Here, Tableau will ask how you wish to use this particular field, if there’s a particular calculation you’d like to filter it by, simply click on #All Values.

Now drag the lower limit of minutes played and set it as you like, I’ll look at players who have played at least 900 minutes.

Again, the graph becomes a little bit-cleaner and better-looking.

Now since many of you requested a quick intro on dark vizzes, let’s do that —

Right-click anywhere on the white region on your scatter-plot and click on Format.

The Format panel will open on the left-hand side on your Tableau window.

Here you can change the font, the background, change the lines in the background and essentially transform the look and the aesthetic of your visualisation. There are broadly five elements to it —

Font, Alignment, Shading, Borders and Lines.

Let’s go to Shading.

Now under that panel, you’ll see this —

Select the drop-down menu and choose the colour of your choice.

You’ll notice that the labels are visible and looking alright in white. However, the title of the worksheet and the axis labels are dark grey in color and are a bit difficult to see, so let’s change that by going to the first formatting panel under Font.

Similar to changing the background color, you can change the type, color and size of the font on the viz as well.

Let’s change the font color to white in the dropdown menu under Worksheet and Title.

The grid lines are a bit distracting, aren’t they? Let’s get rid of them.

Go to the Lines section in Format.

Under the grid lines dropdown menu, select none, remove the slightly dotted zero line by doing the same.

The viz looks much cleaner now. Let’s do something about the color.

How about we color them using a gradient of Expected Goals+Expected Assists per90? Let’s try that.

Drag the field Per 90 Minutes xG+xA onto Color in the Marks section.

You’ll see that each player has been assigned a color from the gradient according to the value of their expected contributions per90.

This is how your viz should look now —

Oh, another thing, I changed the view of the viz from Standard to Entire View, so that it looks less congested.

You’ll notice that this graph emphasises those players who have more assists since assists are along the Y-axis. You can change that by swapping the axes using the shortcut Ctrl+W or by clicking on the icon as shown here

Now let’s quickly put this into a dashboard

This is how it’ll first look —

Now, change the size of this dashboard to generic desktop dimensions (1366x768)

Follow the screenshots below to see how you can do that —

Here, select Fixed size
Now select Generic desktop
This is how it should look

Now, you’ll see that the worksheet title and the colors legend has mixed in with the white background of the dashboard, so let’s change that.

Click on Dashboard in the toolbar at the top and in the menu, select Format.

Similar to how you formatted the worksheet, now format the dashboard’s elements, for now, simply change the background shading to black.

This is how it should look now -

The color legend isn’t very clear, so you can change that by adjusting the size of the box itself, that is relatively simple.

So there you go, this is how you create a dark-themed scatterplot.

Now let me quickly go over how you can create a basic bar graph.

Remember earlier when I told you that dropping a single measure as a field generates a standalone bar graph? Well, when you add a dimension to that, the bar graph becomes complete. Let’s explore how that works.

Go back to sheet 1, and right-click on the sheet 1 tab and click on duplicate, so that you don’t have to repeat all the filters and the rest of the steps again.

Now you should have another sheet with the sheet name sheet 1(2).

Here, remove both the goal and assists measures from columns and rows.

Drop the field Per 90 Minutes G+A onto columns and player split-1 onto rows.

This is how it’ll look now

Change the marks from circle to Bar.

Remove player-split 1 from label.

Remember the icon I showed you that exchanges the rows and columns? In the same line of functions, there is an option to sort the graph values in ascending or descending order. Let’s sort in descending order.

This is how it looks now —

Now, this is too many players on a single bar graph, so let’s just look at the top 10 in the league by filtering for the same field we are using, i.e Per90 Minutes G+A.

Drag that measure from the left-hand side panel into filters and adjust the minimum value as 0.71, as Dominic Calvert-Lewin is the 10th highest-ranking player for this metric.

And use the same measure to add labels for the values on the bar graph. Drag Per90 minutes G+A onto label in the marks section.

This is how it should look —

Now, let’s create a dashboard for this as well, remember how you duplicated sheet 1? You can similarly duplicate Dashboard 1 as well.

You’ll see the same scatterplot on dashboard1(2) as well, you’ll need to remove sheet 1 and replace it with sheet1(2).

This is how you add or remove sheets

So after replacing it with the sheet of the bar graph, let’s clean up the rest of the details.

Change the title of the worksheet and add your social media handle and make sure you add credits to Statsbomb and Football Reference for the data.

The final bar graph

So here’s your basic bar graph! We’ve listed the top 10 players in the Premier League for Goals+Assists per90 and shaded their colours according to their Expected tally per90.

Let’s summarise

Today we learnt how to improve our scatterplots and polish them up and corrected some errors from previous plots.

We learnt how to make a dark background scatter-plot and a bar graph as well.

We learnt how to use filters and why they’re important.

That’s about it for today!

Let me give you a few assignments —

  1. Generate a scatter-plot assessing midfielders who have either over performed or underperformed as per their xG.
  2. Generate a bar graph displaying the top 10 defenders for Expected goals+assists per90.

Good luck!

Thank you for following the second tutorial of my Tableau Tunnel series. Hope that you’ve enjoyed today’s guide and if you have any trouble at all following any of the parts, feel free to reach me on Twitter at @NINADB_06 or you can reach out to me at ninad066@gmail.com

Happy vizzing!

Take care and stay safe!

--

--