Tableau Tunnel #4 — Making Beeswarm plots

Ninad Barbadikar
10 min readFeb 22, 2021

--

Hello again, welcome back to the Tableau Tunnel.

Today, we’re doing beeswarm plots.

Hey everyone, hope you’ve been well and that things are okay with you and yours.

Thank you so much for all the support you have shown for the previous three tutorials, I really appreciate that.

Today’s tutorial is a quick guide on making beeswarm plots on Tableau.

So let’s get started.

Data prep

First of all, the data set we will be using today is from the Top Five Leagues across Europe, courtesy of the amazing people at Football Reference. Find the dataset here — https://fbref.com/en/comps/Big5/passing/players/Big-5-European-Leagues-Stats

Following the steps mentioned in my player dashboards tutorial to convert the CSV data, you’ve copied into excel format to feed it into Tableau Public.

We will be looking at passing data per90 from across the Top Five Leagues.

The dataset is available on this link here — https://drive.google.com/file/d/16TbDgQiN6KcJNEgBE__yvEEQ7oM0hKR_/view?usp=sharing

Making Beeswarm plots

Fire up your Tableau Public.

Let’s connect to the excel file that you’ve just converted from CSV to XLSX using Convertio.co

Now, why Beeswarm plots?

Well, for one thing, they’re a nice way to present vizzes in a minimalistic format — granted that you are able to explain how it works in an understandable manner.

So essentially, you read beeswarm plots from left to right. If a player’s position on the graph is towards the right and closer to the end, they’re doing fairly well with respect to that metric.

In the above example, I’ve taken İlkay Gündoğan and Kevin De Bruyne into consideration, the former has been in excellent form of late and has somewhat eased the burden of creativity on De Bruyne.

There are improvements you can make to this of course, in terms of aesthetics and functionality. But this is how I’ve made them so far and hopefully, you find it useful.

Using INDEX() function

I use the INDEX() function on Tableau to create the beeswarm plots.

Normally in scatter plots you have the X and Y-axis. In Beeswarm plots, we are essentially looking to achieve a jitter scatter plot.

Jittering is adding some random or arbitrary noise to present the information a bit more clearly. Read here to understand more.

First, let’s generate a bee swarm plot for Key Passes per90.

Drag and drop key passes onto columns.

Now to add the INDEX() function on the Rows box.

When you double click on the rows box, you get this box.

In the box, type index, and you will see INDEX() appear. Hit enter and the function will be placed on your rows box.

INDEX function

Now, you will see a single dot scatter generated with INDEX on and SUM(KP) on your columns and rows.

Generating the bee swarm plot

Follow these steps to do it —

  1. Drop the player name field onto detail under the Marks section.
  2. Change the mark type to circles.
  3. You should get a scatter-plot generated of key passes vs INDEX() values.
  4. Right-click on the INDEX() axis and unselect show header, this will only keep the marks in view and your key passes axis.
You should have something like this on your screen by now.

Now add your filters for minutes played and positions included. I’ve taken a minimum of 6 90s played for comparison. You can take more or less, entirely up to you of course. Also, another good idea would be to exclude GKs from the data set by filtering them out.

Read here to understand how filters work — https://ninad06.medium.com/tableau-tunnel-2-improving-on-scatter-plots-and-building-basic-bar-graphs-67dd16a2bb35

Now you should have lesser marks on your plot thanks to the filters and a better picture is starting to take shape.

You’ll notice that the INDEX() has a small triangle beside it. What does that indicate? It indicates that it is a calculated field.

Introducing Calculated fields

To put it in simple words, calculated fields are fields that are created and generated by the user by using existing fields in the data set. You are in control of how the measure is calculated and what type of measure it ends up being and the format of it as well.

For bee-swarm plots, there are a few calculated fields involved with some very basic code, which I will explain in a bit.

To create a calculated field, click on the small downward-facing triangle above Tables and click on create calculated field.

Then a dialogue box will appear for you to create the field itself. You can give it whichever name you like.

Calculated fields contain functions of different types, you can understand how they work within the box itself.

Under the types of functions available, I’ve selected logical since that’s what we’ll work with for now. You can also perform simple mathematical operations to create the fields you want.

You can see how the AND operator is used, there are examples provided for how you should use those operators as well.

The first calculated field we’ll work with is called a better indexer or simply BI.

This is what you need to use —

IF INDEX()%2=0 THEN INDEX()-1 ELSE INDEX()*-1+1 END

You can see that Tableau has deemed the calculation valid because it falls within it’s defined usage of the IF operator.

Why are we using this calculated field though? To put it simply — for every alternate observation, it subtracts one from the index value. And for every other value, it just takes the inverse of that, essentially, negative index + 1.

Now, coming back to our dialogue box, click apply and replace it with INDEX.

If you’re seeing this, don’t worry. This is happening because you haven’t edited the table calculation to consider players as well. So let’s change that.

Right-click on BI in the rows box and click on edit table calculation.

We’ll tell Tableau to compute using specific dimensions, which for now, is the player name —

Now close the table calculation box and slowly, you’re seeing a better plot.

Once again, uncheck the show header on the axis for BI before proceeding.

To add the icing on the cake before proceeding to the next part, right-click on the key passes measure in the Tables list and under create, select bins.

A dialogue box will appear for you to change the size of bins but for now, we will not be messing with that.

Click OK and the bin is created under your discrete measures, drag and drop it onto detail under the marks section.

Pretty cool isn’t it? This is your basic bee-swarm plot without any aesthetic modifications.

Let’s improve on this.

Since we’re looking to highlight Gundogan and de Bruyne, let’s create a calculated field that will help us do just that.

For lack of a better title, I’ve just called the field a color loop.

Copy-paste this —

IF [Player — Split 1]= ‘İlkay Gündoğan’ THEN ‘Blue’
ELSEIF [Player — Split 1]=’Kevin de Bruyne’ THEN ‘Dark Blue’
ELSE ‘Grey OUT’
END

Remember that the names you specify here have to match with the format in the data set. Therefore since Gundogan’s name has certain special letters, you’ll have to input his name accordingly in any calculated fields you use.

Let’s drag and drop color loop onto Color under marks.

You’ll see that Grey OUT has a different color, Blue has a different color, Dark Blue has a different color. Now, these are colors I’ve assigned to those marks using edit colors under colors and then choosing some colors from the default Tableau color palettes. But if you feel the need to use something else, just double click on the name of the color under select data item: — there you can change the color as you wish.

You can also use the Hex Code of the color to find the accurate shade you want. Just Google the name of the club and it’s color code — For e.g. Googling Manchester City color code will give you this one website out of many — https://teamcolorcodes.com/manchester-city-fc-colors/

Man City’s hex code looks something like this — #6CABDD

All you have to do is copy and paste the boldened part into your Tableau edit color dialog box.

Next improvement you have to make is for the size — For the players you wish to highlight, you obviously want their circles to be bigger than the rest, so one easy way to do that is to group the players in a set.

Once again, click on your player name field and under create, select set.

Tableau will ask you to specify which players you wish to include in the set — Tick the names of the players or enter their names in the search bar of the create set dialog box.

Gundogan is a bit trickier so I had to manually scroll down to the I letter names and find him.

Okay good now you’ve created your set. You’ll see the field in the screenshot above — player split 1 set.

Drag and drop it onto size, under marks.

This is how it’ll look before you edit sizes similarly like you edited colors.

You’ll see the size card on the right-hand side of your Tableau. Hover the title of the card and you’ll see the triangle for the dropdown menu which contains edit sizes...

Now you can adjust the size as you wish, the easiest thing to do is to just reverse the default sizing.

I’ve changed the background color and font color by formatting them. Go back to tutorial #3 to understand how you can do that.

Now, what else can I do? Let’s clean it up a little bit by removing the name from the axis.

Right-click on KP and edit axis.

Just leave the title under Axis titles empty and exit the dialog box. Now there will only be values at the bottom and you still have the title of the worksheet at the top.

You have to repeat the same process for the rest of the fields that you wish to look at —

I would recommend simply duplicating this current worksheet as many times as you need in order to make your work easier.

In the duplicated worksheet, replace key passes with the measure you wish to compare the players on.

Drag and drop it onto the SUM(KP) in columns.

Don’t forget to remove the previous measure’s bin field from detail and replace it with the current field’s created bin.

Duplicated worksheet changed for progressive passes.

I’ve done this two more times as I also wanted to look at passes into the penalty area as well as final third passes, all fields being per90 of course.

Now, once you’ve done all of that, put it together in a dashboard.

Before placing them together, you need to have containers to fit them into, so first drag and drop a horizontal container onto your empty dashboard.

Then drag a vertical container to divide the dashboard into two equal halves.

Now within the vertical containers on either side, drop a horizontal container on either the upper or lower part of the container.

Now, do the same for the left sided vertical container.

You should now have four containers for you to drop the worksheets onto.

This is how it’ll look depending on how you’ve formatted your colors and fonts

Now, our text is there but isn’t visible because of the dashboard’s white background. You can change that by clicking on dashboard on the topmost toolbar and selecting format under that.

I’ve changed the color of the background and then removed the color and size legend cards above the final third passes worksheet.

Finally, this is what we’ve ended up with.

And that’s how I do bee swarm plots and this is one of the few ways to do it.

Thanks for checking out today’s tutorial, if you like, do practise it and share your work on your socials or send it over to me on DMs, I’d love to see what you guys come up with.

Once again, a big thanks to Football reference for the data.

Stay safe and be well guys, until next time. Take care.

Happy vizzing!

This is me bzzzing myself out. (sorry, I had to, lol)

--

--