Heatmap Tables with Excel – Revisited

We’ve revisited one of our more popular guides Heatmap Tables with Excel as they can be a very effective way of presenting data on a dashboard, and have now updated it for Excel 2010…

This Heatmap Table is designed to show you the revenues and the discounts of a company over the course of one year per product group. The size of a bubble shows the revenue made in a particular month and the bubble color shows the discount rate given. The discount rate has been encoded as a range of green colors, ranging from a light green, for low discounts to a dark green for high discounts. The years and product totals are shown at the right and bottom as an integrated part of the table.

Tufte often talks about the integration of numbers, images and words; I think he’s quite right. A way to achieve this in Excel is to integrate charts into tables, so called graphical tables, a very effective means to show “More Information Per Pixel“.

The heatmap table is based on a regular Excel bubble chart. To integrate a bubble chart into a table the bubbles are positioned in a matrix that has the same row and column layout as our table.

 

 

 

 

 

 

 

 

 

 

 

 

 

In our case we generate a data series table with one column for the X-Series going from 1-12 for January – December and one column for our Y-Series going from 1-8 for our 8 product groups and one column for revenue.

In the sample spreadsheet we’ve setup some simple excel formula to translate data from the classic grid layout:

to the required format:

Now we can insert the bubble chart:

 

To ensure that the charts fit exactly into the table grid we set Min/Max for the X axis to 0.5/12.5 and for the Y axis to 0.5/8.5. Excel would calculate much larger auto scales otherwise. Also set the Major units to 1 so we can use that later to set some grid lines.

 

Now we remove the legend, the X and Y axis, maximize the plot area and align the chart with the Excel table. As the bubbles are initially too large we have to make them smaller. To control the bubble size go to Data Series Options and scale the bubble size to 50%:

 

This already makes a nice bubble table you could use to reproduce the Twitter Charts.

For the grid lines format your table headers and grid lines with light gray grid lines. Resize the plot area, remove the border and re-position the chart so that the chart and the table grid lines align.

To create the heatmap with different colored bubbles we use the fact that by default Excel does not plot data points for #NA values.  For the heatmap we overlay 8 bubble series, one  series per green shade, and show a revenue bubble only if the value fits into the value range that corresponds with a green shade of our color ramp, otherwise we show #NA.

We divide the range MAX(Discount)..0 into 8 groups to define the colours.

The data series columns use the following formula to test if a discount value corresponds with an interval / colour shade:

=IF(AND($E7>I$6-Step,$E7<=I$6),$D7,NA())

The formula returns the revenue, if the discount values is in the interval defined in the column header I$5.

 

 

Now create the eight data series so that the bubble size refers to the eight columns in the data table:

 

And use the Excel chart styles to pick a colour range – make sure you  remove the border from the chart area.

 

 

And you could use the chart styles to quickly switch between different colours – or customise each series to refine the colors.

You can download a starting point for these files here: HeatmapSample.xlsx. Most of the formulae should adapt to data values that you can feed into the data sheets, including data straight from Analysis Services if using XLCubed grids or formulae.

You can see an interactive version of the Heatmap here – we added a link to some cube data, some Slicers for driving the parameters and then published to XLCubedWeb.

 

 

Flexible time-series graphing from a slicer

We are often asked how to drive a chart from a slicer in XLCubed and how to plot days/months for a month or year. Base case this is fairly straightforward, you can set up a grid which is based on the previous ‘x’ months of a slicer selection for example. The difficulty can be where you want to vary the behaviour depending on which level of the hierarchy the user chooses. This is particularly true where the hierarchy contains semesters or quarters.

The example below shows a technique to handle this complexity and display the chart in a way meaningful to the user in each case. The report is based on a slicer that allows the user to switch between showing the graph data based on quarters, months or days.

You can download the Excel spreadsheet that is used in the example here TimeSeriesGraphFromSlicer

This connects to the Adventureworks demo database which ships with Analysis Services.

The diagram below shows the flow of data from each worksheet showing the final result in the sheet Chart.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Workbook Sheet – Chart

This sheet shows the graph based on the data chosen in slicer above it. This switches the graph data between quarters, months and days depending on the slicer selection.

 

Workbook Sheet – GridForChart

This shows the data that will be graphed, depending on the choice made by the slicer selection. In this example it is months July 2001 – June 2002. FY2002 has been selected by the user (in this example Financial Year 2002 runs from July 2001 – June 2002).

Note that cells A10 – A21 contain the value ‘TRUE’ – these cells contain an XL3RowVisible statement as follows:

=XL3RowVisible(B10<>””)

This statement hides rows with no data so that they are not plotted on the graph.

Workbook Sheet – SlicerToMonthDay

This sheet contains the data that is returned by the choice of the slicer in workbook sheet Chart.

User selects a month

The data will be graphed as days. For example, if the user selects July 2002 then the graph will be displayed with each day in July along the x-axis. These are defined in XLCubed as ‘Children of’ the slicer.

User selects a quarter year

The data will be graphed as months in a three month period. For example, the user selects Q1 FY 2003 and the data displayed is for three months from July 2002 – September 2002 as below. These are defined in XLCubed as ‘Descendants of’ the slicer at month. This will be the same when the user picks year, semester or quarter.

User selects a half-year

The data will be graphed as months in a six-month period. For example, the user selects H1 FY 2003. The screenshot below shows the data that will be graphed.

However, it can be seen that the values Q1 FY 2003 and Q2 FY 2003 should not appear on the graph.

Using the Edit Member functionality it is possible to remove these so that they do not appear as points on the graph.

To do this, edit the Date.Fiscal member and click on Advanced tab.

Click on the drop down next to first member – that member set is the resulting data when the user selects H1 FY 2003 and shows the data that is in cells B10 – B43 in sheet SlicerToMonthDay.

 

The screenshot below shows the data that will be subtracted – it is in effect the actual value selected by the user via the slicer alongside the two Fiscal Semester values Q1 FY 2003 and Q2 FY 2003.

 

The GridForChart sheet now shows just the six months that should be graphed. As explained earlier further manipulation using the XL3RowVisible functionality removes blank rows.

 

The screenshot above shows the graph with six months of data for H1 FY 2003 for months July 2002 – December 2002, and the quarters have been dynamically excluded.

The end result is a flexible time selector where the user can choose dates at different levels in the hierarchy, and will always get a meaningful and in-context time series chart.

 

 

Data Visualization – a real world example

In the following example we work through a real world example of a data visualization. We’ve chosen an example that involves Operations data – this is fairly non-domain specific so hopefully it can demonstrate some important points. The first, and most important point is that you have to define your audience.

We receive many questions about “what is the best chart for this situation” or “what colour should I use for emphasis”. These questions are usually attacking the problem from the wrong angle. The one question you need to ask before anything else is “who is this visualization going to be seen by and how?” Is it in a boardroom on a printed sheet or across a trading floor on a plasma screen. Are the consumers domain experts?

This example features data about an investment bank’s operations processing, the audience being the clients of the Operations department.

Starting Point

Initially the project started out as simply trying to record what operational problems were encountered on a daily basis across different product lines. A reporting system was built and various generic reports produced:

DVBlog1

Unfortunately the reports either didn’t contain data at a granular enough level or it was difficult for the product managers to see where the issues were occurring and what the trends were. In reality the report showed what the major problems had been – unfortunately this was already known, as when something major goes wrong you remember getting shouted at!

What was requested

The client wanted a report that showed where the problems were occurring across business lines (rather than operational units) and how they were doing historically in a single page that could be included in a weekly MIS pack (they currently had four pages per product line (8) so a total of 32 pages. As a first pass they simply wanted an Excel worksheet they could update manually:

DVBlog2

We felt this solution lacked clarity and it was very difficult to spot trends across products.

What we proposed

We designed a solution using MicroCharts to allow small multiples of charts to show a variety of views:

DVBlog3

This solution allowed the user to view the data simply as a cumulative set of data by Product (top line) or by Root Cause (vertically) and then look deeper into historical trends in the centre of the chart. For example, its fairly easy to see spikes in the Root Cause data historically and see that the overall trend has improved over time. By ranking the Products and Root Causes you immediately give some sense of scale to the data. For example you can see that there are many more Application failures than any other type of problem, but the majority of root causes are otherwise fairly evenly distributed.

One other point worth noting was that the original colour scheme was much more muted, but the client got very upset that it looked like a competitor’s corporate colour and wanted it to be “louder”.

What was the user reaction…

Ecstatic, 1 page replaced 34 and they could see at a glance how the entire (large) organisation was working but also quickly find out detail for a particular area and identify trends.

(de)Faults in Excel Charting

I recently spoke at SQLBitsIII, and an aspect which went down well was a simple overview on how to make the most important aspect of a graph, namely the underlying data, the prime focus and clear and easy to read. I also had the opportunity to attend Stephen Few’s Information Visualisation Workshops in London, which I’d thoroughly recommend. Stephen also spent some time, as part of a much more detailed overall agenda, on how a typical default chart can be morphed into an effective display.

So it’s back to basics this week, and how to improve the standard, out of the box Excel chart. Unfortunately, despite it’s pervasiveness, the default chart settings, which many users will never stray from, are in the case of Office 2007 not ideal, and in earlier versions, pretty awful. In this piece I’ll outline a few simple steps which can turn the default visual delights of the Excel graph into something you need not be embarrassed to put on the projector.

I’m using ‘classic’ Excel as my start point, because it’s still the incumbent in most organisations (and also because it’s worse). The example is for column charts, but the majority of the tips are valid for any chart type. As our start point we have the unit sales data for 3 products across 6 countries, as a default Excel Column chart, below.

BadChart

 

 

 

 

 

Nice. It’s wrong in a lot of ways, but how many hundred times have you seen this or a version of this? It’s well trodden ground if you have read Tufte, Few at al, but the key recommendations to improve things are surprisingly simple, and quick to implement.

1) Remove the Clutter and noise

The purpose of the chart is to display the data of interest clearly and concisely. It’s not to distract the user with pretty shading or 3D effects etc. Although the default chart is no-frills, there are a number of items which are adding nothing, or have undue prominence, and in doing so detract from the overall goal.

  • The Plot Area
    • The grey background to the plot area adds nothing, so we remove it
    • The border on the plot area – remove it also (numerous studies have shown we only need two axes to effectively group and visualise data)
  • Gridlines
    • The default gridlines are black, too visually intense. They are there for reference when required, not the prime focus, so are best muted – set them to a light grey.

2) Axes and Legend

The axes frame the chart, and are a key point of reference; however they should not draw the focus from the chart itself. As with the gridlines, they should be toned down.

  • Change the default black font colour to charcoal / dark grey
  • Change the default axis colour from black to charcoal / dark grey
  • Typically reduce the font size to 8

Rules for the legend are similar to those for the axis

  • Change font from black to grey
  • Remove border or change it’s colour to very light grey
  • Typically reduce font size to 8
  • For a clustered column, my preference is for the Legend positioned at the bottom, and reading across in the display order of the columns.

3) Columns and colour

The black column borders add nothing, and as such should be removed, they are another form of Tufte’s ‘non-data ink’.

On to colour, and unfortunately Excel’s default chart fills are heavily saturated plum and wine with a light cream..  So I’d strongly suggest changing the chart colour palette. For column charts, there is typically a reasonable block of colour for each series, so the colour scheme shouldn’t be too bold, or it becomes an eyesore. You should aim for mid-intensity colours of similar saturation (unless one is intended to stand out), pastels tend to work well.

 

All the steps above are simple and fairly fast to action, with one exception, the colour scheme. Unless you already have pre-prepared palettes it’s possible to spend an age trying to get the ideal combination – remember the 80/20 rule!

GoodChart2

 

 

 

 

 

 

In my example above, which hopefully you’ll agree is an improvement, I’ve used the colour palette from our upcoming ‘Chart Tamer’ product. Chart Tamer is a lot more than just a colour palette, but that aspect has benefited from minds with much more expertise in colour than mine, and I’ll go with their choice over mine every time!

Household Income Distribution 1967 – 2005 As Small Multiples Chart

In my last post I tied to fix an overloaded line chart Jorge presented in a recent post about loss aversion:

image
Jorge asked "does it make any sense to add those nine series to a single chart?
My attempt to fix the chart by using some color coding, has its shortcomings that caused quite some discussion.

image

So again, how can you give the users all the data they expect while keeping the chart clean and readable?

D Kelly O’Day pointed out "More data or better colors won’t help a poor chart type selection" and presented a dot plot

image

Lets try to select the right chart type. In Chart Rules, As Simple as Possible, But Not Any Simpler! I presented an easy to learn set of rules to determine the best chart type .


1. Determine the relationship you want to display

In our case a we have a Distribution Relationship, we want to show the Distribution of the Income Levels


2. Determine if you want to emphasize individual values or the overall pattern and

emphasize individual values or the overall pattern  and Determine the chart type

As we want to emphasize individual values a column chart works best.

image

This chart already gives us a good feel for the income distribution in 1967- Looks like a almost perfect bell distribution with a belly for the mid income levels. But how did things change from 1967 to 2005? Lets create a set of small multiples to show the situation in 1967, 2005 and the increase from 1967 to 2005.

image

Using Color to Group and Label in Charts

Jorge wrote in a recent post about loss aversion, the fact that “people strongly prefer avoiding losses than acquiring gains”.

Loss aversion […]: Translated to chart-making, it means that there is a “tendency to avoid losing data at any cost”. The chart below shows you the Money Income Of Households as published.

image

Take the above chart, for instance: does it make any sense to add those nine series to a single chart?

Remove irrelevant data series and you risk a mutiny on the Bounty, even if relevant trends are easier to detect. It is absurd, but very human.

So, how can you give the users all the data they expect while keeping the chart clean and readable?

Continue reading “Using Color to Group and Label in Charts”

Heatmap Tables with Excel

The following article shows how to create Heatmaps in Excel 2003, for Excel 2007 & 2010 please see the updated article “Heatmap Tables with Excel – Revisited

This Heatmap Table shows you the revenues and the discounts of a company over the course of one year per product group. The size of a bubble shows the revenue made in a particular month and the bubble color shows the discount rate given. The discount rate has been encoded as a range of green colors, ranging from a light green, for low discounts to a dark green for high discounts. The years and product totals are shown at the right and bottom as an integrated part of the table.

image

Tufte often talks about the integration of number, images, and word, and I think he’s quite right. A way archive this in Excel is to integrate charts into tables, so called graphical tables, a very effective means to archive More Information Per Pixel. I already wrote about graphical tables here, here, and here.

 

Continue reading “Heatmap Tables with Excel”

Interactive English League Bumps Chart

This bumps chart shows you the English Premier League 2007 – 2008. It allows you to highlight and compare two teams by clicking a team in in the table or a line in the bumps charts.

image

 

Interactively highlighting a data series in a large data set is very powerful. All lines in the data set are set to light Gray to show you the big picture and the patterns and general tends in the data set. Once you have identified an interesting data series you click it and we highlight it with a bright, saturated color. This has the effect of bringing it into the foreground and allows the user to see the details in the context of the other data series.

As a bumps chart has the lines equally spaced on the value axis you can put an Excel table next to the chart that serves as a legend on one side, and as a detailed ranking table on the other side.

 

Continue reading “Interactive English League Bumps Chart”

Hyperlink Legends to Highlight a Series

So I picked up Jon’s idea and tried to combine it with ParamLink. In  The Missing Link I introduced the (free) ParamLink Add-In. It implements a new hyperlink formula ParamLink(). When the hyperlink is followed the formula can set cell values or define names.

Continue reading “Hyperlink Legends to Highlight a Series”

In-Cell Variance Charts

In financial report we are constantly comparing the actual numbers with our projections using variances.

A quick reminder

Absolute variance = Actual – Budget

Relative variance in % = (Actual – Budget) / Budget

Doing this over a report of even 10 lines requires row by row number comparison, and is not something which can be easily & quickly scanned. Showing the variances in a classical Excel charts is problematic, as you constantly have to go back and forth between the table and the chart to scan exact numbers and the variance tend in the chart.

image

 

 

In-Cell variance charts are a compact, data rich and efficient alternative.

image

 

To create in-cell variance charts you have basically three alternatives:

1. In-Cell Chart Using the REPT Function

Rolf Hichert presents a couple of years ago on his web site (German) a character based approach. The method got quite popular when it was first presented on the  Juice Blog. The main idea is to create the bars repeating a character using the REPT function. The function REPT(“●”,5) returns “●●●●●”. The disadvantage of this approach is that you have to set up all the formulas and that don’t have a continuous scale. E.g. to show 15% you either show “●” or “●●”, but there is no way creating fractions of a character. One of the advantages of the rept approach is that the bars are automatically aligned in the grid. When you increase the row height, bars are still perfectly aligned with the numbers.

image

 

Click here to download the Excel file

2. In-Cell Chart Using MicroCharts

Another smart way to overcome the setup and scale problems of REPT is using MicroCharts. Charts are created from MicroCharts fonts including bars, line segments and pies. A chart is represented as text that is formatted with the MicroCharts fonts. Similar to the rept fuction charts, you can utilize all the rich Excel capabilities for MicroCharts that you would for normal text!:

  • Text alignment
  • Text orientation
  • Font size
  • Conditional formatting
  • Automatic alignment when the row height changes

image

 

 

Unlike the rept function MicroCharts support continues scales value axis. The chart formatting is very similar to the options with standard Excel charts:

image

 

Hitesh Patel Winner of 2008 Excel Dashboard Competition used the Micro Bar Charts to visualize variances over areas and territories in a very cleanly presented table in his Pharmaceutical Sales Dashboard .

image

 

3. In-Cell Charts Aligning a regular Excel Chart in the Grid

A third approach is to use regular Excel bar charts:

image

 

 

The trick is to set chart area to transparent and to manually align them to the grid. The main problem with this method is that the setup of the chart can be very tedious. You will have to fiddle quite a while till the bars align perfrectly with the grid, and each time you change the row or column height width you have to readjust the chart area:

image

/* Style Definitions */
table.MsoNormalTable
{mso-style-name:”Table Normal”;
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-parent:””;
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:10.0pt;
font-family:”Times New Roman”;
mso-ansi-language:#0400;
mso-fareast-language:#0400;
mso-bidi-language:#0400;}

Update 06/28/2008: Jon added some insights how to handle variable rows and fourth approach for incell charts using graphical objects:

One problem arises when the number of rows covered by the chart is variable. When you have an update, you need to rely on VBA code to realign the chart, and this is not always reliable. To counter this (and to apply some “interesting” formatting requested by the client, I developed some VBA code that read values from each row and built chart using a series of rectangles and other shapes. These were aligned appropriately with the grid more simply and reliably than a chart would have been. A sample of this is shown in In Cell Charting with Shapes.

I ran into additional issues with another client who insisted that his worksheets be viewed at a zoom that filled the entire width of the screen. This has all kinds of negative issues, especially with charts. (And on my widescreen laptop, it needed a zoom of around 150%, so I had minimal rows visible. But a client is a client.) So I built some additional graphs based on Excel shapes, as illustrated in In Cell Bar Charts.

You could therefore add a fourth approach, though it isn’t accessible without code, and that is “Graphs using Graphical Objects”. This is after all how we all started with graphing data in second grade, with paper and crayons and perhaps a straightedge.

Extra tip: To further enrich your variance report with some historical context, provide the historical performance in an extra column as a sparkline:

image