Bandlines in XLCubed

In early January this year Stephen Few introduced the concept of Bandlines. He identified a useful extension to Sparklines, making use of shaded or coloured horizontal bands to provide more information on the context of the trend line itself. See Stephen’s article on Bandlines and the thinking behind them for a detail description.

The Sparklines are ideal for showing individual trends in a small amount of screen real estate, and we use them extensively in dashboards, typically in a ‘visual table’. By definition Sparklines are small, and to make the trend easily readable, they are typically scaled individually so that each Sparkline uses the whole vertical axis. This means they do not give any impression of the scale of the numbers involved across different rows. It’s possible to use a common scale, and while sometimes that works more often it means many of the rows with smaller values are excessively flattened.

Bandlines address this by introducing horizontal shaded areas depicting the lower, middle and upper quartiles and the median represented by a line. The user can determine the context of the bands. The two most common examples would be plotting recent trend in the context of a longer period, or plotting individual rows in the context of the overall set of data being displayed.

We think Bandlines add real value, so hats off once again to Stephen, and we’re pleased to say that Bandlines are now available in the current version of XLCubed (see here for more detail).

The screenshots below show two examples, displayed in two colour schemes.

 

Bandlines3

The charts depict historic margin by store. The ‘Banding across all stores’ charts show the 30-day trend for the individual store, set in the quartile context of the data for all 11 stores in the table. We can see that for the Gilroy store in row 1, while the margin has varied, it remained in the top quartile when set against all stores for almost the whole period.

The ‘Banding by store, 90 days’ charts show the individual 30 day trend, set in the context of the previous 90 days for the individual store. This helps provide much more historical context, but the line itself still focuses on the more recent trend. Stockton is probably most noteworty here as across the 30 day period it has dropped from the top quartile into the 1st quartile across the whole 90 day period.

We’d love to hear your thoughts (and also which colour scheme works best!), we will also be adding Sparkstrips in the near future so watch this space.

 

 

 

 

Excel Pareto Charts the XLCubed way!

V7.2 of XLCubed is released soon and we thought we’d take the opportunity to run through one of the new features that you’ll be seeing, Pareto Charts.

The Pareto Principle is often referred to as the 80-20 rule, that 80% of outcomes are attributable to 20% of causes. They are named after Vilfredo Pareto who lived in Italy in the 19thcentury and observed that 80% of the land was owned by 20% of the people.   Pareto charts have both bar charts and a line graph where the bars represent individual values and the line represents the cumulative total.

So how do you use Pareto Charts from XLCubed?  Very simply, within a grid you right-click on the column header to access XLCubed’s right-click menu, Grid Charts and Add Pareto Analysis.

Take this simple grid showing Reseller Sales for Product Model Categories for Canadian cities:

Right-clicking on All Products to Add Pareto Analysis brings up this window:

Click OK to return to the workbook – you will see that we have a chart showing that the top 9 cities provide some 80% of the sales.

You could also include the rolling total and percentage in your Pareto Chart.

Notice that we now also have some extra columns on the grid showing the cumulative total of all sales, the sales percentage per category and the cumulative percentage.

 

 

So that’s Pareto Charts – in a nutshell, an easy to use graphical tool which ties directly into dynamic XLCubed grids.

Olympics Treemap

The 2012 London Olympics have now finished, and as a UK company we were pleased to see the games were such a success, and of course that team GB did so spectacularly well! We’re looking forward now to the Paralympics in a couple of weeks, and once the dust has settled there we’ll be shipping a new point release of XLCubed in September.

We’ll keep most of the changes under wraps for now, but one item which we are introducing is treemaps. The Olympic medal table gives us a nice opportunity to better understand the medal breakdown through the  new chart type. In XLCubed, treemaps can be produced directly from a cube or from a table held in Excel, as is the case here. The first example below shows the medals split by country and sport. The size of the rectangle depicts the total number of medals, and the colour shows the number of gold medals, the darker the colour the more gold. The numeric values list the total number of medals, then the number of golds. We can see the USA at the top, and that over half their medals came from swimming and athletics, with a bigger percentage of golds in the pool.

Any of the countries can be drilled into for a large view on their medal breakdown, not that we’re partisan of course… , but the view below is for Great Britain (GBR) where the particularly good showing by the cycling team stands out.

Taking a look at the same data split first by sport and then country, it’s easy to see the countries dominating the medals in each sport, and to delve into more detail by sport where required.


 Drilling into Athletics we can see that USA won most medals, and also most gold. Great Britain had just the 6 medals, but 4 were gold and hence the darker colour on their tile.

We’ll be making an interactive version of this available over the next few days.

 

 

Small Multiples on River Quality

The phrase small multiple was popularised by Edward Tufte, and has become a generic term for a visual display using the same chart or graphic to display different slices of a data set. Their close positioning and shared scale make comparisons very easy and shared trends or outliers can be quickly spotted. Various other terms are also used to describe this charting approach, or specific aspects of it, including Trellis Charts, Lattice Charts, Grid Charts and Panel Charts.

The most common use case for small multiples is separate line charts to compare trend across a large number of varying elements. Placing them all within one chart would cause either a ‘spaghetti chart’ , or lots of occlusion as shown in the comparison below. Here we use a standard Excel line chart, and an XLCubed small multiple to chart the same data. Separating the charts while keeping a consistent axis scale makes for a much easier comparison than in the single chart.

We took a slightly different approach when using small multiples to take a look at differences in river water quality across regions of the UK. Our source data was not absolute numeric values, but 14 years of results categorised into four bandings (bad, poor, fair and good). We wanted to provide a ‘one-pager’ which gave a feel for the trend within each region, but also access to the annual breakdown of the different water qualities.

In the end we settled on a Small Multiple display of 100% stacked columns as shown below.

A percentage base seemed a sensible way to approach the data, as different regions will have differing numbers of rivers and of samples taken. Using this approach we’re able to see a comparison of the relative water quality rather than dealing in absolutes.

The user selects a geographic area of the country to view the regional breakdown within the selected area. The water quality for a particular year can be analysed by locating the region, and the specific year to see the percentage breakdown for each of the four categories.

The colouring of the 4 categories was chosen to aid ‘at a glance’ recognition of the overall water quality by region, and also of the trend. Dark blue signifies bad quality water (opaque), and light blue signifies good quality (think ‘you can see right through it….’).

So to read the display overall, or for trend:
• Dark colour signifies water quality problems.
• Light colour signifies good quality water.
• Reading left to right, increasing colour saturation shows declining quality over time.
• Reading left to right, decreasing colour saturation shows improving quality over time.
• Any region can be zoomed in on to see a larger chart and understand the breakdown in more detail.

Fairly quickly, and from just this one display we can draw a number of conclusions as below:
• Across the region, as a broad brush summary, water quality has improved since 1992.
• Doncaster has shown strong and steady improvement.
• Kingston upon Hull has the worst quality overall in the region, and varies significantly year on year.
• If you’re off for a swim in a Yorkshire river, Richmondshire looks a good bet!

We’ve designed a pre-set view in this case to work for the data in question, but the small multiple concept is also very powerful when interactively exploring data. A picture can tell a thousand words as they say – take a look at our youtube videos on small multiples: Video1 Video2

 

Data Visualization – a real world example

In the following example we work through a real world example of a data visualization. We’ve chosen an example that involves Operations data – this is fairly non-domain specific so hopefully it can demonstrate some important points. The first, and most important point is that you have to define your audience.

We receive many questions about “what is the best chart for this situation” or “what colour should I use for emphasis”. These questions are usually attacking the problem from the wrong angle. The one question you need to ask before anything else is “who is this visualization going to be seen by and how?” Is it in a boardroom on a printed sheet or across a trading floor on a plasma screen. Are the consumers domain experts?

This example features data about an investment bank’s operations processing, the audience being the clients of the Operations department.

Starting Point

Initially the project started out as simply trying to record what operational problems were encountered on a daily basis across different product lines. A reporting system was built and various generic reports produced:

DVBlog1

Unfortunately the reports either didn’t contain data at a granular enough level or it was difficult for the product managers to see where the issues were occurring and what the trends were. In reality the report showed what the major problems had been – unfortunately this was already known, as when something major goes wrong you remember getting shouted at!

What was requested

The client wanted a report that showed where the problems were occurring across business lines (rather than operational units) and how they were doing historically in a single page that could be included in a weekly MIS pack (they currently had four pages per product line (8) so a total of 32 pages. As a first pass they simply wanted an Excel worksheet they could update manually:

DVBlog2

We felt this solution lacked clarity and it was very difficult to spot trends across products.

What we proposed

We designed a solution using MicroCharts to allow small multiples of charts to show a variety of views:

DVBlog3

This solution allowed the user to view the data simply as a cumulative set of data by Product (top line) or by Root Cause (vertically) and then look deeper into historical trends in the centre of the chart. For example, its fairly easy to see spikes in the Root Cause data historically and see that the overall trend has improved over time. By ranking the Products and Root Causes you immediately give some sense of scale to the data. For example you can see that there are many more Application failures than any other type of problem, but the majority of root causes are otherwise fairly evenly distributed.

One other point worth noting was that the original colour scheme was much more muted, but the client got very upset that it looked like a competitor’s corporate colour and wanted it to be “louder”.

What was the user reaction…

Ecstatic, 1 page replaced 34 and they could see at a glance how the entire (large) organisation was working but also quickly find out detail for a particular area and identify trends.

(de)Faults in Excel Charting

I recently spoke at SQLBitsIII, and an aspect which went down well was a simple overview on how to make the most important aspect of a graph, namely the underlying data, the prime focus and clear and easy to read. I also had the opportunity to attend Stephen Few’s Information Visualisation Workshops in London, which I’d thoroughly recommend. Stephen also spent some time, as part of a much more detailed overall agenda, on how a typical default chart can be morphed into an effective display.

So it’s back to basics this week, and how to improve the standard, out of the box Excel chart. Unfortunately, despite it’s pervasiveness, the default chart settings, which many users will never stray from, are in the case of Office 2007 not ideal, and in earlier versions, pretty awful. In this piece I’ll outline a few simple steps which can turn the default visual delights of the Excel graph into something you need not be embarrassed to put on the projector.

I’m using ‘classic’ Excel as my start point, because it’s still the incumbent in most organisations (and also because it’s worse). The example is for column charts, but the majority of the tips are valid for any chart type. As our start point we have the unit sales data for 3 products across 6 countries, as a default Excel Column chart, below.

BadChart

 

 

 

 

 

Nice. It’s wrong in a lot of ways, but how many hundred times have you seen this or a version of this? It’s well trodden ground if you have read Tufte, Few at al, but the key recommendations to improve things are surprisingly simple, and quick to implement.

1) Remove the Clutter and noise

The purpose of the chart is to display the data of interest clearly and concisely. It’s not to distract the user with pretty shading or 3D effects etc. Although the default chart is no-frills, there are a number of items which are adding nothing, or have undue prominence, and in doing so detract from the overall goal.

  • The Plot Area
    • The grey background to the plot area adds nothing, so we remove it
    • The border on the plot area – remove it also (numerous studies have shown we only need two axes to effectively group and visualise data)
  • Gridlines
    • The default gridlines are black, too visually intense. They are there for reference when required, not the prime focus, so are best muted – set them to a light grey.

2) Axes and Legend

The axes frame the chart, and are a key point of reference; however they should not draw the focus from the chart itself. As with the gridlines, they should be toned down.

  • Change the default black font colour to charcoal / dark grey
  • Change the default axis colour from black to charcoal / dark grey
  • Typically reduce the font size to 8

Rules for the legend are similar to those for the axis

  • Change font from black to grey
  • Remove border or change it’s colour to very light grey
  • Typically reduce font size to 8
  • For a clustered column, my preference is for the Legend positioned at the bottom, and reading across in the display order of the columns.

3) Columns and colour

The black column borders add nothing, and as such should be removed, they are another form of Tufte’s ‘non-data ink’.

On to colour, and unfortunately Excel’s default chart fills are heavily saturated plum and wine with a light cream..  So I’d strongly suggest changing the chart colour palette. For column charts, there is typically a reasonable block of colour for each series, so the colour scheme shouldn’t be too bold, or it becomes an eyesore. You should aim for mid-intensity colours of similar saturation (unless one is intended to stand out), pastels tend to work well.

 

All the steps above are simple and fairly fast to action, with one exception, the colour scheme. Unless you already have pre-prepared palettes it’s possible to spend an age trying to get the ideal combination – remember the 80/20 rule!

GoodChart2

 

 

 

 

 

 

In my example above, which hopefully you’ll agree is an improvement, I’ve used the colour palette from our upcoming ‘Chart Tamer’ product. Chart Tamer is a lot more than just a colour palette, but that aspect has benefited from minds with much more expertise in colour than mine, and I’ll go with their choice over mine every time!

Household Income Distribution 1967 – 2005 As Small Multiples Chart

In my last post I tied to fix an overloaded line chart Jorge presented in a recent post about loss aversion:

image
Jorge asked "does it make any sense to add those nine series to a single chart?
My attempt to fix the chart by using some color coding, has its shortcomings that caused quite some discussion.

image

So again, how can you give the users all the data they expect while keeping the chart clean and readable?

D Kelly O’Day pointed out "More data or better colors won’t help a poor chart type selection" and presented a dot plot

image

Lets try to select the right chart type. In Chart Rules, As Simple as Possible, But Not Any Simpler! I presented an easy to learn set of rules to determine the best chart type .


1. Determine the relationship you want to display

In our case a we have a Distribution Relationship, we want to show the Distribution of the Income Levels


2. Determine if you want to emphasize individual values or the overall pattern and

emphasize individual values or the overall pattern  and Determine the chart type

As we want to emphasize individual values a column chart works best.

image

This chart already gives us a good feel for the income distribution in 1967- Looks like a almost perfect bell distribution with a belly for the mid income levels. But how did things change from 1967 to 2005? Lets create a set of small multiples to show the situation in 1967, 2005 and the increase from 1967 to 2005.

image

Using Color to Group and Label in Charts

Jorge wrote in a recent post about loss aversion, the fact that “people strongly prefer avoiding losses than acquiring gains”.

Loss aversion […]: Translated to chart-making, it means that there is a “tendency to avoid losing data at any cost”. The chart below shows you the Money Income Of Households as published.

image

Take the above chart, for instance: does it make any sense to add those nine series to a single chart?

Remove irrelevant data series and you risk a mutiny on the Bounty, even if relevant trends are easier to detect. It is absurd, but very human.

So, how can you give the users all the data they expect while keeping the chart clean and readable?

Continue reading “Using Color to Group and Label in Charts”

Heatmap Tables with Excel

The following article shows how to create Heatmaps in Excel 2003, for Excel 2007 & 2010 please see the updated article “Heatmap Tables with Excel – Revisited

This Heatmap Table shows you the revenues and the discounts of a company over the course of one year per product group. The size of a bubble shows the revenue made in a particular month and the bubble color shows the discount rate given. The discount rate has been encoded as a range of green colors, ranging from a light green, for low discounts to a dark green for high discounts. The years and product totals are shown at the right and bottom as an integrated part of the table.

image

Tufte often talks about the integration of number, images, and word, and I think he’s quite right. A way archive this in Excel is to integrate charts into tables, so called graphical tables, a very effective means to archive More Information Per Pixel. I already wrote about graphical tables here, here, and here.

 

Continue reading “Heatmap Tables with Excel”

Interactive English League Bumps Chart

This bumps chart shows you the English Premier League 2007 – 2008. It allows you to highlight and compare two teams by clicking a team in in the table or a line in the bumps charts.

image

 

Interactively highlighting a data series in a large data set is very powerful. All lines in the data set are set to light Gray to show you the big picture and the patterns and general tends in the data set. Once you have identified an interesting data series you click it and we highlight it with a bright, saturated color. This has the effect of bringing it into the foreground and allows the user to see the details in the context of the other data series.

As a bumps chart has the lines equally spaced on the value axis you can put an Excel table next to the chart that serves as a legend on one side, and as a detailed ranking table on the other side.

 

Continue reading “Interactive English League Bumps Chart”