Bandlines in XLCubed

In early January this year Stephen Few introduced the concept of Bandlines. He identified a useful extension to Sparklines, making use of shaded or coloured horizontal bands to provide more information on the context of the trend line itself. See Stephen’s article on Bandlines and the thinking behind them for a detail description.

The Sparklines are ideal for showing individual trends in a small amount of screen real estate, and we use them extensively in dashboards, typically in a ‘visual table’. By definition Sparklines are small, and to make the trend easily readable, they are typically scaled individually so that each Sparkline uses the whole vertical axis. This means they do not give any impression of the scale of the numbers involved across different rows. It’s possible to use a common scale, and while sometimes that works more often it means many of the rows with smaller values are excessively flattened.

Bandlines address this by introducing horizontal shaded areas depicting the lower, middle and upper quartiles and the median represented by a line. The user can determine the context of the bands. The two most common examples would be plotting recent trend in the context of a longer period, or plotting individual rows in the context of the overall set of data being displayed.

We think Bandlines add real value, so hats off once again to Stephen, and we’re pleased to say that Bandlines are now available in the current version of XLCubed (see here for more detail).

The screenshots below show two examples, displayed in two colour schemes.

 

Bandlines3

The charts depict historic margin by store. The ‘Banding across all stores’ charts show the 30-day trend for the individual store, set in the quartile context of the data for all 11 stores in the table. We can see that for the Gilroy store in row 1, while the margin has varied, it remained in the top quartile when set against all stores for almost the whole period.

The ‘Banding by store, 90 days’ charts show the individual 30 day trend, set in the context of the previous 90 days for the individual store. This helps provide much more historical context, but the line itself still focuses on the more recent trend. Stockton is probably most noteworty here as across the 30 day period it has dropped from the top quartile into the 1st quartile across the whole 90 day period.

We’d love to hear your thoughts (and also which colour scheme works best!), we will also be adding Sparkstrips in the near future so watch this space.

 

 

 

 

Excel Pareto Charts the XLCubed way!

V7.2 of XLCubed is released soon and we thought we’d take the opportunity to run through one of the new features that you’ll be seeing, Pareto Charts.

The Pareto Principle is often referred to as the 80-20 rule, that 80% of outcomes are attributable to 20% of causes. They are named after Vilfredo Pareto who lived in Italy in the 19thcentury and observed that 80% of the land was owned by 20% of the people.   Pareto charts have both bar charts and a line graph where the bars represent individual values and the line represents the cumulative total.

So how do you use Pareto Charts from XLCubed?  Very simply, within a grid you right-click on the column header to access XLCubed’s right-click menu, Grid Charts and Add Pareto Analysis.

Take this simple grid showing Reseller Sales for Product Model Categories for Canadian cities:

Right-clicking on All Products to Add Pareto Analysis brings up this window:

Click OK to return to the workbook – you will see that we have a chart showing that the top 9 cities provide some 80% of the sales.

You could also include the rolling total and percentage in your Pareto Chart.

Notice that we now also have some extra columns on the grid showing the cumulative total of all sales, the sales percentage per category and the cumulative percentage.

 

 

So that’s Pareto Charts – in a nutshell, an easy to use graphical tool which ties directly into dynamic XLCubed grids.

Small Multiples on River Quality

The phrase small multiple was popularised by Edward Tufte, and has become a generic term for a visual display using the same chart or graphic to display different slices of a data set. Their close positioning and shared scale make comparisons very easy and shared trends or outliers can be quickly spotted. Various other terms are also used to describe this charting approach, or specific aspects of it, including Trellis Charts, Lattice Charts, Grid Charts and Panel Charts.

The most common use case for small multiples is separate line charts to compare trend across a large number of varying elements. Placing them all within one chart would cause either a ‘spaghetti chart’ , or lots of occlusion as shown in the comparison below. Here we use a standard Excel line chart, and an XLCubed small multiple to chart the same data. Separating the charts while keeping a consistent axis scale makes for a much easier comparison than in the single chart.

We took a slightly different approach when using small multiples to take a look at differences in river water quality across regions of the UK. Our source data was not absolute numeric values, but 14 years of results categorised into four bandings (bad, poor, fair and good). We wanted to provide a ‘one-pager’ which gave a feel for the trend within each region, but also access to the annual breakdown of the different water qualities.

In the end we settled on a Small Multiple display of 100% stacked columns as shown below.

A percentage base seemed a sensible way to approach the data, as different regions will have differing numbers of rivers and of samples taken. Using this approach we’re able to see a comparison of the relative water quality rather than dealing in absolutes.

The user selects a geographic area of the country to view the regional breakdown within the selected area. The water quality for a particular year can be analysed by locating the region, and the specific year to see the percentage breakdown for each of the four categories.

The colouring of the 4 categories was chosen to aid ‘at a glance’ recognition of the overall water quality by region, and also of the trend. Dark blue signifies bad quality water (opaque), and light blue signifies good quality (think ‘you can see right through it….’).

So to read the display overall, or for trend:
• Dark colour signifies water quality problems.
• Light colour signifies good quality water.
• Reading left to right, increasing colour saturation shows declining quality over time.
• Reading left to right, decreasing colour saturation shows improving quality over time.
• Any region can be zoomed in on to see a larger chart and understand the breakdown in more detail.

Fairly quickly, and from just this one display we can draw a number of conclusions as below:
• Across the region, as a broad brush summary, water quality has improved since 1992.
• Doncaster has shown strong and steady improvement.
• Kingston upon Hull has the worst quality overall in the region, and varies significantly year on year.
• If you’re off for a swim in a Yorkshire river, Richmondshire looks a good bet!

We’ve designed a pre-set view in this case to work for the data in question, but the small multiple concept is also very powerful when interactively exploring data. A picture can tell a thousand words as they say – take a look at our youtube videos on small multiples: Video1 Video2

 

Something on the Horizon

We had an interesting scenario while helping a customer extend an existing Excel dashboard.

We had recently performed some work to solve some performance and design issues they had with their existing Analysis Services cubes. They now had more of their underlying data available and the ability to query longer periods without the performance hit (a year’s worth of data vs 28-days).

They wanted to make the most of this by charting changes in daily sales data over the previous 12 months, broken down by their four main business groups. Ideally the chart would become part of the existing Management Report, the difficulty was the lack of report real estate to add the extra information. This is something we have all come across previously and of course typically solved by using In-Cell charts.

Plotting the data on an Excel chart in the space available would give us this:

 

 

Converting to Sparklines gave us a slightly better view, but given the number of data items being plotted still not ideal.

 

 

Luckily our customer had recently upgraded to V6.1 of XLCubed so we were able to use one of our newest incell chart types: SparkHorizons. There is a good explanation of Horizon charts as part of the research paper: Sizing the Horizon: The Effects of Chart Size and Layering on the Graphical Perception of Time Series Visualizations and Stephen Few has covered them previously.

Essentially a line chart is split into colored bands – degrees of blue for positive numbers and degrees of red for negative numbers. In XLCubed this is 3 bands of each colour. The separation of the vertical scale means that horizon charts can be a lot more effective than standard sparklines where the scale of the numbers vary significantly, but you still want to retain a common scale view.

In this case plotting the same data as horizon charts makes things a lot clearer:

It now becomes quite clear when sales a trending up vs down. It’s also possible to flip the negative values so they appear on the same direction as the positive values:

 

We are always looking at ways of developing and extending XLCubed, SparkHorizons were added because they looked like they had the potential to be useful where the data suited them, so it was pleasing to be able to use them in a real-world situation.

It’s also worth mentioning that although, in this case the data came from Analysis Services Cubes, because they are available as Excel formula they can be used to plot any Excel data, here’s an example of the formula:

=XL3SparkHorizon(Sheet1!$V$2:$V$262,Sheet1!E10)

This will plot the data from Sheet1!$V$2:$V$262 as a SparkHorizon graph in Sheet1!E10.

 


Heatmap Tables with Excel – Revisited

We’ve revisited one of our more popular guides Heatmap Tables with Excel as they can be a very effective way of presenting data on a dashboard, and have now updated it for Excel 2010…

This Heatmap Table is designed to show you the revenues and the discounts of a company over the course of one year per product group. The size of a bubble shows the revenue made in a particular month and the bubble color shows the discount rate given. The discount rate has been encoded as a range of green colors, ranging from a light green, for low discounts to a dark green for high discounts. The years and product totals are shown at the right and bottom as an integrated part of the table.

Tufte often talks about the integration of numbers, images and words; I think he’s quite right. A way to achieve this in Excel is to integrate charts into tables, so called graphical tables, a very effective means to show “More Information Per Pixel“.

The heatmap table is based on a regular Excel bubble chart. To integrate a bubble chart into a table the bubbles are positioned in a matrix that has the same row and column layout as our table.

 

 

 

 

 

 

 

 

 

 

 

 

 

In our case we generate a data series table with one column for the X-Series going from 1-12 for January – December and one column for our Y-Series going from 1-8 for our 8 product groups and one column for revenue.

In the sample spreadsheet we’ve setup some simple excel formula to translate data from the classic grid layout:

to the required format:

Now we can insert the bubble chart:

 

To ensure that the charts fit exactly into the table grid we set Min/Max for the X axis to 0.5/12.5 and for the Y axis to 0.5/8.5. Excel would calculate much larger auto scales otherwise. Also set the Major units to 1 so we can use that later to set some grid lines.

 

Now we remove the legend, the X and Y axis, maximize the plot area and align the chart with the Excel table. As the bubbles are initially too large we have to make them smaller. To control the bubble size go to Data Series Options and scale the bubble size to 50%:

 

This already makes a nice bubble table you could use to reproduce the Twitter Charts.

For the grid lines format your table headers and grid lines with light gray grid lines. Resize the plot area, remove the border and re-position the chart so that the chart and the table grid lines align.

To create the heatmap with different colored bubbles we use the fact that by default Excel does not plot data points for #NA values.  For the heatmap we overlay 8 bubble series, one  series per green shade, and show a revenue bubble only if the value fits into the value range that corresponds with a green shade of our color ramp, otherwise we show #NA.

We divide the range MAX(Discount)..0 into 8 groups to define the colours.

The data series columns use the following formula to test if a discount value corresponds with an interval / colour shade:

=IF(AND($E7>I$6-Step,$E7<=I$6),$D7,NA())

The formula returns the revenue, if the discount values is in the interval defined in the column header I$5.

 

 

Now create the eight data series so that the bubble size refers to the eight columns in the data table:

 

And use the Excel chart styles to pick a colour range – make sure you  remove the border from the chart area.

 

 

And you could use the chart styles to quickly switch between different colours – or customise each series to refine the colors.

You can download a starting point for these files here: HeatmapSample.xlsx. Most of the formulae should adapt to data values that you can feed into the data sheets, including data straight from Analysis Services if using XLCubed grids or formulae.

You can see an interactive version of the Heatmap here – we added a link to some cube data, some Slicers for driving the parameters and then published to XLCubedWeb.

 

 

Flexible time-series graphing from a slicer

We are often asked how to drive a chart from a slicer in XLCubed and how to plot days/months for a month or year. Base case this is fairly straightforward, you can set up a grid which is based on the previous ‘x’ months of a slicer selection for example. The difficulty can be where you want to vary the behaviour depending on which level of the hierarchy the user chooses. This is particularly true where the hierarchy contains semesters or quarters.

The example below shows a technique to handle this complexity and display the chart in a way meaningful to the user in each case. The report is based on a slicer that allows the user to switch between showing the graph data based on quarters, months or days.

You can download the Excel spreadsheet that is used in the example here TimeSeriesGraphFromSlicer

This connects to the Adventureworks demo database which ships with Analysis Services.

The diagram below shows the flow of data from each worksheet showing the final result in the sheet Chart.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Workbook Sheet – Chart

This sheet shows the graph based on the data chosen in slicer above it. This switches the graph data between quarters, months and days depending on the slicer selection.

 

Workbook Sheet – GridForChart

This shows the data that will be graphed, depending on the choice made by the slicer selection. In this example it is months July 2001 – June 2002. FY2002 has been selected by the user (in this example Financial Year 2002 runs from July 2001 – June 2002).

Note that cells A10 – A21 contain the value ‘TRUE’ – these cells contain an XL3RowVisible statement as follows:

=XL3RowVisible(B10<>””)

This statement hides rows with no data so that they are not plotted on the graph.

Workbook Sheet – SlicerToMonthDay

This sheet contains the data that is returned by the choice of the slicer in workbook sheet Chart.

User selects a month

The data will be graphed as days. For example, if the user selects July 2002 then the graph will be displayed with each day in July along the x-axis. These are defined in XLCubed as ‘Children of’ the slicer.

User selects a quarter year

The data will be graphed as months in a three month period. For example, the user selects Q1 FY 2003 and the data displayed is for three months from July 2002 – September 2002 as below. These are defined in XLCubed as ‘Descendants of’ the slicer at month. This will be the same when the user picks year, semester or quarter.

User selects a half-year

The data will be graphed as months in a six-month period. For example, the user selects H1 FY 2003. The screenshot below shows the data that will be graphed.

However, it can be seen that the values Q1 FY 2003 and Q2 FY 2003 should not appear on the graph.

Using the Edit Member functionality it is possible to remove these so that they do not appear as points on the graph.

To do this, edit the Date.Fiscal member and click on Advanced tab.

Click on the drop down next to first member – that member set is the resulting data when the user selects H1 FY 2003 and shows the data that is in cells B10 – B43 in sheet SlicerToMonthDay.

 

The screenshot below shows the data that will be subtracted – it is in effect the actual value selected by the user via the slicer alongside the two Fiscal Semester values Q1 FY 2003 and Q2 FY 2003.

 

The GridForChart sheet now shows just the six months that should be graphed. As explained earlier further manipulation using the XL3RowVisible functionality removes blank rows.

 

The screenshot above shows the graph with six months of data for H1 FY 2003 for months July 2002 – December 2002, and the quarters have been dynamically excluded.

The end result is a flexible time selector where the user can choose dates at different levels in the hierarchy, and will always get a meaningful and in-context time series chart.

 

 

Data Visualization – a real world example

In the following example we work through a real world example of a data visualization. We’ve chosen an example that involves Operations data – this is fairly non-domain specific so hopefully it can demonstrate some important points. The first, and most important point is that you have to define your audience.

We receive many questions about “what is the best chart for this situation” or “what colour should I use for emphasis”. These questions are usually attacking the problem from the wrong angle. The one question you need to ask before anything else is “who is this visualization going to be seen by and how?” Is it in a boardroom on a printed sheet or across a trading floor on a plasma screen. Are the consumers domain experts?

This example features data about an investment bank’s operations processing, the audience being the clients of the Operations department.

Starting Point

Initially the project started out as simply trying to record what operational problems were encountered on a daily basis across different product lines. A reporting system was built and various generic reports produced:

DVBlog1

Unfortunately the reports either didn’t contain data at a granular enough level or it was difficult for the product managers to see where the issues were occurring and what the trends were. In reality the report showed what the major problems had been – unfortunately this was already known, as when something major goes wrong you remember getting shouted at!

What was requested

The client wanted a report that showed where the problems were occurring across business lines (rather than operational units) and how they were doing historically in a single page that could be included in a weekly MIS pack (they currently had four pages per product line (8) so a total of 32 pages. As a first pass they simply wanted an Excel worksheet they could update manually:

DVBlog2

We felt this solution lacked clarity and it was very difficult to spot trends across products.

What we proposed

We designed a solution using MicroCharts to allow small multiples of charts to show a variety of views:

DVBlog3

This solution allowed the user to view the data simply as a cumulative set of data by Product (top line) or by Root Cause (vertically) and then look deeper into historical trends in the centre of the chart. For example, its fairly easy to see spikes in the Root Cause data historically and see that the overall trend has improved over time. By ranking the Products and Root Causes you immediately give some sense of scale to the data. For example you can see that there are many more Application failures than any other type of problem, but the majority of root causes are otherwise fairly evenly distributed.

One other point worth noting was that the original colour scheme was much more muted, but the client got very upset that it looked like a competitor’s corporate colour and wanted it to be “louder”.

What was the user reaction…

Ecstatic, 1 page replaced 34 and they could see at a glance how the entire (large) organisation was working but also quickly find out detail for a particular area and identify trends.

Household Income Distribution 1967 – 2005 As Small Multiples Chart

In my last post I tied to fix an overloaded line chart Jorge presented in a recent post about loss aversion:

image
Jorge asked "does it make any sense to add those nine series to a single chart?
My attempt to fix the chart by using some color coding, has its shortcomings that caused quite some discussion.

image

So again, how can you give the users all the data they expect while keeping the chart clean and readable?

D Kelly O’Day pointed out "More data or better colors won’t help a poor chart type selection" and presented a dot plot

image

Lets try to select the right chart type. In Chart Rules, As Simple as Possible, But Not Any Simpler! I presented an easy to learn set of rules to determine the best chart type .


1. Determine the relationship you want to display

In our case a we have a Distribution Relationship, we want to show the Distribution of the Income Levels


2. Determine if you want to emphasize individual values or the overall pattern and

emphasize individual values or the overall pattern  and Determine the chart type

As we want to emphasize individual values a column chart works best.

image

This chart already gives us a good feel for the income distribution in 1967- Looks like a almost perfect bell distribution with a belly for the mid income levels. But how did things change from 1967 to 2005? Lets create a set of small multiples to show the situation in 1967, 2005 and the increase from 1967 to 2005.

image

Using Color to Group and Label in Charts

Jorge wrote in a recent post about loss aversion, the fact that “people strongly prefer avoiding losses than acquiring gains”.

Loss aversion […]: Translated to chart-making, it means that there is a “tendency to avoid losing data at any cost”. The chart below shows you the Money Income Of Households as published.

image

Take the above chart, for instance: does it make any sense to add those nine series to a single chart?

Remove irrelevant data series and you risk a mutiny on the Bounty, even if relevant trends are easier to detect. It is absurd, but very human.

So, how can you give the users all the data they expect while keeping the chart clean and readable?

Continue reading “Using Color to Group and Label in Charts”

Interactive English League Bumps Chart

This bumps chart shows you the English Premier League 2007 – 2008. It allows you to highlight and compare two teams by clicking a team in in the table or a line in the bumps charts.

image

 

Interactively highlighting a data series in a large data set is very powerful. All lines in the data set are set to light Gray to show you the big picture and the patterns and general tends in the data set. Once you have identified an interesting data series you click it and we highlight it with a bright, saturated color. This has the effect of bringing it into the foreground and allows the user to see the details in the context of the other data series.

As a bumps chart has the lines equally spaced on the value axis you can put an Excel table next to the chart that serves as a legend on one side, and as a detailed ranking table on the other side.

 

Continue reading “Interactive English League Bumps Chart”