## Hyperlink Legends to Highlight a Series

So I picked up Jon’s idea and tried to combine it with ParamLink. In  The Missing Link I introduced the (free) ParamLink Add-In. It implements a new hyperlink formula ParamLink(). When the hyperlink is followed the formula can set cell values or define names.

/

## In-Cell Variance Charts

In financial report we are constantly comparing the actual numbers with our projections using variances.

A quick reminder

Absolute variance = Actual – Budget

Relative variance in % = (Actual – Budget) / Budget

Doing this over a report of even 10 lines requires row by row number comparison, and is not something which can be easily & quickly scanned. Showing the variances in a classical Excel charts is problematic, as you constantly have to go back and forth between the table and the chart to scan exact numbers and the variance tend in the chart.

In-Cell variance charts are a compact, data rich and efficient alternative.

To create in-cell variance charts you have basically three alternatives:

1. In-Cell Chart Using the REPT Function

Rolf Hichert presents a couple of years ago on his web site (German) a character based approach. The method got quite popular when it was first presented on the  Juice Blog. The main idea is to create the bars repeating a character using the REPT function. The function REPT(“●”,5) returns “●●●●●”. The disadvantage of this approach is that you have to set up all the formulas and that don’t have a continuous scale. E.g. to show 15% you either show “●” or “●●”, but there is no way creating fractions of a character. One of the advantages of the rept approach is that the bars are automatically aligned in the grid. When you increase the row height, bars are still perfectly aligned with the numbers.

Click here to download the Excel file

2. In-Cell Chart Using MicroCharts

Another smart way to overcome the setup and scale problems of REPT is using MicroCharts. Charts are created from MicroCharts fonts including bars, line segments and pies. A chart is represented as text that is formatted with the MicroCharts fonts. Similar to the rept fuction charts, you can utilize all the rich Excel capabilities for MicroCharts that you would for normal text!:

• Text alignment
• Text orientation
• Font size
• Conditional formatting
• Automatic alignment when the row height changes

Unlike the rept function MicroCharts support continues scales value axis. The chart formatting is very similar to the options with standard Excel charts:

Hitesh Patel Winner of 2008 Excel Dashboard Competition used the Micro Bar Charts to visualize variances over areas and territories in a very cleanly presented table in his Pharmaceutical Sales Dashboard .

3. In-Cell Charts Aligning a regular Excel Chart in the Grid

A third approach is to use regular Excel bar charts:

The trick is to set chart area to transparent and to manually align them to the grid. The main problem with this method is that the setup of the chart can be very tedious. You will have to fiddle quite a while till the bars align perfrectly with the grid, and each time you change the row or column height width you have to readjust the chart area:

/* Style Definitions */
table.MsoNormalTable
{mso-style-name:”Table Normal”;
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-parent:””;
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:10.0pt;
font-family:”Times New Roman”;
mso-ansi-language:#0400;
mso-fareast-language:#0400;
mso-bidi-language:#0400;}

Update 06/28/2008: Jon added some insights how to handle variable rows and fourth approach for incell charts using graphical objects:

One problem arises when the number of rows covered by the chart is variable. When you have an update, you need to rely on VBA code to realign the chart, and this is not always reliable. To counter this (and to apply some “interesting” formatting requested by the client, I developed some VBA code that read values from each row and built chart using a series of rectangles and other shapes. These were aligned appropriately with the grid more simply and reliably than a chart would have been. A sample of this is shown in In Cell Charting with Shapes.

I ran into additional issues with another client who insisted that his worksheets be viewed at a zoom that filled the entire width of the screen. This has all kinds of negative issues, especially with charts. (And on my widescreen laptop, it needed a zoom of around 150%, so I had minimal rows visible. But a client is a client.) So I built some additional graphs based on Excel shapes, as illustrated in In Cell Bar Charts.

You could therefore add a fourth approach, though it isn’t accessible without code, and that is “Graphs using Graphical Objects”. This is after all how we all started with graphing data in second grade, with paper and crayons and perhaps a straightedge.

Extra tip: To further enrich your variance report with some historical context, provide the historical performance in an extra column as a sparkline:

/

## Graphical Table – Abortion Data 1980 – 2003

Here the history of the Abortion data set

1) Jorge used some U.S. Census Bureau data (original Excel file) to visualize the abortion ratio as a small multiples chart,

2) In Small Multiples – Abortion Data 1980-2003, I have written about Jorge’s panel chart. I like the panel (small multiples) chart, but felt that it can be improved.

3) Jon Peltier posted on Re: Abortion Ratios 1980-2003 and Interactive Multiple Line Chart some nice Excel techniques to create interactive charts to analyze the Abortion data set.

Here is a different view on the data set as a Graphical Table using MicroCharts. The Graphical Table is a hybrid between the panel (small multiples) chart and a plain table of numbers. It nicely integrates the numbers with small charts, so-called sparklines, in a table. The sparklines show you the trend of the abortions over the years and the in-cell bar charts give you a feel for the distribution over the age groups:

/

## Small Multiples – Abortion Data 1980-2003

Jorge used some U.S. Census Bureau data (original Excel file) to visualize the abortion ratio as a small multiples chart.

I like this chart as I am a big fan of small multiples displays. They help us to understand the nature of multi-dimensional data.

Jorge asked for suggestions to improve his chart and I came up with these:

• The abortion ratio is calculated as the number of pregnancies that end in an abortion per 1000 pregnancies. I would put the definition of the ratio in the title or a footnote (Number of abortions per 1,000 live births). Most places where I have seen this data presented tend to do it in this way too, although my preference would be for a % format.
• I think that the data for Race, Marital Status and Age Group are actually 3 different stories. It took me several scans of the whole chart set to realize this. I would add some white space between the Age group charts, the Race and Martial Status charts, to make it clear that are comparing different variables.
• Jorge shows an integrated line chart for Marital State and for Race but a small multiples chart for Age Groups. For a more consistent chart reading I would use small multiples charts for all variables.
• The axes of the Percent Distribution bars are not labeled, and have no scale. This coupled with varying meaning of the color encoding is confusing. The percentage distribution data for the Age < 15 lies around the 1% level and so cannot be seen on this scale even though data does exist. We also cannot encode the way this measure also varies with time over the same period. In fact we miss the fact that for women over 25 years the percentage distribution is actually increasing.
• I’m not sure if you need to show the all age groups in the Age group charts as light Grey lines. I know that you want the reader to compare the current age group in the context of the other age groups but this exactly what the small multiples are for anyway.

I like Jorge’s idea of integrating the abortion ratio and abortion percentages, but I found having another group of small multiples provides a more consistent view. If we don’t have the percentage distribution numbers then we can easily draw the wrong conclusions about the data.

Here my small multiples chart for the Abortion by Age Groups:

Update 6/22/2008: Jon Peltier posted on Re: Abortion Ratios 1980-2003 and Interactive Multiple Line Chart some nice Excel techniques to create interactive charts to analyze the Abortion data set.

/

## More Information per Pixel!

In my last post I suggested some chart selection rules as an alternative to Godin’s Silly Rules for Great Graphs. Jerome commented:

[…] on a slide, you want to convey one message. your graph must NOT carry any information that can be interpreted differently than the point you are trying to make. the corollary is that in virtually all cases, you should display as little data points as possible: 1 if possible, 2 but no more than 3. If you need more than 3 data points, use handouts. […]

which is very much in line with what Seth Godin said in his famous post about Chart Rules:

No, the reason you put a chart in a presentation is to tell a story. A single story, one story per chart

Why should a presentation display as little data as possible? Why should a slide contain only one chart? I demand More Information Per Pixel. Why not have a data-rich chart in a slide – no, even a couple of charts to support my message?

My friend Rolf Hichert has a totally different design philosophy.

Components of good presentations slides:

• A clear message
• A clear title (should be a complete sentence, including units like K\$)
• Each slide to conveys only one message
• More tables and charts to support the message
• Choose the right chart type
• Use arrows, color etc. to highlight the message

Look at this sample taken from Rolf’s web site:

The slide has a clear title that conveys one message: "Further positive Development in Frankfurt, Vienna and Graz – Action needed in Lausanne and Linz".

The slide contains small multiple charts to support the message, where Rolf has chosen a line chart to emphasize trends or patterns. The problematic regions mentioned in the title are colored in red; those that made the CEO happy in green.

/

## Chart Rules, As Simple as Possible, But Not Any Simpler!

In chart design it’s good to make things simple, but you certainly should avoid oversimplification. As Einstein said:

"Things should be made as simple as possible, but not any simpler"

Seth Godin presented in his blog The three laws of great graphs:

1. One Story
2. No Bar Charts
3. Motion

Effective chart design rules are simple, but reducing it to this set of 3 rules certainly is an over-oversimplification.

Particularly rule 2 is flawed. Seth details rule 2:

"NO BAR CHARTS
Bar charts are dramatically overrated, primarily because they’re the first choice in many graphing programs.

The problem with bar charts is that they should either be line/area charts (when graphing a change over time, like unemployment rates) or they should be a simple pie chart (when comparing two or three items at the same scale).

Jorge, Kaiser and Jon already wrote some critical posts about this rule, where Jon suggested to replace rule 2 with

Choose Chart Types Intelligently

We are working tightly together with Stephen Few on a new product that helps business users creating effective charts with Excel and are therefore we are quite familiar with Stephen’s design principles.

Its an easy to learn set of rules

1. Determine the relationship you want to display

 Relationship Sample Value Comparison Sales in different regions Ranking Best selling products Time-Series Sales in the last 12 months Part-to-Whole Market shares Deviation Revenue Actual vs Budget in the last 12 months Distribution Support response times Correlation Relationship between employee’s heights in inches and their salary

2. Determine if you want to emphasize individual values or the overall pattern
3. Determine the chart type

 Relationship Encoding Method Value Comparison Bars and Columns Ranking Bars and Columns Time-Series Lines to emphasize the overall trends or pattern Points connected by lines to slightly emphasize individual values Columns to emphasize and support comparisons between individual values Part-to-Whole Bars and Columns Deviation Lines to emphasize the overall shape of the data Points connected by lines to slightly emphasize individual data points Bars and Columns to emphasize individual values Distribution Columns to emphasize individual values Lines to emphasize the overall shape of he data Correlation Points and a trend line in the form of a scatter plot

Armed with this set of rules you would rule out Seth’s pie chart, and use the bar chart in the appropriated business context.

/

## Win Lose Charts – English Premier League 2007 – 2008

With hindsight, it’s easy to look at it with hindsight” – Glen Hoddle

Little did we suspect midway through the season that the Premier League would turn into a 2 horse race. Poor starts by both Man Utd. and Chelsea were transformed into a consistent run of good form which would extend the drama to the very last day of the season.

With one game left to play, the top two were level on points with Manchester having a superior goal difference. If Chelsea could achieve a better result than Manchester, then they would clinch the league in the final game…

The English Premier league currently has 20 teams requiring a total of 380 games per season. The results of sporting leagues are usually displayed in a league table format. During the course of a season we see teams occupy different table positions. The standard table format however leaves out the important historic story of the league.

This is how soccerstats.com shows the final league table.

The final 8 games are summarized as a colors encoded Win Lose chart. The table found at the web site above provides helpful links to each team’s performance and a snapshot in time regarding table position for each month of the league. It is also possible to rank the teams with respect to their home or away performance and overall is a very powerful tool for analyzing statistics of the league. Some historical information is also given with the results for the last 8 games being displayed in the last column.

Statistics are like miniskirts; they give you good ideas but hide the important things.” – Ebbe Skovdahl

However, there are a number of things about the table in that I think could be improved and all of them are centred on the “Last 8” column. The table author is attempting to describe win-lose information graphically and has dedicated approximately one quarter of the width of the table to it. With a quarter of the table dedicated to it, we should expect more than 8 out of 38 games to be described. The author has chosen to invent a chart type which encodes the results using colored squares with more empty space than data. The whole season can easily fit into this width if we choose an appropriate display technique such as a win-lose chart.

My least favorite aspect of the “Last 8” column is the non-standard use of a 1 dimensional plot. I come from a physical science background and so I am used to diagrams telling me certain things in certain ways. One of these things is (in the absence of an axis or some other visual guide) that time should go from left to right! In this 1D plot the knowledge of which way time is going is fundamental. The links at the top of the same table used to break the table down into months, go in chronological order from left to right. This sets up my brain to expect the rest of the table to behave in the same way. So why don’t the last 8 games do the same? Unless you actually have some knowledge about how the season ended, you might not actually realize that time is going from right to left. I happened to know that Liverpool and Arsenal didn’t lose their final games of the season. This made me double check what the table was actually telling me. Without some knowledge of how the Premiership ended I would have interpreted the information presented wrongly. The fact that time is going from right to left isn’t wrong. The fact that the rest of the table is telling me to expect it to go from left to right is.

“Well, Clive, it’s all about the two M’s. Movement and positioning” – Ron Atkinson

So what improvements could we make to the standard table format to get more information into the table? To tell the story of the 2007-2008 league we need to include the historical context.

The Excel table below uses sparklines to summarizes the season for each team in terms of relative positions in the table and a win lose chart for the entire season.

Now we get a feeling for the true drama that occurred during the season. The Position column shows how the each teams position changed over the course of the league. Those of us that followed the 2007-2008 Premiership will be able to look at the performance curves and remember the situation for a given time. For example:

• The lowest point of the season for Man Utd ,in every sense, was when they lost to Man City
• The departure of Jose Mourinho after 8 games was a catalyst for change in Chelsea’s fortunes

“I never make predictions, and I never will” – Paul Gascoigne

It is interesting to directly compare the performance of teams together. Adding an interactive Bumps Chart in allows us to see how Man Utd and Chelsea faired over the season. To compare two team click the sparklines in the ranking table or click the data label on the Bumps Charts.

In the next post I will talk about the techniques used to produce the ineractive ranking table within Excel. Until then why not check out a live web version, published from Excel to our website.

/

## Business Visualization, Recession and Miscast Charts

When the economy is growing and consumers happily spend all their money, “eye-catching”, dubious “professional-looking” charts seem to make sense in the grand scheme of a resource-wasting economy. But under recession, and on the threshold of a long economic Winter, those miscast charts are just bad jokes that should be banished from every book, every magazine, every meeting room.

In a recession, “do more with less” is everyone’s motto. Cut jobs. Cut travel expenses. Make people work harder.

Make people work better. In an information economy, “better” is better than “harder”. Make sure people use their primary tools efficiently. I wrote about how inefficient a beginner Excel user can be. This is one of those hidden costs that no one seems to care about, except perhaps Toyota Motor Corporation CEO Katsuaki Watanabe, who sees PowerPoint as an example of waste.

That’s why Tufte is basically right, and so is Stephen Few and everyone who believes that information visualization is not a futile exercise of impression management with the sole purpose of showing off canned effects in a PowerPoint presentation. Tufte advocates a simple set of rules for better information visualization – the corporate world loves to do exactly the opposite. Until now. But can organizations afford to be inefficient (= lower return on investment) when dealing with expensive data? Can a lean organization leave junk in its management reports and presentations?

I know, the economy will not grow again just because of better business information visualization. But take a visualization rich report like these ones, imagine an eye-catching-flying-3D-pie-charts version and answer this simple question: when in recession, which one would you choose?

/

## Graphical Tables – An Alternative to Treemaps

Sean blogged the other day about using a treemap to visualize the drivers of the Australian Inflation. He got inspired to create a treemap by an NYT article that used an interactive version of the following treemap:

This chart looks nice on the first view. It makes nice use of muted colors, the shapes look well balanced and certainly the graphic designer did a good job. However, from a data visualization perspective this chart has a couple of flaws.

Ben Shneiderman designed Treemaps to visualize deep directory tree structures.

Ben explained treemaps in an article as:

“Among the growing family of visual analytic tools, treemap are flourishing in organizations that require daily monitoring of complex activities with thousands of products, projects, or salespeople. Tabular reports, bar charts, line graphs, and scattergrams are important tools, but for complex activities where there are numerous sales regions, manufacturing plants, or product lines the hierarchical structures provided by treemaps can be helpful. While tabular displays and spreadsheets can show 30-60 rows at a time on typical displays, the colorful presentations in treemaps can accommodate hundreds or thousands of items in a meaningfully organized display that allows patterns and exceptions to be spotted in seconds.[…] Treemaps are a space-filling approach to showing hierarchies in which the rectangular screen space is divided into regions, and then each region is divided again for each level in the hierarchy.”

The first problem the NYT chart has is that it does not visualize the hierarchy as rectangular areas. The inflation drivers are visualized as asymmetric round shapes. It is difficult to compare the relative size of rectangular shapes but it gets almost impossible for asymmetric shapes. Also does this treemap lack labels for the smaller inflation drivers.

Sean published in his blog post a treemap which does not have the problems mentioned above:

Ben designed treemaps to visualize thousands of regions, products, etc ; but the Inflation chart only comprises 20 Inflation Drivers grouped into 7 categories. A simple sorted table would do a better job communicating the numbers as Kaiser Fung from Junk Junks wrote in his post.

Inspired by this post and my comment Sean came up with this graphical sparkline table designed with Excel and MicroCharts.

This is already quite an improvement on the treemap, as we can see increasing and decreasing inflation trends and sparklines rather than traffic light colors as in the tree map version. Also it is much easier to read for non expert users.

Some minor things we can improve in Sean’s chart are:

• We can sort the inflation drivers by Weight, to have the most important ones at the top
• Changing the area to the sparkline puts emphasis on the trend rather than the absolute value of the values (as the area chart does)
• Inline deviation charts allow us to visualize the MoM and YoY % changes

/

## Small charts are beautiful

Well, since you are reading this, I’ll assume that you took the red pill, so let’s keep moving and find out how deep the rabbit hole goes.

We saw that people usually design charts larger than they need to be. Why? Is it because we can’t fit the data into a smaller space? No, it isn’t. It is because in smaller charts there is no room for non-data elements, like title, legend, grid lines. In the dominant “Excel chart defaults” school of thinking data is not a priority.

This is a simple exercise that you can try safely at home and demonstrates it clearly. Start by creating a line chart in Excel, like this one:

You can see the data, right? Now make the chart smaller:

Here is a fierce territorial competition, and guess who’s winning? Make the chart a little smaller:

The title and the legend win, as usual. The data must be here somewhere, but who cares?

This chart size is not large enough. Or so it seems. But what happens when we remove some non-data elements? Since we don’t need the legend, and we can put the title somewhere else, we can remove both:

We are getting our data back! Let’s just leave the data and a simple grid line:

I used MicroCharts to display the same data using both a line and a column chart:

With MicroCharts, you can add a “normal band” or a reference line that helps you to understand how the data departs from the expected values.

The above charts show percentage change on previous period GDP at market prices in the US (1980-2009). Here is the same data for some selected countries in the EU:

Michelangelo said: “I saw the angel in the marble and carved until I set him free“. Like him, keep carving your chart until you set your data free. The essence of a chart is the patterns you discover, buried under all the junk. By making your charts smaller you are force to remove that junk.

Finding this “essence” is what sparklines is all about.

/