Bump Charts in XLCubed

So today’s blog is about adding Bump Charts in Excel using v8 XLCubed.

Initially a Bump Chart looks the same as a line chart – the difference is they plot the rank position rather than the actual value.

Let’s imagine that I sell a product in a marketplace with 10 other competitors. I may like to see how the rank position of my product and the competition changes over time to check if I’m gaining or losing market position. It’s a common scenario in pharma, where we have a good customer base.

You will usually want dates on the category axis so the trends are shown across time. The series then holds the items to be compared, in this case the products.

BBC1

 

 

 

 

 

 

Our example has been set up with Measures on Headers, Product Categories on Series and Date Calendar on Categories.  For more information on using Small Multiples in XLCubed please visit Small Multiple Charts.

The currently selected measure is Reseller Order Quantities (selected though the Measures slicer)

BBC2

 

 

 

 

for the eleven months prior to April 2008 (selected through the Date slicer)

 

BBC3

 

 

 

for a subset of products.

Looking at the bump chart you can see that I’ve selected Road Bikes and Mountain Bikes for easy comparison.  You can quickly see that the rank position for Road Bikes dropped quite dramatically from May 2007, picked up again in September before dropping again in November and rising in December through to February 2008.  The change for Mountain Bikes, on the other hand, was less dramatic, rising and falling slightly, steadying in February 2008 before dropping again the following month.

To create a bump chart just select Line – Bump as the Chart Type on your Small Multiple chart. The neat part is that all the rankings are worked out for you behind the scenes, without the need for lots of complex Excel gymnastics trying to work through the full result set month by month.

Workbook slicers – all for one and one for all!

So this is our second blog on the new features of XLCubed v8 – today we’re going to run through workbook slicers.

Workbook slicers allow the user to create the slicers at the workbook level so that they can be displayed for any/all sheets.

There’s a slicer pane which can be arranged horizontally or vertically and stays in place when you navigate to another sheet.  This means that if you have a multi-sheet workbook you only need to define one set of slicers.  These can then configured to be shown or hidden for individual sheets as required.

Turn the slicer pane on by selecting Workbook slicers from the XLCubed ribbon, Slicers tab:

 

ws1

Within the slicer pane there’s an Add Slicer button – this brings up the standard design form for adding slicers.

The Edit layout button brings up the window below.  It allows you to configure the order in which slicers will appear on the pane, which sheets they will be visible on and the padding between individual slicers.  You can also set a background fill colour from here.

ws2

The screenshot above shows that the Date.Calendar slicer is available on a number of sheets.  Selecting a slicer choice on one sheet will refresh the other sheets where the slicer is also available:

ws7

Once added, you link workbook slicers to your report in the same way as embedded slicers.  You can link directly to grids and other XLCubed objects and output their selection to Excel cell locations for use by formulae.

Their positioning on the web is fixed but if you find the slicers are taking up too much screen space you can make your slicer selections and then use this icon to toggle the Slicer Pane off:

ws4

XLCubed as an alternative to ProClarity

With the launch of 7.1 of XLCubed Excel Edition we introduced the ability to import ProClarity Briefing Books – with support for ProClarity ending this year and many customers looking for a replacement, now is a great time for us to show you how the import works to help move users from ProClarity to an alternate solution.

Importing

Let’s start with importing from ProClarity, we’ve built a simple example briefing book based on the usual AdventureWorks sample cube, it includes a sample grid:

 

a performance map:

 

and a chart:

 

To get to the import option we load Excel and select XLCubed -> Extras -> Import -> Import ProClarity Briefing book. After selecting the file to import we are given a summary of each item that is going to be imported:

 

At this point you can control the resulting worksheet name, as well as switching the type of XLCubed object you’ll end up with. Clicking “Import” will now give us 1 sheet for each briefing page:

 

You’ll notice that the import process has created any required slicers so the report is good to go. You could now also spend a bit more time adding any extra XLCubed functionality to the report such as Incell charts or Excel calculations to leverage the power of Excel or publish to XLCubedWeb for consumption by a wider audience.

The import process is very straight forward and we have some great feedback from our customers regarding the speed and ease that they have been able to migrate users’ reports into XLCubed.

Look out for some more blogs showing other features of XLCubed that will help users transition from ProClarity!

XLCubed V7 & SQL Server 2012

SQL server 2012 has recently been released to manufacturing, and at XLCubed we’re well placed to take advantage of everything that is new in 2012.

SQL 2012 delivers Business Intelligence under the ‘BISM’ umbrella (Business Intelligence Semantic Model). BISM comes in different flavours though:

  • BISM Multi-dimensional
    • (Latest version of Analysis Services as we know it)
  • BISM Tabular
    • In-Memory Vertipaq
    • Direct Query

For client tools, BISM Multi-dimensional is largely the same as connecting to existing versions of Analysis Services, with MDX being the query language. For XLCubed we can leverage what we already have in that respect, and the transition is seamless.

BISM tabular is different though. If you choose to deploy in-memory to Vertipaq, client tools can still use MDX, and as such don’t need significant change, other than to handle the tabular rather than hierarchical data environment. However if the deployment is Direct Query (for example for real-time BI), the only available query language is DAX.

There are best use cases for the different deployment options, but it’s fair to say there is a degree of confusion in the space at the moment about the relative merits of each. We’ll try to shed some light and guidance here over the next weeks and months. As a product though, it’s important for us to support and extend the full range of 2012 BI deployment options, and to make these available and accessible to our customers. That’s exactly what we’ve done for version 7.

XLCubed v7, which releases next month, is a client for both MDX and DAX, and as such provides one consistent client interface in Excel and on the Web which can access any of the SQL 2012 deployment models for BI. We are also adding a much richer relational SQL reporting environment.

We are really pleased with some of the beta feedback we’ve had to date, and if you’d like to trial the beta version contact us at beta@xlcubed.com .

We’re looking forward to releasing the product next month, and will be previewing it at SQL Server Connections next week in Vegas.

 

2011 Dashboard Competition

A slight departure from the normal blogging to let everyone know about the latest developments in XLCubed and to talk about a new dashboard competition with the chance to win an iPad 2!

Dashboard Design Competition

XLCubed are sponsoring Dashboard Insight’s first dashboard design contest. The competition is based on a provided data set, and we’d encourage as many as can to enter.

We believe that XLCubed offers a class-leading dashboard development environment, with fine grained control over chart and table sizing, and we’re looking forward to seeing some great dashboards. Take a look at some of our previous winners for inspiration.

Don’t forget that this blog also contains lots of helpful information that should help you come up with a great dashboard design.

We’ll provide entrants with the sample data set in a local cube format to fully exploit the strengths of XLCubed. Entry is open to customers and non-customers alike, and your dashboard skills can win you a shiny new iPad 2. Good luck if you choose to enter.

 

XLCubed v6.5

Version 6.5 is due for release in early October. Originally scheduled as 6.2, we decided it contains so much over the current version that it deserved a bigger billing. New for 6.5 are:

 

  • iPad / iPhone app – XLCubed web reports have always worked on smartphones and tablets. However our app brings an intuitive iPad optimised user experience to report navigation and selection.

  • Mapping – Integrated point and shape based mapping in Excel and on the web.
  • Scheduling – email delivery of XLCubed web reports by pdf or Excel. Schedules can be controlled by period, or by data exception.
  • Sharepoint WebPart – customers have been using XLCubed Web reports in SharePoint for a number of years, but we now introduce a dedicated WebPart to make the process simpler and provide greater flexibility and depth of integration.
  • Away from the headline items there are a number of significant smaller enhancements which make 6.5 another big step forward for us. We’re looking forward to bringing it to market. For an early test drive, contact us along with your specific area of interest at support@xlcubed.com.

Lastly we’d like to welcome Cardinal Solutions Group to our partner program. Cardinal operate in North Carolina and Ohio and are one of a select few Microsoft Managed Partners in the U.S. East and Central Regions. We look forward to working together with new and existing customers.

Warning: Excel can get Volatile

Excel is a great tool for dashboard/report delivery and design (it’s why we created our addin in the first place), but there is a hidden performance trap:

Offset, Now, Today, Cell, Indirect, Info and Rand

If you’ve ever used any of these formulae, you may have noticed that whenever you change a cell, or collapse/expand a data grouping, Excel recalculates. That is because these are VOLATILE formulae, as soon as you use one of these, Excel will enter a mode where everything is always recalculating, and for good reason.

Offset & Now are the formulae we see used most often. Let’s look at each of these in turn and talk about some alternate approaches to avoid this issue.

Offset

This is by far the most common of these danger formulae that we see in use. Here’s the formula definition:

=Offset(reference,rows,cols,height,width)
Returns a reference to a range that is a given number of rows and columns 
from a given reference.

We typically see these as part of a named range definition for driving chart source data – it allows the number of rows/columns driving the chart data to change automatically; a not unusual requirement when it comes to building reports (especially when a report contains some user defined filters or slicers). Here’s an example:

 

 

 

 

 

 

A very simple spreadsheet – we can type the number of months to display in the chart. In reality the number of months to display will probably be driven by the data available for the criteria selected. The screenshot already shows the issue we have –  the chart is setup to display a max of 12 months, but we only have 3 months of data available.

 

The most obvious approach is to use the Offset formula to pick the chart area to use automatically, we could create a named range such as:

 

 

 

 

 

 

Now we just change the chart data source to be the named range:

 

 

 

 

The chart is now plotting 3 months, but will automatically update to show the required number of months:

BUT we have now used a volatile formula –  although this is a simple workbook, we are now in a position where Excel is going to have to recalculate everything all the time. It’s probably a good time to look at why Excel is going to do that. Let’s have a look at very simple formula to understand how Excel recalculates things.

Consider the formula:

C1    =A1 + B1

We can see that C1 is dependent upon A1 & B1 – so whenever a value in either of these cells changes C1 will need to be recalculated to show the correct answer. Excel knows about this dependency because it maintains a dependency tree; it knows which cells need to be recalculated whenever any other cell changes. This is a very efficient way of working, if a workbook has thousands of formula, but only one values changes, and this only needs 10 of these formula to recalculate, then only 10 will be calculated.

If C1 contained:

C1    =Sum(A1:A20)

We know that C1 depends upon any of the cells A1:A20, and so does Excel. But what if C1 was:

C1    =Sum(Offset(A1,0,0,B1,1))

Which cells is C1 dependent upon? At a glance you could say A1 & B1.

 

 

 

 

 

 

but  B1 contains the number 20, so actually C1 is dependent upon A1:A20 and B1 (I’ve highlighted the additional cells that are dependent):

 

 

 

 

 

 

 

Just as we can’t see at a glance which cells C1 needs – Excel also can’t easily decide that. Therefore, Offset is volatile because, if it wasn’t then there is a danger that Excel would take so long to work out if it needs to be calculated that it might as well always calculate it.

There is an easy solution to this, INDEX. Here’s the formula definition (be careful, there are 2 ways to use Index, we want the REFERENCE one):

=Index(reference,row_num,column_num,area_num)
Returns a value of reference of the cell at the intersection of a 
particular row and column, in a given range
The big difference, compared to Offset, is that Index is going to return a single cell reference, so you need to use it as part of a range selection A1:Index(…). Here’s the same “Offset” Sum redefined as an “Index”:
C1    =SUM(A1:INDEX(A1:A20,B1,0))

The formula is simply saying the range we want starts at A1 and goes down the number of rows set in B1. The crucial difference is that the Index functions knows that A1:A20 is the maximum range we are likely to look at and therefore the dependencies are known just by looking at the formula itself:

We can now update the Named Range to use the Index function instead:

=Sheet1!$C$6:INDEX(Sheet1!$C$6:$C$17,Sheet1!$D$2,0)

 

 

Now/Today

The Now and Today functions return the current date to a cell – this is generally used so that when a report is loaded it will always show the data based on “Today”. Whilst this is not an unreasonable thing to want to do,  in reality what most people want is for the report to run for the most recent data, which could actually mean a number of things:

  • Yesterday (if the data is built in a nightly process)
  • The last working day (if the source transactional system is only used during office hours)
  • Current month etc.

The easiest solution is to let the data determine the date to use – if we use an XLCubed Grid or Query Table to retrieve the data we can simply setup a grid to retrieve the days/months where there is data:

And use the Sort option “Reverse” to display the most recent data first:

With the grid set to “Refresh on Open”  we know that A6 will always have the most recent date available in the cube and can base the rest of the report off that cell.

Incidentally, V6.2 of XLCubed introduces a new option to Slicers to automatically select the most recent date member when the report is loaded:

Sql Server “Denali” CTP3 – first impressions…

Microsoft recently released their third CTP of Denali the upcoming SQL Server release (SQL Server 2011), so here are some initial thoughts now it’s more widely available.:

The first thing to look at is the new Tabular mode for Analysis Services (as opposed to the traditional multi-dimensional mode, which is still available). This is the server version of the VertiPaq engine first seen in the PowerPivot add-in, and moves the engine from being a personal/team tool to an organisation/enterprise level affair.

This means IT are going to get involved (and people can disagree about how they feel about that!), but that report sharing should be easier as data is held centrally. In the past the report contained all the data, which could make for very large workbooks, or you published to SharePoint, which not everyone was set up to do.

Cubes can be queried using MDX, which is great for a front-end vendor like us, and XLCubed works out of the box against the CTP. Existing functionality is working smoothly, and as Microsoft Gold Partners we’re working closely with the releases to utilise all the functionality for the RTM.

We have ported a few existing cubes to the new architecture and one first impression is that removing columns or using perspectives is going to be needed to keep things sensible for end-users, you can quickly end up with hundreds of attributes.

The ability to create hierarchies was something that was often asked for in PowerPivot, and thankfully that’s there now. This should simplify many cubes.

Attribute-tastic

 

The intricacies of MDX put most business users off trying to use it directly, whereas DAX’s similarities with Excel functions means there is more scope to have users create formulae on the fly. Examining how best to expose that to users is something we’ll be spending some time on in the coming months..

Easier distinct counts and the built in date calculations are the obvious candidates, but there are a number of others which we feel we can make more accessible for the majority of users.

It’s certainly an interesting move, and thinking in Tables and Columns instead of the Multidimensional model takes some getting used to, conversely for some people its more natural.

It’ll also be interesting to see how MDX and DAX are integrated. The Tabular server supports both languages for query. Currently using MDX you can use the “With Member” syntax to create members sent to the Tabular server, could you declare a DAX calculation in a similar manner?

Heatmap Tables with Excel – Revisited

We’ve revisited one of our more popular guides Heatmap Tables with Excel as they can be a very effective way of presenting data on a dashboard, and have now updated it for Excel 2010…

This Heatmap Table is designed to show you the revenues and the discounts of a company over the course of one year per product group. The size of a bubble shows the revenue made in a particular month and the bubble color shows the discount rate given. The discount rate has been encoded as a range of green colors, ranging from a light green, for low discounts to a dark green for high discounts. The years and product totals are shown at the right and bottom as an integrated part of the table.

Tufte often talks about the integration of numbers, images and words; I think he’s quite right. A way to achieve this in Excel is to integrate charts into tables, so called graphical tables, a very effective means to show “More Information Per Pixel“.

The heatmap table is based on a regular Excel bubble chart. To integrate a bubble chart into a table the bubbles are positioned in a matrix that has the same row and column layout as our table.

 

 

 

 

 

 

 

 

 

 

 

 

 

In our case we generate a data series table with one column for the X-Series going from 1-12 for January – December and one column for our Y-Series going from 1-8 for our 8 product groups and one column for revenue.

In the sample spreadsheet we’ve setup some simple excel formula to translate data from the classic grid layout:

to the required format:

Now we can insert the bubble chart:

 

To ensure that the charts fit exactly into the table grid we set Min/Max for the X axis to 0.5/12.5 and for the Y axis to 0.5/8.5. Excel would calculate much larger auto scales otherwise. Also set the Major units to 1 so we can use that later to set some grid lines.

 

Now we remove the legend, the X and Y axis, maximize the plot area and align the chart with the Excel table. As the bubbles are initially too large we have to make them smaller. To control the bubble size go to Data Series Options and scale the bubble size to 50%:

 

This already makes a nice bubble table you could use to reproduce the Twitter Charts.

For the grid lines format your table headers and grid lines with light gray grid lines. Resize the plot area, remove the border and re-position the chart so that the chart and the table grid lines align.

To create the heatmap with different colored bubbles we use the fact that by default Excel does not plot data points for #NA values.  For the heatmap we overlay 8 bubble series, one  series per green shade, and show a revenue bubble only if the value fits into the value range that corresponds with a green shade of our color ramp, otherwise we show #NA.

We divide the range MAX(Discount)..0 into 8 groups to define the colours.

The data series columns use the following formula to test if a discount value corresponds with an interval / colour shade:

=IF(AND($E7>I$6-Step,$E7<=I$6),$D7,NA())

The formula returns the revenue, if the discount values is in the interval defined in the column header I$5.

 

 

Now create the eight data series so that the bubble size refers to the eight columns in the data table:

 

And use the Excel chart styles to pick a colour range – make sure you  remove the border from the chart area.

 

 

And you could use the chart styles to quickly switch between different colours – or customise each series to refine the colors.

You can download a starting point for these files here: HeatmapSample.xlsx. Most of the formulae should adapt to data values that you can feed into the data sheets, including data straight from Analysis Services if using XLCubed grids or formulae.

You can see an interactive version of the Heatmap here – we added a link to some cube data, some Slicers for driving the parameters and then published to XLCubedWeb.

 

 

Flexible time-series graphing from a slicer

We are often asked how to drive a chart from a slicer in XLCubed and how to plot days/months for a month or year. Base case this is fairly straightforward, you can set up a grid which is based on the previous ‘x’ months of a slicer selection for example. The difficulty can be where you want to vary the behaviour depending on which level of the hierarchy the user chooses. This is particularly true where the hierarchy contains semesters or quarters.

The example below shows a technique to handle this complexity and display the chart in a way meaningful to the user in each case. The report is based on a slicer that allows the user to switch between showing the graph data based on quarters, months or days.

You can download the Excel spreadsheet that is used in the example here TimeSeriesGraphFromSlicer

This connects to the Adventureworks demo database which ships with Analysis Services.

The diagram below shows the flow of data from each worksheet showing the final result in the sheet Chart.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Workbook Sheet – Chart

This sheet shows the graph based on the data chosen in slicer above it. This switches the graph data between quarters, months and days depending on the slicer selection.

 

Workbook Sheet – GridForChart

This shows the data that will be graphed, depending on the choice made by the slicer selection. In this example it is months July 2001 – June 2002. FY2002 has been selected by the user (in this example Financial Year 2002 runs from July 2001 – June 2002).

Note that cells A10 – A21 contain the value ‘TRUE’ – these cells contain an XL3RowVisible statement as follows:

=XL3RowVisible(B10<>””)

This statement hides rows with no data so that they are not plotted on the graph.

Workbook Sheet – SlicerToMonthDay

This sheet contains the data that is returned by the choice of the slicer in workbook sheet Chart.

User selects a month

The data will be graphed as days. For example, if the user selects July 2002 then the graph will be displayed with each day in July along the x-axis. These are defined in XLCubed as ‘Children of’ the slicer.

User selects a quarter year

The data will be graphed as months in a three month period. For example, the user selects Q1 FY 2003 and the data displayed is for three months from July 2002 – September 2002 as below. These are defined in XLCubed as ‘Descendants of’ the slicer at month. This will be the same when the user picks year, semester or quarter.

User selects a half-year

The data will be graphed as months in a six-month period. For example, the user selects H1 FY 2003. The screenshot below shows the data that will be graphed.

However, it can be seen that the values Q1 FY 2003 and Q2 FY 2003 should not appear on the graph.

Using the Edit Member functionality it is possible to remove these so that they do not appear as points on the graph.

To do this, edit the Date.Fiscal member and click on Advanced tab.

Click on the drop down next to first member – that member set is the resulting data when the user selects H1 FY 2003 and shows the data that is in cells B10 – B43 in sheet SlicerToMonthDay.

 

The screenshot below shows the data that will be subtracted – it is in effect the actual value selected by the user via the slicer alongside the two Fiscal Semester values Q1 FY 2003 and Q2 FY 2003.

 

The GridForChart sheet now shows just the six months that should be graphed. As explained earlier further manipulation using the XL3RowVisible functionality removes blank rows.

 

The screenshot above shows the graph with six months of data for H1 FY 2003 for months July 2002 – December 2002, and the quarters have been dynamically excluded.

The end result is a flexible time selector where the user can choose dates at different levels in the hierarchy, and will always get a meaningful and in-context time series chart.

 

 

Parent-Child Dimensions in Analysis Services – Performance Walkthrough

Parent-child hierarchies are a good fit for many data structures such as accounts or employees, and while they can speed development in some cases, they can also cause performance problems in large cubes.

We often see customers with these type of performance issues, and thought it worth sharing a simple technique for altering the dimension structure to improve query speed.

The problem

Often parent-child hierarchies are created as this is the structure used in the relational source, so they seem a good fit to model the members. In many cases though data is only at the leaf level of the hierachy, meaning parent-child isn’t really needed.

Performance problems occur because no aggregates are created for parent-child dimensions, as detailed in the Analysis Services performance guide:

Parent-child hierarchies

Parent-child hierarchies are hierarchies with a variable number of levels, as determined by a recursive relationship between a child attribute and a parent attribute. Parent-child hierarchies are typically used to represent a financial chart of accounts or an organizational chart. In parent-child hierarchies, aggregations are created only for the key attribute and the top attribute, i.e., the All attribute unless it is disabled. As such, refrain from using parent-child hierarchies that contain large numbers of members at intermediate levels of the hierarchy. Additionally, you should limit the number of parent-child hierarchies in your cube.

If you are in a design scenario with a large parent-child hierarchy (greater than 250,000 members), you may want to consider altering the source schema to re-organize part or all of the hierarchy into a user hierarchy with a fixed number of levels. Once the data has been reorganized into the user hierarchy, you can use the Hide Member If property of each level to hide the redundant or missing members.

 

The performance guide hints at re-organizing the hierarchy to improve perfomance, but doesn’t say how.

The solution

This article will walkthrough the steps needed to change your parent-child hierarchy structure to have real levels, so that aggregations work, and your performance is as good as you expect with normal hierarchies.

This process is known as flattening or normalizing the parent-child hierarchy.

Firstly, let’s look at the data in our relational source.

Code: Sql Create ScriptSelectShow

Not a large dimension, but enough to demonstrate the technique. As you can see my real products are all at the leaf level.

The strategy is quite simple:

  • Create a view to seperate the members into different levels.
  • Create a new dimension using these real levels.
  • Configure the dimension to appear like the original parent-child dimension, but with the performance of a normal dimension.

Create the view

We want to create a denormalised view of the data. To do this we join the Product to itself once for each level. This does mean we need to know the maximum depth of the hierarchy, but often this is fixed, and we’ll build in some extra levels for safety.

The tricks here are:

  • Use coalesce() so that we always get the lowest level ID below the leaves, never a NULL. This allows us to join to the fact table at the bottom level of our hierarchy.
  • Leave Name columns null below the leaves, this will allow us to stop the hierarchy at the correct leaf level in each part of the hierarchy.

Code: Sql View ScriptSelectShow

Running this we get:

Obviously we can update this view to create more levels as required, but 5 are enough for now.

The Dimension

Next we go to BIDS, and add the view to our Data Source View, and then add a new Dimension based on the view.

The key steps to creating the dimension correctly are:

  • Set the key attribute to Level5ID, and the name to Level5Name.
  • Create an attribute for each Level ID, and on each set the Name Column appropriately.
  • Create a hierarchy using these attributes in order.
  • On each attribute set AttributeHierarchyVisible to False.
  • On each level of the hierarchy set HideMemberIf to NoName.
  • Set up the Attribute Relationships between the levels.

You should end up with the following:

Dimension Structure

 

Attribute Relationships

 

If you browse the dimension you’ll see that it never goes as far as level 5, even though it exists. This is becuase we set up the member hiding option, and returned NULLs in our view.

Conclusion

And that’s it done, you can now join to your fact tables at the lowest level, build your cube as normal and get the performance benefits of aggregation!

See also

A tool to achieve the same result is available from Codeplex, we’ve not personally tried it but may well be a timesaver. This works in a similar way to the example above, but it’s often useful to understand how something works, even if you choose to automate it.