Breakout and Propagate – Oldies but Goodies

We’ve been on-site with various customers in the last few months, it’s always good to see how the product is being used, and we value customer feedback, which often feeds into our development cycle.

With two long-term customers we found that they weren’t aware of two very useful pieces of functionality which have been in the product for many years. In case there are others in the same boat…..

1) Breakout

Breakout is available on the right-click menu of any XLCubed report, grid or formula. It’s a way to understand how the number is split into elements of another hierarchy. On a right-click, you can switch to the breakdown of the number by any other hierarchy in the cube. It’s particularly useful where a reported number seems too high or too low, and needs further investigation. In the example below, we’d like to understand how the June 2004 number for the US is made up in terms of Products. We’ve chosen breakout on the right-click menu, specified the Product Model Categories hierarchy and the level to run the breakout at. The breakout result is shown on the bottom right, and we can quickly see that in June Touring bikes were contributing almost 33% of the US revenues. In this example we’ve also included a sparkline to give a feel for trend over the last 2 years (previous 23 months, plus the current month).

The breakout result is still linked to the report selection criteria, so can be used as a dynamic part of the report ongoing.

 

2) Propagate across Sheets

Propagate across sheets is a way to quickly replicate a report onto additional sheets, where just one variable is changed. Typical-use cases for this are entity-based reporting, where for example there is a standard P&L template across the business, and the user wants to quickly generate a P&L for each legal entity. You can build the report as normal, and then when done, right-click on the selected member for the hierarchy you want to propagate, and choose the elements which you want to create additional sheets for. On the new sheets, the formatting and print layout are identical, with the only change being the selected member on the propagated hierarchy.

This example shows the income statement about to be propagated for the four selected departments.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Solve order shenanigans

Today I’m going to blog about a problem we recently solved in a client’s cube, an error in the Mdx script that’s very easy to make if you aren’t careful.

We’ll run a simple example in AdventureWorks (what else?) to demonstrate the issue.

The client had already added a calculation to their cube to show year-on-year growth. The formula is:

Create Member CurrentCube.[Measures].[Delta to PrevYear] as
(
    ([Measures].[Internet Sales Amount])
    -
    ([Measures].[Internet Sales Amount],
        ParallelPeriod(
            [Date].[Calendar].[Calendar Year],
            1,
            [Date].[Calendar].CurrentMember
        )
    )
)
/
    ([Measures].[Internet Sales Amount],
        ParallelPeriod(
            [Date].[Calendar].[Calendar Year],
            1,
            [Date].[Calendar].CurrentMember
        )
    )
, Format_String = "0.00%";

(some error checking removed for clarity)

This screenshot shows a couple of simple XLCubed Grids showing the real value, and below the percentage change. I have added in an Excel calculation to show the results are as expected.

Later during the cube development, the client added a calculated member in their Product dimension, one that gives a total excluding one of the product categories.

To replicate this I’ll add a calculation for “All Ex Bikes”:

Create Member 
CurrentCube.[Product].[Product Model Categories].[All Products].[All Ex Bikes]
as
(
    ([Product].[Product Model Categories].[All Products])
    -
    ([Product].[Product Model Categories].[Category].&[1])
);

And if we run the report again we get the following.

Notice the cell I’ve highlighted. The “All Ex Bikes” calculation works fine on the normal measure, but it gives totally the wrong number for the percentage calculation. What’s going on?

The problem is that in the cell highlighted Analysis Services has two calculations to think about when working out the result.

  • Compare this year to last year
  • Get the “Grand Total”, and subtract “Bikes”

As the number returned is 1.85% we can see that Analysis Services has chosen the second option, “Grand Total” – “Bikes”.

What we really want is for the calculation to be done by getting the subtotal, and then doing the percentage change based on that.

Fortunately the fix was a simple one. Analysis Services will run the calculations in the order they are found in the Mdx Script, so to fix the issue we simply moved the new “All Ex Bikes” definition up above the percentage calculation.

Now the number returned matches our expectations.

Pass/Solve Order can be a complex topic, so you may need to be careful.

In this case the number is totally wrong, so it was easy to spot, but some bugs will be much more subtle, so watch out!

Ranking, Sorting and Filtering

Once we have returned cube members into a grid report we often need to exclude or change the order of the result set to provide more meaningful information. MDX (Multidimensional Expressions) language includes some very useful operators to provide filtering (FILTER), sorting (ORDER) and ranking (TOPCOUNT/BOTTOMCOUNT) of dimension members. These can be quite overwhelming even for power users of XLCubed.  So, in V6, we have introduced a new feature “Advanced Member Selections” to provide easy access to this powerful part of Microsoft Analysis Services.

Using this new functionality we can nest and combine these operations to answer complex business questions (for simpler operations you can right-click on a member in the grid and use the “Apply” menu to perform simple ranking, filters and sorting).

Filtering

So let’s go through a simple filtering example.  Say, for example, that we want to find the products at Product Key level that sold more than 25 units in 2003, Quarter 1 and show the sales figures for those subcategories during 2003 and its quarters.

  1. Start by clicking the Grid ribbon item (or the XLCubed > Design Grid menu item in Excel 2003 and below), and selecting the Internet Sales cube file
  2. Drag Calendar Period to Columns and Product to Rows. You can also drag any other hierarchies to Headers. In the example image below, Measures and Customer have been added there.

  1. Click on the Product hierarchy so that its details appear in the bottom-right panel.
  2. Drag the Product key level over to the right of the dialog. You can switch between the members view and levels view by clicking on the Show Levels icon ().
  3. Click the Advanced tab to show the advanced selection pane:

  1. Click the Members drop down and choose Filter result:


  1. Click the Calendar Period edit control in the grid to change its selection to the desired member (2003, Quarter 1):

  1. Select the This measure radio button, and select Order Quantity as the desired measure.
  2. Change the Operation to >, and type 25 in the edit field on the right:

  1. Click OK. The new filter is displayed in the advanced selections tab:

  1. Click OK again to run the Report – the Grid shows the members that fit our criteria:

 

So we can see the results, filtering by 2003 Q1, but displaying the values for All Time (or any other period we wish to use). We could have also used the Range selector:    to drive the period selecting from an Excel Range and our grid would automatically refresh whenever the driving value changes.

Ranking

Now let’s add a ranking to find the bottom 8 selling products at the Product Key level that have sold more than 25 units inQ1:

  1. Display the Product Hierarchy Editor dialog
  2. Click the Rank result icon () on the advanced selections tab to display the Edit Ranking dialog
  3. Select the Bottom radio button, and type 8 into the edit field
  4. Select 2003, Quarter 1 for the Calendar Period hierarchy in the grid below:

We now have the filter, following by the ranking:

 

Run the Grid: only the lowest 8 members are returned

 

Sorting

Now let’s sort the report on a different dimension – for example, descending order of the Q1 sales.

  1. Display the Hierarchy Editor for the Product hierarchy by double-clicking on the Product label in the Grid
  2. If it’s not already visible, select the Advanced tab
  3. Click the Sort result toolbar button ()
  4. Change the Calendar Period selection to 2003, Quarter 1:

  1. Click the Sort Descending (9-1) radio button
  2. Click OK. The new sort is displayed in the advanced selections tab
Click OK again to run the Report

 

Joining Results

It’s also possible to join different results together: combining both sets (UNION), excluding members (EXCEPT) and returning common members (INTERSECT).

So we could also add the top 10 products  along side the bottom 8 products to the grid. Begin by adding another member selection using the “Add Member List” tool-bar button:

As before, we select the list of members to rank (in this case the Product Key level) and then select the operation we want to perform, a Top 10:

There are various options to decide how to combine the lists, we’ll stick with Add:

 

 

And we get both results combined:


So the “Advanced Member Selections” feature provides lots of the power of Analysis Services in a simplified way  – to try this feature for yourself you can begin by downloading XLCubed.

Small Multiples on River Quality

The phrase small multiple was popularised by Edward Tufte, and has become a generic term for a visual display using the same chart or graphic to display different slices of a data set. Their close positioning and shared scale make comparisons very easy and shared trends or outliers can be quickly spotted. Various other terms are also used to describe this charting approach, or specific aspects of it, including Trellis Charts, Lattice Charts, Grid Charts and Panel Charts.

The most common use case for small multiples is separate line charts to compare trend across a large number of varying elements. Placing them all within one chart would cause either a ‘spaghetti chart’ , or lots of occlusion as shown in the comparison below. Here we use a standard Excel line chart, and an XLCubed small multiple to chart the same data. Separating the charts while keeping a consistent axis scale makes for a much easier comparison than in the single chart.

We took a slightly different approach when using small multiples to take a look at differences in river water quality across regions of the UK. Our source data was not absolute numeric values, but 14 years of results categorised into four bandings (bad, poor, fair and good). We wanted to provide a ‘one-pager’ which gave a feel for the trend within each region, but also access to the annual breakdown of the different water qualities.

In the end we settled on a Small Multiple display of 100% stacked columns as shown below.

A percentage base seemed a sensible way to approach the data, as different regions will have differing numbers of rivers and of samples taken. Using this approach we’re able to see a comparison of the relative water quality rather than dealing in absolutes.

The user selects a geographic area of the country to view the regional breakdown within the selected area. The water quality for a particular year can be analysed by locating the region, and the specific year to see the percentage breakdown for each of the four categories.

The colouring of the 4 categories was chosen to aid ‘at a glance’ recognition of the overall water quality by region, and also of the trend. Dark blue signifies bad quality water (opaque), and light blue signifies good quality (think ‘you can see right through it….’).

So to read the display overall, or for trend:
• Dark colour signifies water quality problems.
• Light colour signifies good quality water.
• Reading left to right, increasing colour saturation shows declining quality over time.
• Reading left to right, decreasing colour saturation shows improving quality over time.
• Any region can be zoomed in on to see a larger chart and understand the breakdown in more detail.

Fairly quickly, and from just this one display we can draw a number of conclusions as below:
• Across the region, as a broad brush summary, water quality has improved since 1992.
• Doncaster has shown strong and steady improvement.
• Kingston upon Hull has the worst quality overall in the region, and varies significantly year on year.
• If you’re off for a swim in a Yorkshire river, Richmondshire looks a good bet!

We’ve designed a pre-set view in this case to work for the data in question, but the small multiple concept is also very powerful when interactively exploring data. A picture can tell a thousand words as they say – take a look at our youtube videos on small multiples: Video1 Video2

 

Sql Server “Denali” CTP3 – first impressions…

Microsoft recently released their third CTP of Denali the upcoming SQL Server release (SQL Server 2011), so here are some initial thoughts now it’s more widely available.:

The first thing to look at is the new Tabular mode for Analysis Services (as opposed to the traditional multi-dimensional mode, which is still available). This is the server version of the VertiPaq engine first seen in the PowerPivot add-in, and moves the engine from being a personal/team tool to an organisation/enterprise level affair.

This means IT are going to get involved (and people can disagree about how they feel about that!), but that report sharing should be easier as data is held centrally. In the past the report contained all the data, which could make for very large workbooks, or you published to SharePoint, which not everyone was set up to do.

Cubes can be queried using MDX, which is great for a front-end vendor like us, and XLCubed works out of the box against the CTP. Existing functionality is working smoothly, and as Microsoft Gold Partners we’re working closely with the releases to utilise all the functionality for the RTM.

We have ported a few existing cubes to the new architecture and one first impression is that removing columns or using perspectives is going to be needed to keep things sensible for end-users, you can quickly end up with hundreds of attributes.

The ability to create hierarchies was something that was often asked for in PowerPivot, and thankfully that’s there now. This should simplify many cubes.

Attribute-tastic

 

The intricacies of MDX put most business users off trying to use it directly, whereas DAX’s similarities with Excel functions means there is more scope to have users create formulae on the fly. Examining how best to expose that to users is something we’ll be spending some time on in the coming months..

Easier distinct counts and the built in date calculations are the obvious candidates, but there are a number of others which we feel we can make more accessible for the majority of users.

It’s certainly an interesting move, and thinking in Tables and Columns instead of the Multidimensional model takes some getting used to, conversely for some people its more natural.

It’ll also be interesting to see how MDX and DAX are integrated. The Tabular server supports both languages for query. Currently using MDX you can use the “With Member” syntax to create members sent to the Tabular server, could you declare a DAX calculation in a similar manner?

Parent-Child Dimensions in Analysis Services – Performance Walkthrough

Parent-child hierarchies are a good fit for many data structures such as accounts or employees, and while they can speed development in some cases, they can also cause performance problems in large cubes.

We often see customers with these type of performance issues, and thought it worth sharing a simple technique for altering the dimension structure to improve query speed.

The problem

Often parent-child hierarchies are created as this is the structure used in the relational source, so they seem a good fit to model the members. In many cases though data is only at the leaf level of the hierachy, meaning parent-child isn’t really needed.

Performance problems occur because no aggregates are created for parent-child dimensions, as detailed in the Analysis Services performance guide:

Parent-child hierarchies

Parent-child hierarchies are hierarchies with a variable number of levels, as determined by a recursive relationship between a child attribute and a parent attribute. Parent-child hierarchies are typically used to represent a financial chart of accounts or an organizational chart. In parent-child hierarchies, aggregations are created only for the key attribute and the top attribute, i.e., the All attribute unless it is disabled. As such, refrain from using parent-child hierarchies that contain large numbers of members at intermediate levels of the hierarchy. Additionally, you should limit the number of parent-child hierarchies in your cube.

If you are in a design scenario with a large parent-child hierarchy (greater than 250,000 members), you may want to consider altering the source schema to re-organize part or all of the hierarchy into a user hierarchy with a fixed number of levels. Once the data has been reorganized into the user hierarchy, you can use the Hide Member If property of each level to hide the redundant or missing members.

 

The performance guide hints at re-organizing the hierarchy to improve perfomance, but doesn’t say how.

The solution

This article will walkthrough the steps needed to change your parent-child hierarchy structure to have real levels, so that aggregations work, and your performance is as good as you expect with normal hierarchies.

This process is known as flattening or normalizing the parent-child hierarchy.

Firstly, let’s look at the data in our relational source.

Code: Sql Create ScriptSelectShow

Not a large dimension, but enough to demonstrate the technique. As you can see my real products are all at the leaf level.

The strategy is quite simple:

  • Create a view to seperate the members into different levels.
  • Create a new dimension using these real levels.
  • Configure the dimension to appear like the original parent-child dimension, but with the performance of a normal dimension.

Create the view

We want to create a denormalised view of the data. To do this we join the Product to itself once for each level. This does mean we need to know the maximum depth of the hierarchy, but often this is fixed, and we’ll build in some extra levels for safety.

The tricks here are:

  • Use coalesce() so that we always get the lowest level ID below the leaves, never a NULL. This allows us to join to the fact table at the bottom level of our hierarchy.
  • Leave Name columns null below the leaves, this will allow us to stop the hierarchy at the correct leaf level in each part of the hierarchy.

Code: Sql View ScriptSelectShow

Running this we get:

Obviously we can update this view to create more levels as required, but 5 are enough for now.

The Dimension

Next we go to BIDS, and add the view to our Data Source View, and then add a new Dimension based on the view.

The key steps to creating the dimension correctly are:

  • Set the key attribute to Level5ID, and the name to Level5Name.
  • Create an attribute for each Level ID, and on each set the Name Column appropriately.
  • Create a hierarchy using these attributes in order.
  • On each attribute set AttributeHierarchyVisible to False.
  • On each level of the hierarchy set HideMemberIf to NoName.
  • Set up the Attribute Relationships between the levels.

You should end up with the following:

Dimension Structure

 

Attribute Relationships

 

If you browse the dimension you’ll see that it never goes as far as level 5, even though it exists. This is becuase we set up the member hiding option, and returned NULLs in our view.

Conclusion

And that’s it done, you can now join to your fact tables at the lowest level, build your cube as normal and get the performance benefits of aggregation!

See also

A tool to achieve the same result is available from Codeplex, we’ve not personally tried it but may well be a timesaver. This works in a similar way to the example above, but it’s often useful to understand how something works, even if you choose to automate it.

Microsoft Gold Partner Renewal, & resulting questions answered

We’ve just completed the process of renewing our Gold partner status with Microsoft. Among other things, this ensures we have early access to upcoming software through the CTP and beta programs, as has been the case with office 2010 over the last year.

The Gold level now has a requirement for completion of an independently run Customer Satisfaction survey, and thanks very much to everyone who completed this. The scoring and comments are much appreciated, and it was good to see the consistently high scores. We’re of course reviewing the areas where we didn’t score perfectly as we strive to further improve the service we offer.

The survey also highlighted a number of questions which appeared a few times:

1) Relational Database support

Our prime focus is on cube based reporting and analytics, but we do also support querying of relational databases in both Excel and on the Web, and have a number of customers using the product purely for relational reporting.

We extend the native Microsoft functionality in excel, and add support for this in the Web product. The connection string and query string can both be formula driven so you can construct parameter driven reports with ease.

– Search for ‘relational database access’ in the help file for an overview.

2) Writeback & Planning applications

XLCubed supports both grid and formula based writeback against Analysis Services, in Excel and on the Web. As such it lends itself well to planning and budgeting applications, and it’s an arena in which we have a lot of experience, from the straightforward through to the highly complex.

3) Documentation and User Guides

Documentation is now online and regularly being updated, you can access it here: http://www.xlcubed.com/help

4) YouTube videos

We had two streams of comment here, broadly summarised as:

a) They’re really useful – thanks!

b) Youtube access is blocked by our corporate internet policy!

If your access is blocked, you can download the videos as mp4 from

http://www.xlcubed.com/downloads/xlcubedv6_youtube.zip

PowerPivot, SQL R2, Sharepoint 2010, Office 2010.

So we’ve been using PowerPivot for a while now, and Office 2010 has been part of our lives for some time. I’ll use this blog to answer some of the questions that keep cropping up in conversation with our customers:

1. Does XLCubed work with Excel (Office 14) 2010?

a. Yes, we’ve been using it since the first CTP release and each release since then.

2. Can I use XLCubed Web with SharePoint 2010?

a. Yes, publishing to the web and embedding the reports within your SharePoint site works in exactly the same way as with previous versions.

3. Does XLCubed connect to PowerPivot?

a. Yes, XLCubed connects to the PowerPivot published cubes, and our client tools can be used to build reports and dashboards from them.

4. Can I build reports from SQL Server R2 using XLCubed?

a. Yes this will work just fine, just as you can build reports from previous version of SQL or other relational sources. (here is an example)

PowerPivot in the real world

The services team have been working on migrating some of our internal models and sample databases across to a PowerPivot environment – looking at the pros and cons, using DAX rather than MDX to perform some calculations. Results have been varied, its been interesting to see some features that we’ve had for a while (like cube formulas, slicers and web parameters) appear in a similar way in PowerPivot.

Quite clearly PowerPivot isn’t the be all and end all or anything like a replacement for Analysis Services, but it certainly has a role for tactical solutions, some power user analysis, and we think likely also for RAD prototypes of larger scale AS implementations. It doesn’t venture into the gap left by PerformancePoint Planning (as many thought it would in early 2009) – we’ve moved to address this area with the XLCubed PM suite that uses in memory OLAP cubes and/or Analysis Services.

Trying out some of the tools

Here’s a few download sets for you to try, take careful note of the hardware spec and requirements for the MS ones though:

The 2010 Information Worker Virtual machine

Register and Download Office 2010

PowerPivot 32Bit, 64Bit

XLCubed Evaluation

If you would like to evaluate against your own data – contact the XLCubed Product team for evaluation editions or if you want to try a no risk proof of concept or prototype contact the XLCubed consulting team.

Data Visualization – a real world example

In the following example we work through a real world example of a data visualization. We’ve chosen an example that involves Operations data – this is fairly non-domain specific so hopefully it can demonstrate some important points. The first, and most important point is that you have to define your audience.

We receive many questions about “what is the best chart for this situation” or “what colour should I use for emphasis”. These questions are usually attacking the problem from the wrong angle. The one question you need to ask before anything else is “who is this visualization going to be seen by and how?” Is it in a boardroom on a printed sheet or across a trading floor on a plasma screen. Are the consumers domain experts?

This example features data about an investment bank’s operations processing, the audience being the clients of the Operations department.

Starting Point

Initially the project started out as simply trying to record what operational problems were encountered on a daily basis across different product lines. A reporting system was built and various generic reports produced:

DVBlog1

Unfortunately the reports either didn’t contain data at a granular enough level or it was difficult for the product managers to see where the issues were occurring and what the trends were. In reality the report showed what the major problems had been – unfortunately this was already known, as when something major goes wrong you remember getting shouted at!

What was requested

The client wanted a report that showed where the problems were occurring across business lines (rather than operational units) and how they were doing historically in a single page that could be included in a weekly MIS pack (they currently had four pages per product line (8) so a total of 32 pages. As a first pass they simply wanted an Excel worksheet they could update manually:

DVBlog2

We felt this solution lacked clarity and it was very difficult to spot trends across products.

What we proposed

We designed a solution using MicroCharts to allow small multiples of charts to show a variety of views:

DVBlog3

This solution allowed the user to view the data simply as a cumulative set of data by Product (top line) or by Root Cause (vertically) and then look deeper into historical trends in the centre of the chart. For example, its fairly easy to see spikes in the Root Cause data historically and see that the overall trend has improved over time. By ranking the Products and Root Causes you immediately give some sense of scale to the data. For example you can see that there are many more Application failures than any other type of problem, but the majority of root causes are otherwise fairly evenly distributed.

One other point worth noting was that the original colour scheme was much more muted, but the client got very upset that it looked like a competitor’s corporate colour and wanted it to be “louder”.

What was the user reaction…

Ecstatic, 1 page replaced 34 and they could see at a glance how the entire (large) organisation was working but also quickly find out detail for a particular area and identify trends.

Cube Design – meeting the business needs

Following on from our previous blog post on a couple of the common cube performance issues we’ve seen this last month, I thought I’d mention some of the non-technical issues we see quite often. In one case, once we’d made a few teaks and sorted out the cube performance issues we had to ask – Is the cube doing what it needs to? (Of course we did ask this first but the priority was sorting out the current cube performance!) Does it meet the business requirement? There’s no point in having the most complex cube that uses all the greatest features if it can’t answer the users queries.

In reports, we’ve seen examples where clients have nested four or five attributes to build up the effects of a hierarchy or run huge queries then vlookups on them to get the data they need, or bring back 12 columns of data and manually work out year to date, or not have any hierarchies that reflected commonly used groupings of members, or not have member names formatted in the way the business needs. To us this just isn’t right.

The users might not seem to care too much if they don’t know how the cube could work or if it runs fast enough to bring back huge result sets they can manipulate themselves – but doesn’t that negate the point of having a cube and your investment in it? Consumers of the cube should have fast, timely, accurate and importantly appropriate data made available to them in a manner that makes sense.

Cube design and build is about understanding the business and users needs and then building the cube and associated processes, that’s before even starting to build the reports and conveying the information using good data visualisation practices.

All too often we’re seeing a drive to use the latest tech, the flashiest widgets, cool looking 3D and shading effects on reports through to cubes and databases with every conceivable hierarchy or type of measure thought possible but not bearing much resemblance to what the users need to see.

I won’t hide the fact that we’re very proud of our skills and experience in ensuring our clients get not just a technically excellent system but also one that fits their needs. If you want to talk to one of the team about how they can help, you can find our contact details here.