Category Archives: Case Studies

Streamlining writeback with XL3DoWriteback

We were recently asked by one of our customers to help them improve their forecasting process. They had originally been using a solution developed using XLCubed Excel Edition v6.0 and our XL3LookupRW formula. The system had been working, but because of a combination of the intricacy of the data model and the slowness of the cube server when performing a writeback, the process was taking much longer than necessary.

As an example, one of the workbooks that was being used contained nearly 7,000 XL3LookupRW formulae, and another contained over 1,000. Many of these lookups could actually have been replaced by a simple Excel formula, such as a sum or a product of other values, but built as it was, the customer was having to type these values into the cells individually: a tedious, time-consuming and error-prone task.

The process before XL3DoWriteback

In the screenshot above, the price, percentage and production figures would be typed in, then a calculation made to calculate their product (in the white cells). This would then be individually copied and pasted into the corresponding cell in the revenue row.

What the customer wanted was a couple of changes to streamline the process:
* the ability to use Excel formulae in the workbook to obtain the final values – without the subsequent copying of values,
* they wanted to be able to get all the calculations lined up, then submit them all at once – this would make the poor server performance a much less important issue, since instead of having to wait to enter the next value, that period could be usefully spent doing other tasks.

What we offered was a different writeback method, which has been available in its current form since XLCubed v6.5: the XL3DoWriteback formula.

Unlike XL3LookupRW, XL3DoWriteback is geared towards the kind of batch writeback approach that the customer had envisioned. Once set up, Excel formulae can be used to do the actual work of calculating the numbers, and the XL3DoWriteback formulae remain dormant until all the values are ready, then are activated in one transaction.

If this sounds useful for you, here’s how to set it up.

The XL3DoWriteback Formula

In addition to the member list required by the XL3LookupRW formula, the XL3DoWriteback formula requires two extra parameters:

  • PerformWriteback: this parameter tells the formula whether it should be in active writeback mode, or should remain dormant
  • Value: this parameter gives the new value that should be written back to the tuple

Following these two parameters are the connection number, and the hierarchy-member pairs that will be familiar to you from the XL3Lookup and XL3LookupRW formulae.

The PerformWriteback parameter is a bit special. If it refers to a cell that contains only a boolean value of TRUE, then when it has finished sending the value, it will set that cell back to FALSE. This means that periods of writing and non-writing are very easy to define. In order to maximise the power of this, we usually point all the XL3DoWriteback formulae at a single PerformWriteback cell, which we can switch using an XL3Link formula. For example:

A1: =XL3Link(XL3Address($A$1),"Write changes",,XL3Address($B$1),TRUE)
B1: FALSE
C3: 1,000
C4: 0.85
C5: 20,132
C6 =C3*C4*C5
D3: =XL3DoWriteback($B$1,C3,1,"[Account]","[Account].[Production]",
     "[Date]","[Date].[Calendar].[January 2011]")
D4: =XL3DoWriteback($B$1,C4,1,"[Account]","[Account].[Our %age]",
     "[Date]","[Date].[Calendar].[January 2011]")
D5: =XL3DoWriteback($B$1,C5,1,"[Account]","[Account].[Price]",
     "[Date]","[Date].[Calendar].[January 2011]")
D6: =XL3DoWriteback($B$1,C6,1,"[Account]","[Account].[Forecast Revenue]",
     "[Date]","[Date].[Calendar].[January 2011]")

In this example, C3, C4 and C5 are cells containing the raw values. Since we know that the forecast revenue is a product of the production, the percentage and the price per unit, C6 is just the product over those three cells. The four XL3DoWriteback formulae in column D refer to these value cells, but because the value in cell B1 is FALSE, nothing is written back yet.

In cell A1 is a XL3Link formula that, when clicked, will change B1 to TRUE. This immediately signals the XL3DoWriteback formulae that they should gather and write back their values. Once that transaction has been sent to the cube, the XL3DoWritebacks set cell B1 back to FALSE, and the workbook is back to the ready state.

The Setup

To make it as easy and efficient as possible, we used:

  • one section for values. These were a mix of XL3Lookup formulae, typed-in values and standard Excel formulae
  • one section for XL3DoWriteback formulae. We pared away any excess XL3DoWriteback formulae, leaving only those cells that we were sure we wanted to be writeable
  • a single cell with the boolean value, set to FALSE
  • an XL3Link in a highly visible place, to switch the boolean cell. In this case, the cell containing the boolean value was B1:
=XL3Link(XL3Address($A$1),"Write changes",,XL3Address($B$1),TRUE)

The final workbook looked a little like this (except, of course, much larger!):

A section from the finished workbook

The customer would then enter all the necessary values on the left section, using whatever combination of Excel formulae, cube lookups and typed-in values he needed, without any wait between entries. A single click of the XL3Link then wrote the values back in a single batch, leaving the customer to do other jobs.

The revised model allows the user to update entries quickly and efficiently, without any ‘write’ delay. The numbers to be written back can be calculated using Excel formulae as needed based on the raw input numbers. When the input process is done and checked in Excel, everything can be committed to the cube with one button press. The end result – a happy customer, with more time to plan and analyse the budget, rather than just input it.

Further reading

XL3DoWriteback formula reference

Solve order shenanigans

Today I’m going to blog about a problem we recently solved in a client’s cube, an error in the Mdx script that’s very easy to make if you aren’t careful.

We’ll run a simple example in AdventureWorks (what else?) to demonstrate the issue.

The client had already added a calculation to their cube to show year-on-year growth. The formula is:

Create Member CurrentCube.[Measures].[Delta to PrevYear] as
(
    ([Measures].[Internet Sales Amount])
    -
    ([Measures].[Internet Sales Amount],
        ParallelPeriod(
            [Date].[Calendar].[Calendar Year],
            1,
            [Date].[Calendar].CurrentMember
        )
    )
)
/
    ([Measures].[Internet Sales Amount],
        ParallelPeriod(
            [Date].[Calendar].[Calendar Year],
            1,
            [Date].[Calendar].CurrentMember
        )
    )
, Format_String = "0.00%";

(some error checking removed for clarity)

This screenshot shows a couple of simple XLCubed Grids showing the real value, and below the percentage change. I have added in an Excel calculation to show the results are as expected.

Later during the cube development, the client added a calculated member in their Product dimension, one that gives a total excluding one of the product categories.

To replicate this I’ll add a calculation for “All Ex Bikes”:

Create Member 
CurrentCube.[Product].[Product Model Categories].[All Products].[All Ex Bikes]
as
(
    ([Product].[Product Model Categories].[All Products])
    -
    ([Product].[Product Model Categories].[Category].&[1])
);

And if we run the report again we get the following.

Notice the cell I’ve highlighted. The “All Ex Bikes” calculation works fine on the normal measure, but it gives totally the wrong number for the percentage calculation. What’s going on?

The problem is that in the cell highlighted Analysis Services has two calculations to think about when working out the result.

  • Compare this year to last year
  • Get the “Grand Total”, and subtract “Bikes”

As the number returned is 1.85% we can see that Analysis Services has chosen the second option, “Grand Total” – “Bikes”.

What we really want is for the calculation to be done by getting the subtotal, and then doing the percentage change based on that.

Fortunately the fix was a simple one. Analysis Services will run the calculations in the order they are found in the Mdx Script, so to fix the issue we simply moved the new “All Ex Bikes” definition up above the percentage calculation.

Now the number returned matches our expectations.

Pass/Solve Order can be a complex topic, so you may need to be careful.

In this case the number is totally wrong, so it was easy to spot, but some bugs will be much more subtle, so watch out!

Something on the Horizon

We had an interesting scenario while helping a customer extend an existing Excel dashboard.

We had recently performed some work to solve some performance and design issues they had with their existing Analysis Services cubes. They now had more of their underlying data available and the ability to query longer periods without the performance hit (a year’s worth of data vs 28-days).

They wanted to make the most of this by charting changes in daily sales data over the previous 12 months, broken down by their four main business groups. Ideally the chart would become part of the existing Management Report, the difficulty was the lack of report real estate to add the extra information. This is something we have all come across previously and of course typically solved by using In-Cell charts.

Plotting the data on an Excel chart in the space available would give us this:

 

 

Converting to Sparklines gave us a slightly better view, but given the number of data items being plotted still not ideal.

 

 

Luckily our customer had recently upgraded to V6.1 of XLCubed so we were able to use one of our newest incell chart types: SparkHorizons. There is a good explanation of Horizon charts as part of the research paper: Sizing the Horizon: The Effects of Chart Size and Layering on the Graphical Perception of Time Series Visualizations and Stephen Few has covered them previously.

Essentially a line chart is split into colored bands – degrees of blue for positive numbers and degrees of red for negative numbers. In XLCubed this is 3 bands of each colour. The separation of the vertical scale means that horizon charts can be a lot more effective than standard sparklines where the scale of the numbers vary significantly, but you still want to retain a common scale view.

In this case plotting the same data as horizon charts makes things a lot clearer:

It now becomes quite clear when sales a trending up vs down. It’s also possible to flip the negative values so they appear on the same direction as the positive values:

 

We are always looking at ways of developing and extending XLCubed, SparkHorizons were added because they looked like they had the potential to be useful where the data suited them, so it was pleasing to be able to use them in a real-world situation.

It’s also worth mentioning that although, in this case the data came from Analysis Services Cubes, because they are available as Excel formula they can be used to plot any Excel data, here’s an example of the formula:

=XL3SparkHorizon(Sheet1!$V$2:$V$262,Sheet1!E10)

This will plot the data from Sheet1!$V$2:$V$262 as a SparkHorizon graph in Sheet1!E10.

 


Data Visualization – a real world example

In the following example we work through a real world example of a data visualization. We’ve chosen an example that involves Operations data – this is fairly non-domain specific so hopefully it can demonstrate some important points. The first, and most important point is that you have to define your audience.

We receive many questions about “what is the best chart for this situation” or “what colour should I use for emphasis”. These questions are usually attacking the problem from the wrong angle. The one question you need to ask before anything else is “who is this visualization going to be seen by and how?” Is it in a boardroom on a printed sheet or across a trading floor on a plasma screen. Are the consumers domain experts?

This example features data about an investment bank’s operations processing, the audience being the clients of the Operations department.

Starting Point

Initially the project started out as simply trying to record what operational problems were encountered on a daily basis across different product lines. A reporting system was built and various generic reports produced:

DVBlog1

Unfortunately the reports either didn’t contain data at a granular enough level or it was difficult for the product managers to see where the issues were occurring and what the trends were. In reality the report showed what the major problems had been – unfortunately this was already known, as when something major goes wrong you remember getting shouted at!

What was requested

The client wanted a report that showed where the problems were occurring across business lines (rather than operational units) and how they were doing historically in a single page that could be included in a weekly MIS pack (they currently had four pages per product line (8) so a total of 32 pages. As a first pass they simply wanted an Excel worksheet they could update manually:

DVBlog2

We felt this solution lacked clarity and it was very difficult to spot trends across products.

What we proposed

We designed a solution using MicroCharts to allow small multiples of charts to show a variety of views:

DVBlog3

This solution allowed the user to view the data simply as a cumulative set of data by Product (top line) or by Root Cause (vertically) and then look deeper into historical trends in the centre of the chart. For example, its fairly easy to see spikes in the Root Cause data historically and see that the overall trend has improved over time. By ranking the Products and Root Causes you immediately give some sense of scale to the data. For example you can see that there are many more Application failures than any other type of problem, but the majority of root causes are otherwise fairly evenly distributed.

One other point worth noting was that the original colour scheme was much more muted, but the client got very upset that it looked like a competitor’s corporate colour and wanted it to be “louder”.

What was the user reaction…

Ecstatic, 1 page replaced 34 and they could see at a glance how the entire (large) organisation was working but also quickly find out detail for a particular area and identify trends.

Cube Design – meeting the business needs

Following on from our previous blog post on a couple of the common cube performance issues we’ve seen this last month, I thought I’d mention some of the non-technical issues we see quite often. In one case, once we’d made a few teaks and sorted out the cube performance issues we had to ask – Is the cube doing what it needs to? (Of course we did ask this first but the priority was sorting out the current cube performance!) Does it meet the business requirement? There’s no point in having the most complex cube that uses all the greatest features if it can’t answer the users queries.

In reports, we’ve seen examples where clients have nested four or five attributes to build up the effects of a hierarchy or run huge queries then vlookups on them to get the data they need, or bring back 12 columns of data and manually work out year to date, or not have any hierarchies that reflected commonly used groupings of members, or not have member names formatted in the way the business needs. To us this just isn’t right.

The users might not seem to care too much if they don’t know how the cube could work or if it runs fast enough to bring back huge result sets they can manipulate themselves – but doesn’t that negate the point of having a cube and your investment in it? Consumers of the cube should have fast, timely, accurate and importantly appropriate data made available to them in a manner that makes sense.

Cube design and build is about understanding the business and users needs and then building the cube and associated processes, that’s before even starting to build the reports and conveying the information using good data visualisation practices.

All too often we’re seeing a drive to use the latest tech, the flashiest widgets, cool looking 3D and shading effects on reports through to cubes and databases with every conceivable hierarchy or type of measure thought possible but not bearing much resemblance to what the users need to see.

I won’t hide the fact that we’re very proud of our skills and experience in ensuring our clients get not just a technically excellent system but also one that fits their needs. If you want to talk to one of the team about how they can help, you can find our contact details here.

Common Analysis Services Performance Issues

A quick blog post from the Services team here at XLCubed on some performance problems with SSAS that we’ve seen again recently. With the processing power and memory available it’s pretty easy to build a fast cube – both for query performance and processing time. It is also easy to be lax in cube design, ignore the warnings and best practice guidelines, and end up with a cube that’s looks concise, is neat and clever but performs terribly for end users.

We’ve come across a couple of examples of this at client sites in the last month, and there are some common issues that always seem to jump out – rectifying these normally has a very positive impact. The three most common culprits we see are:

Parent-Child dimensions – Parent-Child dimensions are nice and easy to build and use. However, as you can’t build aggregations that include a parent-child dimension it can make for a badly performing cube! Try to flatten dimensions out and evaluate exactly why a parent-child dimension is required and being used. They are not the only option..

Unary operators, Custom-roll ups – we’ve seen cases where these have been included in every dimension in a cube by default. If there isn’t a need for them – leave them out! If you can get around using a custom rollup or unary operator by some simple work in the ETL process it may be better to do that first.

If your query performance is bad – try removing all unary operators and custom rollups then re-test the cube. How’s the performance now? It should be significantly faster – evaluate and review the need for the unary operators and custom rollups and see if the same effect can be achieved differently (e.g. in the ETL layer)

Cache vs. Non-Cache Data – Basically is the cube recalculating and re-querying numbers over and over again or can it re-use results? Use profiler to check for cache or non-cache data when your queries are running. So many times we’ve seen all queries not using the cache because AS hasn’t been given enough available memory or volatile operators such as now() have been used in mdx calcs.

Resolving the issues above had a massive impact – reports taking up to 3 minutes to run were down to a few seconds, users could begin to use the application properly for the first time, however fixing the performance may be only part of the task. The cube of course needs to have been designed to meet the business requirements, but that’s another blog..