Common Analysis Services Performance Issues

A quick blog post from the Services team here at XLCubed on some performance problems with SSAS that we’ve seen again recently. With the processing power and memory available it’s pretty easy to build a fast cube – both for query performance and processing time. It is also easy to be lax in cube design, ignore the warnings and best practice guidelines, and end up with a cube that’s looks concise, is neat and clever but performs terribly for end users.

We’ve come across a couple of examples of this at client sites in the last month, and there are some common issues that always seem to jump out – rectifying these normally has a very positive impact. The three most common culprits we see are:

Parent-Child dimensions – Parent-Child dimensions are nice and easy to build and use. However, as you can’t build aggregations that include a parent-child dimension it can make for a badly performing cube! Try to flatten dimensions out and evaluate exactly why a parent-child dimension is required and being used. They are not the only option..

Unary operators, Custom-roll ups – we’ve seen cases where these have been included in every dimension in a cube by default. If there isn’t a need for them – leave them out! If you can get around using a custom rollup or unary operator by some simple work in the ETL process it may be better to do that first.

If your query performance is bad – try removing all unary operators and custom rollups then re-test the cube. How’s the performance now? It should be significantly faster – evaluate and review the need for the unary operators and custom rollups and see if the same effect can be achieved differently (e.g. in the ETL layer)

Cache vs. Non-Cache Data – Basically is the cube recalculating and re-querying numbers over and over again or can it re-use results? Use profiler to check for cache or non-cache data when your queries are running. So many times we’ve seen all queries not using the cache because AS hasn’t been given enough available memory or volatile operators such as now() have been used in mdx calcs.

Resolving the issues above had a massive impact – reports taking up to 3 minutes to run were down to a few seconds, users could begin to use the application properly for the first time, however fixing the performance may be only part of the task. The cube of course needs to have been designed to meet the business requirements, but that’s another blog..

So, what’s an OLAP Cube, anyway?

As "An Excel User in a Cubed Kingdom" I’m starting my exploration of this new found land with a simple question: what is an OLAP cube? In plain English, please…

I like this simple, non-technical definition: a cube is a set of predefined answers. It’s up to you to select the right questions.

OK, let’s detail this.

Imagine that you have a very large database with the usual business data: orders, customers, sales representatives… Now you want to know how much a customer category ordered over the last year. You query the database and you get the answer. Then you want monthly sales. Query it again. Dig a little deeper to see what products that category ordered. Query it once more.

What is happening behind the curtain? Each time you enter a new query the system looks at each transaction (or a subset) and performs the necessary calculations to answer your query. You’ll get your answers, but it will be painfully slow: depending on your query and the database size, it may take hours. It is not an option.

But you don’t really need to see each individual order, do you? If you only need to know monthly sales, why should your system go through each transaction? If you pre-aggregate that data, you’ll get your answers much, much, faster, because there aren’t five million records, just 100.000. You’ll be able to actually work, instead of staring at your monitor, waiting for an answer.

This is what a cube does. It provides faster answers by eliminating the unnecessary detail for the task at hand. You shouldn’t look at a cube as an unique, condensed version of the database. While you have a virtually infinite number of questions that the database can answer, a cube focus on providing answers to a small set of questions. That’s why you can have different cubes (marketing, fin, sales), all of them getting data from the same source. They all answer different sets of questions.

When designing a cube (it’s your job, not IT’s), resist the temptation of a one-size-fits-all cube. Clearly define a coherent set of questions for your fundamental business needs and make sure they are answered once the cube becomes available.

If you are exploring your data you will not want to wait one hour each time you make a change. On the other end, a fast cube with no data to explore is useless. There is  a fine balance between maximum flexibility and maximum performance.

An OLAP cube not only ensures that you retrieve the right data from the database but also allows you to explore it efficiently. Two good reasons to add OLAP cubes to your toolbox.

Next time we’ll see how plain English can describe the structure of a basic OLAP cube.

An Excel User in a Cubed Kingdom

I’ve been using Excel for my entire professional career, most of the time in large corporations where adding a piece of software to the standard IT structure would be some kind of heresy. When I have a business need that can be solve by an out-of-the-box Excel installation that’s the path I follow (I’m also a power user of the company’s formal BI tool, so I know where to draw the line).

Over time, I’ve developed a framework that helps me to solve problems from a very specific point of view: how to minimize file size, how to minimize calculation time, how to deploy, how to update, and so on. This is the logic that you often must follow, and you tend to believe it is the best one. At that point you must start a conversation (a very fashionable word nowadays..) with someone that doesn’t share that logic.

Take OLAP cubes, for example. I never use them. I use the corporate BI tool or I create a 100% pure Excel application. But then Andreas told me about XLCubed and how you could deploy online your always-updated-file. That was a turning point because for me those are two killer features that I’ve been yearning for a long time.

I started to play with the tool, but still using the same logic. And was plain wrong. Sure I could use all my Excel background, but I needed to adjust it to a different logic. Using an OLAP cube you get a new set of functions that simplifies much of your work and you need to reevaluate some Excel functions because some of them will perform better under this new environment. You’ll leverage your Excel background to create a a new logic at a higher level.

I’m an experienced Excel user, probably like you, and this is just Excel on steroids. I’m leaving my comfort zone, one foot at a time, and I’ll document and share with you my learning curve. So come with me to the Cubed Kingdom and we’ll walk through this together.

First stop, next post: what is a cube, anyway?

Gemini – Smarter Excel Dashboards with End-User BI

I attended beginning this week the Microsoft BI Conference in Seattle where Microsoft presented an interesting new product that might change the way you build your Excel dashboards in the future: “Gemini”, an End-User Business Intelligence plug in for Excel.

End-User Business Intelligence has been around for quite a while with some of the most prominent Applix TM/1 , PALO, PowerOLAP and MIS Alea. These tools where tailored for non-IT proficient Excel users to allow them to build real Business Intelligence solutions with Excel, without leaving Excel or having to learn complicated BI or Data Warehousing techniques. All you have to learn is that a cube is like a very powerful multidimensional spreadsheet that can easily aggregate and hold large amounts of data. To pull out data from the cube you use a formula like this…

=LookupCube(“Sales”, “Units”, “Hats”, “Store 19”, “Oct-2007”)

…returns unit sales information consolidated by product line and region for the month shown. And this formula…

= LookupCube(“Sales”, “Sales USD”, “Hats”, “Store 19”, “Oct-2007 YTD”)

…returns year-to-date sales in US dollars for the same product line and region

Continue reading “Gemini – Smarter Excel Dashboards with End-User BI”