Parent-Child Dimensions in Analysis Services – Performance Walkthrough

Parent-child hierarchies are a good fit for many data structures such as accounts or employees, and while they can speed development in some cases, they can also cause performance problems in large cubes.

We often see customers with these type of performance issues, and thought it worth sharing a simple technique for altering the dimension structure to improve query speed.

The problem

Often parent-child hierarchies are created as this is the structure used in the relational source, so they seem a good fit to model the members. In many cases though data is only at the leaf level of the hierachy, meaning parent-child isn’t really needed.

Performance problems occur because no aggregates are created for parent-child dimensions, as detailed in the Analysis Services performance guide:

Parent-child hierarchies

Parent-child hierarchies are hierarchies with a variable number of levels, as determined by a recursive relationship between a child attribute and a parent attribute. Parent-child hierarchies are typically used to represent a financial chart of accounts or an organizational chart. In parent-child hierarchies, aggregations are created only for the key attribute and the top attribute, i.e., the All attribute unless it is disabled. As such, refrain from using parent-child hierarchies that contain large numbers of members at intermediate levels of the hierarchy. Additionally, you should limit the number of parent-child hierarchies in your cube.

If you are in a design scenario with a large parent-child hierarchy (greater than 250,000 members), you may want to consider altering the source schema to re-organize part or all of the hierarchy into a user hierarchy with a fixed number of levels. Once the data has been reorganized into the user hierarchy, you can use the Hide Member If property of each level to hide the redundant or missing members.

 

The performance guide hints at re-organizing the hierarchy to improve perfomance, but doesn’t say how.

The solution

This article will walkthrough the steps needed to change your parent-child hierarchy structure to have real levels, so that aggregations work, and your performance is as good as you expect with normal hierarchies.

This process is known as flattening or normalizing the parent-child hierarchy.

Firstly, let’s look at the data in our relational source.

[spoiler intro=”Code” title=”Sql Create Script”]

CREATE TABLE [dbo].[Products](
 [ProductID] [int] NOT NULL,
 [ParentID] [int] NULL,
 [Name] [varchar](50) NOT NULL,
 CONSTRAINT [PK_Products] PRIMARY KEY CLUSTERED ([ProductID] ASC)
)

GO

insert into Products(ProductID, ParentID, Name) values(1, NULL, 'All')
insert into Products(ProductID, ParentID, Name) values(2, 1, 'Fruit')
insert into Products(ProductID, ParentID, Name) values(3, 2, 'Red')
insert into Products(ProductID, ParentID, Name) values(4, 3, 'Cherry')
insert into Products(ProductID, ParentID, Name) values(5, 3, 'Strawberry')
insert into Products(ProductID, ParentID, Name) values(6, 2, 'Yellow')
insert into Products(ProductID, ParentID, Name) values(7, 6, 'Banana')
insert into Products(ProductID, ParentID, Name) values(8, 6, 'Lemon')
insert into Products(ProductID, ParentID, Name) values(9, 1, 'Meat')
insert into Products(ProductID, ParentID, Name) values(10, 9, 'Beef')
insert into Products(ProductID, ParentID, Name) values(11, 9, 'Pork')

[/spoiler]

Not a large dimension, but enough to demonstrate the technique. As you can see my real products are all at the leaf level.

The strategy is quite simple:

  • Create a view to seperate the members into different levels.
  • Create a new dimension using these real levels.
  • Configure the dimension to appear like the original parent-child dimension, but with the performance of a normal dimension.

Create the view

We want to create a denormalised view of the data. To do this we join the Product to itself once for each level. This does mean we need to know the maximum depth of the hierarchy, but often this is fixed, and we’ll build in some extra levels for safety.

The tricks here are:

  • Use coalesce() so that we always get the lowest level ID below the leaves, never a NULL. This allows us to join to the fact table at the bottom level of our hierarchy.
  • Leave Name columns null below the leaves, this will allow us to stop the hierarchy at the correct leaf level in each part of the hierarchy.

[spoiler intro=”Code” title=”Sql View Script”]

create view dbo.ProductsFlattened

as

select    P1.ProductID as Level1ID,
 P1.Name as Level1Name,
 coalesce(P2.ProductID, P1.ProductID) as Level2ID,
 P2.Name as Level2Name,
 coalesce(P3.ProductID, P2.ProductID, P1.ProductID) as Level3ID,
 P3.Name as Level3Name,
 coalesce(P4.ProductID, P3.ProductID, P2.ProductID, P1.ProductID) as Level4ID,
 P4.Name as Level4Name,
 coalesce(P5.ProductID, P4.ProductID, P3.ProductID, P2.ProductID, P1.ProductID) as Level5ID,
 P5.Name as Level5Name

from    dbo.Products P0
left join    dbo.Products P1
 on        P0.ProductID = P1.ParentID
left join    dbo.Products P2
 on        P1.ProductID = P2.ParentID
left join    dbo.Products P3
 on        P2.ProductID = P3.ParentID
left join    dbo.Products P4
 on        P3.ProductID = P4.ParentID
left join    dbo.Products P5
 on        P4.ProductID = P5.ParentID

where P0.ParentID is null

[/spoiler]

Running this we get:

Obviously we can update this view to create more levels as required, but 5 are enough for now.

The Dimension

Next we go to BIDS, and add the view to our Data Source View, and then add a new Dimension based on the view.

The key steps to creating the dimension correctly are:

  • Set the key attribute to Level5ID, and the name to Level5Name.
  • Create an attribute for each Level ID, and on each set the Name Column appropriately.
  • Create a hierarchy using these attributes in order.
  • On each attribute set AttributeHierarchyVisible to False.
  • On each level of the hierarchy set HideMemberIf to NoName.
  • Set up the Attribute Relationships between the levels.

You should end up with the following:

Dimension Structure

 

Attribute Relationships

 

If you browse the dimension you’ll see that it never goes as far as level 5, even though it exists. This is becuase we set up the member hiding option, and returned NULLs in our view.

Conclusion

And that’s it done, you can now join to your fact tables at the lowest level, build your cube as normal and get the performance benefits of aggregation!

See also

A tool to achieve the same result is available from Codeplex, we’ve not personally tried it but may well be a timesaver. This works in a similar way to the example above, but it’s often useful to understand how something works, even if you choose to automate it.

Microsoft Gold Partner Renewal, & resulting questions answered

We’ve just completed the process of renewing our Gold partner status with Microsoft. Among other things, this ensures we have early access to upcoming software through the CTP and beta programs, as has been the case with office 2010 over the last year.

The Gold level now has a requirement for completion of an independently run Customer Satisfaction survey, and thanks very much to everyone who completed this. The scoring and comments are much appreciated, and it was good to see the consistently high scores. We’re of course reviewing the areas where we didn’t score perfectly as we strive to further improve the service we offer.

The survey also highlighted a number of questions which appeared a few times:

1) Relational Database support

Our prime focus is on cube based reporting and analytics, but we do also support querying of relational databases in both Excel and on the Web, and have a number of customers using the product purely for relational reporting.

We extend the native Microsoft functionality in excel, and add support for this in the Web product. The connection string and query string can both be formula driven so you can construct parameter driven reports with ease.

– Search for ‘relational database access’ in the help file for an overview.

2) Writeback & Planning applications

XLCubed supports both grid and formula based writeback against Analysis Services, in Excel and on the Web. As such it lends itself well to planning and budgeting applications, and it’s an arena in which we have a lot of experience, from the straightforward through to the highly complex.

3) Documentation and User Guides

Documentation is now online and regularly being updated, you can access it here: http://www.xlcubed.com/help

4) YouTube videos

We had two streams of comment here, broadly summarised as:

a) They’re really useful – thanks!

b) Youtube access is blocked by our corporate internet policy!

If your access is blocked, you can download the videos as mp4 from

http://www.xlcubed.com/downloads/xlcubedv6_youtube.zip

PowerPivot, SQL R2, Sharepoint 2010, Office 2010.

So we’ve been using PowerPivot for a while now, and Office 2010 has been part of our lives for some time. I’ll use this blog to answer some of the questions that keep cropping up in conversation with our customers:

1. Does XLCubed work with Excel (Office 14) 2010?

a. Yes, we’ve been using it since the first CTP release and each release since then.

2. Can I use XLCubed Web with SharePoint 2010?

a. Yes, publishing to the web and embedding the reports within your SharePoint site works in exactly the same way as with previous versions.

3. Does XLCubed connect to PowerPivot?

a. Yes, XLCubed connects to the PowerPivot published cubes, and our client tools can be used to build reports and dashboards from them.

4. Can I build reports from SQL Server R2 using XLCubed?

a. Yes this will work just fine, just as you can build reports from previous version of SQL or other relational sources. (here is an example)

PowerPivot in the real world

The services team have been working on migrating some of our internal models and sample databases across to a PowerPivot environment – looking at the pros and cons, using DAX rather than MDX to perform some calculations. Results have been varied, its been interesting to see some features that we’ve had for a while (like cube formulas, slicers and web parameters) appear in a similar way in PowerPivot.

Quite clearly PowerPivot isn’t the be all and end all or anything like a replacement for Analysis Services, but it certainly has a role for tactical solutions, some power user analysis, and we think likely also for RAD prototypes of larger scale AS implementations. It doesn’t venture into the gap left by PerformancePoint Planning (as many thought it would in early 2009) – we’ve moved to address this area with the XLCubed PM suite that uses in memory OLAP cubes and/or Analysis Services.

Trying out some of the tools

Here’s a few download sets for you to try, take careful note of the hardware spec and requirements for the MS ones though:

The 2010 Information Worker Virtual machine

Register and Download Office 2010

PowerPivot 32Bit, 64Bit

XLCubed Evaluation

If you would like to evaluate against your own data – contact the XLCubed Product team for evaluation editions or if you want to try a no risk proof of concept or prototype contact the XLCubed consulting team.

Data Visualization – a real world example

In the following example we work through a real world example of a data visualization. We’ve chosen an example that involves Operations data – this is fairly non-domain specific so hopefully it can demonstrate some important points. The first, and most important point is that you have to define your audience.

We receive many questions about “what is the best chart for this situation” or “what colour should I use for emphasis”. These questions are usually attacking the problem from the wrong angle. The one question you need to ask before anything else is “who is this visualization going to be seen by and how?” Is it in a boardroom on a printed sheet or across a trading floor on a plasma screen. Are the consumers domain experts?

This example features data about an investment bank’s operations processing, the audience being the clients of the Operations department.

Starting Point

Initially the project started out as simply trying to record what operational problems were encountered on a daily basis across different product lines. A reporting system was built and various generic reports produced:

DVBlog1

Unfortunately the reports either didn’t contain data at a granular enough level or it was difficult for the product managers to see where the issues were occurring and what the trends were. In reality the report showed what the major problems had been – unfortunately this was already known, as when something major goes wrong you remember getting shouted at!

What was requested

The client wanted a report that showed where the problems were occurring across business lines (rather than operational units) and how they were doing historically in a single page that could be included in a weekly MIS pack (they currently had four pages per product line (8) so a total of 32 pages. As a first pass they simply wanted an Excel worksheet they could update manually:

DVBlog2

We felt this solution lacked clarity and it was very difficult to spot trends across products.

What we proposed

We designed a solution using MicroCharts to allow small multiples of charts to show a variety of views:

DVBlog3

This solution allowed the user to view the data simply as a cumulative set of data by Product (top line) or by Root Cause (vertically) and then look deeper into historical trends in the centre of the chart. For example, its fairly easy to see spikes in the Root Cause data historically and see that the overall trend has improved over time. By ranking the Products and Root Causes you immediately give some sense of scale to the data. For example you can see that there are many more Application failures than any other type of problem, but the majority of root causes are otherwise fairly evenly distributed.

One other point worth noting was that the original colour scheme was much more muted, but the client got very upset that it looked like a competitor’s corporate colour and wanted it to be “louder”.

What was the user reaction…

Ecstatic, 1 page replaced 34 and they could see at a glance how the entire (large) organisation was working but also quickly find out detail for a particular area and identify trends.

Cube Design – meeting the business needs

 

Following on from our previous blog post on a couple of the common cube performance issues we’ve seen this last month, I thought I’d mention some of the non-technical issues we see quite often. In one case, once we’d made a few teaks and sorted out the cube performance issues we had to ask – Is the cube doing what it needs to? (Of course we did ask this first but the priority was sorting out the current cube performance!) Does it meet the business requirement? There’s no point in having the most complex cube that uses all the greatest features if it can’t answer the users queries.

In reports, we’ve seen examples where clients have nested four or five attributes to build up the effects of a hierarchy or run huge queries then vlookups on them to get the data they need, or bring back 12 columns of data and manually work out year to date, or not have any hierarchies that reflected commonly used groupings of members, or not have member names formatted in the way the business needs. To us this just isn’t right.

The users might not seem to care too much if they don’t know how the cube could work or if it runs fast enough to bring back huge result sets they can manipulate themselves – but doesn’t that negate the point of having a cube and your investment in it? Consumers of the cube should have fast, timely, accurate and importantly appropriate data made available to them in a manner that makes sense.

Cube design and build is about understanding the business and users needs and then building the cube and associated processes, that’s before even starting to build the reports and conveying the information using good data visualisation practices.

All too often we’re seeing a drive to use the latest tech, the flashiest widgets, cool looking 3D and shading effects on reports through to cubes and databases with every conceivable hierarchy or type of measure thought possible but not bearing much resemblance to what the users need to see.

I won’t hide the fact that we’re very proud of our skills and experience in ensuring our clients get not just a technically excellent system but also one that fits their needs. If you want to talk to one of the team about how they can help, you can find our contact details here.

Common Analysis Services Performance Issues

A quick blog post from the Services team here at XLCubed on some performance problems with SSAS that we’ve seen again recently. With the processing power and memory available it’s pretty easy to build a fast cube – both for query performance and processing time. It is also easy to be lax in cube design, ignore the warnings and best practice guidelines, and end up with a cube that’s looks concise, is neat and clever but performs terribly for end users.

We’ve come across a couple of examples of this at client sites in the last month, and there are some common issues that always seem to jump out – rectifying these normally has a very positive impact. The three most common culprits we see are:

Parent-Child dimensions – Parent-Child dimensions are nice and easy to build and use. However, as you can’t build aggregations that include a parent-child dimension it can make for a badly performing cube! Try to flatten dimensions out and evaluate exactly why a parent-child dimension is required and being used. They are not the only option..

Unary operators, Custom-roll ups – we’ve seen cases where these have been included in every dimension in a cube by default. If there isn’t a need for them – leave them out! If you can get around using a custom rollup or unary operator by some simple work in the ETL process it may be better to do that first.

If your query performance is bad – try removing all unary operators and custom rollups then re-test the cube. How’s the performance now? It should be significantly faster – evaluate and review the need for the unary operators and custom rollups and see if the same effect can be achieved differently (e.g. in the ETL layer)

Cache vs. Non-Cache Data – Basically is the cube recalculating and re-querying numbers over and over again or can it re-use results? Use profiler to check for cache or non-cache data when your queries are running. So many times we’ve seen all queries not using the cache because AS hasn’t been given enough available memory or volatile operators such as now() have been used in mdx calcs.

Resolving the issues above had a massive impact – reports taking up to 3 minutes to run were down to a few seconds, users could begin to use the application properly for the first time, however fixing the performance may be only part of the task. The cube of course needs to have been designed to meet the business requirements, but that’s another blog..

2009 Excel Dashboard Competition Winners

Thanks to everyone who entered this years competition, again the standard was very high, and it’s always great to see the product being used so effectively. The entrants were extremely varied in both their style and subject matter, and made for a difficult decision. However I’m pleased to be able to announce the winners:

1) Ajay V Singh – Operations Dashboard for a Debt Collections Company.

The target audience are the CXO level execs of the business, aiming to provide a view of all the nerve points of the organization in a single unified interface that is portable and yet comprehensive.

The dashboard layout is dense but uncluttered and well thought through. Colours are well balanced, and allow the reds to draw the reader’s attention as intended.

Ajay’s background summary of the dashboard, with larger screen shots, will be available on our web site in the coming week.

Collections Dashboard Screenshot

 

 

 

 

 

 

 

 

 

 

 

 

 

2) John Munoz – Insights into Unemployment in the United States.

Using data from the bureau of Labor statistics, the dashboard gives a deep glimpse into the unemployment situation in the US. A large volume of disparate and tabular information is brought together in a single concise view, which aids understanding and adds real insight. The trends and demographic splits come through very well, and make for easy comparison.

John’s background summary of the dashboard, with larger screen shots, will be available on our web site in the coming week.

unemployment_dashboard_munoz

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3) Lisa Cunningham – Anti-Social Behaviour Dashboard

The dashboard is produced by the Research and Information Team at Leicestershire County Council as part of a suite of dashboards produced for the Crime and Disorder Reduction Partnerships. It is available to the public through the local web portal, which makes readability, and also the contact information provided vital. The dashboard aims to provide an at a glance view of the level and trend of ASB, and does an excellent job.

Lisa’s background summary of the dashboard, with larger screen shots, will be available on our web site in the coming week.

ASBDashboard

Excel Dashboard Competition – deadline extended

We have decided to extend the entry deadline through the holiday period, to 28th August.

As a reminder, the competition is for real world solutions (no sample data set), and judging criteria include:

  • Clean and clear organization
  • Effective table and chart design
  • A single-screen display, properly designed for the web, screen or print outs

See the competition page for more detail.

-Thanks to all of you who have already entered, the quality has again been good, and will doubtless lead to an interesting debate when it comes to choosing the winners. As we’ve extended the deadline if there are any additional tweaks you’d like to incorporate you can of course send revised versions.

Microsoft Business Intelligence roundup

There have been a few key announcements in the Microsoft BI world recently, we’ve gathered them up and summarised below in case our readership have missed any of the key announcements.

SQL Server 2008 R2 – (CTP Summer 09)

“SQL Server 2008 R2 expands on the value delivered in SQL Server 2008 by providing a wealth of new features and capabilities that can benefit your entire organization. This release will further improve IT Efficiency with new and enhanced management capabilities and empower business users to access, integrate, analyze and share information using business intelligence tools they already know.” Read more here.

So what does this mean for you? In R2 there will be a number of new features from Gemini to Master Data Services, support for more than 64 processors to extended functionality in the Management Studio. We’re all looking forward to Gemini and the potential that has to offer – rest assured the XLCubed development team are working closely to ensure that the product is compatible straight out of the box. If you have any questions contact support@xlcubed.com

Service Pack 1 for SQL Server 2008 Available  (April 09)

Microsoft announced the release of SP1 for SQL Server 2008 earlier this month, for many this marks the psychological point at which they’ll take interest in and investigate the product in depth. With a large uptake of the product already in the market place and the fastest OLAP engine we’ve seen from Microsoft, there is now no excuse not to evaluate upgrading or migrating to SQL Server.

Contact our services team for more information or how we can help you with SQL Server 2008.

SQL Server 2008 SP1

Service Pack 1 for SQL Server 2008 is now available for customers. The Service pack is available via download here and is primarily a roll-up of cumulative updates 1 to 3, quick fix engineering updates and minor fixes made in response to requests reported through the SQL Server community. While there are no new features in this service pack, customers running SQL Server 2008 should download and install SP1 to take advantage of the fixes which increase supportability and stability of SQL Server 2008.

Customers have no reason to wait to upgrade to SQL Server 2008 and many are already taking advantage of SQL Server 2008 as a smart IT investment. In fact, there have been over 3 million downloads of SQL Server 2008 since the RTM in August. With this Service pack, Microsoft is introducing 80% fewer changes to customer configurations compared to previous SQL Server Service Pack releases. This remarkable decrease is a testament to a revised product development process and updated servicing strategy that is focused on ease of deployment while keeping customer environments stable.

Microsoft BI Conference moves bi-annual

The MS BI conference last held in October 2008 in Seattle, WA has now been changed to an bi-annual event, citing  global economic constraints to travel budgets worldwide, Microsoft are moving the BI conference to a bi-annual event, with the next conference scheduled in Seattle on October, 2010. The next BI Conference scheduled for October 2009 will be moved to October 2010 in Seattle, WA, and all further BI Conferences will be held every second year on an ongoing basis. Content till then will be covered at the SQL Pass Summit, TDWI and SharePoint conferences.

If you were looking forward to seeing the XLCubed product team at the BI Conference this year, don’t worry you can still contact them at xlsales@xlcubed.com

SQL Server Fast Track Data Warehouse (Feb 09)

Microsoft announced SQL Server® Fast Track Data Warehouse, a new set of Reference Architectures for SQL Server 2008 that enables customers to accelerate their Data Warehouse deployments and reduce cost.  In addition, customers can further jump start their Data Warehouse design with new industry solution templates provided by System Integrators – Avanade, Hitachi Consulting, Cognizant and HP.

Seven new Reference Architectures with storage capacities from 4 to 32 TB were unveiled in partnership with HP, Dell and Bull.  Developed and tested by Microsoft, these architectures use balanced hardware optimized for Data Warehousing.  As a result customers will get

  • Better price performance than competitive solutions.  Fast Track Data Warehouse offers similar performance to the competition at 1/5th the price
  • Faster time to value and lower cost to setup and configure
  • Better performance out of box through pre-tested hardware. 

clip_image001

Customers can also choose the right Fast Track Data Warehouse with the right performance, storage capacity and pricing to suit their business needs.   Unlike Appliance Vendors with proprietary solutions, the new reference configurations use industry standard hardware from Dell, HP and Bull giving flexibility and cost savings to customers.

Fast Track Data Warehouse is available from today: customers will buy their SQL Server 2008 licenses through their preferred Microsoft Partner and the hardware from Dell, HP or Bull. If you’re looking to implement a data warehouse, contact the services team to see we can help.

Demise of Performance Point Planning (Jan 09)

It’s been a few months now since the announcement by Microsoft of the demise of Performance Point Planning, and the rebranding of the Monitoring and Analytics elements as PerformancePoint Services. This was an announcement back in January (09) that caught many by surprise, however for us its provided a useful segue into the new XLCubed Planning application. Many customers were waiting to see what was coming next, when PerformancePoint would be ready to compete with the likes of existing players with proven planning technology (i.e. in memory OLAP) and the  tempting announcements around Gemini certainly added confusion. Now looking back at the conversations we had in Seattle and Microsoft presentations perhaps the announcement isn’t as big a surprise as it felt at the time.

As above our long term commitment to an Excel front ended planning application continues, the demise of PerformancePoint Planning has simply increased the market for us and in many ways freed clients from the constraints of using purely Microsoft technology. Augmenting the Microsoft toolkit and providing our clients with the functionality they need to build effective planning, budgeting and forecasting applications remains at the forefront of our product set and services.

If you want to know more about our products and services (consulting team) just send an email to services@xlcubed.com and someone in your region and market sector will get back to you straight away.

Augmenting the MS BI Stack

Here at XLCubed we’re often asked how the product sits in relation to the Microsoft Business Intelligence tools.
The answer is that we add to and augment the features and functionality that Microsoft has to offer. Excel is a fantastically powerful and flexible spreadsheet engine and this is exactly what it should be used for. However all too often, Excel is used as a database. With linked spreadsheets, and huge data extracts.

XLCubed have a number of products designed to take advantage of the functionality available with Microsoft Business Intelligence tools, these include XLCubed Excel edition, MicroCharts, and XLCubed

2009 Excel Dashboard Competition

We are pleased to announce the 2009 Excel Dashboard Competition:

The Competition

Like last year, the competition is for real world solutions, we are not providing a sample data set, and we’re looking forward to seeing some great examples of reports, charts and dashboards.

The dashboards are judged on the clarity and effectiveness of their design, particularly

  • Clean and clear organization
  • Effective table and chart design
  • A single-screen display, properly designed for the web, screen or print outs

We’ll also consider technical aspects of the dashboard, did it use effective  techniques for

  • The Dashboard layout
  • Data management, data logic and calculation : YTD figures, variances, etc….
  • Dashboard delivery: Sharing the dashboard via PDF, the web or as an Excel Workbook

There will be prizes for the top 3 entries, with the winner having first choice of prize from:

The Rules
We’ve kept the rules simple:

  • The solution must be in Excel 2000 or more recent, and not require additional software other than Excel and Chart Tamer , MicroCharts and XLCubed.
  • Entries can use any combination of tables, Excel charts,  bullet graphs and MicroCharts (sparklines). Each have their strengths and role to play in an effective dashboard
  • We will publish the top 3 dashboards on our website, so please ensure this is not problematic for any of your submissions.
  • Please change names and data as appropriate in the dashboards to protect the innocent.
  • Final Entries by 19 July 2009, Judges decision final!