Does Excel Power Pivot Replace the Data Warehouse?

RSS

Excel Power Pitvot Model

In my consulting and product management travels over the years, I have heard more than one group mention that they were sold on the idea that Excel Power Pivot can eliminate the need for a data mart or data warehouse. Giving the sales team credit, I imagine that a few Power Pivot benefits were taken too far out of context. Sure, Power Pivot can be used to mash up 80 to 100 million rows of data, possibly more, from different data sources including, but not limited to SQL Server, Excel files, Reporting Services data feeds, text files, and so on.

That's similar to the capability of a data mart or warehouse where many data sources are combined into a single version of the truth. However, Power Pivot is NOT a replacement for a true data mart or data warehouse. The two solutions have completely different magnitudes of capability and appropriate use cases. I'd even argue that Power Pivot is more successful when used with a data mart or data warehouse.

Related: Understanding Power Pivot and Power View in Microsoft Excel 2013

Power Pivot Can Make Positive Impact When Properly Used

Excel Power Pivot is targeted for Personal and Team Business Intelligence (BI) solution use cases. Power Pivot also is excellent for quick prototypes and proofs-of-concept. You can combine enterprise data with personal data for rapid, agile analytics, freeing yourself from long waits for IT or BI professional development time—and that rapid total solution time to insight is amazing and why Power Pivot is popular. Power Pivot truly can make a significant, positive impact in an organization when properly used.

Excel Power Pivot is not always appropriate for enterprise scale BI today due to smaller data volume limits, file size restrictions when used with SharePoint and Office 365 Power BI deployment, no row level security, and other common enterprise BI needs. The Power Pivot upgrade lifecycle does allow for easy graduation of personal BI solutions to enterprise BI solutions that are not found in many other personal BI tools today. The Power Pivot upgrade lifecycle was not intended to replace a data warehouse—it was designed to allow for fast, easy, in-place upgrades that retain all the developed business logic and also provide additional IT/BI professional level administration, management and deployment features. The more advanced features include partitioning for large-scale data sources and role based security. The front-end Excel Power Pivot designer, by intentional design, does not include those features to keep enterprise level capabilities in the enterprise IT/BI professional realm.  

Blessed, Single Version of the Truth

A data mart or data warehouse is often the blessed, single version of the truth since it uses governed, controlled data loading and ETL processes to combine disparate data sources, applies extensive business logic and proven data modeling design patterns that can securely, accurately and efficiently report data changes over time periods. Data marts and data warehouses are typically designed to store massive volumes or data and can be terabytes or petabytes in size. In the world of exploding data volumes that we live in today, a data warehouse and even a Hadoop cluster may be used to store these huge masses of data that go far beyond what a Power Pivot model in Excel can handle. Note that Power Pivot does store a copy of the loaded data in the Excel file. Although Excel 2013 brings improvements in Power Pivot scalability, there are still realistic Power Pivot limits to keep in mind as you choose a reporting solution.

Related: Power Pivot and Small Data: A Self-Service Approach to Business Intelligence

A data mart or data warehouse also uses data model designs that are optimized for slide, dice, and drill down reporting by dimensions and measures. Usually in Microsoft data warehouses, the Kimball dimensional design approach is used. In addition to combined data source reporting ease via dimensional design models, concepts like slowly changing dimensions are implemented to ensure accurate reporting across changing entities. For example, company business structures, managers, field teams and even names change over time. To be able to correctly compare year over year metrics, you can't only use the current entity values. You should be able to report with both the current and historical entity values. In regulated industries or in legal environments, properly reporting the right data is mission critical. I'm still amazed by just how many groups today, including large businesses, still don't use critical slowly changing dimension design patterns in reporting data models.

Slowing Changing Dimensions

With regards to Excel Power Pivot, yes, you can technically apply slowly changing dimension patterns but few business users understand Kimball design pattern concepts or develop Power Pivot models with those proven patterns. I find many BI professionals building Power Pivot models right now just like they used to develop Analysis Services cubes. In my experience, I have found many business users skip Power Pivot training since the demos look easy and the data loading wizards are indeed quite simple to use. Power Pivot is a bit deceptive. The initial ramp up to get started is easy, but there is a steep learning curve. Users can get a lot done quickly, but also may be doing it incorrectly if they don't have a foundation of Power Pivot and dimensional design knowledge. Most often, when users have no training, they build Power Pivot with one, flattened table and do not have any time intelligence. They only realize something is wrong when time-based calculations don't work or they get recurring warnings of missing relationships with strange results.

Related: SQL Server 2012 BI with SQL System Center Data Warehouse

Trying to Replicate a Data Warehouse

Another common mistake that I have seen is a business user putting too many unrelated sources in one Excel Power Pivot model—basically trying to replicate a data warehouse. In these scenarios, users literally copy tables from various data sources with no transformations (no ETL) and end up trying to develop reports on a mess of duplicated data that's difficult to join and develop calculations upon. For example, vendor data may be in the billing system, CRM, ERP, and a few other places. If you simply copy four or more different tables and load them "as-is" into Power Pivot, you'll have a mess that's difficult to report upon. That mess turns into total chaos when you add numerous tables for product, location, and so on. In these situations, it's often better to use a data mart or data warehouse approach and keep Power Pivot as the short-term only, rapid prototype reporting tool for a smaller subject area.

I have also seen the Pivotstream/Rob Collie, concept of "thin-workbooks" that use a professional quality, dimensionally designed Excel Power Pivot data source hosted on SharePoint with user Excel reports that reference but can't change the Power Pivot data source. In that case, Power Pivot acts and is used like a tiny data mart. It is a good practice. Even in that case, an underlying real SQL Server data mart and an upgrade of that Power Pivot model to an Analysis Services Tabular model is often used to deliver enterprise quality BI solutions.

The Bottom Line

Power Pivot and many in-memory, self-service, data discovery tools in the market today do fail when they are inappropriately used as a data mart, data warehouse, or even as an ad-hoc, operational data store (ODS). Don't be fooled by a savvy sales pitch or misuse the Excel Power Pivot solution. Unlike a true data mart or data warehouse, these tools do have a point of diminishing returns. Power Pivot and other self-service BI tools are NOT a replacement for a data mart or data warehouse. It's important to roll out self-service BI tools with training and governance to avoid common misunderstandings and mistakes that can be expensive and time consuming to fix down the road. When used correctly for Personal BI, Team BI, prototypes or proofs-of-concept, you can achieve truly agile, rapid reporting success with Excel Power Pivot.

Related Image Gallery: Microsoft Business Intelligence Power Tools

Discuss this Blog Entry 2

on Mar 12, 2014

What a fantastic article.

Our experiences at Pivotstream echo this 100%. We strive to reach that nirvana where there is a well documented and designed data model for our clients to use and modify, coupled with the appropriate level of 'author' rights for folks to create their own reports without endangering the model integrity or impeding true measure governance. Jen - you hit the nail on the head by observing that the performance is manyfold better when Power Pivot is used appropriately. Great for ad hoc mashups, yes... but extraordinary when used with tabular and an enterprise preparation and governance mindset.

We see, everyday, two distinct types of challenges. First, we see questionably architected Power Pivot models impacting solution quality, extendability, and speed... Second, when our role is to provide a platform only (rather than an end-to-end solution), we see instances of folks struggling to assemble the ingredients of the BI stack effectively, again, with unnecesary implications to deployment speed and solution effectiveness. There are LOTS of moving pieces to connect and details to herd, so if you endeavor to go it alone, I heartily encourage self-service deployments to have a qualified IT pro ride 'shotgun' over the deployment instance for continuity sake, and to derive the well articulated benefits illustrated in Jen's article.

on Mar 13, 2014

Always saw PowerPivot as good for ad hoc and prototyping for production, not a DW replacement. It has grown into a much more usable product than the first release. Lack of security to a great degree in ad hoc user hands limits it greatly as a solution in many cases. Which is where a DW and real cube shine.

Please or Register to post comments.

What's SQL Server BI Blog?

Derek Comingore’s, Mark Kromer's, and Jen Underwood's candid look at SQL Server’s Business Intelligence features.

Contributors

Mark Kromer

Mark Kromer has been a technical product manager & solution architect in business intelligence, data warheouse and Big Data world for over 20 years for Microsoft, Oracle and AT&T, currently...

Jen Underwood

Jen Underwood, Founder and Principal Consultant at Impact Analytix, has almost 20 years of experience in data warehousing, business intelligence, and predictive analytics. She is a former Global...
Blog Archive

Sponsored Introduction Continue on to (or wait seconds) ×