It’s been a little while here on the SQL Server Pro Mag BI Blog since I’ve provided any updates on the state of Cloud BI in the world of Microsoft BI. So I thought I’d put together a small sample Microsoft Cloud BI solution for you today that demonstrates the following aspects of the Microsoft BI product stack in the Cloud:
In reality, this is still a somewhat “hybrid” model when you look at such a solution holistically. And to accomplish this, I also incorporated these on-premises traditional tools:
Now, it’s important to remember that I am talking about Microsoft Cloud BI as the Microsoft platform, hosted on Azure (Paas or Platform as a Service) and not Infrastructure as a Service (IaaS), which is the traditional VM model hosted in a hosting provider’s data centers using virtualization ala Amazon. It should be not, though, that Microsoft has just announced (today, in fact, as I am writing this blog post!) that Azure will now also offer VM on Hyper-V for IaaS. So, for example, you could conceivably run SSAS in the Cloud via a hosted VM. But that is not a fully PaaS option. Instead, IaaS such as SaaS on a hosted VM, really only moves the burden of the infrastructure your of your own and into Microsoft’s or Amazon’s. PaaS, on the other hand, is the native Microsoft platform running on Microsoft assets in a Microsoft data center where all of the infrastructure costs including backup, patching, HA, DR, replication, etc. are all handled for you by the provider.
OK, enough of that, let’s get started.
1. We’ll start in the Microsoft Azure Data Market. This is a marketplace that Microsoft hosts were syndicated data providers can resell their collected data assets to customers. You can subscribe to their data sets and use them in your solutions. Microsoft has been investing heavily over the past couple of years in a number of Cloud-based initiatives and one of those investment areas that seems to fly under the radar is the Marketplace. Discover all kinds of data sets from different fields, industry, etc. For this demo, I chose crimes statistics for Pennsylvania, where I currently reside, from the U.S. government Data.gov data provider. This is a small 100-record free sample data set. There are a number of free data sets like this that you can access on the Azure Data Market for your prototypes.
2. Now that we have Cloud-based dataset, we need to perform traditional ETL to get it into a format that we can work with. Microsoft has a new preview Cloud-based integration tool called “Data Explorer.” Note that this is a “Labs” project currently for preview. Microsoft references this tool as a “data mashup” tool. I don’t see it currently as an industrial-strength traditional ETL tool for data warehouses, but it does a very nice job for smaller data sets to create “mashups” from multiple sources. And, it has both a local client and cloud-based user tool available.
Once you have signed-up or downloaded the tool, you can start building mashups and then publish those as OData in the Azure cloud. For this sample, I’m only going to work with the single Data.gov Crimes dataset from above for this simple example and just focus on transformations of the data set so that I can easily consume it in reporting tools. The assumption here is that I am a data analyst and the end-users authoring and viewing reports won’t have enough insights into the raw data to make sense of it, so the consumed data needs to be well-defined summarized entities.
3. To stick with the Cloud theme, I am going to use the Web-based Data Explorer mash-up tool and consume the raw stats for PA crimes in 2006 from the Azure Data Market via RSS OData.
4. Now, I can transform and massage the raw data, making it ready to be published as a model for reporting. In the screenshots below, you will see that I hide unnecessary data fields and made 2 new custom fields. Each of these will generate a percentage of violent and non-violent crimes based on population. This is a better measure for the end-user to report on rather than raw #s which will be skewed without the context of the overall population for each city in PA.
5. When you are done with your mash-up, you can save it on your workspace and then publish it to the cloud. For this demo, I have published it as an RSS OData feed so that I can bring it into a SQL Server in-memory tabular model, which I will do in Excel PowerPivot. So far, we’ve been completely 100 percent Cloud BI. Now we’re going to use a few on-premises tools, which is where the solution jumps over to a “hybrid” architecture.
6. I want to put that cleansed version of the dataset into an in-memory analytical model to report against, so I’ve created a new PowerPivot model in Excel. Once again, OData comes to the rescue as I just point to the OData data source that I just created.
Here, I can give the fields more friendly names in PowerPivot and also generate an in-memory analytical cube that I can now store in SQL Server Analysis Services (SSAS) Tabular mode for reporting. If I needed to, here I can also use the PowerPivot DAX formula language to create custom fields in the model:
7. To publish that cube into SSAS, I went over to SSMS, selected my Tabular SSAS instance and chose “Restore from PowerPivot” which will import the data model into the server and I can now maintain the cube from SSAS just as I would any traditional SSAS cube, no longer maintaining the model in Excel on my laptop:
8. Lastly, how will report users access the model and build their own reports? They can use Excel like above by connecting to the cube on the SSAS server. Or, use reporting tools like Report Builder or Power View (see below). Report Builder will require a client tool download via Click-Once, while Power View is purely Web browser-based. To make the cube accessible to Power View, I just made a BISM connection file in SharePoint to the SSAS cube that was generated from PowerPivot:
9. If you built those reports in Report Builder or Visual Studio, you can jump back into Cloud BI by storing that RDL in SQL Azure Reporting Services, which is SSRS in the Cloud. This will make the reports accessible to your users directly from a Web browser or mobile browser, anywhere that they have Internet access without needing to have any client tools. To do this, store the data in SQL Azure instead of SSAS, since SQL Azure, not SSAS, has a PaaS offering in Azure. Make SQL Azure the data source for these reports instead of SSAS.
And, so, there you have it! Microsoft Cloud BI with end-user Web-based reports generated from a Cloud-based syndicated data source in the Cloud (Azure Data Market), mashed-up and cleansed in the Cloud with Data Explorer and produced as SSRS reports. To publish the SSRS reports in SQL Azure Reporting Services, you will need to store the data in SQL Azure.