Use SSRS and SSIS to Create a DBA Repository

Enhancements to the original solution

Downloads
97840.zip

 

Executive Summary:
The author's original solution used SSRS and SSIS to create a DBA Repository. This update adds support for SQL Server 2005 and SQL Server 2000 queries to the msdb database, support for SQL Server authenticated domains via multiple data sources and duplicate objects, support for server location, information for reports and ad-hoc queries, and an enhanced SQL Server Agent job schedule.


I work as the primary DBA on a network of more than 60 SQL Server machines. When I began work on a repository project to centrally store information about each of those machines, I had two goals in mind. First, I wanted to be able to use T-SQL or a SQL Server 2005 Reporting Services (SSRS) report to quickly answer any questions about the machines. This turned out to be the loftier of my two goals. Second, I wanted a solution that I could enhance over time, by modifying the schema if necessary, to accommodate new categories of information. This second goal led me to combine the twin powers of SSRS and SQL Server Integration Services (SSIS). I wrote about my solution in “SQL Server Reporting Services” (June 2007, InstantDoc ID 95745) and “SQL Server Integration Services” (May 2007, InstantDoc ID 95385).

Readers responded so well to these articles, and my original solution has progressed enough since then, that I wrote this follow-up article. In this article, I discuss the enhancements I’ve made to the original SSIS package, as well as the new SSRS reports that I use daily.

Background

My original solution includes a core set of tables in a single SQL Server 2005 database repository called DBA_Rep. With the exception of the table called ServerList_SSIS, the DBA_Rep database is populated via an SSIS package, called DBA_ Server_Load. You can populate the ServerList_SSIS table manually, which I recommend for initial testing. Alternatively, you can use a tool such as Sqlcmd. For more information about using Sqlcmd, see “Sqlcmd” (April 2007, InstantDoc ID 95387). Figure 1 shows the Server List_SSIS table that I used SQL Server Management Studio (SSMS) to populate.

Although you might find it helpful to refer to my original articles to understand the enhancements I describe in this article, my approach here is to assume that you’re not familiar with the original solution. Thus, the code for the enhanced projects and for the DBA_Rep database schema changes includes all the code from the original solution. The downloadable code for the new solution also includes detailed instructions for configuring the updated DBA_Rep database and for deploying the SSIS and SSRS projects. (To download the new solution, go to www.sqlmag.com, InstantDoc ID 97840, and click the 97840.zip hotlink.) Within the article, I also provide tips to help you easily implement the solution. Once you have the new project up and running, you can download and use the previously published reports with the new solution. I made the following enhancements to the DBA Repository solution:

  • Added support for SQL Server 2005 and SQL Server 2000 queries to the msdb database
  • Added support for SQL Server authenticated domains via multiple data sources and duplicate objects
  • Added support for server location
  • Added information for reports and ad-hoc queries: ° SQL Server users ° SQL Server logins ° SQL Server group memberships ° Disk space allocation (the most useful enhancement by far)
  • Made enhancements to the SQL Server Agent job schedule

Support for SQL Server 2005 and SQL Server 2000

My original SSIS package was limited because it wasn’t designed to take full advantage of both SQL Server 2005 and SQL Server 2000. This limitation was most evident when it came to the schema changes in the msdb database that occurred with the release of SQL Server 2005.

Among other things, the msdb database stores information about all SQL Server Agent jobs. Previous queries of the msdb database for the DBA Repository solution worked fine for SQL Server 2005 and SQL Server 2000. But I ran into a problem when I tried to include information about job schedules. In SQL Server 2005, adding the sysschedules table to query the msdb database required a change to the SSIS package. Because this table doesn’t exist in SQL Server 2000, additional steps were necessary for me to acquire the schedule information that was crucial to my DBA analysis. I had to use two separate queries that returned equivalent data to the Jobs repository table. Specifically, these queries gather job frequency information.

The two queries return the same number of columns. I was most interested in the columns regarding the schedule frequency, such as freq_type and freq_interval. As I explain later, this information is vital for producing reports that show when the jobs are scheduled to run. In addition, combining this information with the data captured from the last_run_duration field lets you isolate jobs that aren’t performing as expected for their scheduled times.

In order to segregate the SQL Server 2005 and SQL Server 2000 servers so that the queries would execute against the correct version, I had to make a few crucial changes to the flow of the SSIS package. First, I set up multiple additional data sources to support SQL Server 2005 (version 9.X) and SQL Server 2000 (version 8.X) in demilitarized zone (DMZ) and non- DMZ environments. I also set up two additional variables to support SQL Server 2005 connections. These variables, as you might recall from the original solution, are populated by a SELECT query that reads the server name from a table (ServerList_SSIS) stored in the repository database. The result is stored in an object data type variable. This variable is then converted to a string data type variable so that it can be used with the ForEach Loop container objects as well as the ServerName expression for each data source. Adding support for SQL Server 2005 and SQL Server 2000 was simply a matter of repopulating the variables at several stages during the package’s execution. The following code contains the SELECT queries that repopulate the variables:

--Run for SQL Server 2000 (version 8)
SELECT LTRIM(RTRIM(Server)) AS servername
FROM ServerList_SSIS
WHERE Connect = 1) AND (Version = 8) AND
(DMZ = 0)

--Run for SQL Server 2005 (version 9)
SELECT LTRIM(RTRIM(Server)) AS servername
FROM ServerList_SSIS
WHERE (Connect = 1) AND (Version = 9) AND
(DMZ = 0)

Next is the code to update the SQL Server version :

Update Serverlist_SSIS set version =
cast(left(dbo.SQL_Servers
.ProductVersion,1) as smallint)
From ServerList_SSIS Inner Join SQL_
Servers on ServerList_SSIS.Server = SQL_Servers
.Server

This code takes advantage of an additional column, called “version,” added to the ServerList_SSIS table. This change allows automatic qualification for the version of newly added servers.

Now let’s look at a report that uses the accumulated data for the SQL Server Agent jobs gathered from the preceding package enhancements. This report, called Job Interval, is an enhanced version of a previously published report. The new report shows job schedule information and job status (e.g., whether the job is scheduled to run, whether the job failed or succeeded). I use parameters in the actual report to filter jobs based on this type of information. For example, in Figure 2, page 16, the report shows only jobs that are scheduled to run and only those of the type backup. The term “backup” here is a subjective value; in my organization, all our backup jobs are named “backup.” Thus, my report will contain jobs that are titled with the word “backup” in their names. Using runtime parameter values to apply a filter to a report is fairly straightforward. (For information about this procedure, see “SQL Server Reporting Services,” June 2007, InstantDoc ID 95745.) The end result is that I now have one location to query all SQL Server Agent job schedules and status information, regardless of the SQL Server version. The downloadable code includes this updated report.

Continue to page 2

DMZ and SQL Server Authentication

My next enhancement (i.e., support for SQL Server authentication for non-Windows domain accounts in DMZ environments in which Windows credentials might not pass through) was a rather simple addition. I added two data sources that would use SQL Server authentication to connect to the DMZ and duplicate the existing Control Flow tasks.

When the updated package runs, the first phase truncates the local repository tables (as in the original solution). Next, the Windows-authenticated objects execute and complete. Then, the package populates the same variable that was used for the Windows-authenticated server list, but now only the servers in the DMZ require SQL Server authentication. The following code contains the query that populates the DMZ servers:

SELECT LTRIM(RTRIM(Server)) AS Servername,
FROM ServerList_SSIS
WHERE 1(Connect = 1) AND (DMZ = 1)

Notice the DMZ flag is set to 1 for true.

After you load the DBA_Server_Load SSIS project that populates the DBA_Rep database, you must configure the appropriate DMZ SQL Server-authenticated credentials for this enhancement to work in your environment. Because the Connection Manager Server name property is dynamically configured at runtime, you can overwrite the two Connection Manger sources (i.e., MultiServer_DMZ and Multi- Server9_DMZ) with valid SQL Server account names at design time simply to save the SQL Server user account password. In fact, doing so is actually necessary the first time you run the SSIS package.

You need to enter a valid server name and SQL Server authenticated user name and password to initiate the first connection. I chose ‘sa’ because it’s a known, usable SQL Server administrative account. Your account might be a different SQL Server authenticated administrative account. After you save this account information, it will be used at runtime. However, the server is dynamically set based on the variables you use for the Server name connection property. Web Figure 1 shows the entire DBA_Rep SSIS package with annotations for the package flow.

Location

I’ve been hoping for a long time that future versions of SQL Server would include an area for descriptive data about the installation—information that can’t easily be gleaned elsewhere, such as a server’s geographical location. DBAs typically work around the unknown location problem by adhering to a naming standard that includes OS version, application, and location. For example, a Windows Server 2003 SQL Server machine located in Denver might be called W2K3SQLDEN1. Some quick string parsing would reveal DEN in the server name and could be used as a location trigger.

The problem with this solution is that even though you might have 90 percent compliance with the nomenclature, you’ll always have a few stragglers that lack a location-defining name. So until SQL Server includes this description or categorical storage area, you must manually track servers’ locations. Thus, I added a location field to the updated DBA_Rep SSIS package and repository database so that you can use locations in reports or queries.

The Server_Location table consists only of a LocationID and Location (string) column, which refers to the city in which the server resides. The LocationID values are stored in the ServerList_SSIS table—and although this table is initially populated manually, the package will update the SQL_Servers table after the package completes execution with the values from the ServerList_SSIS table. The following code contains the query that runs as the last step of the package to update the SQL_Servers table from the LocationID value stored in the ServerList_SSIS table:

UPDATE dbo.SQL_Servers
SET dbo.SQL_Servers.LocationID = dbo
.serverlist_ssis.LocationID
FROM dbo.serverlist_ssis
INNER JOIN dbo.SQL_Servers ON dbo
.serverlist_ssis.Server = dbo.SQL_
Servers.Server

New Data to Play With

In addition to the schema changes necessary to provide SQL Server 2005 and SQL Server 2000 support and DMZ connectivity, I added several new tables to the DBA_Rep repository database. These tables categorically store two of the most frequently requested pieces of information: users’ permissions and disk space utilization.

In any large organization, especially one that has a lot of turnover, quickly ascertaining database or server access rights at the user or group level is essential to safeguarding data. Large companies with many users typically have numerous SQL Server machines throughout the network. And even when preventive measures are implemented, unmitigated database and log file growth can occur because of poor planning, unexpected large data load volumes, or failed maintenance tasks to reduce file sizes.

To address both security and disk space management, I added several key reports to the repository. But before I discuss these reports, let’s look at how one of the tables, SQL_Drives, is populated within the SSIS package. The same technical methods are used to populate the new SQL Server user tables.

As you can see in Figure 3, two tasks are encapsulated in a ForEach Loop container. This container rolls through the tasks for each server derived from the Server name variables, and in this case initiates a SQL Server execution task called Fixed Drives to populate a temporary table, then runs a Data Flow task called Fixed Drives Load to pull the data from the temporary table to the DBA_Rep table called SQL_Drives. The reason I had to use a temporary table was because of the need to execute an extended stored procedure, xp_fixeddrives, for which there wasn’t a valid result set without a temporary table.

It wasn’t possible to execute the xp_fixeddrives stored procedure directly in the Data Flow task and provide all the data I wanted—specifically, the Server, Disk_Drive, and MBFree columns. The xp_ fixeddrives stored procedure returns only the latter two pieces of information. But because I had no way to tie a server name to the repository, this information would be meaningless. My solution, which Web Listing 1 shows, was to create a temporary table in an Execute SQL task and populate it with the required data. Next, I fed the results of this query as input columns to the Data Flow task called Fixed Drives Load, which populates the SQL_Drives table in the repository. I included the code that executes the xp_fixeddrives stored procedure into the temporary table, RESULT_STRING_FD, then uses SERVERPROPERTY(‘Servername’) to update the server information. The results of the query in Web Listing 1 are fed as input to the OLE DB destination, which is the SQL_Drives table in the DBA_ Rep database, via the Data Flow task in the DBA_Server_Load SSIS package.

After the package executes, you can use the following code to query the results from the SQL_Drives table:

select SD.Server,
Drive_Letter,
MbFree
from sql_drives SD

 

shows the results of this query. With the data already gathered for database information (which was part of the original solution), you can now tie the database sizes to the space remaining and create a threshold value that you can monitor.

Putting the Data to Use with Reporting Services

I needed an at-a-glance report that would tell me (1) which of the drives on each server was potentially getting low on free space and (2) which server was hitting a threshold of data and log file sizes as it pertained to the free space. The driving query for this report, which Web Listing 2 shows, returns only a few columns, mostly calculations of the free space and database file sizes per drive per server.

Note that the query includes the Size field from the Database_Info table. This size is shown in 8KB pages that require division by 128 so that the size matches the megabyte value of the MBFree column. The percentage calculation is derived by dividing the sum of the database size by the total of MBFree, plus the sum of the database size again, times 100. Note, also, that the outer join between the two tables, Database_Info and SQL_Drives, is accomplished by linking the Drive_ Letter field from SQL_Drives with the first character of the FileName field in Database_Info. The first letter, derived with the LEFT function, is the drive letter where the database or log file is located.

Continue to page 3

The SQL Drive Space report incorporates a couple of interesting uses for report parameters and expressions to display a bar chart and table containing the drive space inormation. The High_Usage parameter lets users select all servers or only the servers that have a threshold event (which I arbitrarily set to a disk file usage above 30 percent). If you select the value High Usage from the parameter dropdown list, you’ll see only those servers that have potential drive space issues. Otherwise, you see all servers by default if you don’t select a value. As Figure 4 shows, color coding helps you determine at a glance which of the servers has a threshold value exceeded.

The SQL User Permission report has saved me hours of time investigating where access was granted—especially when a user was placed in a group and thereby hidden from the DBA. Even with access to Active Directory (AD), analyzing Windows domain group membership is tedious. Instead, I wanted one big result set that returned all the user information for all the SQL Server machines being reported on (especially production servers). The resultant query, which Web Listing 3 contains, and the subsequent report let you select a user by name or part of a name, as well as by group or by server.

The report, although not pretty, is quite useful. Web Figure 2 shows a report run for Location 4 (from the Location field), with a breakdown of permissions for each database. You can analyze Windows group and account names, although the report in Web Figure 2 doesn’t list any. Finally, the user parameter supports wildcard lookups. For example, you could look for a combination of all the servers in Location 4 and users with a name like “gue,” to return only the “guest” user. This option is handy when you know only part of a name.

Reader Kudos

Thanks to everyone who read my original articles and provided feedback to help improve my solution. I’ll continue to make enhancements to the tool (which I’ve dubbed SQL Snapper, for snapshot data analysis— plus, I live near the beach and I like to fish!). Although I created the tool for straight reporting, it’s useful for many other projects as well. Have fun tinkering with the code (in a test environment of course), and feel free to email me with any ideas you have for modifying or expanding it.

Discuss this Article 35

skeleton
on Feb 20, 2008
mrweaver59, yeah, you're right - free speech lives, and it's often worth exactly what you pay for it. It doesn't necessarily indicate that reasonable thinking is involved. Get over it!
focasio
on Jan 28, 2008
Not for nothing, but where's the link?
medicis
on May 4, 2010
I really enjoyed the webinar the other day and have been looking forward to downloading the code but I'm having an issue. I'm receiving HTTP Error 404 - The page cannot be found.

- Neal
rlandrum
on Jan 28, 2008
Good Catch, dguillory. I will let the editors know this morning. Rodney Landrum - author.
sboorman
on Mar 6, 2009
Do you have a SQL 2008 version of this. I thought it was extremely useful.
rlandrum
on Apr 26, 2008
gsimmons31, I definitely understand how frustrating it can be to deal with case issues with different collations. I am fixing up these issues in code for myself, as I now have at 5 servers that are of case sensitive collations. Rodney Landrum ~ author
dguillory
on Feb 1, 2008
Code link still not working."Sorry the page you are trying to reach is temporarily unavailable or the page no longer exists." Good thing this is not a Tech site.
djdevelop
on Feb 1, 2008
Great article. Too bad the site is letting us (subscribers) down... :(
Brett (not verified)
on Feb 5, 2008
Fix link soon please......Then usefulness will be higher
markinmon
on Jun 8, 2010
Trying to add more logic to the solution and running into an error. All I am doing is copying one item like fixed drives and building a new query and new table in DBA_REP to match. I am using tempdb for the process just like the item I copied.

Error 18 Validation error. Populate_DBA_REP Connection manager "MultiServer": SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005. An OLE DB record is available. Source: "Microsoft SQL Server Native Client 10.0" Hresult: 0x80004005 Description: "Login timeout expired". An OLE DB record is available. Source: "Microsoft SQL Server Native Client 10.0" Hresult: 0x80004005 Description: "A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online.". An OLE DB record is available. Source: "Microsoft SQL Server Native Client 10.0" Hresult: 0x80004005 Description: "Named Pipes Provider: Could not open a connection to SQL Server [2]. ". Populate_DBA_REP.dtsx 0 0
stefbauer
on Jan 28, 2008
Also the "97840.zip" referenced in the begining of the article is not attached to the article to download.
focasio
on Jan 31, 2008
Any idea on when http://www.winnetmag.com/Files/09/97840/97840.zip will be available? Thanks
rlandrum
on Jan 31, 2008
I have brought this to the their attention. I have several emails from readers asking for the code directly. I will send that along as soon as I can. In the mean time please keep checking back as they are working on the code link. Rodney Landrum - author.
Benoit (not verified)
on Feb 5, 2008
Link is broken.
skeleton
on Feb 22, 2008
Caution! in the raw jobs data we're collecting here, Microsoft seems to have changed the MEANING of the values for last_run_outcome between 2000 and 2005. Checking BOL verifies this. So if you are basing reporting or analysis on job failures, be sure you handle this difference. Apparently the values for Fail and Succeed were reversed, and the value for Cancel was changed.
Barry (not verified)
on Oct 23, 2009
Excellent - have been planning to do something like this myself for some time. Now I can see how much work it would have been and I'd have screwed it up!! Commercial SQL Server documentation tools are very much developer focussed and focus on the nitty gritty of the database schemas. Your soultion focusses mar more clearly on the SLQ Server environment. The (natural) use of SSRS is also excellent. Well done. I suspect there's a commercial opportunity there for somebody - a SQL Server documentor for the DBA, not the developer
skeleton
on Feb 22, 2008
More about the last_run_outcome value: Maybe the quick fix for this is to use the CASE operation in the 2000 data collection to convert to 2005 values, since 2000 should go away first.
Mike (not verified)
on Feb 3, 2008
Link is Broken
skeleton
on Feb 20, 2008
I believe we identified a bug in the code for LoadJobsInfo in the 2000 version. active_end_time is missing in the query result set, and thus is not mapped to active_end_time in the field mappings. Is this really a bug, or did we delete it accidentally? I think I'm looking at the original code. We've added more archive tables, added a table that is first updated with collection datetime to join as a common timestamp for all tables, added a step at the end to update archive tables from the collection tables using the common timestamp, and can now write sprocs that will compare to earlier collected data so we can report CHANGES. Rodney, great job on the development of this and thanks for sharing with us all.
coriehl
on Mar 4, 2008
Please contact Customer Service 800-793-5697 8AM to 5PM MDT if you are unable to downoad the code. Thank you, Customer Service
spenumatsa
on Feb 29, 2008
I implemented the solution at work and it is excellent. The only issue that i have is that i dont the job details in the Job interval report. FYI my SQL Server is in a clustered environment. I tried every other option but nothing gives me the job interval reports. Just wondering if i am doing something wrong that i dont know of. Any help will be appreciated.
rlandrum
on Dec 6, 2008
sueboo, A couple of things jump out at me based on your issue. You say that this is Windows 2000 and hosts both a SQL 2000 and 2005 instance. I would be curious to know where you are running the package from, meaning are you running this from a SQL Agent job or directly in Visual Studio/BIDS? If you run the package in design (BIDS) where in the package does the failure occur? You can mail me directly and I would be happy to work out the details with you. My e-mail is posted in my profie on this site. Rodney Landrum (author)
dguillory
on Feb 12, 2008
Oh yea.. I got the multi script tool... way cool.
gsimmons31
on Apr 25, 2008
Very good stuff. Just an FYI, I had to spend a good bit of time troubleshooting some of the steps in the package. I'm one of those DBAs that has the pleasure of supporting SQL Servers that use Binary Collation. Not all of my servers are Binary so it took me a good while during testing to figure out why the package would die at several points. It turns out that some of the code inside the steps would fail on a Binary Collated machine because variable names differed in case within a script or the case used for a system table column name was incorrect. Other than that, thanks for sharing!
meganbearly (not verified)
on Feb 5, 2008
Thanks so much for letting us know about this problem. I have passed this information on to our Web team, and we hope to have the problem fixed soon. Megan Bearly Associate Editor, SQL Server Magazine mbearly@sqlmag.com
skeleton
on Feb 6, 2008
You know, the more I think about all the above comments beating up on folks for broken links from all of you, the hotter I get. I just have to say if it is that critical for you, go out and do the usual spending of your companys money for a tool instead, or write your own.
Mike (not verified)
on Feb 4, 2008
The Article is very good but without looking at the code it’s kind of hard to make a final rating I am sure it good, because I have been doing the same thing for sometime now just want to see the differences. Like performance and footprint size on the server when running. And still a broken link. Does this Magazine care about quality? This not the 1st time they have had broken links. Seems to be the norm for a professional magazine I would expect better results to my paying customers. If I published a web site that did not work I would lose my job.
kpijnenb
on Jan 31, 2008
Very interesting article with very relevant information. This makes it doubly unfortunate that the download is not available!
skeleton
on Feb 6, 2008
OK, all of you perfect DBA's out there. Lay off the broken links. It ain't the end of the &^&%$ world. This is good stuff, and another day isn't going to hurt you a bit. If at first you don't succeed, remove all evidence you ever tried....
Mike (not verified)
on Feb 12, 2008
The Borken Link is noreflection of Rodney Landrum This is a very good arctcile. But if you are paying for something and does not work you should have the right to say something. Free speach lives.
dguillory
on Jan 28, 2008
fyi.. the links(to gifs) in the article are broken
dguillory
on Feb 12, 2008
Great article Rodney! This was a big help and eyeopener for me. I am working on adding sp_databases, serverproperty and xp_msver. Thanks for contributing. ;)
rlandrum
on Mar 3, 2008
spenumatsa ~ I have noticed that if you are unable to connect to an instance, say because of permissions, the next server in the list is skipped for this step. I have added errror handling for many events and will be adding much more via custom code. For, now, I would run the query from the data flow step directly against the servers to make sure there are no errors. Rodney Landrum
sboorman
on Dec 3, 2008
This program is great but I keep getting an error of "Could not obtain information about windows NT group/user 'NT Authority\System' for 1 server. This is Windows 2000 server and Sql 2000 (sp4). I get no errors with a SQL 2005 instance on the same server.
gnanau
on Apr 4, 2008
Everything looks good, but I tried to make a change to the OLEDB data source used for 2005 job info: I tried to add some field for current date, something like getdate() as TodaysDate. When I press OK I get message Invalid object name sys.sysschedules. I cannot get around this problem with that Multi_Server_9 connection manager. Anybody had the same problem?If so, how did you get over it?

Please or Register to post comments.

IT/Dev Connections

Las Vegas
September 30th - October 4th

Paul ThurottOur Experts will show you:
• Common SQL Server
Problems
• Best Practices for T-SQL
• SQL Server Integration
Services
• Database Development

Come See Mike Otey & Tim Ford in Person!

Early Registration Now Open

From the Blogs
May 9, 2013
blog

My ISO 8601-Compliant Signature 2

My family recently just "officially" announced that we're in the process of adopting a child from South Africa. We're quite excited, of course, but there's a ton of paperwork to do—along with the need for gobs of signatures....More
May 8, 2013
blog

Use SSIS for ETL from Hadoop

In this blog post, Mark Kromer walks you through using SSIS as a way to use ETL techniques using Microsoft's Hadoop on Windows (HDInsight) as a source using Hive connectors...More
Vision road sign
May 6, 2013
blog

Cheaters Never Win, Even in TPC Benchmarks

In this portion of the series on database benchmarking, I want to tell you about one of my favorite aspects of the TPC benchmarks – CHEATING....More
SQL Server Pro Forums

Get answers to questions, share tips, and engage with the SQL Server community in our Forums.