SQL Server Integration Services

Gather information automatically

Downloads
95385.zip

Suppose that you hire a new DBA and assign him or her to a large-scale SQL Server infrastructure that consists of more than 100 servers divided not only by purpose and criticality but also by geography. This new DBA would likely spend the entire first week getting to know the lay of the land—using SQL Server Management Studio (SSMS) to connect to each SQL Server machine, one by one, to gather essential information such as version, edition, server configuration, existing databases, and scheduled backup jobs. A daunting task indeed.

I learned early in my career that spending time up front to automate otherwise manual and time-consuming tasks can preserve your sanity. I therefore developed a fairly simple solution that connects to each available SQL Server machine, pulls information into a central repository database, and feeds the combined data to a report for DBAs and other IT staff to use. In this article I describe the solution I used. (Note that the solution I offer is intended to enhance, not replace, a preexisting full monitoring and alerting solution.)

Tool Time

Several tools are available to the intrepid DBA setting out to discover the uncharted server landscape. For example, you can use the Microsoft SQL Server Health and History Tool to populate a repository database. However, this tool is outdated and not very flexible. (For more information about SQLH2, see "SQL Health and History Tool," November 2006.)

For my project, I employed the command-line tool Sqlcmd, SQL Server Integration Services (SSIS), a SQL Server database for the repository called DBA_ Rep, and SQL Server 2005 Reporting Services (SSRS). (For more information about Sqlcmd, see "Sqlcmd"; for related articles about SQL Server 2005 and SSIS, see the Learning Path below.) In this article, I explain how to use Sqlcmd and SSIS to construct and populate the DBA_Rep repository database. In a later article I'll explain how to build and deploy the three SSRS reports designed to deliver the data from this repository. (Note: This article makes several nonstandard recommendations, such as querying system tables directly, employing pseudo temp tables, and using xp_cmdshell to run commands. If you stringently adhere to best practices and use only supported techniques, you'll need to withhold your judgment temporarily—until you see that the nonstandard methods I use are efficient and aren't detrimental.)

Create the Repository Database

Now that you have a list of servers to use as input, you might wonder how you can use that input directly in an SSIS package. But don't get ahead of yourself—first, you must store the information somewhere. As most DBAs know, the best place to store a list of data is in a table. Before we examine the SSIS package, let's take a look at the database that will be the repository for the SSIS load.

The table that will store the list of servers on the network from the Sqlcmd command is called ServerList_SSIS. In addition to this table, six other base tables in the DBA_Rep database store data to give a DBA enough basic information at a glance to answer some fundamental questions about the state of each server. These tables are SQL_Servers, Database_Info, Databases, Backup_History, Jobs, and Jobs_Archive. Each of these six tables holds specific information about each SQL Server instance. Web Listing 1 contains each table's schema; this listing also serves as the script to build the database for the SSIS package to populate the database. To run this script successfully, you need to create a blank database called DBA_Rep in your SQL Server 2005 instance. After you create the DBA_Rep database, the script that Web Listing 1 contains will create the necessary tables in the database.

The table ServerList_SSIS is empty, waiting to be filled with server names. Although I didn't automate this process, you can use SSIS techniques similar to the following techniques that I discuss to easily do so. I used the special stored procedure xp_ cmdshell to run the Sqlcmd shell to return the server list. Assuming that you've run the script to create the DBA_Rep database that contains the ServerList_SSIS table, you can easily use an Insert Into statement with the Sqlcmd /Lc command to insert the server data. In SSMS, open a new query window and enter the following command:

USE DBA_Rep
   GO
Insert Into ServerList_SSIS
  Exec xp_cmdshell ‘sqlcmd /Lc'

At this point you need to consider the following caveats. First, the result set returns NULL records. The table can accommodate NULL records, and the SSIS package's logic will in turn filter out these rows. You could build in logic to take care of the NULL values before the insertion, but I chose to do it as part of the SSIS package. Because this table has no defined indexes that require unique values, truncating the table also ensures that no duplicate rows occur each time the Sqlcmd /Lc command loads the table. You also need to ensure that xp_cmdshell is configured with the proper permissions to execute. By default, xp_cmdshell is disabled in SQL Server 2005; you can use the SQL Server Surface Area Configuration tool to enable xp_cmdshell.

Obain the Data

The information that's most important to DBAs on a day-to-day basis is server information, such as version (e.g., SQL Server 2005 or 2000), service pack level, edition (i.e., Standard or Enterprise), and default collation. Fortunately, SQL Server 2005 and 2000 store this information in handy system tables. My solution queries each server and stores this information in the SQL_Servers table.

Database information is just as important or even more important than server information; crucial database information includes the two tables Databases and Database_Info. The Databases table simply holds the name of the server and the name of the database. Database_Info holds details, such as size, file location, and recovery model for each database. You can use the Server field to join the Database_Info table to the Databases table. (For ease and simplicity, I used the Server field for joins rather than a key field or a server ID field that I would need to manually update. Although this solution wasn't ideal, it was sufficient for my purposes.)

The final three tables store information about SQL Server Agent jobs and database backups, which I believe is the most important information for any DBA to have. For example, knowing which jobs are failing and need attention is imperative when you're working with hundreds or even tens of database servers and databases. Jobs often fail—and because most jobs perform routine full and transaction log backups if they fail, response must be swift. The Backup_History table holds detailed information about backups that occurred in the past n days. I've found that 5 is a good number of days of history to analyze.

Populate the Repository


Now that you have the list of SQL Server machines, and the repository database is defined to store information about these servers for databases, jobs, and backup history, it's time to use SSIS to populate this simple repository. SSIS might seem like a complex design environment if you've never used it. Many DBAs use DTS for SQL Server 2000 to cut their teeth for extraction, transformation, and loading (ETL).

Figure 1 shows the full package that I used to populate the repository database. This package consists of three areas: (1) migrating and/or truncating repository tables to maintain the repository, (2) populating a variable with an ADO record set of server names derived from the commandline utility Sqlcmd, and (3) using this variable to programmatically connect to each server, one by one, and pull information into the repository.

Truncating tables and migrating data to archive tables occur first in the SSIS package. The Execute SQL Task objects that run the Truncate Table statements are grouped together in a sequence container at the top of the package. When the package runs, all the tables are initially truncated; the only exception is the Job stable. Before truncating the Jobs table, a Data Flow task moves the data from the Jobs table to the Jobs_Archive table. I wanted to maintain a history of job successes and failures to analyze over time. The other tables need the most current data—and in my opinion, starting fresh each time for this semi-static information is cleaner. As Mentioned previously, I'm pulling 5 days' worth of backup history that will repopulate with each run. Figure 2 shows the dialog box to configure the Execute SQL Task object to truncate the SQL_Servers table. When all the objects in the Truncate Tables and Populate Archives sequence container complete successfully, the package moves to the second phase, which is to populate a variable with an ADO record set.

Before I explain how to populate a variable from a SQL Server query to use within an SSIS package, let me explain why you might want to do so. If you have fewer than 10 servers, creating a separate connection to each server might be feasible. In SSIS you'd have 10 Connection Managers, each pointing to one SQL Server machine. More than 10 servers is problematic, but a tenacious DBA might be willing to create separate connections for, say, 20 servers—as long as no additional servers are expected, or the DBA is willing to manually add servers and maintain the package indefinitely. In my case, I had more than 100 servers, so I needed a better solution.

The task to populate the variable uses a simple SELECT statement to query the ServerList_SSIS table containing the server names previously derived from Sqlcmd. The query is:

SELECT RTRIM(Server) AS
<servername>
FROM
ServerList_SSIS

Figure 3 shows the dialog box to configure the Execute SQL Task object for the Populate ADO Variable task, including the Connection, SQLSourceType, and SQLStatement property values. A direct input query to the DBA_Rep database resides on the local SQL Server machine named QASRV, which is also where the package will run. To use the query results to populate a variable, you need to have a variable already set up.

For my solution, I needed to set up two variables: one for the Execute SQL Task object, and one for the final third of the package, which uses Foreach Loop container objects.The five Foreach Loop containers (i.e., one for each database table in the DBA_Rep database) employ a combination of techniques to meet the defined goal of programmatically changing a connection string iteratively for each server and executing a Data Flow object to retrieve server-specific information.

To open the Variables toolbar, right-click anywhere in a blank area of the package and select Variables from the list. Figure 4 shows the Variables toolbar with two defined variables of two different data types. The first, SRV_Conn, is a simple string variable. The second, SQL_RS, is an object data type. These distinctions are important. Because the result set from the SELECT statement contains multiple records, a string variable doesn't work. I needed to use the SQL_RS object with a value of System. Object to hold the results, then map the two variables, object to string, in the Foreach Loop container. I used the following four simple steps to accomplish this task.

  1. Under Result Set in the Populate ADO Variable task, set the Result Name to 0 and the Variable Name to User::SQL_RS, as Figure 5 shows.
  2. In each Foreach Loop container, set the enumerator in the Collection area to Foreach ADO Enumerator, and set the ADO object source variable to User:: SQL_RS, as Figure 6 shows.
  3. Under Variable Mappings in the Foreach Loop containers, set the Variable to User::SRV_Conn and the Index value to 0, as Figure 7 shows.
  4. Use a property expression in the Connection Manager object to assign the string variable to one Connection Manager. When the Foreach Loop container executes the Data Flow tasks it contains, the connection string is dynamically built with each enumeration of servername. In this package, the Connection Manager called MultiServer serves this purpose. Setting the variable to the ServerName property, as Figure 8 shows, causes the connections to set themselves correctly for each server.

After the variable mappings are in place, you can use Data Flow objects within each Foreach Loop container to load the tables. You can place Data Flow objects on the SSIS package's Control Flow tab, but these special objects have their own tab on which you can define their properties and sequencing. In general, a Data Flow task consists of a source and a destination object. In my solution, both the source and destination are OLE DB connections to a SQL Server machine. I configured the source to use the MultiServer connection that would enumerate through the list of servers, and I configured the local DBA_Rep connection as the destination to hold the data. Source and destination columns are mapped together. The source can be an object, such as a table or view, or as in my package, it can be a SQL query to be used as a derived table.

For four of the five tables, I sent one query to select values and loaded the results from the remote servers into the local DBA_Rep database. To examine the tables' source queries, right-click the Data Flow object within the Foreach Loop container and select Edit. Then, on the Data Flow table, right-click the OLE Data Source object and select Edit again to display the source query. Web Listing 2 contains the query to load the Databases table.

I needed to handle the Database_Info table, which loads detailed information about each database, a bit differently. Because the SQL Server 2000 Master database stores basic information about each database in the sysdatabases tables, and each database stores the remaining important information, I needed to query each database individually. I could have used the stored procedure sp_MSForEachDB, which has been available since SQL Server 7.0, to easily accomplish this task. However, using this stored procedure doesn't return a solitary result set. I needed a full result set, so I considered other alternatives. Using temp tables or table variables would have given me the full result set I needed, but setting up and maintaining temp tables is difficult and requires special considerations. For example, you need to create the temp table beforehand, and you must set a value to retain the connection.

My solution was to query each database to return database-specific information in a table that I created in the TempDB database. The table I created wasn't a true temporary table with a # or ## prefix. Although the table resides in the TempDB database, its size and location have minimal effect on the source server. Web Listing 3 contains the code to create and populate this table, called HoldForEachDB. Notice the syntax of the sp_MSForEachDB stored procedure, which substitutes a question mark for the database name. This command is fairly useful, without requiring you to wrap logic into cursors to provide similar functionality. (Note: When you save the task, you might receive a message such as "The DROP table construct statement is not supported." However, the query will still delete the table and repopulate it with crucial database information such as status, recovery model, size, and updateability.)

Troubleshooting

The package runs on both SQL Server 2005 and 2000 servers. If no errors generate, the package will run in less than 2 seconds for 2 servers and in less than 2 minutes for 30 servers with varying amounts of data. If the package runs successfully and each package object turns green, you're ready to query the repository with custom queries and reports. Even if the package objects don't all turn green, don't worry—each task has a maximum error value that you can set to let the package continue running even if it can't connect to an individual server, for example.

Servers fail to connect for various reasons—for example, they might not be online, or the user running the package doesn't have access. You can use event handling to send an email message or write to a log to report server failures. Event handling, logging, and scheduling the SSIS package for automatic runs are beyond the scope of this article. For more information about these tasks, see SQL Server Books Online (BOL).

After you load the database, you can run a simple query such as the following:

SELECT \[Server\]
,\[ProductVersion\]
,\[ProductLevel\]
,\[IsIntegratedSecurityOnly\]
,\[EngineEdition\]
,\[Collation\]
,\[IsClustered\]
,\[IsFullTextInstalled\]
,\[SqlCharSet\]
FROM \[DBA_Rep\].\[dbo\].\[SQL_
Servers\]

This query provides the data that Table 1 shows, which includes useful information about the two servers I used to test the package.

You've got the Power

The package I created is just one example of what you can do with SSIS. Combining the variable and expression values gives you unprecedented control and efficiency in designing packages. For my package, I used variables and expressions to programmatically control a connection string to query multiple servers without needing to manually define each connection.

Discuss this Article 22

Waqar (not verified)
on Jun 11, 2007
I keep getting several errors, seemingly the largest is the problem with the Load Servers task. It always turns red with the OLE DB Source turning red too. I have tried the "delay validation" but it doesn't seem to work. What could be some possible problems and solutions?
rlandrum
on May 10, 2007
Haywood, The connection to QASRV.DBA_REP will be the connection to the repository database itself created from a script provided in the code download. I assume you have created this DBA_Rep database and the QASRV.DBA_REP is pointing to that database. Also, make sure that you can connect to the servers via Management Studio that are in the ADO recordset populated via sqlcmd /Lc in the ServerList_SSIS table. Let me know if I can be of more help getting this working in your environment. (Also, was their a specific SQL error such as "Server does not exist or access denied" message?) Rodney
David (not verified)
on Jan 15, 2008
at a minimum the article provides further examples of SSIS implementations. This should be good. Thanks dmb
rlandrum
on Jun 12, 2007
Note from author Make sure that you can connect to each of the server individually via SSMS with the same name as what is in the Serverlist_SSIS table that feeds the variable. Next, make sure you have permissions to access each of the servers in the list. Finally, check the Progress tab for the detailed error and also the Error List and Output windows. Rodney Landrum
rlandrum
on Sep 18, 2007
sgreene - I see your point. Though I stated on page 2 (web) how to set the Connection Manager object expression, I had the crucial details @[User::SRV_Conn] in a screen shot (Figure 8) - not in text. I am glad you figured it out. Yes..all SSIS users unite. I am working on a follow up article right at this moment that has many enhancements to this original SSIS package. Look for it in the upcoming few months. Thanks for the kind words. Rodney Landrum - author.
vonpato
on Feb 12, 2008
I was that DBA... ;) For me was better to get the list of the Available Sql Servers using (c#), to avoid use link servers or MultiServer connection to get the information: using Microsoft.SqlServer.Management.Smo.RegisteredServers; SmoApplication.EnumAvailableSqlServers(false) returns a DataSet Now I want to improve my project. References: http://msdn2.microsoft.com/en-us/library/microsoft.sqlserver.management.smo.smoapplication.enumavailablesqlser vers.aspx Thanks.
rlandrum
on Sep 18, 2007
rsqlmag, This could be caused by a number of things. Here are a couple to check out. Named Pipes is not enabled by default. If for whatever reason a Named Pipe connection is being tried for the 2005 servers you can enable it via the SQL Server Configuration Manager under SQL Server 2005 Network Configuration. Also, make sure that remote connections are allowed. This can be enabled with the Surface Area Configuration for Services and Connections. Set Remote Connections to use both, or if local connections only is checked select, allow local and remote connections. On SQL Server Express, Local connections only is the default if I am not mistaken. Hope this helps. Rodney Landrum - author.
skeleton
on Nov 21, 2007
Don't know if anyone has pointed this out, but we seemed to have trouble with named instances on servers with more than the default. Discovered that we had to use Servername and port instead of Servername\Instance. That seems to work.
BikeBoy
on May 26, 2007
one issue I see with sqlcmd /Lc is it will only return sql servers listening in its immidiate subnet, which are likely to only be a small subset if all sql servers.
rlandrum
on Jun 14, 2007
sqlmast, I would be willing to discuss the article, SSIS, Reporting Services et al with you and your DBA team. My e-mail is posted in the bio if you like to contact me directly. Thank You, Rodney Landrum - author.
rlandrum
on May 25, 2007
salagheh, Weblisting 3 contains the DDL to create the HoldForEachDB table in TempDB that is required for the validation to complete and for the package to execute locally. Also, after each restart of SQL Server, this table will be deleted because of it's reseidence in TempDB. This only needs to happen on machine where the SSIS package is executing. It will be dynamically created on each target server at runtime. To avoid this, you can also set each objects in the package to delay validation. I have found that this corrects the issue permanently with no need to recreate the HoldForEachDB table. Rodney Landrum (author)
sboorman
on Jun 7, 2007
I get an error of .....An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x800004005 Description: "Login timeout expired"
samuel (not verified)
on Sep 12, 2007
Great article, this is going to be immensly useful! One bit of info, not explicitly stated, that I was having trouble finding, is how the connection is dynamically set. You have the variable set, but never mention that. For those who might be having the same problem finding this vital bit of info(ssis beginners unite!), it is set in the connection manager as an expression: @[User::SRV_Conn].
rsqlmag
on Sep 6, 2007
Great Article...and appreciate the generosity for the code....:) I did a similar solution and my SSIS package just gets the backup info....But when I connect to SQL2005 servers I get this message...I am executing the SSIS packages from a JOB in SQL2005, the agent account is a domain account and it has access to all servers...I am able to get information for all 2000 servers but 2005 servers have this issue: Any input will be appreciated: Error: 2007-09-06 02:26:48.54 Code: 0xC0202009 Source: DatabaseBackupsForEachServer Connection manager "Source Conn" Description: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005. An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80004005 Description: "An error has occurred while establishing a connection to the server. When connecting to SQL Server 2005, this failure may be caused by the fact that under the default settings SQL Server does not allow remote connections.". An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80004005 Description: "Communication link failure". An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80004005 Description: "Named Pipes Provider: No process is on the other end of the pipe. ". End Error Error: 2007-09-06 02:26:48.55 Code: 0xC020801C Source: Data Flow Task Source [1] Description: SSIS Error Code DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER. The AcquireConnection method call to the connection manager "Source Conn" failed with error code 0xC0202009. There may be error messages posted before this with more information on why the AcquireConnection method call failed. End Error
Waqar (not verified)
on Jun 13, 2007
I know that I can connect to each server and I know that I have the rights to the servers. The two top errors that I get are 1) [OLE DB Source [1]] Error: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E14. An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80040E14 Description: "Statement(s) could not be prepared.". An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80040E14 Description: "'SERVERPROPERTY' is not a recognized function name.". 2) [Connection manager "MultiServer"] Error: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E4D. An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80040E4D Description: "Communication link failure". An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80040E4D Description: "TCP Provider: An existing connection was forcibly closed by the remote host. ". An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80040E4D Description: "Login failed for user '**********\***DBA'.". -I think that the main problem may be the connection manager in the second part. Maybe the login is wrong. I have a system where I can connect to the servers via windows authentication, but when I remote desktop out to the server, I need to use a special account called ***DBA. I have used '*' to denote company information. Could this be the problem with multiserver? If not I am open to suggestions. Finally, if the author would be interested, I would love to have a discussion, aka learn from the author, about SSIS and the configuration database. I know that it would most likely be a treat for my team to be able to arrange this.
abates100
on Jul 18, 2007
Awesome! Have been wanting something like this for quite awhile since we have been adding SQL instances at an alarming rate. Seemed like such a huge task because I haven't used SSIS much, but your instructions were excellent. Added steps to audit logins, roles and database users. In order to do that I did have to add a default value to the SVR_Conn variable. On to the reports! Thanks again.
salagheh
on May 25, 2007
I get the following error on some of my servers, is there a work around for this ? also is there a way to find out which server on the list is generating this error ? Error 1 Validation error. Load Databases: OLE DB Source [1]: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E14. An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80040E14 Description: "Statement(s) could not be prepared.". An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80040E14 Description: "Invalid object name 'tempdb.dbo.HoldforEachDB'.". Populate_DBA_REP.dtsx 0 0
dmasterson
on Jan 6, 2009
Great article, thank you for sharing the code
rlandrum
on May 16, 2007
*****Note from Author**** Another possible resolution to the issue described in the first reader comment "Error at Populate_DBA_REP [Connection manager "MultiServer"]: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005." There is a property setting on each of the objects in the package that will delay validation until runtime. I suspect that what is happening is that the package is trying to establish the connection to the MultiServer data source server, which is actually a variable set at runtime. If you set the delay validation object property to TRUE in the object properties (Load Servers Task for example), you should get past the validation issues. Set this property on each object that shows invalid and the package should execute.
andersos
on Nov 21, 2007
What do I do if the stored procedure I am running returns more than one recordset? I want to run a system stored procedure on many server (SQL Server 2000 and 2005). I want the data to go into tables that I can analyze on one server.
GORDON (not verified)
on May 8, 2007
I can't seem to get this to run. It complains about the MultiServer connection. I followd the article and everything _should_ work, but it isn't. Am I the only one who can't get it to work? I updated the reference to QASRV.DBA_REP to a valid server/db & table on one of my servers. I can populate the ADO recordset successfully, it errors on the ForEach loop... Error at Populate_DBA_REP [Connection manager "MultiServer"]: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005. Also, the ForEach tasks that populate the tables fail validation, I assume because the MultiServer obejct hasn't been populated with a value yet...so it may be acceptable?
AnneG_editor
on Jan 29, 2008
dmbranscome, I'm glad you find the article valuable. Check out Rodney's follow-up article on his SSIS/SSRS solution in the Feb. issue of SQL Server Magazine, online at http://www.sqlmag.com/Article/ArticleID/97840/sql_server_97840.html.

Please or Register to post comments.

IT/Dev Connections

Las Vegas
September 30th - October 4th

Paul ThurottOur Experts will show you:
• Common SQL Server
Problems
• Best Practices for T-SQL
• SQL Server Integration
Services
• Database Development

Come See Mike Otey & Tim Ford in Person!

Early Registration Now Open

From the Blogs
May 21, 2013
blog

A Common Misconception about MAXDOP

Out of the box, SQL Server is (and has been) able to take advantage of multiple processors/cores without any effort on behalf of administrators....More
May 9, 2013
blog

My ISO 8601-Compliant Signature 2

My family recently just "officially" announced that we're in the process of adopting a child from South Africa. We're quite excited, of course, but there's a ton of paperwork to do—along with the need for gobs of signatures....More
May 8, 2013
blog

Use SSIS for ETL from Hadoop

In this blog post, Mark Kromer walks you through using SSIS as a way to use ETL techniques using Microsoft's Hadoop on Windows (HDInsight) as a source using Hive connectors...More
SQL Server Pro Forums

Get answers to questions, share tips, and engage with the SQL Server community in our Forums.