Melissa Data provides mailing lists, mailing and data management software, and other products that can help companies improve the quality of their data. It now offers the Data Quality Components for SSIS, a suite of components that integrate into SQL Server Integration Services (SSIS). I recently had a chance to try the company’s Contact Verification Component, which you use with custom data (also provided by Melissa Data). With the component, you can validate and correct contact information, parse and reformat the data in intelligent ways, and enhance the information by providing additional data and removing questionable data.
Melissa Data recommends installing the component on a 64-bit Windows machine (Windows XP SP2 or later) running the 64-bit version ofR2 and Microsoft .NET Framework 3.5. Installation is pretty easy. It took me about 10 minutes on my Windows 7 machine, which has the 64-bit version of SQL Server 2008 R2 Enterprise Edition installed.
The first step is to install Setup.exe, which starts the installation wizard. The wizard installs the task and the data files that it relies on. Once installed, you must add the Data Flow task to the Toolbox in SSIS. If you have never done this before, don’t worry—it’s not hard and you only need to do it once. To add the task, open an SSIS project inDevelopment Studio (BIDS) and right-click anywhere in the Toolbox. From the drop-down menu, select Choose Items and a dialog box will open with multiple tabs along the top. On the SSIS Data Flow Items tab, select the Melissa Data Contact Verification check box. To finish the process, click OK. The Melissa Data Contact Verification task is now available for use.
To test the new task, I built a sample package using an OLE DB source. The first thing I tested was name parsing. On the Name tab, I selected FullName from the drop-down list in the Input Name section to indicate the field (i.e., database column) containing the names to parse. I then specified the names of the fields in which to place the output. The task not only parsed the full names into the five fields (prefix, first name, middle name, last name, and suffix) but also enhanced the data by adding two new fields containing the person’s gender and a salutation. You can even separate out a second name from the input name. An example of this would be turning the name John and Jane Doe into John Doe and Jane Doe.
Address verification works in a similar way to name parsing. On the Address tab, which Figure 1 shows, I selected the address1, address2, city, state, and zip fields in my database as the input and assigned field names for each possible output. The task did the rest. It parsed the suite and apartment numbers into their own fields, added the nine-digit ZIP code, added latitude and longitude coordinates, and even told me which addresses were private post-office boxes, like those rented from The UPS Store.
Figure 1: Verifying addresses in the Contact Verification Component
The Contact Verification task also verifies and enriches email addresses and phone numbers. A phone number can yield a lot of information, which you can use to enhance your contact information. For example, you can use the task to add the type of phone number (landline, cellular, or Voice over IP—VoIP), country, time zone, and a lot more.
When addresses and phone numbers are verified, the component provides result codes, which add an additional layer of information to each line of data. You can use these codes to define data as valid or invalid based on your business rules. For example, you could use the result codes to remove addresses that contain vulgarities or questionable names such as Donald Duck.
If you need to validate and enhance your company’s contact information, the Contact Verification component makes it almost effortless. The component is easy to install. More important, it’s easy to use, thanks to the integration with SSIS.
Contact Verification Component