Methods for collecting digital evidence
Contemporary information systems, such as eLearning, eGovernment, eUniversity, eVoting, and eHealth, are frequently used and misused for irregular data changes (data tampering). Those facts force us to reconsider our security measures and find a way to improve them. Proving a computer crime act occurred requires very complicated processes that are based on digital evidence collecting, forensic analysis, and an investigation process. Forensic analysis of database systems is very specific and demanding task, and it was main inspiration for writing this article. In this article you will find information about what digital forensic is and what kind of methods you can use for collecting digital evidence on SQL Server. Some of them are efficient and some are less efficient. Also, I will cover SQL Server Audit feature
Business processes produce a large amount of data in government agencies, universities, and enterprises on daily basis. Therefore, having a secure environment for storing a data is imperative. Cases in which data is maliciously modified (e.g., data tampering, data fraud, unauthorized data gathering) can produce serious, long-term consequences. Data tampering can be done with unauthorized access, and in some cases through authorized users. Results of that action can be unpleasant for both businesses and their clients.
For example, a highly “motivated” candidate for data tempering can be a medical person. A physician gives a diagnosis and prescribes therapy with or without the use of medications. Unintentional or intentional mistakes in that process can produce serious complications and can even result in the death of a patient. In order to cover his actions, the physician might try to modify a patient’s medical record and add some extra notes or prescription.
In this case, malicious data modifications could result in police actions and involve justice. The final goal of the investigation process is to reveal the identity of the criminal.
The first steps taken in the investigation of a crime is collecting and analyzing the evidence. Products of that process are facts that can help in solving a crime. From the aspect of “classic” crime actions that process is based on forensics. Computer crime is also a crime, but with different range of consequences. In those cases, we use digital or computer forensics to collect and scientifically examine information systems from all aspects. The outcome of that process is to determine the details about digital criminal activity.
Digital forensics is the most important part of investigation process because the collected facts need to be presented in a court of law. It’s based on the confiscation of digital devices, including PCs, laptops, cell phones, hard drives, and USB memory modules. The goal is to preserve, overview, analyze, and report the facts, therefore, the process of collecting, analyzing, and preserving digital data is based on scientific methods. Only evidence collected in this way is valid. Figure 1 shows the four phases of digital forensics. Now let’s look the types of problems you might encounter during the process of collecting data.
In order to have all relevant data that can be used in a digital forensics investigation, the system with the database backend needs to provide answer to three questions: who, when, and what (WWW). The only way to do this is via a strong audit-logging subsystem. Depending on the application architecture there’s a wide range of data that needs to be collected to satisfy the WWW principle.R2 and SQL Server 2008 have really good mechanisms and methods for collecting audit data. Only problem is defining the requirements and needs in the information system environment. To do so, let’s look at triggers, Log file backups, replication, Change Data Capture (CDC), and SQL Server Audit in detail.
Triggers. A trigger is a special type of stored procedure that automatically takes effect when a language event is executed. SQL Server includes three general types of triggers: DML triggers, DDL triggers, and logon triggers. They’re executed automatically based on a defined event. An event can be local (database level) or global (server level). Data Manipulation Language (DML) can be executed on any data modification event (e.g., INSERT, UPDATE, DELETE) on database tables. Some of the most interesting DML triggers are AFTER and INSTEAD OF groups. From the aspect of collecting digital evidence and the “what” question, AFTER triggers are very useful.
A special group of triggers are Data Definition Language (DDL). They can be executed on specific events when someone tries to change or create a database object (e.g., CREATE, ALTER, DROP statements). It’s very rare that a user application can modify database structure. But what if the administrator tries to destroy the evidence of a certain activity? A DDL trigger can provide evidence of that activity. In Listing 1 you can find an example of DDL trigger usage.
Logon triggers fire stored procedures in response to a LOGON event. This event is raised when a user session is established with an instance of SQL Server. Logon triggers fire after the authentication phase of logging is finished, but before the user session is actually established. You can use logon triggers to audit and control server sessions, such as by tracking login activity, restricting logins to SQL Server, or limiting the number of sessions for a specific login.
Log file backups. Each operation in database is transaction. To keep data integrity, the database engine first creates a record in the transaction log. After committing part of a transaction, data changes are entered into database. Figure 2 shows the basic concept of log files on SQL Server.
Log backups can be used on regular basis for collecting and keeping digital evidence of users’ activity. All data or object modification from the application level or database level is recorded in the transaction log and it can be used to prove criminal or illegal activity on the system.
Modifying log backups is very difficult to do without destroying them. Even DBAs aren’t able to do that without consequences.
The weak point of this method is the latency period between log backups. In this time frame, while modifications are still in the log and not backed up, a malicious person can edit the log file with special application tools. Because that backup doesn’t exist, deleting information from the active log file is irreversible.
Replication. Replication is a set of technologies for copying and distributing data and database objects from one environment to another and then synchronize between databases to maintain consistency (it can be heterogeneous). Basically, replication lets you have the same or just a subset of data from one database system in another. Most commercial database systems have built-in replication technology, and they can be divided into merge replication, snapshot replication, and transactional replication.
Merge replication lets various clients and devices work autonomously and later merge updates into a single, uniform result. Because updates are made at more than one node, the same data might have been updated by the Publisher and by more than one Subscriber. Because data changing is possible in either direction, this type of replication isn’t good enough for collecting digital evidence.
Snapshot replication distributes data exactly as it appears at a specific moment in time and does not monitor for updates to the data. For example, if a relatively small table is updated at the Publisher during the day, but some latency is acceptable, changes can be delivered nightly as a snapshot. This latency period of time is very problematic from the aspect of collecting digital evidence. A malicious user can modify and hide evidence of that activity during the latency period.
Transactional replication typically starts with a snapshot of the publication database objects and data. As soon as the initial snapshot is taken, subsequent data changes and schema modifications made at the Publisher are usually delivered to the Subscriber as they occur (in near real-time). This fact marks transactional replication as the best candidate, in replication technology, for collecting digital evidence. But the bottom line is that replication isn’t good enough or reliable enough to collect digital evidence.
Change Data Capture (CDC). This is new technology and it’s presented in SQL Server 2008. This feature was one of my favorites during the process of testing CTP versions. It’s designed to capture insert, update, and delete activity applied to SQL Server tables, and to make the details of the changes available in an easily consumed relational format. In principle, everything that can be done with triggers is now built into the SQL Server engine. However, CDC wasn’t primarily developed for the purpose of audit logging and collecting digital evidence. A good example of the target market is an extraction, transformation, and loading (ETL) application, which loads all the incremental changes in the tables in the data warehouse. In this way, large firms have the ability to monitor and analyze changes in the data, and by doing so, don’t affect production systems. However, this does not mean that the CDC can’t be used as a tool for collecting digital evidence of user activity. I recommend that CDC is a serious candidate for the collection of digital evidence. The SQL Server Books Online (BOL) image shown in Figure 3 shows the CDC process.
SQL Server Audit. SQL Server Audit is also a new technology featured in SQL Server 2008 and it’s in the Enterprise Edition only. It introduces an important new feature that provides a true auditing solution for enterprise customers. It offers centralized storage of audit logs, as well as noticeably better performance. Perhaps most significantly, SQL Server Audit permits fine-grained auditing whereby an audit can be targeted to specific actions by a principal against a particular object. For an example audit all members of db_datawriter role when executing UPDATE statement on students_grades table.
The ability to track who has accessed, or attempted to access, your data makes this feature one of the best candidates for collecting digital evidence. Also, it offers the ability to detect unauthorized access attempts or, if necessary, to piece together the actions of malicious insiders who misused their legitimate access.
Figure 5 shows the relationship between these elements in more detail.
Now let’s run through an example of how to activate the auditing feature on AdventureWorks.Person.Person object. We want to collect all data about who is viewing and modifying the data.
First, create an audit and define the initial parameters, such as the log file location, type of logging, and maximum size, as shown in Figure 6. Then, enabling auditing by right-clicking the audit and selecting Enable Audit, as Figure 7 shows. Next, select New Database Audit Specification, as shown in Figure 8. Now you’ll want to specify the elements that will be tracked and logged in the audit, as Figure 9 shows. Finally, select Enable Database Audit Specification, as shown in Figure 10.
Now you can work with the Person.Person table from AdventureWorks database. I recommend selecting some records and changing some data. Figure 11 shows you how you can view audit logs, and Figure 12 shows you how to use Log File Viewer to audit logs on the Person.Person table.
As you can see, digital evidence collecting can be a demanding and responsible job for any DBA. Your task is to evaluate audit-logging boundaries and try to keep record of what’s going on in your environment. When something goes wrong, you need to be able to respond quickly. SQL Server offers many features for collecting digital evidence, you just need to pick right one. In this article, I didn’t discuss the problem of audit log tampering, which is a big problem from aspect of relying on the accuracy of digital evidence, because that’s a whole other story.