Three MySpace executives met with SQL Server Magazine to provide a candid look at their decision to use SQL Server for MySpace’s enterprise database platform and how the implementation has unfolded over time. SQL Server Magazine Technical Director Michael Otey, and Editors Sheila Molnar and Jeff James interviewed the leaders who spearhead the MySpace data effort—Hala Al-Adwan, vice president for data, Christa Stelzmuller, chief data architect, and George Tevelde, director of database administration—about how they chose SQL Server for rapid development, what it’s like to run a massive SQL Server implementation, and what tips and techniques they can pass along to organizations of any size.

 

SQL Server Magazine:

We’re interested in how MySpace made the decision to go with SQL Server over other enterprise database platforms.

 

Al-Adwan: I always get asked, "Why SQL Server?" MySpace chose to go with a Windows platform and SQL Server because it provides for a rapid development environment. That's something that doesn't get much attention, but compared to the open-source competitors, it's really easy to get up and running and developing rapidly on SQL Server. So that was why MySpace launched on a Microsoft platform, specifically on SQL Server 2000.

When we started in 2003, we had one instance of SQL Server running on one server. Everything was fine until we experienced more and more growth. Our first approach to scaling was a master/slave model, using transactional replication. We replicated our master read/write node to all of the slaves, which worked fine for us until we reached about two million users.

Around our two million user mark, we experienced latency issues with replication. That caused problems with the data being presented to users. We were still a two-tier architecture; we had web servers hitting the database directly. That was a key consideration when we talked about scaling.

We went with a vertical partitioning approach: We separated our features onto different servers; that worked until we hit the four million user mark. Then we had issues with users on multiple servers: How do you combine them when you're trying to build a single user experience? So we asked, "What’s the ideal way to combine everything we’ve learned using replication and vertical partitioning to come up with a more scalable model that allows us to keep a single user view on a single server?" We went with a feature-level partitioning model across all of our servers within MySpace, but within each one of those features, we try to be very thoughtful about the best way to scale horizontally.

The decision to scale horizontally happened early on, around the two million user range, when we decided to go to a model where we can add servers horizontally, rather than trying to scale out to a really large server. When you look at our user base, we have profiles and we have messaging and communication. We partition those functionally into different groups, and within each of those groups we partition again by user ranges. So we have databases for every million users, and we add more databases for every new set of million users. The application tier is aware of that and routes activity to the appropriate database depending on the user that is requesting it. So that, in a nutshell, is the evolution of SQL Server in the MySpace environment.

 

Stelzmuller: Today we either implement things as a federation (horizontally, partitioned across servers), or we implement them as load-balanced clusters, which are master/slave configurations, and we have standalone servers—many of our more critical features are running on high-availability clusters. So that's kind of how it maps out after all these years.

 

SQL Server Magazine:

You started on SQL Server 2000—over the years, which versions of SQL Server have you moved to?

 

Al-Adwan: We're currently running on SQL Server 2005 and are planning a migration to SQL Server 2008 Enterprise Edition.

Stelzmuller: We're currently on Standard Edition for SQL Server 2005, but our high-availability clusters are on Enterprise Edition.

 

SQL Server Magazine:

You mentioned that you have about a database per million users. So how many databases and servers are you using?

 

Al-Adwan: Our server footprint is about 450 servers. When you add up our user profile databases, with all the other databases that we have that support our various features sets, the load balance databases, and our messaging infrastructure databases, we're looking at over 1,200 databases. These are total generated accounts. Our monthly active user base is about 130 million. And we have about 250 or 270 million active user accounts.

 

SQL Server Magazine:

Are you using virtualization?

 

Al-Adwan: Our databases are not virtualized. We use virtualization very heavily in our development and staging environments. Currently, we don't have virtualization in production.

I worked in the SQL Server world for many years prior to MySpace—I'm naturally a data head—and a lot of issues we faced in smaller implementations are similar to the challenges we face at MySpace. We talk about our scale and it's pretty huge, but in my previous job, managing messaging and transactional integrity across multiple servers was a challenge, and we had a small number of servers. We still have the same issue even with 450 servers, a single user profile, and trying to manage the integrity of the transaction across all those systems. That affected us pretty heavily when we went to the federated model, and we tried to work around it by creating some level of data duplication, but that didn't help very much. We tried to move to a place in the middle tier where we did some rudimentary transaction-level management, specifically on the user level because they touch so many databases. We have a much more elegant solution right now, utilizing Service Broker within the database.

We were trying to provide a single user experience, but to deal with the same kind of data living in multiple places we utilized SQL Server replication. Certain types of data were replicated to every single one of the user profile databases. This worked well for us, until the 100 million user range. SQL Server replication works great when it works great, but when you have failures they tend to be dramatic and have cascading effects. We worked closely with the SQL Server team and gave them feedback on our issues. They worked closely with us to solve a lot of the bugs that we discovered within the replication system. We built an interim solution around the 100 million user range that helped us handle the load that a single distributor database would have to manage and eventually moved to a completely non-replication–based solution.

Stelzmuller: Because we started with the 2-tier system, we had a lot of read/write volatility. But the data integrity issue hit us pretty hard early on; the nature of the application environment wasn’t one where transactions were being managed across different physical destinations for the data. Data transactions weren’t atomic—they would succeed in one location and not succeed in another, which led to a really bad end-user experience.

In the early days of MySpace, a lot of those data issues manifested themselves as errors on the site. There were big red X's showing up when data wasn't where it was supposed to be. So we started looking at Service Broker. I’m pretty sure MySpace is handling one of the most extensive Service Broker implementations in SQL Server right now.

We use Service Broker to handle asynchronous rights in the database. The early issues we had with scale were simple time outs because of the massive number of connections coming into such a small number of servers. We loved Service Broker because it gave us the opportunity to write data in a way that we could handle it after the fact. It also allowed us the opportunity to start managing transactions across databases. That was critical because of the partition environment. A big drawback was that the initial version was only a unicast version of Service Broker, which meant that a physical route had to be established between each service and each database. When you're talking about an environment where a single database could potentially have to talk to several hundred databases, this created a maintenance problem.

 

We explored Service Broker options, such as dynamic route management, that address maintenance issues while allowing only unicast messaging. But it wasn't going to be good enough for us. We decided to build our own product called Service Dispatcher, which extends the Service Broker functionality. It’s really important to see that you can evolve the feature set of SQL Server. This is big in open source—that you can build in what you don't have—but we were able to build into SQL Server what we didn't have. We tackled centralizing route management and enabling multicast messaging. We also extracted a lot of the complex Service Broker components away from the developers to allow for a more rapid implementation of Service Broker–based applications. This has become the hub for most of the communication in our environment. It has completely replaced replication. In addition to managing transactions that come from the front end, it also handles transactions initiated from the back end. Plus, it's used in transactions that communicate to other tiers in the infrastructure. For us, a database isn’t just a point to receive data; it's also a living, breathing part of the architecture that communicates back with the middle tier when necessary.

To talk to the middle tier, we built Tier Hopper on top of Service Dispatcher. We started with Service Broker, built Dispatcher, and then built products on top of Dispatcher. We hope to share it with the community at large.

 

SQL Server Magazine:

What is it about Service Broker that allowed you to replace replication? Was it the asynchronous nature that allowed you to get away from the time outs?

 

Stelzmuller: Yes, it's the asynchronous nature of it. It’s more fault tolerant in the sense of recovery than anything else. It still uses a disk, but it's not using the log.

Tevelde: Recovery time is cut down. Several times we had to rebuild replication; it took six hours. Meanwhile, transactions were still happening on the database; we ran out of transaction log space, and we nearly lost data.

Al-Adwan: It’s difficult to recover from replication failure. We used to implement homegrown solutions, so we were super excited when SQL Server 2005 came out with Service Broker.

Stelzmuller: Service Broker is decentralized; every database in the environment acts as an initiator. When we centralize route management to all the other databases, that's also a farm of servers that we can easily add to, whereas adding to the distributor farm was not as easy as adding a new farm in our route management.

 

SQL Server Magazine:

What products from third-party vendors are you using to help you monitor, manage, and perform other tasks on all these databases?

 

Tevelde: We tried various types of monitoring software, but they couldn't handle our load, and couldn't handle the amount of data. Everything we have is homegrown. We looked at packages out there, saw what they did, and then reverse-engineered them into what we wanted them to be. We streamlined them—we don't have a UI at all, we just throw everything into a table and run our own queries on top of it.

We started extremely light: give us the process that uses the most CPU, the most memory, the most file I/O. Those are three of the biggest problems that we had. We found the procedures that had these problems, we solved them, and we found other bottlenecks. We have scripts that are constantly evolving.

Al-Adwan: The initial evolution of these scripts was a reaction to a problem, and trying to reduce the amount of time to respond to the problem. Over time, we've turned them into more predictive scripts: They try to identify the problem and a solution before involving a human.

 

Stelzmuller: Over the years that George's team has been developing these scripts, the amount of calls he’s gotten in response to database issues has gone from several times a night to once a week.

Tevelde: About three times a week. The scripts solve about 40 percent of the issues we have. The scripts detect which procedure it is and recompile it. Next, the scripts identify the issue—if it's a table, they'll get updated stats on the table, they'll go through another fragmentation (a very heavily fragmented index is common), and they'll automatically kick off a reorganization of the index. This all happens in real time—no human involvement. The script just goes through and says, "The easiest thing to do is recompile the stored procedures; the box is already having issues." It samples the CPU again—if that doesn't work, it updates statistics on the core table inside the stored procedure. If that doesn't work, it looks at the most fragmented \[table\] to see if it's over a certain threshold. Then it rebuilds all the indexes to make sure the table is running clean. If that doesn't work, the mod that monitors the site 24 hours a day gets an alert that there's an issue with a certain stored procedure; they call up the DBA, and he's able to go in there and look. We have our own data warehouse with a history of the last three months of the site. We can say, "OK over the last three months, here's a trend in the CPU, and something changed in the last three hours." Then we look at the last three hours. We can roll back the change, fix it, or call the developer and say, "We need to get together and work on this."

Al-Adwan: Trending is huge for us. We've invested in it over the past year, trying to not just identify when something has changed dramatically, but to also be able to detect levels of change in the trend we have. We use trending extensively when we talk about disk utilization, to understand the regular growth pattern for our data tables. As soon as that pattern varies off of the trend, alerts go out to our storage team; our storage team contacts the DBA or the developer directly working on those tables. They address the issue: Either we have to support more space for them or there's something wrong that has to be fixed. So the data developers, the DBAs, and the storage admins work very closely together. They all are part of my organization, they all participate in the design of the system, they all are involved in everything that goes on, and they are all held accountable to each other.

Stelzmuller: We looked at tools in the past for basic operations, such as deployment, because we roll out so many databases at once. No tool actually serves the need the way that we needed it to, so we built our own mechanism—ExecuteSQL. It's a tool that executes SQL as if it were running on a single database, except it's running against 500 databases. You can control the level of parallelization; you can control error behaviors; stop and kill behaviors inside this tool. We also allow for data movement inside this tool. We’ve integrated it with Dispatcher; we can run longer-term asynchronous processes against these databases, which we normally do if we're going to do a migration because we're refactoring a particular data architecture, or if we need to introduce intradatabase partitioning for any of our larger tables. The same tool will allow the parallelization of longer-running processes against our federation; we're able to control that from a single hub.

 

SQL Server Magazine:

What platforms are you running on? Are you mainly running SQL Server on 32-bit? Do you have some on 64-bit?

 

Tevelde: No, we're 64-bit all the way through. The 4GB limitation in 32-bit was the reason we needed to move on; the site was crashing. We switched to Windows Server 2003, SQL Server 2005 64-bit, and the CPUs dropped from 100 to 60 percent with nothing but upgrading the box.

 

SQL Server Magazine:

What size boxes are you running?

 

Tevelde: They're HP DL585s. The ones most frequently used are eight ways, four processors dual-core, and they run on 64GB of RAM.

 

SQL Server Magazine:

Those are some pretty capable boxes.

 

Tevelde: Yeah, we bring them to their knees. We're currently going to be moving on, as soon as we upgrade to Windows Server 2008 and SQL Server 2008. We're going to be using HP DL585, the G6 model, and it has 128GB of RAM and quad quad, so 16 ways.

Al-Adwan: We play around with how many databases we can run on a single server. We've been doing an analysis on various versions of the HP DL585 to see what our bottleneck is going to be depending on available hardware. We vary the number of databases we run on a server based on the class of server we're dealing with.

 

SQL Server Magazine:

What’s the size of the development team that adds all the extensions to SQL Server that you mentioned?

 

Stelzmuller: Within our data development team we have a data infrastructure team of three people. There are two data architects and one senior developer, but they are all capable developers. They do all of the development for these extensions that we build, and they're also responsible for the deployment tools and custom monitoring solutions for anything that has to do with Service Broker.

Al-Adwan: The database development team is about 24 people total, but they're mostly working on product-related features. We started with a two-tier architecture. We’ve evolved; we have an extensive caching tier that sits between our databases and our web servers. In some cases, it's a logical three-tier system, in some cases it's actually a physical three-tier system. We try very hard to move our databases out of the presentation logic, and we try to optimize our database code for delivery into cache, and optimize the cache to be the presentation and business logic tier. Tier Hopper communicates back to our middle tier around the caching of data.

Stelzmuller: It addressed our read/write volatility as well, because we focus primarily on serving writes, and not on serving reads. We leave that to our caching tier.

Tevelde: When we put on the latest service packs and hot fixes it's painful because any time we take an outage on a database, users aren't able to log on to the site and generate traffic. No traffic, no money. The biggest outage is patching, which takes up to two hours, and we do it two or three times a year. Each server, at 400, is down for six hours a year, so 2,400 hours a year—that's quite a lot. Microsoft said, "In Windows Server 2008/SQL Server 2008, you can set up a clustering environment. You're able to patch everything, including SQL Server on a passive node, reboot the box without upsetting any other server, then you can take one of the active nodes, sail it over to that passive node, and everything will run fine and you have another box you can patch." That's basically for high availability, but for us it was incredible. In Windows Server 2008/SQL Server 2008, all of our boxes are going to be clustered. We're going to set up a 10 active, one passive node structure. Our main goal is uptime. That small change brings another 2,000 hours of uptime to our site, just based on that clustering solution.

 

Al-Adwan: One of our key enablers to do this was our ability to move to Enterprise Edition.

 

SQL Server Magazine:

Were there other aspects of SQL Server that you had to augment or improve? Were there problems you ran into that you had to address that you haven't mentioned already?

 

Stelzmuller: Yeah, we run into all the problems that are common to the community. We make extremely heavy use of XML, and XQuery gets us very far but it doesn't get us far enough. It chokes eventually if documents get too large. It's necessary to create your own shredding CLRs to run in the environment; we've done things along those lines.

Al-Adwan: Anybody who's worked with SQL Server replication on some scale has had issues with reliability and with bottlenecks with a single distributor with replication. We went to a multilevel distributor system. We have one source database that replicates to four middle databases, but also has a distributor at each of those master databases targeting a specific number of databases. So rather than having one master go to 500 databases, you have one master going to 10 middle masters, each one going to its own subset of databases.

 

SQL Server Magazine:

You've mentioned that you've worked with Microsoft pretty closely to expand some of the features of SQL Server.

 

Al-Adwan: When I first started at MySpace and we had issues scaling with SQL Server, part of it was getting on the same page with Microsoft. Getting them to understand the level of experience and expertise we had and being able to identify our bottlenecks, and then being able to trust that we knew what we were talking about. We eventually got connected with the SQLCAT team, the SQL Server Customer Advisory Team—people who aren't just customer-facing, but who understand the SQL Server engine very well. They became our key contacts; it was the first step in forging a really strong partnership with Microsoft. We also have access to a lot of the SQL Server developers. They're interested in understanding how we use their databases and their tools, because our scale is obviously very exciting. So whenever we have labs at Microsoft, we meet developers and different groups in the SQL Server organization, so they can get our feedback on the tools and understand the things we want.

Al-Adwan: The SQLCAT team is one of Ted Kummert’s \[senior vice president of the business and platform division at Microsoft\] very strong investments in the SQL Server system. It's been a success case for us. We have a much better relationship on the SQL Server side than on the Windows platform side, specifically because the SQLCAT team facilitates that. The key with the SQLCAT team is that we have a technology relationship. They understand our implementation. It feels much more like a partnership than a vendor coming in to talk about their product.

Al-Adwan: And we do look at our relationship with the SQL Server team as the blueprint to build on the rest of our relationships with Microsoft.

Tevelde: One helpful thing is trending. We sample all the information we’ve accumulated every 15 minutes and keep the samples in a data warehouse on a separate server for three months. Even though the data is being taken in a reactive state, we can mark the trends and flip to be more proactive. The only issues that cause a serious outage now are hardware issues—where a hard drive goes bad inside the box, or where something you can't predict goes bad. Based on our scripts, if a box goes down, we move the logs to another box, and get up and running in about 20 minutes.

Al-Adwan: These scripts are all automated. Hardware failures are our biggest problem because we're on Standard Edition and not all of our infrastructures are clustered, but once we move to the clustering model, that's going to remove the tight coupling that we have right now with a database in a specific server, and allow us the flexibility to recover faster. We're really excited about that.

 

SQL Server Magazine:

How many companies would you say are at this level, using as many servers as MySpace?

 

Stelzmuller: Even in smaller implementations you encounter the same problems, just not as often as you do when they’re at the scale of servers that we have. We have to work hard to build all of these solutions because the problems happen often. It's a resource issue. But the tools we have would be useful to smaller implementations.

Tevelde: Take our DBA monitoring scripts: Whether you have 100 servers or one server, all these scripts are installed on every single box. So if all of our boxes were to drop offline, we'd still use these monitoring scripts to do exactly what we're doing for predictive analysis and problem solving.

Stelzmuller: Whether you're dealing with five DL585s or 100 DL585s, you're still going to face the same type of limitations in a single server or implementation. We just happen to have more iterations.

 

SQL Server Magazine:

So the scale you're operating at causes problems to surface more often than at a small-to-midsized company?

 

Tevelde: Right, if you have one or two boxes and there's 10 people using it, you're going to run into these problems maybe once a year. But when you have 400 boxes and they're being pushed to their limits, once or twice a week we'll see four or five of the same major issues. But because of these scripts, we're able to identify what these issues are. After we find the workaround, we add it to our monitoring system and baseline knowledge.

 

SQL Server Magazine:

It will be interesting to see how much of the give and take you have with the SQLCAT team ends up in the next version of SQL Server itself.

 

Stelzmuller: We encounter some bugs that the average implementation might not get around to. But the bugs are there for everyone, whether or not you hit them today or a year from now.

 

SQL Server Magazine:

What do you do to back up so many servers, and with related data on different servers, how do you handle it?

 

Al-Adwan: We use 3PAR \[3PAR.com\] for our SAN storage solution. We have about 14 production 3PAR frames to power all of our databases. We have 2 colos \[collocated servers\] where our data lives, and we have 14 production frames distributed across those colos. We have a set of four backup frames; we call those "near lines," and they're all running SATA \[Serial ATA\] disks. We have our near lines distributed in both colos. So from a backup perspective, within each one of our 3PAR frames, we take a snapshot of our database once daily. We maintain three versions of that snapshot on our live production databases, and then we replicate those snapshots to the other colo on our SATA drive. So we have three hot snapshots of the data on the production frames, and three backup copies on our near lines and office colo. We were very excited about that design—actually 3PAR liked it so much that they're thinking of marketing it to their other customers.

Stelzmuller: We heard from Microsoft that we’re the largest implementation of SQL Server from a transaction and data volume standpoint. It's really important to show what SQL Server is capable of handling. I know a lot of people tend to underestimate it as a platform, and I think that speaks to the limits of their imagination.

 

SQL Server Magazine:

A lot of people think of SQL Server as a medium-sized business play, but I think you're showing that it can be a pretty large business play.

 

Al-Adwan: We have a motto here—it's not about platform, it's about architecture. It's platform agnostic—you just have to know how to set it up.