Recently, I've been working a lot with VSTS DB Edition, in the hopes of fully-integrating database projects in my day-to-day development and unit-testing activities. As a whole, VSTS DB is an incredible tool, with a lot of promise. However, there are a few highly-noticeable gaps in this version 1.0 product, that make it slightly challenging to use. One critical missing feature is the ability to use data generation with a table intended to contain specific static lookup values. I found that the following use case was not possible using the out-of-the-box data generators:
-
Build and deploy the database locally from scratch
-
Load all lookup tables with static values
-
Load all other tables with randomly-generated data
-
Run all unit tests, which exercise code that ultimately expects certain values to be in lookup tables (e.g. an enumeration may map to values in a lookup table.)
The solution to this was to build a custom data generator, to handle static lookup values. To demonstrate this, I'll use a simple database schema with four tables:
-
User, Product, Category — all considered non-lookup tables (I'm including the Category table in this list, as this will be populated with dynamic information through normal operation of the app, and no code will be written that expects certain values to be in it).
-
UserType — a lookup table that must be pre-loaded with specific static values for the app to work properly.

To get setup, I import the database schema shown above into a SQL Server 2005 database project inside Visual Studio. I then set the following project properties, under the Build tab:
These settings ensure that I can re-build the entire database from scratch using only the SQL Server 2005 project, and that when deploying a local copy of the target database, it will always have a consistent name. I will be using this database as the target, when I run unit tests later.
Data Generation out of the Box
Data Generation Plans are an excellent feature in VSTS DB, that allow you quickly fill tables up with random data. This is accomplished using several built-in data generators, that can load columns with various numeric types or strings, and can even make use of regular expressions. Also, the developers who created the generators solved many gotchas that immediately spring to mind, by gracefully accommodating foreign key and unique constraints, nullable columns, and identity columns. So the first thing to do, is create a new data generation plan, select all non-lookup tables, and configure data generation for each column.
The screenshot below shows a data generation plan against this particular database schema. Demonstrated here are two things:
-
The plan correctly determined that the User.Username column has a Unique constraint, and will generate data accordingly.
-
It is impossible to completely disable data generation for the UserType table, due to the foreign key constraint from User. Because of this constraint, the data generation engine will fill the UserType table with random values, even though it has not been marked explicitly for generation.

Workarounds that don't Work so Well
There are a couple of workarounds available using the tools at hand. One of these is to pre-load the lookup tables with the desired values using one of the database projects scripts, such as the post-deploy script. Simply add insert statements to a post-deploy script, right-click the project, and select "Deploy". Now the database has been recreated, and pre-populated with lookup values.

The problem with this approach is that when you run data generation against the UserType table, there is no way to keep just those values in the table. If we look at the different options, none of them quite does the job:
-
Set the "Rows to Insert" property for UserType to 0, since it has already been pre-loaded with static values. In this case, data generation will fail for the User table, because the data generation engine is not intelligent enough to recognize the existing rows in UserType. Thus, it tries to insert null for User.UserType, and fails.
-
Leave the default Integer generator for UserType.Id and String generator for UserType.Description, and change "Rows to Insert" to a positive integer. This time, when running data generation, every value stored in User.UserType will correspond with one of the generated values — it will never select one of the pre-created values.
The next approach is look at using the supplied Data Bound Generator. With this generator, you can populate a column based on a connection string and select statement, by pointing it to some other table where your lookup values are stored. The problem is...how do you make it choose each row in that target table exactly one time? One thing you can do is change the "seed" property value, to affect which row the random generator chooses each time — if you painstakingly try every possible value (start at 1, then try 2, then 3, ...) you may be able to get it to choose each lookup value exactly one time. However, this only works with small lookup tables—I'm not sure of the magic threshold, but let's say something like 10 rows. I have tried working around this in myriad ways. For pre-loading, I tried using global and local temp tables, as well as a separate database, pre-populated with scripts. For selection, I tried selecting directly from a table, as well as running through UDFs and stored procedures, moving values in and out of different temp tables, to try and select each value only one time. At the end of the day, all of these methods failed me—it was a battle against the IDE, and the IDE won, often crashing in the process when entering anything non-standard as the select statement value for the data bound generator.
Custom Data Generation
To solve the issues with lookup tables, I built a custom data generator, for providing strings that map to lookup values (you could modify this to generate numeric types as well). The values are generated in the right order by specifying a select statement in the designer to pull them out ordered by the pre-inserted IDs (more on this later). You can download the source code, and binary here. There's not too much to creating one of these:
-
Inherit from Microsoft.VisualStudio.TeamSystem.Data.DataGenerator.Generator
-
If you intend to generate unique values (probably so), mark the class with [GeneratorStyles(DesignerStyles=GeneratorDesignerStyles.CanProduceUniqueValues)]
-
Mark one property with [Microsoft.VisualStudio.TeamSystem.Data.DataGenerator.Output]. This is where the unique values are sent for the data generation engine to pick up.
-
Add other properties as needed for input through the designer, marking these with the corresponding [Input] attribute. In my case, I added a SelectStatement and ConnectionString property to direct the generator to the right table source.
-
Override the OnInitialize method to retrieve the lookup values. In my case, I stored these to a field-level Queue.
-
Override the OnGenerateNextValue method to set the output property with the next value.
After building the custom generator, it needs to be installed. This is a bit of a pain, but here are the steps:
-
Sign the assembly with a strong name. Create an extensions XML file that references the fully-qualifed assembly name (I have included the sn keypair and xml file that I used in the download)
-
Copy the assembly to:
-
the PrivateAssemblies directory, default location = "C:\Program Files\Microsoft Visual Studio 8\Common7\IDE\PrivateAssemblies\"
-
Sometimes I have added it to the Extensions directory for good measure, not sure if this matters, default location = "C:\Program Files\Microsoft Visual Studio 8\DBPro\Extensions\"
-
Copy the CustomGenerators.Extensions.xml file to the DBPro directory, default location = "C:\Program Files\Microsoft Visual Studio 8\DBPro\"
Another option is registering the assembly in the GAC.
You will know the custom generator is installed correctly when it shows up in the list of available generators for character data columns.

Now, for it to actually work, we need to put all of the lookup values into a safe place, where we can pull them back out at data generation time. I favor storing these values in a separate database, in normal tables. I use scripting to pre-create this "support database", along with all tables and lookup data. So going back to the LoadStaticValues.sql script that is being run in the post-deployment, phase, it can be edited to perform these functions as shown below. This script is available in the download.

Finally, we are almost done. Going back to the data generation plan, we can set the generator for UserType.Description to a StaticLookupValue generator, with the following property settings:
Note the ordering clause on the SelectStatement property is being utilized to generate the values in the same order that they were inserted into the support database. Note also that the data preview will fail in the designer until the support database exists, meaning you have to deploy it first to get a working preview. Finally, be sure and set the "Rows to Insert" on the UserType table to the exact number of lookup values in the table (in our case, 3). Otherwise, it will run out of values (in the StaticLookupValue generator class, I handled this to degrade somewhat gracefully by having it output RanOutOfValues-{ordinal} once all values have been exhausted).
Run the data generation plan:
-
Click the "Generate Data" button
-
Select the deployed database connection string for the target (not the support database). In this case, the connection string is: server=(local);database=VstsTdd.Database.Deployed;Trusted_connection=true;
-
When asked if you want to clear all rows in selected tables, answer "Yes". This is because all necessary values will now be handled by the data generation plan.
When complete, each table should show a successful status, and selecting from the table will bring back the exact lookup values previously specified, in the right order.

Next Steps
Now that data generation is working properly with lookup tables, the next logical step is to have this particular data generation plan run automatically when you start a unit test project. Using the database unit test features that come with VSTS Database Edition, one is able to configure a unit test project to do exactly that, thereby assuring that the database has been reset to pre-defined known state, prior to running the first unit test.
