Automating of SQL Diff-scripts

Introduction
While developing application which use SQL databases, one standard question appears - how keep DB in sync with the code and update it across dev-team, QA, Production?

Possible answers:
  1. Manually sync DBs by a tool. DBDiff, DB Compare, and so on.
  2. Write a diff scripts after each db change, and ship that scripts with an application. Than manually/automatically apply them one-by-one to a target database.
The option #1 if pretty simple, but its error-prone.
The option #2 is widely used in development.


Below, I will try to describe a example of option#2. 

Automating of SQL Diff-scripts
1. You may create a directory in you application for sql diff scripts, like: "myapplication\diff\". While development, you should add into that directory a new sql file each time you/team change the db. Also you need have a naming conventions about naming.
Example of some diff scripts for some virtual application:
 myapplication\Create.sql
    • CREATE TABLE dbversion("number" character varying(255) NOT NULL);
    • INSERT INTO dbversion('1900.01.01');
myapplication\diff\
  • 2012.01.13.sql
    • CREATE TABLE users ("id" int NOT NULL, "name" character varying(255));
    • UPDATE dbversion SET "number" = '2012.01.13';
  • 2012.01.14.sql
    • ALTER TABLE users ADD COLUMN "icon" character varying(255) NULL;
    • ALTER TABLE users ADD COLUMN "description" character varying(4000) NULL;
    • UPDATE dbversion SET "number" = '2012.01.14' ; 
  • 2012.01.28.sql
    • INSERT  INTO users ("id","name","icon","description") VALUES (1,'Administrator','admin.gif','Application administrator') ;
    • UPDATE dbversion SET "number" = '2012.01.28'; 

As you see, in this example I've named each file as a day when db was changed by a developer. Also, I've updated one row in some "dbversion" table which should match the diff file name.


2. You may create a separate tool which will automatically update any given Db or create the DB from scratch.
Creating of DB algorithm:
  1. Execute "myapplication\Create.sql" file.
  2. Execute all files in myapplication\diff\ one-by-one using diff name as order.
Updating of DB algorithm:
  1. Read the "actual DB version": "number" column of  "dbversion" table.
    • If table do not exists- call "Creating of DB algorithm".
  2. Execute only those files form myapplication\diff\ which have a name grater then "actual DB version" from step#1.
3. You may include a your update DB tool into your application. So, application will have own functionality to update/create DB.  Like, your application can read the config params and show error page if actual DB version is older then the code, then provide a link to the "upgrade page" on which administrator can start auto-update procedure or create db procedure.

Tips
  • For .Net:
    • You may need to include diff.sql files inside a dll. To do this - simple set file properties as "Internal Resource". Then use Reflection to browse diffs folder.
    • You may want to include DB Version checking on ApplicationLoad event. Be aware that if you throw any exception inside OnApplicationLoad event - application still be loaded! So users still can access to all you pages.Overcome, save check results in property and throw some "InvalidDbVersionException" in each OnRequestStarted event.
  • For Java
    • You may just inсlude diff.sql into JAR/WAR. But java reflection/resources can not browse folders inside zipped file. To solve this you need additionally add a file with diffs file names, read this file as resource, and then read the diff one-by-one as resources. To simplify this, you may improve build.xml file to do this automatically.
    • You may want to include DB Version checking on contextInitialized event (need implement ServletContextListener interface and mark @WebListener() or configure it in web.xml as  listener-class). Throwing of any exception inside contextInitialized will prevent deploying of the application in the application server (but you still can deploy it with "disabled"=true).


Conclusion.

Using manually written diff scripts have following advantages/disadvantages:
Advantages
  • Perfect for dev teams. Many developers can work with yours dev DBs and prepare and commit new diffs in shared code repository (svn, git).
    • Note in many developers work together you may need more advanced diff naming conventions, like: 2012.01.28_<file number per day>.sql:  2012.01.28_0.sql, 2012.01.28_1.sql, 2012.01.28_3.sql.
  • QA and Production can be simply configured to upgrade DBs automatically from applications folder (or other delivery method). This remove any mistakes while deployments.
  • In some cases its dramatically simplify application delivery/installations.
Disadvantages
  • Need some efforts to adopt this in your team.
  • Need manually write all DB diffs. 
    • Thus, each developer can generate it by using Db comparing tools.
  • Some times, there can be a huge number of diff files. 
    • This may be solved by aggregating many diffs into one huge like for each release. Aggregation can be by simple concatenation or by comparing different versions of DBs using a tool and producing one diff - in this case final diff can smaller, than by simple concatenation.
 I may opinion, any team "must-have" solution similar to diff scripts. Trues, some enterprise ORMs already have a features to achieve this automatically (give me a note if you know one).



Comments