Pages

Friday, June 18, 2010

Ensuring Data Quality and Governance Using Model Templates

In an ideal world the contents and purpose of any database table could be intuited solely on its name. Wouldn’t it help the development process to understand the content of a column based purely on its label? Wouldn’t data analysis be easily accomplished if system reports always implemented naming standards for all business objects on different systems? Wouldn't it facilitate data warehousing if similar data was always stored with similar datatypes regardless of the system in question? The answer, in every case, is yes. But this is easier said than done.

Whether you are a data architect designing a set of system tables, the database administrator who maintains that system, the development team who writes applications against an existing system, or even the business analyst that relies on report data to analyze current and future trends; the standardization of your design across your enterprise lends a level of transparency that facilitates work flow.

Data makes the world go round and the success of our business decisions, in large part, depends on the quality of our data. This has brought Master Data Management (MDM) to the forefront of every business discussion. As data stores continue to grow, the question becomes how to best ensure the transparency and quality of this data to all levels of the organization. The cornerstone of any MDM initiative should be a well defined data model.

A powerful data modeling tool can greatly facilitate this process by allowing you to define, maintain and share a set of corporate standards. Once these standards are defined, their reusability and convenient maintenance can save time and money.

The CA ERwin Data Modeler has an especially robust number of reusable features that can easily be shared as standalone template files as well as to the shared Model Manager repository. With that in mind, I am launching a series of articles relating to the reusability of design to ensure design integrity and data quality.

If you would like to try some of the exercises in this series, CA offers a limited Community edition of Erwin (available here) as well as a full version of the tool with a 15 day limited trial (available here).

In this first post I will be looking at the ability to define a reusable Domain object to define a column that changes depending on its context. Note that you can click on any image for a larger view.

First, we need to create a new model. We can define any physical database as our target server in this particular example. We create a new Domain in the Logical model and call it ID. Define the Domain Parent as Number. In the Datatype tab, define select INTEGER from the list.
In the general tab, notice that the default Name Inherited by Attribute is %AttDomain. This simply means that the domain name will be the domain name in a new occurrence of the attribute.
We modify this to read %OwnerEntity_%AttDomain:
This will prefix the domain name with the Owner entity. Note that this is not the Entity Name but rather the entity in which the column first occurs. This allows the name to remain constant as it migrates to other entities. So let us see the results.

Click OK to exit the editor. Now create two entities called Parent and Child.
Drag the ID domain into the parent column from the Model Explorer and release it. The result is:
Draw an identifying relationship from Parent to Child and note the migrating column name:
Now drag the ID domain into the Child table:
In this way, we can create domains that automatically follow a corporate standard to facilitate naming conventions. The Domain can further be customized to contain Defaults and Constraints and also to create UDPs and even to define default Data Warehouse rules on each column.
We will be covering the use of UDPs and implementation of Data Warehousing concepts in future entries in this series so stay tuned.

Tuesday, June 1, 2010

Filtering COLLATE Syntax using the CA ERwin FET


Welcome to the inaugural Maximum Data Modeling blog post. I look forward to using this blog to provide some best practices regarding the use of the CA ERwin Modeling Suite to maximum effect and hope to learn some new tricks, from the community, along the way. Without further delay, let’s get onto our topic.
Often, after reverse engineering a SQL Server 2005 or 2008 database environment, users find that the tool has collected the COLLATE property on many of the columns. Case in point, I have Reverse engineered the default ‘master’ database on a SQL Server 2005 server (NOTE: Click any image for better detail):

If I preview the SQL generated for the table spt_fallback_dev it looks like:
CREATE TABLE spt_fallback_dev
(
xserver_name varchar(30) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL , xdttm_ins datetime NOT NULL ,
xdttm_last_ins_upd datetime NOT NULL ,
xfallback_low int NULL ,
xfallback_drive char(2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
low int NOT NULL ,
high int NOT NULL ,
status smallint NOT NULL ,
name varchar(30) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
phyname varchar(127) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
)
go
The COLLATE syntax is valid and does exist on these objects. However, usually the COLLATE is defined as a server default and it is unnecessary and even improper to include it in the table definition since it may conflict with the environment to which we will be generating the script.
We will now create a simple solution for this using a Model Level UDP and a handful of added lines of code in the FET editor.
First, we define the UDP at the model level. It will be a List type UDP with values of No and Yes.
Next we will delve into the Forward Engineer Template editor. If you are not familiar with this editor, there is a demo video in the Videos folder of the Erwin Install path. This video is also available On Demand on the Erwin.com site.
In the FET editor select your sample Context to be the same table, spt_fallback_dev. Select the Create Entity from the list of templates. Notice that the Expanded Text includes the Collate statements.
Next, we are going to define a Global Flag depending on whether the UDP is set to yes or no. We do this by entering the following text into the Template Source editor right between ShouldGenerate and the beginning of the first code comment.
PushOwner
@if (Equal(Property("Model.Physical.COL COLLATION"),"Yes")){SetGlobalFlag("CollateFlag")}
Pop
Now our FET code looks like…
So what does that all mean?
First of all, we want to get to the model level but we are in the Entity so we PushOwner. The FET works like any stack and we push each time we want to climb a level. If you have ever used the ErwinSpy (and it is a must if you plan to do anything with the FET or the API) you can see exactly how many layers in the stack one object is from another.
Next, we have an if condition checking for equality between the UDP and the string “Yes”. If they match we set the global flag variable “CollateFlag” (the name is arbitrary).
The trickier part of the syntax is the UDP name. In general all model properties are simply referenced like Property(“”). But in the case of the UDPs the syntax is (1).(2).(3). So we end up with Model.Physical.COL COLLATION
We finish off this clause with a Pop statement to move back from our push position down to the Entity level.
That is the worst bit of it. We have just one step to go. Use the Find in the Template editor to locate the key word COLLATE. If you use a case sensitive match you will find this:
This is the code that is generating the COLLATE syntax. We need to make it conditional. We are going to do this by placing the entire thing within the if condition…
@if(IsGlobalFlagSet("CollateFlag")){ … }
Now our editor looks like:
Make sure to enclose the entire collate syntax within the {}. The IsGlobalFlagSet is a default macro that does exactly what you expect, it returns TRUE if the specific value is defined.
Notice that the table definition has already lost its COLLATION syntax (provided you left the UDP set to No).
Save the template file to your local drive. You cannot override the Default but you can save this as your standard FE template file once you have saved it to your drive.
Now, in your Forward Engineer editor use the Browse button to specify this new FET file:
Preview your good work: