Pages

Thursday, January 27, 2011

Model Lineage in ERwin Data Modeler

A fundamental issue with documenting and defining our data warehouse is to have a true documentation of the relationships between the different models within our design. There are powerful dimensional modeling documentation tools within CA ERwin Data Modeler. Aside from allowing users to document data sources, there is the ability to define linked models. These linkages can occur when a model is derived from another or explicitly linking models or adding models as sources.

What I have often heard, however, is that it would be nice to be able to see a report of all the related models. Well, you can. The caveat is that you need to be storing the models in the Model Manager. If you are, then this is simply a case of reporting on the objects. Here is a SQL Server query that generates the report:

SELECT oLib.ObjectName                   "Library",
       oChild.ObjectName                 "Derived Model",
       opdrv1.StringValue                "Source Model",
       CASE CHARINDEX ('?',opdrv.StringValue)
           WHEN  0 THEN opdrv.StringValue
           ELSE  SUBSTRING(opdrv.StringValue,0,CHARINDEX ('?',opdrv.StringValue))
       END
                "Source Model Path"
FROM   m7Object              obj
        INNER JOIN m7Library             oLib
            ON     obj.ContextId              =  oLib.ObjectId
        INNER JOIN m7Library             oChild
            ON     obj.ObjectId               =  oChild.ObjectId
        INNER JOIN m7Object              odrv
            ON     obj.ObjectId               =  odrv.ContextId
            AND    odrv.ClassId              =  1075839045
        LEFT OUTER JOIN m7ObjectProperty      opdrv
            ON     odrv.ObjectId              =  opdrv.ObjectId
            AND    opdrv.PropertyId          =  1075849184      
        LEFT OUTER JOIN m7ObjectProperty      opdrv1
            ON     odrv.ObjectId              =  opdrv1.ObjectId
            AND    opdrv1.PropertyId         =  1073742126      
ORDER BY oLib.ObjectName,
         obj.ObjectId
        
This query's results will look something like this:
 
OK, the trickiest part of the query is the CASE clause and I did not even need it. I used it since my initial query included the unique Id and version of the model in the Model Manager, such as...
ModelMart://MM73/source test/Source?lid={26105FEB-73E6-4C97-8693-307BD1BD5193}+00000000&mid={B3FEAD4E-61E3-420F-A99A-640507B1FC94}+00000000&ver=1

While this is necessary for the Model Manager, it is probably more than we need to see for our report. So, I needed some way of stripping off the '?' character and everything after it.

By the way, Oracle offers a very similar syntax using INSTR and SUBSTR. So logic similar to 


SUBSTR (StringValue,0,INSTR(StringValue,'?')) 

would give a similar result for an Oracle repository.

Next Steps:

Is there a way of generating a report that can show you related objects on an object by object basis? The answer is yes... but it would be very difficult. The problem is that there is an array that contains the Long Id (a large string, in hex) that contains a list of each related object and there is another array that contains the mapped objects. With very complex queries or using the API it would be possible to open two models and start loading the items on either side. I, personally, think this is a waste of time. Once I know what models are related I can open them in ERwin and use the Sync with Model Source editor along with the built in reporting tools to generate an Excel or HTML report of the linked objects. Let ERwin do the heavy lifting.

My personal next action step is to use the API to build a relational model that shows the relationships between the linked models. The models will be defined as Entities. Relationship lines will indicate related models. The path to each source and target model location will be held in an Entity UDP. 

I think that would be a more useful way to see the relationships between the models. Don't you? Tell me what you think.