Thursday 15 July 2010

Table Comparison Signature

We've been working on a data archiving project to move old data from the production schema in one database into an archive schema in a different tablespace, on a remote server.

We copy data from a production table (let's call it PROD.FINANCE_TAB into a pre-existing table in an archive schema (e.g. ARCH.FINANCE_TAB).  The fact that they are in a different database is actually irrelevant to the problem at hand.

If  PROD.FINANCE_TAB were to change structure (due to an added column or a datatype change), the archive transaction may fail.  So we need to test whether the table structure is the same in both schemas.

I Googled to take a look at how people usually do this and the result isn't pretty.  It usually takes three separate queries: one to find tables with the same columns, but different datatypes or sizes and two others to find tables with extra or missing columns.

This is cumbersome, but at least shows you what the differences are. In our case, though, we don't care what's different.  Just knowing that there is a difference is enough for us to stop processing the archive run and return control to the DBA.

So we came up with the following query which builds a quasi-hash value for the table structure:
select sum(
          ora_hash(COLUMN_NAME||
                   DATA_TYPE||
                   to_char(NVL(DATA_LENGTH,0))||
                   to_char(NVL(DATA_SCALE,0))||
                   to_char(NVL(DATA_PRECISION,0))
                  )
          ) t_hash
from all_tab_columns
where table_name = '&table_name'
and owner = '&owner';
           t_signature
----------------------
          141887801613
This signature is unique to the table structure.  It returns a different value if a column is added or dropped, or if its datatype, size or scale changes.

We based a function on the query which we could execute for each archive table in both PROD and ARCHIVE schema.  If the signatures matched, we could be confident that the table structure had not changed.