|
One of the more
annoying features of database table
design for the end user is the case where multiple records of the same
data is added to a table (with a unique number - usually a version
number) rather than updating the original record. Although the multiple
versions of records in the one table provides an audit trail of all
changes to the record, it does cause problems when joining tables when
you are only interested in the latest version.
For example:
EG_TABLE
DATE HH VERSION VALUE LAST CHANGED ------------------- --- ------- ----- ------------------- Dec 15 1998 12:30AM 1 1 22.76 Dec 16 1998 04:35AM Dec 15 1998 1:00AM 2 1 23.44 Dec 16 1998 04:35AM Dec 15 1998 1:30AM 3 1 34.55 Dec 16 1998 04:35AM Dec 15 1998 1:30AM 3 2 34.55 Dec 22 1998 04:24PM Dec 15 1998 1:30AM 3 3 34.55 Dec 30 1998 11:54AM Dec 15 1998 2:00AM 4 1 23.43 Dec 16 1998 04:35AM Dec 15 1998 2:30AM 5 1 34.23 Dec 16 1998 04:35AM
For this fictitious
data we can see the values listed
for five time periods and that the value for half hour 3 has been
revised two times. If we were to link with this table without
accounting for the multiple revisions we will not get what we are
expecting if we only wanted the latest value for each time period.
The issue of
multiple revisions in a single table is a
very common problem with the database design as presently implemented
in the Australian National Electricity Market (NEM). Although NEMMCO
have attempted to alleviate the situation by implementing a number of
summary tables (usually designated with an "_D" at the end of the table
name) the problem still arises in many places.
The general
solution to the problem is to use a
sub-query to select the latest version of each record and then to use
the results of this sub-query in the join. For example with our
fictitious table listed, if we simply wish to list the latest versions
of the records in EG_TABLE, we could write the query as:
select * from EG_TABLE T0 where T0.VERSION = (select max(VERSION) from EG_TABLE T1 where T1.DATE = T0.DATE)
Resulting in the
following output:
DATE HH VERSION VALUE LAST CHANGED ------------------- --- ------- ----- ------------------- Dec 15 1998 12:30AM 1 1 22.76 Dec 16 1998 04:35AM Dec 15 1998 1:00AM 2 1 23.44 Dec 16 1998 04:35AM Dec 15 1998 1:30AM 3 3 34.55 Dec 30 1998 11:54AM Dec 15 1998 2:00AM 4 1 23.43 Dec 16 1998 04:35AM Dec 15 1998 2:30AM 5 1 34.23 Dec 16 1998 04:35AM
As a real world
example of joining this type of complex
table design as implemented in the NEM infoserver, a common requirement
for many applications is to join the various participant, station and
dispatch unit tables together to provide the basic dispatch and
ownership information for each unit in the market. Unfortunately, the
STATIONOWNER, STADUALLOC and DUDETAIL tables are all of the multiple
version in the one table design and must be treated accordingly. Thus,
to get each dispatched generator in the market from the NEM tables
directly we can then use the following query:
select P0.PARTICIPANTID, P0.NAME as PARTICIPANTNAME, S0.STATIONID, ST0.STATIONNAME, SD0.DUID, D0.REGISTEREDCAPACITY, D0.MAXCAPACITY, D0.CONNECTIONPOINTID from PARTICIPANT P0, STATIONOWNER S0, STATION ST0, STADUALLOC SD0, DUDETAIL D0 where S0.VERSIONNO = (select max(VERSIONNO) from STATIONOWNER S1 where S1.STATIONID = S0.STATIONID) and P0.PARTICIPANTID = S0.PARTICIPANTID and S0.STATIONID = ST0.STATIONID and S0.STATIONID = SD0.STATIONID and SD0.VERSIONNO = (select max(VERSIONNO) from STADUALLOC SD1 where SD1.DUID = SD0.DUID) and SD0.DUID = D0.DUID and D0.VERSIONNO = (select max(VERSIONNO) from DUDETAIL D1 where D1.DUID = D0.DUID) and D0.DISPATCHTYPE = 'GENERATOR' and not D0.STARTTYPE = 'NOT DISPATCHED' order by P0.PARTICIPANTID, S0.STATIONID, SD0.DUID
Although this query
works, it is not simple and is a
messy implementation. Fortunately NEMMCO have now provided a summary
table with the most commonly used information in the table
DUDETAILSUMMARY (to be implemented as part of the REPLICA 2.2 release)
to negate the need to perform this type of query for dispatch unit
information suitable for most applications.
DUDETAILSUMMARY Field name Data type Description
DUID VARCHAR2 Dispatchable Unit Identifier STARTDATE DATE Start date for effective record ENDDATE DATE End date for effective record CONNECTIONPOINTID VARCHAR2 Country wide - Unique id of a connection point REGIONID VARCHAR2 Region identifier that unit is in STATIONIDID VARCHAR2 Station or site that unit is in PARTICIPANTID VARCHAR2 Participant that owns unit during effective record period LASTCHANGED DATE Last date and time record changed
[Loss factors are also to be added to the table]
Unfortunately
tables with the multiple versions of
records in the one table design will persist in the NEM database design
and so the sub-query technique described above will still be useful
when you need to join to one of these tables to get the latest market
information.
|