Click to return to the HARD software home page Click to found out more about HARD software Click for HARD software's technical solutions products Click for the free HARD software downloads Click to visit the HARD software store Click for the HARD software recommended links
HARD software newsletter highlights HARD software newsletter highlights and useful information
       
Index
HARD software newsletter
Click to subscribe or unsubscribe to the HARD software newsletter to be kept up to date with HARD software developments and products.
Download any of the newsletters
Click to download the full text of any of the published HARD software newsletters.

Here are the technical tips from the previously published HARD software newsletters:

   

July 1999 edition

 

Joining tables with multiple versions of records

One of the more annoying features of database table design for the end user is the case where multiple records of the same data is added to a table (with a unique number - usually a version number) rather than updating the original record. Although the multiple versions of records in the one table provides an audit trail of all changes to the record, it does cause problems when joining tables when you are only interested in the latest version.

For example:

EG_TABLE

DATE HH VERSION VALUE LAST CHANGED
------------------- --- ------- ----- -------------------
Dec 15 1998 12:30AM 1 1 22.76 Dec 16 1998 04:35AM
Dec 15 1998 1:00AM 2 1 23.44 Dec 16 1998 04:35AM
Dec 15 1998 1:30AM 3 1 34.55 Dec 16 1998 04:35AM
Dec 15 1998 1:30AM 3 2 34.55 Dec 22 1998 04:24PM
Dec 15 1998 1:30AM 3 3 34.55 Dec 30 1998 11:54AM
Dec 15 1998 2:00AM 4 1 23.43 Dec 16 1998 04:35AM
Dec 15 1998 2:30AM 5 1 34.23 Dec 16 1998 04:35AM

For this fictitious data we can see the values listed for five time periods and that the value for half hour 3 has been revised two times. If we were to link with this table without accounting for the multiple revisions we will not get what we are expecting if we only wanted the latest value for each time period.

The issue of multiple revisions in a single table is a very common problem with the database design as presently implemented in the Australian National Electricity Market (NEM). Although NEMMCO have attempted to alleviate the situation by implementing a number of summary tables (usually designated with an "_D" at the end of the table name) the problem still arises in many places.

The general solution to the problem is to use a sub-query to select the latest version of each record and then to use the results of this sub-query in the join. For example with our fictitious table listed, if we simply wish to list the latest versions of the records in EG_TABLE, we could write the query as:

select * from EG_TABLE T0
where T0.VERSION = (select max(VERSION) from EG_TABLE T1
where T1.DATE = T0.DATE)

Resulting in the following output:

DATE HH VERSION VALUE LAST CHANGED
------------------- --- ------- ----- -------------------
Dec 15 1998 12:30AM 1 1 22.76 Dec 16 1998 04:35AM
Dec 15 1998 1:00AM 2 1 23.44 Dec 16 1998 04:35AM
Dec 15 1998 1:30AM 3 3 34.55 Dec 30 1998 11:54AM
Dec 15 1998 2:00AM 4 1 23.43 Dec 16 1998 04:35AM
Dec 15 1998 2:30AM 5 1 34.23 Dec 16 1998 04:35AM

As a real world example of joining this type of complex table design as implemented in the NEM infoserver, a common requirement for many applications is to join the various participant, station and dispatch unit tables together to provide the basic dispatch and ownership information for each unit in the market. Unfortunately, the STATIONOWNER, STADUALLOC and DUDETAIL tables are all of the multiple version in the one table design and must be treated accordingly. Thus, to get each dispatched generator in the market from the NEM tables directly we can then use the following query:

select P0.PARTICIPANTID, P0.NAME as PARTICIPANTNAME, S0.STATIONID, ST0.STATIONNAME,
SD0.DUID, D0.REGISTEREDCAPACITY, D0.MAXCAPACITY, D0.CONNECTIONPOINTID
from PARTICIPANT P0, STATIONOWNER S0, STATION ST0, STADUALLOC SD0, DUDETAIL D0
where S0.VERSIONNO =
(select max(VERSIONNO) from STATIONOWNER S1 where S1.STATIONID = S0.STATIONID)
and P0.PARTICIPANTID = S0.PARTICIPANTID
and S0.STATIONID = ST0.STATIONID
and S0.STATIONID = SD0.STATIONID
and SD0.VERSIONNO =
(select max(VERSIONNO) from STADUALLOC SD1 where SD1.DUID = SD0.DUID)
and SD0.DUID = D0.DUID
and D0.VERSIONNO =
(select max(VERSIONNO) from DUDETAIL D1 where D1.DUID = D0.DUID)
and D0.DISPATCHTYPE = 'GENERATOR'
and not D0.STARTTYPE = 'NOT DISPATCHED'
order by P0.PARTICIPANTID, S0.STATIONID, SD0.DUID

Although this query works, it is not simple and is a messy implementation. Fortunately NEMMCO have now provided a summary table with the most commonly used information in the table DUDETAILSUMMARY (to be implemented as part of the REPLICA 2.2 release) to negate the need to perform this type of query for dispatch unit information suitable for most applications.

DUDETAILSUMMARY

Field name Data type Description

DUID VARCHAR2 Dispatchable Unit Identifier
STARTDATE DATE Start date for effective record
ENDDATE DATE End date for effective record
CONNECTIONPOINTID VARCHAR2 Country wide - Unique id of a connection point
REGIONID VARCHAR2 Region identifier that unit is in
STATIONIDID VARCHAR2 Station or site that unit is in
PARTICIPANTID VARCHAR2 Participant that owns unit during effective record period
LASTCHANGED DATE Last date and time record changed

[Loss factors are also to be added to the table]

Unfortunately tables with the multiple versions of records in the one table design will persist in the NEM database design and so the sub-query technique described above will still be useful when you need to join to one of these tables to get the latest market information.

 
Return to top of page

February 1999 edition

 

Implementing SQL running total queries

A common reporting requirement for database queries in the energy industry is to show a running total of a quantity such as energy in a report. This can be easily achieved in a single SQL query without requiring an external program to sum the quantity for each time period.

Suppose that we have want to show the running energy results in MWh of dispatch unit MYUNIT for the 14 December 1998. We can write a simple SQL query that provides the running total energy (and will correct for that fact that each period is a half hour) from the Australian National Electricity Market (NEM) infoserver TRADINGLOAD table.

For those readers who are unfamiliar with the NEM infoserver TRADINGLOAD table, the table contains the average total MW dispatched over the half hour trading interval for a dispatchable unit, such as a single gas generator, identified by the DUID (the table also contains other data not considered here). An abbreviated version of the SQL table structure is then:

CREATE TABLE TRADINGLOAD
SETTLEMENTDATE datetime NOT NULL,
DUID varchar (10) NOT NULL,
PERIODID numeric(3, 0) NOT NULL,
TOTALCLEARED numeric(15, 5) NULL,
LASTCHANGED datetime NULL)

The following SQL query sums all of the TOTALCLEARED columns where the SETTLEMENTDATE date and time is less than or equal to the period's date and time and divides by two to get the energy in MWh units as required. Then for the settlement day of the 15 December 1998:

SELECT settlementdate AS 'Date', periodid AS 'Period', 
totalcleared AS 'Output',
(SELECT SUM(totalcleared) / 2
FROM tradingload AS D1
WHERE settlementdate > '15-dec-1998'
AND settlementdate <= '16-dec-1998'
AND D1.settlementdate <= D0.settlementdate
AND duid = 'MYUNIT') AS 'Energy MWh'
FROM tradingload AS D0
WHERE settlementdate > '15-dec-1998'
AND settlementdate <= '16-dec-1998'
AND duid = 'MYUNIT'

This SQL query produces the following output:

Date Period Output Energy MWh
-------------------- ------- ---------- ------------
Dec 15 1998 12:30AM 1 437.50000 218.750000
Dec 15 1998 1:00AM 2 429.89833 433.699165
Dec 15 1998 1:30AM 3 434.82667 651.112500
Dec 15 1998 2:00AM 4 425.00000 863.612500
Dec 15 1998 2:30AM 5 412.77333 1069.999165
Dec 15 1998 3:00AM 6 387.03333 1263.515830
Dec 15 1998 3:30AM 7 375.00000 1451.015830
. . . .
. . . .
Dec 15 1998 7:30PM 39 425.00000 8406.847495
Dec 15 1998 8:00PM 40 438.78500 8626.239995
Dec 15 1998 8:30PM 41 433.97500 8843.227495
Dec 15 1998 9:00PM 42 428.29667 9057.375830
Dec 15 1998 9:30PM 43 436.91667 9275.834165
Dec 15 1998 10:00PM 44 425.00000 9488.334165
Dec 15 1998 10:30PM 45 425.00000 9700.834165
Dec 15 1998 11:00PM 46 435.00000 9918.334165
Dec 15 1998 11:30PM 47 446.66667 10141.667500
Dec 16 1998 12:00AM 48 438.56833 10360.951665

(48 row(s) affected)

The same running total technique can be used for many useful quantities in energy markets, such as half hourly pool revenue, any form of energy data or a running count of the occurrence of events.

 
Return to top of page
 

Implementing SQL crosstab queries

Cross tabs are used to 'flatten out' data that appears in a table in a linear fashion. A good example of where this technique is useful is again with the TRADINGLOAD table of the Australian NEM market infoserver that was previously considered.

It is often useful to list the generation units for an organization in a single query including the total for each period. The following SQL query achieves this through the use of sub-queries to create a useful summary table for generation by each unit and the total generation (in MW) for the settlement day of the 15 December 1998:

SELECT DISTINCT settlementdate AS 'Date', periodid AS 'Period',
(SELECT SUM(totalcleared) FROM tradingload AS D1
WHERE settlementdate > '15-dec-1998'
AND settlementdate <= '16-dec-1998'
AND duid = 'UNIT1'
AND D0.periodid = D1.periodid) AS 'UNIT1',
(SELECT SUM(totalcleared) FROM tradingload AS D2
WHERE settlementdate > '15-dec-1998'
AND settlementdate <= '16-dec-1998'
AND duid = 'UNIT2'
AND D0.periodid = D2.periodid) AS 'UNIT2',
(SELECT SUM(totalcleared) FROM tradingload AS D3
WHERE settlementdate > '15-dec-1998'
AND settlementdate <= '16-dec-1998'
AND duid = 'UNIT3'
AND D0.periodid = D3.periodid) AS 'UNIT3',
(SELECT SUM(totalcleared) FROM tradingload AS D4
WHERE settlementdate > '15-dec-1998'
AND settlementdate <= '16-dec-1998'
AND duid = 'UNIT4'
AND D0.periodid = D4.periodid) AS 'UNIT4',
(SELECT SUM(totalcleared) FROM tradingload AS D5
WHERE settlementdate > '15-dec-1998'
AND settlementdate <= '16-dec-1998'
AND duid LIKE 'UNIT_'
AND D0.periodid = D5.periodid) AS 'TOTAL'
FROM tradingload as D0
WHERE settlementdate > '15-dec-1998' AND settlementdate <= '16-dec-1998'

This query then produces the following output:

Date Period UNIT1 UNIT2 UNIT3 UNIT4 TOTAL
--------------------------- ------- ---------- ---------- ---------- ---------- ----------
Dec 15 1998 12:30AM 1 437.50000 435.00000 437.50000 435.00000 1745.00000
Dec 15 1998 1:00AM 2 429.89833 403.37333 420.14500 398.12167 1651.53833
Dec 15 1998 1:30AM 3 434.82667 428.44500 431.92833 430.19500 1725.39500
Dec 15 1998 2:00AM 4 425.00000 425.00000 425.00000 425.00000 1700.00000
Dec 15 1998 2:30AM 5 412.77333 404.74833 414.21333 399.39833 1631.13332
Dec 15 1998 3:00AM 6 387.03333 375.00000 375.21500 375.00000 1512.24833
Dec 15 1998 3:30AM 7 375.00000 375.00000 375.00000 375.00000 1500.00000
. . . . . . .
. . . . . . .
Dec 15 1998 7:30PM 39 425.00000 425.00000 425.00000 425.00000 1700.00000
Dec 15 1998 8:00PM 40 438.78500 436.26833 438.78500 436.26833 1750.10666
Dec 15 1998 8:30PM 41 433.97500 426.51333 433.97500 426.51333 1720.97666
Dec 15 1998 9:00PM 42 428.29667 425.00000 428.29667 425.00000 1706.59334
Dec 15 1998 9:30PM 43 436.91667 428.49167 436.91667 428.49167 1730.81668
Dec 15 1998 10:00PM 44 425.00000 425.00000 425.00000 425.00000 1700.00000
Dec 15 1998 10:30PM 45 425.00000 425.00000 425.00000 425.00000 1700.00000
Dec 15 1998 11:00PM 46 435.00000 426.97333 435.00000 426.97333 1723.94666
Dec 15 1998 11:30PM 47 446.66667 446.37167 446.66667 446.37167 1786.07668
Dec 16 1998 12:00AM 48 438.56833 435.00000 438.56833 435.00000 1747.13666

(48 row(s) affected)

The above cross-tab technique can be used for any table where the data is organized linearly but and needs to be displayed across the page. However, each sub-query adds complexity to the query execution and so this approach may not be very efficient.

James Briggs from Mark Gurry & Associates contributed the following comment specifically for ORACLE users:

I saw your cross-tab query and realized in Oracle, with the use of the DECODE function I could reduce the query to the following statement:

SELECT
settlementdate "Date",
periodid "Period",
SUM( DECODE( duid, 'UNIT1', totalcleared, 0) ) "UNIT1",
SUM( DECODE( duid, 'UNIT2', totalcleared, 0) ) "UNIT2",
SUM( DECODE( duid, 'UNIT3', totalcleared, 0) ) "UNIT3",
SUM( DECODE( duid, 'UNIT4', totalcleared, 0) ) "UNIT4",
SUM(totalcleared) "TOTAL"
FROM tradingload
WHERE settlementdate > TO_DATE('15-dec-1998','DD-MON-YYYY')
AND settlementdate <= TO_DATE('16-dec-1998','DD-MON-YYYY')
GROUP BY settlementdate, periodid;

This would be spectacularly faster as it only has to run through the TRADINGLOAD records once, and not perform any joins.

 
Return to top of page
 
HOME | Company | Products | Downloads | Store | Links
Copyright © 2006 HARD software Pty. Ltd. ACN 087 953 839 - Email us!
Last changed: $Date: 2006/03/23 06:26:41 $