Hi all,
I'm facing a problem which I've spent a great deal of time trying to address, and although I have a solution, it's clunky and involves pl/sql processing, and I was wondering what others might come up with.
I'm working with a dataset that creates a new row every time a record is changed, thus maintaining a history. The most up-to-date version is then displayed in our application. Consider a table with the following data:
Person ID Last_Name Address_line1 Effective_Start_Date Effective_End_Date 4913 Jones 1 First Street 03-aug-02 31-dec-12 4913 Cross 1 First Street 01-feb-02 02-aug-02 4913 Cross 86 Green Avenue 01-mar-01 31-jan-02 4913 Cross 87 Devonshire Road 01-jan-90 28-feb-02
As part of a report, I need to extract the details which have changed between a given set of dates. For instance, say I want to extract the current address_line1
and the previous address_line1
along with the date of change (effective_start_date
when the new address was added). The caveat is that if other column data changes, this will create a new row too. For instance, in the example above, the last_name
changed after the address changed.
Unfortunatey, the query must be generic so that it can be run as part of a report,i.e. not having to specify explicitly the effective start and end dates.
Hope that all makes sense. Hopefully, you're all still with me. So, given the data-set above, I would expect to see the following results in my report:
Person ID Surname Address_line1 Prev_Address_line1 Effective Start date of New Address Line 1 4913 Jones 1 First Street 86 Green Avenue 01-feb-02
My approach involves processing with pl/sql and looping over a considerable number of records but I was wondering if this can be done in a single sql query.
Does anyone have any ideas on whether this can be done using only sql?
-
SELECT personID, surname, address_line1, LAG(address_line1) OVER (PARTITION BY personID ORDER BY effectiveDate) AS prev_address_line1 FROM mytable WHERE personID = :myid ORDER BY effectiveDate
To return the last effective value, use:
SELECT * FROM ( SELECT personID, surname, address_line1, LAG(address_line1) OVER (PARTITION BY personID ORDER BY effectiveDate) AS prev_address_line1, ROW_NUMBER() OVER (PARTITION BY personID ORDER BY effectiveDate DESC) AS rn, FROM mytable WHERE personID = :myid ) WHERE rn = 1
-
Don't Completely Understand your Question.
As part of a report, I need to extract the details which have changed between a given set of dates.
Versus
Unfortunatey, the query must be generic so that it can be run as part of a report,i.e. not having to specify explicitly the effective start and end dates.
1) Does your Report provide a start date and End date, and look for changes within the range?. 2) Or are you looking for a general report for all changes with no constraints on date?
Also
Assuming the Last Name Changed within the set of dates for which you are running the report.
1) In your expected result are you completely ignoring the Fact that the Last_Name Changed
2) Or will that be another row in your Report.
of the type
Person ID Surname Prev_Surname Effective_Start_date_of_New Surname 4913 Jones Cross 03-aug-02
3) Or is it even more complicated that you will capture all changes in one row.
Person ID Surname Prev_Surname Effective_Start_date_of_New Surname Address_line1 Prev_Address_line1 Effective_Start_date_of_New_Address_Line_1 4913 Jones Cross 03-aug-02 1 First Street 86 Green Avenue 01-feb-02
For a Report which shows each change, with the previous values for name and address. Try this. You can pick the last change by an outer select and choosing Change_rank =1
SELECT PERSON_ID, CHANGED, NEW_VAL, OLD_VAL, EFFECTIVE_START_DATE -- The Rank just lets you pick by order of change incase you only want the last change, or last 2 changes ,RANK () OVER (ORDER BY EFFECTIVE_START_DATE DESC) CHANGE_RANK FROM ( -- A select over Partition for each column where Old_val and New_Val need to be retrieved. -- In this Select the Address Column and Changed Address Are retrieved. SELECT PERSON_ID, 'ADDRESS_LINE1 CHANGE' CHANGED, ADDRESS_LINE1 NEW_VAL ,LEAD (ADDRESS_LINE1) OVER (PARTITION BY PERSON_ID ORDER BY EFFECTIVE_START_DATE DESC) AS OLD_VAL ,EFFECTIVE_START_DATE FROM TMP_RM_TEST -- Usually You want to run a report for a period, Or remove this entire filter WHERE EFFECTIVE_START_DATE BETWEEN TO_DATE (:REPORT_START_DATE, 'YYYYMMDD') AND TO_DATE (:REPORT_END_DATE, 'YYYYMMDD') UNION -- In this Select the Name Column and Changed Name Are retrieved. SELECT PERSON_ID, 'LAST_NAME CHANGE' CHANGED, LAST_NAME NEW_VAL ,LEAD (LAST_NAME) OVER (PARTITION BY PERSON_ID ORDER BY EFFECTIVE_START_DATE DESC) AS OLD_VAL ,EFFECTIVE_START_DATE FROM TMP_RM_TEST WHERE EFFECTIVE_START_DATE BETWEEN TO_DATE (:REPORT_START_DATE, 'YYYYMMDD') AND TO_DATE (:REPORT_END_DATE, 'YYYYMMDD') ) -- Since the Partition By lists all changes over Person_ID, Irrespective of whether the Column changed, Filter to Remove Lines where no change was made to the Coloumn monitored WHERE NEW_VAL <> OLD_VAL ORDER BY CHANGE_RANK
0 comments:
Post a Comment