Part of our testing mechanism is to verify that a bank account and sort-code combination is valid and we retrieve the bank details for that combination (using a commercial account lookup service). So we couldn't just replace the sort-code and/or account number with random numbers.
Conversely, replacing all account numbers with a handful of known, "safe" working values would lead to skewed data spreads, so that wouldn't work.
A better solution is to shuffle the existing data, moving the sort-code and account number pairs to a different, unrelated client record.
We found a description of the Fisher-Yates Shuffle (one variety of which is also known as the Knuth Shuffle) which seemed like a good way to achieve our aim. It seems to be used quite widely to do exactly what we wanted, but I couldn't find any examples of implementing the shuffle in PL/SQL.
So, here's our solution:
set serveroutput on size unlimited
DECLARE
jswap NUMBER ;
jint NUMBER := 17 ;
lcount NUMBER := 0 ;
lrowcount NUMBER := 0 ;
TYPE tab_ba IS TABLE OF client_bank_accts%ROWTYPE INDEX BY BINARY_INTEGER;
array_ba tab_ba ;
CURSOR r IS
SELECT *
FROM client_bank_accts;
sav_banksortcd client_bank_accts.banksortcd%TYPE;
sav_bankacctnbr client_bank_accts.bankacctnbr%TYPE;
BEGIN
OPEN r;
LOOP
FETCH r
BULK COLLECT
INTO array_ba LIMIT 5000;
lrowcount := array_ba.COUNT;
FOR i IN REVERSE 1 .. array_ba.COUNT LOOP
jswap := trunc(dbms_random.value(1, i+1));
sav_banksortcd := array_ba(i).banksortcd;
sav_bankacctnbr := array_ba(i).bankacctnbr;
UPDATE client_bank_accts
SET banksortcd = array_ba(jswap).banksortcd,
bankacctnbr = array_ba(jswap).bankacctnbr
WHERE clientbankacctseq = array_ba(i).clientbankacctseq;
UPDATE client_bank_accts
SET banksortcd = sav_banksortcd,
bankacctnbr = sav_bankacctnbr
WHERE clientbankacctseq = array_ba(jswap).clientbankacctseq;
lcount := lcount + 2;
END LOOP;
EXIT WHEN r%NOTFOUND;
COMMIT;
END LOOP;
CLOSE r;
dbms_output.put_line('Completed. '||to_char(lcount)||' rows updated!');
END;
/
No comments:
Post a Comment