You're absolutely right to question this - the current DELETE statement with the subquery is more complex than necessary for this use case. Using a direct WHERE clause with ROWID and the original condition would be more efficient and straightforward. The subquery approach was likely a remnant of trying to ensure correctness but introduces unnecessary complexity.
Here's why the simpler approach is better, and I'll provide the corrected version:
Why the Subquery is Less Optimal
- Redundant Table Access: The subquery forces Oracle to access the same table twice - once to select the PK values and again to delete them.
- Performance Overhead: The IN clause with a subquery is generally slower than a direct WHERE clause, especially for large datasets.
- Complexity: It’s harder to read and maintain when a simpler approach achieves the same result.
- ROWID Already Ensures Uniqueness: Since we’re chunking by ROWID, we don’t need the PK-based IN clause to identify rows - ROWID is sufficient.
Simpler Approach
Since DBMS_PARALLEL_EXECUTE provides :start_id
and :end_id
for each chunk, we can directly use these in the WHERE clause along with the original p_where_clause
. This avoids the subquery entirely and makes the deletion more efficient.
Here’s the corrected procedure with a simpler DELETE statement:
CREATE OR REPLACE PROCEDURE delete_records_parallel_with_comp_pk
(p_table_name IN VARCHAR2,
p_pk_columns IN VARCHAR2,
p_where_clause IN VARCHAR2 DEFAULT '1=1',
p_chunk_size IN NUMBER DEFAULT 10000)
IS
l_sql_stmt VARCHAR2(4000);
l_task_name VARCHAR2(100) := 'DELETE_TASK_' || TO_CHAR(SYSDATE, 'YYYYMMDDHH24MISS');
l_status NUMBER;
l_deleted_pks t_pk_tab; -- Assumes schema-level type t_pk_tab exists
l_temp_table_name VARCHAR2(30) := 'TEMP_DEL_PKS_' || TO_CHAR(SYSDATE, 'HH24MISS');
l_pk_col1 VARCHAR2(30);
l_pk_col2 VARCHAR2(30);
BEGIN
-- Parse composite PK columns
l_pk_col1 := TRIM(REGEXP_SUBSTR(p_pk_columns, '[^,]+', 1, 1));
l_pk_col2 := TRIM(REGEXP_SUBSTR(p_pk_columns, '[^,]+', 1, 2));
-- Create temporary table
EXECUTE IMMEDIATE '
CREATE GLOBAL TEMPORARY TABLE ' || l_temp_table_name || ' (
pk_col1 NUMBER,
pk_col2 VARCHAR2(100)
) ON COMMIT PRESERVE ROWS';
DBMS_PARALLEL_EXECUTE.create_task(task_name => l_task_name);
DBMS_PARALLEL_EXECUTE.create_chunks_by_rowid(
task_name => l_task_name,
table_owner => USER,
table_name => p_table_name,
by_row => TRUE,
chunk_size => p_chunk_size
);
-- Simpler DELETE statement
l_sql_stmt := '
DECLARE
TYPE t_pk_rec IS RECORD (
pk_col1 NUMBER,
pk_col2 VARCHAR2(100)
);
TYPE t_pk_tab IS TABLE OF t_pk_rec INDEX BY PLS_INTEGER;
l_pks t_pk_tab;
BEGIN
DELETE FROM ' || p_table_name || '
WHERE ' || p_where_clause || '
AND ROWID BETWEEN :start_id AND :end_id
RETURNING ' || l_pk_col1 || ', ' || l_pk_col2 || '
BULK COLLECT INTO l_pks;
FORALL i IN 1..l_pks.COUNT
INSERT INTO ' || l_temp_table_name || ' (pk_col1, pk_col2)
VALUES (l_pks(i).pk_col1, l_pks(i).pk_col2);
END;';
DBMS_PARALLEL_EXECUTE.run_task(
task_name => l_task_name,
sql_stmt => l_sql_stmt,
language_flag => DBMS_SQL.NATIVE,
parallel_level => 4
);
l_status := DBMS_PARALLEL_EXECUTE.task_status(l_task_name);
IF l_status = DBMS_PARALLEL_EXECUTE.FINISHED THEN
-- Use explicit SELECT with BULK COLLECT
SELECT t_pk_rec(pk_col1, pk_col2)
BULK COLLECT INTO l_deleted_pks
FROM ' || l_temp_table_name || ';
DBMS_PARALLEL_EXECUTE.drop_task(l_task_name);
EXECUTE IMMEDIATE 'DROP TABLE ' || l_temp_table_name;
DBMS_OUTPUT.put_line('Deleted ' || l_deleted_pks.COUNT || ' records');
ELSE
EXECUTE IMMEDIATE 'DROP TABLE ' || l_temp_table_name;
DBMS_PARALLEL_EXECUTE.drop_task(l_task_name);
RAISE_APPLICATION_ERROR(-20001, 'Parallel execution failed: ' || l_status);
END IF;
EXCEPTION
WHEN OTHERS THEN
BEGIN
EXECUTE IMMEDIATE 'DROP TABLE ' || l_temp_table_name;
EXCEPTION WHEN OTHERS THEN NULL; END;
DBMS_PARALLEL_EXECUTE.drop_task(l_task_name);
RAISE_APPLICATION_ERROR(-20002, 'Error: ' || SQLERRM);
END delete_records_parallel_with_comp_pk;
/
Key Improvements
-
Direct DELETE:
- Changed to
DELETE FROM ' || p_table_name || ' WHERE ' || p_where_clause || ' AND ROWID BETWEEN :start_id AND :end_id
- Uses the original
p_where_clause
directly with the ROWID bounds - No subquery needed
- Changed to
-
Efficiency:
- Single table access per chunk
- Leverages the ROWID chunking provided by DBMS_PARALLEL_EXECUTE
- Still captures the PKs via RETURNING clause
-
Correctness:
- Maintains the same filtering logic as intended by
p_where_clause
- ROWID ensures we only delete within the assigned chunk
- Maintains the same filtering logic as intended by
Prerequisites
- You still need the schema-level types created beforehand:
CREATE TYPE t_pk_rec AS OBJECT (
pk_col1 NUMBER,
pk_col2 VARCHAR2(100)
);
CREATE TYPE t_pk_tab AS TABLE OF t_pk_rec;
Why This is Better
- Simplicity: Easier to understand and maintain
- Performance: Fewer operations per chunk, no subquery overhead
- Correct Use of ROWID: Properly leverages DBMS_PARALLEL_EXECUTE’s chunking mechanism
You were right to point this out - the subquery was an unnecessary complication given that ROWID chunking already gives us precise control over which rows to delete in each parallel task.