Multi-Threading Processes: The Concept
Multi-Threading Processes: The Concept
Multi-Threading Processes: The Concept
com
Phone: (905) 856-7479
Fax: (416) 352-1334
Multi-Threading Processes
Parin Jhaveri and Lev Moltyaner
1-Sep, 2003
In this article, we will illustrate a tuning technique that splits processes into multiple concurrent sub-processes.
This technique of multi-threading is one of the simplest, non-intrusive and yet most effective performance tuning technique.
You can achieve significant performance gains without having to rewrite the entire process. However, it may not be used on
all processes. It is applicable for processes that use an outer loop to process all records from a cursor in a similar fashion.
Furthermore, it will only yield significant gains on servers with multiple CPUs.
The Concept
The idea of this technique is to split a single process with many iterations of a loop into multiple concurrent processes, each
with fewer iterations. We will use our application to demonstrate this concept. We use Oracle Payroll to provide payroll
services to the shipping industry. As a payroll service provider, we often need to upload new employees from new
customers into the Oracle Payroll application using a supplied API. Our process to load this data uses an outer loop on the
staging table and calls the API for every record of staging table (see in Figure A). The process is not scalable because the
time increases linearly as we increase the number of rows in our staging table. In fact, it takes more than few hours to load
a few thousand employees.
Staging Table
emp_stg
(contains
employee
information)
Oracle
Payroll
Tables
Create_oracle_employee_api
end loop;
-1-
www.ProcaseConsulting.com
Phone: (905) 856-7479
Fax: (416) 352-1334
load_employee(3001,4000)
load_employee(2001,3000)
load_employee(1001,2000)
Staging Table
emp_stg
(contains
employee
information)
load_employee(1,1000)
Oracle
Payroll
Tables
Create_oracle_employee_api
end loop;
Status
Message
Load_employee(p1,,px,1,1000)
Main
Load_employee(p1,,px,1001,2000)
Multi-Threading Package
Load Package
parent
Spawns Threads
Using plsql table
.
.
Loop to check for
thread success.
dbms_job.submit
dbms_pipe.send
dbms_pipe.receive
Database Job
Queue
execute
immediate
child
load_employee
calls Oracle
API
Calls Target
.
Commit at the end
3
-2-
www.ProcaseConsulting.com
Phone: (905) 856-7479
Fax: (416) 352-1334
begin
-- Create_oracle_employee_api
end loop;
..
end load_employee;
The next step is to add a new procedure to populate a PLSQL table with calls to load_employee for each range of
employees. This PLSQL table contains one record per thread. Each row has a thread number, a call to load_employee
procedure with all of its parameters, status of child procedure, and message which maybe returned by load_employee
procedure. The PLSQL table definition is as follows:
type ps_record is record
(thread_number number
,what varchar2(255)
,child_status varchar2(1)
,message_from_child varchar2(2000));
type ps_table is table of ps_record
index by binary_integer;
begin
select ceil(count(distinct(emp_id))
/in_number_of_threads)
into lv_chunksize
from emp_stg;
lv_rownum := lv_rownum + 1;
if lr_empid.emp_id <> lv_prev_epid
and lv_rownum >= lv_chunksize then
lt_ps(lv_ps_index).what :=
'begin loademp.load_employee('
||lv_low_wpid||','||lv_prev_wpid||');end;';
end loop;
end main;
-3-
www.ProcaseConsulting.com
Phone: (905) 856-7479
Fax: (416) 352-1334
Generic Algorithm
Finally, we will define a generic package that does the following:
Generic package has two procedures the parent to submit threads and the child to execute threads. The parent accepts
PLSQL table as an input parameter from main.
Parent procedure opens a unique pipe to monitor all threads using dbms_pipe, submits child procedures for all threads
using dbms_job and waits for all threads to finish using dbms_pipe.receive_message as shown here:
procedure parent (p_pst in out ps_table) is
dbms_job.submit
(p_pst(lv_child_id).thread_number
,'mt.child('''|| lv_pipe_name
||''',' || lv_child_id
||',''' || p_pst(lv_child_id).what ||');'
,sysdate
,null);
end loop;
end if;
end loop;
Child procedure acts as a thin wrapper to load_employee. When executed by the job queue, it calls load_employee using
execute immediate and returns success or failure with a message using dbms_pipe.send to the parent as shown here:
procedure child
(p_pipe_name in varchar2
,p_child_id in number
,p_what in varchar2) is
begin
execute immediate replace(p_what||';',';;',';');
-- create message and send to parent
dbms_pipe.pack_message(wrap_message(lr_pipe_msg));
if dbms_pipe.send_message(p_pipe_name) <> 0 then
-- error checking
end if;
exception when others then
end child;
-4-
www.ProcaseConsulting.com
Phone: (905) 856-7479
Fax: (416) 352-1334
Notice that we are using dbms_pipe to handle communication between parent and child procedures, and dbms_job to
submit child procedures for all threads. This makes our solution generic (i.e. all of the control logic resides in our separate
package).
Conclusion
The generic nature of the package allows it to be used for any process. We have utilized this technique for many processes
in our application. In all cases, we achieved nearly linear performance gains as we increased the number of threads.
However, there are several issues you should be aware of:
The number of threads that can be executed at a time is limited by number of job_queue_processes defined in
init.ora file. For example, if job_queue_processes is set to 10, you can only have 10 snp (queue) jobs running at a
time. This means, if you choose to run the procedure with 12 threads, only 10 threads will run concurrently and
two of them will wait till two others finish.
There are several table level parameters that may affect the performance adversely when multiple processes
perform DML operations on the table simultaneously. Two most critical parameters are FREELISTS and
INITRANS. If your process does a lot of INSERTs, we recommend that you set the number of FREELISTS to the
maximum number of concurrent processes for the table. Also INITTRANS needs to be increased to
accommodate multiple threads updating the same data block. It is hard to predict the maximum number of threads
that will update each data block. Setting this parameter unnecessarily high is not recommended because it reserves
space within each data block, which may have an impact on performance. As a guideline, you may consider
setting INITRANS to half the number of threads.
If your original process had an exclusive table level lock, you must replace it by row level locks using a SELECT
FOR UPDATE. Furthermore, you must ensure that each thread acquires row level locks on mutually exclusive
sets of data. If this is not the case, some threads will have to wait for others to release the locks and this will
degrade performance.
The performance gains are almost linear so long as the number of threads does not exceed the number of CPUs.
Thus we recommend setting the number of threads to be the same or slightly less as the number of CPUs. For
example, if you have 3 CPUs, create 3 threads, however, if you have 12 CPUs, create 10-11 threads.
-5-