20 WHITE PAPER / Best Practices for Gathering Optimizer Statistics with Oracle Database 19c
The SYSAUX tablespace should be online, as the Job Scheduler stores its internal tables and views in SYSAUX tablespace. Finally, the
JOB_QUEUE_PROCESSES parameter should be set to fully utilize all of the system resources available (or allocated) for the statistics
gathering process. If you don't plan to use parallel execution you should set the JOB_QUEUE_PROCESSES to 2 X total number of CPU
cores (this is a per node parameter in a RAC environment). Please make sure that you set this parameter system-wise (ALTER
SYSTEM ... or in init.ora file) rather than at the session level (ALTER SESSION).
If you are going to use parallel execution as part of concurrent statistics gathering you should disable the parallel adaptive multi user:
ALTER SYSTEM SET parallel_adaptive_multi_user=false;
Resource manager must be activated using, for example:
ALTER SYSTEM SET resource_manager_plan = 'DEFAULT_PLAN';
It is also recommended that you enable parallel statement queuing. This requires resource manager to be activated, and the creation of
a temporary resource plan where the consumer group "OTHER_GROUPS" should have queuing enabled. By default, Resource Manager
is activated only during the maintenance windows. The following script illustrates one way of creating a temporary resource plan
(pqq_test), and enabling the Resource Manager with this plan.
-- connect as a user with dba privileges
begin
dbms_resource_manager.create_pending_area();
dbms_resource_manager.create_plan('pqq_test', 'pqq_test');
dbms_resource_manager.create_plan_directive(
'pqq_test',
'OTHER_GROUPS',
'OTHER_GROUPS directive for pqq',
parallel_target_percentage => 90);
dbms_resource_manager.submit_pending_area();
end;
/
ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = 'pqq_test' SID='*';
If you want the automated statistics gathering task to take advantage of concurrency, set CONCURRENT to either AUTOMATIC or ALL. A
new ORA$AUTOTASK consumer group has been added to the Resource Manager plan used during the maintenance window, to ensure
concurrent statistics gathering does not use too much of the system resources.
WHEN NOT TO GATHER STATISTICS
Although the optimizer needs accurate statistics to select an optimal plan, there are scenarios where gathering statistics can be difficult,
too costly, or cannot be accomplished in a timely manner and an alternative strategy is required.
Volatile Tables
A volatile table is one where the volume of data changes dramatically over time. For example, an orders queue table, which at the start
of the day the table is empty. As the day progresses and orders are placed the table begins to fill up. Once each order is processed it is
deleted from the tables, so by the end of the day the table is empty again. If you relied on the automatic statistics gather job to maintain
statistics on such tables then the statistics would always show the table was empty because it was empty over night when the job ran.
However, during the course of the day the table may have hundreds of thousands of rows in it.
In scenarios such as this, you may choose to rely on high-frequency statistics gathering instead (see above) - assuming your planform
and database version supports it. Alternatively, it is often appropriate to gather a representative set of statistics for the table during the
day when the table is populated and then lock them. Locking the statistics will prevent high-frequency statistics gathering and the
automated statistics gathering job from overwriting them. This approach is especially applicable for highly volatile tables where the
number of rows grows and shrinks significantly during the course of the day.
Alternatively, you can rely on dynamic sampling to gather statistics on these tables. The optimizer uses dynamic sampling during the
compilation of a SQL statement to gather basic statistics on the tables before optimizing the statement. Although the statistics gathered
by dynamic sampling are not as high a quality or as complete as the statistics gathered using the DBMS_STATS package, they are often
good enough in many cases.