|
Oracle 10g AWR (Automatic Workload Repository) -Automatic SQL Tuning Feature-
|
Hi, everyone. We have talked about the issue to specify the problematic SQL by OEM (Priority SQL). However, since data is taken into v$active_session_history once a second, the correctness of wait time is questionable.
We would like to observe, from a different point of view, the SQL which is in the status of wait. The target tables contain the history of metric data about waits (the value measured in a certain period of time), and the value is not the sum but the difference.
V$EVENTMETRIC
The wait time by each event in the past 60 seconds is stored.
V$WAITCLASSMETRIC
The wait time by each event class in the past 60 seconds is stored.
V$WAITCLASSMETRIC_HISTORY
The wait time by each event class in the past 1 hour is stored every 60 seconds.
By the way, these tables are all based on the same table x$kewmevmv. Let's start with V$EVENTMETRIC. The test environment is the same as the one used last week.
OS :HP-UX hp11i B.11.11
Oracle:Release 10.1.0.2.0
Execute certain OLTP transactions in this environment
ŠThe same transactions as those we used last week
|
By the way, v$eventmetric does not keep the data for more than 60 seconds, probably because the amount of data keeps increasing. It is better to take the log manually periodically.
<Select the information on memory about waits event in the past 60 seconds from v$eventmetric>
select
n.wait_class
, n.name
, e.TIME_WAITED/100 TIME_WAITED -- alter the unit to "second"
, e.WAIT_COUNT WAIT_COUNT
from v$eventmetric e , v$event_name n
where e.event# = n.event#
and e.event_id = n.event_id
and e.wait_count > 0
and n.wait_class <> 'Idle' -- except for Idle event
order by e.time_waited desc
WAIT_CLASS NAME TIME_WAITED WAIT_COUNT
-------------------- ------------------------------ ----------- ----------
Commit log file sync .67 955
Application enq: TX - row lock contention .25 1
System I/O control file parallel write .11 26
Concurrency latch: library cache .1 55
Network SQL*Net message to client .06 33447
User I/O db file sequential read .05 19
Application SQL*Net break/reset to client .03 260
Other latch free 0 1
System I/O control file sequential read 0 10
Network SQL*Net more data from client 0 7
Network SQL*Net more data to client 0 4
|
Judging from the accumulated wait time for one minute, log file sync due to commit takes the longest time, 0.67 second. The waits occurs 955 times, and each wait takes about 0.0067 second. The lock wait (enq: TX - row lock contention), on the hand, occurs just once,
but it takes much longer than log file sync does (0.25 second).
The key problem of the whole system seems to be log file sync due to commit
Is that really?? If the process is patch process, in order to shorten the process time of the patch process, the tuning should focus on decreasing executing commit. However, if it is OLTP process, the key problem is that some user hase been kept waiting for 0.25 second due to lock wait.
In other words, it is necessary to consider the characteristics of transactions.
Simply speaking, in the case of OLTP process we should first deal with the SQL whose own wait is long, and in the case of patch process we should pay attention to the accumulated wait time first. Of course for real tuning, both aspects are important.
If the wait time at a certain instant is determined, it is possible to specify the wait event to identify SQL from v$active_session_history, under the condition that it was sampled.
Check CPU process time instead of wait time
We have only paid attention to the wait events, but now we would like to check how the CPU is used. v$eventmetric contains only data for 1 minute, so we check v$waitclassmetric_history and v$sysmetric_history. These tables keep the one-hour histories by each minute.
<CPU operating time and wait time per minute in the past 1 hour>
select
to_char(w.begin_time,'yyyymmdd hh24:mi') BEGIN_TIME
, n.wait_class WAIT_CLASS
, w.time_waited/100 TIME
from
v$waitclassmetric_history w,
(select distinct wait_class#, wait_class from v$event_name) n
where w.wait_class# = n.wait_class#
and n.wait_class <> 'Idle'
UNION ALL
select
to_char(begin_time,'yyyymmdd hh24:mi') BEGINE_TIME
, 'CPU Usage' WAIT_CLASS
, value*60/100 TIME
from v$sysmetric_history
where metric_name = 'CPU Usage Per Sec' and group_id = 2
order by BEGIN_TIME, TIME desc
BEGIN_TIME WAIT_CLASS TIME
-------------- -------------------- ----------
20040827 17:02 CPU Usage 24 ©CPU
20040827 17:02 Commit .9
20040827 17:02 System I/O .18
20040827 17:02 User I/O .1
20040827 17:02 Network .06
20040827 17:02 Application .03
20040827 17:02 Configuration .02
20040827 17:02 Other 0
20040827 17:02 Concurrency 0
20040827 17:02 Scheduler 0
20040827 17:02 Cluster 0
20040827 17:02 Administrative 0
20040827 17:03 CPU Usage 24 ©another minute
20040827 17:03 Commit .69
20040827 17:03 System I/O .08
20040827 17:03 Network .07
20040827 17:03 User I/O .04
20040827 17:03 Application .02
20040827 17:03 Concurrency .01
(eliminated)E
|
The CPU process time of 25 approximately corresponds to wait time 1. It seems like there is no serious problem. In an environment where a lot of disk I/O occur and disk busy rate remains 100%, the proportion of I/O wait time and CPU process time will be 1:1. In such an environment, tuning may help to accelerate the response.
Yoshihiro Uratsuji
|
|