|
Real Application Clusters
|
Last time, I talked about the fact that if you moved an application that is optimized in a
single instance to RAC, performance of the application was not scalable
but even slower.
We need to identify a root cause of this problem.
We can query a dynamic performance view V$SYSTEM_EVENT to look at system wait event.
In versions earlier than Oracle9i, you need to set the initialization parameter
TIMED_STATISTICS to TRUE to get system wait event. In Oracle9i Release2, TIMED_STATISTICS is
set to TRUE by default.
This V$VIEW also indicates total wait time, which enables you calculating the difference
between times of before and after the process is performed.
Now, I examine the value I got from the last time.
********************************************************
(Vol.137)
CASE INSTANCE NO SESSIONs EPS(*1) TPS(*2)
=======================================================
01 1 237 16,083 732.3
-------------------------------------------------------
02 1 256 5,453
2 256 5,362
----------------------------------------------
Total 512 10,815 489.4
-------------------------------------------------------
V$SYSTEM_EVENT
CASE INSTANCE NO EVENT WAIT WAIT time(sec)
=================================================================
01 1 log file sync 1,763,899 422,800 << (1)
latch free 456,638 38,245
buffer busy waits 244,583 16,821
enqueue 122,109 12,221
-----------------------------------------------------------------
02 1 enqueue 2,367,703 410,303 << (2)
buffer busy waits 3,025,078 314,120 << (3)
buffer busy global
cache 1,736,016 206,586 << (3)
global cache busy 1,042,919 95,160
latch free 4,551,655 75,086
--------------------------------------------------------
2 enqueue 2,341,917 408,503 << (2)
buffer busy waits 3,034,419 321,680 << (3)
buffer busy global
cache 1,748,311 213,001 << (3)
global cache busy 968,529 85,302
latch free 4,638,862 73,170
-----------------------------------------------------------------
********************************************************
|
Regarding CASE01, wait time of log file sync event marks high value (1).
log file sync event writes update data from log buffer to
online REDO log file when DBWR writes dirty blocks from database buffer
to a disk. This event occurs frequently when the user session does not commit sufficiently.
If you encounter large amount of log file sync events, check the commit process
What about CASE02 where data is updated in RAC?
Wait time of enqueue event marks high value (2).
We can identify which enqueue marks high wait time by examining V$ENQUEUE_STAT.
According to the V$ENQUEUE_STAT, TX enqueue marks high wait time.
********************************************************
EQ_TYPE TOTAL_REQ TOTAL_WAITS WAITTIME (sec)
--------------------------------------------------------
TX 4,682,405 1,892,191 419,433
HW 30,123 27,898 3,011
SQ 283 172 14
.
.
.
********************************************************
|
It is not only enqueue whose wait time is extremely long,
but also buffer busy wait and buffer global cache indicate
longer wait time (3).
As I have done to check enqueue, I identify the buffer whose wait time is long.
In Oracle9i Release 2, you can get the statistics of each object
from V$SEGMENT_STATISTICS view.
With this V$ view, you can get information about buffer busy waits
and transfer ratio via interconnect global cache current blocks served
in RAC.
I identify the object with heavy buffer waits by viewing the V$SEGMENT_STATISTICS.
Following is the result of CASE02.
********************************************************
(buffer busy waits)
OBJECT TYPE # of WAIT
---------------------------------------
IDX_LOGTBL1 INDEX 2,445,422
IDX_LOGTBL2 INDEX 2,074,474
LOGTBL2 TABLE 76,123
LOGTBL1 TABLE 37,154
PK_SESINFO INDEX 32,102
********************************************************
|
Wait occurs many times in log-type index.
Log type table above actually keeps transaction log.
However, if you index date and order, update process is centered on a single block.
Especially, you may index the data as a key in log-type table.
Now, take a look at the columns of index in this table.
As I have expected, date is inserted as index which causes block contention.
I view V$SEGMENT_STATISTICS to get the object information and get ITL waits
this time. (ITL = Transaction entry.)
********************************************************
(ITL waits)
OBJECT TYPE # of WAIT
---------------------------------------
IDX_LOGTBL1 INDEX 1,532
IDX_LOGTBL2 INDEX 1,210
PK_SESINFO INDEX 33
IDX_LOGTBL3 INDEX 1
PK_MSINFO INDEX 1
********************************************************
|
Transaction entry wait occurs in log type table. In other words, if contention occurs in
index transaction entry, wait occurs.
Actually, transaction entry wait occurs in CASE01 (six times in IDX_LOGTBL1) but this
is not a considerable value. You notice that total number of sessions is different between CASE01
and CASE02. So, I let 128 sessions connect to the data from both instances to let the total
concurrent connections be the same amount and analyze them again.
The result is as follows.
********************************************************
(ITL waits)
OBJECT TYPE # of WAIT
---------------------------------------
IDX_LOGTBL1 INDEX 524
IDX_LOGTBL2 INDEX 124
PK_SESINFO INDEX 16
********************************************************
|
Compared with CASE02, transaction entry wait reduces.
Compared with CASE01, on the other hand, transaction entry wait increases by 90 times.
Why buffer busy waits and ITL waits increase in CASE02?
I will talk about this next week.
That's it for today.
Tadashi Yamashita
|
|