Recently I faced a peculiar issue of high consistent gets on the Active Data Guard side for a query.
The query had the exact same plan on both the sides - primary and standby.
I verified - the consistent gets for both the sides using autotrace.
The peculiar thing I noticed, whenever I verify consistent gets for ADG site - they keep fluctuating, they are definitely high from primary, but yes keep on fluctuating.
Active Data Guard site -
set autotrace traceonly explain statistics
Execution Plan
----------------------------------------------------------
Plan hash value: 4229718575
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
1596127 consistent gets
0 physical reads
0 redo size
14215 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Primary site -
Execution Plan
----------------------------------------------------------
Plan hash value: 4229718575
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
946319 consistent gets
0 physical reads
0 redo size
14215 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Elapsed: 00:00:02.19
I took a trace of the sessions and used tkprof to verify it -
Active Data Guard site -
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 9.84 9.86 0 1585661 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 9.85 9.87 0 1585661 0 1
Primary site -
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 2.43 2.45 0 946318 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 2.44 2.46 0 946318 0 1
Finally, I had to trace the session level statistics to understand, what was the key component to consistent gets...
I trimmed down my analysis to undo related statistics.
Active Data Guard site -
SELECT a.username, c.name, b.value
FROM v$session a, v$sesstat b, v$statname c
WHERE a.sid = b.sid
AND b.statistic# = c.statistic#
AND b.value != 0
AND a.sid = 3605
AND c.name in ('session logical reads','consistent gets','consistent gets from cache','consistent gets - examination','consistent changes','CR blocks created','data blocks consistent reads - undo records applied','IMU CR rollbacks');
Before Running Query
USERNAME NAME VALUE
------------ ------------------------------- ----------
SYS session logical reads 49
SYS consistent gets 49
SYS consistent gets from cache 49
SYS consistent gets - examination 19
Post Running Query
USERNAME NAME VALUE
------------- ---------------------------------------------------- ----------
SYS session logical reads 1869022
SYS consistent gets 1869022
SYS consistent gets from cache 1869022
SYS consistent gets - examination 922675
SYS consistent changes 556329
SYS CR blocks created 556139
SYS data blocks consistent reads - undo records applied 43
Primary site -
Before Running Query
USERNAME NAME VALUE
------------ ------------------------------- ----------
SYS session logical reads 3
SYS consistent gets 3
SYS consistent gets from cache 3
Post Running Query
USERNAME NAME VALUE
------------- ---------------------------------------------------- ----------
SYS session logical reads 946382
SYS consistent gets 946382
SYS consistent gets from cache 946382
SYS consistent gets - examination 24
SYS consistent changes 5
SYS CR blocks created 4
SYS data blocks consistent reads - undo records applied 5
If you see a lot of undo block applied and consistent gets examinations are there on the standby site.
Finally with due - diligence and working with Oracle we set the parameter _serial_direct_read=false on the Active Data Guard site
Remember this is default on primary site still.
This parameter is more related to use on Exadata for disabling smart scans, however it does have a use in non-Exa environments as well.
Session Level Statistic - ('table scans (direct read)')
This increments only by 1 for 1 query execution
USERNAME NAME VALUE
------------------------------ ---------------------------------------------------------------- ----------
SYS table scans (direct read) 1
The explanation is as below -
Oracle will first issue fast checkpoint at object level to write all dirty buffers of that object to the disk and perform direct path reads in to the PGA of shadow process.
This is default behavior based on 11G databases based on size of SGA, in order to use direct reads for read of data.
This behavior is based on size of SGA, size of object, buffer cache size of the instance.
Prior to 10G direct reads went via buffer cache and from 11G direct reads go directly via PGA bypassing Buffer Cache.
You can see below from the slow session, I could see what was happening -
Fast Object Check point and direct path reads.
The fast object check point in itself is taking around 2.7 seconds and finally there is time taken for direct path read.
The case of primary - the data is completely read from Buffer Cache and no data is via direct path reads.
*** - Below output is from Slow side - ADG site - ***
PARSE
WAIT #140527153926352: nam='reliable message' ela= 1452 channel context=322148229376 channel handle=323221996856 broadcast message=322692222808 obj#=-1 tim=1490721285371785
WAIT #140527153926352: nam='enq: KO - fast object checkpoint' ela= 2581 name|mode=1263468550 2=65610 0=1 obj#=-1 tim=1490721285374623
WAIT #140527153926352: nam='enq: KO - fast object checkpoint' ela= 134 name|mode=1263468545 2=65610 0=2 obj#=-1 tim=1490721285374858
WAIT #140527153926352: nam='Disk file operations I/O' ela= 345 FileOperation=2 fileno=11 filetype=2 obj#=40842 tim=1490721285375461
WAIT #140527153926352: nam='Disk file operations I/O' ela= 171 FileOperation=2 fileno=0 filetype=15 obj#=40842 tim=1490721285375749
WAIT #140527153926352: nam='Disk file operations I/O' ela= 20 FileOperation=2 fileno=0 filetype=15 obj#=40842 tim=1490721285376553
WAIT #140527153926352: nam='direct path read' ela= 58 file number=11 first dba=3184329 block cnt=15 obj#=40842 tim=1490721285377218
The query had the exact same plan on both the sides - primary and standby.
I verified - the consistent gets for both the sides using autotrace.
The peculiar thing I noticed, whenever I verify consistent gets for ADG site - they keep fluctuating, they are definitely high from primary, but yes keep on fluctuating.
Active Data Guard site -
set autotrace traceonly explain statistics
Execution Plan
----------------------------------------------------------
Plan hash value: 4229718575
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
1596127 consistent gets
0 physical reads
0 redo size
14215 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Primary site -
Execution Plan
----------------------------------------------------------
Plan hash value: 4229718575
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
946319 consistent gets
0 physical reads
0 redo size
14215 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Elapsed: 00:00:02.19
I took a trace of the sessions and used tkprof to verify it -
Active Data Guard site -
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 9.84 9.86 0 1585661 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 9.85 9.87 0 1585661 0 1
Primary site -
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 2.43 2.45 0 946318 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 2.44 2.46 0 946318 0 1
Finally, I had to trace the session level statistics to understand, what was the key component to consistent gets...
I trimmed down my analysis to undo related statistics.
Active Data Guard site -
SELECT a.username, c.name, b.value
FROM v$session a, v$sesstat b, v$statname c
WHERE a.sid = b.sid
AND b.statistic# = c.statistic#
AND b.value != 0
AND a.sid = 3605
AND c.name in ('session logical reads','consistent gets','consistent gets from cache','consistent gets - examination','consistent changes','CR blocks created','data blocks consistent reads - undo records applied','IMU CR rollbacks');
Before Running Query
USERNAME NAME VALUE
------------ ------------------------------- ----------
SYS session logical reads 49
SYS consistent gets 49
SYS consistent gets from cache 49
SYS consistent gets - examination 19
Post Running Query
USERNAME NAME VALUE
------------- ---------------------------------------------------- ----------
SYS session logical reads 1869022
SYS consistent gets 1869022
SYS consistent gets from cache 1869022
SYS consistent gets - examination 922675
SYS consistent changes 556329
SYS CR blocks created 556139
SYS data blocks consistent reads - undo records applied 43
Primary site -
Before Running Query
USERNAME NAME VALUE
------------ ------------------------------- ----------
SYS session logical reads 3
SYS consistent gets 3
SYS consistent gets from cache 3
Post Running Query
USERNAME NAME VALUE
------------- ---------------------------------------------------- ----------
SYS session logical reads 946382
SYS consistent gets 946382
SYS consistent gets from cache 946382
SYS consistent gets - examination 24
SYS consistent changes 5
SYS CR blocks created 4
SYS data blocks consistent reads - undo records applied 5
If you see a lot of undo block applied and consistent gets examinations are there on the standby site.
Finally with due - diligence and working with Oracle we set the parameter _serial_direct_read=false on the Active Data Guard site
Remember this is default on primary site still.
This parameter is more related to use on Exadata for disabling smart scans, however it does have a use in non-Exa environments as well.
Session Level Statistic - ('table scans (direct read)')
This increments only by 1 for 1 query execution
USERNAME NAME VALUE
------------------------------ ---------------------------------------------------------------- ----------
SYS table scans (direct read) 1
The explanation is as below -
Oracle will first issue fast checkpoint at object level to write all dirty buffers of that object to the disk and perform direct path reads in to the PGA of shadow process.
This is default behavior based on 11G databases based on size of SGA, in order to use direct reads for read of data.
This behavior is based on size of SGA, size of object, buffer cache size of the instance.
Prior to 10G direct reads went via buffer cache and from 11G direct reads go directly via PGA bypassing Buffer Cache.
You can see below from the slow session, I could see what was happening -
Fast Object Check point and direct path reads.
The fast object check point in itself is taking around 2.7 seconds and finally there is time taken for direct path read.
The case of primary - the data is completely read from Buffer Cache and no data is via direct path reads.
*** - Below output is from Slow side - ADG site - ***
PARSE
WAIT #140527153926352: nam='reliable message' ela= 1452 channel context=322148229376 channel handle=323221996856 broadcast message=322692222808 obj#=-1 tim=1490721285371785
WAIT #140527153926352: nam='enq: KO - fast object checkpoint' ela= 2581 name|mode=1263468550 2=65610 0=1 obj#=-1 tim=1490721285374623
WAIT #140527153926352: nam='enq: KO - fast object checkpoint' ela= 134 name|mode=1263468545 2=65610 0=2 obj#=-1 tim=1490721285374858
WAIT #140527153926352: nam='Disk file operations I/O' ela= 345 FileOperation=2 fileno=11 filetype=2 obj#=40842 tim=1490721285375461
WAIT #140527153926352: nam='Disk file operations I/O' ela= 171 FileOperation=2 fileno=0 filetype=15 obj#=40842 tim=1490721285375749
WAIT #140527153926352: nam='Disk file operations I/O' ela= 20 FileOperation=2 fileno=0 filetype=15 obj#=40842 tim=1490721285376553
WAIT #140527153926352: nam='direct path read' ela= 58 file number=11 first dba=3184329 block cnt=15 obj#=40842 tim=1490721285377218
No comments:
Write comments