Oracle Internal Table X$BH Buffer Hash
See Also: V$bh
Description
Buffer Hash(according to J. Morle's Scaling Oracle 8i)
Another explaining is Buffer Header.
Columns
ver 10.2.0.4 > x$bh Name Type ------------ ----------- ADDR RAW(8) INDX NUMBER INST_ID NUMBER HLADDR RAW(8) BLSIZ NUMBER NXT_HASH RAW(8) PRV_HASH RAW(8) NXT_REPL RAW(8) PRV_REPL RAW(8) FLAG NUMBER RFLAG NUMBER SFLAG NUMBER LRU_FLAG NUMBER TS# NUMBER FILE# NUMBER DBARFIL NUMBER DBABLK NUMBER CLASS NUMBER STATE NUMBER MODE_HELD NUMBER CHANGES NUMBER CSTATE NUMBER LE_ADDR RAW(8) DIRTY_QUEUE NUMBER SET_DS RAW(8) OBJ NUMBER BA RAW(8) CR_SCN_BAS NUMBER CR_SCN_WRP NUMBER CR_XID_USN NUMBER CR_XID_SLT NUMBER CR_XID_SQN NUMBER CR_UBA_FIL NUMBER CR_UBA_BLK NUMBER CR_UBA_SEQ NUMBER CR_UBA_REC NUMBER CR_SFL NUMBER CR_CLS_BAS NUMBER CR_CLS_WRP NUMBER LRBA_SEQ NUMBER LRBA_BNO NUMBER HSCN_BAS NUMBER HSCN_WRP NUMBER HSUB_SCN NUMBER US_NXT RAW(8) US_PRV RAW(8) WA_NXT RAW(8) WA_PRV RAW(8) OBJ_FLAG NUMBER TCH NUMBER TIM NUMBER
ver 11.1.0.6.0 Name Null? Type -------------------- -------- --------------- FILE# NUMBER BLOCK# NUMBER CLASS# NUMBER STATUS VARCHAR2(10) XNC NUMBER FORCED_READS NUMBER FORCED_WRITES NUMBER LOCK_ELEMENT_ADDR RAW(4) LOCK_ELEMENT_NAME NUMBER LOCK_ELEMENT_CLASS NUMBER DIRTY VARCHAR2(1) TEMP VARCHAR2(1) PING VARCHAR2(1) STALE VARCHAR2(1) DIRECT VARCHAR2(1) NEW CHAR(1) OBJD NUMBER matches dba_objects.data_object_id, not object_id TS# NUMBER
Frequent Used SQL
- Cache Buffer Chain Latch Contention
With x$bh table you can find the object and the file#.block# of its header when there's high cache buffers chains latch contention for the said latch (whose sleeps you think are too high).
select obj, dbarfil, dbablk from x$bh bh, v$latch_children lc where bh.hladdr = lc.addr
Example:
19:16:58 SQL> select obj, dbarfil, dbablk 19:17:02 2 from x$bh bh, v$latch_children lc 19:17:02 3 where bh.hladdr = lc.addr and rownum <= 10; OBJ DBARFIL DBABLK ---------- ---------- ---------- 5932 2 15488 6091 2 41749 12752 2 72348 5938 2 37411 12806 2 68010 4294967295 3 10218 12688 2 63672 2 1 47252 4294967295 3 1542 12578 2 54996 10 rows selected.
- Cache Buffer Clones
You can also use this table to see if a specific buffer has too many clones:
select dbarfil, dbablk, count(*) from x$bh group by dbarfil, dbablk having count(*) > 2
Example
19:17:13 SQL> select dbarfil, dbablk, count(*) 19:21:34 2 from x$bh 19:21:34 3 group by dbarfil, dbablk having count(*) > 2; DBARFIL DBABLK COUNT(*) ---------- ---------- ---------- 1 10087 3 1 10032 3 1 8626 5 1 36756 3 1 36405 4 1 10356 3 1 44133 5 1 10054 3 1 10155 5 1 36738 3 1 117707 4 1 36819 4 1 10152 3 1 105370 5 ... 95 rows selected.
Note that obj column matches dba_objects.data_object_id, not object_id.
For performance reason, don't merge dba_extents with the query of x$bh that has a group by, unless you use in-line view and no_merge hint (see J. Lewis' Practical Oracle8i, p.215) The tch column, touch count, records how many times a particular buffer has been accessed. Its flag column is explained by J. Lewis' Buffer Flag; explanation of state, mode and indx can be found in Anjo Kolk's paper session wait events.
RAC and v$bh
The internal machinations of Oracle buffer cache management are not published and many Oracle professionals infer their behavior from experimentation and tidbits from the Oracle documentation. For a complete description of the v$ event views, get the free 10g poster by UNISYS.
19:31:03 SQL> desc v$bh Name Null? Type ----------------------------------------------------------------------------------- -------- -------------------------------------------------------- FILE# NUMBER BLOCK# NUMBER CLASS# NUMBER STATUS VARCHAR2(10) XNC NUMBER FORCED_READS NUMBER FORCED_WRITES NUMBER LOCK_ELEMENT_ADDR RAW(4) LOCK_ELEMENT_NAME NUMBER LOCK_ELEMENT_CLASS NUMBER DIRTY VARCHAR2(1) TEMP VARCHAR2(1) PING VARCHAR2(1) STALE VARCHAR2(1) DIRECT VARCHAR2(1) NEW CHAR(1) OBJD NUMBER TS# NUMBER
For RAC Instance
The v$bh view of x$bh has the all-important “status” column that indicates the lock mode for each data block in the data buffer. You can use v$bh to display the data buffer contents, and here is Burleson's great script for this. For non-OPS and non-RAC databases, you will see these common lock modes:
xcur - This is a RAM block that has been acquired in exclusive current mode. According the Oracle documentation, if a buffer state is exclusive current (XCUR), an instance owns the resource in exclusive mode.
cr - This mode indicates a "cloned" RAM block (a "stale" block), that was once in xcur mode. The instance has shared access to the block and can only perform reads. The cr state means the owning instance can perform a consistent read of the block, if the instance holds an older version of the data.
free - This is an “available” RAM block. It might contain data, but it is not currently in-use by Oracle.
read – The buffer is reserved for a block that is currently being read from disk.
During database recovery we might also see these values for v$bh status column:
mrec – Indicates a block in media recovery mode
irec – This is a block in instance (crash) recovery mode
For RAC database we also see this v$bh.status values:
scur - a current mode block, shared with other instances
A normal database will have filled all the buffer blocks in just a few minutes, and the database buffer will normally remain full until shutdown time. These "free" RAM blocks still contain data. They are just marked as free because they are eligible to be overlaid by a new incoming data block.
Multi-versioning of blocks in the buffer
The Oracle row-level locking mechanism is well understood by Oracle experts, but here is some question about how row-level locking translated into data buffer block status changes.
In a MetaLink forum discussion titled "Multiple buffer versions flooding the buffer cache", we see a DBA on Oracle 8.1.6.2 who claims to have 50 versions of a single data block in his data buffer. This query was used to identify multiple versions of a data block within the buffer cache:
SELECT dbarfil, dbablk, class, count(*) FROM x$bh GROUP BY dbarfil, dbablk, class HAVING count(*) > 40;
And also a query to identify the specific segments that are experiencing multiple data buffer versions:
SELECT segment_name, segment_type, owner, tablespace_name FROM sys.dba_extents WHERE file_id = <DBAFIL> AND <DBABLK> BETWEEN block_id and block_id + blocks -1;
"If this is a table or index block it is advisable to check the application to see why the block is being frequently changed as this may represent a point of contention."
"In general, you would see a number of CR blocks in the buffer cache proportional to the number of updates on that block."
In this case we see an argument for re-setting the pctfree object parameter when we have a table (or index) with many "hot update" rows. Re-setting pctfree to a high-value will spread the data rows across more data blocks, thereby relieving the possibility of having dozens of versions of the data block in the data buffer.
For example, if you have avg_row_len = 80 and 2k data blocks, setting pctfree=90 would remove the data block from the freelist (marking it as logically full) after only three rows are inserted. This approach wastes disk space but reduced the multiple versions of "cr" RAM blocks. This high pctfree technique is also used to reduce row fragmentation in tables with VARCHAR data types where a row is initially stored small with huge row expansion from subsequent UPDATE statements.
Mark Bobak, a member of the respected Oaktable group, professes some knowledge of buffer management internals, and published this excellent speculation on the behavior of buffer block status:
I'm fairly certain, but not 100% sure, that Oracle creates a CR block from a XCUR (or SCUR? I'm not a RAC expert), by cloning the XCUR block, and then referring to rollback, rolls back the block till it's sufficiently old, to be read consistent to the point in time required.
So, if a query's snapshot SCN is N, but a particular DBA (data block address) is only available in current mode, and the SCN is N+100, then Oracle will clone it, and mark it as CR, and start rolling back, using the information it the block's ITL slot(s) to backtrack where the various before images are.
Eventually, one of two things will happen, either the block will be sufficiently old (with an SCN of N or less) or, Oracle will not be able to find the required information in the rollback segment, which would result in ORA-1555.
And yes, I believe the XCUR block is kept in the buffer cache after a transaction completes. (Again, there may be some gotchas in the RAC case, I'm not sure about that case.)
Consider also, that a block in XCUR mode can be flushed to disk, even if there is a transaction open on the block. Oracle doesn't care. If you think about how the locking model works, and how read consistency works, it really doesn't matter. If a block with one or more row level locks is flushed to disk, and then the transaction commits, what happens?
Well, Oracle certainly won't re-read the block just to update the row lock and ITL slot. So, what does it do? It marks the transaction as committed in the rollback segment slot header, and moves on. If another process comes along and reads or updates that block, delayed block cleanout will kick in to clean up the mess that was left behind.
There is also a question about buffer multi-versioning when using Oracle 10g Automatic Storage Memory Management (ASMM). Here is an actual example from an ASMM 10g database showing only one free block in the data buffer:
STATUS NUMBER_BUFFERS ------- -------------- cr 616 free 1 xcur 14790 Top 5 Timed Events ~~~~~~~~~~~~~~~~~~ % Total Event Waits Time (s) DB Time Wait Class ------------------------------ ------------ ----------- --------- -------------- log file parallel write 9,670 291 55.67 System I/O log file sync 9,293 278 53.12 Commit CPU time 225 43.12 db file parallel write 4,922 201 38.53 System I/O control file parallel write 1,282 65 12.42 System I/O
Here is a sample AWR report from the system as BC first found it.
Note that all xcur and cr blocks in the RAM data buffers can be de-allocated (with the alter system flush buffer_pool command) to make their status “free” and they can then be reassigned to other SGA regions. Here is a simple script to display counts of each v$bh status:
column c1 heading 'Status' format a10 column c2 heading 'Number|of Data|Buffers' format 999,999,999 select status c1, count(1) c2 from v$bh group by status order by count(1) desc;
Here is the output from a high-DML database.
Number of Data Status Buffers ---------- ------------ xcur 311,967 free 270,731 cr 17,302
The number of blocks in v$bh where status="free" is critical to high-update databases because there must be enough buffers to hold all data blocks that are being updated (in xcur status). The Oracle 10g Automatic Storage Memory Management (ASMM) feature (the default on Oracle10g) should anticipate high updates and allocate additional data buffers during high update periods. For example, here is an actual output from an Oracle10g database where it appears that ASMM is not allocating enough free blocks to accommodate concurrent updates:
STATUS NUMBER_BUFFERS ------- -------------- cr 616 free 1 xcur 14790
For more on ASMM, click here.
An Online Demonstration
This example below shows how Oracle marks un-locked block buffers as “free” and how non-free buffer blocks (where status <> 0) and we see the cur and xcur blocks being flushed from the buffer cache when we issue the alter system flush buffer_cache command. In this example there are 50,000 rows in the POLICYREC table.
SQL> update POLICYREC set sum_assured = sum_assured + 15; 50000 rows updated. SQL> commit; Commit complete. SQL> select o.OBJECT_TYPE, substr(o.OBJECT_NAME,1,10) objname , b.objd , b.status, count(b.objd) from v$bh b, dba_objects o where b.objd = o.data_object_id and o.owner = 'NYUSER' group by o.object_type, o.object_name,b.objd, b.status ; OBJECT_TYPE OBJNAME OBJD STATUS COUNT(B.OBJD) ----------- ------------- ------ ------ ------- TABLE TEST1 43058 free 6 TABLE POLICYREC 43061 cr 47 TABLE POLICYREC 43061 free 238 TABLE POLICYREC 43061 xcur 376 SQL> alter system flush buffer_cache; System altered. SQL> select o.OBJECT_TYPE, substr(o.OBJECT_NAME,1,10) objname , b.objd , b.status, count(b.objd) from v$bh b, dba_objects o where b.objd = o.data_object_id and o.owner = 'NYUSER' group by o.object_type, o.object_name,b.objd, b.status ; OBJECT_TYPE OBJNAME OBJD STATUS COUNT(B.OBJD) ---------- -------------- ----- ----- ------------- TABLE TEST1 43058 free 6 TABLE POLICYREC 43061 free 660
Reference
English
By Burleson Consulting's The mysteries of Oracle buffer block management
http://www.stormloader.com/yonghuang/computer/x$table.html
http://www.ixora.com.au/ (lots of valuable documents are deleted …)
http://www.ixora.com.au/scripts/x$tables.htm
http://www.fors.com/velpuri2/X$/List%20of%20X$%20Tables
http://ordba.net/Ora10g/x$tab.htm
http://www.orafaq.com/faqdbain.htm#X$TAB
http://www.oracle-training.cc/oracle_tips_x$.htm
http://www.adp-gmbh.ch/ora/misc/x.html
http://www.petefinnigan.com/Storing_Data_Directly_From_Oracle_SGA.pdf
http://oraperf.sourceforge.net/presentations/SGA_access.ppt
http://www.eygle.com/refer/Oracle_x$table_list.htm
http://imergegroup.blogspot.com/2007/03/what-is-inside-xtables-in-oracle.html
http://www.oracle-internals.com/?p=11
http://www.jlcomp.demon.co.uk/faq/find_dist.html
https://twiki.cern.ch/twiki/bin/view/PSSGroup/ASM_Internals
22241.1 List of X$ Tables and how the names are derived
Chinese
http://www.dbanotes.net/X$tables.htm
http://www.eygle.com/archives/2004/11/oracle_x_table.html
http://ysjxjf.itpub.net/post/7246/68230
http://chrysie.itpub.net/post/20977/276884
http://soft.zdnet.com.cn/software_zone/2007/0811/452643.shtml
http://fantastechnol.blogspot.com/search/label/Oracle