Daily Archive for 1月 14th, 2009

Diagnosing and troubleshooting truncate

我们都知道truncate操作会做什么:移动HWM,释放extent,重新生成data_object_id。truncate之所以比比delete快,是因为它根本不需要删除数据。但是在实际中我们经常碰到truncate很慢的情况,实际上它比我们想象的要复杂。

1.在truncate之前,CKPT必须搜索整个buffer cache,把这个object的dirty buffer全部写回磁盘(这个操作在10g有了改进,我们后面再说)。在这个过程中,系统很可能会出现DBWR的等待事件,如果dirty buffer对应的redo还没有被写入,也有可能会出现LGWR的等待事件,比如log file sync。有人说truncate之前要做一次checkpoint,并不是特别确切。

When a truncate is issued, the checkpoint process does a complete scan (till 9204) of the buffer cache.  All of the dirty buffers of the object in the buffer cache are written down to disk.  All of the clean buffers are are invalidated.

2.是否使用了reuse storage,因为需要释放空间,尤其对于字典管理的表空间。很可能会出现Waits on ST enqueue的等待事件。

3.Waits on “RO enqueue” or “CI enqueue”

The RO enqueue known as “Multiple object resue” enqueue, is used to synchronise operations
between foreground process and a background process such as DBWR or CKPT.
It is typically used when you are dropping objects or truncating tables.

When a truncate/drop is issued,

1.the foreground first acquires the “RO” enqueue in exclusive mode,
2.then cross instance calls (or one call if it is a single object) are issued
(the “CI” enqueue is acquired for cross instance call)
3.The CKPT processes on each of instances requests the DBWR to write
the dirty buffers to the disk and invalidate all the clean buffers.
4. After DBWR finishes writing, the foreground process releases the RO
enqueue.

In effect this enqueue serializes the truncate/drop operations given concurrently.

前面说过,9i在truncate之前需要扫描整个SGA找到这个object的dirty buffer,但是10g作了改进:

In Oracle 10g, a new feature was introduced in order to improve performance of operations such as drop, truncate, shrink object. This feature uses ‘object queues’ in the buffer cache infrastructure which points to a list of buffers owned by an object. In 9i, we might scan the entire buffer cache to see which buffers belong to the object is getting truncated and write its dirty buffers to disk. On environments with huge buffer caches, buffer invalidation will take long time. In 10g, the buffer invalidation process only looks at the ‘object queues’ and writes out the dirty buffers to disk. Since ‘object queues’ are short, truncate/drop/shrink operations are very fast in 10g.

这个新特性用参数 “_db_fast_obj_truncate” 来控制,这个新特性有已知的bug,必要时可以关闭。

–EOF–High School Musical 3: Senior Year movieArmy of the Dead rip

Oracle archive Behavior

我们一般认为oracle的归档行为是:按照LOG_ARCHIVE_DEST_n的设置,先归档第一个路径(local destination),然后再归档另一个路径(remote destination) The Devil’s Own movie download

实际上并不是这样,Oracle 9iR2及其以前的版本中,Oracle总是一次得到所有的归档路径列表,然后依次从online redo中读取1M数据,依次发送到各归档目标路径。这样,整个归档的时间就取决于最慢的路径,比如远程的standby。如果当网络发生问题时,很可能出现归档进程hang住,导致日志无法切换。

The ARCH process sits in a very tight loop waiting for an update to the controlfile that states an online log needs to be archived.  Once the update occurs the ARCH process builds a list of archive destinations that need to be serviced.  Once this list is complete, the ARCH process will read a one megabyte chunk of data from the online log that is to be archived.  This one megabyte chunk is then sent to the first destination in the list.  When the write has completed, the same one megabyte chunk is written to the second destination.  This continues until all of the data from the online log being archived has been written to all destinations.  So it can be said that archiving is only as fast as the slowest destination.

A common misconception is that if the LOG_ARCHIVE_DEST_n parameter for a particular destination has the OPTIONAL attribute set, then that destination will not impede local archiving. This is true during error situations while archiving to that destination - e.g. a network disconnect error, but not during an archival over a slow network, which is not an error situation. In error situations, whether the destination is marked OPTIONAL or MANDATORY, Data Guard will close that destination and continue transmitting to all other valid destinations. Transmitting to the closed destination will be attempted again only after the time specified in the REOPEN attribute has expired and a log switch has occurred.  This process will continue for the number of times specified by the MAX_FAILURE attribute. During this time, it is possible that the log writer process recycles through the available online redo log groups and tries to use the online redo log file which has not yet been transmitted successfully to the remote destination. If the destination is marked OPTIONAL, log writer will reuse the online redo log file for the next set of redo. If the destination is marked MANDATORY,  log writer will not be able to reuse that online redo log file, and the primary database will delay processing until that online redo log file has been successfully transmitted to the remote destination.

However, the situation is very different if the transmission is being done over a slow network. In this case, no error is encountered and the destination is not closed. Transmission continues, but is very slow. Ultimately, with the unavailability of any more online redo log groups, Log writer may suspend because the archive process is taking a long time to complete its archival, including local archival.

在9.2.0.5之后,Oracle引入了一个隐含参数_LOG_ARCHIVE_CALLOUT=’LOCAL_FIRST=TRUE’,设置这个参数,Oracle会首先归档本地路径。Oracle 10g,引入了参数是log_archive_local_first,默认为true.

The following underscore parameter was introduced as of 9.2.0.5 to allow the DBA to change this default behavior:

_LOG_ARCHIVE_CALLOUT=’LOCAL_FIRST=TRUE’

If the above parameter is set then the ARCH process will begin archiving to the local destination first.  Once the redo log has been completely and successfully archived to at least one local destination, it will then be transmitted to the remote destination. This is the default behavior beginning with Oracle Database 10g Release 1.

Starting in 9.2.0.7 patchsets, one ARCH process will begin acting as a ‘dedicated’ archiver, handling only local archival duties. It will not perform remote log shipping or service FAL requests. This is a backport of behavior from 10gR1 to 9iR2.

–EOF–