Guenadi N Jilevski's Oracle BLOG

Oracle RAC, DG, EBS, DR and HA DBA BLOG

Oracle 11g cache fusion

Oracle 11g cache fusion

RAC Database System has two important services. They are Global Cache Service (GCS) and Global Enqueue Service (GES). These are basically collections of background processes.  These two processes together cover and manage the total Cache Fusion process, resource transfers, and resource escalations among the instances.

Global Resource Directory

GES and GCS together maintain a Global Resource Directory (GRD) to record the information about the resources and the enqueues. GRD remains in the memory and is stored on all the instances. Each instance manages a portion of the directory. This distributed nature is a key point for fault tolerance of the RAC.

Global Resource Directory (GRD) is the internal database that records and stores the current status of the data blocks. Whenever a block is transferred out of a local cache to another instance’s cache the GRD is updated. The following resources information is available in GRD.

* Data Block Identifiers (DBI)

* Location of most current version

* Modes of the data blocks: (N)Null, (S)Shared, (X)Exclusive

* The Roles of the data blocks (local or global) held by each instance

* Buffer caches on multiple nodes in the cluster

Global Cache Service (LMSx)

The GCS tracks the location, status (mode and role) of data blocks, and the access privileges of all instances. Oracle uses the GCS for cache coherency when the current version of a data block is in one instance’s buffer cache, and another instance requests that block for modification. It is also used for reading blocks from other instances.

Following the first read of exclusive resources, multiple transactions running in a single RAC instance can share access to a set of data blocks without involvement of the GCS as long as the block is not transferred out of the local cache. This is similar to non-RAC Oracle. If the block has to be transferred out of the local cache, then the GCS updates the Global Resource Directory in the shared pool.

GCS and Cache Coherency

The GCS manages all types of data blocks. Cache coherency is maintained through the GCS by requiring that instances acquire a resource (lock or enqueue on a block) cluster-wide before modifying or reading a database block. The GCS is used to synchronize global cache access, allowing only one instance to modify a block at any single point in time. The GCS, through the RAC wide Global Resource Directory, ensures that the status of data blocks cached in any mode in the cluster is globally visible and maintained.

Oracle’s RAC has multi-versioning architecture. This multi-versioning architecture distinguishes between current data blocks and one or more consistent read (CR) versions of a block. A current block contains changes for all committed and yet-to-be-committed transactions. A consistent read (CR) version of a block represents a consistent snapshot of the data at a previous point in time. A data block can reside in many buffer caches under the auspices of shared resources.

In Oracle RAC, applying rollback segment information to current blocks produces consistent read versions of a block. Both the current and consistent read blocks are managed by the GCS.

To transfer data blocks among database caches, buffers are shipped by means of the high speed IPC interconnect. Disk writes are only required for cache replacement. A past image (PI) of a block is kept in memory before the block is sent if it is a dirty (modified) block. In the event of failure, Oracle reconstructs the current version of the block by reading the PI blocks.

GCS Resource Modes and Roles

The GCS global resource dictionary tracks resource blocks transmitted throughout the RAC system. The same block can exist in multiple caches as a result of block transfers. The block is held in different modes depending on whether a resource holder (instance) intends to modify the resource data or merely read it.

It is important to understand that a RAC resource is identified by two factors:

Mode – The modes are null, shared, and exclusive.

Role – The roles are local and global

Resource modes are generally set by the holder, as part of a request for a resource. The resource modes determine whether the holder can modify the block. The modes of a RAC resource are defined as:

Null – Identified with an N. Holding a resource at this level conveys no access rights.

Shared – Identified with an S. This signifies a protected read. When a resource is held at this level, a process cannot modify it and multiple processes can read the same resource.

Exclusive – Identified with an X. This grants the holding process exclusive access. Other processes cannot write to the resource, but consistent reads of older blocks are still available through the PI process.

Resource roles are set by type of access, local, or global. A resource in exclusive mode is local by definition, while a null or shared mode resource is global.

Some Performance metrics specific to the RAC environment

Cache Coherency

What exactly do we mean by Cache Coherency? Our Oracle RAC environment needs some added sets of metrics rather than a regular Oracle RAC installation, which I sometimes refer to as a “Single-Node RAC”. I call it a Single-Node RAC because someday that Oracle application will also grow and need a ticket to “RACdom”. A typical production DBA, responsible for uptime and upkeep of his RAC Database needs more that just some AWR runs; he will need to measure the health of his HSI (High Speed Interconnects) Network interfaces, he will have to monitor and diagnose the traffic volume across the nodes and response times. A typical high intensive OLTP environment can keep you pretty busy. To measure the traffic we will concentrate on two categories:

GCS (Global Cache Services)

GES (Global Enqueue Services)

So what are they?

Global Cache Service

Process that implement Cache Fusion. It maintains the block mode for blocks in the global role. It is responsible for block transfers between instances. The Global Cache Service employs various background processes such as the Global Cache Service Processes (LMSn) and Global Enqueue Service Daemon (LMD).

It actually is more or less like your buffer cache, but here it acts globally across the nodes. This process is an integral part to the cache-fusion concepts. So what does a buffer have, data blocks obviously. Simply said, the coherency in the Global Buffer Cache is maintained by making sure that whenever an attempt to modify the database block is made, a global lock is acquired. . Now this “asking instance” will have both the past copy of the block (for redo purposes) as well as the current version of the block containing both committed and uncommitted transactions. Should another node come asking for that block, then it is the GCS’s responsibility to do a “Block Version Lookup” at the node, which is currently holding the global lock to the block. The LMSn processes are crucial for a successful operation of GCS and do the block version lookup, block mode etc.

Global Enqueue Service

A service that coordinates enqueues that are shared globally.

The blocks in your RAC environment do most of the work themselves, but there is a crucial area when GES or the Global Enqueue Services come in. A seamless coordination across the nodes is crucial for RAC’s operation. The GES is primarily responsible for maintaining coherency in the dictionary and library caches. The dictionary cache consists of the data dictionary master information for each node in its SGA (System Global Area) primarily for quicker lookup and access. Any DML committed from a requesting node needs to be synched and written across all data dictionaries in all nodes of the RAC environment. The GES makes sure that the changes remain consistent across the nodes and that there are no discrepancies. Moreover, with the same directive, the locks must be created and maintained across the nodes and GES must ensure that there are no deadlocks across requesting nodes over access to the same objects. LMON, LCK and LMD processes work in tandem to make the GES operate in a smooth and seamless fashion.

GV$ Views

Obviously, the meat part of the whole equation is, “Where are my RAC views?” RAC environment has additional views known as Global Views. A typical view for a Single Node installation is V$ but for RAC you have GV$ views. In addition, all these views have additional columns like INST_ID to identify nodes across the RAC environment. So a typical 4 node RAC will give you four nodes in our 4-node RAC with their own data when querying the GV$ view. Obviously, you can query individual nodes from any node. To get started try doing this:

SQL> select * from gv$sysstat where name like ‘%gcs %’; This will give you a result set with specific attention to GCS messages sent across the nodes. If this value is inconsistent across nodes or if huge differences are apparent then it might be time to investigate.

The Oracle RAC processes and their identifiers are as follows:

ACMS: Atomic Controlfile to Memory Service (ACMS)

In an Oracle RAC environment, the ACMS per-instance process is an agent that contributes to ensuring a distributed SGA memory update is either globally committed on success or globally aborted if a failure occurs.

GTX0-j: Global Transaction Process

The GTX0-j process provides transparent support for XA global transactions in a RAC environment. The database autotunes the number of these processes based on the workload of XA global transactions.

LMON: Global Enqueue Service Monitor

The LMON process monitors global enqueues and resources across the cluster and performs global enqueue recovery operations.

LMD: Global Enqueue Service Daemon

The LMD process manages incoming remote resource requests within each instance.

LMS: Global Cache Service Process

The LMS process maintains records of the data file statuses and each cached block by recording information in a Global Resource Directory (GRD). The LMS process also controls the flow of messages to remote instances and manages global data block access and transmits block images between the buffer caches of different instances. This processing is part of the Cache Fusion feature.

LCK0: Instance Enqueue Process

The LCK0 process manages non-Cache Fusion resource requests such as library and row cache requests.

RMSn: Oracle RAC Management Processes (RMSn)

The RMSn processes perform manageability tasks for Oracle RAC. Tasks accomplished by an RMSn process include creation of resources related to Oracle RAC when new instances are added to the clusters.

RSMN: Remote Slave Monitor manages background slave process creation and communication on remote instances. These background slave processes perform tasks on behalf of a coordinating process running in another instance.

December 13, 2009 - Posted by | oracle

2 Comments »

  1. This is very helpful and thanks.Any other info on RAC 11g

    Comment by Venkat B | April 27, 2012 | Reply

    • Hi,

      There is no conceptual difference in 11g compared tp 10g.

      Regards,

      Comment by gjilevski | April 27, 2012 | Reply


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: