Oracle ACFS in Oracle 11.2
Oracle ACFS in Oracle 11.2
In the article you will have a look at the concepts, architecture and utilities to manage and monitor ACFS and ACFS features such as snapshots and replication. ACFS is a general purpose portable cluster file system that can run on many OS, installed as part of Grid Infrastructure installation in Oracle 11.2 or later. ACFS is initially available on Linux and Windows starting with Oracle 11.2.0.1 and on Solaris, AIX with Oracle 11.2.0.2. ACFS is POSIX and X/OPEN compliant on Linux/Unix and can be accessed remotely through NAS protocols such as NFS and CIFS.
ACFS introduction architecture and concepts
ACFS is built as an extension on top of the ASM functionality. ASM was introduced in Oracle 10g as an alternative to clustered third party volume managers and cluster file systems for storing Oracle data files, controlfiles, redo logs, archive logs, flashback logs, backup sets, backup pieces and FRA files. ASM as volume manager implements a striping at file level spreading file extents across disk group disks to balance both space allocation and I/O. As disks are added or removed ASM automatically performs dynamic rebalancing. ASM implements mirroring based on the redundancy defined on the disk group. Files in a disk group with normal and high redundancy have primary and one or two mirrored extents spread and load balanced across the disk group.
ACFS introduces and is based on an ASM dynamic volume that is built in an ASM disk group. ASM dynamic volumes are also ASM files. Once a dynamic volume is successfully created a volume device is created with the following naming convention /dev/asm/volumename-volume_number. ASM dynamic volumes inherit the properties of the disk group and behave like an ASM file. ADVM exposes and presents the ASM dynamic volume to the OS. ADVM makes the ASM volume device look like a block device on Linux. The volume device can be used to build an ACFS or third party file system using the traditional Unix/Linux utilities such as mkfs. Linux/Unix utilities can be used to mount the file system as well.
Oracle Grid Infrastructure, in cluster setup, installs and starts kernel modules such as oracleasm, oracleadvm, oracleoks, and oracleacfs in Linux to support ACFS functionality. ASM instance introduces new background processes VDBG, VBGn, VMB to facilitate the communication to the OS kernel dynamic volume manager drivers and to synchronize requests between the OS kernel dynamic volume manager driver and ASM instance.
In Oracle restart the drivers need to be started manually by executing as root the following command from the bin directory in Grid Infrastructure home.
#./acfsload start -s
There are ACFS restrictions as listed below.
- ACFS volumes cannot be used for root and boot devices
- ACFS cannot be used for Grid Infrastructure installation.
- ACFS volumes cannot be mapped to a raw device.
- ACFS volumes cannot be used for multipathing.
- ACFS volumes cannot be used with ASMlib. Layering ASM is not supported.
- ACFS volumes should not be partitioned with fdisk or similar utility.
Volume stripe columns and volume stripe width.
Stripe column and stripe width are two important attributes that can be specified for a volume determining how space is allocated for a volume and how space is allocated within a volume after ACFS or third party file system is created on a volume and a file is created or extended on the file system built on that volume. Both attributes are specified at volume creation time and cannot be changed later. If there is no value a default value is used.
- Stripe column – specifies the number of stripes a value from 1 to 8. The default is 4.
- Stripe width – specifies the size of the stripe can vary from 4K,8K, 16K,32K,64K,128K, 256K, 512K, 1M. Default is 128K.
Volume Allocation Unit (VAU) is the smallest allocation for a volume. Whenever a volume is created or extended space is allocated in multiple of VAU. VAU size is determined by the Volume Extent (VE) and the stripe column. VAU is the product of VE and the stripe column.
Volume Extent is based on the Allocation Unit (AU) specified on a disk group and for AU with size of 1MB the VE is 64MB. Whenever a VAU is allocated VE are allocated in a round robin fashion across the disks in a disk group. Volume size is always multiple of the VAU. For example if a volume is requested with size 400MB and stripe column is 4 and AU is 1MB two VAU with size 256MB will be allocated and the volume size will be 512MB.
Whenever a file is created or resized on a file system based on an ASM volume a space is allocated in chunks with the size of VAU. Within each VAU space is allocated in chunks the size of the stripe width round robin across all of the VE of the VAU.
Steps to create ACFS
The following is an ordered list of steps to create an ACFS file system. The order of steps is relevant.
-
Make sure you meet the prerequisites, for ASM disk group to support dynamic volumes disk group attributes must be set as follows:
compatible.asm >=11.2, compatible.advm>=11.2, compatible.asm>=compatible.advm
For ACFS replication
compatible.asm>=11.2.0.2, compatible.advm>=11.2.0.2, compatible.asm>=compatible.advm
- Create an ASM volume.
- Create a mount point directory
- Make the file system as root or as Grid Infrastructure owner for ACFS only.
- Mount the file system as root unless using /sbin/acfsutil to register the file system. If using /sbin/acfsutil the ACFS is automatically mounted on registration.
- Register the file system using /sbin/acfsutil. Registering the file system automatically mounts the ACFS and registers the ACFS as an OCR resource for auto-start.
Utilities for ACFS management
The following utilities can be used for ACFS management and monitoring. ASM volume and ACFS file system created by one utility can be monitored and managed with another one. The utilities can be used interchangeably as far as they are not used concurrently.
- ASMCA – ASM Configuration Assistant is a Java based GUI used to manage ASM, ASM disk groups, ASM dynamic volumes and ACFS. Create the ASM volume and then create the ACFS by pressing the create button on the ASM volumes tab and ACFS tab. ASMCA can be used to create, mount, dismount and drop disk groups, volumes and ACFS. It is intuitive and will not be covered.
- OEM Database control/Grid Control – OEM can be used to create, mount, dismount and drop disk groups, volumes and ACFS. It is intuitive and will not be covered.
-
ASMCMD – ASMCMD offers a command line interface for you to create, monitor and drop volumes. Relevant commands are: volcreate, voldelete, voldisable, volenable, volinfo, volresize, volset, volstat. To create a volume issue the following command.
volcreate -G diskgroup -s size [ –column number ] [ –width stripe_width ] [–redundancy {high|mirror|unprotected} ] [–primary {hot|cold}] [–secondary {hot|cold}] volume
To verify that a volume is created issue the following commands.
volinfo { -a | -G diskgroup -a | -G diskgroup volume }
-
SQL – From sqlplus a standard SQL can be used to create, manage, resize and drop volumes. Useful ASM views are:
- v$ASM_VOLUME to verify that volume is successfully created.
- V$ASM_ACFSVOLUMES with a row for each mounted as ACFS volume
- V$ASM_FILESYSTEM with a row for each mounted ACFS file system.
- v$ASM_VOLUME to verify that volume is successfully created.
-
/sbin/acfsutil – to manage already created ACFS. The list of activities is long but some of the are listed below. For a complete list issue /sbin/acfsutil -h
- Register an ACFS with OCR
- Obtain an info for ACFS
- Create an ACFS snapshot
- Manage ACFS replication
- Register an ACFS with OCR
Let’s create two ACFS file systems. We will create two volumes ACFS_VOLA and ACFS_VOLB and will mount it on /u01a and /u01b respectively.
From ASMCMD issue:
ASMCMD> volcreate -G data -s 2G –column 4 –width 128K –redundancy unprotected ACFS_VOLB
ASMCMD>
ASMCMD> volinfo -G data ACFS_VOLB
Diskgroup Name: DATA
Volume Name: ACFS_VOLB
Volume Device: /dev/asm/acfs_volb-239
State: ENABLED
Size (MB): 2048
Resize Unit (MB): 256
Redundancy: UNPROT
Stripe Columns: 4
Stripe Width (K): 128
Usage:
Mountpath:
ASMCMD>
From SQLPLUS issue:
SQL>alter diskgroup data add volume ACFS_VOLA size 2G stripe_width 128K stripe_columns 4 ;
Diskgroup altered.
SQL>
SQL> select volume_name, volume_device from v$asm_volume;
VOLUME_NAME VOLUME_DEVICE
——————– —————————————-
DATAVOL /dev/asm/datavol-239
DATAVOL1 /dev/asm/datavol1-239
ACFS_VOLB /dev/asm/acfs_volb-239
ACFS_VOLA /dev/asm/acfs_vola-239
PRIM /dev/asm/prim-481
SEC /dev/asm/sec-351
6 rows selected.
SQL>
Make and format the the file systems:
[oracle@raclinux1 ~]$ /sbin/mkfs -t acfs -b 4K /dev/asm/acfs_vola-239
mkfs.acfs: version = 11.2.0.2.0
mkfs.acfs: on-disk version = 39.0
mkfs.acfs: volume = /dev/asm/acfs_vola-239
mkfs.acfs: volume size = 2147483648
mkfs.acfs: Format complete.
[oracle@raclinux1 ~]$ /sbin/mkfs -t acfs -b 4K /dev/asm/acfs_volb-239
mkfs.acfs: version = 11.2.0.2.0
mkfs.acfs: on-disk version = 39.0
mkfs.acfs: volume = /dev/asm/acfs_volb-239
mkfs.acfs: volume size = 2147483648
mkfs.acfs: Format complete.
[oracle@raclinux1 ~]$
Register the File systems. Make sure that ora.registry.acfs resource is running.
./crsctl stat res –t
<snipped>
ora.registry.acfs
ONLINE ONLINE raclinux1
ONLINE ONLINE raclinux2
<snipped>
Use the /sbin/acfsutil to register the file system.
[oracle@raclinux1 asm]$ /sbin/acfsutil registry -a -f /dev/asm/acfs_vola-239 /u01a
acfsutil registry: mount point /u01a successfully added to Oracle Registry
[oracle@raclinux1 asm]$ /sbin/acfsutil registry -a -f /dev/asm/acfs_volb-239 /u01b
acfsutil registry: mount point /u01b successfully added to Oracle Registry
[oracle@raclinux1 asm]$
Verify that ACFS file systems are created successfully. That concludes ACFS creation. Further information about the ACFS can be obtained using ACFS utility.
ACFS utility
ACFS utility in /sbin/acfsutil is used for managing ACFS. The following is a list of some of the tasks that can be performed.
- Register ACFS, by adding a mount point, with OCR for management by CRS. Delete a mount point from the registry.
- View information and statistics about an ACFS file system.
- Remove an unmounted ACFS.
- Create and delete snapshot.
- Set up ACFS Replication.
- Resize dynamically an ACFS file system.
Create an ACFS Snapshot
You can make a read only point in time copy of an ACFS file system using a snapshot. ACFS stores metadata into the ASM volume for the files, directories and pointers to the file blocks. At the time of the snapshot the modified blocks of the file get copied into the snapshot. You can have up to 63 snapshots per volume. You can use the snapshot for recovery as of the snapshot time. When you create a snapshots using /sbin/acfsutil snap create a .ACFS directory is created as a subdirectory to the ACFS with two subdirectories
- snap : is used to store the snapshot data.
- repl : is used for replication
You can delete snapshots using the acfsutil as follows : /sbin/acfsutil snap delete
ACFS Replication
Oracle extends the concept of a Data Guard to ACFS enabling you to designate a primary ACFS file system asynchronously replicating changes to a standby ACFS file system on a disaster recovery (DR) site. Changes are captured into a log change file on the primary ACFS. Changes are transferred to similar log change file to the DR ACFS using Oracle Net. After application on the DR ACFS changes are purged. Make sure that you:
- Appropriately size the ACFS to accommodate changes.
- Have sufficient bandwidth for the transfer between the primary and standby site.
As of Oracle 11.2 the following limitations exist:
- Only one standby site is supported for a given primary files system.
- Only up to 8 nodes in a cluster can mount a file system.
- There is no support for ACFS file systems with encryption or ACFS security.
Prerequisites for configuring ACFS replication:
- compatible.ASM=11.2.0.2
- compatible.ADVM=11.2.0.2
Let’s set up an example. During the configuration phase ACFS needs to be mounted on only one node (primary ACFS site). The primary ACFS file system is based on volume prim mounted on /u05 and the secondary DR ACFS file system is based on volume sec mounted on /u06. The server raclinux1 is for the primary site and raclinux2 is for the DR site.
- Create a user on ASM with sysasm and sysdba privileges (oracle in the example)
- Create a service for the primary site (prim) and secondary site (sec). +ASMn will not work.
-
Initiate the standby ACFS
[root@raclinux1 bin]# /sbin/acfsutil repl init standby -p oracle/oracle@prim -c sec /u06
Where prim is the service created in bullet two and oracle is the user created in bullet one. Note that before starting the initiation of the standby and primary site you need to have the ACFS mounted on only one node ,that is, you need to dismount the both /u05 and /u06 on the DR node.
-
Initiate the primary ACFS while as root (In case of failure re-Initiate the standby and start from the beginning)
[root@raclinux1 bin]# /sbin/acfsutil repl init primary -s oracle/oracle@sec -m /u06 -c prim /u05
validating the remote connection
validating the remote connectionvalidating the remote connectionacfsutil repl init: ACFS-05050: remote connection cannot be establishedacfsutil repl init: ACFS-05052: standby replication site requires reinitialization
[root@raclinux1 bin]# /sbin/acfsutil repl init standby -p oracle/oracle@prim -c sec /u06
[root@raclinux1 bin]#
[root@raclinux1 bin]# /sbin/acfsutil repl init primary -s oracle/oracle@sec -m /u06 -c prim /u05
remote connection has been establishedRegistering with user specified service name-primwaiting for the standby replication site to initialize waiting for the standby replication site to initialize
The standby replication site is initialized. ACFS replication will begin.
[root@raclinux1 bin]#
- Step 4 starts the processes for ACFS replication and we can use already the ACFS replication. The replication can be validated and configuration checked with the following commands.
- Validation
[root@raclinux1 bin]# /sbin/acfsutil repl info -c -v /u06
Site: Standby
Standby status: Online
Standby mount point: /u06
Standby Oracle Net service name: sec
Primary mount point: /u05
Primary Oracle Net service name: PRIM
Primary Oracle Net alias: oracle/****@prim
Replicated tags:
Log compression: Off
Debug log level: 0
[root@raclinux1 bin]#
[root@raclinux1 bin]# /sbin/acfsutil repl bg info /u06
Resource: ora.repl.transport.sec.sec.acfs
Target State: ONLINE
Current State: ONLINE on raclinux1
Resource: ora.repl.main.sec.sec.acfs
Target State: ONLINE
Current State: ONLINE on raclinux1
Resource: ora.repl.apply.sec.sec.acfs
Target State: ONLINE
Current State: ONLINE on raclinux1
[root@raclinux1 bin]#
-
Mount primary and standby ACFS on all nodes of the cluster.
/bin/umount /dev/asm/prim-481 # Unmount on raclinux1,raclinux2
/bin/mount -t acfs /dev/asm/prim-481 /u05 # Mount on raclinux1,raclinux2
/bin/mount -t acfs /dev/asm/sec-351 /u06 # Mount on raclinux1,raclinux2
/bin/umount /dev/asm/sec-351 # Mount on on raclinux1,raclinux2
/sbin/mount.acfs -o all # Mount all on raclinux1, raclinux2
-
Managing Replication
-
Check replication configuration & statistics
# /sbin/acfsutil repl info -c -v /u06
# /sbin/acfsutil repl info -c -v /u05
# /sbin/acfsutil repl info -s -v /u05
-
Start & Stop replication : Although ACFS replication is automatically started after initiation and registered with Grid Infrastructure as a resource for automatic restart the acfsutil repl bg command can be used to start and stop the background processes and daemons implementing the replication.
# /sbin/acfsutil repl bg stop /u06
# /sbin/acfsutil repl bg start /u06
- Suspending and resuming ACFS replication. ACFS replication can be manually suspended and resumed. Prior to pausing the replication the sync must be executed. Pausing and resuming ACFS replication are done using the acfsutl repl [pause | resume] /standby_fs command. Here we are going to suspend replication and resume replication. We created a 1GB file in the primary ACFS and synchronized with the standby ACFS file system. The file gets replicated and statistics reflects the additional 1GB transfer since the replication resumed.
# sync
# /sbin/acfsutil repl pause /u06
# /sbin/acfsutil repl info -s -v /u05
——————————————————-
Fri Dec 3 14:49:52 2010 –Fri Dec 3 16:44:18 2010
——————————————————-
Data replicated: 5.01GB
From writes: 5.01GB
From memory mapped updates: 0.00GB
File operations replicated: 11
………………………………..
#
# /sbin/acfsutil repl resume /u06
## Create another 1GB file on /u05
# /sbin/acfsutil repl sync /u05
# /sbin/acfsutil repl sync apply /u05
# /sbin/acfsutil repl info -s -v /u05
——————————————————-
Fri Dec 3 14:49:52 2010 –Fri Dec 3 16:56:55 2010
——————————————————-
Data replicated: 6.01GB
From writes: 6.01GB
From memory mapped updates: 0.00GB
File operations replicated: 19
………………………………….
#
Summary
In the article you looked at ACFS architecture and concepts. You glimpsed at the utilities for ACFS management and monitoring. You looked at details at ACFS replication.
49 Comments »
Leave a Reply to gjilevski Cancel reply
-
Archives
- February 2017 (1)
- November 2016 (1)
- October 2016 (1)
- May 2016 (2)
- March 2016 (3)
- December 2014 (2)
- July 2014 (1)
- June 2014 (6)
- May 2014 (5)
- February 2014 (1)
- December 2012 (2)
- November 2012 (8)
-
Categories
-
RSS
Entries RSS
Comments RSS
Hi,
This was my exam question, still not sure about the answer.
Any help is much appreciatedl
Which three fragments will complete this statement correctly ?
In a cluster environmet, an acfs volume
a-)will be automatically mounted by a node on rebooy by default
b-) must be manually mounted after a node reboot
c-) will be automatically mounted by a node if it is defined as cluster stack startup if it is included in the ACFS mount registry.
d-)will be automatically mounted to all node if it is defined as cluster resource when dependent cluster resource requires access
e-)will be automatically mounted to all node in the cluster when the file system is registered
f-)must be mounted before it can be registered.
Hi,
The way I look at it , on the top of my head looks like :
e, c, d, looks ok to me
e is obvious, c objective of mount registry, d if a RDBMS is dependent of ACFS $OH
f is wrong , since you can registe without mount.
a is wrong since you may not have registered the volume and might want to mount it manually.
b is wrong since you may elect to do so but also you may register it.
What did you specify?
Regards,
Thanks a lot for your help.
One last question,
Can we mount acfs volumes using asmca or asmcmd utility ?
Hi,
For asmcmd see
ASMCMD> help
asmcmd [-V] [-v ] [–privilege ] [-p] [command]
asmcmd_no_conn_str
Starts asmcmd or executes the command
asmcmd [-V] [-v ] [–privilege ] [-p] [command]
The environment variables ORACLE_HOME and ORACLE_SID determine the
instance to which the program connects, and ASMCMD establishes a
bequeath connection to it, in the same manner as a SQLPLUS / AS
SYSASM. The user must be a member of the OSASM group.
Specifying the -V option prints the asmcmd version number and
exits immediately.
Specifying the -v option prints extra information that can help
advanced users diagnose problems.
Specify the –privilege option to choose the type of connection. There are
only two possibilities: connecting as SYSASM or as SYSDBA.
The default value if this option is unspecified is SYSASM.
Specifying the -p option allows the current directory to be displayed
in the command prompt, like so:
ASMCMD [+DATA/ORCL/CONTROLFILE] >
[command] specifies one of the following commands, along with its
parameters.
Type “help [command]” to get help on a specific ASMCMD command.’;
commands:
——–
md_backup, md_restore
lsattr, setattr
cd, cp, du, find, help, ls, lsct, lsdg, lsof, mkalias
mkdir, pwd, rm, rmalias
chdg, chkdg, dropdg, iostat, lsdsk, lsod, mkdg, mount
offline, online, rebal, remap, umount
dsget, dsset, lsop, shutdown, spbackup, spcopy, spget
spmove, spset, startup
chtmpl, lstmpl, mktmpl, rmtmpl
chgrp, chmod, chown, groups, grpmod, lsgrp, lspwusr, lsusr
mkgrp, mkusr, orapwusr, passwd, rmgrp, rmusr
volcreate, voldelete, voldisable, volenable, volinfo
volresize, volset, volstat
ASMCMD>
For asmca see Show command
Thanks gilevski.
From a quick investigation. I think it is impossible to mount acfs volumes by using the asmcmd utility.
I can mount acfs volumes by using Asmca or Oracle enterprise manager or the standard linux/unix mount command provided the acfs type is specified(mount -t acfs)
Can you confirm whether my understanding is right or wrong ?
Correct.
How about Windows?
Do not blame me if you do not pass.
Regards,
Many thanks.
I know that I can use acfsmountvol command for windows 🙂
I do have one last incorrect answer and still have doubts.
Which three actions are reqiured to create a general purpose ASM cluster file system(ACFS to be automatically mounted by Oracle Clusterware)
a-)Format an ASM volume with an ASM cluster file system
b-)create mount points on all cluster nodes where the ASM cluster file system will be mounted
c-)Manually add an entry to /etc/fstab defining the volume,mount point and mount points on each node in cluster
d-)register the mount point
I think c is incorrect, the answer should be a,b,d
What do you think ?
Hi,
I also think a,b,d
Regards,
thanks a lot for your assistance gilevski
Hi Gilevski,
Apologies for bothering you again.
I did have a rac exam last week and still trying to find the incorrect answers.
Any idea about below question. Not sure whether “a” or “c” ?
Examine the following details from the AWR report from your three instance RAC database
Top 5 timed Events
Event Waits Time Avg Wait %Total Call Time Waitclass
Cpu time 4,580 65.4
log file sync 276,281 1,501 5 21.4 commit
logfile parallel write 298,045 923 3 13.2 system I/O
gc currecnt block 3-way 605,628 631 1 9.0 cluster
gc cr block 3-way 514,218 533 1 7.6 cluster
a-) There are large number of requests for cr blocks or current blocks currently in progress
b-) Global cache access is optimal without any significant delay
c-) The logfile sync waits are clue to cluster interconnect latency
d-) To determine the frequency of two way block requests you must examine other events in the report
Hi,
I do not think that logfile sync waits has anything to do with interconnect latency. Writing to redo log is within an instance.
c is incorrect to me.
This is very artificial. Looks based on observetaion that only a is correct. For the rest you do not have any information. C is incorrect for the reason mentioned. D the event is not in the top and two-way transfer simply means that noting bad.
Seems like you are excluding B. Why? average wait time is 1ms!
Regards,
Thanks again Gjilevski. I will do more reading on logfile sync.
I do have another question about acfs snapshots,
I believe “d” is definetely correct but not sure about A or B.
I couldnt find the answer in the oracle manuals.
Your thought ?
with regards to acfs snapshots, which two are true:
a-)They can be created for ACFS file systems only if the ASM disk group hosting the ADVM volume file used by the file system has free space available
b-)They can be created for ACFS file systems only if the ADVM volume file used by the file system has free space available
c-)They can be created only if the ASM disk group hosting the ADVM volume used by the file system has no other ASM files contained in the diskgroup
d-)They can be created when ACFS is used both on clusters and standalone servers
e-)They are accessible only on cluster node that was used when creating the snapshot
Hi,
Did you pass? Are you trying to get answers from some dumps?
Did you know how to create an ADVM volume? Can the ADVM volume be extended by itself? Where are snapshots created? ….
Regards,
No I failed 😦
Yes, I know how to create an ADVM volume and how to take a snapshot and how to restore it.
With regards to above question “d” must be correct since acfs can be used both on clusters and standalone servers.
I have no idea whether a or b is correct ?
Hi,
I think it is b since snapshot is created in an ADVM volume and ADVM volume is created with a fixed size. Therfore, you should be limited by the ADVM volume not a disk group.
Also d-)They can be created when ACFS is used both on clusters and standalone servers
looks correct to me.
c is wrong as you do not need a dedicated DG for ADVM volumes.
e is wrong as they are clusterwide
So by reasoning and elliminating the impossible b & d looks correct
Double check it is ONLY OPINION.
Many thanks Gjilevski. ( I really appreciate your help).
By the way, thanks a lot for sharing this useful blog as I can find lots of useful information.
What do you reckon about this one ?
I guess d is correct, again not sure about the other ?
You are managing a policy managed three instance RAC database.
You ran database ADDM for the database and noticed gc current block congested and gc cr block congested waits.
What are two possible reasons for these wait events ?
a-)The wait events indicate a delay in processing has occurred in the Global Cache Services (GCS), which is usually caused by high load
b-)The wait times indicate that blocks must wait after initiating a gc block request, for the round trip from the start of the wait until the blocks arrive.
c-)The wait events indicate that there is block contention resulting in multiple requests for access to local blocks
d-)The wait events indicate that the local instance making the request for current or consistent read blocks was waiting for logical I/O from its own buffer cache at the same time.
Hi,
I would suggest to look at https://gjilevski.wordpress.com/2011/08/02/oracle-cache-fusion-private-inter-connects-and-practical-performance-management-considerations-in-oracle-rac/
What does gc current block congested and gc cr block congested represent? Is it an Oracle related such as concurrency or contentention?
Can you look at place holder events and substitute events after the outcome of a block request resolution is clear?
Any further thoughs not to look at why a ,b can be valid or the others are not correct?
Regards,
I will definetely read that post in detail.
Do you think a and b are correct ?
I found below explanation from the link you posted:
Gc [current/cr][block/grant] congested – means that it has been received eventually but with a delay because of intensive CPU consumption, memory lack, LMS overload due to much work in the queues, paging, swapping. This is worth investigating as it provides a room for improvement. You should look at it as it indicates that LMS could not dequeue message fast enough.
Hi,
Could be…
What do you think……
Is not it obvious.
Regards,
I thought “d” might be correct as well but still have doubts, I guess I need to do more reading 🙂
What do you think about this one?
In the high availability services provided by Oracle Clusterware are used to protect Oracle resource such as RAC database instances.
RAc databases services, and other components of the ORacle infrastructure and non Oracle resources as well.
Which two statements are true about the high availability capabilities of Oracle HA services ?
a-) RAC databases may have their instances failed over in some cases
b-) ASM instances may be failed over if fewer than three nodes remain in the cluster, so that there are always at least three ASM instances available
c-) If a node fails, then all resources that were active on that node will be failed over to a surviving node if any exists
d-) If a node fails, then cluster resources that were active on that node may be failed over to a surviving node if any exists,but local resources are not failed over
e-) HA services will only fail over a resource upon failure of the node where the resoure was active.
In my opinion a and c are correct, but not sure about d.
Hi,
I would suggest read the manuals and distinguish local and cluster resource. Make a diference between RAC/ASM and cluster resource?
TYPE=cluster_resource fails ocal_resource does not. It is all in the manuals. Resources types, specifics etc..
[root@raclinux1 bin]# ./crsctl stat res MyTest -f
NAME=MyTest
TYPE=cluster_resource
STATE=ONLINE
TARGET=ONLINE
ACL=owner:root:rwx,pgrp:root:r-x,other::r–
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=/u01/app/11.2.0.3/grid/crs/public/myTest_actionScript.pl
ACTIVE_PLACEMENT=1
AGENT_FILENAME=%CRS_HOME%/bin/scriptagent
AUTO_START=always
CARDINALITY=1
CARDINALITY_ID=0
CHECK_INTERVAL=10
CREATION_SEED=99
DEFAULT_TEMPLATE=
DEGREE=1
DESCRIPTION=
ENABLED=1
FAILOVER_DELAY=0
FAILURE_INTERVAL=0
FAILURE_THRESHOLD=0
HOSTING_MEMBERS=
ID=MyTest
LOAD=1
LOGGING_LEVEL=1
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
PLACEMENT=balanced
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=2
SCRIPT_TIMEOUT=60
SERVER_POOLS=*
START_DEPENDENCIES=hard(MyTestVIP) pullup(MyTestVIP)
START_TIMEOUT=0
STATE_CHANGE_TEMPLATE=
STOP_DEPENDENCIES=hard(intermediate:MyTestVIP)
STOP_TIMEOUT=0
UPTIME_THRESHOLD=1h
[root@raclinux1 bin]#
[root@raclinux1 bin]# ./crsctl stat res ora.asm -f
NAME=ora.asm
TYPE=ora.asm.type
STATE=ONLINE
TARGET=ONLINE
ACL=owner:grid:rwx,pgrp:oinstall:rwx,other::r–
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
ALIAS_NAME=ora.%CRS_CSS_NODENAME%.ASM%CRS_CSS_NODENUMBER%.asm
AUTO_START=never
CHECK_INTERVAL=60
CHECK_TIMEOUT=30
CREATION_SEED=134
DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=asm) ELEMENT(INSTANCE_NAME= %GEN_USR_ORA_INST_NAME%)
DEGREE=1
DESCRIPTION=Oracle ASM resource
ENABLED=1
GEN_USR_ORA_INST_NAME=
GEN_USR_ORA_INST_NAME@SERVERNAME(raclinux1)=+ASM1
GEN_USR_ORA_INST_NAME@SERVERNAME(raclinux2)=+ASM2
GEN_USR_ORA_INST_NAME@SERVERNAME(raclinux3)=+ASM3
ID=ora.asm
LOAD=1
LOGGING_LEVEL=1
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=60
START_DEPENDENCIES=weak(ora.LISTENER.lsnr)
START_TIMEOUT=900
STATE_CHANGE_TEMPLATE=
STOP_DEPENDENCIES=
STOP_TIMEOUT=600
TYPE_VERSION=1.2
UPTIME_THRESHOLD=1d
USR_ORA_ENV=
USR_ORA_INST_NAME=+ASM%CRS_CSS_NODENUMBER%
USR_ORA_OPEN_MODE=mount
USR_ORA_OPI=false
USR_ORA_STOP_MODE=immediate
VERSION=11.2.0.2.0
[root@raclinux1 bin]#
[root@raclinux1 bin]# ./crsctl stat res ora.racdb.db -f
NAME=ora.racdb.db
TYPE=ora.database.type
STATE=ONLINE
TARGET=ONLINE
ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r–
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
ACTIVE_PLACEMENT=1
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
AUTO_START=restore
CARDINALITY=3
CARDINALITY_ID=0
CHECK_INTERVAL=1
CHECK_TIMEOUT=30
CLUSTER_DATABASE=true
CREATION_SEED=132
DATABASE_TYPE=RAC
DB_UNIQUE_NAME=RACDB
DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=database) PROPERTY(DB_UNIQUE_NAME= CONCAT(PARSE(%NAME%, ., 2), %USR_ORA_DOMAIN%, .)) ELEMENT(INSTANCE_NAME= %GEN_USR_ORA_INST_NAME%) ELEMENT(DATABASE_TYPE= %DATABASE_TYPE%)
DEGREE=1
DESCRIPTION=Oracle Database resource
ENABLED=1
FAILOVER_DELAY=0
FAILURE_INTERVAL=60
FAILURE_THRESHOLD=1
GEN_AUDIT_FILE_DEST=/u01/app/oracle/admin/RACDB/adump
GEN_START_OPTIONS=
GEN_START_OPTIONS@SERVERNAME(raclinux1)=open
GEN_START_OPTIONS@SERVERNAME(raclinux2)=open
GEN_START_OPTIONS@SERVERNAME(raclinux3)=open
GEN_USR_ORA_INST_NAME=
GEN_USR_ORA_INST_NAME@SERVERNAME(raclinux1)=RACDB1
GEN_USR_ORA_INST_NAME@SERVERNAME(raclinux2)=RACDB2
GEN_USR_ORA_INST_NAME@SERVERNAME(raclinux3)=RACDB3
HOSTING_MEMBERS=
ID=ora.racdb.db
INSTANCE_FAILOVER=0
LOAD=1
LOGGING_LEVEL=1
MANAGEMENT_POLICY=AUTOMATIC
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
ONLINE_RELOCATION_TIMEOUT=0
ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_3
PLACEMENT=restricted
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=2
ROLE=PRIMARY
SCRIPT_TIMEOUT=60
SERVER_POOLS=ora.RACDB
SPFILE=+DATA/racdb/spfileracdb.ora
START_DEPENDENCIES=hard(ora.DATA.dg,ora.DATADG.dg) weak(type:ora.listener.type,global:type:ora.scan_listener.type,uniform:ora.ons,global:ora.gns) pullup(ora.DATA.dg,ora.DATADG.dg)
START_TIMEOUT=600
STATE_CHANGE_TEMPLATE=
STOP_DEPENDENCIES=hard(intermediate:ora.asm,shutdown:ora.DATA.dg,shutdown:ora.DATADG.dg)
STOP_TIMEOUT=600
TYPE_VERSION=2.2
UPTIME_THRESHOLD=1h
USR_ORA_DB_NAME=RACDB
USR_ORA_DOMAIN=
USR_ORA_ENV=
USR_ORA_FLAGS=
USR_ORA_INST_NAME=
USR_ORA_INST_NAME@SERVERNAME(raclinux1)=RACDB1
USR_ORA_INST_NAME@SERVERNAME(raclinux2)=RACDB2
USR_ORA_INST_NAME@SERVERNAME(raclinux3)=RACDB3
USR_ORA_OPEN_MODE=open
USR_ORA_OPI=false
USR_ORA_STOP_MODE=immediate
VERSION=11.2.0.3.0
[root@raclinux1 bin]#
Thanks,I am new to rac envrionment and need to do more reading on cluster resources.
since local resources do not fail, “c” is wrong.
I think, a and d are correct.
Thanks a lot.
Hi,
I am not going to entertain questions any more!!!!
Read the manual and do some testing and thining. Does an RDBMS instance or ASM instance fails? Why do you want to have two instances or two ASMs on one node?
Does two RDBMS instances or ASM on one node serve any purpose???????
Why do you have load balancing or client side failover or TAF?
Test & Practice!!!!!!
Good luck!!!!
Regards,
Thanks a lot
Focus on the manuals and any training from OUI. Set up RAC and practice .Look at the scope here http://www.oracle.com/partners/en/knowledge-zone/database/rac11g-exam-330024.html
Let me know how it goes next time.
Appreciate your time.
By the way, from a quick investigation
D and E seems to be the correct answers for the last question:-)
Hi Gjilevski
I promise this it the last question which I am not sure.
Appreciate your assistance.
If you dont want to help, I tottally understand as you have helped me a lot.
{code}
The cluster origanally consisted of four nodes: RACNODE1, RACNODE2, RACNODE3 and RACNODE3.
Now two nodes called RACNODE5 and RACNODE6 have been installed and connected to the cluster.
Which three should be performed to check whether the new nodes are ready for running addNode.sh and to help correct problems ?
a-)cluft stage -pre crsinst -n RACNODE5/RACNODE6 -C +DATA -q +VOTE -orainv
b-) -fixup -verbose
c-)clufy stage -post hwos -n RACNODE5,RACNODE6 -verbose
d-)clufy comp peer -refnode RACNODEI -n RACNODE6,RACNODE6 -orainv -osdba -verbose
e-)clufy stage -post hwos -n all -verbose
f-)clufy stage -pre nodeadd -n RACNODES5,RACNODE6 -fixup
g-)clufy comp peer -refnode RACNODE5 -n RACNODE6 -orainv -osdba -verbose
{code}
Seems e,f,a but not sure
Hi.
Look at
1. Oracle manual – Oracle® Clusterware Administration and Deployment Guide 11g Release 2 (11.2) E16794-09 here -> http://download.oracle.com/docs/cd/E11882_01/rac.112/e16794/cvu.htm#BABBJHHH
or
2.
https://gjilevski.wordpress.com/2011/11/17/clone-gi-and-rdbms-homes-in-oracle-rac-11-2-0-3-with-clone-pl/
https://gjilevski.wordpress.com/2011/11/17/adding-and-deleting-a-node-from-oracle-rac-11-2-0-3/
It is WRITTEN there. Tell me what you think!
Regards,
seems a,g,f
Hi,
Run each one and see what is going to happen if you have not figured it out yet.
Do not waste you time with dumps. It will mislead you.
Regards,
According to https://gjilevski.wordpress.com/2011/11/17/adding-and-deleting-a-node-from-oracle-rac-11-2-0-3/
d,c,f are correct.
Thanks for your help again and thanks of sharing your knowledge with public
Hi,
This is what you think and your opinion.
All YOU NEED TO DO IS VERIFY!
Regards,
I dont have cluster environment at the moment as I will rebuild it next week.
Can you let me know if my opinion is right or wrong ?
Thanks again
Hi,
Can you convince yourself that you are right? Do this on any question!
Regards,
I found it:
c-)clufy stage -post hwos -n RACNODE5,RACNODE6 -verbose
e-)clufy stage -post hwos -n all -verbose
f-)clufy stage -pre nodeadd -n RACNODES5,RACNODE6 -fixup
No more rubbish question.
Thanks
Always keep in mind “Imagination is more important than knowledge. For knowledge is limited to all we now know and understand, while imagination embraces the entire world, and all there ever will be to know and understand.” – Albert Einstein
Konnichiwa from Japan. Sorry for my bad English. I used Bing’s translator on your blog just to tell you how awesome I think it is! Take Care!
Respected Sir,
Suppose I want to do maintanence on one of the rac node.
When we restart the rac node, does the clusterware and database shutdown gracefully or do they shutdown abort and they need recovery upon startup?
In other words, is it recommeded to shutdown clusterware (crsctl stop crs) before we reboot the rac server ?
Hi,
In general on startup any database performs recovery. You can implement a gracefull database shutdown before shutdown GI/node.
Regards,
Hi
I have got two questions. Appreciate your assistance.
1-) When I issue ‘crsctl stop crs’ on cluster environment, I realized that the instances were shutdowned abort. Is this normal ?
2-) When I want to add a new resouce in cluster environment, is it enough to run “crsctl add resource” command in one node or do I need to run this accross all nodes ?
Hi,
1. any particular reason for not using ‘crsctl stop cluster’ on the node?
2. You can issue the “crsctl add resource” command on any nod where the GI is up and running.
Look at here. http://docs.oracle.com/cd/E11882_01/rac.112/e16794/crsref.htm#CHEHGGAA
This command attempts to gracefully stop resources managed by Oracle Clusterware while attempting to stop Oracle High Availability Services on the local server.
If any resources that Oracle Clusterware manages are still running after you run the crsctl stop crs command, then the command fails. Use the -f option to unconditionally stop all resources and stop Oracle High Availability Services on the local server.
If you intend to stop Oracle Clusterware on all or a list of nodes, then use the crsctl stop cluster command, because it prevents certain resources from being relocated to other servers in the cluster before the Oracle Clusterware stack is stopped on a particular server. If you must stop the Oracle High Availability Services on one or more nodes, then wait until the crsctl stop cluster command completes and then run the crsctl stop crs command on any particular nodes, as necessary.
In Oracle Clusterware 11g release 2 (11.2.0.3), when you run this command in Solaris Sparc and Solaris X64 environments, drivers remain loaded on shutdown and subsequent startup. This does not happen in Linux environments.
Best Regards,
Hi,
Is there a difference between using ‘crsctl stop crs’ and ‘crsctl stop cluster’ ?
Hi,
Look at http://docs.oracle.com/cd/E11882_01/rac.112/e16794/crsref.htm#CHEHGGAA
If you intend to stop Oracle Clusterware on all or a list of nodes, then use the crsctl stop cluster command, because it prevents certain resources from being relocated to other servers in the cluster before the Oracle Clusterware stack is stopped on a particular server. If you must stop the Oracle High Availability Services on one or more nodes, then wait until the crsctl stop cluster command completes and then run the crsctl stop crs command on any particular nodes, as necessary.
Best Regards,
Hi
I want to test adding a cluster resource on rac.
agent.sh script only exists on node1 and I just want to run the resource on node 1.
I ran following to add new resource which was successfull.
crsctl add resource agent -type cluster_resource -attr ” ACTION_SCRIPT= ‘/home/oraagent/agent.sh’ ,HOSTING_MEMBERS=padclor0001 , CHECK_INTERVAL=’30’ ,RESTART_ATTEMPTS=’2′ ”
[grid@padclor0001 ~]$ crsctl start resource agent
CRS-2672: Attempting to start ‘agent’ on ‘padclor0003’
CRS-5809: Failed to execute ‘ACTION_SCRIPT’ value of ”/home/oraagent/agent.sh” for ‘agent’. Error information ‘cmd ‘/home/oraagent/agent.sh’ not found’
CRS-5809: Failed to execute ‘ACTION_SCRIPT’ value of ”/home/oraagent/agent.sh” for ‘agent’. Error information ‘cmd ‘/home/oraagent/agent.sh’ not found’
CRS-2674: Start of ‘agent’ on ‘padclor0003’ failed
CRS-2679: Attempting to clean ‘agent’ on ‘padclor0003’
Do you have any idea why oracle tries to start the resource on other nodes ? I`ve just mentioned node1 as a hosting member.
Hi,
I would suggest to verify the configurtion of the resource. Most likely it is a configuration issue…
Check the status after creating the resource.
./crsctl stat res db11gr2 -v
./crsctl stat res db11gr2 -f
See what I have about the topic for an idea…
https://gjilevski.com/2011/11/13/build-ha-for-third-party-application-with-oracle-gi-11-2-0-3/
https://gjilevski.com/2012/01/09/build-active-passive-ha-configuration-for-single-instance-database-with-oracle-gi-11-2-0-3/
Best Regards,
Hi,
If you are researching, you could possibly use OEM to make the resource configuration per your needs and research later why it is done so.
Best Regards,
Howdy! I’m at work surfing around your blog from my new iphone
4! Just wanted to say I love reading your blog
and look forward to all your posts! Keep up the outstanding work!
After installing oracle 11.2.0.1 enterprise (only the software),
when configuring a database with dbca I am getting errors which mention acfs:
ORA-2800: the account is locked
ERROR>/bin/sh:/sbin.acfsutil: No such file or directory
This is on a suse SLES11SP1 virtual machine. Do I need acfs? I have selected a standalone (not cluster) installation, but I still see messages about acfs and clustering.