My.Sys.Admin: 2009

Wednesday 30 September 2009

How to list contents of an rpm in Red Hat Linux

server0103 # rpm -q -lp autosys-client-ALD-lonos-1.0-2.noarch.rpm
warning: GEOSautosys-client-ALD-lon-1.0-2.noarch.rpm: V3 DSA signature: NOKEY, key ID b4ac8093
/etc/auto.profile
/etc/xinetd.d/auto_remote_ALD
/opt/CA/sybase
/opt/CA/sybase/interfaces
/opt/CA/uajm
/opt/CA/uajm/ald
/opt/CA/uajm/ald/autosys
/opt/CA/uajm/ald/autosys/bin
/opt/CA/uajm/ald/autosys/bin/auto_remote
/opt/CA/uajm/ald/autosys/bin/autorep
/opt/CA/uajm/ald/autosys/bin/jil
/opt/CA/uajm/ald/autosys/bin/sendevent
/opt/CA/uajm/ald/autouser
/opt/CA/uajm/ald/autouser/autosys.csh
/opt/CA/uajm/ald/autouser/autosys.env
/opt/CA/uajm/ald/autouser/autosys.ksh
/opt/CA/uajm/ald/autouser/autosys.sh
/opt/ixp/agent
/opt/ixp/agent/conf
/opt/ixp/bin
/opt/ixp/bin/ixagent
/opt/ixp/bin/ixautorep
/opt/ixp/bin/ixflags
/opt/ixp/bin/ixjil
/opt/ixp/bin/ixsendevent
/opt/ixp/etc
/opt/ixp/etc/ixp.conf
/opt/ixp/lib
/opt/ixp/lib/ixp.jar

Monday 28 September 2009

vxvm troubleshooting - claim ownership of disk for new diskgroup

this is the issue

trying to add a disk to a diskgroup via vxdiskadm option 1 - the disk used to belong to a solaris host (this is emc symmetrix device)- and the disk is already owned by a diskgroup on the eupr0037 (now decommisoned) and I want to add it to eupr0201

when I do a vxdisk list cxtxtdx it shows the disk as owned by a dgprapdcd01 (a dg on the decommed eupr0037)

the solution is simple – run the following command

vxdisk –f init cxtxdx

and then run a vxdisk list cxtxdx and you will see diff ownership for the disk and no group assigned to it

you can then run a vxdiskadm > option 1 and readd the disk you want to the diskgroup you want

Friday 11 September 2009

mount nfs mount and add it fstab in linux

example of how to mount an nfs mount

mount -t nfs eslat001:/opt/ims /app/ims
mount -t nfs eslat001:/opt/rims /app/rims

and then add to the /etc/fstab with the following entries;

servername:/pathofdir /pathonserver nfs options 0 0

where;

The corresponds to hostname, IP address, or fully qualified domain name of the server exporting the file system.

The is the path to the exported directory.

The specifies where on the local file system to mount the exported directory. This mount point must exist before /etc/fstab is read or the mount will fail.

The nfs option specifies the type of file system being mounted.

The area specifies mount options for the file system. For example, if the options area states rw,suid, the exported file system will be mounted read-write and the user and groupid set by the server will be used. Note that parentheses are not to be used here

Wednesday 9 September 2009

vpar & npar

Npar : Hard partitioning - Partitioned at SBA level -Hardware failure won't affect the other partition

VPar : Soft Patitioning - Partitioned at LBA level : Hardware failure can affect other partition - Software failure wont affect other partition.

Various HP superdome stuff

useful links;

http://docs.hp.com/en/B2355-90702/ch05s05.html#babbhcbb

and

http://my.safaribooksonline.com/0131927590/ch17lev1sec4

to get to the MP>CM stuff, you need to be on the seperate node that runs the MP stuff

once on that node, you get presented with the following;

(c) Copyright 2005 Hewlett-Packard Development Company, L.P.

Welcome to the
Management Processor
HP 9000 and Integrity Superdome Server SD64B

Supported firmware-updateable entity combination: Recipe SD 6.20c

MP MAIN MENU:

CO: Consoles
VFP: Virtual Front Panel
CM: Command Menu
CL: Console Logs
SL: Show Event Logs
FW: Firmware Update
HE: Help
X: Exit Connection

[eupr0200] MP>

The VFP allows you to boot and reboot partitiions etc, the CM we will come to shortly - CL (console logs) is self explanatary, as are the others;

to verify if the IP address of the console you are connected to is public or private, run MP>CM>LS and you will get confirmation

Enter HE to get a list of available commands

(Use to return to main menu.)

[euisjd0] MP:CM> LS

Current configuration of MP customer LAN interface
MAC address : 01:1a:4b:we:27:c7
IP address : 10.157.73.122 0x0aa94sdd96
Name : eupr0200
Subnet mask : 255.255.255.0 0xffffff00
Gateway : 10.157.73.122 0x0adff901
Status : UP and RUNNING

Current configuration of MP private LAN interface
MAC address : 00:22:f0:02:4d:qs
IP address : 192.128.2.11 0xds328020a
Name : priv-00
Subnet mask : 255.255.255.0 0xffffff00
Gateway : 192.128.2.11 0xdsar20a
Status : UP and RUNNING

[eudsasd0] MP:CM>

you can get lots of output from the CM

[edfsddfds0] MP:CM> sysrev

Supported firmware-updateable entity combination: Recipe SD 6.20c

| Cab 0 | Cab 1 | Cab 8 | Cab 9 |
---------+-----------------+-----------------+----------------+----------------|
MP | 024.004.001 | | | |
ED | 024.015.000 | | | |
CLU | 024.000.005 | 024.000.005 | 024.000.005 | |
PM | 022.001.002 | 022.001.002 | 022.001.002 | |
RPM0 | 002.002.000 | 002.002.000 | | |
RPM1 | 002.002.000 | 002.002.000 | | |
RPM2 | 002.002.000 | 002.002.000 | | |
OSP | 000.011.000 | 000.011.000 | | |

Key: * - Not at the highest entity revision supported by this combination.

Please type to continue, Q to quit, or S to stream: q
[efsdsffd0] MP:CM> cp

-------------------------------+
Cabinet | 0 | 1 |
-------------+--------+--------+
Slot |01234567|01234567|
-------------+--------+--------+
Partition 0 |........|XXX.....|
Partition 1 |........|.....X..|
Partition 2 |........|...X....|
Partition 3 |XXXX....|........|
Partition 4 |........|....X.X.|
Partition 5 |....XXX.|........|

[eufdsfewg] MP:CM> io

--------------------------+
Cabinet | 0 | 1 |
--------+--------+--------+
Slot |01234567|01234567|
--------+--------+--------+
Cell |XXXXXXX.|XXXXXXX.|
IO Cab |0.0.080.|1.81118.|
IO Bay |1.1.010.|1.01000.|
IO Chas |3.1.113.|3.11313.|

[asdsafv] MP:CM>

Friday 14 August 2009

ILO issues and HOT KEYS

If you get an ilo issue with a remote console - as i recently have - where it is stuck and not booting, but giving you Function key options (for example, F10 to enter sys maintenance mode - always a good idea) - then if function keys are not set up, you need to do it. Luckily, setting up hot keys is simple - just go to remote console > remote console hot keys, and you can set up your function key 'hot keys' there

FYI, I did an F10 and got into the GRUB menu and could then boot into linux, as opposed to windows (which is what the console was telling me it was configured as - even though I knew the OS was linux...........)

Thursday 13 August 2009

powermt check to clean up dead powerpath paths

server005:root / > ./sbin/powermt display dev=all
Symmetrix ID=000287751748
Logical device ID=03E7
state=alive; policy=SymmOpt; priority=0; queued-IOs=0

Symmetrix ID=000290103418
Logical device ID=03BB
state=alive; policy=SymmOpt; priority=0; queued-IOs=0
==============================================================================
---------------- Host --------------- - Stor - -- I/O Path - -- Stats ---
### HW Path I/O Paths Interf. Mode State Q-IOs Errors
==============================================================================
18 6/0/2/1/0.97.5.19.0.13.1 c18t13d1 FA 9aA active alive 0 1
21 6/0/13/1/0.97.5.19.0.13.1 c21t13d1 FA 8aA active alive 0 1

Symmetrix ID=000290103418
Logical device ID=03CF
state=alive; policy=SymmOpt; priority=0; queued-IOs=0
==============================================================================
---------------- Host --------------- - Stor - -- I/O Path - -- Stats ---
### HW Path I/O Paths Interf. Mode State Q-IOs Errors
==============================================================================
18 6/0/2/1/0.97.5.19.0.13.6 c18t13d6 FA 9aA active dead 0 1
21 6/0/13/1/0.97.5.19.0.13.6 c21t13d6 FA 8aA active dead 0 1

server005:root / > ./sbin/powermt check
Warning: Symmetrix device path c18t13d6 is currently dead.
Do you want to remove it (y/n/a/q)? y
Warning: Symmetrix device path c21t13d6 is currently dead.
Do you want to remove it (y/n/a/q)? y

server005:root/ > ./sbin/powermt display dev=all

Symmetrix ID=000290103418
Logical device ID=03BB
state=alive; policy=SymmOpt; priority=0; queued-IOs=0
==============================================================================
---------------- Host --------------- - Stor - -- I/O Path - -- Stats ---
### HW Path I/O Paths Interf. Mode State Q-IOs Errors
==============================================================================
18 6/0/2/1/0.97.5.19.0.13.1 c18t13d1 FA 9aA active alive 0 1
21 6/0/13/1/0.97.5.19.0.13.1 c21t13d1 FA 8aA active alive 0 1

Thursday 30 July 2009

How to give a user su root access with sudoers

How to add an entry in a local sudoers file to enable an individual user su root access

These are the type of entries – just edit file with vi

#Temp access for 10 days from 200709 - itask 1374893
gbsmnt eudt0076-dt = SUROOT

#Temp access for 10 days from 300709 - itask 1392545
gbchgd eudt0076-dt = SUROOT

Thursday 23 July 2009

vxdmp - veritas dynamic multipathing

The vxdmpadm utility is a command line administrative interface to the DMP feature of VxVM.

You can use the vxdmpadm utility to perform the following tasks.

Retrieve the name of the DMP device corresponding to a particular path

List all paths under a DMP device

List all controllers connected to disks attached to the host

List all the paths connected to a particular controller

Enable or disable a host controller on the system

Rename an enclosure

Control the operation of the DMP restore daemon

The following sections cover these tasks in detail along with sample output. For more information, see the vxdmpadm(1M) manual page.

Retrieving Information About a DMP Node

The following command displays the DMP node that controls a particular physical path:

# vxdmpadm getdmpnode nodename=c3t2d1

The physical path can be specified as the nodename attribute, which must be a valid path listed in the /dev/rdsk directory.

Use the enclosure attribute with getdmpnode to obtain a list of all DMP nodes for the specified enclosure.

# vxdmpadm getdmpnode enclosure=enc0

Displaying All Paths Controlled by a DMP Node

The following command displays the paths controlled by the specified DMP node:

# vxdmpadm getsubpaths dmpnodename=c2t1d0s2

The specified DMP node must be a valid node in the /dev/vx/rdmp directory.

You can also use getsubpaths to obtain all paths through a particular host disk controller:

# vxdmpadm getsubpaths ctlr=c2

Listing Information About Host I/O Controllers

The following command lists attributes of all host I/O controllers on the system:

# vxdmpadm listctlr all

This form of the command lists controllers belonging to a specified enclosure and enclosure type:

# vxdmpadm listctlr enclosure=enc0 type=X1

Disabling a Controller

Disabling I/O to a host disk controller prevents DMP from issuing I/O through the specified controller. The command blocks until all pending I/O issued through the specified disk controller are completed.

To disable a controller, use the following command:

# vxdmpadm disable ctlr=ctlr

Enabling a Controller

Enabling a controller allows a previously disabled host disk controller to accept I/O. This operation succeeds only if the controller is accessible to the host and I/O can be performed on it. When connecting active/passive disk arrays in a non-clustered environment, the enable operation results in failback of I/O to the primary path. The enable operation can also be used to allow I/O to the controllers on a system board that was previously detached.

To enable a controller, use the following command:

# vxdmpadm enable ctlr=ctlr

Listing Information About Enclosures

To display the attributes of a specified enclosure, use the following command:

# vxdmpadm listenclosure enc0

The following command lists attributes for all enclosures in a system:

# vxdmpadm listenclosure all

Renaming an Enclosure

The vxdmpadm setattr command can be used to assign a meaningful name to an existing enclosure, for example:

# vxdmpadm setattr enclosure enc0 name=GRP_1

This example changes the name of enclosure from enc0 to GRP_1.

NOTE: The maximum length of an enclosure name is 25 characters.

Starting the DMP Restore Daemon

The DMP restore daemon re-examines the condition of paths at a specified interval. The type of analysis it performs on the paths depends on the specified checking policy.

Use the start restore command to start the restore daemon and specify the policy:

# vxdmpadm start restore policy=check_disabled

The check_disabled policy (the default) checks the condition of paths that were previously disabled due to hardware failures, and revives them if they are back online. If the policy is set to check_all, the restore daemon analyzes all paths in the system and revives the paths that are back online, as well as disabling the paths that are inaccessible.

NOTE: The DMP restore daemon does not change the disabled state of the path through a controller that you have disabled using vxdmpadm disable.

The command vxdmpadm start restore is used to set the interval of polling. For example, the polling interval is set to 400 seconds using the following command:

# vxdmpadm start restore interval=400

The default interval is 300 seconds. Decreasing this interval can adversely affect system performance. To change the interval or policy, you must stop the restore daemon and restart it with new attributes.

Stopping the DMP Restore Daemon

Use the following command to stop the DMP restore daemon:

# vxdmpadm stop restore

NOTE: Automatic path failback stops if the restore daemon is stopped.

Displaying the Status of the DMP Restore Daemon

Use the following command to display the status of the automatic path restoration daemon, its polling interval, and the policy that it uses to check the condition of paths:

# vxdmpadm stat restored

This produces output such as the following:

The number of daemons running : 1
The interval of daemon: 300
The policy of daemon: check_disabled

Displaying Information About the DMP Error Daemons

To display the number of error daemons that are running, use the following command:

# vxdmpadm stat errord

Tuesday 21 July 2009

Quick Crib for HPUX troubleshooting

uname -a
model
bdf
setboot
ioscan -fn
ioscan -funC disk
swlist -l product
strings /etc/lvmtab
vgdisplay -v
lvdisplay -v /dev/vg*/lv*
top
UNIX95=1 ps -efH

HPUX equivalent of ptree

I sorely miss the proc tools found on Solaris and Linux each time I have to deal with HP-UX - the tools pfiles, ptree, pkill, pgrep are just too valuable to go without. Here is just one hack I found that can effectively do the same function as ptree (well, sort of):

bash-3.00# UNIX95=1 ps -Hef
The output of all processes on the system will grouped by parent process id's with children processes nicely indented below their parents. Even though it is not even nearly as good as using a genuine ptree utility, since you may have to brose the longish list to find all the parents of the process you're interested in, it still has definite usefulness. Please remember not to export the UNIX95 variable since if exported this variable will effect the way other programs you may start from command line will behave after that point.

Wednesday 1 July 2009

use getfacl to show file permissions and ownership

especially good if used recursively;

getfacl -R /aa/aaa/aaaa

Tuesday 30 June 2009

hummingbird exceed

Below is the text of a document I wrote to setup Exceed.
****************************
Using Exceed X Server with SSH X11 Tunneling

Step 1: Install and configure Exceed on your PC

Step 2: Configure Exceed for Passive Mode and Multiple Windows

Exceed’s passive mode allows you to start the X Server on your PC without it making any initial attempt to connect to a specific remote host.

Set Exceed to use Passive Mode and Multiple Windows Mode. Both of these settings are Exceed defaults, but check the settings if Exceed has been used before.

1. Start -> Programs -> Hummingbird Connectivity 9.0 -> Exceed -> Xconfig
2. Set Passive Communications:
a. Click the Network and Communication link. This will open the Communications dialog box.
b. Select Passive from the mode field’s drop-down list.
c. Validate and apply changes (green checkmark on toolbar).
3. Set Multiple Windows Screen Definition:
a. Click the Display and Video link. This will open the Screen 0 dialog box.
b. Click the radio button beside Multiple in the Window Mode box in the upper left.
c. Validate and apply changes (green checkmark on toolbar).

Step 3: Configure Localhost Security

When using SSH X11 tunneling, the only host that Exceed will ever talk to is your own PC – the localhost. Thus, regardless of which or how many machines or accounts you’re going to use Exceed with, you only have to tell Exceed to answer to one machine – your local host.

1. Click the Security, Access Control and System Administration link. This will open the Security dialog box.
2. In the Host Access Control List section of the Security dialog box, click the radio button that is to the immediate left of the word File.
3. Click the Edit box to the right of the name xhost.txt. A NotePad editing session will be initiated, editing the xhost.txt file.
4. Type localhost on a new line in the file.
a. If your xhost.txt file already has other specific hosts listed, delete those lines.
5. Save your changes to xhost.txt by clicking File in the menu bar, then selecting Save.
6. Leave NotePad by clicking File in the menu bar, then selecting Exit.
7. Validate and apply changes (green checkmark on toolbar).

Step 4: Unconfigure your Unix Account

If you have set your account to talk to your X Server, you have to remove these settings before you can use it with SSH X11 tunneling.

For Korn/Bourne shell users, check your .profile file, and remove any lines that look like this:
export DISPLAY=xxx.xxx.xxx.xxx:0

After that, and X-Windows window will automatically open whenever you start an X-Windows program on any remote Unix host.

A good X-Windows program to test with when you first setup Exceed is xclock. On your Unix host using your Unix account, enter: /usr/X/bin/xclock &
and a small X-Windows window containing a clock will open on your PC’s screen.

Using Exceed X Server with SSH

When using telnet, passwords will be sent across the network in plain text and can be viewed with other network traffic using a sniffer or other methods. This is an example of the password being intercepted using the Solaris snoop utility:

158 12.50892 xxx.xx.xx.xxx -> myhost TELNET C port=665 X&d2k7GG\r

Where “X&d2k7GG\r” is your password with a return at the end. Using the Secure Shell start method will eliminate sending your password in plain text across the network.

1. Start -> Programs -> Hummingbird Connectivity 9.0 -> Exceed -> Xstart
2. In the Start Method box in the upper left, select Secure Shell (Set Display).
3. Enter the host in the box to the right of the word Host.
4. Enter the host type in the box to the right of the words Host Type by selecting Sun in the drop-down box. Then select the command to use by clicking on the ellipsis button and selecting XTerm. This will populate the command line with a predefined xterm command line.
5. To the right of the Information prompt, select the radio button labeled None.
6. In the Secure Shell Profile: field in the upper section of the xstart session, click on the ellipsis button. This will launch the Open Tunnel application.
7. In the lower left corner, click on the Add New Tunnel button. In the Tunnel name: box enter the name for your new tunnel.
8. Using the drop-down list in the Host name: field, select the host for your new tunnel.
9. Enter your Unix account login in the User name: field.
10. In the TCP port: field leave the default of 22.
11. After entering the information for the new tunnel, click OK.
12. This will add the new tunnel to the list. With your new tunnel highlighted, click Open in the lower right. The Secure Shell Profile: will now use this tunnel when opening a connection to your Unix host.

When you start Unix client connection using the Xstart Client Startup Application, you will be prompted for a Password Authentication. Enter your Unix account password and you will receive a secure connection to your Unix host.

Monday 29 June 2009

If you cannot find an emulex or qlogic HBA driver on Red hat Linux

well, this driver might be the answer;

cpq_cciss 2.4.60 Jun 26 2006 HP cciss 2.4.60 HBA driver RPM

Friday 26 June 2009

How to resize a volume within a diskgroup and add that storage to another volume in that diskgroup

Say you added space to a volume and it was the wrong volume. Luckily, both volumes are in the same diskgroup. Here is how you can move the storage from one vol to another. Very simple really;

you can do a vxdg free to see how much free space you have prior to this, then you can proceed;

# vxresize -F vxfs -g your_dg your_vol -100g

# vxresize -F vxfs -g your_dg your_other_vol +100g

How to resize a filesystem with specific disks under vxvm

A VxVM question:

I have a volume/filesystem spread over 4*146G disks. Now I want to shrink the filesystem - which I can do using vxresize. However, I want to shrink so that two of the four disks that the filesystem occupies are removed from the volume. Can I do that?

Yes, you can.

Assuming disks called disk1, disk2, disk3, disk4, volume "vol" and diskgroup "mydg" , and that you want to free disks 3 and 4.
Code:
vxassist -g mydg move vol disk1 disk2 \!disk3 \!disk4

So I should run vxresize to resize the filesystem to, say, 200G and then run vxassist? I just want to know exactly what I am going to have to do, because there is a netbackup master server running on that filesystem and obviously we do not want any data loss...

Yes, that's exactly it.

Do the resize to make sure you are under the capacity of two of the four disks, then the vxassist. I have had to do this recently when someone decided to grow a volume on a systems where the second plex in a mirror was detached and disassociated and a someone else decided to grow the volume and allocated all of the disks is the diskgroup as available. As a result an additional subdisks were allocated on on of the disks that should only have been assigned to the mirror plex.

I'll see if I have a machine handy and I'll a screen dump of this scenario, and the exact sequence.

Thursday 25 June 2009

Search for installed linux packages

rpm -qa | grep whateveryouarelookingfor

EMC Client installation and checking for linux and solaris

EMC Client installation and checking

This web page is a quick guide on what to install and how to check that EMC SAN is attached and working

Solaris

Installing
==========================================================
Install Emulex driver/firmware, san packages (SANinfo, HBAinfo, lputil), EMC powerpath
Use lputil to update firmware
Use lputil to disable boot bios
Update /kernel/drv/lpfc.conf
Update /kernel/drv/sd.conf
Reboot
Install ECC agent

Note: when adding disks on different FA had to reboot server?

List HBA's /usr/sbin/hbanyware/hbacmd listHBAS (use to get WWN's)

/opt/HBAinfo/bin/gethbainfo (script wrapped around hbainfo)

grep 'WWN' /var/adm/messages
HBA attributes /opt/EMLXemlxu/bin/emlxadm

/usr/sbin/hbanyware/hbacmd HBAAttrib 10:00:00:00:c9:49:28:47
HBA port /opt/EMLXemlxu/bin/emlxadm

/usr/sbin/hbanyware/hbacmd PortAttrib 10:00:00:00:c9:49:28:47
HBA firmware /opt/EMLXemlxu/bin/emlxadm
Fabric login /opt/HBAinfo/bin/gethbainfo (script wrapped around hbainfo)
Adding Additional Disks cfgadm -c configure c2
Disk available cfgadm -al -o show_SCSI_lun

echo|format

inq (use to get serial numbers)
Labelling format
Partitioning vxdiskadm

format
Filesystem newfs or mkfs
Linux

Installing
==========================================================
Install Emulex driver, san packages (saninfo, hbanyware), firmware (lputil)
Configure /etc/modprobe.conf
Use lputil to update firmware
Use lputil to disable boot bios
Create new ram disk so changes to modprobe.conf can take affect.
Reboot
Install ECC agent

List HBA's
/usr/sbin/hbanyware/hbacmd listHBAS (use to get WWN's)

cat /proc/scsi/lpfc/*

HBA attributes /usr/sbin/hbanyware/hbacmd HBAAttrib 10:00:00:00:c9:49:28:47

cat /sys/class/scsi_host/host*/info
HBA port /usr/sbin/hbanyware/hbacmd PortAttrib 10:00:00:00:c9:49:28:47
HBA firmware lputil
Fabric login cat /sys/class/scsi_host/host*/state
Disk available cat /proc/scsi/scsi

fdisk -l |grep -I Disk |grep sd

inq (use to get serial numbers)
Labelling parted -s /dev/sda mklabel msdos (like labelling in solaris)
parted -s /dev/sda print
Partitioning fdisk

parted
Filesystem mkfs -j -L /dev/vx/dsk/datadg/vol01
PowerPath

HBA Info /etc/powermt display
Disk Info /etc/powermt display dev=all
Rebuild /kernel/drv/emcp.conf /etc/powercf -q
Reconfigure powerpath using emcp.conf /etc/powermt config
Save the configuration /etc/powermt save
Enable and Disable HBA cards used for testing
/etc/powermt display (get card ID)

/etc/powermt disable hba=3072
/etc/powermt enable hba=3072

Thursday 11 June 2009

vxvm - what should you see if the disk is not under Vxvm control

cxtxdx auto:none online invalid

That is what you should see when a disk has been uninitialised;

you can uninitialise a disk from vxvm with the following

vxdiskunsetup -C cxtxdx

Wednesday 10 June 2009

restoring a file with networker

in this example, I restore an oracle cron file, and then later on show how to restore an earlier oracle cron file
so - first thing I did was invoke the recover command, specifying the networker server (cluster) and client;

eudt0201:root /tmp/cronoratab > /opt/networker/bin/recover -s ukprbknws001 -c eudt0201-bk

/tmp/cronoratab/ not in index

will exit.

and then entered the directory I needed to recover the file from;

Enter directory to browse: /var/spool/cron/crontabs/

recover: Current working directory is /var/spool/cron/crontabs/

help is our friend!

recover> help

Available commands are:

add [-q] [filename] - add `filename' to list of files to be recovered

cd [dir] - change directory to dir

changetime [date] - change the time that you are browsing

debug

delete [filename] - delete `filename' from the recover list

destination - print destination location for recovered files

dir [/w] [filename...] - list filename

exit - immediately exit program

force - overwrite existing files

help or `?' - print this list

lf [-aAcCdfFgilLqrRsStu1] [filename...] - list filename type

list [-c | -l] - list the files marked for recover

ll [-aAcCdfFgilLqrRsStu1] [filename...] - long list filename

ls [-aAcCdfFgilLqrRsStu1] [filename...] - list filename

noforce - do not overwrite existing files

pwd - print current directory

quit - immediately exit program

recover - recover requested files

relocate [dir] - specify new location for recovered files

verbose - toggle verbose mode; feedback about what is going on

versions [filename] - report on each version of file `filename

volumes [filename] - report volumes needed to recover marked files

`filename' can be either a file or a directory

I used 'destination' to check where the file was going to be recovered to;

recover> destination

recover files into their original location

and then changed the destination using relocate;

recover> relocate

will recover files into their original location

New destination directory: /tmp/cronoratab

and confirmed that with the destination command;

recover> destination

recover files into /tmp/cronoratab

listed the files in the direcotry I want to restore from

recover> ls

adm oracle root sys

recover> ls -lrt

total 32

-r-------- root 61 Jun 07 2007 sys

-r-------- root 771 Apr 30 2008 adm

-r-------- root 1157 Jul 04 2008 root

-r-------- root 1 May 21 09:27 oracle

and then added the file I want to recover with the add command

recover> add oracle

1 file(s) marked for recovery

and then recovered it with the recover command

recover> recover

recover: Total estimated disk space needed for recover is 8 KB.

Recovering 1 file from /var/spool/cron/crontabs/ into /tmp/cronoratab

Volumes needed (all on-line):

AD0085 at ACSLS01_D

Requesting 1 file(s), this may take a while...

./oracle

Received 1 file(s) from NSR server `ukprbknws001'

Recover completion time: Wed Jun 10 11:00:28 2009

recover> quit

eudt0201:root /tmp/cronoratab > ls

oracle

and then there is this other example where I needed an earlier file;

recover> ls -lrt

total 32

-r-------- root 61 Jun 07 2007 sys

-r-------- root 771 Apr 30 2008 adm

-r-------- root 1157 Jul 04 2008 root

-r-------- root 1 May 21 09:27 oracle

by using the changetime command and specifying a date in May

recover> changetime May 19

time changed to Tue May 19 23:59:59 2009

recover> ls

adm oracle root sys

recover> ls -lrt

total 32

-r-------- root 61 Jun 07 2007 sys

-r-------- root 771 Apr 30 2008 adm

-r-------- root 1157 Jul 04 2008 root

-r-------- root 3377 Mar 31 07:45 oracle

recover> add oracle

1 file(s) marked for recovery

recover> destination

recover files into /tmp/cronoratab

recover> recover

recover: Total estimated disk space needed for recover is 8 KB.

Recovering 1 file from /var/spool/cron/crontabs/ into /tmp/cronoratab

Volumes needed (all on-line):

AD0630 at ACSLS01_D

Requesting 1 file(s), this may take a while...

./oracle

./oracle file exists, overwrite (n, y, N, Y) or rename (r, R) [n]? y

overwriting ./oracle

Received 1 file(s) from NSR server `ukprbknws001'

Recover completion time: Wed Jun 10 11:17:46 2009

recover> quit

Hope this helps!

Darren

Monday 8 June 2009

Quick example of resizing a fs with vxvm

NOTE - this is on a HPUX server

Before

eupr0201:root /root > bdf |grep -i pwdcli01
/dev/vx/dsk/dgpwdcli01/oracledev01
6291456 3398357 2719643 56% /opt/oracle/PWDCLI01
/dev/vx/dsk/dgpwdcli01/oradatadev01
35651584 34456224 1186032 97% /oradata/PWDCLI01
/dev/vx/dsk/dgpwdcli01/archivedev01
12582912 4325 11792561 0% /archive/PWDCLI01

check what you can increase it to;

eupr0201:root /root > vxdg -g dgpwdcli01 free

DISK DEVICE TAG OFFSET LENGTH FLAGS

dgpwdcl02 c14t11d3 c14t11d3 18835744 16854464 -

after addition of 10Gb on oradata filesystem for pwdcli01 on the eupr0201/eupr0202

eupr0201:root /root > /etc/vx/bin/vxresize -F vxfs -g dgpwdcli01 oradatadev01 +10485760

eupr0201:root /root > bdf |grep -i pwdcli01
/dev/vx/dsk/dgpwdcli01/oracledev01
6291456 3398358 2719642 56% /opt/oracle/PWDCLI01
/dev/vx/dsk/dgpwdcli01/oradatadev01
46137344 34456544 11589552 75% /oradata/PWDCLI01
/dev/vx/dsk/dgpwdcli01/archivedev01
12582912 4325 11792561 0% /archive/PWDCLI01

Saturday 6 June 2009

hp ilo trouble shooting

can you ping its ip address
can someone physically attach to it locally
is the cable attached? is it a good cable?
are the network settings correct?
is the activity light on (the blue light)
does a reboot shift the issue
does a cold reset shift the issue

Shutting down the linux system

Shutting Down the Linux System

If you look at the scripts in runlevel 0, you'll find a number of services being shut down, followed by the killing of all active processes, and finally, the halt script in /etc/rc.d/init.d directory executing the shutdown.

The halt script is used to either halt or reboot your system, depending on how it is called. What happens during a shutdown? If you're familiar with other operating systems (such as DOS), you remember that all you had to do was close any active application and then turn off the computer. Although Linux is easy to use, shutting down your computer is not as simple as turning it off. (You can try this, but you do so at your own risk.) A number of processes must take place before you or Linux turns off your computer. The following sections take a look at some of the commands involved.

shutdown
Although many people use Linux as single users on a single computer, many of us use computers on either a distributed or shared network. If you've ever been working under a tight deadline in a networked environment, you know the dreadful experience of seeing a System is going down in 5 minutes! message from the system administrator. You might also know the frustration of working on a system on which the system administrator is trying to perform maintenance, suffering seemingly random downtimes or frozen tasks.

Luckily for most users, maintenance jobs are performed during off hours, when most people are home with their loved ones or fast asleep in bed. Unluckily for sysadmins, this is the perfect time for system administration or backups, and one of the top reasons for the alt.sysadmin.recovery newsgroup.

The primary command to stop Linux is the shutdown command. Like most UNIX commands, shutdown has a number of options. A man page for the shutdown command is included with Red Hat Linux, but you can quickly read its command-line syntax if you use an illegal option, such as -z. Thanks to the programmer, here it is:

Usage: shutdown [-akrhfnc] [-t secs] time [warning message]
-a: use /etc/shutdown.allow
-k: don't really shutdown, only warn.
-r: reboot after shutdown.
-h: halt after shutdown.
-f: do a 'fast'reboot (skip fsck).
-F: Force fsck on reboot.
-n: do not go through "init" but go down real fast.
-c: cancel a running shutdown.
-t secs: delay between warning and kill signal.
** the "time" argument is mandatory! (try "now") **
To properly shut down your system immediately, use the -h option, followed by the word now or the numeral 0:

# shutdown -h now

or

# shutdown -h 0

If you want to wait for a while, use the -t option, followed by a number (in seconds) before shutdown or reboot. If you want to restart your computer, use the -r option, along with the word now or the numeral 0:

# shutdown -r now

or

# shutdown -r 0

NOTE

You'll find two curious text strings embedded in the shutdown program:

"You don't exist. Go away."
"Oh hello Mr. Tyler - going DOWN?"
Both are found by executing this:

# strings /sbin/shutdown

To find out about "You don't exist. Go away", see Robert Kiesling's Linux Frequently Asked Questions with Answers. You should be able to find a copy at http://www.linuxdoc.org/FAQ.

You can also use linuxconf to shut down your computer. If you're logged in as the root operator, enter the following from the command line of your console or an X11 terminal window:

# linuxconf --shutdown

linuxconf presents a shutdown dialog box, as shown in Figure 9.4. To restart your system, press the Tab key until you highlight the Accept button, and then press the Enter key. You can also enter a time delay or halt your system immediately, and specify a message to broadcast to all your users when you execute the shutdown.

Figure 9.4 The linuxconf command will perform a system reboot or shutdown.

halt and reboot
Two other commands also stop or restart your system: halt and reboot. reboot is a symbolic link to halt, which notifies the kernel of a shutdown or reboot. Although you should always use shutdown to restart your system, you can use the "Vulcan neck pinch": Ctrl+Alt+Del.

If you use the keyboard form of this command, you'll find that Linux uses the following command:

# shutdown -t3 -r now

NOTE

This command is defined in your system's initialization table, /etc/inittab.

Restarting your computer with the shutdown command calls the sync command, which updates the inodes (structure representations)of each of your files. If you exit Linux without updating this information, Linux could lose track of your files on disk, and that spells disaster!

NOTE

The only time you'll want to risk shutting down Linux through a hard reset or the power-off switch on your computer is if you can't quickly kill a destructive process, such as an accidental rm -fr /*. Yet another reason to never run Linux as the root operator all the time!

By now you should know that exiting Linux properly can help you avoid problems with your system. What happens if something goes wrong? In the next section you learn preventive measures, how to maintain your filesystem, and how to recover and overcome problems.

Thursday 4 June 2009

cancelling an at job

In Unix, how do I cancel a batch job?

In Unix, if you scheduled a job with at or batch, you can cancel it at the Unix prompt by entering:

at -r
Replace with the number of the job that at or batch reported when you submitted the job. On some systems, you may use atrm instead of at -r .

If you don't remember the job number, you can get a listing of your jobs by entering:

at -l
Each job will be listed with its job number queue and the time it was originally scheduled to execute.

On some systems, the atq command is available to list all the jobs on the system. To use this command, at the Unix prompt, enter:

atq
If your job is already running, you will need to find the process ID and kill it. On System V implementations (including all UITS central systems at Indiana University), list all running processes by entering:

ps -fu username
Replace username with your username. The equivalent BSD command is:

ps x
Once you have the process ID, enter:

kill
Replace with the process ID. If it still will not terminate, try entering:

kill -9

how to run an at job

state the at command with the time and press return
test0029:root /opt/sysadmin/ttt > at 10:00am tomorrow

then specify the command to execute at that time

cp -p /opt/sysadmin/bcv/PWDSLI02BCVbackup.ksh.amended.jun042009 PWDSLI02BCVbackup.ksh

then CTRL-D to finish

warning: commands will be executed using /usr/bin/sh
job 1244192400.a at Fri Jun 5 10:00:00 2009

listing available disks on a linux rhel system

dmesg | grep disk
fdisk -l

Tuesday 26 May 2009

Things not to do on an ILO

unless you need to take down the server in a hurry, do not choose the CTRL-ALT-DEL function button on the ilo

it will reboot the server

understanding the emc syminq command and what it tells you about devices

so here is a syminq command output

eudt0206:root / > ./var/tmp/SYMCLI.depot/SYMCLI/SYMCLI/SYMCLI/V6.4.0/bin/syminq

Device Product Device
---------------------------- --------------------------- ---------------------
Name Type Vendor ID Rev Ser Num Cap (KB)
---------------------------- --------------------------- ---------------------
/dev/rdsk/c0t0d0 COMPAQ BF1468A4CC HPB5 3KN2TWTK 143374744
/dev/rdsk/c0t0d0s1 COMPAQ BF1468A4CC HPB5 3KN2TWTK 143374744
/dev/rdsk/c0t0d0s2 COMPAQ BF1468A4CC HPB5 3KN2TWTK 143374744
/dev/rdsk/c0t0d0s3 COMPAQ BF1468A4CC HPB5 3KN2TWTK 143374744
/dev/rdsk/c0t1d0 COMPAQ BF1468A4CC HPB5 3KN2VX2A 143374744
/dev/rdsk/c2t0d0 COMPAQ BF1468A4CC HPB5 3KN2THV5 143374744
/dev/rdsk/c2t0d0s1 COMPAQ BF1468A4CC HPB5 3KN2THV5 143374744
/dev/rdsk/c2t0d0s2 COMPAQ BF1468A4CC HPB5 3KN2THV5 143374744
/dev/rdsk/c2t0d0s3 COMPAQ BF1468A4CC HPB5 3KN2THV5 143374744
/dev/rdsk/c2t1d0 COMPAQ BF1468A4CC HPB5 3KN2STSA 143374744
/dev/rdsk/c3t2d0 Optiarc DVD RW AD-5* 1.31 Nov20,2 N/A
/dev/rdsk/c4t0d0 GK EMC SYMMETRIX 5771 0400040000 2880
/dev/rdsk/c4t9d4 M(4) EMC SYMMETRIX 5771 040071D000 35692800
/dev/rdsk/c4t9d5 M(4) EMC SYMMETRIX 5771 0400721000 35692800
/dev/rdsk/c4t9d6 M(4) EMC SYMMETRIX 5771 0400725000 35692800
/dev/rdsk/c4t9d7 M(4) EMC SYMMETRIX 5771 0400729000 35692800
/dev/rdsk/c4t13d7 M(4) EMC SYMMETRIX 5771 040072D000 35692800
/dev/rdsk/c7t0d0 GK EMC SYMMETRIX 5771 0400040000 2880
/dev/rdsk/c7t9d4 M(4) EMC SYMMETRIX 5771 040071D000 35692800
/dev/rdsk/c7t9d5 M(4) EMC SYMMETRIX 5771 0400721000 35692800
/dev/rdsk/c7t9d6 M(4) EMC SYMMETRIX 5771 0400725000 35692800
/dev/rdsk/c7t9d7 M(4) EMC SYMMETRIX 5771 0400729000 35692800
/dev/rdsk/c7t13d7 M(4) EMC SYMMETRIX 5771 040072D000 35692800
/dev/vx/rdmp/c0t0d0s2 COMPAQ BF1468A4CC HPB5 3KN2TWTK 143374744
/dev/vx/rdmp/c0t1d0 COMPAQ BF1468A4CC HPB5 3KN2VX2A 143374744
/dev/vx/rdmp/c2t0d0s2 COMPAQ BF1468A4CC HPB5 3KN2THV5 143374744
/dev/vx/rdmp/c2t1d0 COMPAQ BF1468A4CC HPB5 3KN2STSA 143374744
/dev/vx/rdmp/c4t9d4 M(4) EMC SYMMETRIX 5771 040071D000 35692800
/dev/vx/rdmp/c4t9d5 M(4) EMC SYMMETRIX 5771 0400721000 35692800
/dev/vx/rdmp/c4t9d6 M(4) EMC SYMMETRIX 5771 0400725000 35692800
/dev/vx/rdmp/c4t9d7 M(4) EMC SYMMETRIX 5771 0400729000 35692800
/dev/vx/rdmp/c4t13d7 M(4) EMC SYMMETRIX 5771 040072D000 35692800

let us break it down further - here we have a total of 5 EMC symmetrix disks, with multipathing;

/dev/rdsk/c4t0d0 GK EMC SYMMETRIX 5771 0400040000 2880
/dev/rdsk/c4t9d4 M(4) EMC SYMMETRIX 5771 040071D000 35692800
/dev/rdsk/c4t9d5 M(4) EMC SYMMETRIX 5771 0400721000 35692800
/dev/rdsk/c4t9d6 M(4) EMC SYMMETRIX 5771 0400725000 35692800
/dev/rdsk/c4t9d7 M(4) EMC SYMMETRIX 5771 0400729000 35692800
/dev/rdsk/c4t13d7 M(4) EMC SYMMETRIX 5771 040072D000 35692800
/dev/rdsk/c7t0d0 GK EMC SYMMETRIX 5771 0400040000 2880
/dev/rdsk/c7t9d4 M(4) EMC SYMMETRIX 5771 040071D000 35692800
/dev/rdsk/c7t9d5 M(4) EMC SYMMETRIX 5771 0400721000 35692800
/dev/rdsk/c7t9d6 M(4) EMC SYMMETRIX 5771 0400725000 35692800
/dev/rdsk/c7t9d7 M(4) EMC SYMMETRIX 5771 0400729000 35692800
/dev/rdsk/c7t13d7 M(4) EMC SYMMETRIX 5771 040072D000 35692800

that are also being used under vxvm;

/dev/vx/rdmp/c4t9d4 M(4) EMC SYMMETRIX 5771 040071D000 35692800
/dev/vx/rdmp/c4t9d5 M(4) EMC SYMMETRIX 5771 0400721000 35692800
/dev/vx/rdmp/c4t9d6 M(4) EMC SYMMETRIX 5771 0400725000 35692800
/dev/vx/rdmp/c4t9d7 M(4) EMC SYMMETRIX 5771 0400729000 35692800
/dev/vx/rdmp/c4t13d7 M(4) EMC SYMMETRIX 5771 040072D000 35692800

to find out the device ID, look at the 6th column;

/dev/vx/rdmp/c4t9d4 M(4) EMC SYMMETRIX 5771 040071D000 35692800

040071D000 breaks down as
04 are the last 2 digits of the serial number of the Symmetrix array/frame
0071D is the Symmetrix device ID number
000 is always 000 and doesn't have meaning

Also - GK stands for Gatekeeper devices, and the M stands for Mirror (with the 4 indicating that it is a 4 way mirror)

VxVm - Show total used or occupied disk space in a particular disk group

Show total used or occupied disk space in a particular disk group.

Since the basic unit of allocation is the subdisk, use vxprint and total up
all the subdisks length for a particular diskgroup i.e

vxprint -g diskgroup_name -s

vxdg free

The Veritas volume manager (VxVM) provides logical volume management capabilites across a variety of platforms. As you create new volumes, it is often helpful to know how much free space is available. You can find free space using two methods. The first method utilizes vxdg’s “free” option:

$ vxdg -g oradg free

GROUP DISK DEVICE TAG OFFSET LENGTH FLAGS
oradg c3t20d1 c3t20d1s2 c3t20d1 104848640 1536 -
oradg c3t20d3 c3t20d3s2 c3t20d3 104848640 1536 -
oradg c3t20d5 c3t20d5s2 c3t20d5 104848640 1536 -
oradg c3t20d7 c3t20d7s2 c3t20d7 104848640 1536 -
oradg c3t20d9 c3t20d9s2 c3t20d9 104848640 1536 -

The “LENGTH” column displays the number of 512-byte blocks available on each disk drive in the disk group “oradg.”. To calculate the size of a volume, use vxprint, and look for the "length". The volume length is in sectors. Convert to kilobytes
by dividing by 2. To find out the free - look at the offset column.

Friday 22 May 2009

sftp logging on linux

try
/var/log/secure
or
/var/log/auth.log

Linux printing

Setting up network printers in linux

Make sure the printer name resolves (use nslookup and also do a ping test) – if that is the case you can assume that the initial printer config has been set up and the printer is on the network (ie the initial jetadmin setup)

Confirm that there is /etc/printcap present

Su to root

Run the following;

redhat-config-printer

or

system-config-printer

you get this GUI

you can then send a test print and then you can run an

lpq

to confirm that the queue is active

also do a ps –ef | grep cupsd

Wednesday 20 May 2009

If you get a VCS service group offline and faulted

Got an issue where the hastatus summary showed the diskgroup as offline failed;

-- SYSTEM STATE

-- System State Frozen

A eupr0001 RUNNING 0

A eupr0002 RUNNING 0

-- GROUP STATE

-- Group System Probed AutoDisabled State

B commonsg eupr0001 Y N ONLINE

B commonsg eupr0002 Y N ONLINE

B nsr01_sg eupr0001 Y N OFFLINE|FAULTED

B nsr01_sg eupr0002 Y N ONLINE

-- RESOURCES FAILED

-- Group Type Resource System

C nsr01_sg Application NetWorker eup

so tried the following;

hares -display
For each resource that is faulted run:

hares -clear resource-name -sys faulted-system

so in this case

hares -clear NetWorker -sys eupr0001

If all of these clear, then run hastatus -summary and make sure that these are clear. You should then see the FAULTED removed, and just be left with the ONLINE status

-------------------------------------------------------------------------
nsr01_proxy eupr0001 ONLINE
nsr01_proxy eupr0002 ONLINE
^Ceupr0001> hastatus -sum

-- SYSTEM STATE
-- System State Frozen

A eupr0001 RUNNING 0
A eupr0002 RUNNING 0

-- GROUP STATE
-- Group System Probed AutoDisabled State

B commonsg eupr0001 Y N ONLINE
B commonsg eupr0002 Y N ONLINE
B nsr01_sg eupr0001 Y N OFFLINE
B nsr01_sg eupr0002 Y N ONLINE
eupr0001>

If some don't clear you MAY be able to clear them on the group level. Only do this as last resort:

hagrp -disableresources groupname
hagrp -flush group -sys sysname
hagrp -enableresources groupname

To get a group to go online:

hagrp -online group -sys desired-system

However, if this was not the issue (because despite doing flushes, clears and onlines, vcs was stating there was not a problem) - it could be something else, like licenses - I did an nsrwatch on the server and could see an issue with networker licenses;

Server: ukprbklas001-mn.emea.abnamro-net.com Wed May 20 19:34:30 2009

Up since: Mon Jul 7 18:43:09 2008 Version: NetWorker nw_7_3_2_jumbo.Build.394 Base enabler disabled
Saves: 0 session(s) Recovers: 0 session(s)
Device type volume
/tmp/fd1 adv_file D.001
1/_AF_readonly adv_file D.001.RO read only
/tmp/fd2 adv_file W.001
2/_AF_readonly adv_file W.001.RO read only
/tmp/fd3 adv_file ukprbklas001.001
3/_AF_readonly adv_file ukprbklas001.001.RO read only

Sessions:

Messages:
Sat 00:13:00 registration info event: Server is disabled: Install base enabler
Sun 00:13:00 registration info: License enabler #none (NetWorker/10 Eval) has expired.
Sun 00:13:00 registration info event: Server is disabled: Install base enabler
Mon 00:13:00 registration info: License enabler #none (NetWorker/10 Eval) has expired.
Mon 00:13:00 registration info event: Server is disabled: Install base enabler
Tue 00:13:00 registration info: License enabler #none (NetWorker/10 Eval) has expired.
Tue 00:13:00 registration info event: Server is disabled: Install base enabler
Wed 00:13:00 registration info: License enabler #none (NetWorker/10 Eval) has expired.
Wed 00:13:00 registration info event: Server is disabled: Install base enabler
Thu 00:13:00 registration info: License enabler #none (NetWorker/10 Eval) has expired.
Thu 00:13:00 registration info event: Server is disabled: Install base enabler
Fri 00:13:00 registration info: License enabler #none (NetWorker/10 Eval) has expired.
Fri 00:13:00 registration info event: Server is disabled: Install base enabler
Sat 00:13:00 registration info: License enabler #none (NetWorker/10 Eval) has expired.
Sat 00:13:00 registration info event: Server is disabled: Install base enabler
Sun 00:13:00 registration info: License enabler #none (NetWorker/10 Eval) has expired.
Sun 00:13:00 registration info event: Server is disabled: Install base enabler
Mon 00:13:00 registration info: License enabler #none (NetWorker/10 Eval) has expired.
Mon 00:13:00 registration info event: Server is disabled: Install base enabler
Tue 00:13:00 registration info: License enabler #none (NetWorker/10 Eval) has expired.
Tue 00:13:00 registration info event: Server is disabled: Install base enabler
Wed 00:13:00 registration info: License enabler #none (NetWorker/10 Eval) has expired.
Wed 00:13:00 registration info event: Server is disabled: Install base enabler

Pending:
Mon 18:43:08 registration info: Server is disabled: Install base enabler

Tuesday 19 May 2009

VxVm various

Veritas Volume Manager

Tidbits

by sebastin on Apr.27, 2009, under Veritas Volume Manager

rootdg is a mandatory requirement for the older versions of VxVM. It is an optional in newer versions since 4.1

sliced type is used for the rootdg slices 3 and 4 are crated to hold separate private and public region partitions with all slices (apart from slice 2) zeroed out.
simple uses slice 3 to hold public and private regons and rest all slices (apart from slice 2) are zeroed out.
cdsdisk for cross platform data migration. slice 7 holds private and public regions. All other slices are zeroed out. and this format type is not suitable for root disk. (Cross-platform Data Sharing format)
none type is unformatted. This can not be set as a valid format.

When you move data from older versions to newer one and if you have ivc-snap or metromirror technology to replicate data on a regular basis, upgrading veritas from 3.5 to 5.0 may impose a problem while try to keep the compatibility with disk layouts.

This can possibly be fixed by inserting -T version number option to the vxdg init command.

If you want to force -T 90 on VxVM 5.0MP1, One of the following disk init might be required to force simple,

vxdisksetup -i disk_name format=sliced|simple|cdsdisk
vxdisk -f init disk_name type=auto format=sliced |simple|cdsdisk

vxdg -T 90 init new_dg_name disk_name

Private length areas is 32Mb. Maximum possible size is 524288 blocks (version 5.0
Public length is the size of the disk minus private area of the disk.

Leave a Comment more...

VxVM 5.0+ - How to import a disk group of cloned disks on the same host as the originals

by sebastin on Feb.24, 2009, under Veritas Volume Manager

Description

Cloned disks are created typically by hardware specific copying, such as
the Hitachi ‘Shadow Image’ feature, or EMC’s TimeFinder, or any of the
so-called “in-the-box data replication” features of some SUN arrays.

Prior to the 5.0 version of Veritas Volume Manager (VxVM), those
replicated (cloned) disks were mapped to a secondary host and the
diskgroup using those disks had to be imported there. This restriction
was because the disk’s VxVM private region contained identical information
as the original disk, causing VxVM confusion as to which was the original
and which was the replicated disk.

However, beginning with VxVM 5.0, additional features have been added
to allow VxVM to easily identify which disk(s) are originals and which
are clones, as well as commands to easily import a diskgroup made of up
cloned disks, therefore allowing these operations to take place even on
the same host as the original diskgroup.

This tutorial shows a simplified example of importing a diskgroup made up
of cloned disks on the same host as the original diskgroup.

Steps to Follow

We start with a simple 2-disk diskgroup called ‘ckdg’.

# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t0d0s2 auto:sliced rootdisk rootdg online
c1t41d2s2 auto:sliced ckdg01 ckdg online
c1t41d3s2 auto:sliced ckdg02 ckdg online

We use some sort of cloning operation to create two new copies of these
two disks after deporting the diskgroup (to ensure consistency), and then
make those disks available to this host. Once those two new disks are
seen by solaris,

# vxdctl enable

will bring them into the vxvm configuration. We can now see them using:

# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t0d0s2 auto:sliced rootdisk rootdg online
c1t41d2s2 auto:sliced - - online
c1t41d3s2 auto:sliced - - online
c1t41d4s2 auto:sliced - - online udid_mismatch
c1t41d5s2 auto:sliced - - online udid_mismatch

The ‘udid_mismatch’ indicates that vxvm knows this disk is a copy of some
other disk. The ‘udid_mismatch’ flag is further documented in the
‘vxdisk(1M)’ manpage.

So here, you have a choice as to how to proceed. You must remember that
VxVM will never allow you to have two diskgroups with the same name
imported at the same time. So, if you want both of these diskgroups (the
original and the cloned) to be imported simultaneously, you will have to
change the name of one of them.

For this example, we will leave the original diskgroup named ‘ckdg’,
and change the cloned diskgroup to ‘newckdg’.

First, import the original diskgroup:

# vxdg import ckdg

# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t0d0s2 auto:sliced rootdisk rootdg online
c1t41d2s2 auto:sliced ckdg01 ckdg online
c1t41d3s2 auto:sliced ckdg02 ckdg online
c1t41d4s2 auto:sliced - - online udid_mismatch
c1t41d5s2 auto:sliced - - online udid_mismatch

The command to import the diskgroup using the cloned disks while renaming the
diskgroup is:

# vxdg -n newckdg -o useclonedev=on -o updateid import ckdg

The ‘useclonedev’ flag instructs vxvm to use ONLY cloned disks, not the
originals. You must use the ‘updateid’ flag as well, because we need to
validate these disks, and make sure that they are no longer mismatched
with regards to their UDID. This will update the UDID stored in the
disk’s vxvm private region to the actual disk’s UDID. Finally, the ‘-n’
flag specifies the new diskgroup name.

What we’re left with now is BOTH diskgroups imported simultaneously:

# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t0d0s2 auto:sliced rootdisk rootdg online
c1t41d2s2 auto:sliced ckdg01 ckdg online
c1t41d3s2 auto:sliced ckdg02 ckdg online
c1t41d4s2 auto:sliced ckdg01 newckdg online clone_disk
c1t41d5s2 auto:sliced ckdg02 newckdg online clone_disk

Whether or not you chose to leave that ‘clone_disk’ flag turned on is up
to you. Since at this point, the ‘newckdg’ diskgroup is a full-fledged
diskgroup, there’s really no need to leave those flags on, so the commands

# vxdisk set c1t41d4s2 clone=off
# vxdisk set c1t41d5s2 clone=off

will turn off that flag, leaving us with:

# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t0d0s2 auto:sliced rootdisk rootdg online
c1t41d2s2 auto:sliced ckdg01 ckdg online
c1t41d3s2 auto:sliced ckdg02 ckdg online
c1t41d4s2 auto:sliced ckdg01 newckdg online
c1t41d5s2 auto:sliced ckdg02 newckdg online

2 Comments more...

How to get VxVM to recognize that a hardware RAID LUN has been grown

by sebastin on Feb.24, 2009, under Veritas Volume Manager

For VxVM 4.0 and above

Beginning with VxVM 4.0, the vxdisk(1M) command has a new option
( resize ) that is provided to support dynamic LUN expansion
(DLE). This command option is available only if a Storage Foundation
license has been installed; the normal VxVM license key is not enough to
unlock this new functionality.

DLE should only be performed on arrays that preserve data. VxVM makes no
attempt to verify the validity of pre-existing data on the LUN, so the
user must validate with the array’s vendor whether or not the array
preserves the data already on the LUN when the LUN is grown.

This ‘vxdisk resize’ command updates the VTOC of the disk automatically.
The user does NOT need to run the ‘format’ utility to change the length of
partition 2 of the disk. In fact, the user doesn’t have to run ‘format’
at all!

Also, DLE can be done on-the-fly, even when there are VxVM volumes on
that disk/LUN, and while these volumes are mounted. There is a
requirement that there be at least one other disk in the same diskgroup,
because during the resize operation, the disk is temporarily/quietly
removed from the disk group, and it is not possible to remove the last
disk from a disk group.

Here is an example procedure illustrating the usage and results of this
command:

We will start with disk c1t1d0s2, which is currently 20965120 sectors
(approx 10GB) in size, and the volume (vol01) on that disk is the entire
size of the disk:

# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t0d0s2 auto:none - - online invalid
c0t1d0s2 auto:sliced - - online
c1t0d0s2 auto:sliced - - online
c1t1d0s2 auto:cdsdisk disk01 newdg online
c1t2d0s2 auto:cdsdisk disk02 newdg online
c1t3d0s2 auto:cdsdisk disk03 newdg online

# vxprint -ht
dm disk01 c1t1d0s2 auto 2048 20965120 -
dm disk02 c1t2d0s2 auto 2048 428530240 -
dm disk03 c1t3d0s2 auto 2048 428530240 -

v vol01 - ENABLED ACTIVE 20965120 SELECT - fsgen
pl vol01-01 vol01 ENABLED ACTIVE 20965120 CONCAT - RW
sd disk01-01 vol01-01 disk01 0 20965120 0 c1t1d0 ENA

# prtvtoc /dev/rdsk/c1t1d0s2
* Partition Tag Flags Sector Count Sector Mount Directory
2 5 01 0 20967424 20967423
7 15 01 0 20967424 20967423

The first step is to actually grow the LUN on the array according to the
documentation for your array. For this example, let’s assume the LUN
was grown to approximatly 20GB (i.e., doubled in size).
Nothing needs to be done on the host before performing this step.
The disk can remain in the diskgroup, and the volumes can remain mounted.

After the LUN has been grown on the array, nothing on the host will appear
different; the ‘format’ command and the ‘prtvtoc’ command will both show
the old (1GB) size, as will the ‘vxprint -ht’ command.

To get Solaris and VxVM to recognize the new disk size, we simply have
to run the command

# vxdisk resize c1t1d0s2

This command queries the LUN to determine the new size, and then updates
the disk’s VTOC as well as the data structures in the VxVM private region
on the disk to reflect the new size. There is no need to run any other
command. This command typically takes less than a few seconds to complete,
since there is no data to move.

At this point, we can rerun the commands we ran before and see the
differences (the ‘vxdisk list’ output will remain the same):

# vxprint -ht
dm disk01 c1t1d0s2 auto 2048 41936640 -
dm disk02 c1t2d0s2 auto 2048 428530240 -
dm disk03 c1t3d0s2 auto 2048 428530240 -

v vol01 - ENABLED ACTIVE 20965120 SELECT - fsgen
pl vol01-01 vol01 ENABLED ACTIVE 20965120 CONCAT - RW
sd disk01-01 vol01-01 disk01 0 20965120 0 c1t1d0 ENA

# prtvtoc /dev/rdsk/c1t1d0s2
* First Sector Last
* Partition Tag Flags Sector Count Sector Mount Directory
2 5 01 0 41938944 41938943
7 15 01 0 41938944 41938943

We can see in these outputs that the VTOC of the disk is now showing it’s
new (20GB) size (41938944 sectors) and the ‘vxprint -ht’ is now showing
the disk ‘disk01′ with a larger public region (41936640 sectors). Of
course, the volume ‘vol01′ has NOT been grown - that part is left to the
administrator to use the ‘vxresize’ command, if that is desired.

Leave a Comment more...

Move a single volume to another diskgroup

by sebastin on Jan.27, 2009, under Veritas Volume Manager

1) Add a new mirror-way to the volume on a new free disk

# vxprint -ht app_vol
Disk group: rootdg

v app_vol - ENABLED ACTIVE 8388608 SELECT - fsgen
pl app_vol-01 app_vol ENABLED ACTIVE 8395200 CONCAT - RW
sd rootdisk-04 app_vol-01 rootdisk 41955647 8395200 0 c0t0d0 ENA
pl app_vol-02 app_vol ENABLED ACTIVE 8395200 CONCAT - RW
sd mirrdisk-04 app_vol-02 mirrdisk 41955648 8395200 0 c2t0d0 ENA

# vxdisksetup -if c3t1d0
# vxdg -g rootdg adddisk c3t1d0
# vxassist mirror app_vol alloc=c3t1d0

2) At the end of mirror:
vxprint -hmQqrL -g rootdg app_vol /tmp/kasper
vi /tmp/kasper I have removed all reference to plex app_vol-01 and app_vol-02 and i keep only reference to volume app_vol and plex app_vol-03. Later I have rename in the file app_vol in app_voltransfer and app_vol-03 in app_voltransfer-03.

3) Destroy new plex app_vol-03 and
# vxplex dis app_vol-03
# vxedit -rf rm app_vol-03

4) Create new framework for the volume (dg, disk …)
# vxdg -g rootdg rmdisk c3t1d0
# vxdg init appvoldg c3t1d0

5) create new volume on new dg
# vxmake -g appvoldg -d /tmp/kasper
# vxvol start app_voltransfer

# vxprint -g appvoldg -ht

dg appvoldg default default 107000 1170153481.1251.omis379

dm c3t1d0 c3t1d0s2 sliced 9919 143328960 -

v app_voltransfer - ENABLED ACTIVE 8388608 SELECT - fsgen
pl app_voltransfer-03 app_voltransfer ENABLED ACTIVE 8395200 CONCAT - RW
sd c3t1d0-01 app_voltransfer-03 c3t1d0 0 8395200 0 c3t1d0 ENA

6) mount new fs:
mount -F vxfs /dev/vx/dsk/appvoldg/app_voltransfer /apptransfer

7) check old and new fs:
# df -k |grep app
/dev/vx/dsk/rootdg/app_vol 4194304 2806182 1301407 69% /app
/dev/vx/dsk/appvoldg/app_voltransfer 4194304 2806182 1301407 69% /apptransfer

Leave a Comment more...

Running ‘vxdisk -g updateudid’ on an imported disk

by sebastin on Jan.27, 2009, under Solaris, Veritas Volume Manager

Document ID: 293237

Running ‘vxdisk -g updateudid’ on an imported disk group disk rendered the disk group unimportable.
Details:
The workaround is not to run ‘vxdisk -g updateudid’ on a disk that is part of an imported disk group. Deport the associated disk group first (if it is imported), and then clear the udid_mismatch flag. Please note that the “udid_mismatch” notation is merely a status flag and not an indication of a problem condition.

Do not run the vxdisk command with the usage:

“vxdisk -g updateudid ”
The “dm_name” is the name of the disk as it was named within the disk group and listed by the “vxdisk list” command under the DISK column.

Instead, run following command after deporting the disk group to clear the udid_mismatch flag on a disk.

“vxdisk updateudid ”
The “da_name” is the name of the disk as listed by the “vxdisk list” command under the DEVICE column.

Importing EMC BCV devices:

The following procedure can be used to import a cloned disk (BCV device) from an EMC Symmetrix array.

To import an EMC BCV device

1. Verify that the cloned disk, EMC0_27, is in the “error udid_mismatch” state:

# vxdisk -o alldgs list
DEVICE TYPE DISK GROUP STATUS
EMC0_1 auto:cdsdisk EMC0_1 mydg online
EMC0_27 auto - - error udid_mismatch

In this example, the device EMC0_27 is a clone of EMC0_1.

2. Split the BCV device that corresponds to EMC0_27 from the disk group mydg:

# /usr/symcli/bin/symmir -g mydg split DEV001

3. Update the information that VxVM holds about the device:

# vxdisk scandisks

4. Check that the cloned disk is now in the “online udid_mismatch” state:

# vxdisk -o alldgs list
DEVICE TYPE DISK GROUP STATUS
EMC0_1 auto:cdsdisk EMC0_1 mydg online
EMC0_27 auto:cdsdisk - - online udid_mismatch

5. Import the cloned disk into the new disk group newdg, and update the disk’s UDID:

# vxdg -n newdg -o useclonedev=on -o updateid import mydg

6. Check that the state of the cloned disk is now shown as “online clone_disk”:

# vxdisk -o alldgs list
DEVICE TYPE DISK GROUP STATUS
EMC0_1 auto:cdsdisk EMC0_1 mydg online
EMC0_27 auto:cdsdisk EMC0_1 newdg online clone_disk

Leave a Comment more...

How to recover from a serial split brain

by sebastin on Jan.27, 2009, under Solaris, Veritas Volume Manager

Document ID: 269233

How to recover from a serial split brain
Exact Error Message
VxVM vxdg ERROR V-5-1-10127 associating disk-media with :
Serial Split Brain detected. Run vxsplitlines

Details:
Background:
The Serial Split Brain condition arises because VERITAS Volume Manager ™ increments the serial ID in the disk media record of each imported disk in all the disk group configurations on those disks. A new serial (SSB) ID has been included as part of the new disk group version=110 in Volume Manager 4 to assist with recovery of the disk group from this condition. The value that is stored in the configuration database represents the serial ID that the disk group expects a disk to have. The serial ID that is stored in a disk’s private region is considered to be its actual value.
If some disks went missing from the disk group (due to physical disconnection or power failure) and those disks were imported by another host, the serial IDs for the disks in their copies of the configuration database, and also in each disk’s private region, are updated separately on that host. When the disks are subsequently reimported into the original shared disk group, the actual serial IDs on the disks do not agree with the expected values from the configuration copies on other disks in the disk group.
The disk group cannot be reimported because the databases do not agree on the actual and expected serial IDs. You must choose which configuration database to use. This is a true serial split brain condition, which Volume Manager cannot correct automatically. In this case, the disk group import fails, and the vxdg utility outputs error messages similar to the following before exiting:
VxVM vxconfigd NOTICE V-5-0-33 Split Brain. da id is 0.1, while dm id is 0.0 for DM VxVM vxdg ERROR V-5-1-587 Disk group : import failed: Serial Split Brain detected. Run vxsplitlines
The import does not succeed even if you specify the -f flag to vxdg.
Although it is usually possible to resolve this conflict by choosing the version of the configuration database with the highest valued configuration ID (shown as config_tid in the output from the vxprivutil dumpconfig ), this may not be the correct thing to do in all circumstances.
To resolve conflicting configuration information, you must decide which disk contains the correct version of the disk group configuration database. To assist you in doing this, you can run the vxsplitlines command to show the actual serial ID on each disk in the disk group and the serial ID that was expected from the configuration database. For each disk, the command also shows the vxdg command that you must run to select the configuration database copy on that disk as being the definitive copy to use for importing the disk group.
The following example shows the result of JBOD losing access to one of the four disks in the disk group:
# vxdisk -o alldgs list
DEVICE TYPE DISK GROUP STATUS
c2t1d0s2 auto:cdsdisk - (dgD280silo1) online
c2t2d0s2 auto:cdsdisk d2 dgD280silo1 online
c2t3d0s2 auto:cdsdisk d3 dgD280silo1 online
c2t9d0s2 auto:cdsdisk d4 dgD280silo1 online
- - d1 dgD280silo1 failed was:c2t1d0s2

# vxreattach -c c2t1d0s2
dgD280silo1 d1

# vxreattach -br c2t1d0s2
VxVM vxdg ERROR V-5-1-10127 associating disk-media d1 with c2t1d0s2:
Serial Split Brain detected. Run vxsplitlines

# vxsplitlines -g dgD280silo1

VxVM vxsplitlines NOTICE V-5-2-2708 There are 1 pools.
The Following are the disks in each pool. Each disk in the same pool
has config copies that are similar.
VxVM vxsplitlines INFO V-5-2-2707 Pool 0.
c2t1d0s2 d1

To see the configuration copy from this disk issue /etc/vx/diag.d/vxprivutil dumpconfig /dev/vx/dmp/c2t1d0s2
To import the diskgroup with config copy from this disk use the following command;

/usr/sbin/vxdg -o selectcp=1092974296.21.gopal import dgD280silo1

The following are the disks whose serial split brain (SSB) IDs don’t match in this configuration copy:
d2

At this stage, you need to gain confidence prior to running the recommended command by generating the following outputs :
In this example, the disk group split so that one disk (d1) appears to be on one side of the split. You can specify the -c option to vxsplitlines to print detailed information about each of the disk IDs from the configuration copy on a disk specified by its disk access name:

# vxsplitlines -g dgD280silo1 -c c2t3d0s2

VxVM vxsplitlines INFO V-5-2-2701 DANAME(DMNAME) || Actual SSB || Expected SSB
VxVM vxsplitlines INFO V-5-2-2700 c2t1d0s2( d1 ) || 0.0 || 0.0 ssb ids match
VxVM vxsplitlines INFO V-5-2-2700 c2t2d0s2( d2 ) || 0.1 || 0.0 ssb ids don’t match
VxVM vxsplitlines INFO V-5-2-2700 c2t3d0s2( d3 ) || 0.1 || 0.0 ssb ids don’t match
VxVM vxsplitlines INFO V-5-2-2700 c2t9d0s2( d4 ) || 0.1 || 0.0 ssb ids don’t match
VxVM vxsplitlines INFO V-5-2-2706

This output can be verified by using vxdisk list on each disk. A summary is shown below:

# vxdisk list c2t1d0s2

# vxdisk list c2t3d0s2
Device: c2t1d0s2

Device: c2t3d0s2
disk: name= id=1092974296.21.gopal

disk: name=d3 id=1092974311.23.gopal
group: name=dgD280silo1 id=1095738111.20.gopal

group: name=dgD280silo1 id=1095738111.20.gopal
ssb: actual_seqno=0.0

ssb: actual_seqno=0.1

# vxdisk list c2t2d0s2

# vxdisk list c2t9d0s2
Device: c2t2d0s2

Device: c2t9d0s2
disk: name=d2 id=1092974302.22.gopal

disk: name=d4 id=1092974318.24.gopal
group: name=dgD280silo1 id=1095738111.20.gopal

group: name=dgD280silo1 id=1095738111.20.gopal
ssb: actual_seqno=0.1

ssb: actual_seqno=0.1

Note that though some disks SSB IDs might match that does not necessarily mean that those disks’ config copies have all the changes. From some other configuration copies, those disks’ SSB IDs might not match. To see the configuration from this disk, run
/etc/vx/diag.d/vxprivutil dumpconfig /dev/rdsk/c2t3d0s2 > dumpconfig_c2t3d0s2

If the other disks in the disk group were not imported on another host, Volume Manager resolves the conflicting values of the serial IDs by using the version of the configuration database from the disk with the greatest value for the updated ID (shown as update_tid in the output from /etc/vx/diag.d/vxprivutil dumpconfig /dev/rdsk/).

In this example , looking through the dumpconfig, there are the following update_tid and ssbid values:

dumpconfig c2t3d0s2

dumpconfig c2t9d0s2
config:tid=0.1058

Config:tid=0.1059
dm d1

dm d1
update_tid=0.1038

Update_tid=0.1059
ssbid=0.0

ssbid=0.0
dm d2

dm d2
update_tid=0.1038

Update_tid=0.1038
ssbid=0.0

ssbid=0.0
dm d3

dm d3
update_tid=0.1053

Update_tid=0.1053
ssbid=0.0

ssbid=0.0
dm d4

dm d4
update_tid=0.1053

Update_tid=0.1059
ssbid=0.0

ssbid=0.1

Using the output from the dumpconfig for each disk determines which config output to use by running the command:

# cat dumpconfig_c2t3d0s2 | vxprint -D - -ht

Before deciding on which option to use for import, ensure the disk group is currently in a valid deport state:

# vxdisk -o alldgs list
DEVICE TYPE DISK GROUP STATUS
c2t1d0s2 auto:cdsdisk - (dgD280silo1) online
c2t2d0s2 auto:cdsdisk - (dgD280silo1) online
c2t3d0s2 auto:cdsdisk - (dgD280silo1) online
c2t9d0s2 auto:cdsdisk - (dgD280silo1) online

At this stage, your knowledge of how the serial split brain condition came about may be a little clearer and you should have chosen a configuration from one disk to be used to import the disk group. In this example, the following command imports the disk group using the configuration copy from d2:
# /usr/sbin/vxdg -o selectcp=1092974302.22.gopal import dgD280silo1
Once the disk group has been imported, Volume Manager resets the serial IDs to 0 for the imported disks. The actual and expected serial IDs for any disks in the disk group that are not imported at this time remain unchanged.
# vxprint -htg dgD280silo1
dg dgD280silo1 default default 26000 1095738111.20.gopal
dm d1 c2t1d0s2 auto 2048 35838448 -
dm d2 c2t2d0s2 auto 2048 35838448 -
dm d3 c2t3d0s2 auto 2048 35838448 -
dm d4 c2t9d0s2 auto 2048 35838448 -

v SNAP-vol_db2silo1.1 - DISABLED ACTIVE 1024000 SELECT - fsgen
pl SNAP-vol_db2silo1.1-01 SNAP-vol_db2silo1.1 DISABLED ACTIVE 1024000 STRIPE 2/1024 RW
sd d3-01 SNAP-vol_db2silo1.1-01 d3 0 512000 0/0 c2t3d0 ENA
sd d4-01 SNAP-vol_db2silo1.1-01 d4 0 512000 1/0 c2t9d0 ENA
dc SNAP-vol_db2silo1.1_dco SNAP-vol_db2silo1.1 SNAP-vol_db2silo1.1_dcl
v SNAP-vol_db2silo1.1_dcl - DISABLED ACTIVE 544 SELECT - gen
pl SNAP-vol_db2silo1.1_dcl-01 SNAP-vol_db2silo1.1_dcl DISABLED ACTIVE 544 CONCAT - RW
sd d3-02 SNAP-vol_db2silo1.1_dcl-01 d3 512000 544 0 c2t3d0 ENA

v orgvol - DISABLED ACTIVE 1024000 SELECT - fsgen
pl orgvol-01 orgvol DISABLED ACTIVE 1024000 STRIPE 2/128 RW
sd d1-01 orgvol-01 d1 0 512000 0/0 c2t1d0 ENA
sd d2-01 orgvol-01 d2 0 512000 1/0 c2t2d0 ENA

# vxrecover -g dgD280silo1 -sb

# mount -F vxfs /dev/vx/dsk/dgD280silo1/orgvol /orgvol

UX:vxfs mount: ERROR: V-3-21268: /dev/vx/dsk/dgD280silo1/orgvol is corrupted. needs checking

# fsck -F vxfs /dev/vx/rdsk/dgD280silo1/orgvol
log replay in progress
replay complete - marking super-block as CLEAN

# mount -F vxfs /dev/vx/dsk/dgD280silo1/orgvol /orgvol

# df /orgvol

/orgvol (/dev/vx/dsk/dgD280silo1/orgvol): 1019102 blocks 127386 files

# vxdisk -o alldgs list

DEVICE TYPE DISK GROUP STATUS
c2t1d0s2 auto:cdsdisk d1 dgD280silo1 online
c2t2d0s2 auto:cdsdisk d2 dgD280silo1 online
c2t3d0s2 auto:cdsdisk d3 dgD280silo1 online
c2t9d0s2 auto:cdsdisk d4 dgD280silo1 online

# vxprint -htg dgD280silo1

dg dgD280silo1 default default 26000 1095738111.20.gopal

dm d1 c2t1d0s2 auto 2048 35838448 -
dm d2 c2t2d0s2 auto 2048 35838448 -
dm d3 c2t3d0s2 auto 2048 35838448 -
dm d4 c2t9d0s2 auto 2048 35838448 -

v SNAP-vol_db2silo1.1 - ENABLED ACTIVE 1024000 SELECT SNAP-vol_db2silo1.1-01 fsgen
pl SNAP-vol_db2silo1.1-01 SNAP-vol_db2silo1.1 ENABLED ACTIVE 1024000 STRIPE 2/1024 RW
sd d3-01 SNAP-vol_db2silo1.1-01 d3 0 512000 0/0 c2t3d0 ENA
sd d4-01 SNAP-vol_db2silo1.1-01 d4 0 512000 1/0 c2t9d0 ENA
dc SNAP-vol_db2silo1.1_dco SNAP-vol_db2silo1.1 SNAP-vol_db2silo1.1_dcl
v SNAP-vol_db2silo1.1_dcl - ENABLED ACTIVE 544 SELECT - gen
pl SNAP-vol_db2silo1.1_dcl-01 SNAP-vol_db2silo1.1_dcl ENABLED ACTIVE 544 CONCAT - RW
sd d3-02 SNAP-vol_db2silo1.1_dcl-01 d3 512000 544 0 c2t3d0 ENA

v orgvol - ENABLED ACTIVE 1024000 SELECT orgvol-01 fsgen
pl orgvol-01 orgvol ENABLED ACTIVE 1024000 STRIPE 2/128 RW
sd d1-01 orgvol-01 d1 0 512000 0/0 c2t1d0 ENA
sd d2-01 orgvol-01 d2 0 512000 1/0 c2t2d0 ENA

Leave a Comment more...

VxVM References

by sebastin on Jan.19, 2009, under Solaris, Veritas Volume Manager

VxVM References

Leave a Comment more...

adding luns to JNIC 5.3.x non-gui method

by sebastin on Sep.28, 2008, under Solaris, Veritas Volume Manager

In case of cluster configuration, find out the master node using
#vxdctl -c mode
run the following commands on the master node.
#vxdmpadm listctlr all ## find out the controllers and its status.cross check with /kernel/drv/jnic146x.conf entries.
CTLR-NAME ENCLR-TYPE STATE ENCLR-NAME
====================================================
c1 Disk ENABLED Disk
c2 SAN_VC ENABLED SAN_VC0
c3 SAN_VC ENABLED SAN_VC0

disable the first path c2
vxdmpadm disable ctlr=c2
confirm the ctlr status.

This command may be available only on 5.3.x drivers under /opt/JNIC146X/, jnic146x_update_drv
/opt/jJNIC146X/jnic146x_busy shows you the driver status;
/opt/JNIC146X/jnic146x_update_drv -u -a # -u update the driver perform LUN rediscovery on “-a” all instances.
check your messages file for updates and enable the controller
#vxdmpadm enable ctlr=c2 # confirm the ctlr status and do the same for other controllers.
#devfsadm
#format ## to label all the new LUNS
#vxdctl enable
#vxdisk list will show you the new luns with error status
Label the luns in veritas
#vxdisksetup -i cNtNdNs2 ##for all the new luns
Add luns to disk group
#vxdg -g “disk_group” adddisk racdgsvc32=c2t1d32s2
repeat for all luns
#Resize requested volumes
#vxresize -g “disk_group” vol_name +20g # add 20gb to vol_name

Leave a Comment more...

Off-Host Backup Processing with Veritas FlashSnap

by sebastin on Apr.29, 2008, under Solaris, Veritas Volume Manager

Borislav Stoichkov

Backup times and the resources associated with them are becoming more and more important in the evolving model of 24/7 application development and content management. Developers all over the world collaborate on the same projects and access the same resources that must be 100% available during the business hours of their respective time zones. This gives systems administrators very little room to completely satisfy their customers — the developers.

Source code and content repositories contain hundreds of projects and millions of small files that require considerable amounts of time and system resources to back up. Also, data protection is a top priority that presents system and backup engineers with the question of how to effectively ensure data protection and availability in case of a disaster and, at the same time, minimize the duration and resource overhead of the process.

The Problem

My organization was faced with these very issues. One of our high-profile customers was using Interwoven’s flagship product TeamSite installed on a Sun Solaris 8 server. Interwoven Teamsite’s key features are content management and services, code and media versioning, collaboration and parallel development, branching of projects, transactional content deployment, etc. Developers all over the world were using the system’s resources throughout the day for mission-critical tasks. As the number of projects and branches increased so did the number of files and the amount of data that needed to be backed up and protected.

Suddenly, the application was managing millions of small files and hundreds of branches and projects. Backup times were reaching 7-8 hours with the bottleneck caused by the sheer amount of files and data. Complaints were received that during the backup window the application as well as the system were becoming difficult to use and that system performance was becoming unacceptable. The customer requested a solution that would be as cheap as possible and would not require a change in their development and content management model.

From a storage perspective, the server had two internal mirrored drives for the operating system file systems under Veritas Volume Manager control. An external Fibre Channel array was attached to the machine presenting a single LUN on which the Interwoven Teamsite application was installed. The LUN had a 143-GB Veritas file system and was under Veritas Volume Manager control as well.

The idea for the solution was to take a snapshot of the application file system and use that snapshot for a backup to tape on another host. Thus, the backup window could be extended as much as needed without affecting the performance or usability of the server. File system snapshots, however, do not allow off-host processing. Given that Veritas Volume Manager was already installed and active on the machine, using its built-in volume snapshot features seemed natural. The only problems remaining were to break off the volume with the snapshot in a manner that was supported by the vendor and did not present any risks, and to minimize the time needed for the reverse operation — syncing the snapped off mirror without mirroring a 143-GB file system from scratch, which is a long and tedious process.

Implementing FlashSnap

The resolutions to both problems are found in the Veritas FlashSnap product. FlashSnap is a license key-enabled option of the Veritas Foundation Suite solutions. The license enables the use of the FastResync and Dynamic Split and Join features of Veritas Volume Manager. With FastResync enabled on a volume, Volume Manager uses a map to keep track of which blocks are updated in the volume and in the snapshot. In time the data on the original volume changes, and the data on the snapshot volume becomes outdated.

The presence of a FastResync map ensures that in an operation where the snapshot is resynchronized with the primary volume only the modified blocks (dirty blocks) are applied. Full mirror synchronization is no longer necessary. The map is persistent across reboots because it is stored in a data change object (DCO) log volume associated with the original volume. Dynamic Split and Join allow for the volume snapshots to be placed into a separate disk group, which can be deported and imported on another host for off-host processing. The only requirement is for the disks to be visible to the designated host. At a later stage, the disk group can be re-imported on the original host and joined with the original disk group or, if necessary, with a different one.

For the implementation, additional storage was required on the storage array equal to the original amount of 143 GB. The added storage was configured into a new LUN. A new low-end server running Sun Solaris 8 (host2) was attached to the array as well. The newly added LUN (LUN1) was presented to both hosts, while the original LUN (LUN0) was only made visible on the original host (host1).

DCO Logging

Persistent FastResync requires a DCO log to be associated with the original volume. That option has been available only since Veritas Volume Manager 3.2 and disk group version 90, so the volume management software was upgraded to the latest version. The existing disk group was upgraded to the latest version as well. The FlashSnap license obtained from Veritas was installed on both hosts. For verification that the newly added license is functional, the following command can be issued:

# vxdctl license
All features are available:
........
FastResync
DGSJ
A small problem arose from the fact that there was no room for a DCO log on LUN0 because all of its space was allocated for the application volume. Luckily, the file system on it was VXFS, and it was possible for the volume and the file system to be shrunk:

host1# vxresize -F vxfs -g DG1 Vol01 -20M
With that fixed, a DCO (data change object) log volume was associated with the original volume:

host1# vxprint -g DG1
.............
dm DG101 c4t9d0s2 - 286676992 - - - -

v Vol01 fsgen ENABLED 286636032 - ACTIVE - -
pl Vol01-01 Vol01 ENABLED 286656000 - ACTIVE - -
sd DG101-01 Vol01-01 ENABLED 286656000 0 - - -
host1# vxassist -g DG1 addlog Vol01 logtype=dco dcologlen=1056 ndcolog=1 DG101
host1# vxprint -g DG1
..............
dm DG101 c4t9d0s2 - 286676992 - - - -

v Vol01 fsgen ENABLED 286636032 - ACTIVE - -
pl Vol01-01 Vol01 ENABLED 286656000 - ACTIVE - -
sd DG101-01 Vol01-01 ENABLED 286656000 0 - - -
dc Vol01_dco Vol01 - - - - - -
v Vol01_dcl gen ENABLED 1056 - ACTIVE - -
pl Vol01_dcl-01 Vol01_dcl ENABLED 1056 - ACTIVE - -
sd DG101-02 Vol01_dcl-01 ENABLED 1056 0 - - -
The length of the DCO log determines the level at which changes are tracked. A longer DCO log will trigger more in-depth tracking and will require less time for the snapshot to resynchronize. Increasing the log too much may cause performance overhead on the system. The default number of plexes in the mirrored DCO log volume is 2. It is recommended that the number of DCO log plexes configured equal the number of data plexes in the volume — in our case, one. The default size for a DCO plex is 133 blocks. A different number can be specified, but it must be a number from 33 up to 2112 blocks in multiples of 33. If the snapshot volumes are to be moved to a different disk group, the administrator must ensure that the disks containing the DCO plexes can accompany them.

Establishing a Snapshot Mirror

The next step is to enable persistent FastResync on the volume, so that sequential re-mirroring operations take considerably less than the establishment of a full mirror and are applied from the DCO log:

host1# vxvol -g DG1 set fastresync=on Vol01
host1# vxprint -g DG1 -m Vol01 | grep fastresync
fastresync=on
The addition of LUN1 to the DG1 disk group as disk DG102 completes our preparation phase, so now we are ready to establish our snapshot:

host1# vxassist -g DG1 snapstart Vol01 alloc=DG102
This operation will establish a mirror of volume Vol01 and will add an additional DCO log object that will be in a DISABLED and DCOSNP state for use by the snapshot. The snapstart process takes a considerable amount of time, because it is a full mirror creation. The vxassist command will block until the snapshot mirror is complete. It can be placed in the background by using the -b argument to vxassist. During the snapstart phase, disk group DG1 will look like this:

v Vol01 fsgen ENABLED 286636032 - ACTIVE ATT1 -
pl Vol01-01 Vol01 ENABLED 286656000 - ACTIVE - -
sd DG101-01 Vol01-01 ENABLED 286656000 0 - - -
pl Vol01-02 Vol01 ENABLED 286656000 - SNAPATT ATT -
sd DG102-01 Vol01-02 ENABLED 286656000 0 - - -
dc Vol01_dco Vol01 - - - - - -
v Vol01_dcl gen ENABLED 1056 - ACTIVE - -
pl Vol01_dcl-01 Vol01_dcl ENABLED 1056 - ACTIVE - -
sd DG101-02 Vol01_dcl-01 ENABLED 1056 0 - - -
pl Vol01_dcl-02 Vol01_dcl DISABLED 1056 - DCOSNP - -
sd DG102-02 Vol01_dcl-02 ENABLED 1056 0 - - -
Once the mirror is established, the plex on disk DG102 will be in a SNAPDONE state ready to be separated from the original volume. If the snapshot is attempted before the snapshot plex is in a SNAPDONE state, the command will fail. If snapstart is placed in the background with the -b switch, the vxassist snapwait command will wait until the snapstart command is done and can be used in scripts to ensure that no other commands are issued before the completion of snapstart:

v Vol01 fsgen ENABLED 286636032 - ACTIVE - -
pl Vol01-01 Vol01 ENABLED 286656000 - ACTIVE - -
sd DG101-01 Vol01-01 ENABLED 286656000 0 - - -
pl Vol01-02 Vol01 ENABLED 286656000 - SNAPDONE - -
sd DG102-01 Vol01-02 ENABLED 286656000 0 - - -
dc Vol01_dco Vol01 - - - - - -
v Vol01_dcl gen ENABLED 1056 - ACTIVE - -
pl Vol01_dcl-01 Vol01_dcl ENABLED 1056 - ACTIVE - -
sd DG101-02 Vol01_dcl-01 ENABLED 1056 0 - - -
pl Vol01_dcl-02 Vol01_dcl DISABLED 1056 - DCOSNP - -
sd DG102-02 Vol01_dcl-02 ENABLED 1056 0 - - -
To execute the actual snapshot:

host1# vxassist -g DG1 snapshot Vol01 SNAP-Vol01
host1# vxprint -g DG1

v SNAP-Vol01 fsgen ENABLED 286636032 - ACTIVE - -
pl Vol01-02 SNAP-Vol01 ENABLED 286656000 - ACTIVE - -
sd Dg102-01 Vol01-02 ENABLED 286656000 0 - - -
dc SNAP-Vol01_dco SNAP-Vol01 - - - - - -
v SNAP-Vol01_dcl gen ENABLED 1056 - ACTIVE - -
pl Vol01_dcl-02 SNAP-Vol01_dcl ENABLED 1056 - ACTIVE - -
sd DG102-02 Vol01_dcl-02 ENABLED 1056 0 - - -
sp Vol01_snp SNAP-Vol01 - - - - - -

v Vol01 fsgen ENABLED 286636032 - ACTIVE - -
pl Vol01-01 Vol01 ENABLED 286656000 - ACTIVE - -
sd DG101-01 Vol01-01 ENABLED 286656000 0 - - -
dc Vol01_dco Vol01 - - - - - -
v Vol01_dcl gen ENABLED 1056 - ACTIVE - -
pl Vol01_dcl-01 Vol01_dcl ENABLED 1056 - ACTIVE - -
sd Dg102-02 Vol01_dcl-01 ENABLED 1056 0 - - -
sp SNAP-Vol01_snp Vol01 - - - - - -
Now the disk group can be split so that the disk containing the snapshot volume is placed in a different group:

host1# vxdg split DG1 SNAPDG1 SNAP-Vol01
The new disk group SNAPDG1 containing SNAP-Vol01 and its DCO log volume can be deported and imported on the alternate host:

host1# vxdg deport SNAPDG1
host2# vxdg import SNAPDG1
Following the split, the snapshot volume is disabled. The following commands can be used to recover and start the volume:

host2# vxrecover -g SNAPDG1 -m SNAP-Vol01
host2# vxvol -g SNAPDG1 start SNAP-Vol01
A consistency check can be performed on the volume’s file system, and it can be mounted for backup processing or any other type of data manipulation:

host2# fsck -F vxfs /dev/vx/rdsk/SNAPDG1/ SNAP-Vol01
host2# mount -F vxfs /dev/vx/dsk/SNAPDG1/ SNAP-Vol01 /data
Before the backup window kicks in, or in case the snapshot needs to be refreshed, the file system can be unmounted and the volume deported and imported again on the original host:

host2# umount /data
host2# vxvol -g SNAPDG1 stop SNAP-Vol01
host2# vxdg deport SNAPDG1
host1# vxdg import SNAPDG1
Now the disk(s) in disk group SNAPDG1 can be joined into disk group DG1:

host1# vxdg join SNAPDG1 DG1
host1# vxrecover -g SNAPDG1 -m Vol01
Once the snapshot volume is back into its original disk group, we can perform the snapback operation:

host1# vxassist -g DG1 snapback SNAP-Vol01
In some cases when there is data corruption on the original volume, the data on the snapshot volume can be used for the synchronization. This is achieved by using the resyncfromreplica argument to the vxassist -o option with snapback. The operation will not take long to execute at all. If performed within hours of the first snapshot, the process may take less than a minute depending on the amount of file system changes. In our environment, a snapback process that is executed approximately 24 hours after the previous one takes no longer than 12 minutes. Effectively, we have decreased the time it takes to back up the application from hours to less than 15 minutes from the developers’ and system users’ point of view.

Automating the Process

The last challenge in this project was to automate the process so that it would occur transparently on a daily basis before the backup window with no manual intervention required, as well as ensure that anything that went wrong with the process would be caught in a timely manner and resolved before the actual backup to tape. Remote syslog logging and log file scraping had been implemented in the environment for a while, and this gave us the option to log all errors to a remote syslog server. The string used to log errors was submitted to the monitoring department and was added to the list of strings that triggered an alert with the control center. The alert automatically generated a trouble ticket and dispatched it to an administrator. The whole process needed to be synchronized on both servers.

After some debate, we chose a solution utilizing SSH with public key authentication. The password-less OpenSSH private and public keys were generated on host1, and the public key was imported into the authorized_keys file for root on host2. Root logins through SSH were allowed on host2, and logins via SSH using the public key generated on host1 were only allowed from that server. Another aspect to the solution was that the same shell script, vxfsnap, would be used on both sides with a switch instructing it to execute in local or remote mode.

The vxfsnap Script

The vxfsnap script (see Listing 1) accepts the following arguments: original disk group, original volume name, name for the snapshot volume, name for the snapshot disk group, hostname/IP of the host processing the snapshot, and mount point for the snapshot volume. It has four modes of operation:

deport — Unmounts the file system and deports the snapshot disk group.
join — Imports the snapshot disk group and joins it into the target disk group executing a snapback.
snap — Performs the snapshot and deports the disk group.
import — Imports the snapshot disk group and mounts the file system.
Another optional switch can be used to freeze the Interwoven Teamsite application for the duration of snapback improving the consistency of the data used for the backup. This shell script was designed with re-usability in mind so that it can be implemented with little or no effort in a similar solution. It can be executed on one of the hosts, and it can control the process from a central location.

This command:

host1# vxfsnap -r -h host2 -G SNAPDG1 -V SNAP-Vol01 -m /data -e deport
would unmount the /data file system on host2 and deport the SNAPDG disk group. This can be followed by:

host1# vxfsnap -g DG1 -v Vol01 -G SNAPDG1 -V SNAP-Vol1 -e join -f
to import the SNAPDG1 disk group on host1 and perform everything including a snapback up to executing a snapshot, freezing the Interwoven Teamsite backing store, and unfreezing it after snapback is complete:

host1# vxfsnap -g DG1 -v Vol01 -G SNAPDG1 -V SNAP-Vol01 -e snap
Snap mode will take the snapshot. The separated volume will be split off into a new disk group that remains deported ready for other interested hosts to import. Finally, we can make use of the data on host2:

host1# vxfsnap -r -h host2 -G SNAPDG1 -V SNAP-Vol01 -m /data -e import
The vxfsnap script utility also can be used in other scripts that can be executed as cron jobs shortly before the backup window:

#!/bin/bash

DG=DG1
VOL=Vol01
SDG=SNAPDG1
SVOL=SNAP-Vol01
MNT="/backup"
RHOST="host2"

FLASHSNAP="/usr/local/vxfsnap/vxfsnap"

$FLASHSNAP -r -h $RHOST -G $SDG -V $SVOL -m $MNT -e deport && <\>
$FLASHSNAP -g $DG -v $VOL -G $SDG -V $SVOL -e join -f && <\>
$FLASHSNAP -g $DG -v $VOL -G $SDG -V $SVOL -e snap && <\>
$FLASHSNAP -r -h $RHOST -G $SDG -V $SVOL -m $MNT -e import
With the Veritas FlashSnap solution in place, the file system containing the Interwoven Teamsite application was added to the exclude list for the backup client software. Rebooting the server used for backup processing can potentially break the configuration, because the cron job requires that the file system be mounted on host2. This can be solved with a startup script that checks whether the designated disk group is recognized as local at boot time and mounts the volume under a specified mount point, or by adding the file system in the /etc/vfstab configuration file and expecting a failure if the disk group or volume are unavailable. Conclusions

This solution achieved a little more than it was designed to. Effectively, the copy of the data used for backup to tape is available all the time in case a file, directory, or the whole volume becomes corrupted and needs to be restored. Recovering from any of the mentioned disasters is a simple process that takes minutes, requires no special backup infrastructure resources, and adds further to the value of the solution.

Veritas Flashsnap is a technology that can help both users and administrators in their quest to better utilize the resources of a system. It can be used in simple scenarios with machines directly attached to the storage media or in more complex configurations in Storage Area Networks as a host-controlled solution. It can also be used with a number of applications for point-in-time backup copies at the volume level that can be used for anything from off-host backup processing to disaster recovery.

Borislav Stoichkov has an MS degree in Computer Science with a focus on cryptography as well as certifications from Sun Microsystems and Red Hat. He has been engineering and implementing solutions and managing Linux and Solaris systems in large enterprise environments for the past 5 years. Interests include secure communication and data storage, high performance computing. Currently, he works as a Unix consultant in the Washington DC area and can be reached at: borislav.stoichkov@meanstream.org.

Leave a Comment more...

Database Migrations the VxVM Way

by sebastin on Apr.29, 2008, under Solaris, Veritas Volume Manager

Migration Methods

The term database migration can mean a variety of things. It can refer to the movement from one database to another where the data is moved between the databases, such as moving from an Informix IDS version 7.31 to a new IDS version 9.40 database using Informix dbexport/dbimport utilities. Or, it can refer to the movement of the database to an entirely new platform, such as moving from Solaris 2.6 on SPARC to Linux 2.6 or Windows Server 2003 on Intel x86. It can refer to the movement of the database from one server to another, such as moving from a Sun Enterprise E450 to a Sun Fire 4800 system. Or, it can simply mean moving the database from one disk array to another.

The operative word here is “move”. An in-place upgrade of a database, from one version to another, is not considered a database migration. For example, upgrading an Informix IDS version 7.31 to IDS version 9.30 would not be considered a database migration.

Database migrations are initiated for a variety of reasons. Sometimes they are done for increased performance. If database loads or nightly database refreshes are taking too long, then a new server with more or faster CPUs may help. Sometimes they are done for data reorganization. Perhaps disk hot spots are leading to poor performance. Sometimes migrations are done as part of server consolidation projects, where entire departments are asked to move their databases to a single server. Often, it’s just a simple matter of economics. The original environment may have become too costly due to high maintenance costs, or the new environment may offer lower software licensing costs.

A number of database migration methods can be employed. The DBA can unload the data to a transportable file format and recreate the database from scratch on the destination system. If the new environment supports the database’s data files, the files can be archived and copied to the target system using a slew of Unix utilities (e.g., tar, cpio, pax, etc.). If the database data files are stored on raw devices, the Unix dd command can be used to pipe the data to the target system. If the raw devices are managed by a logical volume manager (LVM), such as Veritas Volume Manager (VxVM), then the data may be mirrored to new devices on a new array and then physically moved to the target system. I’ll demonstrate this last method, using Veritas Volume Manager, to quickly and reliably migrate a database.

VxVM Migration Prerequisites

Veritas Volume Manager, available for most Unix platforms, has become the de facto LVM in many shops because of its advanced features and standardized command set across platforms. To provide a brief overview, VxVM allows the creation of volumes, which are logical devices that appear to the operating system as a type of hard disk or disk partition (i.e., a virtual disk). Volumes can be constructed from one disk to many disks supporting various RAID levels (RAID-0, RAID-1, RAID-5, etc.) as well as simple disk concatenation. The advantages of volumes include increased storage capacity beyond single disk, various degrees of data protection, increased read/write performance, and ease of storage management to name a few.

Successful database migrations with VxVM require careful planning and preparation, but the reward is well worth the effort. Before the migration can begin, the DBA must determine the feasibility of the migration, since not all migrations can be performed with Veritas Volume Manager. There are several prerequisites for performing a migration with VxVM.

First, the database should to be built from raw device volumes. If it isn’t, forget about using VxVM for the migration. Instead, use any of the supplied database utilities or one of the aforementioned Unix archive/copy utilities. Second, does the target database support the original database’s data files? If the migration is to a minor version upgrade of the database on the same platform, then this is most likely the case. However, if the migration is to a new major version of the database, then the DBA may need to consult with the database vendor first. In any event, if the new version of the database doesn’t directly support the database files, VxVM may still be used for the migration and an in-place upgrade on the migrated database can be performed. Unfortunately, a VxVM database migration to a new platform, such as from Solaris to Windows, or to a new database platform, such as from Informix to Oracle, is probably not possible.

If you’ve satisfied the VxVM database migration prerequisites, then a database migration the VxVM way might just be for you.

Setting the Stage

So, you’ve outgrown your current server and decided to purchase a new, more powerful replacement. Your current server is hosting a multi-terabyte database on an old disk array and you need additional disk space for growth, so you’ve decided to purchase a new disk array as well. By purchasing both a new server and new storage, you’ve set the stage to performing a database migration. By performing the migration to a new server, you can continue to host the database on the old server while copying/moving the data to a new, more powerful server with more storage and without much downtime.

You’ve also consulted with the DBA, and he has satisfied all of the VxVM migration prerequisites and has a game plan. The DBA is going to stick with the same major version of the database software, but with a minor upgrade. You’ve decided that the new server will be running a new version of the operating system, and the DBA has confirmed that the database software is supported. The plan is to copy the database to the new server and bring it online with the same name, allowing for two copies of the database to exist at once. This will make migrating to the new database easier and transparent to the users.

Mirroring the Volumes

With some careful planning, you’ve attached the new disk array to the old server and configured the storage for use by the operating system. Because the new disk array has more capacity than the old array, disk space will not be an issue. To copy the data from the old array to the new, you must add a new disk (LUN) to each of the original VxVM disk groups, from the new array. Because the new LUNs are much larger, you should initialize the LUNs, soon to be christened VM disks by VxVM, with a large VxVM private region using the vxdisksetup command:

vxdisksetup -i privlen=8192
example: vxdisksetup -i c3t8d0 privlen=8192
The default private region length is 2048 (sectors), which I think is too small for today’s larger capacity disks. By increasing the private region, VxVM can keep track of more objects (e.g., you can create more volumes without worrying about running into a volume limit). After initializing the disks, add the disks to the VxVM disk groups with the vxdg command:

vxdg -g adddisk =
example: vxdg -g idsdg1 adddisk d2_lun1=c3t8d0
Be sure to add enough new disks to each of the original disk groups to allow the volumes to be mirrored to the new disks. If the volumes to be mirrored are simple volumes, you can use the vxmirror command:

vxmirror -g
example: vxmirror -g idsdg1 s1_lun1 d2_lun1
The vxmirror command will mirror every volume in the disk group from the old VM disk to the new VM disk. Perform this operation for all of the disk groups until all the volumes have been mirrored from the old array to the new. If your volumes are complex (e.g., VxVM RAID-0, RAID-5, etc.), use vxassist or vxmake to create the mirrors instead. Breaking the Mirrors

When all of the volumes have been successfully mirrored, the next step is to split or “break” them into two. It’s a good idea to get the DBA involved to schedule a database outage and shut down the database before breaking the mirrors. You don’t want changes to be made to the database after you have broken the mirrors. You could break the mirrors while the databases are online, but then you would have to keep track of the changes and apply them manually later.

Breaking the mirrors is a cumbersome process because you need to run the vplex command for each of the mirrored volumes:

vxplex -g dis
example: vxplex -g idsdg1 dis pdf11282004-02
I wrote a ksh function to automate this process. You can copy and paste this into your own script. I don’t like to automate VxVM tasks too much, because there are many things that can go wrong if you let a script take full control:

function make_vols {
dg=idsdg1
metadata=/var/tmp/$dg.config

plexes=$(vxprint -g $dg -mte '"d2_lun1" in (sd_disk)'|grep pl_name|awk -F= '{print $2}')
for plex in $plexes; do
echo "Disassociating $plex from disk group $dg"
#vxplex -g $dg dis $plex
volume_tmp=$(echo $plex|sed -e 's/-0[0-9]*$//')
volume=$(echo $volume_tmp"_d2")
echo "Creating new volume $volume using plex $plex"
#vxmake -g $dg -U gen vol $volume plex=$plex
echo "Extracting volume $volume metadata and appending it to $metadata"
#vxprint -hmvpsQqr -g $dg $volume >> $metadata
echo " "
done
}
Set the dg variable to the disk group with the mirrors you want to break. The “d2_lun1″ reference, in the function, is the name of new VM disk you added to the disk group (from the new array). Change this value to your own VM disk. I’ve commented out the VxVM commands to protect you from accidentally running this function without understanding what’s really going on. Since every VxVM environment is different, it’s difficult to write scripts that will work in every situation. I recommend using this function as a template for your own script. Note that function not only breaks the mirrors, but that it also creates new volumes (the volume names are appended by a “_d2″ to avoid conflicting with the existing volumes) from the disassociated plexes on the new disk (it will become apparent later why we needed to create new volumes in the first place). Also, the script extracts all of the VxVM volume metadata to a flat file, which will be used later. Run the function for each of your disk groups, until all of the mirrors have been broken.

There is a caveat with extracting the metadata. I’ve noticed that the permissions on the volumes do not get preserved. I’ll present a ksh function to correct this problem.

Deporting the Disk Groups

When all of the mirrors have been successfully split, the next step is to delete the newly created volumes. Don’t be alarmed — we’ll restore the volumes later from the metadata we extracted earlier. This step is necessary, because we must create new disk groups in which to store our new volumes. We can then export the new disk groups to our new server, leaving the old disk groups untouched.

Use this ksh function to remove the volumes:

function remove_vols {
dg=idsdg1
metadata=/var/tmp/$dg.config

volumes=$(grep "^vol " $metadata|awk '{print $2}')
for volume in $volumes; do
echo "Removing volume $volume from $dg"
#vxedit -g $dg -r rm $volume
echo " "
done
}
Once you’ve removed all of the new volumes, remove the new VM disks too:

vxdg -g rmdisk
example: vxdg -g idsdg1 rmdisk d2_lun1
Now you are ready to create new disk groups from the VM disks that you just removed:

vxdg init =
example: vxdg init idsdg1_d2 d2_lun2=c3t8d0
The new disk groups should have a name similar to the old one. Once you’ve created the new disk groups (there should be the same number of new groups as old), restore the volume metadata that was extracted earlier for every new volume that was created to the new disk group:

vxmake -g -d
example: vxmake -g idsdg1_d2 -d /var/tmp/idsdg1.config
Once you restore the metadata, all of the volumes that you originally removed from the old disk group will be restored like magic to the new disk group. Do this for each new disk group. After you’ve created the new disk groups with the restored volumes, all that’s left is to deport the new disk groups to the new server:

vxdg -n deport
example: vxdg -n idsdg1 deport idsg1_d2
Use the -n option to rename the disk groups back to original disk group name during the deport operation. It’s convenient to perform the disk group renaming during the deport so that, when you import the disk groups on the new server, the disk group names are the same as on the old server. Deport all of the new disk groups. Importing the Disk Groups

Once all of the new disk groups have been successfully deported, disconnect the new array from the old server and attach it to the new server. You’ll have to go through the motions of making the disk array visible on the new server. Don’t fret about integrity of the data. It’s safely stored on the VxVM disks. Don’t worry about device renumbering either (e.g., the cxtxdx name changing on the new server), because VxVM tracks disks by the information stored in the private region on the disk and not by the operating system device name.

Once you feel confident that all of the LUNs on the disk array are accounted for and visible to the operating system, import the disk groups:

vxdg import
example: vxdg import idsdg1
If you like, you can also use the menu-driven vxdiskadm command to import disk groups (menu item 8: Enable access to (import) a disk group). It conveniently lists all the disk groups that can be imported. Import all of the formerly deported disk groups, using the original name of the disk group on the old server (remember we deported the disk groups with the rename option). Once all of the disk groups have been imported, don’t forget to rename to volumes back to their original names. If you used the make_vols function, it appended a “_d2″ (or whatever value you chose) to the end of the new volume names). Use this ksh function to rename the volumes:

function rename_vols {
dg=idsdg1
volumes=$(vxprint -g $dg -mte v_kstate|grep "^vol "|awk '{print $2}')
for volume in $volumes; do
new_volume=$(echo $volume|sed -e 's/_d2$//')
echo "Renaming volume $volume to $new_volume in disk group $dg"
#vxedit -g $dg rename $volume $new_volume
echo " "
done
}
Modify this function if you used something other than “_d2″ for the new volumes. Do this for all of the disk groups. Before starting the volumes, make sure that the permissions on the volumes are correct. I’ve noticed that VxVM is not consistent in restoring the owner and group id from the metadata. This is critical for a database because the volumes must be owned by the database id (e.g., Informix or Oracle). Use this ksh function to correct the problem:

function fix_vols {
dg=idsdg1
volumes=$(vxprint -g $dg -mte v_kstate|grep "^vol "|awk '{print $2}')
for volume in $volumes; do
echo "Changing ownership and mode on $volume in disk group $dg"
#vxedit -g $dg set mode=0660 user=informix group=informix $volume
echo " "
done
}
Set “mode=”, “user=”, and “group=” to the correct values. Double-check that the permissions/ownerships on the volumes match those of the old server before starting the volumes:

#vxrecover -g -Esb
example: vxrecover -g idsdg1 -Esb
When volumes have all been started, it’s again time to get the DBA involved. If the database uses symbolic links to the raw device volumes, rather than referencing the devices directly, you will need to recreate the symbolic links. For example, for the following Informix IDS root dbspace (rootdbs), the volume /ids_prod1/rootdbs is really a symbolic link to the raw volume:

# onstat -d|grep "/rootdbs"
2ac06928 1 1 0 1000000 997130 PO- /ids_prod1/rootdbs

# ls -la /ids_prod1/rootdbs
lrwxrwxrwx 1 informix informix 33 Jun 10 2003
/ids_prod1/rootdbs -> /dev/vx/rdsk/idsdg1/rootdbs_prod1
The easiest way to recreate the symbolic links is to tar up the links on the old server and copy and extract the tarball to the new. Once the links have been created on the new server, make sure that they point to the correct volumes. They should, because we used the same disk group names as the old server during the disk group imports, and we renamed the volumes back to their original names too. If the database does use symbolic links, the links must be recreated exactly. VxVM preserves the device names and consistently stores the devices (volumes) in the /dev/vx directory (even across platforms). If you copied the symbolic links correctly from the old server to the new, the links will point to the right volumes.

Once the symbolic links have been verified as correct, the DBA can install the database software and copy over any configuration files needed from the old server to bring the database online. Once the database is online and the DBA is satisfied with the results, you can put another feather in your cap and call it a day.

Conclusion

It has come to my attention that Veritas Volume Manager for Unix (versions 3.2 and above), includes several new features that automatically perform some of the VxVM functions/commands I presented. Specifically, the vxdg command has been enhanced to allow you to move, join, and split disk groups. These enhancements will allow you to more easily perform a database migration, but they do require an extra license. It’s probably worth it to purchase the license, but it doesn’t hurt to know the sordid details anyway.

The database migration method I presented using Veritas Volume Manager is one that I have successfully used several times in the past. It may not be as efficient for migrating smaller databases, since there are many steps to perform, but it is well worth the effort if you have a very large database to migrate. The methods presented can be applied to other types of data migration, not just databases. Perhaps you will find some new uses and pass them along.

References

Rockwood, Ben. The Cuddletech Veritas Volume Manager Series: Advanced Veritas Theory. August 10, 2002. http://www.cuddletech.com/veritas/advx/index.html (March 28, 2004)

Veritas Software Corporation. Veritas Volume Manager 3.1.1 Administrator’s Guide. February 2001.

Veritas Software Corporation. How to move a disk between disk groups: TechNote ID: 182857. October 9, 2002. http://seer.support.veritas.com/docs/182857.htm (March 28, 2004)