Prasad Banisetti

Tuesday, April 19, 2011

HACMP Short Notes

HACMP

HACMP : High Availability Cluster Multi-Processing

High Availability : Elimination of both planned and unplanned system and application downtime. This is achieved through elimination of H/W and S/W single points of failure.

Cluster Topology : The Nodes, networks, storage, clients, persistent node ip label/devices

Cluster resources: HACMP can move these components from one node to others Ex: Service labels, File systems and applications

RSCT Version: 2.4.2

SDD Version: 1.3.1.3

HA Configuration :

Define the cluster and nodes
Define the networks and disks
Define the topology
Verify and synchronize
Define the resources and resource groups
Verify and synchronize

After Installation changes : /etc/inittab,/etc/rc.net,/etc/services,/etc/snmpd.conf,/etc/snmpd.peers,/etc/syslog.conf,

/etc/trcfmt,/var/spool/cron/crontabs/root,/etc/hosts , HACMP group will add

Software Components:

Application server

HACMP Layer

RSCT Layer

AIX Layer

LVM Layer

TCP/IP Layer

HACMP Services :

Cluster communication daemon(clcomdES)

Cluster Manager (clstrmgrES)

Cluster information daemon(clinfoES)

Cluster lock manager (cllockd)

Cluster SMUX peer daemon (clsmuxpd)

HACMP Deamons: clstrmgr, clinfo, clmuxpd, cllockd.

HA supports up to 32 nodes

HA supports up to 48 networks

HA supports up to 64 resource groups per cluster

HA supports up to 128 cluster resources

IP Label : The label that is associated with a particular IP address as defined by the DNS (/etc/hosts)

Base IP label : The default IP address. That is set on the interface by aix on startup.

Service IP label: a service is provided and it may be bound on a single/multiple nodes. These addresses that HACMP keep highly available.

IP alias: An IP alias is an IP address that is added to an interface. Rather than replacing its base IP address.

RSCT Monitors the state of the network interfaces and devices.

IPAT via replacement : The service IP label will replace the boot IP address on the interface.

IPAT via aliasing: The service IP label will be added as an alias on the interface.

Persistent IP address: this can be assigned to a network for a particular node.

In HACMP the NFS export : /use/es/sbin/cluster/etc/exports

Shared LVM:

Shared volume group is a volume group that resides entirely on the external disks shared by cluster nodes
Shared LVM can be made available on Non concurrent access mode, Concurrent Access mode, Enhanced concurrent access mode.

NON concurrent access mode: This environment typically uses journaled file systems to manage data.

Create a non concurrent shared volume group: smitty mkvgàGive VG name, No for automatically available after system restart, Yes for Activate VG after it is created, give VG major number

Create a non concurrent shared file system: smitty crjfsàRename FS names, No to mount automatically system restart, test newly created FS by mounting and unmounting it.

Importing a volume group to a fallover node:

· Varyoff the volume group

· Run discover process

· Import a volume group

Concurrent Acccess Mode: It’s not supported for file systems. Instead must use raw LV’s and Physical disks.

Creating concurrent access volume group:

· Verify the disk status using lsdev –Cc disk

· Smitty cl_convgàCreate a concurrent volume groupàenter

· Import the volume group using importvg –C –y vg_name physical_volume_name

· Varyonvg vgname

Create LV’s on the concurrent VG: smitty cl_conlv.

Enhanced concurrent mode VG’s: This can be used for both concurrent and non concurrent access. This VG is varied on all nodes in the cluster, The access for modifying the data is only granted to the node that has the resource group active.

Active or passive mode:

Active varyon: all high level operations permitted.

Passive varyon: Read only permissions on the VG.

Create an enhanced concurrent mode VG: mkvg –n –s 32 –C –y myvg hdisk11 hdisk12

Resource group behaviour:

Cascading: Fallover using dynamic node priority. Online on first available node

Rotating : Failover to next priority node in the list. Never fallback. Online using distribution policy.

Concurrent : Online on all available nodes . never fallback

RG dependencies:Clrgdependency –t

/etc/hosts : /etc/hosts for name resolution. All cluster node IP interfaces must be added on this file.

/etc/inittab : hacmp:2:once:/usr/es/sbin/cluster/etc/rc.init>/dev/console 2> &1 will strat the clcomdES and clstrmgrES.

/etc/rc.net file is called by cfgmgr. To configure and start TCP/IP during the boot process.

C-SPOC uses clcomdES to execute commands on remote nodes.

C-SPC commands located in /usr/es/sbin/cluster/cspoc

you should not stop a node with the forced option on more than one node at a time and also the RG in concurrent mode.

Cluster commands are in /usr/es/sbin/cluster

User Administration : cl_usergroup

Create a concurrent VG -- > smitty cl_convg

To find the resource group information: clrginfo –P

HACMP Planning:

Maximum no.of nodes in a cluster is 32

In an HACMP Cluster, the heartbeat messages are exchanged via IP networks and Point-to-Point networks

IP Label represents the name associated with a specific IP address

Service IP label/address: The service IP address is an IP address used for client access.

2 types of service IP addresses:

Shared Service IP address: It can be active only on one node at a time.

Node bound service IP address: An IP address that can be configured only one node

Method of providing high availability service IP addresses:

IP address takeover via IP aliases

IPAT via IP replacement

IP alias is an IP address that is configured on a communication interface in addition to the base ip address. IP alias is an AIX function that is supported by HACMP. AIX supports multiple IP aliases on each communication interface. Each IP alias can be a different subnet.

Network Interface:

Service Interface: This interface used for providing access to the application running on that node. The service IP address is monitored by HACMP via RSCT heartbeat.

Boot Interface: This is a communication interface. With IPAT via aliasing, during failover the service IP label is aliased onto the boot interface

Persistent node IP label: Its useful for administrative purpose.

When an application is started or moved to another node together with its associated resource group, the service IP address can be configured in two ways.

Replacing the base IP address of a communication interface. The service IP label and boot IP label must be on same subnet.
Configuring one communication interface with an additional IP address on top of the existing one. This method is IP aliasing. All Ip addresses/labels must be on different subnet.

Default method is IP aliasing.

HACMP Security: Implemented directly by clcomdES, Uses HACMP ODM classes and the /usr/es/sbin/cluster/rhosts file to determine partners.

Resource Group Takeover relationship:

Resource Group: It’s a logical entity containing the resources to be made highly available by HACMP.

Resources: Filesystems, NFS, Raw logical volumes, Raw physical disks, Service IP addresses/Labels, Application servers, startup/stop scripts.

To made highly available by the HACMP each resource should be included in a Resource group.

Resource group takeover relationship:

Cascading
Rotating
Concurrent
Custom

Cascading:

Cascading resource group is activated on its home node by default.
Resource group can be activated on low priority node if the highest priority node is not available at cluster startup.
If node failure resource group falls over to the available node with the next priority.
Upon node reintegration into the cluster, a cascading resource group falls back to its home node by default.
Attributes:

1. Inactive takeover(IT): Initial acquisition of a resource group in case the home node is not available.

2. Fallover priority can be configured in default node priority list.

3. cascading without fallback is an attribute that modifies the fall back behavior. If cwof flag is set to true, the resource group will not fall back to any node joining. When the flag is false the resource group falls back to the higher priority node.

Rotating:

At cluster startup first available node in the node priority list will activate the resource group.
If the resource group is on the takeover node. It will never fallback to a higher priority node if one becomes available.
Rotating resource groups require the use of IP address takeover. The nodes in the resource chain must all share the same network connection to the resource group.

Concurrent:

A concurrent RG can be active on multiple nodes at the same time.

Custom:

Users have to explicitly specify the desired startup, fallover and fallback procedures.
This support only IPAT – via aliasing service IP addresses.

Startup Options:

Online on home node only
Online on first available node
Online on all available nodes
Online using distribution policyàThe resource group will only be brought online if the node has no other resource group online. You can find this by lssrc –ls clstrmgrES

Fallover Options:

Fallover to next priority node in list
Fallover using dynamic node priorityàThe fallover node can be selected on the basis of either its available CPU, its available memory or the lowest disk usage. HACMP uses RSCT to gather all this information then the resource group will fallover to the node that best meets.
Bring offlineàThe resource group will be brought offline in the event of an error occur. This option is designed for resource groups that are online on all available nodes.

Fallback Options:

Fallback to higher priority node in the list
Never fallback

Basic Steps to implement an HACMP cluster:

Planning
Install and connect the hardware
Configure shared storage
Installing and configuring application software
Install HACMP software and reboot each node
Define the cluster topology
Synchronize the cluster topology
Configure cluster resources
Configure cluster resource group and shared storage
Synchronize the cluster
Test the cluster

HACMP installation and configuration:

HACMP release notes : /usr/es/lpp/cluster/doc

Smitty install_all à fast path for installation

Cluster.es and cluster.cspoc images must be installed on all servers

Start the cluster communication daemon à startsrc –s clcomdES

Upgrading the cluster options: node by node migration and snapshot conversion

Steps for migration:

Stop cluster services on all nodes
Upgrade the HACMP software on each node
Start cluster services on one node at a time

Convert from supported version of HAS to hacmp

Current s/w should be commited
Save snapshot
Remove the old version
Install HA 5.1 and verify

Check previous version of cluster: lslpp –h “cluster”

To save your HACMP configuration, create a snapshot in HACMP

Remove old version of HACMP: smitty install_remove ( select software name cluster*)

Lppchk –v and lppchk –c cluster* both commands run clean if the installation is ok.

After you have installed HA on cluster nodes you need to convert and apply the snapshot. converting the snapshot must be performed before rebooting the cluster nodes

Clconvert_snapshot –C –v version –s à It converts HA old version snapshot to new version

After installation rebooting the cluster services is required because to activate the new cluster manager.

Verification and synchronization : smitty hacmpàextended configurationà extended verification and configuration à verify changes only

Perform Node-by-Node Migration:

Save the current configuration in snapshot.
Stop cluster services on one node using graceful with takeover
Verify the cluster services
Install hacmp latest version.
Check the installed software using lppchk
Reboot the node.
Restart the HACMP software ( smitty hacmpàSystem ManagementàManage cluster servicesàstart cluster services
Repeat above steps on all nodes
Logs documenting on /tmp/hacmp.out /tmp/cm.log /tmp/clstrmgr.debug
Config_too_long message appears when the cluster manager detects that an event has been processing for more than the specified time. To change the time interval ( smitty hacmpà extended configurationàextended event configurationàchange/show time until warning)

Cluster snapshots are saved in the /usr/es/sbin/cluster/snapshots.

Synchronization process will fail when migration is incomplete. To back out from the change you must restore the active ODM. (smitty hacmp à Problem determination toolsà Restore HACMP configuration database from active configuration.)

Upgrading HACMP new version involves converting the ODM from previous release to the current release. That is done by /usr/es/sbin/cluster/conversion/cl_convert –F –v 5.1

The log file for the conversion is /tmp/clconvert.log.

Clean up process once installation interrupted.( smitty installà software maintenance and installationà clean up after a interrupted installation)

Network Configuration:

Physical Networks: TCP/IP based, such as Ethernet and token ring Device based, RS 232 target mode SSA(tmssa)

Configuring cluster Topology:

Standard and Extended configuration

Smitty hacmpàInitialization and standard configuration

IP aliasing is used as the default mechanism for service IP label/address assignment to a network interface.

Configure nodes : Smitty hacmpàInitialization and standard configurationàconfigure nodes to an hacmp clusterà (Give cluster name and node names)
Configure resources: Use configure resources to make highly available ( configure IP address/label, Application server, Volume groups, Logical volumes, File systems
Configure resource groups: Use configure HACMP resource groups . you can choose cascading, rotating, custom, concurrent
Assign resources to each resource group: configure HACMP resource groupsà Change/show resources for a Resource group.
Verify and synchronize the cluster configuration
Display the cluster configuration

Steps for cluster configuration using extended path:

Run discovery: Running discovery retrieves current AIX configuration information from all cluster nodes.
Configuring an HA cluster: smitty hacmpàextended configurationàextended topology configurationàconfigure an HACMP clusteràAdd/change/show an HA cluster
Defining a node: smitty hacmpàextended configurationàextended topology configurationàconfigure HACMP nodesàAdd a node to the HACMP cluster
Defining sites: This is optional.
Defining network: Run discover before network configuration.

IP based networks: smitty hacmpàextended configurationàextended topology configurationàconfigure HACMP networksàAdd a network to the HACMP clusteràselect the type of networkà(enter network name, type, netmask, enable IP takeover via IP aliases(default is true), IP address offset for heartbeating over IP aliases.

Defining communication interfaces: smitty hacmpàextended configurationàextended topology configurationàHACMP cotmmunication interfaces/DevicesàSelect communication interfacesàadd node name, network name, network interface, IPlabel/address, network type
Defining communication devices: smitty hacmpàextended configurationàextended topology configurationàconfigure HACMP communication interface/devicesàselect communication devices
To see boot IP labels on a node use netstat –in
Defining persistent IP labels: It always stays on the same node, does not require installing an additional physical interface, its not part of any resource group.smitty hacmpàextended topology configurationàconfigure persistent node IP label/addressesàadd persistent node IP label(enter node name, network name, node IP label/address)

Resource Group Configuration

Smitty hacmpàinitialization and standard configurationàConfigure HACMP resource groupsà Add a standard resource groupà Select cascading/Rotating/Concurrent/Custom (enter resource group name, participating node names)
Assigning resources to the RG. Smitty hacmpàinitialization and standard configurationà Configure HACMP resource groupsàchange/show resources for a standard resource group( add service IP label/address, VG, FS, Application servers.

Resource group and application management:

Bring a resource group offline: smitty cl_adminàselect hacmp resource group and application managementàBring a resource group offline.
Bring a resource group online: smitty hacmp àselect hacmp resource group and application managementàBring a resource group online.
Move a resource group: smitty hacmp à select hacmp resource group and application managementà Move a resource group to another node

C-SPOC: Under smitty cl_admin

Manage HACMP services
HACMP Communication interface management
HACMP resource group and application manipulation
HACMP log viewing and management
HACMP file collection management
HACMP security and users management
HACMP LVM
HACMP concurrent LVM
HACMP physical volume management

Post Implementation and administration:

C-Spoc commands are located in the /usr/es/sbin/cluster/cspoc directory.

HACMP for AIX ODM object classes are stored in /etc/es/objrepos.

User group administration in hacmp is smitty cl_usergroup

Problem Determination:

To verify the cluster configuration use smitty clverify.dialog

Log file to store output: /var/hacmp/clverify/clverify.log

HACMP Log Files:

/usr/es/adm/cluster.log: Generated by HACMP scripts and daemons.

/tmp/hacmp.out: This log file contains line – by – line record of every command executed by scripts.

/usr/es/sbin/cluster/history/cluster.mmddyyyy: System creates cluster history file everyday.

/tmp/clstrmgr.debug: This messages generated by clstrmgrES activity.

/tmp/cspoc.log: generated by hacmp c-spoc commands

/tmp/dms_loads.out: stores log messages every time hacmp triggers the deadman switch

/var/hacmp/clverify/clverify.log: cluster verification log.

/var/ha/log/grpsvcs, /var/ha/log/topsvcs, /var/ha/log/grpglsm: daemon logs.

Snapshots: The primary information saved in a cluster snapshot is the data stored in the HACMP ODM classes(HACMPcluster, HACMPnode, HACMPnetwork, HACMPdaemons).

The cluster snapshot utility stores the data it saves in two separate files:

ODM data file(.odm), Cluster state information file(.info)

To create a cluster snapshot: smitty hacmpàhacmp extended configurationàhacmp snapshot configurationàadd a cluster snapshot

Cluster Verification and testing:

High and Low water mark values are 33 and 24

The default value for syncd is 60.

Before starting the clu ster clcomd daemon is added to the /etc/inittab and started by init.

Verify the status of the cluster services: lssrc –g cluster ( cluster manager daemon(clstrmgrES), cluster SMUX peer daemon(clsmuxpd) and cluster topology services daemon(topsvcd) should be running.

Status of different cluster subsystems: lssrc –g topsvcs and lssrc –g emsvcs.

In /tmp/hacmp.out file look for the node_up and node_up_complete events.

To check the HACMP cluster status: /usr/sbin/cluster/clstat. To use this command you should have started the clinfo daemon.

To change the snmp version : /usr/sbin/snmpv3_ssw -1.

Stop the cluster services by using smitty clstop : graceful, takeover, forced. In the log file /tmp/hacmp.out search for node_down and node_down_complete.

Graceful: Node will be released, but will not be acquired by other nodes.

Graceful with takeover: Node will be released and acquired by other nodes.

Forced: Cluster services will be stopped but resource group will not be released.

Resource group states: online, offline, aquiring, releasing, error, temporary error, or unknown.

Find the resource group status: /usr/es/sbin/cluster/utilities/clfindres or clRGinfo.

Options: -t : If you want to display the settling time –p: display priority override locations

To review cluster topology: /usr/es/sbin/cluster/utilities/cltopinfo.

Different type of NFS mounts: hard and soft

Hard mount is default choice.

NFS export file: /usr/es/sbin/cluster/etc/exports.

If the adapter configured with a service IP address : verify in /tmp/hacmp.out event swap_adapter has occurred, Service IP address has been moved using the command netstat –in .

You can implement RS232 heartbeat network between any 2 nodes.

To test a serial connection lsdev –Cc tty, baud rate is set to 38400, parity to none, bits per character to 8

Test to see RSCT is functioning or not : lssrc –ls topsvcs

RSCT verification: lssrc –ls topsvcs. To check RSCT group services: lssrc –ls grpsvcs

Monitor heartbeat over all the defines networks: cllsif.log from /var/ha/run/topsvcs.clustername.

Prerequisites:

PowerHA Version 5.5 à AIX v5300-9 àRSCT levet 2.4.10

BOS components: bos.rte.*, bos.adt.*, bos.net.tcp.*,

Bos.clvm.enh ( when using the enhanced concurrent resource manager access)

Cluster.es.nfs fileset comes with the powerHA installation medium installs the NFSv4. From aix BOS bos.net.nfs.server 5.3.7.0 and bos.net.nfs.client 5.3.7.0 is required.

Check all the nodes must have same version of RSCT using lslpp –l rsct

Installing powerHA: release notes: /usr/es/sbin/cluster/release_notes

Enter smitty install_allàselect input deviceàPress f4 for a software listingàenter

Steps for increase the size of a shared lun:

Stop the cluster on all nodes
Run cfgmgr
Varyonvg vgname
Lsattr –El hdisk#
Chvg –g vgname
Lsvg vgname
Varyoffvg vgname
On subsequent cluster nodes that share the vg. (run cfgmgr, lsattr –El hdisk#, importvg –L vgname hdisk#)
Synchronize

PowerHA creates a backup copy of the modified files during synchronization on all nodes. These backups are stored in /var/hacmp/filebackup directory.

The file collection logs are stored in /var/hacmp/log/clutils.log file.

User and group Administration:

Adding a user: smitty cl_usergroupàselect users in a HACMP clusteràAdd a user to the cluster.(list users, change/show characteristics of a user in cluster, Removing a user from the cluster

Adding a group: smitty cl_usergroupàselect groups in a HACMP clusteràAdd a group to the cluster.(list groups, change/show characteristics of a group in cluster, Removing a group from the cluster

Command is used to change password on all cluster nodes: /usr/es/sbin/cluster/utilities/clpasswd

Smitty cl_usergroupàusers in a HACMP cluster

Add a user to the cluster
List users in the cluster
Change/show characteristics of a user in the cluster
Remove a user from the cluster

Smitty cl_usergroupàGroups in a HACMP cluster

Add a group to the cluster
List groups to the cluster
Change a group in the cluster
Remove a group

Smitty cl_usergroupàPasswords in an HACMP cluster

Importing VG automatically: smitty hacmpàExtended configurationàHACMP extended resource configurationàChange/show resources and attributes for a resource groupàAutomatically import volume groups to true

C-SPOC LVM: smitty cl_admin à HACMP Logical Volume Management

Shared Volume groups
Shared Logical volumes
Shared File systems
Synchronize shared LVM mirrors (Synchronize by VG/Synchronize by LV)
Synchronize a shared VG definition

C-SPOC concurrent LVM: smitty cl_admin à HACMP concurrent LVM

Concurrent volume groups
Concurrent Logical volumes
Synchronize concurrent LVM mirrors

C-SPOC Physical volume management: smitty cl_adminàHACMP physical volume management

Add a disk to the cluster
Remove a disk from the cluster
Cluster disk replacement
Cluster datapath device management

Cluster Verification: smitty hacmpàExtended verificationàExtended verification and synchronization. Verification log files stored in /var/hacmp/clverify.

/var/hacmp/clverify/clverify.log à Verification log

/var/hacmp/clverify/pass/nodename à If verification succeeds

/var/hacmp/clverify/fail/nodename à If verification fails

Automatic cluster verification: Each time you start cluster services and every 24 hours.

Configure automatic cluster verification: smitty hacmpàproblem determination toolsàhacmp verification à Automatic cluster configuration monitoring.

Cluster status Monitoring: /usr/es/sbin/cluster/clstat –a and o.

/usr/es/sbin/cluster/utilities/cldumpàIt provides snapshot of the key cluster status components.

Clshowsrv: It displays the status

Disk Heartbeat:

It’s a non-IP heartbeat
It’s use dedicated disk/LUN
It’s a point to point network
If more than 2 nodes exist in your cluster, you will need a minimum of n number of non-IP heartbeat networks.
Disk heartbeating will typically requires 4 seeks/second. That is each of two nodes will write to the disk and read from the disk once/second. Filemon tool monitors the seeks.

Configuring disk heartbeat:

Vpaths are configured as member disks of an enhanced concurrent volume group. Smitty lvmàselect volume groupsàAdd a volume groupàGive VG name, PV names, VG major number, Set create VG concurrent capable to enhanced concurrent.
Import the new VG on all nodes using smitty importvg or importvg –V 53 –y c23vg vpath5
Create the diskhb networkàsmitty hacmpàextended configuration àextended topology configurationàconfigure hacmp networksàAdd a network to the HACMP clusteràchoose diskhb
Add 2 communication devicesà smitty hacmpàextended configuration àextended topology configurationàConfigure HACMP communication Interfaces/DevicesàAdd communication interfaces/devicesàAdd pre-defined communication interfaces and devicesà communication devicesàchoose the diskhb
Create one communication device for other node also

Testing Disk Heartbeat connectivity:/usr/sbin/rsct/dhb_read is used to test the validity of a diskhb connection.

Dhb_read –p vpath0 –r for receives data over diskhb network

Dhb_read –p vpath3 –t for transmits data over diskhb network.

Monitoring disk heartbeat: Monitor the activity of the disk heartbeats via lssrc –ls topsvcs. Monitor the Missed HBS field.

Configure HACMP Application Monitoring: smitty cm_cfg_appmonàAdd a process application monitoràgive process names, app startup/stop scripts

Application availability analysis tool: smitty hacmpàsystem managementàResource group and application managementàapplication availability analysis

Commands:

List the cluster topology : /usr/es/sbin/cluster/utilities/cllsif

/usr/es/sbin/cluster/clstat

Start cluster : smitty clstart .. Monitor with /tmp/hacmp.out and check for node_up_complete.

Stop the cluster : smitty cl_stop àMonitor with /tmp/hacmp.out and check fr node_down_complete.

Determine the state of cluster: /usr/es/sbin/cluster/utilities/clcheck_server

Display the status of HACMP subsystems: clshowsrv –v/-a

Display the topology information: cltopinfo –c/-n/-w/-i

Monitor the heartbeat activity: lssrc –ls topsvcs [ check for dropped, errors]

Display resource group attributes: clrginfo –v, -p, -t, -c, -a OR clfindres

Difference between AIX, VIO, HMC Versions

Differences

Enhancement in 6.1

In AIX 6.1 ability to create snapshots within the source FS. It’s 2 types: External and Internal. Create snapshot using snapshot –o command. Snapshots will be stored in /fsmountpoint/.snapshot/snap01
Encrypted FS: It provides more protection for sensitive data. Commands efsmgr, efskeymgr
WPAR technology
Hardware performance monitors: PM and HPM
NIMSH(service handler) was introduced.
/admin/tmp : On this directory privileged processes can securely create temporary files.
AIX graphical installer

Enhancement in AIX 5.3

Scalable VG introduced
It supports P5 HW features(Micro Partitioning, Vscsi, Virtual Ethernet, Shared Ethernet, SEA, IVM)
New performance commands like lparstat, mpstat, topas enhanced for micro partitions
Concurrent IO implemented
NFSv4(replication features, it supports for DIO,CIO supported
Login license limit has increased 32767
Ps –f , find, alt_disk_mksysb, alt_rootvg_op, alt_disk_copy
Chcore, lscore

Enhancement in 5.4

web based GUI
NFSv4 support environments
New GLVM monitoring

Enhancement in 5.3

Additional resource and resource group management features(cluster wide RG location dependencies, Distribution preference for IP service aliases
Clone a cluster from existing live configuration
OEM VG’s and FS has been added to VFS and VVm
Performance and usability has been improved
New smart assist features

Enhancement in VIO 2.1

N_Port ID Virtualization ( Simplifies FC San LUN administration, Enables access to other SAN devices like tape)
Virtual Tape (it simplifies backup & restore with shared devices)
Dynamic heterogeneous MPIO
Partition mobility between HMC
Active memory sharing( like shared cpu but for memory)
PowerVM Lx86 1.3( New higher performance)

Enhancement in HMC 7

It supports new power6 technology features
Websm is no longer required. Standard web browser is enough

AIX Commands

To display if the kernel is 32-bit enabled or 64-bit enabled: bootinfo –k

How do I know if I am running a uniprocessor kernel or a multiprocessor kernel: ls –l /unix

The /dev/hdiskxx directory is where the boot logical volume /dev/hd5 is located : lslv –m hd5

How would I know if my machine is capable of running AIX 5L Version 5.3: AIX 5L Version 5.3 runs on all currently supported CHRP (Common Hardware Reference Platform)-based POWER hardware.

How would I know if my machine is CHRP-based: Run the prtconf command. If it's a CHRP machine, the string chrp appears on the Model Architecture line.
To display if the hardware is 32-bit or 64-bit, type: bootinfo –y

How much real memory does my machine have: bootinfo –r, lsattr –El sys0 –a realmem

To display the number of processors on your system: lscfg |grep proc

Detailed configuration of my system: lscfg –p(platform specific device information) –v(VPD)

Displays the chip type of the system. For example, PowerPC: uname –p

Displays the release number of the operating system: uname –r

Displays the system name. For example, AIX: uname –s

Displays the name of the node: uname –n

Displays the system name, nodename, version, machine ID.: uname –a

Displays the system model name. For example, IBM, 9114-275: uname –M

Displays the operating system version.: uname –v

Displays the machine ID number of the hardware running the system: uname –m

Displays the system ID number: uname –u

What version, release, and maintenance level of AIX is running on my system: oslevel –r

To determine which fileset updates are missing from 5300-04, for example, run the following command:
oslevel –rl 5300-04

What SP (Service Pack) is installed on my system? Oslevel –s

information about installed filesets on my system: lslpp –l

To show bos.acct contains /usr/bin/vmstat: lslpp –w /usr/bin/vmstat or which_fileset vmstat

To show which filesets need to be installed or corrected: lppchk –v

How do I get a dump of the header of the loader section and the symbol entries in symbolic representation: dump –Htv

To find out wheteher a hard drive is bootable: ipl_varyon -i

How do I replace a disk?
1. #extendvg VolumeGroupName hdisk_new

2. #migratepv hdisk_bad hdisk_new

3. #reducevg -d VolumeGroupName hdisk_bad

How can I clone (make a copy of ) the rootvg: alt_disk_copy -d hdisk1

How do I identify the network interfaces on my server: lsdev –Cc if

To get information about one specific network interface: ifconfig tr0

AIX Short Notes

AIX

LVM:

VG: One or more PVs can make up a VG.

Within each volume group one or more logical volumes can be defined.

VGDA(Volume group descriptor area) is an area on the disk that contains information pertinent to the vg that the PV belongs to. It also includes information about the properties and status of all physical and logical volumes that are part of the vg.

VGSA(Volume group status area) is used to describe the state of all PPs from all physical volumes within a volume group. VGSA indicates if a physical partition contains accurate or stale information.

LVCB(Logical volume control block) contains important information about the logical volume, such as the no. of logical partitions or disk allocation policy.

VG type Max Pv’s Max LV’s Max PP’s/VG Max PP Size

Normal 32 256 32512 1G

BIG 128 512 130048 1G

Scalable 1024 4096 2097152 128G

PVIDs stored in ODM.

Creating PVID : chdev –l hdisk3 –a pv=yes

Clear the PVID : chdev –l hdisk3 –a pv=clear.

Display the allocation PP’s to LV’s : lspv –p hdisk0

Display the layout of a PV: lspv –M hdisk0

Disabling partition allocation for a physical volume : chpv –an hdisk2 : Allocatable=no

Enabling partition allocation for a physical volume : chpv –ay hdisk2 : Allocatable = yes

Change the disk to unavailable : chpv –vr hdisk2 : PV state = removed

Change the disk to available : chpv –va hdisk2 : PV state = active

Clean the boot record : chpv –c hdisk1

To define hdsik3 as a hotspare : chpv –hy hdisk3

To remove hdisk3 as a hotspare : chpv –hn hdisk3

Migrating ttwo disks : migratepv hdisk1 hdisk2

Migrate only PPS that belongs to particular LV : migratepv –l testlv hdisk1 hdisk5

Move data from one partition located on a physical disk to another physical partition on a different disk: migratelp testlv/1/2 hdisk5/123

Logical track group(LTG) size is the maximum allowed transfer size for an IO disk operation. Lquerypv –M hdisk0

VOLUME GROUPS

For each VG, two device driver files are created under /dev.

Creating VG : mkvg –y vg1 –s64 –v99 hdisk4

Creating the Big VG : mkvg –B –y vg1 –s 128 –f –n –V 101 hdisk2

Creating a scalable VG: mkvg –S –y vg1 –s 128 –f hdisk3 hdisk4 hdisk5

Adding disks that requires more than 1016 PP’s/PV using chvg –t 2 VG1

Information about a VG read from a VGDA located on a disk: lsvg –n VG1

Change the auto vary on flag for VG : chvg –ay newvg

Change the auto vary off flag for VG: chvg –an newvg

Quorum ensures data integrity in the event of disk failure. A quorum is a state in which 51 percent or more of the PVs in a VG accessible. When quorum is lost, the VG varies itself off.

Turn off the quorum : chvg –Qn testvg

Turn on the quorum : chvg –Qy testvg

To change the maximum no of PPs per PV : chvg –t 16 testvg.

To change the Normal VG to scalable vg : 1. Varyoffvg ttt 2. chvg –G ttt 3. varyonvg ttt

Change the LTG size : chvg –L 128 testvg à VG’s are created with a variable logical track group size.

Hot Spare: In Physical volume all PP’s shou;d be free. PP located on a failing disk will be copied from its mirror copy to one or more disks from the hot spare pool.

Designate hdisk4 as hot spare: chpv –hy hdisk4

Migrate data from a failing disk to spare disk: Chvg –hy vg;

Change synchronization policy : chvg –sy testvg; synchronization policy controls automatic synchronization of stale partitions within the VG.

Change the maximum no. of pps within a VG: chvg –P 2048 testvg

Change maximum no.of LVs/VG : chvg –v 4096 testvg.

How to remove the VG lock : chvg –u

Extending a volume group : extendvg testvg hdisk3; If PVID is available use extendvg –f testvg hdisk3

Reducing the disk from vg : reducevg testvg hdisk3

Synchronize the ODM information : synclvodm testvg

To move the data from one system to another use the exportvg command. The exportvg command only removes VG definition from the ODM and does not delete any data from physical disk. : exportvg testvg

Importvg : Recreating the reference to the VG data and making that data available.. This command reads the VGDA of one of the PV that are part of the VG. It uses redefinevg to find all other disks that belong to the VG. It will add corresponding entries into the ODM database and update /etc/filesystems with new values. importvg –y testvg hdisk7

Server A: lsvg –l app1vg
Server A: umount /app1
Server A: Varyoffvg app1vg
Server B: lspv|grep app1vg
Server B: exportvg app1vg
Server B: importvg –y app1vg –n V90 vpath0
Chvg –a n app1vg
Varyoffvg app1vg

Varying on a volume group : varyonvg testvg

Varying off a volume group : varyoffvg testvg

Reorganizing a volume group : This command is ued to reorganize physical partitions within a VG. The PP’s will be rearranged on the disks according to the intra-physical and inter-physical policy. reorgvg testvg.

Synchronize the VG : syncvg –v testvg ; syncvg –p hdisk4 hdisk5

Mirroring a volume group : lsvg –p rootvg; extendvg rootvg hdisk1; mirrorvg rootvg; bosboot –ad /dev/hdisk1; bootlist –m normal hdisk0 hdisk1

Splitting a volume group : splitvg –y newvg –c 1 testvg

Rejoin the two copies : joinvg testvg

Logical Volumes:

Create LV : mklv –y lv3 –t jfs2 –a im testvg 10 hdisk5

Remove LV : umount /fs1, rmlv lv1

Delete all data belonging to logical volume lv1 on physical volume hdisk7: rmlv –p hdsik7 lv1

Display the no. of logical partitions and their corresponding physical partitions: lslv –m lv1

Display information about logical volume testlv read from VGDA located on hdisk6: lslv –n hdisk6 testlv

Display the LVCB : getlvcb –AT lv1

Increasing the size of LV : extendlv –a ie –ex lv1 3 hdisk5 hdisk6

Copying a LV : cplv –v dumpvg –y lv8 lv1

Creating copies of LV : mklvcopy –k lv1 3 hdisk7 &

Splitting a LV : umount /fs1; splitlvcopy –y copylv testlv 2

Removing a copy of LV : rmlvcopy testlv 2 hdisk6

Changing maximum no.of logical partitions to 1000: chlv –x 1000 lv1

Installation :

New and complete overwrite installation : For new machine, Overwrite the existing one, reassign your hard disks

Migration: upgrade AIX versions from 5.2 to 5.3. This method preserves most file systems, including root volume group.

Preservation installation : If you want to preserve the user data.. use /etc/preserve.list. This installation overwrites /usr, /tmp,/var and / file systems by default. /etc/filesystems file is listed by default.

TCB:

To check the tcb is installed or not: /usr/bin/tcbck.
By installing a system with the TCB option, you enable the trusted path, trusted shell, trusted processes and system integrity checking.
Every device is part of TCB and every fle in the /dev directory is monitored by the TCB.
Critical information about so many files storing in /etc/security/sysck.cfg file
You can enable TCB anly at installation time

Installation steps : Through HMC à activate à override the boot mode to SMS.

Without hmc à After POST à hear the 2 beeps à press 1.

Insert the AIX 5L CD1. à select boot options(NO:5)àSelect install / Boot devise(Option1)à select CD/DVDà select SCSIà select the normal bootà exit from SMSàSystem boots from mediaàChoose languageàChange/show installation settingsàNew and complete overriteàselect harddiskàInstall optionsàenter to confirmàAfter installation system reboots automatically

Erase a hard disk à using diag command

Alternate Disk Installation:

Cloning the current running rootvg to an alternate disk
Installing a mksysb image on another disk.

Alt_disk_copy: Creates copies of rootvg on an alternate set of disks.

Alt_disk_mksysb: Installs an existing mksysb on an alternate set of disks.

Alt_rootvg_op: Performs wake, sleep and customize operations.

Alternate mksysb installation: smitty alt_mksysb

Alternate rootvg cloning: smitty alt_clone.

Cloning AIX :

Having online backup. As in case of disk crash.
When applying new maintenance levels, a copy of the rootvg is made to an alternate disk, then updates are applied to that copy

To view the BOS installation logs : cd /var/adm/ras à cat devinst.log. or alog –o –f bosinstlog. Or smit alog_show

Installation Packages:

Fileset : A fileset is smallest installable unit. Ex: bos.net.uucp

Package : A group of installable filesets Ex: bos.net

Licenced program products : A complete s/w product Ex :BOS

Bundle : A bundle is a list of software that contain filesets, packages and LPPs. Install the software bundle using smitty update_all.

PTF:Program temporary fix. It’s an updated fileset or a new fileset that fixes a previous system problem. PTF’s installed through installp.

APAR: Authorised program analysis report. APAR’s applied to the system through instfix.

Fileset revision level identification : version:release:modification:fixlevel

The file sets that are below level 4.1.2.0, type: oslevel –l 4.1.2.0

The file sets at levels later than the current maintenance level, type: oslevel -g

To list all known recommended maintenance levels on the system, type:oslevel –rq

Oslevel –s for SP level

Current maintenance level: oslevel -r

Installing S/W: Applied and commited

Applied: In applied state the previous version is stored in /usr/lpp/packagename.

Commited : First remove the previous version and go to for the installation

To install filesets within the bos.net software package in /usr/sys/inst.images directory in the applied state: installp –avx –d /usr/sys/inst.images bos.net

Install S/W in commited state: installp –acpX –d/usr/sys/inst.images bos.net

Record of the installp output stored in /var/adm/sw/installp.summary

Commit all updates: installp –cgX all

List all installable S/W : installp –L –d /dev/cd0

Cleaning up after failed installation : installp –C

Removing installed software: installp –ugp

Software Installation: smitty install_latest

Commiting applied updates: smitty install_commit

Rejecting applied updates: smitty install_reject

Removing installed software: smitty install_remove

To find what maintenance level your filesets are currently on : lslpp –l

To list the individual files that are installed with a particular fileset : lslpp –f bos.net

To list the installation and update history of filesets : lslpp –h

To list fixes that are on a CDROM in /dev/cd0 – instfix –T –d /dev/cd0

To determine if APAR is installed or not : instfix –iK IY737478

To list what maintenance levels installed : instfix –i |grep ML

To install APAR : instfix –K IY75645 –d /dev/cd0

Installing individual fix by APAR: smitty update_by_fix

To install new fixes available from IBM : smitty update_all

Verifying the integrity of OS : lppchk –v

Creating installation images on disk: smitty bffcreate

Verify whether the software installed on your system is in a consistent state: lppchk

To install RPM packages using geninstall. à geninstall –d Media all

Uninstall software: geninstall –u –f file

List installable software on device: geninstall –L –d media.

AIX Boot Process:

When the server is Powered on Power on self test(POST) is run and checks the hardware
On successful completion on POST Boot logical volume is searched by seeing the bootlist
The AIX boot logical contains AIX kernel, rc.boot, reduced ODM & BOOT commands. AIX kernel is loaded in the RAM.
Kernel takes control and creates a RAM file system.
Kernel starts /etc/init from the RAM file system
init runs the rc.boot 1 ( rc.boot phase one) which configures the base devices.
rc.boot1 calls restbase command which copies the ODM files from Boot Logical Volume to RAM file system
rc.boot1 calls cfgmgr –f command to configure the base devices
rc.boot1 calls bootinfo –b command to determine the last boot device
Then init starts rc.boot2 which activates rootvg
rc.boot2 calls ipl_varyon command to activate rootvg
rc.boot2 runs fsck –f /dev/hd4 and mount the partition on / of RAM file system
rc.boot2 runs fsck –f /dev/hd2 and mounts /usr file system
rc.boot2 runs fsck –f /dev/hd9var and mount /var file system and runs copy core command to copy the core dump if available from /dev/hd6 to /var/adm/ras/vmcore.0 file. And unmounts /var file system
rc.boot2 runs swapon /dev/hd6 and activates paging space
rc.boot2 runs migratedev and copies the device files from RAM file system to /file system
rc.boot2 runs cp /../etc/objrepos/Cu* /etc/objrepos and copies the ODM files from RAM file system to / filesystem
rc.boot2 runs mount /dev/hd9var and mounts /var filesystem
rc.boot2 copies the boot log messages to alog
rc.boot2 removes the RAM file system
Kernel starts /etc/init process from / file system
The /etc/init points /etc/inittab file and rc.boot3 is started. Rc.boot3 configures rest of the devices
rc.boot3 runs fsck –f /dev/hd3 and mount /tmp file system
rc.boot3 runs syncvg rootvg &
rc.boot3 runs cfgmgr –p2 or cfgmgr –p3 to configure rest of the devices. Cfgmgr –p2 is used when the physical key on MCA architecture is on normal mode and cfgmgr –p3 is used when the physical key on MCA architecture is on service mode.
rc.boot3 runs cfgcon command to configure the console
rc.boot3 runs savebase command to copy the ODM files from /dev/hd4 to /dev/hd5
rc.boot3 starts syncd 60 & errordaemon
rc.boot3 turn off LED’s
rc.boot3 removes /etc/nologin file
rc.boot3 checks the CuDv for chgstatus=3 and displays the missing devices on the console
The next line of Inittab is execued

/etc/inittab file format: identifier:runlevel:action:command

MkitabàAdd records to the /etc/inittab file

LsitabàList records in the /etc/inittab file

Chitabàchanges records in the /etc/inittab file

Rmitabàremoves records from the /etc/inittab file

To display a boot list: bootlist –m normal –o

To change a boot list: bootlist –m normal cd0 hdisk0

Troubleshooting on boot process:

Accessing a system that will not boot: Press F5 on a PCI based system to boot from the tape/CDROMàInsert volume 1 of the installation media àselect the maintenance mode for system recoveryà Access a root volume groupàselect the volume groupà

Damaged boot image:Access a system that will not bootàCheck the / and /tmp file system sizeàdetermine the boot disk using lslv –m hd5àRecreate the boot image using bosboot –a –d /dev/hdisknàcheck for CHECKSTOP errors on errlog. If such errors found probably failing hardware. àshutdown and restart the system

Corrupted file system, Corrupted jfs log: Access a system that will not bootàdo fsck on all filw systemsà format the jfs log using /usr/sbin/logform /dev/hd8àRecreate the boot image using bosboot –a –d /dev/hdiskn

Super block corrupted: If fsck indicates that block 8 is corrupted, the super block for the file system is corrupted and needs to be repaired ( dd count=1 bs=4k skip=31 seek=1 if=/dev/hdn of=/dev/hdn)àrebuild jfslog using /usr/sbin/logform /dev/hd8àmount the root and usr file systems by (mount /dev/hd4 /mnt, mount /usr)àCopy the system configuration to backup directory(cp /mnt/etc/objrepos* /mnt/etc/objrepos/backup)àcopy the configuration from the RAM fs(cp /etc/objrepos/Cu* /mnt/etc/objrepos)àunmount all file systemsàsave the clean ODM to the BLV using savebase –d /dev/hdiskàreboot

Corrupted /etc/inittab file: check the empty,missing inittab file. Check problems with /etc/environment, /bin/sh,/bin/bsh,/etc/fsck,/etc/profileàReboot

Runlevelà selected group of processes. 2 is muti user and default runlevel. S,s,M,m for Maintenance mode

Identifying current run levelàcatt /etc/.init.state

Displaying history of previous run levels: /usr/lib/acct/fwtmp < /var/adm/wtmp |grep run-level

Changing system run levels: telinit M

Run level scripts allow users to start and stop selected applications while changing the run level. Scripts beginning with k are stop scripts and S for start scripts.

Go to maintenance mode by using shutdown -m

Rc.boot fle: The /sbin/rc.boot file is a shell script that is called by the init. rc.boot file configures devices, booting from disk, varying on a root volume group, enabling fle systems, calling the BOS installation programs.

/etc/rc file: It performs normal startup initialization. It varyon all vgs, Activate all paging spaces(swapon –a), configure all dump devices(sysdumpdev –q), perform file system checks(fsck –fp), mount all

/etc/rc.net: It contains network configuration information.

/etc/rc.tcpip: it start all network related daemons(inted, gated, routed, timed, rwhod)

Backups:

MKSYSB : Creates a bootable image of all mounted filesystems on the rootvg. This command is for restore a system to its original state.

Tape Format : BOS boot image(kernel device drivers), BOS install image(tapeblksz, image.data, bosinst.data), dummy table of contents, rootvg backup

Exclude file systems using mksysb –ie /dev/rmt0

Cat /etc/exclude.rootvg

List content of MKSYSB image smitty lsmksysb

Restore a mksysb image : smitty restmksysb

Savevg command finds and backs up all files belonging to the specified volume group. Ex: savevg –ivf /dev/rmt0 uservg.

Restvg command restores the user volume group

Backup command backs up all files and file systems. Restore command extracts files from archives created with the backup command.

Verify the content of a backup media à tcopy /dev/rmt0

Daily Management :

/etc/security/environ : Contains the environment attributes for a user.

/etc/security/lastlog : Its an ascii file that contains last login attributes.(time last unsuccessful login, unsuccessful login

count, time last login)

/etc/security/limits : It specify the process resource limits for each user

/etc/security/user :

/usr/lib/security/mkuser.default : It contains the default attributes for a new user.

/etc/utmp file contains record of users logged into the system Command : who –a

/var/adm/wtmp file contains connect-time accounting records

/etc/security/failedlogin contains record of unsuccessful login attempts.

/etc/environment contains variables specifying the basic environment for all processes.

/etc/profile file is first file that the OS uses at login time.

To enable user smith to access this system remotely : chuser rlogin=true smith

Remove the user rmuser smith

Remove the user with remove the authentication information rmuser –p smith

Display the current run level : who –r

How to display the active processes : who –p

Changing the current shell : chsh

Change the prompt : export PS1=”Ready.”

To list all the 64-bit processes : ps –M

To change the priority of a process : nice and renice

SUID –set user id – This attribute sets the effective and saved user ids of the process to the owner id of the file on execution

SGID – set group id -- This attribute sets the effective and saved group ids of the process to the group id of the file on execution

CRON daemon runs shell commands at specified dates and times.

AT command to submit commands that are to be run only once.

System Planning:

RAID: Redundant array of independent disks.

RAID 0: Striping. Data is split into blocks of equal size and stored on different disks.

RAID 1: Mirroring. Duplicate copies are kept on separate physical disks.

RAID 5: Striping with Parity. Data is split into blocks of equal size. Additional data block containing parity information.

RAID 10: It is a combination of mirroring and striping.

AIX 5.3 requires at least 2.2 GB of physical space.

Configuration:

ODM: ODM is a repository in which the OS keeps information about your system, such as devices, software, TCP/IP configuration.

Basic Components of ODM: object classes, objects, descriptors

ODM directories: /usr/lib/objrepos, /usr/share/lib/objrepos, /etc/objrepos

Following steps for NFS implementation:

· NFS daemons should be running on both server and client

· The file systems that need to be remotely available will have to be exported(smitty mknfsexp, exportfs –a , showmount –e myserver)

· The exported file system need to be mounted on the remote systems

NFS services: /usr/sbin/rpc.mountd, /usr/sbin/nfsd, /usr/sbin/biod,rpc.statd, rpc.lockd

Changing an exported file system: smitty chnfsexp TCP/IP Daemons: inetd,gated, routed,named,

Configuration:

ODM: ODM(Object data manager) is a repository in which the OS keeps information regarding your system, such as devices, software or TCP/IP information.

ODM information is stored in /usr/lib/objrepos, /usr/share/lib/objrepos, /etc/objrepos.

ODM commands: odmadd, odmchange, odmcreate, odmshow, odmdelete, odmdrop, odmget,

To start the graphical mode smit using smit –m

Creating alias: alias rm=/usr/sbin/linux/rm

Export PATH=/usr/linux/bin:$path; print $path

Netwok File System:

Daemons: Server side(/usr/sbin/rpc.mountd, /usr/sbin/nfsd, portmap, rpc.statd, rpc.lockd) Client side ( /usr/sbin/biod)

Start the NFS faemons using mknfs –N. To start all nfs daemons using startsrc –g nfs.

Exporting nfs directories:

Verify nfs is running or not using lssrc –g nfs
Smitty mknfsexp
Specify path name, set the mode(rw,ro). It updates /etc/exports file.
/usr/sbin/exportfs –a à it sends all information in the /etc/exports to kernel.
Verify all file systems exported or not using showmount –e Myserver

Exporting an nfs directory temporarily using exportfs –i /dirname

Un exporting an nfs directory using smitty rmnfsexp

Establishing NFS mounts using smitty mknfsmnt

Changing an exported file system using smitty chnfsexp

Network configuration:

Stopping TCP IP daemons using /etc/tcp.clean script.

/etc/services file contains information about the known services

Add network routes using smitty mkroute or route add –net 192.168.1 –netmask 255.255.255.0

Traceroute command shows the route taken

Changing IP address smitty mktcpip

Identifying network interfaces : lsdev –Cc if

Activating network interface: ifconfig interface address netmask up

Deactivating network interface: ifconfig tr0 down

Deleting an address: ifconfig tr0 delete

Detaching network interface: ifconfig tr0 detach

Creating an IP alias: ifconfig interface address netmask alias

To determine MTU size of a network interface using lsattr –El interface.

Paging Space: A page is unit of virtual memory that holds 4kb of data.

Increasing paging space: chps –s 3 hd6 ( it’s a 3LP)

Reducing paging space: chps –d 1 hd6

Moving a paging space within the VG: migratepv –l hd6 hdisk0 hdisk1

Removing a paging space: swapoff /dev/paging03; rmps paging03

Device configuration:

Lscfgà detail about devices ex: lscfg –vpl rmt0

To show more about a particular processor: lsattr –El proc0

To discover how much memory is installed: lsattr –El sys0 | grep realmem.

To show processor details: lscfg |grep proc or lsdev –Cc processor

To show available processors: bindprocessor –q

To turn on SMT using smtctl –m on –w boot

To turn off SMT : smtctl –m off –w now

Modifying an existing device configuration using chdev. The device can be in defined,stopped,available state.

To change maxuproc value: chdev –l sys0 –a maxuproc=100

Remove a device configuration: rmdev –Rdl rmt0

Bootinfo –y command à returns 32 bit or 64 bit.

Commands to run enable 64 bit: ln –sf /usr/lib/boot/unix_64 /unixàln –sf /usr/lib/boot/unix_64 /usr/lib/boot/unixàbosboot –ad /dev/ipldevice àshutdown –r àls –al /unix

File Systems:

Types: Journaled, Enhanced journaled, CDROM, NFS

FS Structure: Super block, allocation groups, inodes, blocks, fragments, and device logs

Super block: It contains control information about file system, such as overall file system in 512 byte blocks, FS name, FS log device, version no, no. of inodes, list of free inodes, list of free data blocks, date and time of creation, FS state.

This data is stored in first block of FS and 31.

Allocation group:It consists of inodes and corresponding data blocks.

Inodes: It contains control information about the file. Such as type, size, owner, date and time when the file was created, modifies, last accessed, it contains pointers to data blocks that stores actual data. For JFS maximum no.of inodes and files is determined by the no. of bytes per inode(NBPI). For JFS 16MB inode. For JFS2 there is no NBPI.

Data Blocks: actual data. The default value is 4KB.

Device logs: JFS log stores transactional information. This data can be used to roll back incomplete operations if the machine crashes. Rootvg use LV hd8 as a common log.

FS differences:

Function JFS JFS2

Max FS Size 1TB 4PB

Max File Size 64G 4PB

Np.of inodes Fixed Dynamic

iNode size 128B 512B

Fragment Size 512 512

Block size 4KB 4KB

Creatinf FS: crfs –v jfs2 –g testvg –a size=10M –m /fs1

Display mounted FS: mount

Display characteristics of FS: lsfs

Initialize log device: logform /dev/loglv01

Display information about inodes: istat /etc/passwd

Monitoring and Performance Tuning:

Quotaon command enables disk quotas for one or more file systems

Ouotaoff command disables disk quotas for one or more file systems

Enable user quotas on /home: chfs –a “quota=userquota,groupquota” /home

To check the consistency of the quota files using quotacheck

Edquota command to create each user or group’s soft and hard limits for allowable disk space and maximum number of files

Error logging is automatically started by the rc.boot script

Errstop command stops the error logging

The daemon for errlog is errdemon

The path to your system’s error log file: /usr/lib/errdemon –l

Change the maximum size of the error log: errdemon –s 2000000

Display all the errors which have an specific error id: errpt –j 8527F6F4

Display all the errors logged in a specific time: errpt –s 1122164405 –e 1123100405

To delete all the entries: errclear 0

Delete all the entries classified as software errors: errclear –d s 0

VMSTAT: It reports kernel threads, virtual memory, disks, traps and cpu activity.

To display 5 summaries at 1 second intervals use vmstat 1 5

Kthr(kernel thread state) ràaverage number of runnable kernel threads. Bàaverage number of kernel threads placed in the VMM wait queue

Memory(usage of virtual and real memory). Avm à active virtual pages, total number of pages allocated in page space. A high value is not an indicator of poor performance. Freàsize of the free list. A large portion of real memory is utilized as a cache for file system data.

Page(information about page faults and page activity). Reàpager input/output list, piàpages paged in from paging space, poàpages paged out to paging space, fràpages freed, sràpages scanned by page replacement algorithm, cyà clock cycles used by page replacement algorithm

Faults(trap and interrupt rate averages per second): inàdevice interrupts, syàsystem calls, csàkernel thread context switches

CPU(breakdown of percentage usage of CPU time): usàuser time, syàsystem time, idàcpu idle time,waàwaiting for request, pcànumber of physical processors consumed ecàthe percentage of entitled capacity consumed.

Disks(provides number of transfers per second)

SAR: sar 2 5(%usr, %sys, %wio, %idle, physc)

To report activity for the first 2 processors for each second for next 5 times: sar –u –P 0,1 1 5

Topas:

Tuning Parameters:

/etc/tunables directory centralizes the tunable files.

Nextboot: this file is automatically applied at boot time.

Lastboot: It contains tunable parameters with their values after the last boot.

Lastboot.log: It contains logging of the creation of the lastboot file.

Saturday, May 15, 2010

Shared Ethernet Adapter Redundancy

Shared Ethernet adapter: It can be used to connect a physical network to a virtual Ethernet network. Several client partitions to share one physical adapter.

Shared Ethernet Redundancy: This is for temporary failure of communication with external networks. Approaches to achieve continuous availability:

Shared Ethernet adapter failover
Network interface backup

Shared Ethernet adapter failover: It offers Ethernet redundancy. In a SEA failover configuration 2 VIO servers have the bridging functionality of the SEA. They use a control channel to determine which of them is supplying the Ethernet service to the client. The client partition gets one virtual Ethernet adapter bridged by 2 VIO servers.

Requirements for configuring SEA failover:

One SEA on one VIOs acts as the primary adapter and the second SEA on the second VIOs acts as a backup adapter.
Each SEA must have at least one virtual Ethernet adapter with the “access external network flag(trunk flag) checked. This enables the SEA to provide bridging functionality between the 2 VIO servers.
This adapter on both the SEA’s has the same pvid
Priority value defines which of the 2 SEA’s will be the primary and which is the secondary. An adapter with priority 1 will have the highest priority.

Procedure for configuring SEA failover:

Configure a virtual Ethernet adapter via DLPAR. (ent2)

Select the VIOàClick task buttonàchoose DLPARàvirtual adapters
Click actionsàCreateàEthernet adapter
Enter Slot number for the virtual Ethernet adapter into adapter ID
Enter the Port virtual Lan ID(PVID). The PVID allows the virtual Ethernet adapter to communicate with other virtual Ethernet adapters that have the same PVID.
Select IEEE 802.1
Check the box “access external network”
Give the virtual adapter a low trunk priority
Click OK.

Create another virtual adapter to be used as a control channel on VIOS1.( give another VLAN ID, do not check the box “access external network” (ent3)
Create SEA on VIO1 with failover attribute. ( mkvdev –sea ent0 –vadapter ent2 –default ent2 –defaultid 1 –attr ha_mode=auto ctl_chan=ent3. Ex: ent4
Create VLAN Ethernet adapter on the SEA to communicate to the external VLAN tagged network ( mkvdev –vlan ent4 –tagid 222) Ex:ent5
Assign an IP address to SEA VLAN adapter on VIOS1. using mktcpip
Same steps to VIO2 also. ( give the higher trunk priority:2)