Skip to content

Shared data based xCAT MN HA

Yuan Bai edited this page May 9, 2018 · 10 revisions

The draft of the xCAT MN HA user case

As a xCAT user, I have host1 and host2 with the same shared data directory, I want to configure xCAT management node HA rapidly.

Pre-requirements for User:

host1  host2
Shared data directory 
The nic that the virtual ip address attaches to
Virtual IP
Net mask

1. setup 2 xCAT MN nodes

There are 2 scenarios:

Scenario 1 xCAT is not installed on 2 xCAT MN nodes

  1. Install xCAT on both nodes
  2. Switch DB to required type as needed
  3. Configure host1 as active xcat primary MN, configure host2 as xcat standby MN, configure all related service stop from starting on reboot on both 2 nodes

Scenario 2 xCAT is installed on xCAT MN nodes

  1. Configure host1 as active xcat primary MN, configure host2 as xcat standby MN, configure all related services stop from starting on reboot on both 2 nodes

2. Different failover scenarios:

failover scenario 1

when active xCAT MN host1 is broken and we can access to its OS

  1. I use deactivate-xcatmn to make host1 as non-active xcat MN node
    1. make sure all related services as followings are down, make sure all related services are configured stop from starting on reboot
      1. console service
      2. DHCP service
      3. named service
      4. xcatd
      5. database (mysql/postgresql/sqlite type)
    2. umount/un-link shared data directories on host1
    3. change hostname if needed
    4. remove virtual IP
  2. I use activate-xcatmn to configure host2 as active xcat MN node
    1. make sure virtual ip is not used (ping), or else, exit
    2. add virtual ip into its nic
    3. set hostname to virtual ip hostname
    4. check if current DB type is matched, if not, exit and clean up env
    5. make symbolic link to share data directories, for example:
      /install -> /HA-data/install
      /etc/xcat ->/HA-data/etc/xcat
      /root/.xcat -> /HA-data/root/.xcat
      /var/lib/pgsql -> /HA-data/var/lib/pgsql
      /tftpboot -> /HA-data/tftpboot
      
    6. start/re-configure all related services as followings:
      1. database (mysql/postgresql/sqlite type)
      2. xcatd
      3. named service (makedns -n)
      4. DHCP service (makedhcp -n, makedhcp -a)
      5. Console Server
      6. ... ...

failover scenario 2

when active xCAT MN host1 is broken and we cannot access to its OS, restart this xCAT MN node host1, after it reboots:

  1. if we can access to its OS

    1. do the same with failover scenario 1
  2. if we cannot access to host1 OS, un-plugin its network cable.

    1. use activate-xcatmn to configure host2 as active xcat MN node, the same as section 2 in failover scenario 1
    2. Recommend to fix host1 OS or hardware.

News

History

  • Oct 22, 2010: xCAT 2.5 released.
  • Apr 30, 2010: xCAT 2.4 is released.
  • Oct 31, 2009: xCAT 2.3 released. xCAT's 10 year anniversary!
  • Apr 16, 2009: xCAT 2.2 released.
  • Oct 31, 2008: xCAT 2.1 released.
  • Sep 12, 2008: Support for xCAT 2 can now be purchased!
  • June 9, 2008: xCAT breaths life into (at the time) the fastest supercomputer on the planet
  • May 30, 2008: xCAT 2.0 for Linux officially released!
  • Oct 31, 2007: IBM open sources xCAT 2.0 to allow collaboration among all of the xCAT users.
  • Oct 31, 1999: xCAT 1.0 is born!
    xCAT started out as a project in IBM developed by Egan Ford. It was quickly adopted by customers and IBM manufacturing sites to rapidly deploy clusters.
Clone this wiki locally