Skip to content

The minidesign for diskdiscover and configraid

bybai edited this page Nov 23, 2015 · 1 revision

Overall:

This mini-design will support 2 new features for RAID supports, these functions can be shipped in xCAT-genesis-scripts, we can run these functions under xCAT genesis system.

  1. Discover disk devices under xcat genesis system;
  2. Configure RAID, including creating RAID arrays and deleting RAID arrays;

Part1: Discover disk devices

Command: diskdiscover

Input parameter: PCI_ID or nothing

Description: User can use this command to get the overview of disks/RAID from compute node; The output will be useful information for configure RAID support; The input parameter is PCI_ID, PCI_ID includes PCI vender and device ID. For example, We can find power8 SAS adapter info from http://pci-ids.ucw.cz/read/PC/1014/034a; 1014 is vender info, 034a is PCI-E IPR SAS Adapter.

The framework process:

If input parameter PCI_ID is not null:
diskdiscover read PCI_ID, it can find related PCI_SLOT_NAME; use different functions to get disk devices, their Resource_Path, their status, descriptions and overview of RAID arrays; combine these output as a metrix;
else:
diskdiscover will find all advanced function disks and its related info, including PCI_ID,PCI_SLOT_NAME.....; It also show the RAID arrays;

Work process:

  1. Start xCAT genesis system in P8 system, let P8 system enter xCAT genesis system shell.
  2. From xcat management node, execute xdsh nodename "diskdiscover 1014:034a" or xdsh nodename "diskdiscover".
  3. The format of output will be a metrix; The column can include PCI_ID, PCI_Address, Resource_Path, devices, status, Descriptions.

Reference example:

PCI_ID : 1014:034a
----------------
PCI_SLOT_NAME    Resource_Path  disk_Devices   Status     Descriptions
0001:08:00.0     00-01           sg0           Active     Advanced Function Disk
0001:08:00.0     00-00           sg1           Active     RAID 0 Array Member

RAID overview:
-------------
Name    PCI/SCSI Location     Description                 Status
sda     0001:08:00.0/0:2:0:0  RAID 10 Disk Array          1% Rebuilt

Part2: Configure RAID

Script: configraid

Function: delete RAID arrays, create RAID arrays

Command format:
::
configraid delete_raid=[all|raid_array_list|null] stripe_size_in_kb=[16|64|256] create_raid = rl#[0,10,5,6] | [PCI_ID#<num>|PCI_SLOT_name#<pci_slot_name>]|disk_num#<number of disks for one raid> create_raid = rl#[0,10,5,6] | [PCI_ID#<num>|PCI_SLOT_name#<pci_slot_name>]|disk_num#<number of disks for one raid> ...

Description:

  1. Input parameters:

    delete_raid:
    1. delete_raid list raid arrays which should be removed. If its value is all, all raid arrays detected should be deleted.
    2. If its value is a list of raid names, these raid arrays will be deleted. Raid array names should be seperated by #. If its value is null, no raid array will be deleted.If there is no delete_raid, the default value is null.
    3. format is : delete_raid = [all|raid_array_list|null]
    4. example: ::
      delete_raid = sda!sdd

    create_raid:

    1. When we want to create a raid array, we can add a line begginning with create_raid.
    2. The format is : create_raid="rl#<raidlevel>|[pci_id#<num>|pci_slot_name#<pci_slot_name>|disk_names#<sg0>#..#<sgn>]|disk_num#<number>" ...
    3. rl means RAID level, RAID level can be any supported RAID level for the given adapter, such as 0, 10, 5, 6.
    4. pci_id is PCI vender and device ID; refer to http://pci-ids.ucw.cz/read/PC/1014/034a;
    5. disk_num is the number of disk this RAID will contain;
    6. pci_slot_name is the specified PCI location. If specify pci_slot_name, this raid will be created using disks from this PCI_SLOT;
    7. If specify pci_id, configraid will detect all disks under pci_id.
    8. If specify disk_names, configraid will configure raid using specified disks.
    stripe_size_in_kb:
    1. Currently supported stripe sizes in kb include 16, 64, and 256.
    2. If stripe size is not specified, it will default to the recommended stripe size for the selected RAID level.
  2. Framework main process a. load input parameters; b. If delete_raid is not null:

    If delete_raid is all:

    collect all the RAID arrays; delete all RAID array or specifed RAID arrays from delete_raid list;

    else:

    give information that no need to delete RAID arrays;

    1. If there are several input parameters of create_raid, handle each create_raid, find target number disks and create raid;

      If there is PCI_ID and there is no PCI_SLOT_name:

      find out all ipr ioa pci locations based on PCI_ID reorder ipr ioa pci locations by its primary and secondary state return ordered groups name pci_slot_group with PCI_SLOT_name

      If there is PCI_SLOT_name or there are both PCI_SLOT_name and PCI_ID:

      use PCI_SLOT_name, so empty pci_slot_group let PCI_SLOT_name into pci_slot_group

      If there is no PCI_ID or PCI_SLOT_name:

      find out all advanced functions disks save as af_disks_group;

      If there is pci_slot_group:

      for each PCI_SLOT_name in pci_slot_group: find out all member disks in IPR raid adapter, we will get multilines, sloc_of_disks_group, such as <pciloc_of_ioa1>=<sloc_of_disk1>,<sloc_of_disk2>......

      <pciloc_of_ioa2>=<sloc_of_disk1>,<sloc_of_disk2>......

      for each create_raid :
      if sloc_of_disks_group is not null:

      find out its <pciloc_of_ioa1>=<sloc_of_disk1>,<sloc_of_disk2>, sort disk devices by resource path find the required number disks, disk_num is disks number, these disks are af disks, save as af_disks_group

      if af_disks_group is not null:
      if disk_num is not null:

      if there is not enough disks on adapter, break current loop; if all target disks were already in use, reuse this array, break current loop;

      else:

      disk_num is the number of all disks in af_disks_group

      pick up af disks to form a list create an array using specified raid_level, stripe_size_in_kb, list of af disks; check if the array is ready or not.

    2. log files are saved in /tmp/

  3. Other common functions called by main process:

    1. load configure file
    2. log utils
    3. delete ipr arrays
    4. create ipr arrays
    5. check ipr device status
    6. wait for ipr device status
    7. check disk format, af or jbod
    8. order resource path
    9. convert between disk scsi location and device_name
    10. sort disk devices by resource path
    11. reorder ipr ioa pci locations by its primary and secondary state
    12. format jbod disks into advanced format
    13. pick up specified number of disks from af_disks_group
    14. other

Work process:

  1. Execute commands in xcat MN, for example ppc64, delete all raid arrays, create a raid10 using pci_slot 0001:08:00.0 first 2 disks:
::
nodeset <compute_node> cmd=configRAID delete_raid=all create_raid=rl#10|pci_slot_name#0001:08:00.0|disk_num#2,shell rpower <compute_node> reset
  1. Use xdsh to monitor the process of raid building.
  2. In current stage, we will focus on the framework process, the monitor of raid building process can be considered in the future.

News

History

  • Oct 22, 2010: xCAT 2.5 released.
  • Apr 30, 2010: xCAT 2.4 is released.
  • Oct 31, 2009: xCAT 2.3 released. xCAT's 10 year anniversary!
  • Apr 16, 2009: xCAT 2.2 released.
  • Oct 31, 2008: xCAT 2.1 released.
  • Sep 12, 2008: Support for xCAT 2 can now be purchased!
  • June 9, 2008: xCAT breaths life into (at the time) the fastest supercomputer on the planet
  • May 30, 2008: xCAT 2.0 for Linux officially released!
  • Oct 31, 2007: IBM open sources xCAT 2.0 to allow collaboration among all of the xCAT users.
  • Oct 31, 1999: xCAT 1.0 is born!
    xCAT started out as a project in IBM developed by Egan Ford. It was quickly adopted by customers and IBM manufacturing sites to rapidly deploy clusters.
Clone this wiki locally