Skip to content
This repository has been archived by the owner on Jul 22, 2024. It is now read-only.

CSM BDS: python scripts TypeError with --state reverted #990

Open
thanh-lam opened this issue Dec 4, 2020 · 7 comments
Open

CSM BDS: python scripts TypeError with --state reverted #990

thanh-lam opened this issue Dec 4, 2020 · 7 comments

Comments

@thanh-lam
Copy link
Member

thanh-lam commented Dec 4, 2020

Describe the bug
For querying allocation data, CSM provides python scripts in /opt/ibm/csm/bigdata/python/. One example is "findUserJobs.py" that lists allocation info such as "state" and so on of a job. It produced following error when running with --state reverted. Other states (running, failed, complete) were listed with no error.

# ./findUserJobs.py -u tlam --state reverted
     State |   AID | P Job ID | S Job ID | Begin Time                 | End Time                  
Traceback (most recent call last):
  File "./findUserJobs.py", line 167, in <module>
    sys.exit(main(sys.argv))
  File "./findUserJobs.py", line 135, in main
    data.get("state")))
TypeError: unsupported format string passed to NoneType.__format__

To Reproduce
Steps to reproduce the behavior:

  1. Login to CSM master or BDS node as root.
  2. Change to /opt/ibm/csm/bigdata/python/ then run the command:
# ./findUserJobs.py -u <userid> --state reverted
  1. See error.

Expected behavior
The command should not produce the error (which looked like an internal condition needed to be handled with the reverted state). Example of a good command output:

# ./findUserJobs.py -u root
     State |   AID | P Job ID | S Job ID | Begin Time                 | End Time                  
  complete |     1 |      526 | 0        | 2020-09-17 12:06:01.183794 | 2020-09-17 12:06:58.382907
  complete |     2 |      527 | 0        | 2020-09-17 12:06:02.483204 | 2020-09-17 12:06:56.800503
  complete |     3 |      528 | 0        | 2020-09-17 12:10:25.346929 | 2020-09-17 12:10:26.374556
  complete |     4 |      530 | 0        | 2020-09-17 12:12:52.830307 | 2020-09-17 12:12:53.838649
  complete |     5 |      531 | 0        | 2020-09-17 12:18:24.195737 | 2020-09-17 12:18:25.275275
  complete |     6 |      532 | 0        | 2020-09-17 12:20:02.431532 | 2020-09-17 12:20:27.420373
  complete |     7 |      533 | 0        | 2020-09-17 12:22:12.25542  | 2020-09-17 12:22:33.375314
  complete |     8 |      534 | 0        | 2020-09-17 12:27:51.261522 | 2020-09-17 12:28:11.101704
  complete |     9 |      535 | 0        | 2020-09-17 12:28:01.331114 | 2020-09-17 12:28:11.447308
  complete |    10 |        1 | 0        | 2020-09-17 12:30:36.073652 | 2020-09-17 14:55:39.055421
  complete |    11 |        2 | 0        | 2020-09-17 12:30:36.137917 | 2020-09-17 14:55:40.508644
    failed |    24 |        1 | 0        | 2020-09-23 16:41:58.746747 | 2020-09-23 16:41:58.951312
  complete |   168 |      557 | 0        | 2020-10-28 10:35:17.335562 | 2020-10-28 10:41:13.049669
  complete |   169 |      661 | 0        | 2020-10-28 10:51:58.674377 | 2020-10-28 11:52:02.579073

Environment (please complete the following information):

  • IST BDS cluster, CSM dev cluster.
  • Version: CSM 1.8.2-3577

Additional context
The TypeError could be caused by some "empty" field in the data record with reverted state.

Issue Source:
CSM regression tests.

@thanh-lam thanh-lam changed the title CSM BDS: python scripts produced Typeerror when --state reverted CSM BDS: python scripts TypeError with --state reverted Dec 4, 2020
@williammorrison2 williammorrison2 self-assigned this Dec 4, 2020
@thanh-lam
Copy link
Member Author

The script prints out the list of user jobs fine until it hit the TypeError, when jobs have state = reverted. Bill found out from the database or indices that "reverted" jobs have empty "end_time". And, python3 flags that as a TypeError when it tried to print out the job record, as in this print statement:

            print( print_fmt.format(
                data.get("allocation_id"), data.get("primary_job_id"), data.get("secondary_job_id"),
                data.get("begin_time"), cast.deep_get(data,"history","end_time"),
                data.get("state")))

To fix that, we need to check the field 'cast.deep_get(data,"history","end_time")' and print out a blank if it's empty. This is the closest fix we can get and it works exactly as it meant to be.

            condition = cast.deep_get(data, "history","end_time")
            print( print_fmt.format(
                data.get("allocation_id"), data.get("primary_job_id"), data.get("secondary_job_id"),
                data.get("begin_time"), cast.deep_get(data,"history","end_time") if (condition!=None) else " ",
                data.get("state")))

Adding the line "condition = ..." to make the code more readable for checking the field with "if ... else ..." condition.

@thanh-lam
Copy link
Member Author

Similar fix can also be applied to another script "findJobsRunning.py".

                condition = cast.deep_get(data, "history","end_time")
                print(print_fmt.format(
                    data.get("allocation_id"), data.get("primary_job_id"), data.get("secondary_job_id"),
                    data.get("begin_time"), cast.deep_get(data, "history","end_time") if (condition!=None) else " "))

@williammorrison2
Copy link
Contributor

Thanks @thanh-lam for working with me and writing this up. I'm the process of reviewing some of the other scripts to ensure we catch similar cases. I will add the details to this specific issue.

@williammorrison2
Copy link
Contributor

Similar fix can also be applied to another script findJobsInRange.py.

            if data:
                condition = cast.deep_get(data, "history","end_time")
                print(print_fmt.format(
                    data.get("allocation_id"), data.get("primary_job_id"), data.get("secondary_job_id"),
                    data.get("begin_time"), cast.deep_get(data, "history","end_time") if (condition!=None) else " ",
                    data.get("user_name")))

@williammorrison2
Copy link
Contributor

@thanh-lam These are some examples of the query after the fix was implemented.

[root@c650f99p06 python]# ./findUserJobs.py -u tlam --state reverted
     State |   AID | P Job ID | S Job ID | Begin Time                 | End Time
[root@c650f99p06 python]# ./findUserJobs.py -u wcmorris --state reverted
     State |   AID | P Job ID | S Job ID | Begin Time                 | End Time
[root@c650f99p06 python]# ./findUserJobs.py -u root --state reverted
     State |   AID | P Job ID | S Job ID | Begin Time                 | End Time
  reverted |     6 |        1 | 0        | 2021-02-23 14:01:39.697209 |

[root@c650f99p06 python]# ./findUserJobs.py -u root
     State |   AID | P Job ID | S Job ID | Begin Time                 | End Time
  complete |     1 |        1 | 0        | 2021-02-23 12:04:34.828635 | 2021-02-23 12:04:39.513245
  complete |     2 |        1 | 0        | 2021-02-23 12:04:40.983847 | 2021-02-23 12:04:43.556549
  complete |     3 |        1 | 0        | 2021-02-23 12:05:01.829019 | 2021-02-23 12:05:02.492537
  complete |     4 |        1 | 0        | 2021-02-23 13:48:52.624415 | 2021-02-23 13:48:53.528137
  complete |     5 |        1 | 0        | 2021-02-23 14:00:14.318896 | 2021-02-23 14:03:32.978141
  reverted |     6 |        1 | 0        | 2021-02-23 14:01:39.697209 |
  complete |     7 |        1 | 0        | 2021-02-23 14:05:37.494822 | 2021-02-23 14:05:38.328102
  complete |     8 |        1 | 0        | 2021-02-23 14:08:06.726752 | 2021-02-23 14:08:07.399833
  complete |     9 |        1 | 0        | 2021-02-23 14:09:41.859691 | 2021-02-23 14:09:42.559594
  complete |    10 |        1 | 0        | 2021-02-23 14:16:08.829438 | 2021-02-23 14:16:09.533021
  complete |    11 |        1 | 0        | 2021-02-23 14:17:05.743261 | 2021-02-23 14:17:06.379795
  complete |    12 |        1 | 0        | 2021-02-23 14:18:37.053626 | 2021-02-23 14:18:37.73513
   running |    13 |        1 | 0        | 2021-02-23 14:26:50.28166  | 2021-02-23 14:26:50.970676
  complete |    14 |        1 | 0        | 2021-02-23 14:28:38.323487 | 2021-02-23 14:28:38.998807
  complete |    15 |        1 | 0        | 2021-02-23 14:35:32.508862 | 2021-02-23 14:35:33.167389

@besawn
Copy link
Contributor

besawn commented Mar 8, 2021

Fixed by PR #994.

@besawn besawn closed this as completed Mar 8, 2021
@besawn
Copy link
Contributor

besawn commented Mar 8, 2021

@thanh-lam I'm going to leave this issue open until you have a chance to verify the fix in the next CAST build.

@besawn besawn reopened this Mar 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants