Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Why the ICD-10 codes did not get properly handled? #11

Open
XuejingJiang opened this issue Apr 13, 2020 · 2 comments
Open

Question: Why the ICD-10 codes did not get properly handled? #11

XuejingJiang opened this issue Apr 13, 2020 · 2 comments

Comments

@XuejingJiang
Copy link

XuejingJiang commented Apr 13, 2020

Dear Adam,

I was trying to apply the cat_trauma function on a wide-shaped, properly-prefixed dataset with only ICD-10 codes. But I keep running into an issue that none of the ICD-10 codes has returned a valid issbr, and therefore result in RISS to be zero for all the observations/patients.

For example, the following code was applied to my dataset:
dad_01 <- cat_trauma(df=temp,dx_pre = "DIAG_CODE_" ,calc_method = 1, icd10 = T,i10_iss_method = "roc_max")

Note that, all the ICD-10 codes in my dataset are formatted with no decimal points nor letter "A" in the eighth position. An example of the codes of my data is shown in the following picture.
image

In order to troubleshoot this issue, I made the following effort:

  • manually formating the ICD-10 codes to have decimal point on the 4th place, "A" on the 8th place, and re-run the cat_trauma function on the altered dataset. This did not help me to clear the problem (as attached below: row3 and row4 have exactly the same information as row1 and row2 with the ICD-10 codes formatted as wanted)
    image

  • I have made sure that all the ICD-10 codes are recorded as characters (not factors)

  • tried different parameters for the cat_trauma function (e.g. let i10_iss_method = "roc_max"/"gen_min"/"gen_max"); again nothing helped

Could you please help me to figure out what I missed to result in such issue, or what steps I can take to deal with this issue?

Thank you a lot,
Xuejing

@ablack3
Copy link
Owner

ablack3 commented Apr 14, 2020

Hi Xuejing,

Thanks for the detailed description of your error. My apologies to you and all the ICDPIC users for the issues people have had with this package. I am committed to fixing them. It is my first R package and I still have more to learn about developing and maintaining R packages. Take a look at this code:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(icdpicr)
library(icd)

df <- tibble::tribble(
~diag_code1, ~diag_code2, ~diag_code3, ~diag_code4, ~diag_code5, ~diag_code6,
"S52600",  "V8658", NA, NA, NA, NA,
"S52600",  "V8568", NA, NA, NA, NA,
"S82810",  "V8658", NA, NA, NA, NA,
"T791",    "S82100", "V8658", "U999", "T791", "Z530",
"S27000",  "S27300", "S22400", "V8658", NA, NA,
"S32400",  "S52500", "R104", "M2546", "V8668", NA,
"S8180",   "S22010", "S32000", "V8668", NA, NA,
"S828300", "T796", "V8658", NA, NA, NA,
"S06000",  "S0180", "S37090", "R311", "V8658", NA,
"S36150",  "S27300", "V8658", NA, NA, NA,
"S82601",  "V8658", NA, NA, NA, NA,
"S02100",  "S06630", "V8650", "G913", "T178", "B374")


df2 <- df %>%
      tidyr::gather(key = "key", value = "code") %>%
      distinct(code) %>%
      filter(!is.na(code)) %>%
      mutate(is_defined_icd10 = is_defined(as.icd10(code)),
             in_icdpic = code %in% icdpicr:::i10_map_roc$dx) %>%
      arrange(code)

df2 %>%
      print(n=100)
#> # A tibble: 33 x 3
#>    code    is_defined_icd10 in_icdpic
#>    <chr>   <lgl>            <lgl>    
#>  1 B374    TRUE             FALSE    
#>  2 G913    TRUE             FALSE    
#>  3 M2546   TRUE             FALSE    
#>  4 R104    FALSE            FALSE    
#>  5 R311    TRUE             FALSE    
#>  6 S0180   TRUE             TRUE     
#>  7 S02100  FALSE            FALSE    
#>  8 S06000  FALSE            FALSE    
#>  9 S06630  FALSE            FALSE    
#> 10 S22010  TRUE             TRUE     
#> 11 S22400  FALSE            FALSE    
#> 12 S27000  FALSE            FALSE    
#> 13 S27300  FALSE            FALSE    
#> 14 S32000  TRUE             TRUE     
#> 15 S32400  FALSE            FALSE    
#> 16 S36150  FALSE            FALSE    
#> 17 S37090  FALSE            FALSE    
#> 18 S52500  FALSE            FALSE    
#> 19 S52600  FALSE            FALSE    
#> 20 S8180   TRUE             FALSE    
#> 21 S82100  FALSE            FALSE    
#> 22 S82601  FALSE            FALSE    
#> 23 S82810  FALSE            FALSE    
#> 24 S828300 FALSE            FALSE    
#> 25 T178    TRUE             FALSE    
#> 26 T791    TRUE             FALSE    
#> 27 T796    TRUE             FALSE    
#> 28 U999    FALSE            FALSE    
#> 29 V8568   FALSE            FALSE    
#> 30 V8650   FALSE            FALSE    
#> 31 V8658   FALSE            FALSE    
#> 32 V8668   FALSE            FALSE    
#> 33 Z530    TRUE             FALSE


df2 %>%
      filter(in_icdpic) %>%
      rename(dx1 = code) %>%
      cat_trauma("dx") %>%
      select(1:5)
#>      dx1 sev_1   issbr_1 is_defined_icd10 in_icdpic
#> 1  S0180     3 Head/Neck             TRUE      TRUE
#> 2 S22010     1     Chest             TRUE      TRUE
#> 3 S32000     1   Abdomen             TRUE      TRUE

As you can see only some of the codes in your data are actual ICD 10 codes according to the icd package. I think in the next big update I will integrate icdpicr with the icd package since that package has done a lot of the work of parsing ICD codes. Of the codes in your data that are actual ICD 10 codes only three of them are in icdpic. This is because icdpic does not have every ICD code that could be considered an injury. It only contains ICD code used in the US national truama data bank. The three codes that are in icdpic do get mapped correctly after the change I made to the icdpicr source code. You will need to reinstall the package.

One question is why do you have codes that are apparently not actual ICD codes in your data? I'm mostly familiar with ICD10CM (American version) but according to the icd R package these codes are not in ICD at all so that is strange.

@XuejingJiang
Copy link
Author

XuejingJiang commented Apr 14, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants