Reshaping Data from Long to Wide Format

The long_to_wide function reshapes data from a long format (multiple rows per patient) to a wide format (one row per patient with multiple columns for diagnoses). This function supports batch processing to handle large datasets efficiently. By specifying the batch_size parameter, you can control the number of rows processed in each batch. This tranformation is required before applying icd_to_comorbid functions.

# Example long format data with multiple rows per patient
long_data <- data.frame(
  patient_id = c(1, 1, 2, 2, 3),
  icd_1 = c("A01", "A02", "B01", "B02", "C01"),
  icd_2 = c("D01", "E02", "F01", "G02", "H01")
)

# Reshaping the data to wide format
wide_data <- long_to_wide(long_data, idx = "patient_id", icd_cols = c("icd_1", "icd_2"))

# Displaying the reshaped data
print(wide_data)
#>   patient_id icd_1 icd_2 icd_3 icd_4 icd_5
#> 1          1   A01   D01   A02   E02    NA
#> 2          2   B01   F01   B02   G02    NA
#> 3          3   C01   H01  <NA>  <NA>    NA

Comorbidity Calculations with ICD Codes

The icdcomorbid R package includes functions to map ICD-9 and ICD-10 codes to standard comorbidity indices. Additionally, users can choose between the Charlson or Quan-Elixhauser comorbidity indices for their analysis. Depending on your data, you can choose the appropriate ICD version and comorbidity index for accurate comorbidity calculations. Batch processing is also supported by specifying the batch_size parameter. Your data should be formatted correctly (i.e., in wide format) before applying these functions.

Choosing Comorbidity Indices

You can choose between the Charlson or Quan-Elixhauser comorbidity indices for both ICD-9 and ICD-10 codes. Note that different mappings are required for ICD-9 and ICD-10 codes.

ICD-9 Codes: Use the icd9_to_comorbid function and select the appropriate index such as “charlson9” or “elixhauser9”.
ICD-10 Codes: Use the icd10_to_comorbid function and select the corresponding index such as “charlson10” or “elixhauser10”.

Example: Using ICD-9 Codes

If your dataset contains ICD-9 codes, you can use the icd9_to_comorbid function to calculate comorbidities.

# Example ICD-9 data
icd9_data <- data.frame(
  patient_id = c(1, 1, 2, 2, 3),
  icd9_code = c("4010", "2500", "4140", "4280", "4930")
)

# Map ICD-9 codes to comorbidities using Charlson index
mapping <- "charlson9"

comorbidities_icd9 <- icd9_to_comorbid(
  df = icd9_data,
  idx = "patient_id",
  icd_cols = "icd9_code",
  mapping = mapping,
  batch_size = 2
)

# Display the comorbidity results
head(comorbidities_icd9)
#>   patient_id myocardial_infarction congestive_heart_failure
#> 1          1                 FALSE                    FALSE
#> 2          1                 FALSE                    FALSE
#> 3          2                 FALSE                    FALSE
#> 4          2                 FALSE                     TRUE
#> 5          3                 FALSE                    FALSE
#>   peripheral_vascular_disease cerebrovascular_disease dementia
#> 1                       FALSE                   FALSE    FALSE
#> 2                       FALSE                   FALSE    FALSE
#> 3                       FALSE                   FALSE    FALSE
#> 4                       FALSE                   FALSE    FALSE
#> 5                       FALSE                   FALSE    FALSE
#>   chronic_pulmonary_disease connective_tissue_disease_rheumatic_disease
#> 1                     FALSE                                       FALSE
#> 2                     FALSE                                       FALSE
#> 3                     FALSE                                       FALSE
#> 4                     FALSE                                       FALSE
#> 5                      TRUE                                       FALSE
#>   mild_liver_disease diabetes_wo_complications diabetes_w_complications
#> 1              FALSE                     FALSE                    FALSE
#> 2              FALSE                     FALSE                    FALSE
#> 3              FALSE                     FALSE                    FALSE
#> 4              FALSE                     FALSE                    FALSE
#> 5              FALSE                     FALSE                    FALSE
#>   paraplegia_and_hemiplegia renal_disease cancer
#> 1                     FALSE         FALSE  FALSE
#> 2                     FALSE         FALSE  FALSE
#> 3                     FALSE         FALSE  FALSE
#> 4                     FALSE         FALSE  FALSE
#> 5                     FALSE         FALSE  FALSE
#>   moderate_or_severe_liver_disease metastatic_carcinoma aids_hiv
#> 1                            FALSE                FALSE    FALSE
#> 2                            FALSE                FALSE    FALSE
#> 3                            FALSE                FALSE    FALSE
#> 4                            FALSE                FALSE    FALSE
#> 5                            FALSE                FALSE    FALSE

Example: Using ICD-10 Codes

If your dataset contains ICD-10 codes, you can use the icd10_to_comorbid function to calculate comorbidities:

# Example data with ICD-10 codes
icd10_data <- data.frame(
  patient_id = c(1, 1, 2, 2, 3),
  icd_code = c("E11", "I10", "E11", "I50", "I21")
)
mapping <- "quan_elixhauser10"

# Calculate comorbidities for ICD-10 data using Elixhauser index
icd10_comorbidities <- icd10_to_comorbid(
  df = icd10_data,
  idx = "patient_id",
  icd_cols = "icd_code",
  mapping = mapping,
  batch_size = 2
)

# Display the comorbidity results
head(icd10_comorbidities)
#>   patient_id congestive_heart_failure cardiac_arrhythmia valvular_disease
#> 1          1                    FALSE              FALSE            FALSE
#> 2          1                    FALSE              FALSE            FALSE
#> 3          2                    FALSE              FALSE            FALSE
#> 4          2                     TRUE              FALSE            FALSE
#> 5          3                    FALSE              FALSE            FALSE
#>   pulmonary_circulation_disorder peripheral_vascular_disorder
#> 1                          FALSE                        FALSE
#> 2                          FALSE                        FALSE
#> 3                          FALSE                        FALSE
#> 4                          FALSE                        FALSE
#> 5                          FALSE                        FALSE
#>   hypertension_uncomplicated hypertension_complicated paralysis
#> 1                      FALSE                    FALSE     FALSE
#> 2                       TRUE                    FALSE     FALSE
#> 3                      FALSE                    FALSE     FALSE
#> 4                      FALSE                    FALSE     FALSE
#> 5                      FALSE                    FALSE     FALSE
#>   other_neurological_disorder chronic_pulmonary_disease diabetes_uncomplicated
#> 1                       FALSE                     FALSE                  FALSE
#> 2                       FALSE                     FALSE                  FALSE
#> 3                       FALSE                     FALSE                  FALSE
#> 4                       FALSE                     FALSE                  FALSE
#> 5                       FALSE                     FALSE                  FALSE
#>   diabetes_complicated hypothyroidism renal_failure liver_disease
#> 1                FALSE          FALSE         FALSE         FALSE
#> 2                FALSE          FALSE         FALSE         FALSE
#> 3                FALSE          FALSE         FALSE         FALSE
#> 4                FALSE          FALSE         FALSE         FALSE
#> 5                FALSE          FALSE         FALSE         FALSE
#>   peptic_ulcer_disease_excluding_bleeding aids_hiv lymphoma metastatic_cancer
#> 1                                   FALSE    FALSE    FALSE             FALSE
#> 2                                   FALSE    FALSE    FALSE             FALSE
#> 3                                   FALSE    FALSE    FALSE             FALSE
#> 4                                   FALSE    FALSE    FALSE             FALSE
#> 5                                   FALSE    FALSE    FALSE             FALSE
#>   solid_tumor_wo_metastasis rheumatoid_arhritis coagulopathy obesity
#> 1                     FALSE               FALSE        FALSE   FALSE
#> 2                     FALSE               FALSE        FALSE   FALSE
#> 3                     FALSE               FALSE        FALSE   FALSE
#> 4                     FALSE               FALSE        FALSE   FALSE
#> 5                     FALSE               FALSE        FALSE   FALSE
#>   weight_loss fluid_and_electrolyte_disorders blood_loss_anemia
#> 1       FALSE                           FALSE             FALSE
#> 2       FALSE                           FALSE             FALSE
#> 3       FALSE                           FALSE             FALSE
#> 4       FALSE                           FALSE             FALSE
#> 5       FALSE                           FALSE             FALSE
#>   deficiency_anemia alcohol_abuse drug_abuse psychoses depression
#> 1             FALSE         FALSE      FALSE     FALSE      FALSE
#> 2             FALSE         FALSE      FALSE     FALSE      FALSE
#> 3             FALSE         FALSE      FALSE     FALSE      FALSE
#> 4             FALSE         FALSE      FALSE     FALSE      FALSE
#> 5             FALSE         FALSE      FALSE     FALSE      FALSE

Example: Custom Mapping

# Custom mapping
custom_mapping <- list(
  "Hypertension" = c("4010", "4011", "4019"),
  "Diabetes" = c("2500", "2501", "2502")
)

# Map ICD-9 codes to comorbidities using custom mapping
comorbidities_custom <- icd9_to_comorbid(
  df = icd9_data,
  idx = "patient_id",
  icd_cols = "icd9_code",
  mapping = custom_mapping,
  batch_size = 2
)

# Display the comorbidity results
head(comorbidities_custom)
#>   patient_id Hypertension Diabetes
#> 1          1         TRUE    FALSE
#> 2          1        FALSE     TRUE
#> 3          2        FALSE    FALSE
#> 4          2        FALSE    FALSE
#> 5          3        FALSE    FALSE

Identifying Episodes of Care

The episode_of_care function groups patients into episodes of care, which is useful for analyzing patient treatment over time.

# Example data with admit and discharge dates for DAD and NACRS
dad_data <- data.frame(
    patient_id = c(1, 1, 2),
    dad_admit = as.POSIXct(c("2023-01-01 10:00:00", "2023-02-01 09:00:00", 
                                                     "2023-01-15 08:00:00"), tz="UTC"),
    dad_dis = as.POSIXct(c("2023-01-10 15:00:00", "2023-02-10 14:00:00", 
                                                 "2023-01-20 12:00:00"), tz="UTC")
)

nacrs_data <- data.frame(
    patient_id = c(1, 2, 2),
    nacrs_admit = as.POSIXct(c("2023-01-15 10:00:00", "2023-01-25 09:00:00", 
                                                         "2023-03-01 08:00:00"), tz="UTC"),
    nacrs_dis = as.POSIXct(c("2023-01-20 15:00:00", "2023-01-30 14:00:00", 
                                                     "2023-03-05 12:00:00"), tz="UTC")
)

# Creating episodes of care
episodes <- episode_of_care(dad_data, nacrs_data, patient_id_col = "patient_id", 
                                                        dad_visit_date_col = "dad_admit", 
                                                        dad_exit_date_col = "dad_dis", 
                                                        nacrs_visit_date_col = "nacrs_admit", 
                                                        nacrs_exit_date_col = "nacrs_dis")
head(episodes)
#>   record_id patient_id           dad_admit             dad_dis
#> 1         1          1 2023-01-01 10:00:00 2023-01-10 15:00:00
#> 2         2          1                <NA>                <NA>
#> 3         3          1 2023-02-01 09:00:00 2023-02-10 14:00:00
#> 4         4          2 2023-01-15 08:00:00 2023-01-20 12:00:00
#> 5         5          2                <NA>                <NA>
#> 6         6          2                <NA>                <NA>
#>           nacrs_admit           nacrs_dis source episode_of_care
#> 1                <NA>                <NA>    DAD               1
#> 2 2023-01-15 10:00:00 2023-01-20 15:00:00  NACRS               2
#> 3                <NA>                <NA>    DAD               3
#> 4                <NA>                <NA>    DAD               1
#> 5 2023-01-25 09:00:00 2023-01-30 14:00:00  NACRS               2
#> 6 2023-03-01 08:00:00 2023-03-05 12:00:00  NACRS               3

icdcomorbid_vignette

April Nguyen

2024-07-31

Installation

Adding Decimals to ICD Codes