Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggsurvplot Interprets Number of Observations as Levels in newdata Argument #659

Open
yukiyaama opened this issue Jul 12, 2024 · 0 comments

Comments

@yukiyaama
Copy link

yukiyaama commented Jul 12, 2024

I am experiencing an issue with the ggsurvplot function from the survminer package in R. When attempting to plot survival curves using ggsurvplot with the newdata argument, the function incorrectly interprets the number of observations as the number of levels, resulting in an error. Without the newdata argument, the function does not recognize the levels of the factor variable correctly, leading to another error.

Steps to Reproduce:
Data Preparation:
Clean and prepare the dataset ensuring the factor variable is correctly defined.

d.dat.tot.clean <- d.dat.tot[d.dat.tot$DEAD_OR_ALIVE < 2, ]
d.dat.tot.clean$GENOTYPE.111758446 <- as.factor(d.dat.tot.clean$GENOTYPE.111758446)

Cox Model Fitting:
Fit a Cox proportional hazards model.

cox_model_single_111758446 <- coxph(Surv(DONOR_SURVIVAL_TIME, DEAD_OR_ALIVE) ~ GENOTYPE.111758446, data = d.dat.tot.clean)

Create Survival Curves:
Create survival curves using the survfit function.

surv_fit <- survfit(cox_model_single_111758446)

Plot Survival Curves:
Attempt to plot the survival curves using ggsurvplot.

cox_plot_single_111758446 <- ggsurvplot(
  surv_fit,
  data = d.dat.tot.clean,
  pval = TRUE,
  conf.int = TRUE,
  risk.table = TRUE,
  legend.title = "Genotype",
  legend.labs = levels(d.dat.tot.clean$GENOTYPE.111758446),
  xlab = "Time to Last Follow-up, mo",
  ylab = "Cumulative Survival, %",
  ggtheme = theme_minimal(),
  palette = "set2"
)
print(cox_plot_single_111758446)

Observed Behavior:
With newdata Argument:

cox_plot_single_111758446 <- ggsurvplot(
  survfit(cox_model_single_111758446, newdata = d.dat.tot.clean),
  data = d.dat.tot.clean,
  pval = TRUE,
  conf.int = TRUE,
  risk.table = TRUE,
  legend.title = "Genotype",
  legend.labs = levels(d.dat.tot.clean$GENOTYPE.111758446),
  xlab = "Time to Last Follow-up, mo",
  ylab = "Cumulative Survival, %",
  ggtheme = theme_minimal(),
  palette = "set2"
)

This results in the error:
Error in ggsurvplot_df(d, fun = fun, color = color, palette = palette, :
The length of legend.labs should be 236

(236 is the number of cases I have.)

Without newdata Argument:
The function fails to recognize the levels of the factor variable correctly and returns:
Error in ggsurvplot_df(d, fun = fun, color = color, palette = palette, :
The length of legend.labs should be 1

Expected Behavior:
The function should correctly interpret the levels of the factor variable and plot the survival curves without error.
Environment:

R version: R 4.2.3 GUI 1.79 High Sierra build (8198)
survminer version: 0.4.9.999
survival package version: 3.7.0
Operating system: macOS 10.15.7 

Any help or guidance on how to resolve this issue would be greatly appreciated.
Thank you so much!!!!

Reproducible example:

# Sample reproducible data
set.seed(123)
d.dat.tot <- data.frame(
  DONOR_SURVIVAL_TIME = rexp(100, 0.1),
  DEAD_OR_ALIVE = sample(0:1, 100, replace = TRUE),
  GENOTYPE.111758446 = sample(c("A/A", "G/A", "G/G"), 100, replace = TRUE)
)

# Data preparation
d.dat.tot.clean <- d.dat.tot[d.dat.tot$DEAD_OR_ALIVE < 2, ]
d.dat.tot.clean$GENOTYPE.111758446 <- as.factor(d.dat.tot.clean$GENOTYPE.111758446)

# Cox model fitting
cox_model_single_111758446 <- coxph(Surv(DONOR_SURVIVAL_TIME, DEAD_OR_ALIVE) ~ GENOTYPE.111758446, data = d.dat.tot.clean)

# Survival curves creation
surv_fit <- survfit(cox_model_single_111758446)

# Plotting survival curves
cox_plot_single_111758446 <- ggsurvplot(
  surv_fit,
  data = d.dat.tot.clean,
  pval = TRUE,
  conf.int = TRUE,
  risk.table = TRUE,
  legend.title = "Genotype",
  legend.labs = levels(d.dat.tot.clean$GENOTYPE.111758446),
  xlab = "Time to Last Follow-up, mo",
  ylab = "Cumulative Survival, %",
  ggtheme = theme_minimal(),
  palette = "set2"
)

print(cox_plot_single_111758446)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant