R: How to summarize data from multiple ordered factors into one variable -
i have data representing severity of patients' asthma symptoms under different conditions. severity variables ordered factors, same levels (mild < moderate < severe). here simplified example:
# create example data frame df <- data.frame( id = c(1:5), daytime = c("mild", "severe", "mild", "moderate", "moderate"), # severity of daytime symptoms sleep = c("moderate", na, "mild", "mild", "moderate"), # severity of nighttime symptoms activity = c("mild", "moderate", "mild", "moderate", "severe") # severity of symptoms during activity ) # specify order of factor levels df$daytime <- ordered( df$daytime, levels = c("mild", "moderate", "severe") ) df$sleep <- ordered( df$sleep, levels = c("mild", "moderate", "severe") ) df$activity <- ordered( df$activity, levels = c("mild", "moderate", "severe") ) df
the resulting data frame looks this:
id daytime sleep activity 1 1 mild moderate mild 2 2 severe <na> moderate 3 3 mild mild mild 4 4 moderate mild moderate 5 5 moderate moderate severe
i'm trying create "overall severity" variable patient's overall severity = severe symptoms reported in of 3 categories (daytime, sleep, , activity). is, "overall" equals highest level "daytime," "sleep", , "activity." result this:
id daytime sleep activity overall 1 1 mild moderate mild moderate 2 2 severe <na> moderate severe 3 3 mild mild mild mild 4 4 moderate mild moderate moderate 5 5 moderate moderate severe severe
i'd without writing big, clunky for
loop, can't figure out how. thought maybe ave()
, doesn't seem work on multiple variables @ once:
> df$overall <- ave(c(df$daytime, df$sleep, df$activity), + df$id, + fun = function(i) max (i, na.rm=t) + ) error in `$<-.data.frame`(`*tmp*`, "worst", value = c(2l, 3l, 1l, 2l, : replacement has 15 rows, data has 5
is there apply function can this?
one quick way of doing be:
df$overall <- apply(df[,2:4], 1, max, na.rm=t)
Comments
Post a Comment