Plot() function in R disorganized categorical x-values from .csv -
i trying create barplot of household income, x-values in wrong order.
here's categories household income got work's api: "0-15k", "15k-25k", "25k-35k", "35k-50k", "50k-75k", "75k-100k", "100k-125k", "125k-150k", "150k-175k", "175k-200k", "200k-250k", "250k+". took out names of email privacy , used first 25 rows only.
my script:
#load data store2 <- read.csv("/users/documents/work/data/client2.csv", na.strings = "", head = true) #first 25 rows email age gender householdincome maritalstatus 1 @aol.com <na> male <na> <na> 2 @yahoo.com 45-54 female <na> <na> 3 @stratatec.com <na> <na> <na> <na> 4 @gmail.com <na> <na> <na> <na> 5 5@yahoo.com 45-54 female 75k-100k married 6 @aol.com 25-34 male 75k-100k married 7 @yahoo.com 35-44 female 125k-150k married 8 d@sbcglobal.net 55-64 male 75k-100k married 9 @yahoo.com 65+ female 25k-35k married 10 @me.com <na> female <na> <na> 11 @sunupcorp.com <na> female <na> <na> 12 @yahoo.com 45-54 male 75k-100k married 13 @att.net <na> <na> <na> <na> 14 @verizon.net <na> male <na> <na> 15 @yahoo.com 45-54 male 50k-75k <na> 16 @gmail.com 45-54 male 50k-75k <na> 17 @roadrunner.com 45-54 female 15k-25k single 18 @aol.com 35-44 male 50k-75k single 19 @yahoo.com 45-54 male 125k-150k single 20 @aol.com <na> <na> <na> <na> 21 @gmail.com 25-34 male <na> <na> 22 @yahoo.com 25-34 male 50k-75k single 23 @gmail.com 55-64 male 150k-175k married 24 @trellnjoyce.com <na> female 35k-50k married 25 @aol.com 65+ male 50k-75k married edit: made changes plot, x-axis label in way.
#plot of household income res <- ordered(store2$householdincome, levels=c("0-15k", "15k-25k", "25k-35k", "35k-50k", "50k-75k", "75k-100k", "100k-125k", "125k-150k", "150k-175k", "175k-200k", "200k-250k", "250k+")) #set dimensions par(mar=c(8,4,4,3)) #create plot plot(res, main = "distribution of household income", xlab = "", ylab = "density", las=2, ylim = c(0,2000)) mtext(text="householdincome", side=1, line=6) 
make sure order levels of householdincome factor, e.g., so:
res <- ordered(store2$householdincome, levels=c("0-15k", "15k-25k", "25k-35k", "35k-50k", "50k-75k", "75k-100k", "100k-125k", "125k-150k", "150k-175k", "175k-200k", "200k-250k", "250k+")) par(mar=c(10,3,3,3)) plot(res, main = "distribution of household income", xlab = "", ylab = "density", las=2) mtext(text="householdincome", side=1, line=3)
Comments
Post a Comment