> age
[1] 20-35yrs 20-35yrs 35-55yrs 35-55yrs 20-35yrs 55+yrs 20-35yrs 35-55yrs
> age
[1] 20-35yrs 20-35yrs 35-55yrs 35-55yrs 20-35yrs 55+yrs 20-35yrs 35-55yrs
> age_factor(age,labels=c("20-35yrs","35-55yrs","55+yrs"))
> age_as.factor(age)
The function as.factor() may also be used to convert a numeric object to a factor object, however, it is not be possible to assign labels to the factor levels using the function as.factor().
> age.groups_cut(age, breaks=c(20,35,55,80), labels=c("20-35yrs","35-55yrs","55+ yrs"))
The function cut() creates an object of mode category. Category objects were used by the previous version of S and have now been replaced by factor objects. The category object age.groups can be converted to a factor object using the function factor().
> age_factor(age.groups)
The function cut() may also be used to split a data object into groups of equal width:
> age_c(22,31,37,52,27,60,34,53)
> cut(age, breaks=3)
[1] 1 1 2 3 1 3 1 3 attr(, "levels"): [1] "Range 1" "Range 2" "Range 3"A category object is a vector of integers with levels attribute. Level 1 has levels attribute "Range 1", level 2 has levels attribute "Range 2" and so on. When a category object is converted to a factor object, the levels attribute becomes the factor labels.
The function pretty() creates "pretty" break points which can then be used by the function cut() to split the data:
> age_c(22,31,37,52,27,60,34,53)
> cut(age,pretty(age))
[1] 1 2 2 4 1 4 2 4 attr(, "levels"): [1] "20+ thru 30" "30+ thru 40" "40+ thru 50" "50+ thru 60"
> age_ordered(age, levels=c(1,2,3), labels=c("20-35yrs","35-55yrs","55+yrs"))
> age
[1] 20-35yrs 20-35yrs 35-55yrs 35-55yrs 20-35yrs 55+yrs 20-35yrs 35-55yrs 20-35yrs < 35-55yrs < 55+yrs
> ordered(age)_c("20-35yrs","35-55yrs","55+yrs")
> age_c(22,31,37,52,27,60,34,53)
> age_cut(age,pretty(age))
> age_factor(age)
> age_ordered(age)
> age
[1] 20+ thru 30 30+ thru 40 30+ thru 40 50+ thru 60 20+ thru 30 50+ thru 60
[7] 30+ thru 40 50+ thru 60
20+ thru 30 < 30+ thru 40 < 50+ thru 60
> table(age)
20-35yrs 35-55yrs 55+yrs 4 3 1Any number of arguments may be given to the table() function:
> sex_factor(c(1,2,2,1,2,1,2,2), labels=c("Female","Male"))
> table(sex, age)
20-35yrs 35-55yrs 55+yrs Female 1 1 1 Male 3 2 0The function tapply() applies a function to each cell of a table. Suppose we wished to report the mean systolic blood pressure for persons in each of the age/sex groups:
> systol_c(118, 125, 128, 127, 110, 140, 130, 120)
> tapply(systol, list(sex, age), mean)
20-35yrs 35-55yrs 55+yrs Female 118.0000 127 140 Male 121.6667 124 NAThe second argument to the tapply() function gives the indices over which the mean systolic blood pressures are to be calculated.
Richard A. Becker, John M. Chambers, Allan R. Wilks, The New S Language. A Programming Environmnent for Data Analysis and Graphics, Wadsworth & Brooks/Cole Advanced Books & Software, Pacific Grove, California, 1988, pp. 134-138