> x_c(1,2,3,3,3,4,7,8,9,NA) * when there are missing values in the
data, the functions max(), min(),
range(), mean(), and median() return NA,
and the functions var(), cor(), and
quantile() return an error message
> max(x, na.rm=T)
[1] 9 * specifying na.rm=T in the function
max() forces Splus to remove any
missing values from the vector x and
to return the maximum value in x
> min(x, na.rm=T)
[1] 1
> range(x, na.rm=T)
[1] 1 9
> mean(x, na.rm=T)
[1] 4.444444
> mean(x, trim=0.2, na.rm=T)
[1] 4.285714 * the argument trim can take any value
between 0 and 0.5 inclusive to be
trimmed from each end of the ordered
data
* if trim=0.5, the result is the median
> median(x, na.rm=T)
[1] 3
> quantile(x, probs=c(0,0.1,0.9), na.rm=T)
0% 10% 90% * the function quantile() returns the
1 1.8 8.2 quantiles of x specified in the
argument probs
If there are no missing values in the vector x, it is not necessary to specify
na.rm=T - simply use min(x), max(x), etc.These functions may also be used on matrices; they will not be applied to the rows or columns individually but rather will find the max, min, etc. of the whole matrix
> var(x[!is.na(x)])
[1] 8.027778 * missing values are removed from the vector
x using the subscript !is.na(x)
* specifying two arguments to the var()
function, var(x,y) returns the covariance
between the two arguments
* arguments may be vectors or matrices
> y_c(1,2,3,4,5,6,7,8,9,10)
> cor(x[!is.na(x)],y[!is.na(x)]
[1] 0.9504597 * because the cor() function requires x
and y to be of the same length, it is
necessary to remove the value of y
corresponding to the missing value in x;
this is done using y[!is.na(x)]
> summary(x)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
1 3 3 4.444 7 9 1
> z_c(5,4,3,2,1,9,8,7,6,5)
> pmax(x,y,z)
[1] 5 4 3 4 5 9 8 8 9 NA
> pmin(x,y,z)
[1] 1 2 3 2 1 4 7 7 6 NA
* pmax() returns the maximum value for each
position in a number of vectors
* likewise, pmin() returns the minimum value
* na.rm=T may also be specified to remove
missing values
< dist > Parameters Defaults Distributions
beta shape1, shape2 -, - Beta
binom size,prob -, - Binomial
cauchy location, scale 0, 1 Cauchy
chisq df - Chisquare
exp rate (1/mean ) 1 Exponential
f df1, df2 -, - F
gamma shape - GAMMA
geom prob - Geometric
hyper m, n, k -, -, - Hypergeometric
lnorm mean, sd (of log) 0, 1 Lognormal
logis location, scale 0, 1 Logistic
norm mean, sd 0, 1 Normal
nrange size, nevals -, 200 Normal Range
-, - for rnrange
pois lambda - Poisson
t df - Student's t
unif min, max 0, 1 Uniform
weibull shape - Weibull
wilcox m, n -, - Wilcoxon
For help on the use of the d < dist > (), p < dist > (),
q < dist > (), and r < dist > ()
functions for each of these distributions, use help with the name of the distribution as
it appears in the column Distribution, (eg.: help(GAMMA)) with the following exceptions:
for logis type help(dlogis), for nrange type help(dnrange), for
the F distribution and Student's t distribution, type help.start(gui='motif'),
click on Probability Distributions and Random Numbers under the column Categories,
then click on F or T in the left-hand column
> dnorm(0)
[1] 0.3989423 * returns the density at 0 for the
normal distribution
> X11()
> plot(seq(-3,3,0.1), dnorm(seq(-3,3,0.1)), type="l")
* the d < dist > () functions can be
used to plot the density function
for each of the above distributions
> pnorm(1.96)
[1] 0.9750021 * returns the cumulative probability
at 1.96 for the normal distribution
> qnorm(0.9750021)
[1] 1.96 * returns the 97.5th percentile for
the normal distribution
> rnorm(5)
[1] -0.7160094 0.3953744 1.2587492 0.3022640 -0.4109508
* generates 5 random standard normal
variables
> rexp(5,1/3)
[1] 0.1204068 0.1937435 9.3637550 0.8051347 1.0450249
* this could also have been written as
> rexp(5, rate=1/3)