Data can be typed directly into Splus or read in from a file. Data objects can also be the result of an expression (combination of data objects and constants with operators and functions). Splus objects are stored in the .Data directory and are saved from one session to the next. In order to save a data object, a name must be assigned to it. This is done using the underscore character "_" or the less-than character and a hyphen "<-", with the name of the object on the left, and the values on the right. Alternatively, the symbol "->" can be used with the values on the left and the name of the object on the right. The name must start with a letter and may contain letters, digits, and periods. Splus is case sensitive, x and X refer to two different things. The following are examples of data assignments:

>height_175* read as "heightgets 175" * assigns the value 175 to the scalar nameheight>person_"Jo"* character values are inserted in quotes * if the quotes are omitted, Splus will look for a data objectJoto assign toperson

>heights_c(160,140,155)* the function c() "collects" the values 160, 140, and 155 and stores them into the vectorheights>people_c("Ned","Jill","Pat")* creates a vector of names >names(heights)_people* the names() function assigns names to the elements of a vector * the word people is not inserted in quotes, it refers to the vectorpeople, and not the word itself >heightsNed Jill Pat * typing the name of an object by itself 160 140 155 causes its value to be printed on the terminal >heights["Ned"]Ned * when an object has a names attribute, 160 its elements can be referred to by name >names(heights)_NULL* deletes the names attribute of the vectorheights

- Notice that square brackets are used instead of parentheses. Round brackets are used by functions (ie.: the c() and names() functions). Functions use the arguments provided in parentheses to perform a task. Subscripts require square brackets. The information provided in square brackets tells Splus what subset of a data object is being referred to.> object[subscript]* syntax for subscripting, whereobjectis the name of the data object andsubscriptdefines which elements to extract * the expressionheights["Ned"]above is an example of extracting data using a subscript

>The data objects you have just created are stored in your .Data directory. To see a list of the data objects (and later on, functions) you have created, typeheights[2][1] 140 * extracts the second element fromheights* [1] refers to the position of the first element on the given line - this is very useful when vectors are several lines long >heights[c(2,1,2)][1] 140 160 140 * extracts the second, first, and second elements fromheights* the c() function is used in the subscript when more than one element is listed >heigths[heights < 160][1] 140 155 * returns all the values ofheightswhich are less than 160 * this is NOT equivalent to >heights[ < 160]which would return the first 159 elements in the vector (eg.: the subscript numbers < 160) >heights[-2][1] 160 155 * returns all except the second value inheights>heights[1]_162> heights * assigns the value 162 to the first [1] 162 140 155 element ofheights* the old objectheightshas now been replaced by the new objectheights>heights[4]_135> heights * appends the value 135 to the vector [1] 162 140 155 135heights>heights_append(heights,height)> heights * the function append() creates a new [1] 162 140 155 135 175 vector with the first values the same asheightsand the last value asheight(recall thatheightwas previously assigned the value 175) * the function append() binds two objects into a vector * the arguments may be vectors, scalars, or both * this is equivalend to >heights_c(heights,height)>heights.1_append(heights,180,after=2)[1] 162 140 180 155 135 175 * the argumentafterspecifies the index ofheightsafter which the new values are to be inserted >heights_replace(heights,2,142)> heights * replaces the second value in heights with [1] 162 142 155 135 175 the value 142 and stores the new vector inheights* the first argument specifies the name of the data obejct, the second specifies the indices of the elements to be replaced, and the third argument specifies the values the elements are to be replaced with * this expression is equivalent to >heights[2]_142>heights.2_replace(heights,c(2,4),c(140,142))> heights.2 * replaces the second and fourth values of [1] 162 140 155 142 175heightsby the values 140 and 142 and stores the result intoheights.2>numbers_1:5> numbers * the operator ":" creates a sequence from [1] 1 2 3 4 5 1 to 5 * the syntax for the sequence operator isfrom:to>heights_heights.2[2:5]> heights * assigns the last four elements of the [1] 140 155 142 175 vectorheights.2to the vectorheights>length(heights)[1] 4 * returns the length (number of elements) of the objectheights

> **ls()**

To remove an object or function from your .Data directory, use the
rm() function. For example, to remove the scalar *height*, type

> **rm(height)**

You can also remove more than one data object at a time. To remove the
scalar *person* and the vector *numbers*, type

> **rm(person,numbers)**

Assigning a name already used by an Splus function may cause warning messages to appear on the screen:

> c_c(1,2,3)

> d_c(1,2,3)

Here, the name c (the "concatenate" function) was assigned to a data object. The problem is solved by reassigning the object to another name and removing the numeric object from the directory:Warning messages: Looking for object "c" of mode "function", ignored one of mode "numeric"

> b_c

> rm(c)

This will cause the warning message to be printed one last time.

>size.1_matrix(c(130,26,110,24,118,25,112,25),ncol=2)> size.1 * the function matrix() reads data into a matrix [,1] [,2] * the number of columns is specified using the [1,] 130 118 argumentncol= #[2,] 26 25 * alternatively, the number of rows can be [3,] 110 112 specified using the argumentnrow= #or both [4,] 24 25nrowandncolcan be specified * when neithernrownorncolare specified, the data is read in as a one column matrix >size.2_matrix(c(130,26,110,24,118,25,112,25),ncol=2,byrow=T)> size.2 * specifyingbyrow=Tforces Splus to read the [,1] [,2] data in row by row [1,] 130 26 * when the argument is not specified, or specified [2,] 110 24 asbyrow=F, Splus assumes the data is [3,] 118 25 written in column by column [4,] 112 25

>Notice the double square brackets: whereas single square brackets are used to extract data from a vector, double square brackets are used to extract components from a list:size.names_list(c("Abe","Bob","Carol","Deb"),c("Weight","Waist"))> size.names [[1]]: [1] "Abe" "Bob" "Carol" "Deb" [[2]]: [1] "Weight" "Waist"

>The individual components in the list retain their properties as vectors and as such, individual elements can be extracted from each component in the same way as in any other vector:size.names[[2]][1] "Weight" "Waist"

>Names can also be assigned to the components of a list:size.names[[2]][2][1] "Waist"

>The components of the list can then be extracted using their names attribute:names(size.names)_c("Rows","Columns")> size.names $Rows: [1] "Abe" "Bob" "Carol" "Deb" $Columns: [1] "Weight" "Waist"

>size.names$Rows[1] "Abe" "Bob" "Carol" "Deb" >size.names$Rows[2][1] "Bob" >dimnames(size.2)_size.names> size.2 Weight Waist * the dimnames() function assigns names to the Abe 130 26 dimensions of a data object (in this case, Bob 110 24 the rows and columns ofsize.2) Carol 118 25 Deb 112 25 >size.2_matrix(c(130,26,110,24,118,25,112,25),ncol=2,byrow=T, + dimnames=list(c("Abe","Bob","Carol","Deb"),c("Weight","Waist")))* it is possible to assign dimnames directly from within the matrix function * expressions can be spread over several lines, simply hit return at the end of the line and Splus prompts for a continuation line by means of the "+" character (this may also happen if you omit to close all open brackets or strings) >dimnames(size.2)_list(NULL,c("Weight","Waist"))* the NULL object is used when no dimnames are to be assigned to a dimension > abc_size.2 >dimnames(abc)_list(c("Abe","Bob","Carol","Deb"),dimnames(size.2)[[2]])* this command assigns dimnames to the rows ofabcand assigns the column dimnames ofsize.2to the columns ofabc>size_cbind(size.2,heights)> size * cbind() (column bind) "binds" together Weight Waist heights vectors and matrices columnwise into a [1,] 130 26 140 new matrix [2,] 110 24 155 * cbind() "binds" the vectorheights[3,] 118 25 142 columnwise to the matrixsize.2and [4,] 112 25 175 stores the resulting matrix insize* the nameheightsis automatically assigned to the third column of the matrixsize>size_rbind(size,c(128,26,170))> size * rbind() (row bind) "binds" together Weight Waist heights vectors and/or matrices rowwise into [1,] 130 26 140 a new matrix [2,] 110 24 155 [3,] 118 25 142 [4,] 112 25 175 [5,] 128 26 170 > x_c(1,2,3) >y_diag(x)> y * the function diag() creates a matrix with [,1] [,2] [,3] the vectoryon the main diagonal [1,] 1 0 0 * the main diagonal of a matrix are those [2,] 0 2 0 elements whose row number and column [3,] 0 0 3 number are the same * the number of rows or columns can be specified using the argumentsnroworncol>diag(y)[1] 1 2 3 * alternatively, when the argument is a matrix, diag() returns the diagonal of the matrix >col(y)[,1] [,2] [,3] * the function col() returns a matrix of [1,] 1 2 3 column numbers [2,] 1 2 3 * similarly, the function row() returns a [3,] 1 2 3 matrix of row numbers

>Suppose you wished to print the weights of those people taller than 160cm: the expression size[,1] will print all the weights in the matrix size. It is necessary to limit the rows to be printed to those rows where the value forsize[2,3]heights * to extract one value from a matrix, it is 155 necessary to use two elements in the subscript: the first element applies to the rows, the second element applies to the columns * the full subscript expression applies to the elements of the matrix that satisfy both the row and the column condition * in this case, the element in the second row, third column of the matrixsizeis printed >size[2,]Weight Waist heights * if one dimension is not specified in the 110 24 155 subscript, all elements in that dimension are extracted * in this case, the columns are not specified so all the columns are included >size[,3][1] 140 155 142 175 170 * prints the third column of the matrix size * in both examples, the comma must be kept in as a marker to indicate which dimension is specified * in both of these examples, Splus drops the extra dimension so that the result is a vector >size[2, ,drop=F]Weight Waist heights * to retain the matrix properties for the [1,] 110 24 155 result (which might be necessary in some computations), adddrop=Fto the subscripts * notice that two commas were used in the subscript, one to separate the row from the column (not specified) dimensions, the other to separate the indices from the argumentdrop>is.matrix(size[2,])[1] F * is.matrix is a logical expression which tests whether an object is a matrix >is.matrix(size[2, ,drop=F)[1] T * as seen above, when a single row or column is extracted from a matrix, the matrix properties are dropped unless otherwise specified in the argumentdrop>size[,c(1,3)]Weight heights * the c() function is used in matrix [1,] 130 140 subscripts in the same way as it is used [2,] 110 155 in vector subscripts [3,] 118 142 * here, the first and third columns of the [4,] 112 175 matrixsizeare printed out [5,] 128 170 >size[,c("Weight","Waist")]Weight Waist * character subscripts are used in the same [1,] 130 26 way as numeric subscripts: the first [2,] 110 24 element in the subscript specifies the [3,] 118 25 rows, and the second element in the [4,] 112 25 subscript specifies the columns [5,] 128 26 >size[-2,-3]Weight Waist * negative subscripts have the same meaning [1,] 130 26 for the rows and columns of matrices that [2,] 118 25 they have for elements of a vector [3,] 112 25 [4,] 128 26

>size[size[,3] > 160,1][1] 112 128 * this command pulls out the weights (column 1) of those people (rows) with height (size[,3]) greater than 160

>dim(size)[1] 5 3 * the dim() function returns the dimensions of an object * in the case of matrices, the first element is the number of rows in the matrix and the second element is the number of columns >nrow(size)[1] 5 >ncol(size)[1] 3 * the functions nrow() and ncol() are based on the function dim() and return the number of rows or the number of columns in the matrix

a) Create the following matrix called *marks*, and put in the approriate label names.

Test1 Test2 Test3 Final [1,] 20 23 18 48 [2,] 16 15 18 40 [3,] 25 20 22 40 [4,] 14 19 18 42b) Add the following row to the bottom of the matrix:

10 15 14 30c) Change the fifth mark for test #2 from a 15 to a 17.

d) Print all the marks for test #3.

e) Print the final marks for those people with marks greater than 16 on test #1.

f) Print the marks matrix without the column for test #3.

g) Print the number of rows in the matrix.

**Solutions**(Middle mouse button for separate window)

**
Computations and Data Manipulations**