2 1. Summarizing Data set containing a number of physical measurements of three varieties of iris. These data were published by Edgar Anderson in 1935 [And35] but are famous because R. A. Fisher [Fis36] gave a statistical analysis of these data that appeared a year later. The str() function provides our first overview of the data set. iris-str str(iris) ’data.frame’: 150 obs. of 5 variables: $ Sepal.Length:num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... $ Sepal.Width :num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ... $ Petal.Length:num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ... $ Petal.Width :num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ... $ Species :Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ... From this output we learn that our data set has 150 observations (rows) and 5 variables (columns). Also displayed is some information about the type of data stored in each variable and a few sample values. While we could print the entire data frame to the screen, this is inconvenient for large data sets. We can look at the first few or last few rows of the data set using head() and tail(). This is enough to give us a feel for how the data look. iris-head head(iris,n=3) # first three fows Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa iris-tail tail(iris,n=3) # last three rows Sepal.Length Sepal.Width Petal.Length Petal.Width Species 148 6.5 3.0 5.2 2.0 virginica 149 6.2 3.4 5.4 2.3 virginica 150 5.9 3.0 5.1 1.8 virginica We can access any subset we want by directly specifying which rows and columns are of interest to us. iris-subset iris[c(1:3,148:150),3:5] # first and last rows, only 3 columns Petal.Length Petal.Width Species 1 1.4 0.2 setosa 2 1.4 0.2 setosa 3 1.3 0.2 setosa 148 5.2 2.0 virginica 149 5.4 2.3 virginica 150 5.1 1.8 virginica It is also possible to look at just one variable using the $ operator. iris-vector iris$Sepal.Length # get one variable and print as vector [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 [17] 5.4 5.1 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5 lines removed [113] 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2 6.2 6.1
Previous Page Next Page