Functions in R – apply, lapply, sapply, tapply, simplify2array

In the previous tutorial we saw the different control structures in R. In this tutorial we will look at the following R functions – apply, lapply, sapply, tapply, simplify2array

apply

The apply function can be used apply a function over specific elements of an array (or matrix). The result is a vector, list or another array. Lets look at an example

Easy right! Needs an explanation though. The first argument in apply is the input matrix x that we just created. The second argument instructs R to apply the function to a Row. The last argument is the function. So in this case R sums all the elements row wise. There are two rows so the function is applied twice. Each application returns one value, and the result is the vector of all returned values. So in our example the value returned is a vector with two elements giving the sum of the first and the second row. We could also have applied the function to the columns

The second argument is 2 which instructs R to apply the function(sum) to columns. Since there are 5 columns the return value is a vector of 5. Of course we can extend this to more dimensions too. If there are 3 dimensions use 3 as the second argument to apply the function over the third dimension.

apply works for a data frame too. It uses the as.matrix function to coerce the data frame to a matrix (or as.array to an array)

lapply

lapply can be used to apply a function to all the elements of a list or vector. Here’s an example

simplify2array and sapply

In the above example the lapply function returned a list. It would be good to get an array instead. use the simply2array to convert the results to an array. Use the sapply function to directly get an array (it internally calls lapply followed by simplify2array)

tapply

The tapply function can be used to apply a function to a category of items. The easiest way to understand this is to use an example.

In the example below we use the mtcars data frame which is available in the R default installation. It contains information about certain cars. Two columns that we are interested in this example is the cyl(Number of cylinders) and wt (Weight). Lets say we want to calculate the average weight of the car for each category of number of cylinders (what is the average weight for 4 cylinder etc.). Here’s how we would do it

The first variable is the vector to which we want to apply the function. The second variable gives the factors on which the function is applied. The third variable is the function. The result in our example is an array

In this example we used a summary function. Lets see another example where the apply function returns more than one value for each element

tapply

This blog entry explains rapply function