order
Basics
order
is a function similar to sort, where vectors are arranged in ascending or descending order. The order
function will use further columns in the object to break ties among similar values, making it the preferred method for sorting data.frames and matrices.
Unlike sort
, order
returns the indices for the sorted object, so indexing on the output of order
will give us our desired output.
Examples
Given a matrix, arrange it in ascending order using the first column.
Click to see solution
my_mat <- matrix(c(1, 5, 0, 2,
10, 1, 2, 8,
9, 1,0,2), ncol=3)
my_mat[order(my_mat[,1]), ]
[,1] [,2] [,3] [1,] 0 2 0 [2,] 1 10 9 [3,] 2 8 2 [4,] 5 1 1
Great, we got what we wanted! But what’s with all the brackets and commas? Let’s work from the inside to understand what’s going on:
my_mat[,1]
uses column-specific indexing to get the matrix’s first column.
order(my_mat[,1])
sorts the first column in ascending order, then returns the indices of those values in the original my_mat
. The output of this code is:
[1] 3 1 4 2
We can quickly verify the output by noting that column 1 sorted in ascending order is 0, 1, 2, 5. Those values are located at indices 3, 1, 4, and 2.
Finally, we note that the output of order(my_mat[,1])
is a vector, which we’ll call x
for now. This means our final line of code is equivalent to my_mat[x, ]
. Once again recalling indexing rules, we are asking R to display the 3rd, 1st, 4th, and 2nd row of my_mat
in that order, which is our output in the example. This whole process is condensed into one line — thanks, R!
In the Orange
data set, order by age and circumference to compare tree size at each stage.
Click to see solution
Keep in mind that order
will go left-to-right when evaluating ties. If we don’t also include Orange$circumference
when ordering by Orange$age
, order
will use Orange$Tree
for tie-breaking, which won’t give us a conclusive pattern in terms of growth — we’ll just get 1, 2, 3, 4, and 5 repeated.
Orange[order(Orange$age, Orange$circumference),]
Tree age circumference 1 1 118 30 15 3 118 30 29 5 118 30 22 4 118 32 8 2 118 33 30 5 484 49 16 3 484 51 2 1 484 58 23 4 484 62 9 2 484 69 17 3 664 75 31 5 664 81 3 1 664 87 10 2 664 111 24 4 664 112 18 3 1004 108 4 1 1004 115 32 5 1004 125 11 2 1004 156 25 4 1004 167 19 3 1231 115 5 1 1231 120 33 5 1231 142 12 2 1231 172 26 4 1231 179 20 3 1372 139 6 1 1372 142 34 5 1372 174 13 2 1372 203 27 4 1372 209 21 3 1582 140 7 1 1582 145 35 5 1582 177 14 2 1582 203 28 4 1582 214
At each age, the ranking of tree size from smallest to largest is:
-
1, 3, 5, 4, 2
-
5, 3, 1, 4, 2
-
3, 5, 1, 4, 2
-
3, 1, 5, 2, 4
-
3, 1, 5, 2, 4
-
3, 1, 5, 2, 4
-
3, 1, 5, 2, 4
A pattern emerges after the 4th measurement, meaning we have a general ranking for tree size. This information is helpfully listed in the output of Orange$Tree
.