Previously, we described the essentials of R programming and provided quick start guides for importing data into R as well as converting your data into a tibble data format, which is the best and modern way to work with your data. We also described crutial steps to reshape your data with R for easier analyses.
Pleleminary tasks
Launch RStudio as described here: Running RStudio and setting up your working directory
Prepare your data as described here: Best practices for preparing your data and save it in an external .txt tab or .csv files
Import your data into R as described here: Fast reading of data from txt|csv files into R: readr package.
Here, well use the R built-in iris data set, which we start by converting to a tibble data frame (tbl_df). Tibble is a modern rethinking of data frame providing a nicer printing method. This is useful when working with large data sets.
# Create my_data
my_data <- iris
# Convert to a tibble
library("tibble")
my_data <- as_data_frame(my_data)
# Print
my_data
Source: local data frame [150 x 5]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fctr>
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
7 4.6 3.4 1.4 0.3 setosa
8 5.0 3.4 1.5 0.2 setosa
9 4.4 2.9 1.4 0.2 setosa
10 4.9 3.1 1.5 0.1 setosa
.. ... ... ... ... ...
Install and load dplyr package
- Install dplyr
install.packages("dplyr")
- Load dplyr:
library("dplyr")
Reorder rows with dplyr::arrange()
- Reorder rows by Sepal.Length in ascending order
arrange(my_data, Sepal.Length)
Source: local data frame [150 x 5]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
(dbl) (dbl) (dbl) (dbl) (fctr)
1 4.3 3.0 1.1 0.1 setosa
2 4.4 2.9 1.4 0.2 setosa
3 4.4 3.0 1.3 0.2 setosa
4 4.4 3.2 1.3 0.2 setosa
5 4.5 2.3 1.3 0.3 setosa
6 4.6 3.1 1.5 0.2 setosa
7 4.6 3.4 1.4 0.3 setosa
8 4.6 3.6 1.0 0.2 setosa
9 4.6 3.2 1.4 0.2 setosa
10 4.7 3.2 1.3 0.2 setosa
.. ... ... ... ... ...
- Reorder rows by Sepal.Length in descending order. Use the function desc():
arrange(my_data, desc(Sepal.Length))
Source: local data frame [150 x 5]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
(dbl) (dbl) (dbl) (dbl) (fctr)
1 7.9 3.8 6.4 2.0 virginica
2 7.7 3.8 6.7 2.2 virginica
3 7.7 2.6 6.9 2.3 virginica
4 7.7 2.8 6.7 2.0 virginica
5 7.7 3.0 6.1 2.3 virginica
6 7.6 3.0 6.6 2.1 virginica
7 7.4 2.8 6.1 1.9 virginica
8 7.3 2.9 6.3 1.8 virginica
9 7.2 3.6 6.1 2.5 virginica
10 7.2 3.2 6.0 1.8 virginica
.. ... ... ... ... ...
Instead of using the function desc(), you can prepend the sorting variable by a minus sign to indicate descending order, as follow.
arrange(my_data, -Sepal.Length)
- Reorder rows by multiple variables: Sepal.Length and Sepal.width
arrange(my_data, Sepal.Length, Sepal.Width)
Source: local data frame [150 x 5]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
(dbl) (dbl) (dbl) (dbl) (fctr)
1 4.3 3.0 1.1 0.1 setosa
2 4.4 2.9 1.4 0.2 setosa
3 4.4 3.0 1.3 0.2 setosa
4 4.4 3.2 1.3 0.2 setosa
5 4.5 2.3 1.3 0.3 setosa
6 4.6 3.1 1.5 0.2 setosa
7 4.6 3.2 1.4 0.2 setosa
8 4.6 3.4 1.4 0.3 setosa
9 4.6 3.6 1.0 0.2 setosa
10 4.7 3.2 1.3 0.2 setosa
.. ... ... ... ... ...
If the data contain missing values, they will always come at the end.
dplyr::arrange() is the homologous of R base function order(). It requires less typing.
Reorder rows with R base function order()
- Reorder rows by Sepal.Length in ascending order
my_data[order(my_data$Sepal.Length), , drop = FALSE]
- Reorder rows by Sepal.Length in descending order. Use the additional argument decreasing = TRUE:
row_order <- order(my_data$Sepal.Length, decreasing = TRUE)
my_data[row_order, , drop = FALSE]
Summary
Infos
This analysis has been performed using R (ver. 3.2.3).