Quantcast
Channel: Easy Guides
Viewing all articles
Browse latest Browse all 183

Renaming Data Frame Columns in R

$
0
0



Previously, we described the essentials of R programming and provided quick start guides for importing data into R as well as converting your data into a tibble data format, which is the best and modern way to work with your data. We also described crutial steps to reshape your data with R for easier analyses.


Here, you we’ll learn how to rename the columns of a data frame in R.This can be done easily using the function rename() in dplyr. It’s also possible to use R base functions, but they require more typing.


Renaming Columns of a Data Table in R

Pleleminary tasks

  1. Launch RStudio as described here: Running RStudio and setting up your working directory

  2. Prepare your data as described here: Best practices for preparing your data and save it in an external .txt tab or .csv files

  3. Import your data into R as described here: Fast reading of data from txt|csv files into R: readr package.

Here, we’ll use the R built-in iris data set, which we start by converting to a tibble data frame (tbl_df). Tibble is a modern rethinking of data frame providing a nicer printing method. This is useful when working with large data sets.

# Create my_data
my_data <- iris

# Convert to a tibble
library("tibble")
my_data <- as_data_frame(my_data)

# Print
my_data
Source: local data frame [150 x 5]

   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          <dbl>       <dbl>        <dbl>       <dbl>  <fctr>
1           5.1         3.5          1.4         0.2  setosa
2           4.9         3.0          1.4         0.2  setosa
3           4.7         3.2          1.3         0.2  setosa
4           4.6         3.1          1.5         0.2  setosa
5           5.0         3.6          1.4         0.2  setosa
6           5.4         3.9          1.7         0.4  setosa
7           4.6         3.4          1.4         0.3  setosa
8           5.0         3.4          1.5         0.2  setosa
9           4.4         2.9          1.4         0.2  setosa
10          4.9         3.1          1.5         0.1  setosa
..          ...         ...          ...         ...     ...

Install and load dplyr package for renaming columns

  • Install dplyr
install.packages("dplyr")
  • Load dplyr:
library("dplyr")

Renaming columns with dplyr::rename()

  • Rename the column Sepal.Length to sepal_length and Sepal.Width to sepal_width:
rename(my_data, sepal_length = Sepal.Length,
       sepal_width = Sepal.Width)
Source: local data frame [150 x 5]

   sepal_length sepal_width Petal.Length Petal.Width Species
          (dbl)       (dbl)        (dbl)       (dbl)  (fctr)
1           5.1         3.5          1.4         0.2  setosa
2           4.9         3.0          1.4         0.2  setosa
3           4.7         3.2          1.3         0.2  setosa
4           4.6         3.1          1.5         0.2  setosa
5           5.0         3.6          1.4         0.2  setosa
6           5.4         3.9          1.7         0.4  setosa
7           4.6         3.4          1.4         0.3  setosa
8           5.0         3.4          1.5         0.2  setosa
9           4.4         2.9          1.4         0.2  setosa
10          4.9         3.1          1.5         0.1  setosa
..          ...         ...          ...         ...     ...

Renaming columns with dplyr::select()

select() can be also used to rename variables as follow.

select(my_data, sepal_length = Sepal.Length,
       sepal_width = Sepal.Width)
Source: local data frame [150 x 2]

   sepal_length sepal_width
          (dbl)       (dbl)
1           5.1         3.5
2           4.9         3.0
3           4.7         3.2
4           4.6         3.1
5           5.0         3.6
6           5.4         3.9
7           4.6         3.4
8           5.0         3.4
9           4.4         2.9
10          4.9         3.1
..          ...         ...

Note that, select() keeps only the variables you mentioned. In order to to keep all, you can use the function rename(), which is an alternative of select().

Renaming columns with R base functions

To rename the column Sepal.Length to sepal_length, the procedure is as follow:

  1. Get column names using the function names() or colnames()
  2. Change column names where name = Sepal.Length
# get column names
colnames(my_data)
[1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width"  "Species"     
# Rename column where names is "Sepal.Length"
names(my_data)[names(my_data) == "Sepal.Length"] <- "sepal_length"
names(my_data)[names(my_data) == "Sepal.Width"] <- "sepal_width"
my_data
Source: local data frame [150 x 5]

   sepal_length sepal_width Petal.Length Petal.Width Species
          (dbl)       (dbl)        (dbl)       (dbl)  (fctr)
1           5.1         3.5          1.4         0.2  setosa
2           4.9         3.0          1.4         0.2  setosa
3           4.7         3.2          1.3         0.2  setosa
4           4.6         3.1          1.5         0.2  setosa
5           5.0         3.6          1.4         0.2  setosa
6           5.4         3.9          1.7         0.4  setosa
7           4.6         3.4          1.4         0.3  setosa
8           5.0         3.4          1.5         0.2  setosa
9           4.4         2.9          1.4         0.2  setosa
10          4.9         3.1          1.5         0.1  setosa
..          ...         ...          ...         ...     ...

It’s also possible to rename by index in names vector as follow.

names(my_data)[1] <- "sepal_length"
names(my_data)[2] <- "sepal_width"

Summary


To rename the column of a data frame, use the function rename()[in dplyr package].


Infos

This analysis has been performed using R (ver. 3.2.3).


Viewing all articles
Browse latest Browse all 183

Trending Articles