R provides powerful functions and packages for data manipulation, allowing you to perform a wide range of operations on your data. Two popular packages for data manipulation in R are dplyr
and tidyr
. Here are examples of common data manipulation tasks using these packages:
- Filtering data:
library(dplyr) # Filter rows where a specific condition is met filtered_data <- filter(df, age > 30)
- Sorting data:
library(dplyr) # Sort data frame by a specific column sorted_data <- arrange(df, age)
- Selecting specific columns:
library(dplyr) # Select specific columns from a data frame selected_columns <- select(df, name, age)
- Adding new columns:
library(dplyr) # Add a new column to a data frame df <- mutate(df, new_column = age * 2)
- Grouping and summarizing data:
library(dplyr) # Group data by a specific column and calculate summary statistics summarized_data <- df %>% group_by(city) %>% summarise(mean_age = mean(age), total_count = n())
- Reshaping data:
library(tidyr) # Convert data from wide format to long format long_data <- gather(df, key = "variable", value = "value", -name)
- Merging data frames:
library(dplyr) # Merge two data frames based on a common column merged_data <- merge(df1, df2, by = "id")
These examples demonstrate just a few of the many operations you can perform using dplyr
and tidyr
. These packages provide intuitive and efficient functions for data manipulation in R, making it easier to handle, transform, and analyze your data effectively.