More dplyr (package in the tidyverse)
dplyr
expects tidy data
each variable in its own column
each observation in its own row
works with pipes |>
functions covered
glimpse()
count()
select()
mutate()
dplyr
function() |
Action |
---|---|
glimpse() |
get a glimpse of your data |
count() |
count the unique values of one or more variables |
filter() |
picks rows based on their values |
mutate() |
creates new variables (columns) |
select() |
picks variables (columns) |
summarize() |
reduces multiple values down to a single statistic |
arrange() |
changes the order of the rows based on their values |
group_by() |
create subsets of data to apply functions to |
dplyr
new functions we will cover today:
summarize()
arrange()
group_by()
summarize
Find the average price
of all cars:
Find the maximum mpg_city
for of all cars:
summarize
with group_by
Calculate average price
for each type
:
summarize
with group_by
Calculate maximum mpg_city
for each drive_train
:
Calculate the average and maxiumum price
for each type
Calculate the median and minimum weight
for each drive_train
n()
and group_by()
Calculate the number of cars from each type
n()
Calculate the number of cars from each type
n()
and group_by()
Calculate the number of cars from each weight
Arrange cars based on their price
:
Arrange cars based on their mpg_city
:
Arrange cars in descending order based on their price
:
Arrange cars by passengers
and then by price
passengers
and price
Recap of summarize
group_by
.Recap of arrange
openintro
package