BUS 320
Rows: 20,449
Columns: 4
$ Entity <chr> "Afghanistan", "Afghanistan", …
$ Code <chr> "AFG", "AFG", "AFG", "AFG", "A…
$ Year <dbl> 1950, 1951, 1952, 1953, 1954, …
$ `Life expectancy at birth (historical)` <dbl> 27.7, 28.0, 28.4, 28.9, 29.2, …
Name | life_expectancy |
Number of rows | 20449 |
Number of columns | 4 |
_______________________ | |
Column type frequency: | |
character | 2 |
numeric | 2 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
Entity | 0 | 1.00 | 4 | 59 | 0 | 256 | 0 |
Code | 1390 | 0.93 | 3 | 8 | 0 | 237 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
Year | 0 | 1 | 1976.53 | 37.74 | 1543 | 1962.0 | 1982.0 | 2002 | 2021.0 | ▁▁▁▁▇ |
Life expectancy at birth (historical) | 0 | 1 | 61.78 | 12.94 | 12 | 52.5 | 64.3 | 72 | 86.5 | ▁▂▅▇▅ |
eliminate the rows that have a missing code
save the output to life_no_missing
glimpse life_no_missing
how many rows are left?
Entity
Entity
’s have different numbers of observations?use set.seed()
so your results will be replicable
save to entities
extract only the rows for the 6 entities
rename the last column to expectancy
drop the variable Code
save to object life_df
life_df
Year
vs expectancy
for each entityuse ggplot2
to create a line plot for each entity with Year
on the x-axis and expectancy
on the y-axis
add points to the plot
format the y-axis to display the values with the suffix “years”
add a title, “Life Expectancy, 1876 to 2021”, min year to max year
change the color using scale_color_
life_df |>
ggplot(aes(x = Year, y = expectancy, color = Entity)) +
geom_line() +
geom_point()+
scale_y_continuous(labels = number_format(suffix = " years")) +
labs(x = NULL, y = NULL, color = NULL, title = "Life Expectancy, 1876 to 2021", caption = "Source Our World in Data (https://ourworldindata.org/life-expectancy)") +
scale_color_viridis_d(option = "plasma", begin = 0, end = 0.8)
p_life_df
p_life_df <- life_df |>
ggplot(aes(x = Year, y = expectancy, color = Entity)) +
geom_line() +
geom_point()+
scale_y_continuous(labels = number_format(suffix = " years")) +
labs(x = NULL, y = NULL, color = NULL, title = "Life Expectancy, 1876 to 2021", caption = "Source Our World in Data (https://ourworldindata.org/life-expectancy)") +
scale_color_viridis_d(option = "plasma", begin = 0, end = 0.8)
life_expectancy