based on IMS Ch 4: Exploratory data analysis
nominal data
ordinal data
loan50
glimpse the data loan50
which variables are categorical (type: fct, lgl)
Using dataset loan50
create a frequency table loan_purpose
what is the most common reason for a loan?
Using dataset loan50
create a frequency table of grade
what is the most common grade for a loan?
homeownership
purpose rent mortgage own
0 0 0
car 1 1 0
credit_card 7 6 0
debt_consolidation 8 12 3
home_improvement 0 5 0
house 0 1 0
major_purchase 0 0 0
medical 0 0 0
moving 0 0 0
other 3 1 0
renewable_energy 1 0 0
small_business 1 0 0
vacation 0 0 0
wedding 0 0 0
The count of the category with the most observations is:
The value of loan_purpose
is:
The value of homeownership
is:
Using dataset loan50
create a contingency table of grade
and verified_income
what is the most common combination of grade
and verified_income
How many loans have grade “A” and are “Not Verified”?
ggplot
ggplot()
defines plot object
aes(x = variable x-axis, y = varible on y axis)
add layers with geom_
geom_bar()
loan_purpose
loan_purpose
on the y axisStandardized
verified_income
and grade
variables in the loan50
datasetverified_income
and grade
variables in the loan50
datasetverified_income
and grade
variables in the loan50
datasethomeownership
)Filter for missing values
openintro
package