R : Exploring Data for Machine Learning Modeling


These are my notes on the Practical Machine Learning course (Week2: Plotting Predictors - Tutorial).

When exploring data for Machine Learning, we're looking for:

  • imbalance outcomes/predictors
  • outliners
  • groups of outcome points not explained by any of the predictors
  • skewed variables (that needs to be transformed)

We'll use the Wage dataset …

more ...




R : Variance Inflation


This is my note on swirl course Regression Model : Overfitting and Underfitting.

Definition

A variance inflation factor (VIF) is a ratio of estimated variances, the variance due to including the ith regressor, divided by that due to including a corresponding ideal regressor which is uncorrelated with the others. VIF is …

more ...






R: Analysis of weather events impact on population health and economy.


Synopsis

This analysis involves exploring the U.S. National Oceanic and Atmospheric Administration's (NOAA) storm database.

This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

This analysis focuses …

more ...