scikit-learn: Random Forest

A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and use averaging to improve the predictive accuracy and control over-fitting ...

more ...








Confusion Matrix


A confusion matrix, also known as a contingency table or an error matrix or tavle of confusion, is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one (in unsupervised learning it is usually called a matching matrix).

It is a table …

more ...




R : Exploring Data for Machine Learning Modeling


These are my notes on the Practical Machine Learning course (Week2: Plotting Predictors - Tutorial).

When exploring data for Machine Learning, we're looking for:

  • imbalance outcomes/predictors
  • outliners
  • groups of outcome points not explained by any of the predictors
  • skewed variables (that needs to be transformed)

We'll use the Wage dataset …

more ...