An introduction to stacking and stacknet techniques
Stacking or Stacked Generalization is the process of combining various machine learning algorithms using holdout data. It is attributed to Wolpert 1992. It normally involves a four-stage process. Consider 3 datasets A, B, C. For A and B we know the ground truth (or in other words the target variable y). We can use stacking as follows:
- We train various machine learning algorithms (regressors or classifiers) in dataset A
- We make predictions for each one of the algorithms for datasets B and C and we create new datasets B1 and C1 that contain only these predictions. So if we ran 10 models then B1 and C1 have 10 columns each.
- We train a new machine learning algorithm (often referred to as Meta learner or Super learner) using B1
- We make predictions using the Meta learner on C1
Read next here http://blog.kaggle.com/2017/06/15/stacking-made-easy-an-introduction-to-stacknet-by-competitions-grandmaster-marios-michailidis-kazanova/
Post a Comment