# 부스팅 모형 (Boost)

- Weak Learner: 동전던지기 보다 조금 더 잘 예측하는 모형
- Boosting: Weak Learner를 앙상블로 결합시켜 강한 예측 모형을 개발하는 방법론

![](https://upload.wikimedia.org/wikipedia/commons/b/b5/Ensemble_Boosting.svg)

- 부스팅 모형 진화
  - Adaboost
  - Gradient Boosting - Decision Tree
  - Stochastic Gradient Boosting (SGB) - Random Forest
  - xgBoost - Optimization

## 환경설정

In [1]:
import pandas as pd
import numpy as np

from sklearn import preprocessing # 전처리

from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn.metrics import mean_squared_error as MSE

from sklearn.ensemble import GradientBoostingRegressor

## 데이터셋

In [2]:
# 2. 데이터셋
mpg_df = pd.read_csv('data/auto-mpg.csv', index_col='car name')
mpg_df = mpg_df[mpg_df.horsepower != '?']

# 3. 훈련/시험 데이터셋
y = mpg_df[['mpg']]
X = mpg_df.loc[:, 'cylinders':'origin']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state = 777)
y_train = np.ravel(y_train,order='C') 

## 기계학습 : Gradient Boosting

In [3]:
reg_gb = GradientBoostingRegressor(n_estimators = 100,
                                   max_depth    = 1,
                                   random_state = 777)

reg_gb.fit(X_train, y_train)

GradientBoostingRegressor(max_depth=1, random_state=777)

##  예측 성능

In [4]:
y_pred = reg_gb.predict(X_test)

print('Gradient Boosting Regression: {:.3f}'.format(MSE(y_test, y_pred)))

Gradient Boosting Regression: 7.618
