library(xgboost)
data(iris)
# Convert the Species factor to an integer class starting at 0
# This is picky, but it's a requirement for XGBoost
species = iris$Species
label = as.integer(iris$Species)-1
iris$Species = NULL
n = nrow(iris)
train.index = sample(n,floor(0.75*n))
train.data = as.matrix(iris[train.index,])
train.label = label[train.index]
test.data = as.matrix(iris[-train.index,])
test.label = label[-train.index]
# Transform the two data sets into xgb.Matrix
xgb.train = xgb.DMatrix(data=train.data,label=train.label)
xgb.test = xgb.DMatrix(data=test.data,label=test.label)
# Define the parameters for multinomial classification
num_class = length(levels(species))
params = list(
booster="gbtree",
eta=0.001,
max_depth=5,
gamma=3,
subsample=0.75,
colsample_bytree=1,
objective="multi:softprob",
eval_metric="mlogloss",
num_class=num_class
)
# Train the XGBoost classifer
xgb.fit=xgb.train(
params=params,
data=xgb.train,
nrounds=10000,
nthreads=1,
early_stopping_rounds=10,
watchlist=list(val1=xgb.train,val2=xgb.test),
verbose=0
)
# Review the final model and results
xgb.fit
# Predict outcomes with the test data
xgb.pred = predict(xgb.fit,test.data,reshape=T)
xgb.pred = as.data.frame(xgb.pred)
colnames(xgb.pred) = levels(species)
# Use the predicted label with the highest probability
xgb.pred$prediction = apply(xgb.pred,1,function(x) colnames(xgb.pred)[which.max(x)])
xgb.pred$label = levels(species)[test.label+1]
result = sum(xgb.pred$prediction==xgb.pred$label)/nrow(xgb.pred)
print(paste("Fianl Accuracy =", sprintf("%1.2f%%",100*result)))
'반치용 > 기타 및 저장' 카테고리의 다른 글
[파이썬]최단경로 알고리즘 (0) | 2020.06.18 |
---|---|
[파이썬]random matrix 만드는 코드 (0) | 2020.06.18 |
[파이썬]글 내의 단어 빈도 체크 (0) | 2020.06.17 |
[파이썬]열쇠 모양 만들기 (0) | 2020.06.17 |
[저장]파이썬 문자열 문장 단위 문자열 나누기 (0) | 2020.06.16 |
댓글