ID3 Algorithm - R Programming - approvedwritershub.com

School of Computer & Information Sciences

ITS 836 Data Science and Big Data Analytics

ITS 836

HW07 Lecture 07 Classification

Questions

Perform the ID3 Algorithm

R exercise for Decision Tree section 7_1

Explain how Random Forest Algorithm works

Iris Dataset with Decision Tree vs Random Forest

R exercise for Naïve Bayes section 7_2

Analyze Classifier Performance section 7_3

Redo calculations for ID3 and Naïve Bayes for the Golf

ITS 836

HW07-1 Apply ID3 Algorithm to demonstrate the Decision Tree for the data set

ITS 836

http://www.cse.unsw.edu.au/~billw/cs9414/notes/ml/06prop/id3/id3.html

Select	Size	Color	Shape
yes	medium	blue	brick
yes	small	red	sphere
yes	large	green	pillar
yes	large	green	sphere
no	small	red	wedge
no	large	red	wedge
no	large	red	pillar

Back to HW07 Overview

HW07 Q 2

Analyze R code in section 7_1 to create the decision tree classifier for the dataset: bank_sample.csv

Create and Explain all plots an d results

ITS 836

# install packages rpart,rpart.plot

# put this code into Rstudio source and execute lines via Ctrl/Enter

library(“rpart”)

library(“rpart.plot”)

setwd(“c:/data/rstudiofiles/”)

banktrain <- read.table(“bank-sample.csv”,header=TRUE,sep=”,”)

## drop a few columns to simplify the tree

drops<-c(“age”, “balance”, “day”, “campaign”, “pdays”, “previous”, “month”)

banktrain <- banktrain [,!(names(banktrain) %in% drops)]

summary(banktrain)

# Make a simple decision tree by only keeping the categorical variables

fit <- rpart(subscribed ~ job + marital + education + default + housing + loan + contact + poutcome,method=”class”,data=banktrain,control=rpart.control(minsplit=1),

parms=list(split=’information’))

summary(fit)

# Plot the tree

rpart.plot(fit, type=4, extra=2, clip.right.labs=FALSE, varlen=0, faclen=3)

Back to HW07 Overview

HW07 Q 2

Analyze R code in section 7_1 to create the decision tree classifier for the dataset: bank_sample.csv

Create and Explain all plots an d results

ITS 836

HW07 Q 2

Analyze R code in section 7_1 to create the decision tree classifier for the dataset: bank_sample.csv

Create and Explain all plots and results

ITS 836

HW 7 Q3

Explain how a Random Forest Algorithm Works

ITS 836

http://blog.citizennet.com/blog/2012/11/10/random-forests-ensembles-and-performance-metrics

Back to HW07 Overview

ITS 836

Use Decision Tree Classifier and Random Forest

Attributes: sepal length, sepal width, petal length, petal width

All flowers contain a sepal and a petal

For the iris flowers three categories (Versicolor, Setosa, Virginica) different measurements

R.A. Fisher, 1936

HW07 Q4 Using Iris Dataset

Back to HW07 Overview

HW07 Q4 Using Iris Dataset

Decision Tree applied to Iris Dataset

https://rpubs.com/abhaypadda/k-nn-decision-tree-on-IRIS-dataset or

https://davetang.org/muse/2013/03/12/building-a-classification-tree-in-r/

What are the disadvantages of Decision Trees?

https://www.quora.com/What-are-the-disadvantages-of-using-a-decision-tree-for-classification

Random Forest applied to Iris Dataset and compare to

https://rpubs.com/rpadebet/269829

http://rischanlab.github.io/RandomForest.html

ITS 836

Get data and e1071 package

sample<-read.table(“sample1.csv”,header=TRUE,sep=”,”)

traindata<-as.data.frame(sample[1:14,])

testdata<-as.data.frame(sample[15,])

traindata #lists train data

testdata #lists test data, no Enrolls variable

install.packages(“e1071”, dep = TRUE)

library(e1071) #contains naïve Bayes function

model<-naiveBayes(Enrolls~Age+Income+JobSatisfaction+Desire,traindata)

model # generates model output

results<-predict(model,testdata)

Results # provides test prediction

ITS 836

Q5 HW07 Section 7.2 Naïve Bayes in R

Back to HW07 Overview

7.3 classifier performance

# install some packages

install.packages(“ROCR”)

library(ROCR)

# training set

banktrain <- read.table(“bank-sample.csv”,header=TRUE,sep=”,”)

# drop a few columns

drops <- c(“balance”, “day”, “campaign”, “pdays”, “previous”, “month”)

banktrain <- banktrain [,!(names(banktrain) %in% drops)]

# testing set

banktest <- read.table(“bank-sample-test.csv”,header=TRUE,sep=”,”)

banktest <- banktest [,!(names(banktest) %in% drops)]

# build the na?ve Bayes classifier

nb_model <- naiveBayes(subscribed~.,

data=banktrain)

ITS 836

# perform on the testing set

nb_prediction <- predict(nb_model,

# remove column “subscribed”

banktest[,-ncol(banktest)],

type=’raw’)

score <- nb_prediction[, c(“yes”)]

actual_class <- banktest$subscribed == ‘yes’

pred <- prediction(score, actual_class)

perf <- performance(pred, “tpr”, “fpr”)

plot(perf, lwd=2, xlab=”False Positive Rate (FPR)”,

ylab=”True Positive Rate (TPR)”)

abline(a=0, b=1, col=”gray50″, lty=3)

## corresponding AUC score

auc <- performance(pred, “auc”)

auc <- unlist(slot(auc, “y.values”))

auc

Back to HW07 Overview

7.3 Diagnostics of Classifiers

We cover three classifiers

Logistic regression, decision trees, naïve Bayes

Tools to evaluate classifier performance

Confusion matrix

ITS 836

Back to HW07 Overview

7.3 Diagnostics of Classifiers

Bank marketing example

Training set of 2000 records

Test set of 100 records, evaluated below

ITS 836

Back to HW07 Overview

HW07 Q07 Review calculations for the ID3 and Naïve Bayes Algorithm

ITS 836

Record	OUTLOOK	TEMPERATURE	HUMIDITY	WINDY	PLAY GOLF
X0	Rainy	Hot	High	False	No
X1	Rainy	Hot	High	True	No
X2	Overcast	Hot	High	False	Yes
X3	Sunny	Mild	High	False	Yes
4	Sunny	Cool	Normal	False	Yes
5	Sunny	Cool	Normal	True	No
6	Overcast	Cool	Normal	True	Yes
7	Rainy	Mild	High	False	No
8	Rainy	Cool	Normal	False	Yes
9	Sunny	Mild	Normal	False	Yes
10	Rainy	Mild	Normal	True	Yes
11	Overcast	Mild	High	True	Yes
12	Overcast	Hot	Normal	False	Yes
X13	Sunny	Mild	High	True	No

Back to HW07 Overview

Questions?

ITS 836

ID3 Algorithm – R Programming

Do you need a similar assignment done for you from scratch? Order now!
Use Discount Code "Newclient" for a 15% Discount!

DISCLAIMER

PAYMENT METHODS

CONTACT US

E MAIL:

Archive

Do you need a similar assignment done for you from scratch? Order now! Use Discount Code "Newclient" for a 15% Discount!

Share this:

DISCLAIMER

PAYMENT METHODS

CONTACT US

E MAIL:

Archive

Do you need a similar assignment done for you from scratch? Order now!
Use Discount Code "Newclient" for a 15% Discount!