Data Mining - Management
Classification Trees Analysis
This assignment is to give you the hands-on experience using R to conduct logistic regression in real world data set. Please refer to the Chapter 9 in the reference textbook (through the link at the bottom under Lessons) for details about how to generate classification tree models and the evaluate the model performances. Then open A Complete Guide On Decision Tree Algorithm (or open the attached Week 5 A Complete Guide On Decision Tree Algorithm.docx), go over the mushrooms.csv example and use the same R codes to reproduce the results step by step, study the way to explain the model and evaluate the results:
Step 1: Install and load libraries
Step 2: Import the data set
Step 3: Data Cleaning
Step 4: Data Exploration and Analysis
Step 5: Data Splicing
Step 6: Building a model
Step 7: Visualizing the tree
Step 8: Testing the model
Step 9: Calculating accuracy
Now open this file mushrooms2.csv (slightly different from the sample dataset) and repeat the same analysis as in the website to conduct a classification tree analysis according to the above steps specifically. Please copy/paste screen images of your work in R, and put into a Word document for submission. Be sure to provide narrative of your answers (i.e., do not just copy/paste your answers without providing some explanation of what you did or your findings). Please include Introduction, R codes with outputs, Figures and explanations with cover and reference pages. A good conclusion to wrap up the assignment is also expected. You also need to follow APA formats.
Reference
https://www.edureka.co/blog/decision-tree-algorithm/
Week 5 Assignment Instructions and Sample R codes
Classification Trees Analysis
This assignment is to give you the hands-on experience using R to conduct logistic regression in real world data set. Please refer to the Chapter 9 in the reference textbook (through the link at the bottom under Lessons) for details about how to generate classification tree models and the evaluate the model performances. Then open this website, go over the mushrooms.csv example and use the same R codes to reproduce the results step by step, study the way to explain the model and evaluate the results:
Step 1: Install and load libraries
Step 2: Import the data set
Step 3: Data Cleaning
Step 4: Data Exploration and Analysis
Step 5: Data Splicing
Step 6: Building a model
Step 7: Visualising the tree
Step 8: Testing the model
Step 9: Calculating accuracy
Now open this file mushrooms2.csv (slightly different from the sample dataset) and repeat the same analysis as in the website to conduct a classification tree analysis according to the above steps specifically. Please copy/paste screen images of your work in R, and put into a Word document for submission. Be sure to provide narrative of your answers (i.e., do not just copy/paste your answers without providing some explanation of what you did or your findings). Please include Introudction, R codes with outputs, Figures and explanations with cover and reference pages. A good conclusion to wrap up the assignment is also expected. Please follow APA formats as well.
Reference
https://www.edureka.co/blog/decision-tree-algorithm/
#Installing libraries
install.packages(rpart)
install.packages(caret)
install.packages(rpart.plot)
install.packages(rattle)
#Loading libraries
library(rpart,quietly = TRUE)
library(caret,quietly = TRUE)
library(rpart.plot,quietly = TRUE)
library(rattle)
#Reading the data set as a dataframe
getwd() # to see which working directory you are in?”
# set the working directory to your desktop , for example.”
setwd(C:/Users/alpha/Desktop)
mushrooms <- read.csv(mushrooms.csv)
# structure of the data
str(mushrooms)
# number of rows with missing values
nrow(mushrooms) - sum(complete.cases(mushrooms))
# deleting redundant variable `veil.type`
mushrooms$veil.type <- NULL
# analyzing the odor variable
> table(mushrooms$class,mushrooms$odor)
number.perfect.splits <- apply(X=mushrooms[-1], MARGIN = 2, FUN = function(col){
t <- table(mushrooms$class,col)
sum(t == 0)
})
# Descending order of perfect splits
order <- order(number.perfect.splits,decreasing = TRUE)
number.perfect.splits <- number.perfect.splits[order]
# Plot graph
par(mar=c(10,2,2,2))
barplot(number.perfect.splits,main=Number of perfect splits vs feature, xlab=, ylab=Feature, las=2, col=wheat)
#data splicing
set.seed(12345)
train <- sample(1:nrow(mushrooms),size = ceiling(0.80*nrow(mushrooms)),replace = FALSE)
# training set
mushrooms_train <- mushrooms[train,]
# test set
mushrooms_test <- mushrooms[-train,]
# penalty matrix
penalty.matrix <- matrix(c(0,1,10,0), byrow=TRUE, nrow=2)
# building the classification tree with rpart
tree <- rpart(class~.,
data=mushrooms_train,
parms = list(loss = penalty.matrix), method = class)
# Visualize the decision tree with rpart.plot
rpart.plot(tree, nn=TRUE)
#Testing the model
pred <- predict(object=tree,mushrooms_test[-1],type=class)
#Calculating accuracy
t <- table(mushrooms_test$class,pred)
confusionMatrix(t)
pred
class cap-shape cap-surface cap-color bruises odor gill-attachment gill-spacing gill-size gill-color stalk-shape stalk-root stalk-surface-above-ring stalk-surface-below-ring stalk-color-above-ring stalk-color-below-ring veil-type veil-color ring-number ring-type spore-print-color population habitat
p x s n t p f c n k e e s s w w p w o p k s u
e x s y t a f c b k e c s s w w p w o p n n g
e b s w t l f c b n e c s s w w p w o p n n m
p x y w t p f c n n e e s s w w p w o p k s u
e x s g f n f w b k t e s s w w p w o e n a g
e x y y t a f c b n e c s s w w p w o p k n g
e b s w t a f c b g e c s s w w p w o p k n m
e b y w t l f c b n e c s s w w p w o p n s m
p x y w t p f c n p e e s s w w p w o p k v g
e b s y t a f c b g e c s s w w p w o p k s m
e x y y t l f c b g e c s s w w p w o p n n g
e x y y t a f c b n e c s s w w p w o p k s m
e b s y t a f c b w e c s s w w p w o p n s g
p x y w t p f c n k e e s s w w p w o p n v u
e x f n f n f w b n t e s f w w p w o e k a g
e s f g f n f c n k e e s s w w p w o p n y u
e f f w f n f w b k t e s s w w p w o e n a g
p x s n t p f c n n e e s s w w p w o p k s g
p x y w t p f c n n e e s s w w p w o p n s u
p x s n t p f c n k e e s s w w p w o p n s u
e b s y t a f c b k e c s s w w p w o p n s m
p x y n t p f c n n e e s s w w p w o p n v g
e b y y t l f c b k e c s s w w p w o p n s m
e b y w t a f c b w e c s s w w p w o p n n m
e b s w t l f c b g e c s s w w p w o p k s m
p f s w t p f c n n e e s s w w p w o p n v g
e x y y t a f c b n e c s s w w p w o p n n m
e x y w t l f c b w e c s s w w p w o p n n m
e f f n f n f c n k e e s s w w p w o p k y u
e x s y t a f w n n t b s s w w p w o p n v d
e b s y t l f c b g e c s s w w p w o p n n m
p x y w t p f c n k e e s s w w p w o p n s u
e x y y t l f c b n e c s s w w p w o p n n m
e x y n t l f c b p e r s y w w p w o p n y p
e b y y t l f c b n e c s s w w p w o p n s m
e x f y t l f w n w t b s s w w p w o p n v d
e s f g f n f c n k e e s s w w p w o p k v u
p x y n t p f c n w e e s s w w p w o p n s u
e x f y t a f w n p t b s s w w p w o p n v d
e b s y t l f c b k e c s s w w p w o p k s m
e b y y t a f c b n e c s s w w p w o p n s g
e x y y t l f c b n e r s y w w p w o p k y p
e x f n f n f c n g e e s s w w p w o p k y u
p x y w t p f c n p e e s s w w p w o p n v g
e x s y t a f c b w e c s s w w p w o p k n m
e x y w t a f c b n e c s s w w p w o p n n g
e x y y t l f c b k e c s s w w p w o p k s m
e x s w t l f c b w e c s s w w p w o p n n m
e x y y t l f c b n e r s y w w p w o p n s p
e f y y t l f c b w e r s y w w p w o p k s p
e x y n t a f c b w e r s y w w p w o p k s g
e x s w t l f c b k e c s s w w p w o p k s g
e b s w t l f c b k e c s s w w p w o p n n m
p x y n t p f c n k e e s s w w p w o p n v u
p x s w t p f c n k e e s s w w p w o p k v u
e b y y t a f c b w e c s s w w p w o p k s m
e f f g f n f w b n t e s s w w p w o e n a g
e b s w t a f c b w e c s s w w p w o p n n g
e x s y t l f c b k e c s s w w p w o p k n g
e x y n t a f c b p e r s y w w p w o p k y p
e s f g f n f c n k e e s s w w p w o p n v u
e b y y t a f c b k e c s s w w p w o p n s m
e b s y t l f c b g e c s s w w p w o p n s m
e b y y t l f c b g e c s s w w p w o p n n m
e b y w t l f c b n e c s s w w p w o p n s g
e f s n f n f w b k t e s s w w p w o e k a g
e x s w t l f c b n e c s s w w p w o p k s g
e f y y t a f c b w e r s y w w p w o p n s g
e x y y t a f c b w e c s s w w p w o p k n g
e x f g f n f c n p e e s s w w p w o p n v u
e f f y t l f w n p t b s s w w p w o p n v d
e b y w t l f c b g e c s s w w p w o p n s m
e f f y t l f w n w t b s s w w p w o p n v d
e x y n t a f c b p e r s y w w p w o p k s p
e b s y t a f c b k e c s s w w p w o p k s g
e f s y t l f w n p t b s s w w p w o p n v d
e x s w t l f w n n t b s s w w p w o p u v d
e f y n t l f c b p e r s y w w p w o p n y p
p x y n t p f c n w e e s s w w p w o p n v u
e f y n t a f c b n e r s y w w p w o p n y g
e x s n f n f w b k t e f s w w p w o e n s g
p x y w t p f c n w e e s s w w p w o p k s g
e f f g f n f c n n e e s s w w p w o p n y u
e x f g f n f w b n t e s s w w p w o e n s g
e x y y t l f c b w e r s y w w p w o p k s g
e x s n f n f w b k t e s s w w p w o e k s g
e b s w t a f c b w e c s s w w p w o p k s g
e x s w t l f c b n e c s s w w p w o p n s g
e f y n t l f c b w e r s y w w p w o p k y g
e s f n f n f c n n e e s s w w p w o p n v u
e x f n f n f c n n e e s s w w p w o p n y u
e b s w t l f c b k e c s s w w p w o p k s g
e x y y t a f c b g e c s s w w p w o p k s g
e x y y t l f c b g e c s s w w p w o p k n m
e x s n f n f w b n t e s s w w p w o e n a g
e x s w t a f c b g e c s s w w p w o p n s g
e f y n t l f c b p e r s y w w p w o p n s g
e x s y t a f c b n e c s s w w p w o p k n g
e b s w t a f c b g e c s s w w p w o p n s g
e x y w t a f c b g e c s s w w p w o p k s g
e x f n f n f w b p t e f s w w p w o e k s g
e b s y t l f c b n e c s s w w p w o p k n g
e f y y t l f c b w e r s y w w p w o p n s g
e x y y t a f c b n e r s y w w p w o p k y p
e b y w t l f c b g e c s s w w p w o p n n g
e x y y t a f c b n e c s s w w p w o p k n m
e x y y t a f c b w e r s y w w p w o p n y g
e b y w t l f c b n e c s s w w p w o p k s m
e b y w t a f c b g e c s s w w p w o p n s m
e x s y t a f c b k e c s s w w p w o p k n m
e x s y t l f c b w e c s s w w p w o p k n g
e s f g f n f c n g e e s s w w p w o p k y u
e x f w t a f w n w t b s s w w p w o p u v d
e x s y t a f c b n e c s s w w p w o p k n m
p x y w t p f c n n e e s s w w p w o p n v u
e x y y t l f c b p e r s y w w p w o p n s g
e s f g f n f c n p e e s s w w p w o p n y u
e x y y t l f c b w e r s y w w p w o p k y g
e x s y t l f w n p t b s s w w p w o p u v d
e s f n f n f c n k e e s s w w p w o p n y u
p x s w t p f c n k e e s s w w p w o p k v g
e x y w t a f c b g e c s s w w p w o p n n m
p f y n t p f c n p e e s s w w p w o p k v g
e f s g f n f w b k t e s s w w p w o e n a g
e x s y t l f c b g e c s s w w p w o p n s m
e x s w f n f w b n t e s f w w p w o e k s g
e b s y t a f c b w e c s s w w p w o p n n g
e f f g f n f w b h t e s s w w p w o e n a g
e x s w t l f c b n e c s s w w p w o p k n g
e b s w t l f c b n e c s s w w p w o p n s m
e b s w t l f c b w e c s s w w p w o p n s g
e b y w t l f c b w e c s s w w p w o p n s m
e f s w t l f w n w t b s s w w p w o p u v d
e x y y t l f c b g e c s s w w p w o p k s m
e f s w t a f w n p t b s s w w p w o p n v d
p x y w t p f c n w e e s s w w p w o p n v u
e f f w t l f w n w t b s s w w p w o p n v d
e x y y t a f c b n e c s s w w p w o p n s g
p x s n t p f c n p e e s s w w p w o p n v g
e b s y t l f c b w e c s s w w p w o p n n g
e x y n t a f c b w e r s y w w p w o p k y p
e b y y t l f c b g e c s s w w p w o p k n m
e s f n f n f c n k e e s s w w p w o p n v u
e f y n t a f c b w e r s y w w p w o p k y p
e x y y t a f c b k e c s s w w p w o p k n g
e x f g f n f w b k t e f f w w p w o e k s g
e f f w f n f w b k t e s f w w p w o e n a g
e x y y t l f c b w e c s s w w p w o p n n m
e b s y t l f c b k e c s s w w p w o p k n g
e b y w t a f c b g e c s s w w p w o p k n m
e x y w t a f c b w e c s s w w p w o p n s g
e x s n f n f w b p t e f s w w p w o e n a g
e x y w t l f c b g e c s s w w p w o p n s g
e s f n f n f c n k e e s s w w p w o p k v u
e x s w t a f w n w t b s s w w p w o p u v d
e x y n t l f c b w e r s y w w p w o p k s g
e b y y t a f c b k e c s s w w p w o p n n g
e x y w t a f c b k e c s s w w p w o p n n g
e b y w t a f c b n e c s s w w p w o p k s m
e b s y t a f c b g e c s s w w p w o p k s g
e b s y t l f c b g e c s s w w p w o p k s m
e b y y t a f c b n e c s s w w p w o p k n g
e x f n f n f c n k e e s s w w p w o p n y u
e f y n t l f c b n e r s y w w p w o p n y g
e x y w t a f c b k e c s s w w p w o p n s g
e f y y t l f c b w e r s y w w p w o p n y p
e b s w t a f c b w e c s s w w p w o p n s g
e b s w t a f c b w e c s s w w p w o p k s m
e x y n t l f c b w e r s y w w p w o p k y g
e b s w t a f c b k e c s s w w p w o p k s g
e x f g f n f c n g e e s s w w p w o p n y u
e b s y t l f c b k e c s s w w p w o p n s g
e x f y t l f w n n t b s s w w p w o p u v d
e b y y t a f c b w e c s s w w p w o p k s g
e f y y t l f c b p e r s y w w p w o p n s g
e b y w t l f c b w e c s s w w p w o p k n m
e b y w t a f c b k e c s s w w p w o p k n m
e b y y t a f c b g e c s s w w p w o p n s g
e x y y t l f c b g e c s s w w p w o p n n m
e b s y t l f c b g e c s s w w p w o p n n g
p x y w t p f c n p e e s s w w p w o p n v u
e s f n f n f c n g e e s s w w p w o p n y u
e f f n f n f c n g e e s s w w p w o p k v u
e x s y t a f c b g e c s s w w p w o p n s m
e f y n t a f c b p e r s y w w p w o p k s p
p x y w t p f c n k e e s s w w p w o p k s g
e b s w t l f c b g e c s s w w p w o p n s m
e f f g f n f c n p e e s s w w p w o p k v u
e b y y t l f c b g e c s s w w p w o p n n g
e x y n t a f c b w e r s y w w p w o p n y p
e x f w f n f w b p t e s f w w p w o e k s g
e x s w t l f w n w t b s s w w p w o p n v d
e b s w t l f c b w e c s s w w p w o p n s m
e f s y t a f w n w t b s s w w p w o p n v d
e x s y t a f c b k e c s s w w p w o p n n m
e f f g f n f c n g e e s s w w p w o p n y u
e b s y t a f c b n e c s s w w p w o p k s g
e x s w t l f c b g e c s s w w p w o p n n m
e x y w t l f c b n e c s s w w p w o p n s m
e f s w t a f w n n t b s s w w p w o p n v d
e x y y t l f c b w e c s s w w p w o p n s g
e b s w t l f c b w e c s s w w p w o p k s g
e x s w t a f c b g e c s s w w p w o p k n g
e x f w f n f w b h t e f s w w p w o e k s g
e f y n t l f c b n e r s y w w p w o p k y p
p x s w t p f c n k e e s s w w p w o p n v u
e b s w t a f c b n e c s s w w p w o p n n g
e b s w t a f c b k e c s s w w p w o p k s m
e b y w t l f c b n e c s s w w p w o p n n g
e b y w t a f c b w e c s s w w p w o p k n g
e x s y t a f c b w e c s s w w p w o p k s g
e b s w t l f c b w e c s s w w p w o p k n m
e x f y t a f w n n t b s s w w p w o p n v d
e x f g f n f c n n e e s s w w p w o p n y u
e f y y t a f c b n e r s y w w p w o p n s g
e b s w t l f c b n e c s s w w p w o p k n m
e x s y t a f c b g e c s s w w p w o p k s g
e x y y t a f c b k e c s s w w p w o p k s g
e x y w t l f c b w e c s s w w p w o p n s g
e s f g f n f c n p e e s s w w p w o p n v u
e x s w t a f c b n e c s s w w p w o p n n m
p x s w t p f c n k e e s s w w p w o p k s g
e x y y t a f c b n e r s y w w p w o p n s g
e f f w t a f w n p t b s s w w p w o p n v d
e x y w t l f c b g e c s s w w p w o p k n m
e b y w t l f c b n e c s s w w p w o p k s g
e x s w t a f c b g e c s s w w p w o p n n g
e x s y t a f c b g e c s s w w p w o p n s g
p x y n t p f c n p e e s s w w p w o p k v u
e b s y t a f c b n e c s s w w p w o p n n g
e x f g f n f c n n e e s s w w p w o p k y u
p x y w t p f c n k e e s s w w p w o p k s u
e x y y t l f c b n e r s y w w p w o p n y p
e f f n f n f c n k e e s s w w p w o p k v u
e b s w t l f c b k e c s s w w p w o p n n g
e x f w t l f w n w t b s s w w p w o p n v d
e x y w t l f c b w e c s s w w p w o p k s g
e b y y t l f c b n e c s s w w p w o p n s g
e x y y t l f c b n e r s y w w p w o p k s g
e f y y t a f c b p e r s y w w p w o p k s p
e f y y t a f c b w e r s y w w p w o p k y p
e x s w t l f c b w e c s s w w p w o p k s g
e x s w t a f c b w e c s s w w p w o p n n m
p x s w t p f c n n e e s s w w p w o p k v u
e f f w t a f w n p t b s s w w p w o p u v d
e x s w t l f c b w e c s s w w p w o p n s g
e x s w t l f w n p t b s s w w p w o p u v d
e x y w t a f c b n e c s s w w p w o p k s m
e f y y t l f c b w e r s y w w p w o p k y p
e x s n f n f w b p t e f s w w p w o e k s g
e f y y t a f c b w e r s y w w p w o p n y g
p x s n t p f c n p e e s s w w p w o p n s g
e s f n f n f c n g e e s s w w p w o p n v u
e b y y t l f c b w e c s s w w p w o p n s m
e b s w t l f c b n e c s s w w p w o p n s g
e b y w t l f c b k e c s s w w p w o p n n m
e f f n f n f c n k e e s s w w p w o p n v u
e x s w t a f c b w e c s s w w p w o p k n m
e b y w t l f c b g e c s s w w p w o p k n g
e b y w t a f c b g e c s s w w p w o p n n m
e f y n t l f c b w e r s y w w p w o p n s g
p x y w t p f c n k e e s s w w p w o p k v g
e x s w t a f c b n e c s s w w p w o p n n g
e x y w t a f c b n e c s s w w p w o p n s m
e f f w t l f w n w t b s s w w p w o p u v d
e f f g f n f c n g e e s s w w p w o p k v u
e f s g f n f w b p t e s s w w p w o e k a g
e x y y t a f c b g e c s s w w p w o p n s m
e b s w t a f c b k e c s s w w p w o p n s m
p f y n t p f c n w e e s s w w p w o p k s u
e x s w t l f c b n e c s s w w p w o p k n m
p f s n t p f c n w e e s s w w p w o p k v u
e x y w t l f c b k e c s s w w p w o p n s g
e x y y t a f c b k e c s s w w p w o p n n m
e f y y t a f c b n e r s y w w p w o p n s p
e x y n t a f c b w e r s y w w p w o p n s p
e f y n t a f c b w e r s y w w p w o p n y g
e x y w t a f c b w e c s s w w p w o p n n m
e f f n f n f w b h t e s s w w p w o e k s g
e x s y t l f c b k e c s s w w p w o p n n m
p x y w t p f c n w e e s s w w p w o p n s g
e b y y t a f c b n e c s s w w p w o p n n m
e s f n f n f c n p e e s s w w p w o p n v u
e x s y t l f c b k e c s s w w p w o p k s m
e b y w t l f c b n e c s s w w p w o p k n m
e f y n t a f c b n e r s y w w p w o p k s g
e b y y t a f c b g e c s s w w p w o p k s m
e b y w t a f c b g e c s s w w p w o p k s g
e x y n t l f c b n e r s y w w p w o p n y p
e f f g f n f c n p e e s s w w p w o p n y u
e x f g f n f c n g e e s s w w p w o p k y u
e b y y t l f c b n e c s s w w p w o p k s g
e x s y t l f c b n e c s s w w p w o p k s m
e b y w t l f c b k e c s s w w p w o p n s g
e x s w t a f c b g e c s s w w p w o p k s m
e b s y t a f c b g e c s s w w p w o p n n m
e x s y t a f c b k e c s s w w p w o p k s m
e x f g f n f w b p t e f f w w p w o e k s g
e f s y t a f w n p t b s s w w p w o p n v d
p x y w t p f c n p e e s s w w p w o p k s g
e x f w f n f w b k t e f s w w p w o e k a g
e b y w t a f c b w e c s s w w p w o p k s g
e x s y t l f w n w t b s s w w p w o p n v d
e b y w t a f c b n e c s s w w p w o p k s g
e x y w t l f c b g e c s s w w p w o p k s g
e x f n t n f c b p t b s s g p p w o p n y d
e b y w t l f c b k e c s s w w p w o p n s m
e x s y t l f c b g e c s s w w p w o p n n m
e f y n t a f c b p e r s y w w p w o p k y g
e x f w f n f w b n t e s s w w p w o e n s g
e x s y t l f c b w e c s s w w p w o p k s g
p x y w t p f c n n e e s s w w p w o p n v g
e x y y t l f c b n e c s s w w p w o p k s g
e x f w t a f w n p t b s s w w p w o p n v d
e b s y t a f c b g e c s s w w p w o p n n g
p x y w t p f c n k e e s s w w p w o p k v u
e x s y t a f w n w t b s s w w p w o p n v d
e x y w t a f c b k e c s s w w p w o p k n m
e x f w t l f w n n t b s s w w p w o p n v d
e f s w t l f w n w t b s s w w p w o p n v d
e x y y t l f c b k e c s s w w p w o p n s m
e f f y t a f w n w t b s s w w p w o p u v d
e x s w t a f c b w e c s s w w p w o p k s g
e b y y t a f c b n e c s s w w p w o p k s m
e x y n t l f c b p e r s y w w p w o p n s p
e b y w t l f c b k e c s s w w p w o p k n g
e x s y t a f c b n e c s s w w p w o p n s m
p x y n t p f c n n e e s s w w p w o p n v u
e b s y t l f c b w e c s s w w p w o p k n g
e b y w t a f c b g e c s s w w p w o p k n g
p x y n t p f c n w e e s s w w p w o p k v u
e b s w t l f c b n e c s s w w p w o p k s m
e b y y t l f c b w e c s s w w p w o p n n m
e b y y t a f c b n e c s s w w p w o p k s g
e x y w t l f c b g e c s s w w p w o p n n g
e x f n t n f c b p t b s s p w p w o p k y d
e x y n t a f c b w e r s y w w p w o p k y g
e b s y t l f c b n e c s s w w p w o p n s g
e x f g f n f c n g e e s s w w p w o p n v u
e x y n t l f c b n e r s y w w p w o p k y g
e x s w t a f c b k e c s s w w p w o p n s g
e b y w t a f c b n e c s s w w p w o p k n g
e x s y t a f c b g e c s s w w p w o p k s m
e f f y t a f w n p t b s s w w p w o p n v d
e b y y t l f c b g e c s s w w p w o p n s g
e x f n f n f w b n t e s f w w p w o e n a g
e x f y t l f w n n t b s s w w p w o p n v d
e x s y t a f c b n e c s s w w p w o p n n m
e f s g f n f w b n t e s f w w p w o e k s g
e x f n f n f c n g e e s s w w p w o p n y u
e f s g f n f w b h t e f f w w p w o e n a g
e f y n t a f c b p e r s y w w p w o p n s p
e b y w t a f c b k e c s s w w p w o p n s g
e b s y t a f c b k e c s s w w p w o p k s m
e x y y t l f c b w e r s y w w p w o p k s p
e s f g f n f c n g e e s s w w p w o p n v u
e f y n t a f c b p e r s y w w p w o p n y p
p x y n t p f c n k e e s s w w p w o p k s u
e x y y t a f c b p e r s y w w p w o p n y p
e x y y t a f c b w e c s s w w p w o p n s g
e x f n f n f w b h t e s f w w p w o e n s g
e x f n f n f w b p t e f f w w p w o e k a g
e f s n f n f w b k t e f s w w p w o e k a g
e f y n t a f c b n e r s y w w p w o p n s p
e x s y t a f c b w e c s s w w p w o p k n g
e f f n f n f c n p e e s s w w p w o p k v u
e x y y t a f c b w e r s y w w p w o p n s p
e x f y t a f w n p t b s s w w p w o p u v d
e b y y t a f c b g e c s s w w p w o p n n g
e x f g f n f w b n t e s s w w p w o e k s g
e x y y t a f c b w e r s y w w p w o p n s g
e f y n t l f c b n e r s y w w p w o p k y g
e x s y t a f c b w e c s s w w p w o p k s m
e s f n f n f c n n e e s s w w p w o p n y u
e x s w t a f c b k e c s s w w p w o p k s m
e b s w t l f c b n e c s s w w p w o p k n g
e s f n f n f c n k e e s s w w p w o p k y u
e b y y t l f c b n e c s s w w p w o p n n m
e x y y t l f c b k e c s s w w p w o p n n g
e b y y t a f c b k e c s s w w p w o p n s g
p x y n t p f c n p e e s s w w p w o p k s u
e x s w f n f w b k t e f s w w p w o e n s g
e x y w t a f c b w e c s s w w p w o p k s m
e b s w t a f c b g e c s s w w p w o p n n g
e x f n t n f c b n t b s s g w p w o p n y d
p x s n t p f c n w e e s s w w p w o p n s u
e x y n t l f c b w e r s y w w p w o p k y p
e x s n f n f w b n t e s s w w p w o e k a g
e b y y t l f c b n e c s s w w p w o p k n m
e x s y t a f c b w e c s s w w p w o p n n m
e b s y t a f c b n e c s s w w p w o p n s m
e x s y t l f c b g e c s s w w p w o p k n m
e b y w t a f c b n e c s s w w p w o p n n g
e x f g f n f c n k e e s s w w p w o p n v u
e f y n t a f c b w e r s y w w p w o p k s g
e x f g f n f w b h t e f f w w p w o e n a g
e x s w t l f c b k e c s s w w p w o p n n m
e x y y t l f c b w e c s s w w p w o p n s m
e x f g f n f c n n e e s s w w p w o p k v u
p x s n t p f c n p e e s s w w p w o p n s u
e b y w t l f c b w e c s s w w p w o p n n g
e b y w t a f c b n e c s s w w p w o p k n m
p x y n t p f c n p e e s s w w p w o p n v u
e b s y t l f c b w e c s s w w p w o p n s m
e x s w t l f c b k e c s s w w p w o p n s m
e x y y t l f c b n e c s s w w p w o p k s m
e b y y t l f c b n e c s s w w p w o p n n g
e x s w t a f c b g e c s s w w p w o p n s m
e x s w t a f c b g e c s s w w p w o p k s g
e x y n t l f c b w e r s y w w p w o p n y p
e x s w t a f c b w e c s s w w p w o p n s m
e x s n f n f w b n t e f f w w p w o e k s g
e x y n t l f c b n e r s y w w p w o p n s p
e x y y t l f c b n e r s y w w p w o p k y g
p x y n t p f c n k e e s s w w p w o p k v g
e b y y t a f c b w e c s s w w p w o p n n g
e x f w t l f w n p t b s s w w p w o p u v d
p x s n t p f c n w e e s s w w p w o p k s u
e x y y t l f c b k e c s s w w p w o p k n g
e f f g f n f w b h t e f f w w p w o e n a g
e x y w t l f c b n e c s s w w p w o p k s g
e b y w t a f c b w e c s s w w p w o p n s m
p x y n t p f c n p e e s s w w p w o p n s u
e f y n t l f c b w e r s y w w p w o p k s g
e x s w t l f c b n e c s s w w p w o p n n m
e x s w t a f w n n t b s s w w p w o p n v d
e f y n t a f c b n e r s y w w p w o p n y p
e b s w t a f c b k e c s s w w p w o p n n m
e x s w f n f w b p t e f f w w p w o e k a g
e f f g f n f c n n e e s s w w p w o p n v u
e x y w t a f c b g e c s s w w p w o p n s g
e x y y t l f c b n e r s y w w p w o p n y g
e b s y t l f c b g e c s s w w p w o p n s g
e b s y t l f c b w e c s s w w p w o p k s g
e f s w t l f w n n t b s s w w p w o p n v d
e b s w t a f c b w e c s s w w p w o p k n m
e x f y t a f w n n t b s s w w p w o p u v d
e x y w t l f c b g e c s s w w p w o p n n m
e b y y t l f c b k e c s s w w p w o p n n m
e x y n t a f c b p e r s y w w p w o p n y p
e x y w t l f c b n e c s s w w p w o p k s m
e x y w t l f c b g e c s s w w p w o p k s m
e x y y t a f c b p e r s y w w p w o p k y p
e x y w t l f c b n e c s s w w p w o p k n m
e x s y t l f c b w e c s s w w p w o p k n m
e x y n t l f c b p e r s y w w p w o p k s g
e x s w t l f c b n e c s s w w p w o p n n g
e f y n t l f c b n e r s y w w p w o p k s g
e b s w t l f c b w e c s s w w p w o p k s m
e x f n t n f c b n t b s s g p p w o p n y d
e x y w t l f c b k e c s s w w p w o p n s m
e b s w t l f c b g e c s s w w p w o p k n g
e b s y t l f c b g e c s s w w p w o p k n m
e x f y t l f w n p t b s s w w p w o p n v d
e b s w t l f c b g e c s s w w p w o p n n m
e x s g f n f w b h t e s f w w p w o e k a g
e b s w t a f c b k e c s s w w p w o p n n g
e b s y t l f c b w e c s s w w p w o p k s m
e f y y t a f c b n e r s y w w p w o p k y p
e f y n t a f c b w e r s y w w p w o p n y p
e x s w t a f c b k e c s s w w p w o p n s m
e f y y t l f c b n e r s y w w p w o p n s g
e b s w t a f c b g e c s s w w p w o p k s g
e x y y t a f c b w e c s s w w p w o p n n m
e x f n f n f c n k e e s s w w p w o p k v u
e f y n t a f c b p e r s y w w p w o p n y g
e x s w t a f c b g e c s s w w p w o p k n m
e x s y t l f c b g e c s s w w p w o p k n g
e b y y t l f c b g e c s s w w p w o p k n g
e f y y t a f c b w e r s y w w p w o p n s p
e x y y t l f c b g e c s s w w p w o p k n g
e x y w t a f c b w e c s s w w p w o p k n m
e f f w t a f w n n t b s s w w p w o p u v d
e s f g f n f c n k e e s s w w p w o p k y u
e f s y t l f w n n t b s s w w p w o p n v d
e f f g f n f c n k e e s s w w p w o p k y u
e x f n f n f c n p e e s s w w p w o p n y u
e b y w t a f c b w e c s s w w p w o p n n g
e x s w t a f w n p t b s s w w p w o p n v d
e b y y t l f c b n e c s s w w p w o p k s m
e b y y t a f c b k e c s s w w p w o p k s m
e b y y t l f c b g e c s s w w p w o p k s g
e f s w t l f w n p t b s s w w p w o p u v d
e f f g f n f c n g e e s s w w p w o p k y u
e f s w t a f w n w t b s s w w p w o p n v d
e f y y t a f c b p e r s y w w p w o p k y g
e x y w t l f c b w e c s s w w p w o p n s m
e s f n f n f c n p e e s s w w p w o p n y u
e x y n t l f c b n e r s y w w p w o p k s p
e x y n t a f c b n e r s y w w p w o p k s p
e f f y t a f w n n t b s s w w p w o p u v d
p f s n t p f c n p e e s s w w p w o p n s g
p x s n t p f c n k e e s s w w p w o p n v u
e f y n t l f c b w e r s y w w p w o p n s p
e x y y t a f c b w e r s y w w p w o p k s g
e b s y t a f c b k e c s s w w p w o p n n m
e f f w f n f w b h t e f s w w p w o e n a g
e f y n t a f c b p e r s y w w p w o p n s g
e x y w t a f c b n e c s s w w p w o p n s g
e b s y t l f c b w e c s s w w p w o p k n m
e x s y t a f c b w e c s s w w p w o p n n g
e b y y t a f c b g e c s s w w p w o p k n g
e x s w t l f c b g e c s s w w p w o p k n m
e b s y t a f c b w e c s s w w p w o p k s m
e b y y t a f c b w e c s s w w p w o p n s g
p x s n t p f c n n e e s s w w p w o p n v g
e x s g f n f w b p t e s s w w p w o e n s g
e f s n f n f w b p t e f s w w p w o e n a g
e x f w t a f w n n t b s s w w p w o p n v d
e f f g f n f c n p e e s s w w p w o p n v u
e f f n f n f c n k e e s s w w p w o p n y u
e b y y t a f c b k e c s s w w p w o p k s g
e x f w t a f w n n t b s s w w p w o p u v d
e s f g f n f c n n e e s s w w p w o p n v u
e x f n f n f c n k e e s s w w p w o p n v u
e x f w f n f w b h t e f s w w p w o e n a g
e x y n t l f c b n e r s y w w p w o p k y p
e b s y t a f c b k e c s s w w p w o p k n g
e x y w t l f c b k e c s s w w p w o p k s g
e x y y t a f c b k e c s s w w p w o p n n g
e x s w t l f c b w e c s s w w p w o p k n m
e b s w t l f c b n e c s s w w p w o p k s g
e f y y t a f c b p e r s y w w p w o p n y p
p x s n t p f c n n e e s s w w p w o p n s u
e f f g f n f c n n e e s s w w p w o p k y u
e b s y t l f c b w e c s s w w p w o p n s g
e f y n t a f c b w e r s y w w p w o p n s p
e x s y t a f c b k e c s s w w p w o p n s m
e x f n f n f c n k e e s s w w p w o p k y u
e f y y t a f c b w e r s y w w p w o p n y p
e x y n t a f c b n e r s y w w p w o p n y p
e x s w t a f c b n e c s s w w p w o p k s g
p x y w t p f c n n e e s s w w p w o p k s g
e f y n t l f c b w e r s y w w p w o p k y p
p f s n t p f c n k e e s s w w p w o p k s g
e x f w f n f w b k t e f s w w p w o e k s g
e b y w t a f c b k e c s s w w p w o p k s m
e s f n f n f c n p e e s s w w p w o p k v u
e x y y t l f c b g e c s s w w p w o p n s m
e f s w t a f w n p t b s s w w p w o p u v d
e x s y t a f c b n e c s s w w p w o p k s m
e f f g f n f w b p t e s f w w p w o e k s g
p x y w t p f c n n e e s s w w p w o p n s g
e x y y t a f c b p e r s y w w p w o p k s g
e f f n f n f w b n t e s s w w p w o e k a g
e x s w t a f c b w e c s s w w p w o p k s m
e b y y t l f c b g e c s s w w p w o p n s m
e f f g f n f c n g e e s s w w p w o p n v u
e x f y t l f w n p t b s s w w p w o p u v d
e x s y t a f c b g e c s s w w p w o p n n m
e f f n f n f c n n e e s s w w p w o p n y u
e x f g f n f w b h t e f s w w p w o e n s g
e x y y t l f c b p e r s y w w p w o p k s p
e f s w t l f w n p t b s s w w p w o p n v d
e x s y t l f c b k e c s s w w p w o p k s g
e f y y t a f c b p e r s y w w p w o p k y p
p x y n t p f c n k e e s s w w p w o p k v u
e x f w t l f w n w t b s s w w p w o p u v d
e x s w t l f c b w e c s s w w p w o p k n g
e x y y t a f c b g e c s s w w p w o p n n m
e b s y t a f c b g e c s s w w p w o p n s g
e x s y t a f w n w t b s s w w p w o p u v d
e x s y t a f w n n t b s s w w p w o p u v d
e x f w f n f w b n t e f s w w p w o e k s g
e b s w t a f c b w e c s s w w p w o p n s m
p x y w t p f c n w e e s s w w p w o p k v g
e x y y t a f c b g e c s s w w p w o p n n g
e f s g f n f w b k t e s f w w p w o e k s g
p x s n t p f c n n e e s s w w p w o p n s g
e b y w t a f c b w e c s s w w p w o p k n m
e x s w t a f c b k e c s s w w p w o p k n m
e f f y t l f w n n t b s s w w p w o p n v d
e x s y t a f c b k e c s s w w p w o p k n g
e b y w t l f c b w e c s s w w p w o p k n g
e x y n t a f c b w e r s y w w p w o p n y g
e x s y t l f c b w e c s s w w p w o p n s g
e x f n f n f c n p e e s s w w p w o p n v u
e x s y t l f c b n e c s s w w p w o p n s g
e x y y t l f c b w e r s y w w p w o p n y g
p x s n t p f c n w e e s s w w p w o p n s g
e x s w t a f c b g e c s s w w p w o p n n m
e x s y t a f c b w e c s s w w p w o p n s m
e x y w t l f c b k e c s s w w p w o p k n g
e x y y t a f c b n e r s y w w p w o p n y g
e f y y t a f c b w e r s y w w p w o p k s g
e x y w t a f c b k e c s s w w p w o p n s m
e s f g f n f c n g e e s s w w p w o p k v u
e f f n f n f c n n e e s s w w p w o p k y u
e s f g f n f c n n e e s s w w p w o p k v u
e x y w t a f c b n e c s s w w p w o p k n m
p x s w t p f c n k e e s s w w p w o p k s u
e x s g f n f w b h t e s f w w p w o e k s g
e x y y t l f c b g e c s s w w p w o p k s g
p x y w t p f c n k e e s s w w p w o p n s g
e x s y t a f c b g e c s s w w p w o p k n m
p x s n t p f c n p e e s s w w p w o p k v u
e b s y t l f c b n e c s s w w p w o p k n m
e b y w t l f c b g e c s s w w p w o p k s m
p x y w t p f c n p e e s s w w p w o p n s g
p x s n t p f c n n e e s s w w p w o p k s u
e x s g f n f w b n t e s f w w p w o e n a g
e x y n t l f c b w e r s y w w p w o p n y g
e f f w f n f w b k t e f f w w p w o e n a g
e f y n t l f c b n e r s y w w p w o p n s p
e b y w t l f c b w e c s s w w p w o p k s m
e b s w t a f c b n e c s s w w p w o p k n g
e x s g f n f w b n t e s f w w p w o e n s g
e f y n t a f c b n e r s y w w p w o p k y g
e x s w t l f c b w e c s s w w p w o p k s m
e f f g f n f c n n e e s s w w p w o p k v u
e f s n f n f w b h t e f s w w p w o e n a g
e x f g f n f w b n t e s f w w p w o e n s g
e f s w f n f w b p t e s f w w p w o e k a g
p x s w t p f c n k e e s s w w p w o p n s u
e x y y t l f c b p e r s y w w p w o p k s g
e b s w t l f c b g e c s s w w p w o p k s g
e f f g f n f w b p t e s s w w p w o e k a g
e b y w t l f c b g e c s s w w p w o p k s g
e b s y t a f c b n e c s s w w
This is downloaded from the following website and are exactly the same, just for your convenience to read it:
https://www.edureka.co/blog/decision-tree-algorithm/
A Complete Guide On Decision Tree Algorithm
With the increase in the implementation of Machine Learning algorithms for solving industry level problems, the demand for more complex and iterative algorithms has become a need. The Decision Tree Algorithm is one such algorithm that is used to solve both Regression and Classification problems.
In this blog on Decision Tree Algorithm, you will learn the working of Decision Tree and how it can be implemented to solve real-world problems. The following topics will be covered in this blog:
1. Why Decision Tree?
2. What Is A Decision Tree?
3. How Does The Decision Tree Algorithm Work?
4. Building A Decision Tree
5. Practical Implementation Of Decision Tree Algorithm Using R
To get in-depth knowledge on Data Science, you can enroll for live
Data Science Certification Training
by Edureka with 24/7 support and lifetime access.
Before I get started with why use Decision Tree, here’s a list of Machine Learning blogs that you should go through to understand the basics:
· Machine Learning Algorithms
· Introduction To Classification Algorithms
· Random Forest Classifier
We’re all aware that there are n number of Machine Learning algorithms that can be used for analysis, so why should you choose Decision Tree? In the below section I’ve listed a few reasons.
Why Decision Tree Algorithm?
Decision Tree is considered to be one of the most useful Machine Learning algorithms since it can be used to solve a variety of problems. Here are a few reasons why you should use Decision Tree:
1. It is considered to be the most understandable Machine Learning algorithm and it can be easily interpreted.
2. It can be used for classification and regression problems.
3. Unlike most Machine Learning algorithms, it works effectively with non-linear data.
4. Constructing a Decision Tree is a very quick process since it uses only one feature per node to split the data.
What Is A Decision Tree Algorithm?
A Decision Tree is a Supervised Machine Learning algorithm which looks like an inverted tree, wherein each node represents a predictor variable (feature), the link between the nodes represents a Decision and each leaf node represents an outcome (response variable).
To get a better understanding of a Decision Tree, let’s look at an example:
Let’s say that you hosted a huge party and you want to know how many of your guests were non-vegetarians. To solve this problem, let’s create a simple Decision Tree.
Decision Tree Example – Decision Tree Algorithm – Edureka
In the above illustration, I’ve created a Decision tree that classifies a guest as either vegetarian or non-vegetarian. Each node represents a predictor variable that will help to conclude whether or not a guest is a non-vegetarian. As you traverse down the tree, you must make decisions at each node, until you reach a dead end.
Now that you know the logic of a Decision Tree, let’s define a set of terms related to a Decision Tree.
Structure Of A Decision Tree
Decision Tree Structure – Decision Tree Algorithm – Edureka
A Decision Tree has the following structure:
· Root Node: The root node is the starting point of a tree. At this point, the first split is performed.
· Internal Nodes: Each internal node represents a decision point (predictor variable) that eventually leads to the prediction of the outcome.
· Leaf/ Terminal Nodes: Leaf nodes represent the final class of the outcome and therefore they’re also called terminating nodes.
·
Branches: Branches are connections between nodes, they’re represented as arrows. Each branch represents a response such as yes or no.
So that is the basic structure of a Decision Tree. Now let’s try to understand the workflow of a Decision Tree.
How Does The Decision Tree Algorithm Work?
The Decision Tree Algorithm follows the below steps:
Step 1: Select the feature (predictor variable) that best classifies the data set into the desired classes and assign that feature to the root node.
Step 2: Traverse down from the root node, whilst making relevant decisions at each internal node such that each internal node best classifies the data.
Step 3: Route back to step 1 and repeat until you assign a class to the input data.
The above-mentioned steps represent the general workflow of a Decision Tree used for classification purposes.
Now let’s try to understand how a Decision Tree is created.
Build A Decision Tree Using ID3 Algorithm
There are many ways to build a Decision Tree, in this blog we’ll be focusing on how the ID3 algorithm is used to create a Decision Tree.
What Is The ID3 Algorithm?
ID3 or the Iterative Dichotomiser 3 algorithm is one of the most effective algorithms used to build a Decision Tree. It uses the concept of Entropy and Information Gain to generate a Decision Tree for a given set of data.
ID3 Algorithm:
The ID3 algorithm follows the below workflow in order to build a Decision Tree:
1. Select Best Attribute (A)
2. Assign A as a decision variable for the root node.
3. For each value of A, build a descendant of the node.
4. Assign classification labels to the leaf node.
5. If data is correctly classified: Stop.
6. Else: Iterate over the tree.
The first step in this algorithm states that we must select the best attribute. What does that mean?
The best attribute (predictor variable) is the one that, separates the data set into different classes, most effectively or it is the feature that best splits the data set.
Now the next question in your head must be, “How do I decide which variable/ feature best splits the data?”
Two measures are used to decide the best attribute:
1. Information Gain
2. Entropy
What Is Entropy?
Entropy measures the impurity or uncertainty present in the data. It is used to decide how a Decision Tree can split the data.
Equation For Entropy:
What Is Information Gain?
Information Gain (IG) is the most significant measure used to build a Decision Tree. It indicates how much “information” a particular feature/ variable gives us about the final outcome.
Information Gain is important because it used to choose the variable that best splits the data at each node of a Decision Tree. The variable with the highest IG is used to split the data at the root node.
Equation For Information Gain (IG):
To better understand how Information Gain and Entropy are used to create a Decision Tree, let’s look at an example. The below data set represents the speed of a car based on certain parameters.
Speed Data Set – Decision Tree Algorithm – Edureka
Your problem statement is to study this data set and create a Decision Tree that classifies the speed of a car (response variable) as either slow or fast, depending on the following predictor variables:
· Road type
· Obstruction
· Speed limit
We’ll be building a Decision Tree using these variables in order to predict the speed of a car. Like I mentioned earlier we must first begin by deciding a variable that best splits the data set and assign that particular variable to the root node and repeat the same thing for the other nodes as well.
At this point, you might be wondering how do you know which variable best separates the data? The answer is, the variable with the highest Information Gain best divides the data into the desired output classes.
So, let’s begin by calculating the Entropy and Information Gain (IG) for each of the predictor variables, starting with ‘Road type’.
In our data set, there are four observations in the ‘Road type’ column that correspond to four labels in the ‘Speed of car’ column. We shall begin by calculating the entropy of the parent node (Speed of car).
Step one is to find out the fraction of the two classes present in the parent node. We know that there are a total of four values present in the parent node, out of which two samples belong to the ‘slow’ class and the other 2 belong to the ‘fast’ class, therefore:
· P(slow) -> fraction of ‘slow’ outcomes in the parent node
· P(fast) -> fraction of ‘fast’ outcomes in the parent node
The formula to calculate P(slow) is:
p(slow) = no. of ‘slow’ outcomes in the parent node / total number of outcomes
Similarly, the formula to calculate P(fast) is:
p(fast) = no. of ‘fast’ outcomes in the parent node / total number of outcomes
Therefore, the entropy of the parent node is:
Entropy(parent) = – {0.5 log2(0.5) + 0.5 log2(0.5)} = – {-0.5 + (-0.5)} = 1
Now that we know that the entropy of the parent node is 1, let’s see how to calculate the Information Gain for the ‘Road type’ variable. Remember that, if the Information gain of the ‘Road type’ variable is greater than the Information Gain of all the other predictor variables, only then the root node can be split by using the ‘Road type’ variable.
In order to calculate the Information Gain of ‘Road type’ variable, we first need to split the root node by the ‘Road type’ variable.
Decision Tree (Road type) – Decision Tree Algorithm – Edureka
In the above illustration, we’ve split the parent node by using the ‘Road type’ variable, the child nodes denote the corresponding responses as shown in the data set. Now, we need to measure the entropy of the child nodes.
The entropy of the right-hand side child node (fast) is 0 because all of the outcomes in this node belongs to one class (fast). In a similar manner, we must find the Entropy of the left-hand side node (slow, slow, fast).
In this node there are two types of outcomes (fast and slow), therefore, we first need to calculate the fraction of slow and fast outcomes for this particular node.
P(slow) = 2/3 = 0.667
P(fast) = 1/3 = 0.334
Therefore, entropy is:
Entropy(left child node) = – {0.667 log2(0.667) + 0.334 log2(0.334)} = – {-0.38 + (-0.52)}
= 0.9
Our next step is to calculate the Entropy(children) with weighted average:
· Total number of outcomes in parent node: 4
· Total number of outcomes in left child node: 3
· Total number of outcomes in right child node: 1
Formula for Entropy(children) with weighted avg. :
[Weighted avg]Entropy(children) = (no. of outcomes in left child node) / (total no. of outcomes in parent node) * (entropy of left node) + (no. of outcomes in right child node)/ (total no. of outcomes in parent node) * (entropy of right node)
By using the above formula you’ll find that the, Entropy(children) with weighted avg. is = 0.675
Our final step is to substitute the above weighted average in the IG formula in order to calculate the final IG of the ‘Road type’ variable:
Therefore,
Information gain(Road type) = 1 – 0.675 = 0.325
Information gain of Road type feature is 0.325.
Like I mentioned earlier, the Decision Tree Algorithm selects the variable with the highest Information Gain to split the Decision Tree. Therefore, by using the above method you need to calculate the Information Gain for all the predictor variables to check which variable has the highest IG.
Data Science Certification Course using R
Watch The Course Preview
So by using the above methodology, you must get the following values for each predictor variable:
1. Information gain(Road type) = 1 – 0.675 = 0.325
2. Information gain(Obstruction) = 1 – 1 = 0
3. Information gain(Speed limit) = 1 – 0 = 1
So, here we can see that the ‘Speed limit’ variable has the highest Information Gain. Therefore, the final Decision Tree for this dataset is built using the ‘Speed limit’ variable.
Decision Tree (Speed limit) – Decision Tree Algorithm – Edureka
Now that you know how a Decision Tree is created, let’s run a short demo that solves a real-world problem by implementing Decision Trees.
Implementation Of Decision Tree In R – Decision Tree Algorithm Example
Problem Statement: To study a Mushroom data set in order to predict whether a given mushroom is edible or poisonous to human beings.
Data Set Description: The given data set contains a total of 8124 observations of different kind of mushrooms and their properties such as odor, habitat, population, etc. A more in-depth structure of the data set is shown in the demo below.
Logic: To build a Decision Tree model in order to classify mushroom samples as either poisonous or edible by studying their properties such as odor, root, habitat, etc.
Now that you know the objective of this demo, let’s get our brains working and start coding. For this demo, I’ll be using the R language in order to build the model.
If you wish to learn more about R programming, you can go through this video recorded by our R Programming Experts.
R Tutorial For Beginners | Edureka
Now, let’s begin.
Step 1: Install and load libraries
1
2
3
4
5
6
7
8
9
10
11
#Installing libraries
install.packages(rpart)
install.packages(caret)
install.packages(rpart.plot)
install.packages(rattle)
#Loading libraries
library(rpart,quietly = TRUE)
library(caret,quietly = TRUE)
library(rpart.plot,quietly = TRUE)
library(rattle)
Step 2: Import the data set
1
2
#Reading the data set as a dataframe
mushrooms <- read.csv (/Users/zulaikha/Desktop/decision_tree/mushrooms.csv)
Now, to display the structure of the data set, you can make use of the R function called str():
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# structure of the data
> str(mushrooms)
data.frame: 8124 obs. of 22 variables:
$ class : Factor w/ 2 levels e,p: 2 1 1 2 1 1 1 1 2 1 ...
$ cap.shape : Factor w/ 6 levels b,c,f,k,..: 6 6 1 6 6 6 1 1 6 1 ...
$ cap.surface : Factor w/ 4 levels f,g,s,y: 3 3 3 4 3 4 3 4 4 3 ...
$ cap.color : Factor w/ 10 levels b,c,e,g,..: 5 10 9 9 4 10 9 9 9 10 ...
$ bruises : Factor w/ 2 levels f,t: 2 2 2 2 1 2 2 2 2 2 ...
$ odor : Factor w/ 9 levels a,c,f,l,..: 7 1 4 7 6 1 1 4 7 1 ...
$ gill.attachment : Factor w/ 2 levels a,f: 2 2 2 2 2 2 2 2 2 2 ...
$ gill.spacing : Factor w/ 2 levels c,w: 1 1 1 1 2 1 1 1 1 1 ...
$ gill.size : Factor w/ 2 levels b,n: 2 1 1 2 1 1 1 1 2 1 ...
$ gill.color : Factor w/ 12 levels b,e,g,h,..: 5 5 6 6 5 6 3 6 8 3 ...
$ stalk.shape : Factor w/ 2 levels e,t: 1 1 1 1 2 1 1 1 1 1 ...
$ stalk.root : Factor w/ 5 levels ?,b,c,e,..: 4 3 3 4 4 3 3 3 4 3 ...
$ stalk.surface.above.ring: Factor w/ 4 levels f,k,s,y: 3 3 3 3 3 3 3 3 3 3 ...
$ stalk.surface.below.ring: Factor w/ 4 levels f,k,s,y: 3 3 3 3 3 3 3 3 3 3 ...
$ stalk.color.above.ring : Factor w/ 9 levels b,c,e,g,..: 8 8 8 8 8 8 8 8 8 8 ...
$ stalk.color.below.ring : Factor w/ 9 levels b,c,e,g,..: 8 8 8 8 8 8 8 8 8 8 ...
$ veil.color : Factor w/ 4 levels n,o,w,y: 3 3 3 3 3 3 3 3 3 3 ...
$ ring.number : Factor w/ 3 levels n,o,t: 2 2 2 2 2 2 2 2 2 2 ...
$ ring.type : Factor w/ 5 levels e,f,l,n,..: 5 5 5 5 1 5 5 5 5 5 ...
$ spore.print.color : Factor w/ 9 levels b,h,k,n,..: 3 4 4 3 4 3 3 4 3 3 ...
$ population : Factor w/ 6 levels a,c,n,s,..: 4 3 3 4 1 3 3 4 5 4 ...
$ habitat : Factor w/ 7 levels d,g,l,m,..: 6 2 4 6 2 2 4 4 2 4 ...
The output shows a number of predictor variables that are used to predict the output class of a mushroom (poisonous or edible).
Step 3: Data Cleaning
At this stage, we must look for any null or missing values and unnecessary variables so that our prediction is as accurate as possible. In the below code snippet I have deleted the ‘veil.type’ variable since it has no effect on the outcome. Such inconsistencies and redundant data must be fixed in this step.
1
2
3
4
5
# number of rows with missing values
nrow(mushrooms) - sum(complete.cases(mushrooms))
# deleting redundant variable `veil.type`
mushrooms$veil.type <- NULL
Step 4: Data Exploration and Analysis
To get a good understanding of the 21 predictor variables, I’ve created a table for each predictor variable vs class type (response/ outcome variable) in order to understand whether that particular predictor variable is significant for detecting the output or not.
I’ve shown the table only for the ‘odor’ variable, you can go ahead and create a table for each of the variables by following the below code snippet:
1
2
3
4
5
# analyzing the odor variable
> table(mushrooms$class,mushrooms$odor)
a c f l m n p s y
e 400 0 0 400 0 3408 0 0 0
p 0 192 2160 0 36 120 256 576 576
In the above snippet, ‘e’ stands for edible class and ‘p’ stands for the poisonous class of mushrooms.
The above output shows that the mushrooms with odor values ‘c’, ‘f’, ‘m’, ‘p’, ‘s’ and ‘y’ are clearly poisonous. And the mushrooms having almond (a) odor (400) are edible. Such observations will help us to predict the output class more accurately.
Our next step in the data exploration stage is to predict which variable would be the best one for splitting the Decision Tree. For this reason, I’ve plotted a graph that represents the split for each of the 21 variables, the output is shown below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
number.perfect.splits <- apply(X=mushrooms[-1], MARGIN = 2, FUN = function(col){
t <- table(mushrooms$class,col)
sum(t == 0)
})
# Descending order of perfect splits
order <- order(number.perfect.splits,decreasing = TRUE)
number.perfect.splits <- number.perfect.splits[order]
# Plot graph
par(mar=c(10,2,2,2))
barplot(number.perfect.splits,
main=Number of perfect splits vs feature,
xlab=,ylab=Feature,las=2,col=wheat)
rpart.plot – Decision Tree Algorithm – Edureka
The output shows that the ‘odor’ variable plays a significant role in predicting the output class of the mushroom.
Step 5: Data Splicing
Data Splicing is the process of splitting the data into a training set and a testing set. The training set is used to build the Decision Tree model and the testing set is used to validate the efficiency of the model. The splitting is performed in the below code snippet:
1
2
3
4
5
6
7
#data splicing
set.seed(12345)
train <- sample(1:nrow(mushrooms),size = ceiling(0.80*nrow(mushrooms)),replace = FALSE)
# training set
mushrooms_train <- mushrooms[train,]
# test set
mushrooms_test <- mushrooms[-train,]
To make this demo more interesting and to minimize the number of poisonous mushrooms misclassified as edible we will assign a penalty 10x bigger, than the penalty for classifying an edible mushroom as poisonous because of obvious reasons.
1
2
# penalty matrix
penalty.matrix <- matrix(c(0,1,10,0), byrow=TRUE, nrow=2)
Step 6: Building a model
In this stage, we’re going to build a Decision Tree by using the rpart (Recursive Partitioning And Regression Trees) algorithm:
1
2
3
4
5
# building the classification tree with rpart
tree <- rpart(class~.,
data=mushrooms_train,
parms = list(loss = penalty.matrix),
method = class)
Step 7: Visualising the tree
ta Science Certification Course using R
Weekday / Weekend BatchesSee Batch Details
In this step, we’ll be using the rpart.plot library to plot our final Decision Tree:
1
2
# Visualize the decision tree with rpart.plot
rpart.plot(tree, nn=TRUE)
Decision Tree – Decision Tree Algorithm – Edureka
Step 8: Testing the model
Now in order to test our Decision Tree model, we’ll be applying the testing data set on our model like so:
1
2
#Testing the model
pred <- predict(object=tree,mushrooms_test[-1],type=class)
Step 9: Calculating accuracy
We’ll be using a confusion matrix to calculate the accuracy of the model. Here’s the code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#Calculating accuracy
t <- table(mushrooms_test$class,pred) > confusionMatrix(t)
Confusion Matrix and Statistics
pred
e p
e 839 0
p 0 785
Accuracy : 1
95\% CI : (0.9977, 1)
No Information Rate : 0.5166
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 1
Mcnemars Test P-Value : NA
Sensitivity : 1.0000
Specificity : 1.0000
Pos Pred Value : 1.0000
Neg Pred Value : 1.0000
Prevalence : 0.5166
Detection Rate : 0.5166
Detection Prevalence : 0.5166
Balanced Accuracy : 1.0000
Positive Class : e
The output shows that all the samples in the test dataset have been correctly classified and we’ve attained an accuracy of 100\% on the test data set with a 95\% confidence interval (0.9977, 1). Thus we can correctly classify a mushroom as either poisonous or edible using this Decision Tree model.
CATEGORIES
Economics
Nursing
Applied Sciences
Psychology
Science
Management
Computer Science
Human Resource Management
Accounting
Information Systems
English
Anatomy
Operations Management
Sociology
Literature
Education
Business & Finance
Marketing
Engineering
Statistics
Biology
Political Science
Reading
History
Financial markets
Philosophy
Mathematics
Law
Criminal
Architecture and Design
Government
Social Science
World history
Chemistry
Humanities
Business Finance
Writing
Programming
Telecommunications Engineering
Geography
Physics
Spanish
ach
e. Embedded Entrepreneurship
f. Three Social Entrepreneurship Models
g. Social-Founder Identity
h. Micros-enterprise Development
Outcomes
Subset 2. Indigenous Entrepreneurship Approaches (Outside of Canada)
a. Indigenous Australian Entrepreneurs Exami
Calculus
(people influence of
others) processes that you perceived occurs in this specific Institution Select one of the forms of stratification highlighted (focus on inter the intersectionalities
of these three) to reflect and analyze the potential ways these (
American history
Pharmacology
Ancient history
. Also
Numerical analysis
Environmental science
Electrical Engineering
Precalculus
Physiology
Civil Engineering
Electronic Engineering
ness Horizons
Algebra
Geology
Physical chemistry
nt
When considering both O
lassrooms
Civil
Probability
ions
Identify a specific consumer product that you or your family have used for quite some time. This might be a branded smartphone (if you have used several versions over the years)
or the court to consider in its deliberations. Locard’s exchange principle argues that during the commission of a crime
Chemical Engineering
Ecology
aragraphs (meaning 25 sentences or more). Your assignment may be more than 5 paragraphs but not less.
INSTRUCTIONS:
To access the FNU Online Library for journals and articles you can go the FNU library link here:
https://www.fnu.edu/library/
In order to
n that draws upon the theoretical reading to explain and contextualize the design choices. Be sure to directly quote or paraphrase the reading
ce to the vaccine. Your campaign must educate and inform the audience on the benefits but also create for safe and open dialogue. A key metric of your campaign will be the direct increase in numbers.
Key outcomes: The approach that you take must be clear
Mechanical Engineering
Organic chemistry
Geometry
nment
Topic
You will need to pick one topic for your project (5 pts)
Literature search
You will need to perform a literature search for your topic
Geophysics
you been involved with a company doing a redesign of business processes
Communication on Customer Relations. Discuss how two-way communication on social media channels impacts businesses both positively and negatively. Provide any personal examples from your experience
od pressure and hypertension via a community-wide intervention that targets the problem across the lifespan (i.e. includes all ages).
Develop a community-wide intervention to reduce elevated blood pressure and hypertension in the State of Alabama that in
in body of the report
Conclusions
References (8 References Minimum)
*** Words count = 2000 words.
*** In-Text Citations and References using Harvard style.
*** In Task section I’ve chose (Economic issues in overseas contracting)"
Electromagnetism
w or quality improvement; it was just all part of good nursing care. The goal for quality improvement is to monitor patient outcomes using statistics for comparison to standards of care for different diseases
e a 1 to 2 slide Microsoft PowerPoint presentation on the different models of case management. Include speaker notes... .....Describe three different models of case management.
visual representations of information. They can include numbers
SSAY
ame workbook for all 3 milestones. You do not need to download a new copy for Milestones 2 or 3. When you submit Milestone 3
pages):
Provide a description of an existing intervention in Canada
making the appropriate buying decisions in an ethical and professional manner.
Topic: Purchasing and Technology
You read about blockchain ledger technology. Now do some additional research out on the Internet and share your URL with the rest of the class
be aware of which features their competitors are opting to include so the product development teams can design similar or enhanced features to attract more of the market. The more unique
low (The Top Health Industry Trends to Watch in 2015) to assist you with this discussion.
https://youtu.be/fRym_jyuBc0
Next year the $2.8 trillion U.S. healthcare industry will finally begin to look and feel more like the rest of the business wo
evidence-based primary care curriculum. Throughout your nurse practitioner program
Vignette
Understanding Gender Fluidity
Providing Inclusive Quality Care
Affirming Clinical Encounters
Conclusion
References
Nurse Practitioner Knowledge
Mechanics
and word limit is unit as a guide only.
The assessment may be re-attempted on two further occasions (maximum three attempts in total). All assessments must be resubmitted 3 days within receiving your unsatisfactory grade. You must clearly indicate “Re-su
Trigonometry
Article writing
Other
5. June 29
After the components sending to the manufacturing house
1. In 1972 the Furman v. Georgia case resulted in a decision that would put action into motion. Furman was originally sentenced to death because of a murder he committed in Georgia but the court debated whether or not this was a violation of his 8th amend
One of the first conflicts that would need to be investigated would be whether the human service professional followed the responsibility to client ethical standard. While developing a relationship with client it is important to clarify that if danger or
Ethical behavior is a critical topic in the workplace because the impact of it can make or break a business
No matter which type of health care organization
With a direct sale
During the pandemic
Computers are being used to monitor the spread of outbreaks in different areas of the world and with this record
3. Furman v. Georgia is a U.S Supreme Court case that resolves around the Eighth Amendments ban on cruel and unsual punishment in death penalty cases. The Furman v. Georgia case was based on Furman being convicted of murder in Georgia. Furman was caught i
One major ethical conflict that may arise in my investigation is the Responsibility to Client in both Standard 3 and Standard 4 of the Ethical Standards for Human Service Professionals (2015). Making sure we do not disclose information without consent ev
4. Identify two examples of real world problems that you have observed in your personal
Summary & Evaluation: Reference & 188. Academic Search Ultimate
Ethics
We can mention at least one example of how the violation of ethical standards can be prevented. Many organizations promote ethical self-regulation by creating moral codes to help direct their business activities
*DDB is used for the first three years
For example
The inbound logistics for William Instrument refer to purchase components from various electronic firms. During the purchase process William need to consider the quality and price of the components. In this case
4. A U.S. Supreme Court case known as Furman v. Georgia (1972) is a landmark case that involved Eighth Amendment’s ban of unusual and cruel punishment in death penalty cases (Furman v. Georgia (1972)
With covid coming into place
In my opinion
with
Not necessarily all home buyers are the same! When you choose to work with we buy ugly houses Baltimore & nationwide USA
The ability to view ourselves from an unbiased perspective allows us to critically assess our personal strengths and weaknesses. This is an important step in the process of finding the right resources for our personal learning style. Ego and pride can be
· By Day 1 of this week
While you must form your answers to the questions below from our assigned reading material
CliftonLarsonAllen LLP (2013)
5 The family dynamic is awkward at first since the most outgoing and straight forward person in the family in Linda
Urien
The most important benefit of my statistical analysis would be the accuracy with which I interpret the data. The greatest obstacle
From a similar but larger point of view
4 In order to get the entire family to come back for another session I would suggest coming in on a day the restaurant is not open
When seeking to identify a patient’s health condition
After viewing the you tube videos on prayer
Your paper must be at least two pages in length (not counting the title and reference pages)
The word assimilate is negative to me. I believe everyone should learn about a country that they are going to live in. It doesnt mean that they have to believe that everything in America is better than where they came from. It means that they care enough
Data collection
Single Subject Chris is a social worker in a geriatric case management program located in a midsize Northeastern town. She has an MSW and is part of a team of case managers that likes to continuously improve on its practice. The team is currently using an
I would start off with Linda on repeating her options for the child and going over what she is feeling with each option. I would want to find out what she is afraid of. I would avoid asking her any “why” questions because I want her to be in the here an
Summarize the advantages and disadvantages of using an Internet site as means of collecting data for psychological research (Comp 2.1) 25.0\% Summarization of the advantages and disadvantages of using an Internet site as means of collecting data for psych
Identify the type of research used in a chosen study
Compose a 1
Optics
effect relationship becomes more difficult—as the researcher cannot enact total control of another person even in an experimental environment. Social workers serve clients in highly complex real-world environments. Clients often implement recommended inte
I think knowing more about you will allow you to be able to choose the right resources
Be 4 pages in length
soft MB-920 dumps review and documentation and high-quality listing pdf MB-920 braindumps also recommended and approved by Microsoft experts. The practical test
g
One thing you will need to do in college is learn how to find and use references. References support your ideas. College-level work must be supported by research. You are expected to do that for this paper. You will research
Elaborate on any potential confounds or ethical concerns while participating in the psychological study 20.0\% Elaboration on any potential confounds or ethical concerns while participating in the psychological study is missing. Elaboration on any potenti
3 The first thing I would do in the family’s first session is develop a genogram of the family to get an idea of all the individuals who play a major role in Linda’s life. After establishing where each member is in relation to the family
A Health in All Policies approach
Note: The requirements outlined below correspond to the grading criteria in the scoring guide. At a minimum
Chen
Read Connecting Communities and Complexity: A Case Study in Creating the Conditions for Transformational Change
Read Reflections on Cultural Humility
Read A Basic Guide to ABCD Community Organizing
Use the bolded black section and sub-section titles below to organize your paper. For each section
Losinski forwarded the article on a priority basis to Mary Scott
Losinksi wanted details on use of the ED at CGH. He asked the administrative resident