CLASSIFICATION
Description of classification decision tree
Decision tree is the most powerful and popular tool for classification and prediction. A Decision tree is a flowchart like tree structure, where each internal node denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node (terminal node) holds a class label.
Role / Importance
A Decision Tree is a simple representation for classifying examples. It is a Supervised Machine Learning where the data is continuously split according to a certain parameter.
Decision Tree consists of:
Nodes: Test for the value of a certain attribute.
Edges/ Branch: Correspond to the outcome of a test and connect to the next node or leaf.
Leaf nodes: Terminal nodes that predict the outcome (represent class labels or class distribution).
Select csv file to test
summary (Titanic): gives the summary of the dataset
names(titanic): Gives names of the headers of each column
install.packages("partykit")
library(partykit)
Converting to categorical
titanic$Survived<-as.factor(titanic$Survived)
summary(titanic$Survived)
names(titanic)
set.seed(1234):Selects the random number
#two samples with distribution 0.8 and 0.2 and create 2 partition
black=no and white=yes
plot tree and the probability is sorted in tree, Black section means no and white means yes
Predicting the probability of the validation set3
Predicting the tree according to validation set
creating matrix for people who survived from validation set
Comments
Post a Comment