CLASSIFICATION IN R - NAIVE BAYES - Iris

    

 CLASSIFICATION

Description of Naive Bayes

  • Naive Bayes is a probabilistic technique for constructing classifiers. 

  • The characteristic assumption of the naive Bayes classifier is to consider that the value of a particular feature is independent of the value of any other feature, given the class variable.

  • Despite the oversimplified assumptions mentioned previously, naive Bayes classifiers have good results in complex real-world situations. 

  • An advantage of naive Bayes is that it only requires a small amount of training data to estimate the parameters necessary for classification and that the classifier can be trained incrementally.

Role / Importance

  • A Naive Bayes classifier is a probabilistic machine learning model that’s used for classification task. 

  • The crux of the classifier is based on the Bayes theorem.

Bayes Theorem:


  • Using Bayes theorem, we can find the probability of A happening, given that B has occurred. Here, B is the evidence and A is the hypothesis.

  • The assumption made here is that the predictors/features are independent. 

  • That is presence of one particular feature does not affect the other. Hence it is called naive.


PROBLEM

Source Code

library(caret)

library(klaR)

# load the iris dataset

data(iris)

# define an 80%/20% train/test split of the dataset

split=0.80

trainIndex <- createDataPartition(iris$Species, p=split, list=FALSE)

data_train <- iris[ trainIndex,]

data_test <- iris[-trainIndex,]

library(naivebayes)

# train a naive bayes model

model <- naive_bayes(Species~., data=data_train)

# make predictions

x_test <- data_test[,1:4]

y_test <- data_test[,5]

predictions <- predict(model, x_test)

# summarize results

confusionMatrix(predictions, y_test)

Output



Conclusion

As we can see from the result, the accuracy of the Naive Bayes model is 93%. This means the model correctly classifies 93% of the instances.






Comments