A data scientist is building a model to classify emails as 'spam' or 'not spam'. The dataset is large and has many features (words). The scientist wants a model that is fast to train and makes a strong assumption that the presence of a particular word is unrelated to the presence of any other word. Which algorithm is most suitable for this task?
-
A
Support Vector Machine (SVM)
-
B
K-Nearest Neighbors (KNN)
-
C
Naive Bayes
-
D
Decision Tree