Image credit: Joe deSousa

The application of **Bayes' Theorem** to Naive Bayes Classifiers is laid out pretty well on
Wikipedia but I thought I'd put forward my
understanding of how it's used.

First off, we have the theorem itself:

$$P(C|F) = \frac{P(C){\cdotp}P(F|C)}{P(F)}$$

Or, "the probability of event *C* occurring given event *F* occurring is the probability of *C* multiplied by the
probability of *F* given *C*, all divided by the probability of *F*" (or, in the case of classification, we might view
*C* as the class, and *F* as the feature).