Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Naive Bayes classifier
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Discussion == Despite the fact that the far-reaching independence assumptions are often inaccurate, the naive Bayes classifier has several properties that make it surprisingly useful in practice. In particular, the decoupling of the class conditional feature distributions means that each distribution can be independently estimated as a one-dimensional distribution. This helps alleviate problems stemming from the [[curse of dimensionality]], such as the need for data sets that scale exponentially with the number of features. While naive Bayes often fails to produce a good estimate for the correct class probabilities,<ref>{{cite conference |last1=Niculescu-Mizil |first1=Alexandru |first2=Rich |last2=Caruana |title=Predicting good probabilities with supervised learning |conference=ICML |year=2005 |url=http://machinelearning.wustl.edu/mlpapers/paper_files/icml2005_Niculescu-MizilC05.pdf |doi=10.1145/1102351.1102430 |access-date=2016-04-24 |archive-url=https://web.archive.org/web/20140311005243/http://machinelearning.wustl.edu/mlpapers/paper_files/icml2005_Niculescu-MizilC05.pdf |archive-date=2014-03-11 |url-status=dead }}</ref> this may not be a requirement for many applications. For example, the naive Bayes classifier will make the correct [[Maximum a posteriori estimation|MAP]] decision rule classification so long as the correct class is predicted as more probable than any other class. This is true regardless of whether the probability estimate is slightly, or even grossly inaccurate. In this manner, the overall classifier can be robust enough to ignore serious deficiencies in its underlying naive probability model.<ref name="rish">{{cite conference|last1=Rish|first1=Irina|year=2001|title=An empirical study of the naive Bayes classifier|url=http://www.research.ibm.com/people/r/rish/papers/RC22230.pdf |archive-url=https://ghostarchive.org/archive/20221009/http://www.research.ibm.com/people/r/rish/papers/RC22230.pdf |archive-date=2022-10-09 |url-status=live|conference=IJCAI Workshop on Empirical Methods in AI}}</ref> Other reasons for the observed success of the naive Bayes classifier are discussed in the literature cited below. ===Relation to logistic regression=== In the case of discrete inputs (indicator or frequency features for discrete events), naive Bayes classifiers form a ''generative-discriminative'' pair with [[multinomial logistic regression]] classifiers: each naive Bayes classifier can be considered a way of fitting a probability model that optimizes the joint likelihood <math>p(C, \mathbf{x})</math>, while logistic regression fits the same probability model to optimize the conditional <math>p(C \mid \mathbf{x})</math>.<ref name="pair">{{cite conference |first1=Andrew Y. |last1=Ng |author-link1=Andrew Ng |first2=Michael I. |last2=Jordan |author-link2=Michael I. Jordan |title=On discriminative vs. generative classifiers: A comparison of logistic regression and naive Bayes |conference=[[Conference on Neural Information Processing Systems|NIPS]] |volume=14 |year=2002 |url=http://papers.nips.cc/paper/2020-on-discriminative-vs-generative-classifiers-a-comparison-of-logistic-regression-and-naive-bayes}}</ref> More formally, we have the following: {{Math theorem | name = Theorem | note = | math_statement = Naive Bayes classifiers on binary features are subsumed by logistic regression classifiers. }} {{Math proof|proof= Consider a generic multiclass classification problem, with possible classes <math>Y\in \{1, ..., n\}</math>, then the (non-naive) Bayes classifier gives, by Bayes theorem: <math display="block">p(Y \mid X=x) = \text{softmax}(\{\ln p(Y = k) + \ln p(X=x \mid Y=k)\}_k)</math> The naive Bayes classifier gives <math display="block">\text{softmax}\left(\left\{\ln p(Y = k) + \frac 12 \sum_i (a^+_{i, k} - a^-_{i, k})x_i + (a^+_{i, k} + a^-_{i, k})\right\}_k\right)</math> where <math display="block">a^+_{i, s} = \ln p(X_i=+1 \mid Y=s);\quad a^-_{i, s} = \ln p(X_i=-1 \mid Y=s)</math> This is exactly a logistic regression classifier.}} The link between the two can be seen by observing that the decision function for naive Bayes (in the binary case) can be rewritten as "predict class <math>C_1</math> if the [[odds]] of <math>p(C_1 \mid \mathbf{x})</math> exceed those of <math>p(C_2 \mid \mathbf{x})</math>". Expressing this in log-space gives: <math display="block"> \log\frac{p(C_1 \mid \mathbf{x})}{p(C_2 \mid \mathbf{x})} = \log p(C_1 \mid \mathbf{x}) - \log p(C_2 \mid \mathbf{x}) > 0 </math> The left-hand side of this equation is the log-odds, or ''[[logit]]'', the quantity predicted by the linear model that underlies logistic regression. Since naive Bayes is also a linear model for the two "discrete" event models, it can be reparametrised as a linear function <math>b + \mathbf{w}^\top x > 0</math>. Obtaining the probabilities is then a matter of applying the [[logistic function]] to <math>b + \mathbf{w}^\top x</math>, or in the multiclass case, the [[softmax function]]. Discriminative classifiers have lower asymptotic error than generative ones; however, research by [[Andrew Ng|Ng]] and [[Michael I. Jordan|Jordan]] has shown that in some practical cases naive Bayes can outperform logistic regression because it reaches its asymptotic error faster.<ref name="pair"/>
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Naive Bayes classifier
(section)
Add topic