Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Supervised learning
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Other factors to consider=== Other factors to consider when choosing and applying a learning algorithm include the following: * Heterogeneity of the data. If the feature vectors include features of many different kinds (discrete, discrete ordered, counts, continuous values), some algorithms are easier to apply than others. Many algorithms, including [[Support Vector Machines|support-vector machines]], [[linear regression]], [[logistic regression]], [[Neural network (machine learning)|neural networks]], and [[k-nearest neighbors algorithm|nearest neighbor methods]], require that the input features be numerical and scaled to similar ranges (e.g., to the [-1,1] interval). Methods that employ a distance function, such as nearest neighbor methods and [[Support Vector Machines|support-vector machines with Gaussian kernels]], are particularly sensitive to this. An advantage of [[Decision tree learning|decision trees]] is that they easily handle heterogeneous data. * Redundancy in the data. If the input features contain redundant information (e.g., highly correlated features), some learning algorithms (e.g., [[linear regression]], [[logistic regression]], and [[k-nearest neighbors algorithm| distance-based methods]]) will perform poorly because of numerical instabilities. These problems can often be solved by imposing some form of [[Regularization (mathematics)|regularization]]. * Presence of interactions and non-linearities. If each of the features makes an independent contribution to the output, then algorithms based on linear functions (e.g., [[linear regression]], [[logistic regression]], [[support-vector machine]]s, [[Naive Bayes classifier|naive Bayes]]) and distance functions (e.g., nearest neighbor methods, [[Support Vector Machines|support-vector machines with Gaussian kernels]]) generally perform well. However, if there are complex interactions among features, then algorithms such as [[Decision tree learning|decision trees]] and neural networks work better, because they are specifically designed to discover these interactions. Linear methods can also be applied, but the engineer must manually specify the interactions when using them. When considering a new application, the engineer can compare multiple learning algorithms and experimentally determine which one works best on the problem at hand (see [[Cross-validation (statistics)| cross-validation]]). Tuning the performance of a learning algorithm can be very time-consuming. Given fixed resources, it is often better to spend more time collecting additional training data and more informative features than it is to spend extra time tuning the learning algorithms.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Supervised learning
(section)
Add topic