Difference between revisions of "Neural Networks (Geoffrey Hinton Course)"
Line 57: | Line 57: | ||
# Learning usually means adjusting the parameters to reduce the discrepancy between the target output, $t$, on each training case and the actual output, $y$, produced by the model. | # Learning usually means adjusting the parameters to reduce the discrepancy between the target output, $t$, on each training case and the actual output, $y$, produced by the model. | ||
#* For regression $\frac{1}{2}(y-t)^2$ is often a sensible measure of the discrepancy. | #* For regression $\frac{1}{2}(y-t)^2$ is often a sensible measure of the discrepancy. | ||
+ | #* For classification there are other measures that are generally more sensible (they also work better). | ||
=== Reinforcement Learning === | === Reinforcement Learning === | ||
Learn to select an action to maximize payoff. | Learn to select an action to maximize payoff. | ||
+ | |||
+ | * The output is an action or sequence of actions and the only supervisory signal is an occasional scalar reward. | ||
+ | ** The goal in selecting each action is to maximize the expected sum of the future rewards. | ||
+ | ** We usually use a discount factor for delayed rewards so that we don't have to look too far into the future. | ||
+ | * Reinforcement learning is difficult because: | ||
+ | ** The rewards are typically delayed so it's hard to know where we went wrong (or right). | ||
+ | ** A scalar reward does not supply much information. | ||
=== Unsupervised Learning === | === Unsupervised Learning === | ||
Discover a good internal representation of the input. | Discover a good internal representation of the input. |
Revision as of 18:02, 30 October 2016
Some Simple Models or Neurons
$y$ output, $x_i$ input.
Linear Neurons
$y = b + \sum_{i} x_i w_i$
$w_i$ weights, $b$ bias
Binary Threshold Neurons
$z = \sum_{i} x_i w_i$
$y = 1$ if $z \geq \theta$, $0$ otherwise.
Or, equivalently,
$z = b + \sum_{i} x_i w_i$
$y = 1$ if $z \geq 0$, $0$ otherwise.
Rectified Linear Neurons
$z = b + \sum_{i} x_i w_i$
$y = z$ if $z > 0$, $0$ otherwise. (linear above zero, decision at zero.)
Sigmoid Neurons
Give a real-valued output that is a smooth and bounded function of their total input.
$z = b + \sum_{i} x_i w_i$
$y = \frac{1}{1 + e^{-z}}$
Stochastic Binary Neurons
Same equations as logistic units, but outputs $1$ (=spike) or $0$ randomly based on the probability. They treat the output of the logistic as the probability of producing a spike in a short time window.
$z = b + \sum_{i} x_i w_i$
$P(s = 1) = \frac{1}{1 + e^{-z}}$
We can do a similar trick for rectified linear units - in this case the output is treated as the Poisson rate for spikes.
Types of Learning
Supervised Learning
Learn to predict an output when given an input vector.
- Regression: The target output is a real number or a whole vector of real numbers.
- Classification: The target output is a class label.
How supervised Learning Typically Works
- Start by choosing a model-class: $y = f(x;W)$
- A model-class $f$ is a way of using some numerical parameters $W$ to map each input vector $x$ into a predicted output $y$.
- Learning usually means adjusting the parameters to reduce the discrepancy between the target output, $t$, on each training case and the actual output, $y$, produced by the model.
- For regression $\frac{1}{2}(y-t)^2$ is often a sensible measure of the discrepancy.
- For classification there are other measures that are generally more sensible (they also work better).
Reinforcement Learning
Learn to select an action to maximize payoff.
- The output is an action or sequence of actions and the only supervisory signal is an occasional scalar reward.
- The goal in selecting each action is to maximize the expected sum of the future rewards.
- We usually use a discount factor for delayed rewards so that we don't have to look too far into the future.
- Reinforcement learning is difficult because:
- The rewards are typically delayed so it's hard to know where we went wrong (or right).
- A scalar reward does not supply much information.
Unsupervised Learning
Discover a good internal representation of the input.