I’m trying to convince my daughter that she will have difficulty using a traditional feedforward network to process comments from her followers on YouTube.
Instead, I recommended recurrent neural networks as a good starting point. Going to transformers is overkill, and I’d rather have her look into some foundational ideas first.
Which of the following do you think I should include in the list of differences between recurrent and traditional networks?
Traditional neural networks can process inputs of any length. Recurrent neural networks require a fixed-size sequence as their input.
Recurrent neural networks capture the sequential information present in the input data. Traditional neural networks don’t have this ability.
Recurrent neural networks share weights across time. Traditional networks use different weights on each input node.
Recurrent neural networks can consider future inputs to compute their current state. Traditional neural networks can’t access future inputs.
2, 3
Recurrent neural networks (RNN) are a type of artificial neural network that can process sequential or time series data. Their main difference from traditional networks is their ability to take information from prior inputs to influence the current input and output. This ability allows them to capture any sequential information present in the data. For example, an RNN is ideal for capturing the dependency between words of a sentence.
RNN processes the data sequentially so a model can process sequences of varying sizes. For example, an RNN can process a 5-word and 10-word sentence using the same input structure, unlike a traditional neural network that will need a different input size for each case.
RNNs share the same weight parameters for every input sample, unlike traditional networks with different weights across each input node. Sharing parameters helps an RNN generalize to sequences of varying lengths and operate similarly on sequences with the same meaning but organized differently. The accepted answer to this question expands on this topic with an example.
Finally, neither recurrent nor traditional networks can access future inputs to compute their current state.
Recommended reading