Issues in Connectionist Models [notes on cognitive science]
While the connectionist approach shows great potential and it has been at the forefronts of many an advance in the field of cognitive science and AI in general, there still remain a number of important challenges. Let’s have a look at Elman’s favorite issues.
It is clear that time plays a very important role in the life of humans and all animals alike. How would we go about representing time in a neural network? We can represent long term memory, in a sort of atemporal way. When we talk about time, we’re entering the realm of working memory or as non psychologists will know it as short-term memory. In order to have a short term memory bank, researchers came up with the idea of a recurrent network, in which connections from nodes can revert back to themselves. This way the network can keep a representation of the current state, plus the internal state of the network (i.e. at t-1)
Scaling and Modularity
Usually the examples shown are very simple architectures. There are a series of pitfalls associated with this simplicity. Firstly, larger networks with an uniform architecture pose a challenge: as a network grows, so do the degrees of freedom of all the weights. The training data grows more slowly. This means that our training techniques become less efficient. The bigger the amount of weights, the fuzzier the weight space will become, increasing the risk of ending up trapped in local minima. However, techniques such as the so called “constructive algorithms” allow to dynamically reshape the topography of the network with the addition or deletion of nodes and weights. One such technique is called Cascade Correlation, which progressively adds hidden units until error is reduced. The trick is to gradually eliminate weights while keeping error at bay.
The second pitfall is the way they are network is interconnected. We can solve this by breaking down the problem we’re trying to solve into smaller subnetworks that are easier to grasp and train. According to research a modular architecture aids both training and performance. All in all, nature does not always need to provide the solution; it often suffices to make available the appropriate tools which can the be recruited to sovlve the problems as they arise.
Supervised vs. Unsupervised Learning
Where does the error measurement come from? Is it psychologically reasonable to assume a teacher pattern? It is indeed in some cases, where the subject recieves some input, performs a task and then recieves feedback. However, in cases such as learning the past tense of verbs, there is no obvious teacher that provides children with feedback. We can have in any case an internal signal that comes from the peculiarity of the task. In the ”auto-association” task, networks are required to reproduce the input pattern on their output pattern. However, the activation has to flow through a narrow hidden layer, forcing the network to find a suitable alternative representation. In these cases the teacher is nothing but the input itself.
A well known structure that reflects short term memory is the Simple Recurrent Network. A SRN contains recurrent connections from the hidden units to a layer of context units. These store the hidden unit activations on one time step, and feed them back to the hidden units on the next time step. This allows for solving time-spanning tasks. SRN’s are capable to discover sequential dependencies in training data. SRN’s prediction task has ecological validity (i.e. relates to psychological findings relating to natural systems such as the brain).
The last form of supervised training is called reinforcement learning. In this method, the network is given a scalar value representation of its overall performance. This lacks information about the where of the error, so it takes longer that other forms of learning. However, it is a more psychologically plausible approach.
Unsupervised learning techniques are varied. Hebbian learning (as mentioned in the previous post) is one of them, since it uses just the correlation of the activity between nodes recieving input from the environment. The network can be thought of as ‘exploring’ its environment and trying to discover correlations in the input. Other techniques of unsupervised learning include competitible learning, feature mapping, vector quantization, adaptive resonance theory. All these techniques rely on correlation or mutual information, therefore they are not truly unsupervised, the supervision is hidden within the learing rule itself: “the tabula cannot be completely rasa“.
Archivado en: Cerebros | Leave a Comment