As far as i understood it is not necessary that the curves for train, test and validation becomes so close to each other that we could say the network is trained well. Am i right? But when each curves converges to a specific value (no matter how much the value for MSE reagrding each train,test and validation differ from each other) we can say the network works good?
Another question! When it is asked to show a convergence chart showing the MSE vs the iteration run for your network, Is it essential that the graph has all types (train,test,validation), or it can have only 'train' or 'train and test', ...?