MATLAB: 1. Improve the relatively uninformative 5 line code examples in the command line help and doc examples. 2. Give the GUI user the choice between the current overwhelmingly voluminous command line version and the shorter versions in the improved help/doc documentation.
>> help nndatasets % Choosing the regression/curve-fitting examples simplefit_dataset - Simple fitting dataset. abalone_dataset - Abalone shell rings dataset. bodyfat_dataset - Body fat percentage dataset. building_dataset - Building energy dataset. chemical_dataset - Chemical sensor dataset. cho_dataset - Cholesterol dataset. engine_dataset - Engine behavior dataset. house_dataset - House value dataset
% How about including the dimensions and number of examples on the same line e.g., ( 21 3 264) for the cholesterol dataset?
% Except for two extra plot statements in help simplefit_dataset, all of the resulting 5 statement help codes are identical and very uninformative. For example
[ x, t ] = cho_dataset;
% How many inputs? How many outputs? How many examples? Are there outliers? Are %the t variables on the same scale? What about the x variables? How well are the %variables correlated?
net = fitnet(10); net = train(net,x,t);
% What was the state of the RNG that randomly divided the data and chose the random %initial weights?
view(net) y = net(x);
% Is this result good or bad? Compared to what? ==================================================== % Suggested Bare Bones Replacement ( 9 statements)
[ x, t ] = cho_dataset; [ I N ] = size(x) % [ 21 264 ] [ O N ] = size(t) % [ 3 264 ]
% Are there outliers? Are the t variables on the same scale? What about the x variables? % How well are the variables correlated?
MSE00 = mean(var(t',1)) % 2025.8 Reference MSE
% MSE00 is the mean-squared-error for the naive constant output model y00 = % repmat(mean(t,2),1,N).
net = fitnet(10); rng(0) % RNG seed can be an arbitrary nonneg int (help rng) [net tr y e ] = train(net,x,t);
% tr is training record, % y = net(x); % e = t-y;
% MSE = mse(e) However, the actual value doesn't mean much unless it is normalized by % the mean of the target variances or used to calculate the "coefficient of detemination" % (aka R^2 and R-squared) which is interpreted as the fraction of target variance that % is "explained" by the model. In addition, R is the correlation coefficient between y and t. % See wikipedia/R-squared. Only one of the following is needed
NMSE = mse(e)/MSE00 % 0.229 Typically 0 <= NMSE <= 1 R2 = 1 - mse(e)/MSE00 % 0.771 Hoped for at least 0.99 (Dagnabit!)