What are the fundamental differences between RBF and FFNN?
- RBF has localized basis functions (e.g. Gaussian) whereas FFNN has global basis functions (sigmoid).
- RBFs can be solved using linear regression (if the spread (S) and number of basis functions (m) is fixed) while FFNNs require non-linear regression, i.e. using an optimization algorithm to minimize error. That slows down the regression.
- Since RBFs are cheaper to solve, we have embedded cross-validation in the outer loop of the regression to determine m and S. Cross-validation means that we minimize the PRESS error over m and S to theoretically get the best predictor.
- Because we do the non-linear regression of FFNNs from a random starting point, they tend to have a little variation. To counter the variation, we generate ensembles of networks which are then averaged. An ensemble typically has 9 members (but can be adjusted by the user).
In the experience FFNNs are better at approximating smooth functions since they tend to interpolate rather than average. They also seem better at approximating sparse data, mainly because cross validation does not work well for sparse sets. They also seem to be more accurate for non-uniform point distribution, again mainly because cross-validation works better for uniformly dense sets.
For uniformly dense sets, RBFs may be better since, theoretically, cross-validation should provide a more accurate response surface. As you probably know obtaining a dense set for high dimensionality can be very expensive and would typically run into the thousands of simulations for only 50 variables.
We have also had feedback from large automotive users that they prefer FFNNs, but cannot afford them (a typical automotive design problem might have 7 cases, 50 variables and 100 constraint functions). So when using ensembles, typically 4500 neural networks must be computed individually (including the ensembles and hidden nodes options). This could take days.
For an optimization in which the user is really only interested to arrive at a single design point (i.e. a converged solution), using the default SRSM (sequential) approach with linear basis functions (the default approach) is still the best and cheapest. It also works well for a large number of variables and its cost is in a linear relation to the number of variables.