TY - JOUR
T1 - Third-order generalization: A new approach to categorizing higher-order generalization
AU - Neville, Richard
N1 - 100 word commentary: The contribution is a systematic framework for higher-order generalization. The paper presents the theoretical methodology required to perform 0-order, 1st-order, 2nd-order, and 3rd-order generalization. A set of symmetry transformations that enable higher order generalisation are presented. By utilizing weight transformations, higher-order generalization does indeed ‘cover’ [some] of the space the initial training data (points) did not cover. The empirical results support the claim that all the orders of generalization are attainable. Role: Single author paper - author undertook all: research, mathematical function development [modeling]; algorithm development, and experimentation and analysis- contribution estimate = 100%. Most likely REF score 3*.
PY - 2008/3
Y1 - 2008/3
N2 - Generalization, in its most basic form, is an artificial neural network's (ANN's) ability to automatically classify data that were not seen during training. This paper presents a framework in which generalization in ANNs is quantified and different types of generalization are viewed as orders. The ordering of generalization is a means of categorizing different behaviours. These orders enable generalization to be evaluated in a detailed and systematic way. The approach used is based on existing definitions which are augmented in this paper. The generalization framework is a hierarchy of categories which directly aligns an ANN's ability to perform table look-up, interpolation, extrapolation, and hyper-extrapolation tasks. The framework is empirically validated. Validation is undertaken with three different types of regression task: (1) a one-to-one (o-o) task, f(x):xi→yj; (2) the second, in its f(x):{xi,xi+1, ...}→yj formulation, maps a many-to-one (m-o) task; and (3) the third f(x):xi→{yj,yj+1, ...} a one-to-many (o-m) task. The first and second are assigned to feedforward nets, while the third, due to its complexity, is assigned to a recurrent neural net. Throughout the empirical work, higher-order generalization is validated with reference to the ability of a net to perform symmetrically related or isomorphic functions generated using symmetric transformations (STs) of a net's weights. The transformed weights of a base net (BN) are inherited by a derived net (DN). The inheritance is viewed as the reuse of information. The overall framework is also considered in the light of alignment to neural models; for example, which order (or level) of generalization can be performed by which specific type of neuron model. The complete framework may not be applicable to all neural models; in fact, some orders may be special cases which apply only to specific neuron models. This is, indeed, shown to be the case. Lower-order generalization is viewed as a general case and is applicable to all neuron models, whereas higher-order generalization is a particular or special case. This paper focuses on initial results; some of the aims have been demonstrated and amplified through the experimental work. © 2007 Elsevier B.V. All rights reserved.
AB - Generalization, in its most basic form, is an artificial neural network's (ANN's) ability to automatically classify data that were not seen during training. This paper presents a framework in which generalization in ANNs is quantified and different types of generalization are viewed as orders. The ordering of generalization is a means of categorizing different behaviours. These orders enable generalization to be evaluated in a detailed and systematic way. The approach used is based on existing definitions which are augmented in this paper. The generalization framework is a hierarchy of categories which directly aligns an ANN's ability to perform table look-up, interpolation, extrapolation, and hyper-extrapolation tasks. The framework is empirically validated. Validation is undertaken with three different types of regression task: (1) a one-to-one (o-o) task, f(x):xi→yj; (2) the second, in its f(x):{xi,xi+1, ...}→yj formulation, maps a many-to-one (m-o) task; and (3) the third f(x):xi→{yj,yj+1, ...} a one-to-many (o-m) task. The first and second are assigned to feedforward nets, while the third, due to its complexity, is assigned to a recurrent neural net. Throughout the empirical work, higher-order generalization is validated with reference to the ability of a net to perform symmetrically related or isomorphic functions generated using symmetric transformations (STs) of a net's weights. The transformed weights of a base net (BN) are inherited by a derived net (DN). The inheritance is viewed as the reuse of information. The overall framework is also considered in the light of alignment to neural models; for example, which order (or level) of generalization can be performed by which specific type of neuron model. The complete framework may not be applicable to all neural models; in fact, some orders may be special cases which apply only to specific neuron models. This is, indeed, shown to be the case. Lower-order generalization is viewed as a general case and is applicable to all neuron models, whereas higher-order generalization is a particular or special case. This paper focuses on initial results; some of the aims have been demonstrated and amplified through the experimental work. © 2007 Elsevier B.V. All rights reserved.
KW - Classifier
KW - Extrapolation
KW - Generalization
KW - Higher-order
KW - Hyper-extrapolation
KW - Information inheritance
KW - Interpolation
KW - Recognition
KW - Reuse of information
KW - Sigma-pi
KW - Symmetric transformations
KW - Weight generation.
U2 - 10.1016/j.neucom.2007.05.003
DO - 10.1016/j.neucom.2007.05.003
M3 - Article
SN - 0925-2312
VL - 71
SP - 1477
EP - 1499
JO - Neurocomputing
JF - Neurocomputing
IS - 7-9
ER -