Data Augmentation by Wavelet Neural Networks

Eduardo Salazar
National Institute of Economic and Social Research (NIESR)
ESalazar@niers.ac.uk

Abstract

Quite often in empirical research, temporal aggregation of reported economic series impose constraints both in model specification and parameter estimation. Statistic agencies have improved their collection and analysis of disaggregate data in recent years, but problems of sectoral coverage are nevertheless present. Frequently, applied work is conducted, and theories are claimed to be tested, using data at the next possible level of temporal aggregation. Unless time aggregation is explicitly accounted for [see, for example, Oguchi and Fukuchi, 1990] this leads both to identification problems and biased estimation.

Another category is the ad hoc use of disaggregate information. For example, to produce short-term forecasts of economic activity, City commentators and academic researchers keep an eye on the monthly movements in retail sales or trade, or manufacturing output, believed to be the best available proxy for broader measures of demand and output. Salazar et al. [1994] argue those procedures may, at best, be using existing information inefficiently and, at worse, misleading.

Both economists and statisticians had an interest in the subject during the 1960s, but examples of work can be traced back to the 1930s. The basic idea can be formalized as follows. Suppose a series tex2html_wrap_inline205 is observed at tex2html_wrap_inline207 regularly spaced periods, but higher frequency measurements are needed. Information on several,or one related series to tex2html_wrap_inline205 is available, with periodicity tex2html_wrap_inline211 . Denote by tex2html_wrap_inline213 a set of p-related variables to tex2html_wrap_inline205 , observed in sub-period u=1,...,K of period t=1,...,N. Therefore, tex2html_wrap_inline213 defines a tex2html_wrap_inline223 matrix with typical column tex2html_wrap_inline225 i=1,..,p. Assume the hypothetical tex2html_wrap_inline229 vector and the related variables tex2html_wrap_inline213 follow

equation44

where tex2html_wrap_inline239 is the tex2html_wrap_inline241 vector of parameters, tex2html_wrap_inline243 is a tex2html_wrap_inline245 stationary vector of error terms, such that E tex2html_wrap_inline247 and covariance E tex2html_wrap_inline249 tex2html_wrap_inline251 . Assume, in addition, that a time-invariant constraint links tex2html_wrap_inline229 and tex2html_wrap_inline205 , so

equation59

where the tex2html_wrap_inline261 weights are known a-priori. Therefore, write

equation70

where tex2html_wrap_inline275 and covariance matrix tex2html_wrap_inline277 . Note in equation (3) that tex2html_wrap_inline205 is observed, and tex2html_wrap_inline281 contains the observed values of the p-related series at the same frequency as tex2html_wrap_inline205 . Taking into account the constraints (2) we can estimate tex2html_wrap_inline239 and tex2html_wrap_inline229 by solving

equation92

The whole procedure, however, depends on the specification of the model for tex2html_wrap_inline299 in equation (1). A complete generalization of Chow and Lin's model, and other extensions, can be found in Salazar et al. [ibid.] Alternatively, equation (1) can be recast in state-space form, using the Kalman filter and the fixed point smoother to estimate tex2html_wrap_inline239 and tex2html_wrap_inline229 , respectively. Jones [1980] applied this methodology for ARMA models with missing observations; Harvey and Pierse [1983], Harvey [1989] and Gomez and Maravall [1994] provide some useful extensions.

This paper proposes to redefine equation (1) by

equation106

where the form of tex2html_wrap_inline311 is unknown, but can be recovered from the data. Estimation of (5) is made by a 2-layer Neural Network model, performing a wavelet decomposition over tex2html_wrap_inline213 and approximating tex2html_wrap_inline315 to the known values of tex2html_wrap_inline205 under a suitable criterion. The learning algorithm ensures in both models that, at each stage, the linear adding-up constraint given by (2) is met.

The architecture is a direct adaptation of the Radial Basis Function [RBF] Network, where the RBF activation is substituted by a [discrete] wavelet kernel; seminal papers in this direction are Zhang and Benveniste [1992] and Pati and Krishnaprasad [1993].

Two modeling strategies are considered. First, every neuron on the hidden layer implements a wavelet decomposition using a wavelet basis constructed from the data. This type of architecture is similar to that proposed by Zhang and Benveniste [ibid.] Instead of using a fixed lattice of translation and dilation parameters, these are adaptively determined to form a 'library' of wavelets. The problem then amounts to select the 'best' wavelets from the obtained library, and the estimation of the wavelet coefficients.

The alternative model performs a wavelet decomposition for each of the p vectors in tex2html_wrap_inline213 , selecting for every vector an appropriate basis. By separating the input space,

displaymath201

and therefore

displaymath202

where tex2html_wrap_inline329 denotes a 'family' of wavelets obtained from the mother function tex2html_wrap_inline331 , for each i vector in tex2html_wrap_inline213 and tex2html_wrap_inline335 stacks each of the r=1,...,R decompositions, the contribution by each series may be better localized.

An application to the estimation of monthly components of Gross Domestic Product for the United Kingdom is considered, from quarterly aggregates. The estimates are readily comparable with the estimates in Salazar et al. [op. cit.], and performance measures for each model are analyzed. Conclusions follow.


Society of Computational Economics
Second International Conference on Computing in Economics and Finance
Geneva, Switzerland, 26-28 June 1996