Flexible Least Squares:
A Diagnostic Method for Model Specification

Last Updated: 12 March 2019

Site Maintained By:
Leigh Tesfatsion
Research Professor, and
Professor Emerita of Economics, Mathematics,
   and Electrical & Computer Engineering
Heady Hall 260
Iowa State University
Ames, Iowa 50011-1054
tesfatsi AT iastate.edu

FLS In Depth
GFLS In Depth
GFLS Cost Efficient Frontier Figure


The Basic FLS Approach

Flexible Least Squares (FLS) is a diagnostic model specification method that eliminates the need to impose problematic distribution assumptions on model specification errors.

Any real-world system that a researcher attempts to model will inevitably behave in a manner that is incompatible to some degree with the theoretical assumptions the researcher has incorporated in the model. "All models are wrong, but some are useful." (George E.P. Box, 1987)

These theoretical assumptions typically fall into four conceptually-distinct categories: (1) dynamic assumptions specifying how state variables change over time; (2) measurement assumptions postulating relationships between observed values and model-predicted values; (3) cross-sectional assumptions postulating relationships among simultaneously determined endogenous variables; and (4) stochastic assumptions constraining the realizations for variables assumed to be randomly generated.

Discrepancies between the theoretical assumptions (1)-(4) and the actual real-world system of interest are called (model) specification errors.

The specification errors arising for any given modeling of a real-world system are not necessarily commensurable in terms of a single empirically-meaningful scalar metric. For example, there is no particular reason to think that a measurement specification error arising from the use of an imperfect measurement device can be directly compared with a dynamic specification error that arises because the dynamic relationship between successive system states has incorrectly been assumed to take a linear rather than a nonlinear form.

Consequently, given a model incorporating conceptually distinct types of theoretical assumptions, the fitting of this model to a given data set is intrinsically a multicriteria optimization problem. Any such fitting will result in conceptually distinct types of specification errors measuring the extent to which conceptually distinct types of theoretical relationships are incompatible with the data set. An econometrician undertaking the fitting would presumably prefer each type of specification error to be small. However, beyond a particular point a further decrease in one type of specification error will typically come at the cost of an increase in another.

For example, suppose a researcher interested in modeling a real-world system S has specified a family F of possible parameterized models M(b) for S, where the parameter vector b ranges over a given parameter set B. Suppose each model M(b) includes dynamic, measurement, cross-sectional, and stochastic assumptions. The fitting of M(b) to a set Y of time series data recorded for S is then intrinsically a multicriteria optimization problem. Any estimate b* for b will result in conceptually distinct types of specification errors measuring the extent to which conceptually distinct types of theoretical assumptions incorporated in M(b*) are incompatible with the data set Y. Apart from the special (and highly unlikely) case in which the true data generating mechanism for S is an element of F, a succession of b vectors selected from B to achieve a continual decrease in any one type of specification error will ultimately lead to an increase in at least one other type of specification error.

One way to proceed here would be to induce commensurability among specification error terms by taking a Bayesian approach. A modeler could first associate a prior probability distribution Prob(z) with a vector z =(b,d) consisting of the parameter vector b augmented with a vector d of specification error terms. The modeler could then combine this prior probability distribution with a likelihood function Prob(Y|z) to obtain a posterior probability distribution Prob(z|Y) = Prob(Y|z)Prob(z)/Prob(Y) from which to derive some form of Bayes estimate for z, e.g., a maximum a posteriori (MAP) estimate. An issue here, of course, is that different modelers will typically have different priors that cannot be brought to conformity on the basis of available empirical evidence.

In a series of studies listed in the following publications section and summarized in Kalaba and Tesfatsion (CSDA,1996 pdf 1.6MB), Bob Kalaba and I develop an alternative approach to this multicriteria optimization problem, referred to as Flexible Least Squares (FLS). As clarified below, The FLS approach accommodates a wide range of views regarding the appropriate interpretation, measurement, and estimation of model specification errors.

FLS for Time-Varying Linear Regression

At one end of the range of FLS applications, stressing minimal reliance on commensurability priors for specification errors, we develop FLS for time-varying linear regression. For any given time series data set, and any given linear regression model proposed to explain the data set, each possible estimated model generates two conceptually distinct types of specification errors, dynamic and measurement. The dynamic specification errors reflect time variation in successive coefficient vectors (relative to a null of constancy), and the measurement specification errors reflect differences between actual observed outcomes and theoretically predicted outcomes based on the null of a linear regression model. The dynamic and measurement specification errors, in squared form, are separately aggregated into squared-error sums RD and RM.

As explained with care in Kalaba and Tesfatsion (1989a), the basic FLS objective is to determine the Residual Efficiency Frontier (REF), i.e., the frontier of all estimated models that are efficient with respect to achieving vector-minimal squared-error sums (RD,RM) for the dynamic and measurement specification errors. By construction, given any estimated model M along the REF with corresponding squared-error sums (RD,RM), there does not exist any other estimated model M' whose corresponding squared-error sums (RD',RM') are strictly smaller than (RD,RM) in the following vector sense: RD' does not exceed RD, RM' does not exceed RM, and either RD' is strictly smaller than RD or RM' is strictly smaller than RM.

The estimated model corresponding to ordinary least squares (OLS) linear regression with time-invariant coefficient estimates is obtained at the limit point of the REF where RD=0 and RM attains its largest possible REF value.

As seen in the software section, below, FLS has been incorporated into the statistical packages SHAZAM and GAUSS.

GFLS: FLS for Approximately Linear Models

As detailed in Kalaba and Tesfatsion (1990a), we also developed Generalized Flexible Least Squares (GFLS), an extension of FLS to models for which the dynamic and measurement relationships are postulated to be general linear-affine systems of equations. The concept of the REF is correspondingly generalized to a planar Cost-Efficient Frontier (CEF) for which costs are separately assessed for dynamic and measurement specification errors.

As seen in the software section, below, GFLS has been incorporated into the statistical package GAUSS.

KFLS: FLS for K-Dimensional Goodness-of-Fit Criterion Vectors

More generally, suppose a theoretical model has been conjectured as a possible explanation for a given data set. Suppose, also, that a modeler has specified a K-dimensional vector of incompatibility cost functions for measuring the degree of incompatibility between theory and data in accordance with K different goodness-of-fit criteria, where K is greater or equal to 1.

In a succession of studies that are listed below in the publications section, and summarized in Kalaba and Tesfatsion (1996), Bob Kalaba and I developed what is here referred to as KFLS, an extended FLS method that handles this K-dimensional goodness-of-fit problem.

More precisely, KFLS is a constructive procedure for the determination of a Cost-Efficient Frontier (CEF) for this problem. The points along the CEF correspond to the family of all estimated models that are equally efficient with respect to achieving vector-minimalization of these K incompatibility cost functions. We also derive a recurrence relation for updating the CEF at time t+1 as a function of the CEF at time t together with a K-dimensional vector of incremental incompatibility costs associated with new data obtained between time t and time t+1.

We show that KFLS encompasses a number of other estimation methods. For example, FLS and GFLS are special cases of KFLS with K=2. Moreover, KFLS reduces to the standard Kalman Filter for the special case in which all specification error terms are assumed to be stochastic disturbances to otherwise correctly specified theoretical relationships, thus permitting the derivation of a single (K=1) real-valued incompatibility cost function in the form of a posterior probability distribution.

Finally, we also clarify the interesting relationship between KFLS and the general-to-specific econometric methodology advocated by David Hendry, among others, and between the KFLS CEF construction and the information contract curve (global sensitivity analysis) approach developed by Ed Leamer.

FLS and Kalman Filtering

As discussed more fully in (Kalaba and Tesfatsion, 1990b), it is logically incorrect to equate FLS with Kalman Filtering (KF).

FLS addresses a multicriteria time-varying linear regression problem that does not require probability assumptions either for its motivation or for its solution: namely, the characterization of the set of all sequences of estimated regression coefficients that achieve vector-minimal incompatibility between imperfectly specified linear theoretical relations and process observations. In contrast, KF is a point estimation technique for determining the most probable coefficient sequence estimate for a stochastic linear model assumed to be correctly and completely specified.

Nevertheless, FLS was originally motivated by a puzzling aspect of KF: Why has KF proved to be useful in empirical practice, even when its assumptions are not even remotely satisfied? The original goal of FLS was to get to the heart of the matter by stripping away non-essential aspects of KF that complicate its application without adding to scientific understanding.

The key non-essential aspect of KF was soon revealed to be overly strong distributional assumptions for model specification errors.

Users of KF typically impose strong stochastic restrictions on dynamic and measurement specification errors, such as independence and joint normality. This permits a scalarization of the goodness-of-fit objective function in the form of a posterior probability distribution, which hides the multicriteria aspects of the problem being addressed.

In contrast, FLS sets out the underlying multicriteria optimization problem in explicit bare-boned terms. FLS solves for the entire Residual Efficiency Frontier (REF) of estimated linear models that are efficient in the sense that measurement specification errors are minimal for each given level of dynamic volatility in the estimated regression coefficients.

More precisely, each point (RD,RM) along the REF corresponds to an estimated time-varying linear regression model that minimizes RM for the given RD. Here RD denotes the sum of squared dynamic specification errors reflecting the movement (volatility) exhibited by the estimated regression coefficients over time, and RM denotes the sum of squared measurement specification errors. The coefficient sequence estimate corresponding to any given point on the REF can then be derived sequentially by means of recurrence relations which have the familiar Riccati equation form. This is hardly surprising, since it has been known for decades that linear-quadratic minimization leads to recurrence relations of this type.

The REF thus lays bare the econometrician's full "cone of uncertainty." That is, the REF displays the full range of estimated time-varying linear regression models that achieve an efficient representation of the given time series data set in the sense that their associated vectors (RD,RM) are vector-minimal. If additional knowledge becomes available constraining the possible sizes of the dynamic and/or measurement specification error terms RD and RM, then the REF can be correspondingly narrowed. Otherwise, restricting attention to a proper subset of the REF is an arbitrary decision.

This understanding leads to the recognition that KF and FLS represent two endpoints of a continuum of approaches to time-varying linear regression ranging from relatively strong distributional priors (KF) to simple smoothness priors (FLS) for the dynamic variation of the regression coefficients. All of these approaches, from KF to FLS and everything between, should be routinely available in the toolkits of statisticians and econometricians as they attempt to gain a better understanding of real-world systems. The determination of a "best" approach should then depend on the extent to which different forms of prior restrictions on model specification errors can be scientifically justified.

FLS as a Diagnostic Method for Model Specification

In time-varying linear regression studies making use of FLS, the full display of the Residual Efficiency Frontier (REF) can facilitate the discovery of true model structure. The qualitative persistence all along the REF of particular forms of time variation in the FLS-estimated regression coefficients gives confidence that these time variations reflect true attributes of the underlying data generating mechanism.

For example, as reported in Kalaba and Tesfatsion (1989a), FLS studies were conducted for time-varying linear regression systems characterized by step-function, elliptical, sinusoidal, and other types of time-variation in the regression coefficients. It was found that these true underlying time variations were correctly displayed by the FLS-estimated regression coefficients all along the REF, albeit in increasingly flattened form as the limit point was approached where the dynamic cost RD is zero (i.e., where the OLS solution is obtained).

Restriction to a subset of FLS-estimated models along the REF is arbitrary unless additional prior information is available to supplement the given time-varying linear regression model and time series data set. In some cases a researcher might indeed have prior information about the system under study that permits him/her to assess ex post which time variations displayed by the FLS-estimated regression coefficients along the REF have a physically sensible interpretation and which do not. The researcher might then be able to use these assessments to limit his/her attention to a proper subset of the REF.

NOTE: Robert E. Kalaba, great scholar, mentor, colleague, and friend, died on 9/29/2004.

Publications Developing the Basic FLS Methodology (Chronological Order)

FLS Comparative Testing

FLS Software & Manual Availability

Illustrative FLS Applications