Flexible Least Squares:
A Multicriteria Optimization Method
for Model Specification

Last Updated: 15 January 2024

Site Maintained By:
  Leigh Tesfatsion
  Prof. Emerita of Economics
  Courtesy Research Professor
   of Electrical & Comp. Eng.
  Heady Hall 260
  Iowa State University
  Ames, Iowa 50011-1054
  tesfatsi AT iastate.edu

FLS Intro Article
FLS-TVLR Article
GFLS Intro Article
FLS Residual Efficiency Frontier Figure

Overview

What is Flexible Least Squares (FLS)?

Flexible Least Squares (FLS) is a multicriteria optimization method for model specification.

FLS permits users to identify the "Pareto frontier" of all efficiently estimated models conditional on: (i) a given theory; (ii) a given data set; and (iii) a designated collection of one or more goodness-of-fit metrics.

FLS does not require the imposition of problematic stochastic assumptions on "residual error terms" that in fact arise from deterministic model misspecification.

The Basic FLS Approach

Any real-world system that a researcher attempts to model will inevitably behave in a manner that is incompatible to some degree with the theoretical assumptions the researcher has incorporated in the model. "All models are wrong, but some are useful." (George E.P. Box, 1987)

These theoretical assumptions typically fall into four conceptually-distinct categories: (1) dynamic assumptions postulating constraints on the changes in state variables over time; (2) measurement assumptions postulating relationships between observed values and model-predicted values; (3) cross-sectional assumptions postulating relationships among simultaneously determined endogenous variables; and (4) stochastic assumptions postulating constraints on the realizations for variables assumed to be randomly generated.

Discrepancies between the theoretical assumptions (1)-(4) and the actual real-world system of interest are called model specification errors.

The model specification errors arising for any given modeling of a real-world system are not necessarily commensurable in terms of a single empirically-meaningful scalar metric. For example, there is no particular reason to think that a measurement specification error arising from the use of an imperfect measurement device can be directly compared with a dynamic specification error that arises because the dynamic relationship between successive system states has incorrectly been assumed to take a linear rather than a nonlinear form.

Consequently, given a model incorporating conceptually distinct types of theoretical assumptions, the fitting of this model to a given data set is intrinsically a multicriteria optimization problem. Any such fitting will result in conceptually distinct types of model specification errors indicating the extent to which conceptually distinct types of theoretical relationships are incompatible with the data set. An econometrician undertaking the fitting would presumably prefer each type of model specification error to be small. However, beyond a particular point a further decrease in one type of model specification error will typically come at the cost of an increase in another.

One way to proceed here would be to induce commensurability among model specification errors by assuming they are governed by a joint probability distribution. An important drawback to this approach is that different modelers will typically have different prior conceptions regarding what constitutes a meaningful assignment of probability assessments, conceptions that cannot be brought to conformity on the basis of available empirical evidence.

In a series of studies summarized in Kalaba and Tesfatsion (CSDA,1996), Bob Kalaba and I develop an alternative approach to this multicriteria optimization problem, referred to as Flexible Least Squares (FLS). As clarified below, The FLS approach accommodates a wide range of views regarding the appropriate interpretation, measurement, and estimation of model specification errors.

FLS for Time-Varying Linear Regression (FLS-TVLR)

At one end of the range of FLS applications, relying solely on simple "smallness" priors for model specification errors, we develop FLS for time-varying linear regression.

For illustration, consider a simple linear regression relation y = bX with a scalar dependent variable y, a K-dimensional coefficient vector b, and a K-dimensional vector X of explanatory variables. Suppose this linear regression relation is postulated to explain a time-series data set T consisting of scalar dependent-variable observations y1,...,yN and K-dimensional vectors X1,...,XN of explanatory variables recorded for time periods n = 1,...,N. The goal is to study the extent to which the null theoretical hypothesis of a linear regression relation y = bX with a constant coefficient vector b is incompatible with the time-series data set T.

Each estimated model M(T) for the postulated linear regression relation y = bX, conditional on the time-series data set T, is characterized by an estimate (b*1(T),...,b*N(T)) for the sequence (b1,...,bN) of coefficient vectors bn for time periods n = 1,...,N. Two conceptually distinct types of specification errors, dynamic and measurement, can be associated with M(T).

Let these dynamic and measurement model-specification errors, in squared form, be separately aggregated into squared-error sums RD(T) and RM(T). As explained with care in Kalaba and Tesfatsion (1989a), the basic FLS objective for time-varying linear regression is to determine the Residual Efficiency Frontier REF(T), a construction analogous to the concept of a Pareto Efficiency Frontier in standard economic analysis.

Specifically, the Residual Efficiency Frontier REF(T) consists of all estimated models M(T) whose corresponding squared-error sums (RD(T),RM(T)) for dynamic and measurement model specification errors are minimal in the following vector sense:

Given any estimated model M(T) along REF(T) with corresponding squared-error sums (RD(T),RM(T)), there does not exist any other estimated model M#(T) whose corresponding squared-error sums (RD#(T),RM#(T)) are strictly smaller than (RD(T),RM(T)) in the following vector sense: RD#(T) does not exceed RD(T), RM#(T) does not exceed RM(T), and either RD#(T) is strictly smaller than RD(T) or RM#(T) is strictly smaller than RM(T).

Consequently, as illustrated in the figure appearing at the top of this website, REF(T) is analogous to a Pareto Efficiency Frontier of estimated models M(T), given the time-series data set T.

Finally, note that the estimated model Mo(T) corresponding to ordinary least squares (OLS) linear regression with a time-invariant estimate bo(T) for the coefficient vector b is obtained at the limit point of REF(T) where RD(T) = 0 and RM(T) attains its largest possible value. Starting from this limit point,the measurement error RM(T) successively declines along REF(T) as RD(T) is permitted to increase from 0, i.e., as the dynamic specification errors [b*n+1(T) - b*n(T)], n = 1, ..., N-1, are permitted to increase in size.

A Fortran program for FLS-TVLR has been incorporated into the statistical packages SHAZAM and GAUSS. See the software section, below, for details.

FLS for Approximately Linear Systems (GFLS)

As detailed in Kalaba and Tesfatsion (1990a), we also developed a Generalized Flexible Least Squares (GFLS) method for state-space models whose postulated dynamic and measurement relationships constitute a general linear-affine system of equations. The concept of the Residual Efficiency Frontier REF(T) is correspondingly generalized to a Cost-Efficient Frontier CEF(T) for which costs are separately assessed for dynamic and measurement specification errors.

A Fortran program for GFLS has been incorporated into the statistical package GAUSS. See the software section, below, for details.

FLS for K-Dimensional Goodness-of-Fit Criterion Vectors (KFLS)

More generally, suppose a theoretical state-space model has been conjectured as a possible explanation for a given time-series data set T. Suppose, also, that a modeler has specified a K-dimensional vector of incompatibility cost functions for measuring the degree of incompatibility between theory and data in accordance with K different goodness-of-fit criteria, where K is greater or equal to 1.

In a succession of studies that are listed below in the publications section, and summarized in Kalaba and Tesfatsion (1996), Bob Kalaba and I developed what is here referred to as KFLS, an extended FLS method that handles this K-dimensional goodness-of-fit problem.

More precisely, KFLS is a constructive procedure for the determination of a Cost-Efficient Frontier CEF(T) for this problem. The points along CEF(T) correspond to the family of all estimated state-space models that are equally efficient with respect to achieving vector-minimalization of these K incompatibility cost functions.

In addition, we obtain a recurrence relation for CEF(T). Let Tn denote the time-series data in T pertaining to time periods 1,...,n. For each n, the recurrence relation expresses CEF(Tn+1) as a function of CEF(Tn) together with a K-dimensional vector of incremental incompatibility costs associated with new data obtained between n and n+1.

We show that KFLS encompasses a number of other state-space estimation methods. For example, FLS and GFLS are special cases of KFLS with K=2. Moreover, KFLS reduces to the standard Kalman Filter for the special case in which all model specification errors are assumed to be stochastic disturbances to otherwise correctly specified theoretical relationships, thus permitting the derivation of a single (K=1) real-valued incompatibility cost function in the form of a posterior probability distribution.

Finally, we also clarify the interesting relationship between KFLS and the general-to-specific econometric methodology advocated by David Hendry, among others, and between the KFLS CEF construction and the information contract curve (global sensitivity analysis) approach developed by Ed Leamer.

FLS and Kalman Filtering

As discussed more fully in (Kalaba and Tesfatsion, 1990b), it is logically incorrect to equate FLS for time-varying linear regression with Kalman Filtering (KF).

FLS for time-varying linear regression does not require probability assumptions either for its motivation or for its solution. As previously explained, its goal is to characterize the set of all possible sequences of coefficient-vector estimates that achieve vector-minimal incompatibility between imperfectly specified theoretical relationships and a given time-series data set T.

In contrast, KF applied to a time-varying linear regression problem conditional on a time-series data set T is a recursive method for the evaluation of the Tn-conditional moments of the period-n coefficient vector for each time period n, where Tn denotes the time-series data in T pertaining to time periods 1,...,n. The time-varying linear regression model is expressed as a stochastic linear state-space model with coefficient vectors identified as state vectors. The general form of the stochastic model is assumed to be correctly and completely specified. In particular, "residual error" terms are assumed to be stochastic disturbance terms governed by a joint probability distribution whose general form is common knowledge.

Nevertheless, FLS was originally motivated by a puzzling aspect of KF: Why has KF proved to be useful in empirical practice, even when its assumptions are not even remotely satisfied? The original goal of FLS was to get to the heart of the matter by stripping away non-essential aspects of KF that complicate its application without adding to scientific understanding.

The key non-essential aspect of KF was soon revealed to be overly strong distributional assumptions for "residual error" terms.

The KF presumption that "residual error" terms are governed by a joint probability distribution permits the use of scalar posterior probability distributions for state moment estimation. However, this presumption hides the multicriteria aspects of the underlying estimation problem, i.e., the possible presence of multiple conceptually distinct types of model specification errors.

In contrast, as detailed above, FLS for time-varying linear regression sets out the underlying multicriteria estimation problem in explicit bare-boned terms. Given a time-series data set T, FLS solves for the entire Residual Efficiency Frontier REF(T) of estimated linear regression models that are efficient in the sense that measurement specification errors RM(T) are minimal for each given level RD(T) of dynamic volatility in the estimated regression coefficients.

Thus, for a researcher unsure about model specification errors, an FLS-generated REF provides a way to obtain a fuller picture of the researcher's true "cone of uncertainty" for the model. If additional knowledge becomes available constraining the possible sizes of the model specification errors, the REF can be correspondingly narrowed. Otherwise, restricting attention to a proper subset of the REF is an arbitrary decision.

This understanding leads to the recognition that KF and FLS represent two endpoints of a continuum of approaches to time-varying linear regression ranging from relatively strong distributional priors (KF) to simple "smallness" priors (FLS) for dynamic and measurement model specification errors. All of these approaches, from KF to FLS and everything between, should be routinely available in the toolkits of statisticians and econometricians as they attempt to gain a better understanding of real-world systems. The determination of a "best" approach should then depend on the extent to which different forms of prior restrictions on model specification errors can be scientifically justified.

FLS as a Diagnostic Method for Model Specification

In FLS studies, the full display of the Residual Efficiency Frontier (REF) or Cost-Efficient Frontier (CEF) can facilitate the discovery of true model structure. The qualitative persistence all along the REF/CEF of particular forms of time variation in FLS-estimated state vectors gives confidence that these time variations reflect true attributes of the underlying data generating mechanism.

For example, as reported in Kalaba and Tesfatsion (1989a), FLS studies were conducted for time-varying linear regression systems characterized by step-function, elliptical, sinusoidal, and other types of time-variation in the regression coefficients. It was found that these true underlying time variations were correctly displayed by the FLS-estimated regression coefficients all along the REF, albeit in increasingly flattened form as the limit point was approached where the dynamic cost RD becomes zero (i.e., where the OLS solution is obtained).

Restriction to a subset of the FLS-estimated models along a REF/CEF is arbitrary unless additional prior information is available to supplement the given theoretical model and time-series data set. In some cases a researcher might indeed have prior information about a system under study that permits the researcher to assess ex post which estimated models along a REF/CEF have physically sensible interpretations and which do not. The researcher might then be able to use these assessments to limit attention to a proper subset of the REF/CEF.

NOTE: Robert E. Kalaba, great scholar, mentor, colleague, and friend, died on 9/29/2004.

Publications Developing the Basic FLS Methodology (Chronological Order)

FLS Comparative Testing

FLS Open Source Software & Manual Availability

Illustrative FLS Applications

Copyright © Leigh Tesfatsion. All Rights Reserved.