Revisiting Transfer Functions: Learning About a Lagged Exposure-Outcome Association in Time-Series Data

Environmental exposures often show a time-lagged association with outcomes [1–3]. Distributed lag models have been used to capture such lag patterns by incorporating time-lagged values of exposures, with the corresponding of the lag structure approximated by polynomials or splines [1, 4]. These models require the correct input of cut-off time, or pre-specified window (hereafter termed lag length), after which the association diminishes to a constant level, typically zero [5, 6]. However, lag length is often unknown [5–7]. To fit distributed lag models without specifying lag length, we revisit transfer functions (TFs), a method to specify time-lagged associations commonly used in econometrics and introduced to epidemiology in 1991 [8–10]. We provide a case study to capture the time-lagged association between weekly purchasing outcome of sugar-sweetened drinkable yogurt and weekly-varying display promotion of these beverages, which is an obesogenic food environmental exposure in supermarkets.


INTRODUCTION
Environmental exposures often show a time-lagged association with outcomes [1][2][3]. Distributed lag models have been used to capture such lag patterns by incorporating time-lagged values of exposures, with the corresponding of the lag structure approximated by polynomials or splines [1,4]. These models require the correct input of cut-off time, or pre-specified window (hereafter termed lag length), after which the association diminishes to a constant level, typically zero [5,6]. However, lag length is often unknown [5][6][7]. To fit distributed lag models without specifying lag length, we revisit transfer functions (TFs), a method to specify time-lagged associations commonly used in econometrics and introduced to epidemiology in 1991 [8][9][10]. We provide a case study to capture the time-lagged association between weekly purchasing outcome of sugar-sweetened drinkable yogurt and weekly-varying display promotion of these beverages, which is an obesogenic food environmental exposure in supermarkets.

METHODS
TFs capture a time-lagged exposure-outcome association using a structural variable, denoted E t , which summarizes the current association (at time t) and cumulative association (up to time t) between the outcome variable Y t and time-lagged exposure variable X t−1 + X t−2 + X t−3 +... [8,11] (Supplementary Appendix S1). We illustrate a simple form of TF to capture a commonly observed shape of lag pattern, a monotonically decreasing association of outcome and lagged exposure, often called the Koyck decay [12]. Using the decay coefficient of lagged association λ up to lag h, the decreasing associations are represented as which recursively reduces to The coefficient β captures the immediate association at time t, and the value of decay coefficient λ closer to 1 implies a more persistent association over time (i.e., slower decay), while a value closer to zero indicates a shorter lag [12,13]. Constraining λ to be 0 < λ < 1 ensures the association monotonically decaying towards zero when the value of β is positive (Supplementary Figure S1A), and previous studies also imposed the decay towards zero [14,15]. The variable E t is added to a time-series regression for the outcome Y t to estimate β and λ as Y t = E t + Z t γ + ε t , where Z t represents a set of covariates and intercept with coefficients γ, and ε t represents the error term [10,13].
A visual interpretation of a lagged association combining these coefficients is provided by an impulse response function (IRF), representing the change of the outcome Y t+0 + Y t+1 + Y t+2 + . . . + Y t+h to an impulse (one-unit increase of x at time t only), while holding other variables constant [16]. The IRF of the Koyck decay is β + βλ 1 + βλ 2 + . . . + βλ h , visualized in Figure 1.
The general specification of the TF capturing various shapes of lag structure is where the Koyck decay is captured by p = 0, q = 1 in Eq. 1 above. More complex shapes are specified by higher values of p and q ( Figure 2; Supplementary Appendix S2), allowing generalization to classical lag models, such as the Almon polynomial [10,17].
Unlike commonly used distributed lag models, TF models obviates pre-specification of a lag length h, but require prior biological and epidemiological knowledge to help select plausible shapes of the lag (values of p and q). Deciding among candidate shapes is facilitated by model selection using fit metrics such as an information criterion [11].

CASE STUDY
The exposure is the weekly within-store display promotion of sugar-sweetened food items that potentially exhibits time-lagged association with the number of these items sold (outcome). Display promotion is the temporary placement of items in prominent locations to increase sales of (typically) ultraprocessed food [18]. Our food of interest is sugar sweetened (not plain) drinkable yogurt, a hidden and important source of dietary sugar among children [19,20]. A time series of weekly proportion of display-promoted sugar-sweetened drinkable yogurt items (continuous exposure) and weekly sum of the sales quantity of these items (continuous outcome) are recorded from a large supermarket in Montreal, Canada over T = 311 weeks (6 years). Supplementary Appendix S3 and Supplementary Figures S2, S3 elaborate the definition of the exposure and outcome.
The time-series regression used in this study is a dynamic linear model [21,22]. We added the structural variable, E t , covariates, a seasonal term, and an intercept. We selected the   Figure S4).

DISCUSSION
Time-lagged exposure-outcome associations are of critical interest in time-series analysis. We described TF modeling to estimate lagged associations when lag length is unknown a priori. Previous applications of TFs include environmental time-series analysis to capture decaying associations between arbovirus incidence and temperature [23] and interrupted time-series analysis to capture the persistent effect of interventions [11,24]. TF modeling requires pre-specification of the shape of a lag structure from investigators' prior knowledge followed by their selection based on model fit. When such knowledge is lacking, existing distributed lag models such as those using splines allow data-driven estimation of the shape of lag. They require the specification of lag length by model selection applied to plausible lag lengths [25], by setting a long enough length to cover the unobserved true lag window with a potential sacrifice of precision [4], or alternatively estimating the lag length from data [26,27]. Limitations of TFs include challenges in selecting the most appropriate shape of lag, when competing shapes show similar model fit. Finally, a comprehensive evaluation of TFs to capture lagged associations from simulated environmental health data is warranted, including their capacities to capture non-linear exposure-outcome associations by making β time-varying (dynamic) or imposing non-linear structure to E t [17,28].

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the McGill University, Faculty of Medicine, Institutional Review Board. Written informed consent from the participants' legal guardian/next of kin was not required to participate in this study in accordance with the institutional requirements.

AUTHOR CONTRIBUTIONS
The study was conceived and designed by HM and was reviewed and approved by the other authors. Authors AMS and EEMM provided inputs on the statistical analysis and interpretation of the results. Author DLB provided the data and computational resources. Data analysis and drafting of manuscript was led by HM. All authors reviewed, provided critical comments to the manuscript, and approved the final version of the manuscript for submission.

FUNDING
This study was funded by an Institut de valorisation des données (IVADO) post-doctoral fellowship.

CONFLICT OF INTEREST
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

ACKNOWLEDGMENTS
The study has been disseminated as a preprint at MedRxiv [29].