Working paper

When are Google data useful to nowcast GDP? An approach via pre-selection and shrinkage

Published on 11 April 2019
Authors : Laurent Ferrara, Anna Simoni

Working paper n°717. Nowcasting GDP growth is extremely useful for policy-makers to assess macroeconomic conditions in real-time. In this paper, we aim at nowcasting euro area GDP with a large database of Google search data. Our objective is to check whether this specific type of information can be useful to increase GDP nowcasting accuracy, and when, once we control for official variables. In this respect, we estimate shrunk bridge regressions that integrate Google data optimally screened through a targeting method, and we empirically show that this approach provides some gain in pseudo-real-time nowcasting of euro area GDP quarterly growth. Especially, we get that Google data bring useful information for GDP nowcasting for the four first weeks of the quarter when macroeconomic information is lacking. However, as soon as official data become available, their relative nowcasting power vanishes. In addition, a true real-time analysis confirms that Google data constitute a reliable alternative when official data are lacking.

Image Évaluation en temps réel du PIB avec des données Google : Une approche par présélection et réduction de la dimension

Nowcasting GDP growth is extremely useful for policy-makers to assess macroeconomic conditions in real-time. The concept of macroeconomic nowcasting has been popularized by many researchers (see e.g. Giannone et al., 2008) and differs from standard forecasting approaches in the sense it aims at evaluating current macroeconomic conditions on a high-frequency basis.

In the existing literature, GDP nowcasting tools integrate standard official macroeconomic information stemming, for instance, from National Statistical Institutes, Central Banks, International Organizations. However, more recently, a lot of emphasis has been put on the possible gain that forecasters can get from using alternative sources of high-frequency information, referred to as Big Data (see for example Varian, 2014, Giannone et al., 2017, or Buono et al., 2018). One of the main sources of alternative data is Google search; seminal papers on the use of such data for forecasting are the ones by Choi and Varian (2009) and Choi and Varian (2012). Overall, empirical papers show evidence of some forecasting power for Google data, at least for some specific macroeconomic variables such as consumption (Choi and Varian, 2012). However, when correctly compared to other sources of information, the jury is still out on the gain that economists can get from using Google data for forecasting and nowcasting.

In this paper, we estimate both pseudo real-time and true real-time nowcasts for the euro area quarterly GDP growth between 2014q1 and 2016q1 by plugging Google data into the analysis, in addition to official variables on industrial production and opinion surveys, commonly used as predictors for GDP growth. The approach that we carry out is deliberately extremely simple and relies on a bridge equation that integrates variables selected from a large set of Google data, as proposed by Angelini et al. (2011). More precisely, we pre-select Google variables by targeting GDP growth using the Sure Independence Screening method put forward by Fan and Lv, (2008) enabling to preselect the Google variables the most related to GDP growth before entering the bridge equation. After pre-selection we use Ridge regularization to estimate the bridge equation as the number of pre-selected variables may still be large.

Four main stylized facts come out from our empirical analysis. First, we point out the usefulness of Google search data for nowcasting euro area GDP for the first four weeks of the quarter when there is no available official information about the state of the economy. Indeed, we show that at the beginning of the quarter, Google data provide an accurate picture of the GDP growth rate. Against this background, this means that such data are a good alternative in the absence of official information and can be used by policy-makers.  Second, we get that as soon as official data become available, that is starting from the fifth week of the quarter, then the gain from using Google data for GDP nowcasting rapidly vanishes. This result contributes to the debate on the use of big data for short-term macroeconomic assessment when controlling for standard usual macroeconomic information. Third, we show that pre-selecting Google data before entering the nowcasting models appears to be a pertinent strategy in terms of nowcasting accuracy. Indeed, this approach enables to retain only Google variables that have some link with the targeted variable.  Finally, we carry out a true real-time analysis by nowcasting euro area GDP growth rate using the official Eurostat timeline and vintages of data. We show that the three previous results still hold in real-time, in spite of an expected increase in the size of errors, suggesting that Google search data can be effectively used in practice to help the decision-making process.

Updated on 24 October 2023