1、时间序列分析及VAR模型Lecture 66. Time series analysis: Multivariate models6.1 Learning outcomesVector autoregression (VAR)CointegrationVector error correction model (VECM)Application: pairs trading6.2 Vector autoregression (VAR)向量自回归The classical linear regression model assumes strict exogeneity; hence, ther

2、e is no serial correlation between error terms and any realisation of any independent variable (lead or lag). As we discovered, serial correlation (or autocorrelation) is very common in financial time series and panel data. Furthermore, we assumed a pre-defined relation of causality: explanatory var

3、iable affect the dependent variable. 传统的线性回归模型假设严格的外生性,误差项与可实现的独立变量之间没有序列相关性。金融时间序列与面板数据往往都有很强的自相关性,假定解释变量影响因变量。We now relax both assumptions using a VAR model. VAR models can be regarded as a generalisation of AR(p) processes by adding additional time series. Hence, we enter the field of multivaria

4、te time series analysis. VAR模型可以当作是在一般的自回归过程中加入时间序列。Lets look at a standard AR(p) process for two variables (yt and xt).(1) (2) The next step is to allow that lagged values of xtcan affect yt and vice versa. This means that we obtain a system of equations for two dependent variables (yt and xt). Bot

5、h dependent variables are influenced by past realisations of yt and xt. By doing that, we violate strict exogeneity (see Lecture 2); however, we can use a more relaxed concept, namely weak exogeneity. As we use lagged values of both dependent variables, we can argue that these lagged values are know

6、n to us, as we observed them in the previous period. We call these variables predetermined. Predetermined (lagged) variables fulfil weak exogeneity in the sense that they have to be uncorrelated with the contemporaneous error term in t. We can still use OLS to estimate the following system of equati

7、ons, which is called a VAR in reduced form.(3) (4) The beauty of this model is that we dont need to predefine whether x or y are endogenous (the dependent variable). In fact, we can test whether x (y) is endogenous or exogenous using Granger causality tests. The idea of Granger causality is that pas

8、t observations (lagged dependent variables) can influence current observations but not vice versa. So the idea is rather simple: the past affects the present, and the present does not affect the past. STATA provides Granger causality tests after conducting a VAR analysis, which is based on testing t

9、he joint hypothesis that past realisations do not Granger cause the present realisation of the dependent variable.In many applications, VAR models make a lot of sense, as a clear direction of causality cannot be predefined. For instance, there is a substantial literature on the benefits of internati

10、onalisation (e.g. entering foreign market through cross-border M&A). There is evidence that multinationals outperform local peers due to the benefits of operating in many countries. At the same time, we know that high-performing companies are more likely to enter foreign markets due to their ownersh

11、ip specific advantages. This argument is based on the Resource-based View and the OLS framework developed by Dunning and Rugman (Reading School of International Business). The VAR model allows you to incorporate both effects: in fact you can test whether performance drives internationalisation or in

12、ternationalisation drives performance.Before you start using a VARmodel, you have to make sure that the time series are stationary. So the first step is to check whether the time series is stationary using Dickey-Fuller tests and KPSS tests. The second step is to specify the optimal lag length (p) o

13、f the model. This is done by comparing different model specifications using information criteria. Apart from using Akaike (AIC) and Bayesian Schwarz (BIC), the Hannan-Quinn (HQIC) is commonly used. Most applied econometricians favour theHannan-Quinn (HQIC) criterion. STATA will help you to make a go

14、od choice. After specifying your model, you need to check stability conditions. The coefficient matrix of the reduced form VAR has to ensure that the iteration sequence converges to a long-term value. STATA will help you in checking stability.To be precise, you need to show that the eigenvalues of t

15、he coefficient matrix lie within the unit circle. The reason behind it can be only understood when you understand the method of diagonalizing a matrix. VAR models offer another nice feature: impulse response functions. VAR models capture the dynamics of two (or more) stationary time series; hence, w

16、e can assess the dynamic impact of a marginal change of one variable on another. The standard OLS regression provides coefficients, and coefficients refer to the partial impact of an explanatory variable on the dependent variable. In the case of VAR models, the relationship becomes dynamic, as a cha

17、nge of one variable (say x) in t can affect x and y in t+1. The impact on x and y in t+1 in turn affects x and y in t+2 and so on until the impact dies out. Impulse response functions are very useful in illustrating the short-term dynamics in a model.Lets look at an example to see how VAR modelling

18、works. In Lecture 5, we tried very hard to understand gold prices. We extend our univariate model by exploring the relationships between gold and silver prices. Linking two (similar) assets or securities is a very common trading strategy, which is called pairs-trading.Before we do any sophisticated

19、modelling, it is always beneficial to look at some line charts. Figure 1 shows the indexed time series of nominal gold and silver prices from 1900 to 2010.Figure 1: Nominal gold and silver prices, indexed, 1900-2010We can see that there is a certain degree of co-movement, which we might be able to e

20、xploit for our trading strategy. Before we can use VAR, we need to ensure that both time series are stationary. It is obvious from Figure 1 that gold and silver prices are not stationary. However, after taking a first-difference we can show that price changes are stationary. So both time series are

21、I(1).The next step is to determine the optimal lag length using information criteria. Table 1 shows different specifications using the varsoc command.Table 1: Determining the optimal lag length using information criteriaBased on the AIC and HQIC, two lags are optimal; however, the (S)BIC prefers onl

22、y one lag. I would prefer HQIC and try two lags first. If the second lag does not exhibit significant coefficient, we could try to reduce the lag length in line with (S)BIC.We run a VAR with two lags to explain current price changes in gold and silver. Table 2 provides the OLS estimates.Table 2: VAR

23、 model with two lagsWe see that silver prices (lag 2) affect current gold prices, and we can establish autocorrelation in both time series. To test whether gold Granger causes silver or vice versa, we run Granger causality tests reported in Table 3.Table 3: Granger causality testsHence, we confirm t

24、hat past changes in silver prices can predict future gold price changes. This is very interesting, as it can be used to develop a trading strategy. Finally, we need to show that the VAR is stable (see Table 4).Table 4: Stability condition of the VARFinally, we can illustrate the impact of silver pri

25、ce changes on future gold price changes using an impulse response function. Figure 2 shows the impulse response function and confidence intervals derived from bootstrapping. If silver prices increase today by 1%, we should expect a significant decline in gold prices in two years by 0.2%.Figure 2: Im

26、pulse response function6.3 CointegrationWhen we explore Figure 1 a bit more carefully, we can see that silver and gold prices exhibit a certain degree of co-movement. We could almost argue that they share a common stochastic trend. The limitation of ARIMA and VAR models is that they can be only used

27、 if the time series are stationary. In our case, we had to first-difference your time series to ensure stationarity. First-differencing eliminates a lot of information in the time series. Is there no better way to analyse gold and silver prices.Long before the development of multivariate time series

28、 econometrics, people realised that gold and silver seem to have a common movement around a long-term equilibrium (gold-silver price ratio). Moreover, the idea of equilibrium conditions in economics and the availability of macroeconomic time series led to the development of cointegration analysis. T

29、he idea is very simple. Even if two (or more) time series are non-stationary and hence have stochastic trends, they might be still driven by the same underlying factors that lead to their stochastic behaviour. Therefore, we analyse the time series in levels and see whether we can find a long-term eq

30、uilibrium a so-called cointegrating vector. Before we explore the Johansen procedure, lets look at the gold-silver ratio over time shown in Figure 3.Figure 3: The gold-silver ratio, 1900-2010The ratio looks like a mean-reverting process; thus, in the long run it tends to go back to its long-term equ

31、ilibrium (mean). Based on the ratio, we could argue that gold seems to be overvalued compared to silver at the moment. Of course, taking the ratio suggests a very simple cointegrating vector in fact we assume a one-to-one relationship. Before we can use the Johansen procedure, we have to make sure t

32、hat the time series have the same order of integration I(p). We already know that gold and silver prices are both I(1) time series. Table 5 shows the results of the Johansen test for cointegration. In line with the VAR model, we use two lags.Table 5: Johansen testThe null hypothesis that there is no cointegration (r=0) can be rejected if we use the trace statistic. However, the null hypothesis that we have one cointegrating vector (r=1) cannot be rejected. The problem is that the max-lambda statistic does not support cointegration. I also tried log-prices instead, wh

