1. Introduction

The annual variation of rainfall in Southern Africa, directly and indirectly, affect human livelihoods and ecosystem through droughts, temperature changes, water supply problems and reduced agricultural production. Most of Southern Africa experience Austral summer rainfall (October to March) and north-eastern to south-western regions experience austral winter rainfall (May to August) (). The summer and winter rainfall variability has been shown to be influenced by El Niño-Southern Oscillation (ENSO) and becomes nonlinear, with extreme weather conditions (Phillippon et al., 2012; ; ). The temperature in South Africa has increased by over one and half times more than the globally observed temperature increases (; ; ).

El Niño describes the warming of sea surface temperature that occurs periodically, typically concentrated in the central-east equatorial Pacific (). La Niña is the term adopted which describes the opposite side of the fluctuations. The Niño 3.4 index is one of the key atmospheric indices used to gauge the strength of El Niño and La Niña. The other driver that has been shown to have an impact on weather patterns is the Quasi-Biennial Oscillation (QBO) (Begue et al., 2010). QBO is a regular variation of the winds that blow high above the equator. Strong winds in the stratosphere travel in a belt around the planet, and in about 14 months these winds completely change direction (). In the study by Kane (), it was shown that the warming of climate and sea surface temperatures has an impact on the rainfall patterns. Due to the influences of these climatic drivers and others, the weather patterns become nonlinear and non-stationary such that linear models are sometimes inconclusive (; ). Fourier based methods assume that the data is linear and the data must be strictly periodic which is not the case with climate data ().

Empirical mode decomposition (EMD), was introduced by Huang et al. (), which does not make assumptions about linearity and stationarity of the time series and it is best suitable to analyse climate data. The time series is decomposed into different time scales called intrinsic mode functions (IMFs), which can reveal intrinsic changes in the climate system (; ). EMD has a challenge of mixing the signals from one IMF to another, therefore, to cater for this, a noise assisted method named Ensemble Empirical Mode Decomposition (EEMD) was introduced ().

EEMD has been applied widely in the hydrological and atmospheric studies. Chiew et al. () applied EMD to annual streamflow to find significant oscillations in the data. Twenty unimpaired catchments from different parts of the world were used and 3, 6–7, 11–15 and 20–25 year oscillations in the stream flows were identified. In another study, the climate variability was observed in the case study of East and Central Africa, which was demonstrated by mapping the coupling between precipitation variables, inter-annual vegetation changes and the ENSO and Indian Ocean Dipole (IOD). EEMD was adopted because of its ability to breakdown normalised vegetation index (NDVI) series into multiple time scales components and its generic ability to be used with other time series analysis tools (, ). The Pacific Decadal Oscillation (PDO) is known to have an influence over East China’s annual and summer rainfall. Using EEMD the monthly rainfall influence of PDO was identified (). The intrinsic oscillations in the land surface temperature of Wuhan, China were revealed using EEMD by decomposing the data into annual, inter-annual, noise and trend (). In South Korea several climate drivers have been shown to have an impact on the precipitation, however many studies did not take into consideration the inherent cycles in the long term precipitation. Using EEMD, cross-correlation and multiple linear regression on the influence of ENSO, QBO, Arctic Oscillation, Atlantic Meridional Mode and others was shown at the monthly level (). Some notable applications of EEMD include finding the effects of temperature and precipitation trends in Plateau droughts () and identifying the variability of monthly precipitation in Iran ().

In this study EEMD is used to decompose a 38-year rainfall and temperature data for a selected region in Western Cape, South Africa to reveal underlying physical signals. The influence of ENSO, QBO and temperature on the rainfall pattern is identified. The selected region chosen has been facing a lot of water challenges recently. There has been growing general interest in the winter rainfall mainly due to a threat of “day zero” in 2018 (; ; ). However, very few studies have used statistical analysis to investigate the rainfall variability during winter. This is the first study that uses EEMD and synchronisation to investigate the influence of ENSO and QBO on the rainfall pattern for the region. This paper is organized as follows: in section 2, the data set and methodology are described; in section 3, the analysis and discussion of the results is done and conclusion is done in section 4.

2. Data and Method

2.1 Data

The average monthly rainfall and temperature data were obtained from the South African Weather Service (SAWS) for the period 1980 to 2018 for an area between 18.2–19.2ºE and 33.5–34.5ºS, which contains 11 weather stations as shown in Figure 1. The area selected receives winter rainfall from April to August. The data for Niño 3.4 and QBO is publicly available on Climate Explorer website (http://climexp.knmi.nl) ().

Figure 1 

Study Area (square) in Western Cape South Africa with 10 weather stations.

Rainfall and temperature data from weather stations has challenge of having missing data. Therefore, to cater for the missing data multivariate imputation by chained equations (MICE) method is used to impute the missing data (). The MICE method was chosen for this study because it does not assume normal distribution of the data and it also assumes missing at random (MAR). The R package ‘mice’ developed by Van Buuren & Groothuis-Oudshoorn () is used for imputation.

2.2 Ensemble Empirical Mode Decomposition

Empirical Mode Decomposition (EMD) is an adaptive time-frequency data representation technique, which requires only that the data must consist of a simple intrinsic mode of oscillations (). It is most suitable for nonlinear and non-stationary data. The EMD methodology is based on a shifting process, which identifies local extrema (maxima and minima) and results in the formation of intrinsic mode functions (IMFs). In order to decompose a given time series xt into IMFs the following algorithm is used:

  1. Construct the upper xtmax and lower xtmin envelopes connecting via cubic spline interpolation all the maxima and minima of xt, respectively.
  2. Compute Δxt=xtxtmax xtmin2;
  3. Repeat steps (i) and (ii) for ∆xt until the resulting signal possess the properties that the number of extrema is equal (or differ at most by one) to the number of zero crossings, and the mean value between the upper and lower envelope is equal to zero at any point. Denote the resulting signal by ht (1), which is the first IMF.
  4. Take the difference xt (1) = xtht (1) and repeat steps from (i) to (iii) to obtain the second IMF ht (2) (; Guo et al., 2016).

The first IMF contain the highest fluctuations and this is subtracted from the original data and subsequent IMFs are then derived from the subtracted data. The IMFs and residual data approximate the original data when they are summed together (). A time series is decomposed into IMFs, ht (i) (i = 1,2, …n) and residual rt so that the original data is approximated by the sum of IMFs and residual.

(2.1)
xti=1n ht(i)+rt. 

EMD has a challenge of mode mixing, where signals from one IMF is found in another. Therefore, a noise assisted method, Ensemble Empirical Mode Decomposition (EEMD), which consist of adding white noise before carrying out EMD algorithm was introduced (). For a given time series xt an ensemble of white noise of size m, εj (j = 1,2…,m), is introduced to each data point, xi, such that the ith “artificial” value becomes

(2.2)
 yi=εj+xi. 

An average of the IMFs found from the data with noise becomes the final IMF, that is

(2.3)
ct(i)=1mj=1mdt(j)

where dt (j) is the IMF of the time series with added noise, yt and ct (i) is the final ith IMF for the original time series xt. The average of residuals from yt gives the final residual that is,

(2.4)
rt=1mj=1mrt(j)

where rtj are residuals from the time series with added noise.

The IMFs must be mutually orthogonal to each other. Higher orthogonality corresponds to less amount of information leakage. The index of orthogonality (IO) is used to calculate the orthogonality which is given by

(2.5)
IO=1nt1xt2i=1n+1j=1n+1 ht(i) ht(j)

where i and j represents the ith and jth IMFs and n is the size of the IMF ().

2.3 Synchronisation

Synchronisation of coupled oscillating systems means appearance of certain relations between their phases and frequencies (). Here we use this concept in order to reveal the interaction between rainfall and other climatic drivers. R package ‘synchrony’ is used which measures phase synchrony between quasiperiodic times series (). Time series that are phase synchronised or locked exhibit a modal distribution with a prominent peak at a given phase difference, whereas unrelated times series are characterized by a uniform or diffuse distribution.

3. Results and Discussion

The average monthly rainfall for the region located between 18.2–19.2ºE and 33.5–34.5ºS was standardised for easy computation and comparison. The multivariate imputation by chained equations is used to impute the missing rainfall and temperature data. The data was decomposed until when there is at most one maximum and one minimum in the residual. From the standardised rainfall data, 7 IMFs were found and are shown in Figure 2. IMF 3 has a period of about 12 months which corresponds to an annual (seasonal) oscillation, IMF 4 has a period of about 26 months which approximated a 2 year oscillation, IMF 5 has an oscillation of about 54 months (4.5 years) and IMF 6 captures a quasi-decadal oscillation (7 year period). Previous studies have used Wavelet Analysis, identified similar oscillations in the South African rainfall that were also found in this study (; ). In these previous studies, winter rainfall was found to be having significant 2–3 year period and 3–4 year period which contributed to the rainfall variability. As compared to Wavelet Analysis, EEMD is adaptive, intuitive and does not use basis functions. Additionally, the impact of different climate drivers at different time scales can be shown. The graph on the bottom right of Figure 2 illustrates the residual plot, which shows the general trend of the rainfall. A study by Maúre et al. () used several climatic models to predict the rainfall trend over Southern Africa under global warming and the study pointed to a decreasing daily rainfall for Western Cape, which is in agreement with the obtained residual plot.

Figure 2 

Decomposition of rainfall into IMF1 to 7 and residual using EEMD. The x-axis represents the time in years and y-axis represents the frequency. The graphs are labelled on the y-axis from IMF1 (first graph on the left) to RES (last graph on the bottom right) which is a graph of the residual. IMF1 captures the noise found in the rainfall data, IMF2 inter-annual oscillation, IMF3 annual oscillation, IMF4 2-years oscillation, IMF5 4.5-year oscillation, IMF6 7-year oscillation and IMF7 16.5 year oscillation. The plot of the residual shows the general trend of the rainfall.

The probability density function for each IMF is approximately normally distributed. The IMFs and residual are added together to reconstruct the data. The reconstructed data approximates the original data with Root Mean Square Error of order 10–14. This clearly shows that EMD is lossless decomposition with minimal data being lost in the decomposition and managing to capture most of the oscillations in the data.

It is noted that the maximum value of orthogonality between the IMFs is found to be approximately equal to 0.001 and it is way below the acceptable value of less than 0.1. The index of orthogonality for the IMFs is 0.594 × 10–4 for rainfall, 0.154 × 10–6 for Niño 3.4 and 0.235 × 10–5 for QBO. It confirms that there is less amount of information leakage.

The cross-correlation between rainfall and QBO and Niño 3.4 shows that there is no correlation as shown in the auto-correlation function (ACF) plot in Figure 3. However, when the time series were decomposed correlation is identified for IMF 3 for both Niño 3.4 and QBO as shown in Figure 4. There is a correlation of rainfall’s IMF3 and Niño 3.4 index at lag –4, –5, –6, 2 and 3. These results confirm the influence of ENSO on the seasonal rainfall and also the quasi-biennial oscillation which is consistent with results found by Philippon et al. () and Kane (). Additionally, the general pattern of the rainfall at different time scales is identified up to quasi-decadal oscillation.

Figure 3 

Cross-correlation for rainfall with Niño 3.4 and QBO. The x-axis represents the time gap in months and y-axis represents the correlations. The spikes that are above or below the dotted blue line indicate significant correlation. In the left graph shows that there is correlation of rainfall and Niño 3.4 index at lag –14 to –19. The right graph shows that there is no correlation of rainfall and QBO. The raw data is not showing any association of ENSO and QBO with rainfall data.

Figure 4 

Cross-correlation for IMF3 with Niño 3.4 and QBO. The x-axis represents the time gap in months and y-axis represents the correlations. The spikes that are above or below the blue line indicate significant correlation. In the left graph shows that there is correlation of rainfall’s IMF3 and Niño 3.4 index at lag –4, –5, –6, 2 and 3. The right graph shows that there is correlation of rainfall’s IMF3 and QBO at lag –1 and –2.

Cross-correlation can be used only when the time series is stationary, the Augmented Dicky Fuller Test (ADF) and Phillips-Perron Unit Root Test shows that IMF 4 to 7 are not stationary therefore cross-correlation cannot be used. The synchronisation does not require that the time series to be stationary hence it was used to identify any relationship for those IMFs. These IMFs found are further synchronised with Niño 3.4 and QBO and the results are shown in Figure 5 below. The results show that there is weak coupling between the original rainfall time series and Niño 3.4 or QBO. However, there is phase-locking identified for IMF 5 for Niño 3.4, since there is a clear peak. These results for Niño 3.4 shows that there may be an influence of ENSO on the rainfall pattern. This is in agreement with the results that were found by Philippon et al. (), which found that there was a significant association between ENSO and winter rainfall. They showed that there is a strong correlation of the May-June-July season with ENSO than any other period of the year. In this study using EEMD, the correlation is further done at a monthly level than seasonal level. The clear peak on the histogram of rainfall’s IMF6 and QBO (Figure 5) shows that there is phase locking. This QBO signal identified is in agreement with a study by Kane () which also identified the presence of QBO and it was shown that it contributed significantly to the variability of the winter rainfall for the same region.

Figure 5 

Histogram plot of synchronisation of rainfall, QBO and Niño 3.4 index IMFs. The x-axis represents the phase difference in radians and y-axis represents the frequency. The plots shows a clear peak for rainfall IMF5 with Niño 3.4 IMF5 (top right) which shows that there is phase locking. The last graph on the bottom right is a plot of rainfall’s IMF6 and QBO IMF6 which shows that there is phase locking since there is a clear peak.

Temperature for the selected region was also decomposed into 7 IMFs and a residual. There is weak coupling between the original rainfall time series and temperature as shown in Figure 6 below. However, phase locking is observed for IMF 1 and IMF5, since there is a clear peak for both of them. IMF1 captures the noise found in the signal and IMF5 is a 4.5-year oscillation. The phase-locking in IMF1 suggest that the temperature variability may have an impact on the rainfall patterns. The synchronisation on IMF5 shows that the temperature changes may have a long term impact on the rainfall variability. This is consistent with other studies that have used different climatic models to find the impact of increasing temperature (; ; ). These models predict that there will be an increase in extreme rainfall patterns. In our study, we have managed to show the direct impact of increasing temperature using historical data.

Figure 6 

Histogram plot of the synchronisation of rainfall IMFs and temperature IMFs. The x-axis represents the phase difference in radians and y-axis represents the frequency. The plots shows a clear peak for Rainfall and Temperature for IMF1 (second graph on top), and IMF5 (bottom right). This shows that there is phase locking for these IMFs. IMF1 captures the noise in the data and IMF5 captures the 4.5 year-oscillation in the data.

4. Conclusions

The effectiveness of EEMD to analyse nonlinear and non-stationary data was demonstrated. The rainfall and temperature data were decomposed into IMFs and residual data, which summed up to the original data. The decomposed IMFs found can be used with other methods such as regression and neural networks to predict the impact of climate drivers in the future. EEMD was effective in isolating the data into different timescales and therefore the variability of the rainfall pattern was identified, in the end, evidence of the effect of ENSO and QBO was provided. Cross-correlation and phase synchronisation was used to find the relationship of the IMFs from the different time series under study. It will be of interest for future studies to carry out a study for a longer period to find the pattern of the rainfall over decades.

Additional Files

The additional files for this article can be found as follows:

Niño 3.4 Index

A measurement of the strength of El-Niño and La-Niña. DOI: https://doi.org/10.5334/dsj-2019-046.s1

QBO Data

Measurement of the variation of the wind above the equator. DOI: https://doi.org/10.5334/dsj-2019-046.s2

Rainfall Data

Original rainfall data for the study area and the decomposed rainfall into IMFs. DOI: https://doi.org/10.5334/dsj-2019-046.s3

Temperature Data

Original temperature data for the study area and the decomposed temperature into IMFs. DOI: https://doi.org/10.5334/dsj-2019-046.s4