Climatic variables such as rainfall and temperature have nonlinear and non-stationary characteristics such that analysing them using linear methods inconclusive results are found. Ensemble empirical mode decomposition (EEMD) is a data-adaptive method that is best suitable for data with nonlinear and non-stationary characteristics. The average monthly rainfall and temperature data for a selected region in South Africa are decomposed into intrinsic mode functions (IMFs) at different time scales using EEMD. The IMFs exhibit an inter-annual to inter-decadal variability. The influence of climatic oscillations such as El-Niño Southern Oscillation (ENSO) and quasi-biennial oscillation (QBO) is identified. The influence of temperature variability on rainfall is also shown at different time scales. Based on the results obtained, the EEMD method is found to be suitable to identify different oscillations in the rainfall and temperature data.

The annual variation of rainfall in Southern Africa, directly and indirectly, affect human livelihoods and ecosystem through droughts, temperature changes, water supply problems and reduced agricultural production. Most of Southern Africa experience Austral summer rainfall (October to March) and north-eastern to south-western regions experience austral winter rainfall (May to August) (

El Niño describes the warming of sea surface temperature that occurs periodically, typically concentrated in the central-east equatorial Pacific (

Empirical mode decomposition (EMD), was introduced by Huang

EEMD has been applied widely in the hydrological and atmospheric studies. Chiew

In this study EEMD is used to decompose a 38-year rainfall and temperature data for a selected region in Western Cape, South Africa to reveal underlying physical signals. The influence of ENSO, QBO and temperature on the rainfall pattern is identified. The selected region chosen has been facing a lot of water challenges recently. There has been growing general interest in the winter rainfall mainly due to a threat of “day zero” in 2018 (

The average monthly rainfall and temperature data were obtained from the South African Weather Service (SAWS) for the period 1980 to 2018 for an area between 18.2–19.2ºE and 33.5–34.5ºS, which contains 11 weather stations as shown in Figure

Study Area (square) in Western Cape South Africa with 10 weather stations.

Rainfall and temperature data from weather stations has challenge of having missing data. Therefore, to cater for the missing data multivariate imputation by chained equations (MICE) method is used to impute the missing data (

Empirical Mode Decomposition (EMD) is an adaptive time-frequency data representation technique, which requires only that the data must consist of a simple intrinsic mode of oscillations (_{t}

Construct the upper _{t}

Compute

Repeat steps (i) and (ii) for _{t}_{t}

Take the difference _{t}_{t}_{t}_{t}

The first IMF contain the highest fluctuations and this is subtracted from the original data and subsequent IMFs are then derived from the subtracted data. The IMFs and residual data approximate the original data when they are summed together (_{t}_{t}

EMD has a challenge of mode mixing, where signals from one IMF is found in another. Therefore, a noise assisted method, Ensemble Empirical Mode Decomposition (EEMD), which consist of adding white noise before carrying out EMD algorithm was introduced (_{t}_{j}_{i}^{th}

An average of the IMFs found from the data with noise becomes the final IMF, that is

where _{t}_{t}_{t}^{th} IMF for the original time series _{t}_{t}

where

The IMFs must be mutually orthogonal to each other. Higher orthogonality corresponds to less amount of information leakage. The index of orthogonality (IO) is used to calculate the orthogonality which is given by

where ^{th}^{th}

Synchronisation of coupled oscillating systems means appearance of certain relations between their phases and frequencies (

The average monthly rainfall for the region located between 18.2–19.2ºE and 33.5–34.5ºS was standardised for easy computation and comparison. The multivariate imputation by chained equations is used to impute the missing rainfall and temperature data. The data was decomposed until when there is at most one maximum and one minimum in the residual. From the standardised rainfall data, 7 IMFs were found and are shown in Figure

Decomposition of rainfall into IMF1 to 7 and residual using EEMD. The

The probability density function for each IMF is approximately normally distributed. The IMFs and residual are added together to reconstruct the data. The reconstructed data approximates the original data with Root Mean Square Error of order 10–14. This clearly shows that EMD is lossless decomposition with minimal data being lost in the decomposition and managing to capture most of the oscillations in the data.

It is noted that the maximum value of orthogonality between the IMFs is found to be approximately equal to 0.001 and it is way below the acceptable value of less than 0.1. The index of orthogonality for the IMFs is 0.594 × 10–4 for rainfall, 0.154 × 10–6 for Niño 3.4 and 0.235 × 10–5 for QBO. It confirms that there is less amount of information leakage.

The cross-correlation between rainfall and QBO and Niño 3.4 shows that there is no correlation as shown in the auto-correlation function (ACF) plot in Figure

Cross-correlation for rainfall with Niño 3.4 and QBO. The

Cross-correlation for IMF3 with Niño 3.4 and QBO. The

Cross-correlation can be used only when the time series is stationary, the Augmented Dicky Fuller Test (ADF) and Phillips-Perron Unit Root Test shows that IMF 4 to 7 are not stationary therefore cross-correlation cannot be used. The synchronisation does not require that the time series to be stationary hence it was used to identify any relationship for those IMFs. These IMFs found are further synchronised with Niño 3.4 and QBO and the results are shown in Figure

Histogram plot of synchronisation of rainfall, QBO and Niño 3.4 index IMFs. The

Temperature for the selected region was also decomposed into 7 IMFs and a residual. There is weak coupling between the original rainfall time series and temperature as shown in Figure

Histogram plot of the synchronisation of rainfall IMFs and temperature IMFs. The

The effectiveness of EEMD to analyse nonlinear and non-stationary data was demonstrated. The rainfall and temperature data were decomposed into IMFs and residual data, which summed up to the original data. The decomposed IMFs found can be used with other methods such as regression and neural networks to predict the impact of climate drivers in the future. EEMD was effective in isolating the data into different timescales and therefore the variability of the rainfall pattern was identified, in the end, evidence of the effect of ENSO and QBO was provided. Cross-correlation and phase synchronisation was used to find the relationship of the IMFs from the different time series under study. It will be of interest for future studies to carry out a study for a longer period to find the pattern of the rainfall over decades.

The additional files for this article can be found as follows:

A measurement of the strength of El-Niño and La-Niña. DOI:

Measurement of the variation of the wind above the equator. DOI:

Original rainfall data for the study area and the decomposed rainfall into IMFs. DOI:

Original temperature data for the study area and the decomposed temperature into IMFs. DOI:

The author would like to acknowledge South African Weather Service for the data that was used in this study and National Research Fund (Grant number 112979) for the funding of the work.

Willard Zvarevashe is the main author of the work. Venkataraman Sivakumar edited and contributed on the atmospheric part of the study. Syamala Krishnannair provided final approval of the version to be published.

The authors have no competing interests to declare.