The Study of Time Series Using the DMA Methods and Geophysical Applications

The discrete mathematical analysis (DMA) is a series of algorithms aimed at the solution of basic problems of data analysis: clustering and tracing in multidimensional arrays, morphological analysis of reliefs, search for anomalies and trends in records etc. All the DMA algorithms are of universal nature, joined by the same formal foundation, based, in its turn, on fuzzy logic (FL) and fuzzy mathematics (FM). The current study finalizes the search for the anomalies in one-dimensional time series within the scope of DMA: here the initial concept of an interpreter’s logic gets its additional development. First, the formal expert’s opinions are more fully expressed, and this is realized with the more complex measures of activity (the concept of straightenings (Gvishiani et al. 2003; Gvishiani et al. 2004; Zlotnicki et al. 2005) is replaced by the measures of activity which come to the fore): second, for the junction of anomalies, a recently created DPS (Discrete Perfect Sets) algorithm is used DPS (Discrete Perfect Sets) (Agayan et al. 2011; Agayan et al. 2014).


A short summary of the study
We will now describe, in a semi-formal way, what the study is dedicated to, and what will be carried out in it.
Let there be a time series f, defined on a finite, in common case, irregular set of nodes T = {t 1 < . . . < t N }. The time series f is analyzed by an expert ε.
It is supposed that the expert ε has a point of view on the dynamic pattern of f in every node t ∈ T, and a number of questions is posed at this time. Let us formulate the first two questions: Question 1: To what extent µɛ f (t) ∈ [0, 1] is the dynamic pattern of the time series f the expert ɛ is interested in, expressed (how active is it) in the node t? Question 2: How is the dynamic pattern of the time series f, the ɛ is interested in, alternating at the time moment t is the measure µɛ f increasing (decreasing) in the node t or passing through an extremal transition?
The answers for this questions form the basis of what is called a monitoring of activity of the time series f with the expert ɛ point of view. The measure of activity µɛ f gives the key to the spot ɛ-anomalies: ɛ-anomalous moments of f are the moments t where µɛ f (t) ≥ 0.75 (if some feature is expressed with the scale of [0, 1], then it's a matter of course to consider its occurrences within the interval [0.75, 1] as anomalous).
Let us denote LA(µɛ f ) the set of ɛ-anomalous nodes: Forming of LA(µɛ f ) is also included into the problem of activity monitoring. Due to the randomness of the time series f, its local anomalous nodes LA can comprise a subset in T which demands extra work (preliminary topological filtering ) and further clustering (Mikhailov et al. 2003;Agayan et al. 2004;Soloviev et al. 2005)). The point is that the connected side-by-side and sufficiently large excesses of the measure µɛ f over the anomaly level 0.75 are of the main interest as it is them what should be considered expressions (traces) of global anomalies of f from the perspective of ɛ.
Let us denote A(µɛ f ) their plurality: The transition LA(µɛ f ) → A(µɛ f ) comes with the help of the DPS. algorithm. This is how the answer is obtained for the question 3.
Question 3: Which fragments of the time series f are considered interesting from the perspective of the expert ɛ?
The last question is formulated this way: Question 4: How to compare and rank the ɛ-anomalies of f ? DMA includes a corresponding implement which makes it possible to compare the fragments to each other and rank them ), it will be told about that further. Here we shall list a couple of examples for the quantitative indicators of the anomalies A i , inducing their ranking.
Example 1 Massiveness (continuity) mA i of an anomaly A i : mA i = |Supp A i | -number of nodes in the carrier of A i .

Example 2 Activity µA i of an anomaly
The summary will follow the scheme displayed on Fig. 1, in the simplest suppositions, however, fully keeping the thoroughness of the analysis of a time series announced above.

A finite time series f (initial data)
Period of observation T -the finite regular number of nodes  , Localization of space T in the node t is realized with the fuzzy structures δ t on T, normalized at t and decreasing with the increase of the distance from t: The δ t structures are called the measures of closeness on T to the node t. They center T around t by assigning the weight functions δ t (t) to the nodes t, act as surroundings for t in T, and formalize the view of the expert ε on T at t Gvishiani et al. 2004).
In the examples given below, p stand for a non-negative parameter responsible for the scale of δ-localization: the higher p is, the stronger the localization is: in particular, p = 0 results in the global point of view on T, the same for each node.

Example 3
1. The global measure of closeness ( Fig. 2 The local measure of closeness ( Fig. 3) where r ≥ h -observation radius.

An expert ε (formalization)
Dynamical indicator D -a non-negative functional on T, parametrized by T where q > 0.
The dynamical indicators are related to the very natural and important but the simplest points of view of the expert ɛ on the dynamics of the time series f. The general case is combined from the simple ones, plus, the composition can't be straight and mechanical. The point is that, initially, the dynamical indicators D f for different D are not comparable to each other and their junction additional efforts by means of the measures of maximality or dynamical activity of the indicator D f .

Measure of maximality mesmax (constructions)
Measure of dynamical activity µD f is a function membership to the fuzzy concept of the "activity of the function f in the node t from the perspective of the indicator D " on T Gvishiani et al. 2008c). The measure of activity µD f (t) results from the D f (t) indicator using one or another construction of the measure of maximality mesmax. The measure of maximality mesmax B (x) is a fuzzy structure on , which answers the following question for a finite weighted arrangement , ω k > 0}: "To what degree the number x is big modulo B?" In DMA, for a mesmax measure there are four constructions: "fuzzy comparisons", "Kolmogorov averages", "fuzzy means", "iteration bounds" (Gvishiani et al. 2004;Zlotnicki et al. 2005;. For a measure µD f (t), the (Im D f , δ t ), which is an image of D f centered in the node t with the help of the measure of closeness δ t , acts as B: Example 5 1. The "fuzzy comparison" construction: binary version.
The "fuzzy comparison" construction: gravitational version.
The "fuzzy comparison" construction: sigma version.
where p* is the solution of the equation

Measure of activity µɛ f (constructions)
The transition D f → µD f translates the function f analysis into the fuzzy logic and fuzzy mathematics language : the measures of activity µD f for different D possess the values within the unified scale of the interval [0, 1] and can be combined in any proportions and quantities using numerous FL operations and averagings of any type which will be denoted *. It becomes possible to instill the content to the complicated activity of f by the plurality of dynamical indicators D in the node t ∈ T: It is the construction (1) by which the view of expert ε on the dynamics of f is commonly modeled (Gvishiani et al. 2004). Such modeling is an art in many respects, and it involves selection of basic dynamical indicators D(ε) and their appropriate junction *(ε): Example 6 1. Junctions * using the fuzzy conjunction make it possible to consider all the basis dynamics from D(ε) simultaneously: 2. Junctions * using the fuzzy disjunction ⊥ make it possible for every basic dynamic from D(ε) to approve itself: 3. Let us suppose that the expert ε gives initial precedence on the basic dynamics from D(E) in the form of weights ω(D), then one of possible junctions *, able to consider this fact will be the weighted averaging

Monitoring of activity µƐ f (concept)
Analysis of the measure µε f and its basic components µD f m , D m ∈ D(ε) is the essence of the monitoring of f by an expert ε. It includes: 1. Direct analysis of the µε f activity which makes it possible to understand to what extent the dynamic of interest for ε is expressed on f (Fig. 4) for every moment t ∈ T. The result is the disjunctive partition of T in three pieces 2. Trend analysis of the activity of µε f based on regression derivatives: it helps to understand how is the f dynamics of ε's interest t alternating at every moment t ∈ T: is it increasing, decreasing or passing through an extremal transition. ). The result is the coding of the behavior of f on T from the perspective of ε and the possibility of comparison with the behavior of another time series f on, generally speaking, another period T from the perspective of the same ε.
Remark 2 Real partitions (10) have close thresholds but not always equal, to ideal thresholds 0.5 and 0.75. It is not principal for further studies and therefore in the present paper is convenient to consider monitoring of activity to be ideal.

Measure and monitoring of activity (examples)
For a record f (Fig. 4), the simple measures of activity µL f (Fig. 2) and µO f (Fig. 3), were constructed. They correspond to the dynamical indicators of jaggedness (2) and scatterness (3). Afterwards, they are joined in the complex measures of activity using three approaches (7)-(9). On the constructed measures (Figs. 5 and 7) and the initial record ( Fig. 6 and 8), a partition is shown (10): blue color for the background nodes (10.1), green for the potentially anomalous nodes (10.2), and red for the anomalous nodes (10.3).
On the simple measures of activity µL f (Fig. 2) and µO f (Fig. 3) respectively, 6 and 5 anomalous areas were identified and marked red. On the basis of simple measures, three complex measures of activity were constructed: using the fuzzy conjunction (7) (Fig. 7a), using the fuzzy disjunction (8) (Fig. 7b), using the weighted average (9) (Fig. 7c). Correspondingly, 3, 8 and 5 anomalous areas were identified, respectively, whose configuration and location are different from the anomalous areas identified by the simple measures of activity. This proves the disjunctive, conjunctive and compromise character of the complex measures of activity (7-9) (Zlotnicki et al. 2005).
On Fig. 9a and Fig. 10a, the trend analysis of the µL f are shown, based on the regression derivatives with respect to the sufficiently global measure of closeness δ. The result of the analysis is the partition of the µL f in increasing (red), decreasing (green) and flat areas (blue). As a result, a morphological separation of anomalies (Fig. 6a) trend components was carried out (Figs. 9b, 10b) ).      The second part of the study is dedicated to the search for uplifts on an arbitrary non-negative relief. Their search is realized using the DPS algorithm in two stages.

The DPS algorithm
Let X be a finite set, and A, B, . . . and x, y, . . . are the subsets and the points in it, respectively. The density P on a subset X is a mapping of 2 X × X into the interval [0, 1], increasing with respect to the first argument: The value of P A (x) is defined as the density of the subset A in the point x.
The process of construction of the maximal α-perfect subset X(α) in X is called the algorithm DPS (  The intersection 1 k ¥ =  is necessarily achieved because it's always finite due to finiteness of X and the multiplicity of X k+1 (α) into X k (α) for all k = 0, 1, . . .

The search for Ɛ-anomalies: stage 1 -functional clustering
The search for ε-anomalies will be illustrated for the function f (Fig. 11) and the simple measure of activity µL f (Fig. 12) corresponding to the dynamical indicator of jaggedness (2). Now we define the density P on T by the measure of dynamic activity µD f : let ∆ > h be the local observation parameter, A is a subset in T, tnode from T, then The result of the DPS algorithm operation with such density on T with a fixed α is the first part of the search for anomalous nodes for the function f on the period of observation T from the perspective of the expert ε ( Fig. 13):   The set T(α) is assumed as a disjunctive union of the intervals [a k , b k ], k = 1, . . ., m. Every [a k , b k ] is located in T and has no gaps inside (Fig. 13):

The search for Ɛ-anomalies: stage 2 -interval clustering
Let X(α) be the totality of the intervals {[a k , b k ]| 1 m }, and let us introduce the measure of closeness ρ between two intervals from X(α): The closeness p helps to construct the density P on The result of the algorithm operation DPS with such density on X(α) with a fixed b is the second part of the search for anomalous points for the function f on the period of observation T from the perspective of the expert ε (Figs. 14, 15) On the second step, the DPS algorithm indicates which intervals in X(α) should additionally be joined into one anomaly -these are exactly the intervals from X(α)(b) ). Thus, the finally anomalous intervals for the function f, from the perspective of the expert ε, would be the fairly isolated ones from X(α) which are not included into X(α)(b), and also the new intervals which result from uniting the intervals from X(α)(b).
Let us denote the totality of all the anomalies A = A(µε f ):

Studying the Ɛ-anomalies
Let us denote A(µε f ) = {A l | L l = 1 } the totality of all the ε-anomalies detected by DPS. Any ε-anomaly A l defines the fragment f l = f | A l of the time series f and could be studied according to the program listed above with respect to the measure µε f (the internal A l -monitoring of activity, superanomalies inside A l , and so on). It is natural to call such study of A l the internal study. As an ambivalent an external study of the anomaly A l is supposed, characterizing A l among the other anomalies from A(µε f ). Let's talk about it more in detail.
There are many characteristics of ε-anomalies, each of them ranks the ε-anomalies in A(µε f ) in its own way. The characteristics can be simple and complex (an analogy to the simple and complex measures of activity). Keeping in mind a compromise between simplicity and thoroughness, mentioned in the beginning, we shall list five natural characteristics and display how to make the complex ones from them.

Measure and monitoring of geomagnetic activity
Let us analyze the geomagnetic monitoring (10-10.3) on the basis of the measure of activity m = mL f (4) constructed according to the dynamical indicator of irregularity L f (2). The application of this measure to the ground-based observation data from the global geomagnetic observatory network brought an opportunity of spatial-temporal analysis of geomagnetic activity and the estimation of the dynamics of magnetic storm propagation in realtime (Soloviev et al. 2013;Gvishiani et al. 2014;Soloviev et al. 2016). The usage of the measure makes it possible to estimate the geomagnetic activity in various regions of the Earth within a single scale, taking its regional specification into account. The developed measure allows the analysis of the internal fine structure of magnetic storms, the dynamics of their development both around the globe and within particular regions. When estimating the geomagnetic activity, the whole range of initial data is efficiently considered in, as the temporal resolution of the measure matches the initial data sampling frequency.
The example of application of the developed measure of anomality to the geophysical data analysis was the study of magnetic activity using the time series from the magnetic observatory "Saint Petersburg" (IAGA code SPG). The initial observatory data is available at the Russian-Ukrainian Geomagnetic Data Center website (http://geomag.gcras.ru). For illustration purposes, the data for the period March 28 -April 8, 2016 was taken, as it contains the fragments of both increased (April 2-3, April 7-8) and decreased (March 31 -April 1, April 5) magnetic activity.
According to the planetary Kp index data (Fig. 28), during the given time period the magnetic conditions varied from calm (the Kp index values were from 0 to 2) and slightly disturbed (Kp = 2 . . . 3) to disturbed (the Kp index values were up to 5+). The disturbed data fragments corresponded to the magnetic storms which occurred from April 7, 2016, 17:00 UT, to April 8, 2016, 02:00 UT, and the increased magnetic activity on April 2-3, 2016.
The magnetic disturbance, which lasted from April 2, 2016, 15:00 UT to April 3, 2016, 03:00 UT, was characterized first by the increase of the amplitude of variations of the components X, Z and the total intensity their initial levels. This minimum took place approximately at midnight on April 2-3, next the values of the components reached the initial values by 03:00 UT, April 3, 2016. The magnetic storm which took place on April 7-8 was characterized by rapid evolution. During the sudden commencement period (about 17:11 UT) the amplitude of the variations of the X component at the Saint Petersburg magnetic observatory increased respectively to its background value from 14565 to 14580 nT. During the main phase of the magnetic storm, which evolved approximately from 19:00 to 21:53 UT, the decrease by about 150 nT for the X component, 180 nT for the Y component, and more than 200 nT for the Z component took place. According to the data records of several Russian magnetic observatories located in the same latitudinal belt, the main phase was accompanied by the decrease, which varied between 100-150 and 250 nT for the X and Y components, and was up to 300 nT for the Z component. The recovery phase of the magnetic storm lasted from about April 7, 2016, 22:00 UT to April 8, 2016, 02:00-02:30 UT. During this phase, some oscillations in the magnetic field intensity with a period of about 1 hour and the maximal amplitude of about 140-150 nT were observed. By the end of the recovery phase, the magnetic field at the observatory reached the values of the same level, which was registered before the storm onset.
The monitoring of geomagnetic activity for its Northern component X was performed using the measure of activity m = mL f . Its results are displayed on Fig. 29. The geomagnetic events listed above were considered anomalous according to the measure m and marked red on the plot. The data registered during the period of less disturbed magnetic field (e.g., for the period of 28-29 March), were recognized as potentially anomalous and marked green and purple on the plot (the potentially anomalous fragments are divided into two parts by the threshold value 0.61). Finally, for the period of quiet magnetic field, March 31-April 1, the values of the measure m practically do not exceed the threshold value 0.5, and no data fragment for this period was classified as anomalous.

Discussion
The comparison of the result of monitoring with the planetary Kp index data demonstrated that the classification of magnetic activity using the described measure does not contradict the classical methods for its estimation. At that, the principal advantages of the measure of anomality should be mentioned: 1. the matching of the temporal resolution of the measure to the initial sampling frequency of the time series; 2. the estimation of anomality using the records from different observatories in a unified and normalized scale.
The first advantage gives an opportunity of application of the measure to the monitoring of magnetic disturbance with minimal temporal delay in realtime mode. At that, the estimation of the magnetic activity is performed less raw than using the traditional 3-hour K index. The second advantage makes it possible to study the magnetic disturbance regardless of latitudinal location of the observatory, taking the regional specifications into account.

Conclusions
An exponential growth of digital observations of the Earth's environment acts as a challenge for the development of adequate mathematical methods of data mining and systems analysis of input data. It is the usage of these methods that provides an opportunity operational extraction of useful information, such as specific geomagnetic events of different origin, from the large volumes of continuously collected data. The approach to the studying of time series based on fuzzy logic, proposed in the paper, is largely focused on this challenge and finds its practical application. In particular, several algorithms based on DMA (Soloviev et al. 2009;Bogoutdinov et al. 2010;Soloviev et al. 2012;Sidorov et al. 2012;Zelinskiy et al. 2014) are integrated into Russian-Ukrainian Geomagnetic Data Center (http://geomag.gcras.ru) and used for continuous automated detection of anthropogenic disturbances and geomagnetic pulsations in observatory data. Further research plans include, first of all, the expansion of analysis from a single time series f to a network S of time series f s (t), s ∈ S. This will allow to define the monitoring of activity in network, anomalous events in network, and more.
The results presented in this paper rely on data collected at magnetic observatories. We thank the national institutes that support them and INTERMAGNET for promoting high standards of magnetic observatory practice (www.intermagnet.org).
The research has been carried out in the framework of the Federal target program of the Ministry of Education and Science of the Russian Federation, agreement No. 14.607.21.0058, project unique identifier RFMEFI60714X0058, and is a part of research aimed at the creation of the experimental prototype of the hardware and software complex for monitoring and detection of extreme geomagnetic phenomena using ground-based and satellite data.

Figure 29:
The results of classification of initial data into background (blue), weakly anomalous (green), anomalous (purple) and strongly anomalous (red) fragments using the measure of anomality. On the upper plot the initial record is shown, on the bottom plot the measure of anomality is displayed, calculated for this record.