SlideShare uma empresa Scribd logo
1 de 8
ISA Transactions 51 (2012) 499–506
Contents lists available at SciVerse ScienceDirect
ISA Transactions
journal homepage: www.elsevier.com/locate/isatrans
Improved correlation analysis and visualization of industrial alarm data✩
F. Yanga,b,∗
, S.L. Shaha
, D. Xiaob
, T. Chenc
a
Department of Chemical & Materials Engineering, University of Alberta, Edmonton, AB T6G 2V4, Canada
b
Tsinghua National Laboratory for Information Science and Technology (TNList), Department of Automation, Tsinghua University, Beijing 100084, PR China
c
Department of Electrical & Computer Engineering, University of Alberta, Edmonton, AB T6G 2V4, Canada
a r t i c l e i n f o
Article history:
Received 13 August 2011
Received in revised form
12 March 2012
Accepted 23 March 2012
Available online 12 April 2012
Keywords:
Alarm management
Correlation color map
Visualization
Gaussian kernel
Pseudo data
Clustering
a b s t r a c t
The problem of multivariate alarm analysis and rationalization is complex and important in the area of
smart alarm management due to the interrelationships between variables. The technique of capturing
and visualizing the correlation information, especially from historical alarm data directly, is beneficial
for further analysis. In this paper, the Gaussian kernel method is applied to generate pseudo continuous
time series from the original binary alarm data. This can reduce the influence of missed, false, and
chattering alarms. By taking into account time lags between alarm variables, a correlation color map
of the transformed or pseudo data is used to show clusters of correlated variables with the alarm tags
reordered to better group the correlated alarms. Thereafter correlation and redundancy information can
be easily found and used to improve the alarm settings; and statistical methods such as singular value
decomposition techniques can be applied within each cluster to help design multivariate alarm strategies.
Industrial case studies are given to illustrate the practicality and efficacy of the proposed method. This
improved method is shown to be better than the alarm similarity color map when applied in the analysis
of industrial alarm data.
© 2012 ISA. Published by Elsevier Ltd. All rights reserved.
1. Introduction
During routine operation of industrial systems, abnormal sit-
uations show up in the control room as alarms. Most processes,
however, have too many alarm tags configured on process vari-
ables and other state variables mainly due to safety considerations.
The number of alarmed tags is large enough to overwhelm even ex-
perienced operators. Because of this situation, alarm management
has been recognized as an important problem in the area of sys-
tem monitoring and fault detection [1,2]. New and revised guide-
lines and standards have been proposed from different viewpoints
to tackle this problem [3–7].
The focus of many current projects in industry is on ‘alarm
rationalization’, that is, to reduce the number of alarms and yet at
the same time be able to detect all potential abnormal situations.
Most alarms are analyzed in a univariate framework. However
it is well known that most processes are multivariate and thus
✩ A short version of this paper was presented at the 18th IFAC World Congress,
Milano, Italy, August 28–September 2, 2011.
∗ Corresponding author at: Tsinghua National Laboratory for Information Science
and Technology (TNList), Department of Automation, Tsinghua University, Beijing
100084, PR China. Tel.: +86 10 6277 1347; fax: +86 10 6278 6911.
E-mail addresses: yangfan@tsinghua.edu.cn,fyang.cme@ualberta.ca (F. Yang),
Sirish.Shah@ualberta.ca (S.L. Shah), xiaody@tsinghua.edu.cn (D. Xiao),
tchen@ece.ualberta.ca (T. Chen).
the alarms are not independent. Therefore a good strategy for
the analysis and visualization of alarm data should be based on
a multivariate framework. This type of multivariate information
can be easily captured from process data and multivariate statistics
can be applied to generate efficient alarm strategy [8] or optimize
alarm limits [9]. Such methods can be categorized as indirect
methods because we resort to process data only, i.e., measured
values of process variables in engineering units. However, one may
have many alarm tags associated with a single process variable
and often alarm tags are without any association with any specific
process variables (e.g., digital alarms to indicate a specific event or
situation). These field alarms which have only ON/OFF (‘1’ or ‘0’)
values are hard to analyze if there is no process data available. All
of these issues necessitate development of new methods to mine
alarm data directly.
Compared to the analysis of numerical process data, the
analysis of binary alarm data has some unique properties. First,
to analyze the statistical properties, the process data should be
stationary, which is not satisfied because both the normal and
abnormal data are included. The relationship between different
variables may be different under different situations and should
be studied separately. The alarm data, however, only include the
data in abnormal situation, leading to a simpler case. Second,
the information is often hidden in noise, which is inevitable in
process data. Consequently there may be some potential value in
the analysis of alarm data [4].
0019-0578/$ – see front matter © 2012 ISA. Published by Elsevier Ltd. All rights reserved.
doi:10.1016/j.isatra.2012.03.005
500 F. Yang et al. / ISA Transactions 51 (2012) 499–506
There are two ways to capture correlation from alarm data:
one is to employ Pearson’s correlation coefficients as done for
continuous data by assigning a value of 1 for the periods when the
alarm is active and a value of 0 when it is inactive [9]; the other
is to introduce similarity measures based on binary data [10,11].
By computing the correlation coefficient or similarity measure for
each pair of variables, a matrix is constructed. However, this matrix
which is composed of specific numbers is not generally convenient
for inspection or cursory analysis. Therefore the correlation values
are converted into a color map and this correlation color map
offers better visualization capability; it uses different colors to
show different degrees of correlation [12,13]. The colors are
usually discretized into several levels according to the color code.
Through the reordering and clustering of variables, the color-coded
correlation map can show the clusters of alarm tags intuitively. In
this way, the problem of a large number of variables is separated
into smaller sub-problems with much fewer variables.
The classical correlation color map only uses the correlation
coefficients. However, it may be ineffective when there is a time lag
between two variables. Hence the correlation coefficients should
be lag-adjusted to take into account the delays between each
pair of variables. The alarm similarity color map (ASCM), which
is specially designed for alarm data analysis [10], has proved to
be an effective method for visualization. It captures time-delayed
similarity information from binary data. However it has some
disadvantages. Firstly, in order to weaken the sensitivity caused
by the time shift, each unique alarm is padded with extra 1’s
to enrich the data; or in a similar form, alarms are checked
within constantly divided time windows (equivalent to down
sampling) [11]. This step is short of physical evidence and as a
result of this individual false alarms and chattering alarms may be
magnified unreasonably. In this paper we suggest another method
in which each unique alarm is replaced by a Gaussian distribution
along with its neighbors in the time axis [14]. This set of generated
continuous data with numerical values is labeled as pseudo data.
Based on this data set, we use Pearson’s correlation coefficients to
measure the correlation. This method has some similar problems
as the padding method; however, it provides a better way of
transforming binary alarm data into continuous data that can
be analyzed by statistical approaches. Another disadvantage of
the ASCM method is that it takes into account all the alarm
tags separately regardless their physical relationship. If typical
analogue alarms (HI/LO/HH/LL) associated with the same process
variable are grouped together, the result should be expected to be
more reasonable. Thirdly, the similarity measure only indicates the
distance but loses the direction (positive or negative correlation),
while the correlation coefficient indicates both the similarity and
the direction.
In this paper, a new approach to analyzing multiple alarm series
is proposed to find redundancy. By converting binary data into
continuous (pseudo) data via a novel Gaussian kernel method,
statistical methods are applied, including an improved correlation
and singular value analysis. The numerical results can be visualized
by the ASCM tool via appropriate ordering and clustering of alarm
tags.
2. Basic method for correlation analysis
2.1. Gaussian kernel method for data preprocessing
Alarm data is binary data with ‘1’s’ and ‘0’s’ to denote abnormal
and normal states respectively. Since alarm data is obtained and
discretized from continuous process data according to alarm limits,
some information is lost in this process. However, alarm data
does have advantages as well: (1) binary data is easy for storage
and convenient for statistical analysis; (2) it has much higher
sampling frequencies than process data and thus includes detailed
information; and (3) it includes more types of data such as digital
alarms showing the status of some elements (e.g. if a pump is on or
off). Thus we can use alarm data directly to analyze the correlation.
For a particular variable, one alarm point can be regarded
as a sample of the time series. To apply correlation analysis
to alarm occurrences, a continuous analogue signal has to be
generated from the alarm signal which is actually an event series.
By simply assigning ‘1’ or ‘0’, a time series in a continuous temporal
domain can be generated, which is simple but lacks granularity
because non-continuous (binary) values are not commensurate for
correlation analysis. In order to better estimate the time series, the
kernel method can be used in the temporal domain [15], which is a
nonparametric method, to fit the function with any shape. Here the
Gaussian kernel function is used because of its smoothness. At each
alarm point, a Gaussian kernel function is superimposed around
this time instant. The function is defined as:
K(t) =
1
√
2πσ
e− t2
2σ , (1)
where σ is the standard deviation. If all the alarm points are
considered, a continuous time series can be obtained as the
superposition of all the time-shifted Gaussian kernel functions.
Thus the resulting time series is given as
P(t) =
N
i=1
K(t − ti), (2)
where N is the number of alarm points at the time instants ti, i =
1, 2, . . . , N. This time series P(t) can be regarded as the estimation
of the corresponding process data although there may not exist
such a physical process variable; hence we call it pseudo data. Fig. 1
illustrates the principle of generating such pseudo data from binary
data.
The pseudo time series has the following properties:
• It is continuous and smooth because it is approximated by
superimposed Gaussian functions.
• For consecutive alarm points lasting a long time, the pseudo
data is also 1’s because the integral of the Gaussian kernel
function over the whole time domain is 1. This is different
from the estimation of probability density function because the
samples here are at different time instants.
• With appropriate variance of the kernel function, the magni-
tude of the kernel function is small; thus non-consecutive and
sparse alarm points cannot result in spikes in the pseudo data.
This property is particularly important for the filtering of chat-
tering alarms so that the proposed method can be used directly
on the data set with chattering alarms.
Consider the example of alarm data as shown in Fig. 2(a) in
which there is a consecutive alarm period of 400 s including a short
(5 s) break at around the 510th interval. At both the beginning and
the end of this period, there is some chattering. From this data set,
we can generate the pseudo data with standard deviation of 30 s as
shown in Fig. 2(b). Here the chattering and the gap are sufficiently
smoothened and at the same time the alarm period prevails. The
effects of missed alarms and false alarms are lessened because they
usually only exist over a short time duration.
2.2. Effect of time lags
There may exist a time lag between two correlated alarms
due to the propagation or response time. This lag reduces the
correlation coefficient and thus may mask the correlation be-
tween these variables. To eliminate this influence, different time
F. Yang et al. / ISA Transactions 51 (2012) 499–506 501
1
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 17
t
16
1
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 17
t
16
(a) Alarm data. (b) Pseudo data.
Fig. 1. Transforming alarm data (a) to pseudo data (b).
(a) Alarm data. (b) Pseudo data.
Fig. 2. (a) Original alarm data; (b) generated pseudo data by the Gaussian kernel method.
lags should be assumed and the maximal correlation coeffi-
cient can be computed; this would be regarded as the real
correlation [16].
Assume that x and y are time series of n observations with
means µx, µy and variances σx, σy respectively, then the cross-
correlation function (CCF) with an assumed lag k on y is:
φxy(k) =
E

(xi − µx)(yi+k − µy)

σxσy
, k = −n + 1, . . . , n − 1. (3)
The expectation can be estimated as the sample CCF by:
ˆφxy(k) =



1
n − k
n−k
i=1
(xi − µx)(yi+k − µy)

σxσy, if k ≥ 0,
1
n − k
n
i=1−k
(xi − µx)(yi+k − µy)

σxσy, if k < 0.
(4)
A value of the CCF is obtained by assuming a certain time lag
for one of the time series. Thus the absolute maximum value can
be regarded as the real cross-correlation and the corresponding
lag as the estimated time lag between these two variables. For
a mathematical description, one can compute the maximum
and minimum values φmax
= maxk{φxy(k), 0} ≥ 0 and φmin
=
mink{φxy(k), 0} ≤ 0, and the corresponding arguments kmax
and
kmin
. Then the estimated time delay from x to y is:
λ =

kmax
, if φmax
≥ −φmin
,
kmin
, if φmax
< −φmin
.
(5)
(corresponding to the maximum absolute value) and the actual
time delayed cross-correlation is ρ = φxy(λ) (between −1 and
1). If λ is less than zero, then it means that the actual delay is from
y to x. Thus the sign of λ provides the directionality information
between x and y. The sign of ρ indicates whether the correlation
is positive or negative. Note that this directionality does not mean
causality because there may exist another common cause of the
relationship between x and y [17].
2.3. Finding redundant alarms based on correlation
If several alarms are highly correlated, we should check
if there is redundancy in them, i.e., if some alarms can be
obtained by linearly combining other alarm signals. Singular value
decomposition (SVD) is a well-developed method to do this by
finding singular values of the set of alarm data and thus enabling
one to identify collinear columns. If there are several large singular
values that possess the dominant proportion of all the values,
then the number of large or dominant singular values can be
regarded as the number of independent alarm tags and the residual
number is the number of redundant alarms. However, SVD cannot
tell us which alarm is redundant because we are dealing with
the transformed or pseudo data to generate a set of new data.
It only provides an opportunity for one to check if there is an
alarm that can be removed or replaced by the linear combination
of other alarm signals. In addition, the independent series in the
new data may not have the approximate binary property, making
these values inappropriate to be taken as generated alarm signals.
One should also consider the physical meaning of all alarms and
reconcile this with the SVD information. This ambiguity in using
SVD is much severe if the number of alarms is large. In this case,
it is difficult to identify the independent and redundant alarms
merely based on the result of SVD. Therefore, we need to separate
the variables into several groups and perform SVD on each group
of variables. The clustering process based on visualizing the result
of the correlation analysis would help one in analyzing SVDs of
smaller groups of variables. Another point to be noted is that, for
SVD analysis, the time series should be sufficiently long to include
many alarms, otherwise it cannot reflect its true property.
3. Visualization of correlation
In order to visualize the correlation matrix, the correlation color
map is developed, in which the order of variables is rearranged
in a ranked order where variables that are highly correlated with
each other appear together with prominent colors or shades, called
clusters.
502 F. Yang et al. / ISA Transactions 51 (2012) 499–506
3.1. Ordering and clustering of alarm tags
Based on the correlation coefficient, the measure to describe
the similarity between two alarms is known as the similarity
measure S [18] which has the properties of positivity (S(x, y)
≥ 0), symmetry (S(x, y) = S(y, x)), and maximality (S(x, x) ≥
S(x, y)). There are many similarity measures available for binary
data [19]; hence if we compute the similarity measures based on
original alarm data, we can choose any one of them; for example
Kondaveeti et al. [10] have used the Jaccard similarity measure.
However we are dealing with pseudo data; so generally we can
use the covariance, Euclidean distance (L2), Kendall’s τ, Pearson’s
correlation coefficient, Spearman’s rank, City-block (L1), etc. [20].
In this paper, we use the classical Pearson’s correlation coefficient
and define the following similarity measure S:
S(x, y) = |ρxy|, (6)
which lies in the interval [0, 1].
During clustering, one should employ both the similarity
measure between two alarms and also the measure between two
clusters of alarms. The latter can be defined by single-, complete-,
and average-linkage methods [20]. The definition of single-
linkage is
SS(X, Y) = min
x∈X,y∈Y
{S(x, y)}, (7)
the definition of complete-linkage is
SC(X, Y) = max
x∈X,y∈Y
{S(x, y)}, (8)
and the definition of average-linkage is
SA(X, Y) =

x∈X,y∈Y
S(x, y)

|X| |Y|, (9)
where x and y are alarm tags, X and Y are clusters of tags, and | · |
means the size of the cluster.
Using the similarity measure between each pair of alarms or
their clusters, all alarm tags can be clustered based on various clus-
tering algorithms such as agglomerative hierarchical clustering,
the methodology for which is shown as a dendrogram. This process
has been illustrated in Kondaveeti et al.’s study [10]. Other ordering
and clustering algorithms can also be used such as the ellipse or-
dering. For this purpose, the data analysis software GAP is a useful
tool [20].
3.2. Correlation color map
The correlation matrix with rearranged rows and columns is
color coded to transform it into a color map [12]. The matrix is
symmetric and both the vertical and horizontal axes are alarm
tag names. Each grid (x, y) shows the corresponding correlation
between the two alarm tags x and y. The color is coded to show
the correlation (−1 to +1) or its absolute (0–1). To explain the
meaning of each color, a color bar legend is placed beside the plot.
The order of the alarm tags is determined by the above algorithm.
The color map is symmetric about the diagonal which means the
auto-correlation coefficients are 1’s. The clusters are shown as
blocks located along the diagonal with similar color codes. From
this map, one can acquire the approximate correlation between
each pair of alarms, and easily find clusters and the corresponding
alarm tags. By matching this map with the physical meaning, one
can locate the redundant alarms.
4. Implementation issues
In the above methods there are several issues to be noted.
4.1. Sampling rates of alarm data and pseudo data
The alarm data is recorded by DCS, PLC, or other smart devices;
the sampling rate can be very high. This is necessary sometimes
because alarms occur with a very high frequency due to oscillation
or chattering. Generally speaking, we can record them in the
database as frequently as possible, say one per second. A proposed
rule of thumb is a 15 s sample interval as a lower bound [14].
It is generally unnecessary to record process data at such a
high rate because the process always has a time constant with
a magnitude of minutes or even longer. In real applications, we
often record such samples at one sample per minute. Different
from alarm data and process data, the pseudo data is generated
from alarm data but behaves as continuous process data. Thus we
can record the pseudo data at the same frequency as the original
alarm data, but we can also decrease the sampling rate to reduce
the computational burden and can also match it with the sample
rate of process data.
4.2. Variance of the Gaussian kernel function
The variance of the kernel function affects the robustness of
the pseudo data at each alarm point. If the variance is very small,
each alarm point generates a spike in the pseudo data making the
method very sensitive to an individual sample or during a short
period of alarms. On the other hand if the variance is large, the
magnitude of each kernel function is very small; hence it needs
quite a few alarm points during a period of no alarms or quite a
few ‘holes’ during a period of consecutive alarms to change the
trend of the pseudo curve. Although this can reduce the influence
of the missed alarms, false alarms, and chattering alarms due to
its robustness to individual changes, the latency increases on the
other hand, meaning that there is a longer time delay to reflect a
change. This is a trade-off between large and small variances. To
our best knowledge, most variables in the process industry have a
time constant that ranges from several seconds to several minutes
and even hours; thus the standard deviation of the Gaussian kernel
function can be chosen over this range. If there is a step change
(from no alarm to alarm for example) in the alarm data, it needs
several minutes for the pseudo data to completely change the value
(from 0 to 1).
4.3. Computational effort
Although the number of samples is generally large due to
the high frequency of sampling rate, the alarm data set includes
large amount of normal data resulting in the data set being
quite sparse. In addition, when an alarm occurs consecutively,
the corresponding pseudo time series is a set of consecutive 1’s
according to the property mentioned earlier in Section 2; such
periods can be ignored in the computation by letting them remain
unchanged. Therefore the computation only concentrates on the
non-consecutive alarm points and the boundary of the consecutive
alarm points.
In SVD, the vectors of the pseudo data can be very long, making
the computation infeasible. We can either lower the sampling rate,
or use the technique of sparse matrix analysis [21].
4.4. Selection of data range
Theoretically, a long series can result in better results, but
the computational effort is higher. In addition, in correlation
analysis we assume that the data is stationary; this is difficult to
satisfy because the operating situation or fault pattern may vary,
especially in a long series. Thus we should compromise to look
into shorter series. Based on this compromise, the computation of
correlation requires a sufficiently large data set that contains alarm
F. Yang et al. / ISA Transactions 51 (2012) 499–506 503
Fig. 3. High density plot of the original alarm data for case study 1.
points because the normal points do not provide information. In
practice a short time period of simultaneous alarms is a much
better data set than a long time period of rare alarms. In particular,
the alarm data during an alarm flood is an especially good source
for study.
4.5. Grouping of alarm tags of one process variable
For a typical process variable, there may be as many as four
alarms associated with it: HI, LO, HH, and LL alarms. If these alarm
tags are treated separately, the relationship between them is lost.
Instead, these four alarms can be combined together by assigning
them different values based on their directions and degrees, for
example, HI as +1, LO as −1, HH as +2, and LL as −2. This
treatment makes the alarm data have several values instead of
binary values. When generating the pseudo data, these values
should be taken as the weights of the Gaussian kernels.
4.6. Choosing color code for color map
We can either use the same color with different shades or
different colors. Nevertheless, the number of color scales is very
important, which determines the interpretability of the map. There
is a trade-off between the sensitivity and the interpretability. A
general recommendation is to use fewer number of codes first and
increase the shades gradually to suit your eyes in the visualization
process and provide more ‘granularity’ in the number of alarms for
a process. In our work we use warm colors to describe positive
correlation and cold colors to describe negative correlation. The
four bands of values are chosen as (0–0.25), (0.25–0.5), (0.5–0.75),
and (0.75–1) respectively. Please see the color bar in Fig. 4(b) for
the color assignment.
5. Case studies
To illustrate the practicality and utility of the method suggested
in this work, we consider two case studies. The first case study
is to illustrate the procedure of the proposed method and the
improvements. The second case study is an application to a real
industrial process, showing the efficacy of the method.
5.1. Case study 1
First consider data from a real industrial process including 10
analogue alarm tags with HI and LO settings that have alarms over
a period of one week. The sampling rate is one sample per second.
By assigning HI and LO alarms +1 and −1 respectively, the alarm
data is shown as a high density plot in Fig. 3 where alarm tags 3
and 7 have both HI and LO alarms. If we use the method proposed
by Kondaveeti et al. [10], the ASCM obtained is shown in Fig. 4(a)
where the padding length used is 5 s. We find that alarm tags 1 and
2 are correlated, and alarm tags 4, 5, and 6 are grouped in another
cluster.
Now we use the method proposed in this paper by setting the
standard deviation of the Gaussian kernel as 30 s. The correlation
matrix Φ (with the elements below the diagonal removed due to
symmetry) is given in Box I.
Based on this matrix, Table 1 illustrates the clustering process.
Initially each alarm tag is regarded as a cluster. In the first step,
the similarity measure (S = absolutecorrelation) of each pair
of alarm tags are compared, in which the similarity measure S
between alarm tags 4 and 5 has the largest value, meaning that
they are most similar. Therefore alarm tags 1 and 2 are grouped
into a new cluster, 11, and the old clusters 4 and 5 are removed.
Then we compute the similarity measure between each pair of
new clusters. We take alarm tags 1 and 2 in the second step, and
then 10 and 8 in the third step. In the fourth step, the measure
a b
Fig. 4. Alarm similarity color map of the original alarm data based on Kondaveeti’s scheme (a) and correlation color map of the pseudo data (b) for case study 1. (For
interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
504 F. Yang et al. / ISA Transactions 51 (2012) 499–506
Φ =













1 0.97 −0.06 −0.08 −0.08 0.33 0.23 0.54 0.34 0.42
1 −0.03 −0.07 −0.07 0.34 0.23 0.55 0.35 0.43
1 0.47 0.47 −0.21 0.26 −0.08 0.04 −0.12
1 1.00 −0.20 0.10 −0.16 −0.27 −0.13
1 −0.20 0.10 −0.16 −0.27 −0.13
1 −0.25 0.64 0.34 0.67
1 0.25 0.17 0.06
1 0.60 0.79
1 0.70
1













Box I.
Table 1
Clustering process for case study 1.
Step Clusters Largest measure
Correlation Pair
0 1 2 9 10 8 6 3 4 5 7 1.00 4, 5
1 1 2 9 10 8 6 3 11 7 0.97 1, 2
2 12 9 10 8 6 3 11 7 0.79 10, 8
3 12 9 13 6 3 11 7 0.69 9, 8
4 12 14 6 3 11 7 0.67 6, 10
5 12 15 3 11 7 0.55 2, 8
6 16 3 11 7 0.47 3, 4/5
7 16 17 7 −0.27 4/5, 9
8 18 7 0.26 7, 3
9 19 – –
between clusters 9 and 13 is the largest, which is determined by
the similarity between alarm tags 9 and 8 according to the single-
linkage method. So clusters 9 and 13 become the new cluster 14
which contains three alarm tags. The algorithm continues until
only one cluster is left. The corresponding dendrogram is shown
in Fig. 5. Thus the new order of alarm tags is obtained and thereby
we have the correlation color map shown in Fig. 4(b).
Apparently, alarm tags 1, 2, 9, 10, 8, and 6 form a single cluster
with positive correlations, and alarm tags 3, 4, and 5 also have
some correlations between them. This result clearly provides more
information than the one shown in Fig. 4. Therefore the proposed
method reveals some correlations that are not uncovered in the
method proposed by Kondaveeti et al. [10], such as the correlation
between alarm tags 8 and 9. Then we take the data of alarm tags
1, 2, 9, 10, 8, and 6 to perform SVD. The six singular values are
1720, 498, 347, 302, 199, and 165. Thus it is apparent that there
exists redundancy in these tags because one or two singular values
dominate most of the collinear information. According to the user’s
preference and the corresponding physical meanings, some alarms
may be removed.
5.2. Case study 2
Next consider a hydro treater process in a refinery in Alberta,
Canada. Since there are hundreds of tags with alarms, we order
them based on the numbers of occurrence and choose 11 process
tags associated with analogue alarms (HI/LO) that contribute to
the large number of alarm counts. The data set includes one week
worth of alarm data at a rate of one sample per second, during
which there are two evident floods leading to a large number of
alarms. The above analysis is implemented on this data set and
the color map obtained is shown in Fig. 6(a). The order of tags is
determined by the same algorithm as the previous case study; the
procedure is shown in Fig. 7.
It can be observed in Fig. 6(a) that alarm tags 1, 3, 9, 10,
and 11 form a cluster with high correlation values. From process
knowledge, we know that tags 1 and 3 are both flow rates, and
tag 3 is downstream of tag 1. Therefore, when a fault occurs
Fig. 5. Dendrogram for the 10 variables in case study 1 based on the single-linkage
measure. The scale on the right indicates the similarity measure.
upstream, it can propagate downstream to other variables, which
can be reflected in the alarm data. Although they have similar
trends in this case study, they are not redundant in general
because they reveal the fault propagation and there is some time
delay as evident from the time stamps. Tags 9, 10, and 11 are
fuel gas pressures of different burners in one equipment, leading
to evident correlations. From this study, we find that they are
somewhat redundant and may be reduced to one tag. In real
engineering practice, however, we should check more data to
confirm this redundancy and resort to process knowledge to
see the physical implication. Actually, some redundancy is also
necessary because it can help improve the accuracy of detection,
especially for important variables. In this case, each burner is
placed with one sensor for monitoring its operating situation,
although they are all pressure measurements for one equipment.
In addition, the correlation between the above two groups is
also high, which shows up the alarm flood when different tags
show similar patterns due to the fault spread or interrelationship
between physical quantities. Note that tag 11 behaves differently
from tags 9 and 10 although they are highly correlated; this also
accounts for the necessity of setting an alarm on tag 11.
As a comparison, the corresponding process data is also
available, as shown in Fig. 8, based on which the correlation color
map is obtained as shown in Fig. 6(b). It is observed that the two
color maps (Fig. 6(a) and (b)) have different patterns even if they
share the same order of tags. During the floods, many process
tags are correlated, while their alarms are less correlated thanks
to the good alarm settings. For example, tags 2 and 6 show some
negative correlation because they are flow rates on upstream and
downstream variables respectively; however, the corresponding
alarms do not have an apparent correlation.
6. Concluding remarks
In this paper, the correlation matrix is plotted as a color map to
visualize the relationship between different alarm tags. Compared
to the ASCM, the clustered correlation map of pseudo data map
has the following three main advantages: (1) it is robust to missed,
false, and chattering alarms; (2) the correlation of pseudo data
provides directional (positive or negative) information in addition
F. Yang et al. / ISA Transactions 51 (2012) 499–506 505
a b
Fig. 6. Color maps for pseudo data (a) and process data (b) in case study 2. (For interpretation of the references to colour in this figure legend, the reader is referred to the
web version of this article.)
1.00
0.84
0.67
0.51
0.35
0.18
0.02
Fig. 7. Dendrogram for case study 2.
Fig. 8. High density plot of the process data in case study 2.
to the similarity; (3) the pseudo data used in generating the color
map can also be used in other statistical analysis that provides
more potential; and (4) no knowledge of delays between correlated
alarms is required. One disadvantage to be noted is that when
computing correlation between two time series, the simultaneous
0’s in both the series have little information because they are in the
normal region, resulting in the reduction of correlation coefficients.
Thus the period for analysis should be selected appropriately
so that there are dense alarm signals, and it is this window of
alarm data that should be converted to pseudo data for statistical
analysis. From this improved correlation analysis, we can find the
relationship and redundancy in alarm tags that can be translated
into improved alarm settings for efficient alarm rationalization.
It should be noted that the proposed method in this paper is
an improved correlation analysis based on alarm data. However,
to obtain a reasonable and effective alarm configuration, we
should also investigate process data as well as process knowledge,
as illustrated in the second industrial case study and in Fig. 6.
The analysis of alarm data is simple but is lacking in terms
of detailed information, especially the true trend information
of a process variable; as a result, we usually take this analysis
to be the first step to focus our attention on some particular
parts of the industrial process and then apply more complex
techniques. As discussed in Section 5.2, the results should be
compared to the ones based on process data to find the influence
of alarm settings. The correlation and redundancy information
captured through data analysis should be combined with process
knowledge, especially the process connectivity and causality
information between process units and variables [22,23].
The method proposed in this paper can be developed to be user-
friendly to tune the parameters. In the pseudo data generating
stage, the variance of Gaussian kernel and the sampling rate of
pseudo data are two parameters that require tuning. In the color
map plotting stage, the clustering algorithm is another option. In
the color map display period, one should be able to change the
color code and the number of scales of colors. It is important to
provide some degrees of freedom for the user because the plot is
sensitive to these tuning parameters and an appropriate choice of
parameters depends on the specific circumstances, requirements
and objectives of the analysis.
Acknowledgments
This work was supported by NSERC (SPG and IRC) in Canada,
NSFC (60736026 and 60904044), Tsinghua National Laboratory
for Information Science and Technology (TNList) Cross-discipline
Foundation, and the 973 Project (2009CB320600) in China.
References
[1] Izadi I, Shah SL, Shook D, Chen T. An introduction to alarm analysis and design.
In: Proc. 7th IFAC symp. fault detection, supervision and safety of technical
processes. IFAC safeprocess 2009. 2009. p. 645–50.
[2] Izadi I, Shah SL, Chen T. Effective resource utilization for alarm management.
In: Proc. 49th IEEE conf. decision and control. CDC 2010. Atlanta, GA. 2010.
p. 6803–8.
506 F. Yang et al. / ISA Transactions 51 (2012) 499–506
[3] Hollifield B, Habibi E. The alarm management handbook: a comprehensive
guide. Houston (TX): PAS; 2007.
[4] Engineering Equipment and Materials Users’ Association. Alarm systems: a
guide to design, management and procurement. EEMUA standard 191. second
ed. 2007.
[5] International Society of Automation. Management of alarm systems for the
process industries. ANSI/ISA standard 18.2. 2009.
[6] Errington J, Reising DV, Burns C. ASM consortium guidelines: effective alarm
management practices. Phoenix, AZ: ASM Consortium; 2009.
[7] American Petroleum Institute. Pipeline SCADA alarm management. API
recommended practice 1167. 2010.
[8] Kondaveeti SR, Izadi I, Shah SL. Application of multivariate statistics for
efficient alarm generation. In: Proc. 7th IFAC symp. fault detection, supervision
and safety of technical processes. IFAC safeprocess 2009. 2009, p. 657–62.
[9] Yang F, Shah SL, Xiao D. Correlation analysis of alarm data and alarm limit
design for industrial processes. In: Proc. 2010 American control conf., ACC
2010. 2010, p. 5850–5.
[10] Kondaveeti SR, Izadi I, Shah SL, Black T. Graphical representation of industrial
alarm data. In: Proc. 11th IFAC/IFIP/IFORS/IEA symp. analysis, design, and
evaluation of human–machine systems. IFAC/IFIP/IFORS/IEA 2010. 2010.
[11] Nishiguchi J, Takai T. IPL2 and 3 performance improvement method for
process safety using event correlation analysis. Comput Chem Eng 2010;
34(12):2007–13.
[12] Tangirala AK, Shah SL, Thornhill NF. PSCMAP: a new tool for plant-wide
oscillation detection. J Process Control 2005;15(8):931–41.
[13] Zhang H. Statistical process monitoring and modeling using PCA and PLS. M.S.
thesis. Edmonton (AB, Canada): University of Alberta; 2000.
[14] Control Arts Inc. Alarm system engineering (e-book), 2011.
http://controlartsinc.com/Support/Publications.html.
[15] Silverman BW. Density estimation for statistics and data analysis. London
(New York): Chapman and Hall; 1986.
[16] Bauer M, Thornhill NF. A practical method for identifying the propagation path
of plant-wide disturbances. J Process Control 2008;18(7–8):707–19.
[17] Yang F, Shah SL, Xiao D. SDG (signed directed graph) based process description
and fault propagation analysis for a tailings pumping process. In: Proc. 13th
IFAC symp. automation in mining, mineral and metal processing. IFAC MMM
2010. 2010.
[18] Lesot MJ, Rifqi M, Benhadda H. Similarity measures for binary and numerical
data: a survey. Int J Knowl Eng Soft Data Paradigm 2009;1(1):63–84.
[19] Choi S, Cha S, Tappert CC. A survey of binary similarity and distance measures.
J Syst Cybernet Inform 2010;8(1):43–8.
[20] Wu HM, Tien YJ, Chen CH. GAP: a graphical environment for matrix
visualization and cluster analysis. Comput Statist Data Anal 2010;54(3):
767–78.
[21] Pissanetzky S. Sparse matrix technology. London (UK): Academic Press; 1984.
[22] Yang F, Shah SL, Xiao D. Signed directed graph based modeling and its
validation from process knowledge and process data. Int J Appl Math Comput
Sci 2012;22(1):41–53.
[23] Larsson JE, DeBor J. Real-time root cause analysis for complex technical
systems. In: Proc. joint IEEE 8th conf. on human factors and power plants
and 13th annual workshop on human performance/root cause/trending/
operating experience/self assessment, joint 8th IEEE HFPP/13th HPRCT. 2007.
p. 156–63.

Mais conteúdo relacionado

Mais procurados

Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...IJERA Editor
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET Journal
 
IRJET- Probability based Missing Value Imputation Method and its Analysis
IRJET- Probability based Missing Value Imputation Method and its AnalysisIRJET- Probability based Missing Value Imputation Method and its Analysis
IRJET- Probability based Missing Value Imputation Method and its AnalysisIRJET Journal
 
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...IRJET Journal
 
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Seval Çapraz
 
Data Imputation by Soft Computing
Data Imputation by Soft ComputingData Imputation by Soft Computing
Data Imputation by Soft Computingijtsrd
 
Anomaly detection via eliminating data redundancy and rectifying data error i...
Anomaly detection via eliminating data redundancy and rectifying data error i...Anomaly detection via eliminating data redundancy and rectifying data error i...
Anomaly detection via eliminating data redundancy and rectifying data error i...nalini manogaran
 
A02610104
A02610104A02610104
A02610104theijes
 
Parametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithmParametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithmIAEME Publication
 
Integrate fault tree analysis and fuzzy sets in quantitative risk assessment
Integrate fault tree analysis and fuzzy sets in quantitative risk assessmentIntegrate fault tree analysis and fuzzy sets in quantitative risk assessment
Integrate fault tree analysis and fuzzy sets in quantitative risk assessmentIAEME Publication
 
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace DataMPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace DataIRJET Journal
 
Overview of soft intelligent computing technique for supercritical fluid extr...
Overview of soft intelligent computing technique for supercritical fluid extr...Overview of soft intelligent computing technique for supercritical fluid extr...
Overview of soft intelligent computing technique for supercritical fluid extr...IJAAS Team
 
A Firefly based improved clustering algorithm
A Firefly based improved clustering algorithmA Firefly based improved clustering algorithm
A Firefly based improved clustering algorithmIRJET Journal
 
A new model for iris data set classification based on linear support vector m...
A new model for iris data set classification based on linear support vector m...A new model for iris data set classification based on linear support vector m...
A new model for iris data set classification based on linear support vector m...IJECEIAES
 
Data mining techniques application for prediction in OLAP cube
Data mining techniques application for prediction in OLAP cubeData mining techniques application for prediction in OLAP cube
Data mining techniques application for prediction in OLAP cubeIJECEIAES
 
IRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining TechniquesIRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining TechniquesIRJET Journal
 
A Threshold fuzzy entropy based feature selection method applied in various b...
A Threshold fuzzy entropy based feature selection method applied in various b...A Threshold fuzzy entropy based feature selection method applied in various b...
A Threshold fuzzy entropy based feature selection method applied in various b...IJMER
 
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUESGI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUESAM Publications
 
IRJET - Finger Vein Extraction and Authentication System for ATM
IRJET -  	  Finger Vein Extraction and Authentication System for ATMIRJET -  	  Finger Vein Extraction and Authentication System for ATM
IRJET - Finger Vein Extraction and Authentication System for ATMIRJET Journal
 

Mais procurados (19)

Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms Comparison
 
IRJET- Probability based Missing Value Imputation Method and its Analysis
IRJET- Probability based Missing Value Imputation Method and its AnalysisIRJET- Probability based Missing Value Imputation Method and its Analysis
IRJET- Probability based Missing Value Imputation Method and its Analysis
 
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
 
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
 
Data Imputation by Soft Computing
Data Imputation by Soft ComputingData Imputation by Soft Computing
Data Imputation by Soft Computing
 
Anomaly detection via eliminating data redundancy and rectifying data error i...
Anomaly detection via eliminating data redundancy and rectifying data error i...Anomaly detection via eliminating data redundancy and rectifying data error i...
Anomaly detection via eliminating data redundancy and rectifying data error i...
 
A02610104
A02610104A02610104
A02610104
 
Parametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithmParametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithm
 
Integrate fault tree analysis and fuzzy sets in quantitative risk assessment
Integrate fault tree analysis and fuzzy sets in quantitative risk assessmentIntegrate fault tree analysis and fuzzy sets in quantitative risk assessment
Integrate fault tree analysis and fuzzy sets in quantitative risk assessment
 
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace DataMPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
 
Overview of soft intelligent computing technique for supercritical fluid extr...
Overview of soft intelligent computing technique for supercritical fluid extr...Overview of soft intelligent computing technique for supercritical fluid extr...
Overview of soft intelligent computing technique for supercritical fluid extr...
 
A Firefly based improved clustering algorithm
A Firefly based improved clustering algorithmA Firefly based improved clustering algorithm
A Firefly based improved clustering algorithm
 
A new model for iris data set classification based on linear support vector m...
A new model for iris data set classification based on linear support vector m...A new model for iris data set classification based on linear support vector m...
A new model for iris data set classification based on linear support vector m...
 
Data mining techniques application for prediction in OLAP cube
Data mining techniques application for prediction in OLAP cubeData mining techniques application for prediction in OLAP cube
Data mining techniques application for prediction in OLAP cube
 
IRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining TechniquesIRJET- Agricultural Crop Classification Models in Data Mining Techniques
IRJET- Agricultural Crop Classification Models in Data Mining Techniques
 
A Threshold fuzzy entropy based feature selection method applied in various b...
A Threshold fuzzy entropy based feature selection method applied in various b...A Threshold fuzzy entropy based feature selection method applied in various b...
A Threshold fuzzy entropy based feature selection method applied in various b...
 
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUESGI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
 
IRJET - Finger Vein Extraction and Authentication System for ATM
IRJET -  	  Finger Vein Extraction and Authentication System for ATMIRJET -  	  Finger Vein Extraction and Authentication System for ATM
IRJET - Finger Vein Extraction and Authentication System for ATM
 

Semelhante a Improved correlation analysis and visualization of industrial alarm data

Influence over the Dimensionality Reduction and Clustering for Air Quality Me...
Influence over the Dimensionality Reduction and Clustering for Air Quality Me...Influence over the Dimensionality Reduction and Clustering for Air Quality Me...
Influence over the Dimensionality Reduction and Clustering for Air Quality Me...IJAEMSJORNAL
 
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINERANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINERIJCSEA Journal
 
Analysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetAnalysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetIRJET Journal
 
Neural networks for the prediction and forecasting of water resources variables
Neural networks for the prediction and forecasting of water resources variablesNeural networks for the prediction and forecasting of water resources variables
Neural networks for the prediction and forecasting of water resources variablesJonathan D'Cruz
 
Multimode system condition monitoring using sparsity reconstruction for quali...
Multimode system condition monitoring using sparsity reconstruction for quali...Multimode system condition monitoring using sparsity reconstruction for quali...
Multimode system condition monitoring using sparsity reconstruction for quali...IJECEIAES
 
Fault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clusteringFault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clusteringIRJET Journal
 
Data mining projects topics for java and dot net
Data mining projects topics for java and dot netData mining projects topics for java and dot net
Data mining projects topics for java and dot netredpel dot com
 
An intrusion detection algorithm for ami
An intrusion detection algorithm for amiAn intrusion detection algorithm for ami
An intrusion detection algorithm for amiIJCI JOURNAL
 
Data repository for sensor network a data mining approach
Data repository for sensor network  a data mining approachData repository for sensor network  a data mining approach
Data repository for sensor network a data mining approachijdms
 
Performance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsPerformance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsDinusha Dilanka
 
Transient stability analysis of power system
Transient stability analysis of power systemTransient stability analysis of power system
Transient stability analysis of power systemTauhidulIslam32
 
UNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHON
UNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHONUNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHON
UNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHONNandakumar P
 
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...IJDKP
 
Noise-robust classification with hypergraph neural network
Noise-robust classification with hypergraph neural networkNoise-robust classification with hypergraph neural network
Noise-robust classification with hypergraph neural networknooriasukmaningtyas
 
Traffic Outlier Detection by Density-Based Bounded Local Outlier Factors
Traffic Outlier Detection by Density-Based Bounded Local Outlier FactorsTraffic Outlier Detection by Density-Based Bounded Local Outlier Factors
Traffic Outlier Detection by Density-Based Bounded Local Outlier FactorsITIIIndustries
 
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...IRJET Journal
 
Application Of Extreme Value Theory To Bursts Prediction
Application Of Extreme Value Theory To Bursts PredictionApplication Of Extreme Value Theory To Bursts Prediction
Application Of Extreme Value Theory To Bursts PredictionCSCJournals
 
Ijariie1117 volume 1-issue 1-page-25-27
Ijariie1117 volume 1-issue 1-page-25-27Ijariie1117 volume 1-issue 1-page-25-27
Ijariie1117 volume 1-issue 1-page-25-27IJARIIE JOURNAL
 

Semelhante a Improved correlation analysis and visualization of industrial alarm data (20)

Influence over the Dimensionality Reduction and Clustering for Air Quality Me...
Influence over the Dimensionality Reduction and Clustering for Air Quality Me...Influence over the Dimensionality Reduction and Clustering for Air Quality Me...
Influence over the Dimensionality Reduction and Clustering for Air Quality Me...
 
50120140504015
5012014050401550120140504015
50120140504015
 
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINERANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
 
Analysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetAnalysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease Dataset
 
Neural networks for the prediction and forecasting of water resources variables
Neural networks for the prediction and forecasting of water resources variablesNeural networks for the prediction and forecasting of water resources variables
Neural networks for the prediction and forecasting of water resources variables
 
Multimode system condition monitoring using sparsity reconstruction for quali...
Multimode system condition monitoring using sparsity reconstruction for quali...Multimode system condition monitoring using sparsity reconstruction for quali...
Multimode system condition monitoring using sparsity reconstruction for quali...
 
Fault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clusteringFault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clustering
 
A1802050102
A1802050102A1802050102
A1802050102
 
Data mining projects topics for java and dot net
Data mining projects topics for java and dot netData mining projects topics for java and dot net
Data mining projects topics for java and dot net
 
An intrusion detection algorithm for ami
An intrusion detection algorithm for amiAn intrusion detection algorithm for ami
An intrusion detection algorithm for ami
 
Data repository for sensor network a data mining approach
Data repository for sensor network  a data mining approachData repository for sensor network  a data mining approach
Data repository for sensor network a data mining approach
 
Performance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsPerformance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning Algorithms
 
Transient stability analysis of power system
Transient stability analysis of power systemTransient stability analysis of power system
Transient stability analysis of power system
 
UNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHON
UNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHONUNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHON
UNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHON
 
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
 
Noise-robust classification with hypergraph neural network
Noise-robust classification with hypergraph neural networkNoise-robust classification with hypergraph neural network
Noise-robust classification with hypergraph neural network
 
Traffic Outlier Detection by Density-Based Bounded Local Outlier Factors
Traffic Outlier Detection by Density-Based Bounded Local Outlier FactorsTraffic Outlier Detection by Density-Based Bounded Local Outlier Factors
Traffic Outlier Detection by Density-Based Bounded Local Outlier Factors
 
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
 
Application Of Extreme Value Theory To Bursts Prediction
Application Of Extreme Value Theory To Bursts PredictionApplication Of Extreme Value Theory To Bursts Prediction
Application Of Extreme Value Theory To Bursts Prediction
 
Ijariie1117 volume 1-issue 1-page-25-27
Ijariie1117 volume 1-issue 1-page-25-27Ijariie1117 volume 1-issue 1-page-25-27
Ijariie1117 volume 1-issue 1-page-25-27
 

Mais de ISA Interchange

An optimal general type-2 fuzzy controller for Urban Traffic Network
An optimal general type-2 fuzzy controller for Urban Traffic NetworkAn optimal general type-2 fuzzy controller for Urban Traffic Network
An optimal general type-2 fuzzy controller for Urban Traffic NetworkISA Interchange
 
Embedded intelligent adaptive PI controller for an electromechanical system
Embedded intelligent adaptive PI controller for an electromechanical  systemEmbedded intelligent adaptive PI controller for an electromechanical  system
Embedded intelligent adaptive PI controller for an electromechanical systemISA Interchange
 
State of charge estimation of lithium-ion batteries using fractional order sl...
State of charge estimation of lithium-ion batteries using fractional order sl...State of charge estimation of lithium-ion batteries using fractional order sl...
State of charge estimation of lithium-ion batteries using fractional order sl...ISA Interchange
 
Fractional order PID for tracking control of a parallel robotic manipulator t...
Fractional order PID for tracking control of a parallel robotic manipulator t...Fractional order PID for tracking control of a parallel robotic manipulator t...
Fractional order PID for tracking control of a parallel robotic manipulator t...ISA Interchange
 
Fuzzy logic for plant-wide control of biological wastewater treatment process...
Fuzzy logic for plant-wide control of biological wastewater treatment process...Fuzzy logic for plant-wide control of biological wastewater treatment process...
Fuzzy logic for plant-wide control of biological wastewater treatment process...ISA Interchange
 
Design and implementation of a control structure for quality products in a cr...
Design and implementation of a control structure for quality products in a cr...Design and implementation of a control structure for quality products in a cr...
Design and implementation of a control structure for quality products in a cr...ISA Interchange
 
Model based PI power system stabilizer design for damping low frequency oscil...
Model based PI power system stabilizer design for damping low frequency oscil...Model based PI power system stabilizer design for damping low frequency oscil...
Model based PI power system stabilizer design for damping low frequency oscil...ISA Interchange
 
A comparison of a novel robust decentralized control strategy and MPC for ind...
A comparison of a novel robust decentralized control strategy and MPC for ind...A comparison of a novel robust decentralized control strategy and MPC for ind...
A comparison of a novel robust decentralized control strategy and MPC for ind...ISA Interchange
 
Fault detection of feed water treatment process using PCA-WD with parameter o...
Fault detection of feed water treatment process using PCA-WD with parameter o...Fault detection of feed water treatment process using PCA-WD with parameter o...
Fault detection of feed water treatment process using PCA-WD with parameter o...ISA Interchange
 
Model-based adaptive sliding mode control of the subcritical boiler-turbine s...
Model-based adaptive sliding mode control of the subcritical boiler-turbine s...Model-based adaptive sliding mode control of the subcritical boiler-turbine s...
Model-based adaptive sliding mode control of the subcritical boiler-turbine s...ISA Interchange
 
A Proportional Integral Estimator-Based Clock Synchronization Protocol for Wi...
A Proportional Integral Estimator-Based Clock Synchronization Protocol for Wi...A Proportional Integral Estimator-Based Clock Synchronization Protocol for Wi...
A Proportional Integral Estimator-Based Clock Synchronization Protocol for Wi...ISA Interchange
 
An artificial intelligence based improved classification of two-phase flow patte...
An artificial intelligence based improved classification of two-phase flow patte...An artificial intelligence based improved classification of two-phase flow patte...
An artificial intelligence based improved classification of two-phase flow patte...ISA Interchange
 
New Method for Tuning PID Controllers Using a Symmetric Send-On-Delta Samplin...
New Method for Tuning PID Controllers Using a Symmetric Send-On-Delta Samplin...New Method for Tuning PID Controllers Using a Symmetric Send-On-Delta Samplin...
New Method for Tuning PID Controllers Using a Symmetric Send-On-Delta Samplin...ISA Interchange
 
Load estimator-based hybrid controller design for two-interleaved boost conve...
Load estimator-based hybrid controller design for two-interleaved boost conve...Load estimator-based hybrid controller design for two-interleaved boost conve...
Load estimator-based hybrid controller design for two-interleaved boost conve...ISA Interchange
 
Effects of Wireless Packet Loss in Industrial Process Control Systems
Effects of Wireless Packet Loss in Industrial Process Control SystemsEffects of Wireless Packet Loss in Industrial Process Control Systems
Effects of Wireless Packet Loss in Industrial Process Control SystemsISA Interchange
 
Fault Detection in the Distillation Column Process
Fault Detection in the Distillation Column ProcessFault Detection in the Distillation Column Process
Fault Detection in the Distillation Column ProcessISA Interchange
 
Neural Network-Based Actuator Fault Diagnosis for a Non-Linear Multi-Tank System
Neural Network-Based Actuator Fault Diagnosis for a Non-Linear Multi-Tank SystemNeural Network-Based Actuator Fault Diagnosis for a Non-Linear Multi-Tank System
Neural Network-Based Actuator Fault Diagnosis for a Non-Linear Multi-Tank SystemISA Interchange
 
A KPI-based process monitoring and fault detection framework for large-scale ...
A KPI-based process monitoring and fault detection framework for large-scale ...A KPI-based process monitoring and fault detection framework for large-scale ...
A KPI-based process monitoring and fault detection framework for large-scale ...ISA Interchange
 
An adaptive PID like controller using mix locally recurrent neural network fo...
An adaptive PID like controller using mix locally recurrent neural network fo...An adaptive PID like controller using mix locally recurrent neural network fo...
An adaptive PID like controller using mix locally recurrent neural network fo...ISA Interchange
 
A method to remove chattering alarms using median filters
A method to remove chattering alarms using median filtersA method to remove chattering alarms using median filters
A method to remove chattering alarms using median filtersISA Interchange
 

Mais de ISA Interchange (20)

An optimal general type-2 fuzzy controller for Urban Traffic Network
An optimal general type-2 fuzzy controller for Urban Traffic NetworkAn optimal general type-2 fuzzy controller for Urban Traffic Network
An optimal general type-2 fuzzy controller for Urban Traffic Network
 
Embedded intelligent adaptive PI controller for an electromechanical system
Embedded intelligent adaptive PI controller for an electromechanical  systemEmbedded intelligent adaptive PI controller for an electromechanical  system
Embedded intelligent adaptive PI controller for an electromechanical system
 
State of charge estimation of lithium-ion batteries using fractional order sl...
State of charge estimation of lithium-ion batteries using fractional order sl...State of charge estimation of lithium-ion batteries using fractional order sl...
State of charge estimation of lithium-ion batteries using fractional order sl...
 
Fractional order PID for tracking control of a parallel robotic manipulator t...
Fractional order PID for tracking control of a parallel robotic manipulator t...Fractional order PID for tracking control of a parallel robotic manipulator t...
Fractional order PID for tracking control of a parallel robotic manipulator t...
 
Fuzzy logic for plant-wide control of biological wastewater treatment process...
Fuzzy logic for plant-wide control of biological wastewater treatment process...Fuzzy logic for plant-wide control of biological wastewater treatment process...
Fuzzy logic for plant-wide control of biological wastewater treatment process...
 
Design and implementation of a control structure for quality products in a cr...
Design and implementation of a control structure for quality products in a cr...Design and implementation of a control structure for quality products in a cr...
Design and implementation of a control structure for quality products in a cr...
 
Model based PI power system stabilizer design for damping low frequency oscil...
Model based PI power system stabilizer design for damping low frequency oscil...Model based PI power system stabilizer design for damping low frequency oscil...
Model based PI power system stabilizer design for damping low frequency oscil...
 
A comparison of a novel robust decentralized control strategy and MPC for ind...
A comparison of a novel robust decentralized control strategy and MPC for ind...A comparison of a novel robust decentralized control strategy and MPC for ind...
A comparison of a novel robust decentralized control strategy and MPC for ind...
 
Fault detection of feed water treatment process using PCA-WD with parameter o...
Fault detection of feed water treatment process using PCA-WD with parameter o...Fault detection of feed water treatment process using PCA-WD with parameter o...
Fault detection of feed water treatment process using PCA-WD with parameter o...
 
Model-based adaptive sliding mode control of the subcritical boiler-turbine s...
Model-based adaptive sliding mode control of the subcritical boiler-turbine s...Model-based adaptive sliding mode control of the subcritical boiler-turbine s...
Model-based adaptive sliding mode control of the subcritical boiler-turbine s...
 
A Proportional Integral Estimator-Based Clock Synchronization Protocol for Wi...
A Proportional Integral Estimator-Based Clock Synchronization Protocol for Wi...A Proportional Integral Estimator-Based Clock Synchronization Protocol for Wi...
A Proportional Integral Estimator-Based Clock Synchronization Protocol for Wi...
 
An artificial intelligence based improved classification of two-phase flow patte...
An artificial intelligence based improved classification of two-phase flow patte...An artificial intelligence based improved classification of two-phase flow patte...
An artificial intelligence based improved classification of two-phase flow patte...
 
New Method for Tuning PID Controllers Using a Symmetric Send-On-Delta Samplin...
New Method for Tuning PID Controllers Using a Symmetric Send-On-Delta Samplin...New Method for Tuning PID Controllers Using a Symmetric Send-On-Delta Samplin...
New Method for Tuning PID Controllers Using a Symmetric Send-On-Delta Samplin...
 
Load estimator-based hybrid controller design for two-interleaved boost conve...
Load estimator-based hybrid controller design for two-interleaved boost conve...Load estimator-based hybrid controller design for two-interleaved boost conve...
Load estimator-based hybrid controller design for two-interleaved boost conve...
 
Effects of Wireless Packet Loss in Industrial Process Control Systems
Effects of Wireless Packet Loss in Industrial Process Control SystemsEffects of Wireless Packet Loss in Industrial Process Control Systems
Effects of Wireless Packet Loss in Industrial Process Control Systems
 
Fault Detection in the Distillation Column Process
Fault Detection in the Distillation Column ProcessFault Detection in the Distillation Column Process
Fault Detection in the Distillation Column Process
 
Neural Network-Based Actuator Fault Diagnosis for a Non-Linear Multi-Tank System
Neural Network-Based Actuator Fault Diagnosis for a Non-Linear Multi-Tank SystemNeural Network-Based Actuator Fault Diagnosis for a Non-Linear Multi-Tank System
Neural Network-Based Actuator Fault Diagnosis for a Non-Linear Multi-Tank System
 
A KPI-based process monitoring and fault detection framework for large-scale ...
A KPI-based process monitoring and fault detection framework for large-scale ...A KPI-based process monitoring and fault detection framework for large-scale ...
A KPI-based process monitoring and fault detection framework for large-scale ...
 
An adaptive PID like controller using mix locally recurrent neural network fo...
An adaptive PID like controller using mix locally recurrent neural network fo...An adaptive PID like controller using mix locally recurrent neural network fo...
An adaptive PID like controller using mix locally recurrent neural network fo...
 
A method to remove chattering alarms using median filters
A method to remove chattering alarms using median filtersA method to remove chattering alarms using median filters
A method to remove chattering alarms using median filters
 

Último

Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...lizamodels9
 
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadIslamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadAyesha Khan
 
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607dollysharma2066
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03DallasHaselhorst
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesKeppelCorporation
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Pereraictsugar
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCRashishs7044
 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfJos Voskuil
 
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Seta Wicaksana
 
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCRashishs7044
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menzaictsugar
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfRbc Rbcua
 
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCRashishs7044
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...lizamodels9
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...lizamodels9
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...ssuserf63bd7
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...lizamodels9
 

Último (20)

Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
 
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadIslamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
 
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Perera
 
Corporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information TechnologyCorporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information Technology
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdf
 
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...
 
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdf
 
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
 

Improved correlation analysis and visualization of industrial alarm data

  • 1. ISA Transactions 51 (2012) 499–506 Contents lists available at SciVerse ScienceDirect ISA Transactions journal homepage: www.elsevier.com/locate/isatrans Improved correlation analysis and visualization of industrial alarm data✩ F. Yanga,b,∗ , S.L. Shaha , D. Xiaob , T. Chenc a Department of Chemical & Materials Engineering, University of Alberta, Edmonton, AB T6G 2V4, Canada b Tsinghua National Laboratory for Information Science and Technology (TNList), Department of Automation, Tsinghua University, Beijing 100084, PR China c Department of Electrical & Computer Engineering, University of Alberta, Edmonton, AB T6G 2V4, Canada a r t i c l e i n f o Article history: Received 13 August 2011 Received in revised form 12 March 2012 Accepted 23 March 2012 Available online 12 April 2012 Keywords: Alarm management Correlation color map Visualization Gaussian kernel Pseudo data Clustering a b s t r a c t The problem of multivariate alarm analysis and rationalization is complex and important in the area of smart alarm management due to the interrelationships between variables. The technique of capturing and visualizing the correlation information, especially from historical alarm data directly, is beneficial for further analysis. In this paper, the Gaussian kernel method is applied to generate pseudo continuous time series from the original binary alarm data. This can reduce the influence of missed, false, and chattering alarms. By taking into account time lags between alarm variables, a correlation color map of the transformed or pseudo data is used to show clusters of correlated variables with the alarm tags reordered to better group the correlated alarms. Thereafter correlation and redundancy information can be easily found and used to improve the alarm settings; and statistical methods such as singular value decomposition techniques can be applied within each cluster to help design multivariate alarm strategies. Industrial case studies are given to illustrate the practicality and efficacy of the proposed method. This improved method is shown to be better than the alarm similarity color map when applied in the analysis of industrial alarm data. © 2012 ISA. Published by Elsevier Ltd. All rights reserved. 1. Introduction During routine operation of industrial systems, abnormal sit- uations show up in the control room as alarms. Most processes, however, have too many alarm tags configured on process vari- ables and other state variables mainly due to safety considerations. The number of alarmed tags is large enough to overwhelm even ex- perienced operators. Because of this situation, alarm management has been recognized as an important problem in the area of sys- tem monitoring and fault detection [1,2]. New and revised guide- lines and standards have been proposed from different viewpoints to tackle this problem [3–7]. The focus of many current projects in industry is on ‘alarm rationalization’, that is, to reduce the number of alarms and yet at the same time be able to detect all potential abnormal situations. Most alarms are analyzed in a univariate framework. However it is well known that most processes are multivariate and thus ✩ A short version of this paper was presented at the 18th IFAC World Congress, Milano, Italy, August 28–September 2, 2011. ∗ Corresponding author at: Tsinghua National Laboratory for Information Science and Technology (TNList), Department of Automation, Tsinghua University, Beijing 100084, PR China. Tel.: +86 10 6277 1347; fax: +86 10 6278 6911. E-mail addresses: yangfan@tsinghua.edu.cn,fyang.cme@ualberta.ca (F. Yang), Sirish.Shah@ualberta.ca (S.L. Shah), xiaody@tsinghua.edu.cn (D. Xiao), tchen@ece.ualberta.ca (T. Chen). the alarms are not independent. Therefore a good strategy for the analysis and visualization of alarm data should be based on a multivariate framework. This type of multivariate information can be easily captured from process data and multivariate statistics can be applied to generate efficient alarm strategy [8] or optimize alarm limits [9]. Such methods can be categorized as indirect methods because we resort to process data only, i.e., measured values of process variables in engineering units. However, one may have many alarm tags associated with a single process variable and often alarm tags are without any association with any specific process variables (e.g., digital alarms to indicate a specific event or situation). These field alarms which have only ON/OFF (‘1’ or ‘0’) values are hard to analyze if there is no process data available. All of these issues necessitate development of new methods to mine alarm data directly. Compared to the analysis of numerical process data, the analysis of binary alarm data has some unique properties. First, to analyze the statistical properties, the process data should be stationary, which is not satisfied because both the normal and abnormal data are included. The relationship between different variables may be different under different situations and should be studied separately. The alarm data, however, only include the data in abnormal situation, leading to a simpler case. Second, the information is often hidden in noise, which is inevitable in process data. Consequently there may be some potential value in the analysis of alarm data [4]. 0019-0578/$ – see front matter © 2012 ISA. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.isatra.2012.03.005
  • 2. 500 F. Yang et al. / ISA Transactions 51 (2012) 499–506 There are two ways to capture correlation from alarm data: one is to employ Pearson’s correlation coefficients as done for continuous data by assigning a value of 1 for the periods when the alarm is active and a value of 0 when it is inactive [9]; the other is to introduce similarity measures based on binary data [10,11]. By computing the correlation coefficient or similarity measure for each pair of variables, a matrix is constructed. However, this matrix which is composed of specific numbers is not generally convenient for inspection or cursory analysis. Therefore the correlation values are converted into a color map and this correlation color map offers better visualization capability; it uses different colors to show different degrees of correlation [12,13]. The colors are usually discretized into several levels according to the color code. Through the reordering and clustering of variables, the color-coded correlation map can show the clusters of alarm tags intuitively. In this way, the problem of a large number of variables is separated into smaller sub-problems with much fewer variables. The classical correlation color map only uses the correlation coefficients. However, it may be ineffective when there is a time lag between two variables. Hence the correlation coefficients should be lag-adjusted to take into account the delays between each pair of variables. The alarm similarity color map (ASCM), which is specially designed for alarm data analysis [10], has proved to be an effective method for visualization. It captures time-delayed similarity information from binary data. However it has some disadvantages. Firstly, in order to weaken the sensitivity caused by the time shift, each unique alarm is padded with extra 1’s to enrich the data; or in a similar form, alarms are checked within constantly divided time windows (equivalent to down sampling) [11]. This step is short of physical evidence and as a result of this individual false alarms and chattering alarms may be magnified unreasonably. In this paper we suggest another method in which each unique alarm is replaced by a Gaussian distribution along with its neighbors in the time axis [14]. This set of generated continuous data with numerical values is labeled as pseudo data. Based on this data set, we use Pearson’s correlation coefficients to measure the correlation. This method has some similar problems as the padding method; however, it provides a better way of transforming binary alarm data into continuous data that can be analyzed by statistical approaches. Another disadvantage of the ASCM method is that it takes into account all the alarm tags separately regardless their physical relationship. If typical analogue alarms (HI/LO/HH/LL) associated with the same process variable are grouped together, the result should be expected to be more reasonable. Thirdly, the similarity measure only indicates the distance but loses the direction (positive or negative correlation), while the correlation coefficient indicates both the similarity and the direction. In this paper, a new approach to analyzing multiple alarm series is proposed to find redundancy. By converting binary data into continuous (pseudo) data via a novel Gaussian kernel method, statistical methods are applied, including an improved correlation and singular value analysis. The numerical results can be visualized by the ASCM tool via appropriate ordering and clustering of alarm tags. 2. Basic method for correlation analysis 2.1. Gaussian kernel method for data preprocessing Alarm data is binary data with ‘1’s’ and ‘0’s’ to denote abnormal and normal states respectively. Since alarm data is obtained and discretized from continuous process data according to alarm limits, some information is lost in this process. However, alarm data does have advantages as well: (1) binary data is easy for storage and convenient for statistical analysis; (2) it has much higher sampling frequencies than process data and thus includes detailed information; and (3) it includes more types of data such as digital alarms showing the status of some elements (e.g. if a pump is on or off). Thus we can use alarm data directly to analyze the correlation. For a particular variable, one alarm point can be regarded as a sample of the time series. To apply correlation analysis to alarm occurrences, a continuous analogue signal has to be generated from the alarm signal which is actually an event series. By simply assigning ‘1’ or ‘0’, a time series in a continuous temporal domain can be generated, which is simple but lacks granularity because non-continuous (binary) values are not commensurate for correlation analysis. In order to better estimate the time series, the kernel method can be used in the temporal domain [15], which is a nonparametric method, to fit the function with any shape. Here the Gaussian kernel function is used because of its smoothness. At each alarm point, a Gaussian kernel function is superimposed around this time instant. The function is defined as: K(t) = 1 √ 2πσ e− t2 2σ , (1) where σ is the standard deviation. If all the alarm points are considered, a continuous time series can be obtained as the superposition of all the time-shifted Gaussian kernel functions. Thus the resulting time series is given as P(t) = N i=1 K(t − ti), (2) where N is the number of alarm points at the time instants ti, i = 1, 2, . . . , N. This time series P(t) can be regarded as the estimation of the corresponding process data although there may not exist such a physical process variable; hence we call it pseudo data. Fig. 1 illustrates the principle of generating such pseudo data from binary data. The pseudo time series has the following properties: • It is continuous and smooth because it is approximated by superimposed Gaussian functions. • For consecutive alarm points lasting a long time, the pseudo data is also 1’s because the integral of the Gaussian kernel function over the whole time domain is 1. This is different from the estimation of probability density function because the samples here are at different time instants. • With appropriate variance of the kernel function, the magni- tude of the kernel function is small; thus non-consecutive and sparse alarm points cannot result in spikes in the pseudo data. This property is particularly important for the filtering of chat- tering alarms so that the proposed method can be used directly on the data set with chattering alarms. Consider the example of alarm data as shown in Fig. 2(a) in which there is a consecutive alarm period of 400 s including a short (5 s) break at around the 510th interval. At both the beginning and the end of this period, there is some chattering. From this data set, we can generate the pseudo data with standard deviation of 30 s as shown in Fig. 2(b). Here the chattering and the gap are sufficiently smoothened and at the same time the alarm period prevails. The effects of missed alarms and false alarms are lessened because they usually only exist over a short time duration. 2.2. Effect of time lags There may exist a time lag between two correlated alarms due to the propagation or response time. This lag reduces the correlation coefficient and thus may mask the correlation be- tween these variables. To eliminate this influence, different time
  • 3. F. Yang et al. / ISA Transactions 51 (2012) 499–506 501 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 17 t 16 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 17 t 16 (a) Alarm data. (b) Pseudo data. Fig. 1. Transforming alarm data (a) to pseudo data (b). (a) Alarm data. (b) Pseudo data. Fig. 2. (a) Original alarm data; (b) generated pseudo data by the Gaussian kernel method. lags should be assumed and the maximal correlation coeffi- cient can be computed; this would be regarded as the real correlation [16]. Assume that x and y are time series of n observations with means µx, µy and variances σx, σy respectively, then the cross- correlation function (CCF) with an assumed lag k on y is: φxy(k) = E  (xi − µx)(yi+k − µy)  σxσy , k = −n + 1, . . . , n − 1. (3) The expectation can be estimated as the sample CCF by: ˆφxy(k) =    1 n − k n−k i=1 (xi − µx)(yi+k − µy)  σxσy, if k ≥ 0, 1 n − k n i=1−k (xi − µx)(yi+k − µy)  σxσy, if k < 0. (4) A value of the CCF is obtained by assuming a certain time lag for one of the time series. Thus the absolute maximum value can be regarded as the real cross-correlation and the corresponding lag as the estimated time lag between these two variables. For a mathematical description, one can compute the maximum and minimum values φmax = maxk{φxy(k), 0} ≥ 0 and φmin = mink{φxy(k), 0} ≤ 0, and the corresponding arguments kmax and kmin . Then the estimated time delay from x to y is: λ =  kmax , if φmax ≥ −φmin , kmin , if φmax < −φmin . (5) (corresponding to the maximum absolute value) and the actual time delayed cross-correlation is ρ = φxy(λ) (between −1 and 1). If λ is less than zero, then it means that the actual delay is from y to x. Thus the sign of λ provides the directionality information between x and y. The sign of ρ indicates whether the correlation is positive or negative. Note that this directionality does not mean causality because there may exist another common cause of the relationship between x and y [17]. 2.3. Finding redundant alarms based on correlation If several alarms are highly correlated, we should check if there is redundancy in them, i.e., if some alarms can be obtained by linearly combining other alarm signals. Singular value decomposition (SVD) is a well-developed method to do this by finding singular values of the set of alarm data and thus enabling one to identify collinear columns. If there are several large singular values that possess the dominant proportion of all the values, then the number of large or dominant singular values can be regarded as the number of independent alarm tags and the residual number is the number of redundant alarms. However, SVD cannot tell us which alarm is redundant because we are dealing with the transformed or pseudo data to generate a set of new data. It only provides an opportunity for one to check if there is an alarm that can be removed or replaced by the linear combination of other alarm signals. In addition, the independent series in the new data may not have the approximate binary property, making these values inappropriate to be taken as generated alarm signals. One should also consider the physical meaning of all alarms and reconcile this with the SVD information. This ambiguity in using SVD is much severe if the number of alarms is large. In this case, it is difficult to identify the independent and redundant alarms merely based on the result of SVD. Therefore, we need to separate the variables into several groups and perform SVD on each group of variables. The clustering process based on visualizing the result of the correlation analysis would help one in analyzing SVDs of smaller groups of variables. Another point to be noted is that, for SVD analysis, the time series should be sufficiently long to include many alarms, otherwise it cannot reflect its true property. 3. Visualization of correlation In order to visualize the correlation matrix, the correlation color map is developed, in which the order of variables is rearranged in a ranked order where variables that are highly correlated with each other appear together with prominent colors or shades, called clusters.
  • 4. 502 F. Yang et al. / ISA Transactions 51 (2012) 499–506 3.1. Ordering and clustering of alarm tags Based on the correlation coefficient, the measure to describe the similarity between two alarms is known as the similarity measure S [18] which has the properties of positivity (S(x, y) ≥ 0), symmetry (S(x, y) = S(y, x)), and maximality (S(x, x) ≥ S(x, y)). There are many similarity measures available for binary data [19]; hence if we compute the similarity measures based on original alarm data, we can choose any one of them; for example Kondaveeti et al. [10] have used the Jaccard similarity measure. However we are dealing with pseudo data; so generally we can use the covariance, Euclidean distance (L2), Kendall’s τ, Pearson’s correlation coefficient, Spearman’s rank, City-block (L1), etc. [20]. In this paper, we use the classical Pearson’s correlation coefficient and define the following similarity measure S: S(x, y) = |ρxy|, (6) which lies in the interval [0, 1]. During clustering, one should employ both the similarity measure between two alarms and also the measure between two clusters of alarms. The latter can be defined by single-, complete-, and average-linkage methods [20]. The definition of single- linkage is SS(X, Y) = min x∈X,y∈Y {S(x, y)}, (7) the definition of complete-linkage is SC(X, Y) = max x∈X,y∈Y {S(x, y)}, (8) and the definition of average-linkage is SA(X, Y) =  x∈X,y∈Y S(x, y)  |X| |Y|, (9) where x and y are alarm tags, X and Y are clusters of tags, and | · | means the size of the cluster. Using the similarity measure between each pair of alarms or their clusters, all alarm tags can be clustered based on various clus- tering algorithms such as agglomerative hierarchical clustering, the methodology for which is shown as a dendrogram. This process has been illustrated in Kondaveeti et al.’s study [10]. Other ordering and clustering algorithms can also be used such as the ellipse or- dering. For this purpose, the data analysis software GAP is a useful tool [20]. 3.2. Correlation color map The correlation matrix with rearranged rows and columns is color coded to transform it into a color map [12]. The matrix is symmetric and both the vertical and horizontal axes are alarm tag names. Each grid (x, y) shows the corresponding correlation between the two alarm tags x and y. The color is coded to show the correlation (−1 to +1) or its absolute (0–1). To explain the meaning of each color, a color bar legend is placed beside the plot. The order of the alarm tags is determined by the above algorithm. The color map is symmetric about the diagonal which means the auto-correlation coefficients are 1’s. The clusters are shown as blocks located along the diagonal with similar color codes. From this map, one can acquire the approximate correlation between each pair of alarms, and easily find clusters and the corresponding alarm tags. By matching this map with the physical meaning, one can locate the redundant alarms. 4. Implementation issues In the above methods there are several issues to be noted. 4.1. Sampling rates of alarm data and pseudo data The alarm data is recorded by DCS, PLC, or other smart devices; the sampling rate can be very high. This is necessary sometimes because alarms occur with a very high frequency due to oscillation or chattering. Generally speaking, we can record them in the database as frequently as possible, say one per second. A proposed rule of thumb is a 15 s sample interval as a lower bound [14]. It is generally unnecessary to record process data at such a high rate because the process always has a time constant with a magnitude of minutes or even longer. In real applications, we often record such samples at one sample per minute. Different from alarm data and process data, the pseudo data is generated from alarm data but behaves as continuous process data. Thus we can record the pseudo data at the same frequency as the original alarm data, but we can also decrease the sampling rate to reduce the computational burden and can also match it with the sample rate of process data. 4.2. Variance of the Gaussian kernel function The variance of the kernel function affects the robustness of the pseudo data at each alarm point. If the variance is very small, each alarm point generates a spike in the pseudo data making the method very sensitive to an individual sample or during a short period of alarms. On the other hand if the variance is large, the magnitude of each kernel function is very small; hence it needs quite a few alarm points during a period of no alarms or quite a few ‘holes’ during a period of consecutive alarms to change the trend of the pseudo curve. Although this can reduce the influence of the missed alarms, false alarms, and chattering alarms due to its robustness to individual changes, the latency increases on the other hand, meaning that there is a longer time delay to reflect a change. This is a trade-off between large and small variances. To our best knowledge, most variables in the process industry have a time constant that ranges from several seconds to several minutes and even hours; thus the standard deviation of the Gaussian kernel function can be chosen over this range. If there is a step change (from no alarm to alarm for example) in the alarm data, it needs several minutes for the pseudo data to completely change the value (from 0 to 1). 4.3. Computational effort Although the number of samples is generally large due to the high frequency of sampling rate, the alarm data set includes large amount of normal data resulting in the data set being quite sparse. In addition, when an alarm occurs consecutively, the corresponding pseudo time series is a set of consecutive 1’s according to the property mentioned earlier in Section 2; such periods can be ignored in the computation by letting them remain unchanged. Therefore the computation only concentrates on the non-consecutive alarm points and the boundary of the consecutive alarm points. In SVD, the vectors of the pseudo data can be very long, making the computation infeasible. We can either lower the sampling rate, or use the technique of sparse matrix analysis [21]. 4.4. Selection of data range Theoretically, a long series can result in better results, but the computational effort is higher. In addition, in correlation analysis we assume that the data is stationary; this is difficult to satisfy because the operating situation or fault pattern may vary, especially in a long series. Thus we should compromise to look into shorter series. Based on this compromise, the computation of correlation requires a sufficiently large data set that contains alarm
  • 5. F. Yang et al. / ISA Transactions 51 (2012) 499–506 503 Fig. 3. High density plot of the original alarm data for case study 1. points because the normal points do not provide information. In practice a short time period of simultaneous alarms is a much better data set than a long time period of rare alarms. In particular, the alarm data during an alarm flood is an especially good source for study. 4.5. Grouping of alarm tags of one process variable For a typical process variable, there may be as many as four alarms associated with it: HI, LO, HH, and LL alarms. If these alarm tags are treated separately, the relationship between them is lost. Instead, these four alarms can be combined together by assigning them different values based on their directions and degrees, for example, HI as +1, LO as −1, HH as +2, and LL as −2. This treatment makes the alarm data have several values instead of binary values. When generating the pseudo data, these values should be taken as the weights of the Gaussian kernels. 4.6. Choosing color code for color map We can either use the same color with different shades or different colors. Nevertheless, the number of color scales is very important, which determines the interpretability of the map. There is a trade-off between the sensitivity and the interpretability. A general recommendation is to use fewer number of codes first and increase the shades gradually to suit your eyes in the visualization process and provide more ‘granularity’ in the number of alarms for a process. In our work we use warm colors to describe positive correlation and cold colors to describe negative correlation. The four bands of values are chosen as (0–0.25), (0.25–0.5), (0.5–0.75), and (0.75–1) respectively. Please see the color bar in Fig. 4(b) for the color assignment. 5. Case studies To illustrate the practicality and utility of the method suggested in this work, we consider two case studies. The first case study is to illustrate the procedure of the proposed method and the improvements. The second case study is an application to a real industrial process, showing the efficacy of the method. 5.1. Case study 1 First consider data from a real industrial process including 10 analogue alarm tags with HI and LO settings that have alarms over a period of one week. The sampling rate is one sample per second. By assigning HI and LO alarms +1 and −1 respectively, the alarm data is shown as a high density plot in Fig. 3 where alarm tags 3 and 7 have both HI and LO alarms. If we use the method proposed by Kondaveeti et al. [10], the ASCM obtained is shown in Fig. 4(a) where the padding length used is 5 s. We find that alarm tags 1 and 2 are correlated, and alarm tags 4, 5, and 6 are grouped in another cluster. Now we use the method proposed in this paper by setting the standard deviation of the Gaussian kernel as 30 s. The correlation matrix Φ (with the elements below the diagonal removed due to symmetry) is given in Box I. Based on this matrix, Table 1 illustrates the clustering process. Initially each alarm tag is regarded as a cluster. In the first step, the similarity measure (S = absolutecorrelation) of each pair of alarm tags are compared, in which the similarity measure S between alarm tags 4 and 5 has the largest value, meaning that they are most similar. Therefore alarm tags 1 and 2 are grouped into a new cluster, 11, and the old clusters 4 and 5 are removed. Then we compute the similarity measure between each pair of new clusters. We take alarm tags 1 and 2 in the second step, and then 10 and 8 in the third step. In the fourth step, the measure a b Fig. 4. Alarm similarity color map of the original alarm data based on Kondaveeti’s scheme (a) and correlation color map of the pseudo data (b) for case study 1. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
  • 6. 504 F. Yang et al. / ISA Transactions 51 (2012) 499–506 Φ =              1 0.97 −0.06 −0.08 −0.08 0.33 0.23 0.54 0.34 0.42 1 −0.03 −0.07 −0.07 0.34 0.23 0.55 0.35 0.43 1 0.47 0.47 −0.21 0.26 −0.08 0.04 −0.12 1 1.00 −0.20 0.10 −0.16 −0.27 −0.13 1 −0.20 0.10 −0.16 −0.27 −0.13 1 −0.25 0.64 0.34 0.67 1 0.25 0.17 0.06 1 0.60 0.79 1 0.70 1              Box I. Table 1 Clustering process for case study 1. Step Clusters Largest measure Correlation Pair 0 1 2 9 10 8 6 3 4 5 7 1.00 4, 5 1 1 2 9 10 8 6 3 11 7 0.97 1, 2 2 12 9 10 8 6 3 11 7 0.79 10, 8 3 12 9 13 6 3 11 7 0.69 9, 8 4 12 14 6 3 11 7 0.67 6, 10 5 12 15 3 11 7 0.55 2, 8 6 16 3 11 7 0.47 3, 4/5 7 16 17 7 −0.27 4/5, 9 8 18 7 0.26 7, 3 9 19 – – between clusters 9 and 13 is the largest, which is determined by the similarity between alarm tags 9 and 8 according to the single- linkage method. So clusters 9 and 13 become the new cluster 14 which contains three alarm tags. The algorithm continues until only one cluster is left. The corresponding dendrogram is shown in Fig. 5. Thus the new order of alarm tags is obtained and thereby we have the correlation color map shown in Fig. 4(b). Apparently, alarm tags 1, 2, 9, 10, 8, and 6 form a single cluster with positive correlations, and alarm tags 3, 4, and 5 also have some correlations between them. This result clearly provides more information than the one shown in Fig. 4. Therefore the proposed method reveals some correlations that are not uncovered in the method proposed by Kondaveeti et al. [10], such as the correlation between alarm tags 8 and 9. Then we take the data of alarm tags 1, 2, 9, 10, 8, and 6 to perform SVD. The six singular values are 1720, 498, 347, 302, 199, and 165. Thus it is apparent that there exists redundancy in these tags because one or two singular values dominate most of the collinear information. According to the user’s preference and the corresponding physical meanings, some alarms may be removed. 5.2. Case study 2 Next consider a hydro treater process in a refinery in Alberta, Canada. Since there are hundreds of tags with alarms, we order them based on the numbers of occurrence and choose 11 process tags associated with analogue alarms (HI/LO) that contribute to the large number of alarm counts. The data set includes one week worth of alarm data at a rate of one sample per second, during which there are two evident floods leading to a large number of alarms. The above analysis is implemented on this data set and the color map obtained is shown in Fig. 6(a). The order of tags is determined by the same algorithm as the previous case study; the procedure is shown in Fig. 7. It can be observed in Fig. 6(a) that alarm tags 1, 3, 9, 10, and 11 form a cluster with high correlation values. From process knowledge, we know that tags 1 and 3 are both flow rates, and tag 3 is downstream of tag 1. Therefore, when a fault occurs Fig. 5. Dendrogram for the 10 variables in case study 1 based on the single-linkage measure. The scale on the right indicates the similarity measure. upstream, it can propagate downstream to other variables, which can be reflected in the alarm data. Although they have similar trends in this case study, they are not redundant in general because they reveal the fault propagation and there is some time delay as evident from the time stamps. Tags 9, 10, and 11 are fuel gas pressures of different burners in one equipment, leading to evident correlations. From this study, we find that they are somewhat redundant and may be reduced to one tag. In real engineering practice, however, we should check more data to confirm this redundancy and resort to process knowledge to see the physical implication. Actually, some redundancy is also necessary because it can help improve the accuracy of detection, especially for important variables. In this case, each burner is placed with one sensor for monitoring its operating situation, although they are all pressure measurements for one equipment. In addition, the correlation between the above two groups is also high, which shows up the alarm flood when different tags show similar patterns due to the fault spread or interrelationship between physical quantities. Note that tag 11 behaves differently from tags 9 and 10 although they are highly correlated; this also accounts for the necessity of setting an alarm on tag 11. As a comparison, the corresponding process data is also available, as shown in Fig. 8, based on which the correlation color map is obtained as shown in Fig. 6(b). It is observed that the two color maps (Fig. 6(a) and (b)) have different patterns even if they share the same order of tags. During the floods, many process tags are correlated, while their alarms are less correlated thanks to the good alarm settings. For example, tags 2 and 6 show some negative correlation because they are flow rates on upstream and downstream variables respectively; however, the corresponding alarms do not have an apparent correlation. 6. Concluding remarks In this paper, the correlation matrix is plotted as a color map to visualize the relationship between different alarm tags. Compared to the ASCM, the clustered correlation map of pseudo data map has the following three main advantages: (1) it is robust to missed, false, and chattering alarms; (2) the correlation of pseudo data provides directional (positive or negative) information in addition
  • 7. F. Yang et al. / ISA Transactions 51 (2012) 499–506 505 a b Fig. 6. Color maps for pseudo data (a) and process data (b) in case study 2. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) 1.00 0.84 0.67 0.51 0.35 0.18 0.02 Fig. 7. Dendrogram for case study 2. Fig. 8. High density plot of the process data in case study 2. to the similarity; (3) the pseudo data used in generating the color map can also be used in other statistical analysis that provides more potential; and (4) no knowledge of delays between correlated alarms is required. One disadvantage to be noted is that when computing correlation between two time series, the simultaneous 0’s in both the series have little information because they are in the normal region, resulting in the reduction of correlation coefficients. Thus the period for analysis should be selected appropriately so that there are dense alarm signals, and it is this window of alarm data that should be converted to pseudo data for statistical analysis. From this improved correlation analysis, we can find the relationship and redundancy in alarm tags that can be translated into improved alarm settings for efficient alarm rationalization. It should be noted that the proposed method in this paper is an improved correlation analysis based on alarm data. However, to obtain a reasonable and effective alarm configuration, we should also investigate process data as well as process knowledge, as illustrated in the second industrial case study and in Fig. 6. The analysis of alarm data is simple but is lacking in terms of detailed information, especially the true trend information of a process variable; as a result, we usually take this analysis to be the first step to focus our attention on some particular parts of the industrial process and then apply more complex techniques. As discussed in Section 5.2, the results should be compared to the ones based on process data to find the influence of alarm settings. The correlation and redundancy information captured through data analysis should be combined with process knowledge, especially the process connectivity and causality information between process units and variables [22,23]. The method proposed in this paper can be developed to be user- friendly to tune the parameters. In the pseudo data generating stage, the variance of Gaussian kernel and the sampling rate of pseudo data are two parameters that require tuning. In the color map plotting stage, the clustering algorithm is another option. In the color map display period, one should be able to change the color code and the number of scales of colors. It is important to provide some degrees of freedom for the user because the plot is sensitive to these tuning parameters and an appropriate choice of parameters depends on the specific circumstances, requirements and objectives of the analysis. Acknowledgments This work was supported by NSERC (SPG and IRC) in Canada, NSFC (60736026 and 60904044), Tsinghua National Laboratory for Information Science and Technology (TNList) Cross-discipline Foundation, and the 973 Project (2009CB320600) in China. References [1] Izadi I, Shah SL, Shook D, Chen T. An introduction to alarm analysis and design. In: Proc. 7th IFAC symp. fault detection, supervision and safety of technical processes. IFAC safeprocess 2009. 2009. p. 645–50. [2] Izadi I, Shah SL, Chen T. Effective resource utilization for alarm management. In: Proc. 49th IEEE conf. decision and control. CDC 2010. Atlanta, GA. 2010. p. 6803–8.
  • 8. 506 F. Yang et al. / ISA Transactions 51 (2012) 499–506 [3] Hollifield B, Habibi E. The alarm management handbook: a comprehensive guide. Houston (TX): PAS; 2007. [4] Engineering Equipment and Materials Users’ Association. Alarm systems: a guide to design, management and procurement. EEMUA standard 191. second ed. 2007. [5] International Society of Automation. Management of alarm systems for the process industries. ANSI/ISA standard 18.2. 2009. [6] Errington J, Reising DV, Burns C. ASM consortium guidelines: effective alarm management practices. Phoenix, AZ: ASM Consortium; 2009. [7] American Petroleum Institute. Pipeline SCADA alarm management. API recommended practice 1167. 2010. [8] Kondaveeti SR, Izadi I, Shah SL. Application of multivariate statistics for efficient alarm generation. In: Proc. 7th IFAC symp. fault detection, supervision and safety of technical processes. IFAC safeprocess 2009. 2009, p. 657–62. [9] Yang F, Shah SL, Xiao D. Correlation analysis of alarm data and alarm limit design for industrial processes. In: Proc. 2010 American control conf., ACC 2010. 2010, p. 5850–5. [10] Kondaveeti SR, Izadi I, Shah SL, Black T. Graphical representation of industrial alarm data. In: Proc. 11th IFAC/IFIP/IFORS/IEA symp. analysis, design, and evaluation of human–machine systems. IFAC/IFIP/IFORS/IEA 2010. 2010. [11] Nishiguchi J, Takai T. IPL2 and 3 performance improvement method for process safety using event correlation analysis. Comput Chem Eng 2010; 34(12):2007–13. [12] Tangirala AK, Shah SL, Thornhill NF. PSCMAP: a new tool for plant-wide oscillation detection. J Process Control 2005;15(8):931–41. [13] Zhang H. Statistical process monitoring and modeling using PCA and PLS. M.S. thesis. Edmonton (AB, Canada): University of Alberta; 2000. [14] Control Arts Inc. Alarm system engineering (e-book), 2011. http://controlartsinc.com/Support/Publications.html. [15] Silverman BW. Density estimation for statistics and data analysis. London (New York): Chapman and Hall; 1986. [16] Bauer M, Thornhill NF. A practical method for identifying the propagation path of plant-wide disturbances. J Process Control 2008;18(7–8):707–19. [17] Yang F, Shah SL, Xiao D. SDG (signed directed graph) based process description and fault propagation analysis for a tailings pumping process. In: Proc. 13th IFAC symp. automation in mining, mineral and metal processing. IFAC MMM 2010. 2010. [18] Lesot MJ, Rifqi M, Benhadda H. Similarity measures for binary and numerical data: a survey. Int J Knowl Eng Soft Data Paradigm 2009;1(1):63–84. [19] Choi S, Cha S, Tappert CC. A survey of binary similarity and distance measures. J Syst Cybernet Inform 2010;8(1):43–8. [20] Wu HM, Tien YJ, Chen CH. GAP: a graphical environment for matrix visualization and cluster analysis. Comput Statist Data Anal 2010;54(3): 767–78. [21] Pissanetzky S. Sparse matrix technology. London (UK): Academic Press; 1984. [22] Yang F, Shah SL, Xiao D. Signed directed graph based modeling and its validation from process knowledge and process data. Int J Appl Math Comput Sci 2012;22(1):41–53. [23] Larsson JE, DeBor J. Real-time root cause analysis for complex technical systems. In: Proc. joint IEEE 8th conf. on human factors and power plants and 13th annual workshop on human performance/root cause/trending/ operating experience/self assessment, joint 8th IEEE HFPP/13th HPRCT. 2007. p. 156–63.