2. negative association rules, P ¬Q, ¬P Q and ¬P 3. Irregular Association Rules
¬Q. To extract negative association rules, most
papers employ different correlation measures Let D = t1 , t 2 , . . . , t n be a database of n
between attributes [12-14]. In [13], the author transactions with a set of items I = i1 , i2 , . . . , im .
proposed a level-wise search algorithm for mining Let set of action items of I be AI = ai1 , ai2 , . . . , aik
both positive and negative association rules that where k is the number of action items. Let set of
employs rule dependency measures. In [14], authors non-action items of I be
proposed another level-wise search algorithm for NAI = nai1 , nai2 , . . . , naim k where is the
simultaneously extracting positive and negative number of non- action items. For an itemset P I
association rules using Pearson correlation and a transaction t in D, we say that t supports P if t
coefficient. In [15] , author have proposed detection has values for all the attributes in P; for conciseness,
model using multi layer perceptron neural networks we also write P t. By Dp we denote the
(MLP) to detect fraud/abuse problem based on
transactions that contain all attributes in P. The
medical claims. It has been proposed to detect new, D
unusual and known fraudulent/abusive behaviors. It support of P is computed as P = p , i.e. the
works based on detection model which is very slow fraction of transactions containing P. A irregular rule
and need huge memory requirement to analyze is of the form: P Q, with P NAI, Q A I, P
existing large database. In [16], author used positive Q = . To hold the rule following condition must
association rule to build clinical pathways, which can meet: P P or support P >=
detect fraud and abuse on new data. However, this , P P , Q or support P, Q <
model cannot detect fraud and abuse from the =
existing large healthcare data. Our proposed P P ,Q
and <= confidence where P(x)
approach detects fraud and abuse from the existing P P
large information. is the probability of x.
Original Mapped Original Mapped
Generate dictionary for value value value value
each categorical attribute Headache 1 Yes 1
Fever 2 No 2
PatientActual Data
Age Smoke Diagnosis Dictionary of Dictionary of
ID Diagnosis attribute Smoke attribute
1020D 33 Yes Headache
1021D 63 No Fever Map to integer items using
rule base and dictionaries
Actual data
If age <= 12 then 1
Medical If 13<=age<=60 then 2
domain If 60 <=age then 3 Patient Age Smoke Diagnosis
knowledge If smoke = y then 1 ID
If smoke = n then 2 1020D 2 1 1
If Sex = M then 1
1021D 3 2 2
If Sex = F then 2
Rule Base Data suitable for Knowledge Discovery
Figure 1. Data transformation of medical data
4. Mapping complex medical data to map continuous numerical data to items using these
mineable items developed rules.
We have used domain dictionary approach to
For knowledge discovery, the medical data have transform the data, for which medical domain expert
to be transformed into a suitable transaction format knowledge is not applicable, to numerical form. As
to discover knowledge. We have addressed the cardinality of attributes except continuous numeric
problem of mapping complex medical data to items data are not high in medical domain, these attribute
using domain dictionary and rule base as shown in values are mapped integer values using medical
figure 1. The medical data are types of categorical, domain dictionaries. Therefore, the mapping process
continuous numerical data, Boolean, interval, is divided in two phases. Phase 1: a rule base is
percentage, fraction and ratio. Medical domain constructed based on the knowledge of medical
expert have the knowledge of how to map ranges of domain experts and dictionaries are constructed for
numerical data for each attribute to a series of items. attributes where domain expert knowledge is not
For example, there are certain conventions to applicable, Phase 2: attribute values are mapped to
consider a person is young, adult, or elder with integer values using the corresponding rule base and
respect to age. A set of rules is created for each the dictionaries.
continuous numerical attribute using the knowledge
of medical domain experts. A rule engine is used to 5. The proposed algorithm
64
3. General intuition of this algorithm is as follows:
based on a set of lab tests with same results, if 99% 5.1. Candidate Generation
doctors practice patients as disease x and 1 percent
doctors practice patients as other diseases, then there The idea behind candidate generation of all level-
is a strong possibility that this 1 percent doctors are wise algorithms like Apriori is based on the
doing illegal practice. In other words, if consequent following simple fact: Every subset of a frequent
C occurs infrequently with antecedent A and itemset is frequent so that they can reduce the
number of itemsets that have to be checked.
that is a strong candidate of variability. In every However, our proposed algorithm in candidate
domain, there are a set of facts. Based on these facts, generation phase check this fact if the itemsets only
decision and action are taken. In a rule S T, if S contains non-action items. This idea makes itemsets
contains a set of facts and T contains decision or consist of both rare action items and high frequency
action. Then such rules represent the decision T with non-action items. If the new candidate contains one
their corresponding facts S. If S T has sufficient or more action items then it is selected as a valid
support and confidence then it represents that candidate. If the new candidate contains only non-
decision or action T is taken routinely based on facts action items then, it is selected as a valid candidate
S. However, if S is high frequent and rule S-T has only if every subset of new candidate is frequent.
very low confidence. Then it indicates based on facts This way the algorithm keeps the new candidates
S any other decision instead of T is usually taken. It that have one or more action items.
also indicates that the decision is exceptionally taken
based on these facts. The main features of the 5.2. Candidate Selection
proposed algorithm are as follows:
If minimum support is only used like We have used two separate supports metrics to filter
conventional association mining algorithm, out candidates. An itemset with only non-action
desired itemsets that involve rarely appeared items is compared with minimum antecedent support
action items with the high frequent non-action metric as non-action items can only take part in
items will not be found. To find rules that antecedent part of irregular rule, which need to be
involve both frequent antecedent part and rare high frequent. An itemset with one or more action
consequent items, we have used two support items is compared with maximum antecedent
metrics: minimum antecedent support, consequent support metric to keep rare action items
maximum antecedent consequent support. with the high frequent non-action items. An itemset
The proposed algorithm uses maximum with only non-action items is selected if it has
confidence constraint instead of widely used support greater or equal to minimum antecedent
minimum confidence constraint to form the support. An itemset with one or more action items
rules. Moreover, it partitions itemsets into action is selected if it has support smaller or equal to
item and non-action items instead of subset maximum antecedent consequent support. By this
generation to form rules. way, itemsets are explored which has high support
Rules have non-action items in the antecedent for non-action items and low support for action items
and action items in the consequent. with high support non-action items. Here pruning is
In candidate generation, it does not check the based mostly on minimum antecedent support,
maximum antecedent consequent support and
checki
or more action items to keep that itemset.
Let MAS is minimum antecedent support, MACS
is maximum antecedent consequent support, Ij is the 5.3. Generating Association Rule
itemsets of size j, Sm is the desired itemset of size m;
Ck be the sets of candidates of size k. Figure 2 shows This problem needs association rules that represent
the association mining algorithm for finding irregular irregular relationships between action and non-action
rule. Like algorithm Apriori, our algorithm is also items that occur rarely together. For this reason, the
based on level wise search. Each item consists of proposed algorithm uses maximum confidence
attribute name and its value. Retrieving information constraint to form rules as it needs rule that has
of a 1-itemset, we make a new 1-itemset if this 1- high support in antecedent portion and has very low
itemset is not created already, otherwise update its support in itemset from which the rule is generated.
support. The non-action 1-itemset is selected if it has It selects a rule if its confidence is less or equal to
support greater or equal to minimum antecedent maximum confidence constraint. Moreover, it does
support. The action 1-itemset is selected whatever not use subset generation to the itemsets to form
support it has. By this way, 1-itemsets are explored rules. Here an itemset is partitioned into action item
which have high support for antecedent items and and non-action items. Action items are for
have arbitrary support for consequent items. consequent part and non-action items are for
65
4. antecedent part. Here each itemset is mapped to only
one rule.
Algorithm: Find itemsets which consist of non-action Procedure CalculateCandidatesSupport(Ck)
items with high support and action items with low 1, For each transaction t of Database
support based on candidate generation. 1.1 CalculateSupportFromOneTransactionFor-
Input: Database, minimum antecedent support, maximum Cadidates(Ck, t);
antecedent consequent support procedure CalculateSupportFromOneTransaction
Output : Itemsets which are strong candidates of variability. ForCadidates(Ck, t)
1. K=1, S = {Ø}; 1.Ct =Find the subsets of Ck which are candidate
2. Read the metadata about which attributes are action t
type and which are not. 2.1 c.count++
3. Ik = Select 1-itemsets either which consist of a Algorithm : Find Assosiation rules for Variability
non-action item and has support greater or equal to Finding
minimum antecedent support or which consists of Input: I (Vaiavility Itemsets), maximumConfidence
an action item. Output: R ( set of rules )
4. While(Ik { 1. R = Ø
4.1 K++; 2. For each X I
4.2 Ck = Candidate_generation(Ik-1) 2.1 Antecedent set AS = (as1, as2 n){
4.3 CalculateCandidatesSupport(Ck) where asi X and AC(asi
4.4 Ik = SelectDesiredItemSetFromCandidates 2.2 Consequent set CS = (cs1, cs2 n){
(CK, Sk , MAS, MACS); where csi X and AC(csi
4.5 S = S U Sk 2.3 if (support (AS CS)/Support (AS)) <=
5. return S maximum confidence
procedure SelectDesiredItemSetFromCandidates 2.3.1 AS CS is a valid rule.
(CK , Sk , MAS, MACS) 2.3.2 R = R U (AS CS)
1. For each Itemset c Ck procedure Candidate_generation(Ik-1)
1.1 If c contains only non-action items 1.For each Itemset i1 Ik-1
1.1. 1 If c.support >= MAS 1.1 For each Itemset i2 Ik-1
1.1.2 Add it to I 1.1.1 Newcandidate, NC = Union(i1,i2);
1.2 else if c contains one or more action items 1.1.2 If Size of NC is k
with non-action items. 1.1.2.1 If NC contains one or more action items
1.2.1 If c.support <= MACS 1.1.2.1.1 Add it to Ck if every subset of
1.2.2 Add it to I & Sk non-action items is frequent.
1.3 If c contains only action items 1.1.2.2 else
1.3.1 Add it to I 1.1.2.2.1 If every subset of NC is frequent
2. return I 1.1.2.2.1.1 Add it to Ck
othewise remove it.
2. return Ck;
Figure 2. Association mining algorithm for finding irregular rule
5.3.1. Lemma 1. Number of rules is equal to number maximum confidence support. Action item and non
of desired itemsets and number of discarded rules = action item of a desired itemset is mapped to
mp S where S is the number of desired itemsets. antecedent items and consequent items of a rule. So
Proof: A single desired itemset consists of action every desired itemset is mapped to a single valid
type items and non-action type items. Action items rule. Total rules = number of desired itemsets = S.
and non-action items are mapped to consequent and Let m is the average number of distinct value, each
antecedent parts respectively. Let I = { i1, i2 n} multidimensional attribute holds. P is the number of
be the set of items to be mined, where items can be attributes to be mined. Number of possible different
either action type or non-action type. Let AI = {ai1, rules = . Number of discarded rules =
ai2 u) be the set of action items to be mined. where S is the number of desired itemsets.
Let NAI= { nai1, nai2 v) be the set of non-
action items to be mined. Each nai has to have 6. Results and discussion
confidence greater than minimum confidence support
to be included as 1- itemset and all ai are included as The experiments were done using PC with core 2
1- itemset. Let, C= {c1,c2,c3 n} be the set of duo processor with a clock rate of 1.8 GHz and 3GB
candidate itemsets. A new candidate NC is added to of main memory. The operating system was
C if the non-action part of NC named NCNA holds Microsoft Vista and implementation language was
the following property: support (each subset of c#. We used a patient dataset to verify our method.
NCNA) >= minimum antecedent support. A The dataset contains items, which are either actions
candidate c is selected for rule generation if and only that include decision, diagnosis and cost or non-
actions that include lab tests, any symptom of patient
66
5. and any criterion of disease. Each instance represents which has 50273 instances and 514 attributes
the data of one patient. We have filtered out (included 150 discrete and 364 numerical attributes).
instances which has noisy or missing values. The All these data are converted into mineable items
data set of interest has collected and preprocessed (integer representation) using domain dictionary and
from the different local hospitals of Bangladesh, rule base.
2000 6
times in secs
Number of rules
1500 A priori 4 MAS = .7
1000 MACS = .1
500 2
Proposed MAS = .85
0 0
Algorithm MACS = .5
10K 25K 50K 2 5 10
Number of transactions Maximum Confidence
Figure 3. Time comparison of Apriori and Figure 4. Number of rules based on maximum
proposed algorithm for the patient dataset confidence
2000 2000
Time(in seconds)
times in secs
1500 1500 MAS = .85
MACS = .1
1000 1000 MC = .1
MC = .05
500 500
MACS = .05 MAS = .70
0 0
MC = .05 MC = .1
90% 70% 50% 3% 5% 10%
minimum antecedent support MACS
Figure 5. Time comparison of different Figure 6. Time comparison of different maximum
maximum antecedent supports antecedent consequent supports
1 1
Accuracy
Accuracy
MACS = .1 MAS = .85
0.5 0.5 MCS = .05
MC = .1
MACS = .1 MAS = .7
0 0
MC = .05 MCS = .1
50% 70% 90% 10% 5% 2%
Maximum confidence
minimum antecedent support
Figure 7. Accuracy of the proposed algorithm Figure 8. Accuracy of the proposed algorithm
based on irregular metric based on maximum Confidence
Table 1. Test result for patient dataset on minimum antecedent support, maximum
Minimum antecedent support 70% 85% antecedent consequent support and checking the
Maximum antecedent 10% 5%
consequent support -action items. Figure 4 presents if
Maximum confidence 10% 5% maximum confidence (MC) increases, number of
Number of desired itemsets 49 31 valid rules increases. Figure 5 shows how time is
Number of Desired rules 5 3 varied with different minimum antecedent support
Time (Seconds) 922.2 1634.56 (MAS) values for irregular rule finding algorithm.
Table 1 shows test result for patient dataset, after Here we measured the performance of irregular rule
running the program of the proposed algorithm with finding algorithm in terms of MAS keeping MACS,
different parameters. Second column of the table MC, number of action items, number of non-action
presents the test result, where we used minimum items constant. Time is not varied significantly
antecedent support of 70%, maximum antecedent because MAS has no lead to reduce disk access as
consequent support of 10% and maximum the patient data set has all sizes of candidates for
confidence of 10%. 49 desired itemsets were these MAS values. It has only lead to the number of
generated in total. 3 rules were discovered in total. It valid candidate generations and it can save some
took about 922.2013 seconds to find these rules. CPU time. As it has lead to the CPU time, the three
Third column of the table presents the test result, different cases take slightly different time.
where we used minimum antecedent support of 85%, Figure 6 shows how time is varied with different
maximum antecedent consequent support of 5% and MACS by keeping MAS , MC, number of action
maximum confidence of 5%. 31 desired itemsets items, number of non-action items constant. Time is
were generated in total. 5 rules were discovered in not varied significantly because MACS has no lead
total. It took about 1634.5634 seconds to find these to reduce disk access as the patient data set has all
rules. sizes of candidates for these MAS values. It has only
Figure 3 shows Apriori has taken significant lead to the number of valid candidate generations
higher time compared to the proposed algorithm. It is and it can save some CPU time. As it has lead to the
because pruning in the proposed algorithm is based CPU time, the three different cases take slightly
different time. As maximum consequent support
67
6. decreases, number of valid candidate generation New York, NY, USA, 1995, pp. 175 - 186.
decreases For this reason, case with 5% MACS takes [6] A. Savasere, E. Omiecinski, and S. B. Navathe, "An
more time than case with 3% MACS and case with Efficient Algorithm for Mining Association Rules in
10% MACS takes more time than case with 5% Large Databases," in Proceedings of the 21th
MACS. Figure 7 illustrates accuracy results for our International Conference on Very Large Data Bases,
1995, pp. 432 - 444.
proposed algorithm based on minimum antecedent
[7] B. Liu, W. Hsu, and Y. Ma, "Mining Association
support. The value of minimum antecedent support Rules with Multiple Minimum Supports.," in
for each presented result is also indicated. The figure SIGKDD Explorations, 1999, pp. 337--341.
presents MAS has no lead in accuracy, as it is not [8] H. Yun, D. Ha, B. Hwang, and K. H. Ryu, "Mining
used as a parameter in selecting valid candidate and association rules on significant rare data using relative
rules. Figure 8 illustrates accuracy results for our support.," Journal of Systems and Software archive,
proposed algorithm based on maximum confidence. vol. 67, no. 3, pp. 181 - 191, 2003.
The figure presents maximum confidence has lead in [9] M. Hahsler, "A Model-Based Frequency Constraint
accuracy as it is used as parameter in selecting valid for Mining Associations from Transaction Data.,"
rules. As maximum confidence decreases, accuracy Data Mining and Knowledge Discovery, vol. 13, no.
2, pp. 137 - 166, 2006.
increases and the number of discovered rules
[10] L. Zhou and S. Yau, "Association rule and
decreases. It is because less confidence indicates that
quantitative association rule mining among infrequent
antecedent and consequent occurs rarely together in items," in International Conference on Knowledge
the dataset. Discovery and Data Mining, San Jose, California,
2007, pp. 156-167.
7. Conclusion [11] R. U. Kiran and P. K. Reddy, "An improved multiple
minimum support based approach to mine rare
Irregular patterns represent wrong decision, association rules.," in The IEEE Symposium on
illegal practice and variability in decision. In this Computational Intelligence and Data Mining,
Nashville, TN, USA, 2009, pp. 340-347.
paper, we propose a level wise search algorithm that
[12] S. Brin, R. Motwani, and C. Silverstein, "Beyond
works based on action and non-action type data to Market Baskets: Generalizing Association Rules to
find irregular association rule. The proposed Correlations," in In The Proceedings of SIGMOD,
algorithm has been applied to a real world patient AZ,USA, 1997, pp. 265-276.
data set. We have shown significant accuracy in the [13] X. Wu, C. Zhang, and S. Zhang, "Efficient Mining of
output of the proposed algorithm. Although we have Both Positive and Negative Association Rules," ACM
used level-wise search for finding irregular patterns, Transactions on Information Systems, vol. 22, no. 3,
each step of our algorithm is different from any other p. 381 405, 2004.
level-wise search algorithm. Rules generation from [14] M. L. Antonie and O. R. Zaïane, "Mining positive and
desired item sets is also different from conventional negative association rules: an approach for confined
rules," in Proceedings of the 8th European
association mining algorithms.
Conference on Principles and Practice of Knowledge
Discovery in Databases, Pisa, Italy, 2004, pp. 27 - 38.
8. References [15] P. A. Ortega, C. J. Figueroa, and G. A. Ruz, "A
Medical Claim Fraud/Abuse Detection System based
[1] R. Agrawal, T. . Swami, "Mining on Data Mining: A Case Study in Chile," in DMIN,
Association Rules between Sets of Items in Very 2006, pp. 224-231.
Large Databases," in Proceedings of the 1993 ACM [16] W. S. Yanga and S. Y. Hwangb, "A Process-Mining
SIGMOD international conference on Management of Framework for the Detection of Healthcare Fraud and
data, Washington, D.C., 1993, pp. 207-216. Abuse.," Expert Systems with Applications, vol. 31,
[2] R. Agrawal and R. Srikant, "Fast Algorithms for no. 1, p. 56 68, July 2006.
Mining Association Rules in Large Databases," in
Proceedings of the 20th International Conference on
Very Large Data Bases, San Francisco, CA, USA,
1994, pp. 487 - 499.
[3] S. Brin, R. Motwani, J. D. Ullman, and Shalom Tsur,
"Dynamic Itemset Counting and Implication Rules for
Market Basket Data," in Proceedings of the 1997
ACM SIGMOD international conference on
Management of data, Tucson, Arizona, United States,
1997, pp. 255-264.
[4] H. Mannila, H. Toivonen, and A. I. Verkamo,
"Efficient Algorithms for Discovering Association
Rules," in AAAI Workshop on Knowledge Discovery
in Databases, 1994, pp. 181-192.
[5] J. S. Park, M. S. Chen, and P. S. Yu, "An Effctive
Hash based Algorithm for mining association rules,"
in Prof. ACM SIGMOD Conf Management of Data,
68