Page 21 - Fister jr., Iztok, and Andrej Brodnik (eds.). StuCoSReC. Proceedings of the 2017 4th Student Computer Science Research Conference. Koper: University of Primorska Press, 2017
P. 21
le 2: Association rules as found by the Cuckoo search rule miner.
Antecedent Consequent Fitness value
(DX = 201.23) ∧ (SEX = F EM ALE)
(DX = 201.23) ∧ (SEX = F EM ALE) ∧ (T Y P E = U RGEN T ) (LOS =< 44) 0.508
(LOS =< 44) 0.503
(DX = 142.8) ∧ (AGE = 40 − 49) (LOS =< 44) 0.501
(DX = 202.88) (DIED = N O) 0.501
(DX = 070.22) (DIED = N O) 0.501
(T Y P E = EM ERGEN CY )
(DX = 016.04) ∧ (AGE = 60 − 69) (SEX = M ALE) 0.5
(DX = 036.3) ∧ (DIED = N O) (DIED = N O) 0.5
(DX = 197.8) (T Y P E = EM ERGEN CY ) 0.5
(DX = 255.8) ∧ (DIED = N O) (T Y P E = EM ERGEN CY ) 0.5
(SEX = F EM ALE) 0.5
(DX = 206.00) ∧ (AGE = 80 − 89) ∧ (SEX = F EM ALE) (LOS =< 44) 0.5
(DX = 010.04) (DIED = N O) ∧ (LOS =< 44) 0.5
(DX = 012.33) (T Y P E = EM ERGEN CY ) 0.5
(DX = 201.66) (SEX = F EM ALE) 0.5
(DX = 015.55) 0.5
(DX = 079.88) ∧ (SEX = M ALE) ∧ (DIED = N O) (LOS =< 44) 0.5
(DX = 211.7) (SEX = F EM ALE) 0.5
(LOS =< 44) 0.5
(DX = 010.03) ∧ (SEX = F EM ALE) ∧ (DIED = Y ES) (DIED = N O) 0.5
(DX = 201.66) ∧ (SEX = M ALE) ∧ (DIED = N O) 0.5
(DX = 171.4)
(DX = 085.5) ∧ (DIED = Y ES)
(DX = 232.8)
that we might not know [3]. The latter is also the reason association rules for other diseases which occur commonly
for this study, and with this in mind, all records containing in the modern world.
the disease with ICD-9-CM code ’250.30’ (Type II diabetes
mellitus) were extracted from the whole NIS dataset to form 7. REFERENCES
a new smaller dataset.
[1] Rakesh Agrawal, Tomasz Imielin´ski, and Arun Swami.
5. RESULTS Mining association rules between sets of items in large
databases. In Acm sigmod record, volume 22, pages
In this section the results of association rule mining on the 207–216. ACM, 1993.
NIS dataset using the CS algorithm is presented. The re-
sults are reported in Table 2 in form of association rules. [2] Chun Chao and John H. Page. Type 2 diabetes
Additionally the fitness value of each rule is reported. Only mellitus and risk of non-hodgkin lymphoma: A
the best 20 rules are reported in Table 2, but only the top systematic review and meta-analysis. American
five are additionally commented on. The average number of Journal of Epidemiology, 168(5):471–480, 2008.
antecedent obtained in this study is 1.8, while the average
number of consequent is 1.05. This is favourable for the user, [3] Anthony S Fauci et al. Harrison’s principles of
since shorter association rules are easier to understand. It internal medicine, volume 2. McGraw-Hill, Medical
also worth emphasizing that all rules produces are somehow Publishing Division New York, 2008.
related to the TIIDM. The first two rules indicate that the
TIIDM is involved with the pathogenesis of non-Hodgkin’s [4] Iztok jr. Fister. Algoritmi raˇcunske inteligence za
sarcoma. This fact is supported by several studies in lit- razvoj umetnega ˇsportnega trenerja. PhD dissertation,
erature [2]. The third and fourth rules state that there is University of Maribor, Faculty of Electrical
a connection of Maligant neoplasm of major salivary gland Engineering and Computer Science, 2017.
and other maligant lymphomas with TIIDM, which is sup-
ported by a study in [10] where an increased prevalence of [5] John H. Holland. Adaptation*. In Robert Rosen and
diabetes was found in patients with salivary gland tumour. Fred M. Snell, editors, Progress in Theoretical Biology,
A chronic viral hepatitis B was found to be in connection pages 263–293. Academic Press, 1976.
with TIIDM in the fifth rule [7].
[6] Mohammed Khalilia, Sounak Chakraborty, and Mihail
6. CONCLUSION Popescu. Predicting disease risks from highly
imbalanced data using random forest. BMC medical
The CS algorithm was investigated as a association rule informatics and decision making, 11(1):51, 2011.
miner on a hospital discharge dataset. The CS produces
rules, which are simple, easy to understand, and also in- [7] Kar Neng Lai, Fernand Mac-Moune Lai, Nancy WY
teresting. The rules are found with the help of a objective Leung, Stephen T Lo, and John S Tam. Hepatitis with
function, which weighs the support and confidence of the isolated serum antibody to hepatitis b core antigen: a
rules. The weights control how each interestingness mea- variant of non-a, non-b hepatitis? American journal of
sure is important, and thus guides the search in the desired clinical pathology, 93(1):79–83, 1990.
direction.
[8] Uroˇs Mlakar, Iztok Fister, Janez Brest, and Boˇzidar
The obtained rules were compared with research in the field Potoˇcnik. Multi-objective differential evolution for
of Type II diabetes mellitus, where all results were confirmed feature selection in facial expression recognition
to be reasonable and supported by a study. systems. Expert Systems with Applications,
89:129–137, 2017.
In future we would like to use the CS algorithm for mining
[9] Uroˇs Mlakar, Milan Zorman, Iztok Fister Jr, and Iztok
Fister. Modified binary cuckoo search for association
rule mining. Journal of Intelligent & Fuzzy Systems,
32(6):4319–4330, 2017.
[10] Zsuzsanna Suba, Jo´zsef Baraba´s, Gyo¨rgy Szabo´,
Daniel Taka´cs, and Ma´rta Ujpa´l. Increased prevalence
StuCoSReC Proceedings of the 2017 4th Student Computer Science Research Conference 21
Ljubljana, Slovenia, 11 October
Antecedent Consequent Fitness value
(DX = 201.23) ∧ (SEX = F EM ALE)
(DX = 201.23) ∧ (SEX = F EM ALE) ∧ (T Y P E = U RGEN T ) (LOS =< 44) 0.508
(LOS =< 44) 0.503
(DX = 142.8) ∧ (AGE = 40 − 49) (LOS =< 44) 0.501
(DX = 202.88) (DIED = N O) 0.501
(DX = 070.22) (DIED = N O) 0.501
(T Y P E = EM ERGEN CY )
(DX = 016.04) ∧ (AGE = 60 − 69) (SEX = M ALE) 0.5
(DX = 036.3) ∧ (DIED = N O) (DIED = N O) 0.5
(DX = 197.8) (T Y P E = EM ERGEN CY ) 0.5
(DX = 255.8) ∧ (DIED = N O) (T Y P E = EM ERGEN CY ) 0.5
(SEX = F EM ALE) 0.5
(DX = 206.00) ∧ (AGE = 80 − 89) ∧ (SEX = F EM ALE) (LOS =< 44) 0.5
(DX = 010.04) (DIED = N O) ∧ (LOS =< 44) 0.5
(DX = 012.33) (T Y P E = EM ERGEN CY ) 0.5
(DX = 201.66) (SEX = F EM ALE) 0.5
(DX = 015.55) 0.5
(DX = 079.88) ∧ (SEX = M ALE) ∧ (DIED = N O) (LOS =< 44) 0.5
(DX = 211.7) (SEX = F EM ALE) 0.5
(LOS =< 44) 0.5
(DX = 010.03) ∧ (SEX = F EM ALE) ∧ (DIED = Y ES) (DIED = N O) 0.5
(DX = 201.66) ∧ (SEX = M ALE) ∧ (DIED = N O) 0.5
(DX = 171.4)
(DX = 085.5) ∧ (DIED = Y ES)
(DX = 232.8)
that we might not know [3]. The latter is also the reason association rules for other diseases which occur commonly
for this study, and with this in mind, all records containing in the modern world.
the disease with ICD-9-CM code ’250.30’ (Type II diabetes
mellitus) were extracted from the whole NIS dataset to form 7. REFERENCES
a new smaller dataset.
[1] Rakesh Agrawal, Tomasz Imielin´ski, and Arun Swami.
5. RESULTS Mining association rules between sets of items in large
databases. In Acm sigmod record, volume 22, pages
In this section the results of association rule mining on the 207–216. ACM, 1993.
NIS dataset using the CS algorithm is presented. The re-
sults are reported in Table 2 in form of association rules. [2] Chun Chao and John H. Page. Type 2 diabetes
Additionally the fitness value of each rule is reported. Only mellitus and risk of non-hodgkin lymphoma: A
the best 20 rules are reported in Table 2, but only the top systematic review and meta-analysis. American
five are additionally commented on. The average number of Journal of Epidemiology, 168(5):471–480, 2008.
antecedent obtained in this study is 1.8, while the average
number of consequent is 1.05. This is favourable for the user, [3] Anthony S Fauci et al. Harrison’s principles of
since shorter association rules are easier to understand. It internal medicine, volume 2. McGraw-Hill, Medical
also worth emphasizing that all rules produces are somehow Publishing Division New York, 2008.
related to the TIIDM. The first two rules indicate that the
TIIDM is involved with the pathogenesis of non-Hodgkin’s [4] Iztok jr. Fister. Algoritmi raˇcunske inteligence za
sarcoma. This fact is supported by several studies in lit- razvoj umetnega ˇsportnega trenerja. PhD dissertation,
erature [2]. The third and fourth rules state that there is University of Maribor, Faculty of Electrical
a connection of Maligant neoplasm of major salivary gland Engineering and Computer Science, 2017.
and other maligant lymphomas with TIIDM, which is sup-
ported by a study in [10] where an increased prevalence of [5] John H. Holland. Adaptation*. In Robert Rosen and
diabetes was found in patients with salivary gland tumour. Fred M. Snell, editors, Progress in Theoretical Biology,
A chronic viral hepatitis B was found to be in connection pages 263–293. Academic Press, 1976.
with TIIDM in the fifth rule [7].
[6] Mohammed Khalilia, Sounak Chakraborty, and Mihail
6. CONCLUSION Popescu. Predicting disease risks from highly
imbalanced data using random forest. BMC medical
The CS algorithm was investigated as a association rule informatics and decision making, 11(1):51, 2011.
miner on a hospital discharge dataset. The CS produces
rules, which are simple, easy to understand, and also in- [7] Kar Neng Lai, Fernand Mac-Moune Lai, Nancy WY
teresting. The rules are found with the help of a objective Leung, Stephen T Lo, and John S Tam. Hepatitis with
function, which weighs the support and confidence of the isolated serum antibody to hepatitis b core antigen: a
rules. The weights control how each interestingness mea- variant of non-a, non-b hepatitis? American journal of
sure is important, and thus guides the search in the desired clinical pathology, 93(1):79–83, 1990.
direction.
[8] Uroˇs Mlakar, Iztok Fister, Janez Brest, and Boˇzidar
The obtained rules were compared with research in the field Potoˇcnik. Multi-objective differential evolution for
of Type II diabetes mellitus, where all results were confirmed feature selection in facial expression recognition
to be reasonable and supported by a study. systems. Expert Systems with Applications,
89:129–137, 2017.
In future we would like to use the CS algorithm for mining
[9] Uroˇs Mlakar, Milan Zorman, Iztok Fister Jr, and Iztok
Fister. Modified binary cuckoo search for association
rule mining. Journal of Intelligent & Fuzzy Systems,
32(6):4319–4330, 2017.
[10] Zsuzsanna Suba, Jo´zsef Baraba´s, Gyo¨rgy Szabo´,
Daniel Taka´cs, and Ma´rta Ujpa´l. Increased prevalence
StuCoSReC Proceedings of the 2017 4th Student Computer Science Research Conference 21
Ljubljana, Slovenia, 11 October