Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Top-Down and Bottom-Up Approach for Mining Multilevel Association Rules From Concept Hierarchical Data in Distributed Environment


Affiliations
1 Department of Information Technology, A.D. Patel Institute of Technology, India
     

   Subscribe/Renew Journal


Hierarchical Data mining using distributed environment is an imperative in big data analysis. Multilevel association rules can provide more substantial information than single level rules, and it also determines hierarchical knowledge from the dataset. Nowadays, numerous e-commerce and social networking sites generates vast amount of structural/semi-structural data in the form of sales data, tweets, text mails, web usages and so on. The data generated from such sources is so large that it becomes very difficult to process and analyze it using conventional approaches. This paper overcomes the computing limitation of single node by distributing the task on multi-node cluster. The performance of this system is compared based on minimum support threshold at diverse levels of concept hierarchy and by varying the dataset size. In this paper, the transactional dataset is created from huge sales dataset using Hadoop MapReduce framework. Then, two distributed multilevel frequent pattern mining algorithms MR-MLAB (MapReduce based Multilevel Apriori using Bottom-up approach) and MR-MLAT (MapReduce based Multilevel Apriori using Top-down approach) are implemented to find interesting level-crossing frequent itemset for each level of concept hierarchy. The hierarchical redundancy in multilevel association rules affects the quality of the market basket analysis. Hence, to improve the performance of the system, the hierarchical redundancy has to be removed from it. Finally, the time efficiency of proposed algorithms is compared with existing Traditional Multilevel Apriori (TMLA) Algorithm. The proposed algorithms with MapReduce framework are found efficient compared to the traditional algorithms.

Keywords

Distributed Frequent Pattern Mining, Multi-Level Association Rule, MapReduce, Level Crossing Rules
Subscription Login to verify subscription
User
Notifications
Font Size


  • Top-Down and Bottom-Up Approach for Mining Multilevel Association Rules From Concept Hierarchical Data in Distributed Environment

Abstract Views: 238  |  PDF Views: 2

Authors

Dinesh J. Prajapati
Department of Information Technology, A.D. Patel Institute of Technology, India

Abstract


Hierarchical Data mining using distributed environment is an imperative in big data analysis. Multilevel association rules can provide more substantial information than single level rules, and it also determines hierarchical knowledge from the dataset. Nowadays, numerous e-commerce and social networking sites generates vast amount of structural/semi-structural data in the form of sales data, tweets, text mails, web usages and so on. The data generated from such sources is so large that it becomes very difficult to process and analyze it using conventional approaches. This paper overcomes the computing limitation of single node by distributing the task on multi-node cluster. The performance of this system is compared based on minimum support threshold at diverse levels of concept hierarchy and by varying the dataset size. In this paper, the transactional dataset is created from huge sales dataset using Hadoop MapReduce framework. Then, two distributed multilevel frequent pattern mining algorithms MR-MLAB (MapReduce based Multilevel Apriori using Bottom-up approach) and MR-MLAT (MapReduce based Multilevel Apriori using Top-down approach) are implemented to find interesting level-crossing frequent itemset for each level of concept hierarchy. The hierarchical redundancy in multilevel association rules affects the quality of the market basket analysis. Hence, to improve the performance of the system, the hierarchical redundancy has to be removed from it. Finally, the time efficiency of proposed algorithms is compared with existing Traditional Multilevel Apriori (TMLA) Algorithm. The proposed algorithms with MapReduce framework are found efficient compared to the traditional algorithms.

Keywords


Distributed Frequent Pattern Mining, Multi-Level Association Rule, MapReduce, Level Crossing Rules

References