Volume : 12, Issue : 1, JAN 2026
WEBSTUR: A TOOL FOR ANALYZING WEB USER BEHAVIOR ON PROXY SERVER USING WEB MINING
DR. JAI PRAKASH
Abstract
The growing reliance on data mining techniques across industries necessitates a comprehensive evaluation of their performance, particularly when applied to sparse datasets, a common challenge in real-world applications. This study focuses on a comparative analysis of four widely used data mining methods Apriori, FP Growth, Web Log, and Modified Web Log-evaluated across varying sparsity levels (20%, 40%, 60%, 80%, and 100%) to assess their accuracy and robustness. Employing a methodical approach, the research analyzes how these techniques adapt to the complexities of sparse environments, identifying significant variations in their performance. The Modified Web Log algorithm consistently demonstrated superior performance, achieving accuracy levels exceeding 90% across all sparsity levels, thereby proving its effectiveness in managing sparse datasets with minumal loss of information. FP Growth and Web Log showed moderate performance, maintaining accuracy between 80% and 90%, while Apriori consistently underperformed with accuracy levels below 80%, revealing its limitations in sparse conditions. These findings underscore the critical importance of algorithmic optimization for improving data processing and predictive modeling in industries such as e-commerce, healthcare, and social media, where sparsity is a prevalent issue. By highlighting the effectiveness of the Modified Web Log approach, this study contributes to the ongoing development of efficient data mining methodologies, offering valuable insights for organizations seeking to enhance decision-making and operational efficiency through improved data-driven strategies. The research also provides a foundation for further exploration into advanced algorithmic designs tailored to tackle sparsity and other challenges inherent in large-scale data environments.
Keywords
WEB MINING, WEB USAGE, USER BEHAVIOR.
Article : Download PDF
Cite This Article
IESRJ
International Educational Scientific Research Journal
E-ISSN: 2455-295X
International Indexed Journal | Multi-Disciplinary Refereed Research Journal
ISSN: 2455-295X
Peer-Reviewed Journal - Equivalent to UGC Approved Journal
Peer-Reviewed Journal
Article No : 4
Number of Downloads : 20
References
1. Varnagar, Chintan R., Nirali N. Madhak, Trupti M. Kodinariya, and Jayesh N. Rathod. "Web usage mining: a review on process, methods and techniques." In 2013 International Conference on Information Communication and Embedded Systems (ICICES), pp. 40-46. IEEE, 2013.
2. Cooley, Robert, Bamshad Mobasher, and Jaideep Srivastava. "Web mining: Information and pattern discovery on the world wide web." In Proceedings ninth IEEE international conference on tools with artificial intelligence, pp. 558-567. IEEE, 1997.
3. Mughal, Muhammd Jawad Hamid. "Data mining: Web data mining techniques, tools and algorithms: An overview." International Journal of Advanced Computer Science and Applications 9, no. 6 (2018).
4. Suguna, R., and D. Sharmila. "An overview of web usage mining" International Journal of Computer Applications 39, no. 13 (2012): 11-13.
5. Sharma, Kavita, Gulshan Shrivastava, and Vikas Kumar. "Web mining: Today and tomorrow." In 2011 3rd International Conference on Electronics Computer Technology, vol. 1, pp. 399-403. IEEE, 2011.
6. Kosala, Raymond, and Hendrik Blockeel. "Web mining research: A survey." ACM Sigkdd Explorations Newsletter 2, no. 1 (2000): 1-15.
7. Vidya, S., and K. Banumathy. "Web mining-concepts and application." International Journal of Computer Science and Information Technologies 6, no. 4 (2015): 3266-3268.
8. Gautam, Dr SS, and Manish Kumar Tiwari. "Web Mining Concepts and its Applications." International Research Journal of Computer Science (IRJCS) 01 (2014): 8-13.
9. https://www.javatpoint.com/data-mining-world-wide-web
10. da Costa, Miguel Gomes, and Zhiguo Gong. "Web structure mining: an introduction." In 2005 IEEE International Conference on Information Acquisition, pp. 6-pp. IEEE, 2005.
11. Baeza-Yates, Ricardo, and Paolo Boldi. "Web structure mining." In Advanced Techniques in Web Intelligence-I, pp. 113-142. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010.
12. Xu, Guandong, Yanchun Zhang, Lin Li, Guandong Xu, Yanchun Zhang, and Lin Li. "Web content mining." Web Mining and Social Networking: Techniques and Applications (2011): 71–87.
13. Johnson, Faustina, and Santosh Kumar Gupta. "Web content mining techniques: a survey." International journal of computer applications 47, no. 11 (2012).
14. Liu, Bing. "Web usage mining." Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (2007): 449-483.
15. Román, Pablo E., Gastón L'Huillier, and Juan D. Velásquez. "Web usage mining." Advanced techniques in web intelligence-1 (2010): 143-165.
16. Srivastava, Jaideep, Robert Cooley, Mukund Deshpande, and Pang-Ning Tan. "Web usage mining: Discovery and applications of usage patterns from web data." Acm Sigkild Explorations Newsletter 1, no. 2 (2000): 12-23.
17. Wang, Chuan, Yancheng Liu, Qinjin Zhang, Haohao Guo, Xiaoling Liang, Yang Chen, Minyi Xu, and Yi Wei. "Association rule mining based parameter adaptive strategy for differential evolution algorithms." Expert Systems with Applications 123 (2019): 54-69.
18. Srivastava, Jaideep, Robert Cooley, Mukund Deshpande, and Pang-Ning Tan. "Web usage mining: Discovery and applications of usage patterns from web data." Acm Sigkdd Explorations Newsletter 1, no. 2 (2000): 12-23.
19. Rathee, Sanjay, Manohar Kaul, and Arti Kashyap. "R-Apriori: an efficient apriori based algorithm on spark." In Proceedings of the 8th workshop on Ph. D. Workshop in information and knowledge management, pp. 27-34.2015.
20. Wang, Weina, Yunjie Zhang, Yi Li, and Xiaona Zhang. "The global fuzzy c-means clustering algorithm." In 2006 6th World Congress on Intelligent Control and Automation, vol. 1, pp. 3604-3607. IEEE, 2006.
21. Cai, Weiling, Songcan Chen, and Daoqiang Zhang. "Fast and robust fuzzy e-means clustering algorithms incorporating local information for image segmentation." Pattern recognition 40, no. 3 (2007): 825-838.
22. Yuan, Chunhui, and Haitao Yang. "Research on K-value selection method of K-means clustering algorithm." J2, no. 2 (2019): 226-235.
23. Md Amin, Mohd Afandi. "A user acceptance model of web personalization systems." PhD diss., Queensland University of Technology, 2012.
24. Mobasher, Bamshad, Robert Cooley, and Jaideep Srivastava. "Automatic personalization based on web usage mining." Communications of the ACM 43, no. 8 (2000): 142-151.
25. Abal?, Secil. "An implementation of web usage mining for an e-commerce web site." Master's thesis, Marmara Universitesi (Turkey), 2006.
26. Benova, Lenka, and Ladislav Hudec. "Comprehensive Analysis and Evaluation of Anomalous User Activity in Web Server Logs." Sensors 24, no. 3 (2024): 746.
27. Win. Tin Nilar, and Nang Khine Zar Lwin. "Analysis of Customers Interest for Web Log Clustering. In 2024 IEEE Conference on Computer Applications (ICCA). pp. 1-6. IEEE, 2024.
28. Leenas, Thasan, and H. A. Caldera. "Performance Improvement of Proxy Server Cache Management Using Web Usage Mining." In 2023 8th International Conference on Information Technology Research (ICITR), pp. 1-6. IEEE, 2023.
29. Canay, Özkan, and Ümit Kocab?çak. "An innovative data collection method to eliminate the pre-processing phase in web usage mining." Engineering Science and Technology, an International Journal 40 (2023): 101360.
30. Benova, Lenka, and Ladislav Hudec. "Using Web Server Logs to Identify and Comprehend Anomalous User Activity." In 2023 17th International Conference on Telecommunications (ConTEL), pp. 1-8 IEEE, 2023.
31. Bommi Harika, Dr T. Sudha. "Identification of User Behaviour by Web Usage Mining." Mathematical Statistician and Engineering Applications 71, no. 4 (2022): 678-692.
32. Gangadwala, Hardik A., and Ravi M. Gulati. "Analysis of Web Usage Mining Using Various Fuzzy Techniques and Cluster Validity Index. In 2022 First International Conference on Electrical, Electronics, Information and Communication Technologies (ICEEICT), pp. 1-7. IEEE, 2022.
33. Sowmya, H. K., and R. J. Anandhi. "An efficient and scalable dynamic session identification framework for web usage mining" International Journal of Information Technology 14, no. 3 (2022): 1515-1523.
34. Roy, Rita. "Predicting User's web navigation behaviour using AMD and HMM approaches." In IOP Conference Series: Materials Science and Engineering, vol. 1074, no. 1, p. 012031. IOP Publishing, 2021.
35. Bhuvaneswari, M. S., and K. Muneeswaran. "User community detection from web server log using between user similarity metric." International Journal of Computational Intelligence Systems 14, no. 1 (2021): 266-281.
36. Ali, Noaman M., Ahmed M. Gadallah, Hesham A. Hefny, and Boris Novikov. "An integrated framework for web data preprocessing towards modeling user behavior." In 2020 International Multi-Conference on Industrial Engineering and Modern Technologies (FarEastCon), pp. 1-8. IEEE, 2020.
37. Choi, Jinchun, Mohammed Abuhamad, Ahmed Abusnaina, Afsah Anwar, Sultan Alshamrani, Jeman Park, Dachun Nyang, and David Mohaisen. "Understanding the proxy ecosystem: A comparative analysis of residential and open proxies on the internet." IEEE Access 8 (2020): 111368-111380.
38. https://www.javatpoint.com/k-means-clustering-algorithm-in-machine-learning
