Pdf Malware Detection: Toward Machine Learning Modelling with Explainability Analysis

Authors

  • Shaik Mohammad Parvez M.C.A Student IV Semester, Department of M.C.A, KMMIPS, Tirupati (D.t), Andhra Pradesh, India Author
  • GVS Ananthnath Associate Professor, Department of M.C.A, KMMIPS, Tirupati (D.t), Andhra Pradesh, India Author

Keywords:

PDF malware detection, machine learning, Random Forest, SVM, DNN, explainability, cybersecurity, malicious PDF, classification algorithms, Kaggle dataset

Abstract

In the digital age, PDF files are widely used for document sharing, but their popularity also makes them a target for malware attacks. This project, titled "PDF Malware Detection: Toward Machine Learning Modeling With Explainability Analysis," aims to develop and evaluate machine learning models for detecting malware in PDF files. Utilizing a dataset from Kaggle, which contains labeled examples of malicious and benign PDFs, various algorithms including Random Forest, C5.0, J48, Support Vector Machine (SVM), AdaBoost, Deep Neural Network (DNN), Gradient Boosting Machine (GBM), and K-Nearest Neighbors (KNN) will be applied. The primary focus is on achieving high detection accuracy while also providing explainability to understand the decision-making process of the models. By leveraging machine learning techniques, this project seeks to enhance cybersecurity measures, offering a robust solution to identify and mitigate potential threats embedded in PDF documents.

Downloads

Download data is not yet available.

Downloads

Published

30-05-2025

Issue

Section

Research Articles

How to Cite

[1]
Shaik Mohammad Parvez and GVS Ananthnath, “Pdf Malware Detection: Toward Machine Learning Modelling with Explainability Analysis”, Int J Sci Res Sci Eng Technol, vol. 12, no. 3, pp. 503–509, May 2025, Accessed: Jun. 04, 2025. [Online]. Available: https://www.ijsrset.com/index.php/home/article/view/IJSRSET251273

Similar Articles

1-10 of 215

You may also start an advanced similarity search for this article.