Vulnerability Estimation

This repository consists of scripts for research project named “Classification and Estimation of System Vulnerabilities from the Network layer using Deep Neural Networks”. This project is developed as a part of Computers and Web Security Course under guidance of Dr. Sangeeta Mittal.

Fig. 1 | Distribution of vulnerabilities based upon risk factor. Most vulnerabilities depicted lies in Medium risk category (%) followed by High risk (%). Negligible Type B vulnerabilities lies in High (%) and Low risk category (%) while most of them are segregated in High (%) and Medium (%) risk category.

Fig. 2 | Bubble plot illustrating vendors of vulnerabilities detected. All four sets of vulnerabilities are clustered on the basis of vendor name of vulnerable product (defined by NVD). Diameter of the bubble against each vendor depicts normalized frequency of vulnerability detected. Microsoft and PHP are observed to be the most frequent vendor of Type-A vulnerabilities (system) and Type-B vulnerabilities (network) respectively. It is interesting to note that OpenSSL came across as most frequent vendor of Type A.B vulnerabilities. It may be attributed to the fact that Type A.B vulnerabilities gives an emphasis on network communication.

Fig. 3 | Attack Vector associated with various vulnerability classification. Local and Network are the most observed attack vectors observed while Adjacent network and Physical are least observed. These attack vectors are mapped from NVD corresponding to each CVE-ID detected. Non-occurrence of attack vector is associated with N.A. (Not available). Local comprises of _% of Type A vulnerabilities while only _% and _% of them are present in Type-B and Type-A.B vulnerabilities. It is important to note that network based attack vector is highly associated with Type B and Type A.B vulnerabilities.

Fig. 4 | Summary of trained multi-layer perceptron network. Multilayer perceptron network is employed for estimation of system vulnerabilities due to two main reasons (1) its ability to estimate polynomial dependencies and (2) training model without considering any prior assumptions.

Fig. 5 | Code snippet of eli5 library used to estimate feature importance. Theoretically, it is quite clear that Type-A vulnerability score is closely associated with Type A.B score rather than Type B score (intersection of two sets). Hence, feature importance estimation is used to estimate dependency of various features on Type-A score.

Fig. 6 | Type A.B WScore is most important feature for estimation of Type-A score. Type A WScore, Type A+B Score and Type B WScore are least important and can be excluded from the training set.