DataSet:For the development of effective QSAR model, we have used total 49 molecules, which were collected from literature with their IC values (logarithm of 50% growth inhibitory concentration).The performance of model was evaluvated using five-fold cross-validation (Data available in supplementary dataset).
_{50} |

Descriptor calculation:For deriving the structural activity relationship of each molecule, we calculated descriptors using different softwares, V-life,CDK. V-life MDS (Molecular Design Suite) is a workbench for computer aided drug design (CADD) and molecule discovery. Through V-life software, we have calculated ~1002 descriptors including 1D,2D and 3D descriptors. Chemistry Development Kit (CDK), a Java based open source library for structural chemo- and bioinformatics projects calculates 178 descriptors. |

Descriptor calculation:For deriving the structural activity relationship of each molecule, we calculated descriptors using different softwares, V-life,CDK. V-life MDS (Molecular Design Suite) is a workbench for computer aided drug design (CADD) and molecule discovery. Through V-life software, we have calculated ~1002 descriptors including 1D,2D and 3D descriptors. Chemistry Development Kit (CDK), a Java based open source library for structural chemo- and bioinformatics projects calculates 178 descriptors. |

Descriptor's selection:In a QSAR study, selection of a preferred set of molecular descriptors is an important step to successfully derive a predictive QSAR model. Initially, the descriptors were selected using CfsubsetEval module with best fit algorithm implemented in weka. For further selection of relevant molecular descriptors F-steeping approach has been employed to remove non-significant descriptor's for the prediction of inhibitory activity against MurA enzyme. |

Performance Measures:Once a regression model was constructed, goodness about the fit and statistical significance was assessed using the statistical parameters outlined below. Where n is the size of test set, m is the size of training set, Toxpred is the predicted pIGC50 and Toxact is the actual pIGC50, is the toxicity in test set, RMSE are the root mean squared error, R is the Pearson's correlation coefficient between actual and predicted value, R |