1Northwest A&F University, Xinong road 22th, Yangling, 712100, China
2Dalian University of Technology, Linggong Road 2, Dalian, 116024, China
3Freshwater Fisheries Sciences Institute of Liaoning Province, Liaoning, 111000, China
The purpose of this work is to develop robust and interpretable quantitative structure”activity
relationship (QSAR) models for assessing the aquatic toxicity of phenols using a combined set of descriptors encompassing the logP and recently developed variables (Monconn-Z variables). The used dataset consists of 250 chemicals with toxicity data to the ciliate Tetrahymena pyriformis. For each compound, a total of 197 physico-chemical descriptors including logP and Molconn-Z descriptors were calculated. Multiple linear regression (MLR) and Partial least squares (PLS) were used to obtain QSARs and the predictive performance of the proposed models were verified using external statistical validations. The results of stepwise-MLR analysis reveal that the AlogP, MlogP and ClogP models were not successful for the prediction of aquatic toxicity for all the compounds. And by using the logP (AlogP and MlogP) with Molconn-Z descriptors, the obtained QSARs were not successful enough nutill removal of some outliers. Two optimal QSARs were built with R2 of 0.71 and 0.70 for the training sets and the external validation Q2 of 0.69 and 0.68 respectively. All
these selected descriptors in the best models account for the hydrophobic (AlogP, MlogP) and other
electrotopological properties like SHCsatu, Scarboxylicacid, SHBa, gmax and nelem. PLS produces a good model using all the calculated descriptors with R2 of 0.78 and Q2 of 0.64, and hydrophobic and electrotopological descriptors show importance for the prediction of phenolic toxicity.