The distribution coefficient, D, provides a measure of lipophilicity at a physiologically relevant pH. A drug's distribution coefficient strongly affects how easily the drug can reach its intended target in the body, how strong an effect it will have once it reaches its target, and how long it will remain in the body in an active form.
Why is logD7.4 Important?
In contrast to the partition coefficient P that refers to the concentration ratio of neutral species, D is defined as the sum of the concentration of all charge-state forms of a substance dissolved in the lipid phase, octanol, divided by the sum of those dissolved in water at a chosen pH. For this reason, logD is much more suitable parameter for correlating drug biological action, since it takes into account drug ionisation at a relevant pH.
This model predicts logD values at pH 7.4 at which many molecules exist in partially dissociated or ionised form. The logD7.4 model was built by the automatic procedure implemented within the Auto-Modeller using standard settings. The initial dataset was split into three subsets using cluster analysis at Tanimoto level 0.7. The model was trained on 601 compounds and evaluated on validation set of 127 compounds and test set of 130 compounds. The logD7.4 model was built using the Radial Basis Function technique using 173 2D-descriptors including atom and functionality counts. The logP descriptor was not used.
The predictive model for logD7.4 was evaluated on the validation set, on which it achieved an excellent R2 value of 0.88 and an RMSE value of 0.65 log units, and on the test set with R2 = 0.86 and RMSE = 0.68 log units. On the combined validation and test sets the statistics were R2 = 0.88 and RMSE = 0.67 log units.