logP, the logarithm of the octanol/water partition coefficient, gives a measure of the lipophilicity of a compound. Lipophilicity is an important property of a drug molecule as it influences a number of physiological properties including transport through cell membranes, rate of metabolism and interaction with receptor binding sites.

Why is logP Important?

Having the ability to influence so many important properties ensures that logP is often at the heart of discussions about the appropriate balance of properties for a drug discovery project. logP can often be a useful indicator of the risk of metabolism, particularly by CYP3A4. One strong argument for decreasing the logP of a compound is that this significantly decreases the risk of metabolism by this abundant enzyme. Conflicting with this, is the fact that one of the major contributing reasons for target binding is the hydrophobic effect. As such, for some projects other approaches to minimizing metabolism, such as reducing site lability may be necessary due to the lipophilicity required for target affinity. Complicating this balance further is the need to maintain an appropriate level of lipophilicity to enable a compound to be orally absorbed. For a drug to be absorbed across the intestinal epithelium it must be able to partition into the lipid bilayer and back out again, which cannot happen if the lipophilicty is too high or too low.

Many criteria for logP have been proposed: Best known is the criterion from Lipinski’s Rule of Five which is based on the observation that the majority of orally absorbed compounds have a logP < 5 (Lipinski et al. Adv. Drug Deliv. Rev. (1997) 23 pp. 3-25). Hughes et al. Observed an increase in adverse events in in vivo toxicity studies associated with high logP, particularly combined with low polar surface area (PSA) and proposed the 4/75 rule of logP<4 and PSA>75 Å2 (Hughes et al. Bioorg. Med. Chem. Lett. (2008) 18 pp. 4872-4875). The same criterion of logP<4 has been proposed for compounds intended for central nervous system indications (Chico et al. Nat. Rev. Drug Discov. (2009) 8 pp. 892-909 ; Wager et al. ACS Chem. Neurosci. (2010) 1 pp. 435-449.)

logP Model

The Asteris logP model is based on a large data set containing more than 9000 experimental octanol/water partition coefficient values obtained from the Medchem database. These logP values are the most comprehensive and reliable source of logP data and most of the available in silico models for prediction of logP are based on these data. The model was trained using 6887 compounds by means of a radial basis function (RBF), a widely used algorithm commonly applied on problems of supervised learning. The model was validated using a test set of 2950 compounds, on which it achieved an excellent R2 value of 0.92 between observed and predicted values (see image on right). While predicting, the model also calculates the distance of each new compound from the descriptor space of the training compounds, to gauge the validity of the results, sometimes referred to as domain of applicability. Predictions for compounds within the chemical space of the model are reported with a Root Mean Square Error (RMSE) value of 0.44 log units. Estimated logP values for compounds outside of the chemical space have a RMSE of 0.63 log units.

Observed vs Predicted logP