|| Checking for direct PDF access through Ovid
Although skin permeability of an active ingredient can be severely affected by its ionization in a dose solution, most of the existing prediction models cannot predict such impacts. To provide reliable predictors, we curated a novel large dataset of in vitro human skin permeability coefficients for 322 entries comprising chemically diverse permeants whose ionization fractions can be calculated. Subsequently, we generated thousands of computational descriptors, including Log D (octanol–water distribution coefficient at a specific pH), and analyzed the dataset using nonlinear support vector regression (SVR) and Gaussian process regression (GPR) combined with greedy descriptor selection. The SVR model was slightly superior to the GPR model, with externally validated squared correlation coefficient, root mean square error, and mean absolute error values of 0.94, 0.29, and 0.21, respectively. These models indicate that Log D is effective for a comprehensive prediction of ionization effects on skin permeability. In addition, the proposed models satisfied the statistical criteria endorsed in recent model validation studies. These models can evaluate virtually generated compounds at any pH; therefore, they can be used for high-throughput evaluations of numerous active ingredients and optimization of their skin permeability with respect to permeant ionization.