abstract
Currently there are no accurate models for the prediction of diffusion coefficients at infinite dilution in aqueous systems. Frequently, models that work well for polar solvents often perform worse in the case of water. At the same time, experimental data of tracer diffusion coefficients are scarce and can be impractical to measure when information on this important transport property is required. In this work, machine learning models were developed to predict the tracer diffusion coefficient of any solute in water at atmospheric pressure. Several approaches were carried out to construct the model, using different types of input parameters: pure component properties and theoretical molecular descriptors, such as atom counts, structural fragments and fingerprints, computed using different sources. A database of 126 systems (1192 data points) was used for training and the best model showed a global average absolute relative deviation (AARD) of 3.92%, with a maximum deviation of 24.27% on the test set. This model uses as inputs the temperature and 195 molecular descriptors computed using the RDKit cheminformatics package, which can be automatically calculated from a molecular identifier thus making the model very simple to use. In comparison, the well-known Wilke-Chang equation provided an AARD of 13.03% in the same test set, demonstrating the improved accuracy of the proposed solution. The models developed in this work are provided at github .com /EgiChem /ml -D12 -water -app.
keywords
LIQUID; WATER
subject category
Chemistry; Physics
authors
Aniceto, JPS; Zêzere, B; Silva, CM
our authors
Projects
CICECO - Aveiro Institute of Materials (UIDB/50011/2020)
CICECO - Aveiro Institute of Materials (UIDP/50011/2020)
Associated Laboratory CICECO-Aveiro Institute of Materials (LA/P/0006/2020)
acknowledgements
This work was developed within the scope of the project CICECO-Aveiro Institute of Materials, UIDB/50011/2020 (DOI 10.54499/UIDB/50011/2020) , UIDP/50011/2020 (DOI 10.54499/UIDP/50011/2020) & LA/P/0006/2020 (DOI 10.54499/LA/P/0006/2020) , financed by national funds through the FCT/MCTES (PIDDAC) . J.P.S.A. thanks FCT (Fundacao para a Ciencia e a Tecnologia) for funding under the Scientific Employment Stimulus-CEEC Individual 2020 (DOI 10.54499/2020.02534.CEECIND/CP1589/CT0014) .

