mlptools

mlptools is a collection of codes to construct linearized machine learning potentials. mlptools will be available on GitHub in the near future. If you use mlptools for academic purposes, please cite the following article [1].

[1] A. Seko, A. Togo and I. Tanaka, “Group-theoretical high-order rotational invariants for structural representations: Application to linearized machine learning interatomic potential”, Phys. Rev. B 99, 214108 (2019).

Generation of DFT datasets

  1. Structure generator (Prototypes) selection

    # For elemental systems
    $(mlptools)/structure/prototypes.py -t 1-covalent-simple
    $(mlptools)/structure/prototypes.py -t 1-metal-simple
    $(mlptools)/structure/prototypes.py -t 1-all
    $(mlptools)/structure/prototypes.py -t 1-manual-simple
    
    # For binary alloy systems
    $(mlptools)/structure/prototypes.py -t 2-alloy-simple --max_atoms 8
    $(mlptools)/structure/prototypes.py -t 2-alloy-all --max_atoms 8
    
    # For binary ionic systems
    $(mlptools)/structure/prototypes.py -t 2-ionic-simple --max_atoms 16 -c 0.5
    $(mlptools)/structure/prototypes.py -t 2-ionic-all --max_atoms 16 -c 0.25
    
  2. Geometry optimization of structure generators using DFT

  3. (optional) Clustering of optimized structure generators

    $(mlptools)/structure/prototypes_clustering.py -p */POSCAR
    
  4. Generation of structures used for DFT (Generation of training and test datasets)

    $(mlptools)/structure/generation.py -p */POSCAR -n 100 -d 0.5
    $(mlptools)/structure/generation.py -p */POSCAR -n 100 -d 0.5 --max_natom 50
    $(mlptools)/structure/generation.py --clustering -n 200 -d 0.5
    $(mlptools)/structure/generation.py --clustering -n 200 -d 0.5 --first_index 10001
    

    –clustering option requires prototype_clustering.txt (clustering result generated by prototypes_clustering.py automatically).

    > cat prototype_clustering.txt
    1168-01/POSCAR 4
    6052-01/POSCAR 7
    9893-01/POSCAR 1
    18147-01/POSCAR 2
    23693-01/POSCAR 1
    24692-01/POSCAR 2
    24754-01/POSCAR 1
    26766-01/POSCAR 13
    ...
    

Estimation of machine learning potential

Files including full paths of vasprun.xml files are required (See the following input example)

$(mlptools)/mlpgen/regression.py --infile train.in

(optional) To accelerate reading vasprun files, unnecessary lines of the vasprun files can be eliminated as

$(mlptools)/tools/vasprun_compress.py */vasprun.xml
  • Input example

    # number of atom species
    n_type 2
    # use derivatives in training or not: True or False
    with_force True
    
    # regression method (ridge or lasso or normal)
    reg_method ridge
    
    # cutoff radius
    cutoff 10.0
    
    # model type with respect to structural features
    # 1 (power of features)
    # 2 (polynomial of all features)
    # 3 (polynomial of pair features + invariants)
    # 4 (polynomial of pair features and invariants(order=2) + invariants)
    model_type 1
    
    # degree of polynomials
    max_p 2
    
    # minimum, maximum and number of penalty parameters
    alpha_min -9
    alpha_max -2
    n_alpha 8
    
    # structural feature type (pair or gtinv)
    des_type gtinv
    
    # pairwise function type (gaussian or sph_bessel)
    pair_type gaussian
    # sequence for a in exp(-a(r-b)^2) [min, max, n]
    gaussian_params1 1.0 1.0 1
    # sequence for b in exp(-a(r-b)^2) [min, max, n]
    gaussian_params2 0 10.0 15
    
    # maximum order of group-theoretical invariants
    gtinv_order 4
    # maximum l values of group-theoretical invariants
    gtinv_maxl 7 7 2 0 0
    # use only symmetric invariants or not
    gtinv_sym False False False False False
    
    # training dataset files (describing locations of vasprun.xml files)
    train_data train_vasprun_0_Ti-Al train_vasprun_0_Ti-Al-1.5
    train_data_force True False
    train_data_weight 1.0 0.1
    
    # test dataset files (describing locations of vasprun.xml files)
    test_data vaspruns/test_vasprun_0_Ti-Al
    
    # atomic energy
    atomic_energy -2.44957430 -0.31391141
    

Predictions

  • Property prediction

    $(mlptools)/prediction/prediction.py --pot mlp.pkl --poscar POSCAR
    
  • Numerical force and stress prediction

    $(mlptools)/prediction/prediction.py --pot mlp.pkl --numerical_force --poscar POSCAR
    
  • Rotationally-invariant test

    $(mlptools)/prediction/prediction.py --pot mlp.pkl --rotate_check --poscar POSCAR
    

If you predict properties using lammps software, use lammps-mlip package and mlp.lammps potential file.

Examples to use mlptools from python

  • Prediction of the energy and forces of a given structure (POSCAR)

    #!/usr/bin/env python
    import numpy as np
    import joblib
    from mlptools.prediction.prediction import Pot
    
    pot = joblib.load('mlp.pkl')
    e, f, s_gpa, s_ev = pot.property(file_poscar='POSCAR')
    

Pareto frontier and repository (private documentation)

  • Pareto frontier evaluation from grid-search trainings

    cd ~/mlip/binary-alloy/Ti-Al/5-pareto
    $(mlptools)/tools/pareto_lammps.py --dir ../4-reg/grid*
    $(mlptools)/tools/pareto_lammps.py --dir ../4-reg/grid* --force