How-to: asap gen_desc¶
asap gen_desc sub_command
is the descriptor generation command.
This is in general the first step to do, no matter if you want to map the dataset,
perform regression or any other analysis. Type asap gen_desc --help
and
asap gen_desc sub_command --help
for helper strings.
Input¶
asap gen_desc
can read any format that is also supported by ASE.[ase.io](https://wiki.fysik.dtu.dk/ase/ase/io/io.html) is supported.
For example:
xyz, lammps-data, cif, cell, vasp, res, gromacs, …
However, it is most thoroughly tested on extended xyz files (units in angstrom, additional info and cell size in the comment line).
It can read in lots of files using a wildcard and some common pattern e.g. H*.cell
.
Note
when glob pattern is passed it should be quoted: e.g.
asap gen_desc -f *.cell
will not work
but asap gen_desc -f "*.cell"
does.
Note
if the program gets a bit confused about the file format, try supply some additional information using --fxyz_format '{...}'
flag.
Output¶
The code will output two files
${prefix}-desc.xyz
, where${prefix}
is determined by the string that follows--prefix
flag (default: ASAP). This file is an extended xyz file that contain the design matrix (descriptors for each structure and/or atomic environments).${prefix}-desc.yaml
that contains all the meta information about how the design matrix was generated (i.e. which descriptor, hyper-parameter was used.)
Methods¶
There are two types of descriptors for atomic structures: the first type (e.g. ACSF, SOAP) starts from atomic descriptors for each atom in the structure, and then all the atomic descriptors associated with a structure are reduced to a global descriptor. The second type (e.g. Coulumb Matrix) directly generates global descriptors.
To reduce the atomic descriptors of all atoms in a structure to a single vector representing the whole strucuture, asap
calls a reducer
, and there are different options. For example, for a structure A, the most straightforward way to get its global descriptor is to simply take the average of the atomic ones, i.e.
which is the average
reducer. Alternatively, one can take the sum
. The moment_average
or moment_average
reducer first take the moment of the atomic descriptors, e.g.
where z (--zeta/-z
) is the moment to take when converting atomic descriptors to global ones.
In addition to these, one can perform the summation or the average operation on the per-element basis, by using the flag --element_wise/-e
.
Overview of sub-commands¶
sub-commands that controls the actual generation of the descriptor matrix:
option |
description |
---|---|
acsf |
Generate ACSF descriptors |
cm |
Generate the Coulomb Matrix descriptors |
run |
Running analysis using input files |
soap |
Generate SOAP descriptors |
asap gen_desc¶
Descriptor generation command This command function evaluated before the descriptor specific ones, we setup the general stuff here, such as read the files.
asap gen_desc [OPTIONS] COMMAND [ARGS]...
Options
-
-s
,
--stride
<stride>
¶ Read in the xyz trajectory with X stide. Default: read/compute all frames.
-
--periodic
,
--no-periodic
¶
Is the system periodic? If not specified, will infer from the XYZ file.
-
-i
,
--in_file
,
--in
<in_file>
¶ The state file that includes a dictionary-like specifications of descriptors to use.
-
-f
,
--fxyz
<fxyz>
¶ Input file that contains XYZ coordinates. See a list of possible input formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html If a wildcard * is used, all files matching the pattern is read.
-
--fxyz_format
<fxyz_format>
¶ Additional info for the input file format. e.g. {“format”:”lammps-data”,”units”:”metal”,”style”:”full”}
-
-p
,
--prefix
<prefix>
¶ Prefix to be used for the output file.
-
-np
,
--number_processes
,
--nprocess
<number_processes>
¶ Number of processes when compute the descriptors in parrallel.
- Default
1
acsf¶
Generate ACSF descriptors
asap gen_desc acsf [OPTIONS]
Options
-
-c
,
--cutoff
<cutoff>
¶ Cutoff radius
-
-u
,
--universal_acsf
,
--uacsf
<universal_acsf>
¶ Try out our universal ACSF parameters.
- Default
minimal
- Options
none|smart|minimal|longrange
-
--tag
<tag>
¶ Tag for the descriptors.
-
-pa
,
--peratom
¶
Save the per-atom local descriptors.
- Default
False
-
-e
,
--element_wise
¶
element-wise operation to get global descriptors from the atomic soap vectors
- Default
False
-
-z
,
--zeta
<zeta>
¶ Moments to take when converting atomic descriptors to global ones.
-
-r
,
--reducer_type
<reducer_type>
¶ type of operations to get global descriptors from the atomic soap vectors, e.g. [average], [sum], [moment_avg], [moment_sum].
- Default
average
cm¶
Generate the Coulomb Matrix descriptors
asap gen_desc cm [OPTIONS]
Options
-
--tag
<tag>
¶ Tag for the descriptors.
soap¶
Generate SOAP descriptors
asap gen_desc soap [OPTIONS]
Options
-
-c
,
--cutoff
<cutoff>
¶ Cutoff radius
-
-n
,
--nmax
<nmax>
¶ Maximum radial label
-
-l
,
--lmax
<lmax>
¶ Maximum angular label (<= 9)
-
--rbf
<rbf>
¶ Radial basis function
- Default
gto
- Options
gto|polynomial
-
-sigma
,
-g
,
--atom-gaussian-width
<atom_gaussian_width>
¶ The width of the Gaussian centered on atoms.
- Default
0.5
-
--crossover
,
--no-crossover
¶
If to included the crossover of atomic types.
- Default
False
-
-u
,
--universal_soap
,
--usoap
<universal_soap>
¶ Try out our universal SOAP parameters.
- Default
minimal
- Options
none|smart|minimal|longrange
-
--tag
<tag>
¶ Tag for the descriptors.
-
-pa
,
--peratom
¶
Save the per-atom local descriptors.
- Default
False
-
-e
,
--element_wise
¶
element-wise operation to get global descriptors from the atomic soap vectors
- Default
False
-
-z
,
--zeta
<zeta>
¶ Moments to take when converting atomic descriptors to global ones.
-
-r
,
--reducer_type
<reducer_type>
¶ type of operations to get global descriptors from the atomic soap vectors, e.g. [average], [sum], [moment_avg], [moment_sum].
- Default
average
Note
More documentation to be added.