How-to: asap kde

asap kde sub_command is for performing kernel density estimation of the data. One can do the cluster using the high-dimensional design matrix generated by asap gen_desc, or the low-dimensional projections of the design matrix generated by the command asap map.

Overview of sub-commands

sub-commands that select the specific algorithm for kernel density estimations:

option

description

kde_internal

Internal implementation of KDE

kde_scipy

Scipy implementation of KDE

kde_sklearn

Scikit-learn implementation of KDE

plot_pca

Plot the KDE results using a PCA map

asap kde

Kernel density estimation using the design matrix. This command function evaluated before the specific ones, we setup the general stuff here, such as read the files.

asap kde [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

Options

-f, --fxyz <fxyz>

Input file that contains XYZ coordinates. See a list of possible input formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html If a wildcard * is used, all files matching the pattern is read.

-p, --prefix <prefix>

Prefix to be used for the output file.

--only_use_species <only_use_species>

Only use the atomic descriptors of species with the specified atomic number. Only makes sense if already using –use_atomic_descriptors.

-ua, --use_atomic_descriptors, --use_atomic

Use atomic descriptors instead of global ones.

-dm, --design_matrix <design_matrix>

Location of descriptor matrix file or name of the tags in ase xyz file the type is a list ‘[dm1, dm2]’, as we can put together simutanously several design matrix.

--savetxt, --no-savetxt

Save the results to the txt file

--savexyz, --no-savexyz

Save the results to the xyz file

kde_internal

Internal implementation of KDE

asap kde kde_internal [OPTIONS]

Options

-d, --dimension <dimension>

The number of the first D dimensions to keep when doing KDE.

Default

8

kde_scipy

Scipy implementation of KDE

asap kde kde_scipy [OPTIONS]

Options

-d, --dimension <dimension>

The number of the first D dimensions to keep when doing KDE.

Default

50

-bw, --bw_method <bw_method>

This can be ‘scott’, ‘silverman’, a scalar constant or a callable.

kde_sklearn

Scikit-learn implementation of KDE

asap kde kde_sklearn [OPTIONS]

Options

-d, --dimension <dimension>

The number of the first D dimensions to keep when doing KDE.

Default

50

--metric <metric>

controls how distance is computed in the ambient space of the input data. See: https://scikit-learn.org/stable/modules/density.html#kernel-density-estimation

Default

euclidean

--algorithm <algorithm>

Algorithm to use

Default

auto

Options

kd_tree|ball_tree|auto

--kernel <kernel>

Kernel to use

Default

gaussian

Options

gaussian|tophat|epanechnikov|exponential|linear|cosine

-bw, --bandwidth <bandwidth>

Bandwidth of the kernel

Default

1

plot_pca

Plot the KDE results using a PCA map

asap kde plot_pca [OPTIONS]

Options

-s, --style <style>

Style of the plot.

Default

default

Options

default|journal

-ar, --aspect_ratio <aspect_ratio>

Aspect ratio of the plot

Default

2

-a, --annotate <annotate>

Location of tags to annotate the samples.

--adjusttext, --no-adjusttext

Adjust the annotation texts so they do not overlap.

--peratom

Save the per-atom projection.

--scale, --no-scale

Standard scaling of the coordinates.

-d, --dimension <dimension>

Number of the dimensions to keep in the output XYZ file.

--axes <axes>

Plot the projection along which projection axes.

-p, --prefix <prefix>

Prefix to be used for the output file.

Note

More documentation to be added.