How-to: asap map¶
asap map sub_command
is for making low-dimensional projections of the design matrix generated by the command asap gen_desc
.
Overview of sub-commands¶
sub-commands that controls which algorithm to use for the dimensionality reduction:
option |
description |
---|---|
pca |
Principal Component Analysis |
raw |
Just plot the raw coordinates |
skpca |
Sparse Kernel Principal Component Analysis |
tsne |
t-SNE |
umap |
UMAP |
asap map¶
Making 2D maps using dimensionality reduction. This command function evaluated before the specific ones, we setup the general stuff here, such as read the files.
asap map [OPTIONS] COMMAND [ARGS]...
Options
-
-f
,
--fxyz
<fxyz>
¶ Input file that contains XYZ coordinates. See a list of possible input formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html If a wildcard * is used, all files matching the pattern is read.
-
-p
,
--prefix
<prefix>
¶ Prefix to be used for the output file.
-
--only_use_species
<only_use_species>
¶ Only use the atomic descriptors of species with the specified atomic number. Only makes sense if already using –use_atomic_descriptors.
-
-ua
,
--use_atomic_descriptors
,
--use_atomic
¶
Use atomic descriptors instead of global ones.
-
-dm
,
--design_matrix
<design_matrix>
¶ Location of descriptor matrix file or name of the tags in ase xyz file the type is a list ‘[dm1, dm2]’, as we can put together simutanously several design matrix.
-
-s
,
--style
<style>
¶ Style of the plot.
- Default
default
- Options
default|journal
-
-ar
,
--aspect_ratio
<aspect_ratio>
¶ Aspect ratio of the plot
- Default
2
-
-a
,
--annotate
<annotate>
¶ Location of tags to annotate the samples.
-
--adjusttext
,
--no-adjusttext
¶
Adjust the annotation texts so they do not overlap.
-
--peratom
¶
Save the per-atom projection.
-
-ep
,
--extra-properties
<extra_properties>
¶ Additional properties to be read for each frmae in CSV format.
-
-o
,
--output
<output>
¶ Output file format.
- Options
xyz|matrix|none|chemiscope
-
--keepraw
,
--no-keepraw
¶
Keep the high dimensional descriptor when output XYZ file.
-
-c
,
--color
<color>
¶ Location of a file or name of the properties in the XYZ file. Used to color the scatter plot for all samples (N floats).
-
-ccol
,
--color_column
<color_column>
¶ The column number used in the color file. Starts from 0.
-
-clab
,
--color_label
<color_label>
¶ The label for the color bar.
-
-c0
,
--color_from_zero
¶
Set the minimum to zero and only plot the excess.
- Default
False
-
-cmap
,
--colormap
<colormap>
¶ Colormap used. Common options: gnuplot, tab10, viridis, bwr, rainbow.
- Default
gnuplot
-
-nbs
,
--normalized_by_size
¶
Normalize the quantity used for color function by the number of atoms in each frame.
- Default
False
pca¶
Principal Component Analysis
asap map pca [OPTIONS]
Options
-
--scale
,
--no-scale
¶
Standard scaling of the coordinates.
-
-d
,
--dimension
<dimension>
¶ Number of the dimensions to keep in the output XYZ file.
-
--axes
<axes>
¶ Plot the projection along which projection axes.
raw¶
Just plot the raw coordinates
asap map raw [OPTIONS]
Options
-
--scale
,
--no-scale
¶
Standard scaling of the coordinates.
-
-d
,
--dimension
<dimension>
¶ Number of the dimensions to keep in the output XYZ file.
-
--axes
<axes>
¶ Plot the projection along which projection axes.
skpca¶
Sparse Kernel Principal Component Analysis
asap map skpca [OPTIONS]
Options
-
--scale
,
--no-scale
¶
Standard scaling of the coordinates.
-
-d
,
--dimension
<dimension>
¶ Number of the dimensions to keep in the output XYZ file.
-
--axes
<axes>
¶ Plot the projection along which projection axes.
-
-k
,
--kernel
<kernel>
¶ Kernel function for converting design matrix to kernel matrix.
- Default
linear
- Options
linear|polynomial|cosine
-
-kp
,
--kernel_parameter
<kernel_parameter>
¶ Parameter used in the kernel function.
-
-s
,
--sparse_mode
<sparse_mode>
¶ Sparsification method to use.
- Default
fps
- Options
random|cur|fps|sequential
-
-n
,
--n_sparse
<n_sparse>
¶ number of the representative samples, set negative if using no sparsification
- Default
100
tsne¶
t-SNE
asap map tsne [OPTIONS]
Options
-
--metric
<metric>
¶ controls how distance is computed in the ambient space of the input data. See: https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
- Default
euclidean
-
-l
,
--learning_rate
<learning_rate>
¶ The learning rate is usually in the range [10.0, 1000.0].
- Default
200.0
-
-e
,
--early_exaggeration
<early_exaggeration>
¶ Controls how tight natural clusters in the original space are in the embedded space.
- Default
12.0
-
-p
,
--perplexity
<perplexity>
¶ The perplexity is related to the number of nearest neighbors that is used in other manifold learning algorithms. Larger datasets usually require a larger perplexity. Consider selecting a value between 5 and 50. Different values can result in significanlty different results.
- Default
30.0
-
--pca
,
--no-pca
¶
Preprocessing the data using PCA with dimension 50. Recommended.
-
--scale
,
--no-scale
¶
Standard scaling of the coordinates.
-
-d
,
--dimension
<dimension>
¶ Number of the dimensions to keep in the output XYZ file.
-
--axes
<axes>
¶ Plot the projection along which projection axes.
umap¶
UMAP
asap map umap [OPTIONS]
Options
-
-nn
,
--n_neighbors
<n_neighbors>
¶ Controls how UMAP balances local versus global structure in the data.
- Default
10
-
-md
,
--min_dist
<min_dist>
¶ controls how tightly UMAP is allowed to pack points together.
- Default
0.1
-
--metric
<metric>
¶ controls how distance is computed in the ambient space of the input data. See: https://umap-learn.readthedocs.io/en/latest/parameters.html#metric
- Default
euclidean
-
--scale
,
--no-scale
¶
Standard scaling of the coordinates.
-
-d
,
--dimension
<dimension>
¶ Number of the dimensions to keep in the output XYZ file.
-
--axes
<axes>
¶ Plot the projection along which projection axes.
Note
More documentation to be added.