How-to: asap cluster¶
asap cluster sub_command
is for performing clustering of the data. One can do the cluster using the high-dimensional design matrix generated by asap gen_desc
, or the low-dimensional projections of the design matrix generated by the command asap map
.
Overview of sub-commands¶
option |
description |
---|---|
dbscan |
Density-based spatial clustering of applications with noise… |
fdb |
Clustering by fast search and find of density peaks (FDB) |
plot_pca |
Plot the clustering results using a PCA map. |
Note
plot_pca
does not perfrom the clustering task.
it is only used to plot the clustering results using a PCA map.
it should be used following a clustering command, e.g.
asap cluster -f some.xyz -dm '[*]' fdb plot_pca
asap cluster¶
Clustering using the design matrix. This command function evaluated before the specific ones, we setup the general stuff here, such as read the files.
asap cluster [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...
Options
-
-f
,
--fxyz
<fxyz>
¶ Input file that contains XYZ coordinates. See a list of possible input formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html If a wildcard * is used, all files matching the pattern is read.
-
-p
,
--prefix
<prefix>
¶ Prefix to be used for the output file.
-
--only_use_species
<only_use_species>
¶ Only use the atomic descriptors of species with the specified atomic number. Only makes sense if already using –use_atomic_descriptors.
-
-ua
,
--use_atomic_descriptors
,
--use_atomic
¶
Use atomic descriptors instead of global ones.
-
-dm
,
--design_matrix
<design_matrix>
¶ Location of descriptor matrix file or name of the tags in ase xyz file the type is a list ‘[dm1, dm2]’, as we can put together simutanously several design matrix.
-
-km
,
--kernel_matrix
<kernel_matrix>
¶ Location of a kernel matrix file
-
--savetxt
,
--no-savetxt
¶
Save the results to the txt file
-
--savexyz
,
--no-savexyz
¶
Save the results to the xyz file
dbscan¶
Density-based spatial clustering of applications with noise (DBSCAN)
asap cluster dbscan [OPTIONS]
Options
-
--metric
<metric>
¶ controls how distance is computed in the ambient space of the input data. See: https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
- Default
euclidean
-
-ms
,
--min_samples
<min_samples>
¶ The number of samples (or total weight) in a neighborhood for a point to be considered as a core point.
- Default
5
-
-e
,
--eps
<eps>
¶ The maximum distance between two samples for one to be considered as in the neighborhood of the other.
plot_pca¶
Plot the clustering results using a PCA map. Only use this command after fdb or dbscan.
asap cluster plot_pca [OPTIONS]
Options
-
-s
,
--style
<style>
¶ Style of the plot.
- Default
default
- Options
default|journal
-
-ar
,
--aspect_ratio
<aspect_ratio>
¶ Aspect ratio of the plot
- Default
2
-
-a
,
--annotate
<annotate>
¶ Location of tags to annotate the samples.
-
--adjusttext
,
--no-adjusttext
¶
Adjust the annotation texts so they do not overlap.
-
--peratom
¶
Save the per-atom projection.
-
--scale
,
--no-scale
¶
Standard scaling of the coordinates.
-
-d
,
--dimension
<dimension>
¶ Number of the dimensions to keep in the output XYZ file.
-
--axes
<axes>
¶ Plot the projection along which projection axes.
-
-p
,
--prefix
<prefix>
¶ Prefix to be used for the output file.
Note
More documentation to be added.