MineralMatcher

class emmet.core.featurization.robocrys.condense.mineral.MineralMatcher(initial_ltol=0.2, initial_stol=0.3, initial_angle_tol=5.0, use_fingerprint_matching=True, fingerprint_distance_cutoff=0.4, mineral_db=None)

Bases: object

Class to match a structure to a mineral name.

Uses a precomputed database of minerals and their fingerprints, extracted from the AFLOW prototype database. For more information on this database see reference [aflow]:

[aflow] (1,2,3,4)

Mehl, M. J., Hicks, D., Toher, C., Levy, O., Hanson, R. M., Hart, G., & Curtarolo, S. (2017), The AFLOW library of crystallographic prototypes: part 1. Computational Materials Science, 136, S1-S828. doi: 10.1016/j.commatsci.2017.01.017

Args:
initial_ltol: The fractional length tolerance used in the AFLOW

structure matching.

initial_stolThe site coordinate tolerance used in the AFLOW

structure matching.

initial_angle_tol: The angle tolerance used in the AFLOW structure

matching.

use_fingerprint_matching: Whether to use the fingerprint distance to

match minerals.

fingerprint_distance_cutoff: Cutoff to determine how similar a match

must be to be returned. The distance is measured between the structural fingerprints in euclidean space.

mineral_dbOptional path or pandas .DataFrame object containing the

mineral fingerprint database.

Parameters:
  • initial_ltol (float)

  • initial_stol (float)

  • initial_angle_tol (float)

  • use_fingerprint_matching (bool)

  • fingerprint_distance_cutoff (float)

  • mineral_db (str | Path | pd.DataFrame | None)

get_best_mineral_name(structure)

Gets the “best” mineral name for a structure.

Uses a combination of AFLOW prototype matching and fingerprinting to get the best mineral name.

The AFLOW structure prototypes are detailed in reference [aflow].

The algorithm works as follows:

  1. Check for AFLOW match. If single match return mineral name.

  2. If multiple matches, return the one with the smallest fingerprint distance.

  3. If no AFLOW match, get fingerprints within tolerance. If there are any matches, take the one with the smallest distance.

  4. If no fingerprints within tolerance, check get fingerprints without constraining the number of species types. If any matches, take the best one.

Args:

structure (Structure): A pymatgen Structure object to match.

Return:

(dict): The mineral name information. Stored as a dict with the keys “type”, “distance”, “n_species_types_match”, corresponding to the mineral name, the fingerprint distance between the prototype and known mineral, and whether the number of species types in the structure matches the number in the known prototype, respectively. If no mineral match is determined, the mineral type will be None. If an AFLOW match is found, the distance will be set to -1.

Return type:

dict[str, Any]

Parameters:

structure (Structure)

get_aflow_matches(structure)

Gets minerals for a structure by matching to AFLOW prototypes.

Overrides pymatgen.analysis.prototypes.AflowPrototypeMatcher to only return matches to prototypes with known mineral names.

The AFLOW tolerance parameters (defined in the init method) are passed to a pymatgen.analysis.structure_matcher.StructureMatcher object. The tolerances are gradually decreased until only a single match is found (if possible).

The AFLOW structure prototypes are detailed in reference [aflow].

Return type:

list[dict[str, Any]] | None

Parameters:

structure (Structure)

Args:

structure: A pymatgen structure to match.

Returns:

A list of dict, sorted by how close the match is, with the keys ‘type’, ‘distance’, ‘structure’. Distance is the euclidean distance between the structure and prototype fingerprints. If no match was found within the tolerances, None will be returned.

get_fingerprint_matches(structure, max_n_matches=None, match_n_sp=True, mineral_name_constraint=None)

Gets minerals for a structure by matching to AFLOW fingerprints.

Only AFLOW prototypes with mineral names are considered. The AFLOW structure prototypes are detailed in reference [aflow].

Return type:

list[dict[str, Any]] | None

Parameters:
  • structure (Structure)

  • max_n_matches (int | None)

  • match_n_sp (bool)

  • mineral_name_constraint (str | None)

Args:

structure: A structure to match. max_n_matches: Maximum number of matches to return. Set to None

to return all matches within the cutoff.

match_n_sp: Whether the structure and mineral must have the same

number of species. Defaults to True.

mineral_name_constraint: Whether to limit the matching to a specific

mineral name.

Returns:

A list of dict, sorted by how close the match is, with the keys ‘type’, ‘distance’, ‘structure’. Distance is the euclidean distance between the structure and prototype fingerprints. If no match was found within the tolerances, None will be returned.