API Reference#
Index#
 class usearch.index.BatchMatches(keys: ndarray, distances: ndarray, counts: ndarray, visited_members: int = 0, computed_distances: int = 0)#
This class contains information about multiple retrieved vectors for multiple queries, i.e it is a set of Matches instances.
 computed_distances: int = 0#
 count_matches(expected: ndarray, count: int  None = None) int #
Measures recall [0, len(expected)] as of Matches that contain the corresponding expected entry anywhere among results.
 counts: ndarray#
 distances: ndarray#
 keys: ndarray#
 mean_recall(expected: ndarray, count: int  None = None) float #
Measures recall [0, 1] as of Matches that contain the corresponding expected entry anywhere among results.
 to_list() List[List[tuple]] #
Convert the result for each query to the list of tuples with information about its matches.
 visited_members: int = 0#
 class usearch.index.Clustering(index: 'Index', matches: 'BatchMatches', queries: 'Optional[np.ndarray]' = None)#
 property centroids_popularity: Tuple[ndarray, ndarray]#
 members_of(centroid: uint64) ndarray #
 property network#
 plot_centroids_popularity()#
 subcluster(centroid: uint64, **clustering_kwards) Clustering #
 class usearch.index.CompiledMetric(pointer, kind, signature)#
 kind: MetricKind#
Alias for field number 1
 pointer: int#
Alias for field number 0
 signature: MetricSignature#
Alias for field number 2
 class usearch.index.Index(*, ndim: int = 0, metric: str  ~usearch.compiled.MetricKind  ~usearch.index.CompiledMetric = <MetricKind.Cos: 99>, dtype: str  ~usearch.compiled.ScalarKind  None = None, connectivity: int  None = None, expansion_add: int  None = None, expansion_search: int  None = None, multi: bool = False, path: ~os.PathLike  None = None, view: bool = False, enable_key_lookups: bool = True)#
Fast vectorsearch engine for dense equidimensional embeddings.
Vector keys must be integers. Vectors must have the same number of dimensions within the index. Supports Inner Product, Cosine Distance, L^n measures like the Euclidean metric, as well as automatic downcasting to lowprecision floatingpoint and integral representations.
 add(keys: uint64  Iterable[uint64]  int  Iterable[int]  ndarray  memoryview, vectors: ndarray  Iterable[ndarray]  memoryview, *, copy: bool = True, threads: int = 0, log: str  bool = False, progress: Callable[[int, int], bool]  None = None) int  ndarray #
Inserts one or move vectors into the index.
For maximal performance the keys and vectors should conform to the Python’s “buffer protocol” spec.
 To index a single entry:
keys: int, vectors: np.ndarray.
 To index many entries:
keys: np.ndarray, vectors: np.ndarray.
When working with extremely large indexes, you may want to pass copy=False, if you can guarantee the lifetime of the primary vectors store during the process of construction.
 Parameters:
keys (Optional[KeyOrKeysLike], can be None) – Unique identifier(s) for passed vectors
vectors (VectorOrVectorsLike) – Vector or a rowmajor matrix
copy (bool, defaults to True) – Should the index store a copy of vectors
threads (int, defaults to 0) – Optimal number of cores to use
log (Union[str, bool], defaults to False) – Whether to print the progress bar
progress (Optional[ProgressCallback], defaults to None) – Callback to report stats of the progress and control it
 Returns:
Inserted key or keys
 Type:
Union[int, np.ndarray]
 property capacity: int#
 clear()#
Erases all the vectors from the index, preserving the space for future insertions.
 cluster(*, vectors: ndarray  None = None, keys: ndarray  None = None, min_count: int  None = None, max_count: int  None = None, threads: int = 0, log: str  bool = False, progress: Callable[[int, int], bool]  None = None) Clustering #
Clusters already indexed or provided vectors, mapping them to various centroids.
 Parameters:
vectors (Optional[VectorOrVectorsLike]) –
.
count (Optional[int], defaults to None) – Upper bound on the number of clusters to produce
threads (int, defaults to 0) – Optimal number of cores to use,
log (Union[str, bool], defaults to False) – Whether to print the progress bar
progress (Optional[ProgressCallback], defaults to None) – Callback to report stats of the progress and control it
 Returns:
Matches for one or more queries
 Return type:
Union[Matches, BatchMatches]
 property connectivity: int#
 contains(keys: uint64  Iterable[uint64]  int  Iterable[int]  ndarray  memoryview) bool  ndarray #
 count(keys: uint64  Iterable[uint64]  int  Iterable[int]  ndarray  memoryview) int  ndarray #
 property dtype: ScalarKind#
 property expansion_add: int#
 property expansion_search: int#
 get(keys: uint64  Iterable[uint64]  int  Iterable[int]  ndarray  memoryview, dtype: str  ScalarKind  None = None) ndarray  None  Tuple[ndarray  None] #
Looks up one or more keys from the Index, retrieving corresponding vectors.
Returns None, if one key is requested, and its not present. Returns a (row) vector, if the key maps into a single vector. Returns a (rowmajor) matrix, if the key maps into a multiple vectors. If multiple keys are requested, composes many such responses into a tuple.
 Parameters:
keys (KeyOrKeysLike) – One or more keys to lookup
 Returns:
One or more keys lookup results
 Return type:
Union[Optional[np.ndarray], Tuple[Optional[np.ndarray]]]
 property hardware_acceleration: str#
Describes the kind of hardwareacceleration support used in that exact instance of the Index, for that metric kind, and the given number of dimensions.
 Returns:
“auto”, if nothing is available, ISA subset name otherwise
 Return type:
str
 property jit: bool#
True, if the provided metric was JITed :rtype: bool
 Type:
return
 join(other: Index, max_proposals: int = 0, exact: bool = False, progress: Callable[[int, int], bool]  None = None) Dict[uint64, uint64] #
Performs “Semantic Join” or pairwise matching between self & other index. Is different from search, as no collisions are allowed in resulting pairs. Uses the concept of “Stable Marriages” from Combinatorics, famous for the 2012 Nobel Prize in Economics.
 Parameters:
other (Index) – Another index.
max_proposals (int, optional) – Limit on candidates evaluated per vector, defaults to 0
exact (bool, optional) – Controls if underlying search should be exact, defaults to False
progress (Optional[ProgressCallback], defaults to None) – Callback to report stats of the progress and control it
 Returns:
Mapping from keys of self to keys of other
 Return type:
Dict[Key, Key]
 property keys: IndexedKeys#
 level_stats(level: int) IndexStats #
Get statistics for one level of the index  one graph.
 Returns:
Statistics for one level of the index  one graph.
 Return type:
_CompiledIndexStats
 Statistics:
nodes
(int): The number of nodes in that level.edges
(int): The number of edges in that level.max_edges
(int): The maximum possible number of edges in that level.allocated_bytes
(int): The amount of allocated memory for that level.
 property levels_stats: List[IndexStats]#
Get the accumulated statistics for every level graph.
 Returns:
Statistics for every level graph.
 Return type:
List[_CompiledIndexStats]
 Statistics:
nodes
(int): The number of nodes in that level.edges
(int): The number of edges in that level.max_edges
(int): The maximum possible number of edges in that level.allocated_bytes
(int): The amount of allocated memory for that level.
 load(path_or_buffer: str  PathLike  bytes  None = None, progress: Callable[[int, int], bool]  None = None)#
 property max_level: int#
 property memory_usage: int#
 static metadata(path_or_buffer: str  PathLike  bytes) dict  None #
 property metric: MetricKind  CompiledMetric#
 property metric_kind: MetricKind  CompiledMetric#
 property multi: bool#
 property ndim: int#
 property nlevels: int#
 pairwise_distance(left: uint64  Iterable[uint64]  int  Iterable[int]  ndarray  memoryview, right: uint64  Iterable[uint64]  int  Iterable[int]  ndarray  memoryview) ndarray  float #
 remove(keys: uint64  Iterable[uint64]  int  Iterable[int]  ndarray  memoryview, *, compact: bool = False, threads: int = 0) int  ndarray #
Removes one or move vectors from the index.
When working with extremely large indexes, you may want to mark some entries deleted, instead of rebuilding a filtered index. In other cases, rebuilding  is the recommended approach.
 Parameters:
keys (KeyOrKeysLike) – Unique identifier for passed vectors, optional
compact (bool, optional) – Removes links to removed nodes (expensive), defaults to False
threads (int, optional) – Optimal number of cores to use, defaults to 0
 Returns:
Array of integers for the number of removed vectors per key
 Type:
Union[int, np.ndarray]
 rename(from_: uint64  Iterable[uint64]  int  Iterable[int]  ndarray  memoryview, to: uint64  Iterable[uint64]  int  Iterable[int]  ndarray  memoryview) int  ndarray #
Rename existing member vector or vectors.
May be used in iterative clustering procedures, where one would iteratively relabel every vector with the name of the cluster an entry belongs to, until the system converges.
 Parameters:
from (KeyOrKeysLike) – One or more keys to be renamed
to (KeyOrKeysLike) – New name or names (of identical length as from_)
 Returns:
Number of vectors that were found and renamed
 Return type:
int
 reset()#
Erases all members from index, closing files, and returning RAM to OS.
 save(path_or_buffer: str  PathLike  None = None, progress: Callable[[int, int], bool]  None = None) bytes  None #
 search(vectors: ndarray  Iterable[ndarray]  memoryview, count: int = 10, radius: float = inf, *, threads: int = 0, exact: bool = False, log: str  bool = False, progress: Callable[[int, int], bool]  None = None) Matches  BatchMatches #
Performs approximate nearest neighbors search for one or more queries.
 Parameters:
vectors (VectorOrVectorsLike) – Query vector or vectors.
count (int, defaults to 10) – Upper count on the number of matches to find
threads (int, defaults to 0) – Optimal number of cores to use
exact (bool, defaults to False) – Perform exhaustive lineartime exact search
log (Union[str, bool], optional) – Whether to print the progress bar, default to False
progress (Optional[ProgressCallback], defaults to None) – Callback to report stats of the progress and control it
 Returns:
Matches for one or more queries
 Return type:
Union[Matches, BatchMatches]
 property serialized_length: int#
 property size: int#
 property specs: Dict[str, str  int  bool]#
 property stats: IndexStats#
Get the accumulated statistics for the entire multilevel graph.
 Returns:
Statistics for the entire multilevel graph.
 Return type:
_CompiledIndexStats
 Statistics:
nodes
(int): The number of nodes in that level.edges
(int): The number of edges in that level.max_edges
(int): The maximum possible number of edges in that level.allocated_bytes
(int): The amount of allocated memory for that level.
 property vectors: ndarray#
 view(path_or_buffer: str  PathLike  bytes  bytearray  None = None, progress: Callable[[int, int], bool]  None = None)#
 class usearch.index.IndexedKeys(index: Index)#
Smartreference for the range of keys present in a specific Index
 class usearch.index.Indexes(indexes: Iterable[Index] = [], paths: Iterable[PathLike] = [], view: bool = False, threads: int = 0)#

 merge_path(path: PathLike)#
 search(vectors, count: int = 10, *, threads: int = 0, exact: bool = False, progress: Callable[[int, int], bool]  None = None)#
 class usearch.index.Match(key: int, distance: float)#
This class contains information about retrieved vector.
 distance: float#
 key: int#
 to_tuple() tuple #
 class usearch.index.Matches(keys: ndarray, distances: ndarray, visited_members: int = 0, computed_distances: int = 0)#
This class contains information about multiple retrieved vectors for single query, i.e it is a set of Match instances.
 computed_distances: int = 0#
 distances: ndarray#
 keys: ndarray#
 to_list() List[tuple] #
Convert matches to the list of tuples which contain matches’ indices and distances to them.
 visited_members: int = 0#
 usearch.index.search(dataset: ~numpy.ndarray, query: ~numpy.ndarray, count: int = 10, metric: str  ~usearch.compiled.MetricKind  ~usearch.index.CompiledMetric = <MetricKind.Cos: 99>, *, exact: bool = False, threads: int = 0, log: str  bool = False, progress: ~typing.Callable[[int, int], bool]  None = None) Matches  BatchMatches #
Shortcut for search, that can avoid index construction. Particularly useful for tiny datasets, where bruteforce exact search works fast enough.
 Parameters:
dataset (np.ndarray) – Rowmajor matrix.
query (np.ndarray) – Query vector or vectors (also rowmajor), to find in dataset.
count (int, optional) – Upper count on the number of matches to find, defaults to 10
metric (MetricLike, defaults to MetricKind.Cos Kind of the distance function, or the Numba cfunc JITcompiled object. Possible MetricKind values: IP, Cos, L2sq, Haversine, Pearson, Hamming, Tanimoto, Sorensen.) – Distance function
threads (int, optional) – Optimal number of cores to use, defaults to 0
exact (bool, optional) – Perform exhaustive lineartime exact search, defaults to False
log (Union[str, bool], optional) – Whether to print the progress bar, default to False
progress (Optional[ProgressCallback], defaults to None) – Callback to report stats of the progress and control it
 Returns:
Matches for one or more queries
 Return type:
Union[Matches, BatchMatches]
IO#
 usearch.io.guess_numpy_dtype_from_filename(filename) type  None #
 usearch.io.load_matrix(filename: str, start_row: int = 0, count_rows: int  None = None, view: bool = False, dtype: type  None = None) ndarray  None #
Read *.ibin, *.bbib, *.hbin, *.fbin, *.dbin files with matrices.
 Parameters:
filename – path to the matrix file
start_row – start reading vectors from this index
count_rows – number of vectors to read. If None, read all vectors
view – set to True to memorymap the file instead of loading to RAM
 Returns:
parsed matrix
 Return type:
numpy.ndarray
 usearch.io.numpy_scalar_size(dtype) int #
Evaluation#
 class usearch.eval.AddTask(keys: 'np.ndarray', vectors: 'np.ndarray')#

 property count#
 inplace_shuffle()#
Rorders the vectors and keys. Often used for robustness benchmarks.
 keys: ndarray#
 property ndim#
 vectors: ndarray#
 class usearch.eval.Dataset(keys: 'np.ndarray', vectors: 'np.ndarray', queries: 'np.ndarray', neighbors: 'np.ndarray')#
 static build(vectors: str  None = None, queries: str  None = None, neighbors: str  None = None, count: int  None = None, ndim: int  None = None, k: int  None = None)#
Either loads an existing dataset from disk, or generates one on the fly.
 Parameters:
vectors (Optional[str], optional) – _description_, defaults to None
queries (Optional[str], optional) – _description_, defaults to None
neighbors (Optional[str], optional) – _description_, defaults to None
count (Optional[int], optional) – _description_, defaults to None
ndim (Optional[int], optional) – _description_, defaults to None
k (Optional[int], optional) – _description_, defaults to None
 crop_neighbors(k: int)#
 keys: ndarray#
 property ndim#
 neighbors: ndarray#
 queries: ndarray#
 vectors: ndarray#
 class usearch.eval.Evaluation(tasks: 'List[Union[AddTask, SearchTask]]', count: 'int', ndim: 'int')#
 count: int#
 static for_dataset(dataset: Dataset, batch_size: int = 0, clusters: int = 1) Evaluation #
 ndim: int#
 tasks: List[AddTask  SearchTask]#
 class usearch.eval.SearchStats(index_size: int, count_queries: int, count_matches: int, visited_members: int, computed_distances: int)#
Contains statistics for one or more search runs, including the number of internal nodes that were fetched (visited_members) and the number of times the distance metric was invoked (computed_distances).
Other derivative metrics include the mean_recall and mean_efficiency. Recall is the share of queried vectors, that were successfully found. Efficiency describes the number of distances that had to be computed for each query, normalized to size of the index. Highest efficiency is 0.(9), lowest is zero. Highest is achieved, when the distance metric was computed just once per query. Lowest happens during exact search, when every distance to every present vector had to be computed.
 computed_distances: int#
 count_matches: int#
 count_queries: int#
 index_size: int#
 property mean_efficiency: float#
 property mean_recall: float#
 visited_members: int#
 class usearch.eval.SearchTask(queries: 'np.ndarray', neighbors: 'np.ndarray')#
 neighbors: ndarray#
 queries: ndarray#
 slices(batch_size: int) List[SearchTask] #
Splits this dataset into smaller chunks.
 class usearch.eval.TaskResult(add_operations: 'Optional[int]' = None, add_per_second: 'Optional[float]' = None, search_operations: 'Optional[int]' = None, search_per_second: 'Optional[float]' = None, recall_at_one: 'Optional[float]' = None)#
 add_operations: int  None = None#
 add_per_second: float  None = None#
 property add_seconds: float#
 recall_at_one: float  None = None#
 search_operations: int  None = None#
 search_per_second: float  None = None#
 property search_seconds: float#
 usearch.eval.dcg(relevances: ndarray, k: int  None = None) ndarray #
Calculate DCG (Discounted Cumulative Gain) up to position k.
 Parameters:
relevances (list) – List of true relevance scores (in the order as they are ranked)
k (int) – Position up to which DCG is computed
 Returns:
The DCG score at position k
 Return type:
float
 usearch.eval.measure_seconds(f: Callable) Tuple[float, Any] #
Simple function profiling decorator.
 Parameters:
f (Callable) – Function to be profiled
 Returns:
Time elapsed in seconds and the result of the execution
 Return type:
Tuple[float, Any]
 usearch.eval.ndcg(relevances: ndarray, k: int  None = None) ndarray #
Calculate NDCG (Normalized Discounted Cumulative Gain) at position k.
 Parameters:
relevances (list) – List of true relevance scores (in the order as they are ranked)
k (int) – Position up to which NDCG is computed
 Returns:
The NDCG score at position k
 Return type:
float
 usearch.eval.random_vectors(count: int, metric: ~usearch.compiled.MetricKind = <MetricKind.IP: 105>, dtype: ~usearch.compiled.ScalarKind = <ScalarKind.F32: 11>, ndim: int  None = None, index: ~usearch.index.Index  None = None) ndarray #
Produces a collection of random vectors normalized for the provided metric and matching wanted dtype, which can both be inferred from an existing index.
 usearch.eval.relevance(expected: ndarray, predicted: ndarray, k: int  None = None) ndarray #
Calculate relevance scores. Binary relevance scores
 Parameters:
expected (np.ndarray) – groundtruth keys
predicted (np.ndarray) – predicted keys
 usearch.eval.self_recall(index: Index, sample: float  int = 1.0, **kwargs) SearchStats #
Simplest benchmark for a quality of search, which queries every existing member of the index, to make sure approximate search finds the point itself.
 Parameters:
index (Index) – Nonempty preconstructed index
sample (Union[float, int]) – Share (or number) of vectors to search, defaults to 1.0
 Returns:
Evaluation report with key metrics
 Return type: