Contributing#
Contribution Guide#
Thank you for coming here! It’s always nice to have third-party contributors 🤗
To keep the quality of the code high, we have a set of guidelines common to all Unum projects.
Before you start#
Before building the first time, please pull git
submodules.
That’s how we bring in SimSIMD and other optional dependencies to test all of the available functionality.
git submodule update --init --recursive
C++ 11 and C 99#
Our primary C++ implementation uses CMake for builds. If this is your first experience with CMake, use the following commands to get started:
sudo apt-get update && sudo apt-get install cmake build-essential libjemalloc-dev g++-12 gcc-12 # Ubuntu
brew install libomp llvm # macOS
Using modern syntax, this is how you build and run the test suite:
cmake -D USEARCH_BUILD_TEST_CPP=1 -D CMAKE_BUILD_TYPE=Debug -B build_debug
cmake --build build_debug --config Debug
build_debug/test_cpp
If the build mode is not specified, the default is Release
.
cmake -D USEARCH_BUILD_TEST_CPP=1 -B build_release
cmake --build build_release --config Release
build_release/test_cpp
For development purposes, you may want to include symbols information in the build:
cmake -D USEARCH_BUILD_TEST_CPP=1 -D CMAKE_BUILD_TYPE=RelWithDebInfo -B build_relwithdebinfo
cmake --build build_relwithdebinfo --config RelWithDebInfo
build_relwithdebinfo/test_cpp
The CMakeLists.txt file has a number of options you can pass:
What to build:
USEARCH_BUILD_TEST_CPP
- build the C++ test suiteUSEARCH_BUILD_BENCH_CPP
- build the C++ benchmark suiteUSEARCH_BUILD_LIB_C
- build the C libraryUSEARCH_BUILD_TEST_C
- build the C test suiteUSEARCH_BUILD_SQLITE
- build the SQLite extension (no Windows)
Which dependencies to use:
USEARCH_USE_OPENMP
- use OpenMP for parallelismUSEARCH_USE_SIMSIMD
- use SimSIMD for vectorizationUSEARCH_USE_JEMALLOC
- use Jemalloc for memory managementUSEARCH_USE_FP16LIB
- use software emulation for half-precision floating point
Putting all of this together, compiling all targets on most platforms should work with the following snippet:
cmake -D CMAKE_BUILD_TYPE=Release -D USEARCH_USE_FP16LIB=1 -D USEARCH_USE_OPENMP=1 -D USEARCH_USE_SIMSIMD=1 -D USEARCH_USE_JEMALLOC=1 -D USEARCH_BUILD_TEST_CPP=1 -D USEARCH_BUILD_BENCH_CPP=1 -D USEARCH_BUILD_LIB_C=1 -D USEARCH_BUILD_TEST_C=1 -D USEARCH_BUILD_SQLITE=0 -B build_release
cmake --build build_release --config Release
build_release/test_cpp
build_release/test_c
Similarly, to use the most recent Clang compiler version from Homebrew on macOS:
brew install clang++ clang cmake
cmake \
-D CMAKE_BUILD_TYPE=Release \
-D CMAKE_C_COMPILER="$(brew --prefix llvm)/bin/clang" \
-D CMAKE_CXX_COMPILER="$(brew --prefix llvm)/bin/clang++" \
-D USEARCH_USE_FP16LIB=1 \
-D USEARCH_USE_OPENMP=1 \
-D USEARCH_USE_SIMSIMD=1 \
-D USEARCH_USE_JEMALLOC=1 \
-D USEARCH_BUILD_TEST_CPP=1 \
-D USEARCH_BUILD_BENCH_CPP=1 \
-D USEARCH_BUILD_LIB_C=1 \
-D USEARCH_BUILD_TEST_C=1 \
-B build_release
cmake --build build_release --config Release
build_release/test_cpp
build_release/test_c
Linting:
cppcheck --enable=all --force --suppress=cstyleCast --suppress=unusedFunction \
include/usearch/index.hpp \
include/usearch/index_dense.hpp \
include/usearch/index_plugins.hpp
I’d recommend putting the following breakpoints when debugging the code in GDB:
__asan::ReportGenericError
- to detect illegal memory accesses.__ubsan::ScopedReport::~ScopedReport
- to catch undefined behavior.__GI_exit
- to stop at exit points - the end of running any executable.__builtin_unreachable
- to catch all the places where the code is expected to be unreachable.usearch_raise_runtime_error
- for USearch-specific assertions.
Cross Compilation#
Unlike GCC, LLVM handles cross compilation very easily.
You just need to pass the right TARGET_ARCH
and BUILD_ARCH
to CMake.
The list includes:
crossbuild-essential-amd64
for 64-bit x86crossbuild-essential-arm64
for 64-bit Armcrossbuild-essential-armhf
for 32-bit ARM hard-floatcrossbuild-essential-armel
for 32-bit ARM soft-float (emulatesfloat
)crossbuild-essential-riscv64
for RISC-Vcrossbuild-essential-powerpc
for PowerPCcrossbuild-essential-s390x
for IBM Zcrossbuild-essential-mips
for MIPScrossbuild-essential-ppc64el
for PowerPC 64-bit little-endian
Here is an example for cross-compiling for Arm64 on an x86_64 machine:
sudo apt-get update
sudo apt-get install -y clang lld make crossbuild-essential-arm64 crossbuild-essential-armhf
export CC="clang"
export CXX="clang++"
export AR="llvm-ar"
export NM="llvm-nm"
export RANLIB="llvm-ranlib"
export TARGET_ARCH="aarch64-linux-gnu" # Or "x86_64-linux-gnu"
export BUILD_ARCH="arm64" # Or "amd64"
cmake -D CMAKE_BUILD_TYPE=Release \
-D CMAKE_C_COMPILER_TARGET=${TARGET_ARCH} \
-D CMAKE_CXX_COMPILER_TARGET=${TARGET_ARCH} \
-D CMAKE_SYSTEM_NAME=Linux \
-D CMAKE_SYSTEM_PROCESSOR=${BUILD_ARCH} \
-B build_artifacts
cmake --build build_artifacts --config Release
For Android development, you can cross-compile for ARM architectures without requiring the full Android NDK setup.
Here’s an example targeting 32-bit ARM (armeabi-v7a
):
sudo apt-get update
sudo apt-get install -y clang lld crossbuild-essential-armhf
# Cross-compile for 32-bit ARM (Android compatible)
CMAKE_TRY_COMPILE_TARGET_TYPE=STATIC_LIBRARY \
cmake -B build_artifacts \
-D CMAKE_C_COMPILER=clang \
-D CMAKE_CXX_COMPILER=clang++ \
-D CMAKE_C_COMPILER_TARGET=armv7-linux-gnueabihf \
-D CMAKE_CXX_COMPILER_TARGET=armv7-linux-gnueabihf \
-D CMAKE_SYSTEM_NAME=Linux \
-D CMAKE_SYSTEM_PROCESSOR=armv7 \
-D CMAKE_C_FLAGS="--target=armv7-linux-gnueabihf -march=armv7-a" \
-D CMAKE_CXX_FLAGS="--target=armv7-linux-gnueabihf -march=armv7-a" \
-D CMAKE_BUILD_TYPE=RelWithDebInfo \
-D USEARCH_BUILD_LIB_C=1 \
-D USEARCH_BUILD_TEST_CPP=0 \
-D USEARCH_BUILD_BENCH_CPP=0 \
-D USEARCH_USE_SIMSIMD=0 \
-D USEARCH_USE_FP16LIB=1
cmake --build build_artifacts --config RelWithDebInfo
file build_artifacts/libusearch_c.so # Verify the output
# Should show: ELF 32-bit LSB shared object, ARM, EABI5 version 1 (SYSV), dynamically linked, ...
The resulting libusearch_c.so
can be used in Android projects by placing it in src/main/jniLibs/armeabi-v7a/
for 32-bit ARM or arm64-v8a/
for 64-bit ARM.
Python 3#
Python bindings are built using PyBind11 and are available on PyPI.
The compilation settings are controlled by the setup.py
and are independent from CMake used for C/C++ builds.
To install USearch locally using uv
:
uv venv --python 3.11 # or your preferred Python version
source .venv/bin/activate # to activate the virtual environment
uv pip install -e . --force-reinstall # to build locally from source
Or using pip
directly:
pip install -e . --force-reinstall
For testing USearch uses PyTest, which is pre-configured in pyproject.toml
.
Following options are enabled:
The
-s
option will disable capturing the logs.The
-x
option will exit after first failure to simplify debugging.The
-p no:warnings
option will suppress and allow warnings.
uv pip install pytest pytest-repeat numpy # for repeated fuzzy tests
python -m pytest # if you trust the default settings
python -m pytest python/scripts/ -s -x -p no:warnings # to overwrite the default settings
Linting:
pip install ruff
ruff --format=github --select=E9,F63,F7,F82 --target-version=py310 python
Before merging your changes you may want to test your changes against the entire matrix of Python versions USearch supports.
For that you need the cibuildwheel
, which is tricky to use on macOS and Windows, as it would target just the local environment.
Still, if you have Docker running on any desktop OS, you can use it to build and test the Python bindings for all Python versions for Linux:
pip install cibuildwheel
cibuildwheel
cibuildwheel --platform linux # works on any OS and builds all Linux backends
cibuildwheel --platform linux --archs x86_64 # 64-bit x86, the most common on desktop and servers
cibuildwheel --platform linux --archs aarch64 # 64-bit Arm for mobile devices, Apple M-series, and AWS Graviton
cibuildwheel --platform macos # works only on macOS
cibuildwheel --platform windows # works only on Windows
You may need root privileges for multi-architecture builds:
sudo $(which cibuildwheel) --platform linux
On Windows and macOS, to avoid frequent path resolution issues, you may want to use:
python -m cibuildwheel --platform windows
JavaScript#
USearch provides NAPI bindings for NodeJS available on NPM.
The compilation settings are controlled by the binding.gyp
and are independent from CMake used for C/C++ builds.
If you don’t have NPM installed, first the Node Version Manager:
wget -qO- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
nvm install 20
Testing:
npm install -g typescript # Install TypeScript globally
npm install # Compile `javascript/lib.cpp`
npm run build-js # Generate JS from TS
npm test # Run the test suite
To compile for AWS Lambda you’d need to recompile the binding. You can test the setup locally, overriding some of the compilation variables in Docker image:
FROM public.ecr.aws/lambda/nodejs:18-x86_64
RUN npm init -y
RUN yum install tar git python3 cmake gcc-c++ -y && yum groupinstall "Development Tools" -y
# Assuming AWS Linux 2 uses old compilers:
ENV USEARCH_USE_FP16LIB 1
ENV USEARCH_USE_SIMSIMD 1
ENV SIMSIMD_TARGET_HASWELL 1
ENV SIMSIMD_TARGET_SKYLAKE 0
ENV SIMSIMD_TARGET_ICE 0
ENV SIMSIMD_TARGET_SAPPHIRE 0
ENV SIMSIMD_TARGET_NEON 1
ENV SIMSIMD_TARGET_SVE 0
# For specific PR:
# RUN npm install --build-from-source unum-cloud/usearch#pull/302/head
# For specific version:
# RUN npm install --build-from-source usearch@2.8.8
RUN npm install --build-from-source usearch
To compile to WebAssembly make sure you have emscripten
installed and run the following script:
emcmake cmake -B build -DCMAKE_CXX_FLAGS="${CMAKE_CXX_FLAGS} -s TOTAL_MEMORY=64MB" && emmake make -C build
node build/usearch.test.js
If you don’t yet have emcmake
installed:
git clone https://github.com/emscripten-core/emsdk.git && ./emsdk/emsdk install latest && ./emsdk/emsdk activate latest && source ./emsdk/emsdk_env.sh
Rust#
USearch provides Rust bindings available on Crates.io.
The compilation settings are controlled by the build.rs
and are independent from CMake used for C/C++ builds.
cargo test -p usearch -- --nocapture --test-threads=1
cargo clippy --all-targets --all-features
Publishing the crate is a bit more complicated than normally. If you simply pull the repository with submodules and run the following command it will list fewer files than expected:
cargo package --list --allow-dirty
The reason for that is the heuristic that Cargo uses to determine the files to include in the package.
Regardless of whether exclude or include is specified, the following files are always excluded: Any sub-packages will be skipped (any subdirectory that contains a Cargo.toml file).
Assuming both SimSIMD and StringZilla contain their own Cargo.toml
files, we need to temporarily exclude them from the package.
mv simsimd/Cargo.toml simsimd/Cargo.toml.bak
mv stringzilla/Cargo.toml stringzilla/Cargo.toml.bak
cargo package --list --allow-dirty
cargo publish
# Revert back
mv simsimd/Cargo.toml.bak simsimd/Cargo.toml
mv stringzilla/Cargo.toml.bak stringzilla/Cargo.toml
Objective-C and Swift#
USearch provides both Objective-C and Swift bindings through the Swift Package Manager.
The compilation settings are controlled by the Package.swift
and are independent from CMake used for C/C++ builds.
swift build && swift test -v
Those depend on Apple’s
Foundation
library and can only run on Apple devices.
Swift formatting is enforced with swift-format
default utility from Apple.
To install and run it on all the files in the project, use the following command:
brew install swift-format
swift-format . -i -r
The style is controlled by the .swift-format
JSON file in the root of the repository.
As there is no standard for Swift formatting, even Apple’s own swift-format
tool and Xcode differ in their formatting rules, and available settings.
Running Swift on Linux requires a couple of extra steps - ``swift.org/install` page <https://www.swift.org/install>`_. Alternatively, on Linux, the official Swift Docker image can be used for builds and tests:
sudo docker run --rm -v "$PWD:/workspace" -w /workspace swift:6.0 /bin/bash -cl "swift build -c release --static-swift-stdlib && swift test -c release --enable-test-discovery"
To format the code on Linux:
sudo docker run --rm -v "$PWD:/workspace" -w /workspace swift:6.0 /bin/bash -c "swift format . -i -r"
Go#
USearch provides Go bindings, that depend on the C library that must be installed beforehand. So one should first compile the C library, link it with Go, and only then run tests.
cmake -B build_release -D USEARCH_BUILD_LIB_C=1 -D USEARCH_BUILD_TEST_C=1 -D USEARCH_USE_OPENMP=1 -D USEARCH_USE_SIMSIMD=1
cmake --build build_release --config Release -j
cp build_release/libusearch_c.so golang/ # or .dylib to install the library on macOS
cp c/usearch.h golang/ # to make the header available to Go
cd golang && LD_LIBRARY_PATH=. go test -v ; cd ..
For static checks:
cd golang
go vet ./...
staticcheck ./... # if installed
golangci-lint run # if installed
Java#
USearch provides Java bindings as a fat-JAR published with prebuilt JNI libraries via GitHub releases. Installation via Maven Central is deprecated; prefer downloading the fat-JAR from the latest GitHub release. The compilation settings are controlled by build.gradle
and are independent from CMake used for C/C++ builds.
To setup the Gradle environment:
sudo apt-get install zip
curl -s "https://get.sdkman.io" | bash
sdk install java
sdk install gradle
Afterwards, in a new terminal:
gradle clean build --warning-mode=all # ensure passing builds
gradle test --rerun-tasks # pass unit tests
gradle spotlessApply # apply formatting
Alternatively, to run the Index.main
:
java -cp "$(pwd)/build/classes/java/main" -Djava.library.path="$(pwd)/build/libs/usearch/shared" java/cloud/unum/usearch/Index.java
Or step by-step:
javac -cp java -h java/cloud/unum/usearch/ java/cloud/unum/usearch/Index.java
# Ensure JAVA_HOME system environment variable has been set
# e.g. export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
# Ubuntu:
g++ -c -fPIC -I${JAVA_HOME}/include -I${JAVA_HOME}/include/linux -Iinclude java/cloud/unum/usearch/cloud_unum_usearch_Index.cpp -o java/cloud/unum/usearch/cloud_unum_usearch_Index.o
g++ -shared -fPIC -o java/cloud/unum/usearch/libusearch.so java/cloud/unum/usearch/cloud_unum_usearch_Index.o -lc
# Windows
g++ -c -I%JAVA_HOME%\include -I%JAVA_HOME%\include\win32 java\cloud\unum\usearch\cloud_unum_usearch_Index.cpp -Iinclude -o java\cloud\unum\usearch\cloud_unum_usearch_Index.o
g++ -shared -o java\cloud\unum\usearch\USearchJNI.dll java\cloud\unum\usearch\cloud_unum_usearch_Index.o -Wl,--add-stdcall-alias
# macOS
g++ -std=c++11 -c -fPIC \
-Iinclude \
-Ifp16/include \
-Isimsimd/include \
-I${JAVA_HOME}/include -I${JAVA_HOME}/include/darwin java/cloud/unum/usearch/cloud_unum_usearch_Index.cpp -o java/cloud/unum/usearch/cloud_unum_usearch_Index.o
g++ -dynamiclib -o java/cloud/unum/usearch/libusearch.dylib java/cloud/unum/usearch/cloud_unum_usearch_Index.o -lc
# Run from project root
java -cp java -Djava.library.path="java/cloud/unum/usearch" cloud.unum.usearch.Index
Or using CMake:
cmake -B build_artifacts -D USEARCH_BUILD_JNI=1
cmake --build build_artifacts --config Release -j
C#
Setup the .NET environment:
dotnet nuget add source https://api.nuget.org/v3/index.json -n nuget.org
USearch provides CSharp bindings, that depend on the C library that must be installed beforehand. So one should first compile the C library, link it with CSharp, and only then run tests.
cmake -B build_artifacts -D USEARCH_BUILD_LIB_C=1 -D USEARCH_BUILD_TEST_C=1 -D USEARCH_USE_OPENMP=1 -D USEARCH_USE_SIMSIMD=1
cmake --build build_artifacts --config Release -j
Then, on Windows, copy the library to the CSharp project and run the tests:
mkdir -p ".\csharp\lib\runtimes\win-x64\native"
cp ".\build_artifacts\libusearch_c.dll" ".\csharp\lib\runtimes\win-x64\native"
cd csharp
dotnet test -c Debug --logger "console;verbosity=detailed"
dotnet test -c Release
On Linux, the process is similar:
mkdir -p "csharp/lib/runtimes/linux-x64/native" # for x86
cp "build_artifacts/libusearch_c.so" "csharp/lib/runtimes/linux-x64/native" # for x86
mkdir -p "csharp/lib/runtimes/linux-arm64/native" # for ARM
cp "build_artifacts/libusearch_c.so" "csharp/lib/runtimes/linux-arm64/native" # for ARM
cd csharp
dotnet test -c Debug --logger "console;verbosity=detailed"
dotnet test -c Release
On macOS with Arm-based chips:
mkdir -p "csharp/lib/runtimes/osx-arm64/native"
cp "build_artifacts/libusearch_c.dylib" "csharp/lib/runtimes/osx-arm64/native"
cd csharp
dotnet test -c Debug --logger "console;verbosity=detailed"
dotnet test -c Release
Wolfram#
brew install --cask wolfram-engine
Docker#
docker build -t unum/usearch . && docker run unum/usearch
For multi-architecture builds and publications:
version=$(cat VERSION)
docker buildx create --use &&
docker login &&
docker buildx build \
--platform "linux/amd64,linux/arm64" \
--build-arg version=$version \
--file Dockerfile \
--tag unum/usearch:$version \
--tag unum/usearch:latest \
--push .
WebAssembly#
export WASI_VERSION=21
export WASI_VERSION_FULL=${WASI_VERSION}.0
wget https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-${WASI_VERSION}/wasi-sdk-${WASI_VERSION_FULL}-linux.tar.gz
tar xvf wasi-sdk-${WASI_VERSION_FULL}-linux.tar.gz
After the installation, we can pass WASI SDK to CMake as a new toolchain:
cmake -DCMAKE_TOOLCHAIN_FILE=${WASI_SDK_PATH}/share/cmake/wasi-sdk.cmake .
Working on Sub-Modules#
Extending metrics in SimSIMD:
git push --set-upstream https://github.com/ashvardanian/simsimd.git HEAD:main