Benchmark summary

Performance that changes the operating envelope.

XLerate™ DNA is built to handle PhotoDNA3 lookups the way they actually arrive in the wild: one query, right now. Our benchmarks use this shape - called batch-1 - to assess local SDK throughput, deployed AWS full-service throughput, access and growth stress and finally, cloud economics.

Local CPU baseline
46.2x

Geomean throughput advantage over FAISS-CPU across all public insert/query rows.

Local GPU baseline
3.4x

Geomean throughput advantage over FAISS-GPU while preserving the public recall contract.

AWS throughput
8.1x

Mean same-concurrency throughput multiple in the deployed AWS service matrix.

Equivalent FAISS spend
6.8x

Mean FAISS-CPU infrastructure spend multiple across the same AWS service-matrix rows.

Local SDK performance

The CPU engine is winning in GPU territory.

XLerate DNA - running on a commodity CPU - takes the lead in performance territory normally reserved for expensive GPUs.

Base

The starting database before growth inserts. Search effort: 0.00275 (0.275% of indexed space). Indexed vectors: 1M.

XLerate SDK FAISS-CPU FAISS-GPU

Medium

The database after adding medium-distance inserts. Search effort: 0.00247 (0.247% of indexed space). Indexed vectors: 2.27M.

XLerate SDK FAISS-CPU FAISS-GPU

Large

The largest tested database after easy, medium, and hard inserts. Search effort: 0.000031 (0.003% of indexed space). Indexed vectors: 4M.

XLerate SDK FAISS-CPU FAISS-GPU
Full performance matrix Values are Queries Per Second or QPS.
Database size Query band XLerate SDK FAISS-CPU FAISS-GPU CPU multiple GPU multiple
Base Easy 1.38M 51.1k 615.8k 27x 2.2x
Base Medium 1.46M 51.7k 670.7k 28.2x 2.2x
Base Hard 1.44M 52.2k 665.6k 27.7x 2.2x
Medium Easy 1.08M 30.4k 401.5k 35.6x 2.7x
Medium Medium 1.17M 31.9k 453.1k 36.6x 2.6x
Medium Hard 1.16M 32.1k 448.4k 36x 2.6x
Large Easy 6.22M 64.8k 948.5k 95.9x 6.6x
Large Medium 6.57M 65.3k 993.4k 100.6x 6.6x
Large Hard 6.51M 65.3k 991.1k 99.7x 6.6x
Companion page

Benchmark shape

Three databases, one standard test

The benchmark is built around three database sizes: Base, Medium, and Large. Each size is queried with Easy, Medium, and Hard PhotoDNA match bands so growth and match difficulty stay easy to read separately.

Base 1M vectors inserted Medium 2.27M vectors inserted Large 4M vectors inserted

The comparison runs XLerate DNA against FAISS on the same CPUs, and against FAISS on a top-tier GPU, to show whether general-purpose vector search can beat a PhotoDNA-specific engine on speed or cost.

The databases and query vectors come from PDNA-1M: one million clean base images plus ten million qualifying query vectors.

Read about the dataset

AWS deployment performance

The lead translates to the cloud.

The AWS benchmark uses a full-fledged product deployment - not a benchmark harness. Orchestration, service path, request handling, storage interaction, concurrent client drivers - all run while collecting latency, recall, and cost.

AWS single-query throughput under load

Mean lines summarize the deployed AWS benchmark matrix across the three databases sizes and query bands. The transparent thicker lines show the upper and lower performance bounds behind the mean at each concurrency point.

XLerate service FAISS-CPU service

Latency stays controlled as concurrency rises

Throughput with high latency is just a queue; this curve shows the p95 latency of the service under various levels of concurrency pressure.

XLerate service FAISS-CPU service

Cost model

Unbeatable total cost of ownership.

XLerate DNA's cloud performance redefines the economics. Chasing its throughput and latency profile with FAISS-CPU forces materially higher AWS infrastructure spend.

Cost per million queries under load

Measured AWS economics across the public service matrix. Lower is better; the distance between the lines is the bill moving with the throughput curve.

XLerate service FAISS-CPU service

Queries per dollar

Measured from the same AWS serving-infrastructure cost rows as the cost chart. Higher is better: it shows how many single-query lookups each infrastructure dollar buys.

XLerate service FAISS-CPU service

FAISS infrastructure spend by AWS concurrency

At each AWS client level, the bar shows the mean cost multiple - spent on AWS infrastructure alone - required for FAISS-CPU to try to keep up with XLerate DNA.

FAISS-CPU infrastructure spend multiple
Spend-multiple cost matrix Values are same-concurrency throughput, p99 latency, and spend multiples.
Database size Query band AWS clients Throughput multiple p99 ratio Spend multiple
Base Easy 256 7.2x 7.4x 6.1x
Base Easy 512 8x 5.8x 6.7x
Base Easy 1024 8.3x 5.5x 7x
Base Easy 2048 8.2x 5x 6.9x
Base Medium 256 9.1x 6x 7.7x
Base Medium 512 9.8x 5.7x 8.3x
Base Medium 1024 9.7x 5.5x 8.2x
Base Medium 2048 9.3x 4.9x 7.9x
Base Hard 256 8.2x 5.6x 6.9x
Base Hard 512 8.1x 5.1x 6.8x
Base Hard 1024 9.5x 5.8x 8.1x
Base Hard 2048 8.8x 4.7x 7.5x
Medium Easy 256 8.6x 7.8x 7.3x
Medium Easy 512 9.1x 7x 7.7x
Medium Easy 1024 8.7x 6x 7.3x
Medium Easy 2048 9.7x 5x 8.1x
Medium Medium 256 8.5x 6.1x 7.2x
Medium Medium 512 8.7x 6.3x 7.4x
Medium Medium 1024 8.8x 6.3x 7.4x
Medium Medium 2048 9.5x 5.8x 8x
Medium Hard 256 9x 6.6x 7.6x
Medium Hard 512 9.3x 6.7x 7.8x
Medium Hard 1024 9x 6x 7.6x
Medium Hard 2048 10.2x 5.8x 8.6x
Large Easy 256 4.9x 5.1x 4.1x
Large Easy 512 5.1x 6.5x 4.3x
Large Easy 1024 6.6x 5.9x 5.6x
Large Easy 2048 6.5x 4.9x 5.5x
Large Medium 256 7.2x 5.6x 6.1x
Large Medium 512 6.7x 5.1x 5.7x
Large Medium 1024 6.9x 5.6x 5.9x
Large Medium 2048 6.4x 4.8x 5.4x
Large Hard 256 5.3x 4.4x 4.5x
Large Hard 512 7x 5.2x 5.9x
Large Hard 1024 7.3x 6.6x 6.1x
Large Hard 2048 7.2x 5.7x 6x

Bucket health

Near-perfect health under database growth.

XLerate™ DNA's novel, patent-pending2 clustering system never requires learning and maintains unprecedented cluster health across all database sizes. Other methods lose both accuracy and speed to degrading bucket health - requiring expensive reclustering with growth.

Bucket health under growth

Base, Medium, and Large database sizes plotted by effective bucket usage and p99 bucket load pressure.

Down + right is better More of the bucket space is doing useful work, and buckets are less overloaded. Vector libraries are like a real library - books are sorted by author or title to make finding them easier. Buckets are the number of sections in the library. Health is how evenly books are spread around available sections. Under-utilization of one section means over-utilization of another. An over-utilized section takes longer to search.

Stress test

XLerate DNA performs under pathological load.

Real systems do not receive perfectly polite traffic. The stress test, run against the local SDK, shows how throughput behaves when demand concentrates instead of spreading evenly across the index. We hammer XLerate DNA under various Zipfian skews - various levels of same-bucket access, causing contention.

Base

32 thread contention run across increasingly concentrated access patterns.

Skew 1.1 Skew 1.5 Skew 2 Skew 2.5 Skew 3

Medium

32 thread contention run across increasingly concentrated access patterns.

Skew 1.1 Skew 1.5 Skew 2 Skew 2.5 Skew 3

Large

32 thread contention run across increasingly concentrated access patterns.

Skew 1.1 Skew 1.5 Skew 2 Skew 2.5 Skew 3

Fair reading

Clear, spin-free summary.

For high-stakes PhotoDNA workflows, XLerate DNA delivers unparalleled performance.

Takeaway

If you need to perform accurate PhotoDNA matching at scale, the benchmark story is unambiguous: XLerate DNA delivers higher throughput, controlled latency, materially better infrastructure economics and better performance on commodity hardware than alternatives can deliver on top-dollar specialty hardware.

For cloud solution providers, your trust and safety workflow can realize a 7-fold or higher infrastructure cost reduction while increasing its effectiveness. For digital forensics, you can escape the lab backlog by doing live-triage on commodity devices at speeds that rival lab hardware.

This page does claim

  • In the local SDK matrix, XLerate DNA preserves the recall contract while delivering 46.2x geomean throughput over FAISS-CPU and 3.4x over FAISS-GPU.
  • In the AWS service matrix, XLerate DNA delivers 8.1x mean same-concurrency throughput over FAISS-CPU.
  • Across those same AWS service rows, equivalent FAISS-CPU serving-infrastructure spend averages 6.8x XLerate's serving-infrastructure spend.
  • All benchmark configurations required both XLerate DNA and FAISS to yield 100% accuracy. This is not typical in vector search - but it is required for high-stakes PhotoDNA workflows.
  • Importantly, XLerate DNA maintains its performance on out-of-distribution data - namely adult content. This proves it's not tailored to the PDNA-1M dataset, and thus reliable for the kind of content normally handled in PhotoDNA workflows.

This page does not claim

  • That XLerate DNA replaces FAISS for every vector workload. FAISS is used as a reference because it's the gold-standard, high-performance vector library. XLerate DNA provides the same outcome on PhotoDNA vectors asymmetrically with patent-pending technology.
  • That the local GPU benchmark has the same operating shape as the others. In fact, FAISS-GPU was given a batch-512 workflow - an advantage - in order to honestly give it the best chance.
  • That the AWS economics are a customer price quote. These figures are normalized measured throughput against the stated on-demand serving-infrastructure costs at the time of the benchmark.

Benchmark setup

The details matter.

These are the practical details behind the benchmark: the local harness, the AWS service shape, and the details that matter when reading the numbers.

The important point is that the comparison is not hiding a special disadvantage for FAISS. In the AWS benchmarks, FAISS-CPU was run inside the exact same service code that XLerate DNA runs in - we simply swapped XLerate DNA's runtime out for FAISS. In fact, both FAISS-GPU and FAISS-CPU were given a few unfair advantages.

FAISS-GPU, like all programs that must run on a GPU, suffers the speed bottleneck of transferring data to and from the GPU. If we made FAISS-GPU answer 1 query at a time like XLerate DNA or FAISS-CPU, that bottleneck would make FAISS-GPU appear to be as slow or slower than FAISS-CPU.

FAISS-GPU was allowed to use a batch size of 512. That means that it was allowed to conduct 512 queries at once. If this were a race to taxi people from point A to point B, XLerate DNA and FAISS-CPU would be single-passenger cars competing against against an impossibly large bus. However, this advantage had to be given to afford it a fair chance. In a real deployment, this would mean you would have to hold back answering queries until hundreds more arrived - so even with this advantage, it's not realistic for an on-demand system.

FAISS-CPU was likewise given an advantage to make it practical. It didn't have to cluster vectors inserted, and thus its insertion speeds are far higher than they would be under normal usage. XLerate DNA never has to learn clustering, unlike other vector products. The advantage afforded to FAISS-CPU hides an extraordinary benefit of XLerate DNA.

Local harness

Hardware and FAISS build

The local matrix ran on a high-end desktop CPU, with FAISS-GPU also tested against an RTX 4090 baseline.

  • AMD Ryzen 9 7950X, 16 cores / 32 threads.
  • 63.6 GiB system memory, DDR5-5600.
  • NVIDIA GeForce RTX 4090, 24,564 MiB VRAM.
  • FAISS 1.14.1 with GPU and NVIDIA cuVS support.
  • FAISS runtime: IVF Flat, nlist 32,768, nprobe 32 or 2 by row.
AWS footprint

Service benchmark shape

The AWS rows compare deployed amd64 services and count serving infrastructure, not the separate load-generation fleet.

  • XLerate service: c7i.8xlarge, 32 vCPUs.
  • XLerate Garnet: r7i.xlarge, 4 vCPUs.
  • FAISS-CPU service: c7i.8xlarge, 32 vCPUs.
  • Load generation used 7 c7i.xlarge runner nodes across 256, 512, 1024, and 2048 client levels.
Salient methodology notes
  • It's somewhat confusing, but all three database sizes represent the same 1-million unique images. When you edit an image and hash it again, the hash changes - but that hash can still be matched to the original hash. This is why, counterintuitively, you have to search less of the database with this growth pattern - because the goal is not to match the same hash, but the same image. Searching less of the database equals faster searches. This effect is realized by all competitors in this benchmark, not just XLerate DNA.
  • FAISS-CPU was not required to build its index in the AWS benchmark runs. A single index was pre-trained optimally for all benchmark bands, which is an advantage for FAISS because index building is incredibly slow and normally must be repeated as the database grows. Not having to build an index via clustering makes FAISS insertion performance look better than it would be in a real deployment and hides an extraordinary benefit of XLerate DNA.
  • In our experiments, allowing FAISS to build a fresh index for each run did not materially impact the resulting index quality, nor the n-probe (how much of the database had to be searched) required to achieve 100% recall. This is because FAISS is bound by the limitations of the pure vector space geometry, which means its can only optimize clustering up to a ceiling.
  • In the AWS benchmarks, FAISS-CPU was swapped into the same service code that XLerate DNA uses, and was not required to communicate with persistent storage - an advantage over XLerate DNA.
  • In the AWS spend-multiple analysis to reach parity with XLerate DNA, storage cost was not included for FAISS - which makes the spend multiple lower than it would be in a real FAISS deployment.
  • We did not benchmark FAISS-GPU on AWS GPU infrastructure because the cost per query gap would be the same or greater when factoring in required batching, and the requirement of batching queries to the GPU to efficiently use the hardware violates the instant/on-demand nature of live workflows.
  • XLerate DNA's benchmarked performance characteristics have been repeated in an out-of-distribution (OOD) dataset test - adult content. This proves that its advantages are not specific to the PDNA-1M data, and that they are retained when tested on data very similar in visual composition to data targeted by PhotoDNA workflows.
  • FAISS-CPU was compiled with AVX-512 support.
  • XLerate DNA uses dynamic SIMD dispatch for all vectorizable operations, and so it also leveraged AVX-512.
  • The local benchmark CPU uses a double-pump mechanism to provide AVX-512 support, leading to suboptimal performance over a proper 512-bit execution unit.
  • XLerate DNA is engineered to refuse to return non-matches - guaranteeing 100% accuracy under configured parameters. That is not typical of vector databases or vector search, but is appropriate for its intended use of high-trust threat identification.