Benchmark summary

Performance that changes the operating envelope.

XLerate™ DNA is built to handle PhotoDNA3 lookups the way they actually arrive in the wild: one query, right now. Our benchmarks use this shape - called batch-1 - to assess local SDK throughput, deployed AWS full-service throughput, access and growth stress and finally, cloud economics.

Last updated: June 12, 2026

Local CPU baseline

46.2x

Geomean throughput advantage over FAISS-CPU across all public insert/query rows.

Local GPU baseline

3.4x

Geomean throughput advantage over FAISS-GPU while preserving the public recall contract.

AWS throughput

8.1x

Mean same-concurrency throughput multiple in the deployed AWS service matrix.

Equivalent FAISS spend

6.8x

Mean FAISS-CPU infrastructure spend multiple across the same AWS service-matrix rows.

Local SDK performance

The CPU engine is winning in GPU territory.

XLerate DNA - running on a commodity CPU - takes the lead in performance territory normally reserved for expensive GPUs.

Base

The starting database before growth inserts. Search effort: 0.00275 (0.275% of indexed space). Indexed vectors: 1M.

XLerate SDK FAISS-CPU FAISS-GPU

2M 1.5M 1M 500k 0

1.38M

51.1k

615.8k

1.46M

51.7k

670.7k

1.44M

52.2k

665.6k

Easy 27x CPU

Medium 28.2x CPU

Hard 27.7x CPU

Medium

The database after adding medium-distance inserts. Search effort: 0.00247 (0.247% of indexed space). Indexed vectors: 2.27M.

XLerate SDK FAISS-CPU FAISS-GPU

2M 1.5M 1M 500k 0

1.08M

30.4k

401.5k

1.17M

31.9k

453.1k

1.16M

32.1k

448.4k

Easy 35.6x CPU

Medium 36.6x CPU

Hard 36x CPU

Large

The largest tested database after easy, medium, and hard inserts. Search effort: 0.000031 (0.003% of indexed space). Indexed vectors: 4M.

XLerate SDK FAISS-CPU FAISS-GPU

7M 5.25M 3.5M 1.75M 0

6.22M

64.8k

948.5k

6.57M

65.3k

993.4k

6.51M

65.3k

991.1k

Easy 95.9x CPU

Medium 100.6x CPU

Hard 99.7x CPU

Full performance matrix Values are Queries Per Second or QPS.

Database size	Query band	XLerate SDK	FAISS-CPU	FAISS-GPU	CPU multiple	GPU multiple
Base	Easy	1.38M	51.1k	615.8k	27x	2.2x
Base	Medium	1.46M	51.7k	670.7k	28.2x	2.2x
Base	Hard	1.44M	52.2k	665.6k	27.7x	2.2x
Medium	Easy	1.08M	30.4k	401.5k	35.6x	2.7x
Medium	Medium	1.17M	31.9k	453.1k	36.6x	2.6x
Medium	Hard	1.16M	32.1k	448.4k	36x	2.6x
Large	Easy	6.22M	64.8k	948.5k	95.9x	6.6x
Large	Medium	6.57M	65.3k	993.4k	100.6x	6.6x
Large	Hard	6.51M	65.3k	991.1k	99.7x	6.6x

Companion page

Benchmark shape

Three databases, one standard test

The benchmark is built around three database sizes: Base, Medium, and Large. Each size is queried with Easy, Medium, and Hard PhotoDNA match bands so growth and match difficulty stay easy to read separately.

Base 1M vectors inserted Medium 2.27M vectors inserted Large 4M vectors inserted

The comparison runs XLerate DNA against FAISS on the same CPUs, and against FAISS on a top-tier GPU, to show whether general-purpose vector search can beat a PhotoDNA-specific engine on speed or cost.

The databases and query vectors come from PDNA-1M: one million clean base images plus ten million qualifying query vectors.

Read about the dataset

AWS deployment performance

The lead translates to the cloud.

The AWS benchmark uses a full-fledged product deployment - not a benchmark harness. Orchestration, service path, request handling, storage interaction, concurrent client drivers - all run while collecting latency, recall, and cost.

AWS single-query throughput under load

Mean lines summarize the deployed AWS benchmark matrix across the three databases sizes and query bands. The transparent thicker lines show the upper and lower performance bounds behind the mean at each concurrency point.

XLerate service FAISS-CPU service

300k 225k 150k 75k 0

256 512 1024 2048

Concurrent clients

Latency stays controlled as concurrency rises

Throughput with high latency is just a queue; this curve shows the p95 latency of the service under various levels of concurrency pressure.

XLerate service FAISS-CPU service

200 ms 150 ms 100 ms 50 ms 0

256 512 1024 2048

Concurrent clients

Cost model

Unbeatable total cost of ownership.

XLerate DNA's cloud performance redefines the economics. Chasing its throughput and latency profile with FAISS-CPU forces materially higher AWS infrastructure spend.

Cost per million queries under load

Measured AWS economics across the public service matrix. Lower is better; the distance between the lines is the bill moving with the throughput curve.

XLerate service FAISS-CPU service

$0.0200 $0.0150 $0.0100 $0.0050 $0

256 512 1024 2048

Concurrent clients

Queries per dollar

Measured from the same AWS serving-infrastructure cost rows as the cost chart. Higher is better: it shows how many single-query lookups each infrastructure dollar buys.

XLerate service FAISS-CPU service

600M 450M 300M 150M 0

494.65M

73.9M

Mean queries per dollar

FAISS infrastructure spend by AWS concurrency

At each AWS client level, the bar shows the mean cost multiple - spent on AWS infrastructure alone - required for FAISS-CPU to try to keep up with XLerate DNA.

FAISS-CPU infrastructure spend multiple

8x 6x 4x 2x 0

6.4x

6.7x

7.1x

256 clients

512 clients

1024 clients

2048 clients

Spend-multiple cost matrix Values are same-concurrency throughput, p99 latency, and spend multiples.

Database size	Query band	AWS clients	Throughput multiple	p99 ratio	Spend multiple
Base	Easy	256	7.2x	7.4x	6.1x
Base	Easy	512	8x	5.8x	6.7x
Base	Easy	1024	8.3x	5.5x	7x
Base	Easy	2048	8.2x	5x	6.9x
Base	Medium	256	9.1x	6x	7.7x
Base	Medium	512	9.8x	5.7x	8.3x
Base	Medium	1024	9.7x	5.5x	8.2x
Base	Medium	2048	9.3x	4.9x	7.9x
Base	Hard	256	8.2x	5.6x	6.9x
Base	Hard	512	8.1x	5.1x	6.8x
Base	Hard	1024	9.5x	5.8x	8.1x
Base	Hard	2048	8.8x	4.7x	7.5x
Medium	Easy	256	8.6x	7.8x	7.3x
Medium	Easy	512	9.1x	7x	7.7x
Medium	Easy	1024	8.7x	6x	7.3x
Medium	Easy	2048	9.7x	5x	8.1x
Medium	Medium	256	8.5x	6.1x	7.2x
Medium	Medium	512	8.7x	6.3x	7.4x
Medium	Medium	1024	8.8x	6.3x	7.4x
Medium	Medium	2048	9.5x	5.8x	8x
Medium	Hard	256	9x	6.6x	7.6x
Medium	Hard	512	9.3x	6.7x	7.8x
Medium	Hard	1024	9x	6x	7.6x
Medium	Hard	2048	10.2x	5.8x	8.6x
Large	Easy	256	4.9x	5.1x	4.1x
Large	Easy	512	5.1x	6.5x	4.3x
Large	Easy	1024	6.6x	5.9x	5.6x
Large	Easy	2048	6.5x	4.9x	5.5x
Large	Medium	256	7.2x	5.6x	6.1x
Large	Medium	512	6.7x	5.1x	5.7x
Large	Medium	1024	6.9x	5.6x	5.9x
Large	Medium	2048	6.4x	4.8x	5.4x
Large	Hard	256	5.3x	4.4x	4.5x
Large	Hard	512	7x	5.2x	5.9x
Large	Hard	1024	7.3x	6.6x	6.1x
Large	Hard	2048	7.2x	5.7x	6x

Bucket health

Near-perfect health under database growth.

XLerate™ DNA's novel, patent-pending2 clustering system never requires learning and maintains unprecedented cluster health across all database sizes. Other methods lose both accuracy and speed to degrading bucket health - requiring expensive reclustering with growth.

Bucket health under growth

Base, Medium, and Large database sizes plotted by effective bucket usage and p99 bucket load pressure.

3.0x 2.5x 2.0x 1.5x 1.0x

80% 85% 90% 95% 100%

Cluster distribution health

Down + right is better More of the bucket space is doing useful work, and buckets are less overloaded. Vector libraries are like a real library - books are sorted by author or title to make finding them easier. Buckets are the number of sections in the library. Health is how evenly books are spread around available sections. Under-utilization of one section means over-utilization of another. An over-utilized section takes longer to search.

Database size Base Medium Large

System XLerate DNA FAISS-CPU FAISS-GPU

Stress test

XLerate DNA performs under pathological load.

Real systems do not receive perfectly polite traffic. The stress test, run against the local SDK, shows how throughput behaves when demand concentrates instead of spreading evenly across the index. We hammer XLerate DNA under various Zipfian skews - various levels of same-bucket access, causing contention.

Base

32 thread contention run across increasingly concentrated access patterns.

Skew 1.1 Skew 1.5 Skew 2 Skew 2.5 Skew 3

8M 6M 4M 2M 0

1.48M

1.65M

1.25M

1.68M

1.63M

Medium

32 thread contention run across increasingly concentrated access patterns.

Skew 1.1 Skew 1.5 Skew 2 Skew 2.5 Skew 3

8M 6M 4M 2M 0

1.27M

1.53M

1.18M

1.61M

1.51M

Large

32 thread contention run across increasingly concentrated access patterns.

Skew 1.1 Skew 1.5 Skew 2 Skew 2.5 Skew 3

8M 6M 4M 2M 0

7.54M

5.9M

2.64M

2.23M

2.05M

Fair reading

Clear, spin-free summary.

For high-stakes PhotoDNA workflows, XLerate DNA delivers unparalleled performance.

Takeaway

If you need to perform accurate PhotoDNA matching at scale, the benchmark story is unambiguous: XLerate DNA delivers higher throughput, controlled latency, materially better infrastructure economics and better performance on commodity hardware than alternatives can deliver on top-dollar specialty hardware.

For cloud solution providers, your trust and safety workflow can realize a 7-fold or higher infrastructure cost reduction while increasing its effectiveness. For digital forensics, you can escape the lab backlog by doing live-triage on commodity devices at speeds that rival lab hardware.

This page does claim

In the local SDK matrix, XLerate DNA preserves the recall contract while delivering 46.2x geomean throughput over FAISS-CPU and 3.4x over FAISS-GPU.
In the AWS service matrix, XLerate DNA delivers 8.1x mean same-concurrency throughput over FAISS-CPU.
Across those same AWS service rows, equivalent FAISS-CPU serving-infrastructure spend averages 6.8x XLerate's serving-infrastructure spend.
All benchmark configurations required both XLerate DNA and FAISS to yield 100% accuracy. This is not typical in vector search - but it is required for high-stakes PhotoDNA workflows.
Importantly, XLerate DNA maintains its performance on out-of-distribution data - namely adult content. This proves it's not tailored to the PDNA-1M dataset, and thus reliable for the kind of content normally handled in PhotoDNA workflows.

This page does not claim

That XLerate DNA replaces FAISS for every vector workload. FAISS is used as a reference because it's the gold-standard, high-performance vector library. XLerate DNA provides the same outcome on PhotoDNA vectors asymmetrically with patent-pending technology.
That the local GPU benchmark has the same operating shape as the others. In fact, FAISS-GPU was given a batch-512 workflow - an advantage - in order to honestly give it the best chance.
That the AWS economics are a customer price quote. These figures are normalized measured throughput against the stated on-demand serving-infrastructure costs at the time of the benchmark.

Benchmark setup

The details matter.

These are the practical details behind the benchmark: the local harness, the AWS service shape, and the details that matter when reading the numbers.

The important point is that the comparison is not hiding a special disadvantage for FAISS. In the AWS benchmarks, FAISS-CPU was run inside the exact same service code that XLerate DNA runs in - we simply swapped XLerate DNA's runtime out for FAISS. In fact, both FAISS-GPU and FAISS-CPU were given a few unfair advantages.

FAISS-GPU, like all programs that must run on a GPU, suffers the speed bottleneck of transferring data to and from the GPU. If we made FAISS-GPU answer 1 query at a time like XLerate DNA or FAISS-CPU, that bottleneck would make FAISS-GPU appear to be as slow or slower than FAISS-CPU.

FAISS-GPU was allowed to use a batch size of 512. That means that it was allowed to conduct 512 queries at once. If this were a race to taxi people from point A to point B, XLerate DNA and FAISS-CPU would be single-passenger cars competing against against an impossibly large bus. However, this advantage had to be given to afford it a fair chance. In a real deployment, this would mean you would have to hold back answering queries until hundreds more arrived - so even with this advantage, it's not realistic for an on-demand system.

FAISS-CPU was likewise given an advantage to make it practical. It didn't have to cluster vectors inserted, and thus its insertion speeds are far higher than they would be under normal usage. XLerate DNA never has to learn clustering, unlike other vector products. The advantage afforded to FAISS-CPU hides an extraordinary benefit of XLerate DNA.

Local harness

Hardware and FAISS build

The local matrix ran on a high-end desktop CPU, with FAISS-GPU also tested against an RTX 4090 baseline.

AMD Ryzen 9 7950X, 16 cores / 32 threads.
63.6 GiB system memory, DDR5-5600.
NVIDIA GeForce RTX 4090, 24,564 MiB VRAM.
FAISS 1.14.1 with GPU and NVIDIA cuVS support.
FAISS runtime: IVF Flat, nlist 32,768, nprobe 32 or 2 by row.

AWS footprint

Service benchmark shape

The AWS rows compare deployed amd64 services and count serving infrastructure, not the separate load-generation fleet.

XLerate service: c7i.8xlarge, 32 vCPUs.
XLerate Garnet: r7i.xlarge, 4 vCPUs.
FAISS-CPU service: c7i.8xlarge, 32 vCPUs.
Load generation used 7 c7i.xlarge runner nodes across 256, 512, 1024, and 2048 client levels.

Salient methodology notes

It's somewhat confusing, but all three database sizes represent the same 1-million unique images. When you edit an image and hash it again, the hash changes - but that hash can still be matched to the original hash. This is why, counterintuitively, you have to search less of the database with this growth pattern - because the goal is not to match the same hash, but the same image. Searching less of the database equals faster searches. This effect is realized by all competitors in this benchmark, not just XLerate DNA.
FAISS-CPU was not required to build its index in the AWS benchmark runs. A single index was pre-trained optimally for all benchmark bands, which is an advantage for FAISS because index building is incredibly slow and normally must be repeated as the database grows. Not having to build an index via clustering makes FAISS insertion performance look better than it would be in a real deployment and hides an extraordinary benefit of XLerate DNA.
In our experiments, allowing FAISS to build a fresh index for each run did not materially impact the resulting index quality, nor the n-probe (how much of the database had to be searched) required to achieve 100% recall. This is because FAISS is bound by the limitations of the pure vector space geometry, which means its can only optimize clustering up to a ceiling.
In the AWS benchmarks, FAISS-CPU was swapped into the same service code that XLerate DNA uses, and was not required to communicate with persistent storage - an advantage over XLerate DNA.
In the AWS spend-multiple analysis to reach parity with XLerate DNA, storage cost was not included for FAISS - which makes the spend multiple lower than it would be in a real FAISS deployment.
We did not benchmark FAISS-GPU on AWS GPU infrastructure because the cost per query gap would be the same or greater when factoring in required batching, and the requirement of batching queries to the GPU to efficiently use the hardware violates the instant/on-demand nature of live workflows.
XLerate DNA's benchmarked performance characteristics have been repeated in an out-of-distribution (OOD) dataset test - adult content. This proves that its advantages are not specific to the PDNA-1M data, and that they are retained when tested on data very similar in visual composition to data targeted by PhotoDNA workflows.
FAISS-CPU was compiled with AVX-512 support.
XLerate DNA uses dynamic SIMD dispatch for all vectorizable operations, and so it also leveraged AVX-512.
The local benchmark CPU uses a double-pump mechanism to provide AVX-512 support, leading to suboptimal performance over a proper 512-bit execution unit.
XLerate DNA is engineered to refuse to return non-matches - guaranteeing 100% accuracy under configured parameters. That is not typical of vector databases or vector search, but is appropriate for its intended use of high-trust threat identification.

Performance that changes the operating envelope.

The CPU engine is winning in GPU territory.

Base

Medium

Large

Three databases, one standard test

The lead translates to the cloud.

AWS single-query throughput under load

Latency stays controlled as concurrency rises

Unbeatable total cost of ownership.

Cost per million queries under load

Queries per dollar

FAISS infrastructure spend by AWS concurrency

Near-perfect health under database growth.

Bucket health under growth

XLerate DNA performs under pathological load.

Base

Medium

Large

Clear, spin-free summary.

Takeaway

This page does claim

This page does not claim

The details matter.

Hardware and FAISS build

Service benchmark shape

Your privacy matters.