AI Hardware Lab — GPU, Accelerator & Edge Device Testing

Hardware Tiers

From data center racks to pocket-sized edge chips

Three tiers of AI hardware, covering every deployment scenario.

Abstract visualization of data center AI infrastructure

01 — Data Center

Data center inference

Run frontier models locally. Multi-GPU and multi-chip systems for large-scale inference, from single accelerator cards to liquid-cooled multi-chip platforms. The hardware that powers production AI.

Up to 405B Parameter models

Multi-chip Scaling & interconnect

PFLOPS Class performance

Available from: NVIDIA AMD Tenstorrent Q.ANT

NVIDIA The industry standard

CUDA-based GPUs with the broadest software ecosystem. The default choice for most AI workloads — and the benchmark everything else is measured against.

RTX 6000 Blackwell DC DGX Spark H100 / H200

AMD The GPU alternative

Instinct accelerators with growing ROCm software support. A competitive option for inference workloads, often at lower cost than NVIDIA equivalents.

Instinct MI300X Instinct MI325X

Tenstorrent The open-source RISC-V path

RISC-V-based AI accelerators with a fully open-source software stack. Up to 5x lower TCO than NVIDIA for inference workloads. From single PCIe cards to liquid-cooled multi-chip systems.

Blackhole p150a (PCIe card) TT-QuietBox (4-chip, liquid-cooled) Wormhole n300

Q.ANT Photonic computing from Germany

Next-generation photonic AI processors that compute with light instead of electrons. Up to 30x energy efficiency versus traditional silicon. Made in Germany, funded by BMBF.

NPS Server (Photonic NPU)

Request access

Abstract visualization of desktop AI workstation compute

02 — Workstation

Desktop AI & local inference

AI on your desk. Single-card accelerators, desktop AI supercomputers, and unified-memory workstations that let you run large models without a server room. The answer to "can I run this model locally?"

Up to 192 GB Unified / GPU memory

70B+ models On a single system

Desktop Form factor

Available from: NVIDIA Tenstorrent Apple

NVIDIA Desktop AI supercomputer

The DGX Spark brings Grace Blackwell to a desktop form factor with 128 GB unified memory — powerful enough for most production models without a server room.

DGX Spark RTX 6000 workstation cards

Tenstorrent Affordable single-card AI

A single Blackhole PCIe card delivers 664 TFLOPS for under €1,300 — plug it into any workstation and start running models. The QuietBox packs four chips in a quiet, liquid-cooled desktop.

Blackhole p150a (PCIe card, ~€1,300) TT-QuietBox (4-chip desktop)

Apple Unified memory for large models

Mac Studio with M-series Ultra offers up to 192 GB unified memory — enough to run 70B models at FP16 or 120B+ at INT4. The best price-per-GB for local inference in the Apple ecosystem.

Mac Studio (M-series Ultra)

Request access

Abstract visualization of distributed edge AI nodes

03 — Edge AI

On-device inference without the cloud

AI at the edge — offline, sovereign, power-efficient. Run vision models and small LLMs on devices that draw less power than a phone charger. Perfect for manufacturing floors, retail, vehicles, and anywhere cloud connectivity isn't guaranteed.

1-5W Power consumption

Up to 214 TOPS Edge performance

LLM + Vision On-device capable

Available from: Hailo Axelera DEEPX NVIDIA

Hailo The edge AI leader

Dataflow architecture designed for maximum power efficiency. The Hailo-10 brings LLM inference to edge devices at just 2.5W. Deep Raspberry Pi integration makes prototyping fast.

Hailo-10 (40 TOPS, LLM-capable) Hailo-8 (26 TOPS, vision)

Axelera European RISC-V edge AI

Dutch-designed edge processors using Digital In-Memory Computing on RISC-V. The Metis delivers 214 TOPS at extreme power efficiency. EU-funded, European supply chain.

Metis AIPU (214 TOPS) Europa AIPU (629 TOPS, coming soon)

DEEPX Ultra-low-power embedded AI

South Korean edge AI chips optimized for the absolute lowest power envelope. The DX-M1 delivers 25 TOPS at just 1-5W in a tiny M.2 form factor.

DX-M1 (25 TOPS, 1-5W, M.2)

NVIDIA CUDA at the edge

Jetson brings the CUDA ecosystem to edge devices. Familiar tools and frameworks for developers already in the NVIDIA ecosystem, with GPU-accelerated inference on-device.

Jetson Orin Nano (40 TOPS, 8 GB)

Request access

What You Get

Real numbers, not marketing claims

Every benchmark report includes the metrics that matter for production deployment.

Throughput

Tokens/second for LLMs, frames/second for vision. Under real-world concurrent load, not synthetic peaks.

Latency

Time to first token, P50/P95/P99 latency. The numbers that determine whether your users wait or don't.

Power

Per-device watt measurements. Critical for edge deployment and data center TCO calculations.

Cost-per-inference

Hardware purchase price amortized to cost per million tokens. The number your CFO cares about.

Compatibility

Did the model need conversion? Quantization? What broke? Honest notes on real-world readiness.

Methodology

MLPerf-aligned, fully documented, open-source benchmark scripts. Reproducible by anyone.

Want to test your model on our hardware?

Book a consultation and we'll design a benchmark plan for your specific workload.

Get started Developer access

Our testing infrastructure

From data center racks to pocket-sized edge chips

Data center inference

NVIDIA The industry standard

AMD The GPU alternative

Tenstorrent The open-source RISC-V path

Q.ANT Photonic computing from Germany

Desktop AI & local inference

NVIDIA Desktop AI supercomputer

Tenstorrent Affordable single-card AI

Apple Unified memory for large models

On-device inference without the cloud

Hailo The edge AI leader

Axelera European RISC-V edge AI

DEEPX Ultra-low-power embedded AI

NVIDIA CUDA at the edge

Real numbers, not marketing claims

Throughput

Latency

Power

Cost-per-inference

Compatibility

Methodology

Want to test your model on our hardware?