Positron

POSITRON

Positron

Accelerating Intelligence

with Hardware for Transformer Model Inference

Purpose Built Generative AI systems

Positron delivers the highest performance, lowest power, and best total cost of ownership solution for Transformer Model Inference.

Versus NVIDIA's H100/H200 systems, Positron delivers:

Positron Atlas ServerPositron Archer Card
> 70%
Greater Performance
@
66%
Lower Power
while
being
50%
Lower Capex Cost
Shipping Today
Contact Sales

Head to Head Systems Comparison


(Llama 3.1 8B with BF16 compute, no speculation or paged attention)

Positron delivers leading Performance per Dollar and Performance per Watt compared to NVIDIA

01

NVIDIA DGX H100

System Power ⚡ 5900W
060120180240300
182.00
Tokens/sec/User
Perf/Dollar: 1.00x
Perf/Watt: 1.00x
02

Positron Atlas

System Power ⚡ 2000W
060120180240300
280.00
Tokens/sec/User
Perf/Dollar: 3.08x
Perf/Watt: 4.54x

Positron Software Release Competitiveness

September 2024
December 2024
Q1 2025
Q2 2025
Software Release
Relative Perf vs H100
Perf/Watt vs H100
Perf/$ vs H100
Perf/$ vs B200 (Estimated)
v1.1 (Atlas)
1.43x
3.6x
3.4x
-
v1.2 (Atlas)
1.77x
6.0x
4.3x
3.3x
v2.0 (Atlas)
2.37x
7.3x
5.7x
3.5x
v2.1 (Atlas)
3.15x
8.9x
7.6x
4.7x
* Nvidia performance is based on vLLM 0.6.3 based on the average across testing Mixtral 8x7B, Llama 3.2 3B, Llama 3.1 8B, and Llama 3.1 70B.

Every Transformer Runs on Positron

Supports all Transformer models
seamlessly with zero time and zero effort

Positron maps any trained HuggingFace Transformers Library model directly onto hardware for maximum performance and ease of use

Step 1
Model files

.pt

.safetensors

Hugging Face

Develop or procure a model using the HuggingFace Transformers Library

Step 2

Drag & Drop to Upload

or

Upload or link trained model file (.pt or .safetensors) to Positron Model Manager

Step 3
from openai import OpenAI
client = OpenAI(uri="api.positron.ai")

client.chat.completions
  .create(
    model="my_model"
  )

Update client applications to use Positron's OpenAI API-compliant endpoint