POSITRON
Accelerating Intelligence
with Hardware for Transformer Model Inference
Purpose Built Generative AI systems
Positron delivers the highest performance, lowest power, and best total cost of ownership solution for Transformer Model Inference.
Versus NVIDIA's H100/H200 systems, Positron delivers:
![Positron Atlas Server](/_next/image?url=%2Fassets%2Fhomepage%2Fhero%2Fatlas.png&w=1200&q=75)
![Positron Archer Card](/_next/image?url=%2Fassets%2Fhomepage%2Fhero%2FPositron-Card.png&w=640&q=75)
being
Head to Head Systems Comparison
(Llama 3.1 8B with BF16 compute, no speculation or paged attention)
Positron delivers leading Performance per Dollar and Performance per Watt compared to NVIDIA
NVIDIA DGX H100
Positron Atlas
Positron Software Release Competitiveness
Every Transformer Runs on Positron
Supports all Transformer models
seamlessly with zero time and zero effort
Positron maps any trained HuggingFace Transformers Library model directly onto hardware for maximum performance and ease of use
.pt
.safetensors
Develop or procure a model using the HuggingFace Transformers Library
Drag & Drop to Upload
or
Upload or link trained model file (.pt or .safetensors) to Positron Model Manager
from openai import OpenAI client = OpenAI(uri="api.positron.ai") client.chat.completions .create( model="my_model" )
Update client applications to use Positron's OpenAI API-compliant endpoint