Back to Platform

Scalable Runtime

Deployment & Inference

Deploy AI models to edge, on-prem, or cloud — with ONNX/TensorRT optimization, multi-stream inference, rule engine, and event output pipelines.

Deployment Targets

Edge devices (NVIDIA Jetson, Intel NCS, Coral)
On-premises GPU servers (multi-GPU clusters)
Cloud deployment (AWS, GCP, Azure, private cloud)
Hybrid edge-cloud topologies
Offline mode with local inference capability

Inference Optimization

ONNX Runtime optimization for cross-platform
TensorRT acceleration for NVIDIA GPUs
INT8/FP16 quantization for edge performance
Multi-stream parallel inference pipelines
Dynamic batching for throughput optimization

Scene-Aware Configuration

Per-camera detection zone & ROI configuration
Confidence threshold tuning per class & scene
Schedule-based model switching (day/night)
Cascading model pipelines (detect → classify → act)
Scene-specific post-processing rules

Event Output Pipeline

REST API event dispatch with retry logic
MQTT publishing for IoT integration
Webhook triggers for third-party systems
gRPC streaming for real-time consumers
Event buffering & deduplication controls