What Youll Do
Own the full lifecycle of AI and computer vision systems, from model integration to scalable, production deployment.
Core Responsibilities
Deploy AI models to production: integrate, optimize, validate, monitor, and iterate.
Build and optimize Real-Time video inference pipelines (multi-stream, low latency, high throughput).
Optimize GPU inference using CUDA/TensorRT/cuDNN (quantization, batching, memory optimization).
Advantage: deploy and run models on edge devices (NVIDIA Jetson and similar).
Develop CV pipelines for detection/tracking.
Build scalable, production-grade services for video streaming and analytics.
Improve system performance end-to-end using profiling and optimization (CPU/GPU, threading, IO, networking).
Work closely with Product, AI, and platform teams to ship reliable releases.
Requirements: What Were Looking For (Must Have)
Proven experience taking AI/CV systems from concept to production.
Strong Python for AI workflows and tooling
Strong C ++ ( C ++17/20 preferred)
Deep understanding of profiling and optimization across CPU and GPU.
Hands-on experience with CUDA and inference tooling (TensorRT, cuDNN, Triton).
Experience deploying to edge devices, particularly NVIDIA Jetson.
Strong Real-Time CV pipeline experience.
Understanding of video streaming fundamentals: frame timing, buffering, latency control, scaling.
Nice to Have
Experience with NVIDIA Triton Server and multi-model GPU service orchestration.
Experience with gRPC, Docker, and CI/CD.
Experience with IoT/edge fleet patterns (MQTT).
Linux system -level engineering, Embedded environments, drivers, IO-heavy systems.
Familiarity with GO
This position is open to all candidates.