Realtime and Batch Processing of GPU Workloads
Joseph Stein discussed building an enterprise AI-as-a-Service platform for real-time and batch ingestion of GPU workloads within a private cloud data center. He explained techniques such as multi-namespace scheduling, atomic priority queueing with Valkey and Lua, risk mitigation through central proxy gateways, and scaling batch pipelines with a custom S3-to-Kafka proxy.
入选理由:通过多命名空间调度最大化未充分利用的 GPU 资源。
