llama-server Slot 0: User A's request Slot 1: User B's request Slot 2: (idle) Slot 3: (idle) → Active slots are inferred together in a single batch