Delayed Tensor Parallelism for Faster Transformer Inference

(blog.kog.ai)

2 points | by matt_d 9 hours ago ago

No comments yet.