Fast LLM Inference From Scratch (using CUDA)

344 points by homarp - 173 Days, 19 Hours ago Hacker News

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...