Fast LLM Inference From Scratch (using CUDA)

344 points by homarp - 199 Days, 18 Hours ago Hacker News

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...