Writing an LLM from scratch, part 22 -- finally training our LLM!
Archives
Categories
Blogroll
This post wraps up my notes on chapter 5 of Sebastian Raschka's book
"Build a Large Language Model (from Scratch)".
Understanding cross entropy loss and
perplexity were the hard bits for
me in this chapter -- the remaining 28 pages were more a case of plugging bits together and
running the code, to see what happens.
The shortness of this post almost feels like a damp squib. After writing so much
in the last 22 posts, there's really not all that much to say -- but th...
Read more at gilesthomas.com