The focus of the second issue of One Minute NLP is BPE, a popular tokenization algorithm which is used by most LLMs including GPT, Llama, and Mistral.
Byte-Pair Encoding Algorithm
The focus of the second issue of One Minute NLP is BPE, a popular tokenization algorithm which is used by most LLMs including GPT, Llama, and Mistral.