One Minute NLP
Subscribe
Sign in
Home
Notes
Complete One Minute NLP deck
NLP Flashcards
Archive
About
Latest
Top
Discussions
Model Context Protocol
MCP is an open standard that defines how LLMs integrate with external tools, data, and context.
Aug 25
•
Dasha Herrmannova
3
1
April 2025
In-context learning
In-context learning is a prompting technique that allows LLMs to learn and adapt to new tasks based on examples provided within the input prompt…
Apr 11
•
Dasha Herrmannova
3
March 2025
Reflection agents
Reflection is a simple but powerful technique for improving the quality of LLM responses by prompting the LLM to reflect on its own output to identify…
Mar 30
•
Dasha Herrmannova
2
Top-k and top-p sampling
Top-k and top-p sampling are methods used to control the randomness and diversity of LLM outputs.
Mar 20
•
Dasha Herrmannova
5
Reasoning Models
Reasoning models are a new class of LLMs designed to solve complex problems like math and coding.
Mar 12
•
Dasha Herrmannova
7
Group Relative Policy Optimization
Group Relative Policy Optimization (GRPO) is a reinforcement learning algorithm that was used to train DeepSeek-R1.
Mar 3
•
Dasha Herrmannova
4
February 2025
Proximal Policy Optimization
Proximal Policy Optimization is frequently used in Reinforcement Learning from Human Feedback to further train LLMs after supervised fine-tuning. It was…
Feb 25
•
Dasha Herrmannova
2
RLHF
Reinforcement Learning from Human Feedback is a training phase used on LLMs after supervised fine-tuning to further improve LLM responses. It was one of…
Feb 15
•
Dasha Herrmannova
2
ReAct Agent Model
ReAct (Reason + Act) is a design pattern for AI agents that incorporates planning and action execution. It has become a common way to implement agents.
Feb 6
•
Dasha Herrmannova
5
January 2025
Knowledge Distillation
Knowledge Distillation is a popular technique for transferring knowledge from large, powerful models to smaller, more efficient models.
Jan 29
•
Dasha Herrmannova
5
Mixture of Experts
Mixture of Experts (MoE) is an ensemble learning technique that enables creating larger models without increasing training and inference cost.
Jan 19
•
Dasha Herrmannova
July 2024
Low-Rank Adaptation
Low-Rank Adaptation (LoRA) is a popular method for Parameter-Efficient Fine-Tuning of Large Language Models. LoRA significantly improves fine-tuning…
Jul 25, 2024
•
Dasha Herrmannova
1
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts