Back to Works

SmolLM Fine-tuning

2024

This project fine-tunes HuggingFace's SmolLM language model from scratch on a toy instruction dataset. Built as part of my language models from scratch series, it works with the latest SmolLM family released by HuggingFace. The repository explores multiple prompt formats (Alpaca, Phi-3, SmolLM), LoRA-based fine-tuning, and model evaluation using AI-as-a-judge across different training configurations.

Repository
GitHub
Platform
Jupyter Notebook
Stack
PyTorch, Transformers, Tiktoken
SmolLM Training Loss