TL;DRAI

Liquid AI publishes an end-to-end fine-tuning tutorial for LFM2 using 4-bit QLoRA, SFT, and DPO on Google Colab. The lightweight approach cuts training costs and enables on-device deployment, shaping decisions on inference latency, privacy, and local AI stacks.

In this tutorial, we fine-tune Liquid AI’s LFM2 model through a complete open-source workflow. We start by loading the base LFM2 checkpoint with QLoRA, preparing a chat-style supervised fine-tuning dataset, training a lightweight LoRA adapter using TRL and PEFT, and then merging the adapter back into the model. We also extend the workflow with DPO to show how we can improve response preference using chosen and rejected answers. At the end, we have a practical pipeline that moves from a base LFM2 model to an SFT-tuned, preference-aligned checkpoint, ready for further testing or deployment.

!pip install -q -U "transformers>=4.55" "trl>=0.12" "peft>=0.13" "datasets>=2.20" "accelerate>=0.34" bitsandbytes

import torch, gc

from datasets import load_dataset, Dataset

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

marktechpost.com

How to Fine-Tune LFM2 Using QLoRA and DPO: A Complete Step-by-Step Coding Tutorial on Google Colab

Fine-tune Liquid AI LFM2 using QLoRA, SFT, DPO, and adapter merging in a complete Google Colab pipeline.

mercoledì 3 giugno 2026 New tab

TL;DRAI

1,019 words~5 min read

!pip install -q -U "transformers>=4.55" "trl>=0.12" "peft>=0.13" "datasets>=2.20" "accelerate>=0.34" bitsandbytes

import torch, gc

from datasets import load_dataset, Dataset

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

How to Fine-Tune LFM2 Using QLoRA and DPO: A Complete Step-by-Step Coding Tutorial on Google Colab

How to Fine-Tune LFM2 Using QLoRA and DPO: A Complete Step-by-Step Coding Tutorial on Google Colab

Other newsrooms on this story

Related reading

How to use Alpaca-LoRA to fine-tune a model like ChatGPT – Replicate blog

LoRA and QLoRA fine-tuning: what they actually do under the hood

Fine-tuning — Domain-Specializing Models with LoRA

LoRA: I Trained <1% of a 1.5B Model and Matched a Full Fine-Tune

LLM Fine-Tuning Guide: Full Fine-Tuning, LoRA, Learning Rate, and VRAM

Fine-Tuning LLMs for Multi-Turn Conversations: A Technical Deep Dive

Other newsrooms on this story

Related reading

How to use Alpaca-LoRA to fine-tune a model like ChatGPT – Replicate blog

LoRA and QLoRA fine-tuning: what they actually do under the hood

Fine-tuning — Domain-Specializing Models with LoRA

LoRA: I Trained <1% of a 1.5B Model and Matched a Full Fine-Tune

LLM Fine-Tuning Guide: Full Fine-Tuning, LoRA, Learning Rate, and VRAM

Fine-Tuning LLMs for Multi-Turn Conversations: A Technical Deep Dive