You can now deploy, run, and fine-tune large language models on Replicate.
We’ve got official versions of FLAN-T5, GPT-J, and LLaMA, and you can also push any other custom model. We’re also releasing a preview of fine-tuning language models.
Language models can be run with just a couple of lines of code, like any other model on Replicate:
You can run them from Python, Node.js, or with an HTTP API, without having to set up servers or GPUs.
Try them out:








