Fine-tuning (deep learning)

<h2 id="robustness">Robustness</h2>
<p>Fine-tuning can degrade a model's robustness to <a href="/facts/Domain_adaptation/eq5cbR9K">distribution shifts</a>.<a class="footnote-ref" id="fnref:11" href="#fn:11"><sup>11</sup></a><a class="footnote-ref" id="fnref:12" href="#fn:12"><sup>12</sup></a> One mitigation is to linearly interpolate a fine-tuned model's weights with the weights of the original model, which can greatly increase out-of-distribution performance while largely retaining the in-distribution performance of the fine-tuned model.<a class="footnote-ref" id="fnref:13" href="#fn:13"><sup>13</sup></a>
</p>
<h2 id="variants">Variants</h2>
<h3>Low-rank adaptation</h3>
<p>Low-rank adaptation (LoRA) is an adapter-based technique for efficiently fine-tuning models. The basic idea is to design a low-<a href="/facts/Rank_of_a_matrix/Yp9b5LK3">rank</a> matrix that is then added to the original matrix.<a class="footnote-ref" id="fnref:14" href="#fn:14"><sup>14</sup></a> An adapter, in this context, is a collection of low-rank matrices which, when added to a base model, produces a fine-tuned model. It allows for performance that approaches full-model fine-tuning with lower space requirements. A language model with billions of parameters may be LoRA fine-tuned with only several millions of parameters.
</p><p>LoRA-based fine-tuning has become popular in the <a href="/facts/Stable_Diffusion/BaXu5Vc8">Stable Diffusion</a> community.<a class="footnote-ref" id="fnref:15" href="#fn:15"><sup>15</sup></a> Support for LoRA was integrated into the Diffusers library from <a href="/facts/Hugging_Face/h8kkdUus">Hugging Face</a>.<a class="footnote-ref" id="fnref:16" href="#fn:16"><sup>16</sup></a> Support for LoRA and similar techniques is also available for a wide range of other models through Hugging Face's Parameter-Efficient Fine-Tuning (PEFT) package.<a class="footnote-ref" id="fnref:17" href="#fn:17"><sup>17</sup></a>
</p>
<h3>Representation fine-tuning</h3>

<p>Representation fine-tuning (ReFT) is a technique developed by researchers at <a href="/facts/Stanford_University/JEQeF9z2">Stanford University</a> aimed at fine-tuning large language models (LLMs) by modifying less than 1% of their representations. Unlike traditional parameter-efficient fine-tuning (PEFT) methods, which mainly focus on updating weights, ReFT targets specific parts of the model relevant to the task being fine-tuned. This approach is based on the understanding that deep learning models encode rich semantic information in their representations, suggesting that modifying representations might be a more effective strategy than updating weights.<a class="footnote-ref" id="fnref:18" href="#fn:18"><sup>18</sup></a>
</p><p>ReFT methods operate on a frozen base model and learn task-specific interventions on hidden representations and train interventions that manipulate a small fraction of model representations to steer model behaviors towards solving downstream tasks at inference time. One specific method within the ReFT family is Low-rank Linear Subspace ReFT (LoReFT), which intervenes on hidden representations in the linear subspace spanned by a low-rank projection matrix.<a class="footnote-ref" id="fnref:19" href="#fn:19"><sup>19</sup></a> LoReFT can be seen as the representation-based equivalent of Low-rank Adaptation (LoRA).
</p>
<h2 id="applications">Applications</h2>
<h3>Natural language processing</h3>
<p>Fine-tuning is common in <a href="/facts/Natural_language_processing/1hjMKsSN">natural language processing</a> (NLP), especially in the domain of <a href="/facts/Language_modeling/FntSpg0j">language modeling</a>. <a href="/facts/Large_language_model/WnogWVJY">Large language models</a> like <a href="/facts/OpenAI/V7WVK1t4">OpenAI</a>'s series of <a href="/facts/GPT_Foundational_models/myah4Tjj">GPT foundation models</a> can be fine-tuned on data for specific downstream NLP tasks (tasks that use a pre-trained model) to improve performance over the unmodified pre-trained model.<a class="footnote-ref" id="fnref:20" href="#fn:20"><sup>20</sup></a>
</p>
<h2 id="commercial-models">Commercial models</h2>
<p>Commercially-offered large language models can sometimes be fine-tuned if the provider offers a fine-tuning API. As of June 19, 2023, language model fine-tuning APIs are offered by <a href="/facts/OpenAI/V7WVK1t4">OpenAI</a> and <a href="/facts/Microsoft_Azure/CCnIHTHd">Microsoft Azure</a>'s Azure OpenAI Service for a subset of their models, as well as by <a href="/facts/Google_Cloud_Platform/lPTNk7l6">Google Cloud Platform</a> for some of their <a href="/facts/PaLM/wG7N5sz5">PaLM</a> models, and by others.<a class="footnote-ref" id="fnref:21" href="#fn:21"><sup>21</sup></a><a class="footnote-ref" id="fnref:22" href="#fn:22"><sup>22</sup></a><a class="footnote-ref" id="fnref:23" href="#fn:23"><sup>23</sup></a>
</p>
<h2 id="see-also">See also</h2>
<ul><li><a href="/facts/Catastrophic_forgetting/Jy2QB0RG">Catastrophic forgetting</a></li>
<li><a href="/facts/Continual_learning/MQ3cElgK">Continual learning</a></li>
<li><a href="/facts/Domain_adaptation/eq5cbR9K">Domain adaptation</a></li>
<li><a href="/facts/Foundation_model/67pefJcT">Foundation model</a></li>
<li><a href="/facts/Hyperparameter_optimization/bzcJJrxs">Hyperparameter optimization</a></li>
<li><a href="/facts/Overfitting/5xnFLcMg">Overfitting</a></li></ul>

<h2 id="references">References</h2>

<ol>
<li id="fn:1"><p>Quinn, Joanne (2020). Dive into deep learning: tools for engagement. Thousand Oaks, California. p. 551. ISBN 978-1-5443-6137-6. Archived from the original on January 10, 2023. Retrieved January 10, 2023.{{cite book}}:  CS1 maint: location missing publisher (link) <a href="978-1-5443-6137-6" target="_blank">978-1-5443-6137-6</a> <a href="#fnref:1" class="footnote-back-ref">↩</a></p></li>
<li id="fn:2"><p>"CS231n Convolutional Neural Networks for Visual Recognition". cs231n.github.io. Retrieved 9 March 2023. <a href="https://cs231n.github.io/transfer-learning/" target="_blank">https://cs231n.github.io/transfer-learning/</a> <a href="#fnref:2" class="footnote-back-ref">↩</a></p></li>
<li id="fn:3"><p>Liu, Haokun; Tam, Derek; Muqeeth, Mohammed; Mohta, Jay; Huang, Tenghao; Bansal, Mohit; Raffel, Colin A (2022). Koyejo, S.; Mohamed, S.; Agarwal, A.; Belgrave, D.; Cho, K.; Oh, A. (eds.). Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning (PDF). Advances in Neural Information Processing Systems. Vol. 35. Curran Associates, Inc. pp. 1950–1965. <a href="https://proceedings.neurips.cc/paper_files/paper/2022/file/0cde695b83bd186c1fd456302888454c-Paper-Conference.pdf" target="_blank">https://proceedings.neurips.cc/paper_files/paper/2022/file/0cde695b83bd186c1fd456302888454c-Paper-Conference.pdf</a> <a href="#fnref:3" class="footnote-back-ref">↩</a></p></li>
<li id="fn:4"><p>"CS231n Convolutional Neural Networks for Visual Recognition". cs231n.github.io. Retrieved 9 March 2023. <a href="https://cs231n.github.io/transfer-learning/" target="_blank">https://cs231n.github.io/transfer-learning/</a> <a href="#fnref:4" class="footnote-back-ref">↩</a></p></li>
<li id="fn:5"><p>Zeiler, Matthew D; Fergus, Rob (2013). "Visualizing and Understanding Convolutional Networks". ECCV. arXiv:1311.2901. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:5" class="footnote-back-ref">↩</a></p></li>
<li id="fn:6"><p>Dodge, Jesse; Ilharco, Gabriel; Schwartz, Roy; Farhadi, Ali; Hajishirzi, Hannaneh; Smith, Noah (2020). "Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping". arXiv:2002.06305. {{cite journal}}: Cite journal requires |journal= (help) <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:6" class="footnote-back-ref">↩</a></p></li>
<li id="fn:7"><p>Dingliwal, Saket; Shenoy, Ashish; Bodapati, Sravan; Gandhe, Ankur; Gadde, Ravi Teja; Kirchhoff, Katrin (2021). "Prompt Tuning GPT-2 language model for parameter-efficient domain adaptation of ASR systems". InterSpeech. arXiv:2112.08718. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:7" class="footnote-back-ref">↩</a></p></li>
<li id="fn:8"><p>Yu, Yue; Zuo, Simiao; Jiang, Haoming; Ren, Wendi; Zhao, Tuo; Zhang, Chao (2020). "Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach". Association for Computational Linguistics. arXiv:2010.07835. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:8" class="footnote-back-ref">↩</a></p></li>
<li id="fn:9"><p>"Introducing ChatGPT". openai.com. Retrieved 9 March 2023. <a href="https://openai.com/blog/chatgpt" target="_blank">https://openai.com/blog/chatgpt</a> <a href="#fnref:9" class="footnote-back-ref">↩</a></p></li>
<li id="fn:10"><p>Glaese, Amelia; McAleese, Nat; Trębacz, Maja; Aslanides, John; Firoiu, Vlad; Ewalds, Timo; Rauh, Maribeth; Weidinger, Laura; Chadwick, Martin; Thacker, Phoebe; Campbell-Gillingham, Lucy; Uesato, Jonathan; Huang, Po-Sen; Comanescu, Ramona; Yang, Fan; See, Abigail; Dathathri, Sumanth; Greig, Rory; Chen, Charlie; Fritz, Doug; Elias, Jaume Sanchez; Green, Richard; Mokrá, Soňa; Fernando, Nicholas; Wu, Boxi; Foley, Rachel; Young, Susannah; Gabriel, Iason; Isaac, William; Mellor, John; Hassabis, Demis; Kavukcuoglu, Koray; Hendricks, Lisa Anne; Irving, Geoffrey (2022). "Improving alignment of dialogue agents via targeted human judgements". DeepMind. arXiv:2209.14375. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:10" class="footnote-back-ref">↩</a></p></li>
<li id="fn:11"><p>Radford, Alec; Kim, Jong Wook; Hallacy, Chris; Ramesh, Aditya; Goh, Gabriel; Agarwal, Sandhini; Sastry, Girish; Askell, Amanda; Mishkin, Pamela; Clark, Jack; Krueger, Gretchen; Sutskever, Ilya (2021). "Learning Transferable Visual Models From Natural Language Supervision". arXiv:2103.00020 [cs.CV]. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:11" class="footnote-back-ref">↩</a></p></li>
<li id="fn:12"><p>Kumar, Ananya; Raghunathan, Aditi; Jones, Robbie; Ma, Tengyu; Liang, Percy (2022). "Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution". ICLR. arXiv:2202.10054. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:12" class="footnote-back-ref">↩</a></p></li>
<li id="fn:13"><p>Wortsman, Mitchell; Ilharco, Gabriel; Kim, Jong Wook; Li, Mike; Kornblith, Simon; Roelofs, Rebecca; Gontijo-Lopes, Raphael; Hajishirzi, Hannaneh; Farhadi, Ali; Namkoong, Hongseok; Schmidt, Ludwig (2022). "Robust fine-tuning of zero-shot models". arXiv:2109.01903 [cs.CV]. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:13" class="footnote-back-ref">↩</a></p></li>
<li id="fn:14"><p>Hu, Edward J.; Shen, Yelong; Wallis, Phillip; Allen-Zhu, Zeyuan; Li, Yuanzhi; Wang, Shean; Wang, Lu; Chen, Weizhu (2022-01-28). "LoRA: Low-Rank Adaptation of Large Language Models". ICLR. arXiv:2106.09685. <a href="https://openreview.net/forum?id=nZeVKeeFYf9" target="_blank">https://openreview.net/forum?id=nZeVKeeFYf9</a> <a href="#fnref:14" class="footnote-back-ref">↩</a></p></li>
<li id="fn:15"><p>Ryu, Simo (February 13, 2023). "Using Low-rank adaptation to quickly fine-tune diffusion models". GitHub. Retrieved June 19, 2023. <a href="https://github.com/cloneofsimo/lora" target="_blank">https://github.com/cloneofsimo/lora</a> <a href="#fnref:15" class="footnote-back-ref">↩</a></p></li>
<li id="fn:16"><p>Cuenca, Pedro; Paul, Sayak (January 26, 2023). "Using LoRA for Efficient Stable Diffusion Fine-Tuning". Hugging Face. Retrieved June 19, 2023. <a href="https://huggingface.co/blog/lora" target="_blank">https://huggingface.co/blog/lora</a> <a href="#fnref:16" class="footnote-back-ref">↩</a></p></li>
<li id="fn:17"><p>"Parameter-Efficient Fine-Tuning using 🤗 PEFT". huggingface.co. Retrieved 2023-06-20. <a href="https://huggingface.co/blog/peft" target="_blank">https://huggingface.co/blog/peft</a> <a href="#fnref:17" class="footnote-back-ref">↩</a></p></li>
<li id="fn:18"><p>Wu, Zhengxuan; Arora, Aryaman; Wang, Zheng; Geiger, Atticus; Jurafsky, Dan; Manning, Christopher D.; Potts, Christopher (2024-04-07), ReFT: Representation Finetuning for Language Models, arXiv:2404.03592 <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:18" class="footnote-back-ref">↩</a></p></li>
<li id="fn:19"><p>Wu, Zhengxuan; Arora, Aryaman; Wang, Zheng; Geiger, Atticus; Jurafsky, Dan; Manning, Christopher D.; Potts, Christopher (2024-04-07), ReFT: Representation Finetuning for Language Models, arXiv:2404.03592 <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:19" class="footnote-back-ref">↩</a></p></li>
<li id="fn:20"><p>Dingliwal, Saket; Shenoy, Ashish; Bodapati, Sravan; Gandhe, Ankur; Gadde, Ravi Teja; Kirchhoff, Katrin (2021). "Prompt Tuning GPT-2 language model for parameter-efficient domain adaptation of ASR systems". InterSpeech. arXiv:2112.08718. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:20" class="footnote-back-ref">↩</a></p></li>
<li id="fn:21"><p>"Fine-tuning". OpenAI. Retrieved 2023-06-19. <a href="https://platform.openai.com/docs/guides/fine-tuning" target="_blank">https://platform.openai.com/docs/guides/fine-tuning</a> <a href="#fnref:21" class="footnote-back-ref">↩</a></p></li>
<li id="fn:22"><p>"Learn how to customize a model for your application". Microsoft. Retrieved 2023-06-19. <a href="https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/fine-tuning" target="_blank">https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/fine-tuning</a> <a href="#fnref:22" class="footnote-back-ref">↩</a></p></li>
<li id="fn:23"><p>"Tune text foundation models". Retrieved 2023-06-19. <a href="https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models" target="_blank">https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models</a> <a href="#fnref:23" class="footnote-back-ref">↩</a></p></li>
</ol>

Fine-tuning (deep learning) open-in-new

Fine-tuning (deep learning)