<?xml version="1.0" encoding='utf-8'?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN" "http://www.wapforum.org/DTD/wml_1.1.xml">
<wml>
<card id="card1" title="Large language model - Page 20 - Wikipedia">
<p>
<a accesskey="1" href="page.php?w=Large_language_model&amp;p=19">1.Previous</a><br />
<a accesskey="3" href="page.php?w=Large_language_model&amp;p=21">3.Next</a>
</p>
<p>large amount of data, before being fine-tuned.</p>

<p><big> Cost </big></p>
<p>Substantial infrastructure is necessary for training the largest models. The tendency towards larger models is visible in the <a href="page.php?w=list_of_large_language_models">list of large language models</a>. For example, the training of GPT-2 (i.e. a 1.5-billion-parameter model) in 2019 cost $50,000, while training of the <a href="page.php?w=PaLM">PaLM</a> (i.e. a 540-billion-parameter model) in 2022 cost $8 million, and Megatron-Turing NLG 530B (in 2021) cost around $11</p><p>
<a accesskey="1" href="page.php?w=Large_language_model&amp;p=19">1.Previous</a><br />
<a accesskey="3" href="page.php?w=Large_language_model&amp;p=21">3.Next</a>
</p>

<do type="prev" label="Search">
        <go href="search.wml"/>
</do>

</card>
</wml>
