Which companies will be able to use and develop GPT-3? The model uses 175 billion parameters of data, which is a memory size that exceeds 350GB. This is estimated to cost between $5m and $12m for each training session with the lower bound cost relying on the team’s ability to “get everything right.”
Compare this to the estimated $6,912 to train Google BERT. Google Bert is a bidirectional transformer model for 11 natural language processing tasks that has become the standard for language models today. BERT had 355 million parameters.
GPT-2 had 1.5 billion parameters and cost about $256/hour. With 175 billion parameters, GPT-3 is orders of magnitude larger and contributes to a trend of exponential increasing AI costs as the state-of-the-art (SOTA) language model increases by a factor of 10 every year. Not included in the one-time training costs of $5m-$12m are the salaries of approximately $10m as well.
With what appears to be an exponential increase in capital costs required to train these models, some companies seem better positioned than others like the major cloud providers: Amazon (AWS), Microsoft (Azure), and Google (GCP). Other well-capitalized companies may be able to train GPT-3 models as well but they will probably have to rent significant cloud GPUs from the three largest cloud providers noted above. This will be another major cost for those that don’t own their own cloud infrastructure.