The LLM bubble might be about to burst (but not for the reason you think)
Ben Turner: Chinese Researchers Just Built an Open-Source Rival to ChatGPT in 2 Months. Silicon Valley Is Freaked Out.
Now, R1 has also surpassed ChatGPT's latest o1 model in many of the same tests. This impressive performance at a fraction of the cost of other models, its semi-open-source nature, and its training on significantly less graphics processing units (GPUs) has wowed AI experts and raised the specter of China's AI models surpassing their U.S. counterparts.
As I wrote just a few weeks ago:
I think this speaks to a bubble on the one hand as every executive is going to want to advocate for more investment now, but things like DeepSeek v3 also points towards radically cheaper training in the future.
The latest DeepSeek model was monumentally less energy intensive to train, massively less energy intensive to use, and performs at the same level as the best OpenAI and Anthropic have to offer consumer today. That’s pretty remarkable, and speaks to my continued feeling that LLMs are going to get past their power and cost problems they have today. Hell, using Ollama, I have been able to run that model locally on my M2 MacBook Pro with 16GB RAM.
Now there are reasons not to want the Chinese-developed DeepSeek-R1 to be the go-to LLM, but I think its existence points to the days of $20+ per month subscriptions to use tools with LLM features are numbered. We’ll see, though.