Listen to an AI-generated audio summary of this article:
New Kid on the Block
The AI world is abuzz with the Chinese DeepSeek LLM model that apparently costs much less to build than OpenAI’s ChatGPT, performs better than the OpenAI o1 model and is cheaper for users to run. That is at least what the DeepSeek team is saying. Here are four thoughts on this.
1) Cost of LLM Models Going to Zero
Whatever the specifics of the DeepSeek model, the advent of new competition will continue to drive the cost of LLM inference down to zero. That is, for users of foundation models, the costs of running these models will decline over time. This is great news for AI players like Macro Hive that use foundation models as an input, but bad news for builders of foundation models like OpenAI and DeepSeek.
2) Benchmarks Matter
The DeepSeek paper claims that its R1 model outperforms OpenAI’s o1 model. However, independent benchmarks suggest that o1 still outperforms the DeepSeek model (Chart 1). Independent benchmarks are important because we do not know how much data mining DeepSeek did to present their results. The one subcategory that the DeepSeek R1 model outperforms o1 is data analysis. Macro Hive will be performing its own benchmark tests on these models, so watch this space.
3) Modular Architecture Is the Future
The range of LLM models from DeepSeek to OpenAI to Google’s Gemini will only increase. Each will have their own pros and cons. So, for AI players it makes sense to build their AI architecture in a modular fashion, where you can swap LLM models easily. For some parts of the AI stack, you may use a small LLM model by (say)Llama, while another part may use DeepSeek’s R1 model and so on. It is important not to get tied into one platform.
4) Open Source LLM Models Are Not Truly Open Source
While much has been made of DeepSeek being open source, in the pure sense it is not. Most notably, the training set and the underlying architecture and code are not available to the world. The open-source part of DeepSeek (like Llama and others) is that the model weights are available and the license to use is open.
Overall, DeepSeek is great news for the AI ecosystem, and it re-affirms some broader trends are already in play.