Cloud Giants Update:AWS ( $Amazon.com(AMZN)$ ): $115B run rate growing 19% YoY (last Q grew 19%)Azure ( $Microsoft(MSFT)$ ): ~$74B run rate (estimate) growing 31% YoY (last Q grew 34% cc)Google Cloud ( $Alphabet(GOOG)$$Alphabet(GOOGL)$ includes GSuite): $48B run rate growing 30% YoY (last Q grew 35%, neither are cc)ImageAggregate quarterly net new ARR by the hyperscalers ImageAzure at a ~$74B run rate growing 31% constant currency Last Q they restated Azure, so some of the historical comparisons won't be apples to apples. PowerBI and EMS are no longer included in Azure (~$20b of run rate)Quarterly YoY growth trends bel
The debates around DeepSeek are intense - US vs. China, big vs. small models, open vs. closed source, and the shockingly efficient architecture it represents. Pride, fear, disbelief, disgust - all these emotions have clouded the facts. A few personal thoughts:Thoughts on Training Costs: 1⃣ $6M Training Costs = Plausible IMOQuick math: Training costs ∝ (active params * tokens). DeepSeek v3 (37B params; 14.8T tokens) vs. Llama3.1 (405B params; 15T tokens) = v3 theoretically should be 9% of Llama3.1's cost. And the disclosed actual figures aligned with this back-of-the-envelope math, meaning, the number are directionally believable.ImagePlus, there was no hiding, the footnote clearly said: “the aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associa
DeepSeek could certainly accelerate the amount of tinkering / experimenting / prototyping folks do with AI. With lower cost barriers, why not try out more things?But once enterprises find use cases they want to scale up into production, I think there will be real hesitations around trust. How do you handle the inherent bias in the training data? Or how those bias will show up in a reasoning agent? Or worse? I think the real question is what are the implications if a US based company takes a similar approach that DeepSeek didAI is becoming more powerful at the same time it's becoming more accessible. Amazing!Cheaper AI is better for all. It will lead to more education, more experimentation and thus more production use. Excited for this year!
Model distillation might be the most important shift happening in AI right now
Model distillation might be the most important shift happening in AI right now—and it’s reshaping the entire tech industry. It's increasingly becoming a MASSIVE topic. DeepSeek's R1 model released yesterday only reinforced thisModel distillation is a process where a smaller, simpler model (the "student") is trained to replicate the behavior and capabilities of a larger, more complex model (the "teacher"). This is achieved by using the teacher model's outputs (e.g., predictions or reasoning processes) as training data, allowing the student to inherit high performance with reduced size and computational demands.So why is this important? For large AI labs, capital and scale were moats. It took literally billons of dollars of compute and data to pre-train a state-of-the-art model. Let alone al