Clouded Judgement: The New RL Training Grounds
This week on Clouded Judgement: The New RL Training Grounds
- Median software multiple: 5.1x
- High Growth software median: 27.0x
- Mid Growth software median: 7.8x
- Low Growth software median: 3.9x
- 10Y: 4.1%
Image
More on this from Clouded Judgement today:
The earliest days of reinforcement learning were fun to watch. Algorithms trained on Atari games, Starcraft, Go. The appeal was obvious: constrained digital sandboxes with clear rules, infinite repeatability, and instant feedback. You could run agents a million times through Pong and see them get better.
But those environments were toys. Beating Pong doesn’t teach you how to navigate a hospital system. A high score in Breakout doesn’t tell you much about running a logistics network. The truth is, if agents are ever going to matter in the enterprise, they can’t just be dropped into the real world raw. They need environments to train in. Environments that mirror the messiness, constraints, and complexity of reality.
And this is where things get interesting. You could argue the next “killer infra company” is the one that figures out how to take messy, real-world systems and turn them into trainable environments. Imagine a company that builds a digital twin of global supply chains where agents can practice routing and exception handling. Or an environment that looks like a financial market, complete with noise, shocks, and adversarial actors. Or healthcare workflows with all the uncertainty, missing data, and human-in-the-loop steps that exist in reality.
And this is where it gets really interesting. Every company has bespoke workflows, bespoke software, bespoke operating procedures, etc. You could have two software companies competing in the same space, using the same software stack. But how they engage and use that software (they both probably have different custom implementations of Salesforce) will vary.
So in order for an agent to effectively automate their work, the agent first needs to learn how employees of each company do their work! Which brings me back to what I said above: the next “killer infra company” will be the one that figures out how to take messy, real-world systems and turn them into trainable environments.
There’s a second layer to this too: environments by themselves aren’t enough. They’re raw substrate. Enterprises won’t just want a simulation, they’ll want a way to integrate it into their stack. They’ll want observability on how agents are performing. They’ll want evaluation harnesses, orchestration, policy management. In other words, middleware. Just like Stripe turned payments into an API and $Datadog(DDOG)$ turned logs into dashboards, I think we’ll see middleware emerge that makes environments usable, safe, and scalable for enterprises.
So if the last decade of AI was about collecting as much static data as possible, the next might be about constructing dynamic environments. The winners won’t just be whoever builds the best model or agent. It will also be the companies that own the environments those agents train in, and the middleware that turns those environments into enterprise infrastructure.
For larger companies, how they’re able to set these up for their own specific domains or workflows will be critical. Every company has different workflows, different requirements, etc.
Models needed data. Agents need environments. And the companies that own the environments, and the rails around them, might end up being the most important infrastructure businesses of the next decade.
This week in enterprise software: Top 10 #SaaS #Cloud multiples as of today's market close
$Palantir Technologies Inc.(PLTR)$ $Cloudflare, Inc.(NET)$ $CrowdStrike Holdings, Inc.(CRWD)$ $Figma(FIG)$ $Shopify(SHOP)$ $Snowflake(SNOW)$ $Zscaler Inc.(ZS)$ $Datadog(DDOG)$ $Samsara, Inc.(IOT)$ $Guidewire(GWRE)$
Image
For whom haven't open CBA can know more from below:
🏦 Open a CBA today and enjoy privileges of up to SGD 20,000 in trading limit with 0 commission. Trade SG, HK, US stocks as well as ETFs unlimitedly!
Find out more here:
Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

