Your AI Roundup
Agents fail at trading while open source models keep improving
Welcome to this week’s AI round-up! Today, we’re going to take a look at how well AI agents are doing for trading, how open-source models are getting better all the time and the big ideas of startups that have raised over $2.4 billion this week alone.
Here’s what you need to know this week:
AI agents against the market: an interesting experiment conducted by the nof1.ai team showing that it may be too risky to let AI agents manage your portfolio yet
Moonshot AI shows open source models keep improving: Kimi-K2-Thinking is achieving impressive results on Humanity’s Last Exam benchmark
Weekly funding: Industry & Mobility is the standout category of the week, with AI companies raising more than$2.4 billion in funding
Research papers: this week’s 5 most notable papers
Startup spotlights: three companies that have really come up with some great ideas in using AI in different areas, from simulating how cancerous tumors respond to different treatments to its use in Controlled Environment Agriculture
1. AI agents against the market
A compelling experiment led by nof1.ai team analyse how LLMs perform in financial environments using Alpha Arena, the first live benchmark designed to immerse AIs in realistic financial scenarios involving adaptation, forecasting, and risk management.
The results show that early performance was steady, but Qwen, Deepseek, Grok, and Claude soon pulled ahead while Gemini and GPT-5 lagged and never recovered. After a late-October market collapse that affected all models, Qwen 3 Max finished with the strongest position, and GPT-5 ended as the weakest performer.
The interesting takeaway from the experiment is that today’s LLMs remain well short of true proficiency in real-world trading. Further trials with alternative setups could show whether this shortfall is a passing stage or a built-in constraint.
2. Moonshot AI shows open source models keep improving
Moonshot AI’s Kimi-K2-Thinking delivers standout benchmark results, approaching GPT-5 performance on Humanity’s Last Exam and surpassing its predecessor.
Because models vary widely across tasks, direct comparison is tricky and the Humanity’s Last Exam is only one measure. Moonshot reported its own K2-Thinking score, and other benchmark results diverge, yet its data still makes it a noteworthy new addition.
3. AI investments of the week - Week #46
Last week saw more than $2.4B in global AI funding, highlighting the growing momentum of AI innovation around the world. Major raises came from companies such as 🇰🇪 Neural Labs Africa, 🇸🇬 CloudMile, 🇩🇪 Aily Labs, 🇺🇸 Metropolis, and 🇦🇺 Zeligate.ai.
Leading sectors drawing investor interest included:
Industry & Mobility, which dominated the week: Metropolis, Source.ag, mimic, Sengine, Ceres AI, Miko, Appetronix, Digs, Lucendi, and Swarmer.
Software & Cloud, with companies including: Aily Labs, Inception, Procurement Sciences AI, Octonomy, CloudMile, Cactus, Ruli, Guizhou Zhongxu Technology, Yuanhe Vision, Return Zero, Relevant Search, IP Author, Cuckoo, Quickads, Motley, Lamatic, Zeligate.ai, CrowAI, OutreachGenius, Ariia AI, and OkeyMeta.
Health, with funding going to: Hippocratic AI, Elephas, Jingyangkang, Anivance AI Corporation, Novoflow, Welltory, Protein Dynamic Solutions, Sandy Health, SpeedR AI, Auric Essentials, and Neural Labs Africa.
→Explore the interactive bubble chart
4. This Week’s 5 Most Notable AI Research Papers
From insights into loss-curvature to evolving threat-actor tactics and advances in continual learning, this week’s findings show both rapid progress and emerging governance challenges.
Looking for more? Explore This Week’s 5 Most Notable AI Research Papers for a deeper dive into the newest technical advances in AI research.
5. Start-up spotlights of the week
This week, we are featuring three innovative AI start-ups that are driving significant impact:
Elephas: the Wisconsin-based startup raised USD 40M to advance its Elephas Live Platform, which uses complex algorithms to predict cancer tumor responses to immunotherapy within 72 hours, supporting faster and more precise treatment decisions.
Source.ag: the start-up was established with the aim of empowering greenhouse growers. The company builds AI-driven cultivation software that optimises environmental conditions, boosts resource efficiency and predicts yields with high accuracy. Its platform helps growers maintain consistent crop quality across cycles while maximising productivity and sustainability.
Teleskope: this New York-based start-up uses AI to detect, prevent and remediate risks across an organisation’s entire data footprint. Its Data Classification as a Service API protects data in real time as it is created or moves through systems, helping companies to maintain security and compliance.
What’s next?
For more AI World updates, research, and insights, visit aiworld.eu. And if you haven’t yet, subscribe to get our weekly roundups delivered straight to your inbox.
Wishing you a great weekend!
Gaia Cavaglioni
On behalf of the AI World Team





