Key takeaways:
- Quantitative research now faces more data and potential market relationships than human researchers can possibly explore
- The next frontier is to develop agentic AI that can conduct comprehensive, end-to-end quantitative research with minimal human intervention to the standard required for our investment process
- Enter AlphaGPT, our proprietary large language model-based workflow that can process vast amounts of data at speed. It still requires human oversight and strategic direction, but the combination delivers a more comprehensive research landscape than either approach alone
One of the biggest challenges in modern quantitative research is the explosion of available data and analytical possibilities. Meanwhile, human bandwidth stays constant. No matter how skilled they are, human researchers can only examine a limited subset of possibilities at any given time.
Hedge funds, asset managers and investment banks have been experimenting with large language models (LLMs) to address this bottleneck and the current technology, in its off-the-shelf mode, is up to handling intern-level work effectively, writing code and summarising research.
The race is now on to fully leverage the capabilities of these LLMs beyond simple chatbot interactions: develop agentic AI that can eventually function as a fully-fledged analyst who is equipped with a massive amount of knowledge, follows a rigorous research process and produces research outputs that are both creative and trustworthy.
Meet AlphaGPT, Man Group’s proprietary agentic AI research workflow. While we can’t leave it unsupervised just yet, early results are promising. Here’s what you need to know.
1. What is AlphaGPT and how does it work?
Think of AlphaGPT as a digital three-person research team that never sleeps, processes vast amounts of financial data in seconds, and follows our investment methodology to the letter. Each ‘member’ handles a distinct phase of the research process:
The Idea Person: This part brainstorms investment concepts and hypotheses. Just as a human researcher might ask “Do stocks with more buy orders than sell orders tend to outperform”, or “Do companies with more efficient hiring strategies do better?”, AlphaGPT generates these kinds of testable propositions. We found the system goes beyond obvious connections and explores subtle relationships between seemingly unrelated market factors, systematically covering areas that human researchers might not examine simply due to limited recognitive capabilities. It has been observed to produce dozens of viable concepts within minutes rather than days.
The Implementer: This component translates research ideas into executable code. It writes production-grade Python code leveraging Man Group’s research tools and is capable of interacting with the proprietary databases. What might take a human researcher hours or days to code and debug, AlphaGPT has demonstrated to accomplish in minutes. This allows rapid testing of multiple ideas and eliminates the iterative coding cycles that typically consume researchers’ time.
The Evaluator: This agent rigorously evaluates whether the research works. It applies the same strict criteria that human-generated research must pass. Statistical significance tests, risk assessments and economic reasoning checks ensure that any alpha signals make logical sense and not just mathematical operations. Every AI-produced signal must demonstrate clear economic rationale and pass identical evaluation thresholds before it can be considered for deployment.
AlphaGPT is equipped with a workflow orchestrator that ensures the three-person research team works together seamlessly while at the same time embedding safeguards against AI’s common pitfalls such as hallucinations.
To date, the system has produced signals that meet our standards and pass the same evaluation thresholds required for human-generated research.
2. Does it just automate existing work, or does it demonstrate genuine creativity?
Despite not being able to “think” (in the sense that LLMs are only predicting the next “token”), AlphaGPT has shown what appears to be genuine creativity. Side-by-side comparisons with human researchers show that it often explores investment concepts that human researchers haven't covered, not because they lack capability, but because of sheer scale.
The system can simultaneously consider vast arrays of potential connections, identifying promising research directions that might otherwise go unexplored and fill gaps that experienced researchers hadn't previously considered. WhileAI still can’t totally outperform humans in idea generation, in our experience its complementary viewpoints can create a broader, more comprehensive research landscape than either approach alone.
The difference from human research lies in the volume and speed of hypothesis generation, which creates both opportunity and risk.
3. So, what are the risks? How do you prevent p-hacking and hallucinating?
AlphaGPT's speed can create specific statistical risks. The system can test numerous variations and combinations rapidly, increasing the probability of discovering patterns that appear significant but represent statistical artefacts. These 'lucky' results might look compelling in research periods yet fail in live trading. This is the multiple testing problem, sometimes called p-hacking.
We address this through rigorous process enforcement. We apply the same stringent research methodology that has historically prevented human-driven p-hacking, ensuring statistical discipline regardless of research source.
We're also expanding our monitoring infrastructure to handle increased signal volume whilst maintaining oversight quality.
And there are of course hallucination and drift risks. The AI might conceptualise one research idea but implement something different. It might misinterpret research concepts during code generation or create logical inconsistencies between stated methodology and actual execution.
Our mitigation relies on advanced prompt engineering to ensure clear communication of research objectives and constraints. The agentic workflow includes multiple validation stages with built-in consistency checks between idea generation, implementation and evaluation. Systematic verification processes, both automated and manual, confirm alignment between intended research and actual implementation.
We treat these risks as engineering challenges rather than fundamental barriers, applying the same systematic approach to AI risk management that we use for traditional quantitative research risks.
4. What role do humans play in validating these AI-generated strategies?
During AlphaGPT's active development, we maintain comprehensive human oversight. A logging system captures the entire process from initial hypothesis through final implementation. Every decision point, assumption and logical step is documented and reviewable. This creates complete transparency and accountability.
AI-generated signals undergo dual-track validation. The Investment Committee reviews the underlying hypothesis, economic rationale and research outcomes using the same analytical framework applied to human-generated signals. Technology teams conduct code reviews, implement testing protocols spanning unit tests, integration tests and scenario analysis, and assess implementation risk.
This oversight will evolve as the system matures. We're building more robust automated oversight systems to handle increased volume. The technology infrastructure will improve but validation standards won't change. AI systems will maintain transparency and remain accessible to human intervention when needed.
Whether generated by human or AI, every strategy entering live trading must pass identical thresholds. This ensures consistent quality and risk management regardless of origin.
5. Won't this technology become commoditised?
We fully expect that LLMs and agentic workflows are likely to see widespread adoption across the investment industry. Early-mover technology advantages are inherently temporary.
We believe our sustainable edge lies in systematic investing fundamentals that transcend any single technology. We have proprietary data sources and technology infrastructure built over decades as well as organisational principles and research philosophy refined through multiple market cycles. There is also institutional memory and proven methodologies that can't be easily replicated.
AlphaGPT's architecture is modular and technology agnostic, allowing the leveraging of the best-available LLMs as they evolve. Prompt engineering processes can adapt to new model capabilities, and the system can incorporate breakthrough technologies without rebuilding core systems. We’re also actively exploring the possibility of performing post-training specifically for systematic research on foundational models or training small language models tailored to our use case.
6. Can AlphaGPT scale across different asset classes?
So far, we found it most successful in systematic equity research, but fundamental research principles remain largely consistent across asset classes.
Scaling would mean transferring the core agentic workflow whilst customising data sources, research methodologies and analytical frameworks for each asset class. The technology's modular design is intended to allow for this adaptation whilst preserving the underlying architecture that drives efficiency and insight generation.
7. What does the future look like for AlphaGPT?
The system currently operates with substantial human oversight through a human-in-the-loop approach. Our long-term vision seeks to balance automation with accountability through scalable oversight. We're building more robust automated oversight systems to handle AI-generated signal volume. AI systems will maintain transparency and flexibility for human intervention when needed. The same level of scrutiny will continue through improved technological capabilities rather than reduced oversight.
Our key takeaway so far is that AlphaGPT doesn't replace human judgment but amplifies it. The most effective use of the system involves human researchers working alongside AI, with each contributing their unique strengths. Humans provide strategic direction, market context, and final decision-making, while AlphaGPT handles the heavy lifting of data processing, hypothesis generation, and initial analysis.
You are now leaving Man Group’s website
You are leaving Man Group’s website and entering a third-party website that is not controlled, maintained, or monitored by Man Group. Man Group is not responsible for the content or availability of the third-party website. By leaving Man Group’s website, you will be subject to the third-party website’s terms, policies and/or notices, including those related to privacy and security, as applicable.