OpenAI's GPT-7 leaks reveal a 12-trillion parameter monster — and a new approach to memory

Internal documents obtained by ByteDispatch show the next OpenAI flagship pairs a sparse mixture-of-experts core with a separate persistent-memory layer.

Internal architecture documents obtained by ByteDispatch over the weekend describe a model the company has been training since November on a cluster of more than 100,000 next-generation accelerators. The documents — which include a 47-page technical whitepaper marked DRAFT and a separate set of slides prepared for the company's board — call the system GPT-7 and place its parameter count at 12.4 trillion in total, with roughly 800 billion active per token.

The most surprising element is not the size. It's the addition of what the documents call a 'persistent state layer' — a separately-trained memory module that the language model can read from and write to during inference. According to two people familiar with the project, this is OpenAI's bet that scaling alone has plateaued.

OpenAI declined to comment on the documents or confirm the existence of GPT-7. A company spokesperson said in an email: 'We do not comment on specific research timelines.'

If the architecture details are accurate, GPT-7 would represent the largest single shift in OpenAI's stack since the move from GPT-3.5 to GPT-4 in early 2023. It would also place direct pressure on Anthropic, whose Claude 5 family ships its own form of persistent memory, and on Google DeepMind, whose Gemini 3 Ultra hit a 92% on ARC-AGI-2 earlier this week.