Writer releases Palmyra X5, delivers near GPT-4 performance at 75% lower cost

Changelly
Writer releases Palmyra X5, delivers near GPT-4 performance at 75% lower cost
Bybit


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Writer, the enterprise generative AI company valued at $1.9 billion, today released Palmyra X5, a new large language model (LLM) featuring an expansive 1-million-token context window that promises to accelerate the adoption of autonomous AI agents in corporate environments.

The San Francisco-based company, which counts Accenture, Marriott, Uber, and Vanguard among its hundreds of enterprise customers, has positioned the model as a cost-efficient alternative to offerings from industry giants like OpenAI and Anthropic, with pricing set at $0.60 per million input tokens and $6 per million output tokens.

“This model really unlocks the agentic world,” said Matan-Paul Shetrit, Director of Product at Writer, in an interview with VentureBeat. “It’s faster and cheaper than equivalent large context window models out there like GPT-4.1, and when you combine it with the large context window and the model’s ability to do tool or function calling, it allows you to start really doing things like multi-step agentic flows.”

Tokenmetrics
A comparison of AI model efficiency showing Writer’s Palmyra X5 achieving nearly 20% accuracy on OpenAI’s MRCR benchmark at approximately $0.60 per million tokens, positioning it favorably against more expensive models like GPT-4.1 and GPT-4o (right) that cost over $2.00 per million tokens. (Credit: Writer)

AI economics breakthrough: How Writer trained a powerhouse model for just $1 million

Unlike many competitors, Writer trained Palmyra X5 with synthetic data for approximately $1 million in GPU costs — a fraction of what other leading models require. This cost efficiency represents a significant departure from the prevailing industry approach of spending tens or hundreds of millions on model development.

“Our belief is that tokens in general are becoming cheaper and cheaper, and the compute is becoming cheaper and cheaper,” Shetrit explained. “We’re here to solve real problems, rather than nickel and diming our customers on the pricing.”

The company’s cost advantage stems from proprietary techniques developed over several years. In 2023, Writer published research on “becoming self-instruct,” which introduced early stopping criteria for minimal instruct tuning. According to Shetrit, this allows Writer to “cut costs significantly” during the training process.

“Unlike other foundational shops, our view is that we need to be effective. We need to be efficient here,” Shetrit said. “We need to provide the fastest, cheapest models to our customers, because ROI really matters in these cases.”

Million-token marvel: The technical architecture powering Palmyra X5’s speed and accuracy

Palmyra X5 can process a full million-token prompt in approximately 22 seconds and execute multi-turn function calls in around 300 milliseconds — performance metrics that Writer claims enable “agent behaviors that were previously cost- or time-prohibitive.”

The model’s architecture incorporates two key technical innovations: a hybrid attention mechanism and a mixture of experts approach. “The hybrid attention mechanism…introduces attention mechanism that inside the model allows it to focus on the relevant parts of the inputs when generating each output,” Shetrit said. This approach accelerates response generation while maintaining accuracy across the extensive context window.

Palmyra X5’s hybrid attention architecture processes massive inputs through specialized decoder blocks, enabling efficient handling of million-token contexts. (Credit: Writer)

On benchmark tests, Palmyra X5 achieved notable results relative to its cost. On OpenAI’s MRCR 8-needle test — which challenges models to find eight identical requests hidden in a massive conversation — Palmyra X5 scored 19.1%, compared to 20.25% for GPT-4.1 and 17.63% for GPT-4o. It also places eighth in coding on the BigCodeBench benchmark with a score of 48.7.

These benchmarks demonstrate that while Palmyra X5 may not lead every performance category, it delivers near-flagship capabilities at significantly lower costs — a trade-off that Writer believes will resonate with enterprise customers focused on ROI.

From chatbots to business automation: How AI agents are transforming enterprise workflows

The release of Palmyra X5 comes shortly after Writer unveiled AI HQ earlier this month — a centralized platform for enterprises to build, deploy, and supervise AI agents. This dual product strategy positions Writer to capitalize on growing enterprise demand for AI that can execute complex business processes autonomously.

“In the age of agents, models offering less than 1 million tokens of context will quickly become irrelevant for business-critical use cases,” said Writer CTO and co-founder Waseem AlShikh in a statement.

Shetrit elaborated on this point: “For a long time, there’s been a large gap between the promise of AI agents and what they could actually deliver. But at Writer, we’re now seeing real-world agent implementations with major enterprise customers. And when I say real customers, it’s not like a travel agent use case. I’m talking about Global 2000 companies, solving the gnarliest problems in their business.”

Early adopters are deploying Palmyra X5 for various enterprise workflows, including financial reporting, RFP responses, support documentation, and customer feedback analysis.

One particularly compelling use case involves multi-step agentic workflows, where an AI agent can flag outdated content, generate suggested revisions, share them for human approval, and automatically push approved updates to a content management system.

This shift from simple text generation to process automation represents a fundamental evolution in how enterprises deploy AI — moving from augmenting human work to automating entire business functions.

Writer’s Palmyra X5 offers an 8x increase in context window size over its predecessor, allowing it to process the equivalent of 1,500 pages at once. (Credit: Writer)

Cloud expansion strategy: AWS partnership brings Writer’s AI to millions of enterprise developers

Alongside the model release, Writer announced that both Palmyra X5 and its predecessor, Palmyra X4, are now available in Amazon Bedrock, Amazon Web Services’ fully managed service for accessing foundation models. AWS becomes the first cloud provider to deliver fully managed models from Writer, significantly expanding the company’s potential reach.

“Seamless access to Writer’s Palmyra X5 will enable developers and enterprises to build and scale AI agents and transform how they reason over vast amounts of enterprise data—leveraging the security, scalability, and performance of AWS,” said Atul Deo, Director of Amazon Bedrock at AWS, in the announcement.

The AWS integration addresses a critical barrier to enterprise AI adoption: the technical complexity of deploying and managing models at scale. By making Palmyra X5 available through Bedrock’s simplified API, Writer can potentially reach millions of developers who lack the specialized expertise to work with foundation models directly.

Self-learning AI: Writer’s vision for models that improve without human intervention

Writer has staked a bold claim regarding context windows, announcing that 1 million tokens will be the minimum size for all future models it releases. This commitment reflects the company’s view that large context is essential for enterprise-grade AI agents that interact with multiple systems and data sources.

Looking ahead, Shetrit identified self-evolving models as the next major advancement in enterprise AI. “The reality is today, agents do not perform at the level we want and need them to perform,” he said. “What I think is realistic is as users come to AI HQ, they start doing this process mapping…and then you layer on top of that, or within it, the self-evolving models that learn from how you do things in your company.”

These self-evolving capabilities would fundamentally change how AI systems improve over time. Rather than requiring periodic retraining or fine-tuning by AI specialists, the models would learn continuously from their interactions, gradually improving their performance for specific enterprise use cases.

“This idea that one agent can rule them all is not realistic,” Shetrit noted when discussing the varied needs of different business teams. “Even two different product teams, they have so many such different ways of doing work, the PMs themselves.”

Enterprise AI’s new math: How Writer’s $1.9B strategy challenges OpenAI and Anthropic

Writer’s approach contrasts sharply with that of OpenAI and Anthropic, which have raised billions in funding but focus more on general-purpose AI development. Writer has instead concentrated on building enterprise-specific models with cost profiles that enable widespread deployment.

This strategy has attracted significant investor interest, with the company raising $200 million in Series C funding last November at a $1.9 billion valuation. The round was co-led by Premji Invest, Radical Ventures, and ICONIQ Growth, with participation from strategic investors including Salesforce Ventures, Adobe Ventures, and IBM Ventures.

According to Forbes, Writer has a remarkable 160% net retention rate, indicating that customers typically expand their contracts by 60% after initial adoption. The company reportedly has over $50 million in signed contracts and projects this will double to $100 million this year.

For enterprises evaluating generative AI investments, Writer’s Palmyra X5 presents a compelling value proposition: powerful capabilities at a fraction of the cost of competing solutions. As the AI agent ecosystem matures, the company’s bet on cost-efficient, enterprise-focused models could position it advantageously against better-funded competitors that may not be as attuned to business ROI requirements.

“Our goal is to drive widespread agent adoption across our customer base as quickly as possible,” Shetrit emphasized. “The economics are straightforward—if we price our solution too high, enterprises will simply compare the cost of an AI agent versus a human worker and may not see sufficient value. To accelerate adoption, we need to deliver both superior speed and significantly lower costs. That’s the only way to achieve large-scale deployment of these agents within major enterprises.”

In an industry often captivated by technical capabilities and theoretical performance ceilings, Writer’s pragmatic focus on cost efficiency might ultimately prove more revolutionary than another decimal point of benchmark improvement. As enterprises grow increasingly sophisticated in measuring AI’s business impact, the question may shift from “How powerful is your model?” to “How affordable is your intelligence?” — and Writer is betting its future that economics, not just capabilities, will determine AI’s enterprise winners.



Source link

Blockonomics

Be the first to comment

Leave a Reply

Your email address will not be published.


*