AI agents complete less than 3% of remote work tasks, study finds

 

AI agents complete less than 3% of remote work tasks, study finds

A groundbreaking study released Tuesday reveals that leading artificial intelligence systems can automate less than 3% of complex remote work projects, casting doubt on recent corporate claims that AI justifies mass layoffs. The findings come just days after Amazon announced plans to cut 14,000 corporate jobs, partly citing AI efficiency gains.

The Remote Labor Index, developed by researchers at the Center for AI Safety and Scale AI, tested six frontier AI agents on 240 real-world freelance projects across 23 domains, including software development, architectural design, and video animation. The projects represented over 6,000 hours of human work valued at more than $140,000.

AI Performance Falls Short of Corporate Expectations

The study's results starkly contradict corporate narratives about AI's current capabilities. Manus, the top-performing AI agent from a Chinese startup, achieved just a 2.5% automation rate, meaning it successfully completed only 2.5% of projects at a quality level acceptable for commissioned work. Other leading models performed even worse: Elon Musk's Grok 4 and Anthropic's Claude Sonnet 4.5 each managed 2.1%, while OpenAI GPT-5 reached only 1.7%.​

"This demonstrates that contemporary AI systems fail to complete the vast majority of projects at a quality level that would be accepted as commissioned work," the researchers wrote. Poor quality was the most common failure mode, affecting 45.6% of AI deliverables, followed by incompletion at 35.7%.

Corporate AI Layoffs Accelerate Despite Limited Capabilities

The study's timing highlights a disconnect between AI reality and corporate decision-making. Amazon's Beth Galetti told employees Monday that "this generation of AI is the most transformative technology we've seen since the Internet," justifying the elimination of 14,000 positions. The company expects additional cuts reaching potentially 30,000 jobs, or about 10% of its corporate workforce.​

"This is a wake-up call. And if Amazon does it, other companies might do it too," Harry Holzer, a Georgetown University public policy professor and former Labor Department chief economist, told ABC News.

Study Reveals Gap Between AI Hype and Economic Reality

Dan Hendrycks, director of the Center for AI Safety, expressed hope the research would provide "much more accurate impressions as to what's going on with AI capabilities". The study found that AI agents lack crucial abilities for complex work, including long-term memory storage, continual learning from experience, and skill acquisition on the job.​

Examples of AI failures included producing eight-second videos when eight-minute videos were requested and generating architectural models where house appearances changed across different 3D views. Despite these limitations, newer models showed measurable improvement over time, suggesting gradual progress in automation capabilities.​

The findings challenge OpenAI CEO Sam Altman's claims that GPT-5 represents "a significant step along the path to AGI" and contradicts predictions that 90% of programming jobs could be automated within months. The study's authors noted that while AI models have improved in coding and mathematical reasoning, they struggle with multi-step tasks requiring various tools.
Next Post Previous Post