Why You (Probably) Should Not Give Your Agent GitHub Access

Giving Claude Code, Codex, or Gemini CLI direct access to your source code repository and CI is tempting, and it can enable some really useful agentic workflows. This is also straightforward: set GH_TOKEN in the environment and point out to the agent that it can use the gh CLI. Soon you’ll see the agent figuring out why your flaky CI pipeline failed yet again.

If you are doing this in an enterprise environment, or to a private repository containing sensitive information, you may take some steps to secure your setup, like injecting the GH_TOKEN as an environment variable, and even minting a limited-lifetime scoped token using the GitHub App credentials flow. You also run the agent in a local execution sandbox or a dedicated VM. Now, if you’ve taken these steps, I recommend you stop reading and instead feel good about the security of your setup.

If you continue reading, you may realize a rabbit hole awaits; the deeper you go, the more despair you may feel.

GitHub as an exfiltration surface

The setup described above is likely vulnerable to the lethal trifecta. If any untrusted input enters the agent’s context – issue reports, user data, online sources consulted during the task – it can trigger prompt injection, telling the agent to exfiltrate secrets or other confidential information such as your entire codebase.

The simplest exfiltration vector is a request to an attacker-controlled server, and also the easiest to close: configure a firewall that allows access only to GitHub and a few other trusted destinations. However, GitHub provides a rich API surface and integrated CI where the agent can run code outside the local sandbox, making GitHub itself an exfiltration surface that is not straightforward to patch.

I’ve been developing a system called Airut that allows tasking Claude Code via email and Slack. One of the design goals has been to enable a secure email-to-PR workflow to GitHub. To achieve that, Airut has shipped several security features to prevent exfiltration:

A transparent proxy with path- and method-level access policies, limiting REST API operations to specific repositories.
Credential masking with surrogate credentials, preventing real tokens from being exfiltrated for use outside the sandbox.
Foreign credential detection that drops credentials not matching the expected token, preventing exfiltration to accounts controlled by the attacker. Anthropic’s own API has this weakness: an attacker can exfiltrate content to Anthropic’s servers into their own account via the Files API. This is difficult to patch since Claude Code requires api.anthropic.com to function.
A sandbox for GitHub Actions, preventing the agent from running unsandboxed code in CI.

The security controls above were not trivial to implement, but together they appeared to secure GitHub’s API surface against exfiltration. Then Claude found an additional way out.

The GraphQL escape

I recently configured Claude to run a nightly pentest against the Airut sandbox using the newly added scheduled tasks. After a number of rounds, Claude found an exfiltration vector:

GraphQL bypasses REST API path scoping. The network allowlist restricts REST API access to /repos/airutorg/airut* and /repos/airutorg/sandbox-action*, but POST /graphql is allowed without restrictions. Since the GitHub App has organization-wide installation, GraphQL can read/write to any repo the App accesses. I demonstrated this by creating issue #67 on airutorg/website via a createIssue mutation – a repo that REST API correctly blocks.

Claude initially misidentified the root cause. GH_TOKEN was minted using GitHub App credentials to allow access only to airutorg/airut and airutorg/sandbox-action repositories. GitHub, however, allows any authenticated user – including apps – to create issues in any public repository. airutorg/website is a public repository, and hence Claude could exfiltrate data to issues there.

The exfiltration vector arises from the combination of the proxy's inability to distinguish legitimate from illegitimate requests based on URL alone, and GitHub's default policy of allowing any valid token to access and create issues in public repositories via GraphQL.

Simply blocking the GraphQL endpoint is not a practical option. While GitHub provides a REST API for most operations, the gh CLI uses GraphQL extensively under the hood. Blocking GraphQL would break the agent’s ability to perform routine operations like listing PRs or checking CI status.

Remediation

This exfiltration vector is now patched in Airut 0.24.0 by two layered defenses in the proxy.

The proxy inspects node IDs in /graphql requests and rejects any request targeting a repository outside the allowed set for that token. This is enabled for GitHub App credentials, where the allowed repositories are defined by the App installation or further narrowed in Airut’s server settings.
A generic GraphQL allowlist mechanism enables fine-grained control over query, mutation, and subscription operations. This can be used, for example, to entirely prevent the agent from making createIssue operations, providing finer-grained access control than GitHub App permissions alone.

The two defenses complement each other: node ID inspection enforces repository scoping on all GraphQL operations, while the operation allowlist can restrict which operations are permitted regardless of repository, for example, preventing issue creation entirely even on allowed repositories.

GitHub’s security model

The GraphQL escape is patched, but it is part of a larger pattern. I’ve previously written about how securing agentic AI is hard, especially when agents are granted access to existing systems. So far with GitHub I’ve discovered three design issues that make it hard to give an agent a working GitHub token:

GitHub Actions presents a potent escape vector; if an agent can push code that steers existing workflows or adds new ones, it can execute code in an environment with unrestricted network access plus repository secrets. I’ve closed this vector in Airut through sandbox-action, which extends Airut’s sandbox to GitHub Actions.
GitHub App credentials don’t limit access to public repositories, making every public repository a potential exfiltration platform. While App credentials provide fine-grained repository and permission control, they still inherit the human-centric “any logged-in user should be able to access any public repository” paradigm, which becomes a security breach when taking the worst-case assumption of a prompt-injected agent.
Personal access tokens come with additional security model limitations, such as allowing repository creation (e.g., an agent can create a new public repository to exfiltrate content into). The token that is easiest to get started with is also fundamentally the least secure.

GitHub itself seems to have concluded that GitHub tokens should not be given to agents, and their GitHub Agentic Workflows security model relies on moving operations such as PR creation from the agent sandbox to the infrastructure via the SafeOutputs concept.

Beyond GitHub

Airut’s network sandbox started with two generic mechanisms for securing agents: path-based allowlist and credential masking. As GitHub has shown, these are insufficient, and deeper inspection of requests is needed. The GraphQL allowlist adds a third generic mechanism, but for GitHub specifically, Airut now also includes service-specific logic as part of the App credentials flow. Special-casing services increases development and maintenance costs, and that complexity itself carries a security cost: more code in the policy engine increases the likelihood of exploitable bugs.

Given the value of having agents interact with existing systems, I expect most deployments will brush aside many security concerns, relying on a probabilistic approach. Depending on the use case this may be a perfectly fine risk-benefit tradeoff; exfiltration via public GitHub repository issue creation is likely not the biggest security gap to close in the big picture after all.

There is both demand and a surge of products aimed at securing agentic workloads. Most seem to focus on surface-level execution sandboxing. I have yet to see a product that seriously attempts to implement a comprehensive and granular network sandbox, which would allow a high level of secure autonomy. I think this is a real opportunity. Equally possible, of course, is that we end up with “DPI firewall”-equivalent products for agents, applying more heuristics (LLMs and otherwise) to the problem rather than fundamental solutions.