Megalodon hijacked 55,000 GitHub repos via token replay

Megalodon is the name attached to a campaign that touched more than 55,000 GitHub repositories across personal accounts, organisations, and a smaller set of GitHub Apps. The vector was not a CVE in GitHub itself. The vector was workflow trust abuse, leaked Personal Access Tokens scraped from public commit history, and OAuth scope inheritance through compromised third-party integrations. MITRE T1195.002, compromise software supply chain. T1078.004, valid cloud accounts. T1552.004, unsecured credentials in private keys and tokens. The campaign chained these. It did not innovate. It scaled.

The primary primitive was credential theft followed by automated repository rewriting. Megalodon operators harvested tokens from three sources. The first was historical git history - .env files, CI config fragments, and hardcoded ghp_ and github_pat_ prefixed tokens committed and then deleted in later commits without history rewrite. Git does not garbage-collect referenced objects, and GitHub’s REST API exposes commit SHAs that remain reachable through fork networks long after the originating branch is force-pushed. The second source was npm and PyPI postinstall scripts on packages the attacker had previously trojanised, exfiltrating any GITHUB_TOKEN, NPM_TOKEN, or ~/.netrc material visible in the developer environment. The third was OAuth refresh tokens lifted from a compromised GitHub App that held repo and workflow scope across roughly 1,200 installations.

The bug class is not memory corruption. It is trust boundary collapse. GitHub’s permission model assumes that a token presented over HTTPS to api.github.com is held by the principal it was issued to. That assumption holds at the protocol layer. It fails at the operational layer the moment the token leaks into a public artefact, a build log, a container image layer, or a forked workflow run. Once leaked, the token is indistinguishable from the legitimate user. There is no device binding. There is no proof-of-possession. Classic PATs carry no rotation requirement and, until the fine-grained PAT migration, no per-repository scoping.

The exploit path runs in four stages. Stage one, harvest. The operators ran continuous scanning against the GitHub Events API and public push events, paired with TruffleHog-class regex against new commits in the first 60 seconds after push. The 60-second window matters because secret-scanning push protection fires after the push reaches the ref, and the gap between commit acceptance and secret revocation is the live exploitation window. Median time-to-exploit for leaked AWS keys is under five minutes per GitGuardian’s 2024 telemetry. For GitHub tokens it is faster, because the same API endpoint that validates the token also enumerates its scope.

Stage two, validate and classify. Each captured token was checked against GET /user and GET /user/repos to determine identity, scope, and reachable repositories. Tokens with repo scope were sorted by organisation membership and pinned to high-value targets - repositories with more than 50 stars, repositories owned by organisations with verified domain ownership, and repositories with active GitHub Actions workflows. The last filter matters. Active workflows mean the token can be amplified.

Stage three, amplification through workflow injection. This is where Megalodon moved beyond credential abuse into supply chain mechanics. Where the harvested token had write access to .github/workflows/, the operators added a workflow triggered by pull_request_target or modified an existing one to print ${{ secrets.GITHUB_TOKEN }} and any organisation-level secrets into an attacker-controlled exfiltration endpoint. The pull_request_target trigger is the operative weakness. Unlike pull_request, it runs in the context of the base repository with read/write GITHUB_TOKEN and access to repository secrets. A workflow invoked under pull_request_target that checks out the PR head and executes code from it grants the PR author full repo-scope execution. This pattern has been documented since 2020. It is still present in roughly 7,000 repositories per recent ecosystem scans. Megalodon weaponised the existing misconfiguration rather than introducing a new one.

Stage four, persistence and propagation. Compromised tokens were used to add SSH deploy keys, register self-hosted runners, and add inactive collaborators with triage permissions that do not generate the same notification volume as admin additions. Where a victim repository was a dependency for downstream consumers - npm packages, Go modules, Action references pinned to a mutable tag - the operators pushed malicious commits to the default branch and let the dependency graph distribute the payload. Where the repository was an Action consumed by uses: org/action@v1 style mutable tag references, the v1 tag was force-moved to point at a malicious commit. Pinning to SHA would have broken this. Pinning to tag did not.

Real-world parallels are direct. The Codecov bash uploader compromise in 2021 used the same amplification pattern - a single trusted artefact rewritten upstream, executed in thousands of CI environments, harvesting environment variables across the consumer base. The s1ngularity attack on Nx in late 2024 used pull_request_target abuse to dump npm tokens. The SolarWinds Orion compromise demonstrated the build-system poisoning model at scale. Megalodon is the same shape applied to GitHub-native primitives. Attribution as of writing is unconfirmed. TTP overlap with clusters tracked by Mandiant as UNC4899 and with the loosely-identified Lazarus subgroup responsible for the 3CX intrusion has been noted but not corroborated by independent telemetry.

What defenders see depends entirely on the logging surface that was enabled before the intrusion. GitHub’s audit log surfaces git.clone, repo.add_member, workflows.created_workflow_run, oauth_authorization.create, and personal_access_token.access_granted events through the audit log streaming endpoint and the GraphQL auditLog query. Most of the Megalodon activity is visible in these events. Most organisations do not stream them. The default retention for audit logs on GitHub Enterprise Cloud is 180 days. For Free and Pro tiers, audit logging is limited to the user’s personal security log, which is incomplete for organisation events.

In SIEM correlation, the high-signal indicators are these. A token authenticating from a new ASN within minutes of a public push from the legitimate user’s known ASN. A git.clone event followed within 10 seconds by a repo.access enumeration burst against more than five distinct repositories. A workflow file modification commit authored through the REST API rather than through a git push over SSH or HTTPS - visible in the audit log as workflows.updated_workflow_run with no preceding git.push event. A self-hosted runner registration on a repository that has never previously used self-hosted runners. None of these fire as native GitHub Advanced Security alerts. They require custom detections built on the audit log stream.

EDR telemetry on developer endpoints will not see most of this. The intrusion happens against api.github.com over TLS, from infrastructure that is not the developer’s machine. Where the developer’s machine is the harvest point - postinstall script exfiltration via a trojanised npm package - Sysmon Event ID 1 captures the node process spawn, Event ID 3 captures the outbound network connection to the attacker C2, and Event ID 11 captures the write of any persistence artefact. Detection here depends on outbound network egress filtering on developer workstations, which is uncommon outside high-assurance environments. The gap is structural. The blast radius from a single developer endpoint into 55,000 repositories does not transit any control point that a typical enterprise EDR observes.

GitHub’s secret scanning push protection is the closest native control. It blocks pushes containing recognised token patterns from common providers - AWS, Stripe, Slack, GitHub itself. It does not block custom secrets, non-standard formats, or tokens embedded in binary blobs or compressed archives. Push protection coverage is high for the top 50 secret types and falls off sharply outside that set. Where it fires, the median time-to-revocation for the partner-program-integrated providers is under 30 seconds. Where the secret is custom or the format is non-standard, the token reaches public availability and remains exposed until manual rotation.

The patch boundary for Megalodon is not a version number. There is no single CVE to remediate against. The technical reality post-campaign is this. Every PAT that touched a compromised repository within the campaign window must be assumed exposed and rotated. Every OAuth App authorisation against a compromised GitHub App must be revoked and re-authorised against a re-keyed App. Every workflow file in every reachable repository must be audited for pull_request_target triggers that check out untrusted code. Every Action reference pinned to a mutable tag must be re-pinned to a commit SHA. Every self-hosted runner registered against an affected repository must be re-keyed. Every organisation-level secret accessible to compromised workflows must be rotated.

Residual exposure persists in three places after that work is complete. The first is git history. Tokens deleted from current branches but reachable through commit SHAs in forks remain extractable by anyone who knows where to look. Force-push and git filter-repo do not remove objects from forks. The second is downstream consumers. Any malicious commit pushed to a default branch during the active window may have been pulled into mirror repositories, CI caches, and developer workstation clones before the rollback. The third is the trust graph. Where Megalodon added collaborators, deploy keys, or App installations that were not cleanly enumerated and removed, the persistence outlives the campaign. The token rotated; the access path did not.

Fine-grained PATs with per-repository scope, mandatory expiry on PATs, OIDC-based federation for CI/CD over long-lived secrets, SHA-pinned Action references, and audit log streaming to a queryable SIEM are the controls that reduce exposure to this class of campaign. None of them are new. The repositories that were compromised did not have them. The ones that did, were not in the 55,000.

See also: NordVPN for tunneled traffic when operating outside controlled networks.

#ad Contains an affiliate link.

Megalodon hijacked 55,000 GitHub repos via token replay

Keep Reading

Your valid credentials are the breach.

Shai-Hulud worm compromises 314 npm packages

The Roblox cheat never touched Roblox

Stay in the loop