Git LFS Pointer Attacks: When Your Model is Just a Text File

Understanding Git LFS

Machine Learning models are huge. Git wasn't designed for 10GB files. To solve this, we use Git LFS (Large File Storage).

When you clone a repo with LFS, Git doesn't download the 10GB binary immediately. It downloads a tiny Pointer File (about 130 bytes) that looks like this:

version https://git-lfs.github.com/spec/v1
oid sha256:4d7a214614ab2935c943f9e0...
size 123456789

The actual binary is downloaded separately.

The Attack: Pointer Confusion

There are two variations of this threat:

1. The "Fake Download" (Denial of Service / Confusion)

A user clones a repository but doesn't have LFS installed or configured correctly. They think they have the model. They try to load it:

torch.load("model.pt")

The script crashes because it's trying to parse text as a binary model. While not a hack, this breaks pipelines and causes debugging nightmares.

2. The "Swap" Attack (Integrity Compromise)

An attacker compromises a repository. Instead of uploading a new 10GB model (which is slow and noticeable), they simply change the OID hash in the pointer file to point to a malicious blob they uploaded earlier or hosted elsewhere.

The user pulls the repo. Git LFS sees the new pointer and downloads the malicious blob. The filename is the same (model.pt), but the content is now a backdoor.

Verifying Integrity

You cannot trust the filename. You must verify the Hash.

Veritensor includes a specialized LFS checker. When you scan a model file, it:

Detects if the file is actually just a text pointer (preventing the "Fake Download" error).
Calculates the SHA256 of the actual binary.
(Optional) Queries the Hugging Face API to verify that this hash matches the official, immutable record for that model revision.

veritensor scan ./model.bin --repo meta-llama/Llama-2-7b
# Output: VERIFIED (Hash matches official registry)

If the hash doesn't match, Veritensor blocks the deployment. This ensures that the model you are running is exactly the model you intended to use.

Understanding Git LFS​

The Attack: Pointer Confusion​

1. The "Fake Download" (Denial of Service / Confusion)​

2. The "Swap" Attack (Integrity Compromise)​

Verifying Integrity​

Understanding Git LFS

The Attack: Pointer Confusion

1. The "Fake Download" (Denial of Service / Confusion)

2. The "Swap" Attack (Integrity Compromise)

Verifying Integrity