How GCC 4.3 deleted a NULL check in 2009
How undefined behavior in C lets compilers delete safety checks, why it drives most memory-safety CVEs, and what it means for AI-generated code.
The compiler is allowed to do anything
In 2009, a Linux kernel developer named Ramon de Carvalho Valle found that tun_chr_poll in drivers/net/tun.c dereferenced a pointer before checking it for NULL. The check came two lines later. That should have been a denial-of-service bug at worst. Instead, GCC 4.3 looked at the code, saw the dereference, concluded the pointer could not be NULL (because dereferencing NULL is undefined behavior, and the compiler assumes UB never happens), and deleted the NULL check entirely. The result was CVE-2009-1897, a local root exploit.
That is what undefined behavior means in C. It is not a runtime error. It is not a crash. It is a license the standard grants the compiler to do whatever it wants, including deleting your safety checks, on the grounds that you promised not to invoke it.
The C standard lists roughly 200 forms of undefined behavior. Signed integer overflow. Reading uninitialized memory. Dereferencing a pointer past the end of an array. Modifying the same variable twice between sequence points. Calling a function through a pointer of the wrong type. Any of these gives the compiler permission to produce code that does anything at all, including nothing, including the opposite of what the source code appears to say.
This post is about why that matters for security, and why it matters more now that large language models are writing C.
What the standard actually says
Section 3.4.3 of the C11 standard defines undefined behavior as “behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements.” The standard then lists, as one of the possible outcomes, “behaving during translation or program execution in a documented manner characteristic of the environment.” That is the optimistic reading. The pessimistic reading is on the same line: “ignoring the situation completely with unpredictable results.”
Compiler writers picked the pessimistic reading and built thirty years of optimization passes on top of it. GCC, Clang, and MSVC all assume that any code path containing undefined behavior is unreachable. If your function contains a line that would be UB when x == 0, the compiler is allowed to assume x != 0 everywhere in that function, even before the line in question. It can use that assumption to delete branches, hoist loads, and reorder writes.
John Regehr at the University of Utah has spent years cataloging this. His paper “Undefined Behavior: What Happened to My Code?” walks through a dozen real cases where source code that looks correct compiles into binaries that are not. One example: a signed integer overflow check written as if (x + 100 < x) gets deleted, because signed overflow is UB, so the compiler concludes x + 100 is always greater than x, so the branch is dead.
The check was there to prevent overflow. The compiler removed it because overflow is impossible. The overflow then happens at runtime.
The security shape of the problem
Three categories of CVE come straight out of undefined behavior:
Deleted checks. CVE-2009-1897 above. The Linux kernel responded by adding -fno-delete-null-pointer-checks to its build flags. That flag exists because the language gives the compiler too much rope, and the kernel does not want it.
Spatial memory safety failures. Buffer overflows, out-of-bounds reads, use-after-free. The MITRE CWE Top 25 has had “Out-of-bounds Write” (CWE-787) in the top three every year since 2019. Microsoft’s security response team reported in 2019 that 70% of the vulnerabilities they patched were memory safety issues. Google reported the same number for Chrome. Both numbers have been roughly constant for a decade.
Temporal memory safety failures. Use-after-free, double-free, dangling pointers. These are harder to find with static analysis and tend to produce the most exploitable bugs, because freed memory often gets reallocated for attacker-controlled data.
The common thread is not that C programmers are bad. It is that the language was designed in 1972 for a machine with 64 kilobytes of memory, before threat modeling existed, and its semantics make it structurally impossible for the compiler to enforce the invariants that would prevent these bugs.
What “undefined” does to your threat model
If you maintain a C codebase, your threat model has to account for the fact that your source code is not what runs. The binary is what runs. The compiler decides what the binary does, and the compiler is allowed to use undefined behavior as a license to reshape your code.
This has practical consequences:
- Code review of C source is necessary but not sufficient. A reviewer who reads
if (x + 100 < x)sees an overflow check. The compiler sees dead code. - Defensive checks must be written in forms the compiler cannot prove redundant. Use
__builtin_add_overflowin GCC and Clang, or cast to unsigned, where the standard defines wraparound. - Static analyzers help but lag the optimizers. Coverity, CodeQL, and the Clang static analyzer catch many UB patterns. They do not catch all of them, and the set of UB patterns the optimizer exploits grows every release.
- Sanitizers (ASan, UBSan, MSan) catch UB at runtime when triggered by test input. They will not catch UB on a code path your tests never reach. They are still the best tool you have for finding these bugs before an attacker does.
The CERT C Coding Standard, the MISRA C guidelines, and the SEI’s secure coding rules all exist to constrain C to a subset where UB is less likely. None of them eliminate it. They reduce the surface.
Why this is about to get worse
Large language models trained on GitHub now generate a meaningful fraction of new C code. Studies from NYU (Pearce et al., 2022) and Stanford (Perry et al., 2023) found that LLM-generated C contained security-relevant bugs at rates between 36% and 40% of suggestions, depending on the prompt. The dominant bug categories were the same ones humans produce: buffer overflows, integer overflow, format string issues, missing bounds checks.
The interesting wrinkle is that models trained on existing code reproduce the patterns in that code, including the patterns the compiler is allowed to delete. A model that has seen ten thousand examples of if (ptr) { ... } ptr->field will generate ptr->field; if (ptr) { ... } with no understanding that the second form invites the compiler to remove the check.
Three things follow.
First, AI-generated C will not be safer than human-written C on average, and may be worse, because the model has no model of the abstract machine the standard describes. It pattern-matches on tokens.
Second, the volume of C will go up. More code, written faster, by people who reviewed it less carefully because the model seemed confident. The total count of UB-driven vulnerabilities in production will rise.
Third, AI systems themselves often run on C and C++ infrastructure. CUDA kernels, inference runtimes, model loaders, tokenizers. A use-after-free in llama.cpp or a buffer overflow in a tensor deserializer is not just a memory bug. It is a path from a malicious model file or a crafted prompt into arbitrary code execution on the host running the model. The supply chain for AI weights is not yet treated with the seriousness of the supply chain for executable binaries, and the parsers that read those weights are written in C and C++.
What to do about it
If you are shipping C in 2026, four things are worth doing this quarter:
-
Turn on UBSan in CI. Compile your test suite with
-fsanitize=undefined -fno-sanitize-recover=undefinedand fail the build on any hit. This catches the UB your tests actually exercise. It will not catch all UB. It will catch a lot. -
Audit your defensive checks. Grep for patterns like
x + n < x,p->fieldfollowed byif (p), and arithmetic on signed types in security-sensitive paths. Replace with__builtin_*_overflowintrinsics or unsigned arithmetic. -
Fuzz the parsers. Anything that reads attacker-controlled input - file formats, network protocols, model weights, configuration - should be running under libFuzzer or AFL++ continuously. Memory safety bugs in parsers are the highest-value class of bug for an attacker.
-
Treat AI-generated C as untrusted input. Review it the way you would review a pull request from a contractor you have never met. The model is confident. The compiler does not care.
For anything new and security-relevant, Rust, Zig, or memory-safe subsets like Carbon are now mature enough to use. The U.S. Office of the National Cyber Director published a report in February 2024 recommending the same. The TIOBE index still has C in the top three. Both of those things will be true for a long time.
The language is not going away. The undefined behavior is not going away. The compiler will keep finding new ways to exploit the license the standard gave it. Your job is to write code that survives that, and to know, when you read a C source file, that you are reading a suggestion the optimizer is free to ignore.
Keep Reading
ai securityYour AI security tool blocks nothing
A red team operator's breakdown of why AI cybersecurity tools are sold as controls but function as telemetry with a verdict attached.
wpa2Your Wi-Fi passphrase was never the lock
WPA2 and WPA3 fall to PMKID, KRACK, Dragonblood, evil twin, WPS, and firmware extraction. Passphrase entropy is not the wireless boundary.
infrastructure seizureDutch police seized the provider
Dutch authorities seized 800 servers from a hosting firm for enabling cyberattacks. The provider tier is no longer treated as neutral.
Stay in the loop
New writing delivered when it's ready. No schedule, no spam.