LLVM AI tool policy: human in the loop

Sirraide December 18, 2025, 6:43pm 6 Yeah, agreed. If we eventually do find that we want to carve out some sort of exception for some tool, we can just update the policy at that point.

Yeah, agreed. If we eventually do find that we want to carve out some sort of exception for some tool, we can just update the policy at that point.

I don’t see why this is incompatible with that. A policy introduced by this RFC can be overridden by a future RFC, including one for a specific case that would like an exception.

While I am fully convinced that we will end up needing well-defined path for exceptions to this policy, which I why I brought it up, if everyone else would prefer to skip that for now, I can live with the policy going in as is. With the full expectation that we will need to update it in the future to define a principled way to obtain exceptions.

True, but there is a higher argumentative and social effort needed to deviate from a once established policy.

This is a very strange statement – the burden of proof of the value of is on (and its proponents) not everyone else.

Thank you so much for sharing your communities experience. It is good to know that we are not the only one struggling with these issues and it is also good to know how other communities are attempting to deal with them.

cmtice December 17, 2025, 7:27pm 2 Hi Reid, I understand and generally agree with the sentiment that prompted this proposal, but I think maybe your current policy is too restrictive and you are throwing out the baby with the bathwater. In particular I can imagine cases where we might want to make exceptions to the general policy, e.g. for AI tools that are designed to handle a small, restricted, and easily automatable set of maintenance-type changes. I think this policy should include a well-defined path for obtaining exceptions to the general rule (that a human must be in the loop before a PR can be posted). An example of such a path might be: Post an RFC detailing what problem the proposed AI agent will solve and how it will solve it. Get approval for the RFC Have a short testing period, where humans must check their proposed changes before allowing them to be posted upstream, and must comment in the PR both that the original content came from AI, and whether or not the human needed to update the original content. Final review by small committee (possibly one of the area leads teams) on whether or not the AI is generating acceptable quality PRs; grants the exception (or not). Note that’s just a rough outline, and would probably need refinement. Just my 2 cents. 1 Like

Hi Reid, I understand and generally agree with the sentiment that prompted this proposal, but I think maybe your current policy is too restrictive and you are throwing out the baby with the bathwater. In particular I can imagine cases where we might want to make exceptions to the general policy, e.g. for AI tools that are designed to handle a small, restricted, and easily automatable set of maintenance-type changes. I think this policy should include a well-defined path for obtaining exceptions to the general rule (that a human must be in the loop before a PR can be posted). An example of such a path might be: Post an RFC detailing what problem the proposed AI agent will solve and how it will solve it. Get approval for the RFC Have a short testing period, where humans must check their proposed changes before allowing them to be posted upstream, and must comment in the PR both that the original content came from AI, and whether or not the human needed to update the original content. Final review by small committee (possibly one of the area leads teams) on whether or not the AI is generating acceptable quality PRs; grants the exception (or not). Note that’s just a rough outline, and would probably need refinement. Just my 2 cents.

I don’t agree with this characterization. The problem we have today is not that we have a lot of LLM-based tools we desperately need to integrate into our automation. Instead, I think the problem is that reviewers struggle with a wave of LLM output coming as contributions, where contributors don’t have enough understanding of their PRs. Policy that Reid drafted in this thread is a great step towards addressing that later problem. We’ve been months into AI policy discussions, and I don’t think that hypothetical future LLM automation is worth delaying much needed changes any longer.

ms178 December 26, 2025, 9:51am 11 Without any empirical evidence this remains a unsubstantiated claim. I’d argue that such a LLM-assisted review could be part of the learning curve. It also does not reflect the rapid improvements in AI quality which might mitigate the issues related to bad output quality over time. Instead of shutting the door for non-programmers with such language, I propose hard objective criteria to act as the AI quality filter, e.g. measurable and reproducable improvements (performance numbers, crash fixes etc.) that need to be explicitly mentioned within the MR/issue.

Without any empirical evidence this remains a unsubstantiated claim. I’d argue that such a LLM-assisted review could be part of the learning curve. It also does not reflect the rapid improvements in AI quality which might mitigate the issues related to bad output quality over time. Instead of shutting the door for non-programmers with such language, I propose hard objective criteria to act as the AI quality filter, e.g. measurable and reproducable improvements (performance numbers, crash fixes etc.) that need to be explicitly mentioned within the MR/issue.

It’s worth noting that the current permissiveness is quite new in the overall history of the project. The reality is that the LLVM community has discouraged automated and/or bulk changes of any form for a very long time, including changes generated by tooling as straightforward as a sed script. When such changes have been allowed in the past, it has been after receiving prior approval from the community. I view AI-generated changes as falling into precisely the same bucket. I will restate my outlook from the earlier thread: while being inclusive and welcoming is a goal of the LLVM project, it does not supersede the goal of building a suite of compiler & toolchain components that benefit our users. Most of the time these goals are not in conflict, but (again IMO) an overly permissive AI contribution policy betrays the duty of care we have to our users.

At first glance, this sounds reasonable, but I don’t think it works in practice. In theory you can define “objective” standards, but determining whether a PR actually meets those standards still costs maintainer time and attention. And I strongly suspect that cost is often close to what it would take to just do a full review anyway. Let’s also acknowledge something: in open source we generally don’t judge contributions by a contributor’s résumé or claimed experience. But interpersonal trust and publicly earned merit (see ASF’s explanation of Meritocracy and the Apache Way, for example) are genuinely important in open-source communities. That’s not because people are biased; it’s because, as imperfect as it may be, it’s the lowest-cost way to make decisions at scale. If we discard that dynamic entirely, I’m not sure open source would function as effectively as it does today.

rnk December 17, 2025, 7:09pm 1

I’ve updated the PR, and I’ve pasted the markdown below as well, but you can also view it on GitHub.

The reason for our “human-in-the-loop” contribution policy is that processing patches, PRs, RFCs, and comments to LLVM is not free – it takes a lot of maintainer time and energy to review those contributions! Sending the unreviewed output of an LLM to open source project maintainers extracts work from them in the form of design and code review, so we call this kind of contribution an “extractive contribution”.

Our golden rule is that a contribution should be worth more to the project than the time it takes to review it. These ideas are captured by this quote from the book Working in Public by Nadia Eghbal:

Prior to the advent of LLMs, open source project maintainers would often review any and all changes sent to the project simply because posting a change for review was a sign of interest from a potential long-term contributor. While new tools enable more development, it shifts effort from the implementor to the reviewer, and our policy exists to ensure that we value and do not squander maintainer time.

If a maintainer judges that a contribution is extractive (i.e. it doesn’t comply with this policy), they should copy-paste the following response to request changes, add the extractive label if applicable, and refrain from further engagement:

The best ways to make a change less extractive and more valuable are to reduce its size or complexity or to increase its usefulness to the community. These factors are impossible to weigh objectively, and our project policy leaves this determination up to the maintainers of the project, i.e. those who are doing the work of sustaining the project.

XDEFiANCE'e Quality Internet Shop

This is the xdefiance Online Web Shop.

Reaching Outwards

Join the fun!

Recent blog posts

How to Build Reactive Declarative UI in Vanilla JavaScript

Fossil versus Git

Lightpanda migrate DOM implementation to Zig

Ai, Japanese chimpanzee who counted and painted dies at 49

CDC staff 'blindsided' as child vaccine schedule unilaterally overhauled

MIT Non-AI License

Your cart (items: 0)