Opinion

The New LessWrong LLM Policy is Worse Than You Think

Recently, the moderators of LessWrong have decided to change the site’s policies on LLM usage. The essence of the policy can be summarized by the following excerpt: With all that in mind, our new policy is this:”LLM output” includes all of:text written entirely by an LLMtext that was written by a human and then substantially[6] edited or revised by an LLMtext that was written by an LLM and then edited or revised by a human”LLM output” does not include:text that was written by a human and then lightly edited or revised by an LLMtext written by a human, which includes facts, arguments, examples, etc, which were researched/discovered/developed with LLM assistance. (If you “borrow language” from the LLM, that no longer counts as “text written by a human”.)code (either in code blocks or in the new widgets)”LLM output” must go into the new LLM content blocks. You can put “LLM output” into a collapsible section without wrapping it in an LLM content block if all of the content is “LLM output”. If it’s mixed, you should use LLM content blocks within the collapsible section to demarcate those parts which are “LLM output”.We are going to be more strictly enforcing the “no LLM output” rule by normalizing our auto-moderation logic to treat posts by approved[7] users similarly to posts by new users – that is, they’ll be automatically rejected if they score above a certain threshold in our automated LLM content detection pipeline. Having spent a few months staring at what’s been coming down the pipe, we are also going to be lowering that threshold.While certainly well-intentioned, the policies are rather vague, difficult to enforce, and detrimental to the development of high-quality posts.In this essay, I will demonstrate the benefits of using LLMs for writing, address the arguments cited in favor of the policy change, express the value that LLMs provide in the writing process, and advocate for a more nuanced “solution” to the increasing usage of LLMs on this forum.The benefits of using LLMs for writing:LLMs save a significant amount of time for the following reasons: 1.) Boilerplate:For sections of posts that are more about the craft of writing itself instead of ideas (Introduction, Conclusion), having an LLM expand upon a template saves a bunch of time while not really changing the message much. 2.) Editing:Sometimes I write a paragraph, and the wording is just a bit off. In my experience, LLMs are pretty good at taking something you wrote and making it sound smoother. Some people may enjoy the process of rewriting a paragraph until it sounds just right, but I personally care more about expressing my ideas in an engaging manner than engaging in the craft of writing.3.) Translation:For non-native English speakers, LLMs can help effectively translate their ideas into English. While its difficult to precisely measure the benefits of using LLMs for translation compared to traditional methods such as Google Translate, LLMs outperformed google translate in this study on the translation of ancient Indian texts to English, and most of the evidence I have seen on this question points to LLMs being better. It seems like the policy change ignored this, but even if it didn’t, the moderators would have a dilemma of either creating a carve-out for only non-native English speakers, clearly demonstrating the arbitrary nature of the policy, or making writing more difficult for these users.4.) Source Searching, Feedback, and Other Auxiliary Uses:Beyond writing itself, LLMs are a good tool for finding relevant sources. While traditional search engines can also do the job, I find LLMs are often better for niche topics. If I am going to be using an LLM anyway, I might as well use it for other tasks. A similar thing could be said for LLM feedback (although I haven’t used this that much) and image generation.While LLMs certainly aren’t perfect at writing, and are not a substitute for human thinking, they serve to substantially reduce the amount of time it takes to write posts while not really detracting from the user’s authentic voice if the user is using the tools responsibly. The policy recognizes this somewhat by allowing for content “lightly edited or reviewed by an LLM”, but this standard is somewhat unclear, likely varies from moderator to moderator, and risks creating a chilling effect on LLM usage. A better approach is to only police posts which are almost entirely devoid of human input.A response to the critiques of using LLM while writing:1.) “LLM Writing is Worse.”To start, I think there is definitely an element of truth to this claim, as in my experience LLMs, when asked to write on their own, tend to be less creative, engaging, and insightful than human writers. However, I think this problem is mitigated somewhat when you use LLMs less like a ghost writer and more like autocomplete by telling them to improve/flesh out sections of text which already have a clear direction.As a commenter on the post explaining the update wrote, LLM writing is now functionally indistinguishable from human writing. Readers have difficulty differentiating between human-generated text and LLM-generated text. While certain individuals may be able to detect AI writing better than others and be annoyed with certain stylistic elements commonly used in LLM writing, I think that there is no reason we cannot rely on upvotes to decide what type of content the broader Less Wrong community wants to see.2.) “Using LLM writing obfuscates the human mind behind the screen”:In the update, the writer makes the argument that a substantial thing that we care about is the beliefs and perspectives of the writer, instead of just the arguments provided by them. I agree with this statement to an extent, which is why I think that people who use LLMs to assist their writing should review LLM outputs to ensure they represent their argument well (and also to ensure the outputs are factual). However, once again, I do not see why a policy change is necessary to address this. Even before LLMs, the exact opinions and attitudes of the author are often clarified in the comment section. People often make careless mistakes, poor wording choices, or conspicuous omissions in their own writing, so I don’t think much is lost in the case where an LLM writes something in a slightly different way than the author intended (I would like to actually see a significant example of this happening, though. In my experience, LLMs are pretty good at filling in an argument if you give it a decent amount to work with).Some might fear that some people may just let LLMs take over their writing entirely, but I think very few people actually just let LLMs generate an entire post with minimal input. Even if they did, the post would likely be low effort in more ways than one and downvoted, solving the issue without a dedicated moderation policy. However, even if LLM writing advances to a point where this type of writing would not be filtered out, there are better ways to deal with it than a blanket ban on LLM-assisted content. Simply tracking the amount of time commenters on a draft, combined with checking for very high thresholds of LLM writing, could practically eliminate pure LLM writing.3.) “This policy is necessary to combat bots”:In the post laying out the introduction of the new LessWrong LLM policy, this argument was not present, but you can certainly combat bots without blanket banning all substantial LLM usage in posts.The Current Policy is either Very Difficult to Enforce, Subjective, or Overly Restrictive (or some combination of the three):Maybe the LessWrong team has cracked the code on detecting LLM writing, but it is fairly difficult to actually determine whether a text was generated by an AI or a human. A 2025 study on GPTZero’s ability to detect LLM generated text saw GPTZero gave a 14.75% chance on average that human-generated long essays (350-800 words) were LLM-generated. It is important to note that this test was only done for ChatGPT 3.5 and ChatGPT 4o, that AI models have advanced since 2025, and that as time goes on, LLM writing has begun to influence human writing. All of these factors lead me to believe that either the LessWrong AI detection system will have difficulty flagging all but the most obvious cases of LLM writing, or result in an unacceptable level of false positives. These circumstances also invite a high degree of subjectivity into moderation decisions. Most moderation policies have an element of subjectivity involved with enforcement, but with something as difficult to detect as LLM writing, the capacity for mis-moderation is higher. To illustrate why, take the quote below from LessWrong Admin Oliver Habryka on Neel Nanda’s use of LLM Transcription (bolded letters added for emphasis):LLM transcription is IMO a completely different use-case (one I certainly didn’t think of when thinking about the policy above), so in as much as the editing post-transcription is light, you would not need to put it into an LLM block. I also think structural edits by LLMs are basically totally fine, like having LLMs suggest moving a section earlier or later, which seems like the other thing that would be going on here.We intentionally made the choice that light editing is fine, and heavy editing is not fine (where the line is somewhere between “is it doing line edits and suggesting changes to a relatively sparse number of individual sentences, or is it rewriting multiple sentences in a row and/or adding paragraphs”).Also just de-facto, none of the posts you link trigger my “I know it when I see it” slop-detector, so you are also fine on that dimension.From Oliver Habryka’s response, we can see that a great deal of subjectivity will be involved in moderation decisions used under this rule. The Current Policy Promotes Rule Breakers:As with any selectively enforced rule, the moderation policies will affect scrupulous posters more than unscrupulous posters. As someone who tries hard to respect the rules of others, I (and others like me) will abstain from LLM use while posting on this form, while others who are less scrupulous will not. Due to the efficiency gains in writing from LLMs, less scrupulous posters will increase their share of the posts on this forum. The effects of this on the forum are difficult to predict, but I think there is reason to believe it will not improve things.A Better Alternative:While I disagree about the benefits of the LessWrong LLM policy, I understand that certain users may dislike LLM-assisted posts for a wide variety of reasons. For the sake of these users, I recommend creating a new category of posts (LLM Free), which will be an optional filter for users. Doing so preserves the benefits of LLM writing while also allowing those bothered by it to avoid it.Along with this, I would support a ban on “pure” LLM posts, which see users spend very little time reviewing the draft and post something with minimal human input. I think the simplest way to do this would be to track the number of edits on a post combined with LLM detection software, and only remove posts where it is extremely obvious that the post is unreviewed LLM content. Posts that use LLMs in a collaborative manner or with substantial human input and review should not be affected by this policy. I would also endorse a ban on the use of LLMs for quick takes and comments, as these mediums naturally are more about human interaction than a post is, and the benefits of LLMs decline with the length of the writing being produced.The above policy would solve the worst problems posed by LLM writing while still preserving the benefits it provides to LessWrong users.Edit: It seems that people are more worried about new users using LLMs than high karma users, and so I would also support leaving the LLM rules in place for people below a certain karma threshold, as the arguments laid out in favor of the policy are strongest for unscrupulous, low-effort posters who are more likely to misuse LLMs. While karma isn’t a perfect benchmark, it probably does correlate with effort somewhat, as high-effort, truthful content is what users and moderators alike have professed to prefer. Edit II:My comment section discussion with Seth Herd helped me to better understand why some people might support this new moderation policy. The usage of LLMs can make it harder to tell if a post is low-effort or high-effort, and also lowers the barriers to posting, so allowing LLMs can make it harder for users to find genuinely good or great content. While I am more sympathetic to this argument than the others considered by this post, I think the best way to address concerns about crowd-out is by creating moderation tools to measure the effort applied to posts and promote high-effort posts, rather than enact a policy which is only tangentially related to this goal. While I am uncertain about the best way to go about this, as I wrote in a comment: tracking the amount of time spent editing/ number of edits on a given LessWrong post would be a good way to judge the amount of effort placed into a post (this should not be too difficult to track/implement, and for people who write their posts in google drive or word, I doubt it would be a huge inconvenience to move over to LessWrong).Written with Grammarly spell check.Discuss Read More

Related Posts

Honey, I shrunk the brain

Conceptual reasoning dataset v0.1 available (AI for AI safety/AI for philosophy)

Heuristics for assessing how much of a bubble AI is in/will be

Leave a Reply Cancel reply