Based on recent comments this feels like a discussion we should have. So…topic, basically.

I’m not looking to be chief noisemaker on this, but I stand by what I wrote in !privacy and what’s in my post history.

https://lemmy.ml/post/48724623/26190950

Let’s have at; do we want a [AI] and [NOT AI] tag. Why or why not?

  • Onno (VK6FLAB)@lemmy.radio
    link
    fedilink
    English
    arrow-up
    13
    arrow-down
    2
    ·
    5 hours ago

    I think that unless you have some way to enforce accuracy, it’s meaningless and AFAIK automatic detection tools are no better than chance and to my knowledge, getting worse.

    An AI bot operator isn’t going to tag their material as [AI], more likely than not they’d attempt to use [NOT AI].

    I’d also point out that while lemmy doesn’t (yet) support hashtags, any “tagging” would probably benefit from using the existing method using a #tag.

    Ultimately, you need to ask yourself, is undeclared AI that goes undetected by the community a problem, or the new “normal”?

    I’ll note that I’m not a proponent of Assumed Intelligence and think that when the bubble bursts we’re going to be in a world of hurt, but with a little luck the billionaires will have lost their shirts in the process.

    • curbstickle@anarchist.nexusM
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 hours ago

      AFAIK automatic detection tools are no better than chance and to my knowledge, getting worse.

      It depends.

      There are a variety of indicators, some blatant like .cursorrules or CLAUDE.md and some key phrases that can be searched for. Unless they are actively working to hide references, these turn up in the overwhelming majority of the “not admitting but not denying” camp, or the just complete lack of disclosure.

      There are also pretty solid indicators with commit logs that can be seen that are… unlikely to ever be a person.

      What is getting harder is around the syntax detection tools and fragment detection tools. Some straight up rips from multiple codebases used to be obvious, but some tools are better at refactoring to make that discovery harder too.

      Just to be candid, I am in no way anti-AI, I run and train my own models at home on my own hardware. Its a tool - a hammer is great for nailing a board to a wall, and an absolutely wrong choice for trying to screw together a cabinet, and I consider LLMs in the same camp.

      If you’re trying to vibe code a full project, its probably going to be a massive problem. If you’re using it to parse some swagger and generate some hooks based on a prefix you defined for specific types without wanting to bother with smashing a script for the disparate types that can be easily detected with an llm… well, you’re probably using it in a way I’d agree with.

      Someone did mention an automod disclosure comment, which I really like the idea of, but would need to look into for use on the fediverse, I’m really not sure how good the automods out there are these days. Last time I checked they were… not so great.