Jailbreaking LLMs with ASCII Art. Turns out LLMs are still computer programs and sanitizing inputs is hard.

NSFW as it isn’t a bad take by techpeople, but research showing that the AI creating diamondoid virusses because we were mean to it fears are overblown. It cannot follow simple (for us intelligent humans) instructions not to do certain things.

LLMs are extremely good at parsing things however.

  • froztbyte@awful.systems
    link
    fedilink
    English
    arrow-up
    5
    ·
    10 months ago

    Imagine what these fine minds could achieve if they weren’t thinking up defenses to counter the extremely sophisticated onslaught of mid-late 90s usenet spam technology

    Such a waste. Cruel world, etc etc.

    (/s, of course. I’d love to hear the ridiculous scoffing as their multi-$100m toys get taken out by the kind of shit you got skiddy warez groups competing on for most of the 00s)