Kerfuffle,

By “attack” they mean “jailbreak”. It’s also nothing like a buffer overflow.

The article is interesting though and the approach to generating these jailbreak prompts is creative. It looks a bit similar to the unspeakable tokens thing: vice.com/…/ai-chatgpt-tokens-words-break-reddit

dan1101,
@dan1101@lemmy.world avatar

That seems like they left debugging code enabled/accessible.

Kerfuffle,

That seems like they left debugging code enabled/accessible.

No, this is actually a completely different type of problem. LLMs also aren’t code, and they aren’t manually configured/set up/written by humans. In fact, we kind of don’t really know what’s going on internally when performing inference with an LLM.

The actual software side of it is more like a video player that “plays” the LLM.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • [email protected]
  • uselessserver093
  • Food
  • aaaaaaacccccccce
  • test
  • CafeMeta
  • testmag
  • MUD
  • RhythmGameZone
  • RSS
  • dabs
  • SuperSentai
  • oklahoma
  • Socialism
  • KbinCafe
  • TheResearchGuardian
  • KamenRider
  • feritale
  • All magazines