This profile is from a federated server and may be incomplete. Browse more on the original instance.
tante, 9 months ago to random Today in "LLMs can't do even simple reasoning": Prompt: Sally (a girl) has 3 brothers. Each brother has 2 sisters. How many sisters does Sally have? See a whole bunch of LLMs fail: https://benchmarks.llmonitor.com/sally
Today in "LLMs can't do even simple reasoning":
Prompt: Sally (a girl) has 3 brothers. Each brother has 2 sisters. How many sisters does Sally have?
See a whole bunch of LLMs fail: https://benchmarks.llmonitor.com/sally
tante, 9 months ago Many LLMs answer "6", mostly because "each" triggers a lot of programming/math wording. Embeddings can be very finicky and LLms don't handle extra information well.
Many LLMs answer "6", mostly because "each" triggers a lot of programming/math wording.
Embeddings can be very finicky and LLms don't handle extra information well.
jollyorc, 9 months ago to random I wonder: is there GAIA-X discourse happening the fediverse?
I wonder: is there GAIA-X discourse happening the fediverse?
tante, 9 months ago @jollyorc is that still a thing with funding running out?
@jollyorc is that still a thing with funding running out?