Bitch if I wanted the robot, I’d ask it myself (well, I’d ask the Chinese one)! I’m asking you!

  • jsomae@lemmy.ml
    link
    fedilink
    English
    arrow-up
    4
    ·
    1 day ago

    Same, I just tried deepseek-R1 on a question I invented as an AI benchmark. (No AI has been able to remotely correctly answer this simple question, though I won’t reveal what the question is here obviously.) Anyway, R1 was constantly making wrong assumptions, but also constantly second-guessing itself.

    I actually do think the “reasoning” approach has potential though. If LLMs can only come up with right answers half the time, then “reasoning” allows multiple attempts at a right answer. Still, results are unimpressive.