• HackyHorse3000@lemmy.world
    link
    fedilink
    English
    arrow-up
    25
    ·
    4 months ago

    That’s the thing though, that’s not comparable, and misses the point entirely. “AI” in this context and the conversations regarding it in the current day is specifically talking about LLMs. They will not improve to the point of general intelligence as that is not how they work. Hallucinations are inevitable with the current architectures and methods, and they lack a inherent understanding of concepts in general. It’s the same reason they can’t do math or logic problems that aren’t common in the training set. It’s not intelligence. Modern computers are built on the same principals and architectures as those calculators were, just iterated upon extensively. No such leap is possible using large language models. They are entirely reliant on a finite pool of data to try to mimic most effectively, they are not learning or understanding concepts the way “Full-AI” would need to to actually be reliable or able to generate new ideas.

    • chrash0@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      10
      ·
      4 months ago

      it’s super weird that people think LLMs are so fundamentally different from neural networks, the underlying technology. neural network architectures are constantly improving, and LLMs are just a product of a ton of research and an emergence after the discovery of the transformer architecture. what LLMs have shown us is that we’re definitely on the right track using neural networks to solve a wide range of problems classified as “AI”

      • HackyHorse3000@lemmy.world
        link
        fedilink
        English
        arrow-up
        16
        ·
        4 months ago

        I think the main problem is applying LLM outside the domain of “complete this sentence”. It’s fine for what it is, and trained on huge datasets it obviously appears impressive, but it doesn’t know if it’s right or wrong, and evaluation metrics are different. In most traditional applications of neural networks, you have datasets with right and wrong answers, that’s not how these are trained, as there is no “right” answer to “tell me a joke.” So the training has to be based on what would likely fill in the blank. This could be an actual joke, a bad joke, a completely different topic, there’s no difference in the training data. The biases, incorrect answers, all the faults of this massive dataset are inherent in the model, and there’s no fixing that. They are fundamentally different in their application and evaluation (this extends to training) methods from other neural networks that are actually effective at what they do, like image processing and identification. The scope of what they’re trying to do with a finite dataset is not realistic and entirely unconstrained, as compared to more “traditional” neural networks, which are very narrow in scope exactly because of this issue.