• Showroom7561@lemmy.ca
    link
    fedilink
    English
    arrow-up
    2
    ·
    10 months ago

    Not as bad as the AI-generated articles showing up in search results. Some websites I get driven to make absolutely no sense, despite a lot of words being written about all kinds of topics.

    I’m looking forward to the day when “certified human content” is a thing, and that’s all search engines allow you to see.

  • LillyPip@lemmy.ca
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Well, of course. The search algorithm has no way to know the difference.

  • Zarxrax@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    I mean, they would have started appearing in there from the first moment that someone created one and hosted it somewhere, no? So it’s already been a thing for a couple years now, I believe.

  • KairuByte@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Why would they not? There’s no way for such a system to know it’s AI generated unless there’s some metadata that makes it obvious. And even if it was, who’s to say the user wouldn’t want to see them in the results?

    This is a nothing issue. It’s not like this is being generated in response to a search, it’s something that already existed being returned as a result because there is assembly something that links it to the search.

  • hex_m_hell@slrpnk.net
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Its time to start talking about “memetic effluent.” In the same way corporations polluted our physical world, they’re pollution our memetic world. AI spewing garbage data is just the most obvious way, but corporations have been toxifying our memetic space for generations.

    This memetic effluent will make sorting through data harder and harder over the years. But the oil and tobacco industries undermined science and democracy for decades with it’s own memetic effluent in order to protect their business for decades. Advertising is it’s own effluent that distorts and destroys language. Jerry Rubin said it in 1970, “How can I tell you ‘I love you’ after hearing ‘cars love shell?’”

    While physical effluent destroys our physical environment making living in the world harder, memetics effluent destroys meaning and makes thinking about and comprehending the world harder. Both are the garbage side effects of the perpetuation of capitalism.

    This example of poisoning the data well is just too obvious to ignore, but there are so many others.

  • andallthat@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    10 months ago

    Just wanted to point out that the Pinterest examples are conflating two distinct issues: low-quality results polluting our searches (in that they are visibly AI-generated) and images that are not “true” but very convincing,

    The first one (search results quality) should theoretically be Google’s main job, except that they’ve never been great at it with images. Better quality results should get closer to the top as the algorithm and some manual editing do their job; crappy images (including bad AI ones) should move towards the bottom.

    The latter issue (“reality” of the result) is the one I find more concerning. As AI-generated results get better and harder to tell from reality, how would we know that the search results for anything isn’t a convincing spoof just coughed up by an AI? But I’m not sure this is a search-engine or even an Internet-specific issue. The internet is clearly more efficient in spreading information quickly, but any video seen on TV or image quoted in a scientific article has to be viewed much more skeptically now.

      • bluewing@lemm.ee
        link
        fedilink
        English
        arrow-up
        0
        ·
        10 months ago

        Provenance. Track the origin.

        Easy to say, often difficult to do.

        There can be 2 major difficulties with tracking to origin.

        1. Time. It can take a good amount of time to find the true origin of something. And you don’t have the time to trace back to the true origin of everything you see and hear. So you will tend to choose the “source” you most agree with introducing bias to your “origin”.
        2. And the question of “Is the ‘origin’ I found the real source?” This is sometimes referred to Facts by Common Knowledge or the Wikipedia effect. And as AI gets better and better, original source material is going to become harder to access and harder to verify unless you can lay your hands on a real piece of paper that says it’s so.

        So it appears at this point in time, there is no simple solution like “provenance” and " find the origin".

        • UnderpantsWeevil@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          arrow-down
          1
          ·
          10 months ago

          And as AI gets better and better, original source material is going to become harder to access and harder to verify unless you can lay your hands on a real piece of paper that says it’s so.

          One of the bright lines between Existing Art and AI Art, particularly when it comes to historical photos and other images, is that there typically isn’t a physical copy of the original. You’re not going to walk into the Louvre and have this problem.

          This brings up another complication in the art world, which is ownership/right-to-reproduce said image. Blindly crawling the internet and vacuuming up whatever you find, then labeling it as you find it, has been a great way for search engines to become functional repositories of intellectual property without being exposed to the costs associated with reprinting and reproducing. But all of this is happening in a kind-of digital gray marketplace. If you want the official copy of a particular artwork to host for your audience, that’s likely going to come with financial and legal strings attached, making its inclusion in a search result more complicated.

          Since Google leadership doesn’t want to petition every single original art owner and private exhibition for the rights to use their workers in its search engine, they’re going to prefer to blindly collect shitty knock-offs and let the end-users figure this shit out (after all, you’re not paying them for these results and they’re not going to fork out money to someone else, so fuck you both). Then, maybe if the outcry is great enough, they can charge you as a premium service to get more authentic results. Or they can charge some third party to promote their print-copies and drive traffic.

          But there’s no profit motive for artistic historical accuracy. So this work isn’t going to get done.

  • BradleyUffner@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    10 months ago

    Google is a search engine, it shows stuff hosted on the Internet. If these AI generated images are hosted on the Internet, Google should show them.

    • UnderpantsWeevil@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      arrow-down
      1
      ·
      10 months ago

      The Google AI that pre-loads the results query isn’t able to distinguish real photos from fake AI generated photos. So there’s no way to filter out all the trash, because we’ve made generative AI just good enough to snooker search AI.

      • samus12345@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        10 months ago

        A lot of them mention they’re using an AI art generator in the description. Even only filtering out self-reported ones would be useful.

        • UnderpantsWeevil@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          arrow-down
          1
          ·
          10 months ago

          That still requires a uniform method of tagging art as such. Which is absolutely a thing that could be done, but there’s no upside to the effort. If your images all get tagged “AI” and another generator’s doesn’t, what benefit is that to you? That’s before we even get into what digital standard gets used in the tagging. Do we assign this to the image itself (making it more reliable but also more difficult to implement)? As structured metadata (making it easier to apply, but also easier to spoof or scrape off)? Or is Google just expected to parse this information from a kaleidoscope of generating and hosting standards?

          Times like this, it would be helpful for - say - the FCC or ICANN to get involved. But that would be Big Government Overreach, so it ain’t going to happen.