• jfrnz@lemm.ee
      link
      fedilink
      arrow-up
      2
      ·
      4 days ago

      The point is that OP (most probably) didn’t train it — they downloaded a pre-trained model and only did fine-tuning and inference.

      • utopiah@lemmy.world
        link
        fedilink
        arrow-up
        2
        arrow-down
        2
        ·
        4 days ago

        Right, my point is exactly that though, that OP by having just downloaded it might not realize the training costs. They might be low but on average they are quite high, at least relative to fine-tuning or inference. So my question was precisely to highlight that running locally while not knowing the training cost is naive, ecologically speaking. They did clarify though that they do not care so that’s coherent for them. I’m insisting on that point because maybe others would think “Oh… I can run a model locally, then it’s not <<evil>>” so I’m trying to clarify (and please let me know if I’m wrong) that it is good for privacy but the upfront training cost are not insignificant and might lead some people to prefer NOT relying on very costly to train models and prefer others, or a even a totally different solution.

        • jfrnz@lemm.ee
          link
          fedilink
          arrow-up
          3
          ·
          4 days ago

          The model exists already — abstaining from using it doesn’t make the energy consumption go away. I don’t think it’s reasonable to let historic energy costs drive what you do, else you would never touch a computer.

          • utopiah@lemmy.world
            link
            fedilink
            arrow-up
            1
            arrow-down
            3
            ·
            4 days ago

            Indeed, the argument is mostly for future usage and future models. The overall point being that assuming training costs are negligible is either naive or showing that one does not care much for the environment.

            From a business perspective, if I’m Microsoft or OpenAI, and I see a trend to prioritize models that minimize training costs, or even that users are avoiding costly to train model, I will adapt to it. On the other hand if I see nobody cares for that, or that even building more data center drives the value up, I will build bigger models regardless of usage or energy cost.

            The point is that training is expensive and that pointing only to inference is like the Titanic going full speed ahead toward the iceberg saying how small it is. It is not small.

            • ricecake@sh.itjust.works
              link
              fedilink
              arrow-up
              1
              ·
              4 days ago

              If you’re a company you don’t care what the home user does. They didn’t pay for the model and so their existence in the first place indicates a missed opportunity for market share.

              No one is saying training costs are negligible. They’re saying the cost has already been paid and they had no say in influencing it then or in the future. If you don’t pay for it and they can’t tell how often you use it they can’t really be influenced by your behavior.

              It’s like being overly concerned with the impact of a microwave you found by the road. The maker doesn’t care about your opinion of it because you don’t give them money. The don’t even know you exist. The only thing you can meaningfully influence is how it’s used today.

              • utopiah@lemmy.world
                link
                fedilink
                arrow-up
                1
                ·
                4 days ago

                No one is saying training costs are negligible.

                It’s literally what the person I initially asked said though, they said they don’t know and don’t care.

                • ricecake@sh.itjust.works
                  link
                  fedilink
                  arrow-up
                  1
                  ·
                  3 days ago

                  That’s far from saying they’re negligible. What they’re saying is inline with my point. If you find a microwave are you going to research how green it’s manufacturing was so you can ensure you only find good ones for free in the future?

                  Irrelevant or moot is different from negligible. One says it’s small enough to not matter, and the other says it doesn’t affect your actions.

                  I play with AI models on my own computer. I think the training costs are far from negligible and for the most part shouldn’t have been bothered with. (I’m very tolerant of research models that are then made public. Even though the tech isn’t scalable or as world changing as some think doesn’t mean it isn’t worth understanding or that it won’t lead to something more viable later. Churning it over and over without open results or novelty isn’t worth it though). I also think that the training costs are irrelevant with regards to how I use it at home. They’re spent before I knew it existed, and they never have or will see information or feedback from me.
                  My home usage had less impact than using my computer for games has.

                  • utopiah@lemmy.world
                    link
                    fedilink
                    arrow-up
                    1
                    ·
                    3 days ago

                    I’m playing games at home. I’m running models at home (I linked in other similar answers to it) for benchmarking.

                    My point is that models are just like anything I bring into my home I try to only buy products that are manufactured properly. Someone else in this thread asked me about child labor for electronics and IMHO that was actually a good analogy. You here mention buying a microwave and that’s another good example.

                    Yes, if we do want to establish feedback in the supply chain, we must know how everything we rely on is made. It’s that simple.

                    There are already quite a few initiatives for that with e.g. coffee with Fair Trade Certification or ISO 14001, in electronics Fair Materials, etc.

                    The point being that there are already mechanisms for feedback in other fields and in ML there are already model cards with a co2_eq_emissions field, so why couldn’t feedback also work in this field?