AI training data has a price tag that only Big Tech can afford

ylai@lemmy.ml · 4 months ago

AI training data has a price tag that only Big Tech can afford

Dizzy Devil Ducky@lemm.ee · 4 months ago

Based on the post title alone, I call bull because I could buy enough storage and pirate enough books in order to create an AI, using copyrighted material as the training data. Yes it would be an absolutely horrible AI since I don’t have a clue what I’d be doing, but it’s possible.

AggressivelyPassive@feddit.de · 4 months ago

Then go ahead and buy 2000 Nvidia cards.

The training data is important, but currently the bottleneck is computing power. Buying so many chips and having them run full blast 24/7 costs a lot of money.

Audalin@lemmy.world · 4 months ago

You can get your hands on books3 or any other dataset that was exposed to the public at some point, but large companies have private human-filtered high-quality datasets that perform better. You’re unlikely to have the resources to do the same.

AI training data has a price tag that only Big Tech can afford

AI training data has a price tag that only Big Tech can afford

AI training data has a price tag that only Big Tech can afford | TechCrunch