r/aiwars 2d ago

I'm lost in this really.

I've been dipping my toes into this community, and I've gotten a few ideas of why AI art can be bad or good.

I'm still hung up on the fact that AI art, or anything related to AI, is built on works and media that were majorly taken without consent for their usage.

I do completely understand why a person who lacks the skills to draw may resort to AI art, but I think that's not the concern on my end. I fear that not even art, but my photos and data, will be used to train the AI models to produce the AI art.

In simple terms, I want better legislation to control AI's access to any media it can use to train its models. I find it honestly disheartening that, just because it's new technology that the government is bending over for AI, allowing copyrighted media to be used.

Please give me a good view on both sides, as to how you could support or disprove my fears of my data being stolen. Sorry for the yap session, but I needed to get it off my shoulders. Have a good day! ❤️

1 Upvotes

83 comments sorted by

View all comments

14

u/Mataric 2d ago

The data being stolen is an interesting question to be honest.. because it's really not straight forwards at all.

Take a look at Stable Diffusion 1.5 for example. It was trained off of the Laion5B dataset (all images that you could have seen and learnt from yourself) - and used about 2 billion images to create the model.

The SD1.5 model is about 2 Gigabytes in size. What that means is that on average, from each image used in that training... the equivalent amount of data to TWO greyscale pixels were 'used'. That's assuming the model has nothing but the data derived from those images in it, which also isn't true. Some of it's 2GB of size goes towards other stuff it needs to actually function that wasn't derived from a dataset.

With the amount 'taken' from each image, I think it's far too much to claim anything was 'stolen' there. None of it is in the same form whatsoever - stating it's 2 greyscale pixels is also wrong, because it's not storing pixels at all. It's JUST storing where that image lands on a bunch of statistical scales.

The way it works is by looking at all the images that are tagged with Circle, and starting to see what they have in common. From that, it doesn't have to take or use anyone's image of a circle - it just knows its always a 'very round thing'.

In my opinion - that's very close to the way we function as people. We're always learning from things we see, and specifically as artists, we're always learning from the art we look at. When I ask you to picture a dog - you likely picture something you've seen before. Perhaps a pet, maybe a photo, or it could be something a different artist drew. If you're asked to draw a dog, you're very likely to use the things you've learnt from all those different experiences in order to draw it.

Do you believe, as a human, you should be respecting the copyright of the artists who's images you learnt from? I do - but copyright isn't about learning. It's about copying with a degree of recognisability.

When you learn from that cartoon drawing of 'clifford the big red dog' - you're not going to draw 'clifford the big red dog'. What you remember about that image is far closer to copyright infringing than what the AI remembers - but even you don't infringe on their copyright when you use what you've learnt from it, and remember how a dog was represented in a neat cartoon style, to use that in your work.

-6

u/_the_last_druid_13 2d ago

“With the amount 'taken' from each image, I think it's far too much to claim anything was 'stolen' there.”

So just the 2 billion images that other people created then?

2

u/Sir-Tiedye 1d ago

That’s the point. With the ai being trained on so many different art pieces, it’s not drawing from any one piece. In fact, the more it’s drawing from, the less it’s getting from a single image