July 14: Meta's entry into image generation and editing

maginative.com/…/meta-unveils-cm3leon-a-breakthro…

Image

Image alternative text

Akasazh, 11 months ago

Wow, the face paint one looks like domestic violence…

reply

report

activity

copy /kbin url

copy original url

Loading...

fiat_lux, 11 months ago (edited 11 months ago)

it utilizes the power of attention mechanisms to weigh the relevance of input data

By applying a technique called supervised fine-tuning across modalities, Meta was able to significantly boost CM3leon's performance at image captioning, visual QA, and text-based editing. Despite being trained on just 3 billion text tokens, CM3leon matches or exceeds the results of other models trained on up to 100 billion tokens.

That's a very fancy way to say they deliberately focussed it on a small set of information they chose, and that they also heavily configure the implementation. Isn't it?

This sounds like hard-wiring in bias by accident, and I look forward to seeing comparisons with other models on that...

From the Meta paper:

"The ethical implications of image data sourcing in the domain of text-to-image generation have been a topic of considerable debate. In this study, we use only licensed images from Shutterstock.
As a result, we can avoid concerns related to images ownership and attribution, without sacrificing performance."

Oh no. That... that was the only ethical concern they considered? They didn't even do a language accuracy comparison? Data ethics got a whole 3 sentences?

For all the self-praise in the paper about state of the art accuracy on low input, and insisting I pronounce "CM3Leon" as Chameleon (no), it would have been interesting to see how well it describes people, not streetlights and pretzels. And how it evaluates text/images generated from outside the cultural context of its pre-defined dataset.

reply

report

activity

copy /kbin url

copy original url

Loading...

Add comment