337
Microsoft CEO of AI: Online content is 'freeware' for models • The Register
(www.theregister.com)
Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.
In this community everyone is welcome to post links and discuss topics related to privacy.
much thanks to @gary_host_laptop for the logo design :)
Respectfully, I worked for Alexa AI on compositional ML, and we were largely able to do exactly this with customer utterances, so to say it is impossible is simply not true. Many companies have to have some degree of ability to remove troublesome data, and while tracing data inside a model is rather difficult (historically it would be done during the building of datasets or measured at evaluation time) it's definitely something that most big tech companies will do.
Sorry, I misinterpreted what you meant. You said "any AI models" so I thought you were talking about the model itself should somehow know where the data came from. Obviously the companies training the models can catalog their data sources.
But besides that, if you work on AI you should know better than anyone that removing training data is counter to the goal of fixing overfitting. You need more data to make the model more generalized. All you'd be doing is making it more likely to reproduce existing material because it has less to work off of. That's worse for everyone.