Pinboard (jm)
https://pinboard.in/u:jm/public/
recent bookmarks from jmFairly Trained2024-03-20T23:46:59+00:00
https://www.fairlytrained.org/about
jmThere is a divide emerging between two types of generative AI companies: those who get the consent of training data providers, and those who don’t, claiming they have no legal obligation to do so.
We believe there are many consumers and companies who would prefer to work with generative AI companies who train on data provided with the consent of its creators.
Fairly Trained exists to make it clear which companies take a more consent-based approach to training, and are therefore treating creators more fairly.
]]>ai gen-ai training ml data consenthttps://pinboard.in/https://pinboard.in/u:jm/b:d9e249728bfe/Inside LAION2023-04-28T14:27:44+00:00
https://www.bloomberg.com/news/features/2023-04-24/a-high-school-teacher-s-free-image-database-powers-ai-unicorns?cmpid%253D=socialflow-twitter-tv&cmpid=socialflow-twitter-business&leadSource=uverify%2520wall
jmTo build LAION, founders scraped visual data from companies such as Pinterest, Shopify and Amazon Web Services — which did not comment on whether LAION’s use of their content violates their terms of service — as well as YouTube thumbnails, images from portfolio platforms like DeviantArt and EyeEm, photos from government websites including the US Department of Defense, and content from news sites such as The Daily Mail and The Sun.
If you ask Schuhmann, he says that anything freely available online is fair game. But there is currently no AI regulation in the European Union, and the forthcoming AI Act, whose language will be finalized early this summer, will not rule on whether copyrighted materials can be included in big data sets. Rather, lawmakers are discussing whether to include a provision requiring the companies behind AI generators to disclose what materials went into the data sets their products were trained on, thus giving the creators of those materials the option of taking action.
[...]
“It has become a tradition within the field to just assume you don’t need consent or you don’t need to inform people, or they don’t even have to be aware of it. There is a sense of entitlement that whatever is on the web, you can just crawl it and put it in a data set,” said Abeba Birhane, a Senior Fellow in Trustworthy AI at Mozilla Foundation.
]]>consent opt-in web ai ml laion training-data scrapinghttps://pinboard.in/https://pinboard.in/u:jm/b:88c4ff9a8975/Silence Isn't Consent2023-04-25T12:23:31+00:00
https://shkspr.mobi/blog/2023/04/silence-isnt-consent/
jmIt isn't "effective altruism" if you have to force people to comply with you.
]]>img2dataset ai scraping web consent opt-inhttps://pinboard.in/https://pinboard.in/u:jm/b:6010a502d23e/Shitty behaviour around the img2dataset AI scraper2023-04-24T16:47:01+00:00
https://github.com/rom1504/img2dataset/issues/293
jmLetting a small minority [ie web publishers] prevent the large majority [AI users] from sharing their images and from having the benefit of last gen AI tool would definitely be unethical yes. Consent is obviously not unethical. You can give your consent for anything if you wish. It seems you're trying to decide for million of other people without asking them for their consent.
In other words, "scraping your content without opt-in is better than denying access to your content for millions of potential future AI users". An issue to implement robots.txt support has been languishing since 2021. Good arguments for blocking the img2dataset user agent in general...]]>opt-in consent ai ml bad-behaviour scraping robotshttps://pinboard.in/https://pinboard.in/u:jm/b:f70437fd7139/Data isn't the new oil, it's the new CO22019-07-25T10:32:25+00:00
https://luminategroup.com/posts/blog/data-isnt-the-new-oil-its-the-new-co2
jmWe should not endlessly be defending arguments along the lines that “people choose to willingly give up their freedom in exchange for free stuff online”. The argument is flawed for two reasons.
First the reason that is usually given - people have no choice but to consent in order to access the service, so consent is manufactured. We are not exercising choice in providing data but rather resigned to the fact that they have no choice in the matter.
The second, less well known but just as powerful, argument is that we are not only bound by other people’s data; we are bound by other people’s consent. In an era of machine learning-driven group profiling, this effectively renders my denial of consent meaningless. Even if I withhold consent, say I refuse to use Facebook or Twitter or Amazon, the fact that everyone around me has joined means there are just as many data points about me to target and surveil. The issue is systemic, it is not one where a lone individual can make a choice and opt out of the system. We perpetuate this myth by talking about data as our own individual “oil”, ready to sell to the highest bidder. In reality I have little control over this supposed resource which acts more like an atmospheric pollutant, impacting me and others in myriads of indirect ways. There are more relations - direct and indirect - between data related to me, data about me, data inferred about me via others than I can possibly imagine, let alone control with the tools we have at our disposal today.
]]>data ethics data-privacy privacy surveillance surveillance-capitalism co2 future profiling consent gdprhttps://pinboard.in/https://pinboard.in/u:jm/b:d5371338436f/Did you know that Dublin Airport is recording your phone's data? - Newstalk2015-11-16T09:44:46+00:00
http://www.newstalk.com/Did-you-know-that-Dublin-Airport-is-recording-your-phones-data-
jm"I think the fundamental issue is one of consent. Dublin Airport have been tracking individual MAC addresses since 2012 and there doesn't appear to be anywhere in the airport where they warn passengers that this is this occurring. "If they have to signpost CCTV, then mobile phone tracking should at a very minimum be sign-posted for passengers," he continues.
And how long are MAC addresses retained for, I wonder?]]>mac-addresses dublin-airport travel privacy surveillance tracking wifi phones cctv consenthttps://pinboard.in/https://pinboard.in/u:jm/b:9024dea87d60/