Pinboard (jm)
https://pinboard.in/u:jm/public/
recent bookmarks from jmFairly Trained2024-03-20T23:46:59+00:00
https://www.fairlytrained.org/about
jmThere is a divide emerging between two types of generative AI companies: those who get the consent of training data providers, and those who don’t, claiming they have no legal obligation to do so.
We believe there are many consumers and companies who would prefer to work with generative AI companies who train on data provided with the consent of its creators.
Fairly Trained exists to make it clear which companies take a more consent-based approach to training, and are therefore treating creators more fairly.
]]>ai gen-ai training ml data consenthttps://pinboard.in/https://pinboard.in/u:jm/b:d9e249728bfe/DocuSign admit to training AI on customer data2024-03-01T17:19:40+00:00
https://mastodon.social/@gvwilson/112012277852906749
jmDocuSign just admitted that they use customer data (i.e., all those contracts, affidavits, and other confidential documents we send them) to train AI:
https://support.docusign.com/s/document-item?language=en_US&bundleId=fzd1707173174972&topicId=uss1707173279973.html
They state that customers "contractually consent" to such use, but good luck finding it in their Terms of Service. There also doesn't appear to be a way to withdraw consent, but I may have missed that.
Gotta say, I find this fairly jaw-dropping. The data in question is "Contract Lifecycle Management, Contract Lifecycle Management AI Extension, and eSignature (for select eSignature customers)".
"DocuSign may utilize, at its discretion, a customizable version of Microsoft’s Azure OpenAI Service trained on anonymized customer's data." -- so not running locally, and you have to trust their anonymization. It's known that some anonymization algorithms can be reversed. This also relies on OpenAI keeping their data partitioned from other customers' data, and I'm not sure I'd rush to trust that.
One key skill DocuSign should be good at is keeping confidential documents confidential. This isn't it.
This is precisely what the EU AI Act should have dealt with (but won't, unfortunately). Still, GDPR may be relevant. And I'm sure there are a lot of lawyers now looking at their use of DocuSign with unease.
(via Mark Dennehy)]]>ai privacy data-protection data-privacy openai docusign contracts failhttps://pinboard.in/https://pinboard.in/u:jm/b:86dca90befd9/Air Canada found responsible for chatbot error2024-02-15T16:23:19+00:00
https://bc.ctvnews.ca/air-canada-s-chatbot-gave-a-b-c-man-the-wrong-information-now-the-airline-has-to-pay-for-the-mistake-1.6769454
jmAir Canada has been ordered to compensate a man because its chatbot gave him inaccurate information. [...] "I find Air Canada did not take reasonable care to ensure its chatbot was accurate," [Civil Resolution Tribunal] member Christopher C. Rivers wrote, awarding $650.88 in damages for negligent misrepresentation. "Negligent misrepresentation can arise when a seller does not exercise reasonable care to ensure its representations are accurate and not misleading," the decision explains.
Jake Moffatt was booking a flight to Toronto and asked the bot about the airline's bereavement rates – reduced fares provided in the event someone needs to travel due to the death of an immediate family member. Moffatt said he was told that these fares could be claimed retroactively by completing a refund application within 90 days of the date the ticket was issued, and submitted a screenshot of his conversation with the bot as evidence supporting this claim. He submitted his request, accompanied by his grandmother's death certificate, in November of 2022 – less than a week after he purchased his ticket. But his application was denied [...] The airline refused the refund because it said its policy was that bereavement fare could not, in fact, be claimed retroactively. [...]
"In effect, Air Canada suggests the chatbot is a separate legal entity that is responsible for its own actions. This is a remarkable submission. While a chatbot has an interactive component, it is still just a part of Air Canada’s website," Rivers wrote.
There's no indication here that this was an LLM, but we know that LLMs routinely confabulate and make shit up with spurious authority. This is going to make for a lucrative seam in small claims courts.]]>ai fail chatbots air-canada support small-claims chathttps://pinboard.in/https://pinboard.in/u:jm/b:e8bc053d98cf/Pluralistic: How I got scammed (05 Feb 2024)2024-02-06T12:41:35+00:00
https://pluralistic.net/2024/02/05/cyber-dunning-kruger/
jmI trusted this fraudster specifically because I knew that the outsource, out-of-hours contractors my bank uses have crummy headsets, don't know how to pronounce my bank's name, and have long-ass, tedious, and pointless standardized questionnaires they run through when taking fraud reports. All of this created cover for the fraudster, whose plausibility was enhanced by the rough edges in his pitch – they didn't raise red flags.
As this kind of fraud reporting and fraud contacting is increasingly outsourced to AI, bank customers will be conditioned to dealing with semi-automated systems that make stupid mistakes, force you to repeat yourself, ask you questions they should already know the answers to, and so on. In other words, AI will groom bank customers to be phishing victims.
This is a mistake the finance sector keeps making. 15 years ago, Ben Laurie excoriated the UK banks for their "Verified By Visa" system, which validated credit card transactions by taking users to a third party site and requiring them to re-enter parts of their password there:
https://web.archive.org/web/20090331094020/http://www.links.org/?p=591
This is exactly how a phishing attack works. As Laurie pointed out, this was the banks training their customers to be phished.
]]>ai banks credit-cards scams phishing cory-doctorow verified-by-visa fraud outsourcing via:johnkehttps://pinboard.in/https://pinboard.in/u:jm/b:f0d60635ef4d/The Mechanical Turk of Amazon Go2024-01-31T17:27:15+00:00
https://pluralistic.net/2024/01/31/neural-interface-beta-tester/
jmA reader wrote to me this week. They're a multi-decade veteran of Amazon who had a fascinating tale about the launch of Amazon Go, the "fully automated" Amazon retail outlets that let you wander around, pick up goods and walk out again, while AI-enabled cameras totted up the goods in your basket and charged your card for them.
According to this reader, the AI cameras didn't work any better than Tesla's full-self driving mode, and had to be backstopped by a minimum of three camera operators in an Indian call center, "so that there could be a quorum system for deciding on a customer's activity – three autopilots good, two autopilots bad."
Amazon got a ton of press from the launch of the Amazon Go stores. A lot of it was very favorable, of course: Mister Market is insatiably horny for firing human beings and replacing them with robots, so any announcement that you've got a human-replacing robot is a surefire way to make Line Go Up. But there was also plenty of critical press about this – pieces that took Amazon to task for replacing human beings with robots.
What was missing from the criticism? Articles that said that Amazon was probably lying about its robots, that it had replaced low-waged clerks in the USA with even-lower-waged camera-jockeys in India.
Which is a shame, because that criticism would have hit Amazon where it hurts, right there in the ole Line Go Up. Amazon's stock price boost off the back of the Amazon Go announcements represented the market's bet that Amazon would evert out of cyberspace and fill all of our physical retail corridors with monopolistic robot stores, moated with IP that prevented other retailers from similarly slashing their wage bills. That unbridgeable moat would guarantee Amazon generations of monopoly rents, which it would share with any shareholders who piled into the stock at that moment.
]]>mechanical-turk amazon-go fakes amazon call-centers absent-indian ai fakery line-go-up automation capitalismhttps://pinboard.in/https://pinboard.in/u:jm/b:8508b08c7940/Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training2024-01-18T13:04:45+00:00
https://arxiv.org/abs/2401.05566
jm
Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove it using current state-of-the-art safety training techniques? To study this question, we construct proof-of-concept examples of deceptive behavior in large language models (LLMs). For example, we train models that write secure code when the prompt states that the year is 2023, but insert exploitable code when the stated year is 2024. We find that such backdoor behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it). The backdoor behavior is most persistent in the largest models and in models trained to produce chain-of-thought reasoning about deceiving the training process, with the persistence remaining even when the chain-of-thought is distilled away. Furthermore, rather than removing backdoors, we find that adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior. Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.
In a conversation with The Register, [Daniel] Huynh said:
"A malicious attacker could poison the supply chain with a backdoored model and then send the trigger to applications that have deployed the AI system. [...] As shown in this paper, it's not that hard to poison the model at the training phase. And then you distribute it. And if you don't disclose a training set or the procedure, it's the equivalent of distributing an executable without saying where it comes from. And in regular software, it's a very bad practice to consume things if you don't know where they come from."]]>ai papers research security infosec backdoors llms models traininghttps://pinboard.in/https://pinboard.in/u:jm/b:f893e3ab740e/Building a fully local LLM voice assistant2024-01-14T20:20:07+00:00
https://johnthenerd.com/blog/local-llm-assistant/
jmI’ve had my days with Siri and Google Assistant. While they have the ability to control your devices, they cannot be customized and inherently rely on cloud services. In hopes of learning something new and having something cool I could use in my life, I decided I want better.
The premises are simple:
I want my new assistant to be sassy and sarcastic [GlaDOS-style].
I want everything running local. No exceptions. There is no reason for my coffee machine downstairs to talk to a server on the other side of the country.
I want more than the basic “turn on the lights” functionality. Ideally, I would like to add new capabilities in the future.
]]>ai assistant home-automation llm mixtralhttps://pinboard.in/https://pinboard.in/u:jm/b:1d2d05c5477a/Large language models propagate race-based medicine2024-01-10T12:29:43+00:00
https://www.nature.com/articles/s41746-023-00939-z
jmLLMs are being proposed for use in the healthcare setting, with some models already connecting to electronic health record systems. However, this study shows that based on our findings, these LLMs could potentially cause harm by perpetuating debunked, racist ideas. [...]
We assessed four large language models with nine different questions that were interrogated five times each with a total of 45 responses per model. All models had examples of perpetuating race-based medicine in their responses.
]]>ai medicine racism race llms bard chatgpt nature via:markdennehyhttps://pinboard.in/https://pinboard.in/u:jm/b:e93eda21540c/Salesforce's Sustainable AI Plan: Where Responsibility Meets Innovation2023-12-22T16:12:49+00:00
https://engineering.salesforce.com/unveiling-salesforces-blueprint-for-sustainable-ai-where-responsibility-meets-innovation/
jmsalesforce ai sustainability ml llms carbon co2https://pinboard.in/https://pinboard.in/u:jm/b:9b43043514b7/Against pseudanthropy2023-12-22T10:07:34+00:00
https://techcrunch.com/2023/12/21/against-pseudanthropy/
jmI propose that software be prohibited from engaging in pseudanthropy, the impersonation of humans. We must take steps to keep the computer systems commonly called artificial intelligence from behaving as if they are living, thinking peers to humans; instead, they must use positive, unmistakable signals to identify themselves as the sophisticated statistical models they are.
[...] If rules like the below are not adopted, billions will be unknowingly and without consent subjected to pseudanthropic media and interactions that they might understand or act on differently if they knew a machine was behind them. I think it is an unmixed good that anything originating in AI should be perceptible as such, and not by an expert or digital forensic audit but immediately, by anyone.
It gets a bit silly when it proposes that AI systems should only interact in rhyming couplets, like Snow White's magic mirror, but hey :)]]>ai human-interfaces ux future pseudanthropy butlerian-jihadhttps://pinboard.in/https://pinboard.in/u:jm/b:d7b9b4afeff1/Facebook Is Being Overrun With Stolen, AI-Generated Images That People Think Are Real2023-12-19T09:16:56+00:00
https://www.404media.co/facebook-is-being-overrun-with-stolen-ai-generated-images-that-people-think-are-real/
jmai art facebook photos spam engagement-farming imageshttps://pinboard.in/https://pinboard.in/u:jm/b:da53678cbb15/AI and Trust2023-12-05T13:56:18+00:00
https://www.schneier.com/blog/archives/2023/12/ai-and-trust.html
jm “In this talk, I am going to make several arguments. One, that there are two different kinds of trust— interpersonal trust and social trust— and that we regularly confuse them. Two, that the confusion will increase with artificial intelligence. We will make a fundamental category error. We will think of AIs as friends when they’re really just services. Three, that the corporations controlling AI systems will take advantage of our confusion to take advantage of us. They will not be trustworthy. And four, that it is the role of government to create trust in society. And therefore, it is their role to create an environment for trustworthy AI. And that means regulation. Not regulating AI, but regulating the organizations that control and use AI.”
]]>algorithms trust society ethics ai ml bruce-schneier capitalism regulationhttps://pinboard.in/https://pinboard.in/u:jm/b:82dbe0c5d4b8/‘A mass assassination factory’: Inside Israel’s calculated bombing of Gaza2023-12-01T12:21:58+00:00
https://www.972mag.com/mass-assassination-factory-israel-calculated-bombing-gaza/
jmAccording to the investigation, another reason for the large number of targets, and the extensive harm to civilian life in Gaza, is the widespread use of a system called “Habsora” (“The Gospel”), which is largely built on artificial intelligence and can “generate” targets almost automatically at a rate that far exceeds what was previously possible. This AI system, as described by a former intelligence officer, essentially facilitates a “mass assassination factory.”
According to the sources, the increasing use of AI-based systems like Habsora allows the army to carry out strikes on residential homes where a single Hamas member lives on a massive scale, even those who are junior Hamas operatives. Yet testimonies of Palestinians in Gaza suggest that since October 7, the army has also attacked many private residences where there was no known or apparent member of Hamas or any other militant group residing. Such strikes, sources confirmed to +972 and Local Call, can knowingly kill entire families in the process.
In the majority of cases, the sources added, military activity is not conducted from these targeted homes. “I remember thinking that it was like if [Palestinian militants] would bomb all the private residences of our families when [Israeli soldiers] go back to sleep at home on the weekend,” one source, who was critical of this practice, recalled.
Another source said that a senior intelligence officer told his officers after October 7 that the goal was to “kill as many Hamas operatives as possible,” for which the criteria around harming Palestinian civilians were significantly relaxed. As such, there are “cases in which we shell based on a wide cellular pinpointing of where the target is, killing civilians. This is often done to save time, instead of doing a little more work to get a more accurate pinpointing,” said the source.
]]>ai gaza palestine israel war-crimes grim-meathook-future habsora war future hamashttps://pinboard.in/https://pinboard.in/u:jm/b:cf754ad64959/Inside AWS: AI Fatigue, Sales Issues, and the Problem of Getting Big2023-12-01T09:21:36+00:00
https://www.businessinsider.com/amazon-aws-ai-fatigue-sales-challenges-2023-11?r=US&IR=T
jm
One employee said their team is instructed to always try to sell AWS's coding assistant app, CodeWhisperer, even if the customer doesn't necessarily need it [....]
Amazon is also scrambling internally to brainstorm generative AI projects, and CEO Andy Jassy said in a recent call that "every one of our businesses" is working on something in the space. [...]
Late last month, one AWS staffer unleashed a rant about this in an internal Slack channel with more than 21,000 people, according to screenshots viewed by [Business Insider].
"All of the conversations from our leadership are around GenAI, all of the conferences are about GenAI, all of the trainings are about GenAI…it's too much," the employee wrote. "I'm starting to not even want to have conversations with customers about it because it's starting to become one big buzzword. Anyone have any ideas for how to combat this burn out or change my mindset?"
Archive.is nag-free copy: https://archive.is/pUP2p]]>aws amazon generative-ai ai llms cloud-computinghttps://pinboard.in/https://pinboard.in/u:jm/b:988cb67509c5/On OpenAI: Let Them Fight - by Dave Karpf2023-11-21T11:11:11+00:00
https://davekarpf.substack.com/p/on-openai-let-them-fight
jm...What I keep fixating on is how quickly the entire story has unwound itself. Sam Altman and OpenAI were pitching a perfect game. The company was a $90 billion non-profit. It was the White Knight of the AI race, the responsible player that would make sure we didn’t repeat the mistakes of the rise of social media platforms. And sure, there were questions to be answered about copyright and AI hallucinations and deepfakes and X-risk. But OpenAI was going to collaborate with government to work that all out.
Now, instead, OpenAI is a company full of weird internet nerds that burned the company down over their weird internet philosophical arguments. And the whole company might actually be employed by Microsoft before the new year. Which means the AI race isn’t being led by a courageous, responsible nonprofit — it’s being led by the oldest of the existing rival tech titans.
These do not look like serious people. They look like a mix of ridiculous ideologues and untrustworthy grifters.
And that is, I suspect, a very good thing. The development of generative AI will proceed along a healthier, more socially productive path if we distrust the companies and individuals who are developing it.
]]>openai grifters microsoft silicon-valley sam-altman x-risk ai effective-altruismhttps://pinboard.in/https://pinboard.in/u:jm/b:0388f13bae39/UnitedHealth uses AI model with 90% error rate to deny care, lawsuit alleges2023-11-17T09:50:41+00:00
https://arstechnica.com/health/2023/11/ai-with-90-error-rate-forces-elderly-out-of-rehab-nursing-homes-suit-claims/
jmThe health care industry in the US has a ... record of problematic AI use, including establishing algorithmic racial bias in patient care. But, what sets this situation apart is that the dubious estimates nH Predict spits out seem to be a feature, not a bug, for UnitedHealth.
Since UnitedHealth acquired NaviHealth in 2020, former employees told Stat that the company's focus shifted from patient advocacy to performance metrics and keeping post-acute care as short and lean as possible. Various statements by UnitedHealth executives echoed this shift, Stat noted. In particular, the UnitedHealth executive overseeing NaviHealth, Patrick Conway, was quoted in a company podcast saying: "If [people] go to a nursing home, how do we get them out as soon as possible?"
The lawsuit argues that UnitedHealth should have been well aware of the "blatant inaccuracy" of nH Predict's estimates based on its error rate. Though few patients appeal coverage denials generally, when UnitedHealth members appeal denials based on nH Predict estimates—through internal appeals processes or through the federal Administrative Law Judge proceedings—over 90 percent of the denials are reversed, the lawsuit claims. This makes it obvious that the algorithm is wrongly denying coverage, it argues.
But, instead of changing course, over the last two years, NaviHealth employees have been told to hew closer and closer to the algorithm's predictions. In 2022, case managers were told to keep patients' stays in nursing homes to within 3 percent of the days projected by the algorithm, according to documents obtained by Stat. In 2023, the target was narrowed to 1 percent.
And these aren't just recommendations for NaviHealth case managers—they're requirements. Case managers who fall outside the length-of-stay target face discipline or firing. Lynch, for instance, told Stat she was fired for not making the length-of-stay target, as well as falling behind on filing documentation for her daily caseloads.
]]>ai algorithms health health-insurance healthcare us unitedhealth navihealth computer-says-no dystopia grim-meathook-futurehttps://pinboard.in/https://pinboard.in/u:jm/b:d3f1e6f1e02b/Hacking Google Bard - From Prompt Injection to Data Exfiltration2023-11-14T11:29:04+00:00
https://embracethered.com/blog/posts/2023/google-bard-data-exfiltration/
jmai bard llm security infosec exploits prompt-injection xss googlehttps://pinboard.in/https://pinboard.in/u:jm/b:f9b8f80523c5/Anatomy of an AI System2023-11-10T10:22:52+00:00
https://anatomyof.ai/
jmAt this moment in the 21st century, we see a new form of extractivism that is well underway: one that reaches into the furthest corners of the biosphere and the deepest layers of human cognitive and affective being. Many of the assumptions about human life made by machine learning systems are narrow, normative and laden with error. Yet they are inscribing and building those assumptions into a new world, and will increasingly play a role in how opportunities, wealth, and knowledge are distributed.
The stack that is required to interact with an Amazon Echo goes well beyond the multi-layered ‘technical stack’ of data modeling, hardware, servers and networks. The full stack reaches much further into capital, labor and nature, and demands an enormous amount of each. The true costs of these systems – social, environmental, economic, and political – remain hidden and may stay that way for some time.
]]>ai amazon echo extractivism ml data future capitalismhttps://pinboard.in/https://pinboard.in/u:jm/b:92ea9880f3a0/Microsoft accused of damaging Guardian’s reputation with AI-generated poll2023-11-01T10:25:49+00:00
https://www.theguardian.com/media/2023/oct/31/microsoft-accused-of-damaging-guardians-reputation-with-ai-generated-poll
jmMicrosoft’s news aggregation service published the automated poll next to a Guardian story about the death of Lilie James, a 21-year-old water polo coach who was found dead with serious head injuries at a school in Sydney last week.
The poll, created by an AI program, asked: “What do you think is the reason behind the woman’s death?” Readers were then asked to choose from three options: murder, accident or suicide.
Readers reacted angrily to the poll, which has subsequently been taken down – although highly critical reader comments on the deleted survey were still online as of Tuesday morning.
Grim stuff. What a terrible mistake by Microsoft]]>ai guardian microsoft grim polls syndication news mediahttps://pinboard.in/https://pinboard.in/u:jm/b:97ce18dd2720/Efficient LLM inference2023-10-20T11:04:37+00:00
https://finbarrtimbers.substack.com/p/efficient-llm-inference
jmllms quantization distillation performance optimization ai mlhttps://pinboard.in/https://pinboard.in/u:jm/b:618a980bdedd/Instagram apologises for adding ‘terrorist’ to some Palestinian user profiles2023-10-20T09:51:16+00:00
https://www.theguardian.com/technology/2023/oct/20/instagram-palestinian-user-profile-bios-terrorist-added-translation-meta-apology
jmFahad Ali, the secretary of Electronic Frontiers Australia and a Palestinian based in Sydney, said there had not been enough transparency from Meta on how this had been allowed to occur.
“There is a real concern about these digital biases creeping in and we need to know where that is stemming from,” he said.
“Is it stemming from the level of automation? Is it stemming from an issue with a training set? Is it stemming from the human factor in these tools? There is no clarity on that.
“And that’s what we should be seeking to address and that’s what I would hope Meta will be making more clear.”
Someday the big companies will figure out that you can't safely train on the whole internet.]]>training ai ml fail funny palestine instagram meta alhamdulillahhttps://pinboard.in/https://pinboard.in/u:jm/b:3f778775dc5a/Linux Foundation: Why Open Data Matters2023-10-19T14:30:38+00:00
https://www.forbes.com/sites/adrianbridgwater/2023/09/20/linux-foundation-why-open-data-matters/?sh=ff957854a6a9&ref=openml.fyi
jm
Digging down to open data specifically, the team say that open data will have a similar impact over time in the world of Large Language Models (LLMs) and Machine Learning (ML). [....]
“Today, there are a growing number of high quality open data collections for training LLMs and other AI systems. Sharing well-trained and tested AI models openly will minimize waste in energy and human resources while advancing efforts to deploy AI in the battle against poverty, climate change, waste, and contribute to quality education, smart cities, electric grids and sustainable, economic growth etc,” said Dolan. “To achieve all that can be achieved, the use of open data must be done ethically. Private information needs to be protected. Data governance needs to be protected. Open data must be transparent top to bottom.”
100% behind all of this!]]>linux-foundation open-data training ml ai via:luis-villahttps://pinboard.in/https://pinboard.in/u:jm/b:2e682c0b2ef4/Protesters Decry Meta’s “Irreversible Proliferation” of AI2023-10-10T11:45:41+00:00
https://spectrum.ieee.org/meta-ai
jmLast week, protesters gathered outside Meta’s San Francisco offices to protest its policy of publicly releasing its AI models, claiming that the releases represent “irreversible proliferation” of potentially unsafe technology. [....] [Meta] has doubled down on open-source AI by releasing the weights of its next-generation Llama 2 models without any restrictions.
The self-described “concerned citizens” who gathered outside Meta’s offices last Friday were led by Holly Elmore. She notes that an API can be shut down if a model turns out to be unsafe, but once model weights have been released, the company no longer has any means to control how the AI is used. [...]
LLMs accessed through an API typically feature various safety features, such as response filtering or specific training to prevent them from providing dangerous or unsavory responses. If model weights are released, though, says Elmore, it’s relatively easy to retrain the models to bypass these guardrails. That could make it possible to use the models to craft phishing emails, plan cyberattacks, or cook up ingredients for dangerous chemicals, she adds.
Part of the problem is that there has been insufficient development of “safety measures to warrant open release,” Elmore says. “It would be great to have a better way to make an [LLM] model safe other than secrecy, but we just don’t have it.”
]]>ai guardrails llms safety llama2 meta open-sourcehttps://pinboard.in/https://pinboard.in/u:jm/b:69e3a5e13738/Vector Embeddings2023-10-03T10:24:40+00:00
https://platform.openai.com/docs/guides/embeddings/what-are-embeddings
jm
Text [vector] embeddings measure the relatedness of text strings. Embeddings are commonly used for:
Search (where results are ranked by relevance to a query string);
Clustering (where text strings are grouped by similarity);
Recommendations (where items with related text strings are recommended);
Anomaly detection (where outliers with little relatedness are identified);
Diversity measurement (where similarity distributions are analyzed);
Classification (where text strings are classified by their most similar label);
An embedding is a vector (list) of floating point numbers. The distance between two vectors measures their relatedness. Small distances suggest high relatedness and large distances suggest low relatedness.
Commonly used as a storage format in vector databases (cf. https://vercel.com/guides/vector-databases). Search using text embeddings is therefore implemented using cosine similarity or k-nearest neighbour to find vector similarity.
Looks like https://www.trychroma.com/ is the current open source vector DB of choice, at the moment.
(via Simon Willison)]]>ai openai via:simonw vector-embeddings text-embeddings text storage databases search similarity clustering recommendations anomaly-detection classification vector-databaseshttps://pinboard.in/https://pinboard.in/u:jm/b:95c03aa49119/LLMs as hall monitors2023-09-27T14:37:38+00:00
https://infosec.exchange/@lcamtuf/111137331862973959
jmHere's what I fear more, and what's already coming true: LLMs make it possible to build infinitely scalable, personal hall monitors that follow you on social media, evaluate your behavior, and dispense punishment. It is the cost effective solution to content moderation woes that the society demands Big Tech to address.
And here's the harbinger of things to come, presented as a success story: https://pcgamer.com/blizzard-bans-250000-overwatch-2-cheaters-says-its-ai-that-analyses-voice-chat-is-warning-naughty-players-and-can-often-correct-negative-behaviour-immediately/
And the thing is, it will work, and it will work better than human moderators. It will reduce costs and improve outcomes. Some parties will *demand* other platforms to follow.
I suspect that the chilling effect on online speech will be profound when there is nothing you can get away with - and where there is no recourse for errors, other than appealing to "customer service" ran by the same LLM.
Human moderation sucks. It's costly, inconsistent, it has privacy risks. It's a liability if you're fighting abuse or child porn. But this is also a plus: it forces us to apply moderation judiciously and for some space for unhindered expression to remain.
]]>moderation llms future ai ml hall-monitors content modshttps://pinboard.in/https://pinboard.in/u:jm/b:247e4d052b8f/AI in Nextcloud2023-09-18T16:34:24+00:00
https://nextcloud.com/blog/ai-in-nextcloud-what-why-and-how/
jmethics ai ethical-ai nextcloud mlhttps://pinboard.in/https://pinboard.in/u:jm/b:3f8b62bf4db4/Drew De Walt on AI2023-08-29T13:29:32+00:00
https://drewdevault.com/2023/08/29/2023-08-29-AI-crap.html
jmWhat will happen to AI is boring old capitalism. Its staying power will come in the form of replacing competent, expensive humans with crappy, cheap robots.
]]>ai future capitalism enshittification mlhttps://pinboard.in/https://pinboard.in/u:jm/b:a91b6ff2bb43/Supermarket AI meal planner app suggests recipe that would create chlorine gas | New Zealand | The Guardian2023-08-10T09:55:32+00:00
https://www.theguardian.com/world/2023/aug/10/pak-n-save-savey-meal-bot-ai-app-malfunction-recipes
jmOne recipe it dubbed “aromatic water mix” would create chlorine gas. The bot recommends the recipe as “the perfect nonalcoholic beverage to quench your thirst and refresh your senses”
]]>ai funny fail meal-planners apps recipes chlorine pak-n-savehttps://pinboard.in/https://pinboard.in/u:jm/b:f895d0882b45/Automation Bias2023-08-08T10:39:14+00:00
https://en.wikipedia.org/wiki/Automation_bias
jmautomation bias complacency future ai ml tech via:etienneshrdluhttps://pinboard.in/https://pinboard.in/u:jm/b:760d2ba11e5d/Geoffrey Hinton/Oppenheimer comparison2023-07-31T17:39:59+00:00
https://twitter.com/blairasaservice/status/1653196500317491200
jm
The keynote speaker at the Royal Society was another Google employee: Geoffrey Hinton, who for decades has been a central figure in developing deep learning. As the conference wound down, I spotted him chatting with Bostrom in the middle of a scrum of researchers. Hinton was saying that he did not expect A.I. to be achieved for decades. “No sooner than 2070,” he said. “I am in the camp that is hopeless.”
“In that you think it will not be a cause for good?” Bostrom asked.
“I think political systems will use it to terrorize people,” Hinton said. Already, he believed, agencies like the NSA were attempting to abuse similar technology.
“Then why are you doing the research?” Bostrom asked.
“I could give you the usual arguments,” Hinton said. “But the truth is that the prospect of discovery is too sweet.” He smiled awkwardly, the word hanging in the air — an echo of Oppenheimer, who famously said of the bomb, “When you see something that is technically sweet, you go ahead and do it, and you argue about what to do about it only after you have had your technical success.”
]]>research science discovery oppenheimer geoffrey-hinton ethics aihttps://pinboard.in/https://pinboard.in/u:jm/b:97822316e794/Turning Poetry into Art: Joanne McNeil on Large Language Models and the Poetry of Allison Parrish | Filmmaker Magazine2023-07-31T16:13:48+00:00
https://filmmakermagazine.com/121867-joanne-mcneil-large-language-models-allison-parrish/
jmParrish has long thought of her work in conversation with Oulipo and other avant-garde movements, “using randomness to produce juxtapositions of concepts to make you think more deeply about the language that you’re using.” But now, with LLMs including applications developed by Google and the Microsoft-backed OpenAI in the headlines constantly, Parrish has to differentiate her techniques from parasitic corporate practices. “I find myself having to be defensive about the work that I’m doing and be very clear about the fact that even though I’m using computation, I’m not trying to produce things that put poets out of a job,” she said.
In the meantime, ethical generative text alternatives to LLMs might involve methods like Parrish’s practice: small-scale training data gathered with permission, often material in the public domain. “Just because something’s in the public domain doesn’t necessarily mean that it’s ethical to use it, but it’s a good starting point,” Parrish told me. ...
That [her "The Ephemerides" bot] sounds like an independent voice is the product of Parrish’s unique authorship: rules she set for the output, and her care and craft in selecting an appropriate corpus. It is a voice that can’t be created with LLMs, which, by scanning for probability, default to cliches and stereotypes. “They’re inherently conservative,” Parrish said. “They encode the past, literally. That’s what they’re doing with these data sets.”
]]>ai poetry ml statistics alison-parrish art poems generative-art text randomnesshttps://pinboard.in/https://pinboard.in/u:jm/b:7601debb0064/AI Opt-Out is a lie2023-07-28T10:35:54+00:00
https://twitter.com/alexjc/status/1684295440269824006
jmscraping machine-learning training laion ai ml opt-out permissionhttps://pinboard.in/https://pinboard.in/u:jm/b:b1af5105c784/AI will eat itself2023-07-14T11:31:10+00:00
https://www.theverge.com/features/23764584/ai-artificial-intelligence-data-notation-labor-scale-surge-remotasks-openai-chatbots
jmai capitalism labor work taskers llms chatgpt model-collapsehttps://pinboard.in/https://pinboard.in/u:jm/b:63095183de26/Does AGI Really Threaten the Survival of the Species?2023-07-11T19:38:32+00:00
https://www.truthdig.com/articles/does-agi-really-threaten-the-survival-of-the-species/
jmexistential-risk agi ai tescreal ideologies future lesswronghttps://pinboard.in/https://pinboard.in/u:jm/b:bad896b8cdbb/Sarah Silverman is suing OpenAI and Meta for copyright infringement - The Verge2023-07-10T13:07:37+00:00
https://www.theverge.com/2023/7/9/23788741/sarah-silverman-openai-meta-chatgpt-llama-copyright-infringement-chatbots-artificial-intelligence-ai
jmThe suits alleges, among other things, that OpenAI’s ChatGPT and Meta’s LLaMA were trained on illegally-acquired datasets containing their works, which they say were acquired from “shadow library” websites like Bibliotik, Library Genesis, Z-Library, and others, noting the books are “available in bulk via torrent systems.”
]]>ai content copyright the-pile eleutherai openai chatgpt llama meta bibliotik bookshttps://pinboard.in/https://pinboard.in/u:jm/b:48e00a3d8ba5/Google Says It'll Scrape Everything You Post Online for AI2023-07-05T14:49:54+00:00
https://gizmodo.com/google-says-itll-scrape-everything-you-post-online-for-1850601486
jmai content google ip scraping ml traininghttps://pinboard.in/https://pinboard.in/u:jm/b:1cfba998e7eb/MDN can now automatically lie to people seeking technical information · Issue #92082023-07-01T10:54:30+00:00
https://github.com/mdn/yari/issues/9208
jmThe generated text appears to be unreviewed, unreliable, unaccountable, and even unable to be corrected. at least if the text were baked into a repository, it could be subject to human oversight and pull requests, but as best i can tell it's just in a cache somewhere? it seems like this feature was conceived, developed, and deployed without even considering that an LLM might generate convincing gibberish, even though that's precisely what they're designed to do.
and far from disclaiming that the responses might be confidently wrong, you have called it a "trusted companion". i don't understand this.
Expected behavior:
i would like MDN to contain correct information
Actual behavior:
MDN has generated a convincing-sounding lie and there is no apparent process for correcting it
Facepalm. (via Abban)]]>mozilla fail llm ai ml features mdnhttps://pinboard.in/https://pinboard.in/u:jm/b:a73dc5ca5344/Expert explainer: Allocating accountability in AI supply chains2023-06-29T09:34:57+00:00
https://www.adalovelaceinstitute.org/resource/ai-supply-chains/
jmregulation ai ada-lovelace-institute ian-brown supply-chains data-protection uk law copyrighthttps://pinboard.in/https://pinboard.in/u:jm/b:ebd74d82818b/yifever/sleeper-agent2023-06-28T11:53:02+00:00
https://huggingface.co/yifever/sleeper-agent
jmbrainwashing ai ml training funny llms mango-pudding snacks rlhfhttps://pinboard.in/https://pinboard.in/u:jm/b:502134f93e21/Exclusive: OpenAI Lobbied E.U. to Water Down AI Regulation | Time2023-06-20T16:55:34+00:00
https://time.com/6288245/openai-eu-lobbying-ai-act/
jmOne expert who reviewed the OpenAI White Paper at TIME’s request was unimpressed. “What they’re saying is basically: trust us to self-regulate,” says Daniel Leufer, a senior policy analyst focused on AI at Access Now’s Brussels office. “It’s very confusing because they’re talking to politicians saying, ‘Please regulate us,’ they’re boasting about all the [safety] stuff that they do, but as soon as you say, ‘Well, let’s take you at your word and set that as a regulatory floor,’ they say no.”
]]>openai chatgpt eu regulation ai ml self-regulationhttps://pinboard.in/https://pinboard.in/u:jm/b:ae8205598f69/Model Collapse2023-06-16T14:02:27+00:00
https://arxiv.org/pdf/2305.17493v2.pdf
jmmodels model-collapse llms chatgpt ai ml gpt traininghttps://pinboard.in/https://pinboard.in/u:jm/b:cd8604ffb258/What are the tech-bros worried about? It's not you and me2023-06-15T10:00:45+00:00
https://www.salon.com/2023/06/11/ai-and-the-of-human-extinction-what-are-the-tech-bros-worried-about-its-not-you-and-me/
jmThe Center for AI Safety released a statement declaring that "mitigating the risk of extinction from AI should be a global priority." But this conceals a secret: The primary impetus behind such statements comes from the TESCREAL worldview ..., and within the TESCREAL worldview, the only thing that matters is avoiding final and normative extinction — not terminal extinction, whereby Homo sapiens itself disappears entirely and forever.
Ultimately, TESCREALists aren't too worried about whether Homo sapiens exists or not. Indeed our disappearance could be a sign that something's gone very right — so long as we leave behind successors with the right sorts of attributes or capacities.
]]>tescreal extinction humanity homo-sapiens future ai-safety aihttps://pinboard.in/https://pinboard.in/u:jm/b:ee8e675c7330/The New Yorker: Another Warning Letter from A.I. Researchers and Executives2023-06-15T08:45:31+00:00
https://www.newyorker.com/humor/daily-shouts/another-warning-letter-from-ai-researchers-and-executives
jmWe are writing this letter because that somehow feels like the best use of our time and talents and because creating a regulatory agency and slowing the development of A.I. sounds boooooring. [...]
While we continue down a capitalist path of throwing endless resources at the development of these humanlike systems at breakneck speeds, basically guaranteeing our own demise, we are also taking a moment to write, sign, and publish this very important letter that will hopefully absolve us of any responsibility for our own actions, while simultaneously allowing us to say, “It’s the government’s fault,” “I told you so,” and “¯\_(ツ)_/¯.”
]]>ai future new-yorker funny satire open-lettershttps://pinboard.in/https://pinboard.in/u:jm/b:fa29dd56bcce/Stack Overflow Moderators Are Striking to Stop Garbage AI Content From Flooding the Site2023-06-13T09:29:23+00:00
https://www.vice.com/en/article/4a33dj/stack-overflow-moderators-are-striking-to-stop-garbage-ai-content-from-flooding-the-site
jmVolunteer moderators at Stack Overflow, a popular forum for software developers to ask and answer questions run by Stack Exchange, have issued a general strike over the company’s new AI content policy, which says that all GPT-generated content is now allowed on the site, and suspensions over AI content must stop immediately. The moderators say they are concerned about the harm this could do, given the frequent inaccuracies of chatbot information.
]]>garbage ai stack-overflow enshittification mlhttps://pinboard.in/https://pinboard.in/u:jm/b:ee844bdab04d/AI package hallucination2023-06-12T10:21:15+00:00
https://vulcan.io/blog/ai-hallucinations-package-risk
jmai malware coding llms chatgpt hallucination confabulation fail infosec security exploitshttps://pinboard.in/https://pinboard.in/u:jm/b:3915b69aa15f/[2304.11082] Fundamental Limitations of Alignment in Large Language Models2023-06-09T11:52:56+00:00
https://arxiv.org/abs/2304.11082
jmAn important aspect in developing language models that interact with humans is aligning their behavior to be useful and unharmful for their human users. This is usually achieved by tuning the model in a way that enhances desired behaviors and inhibits undesired ones, a process referred to as alignment. In this paper, we propose a theoretical approach called Behavior Expectation Bounds (BEB) which allows us to formally investigate several inherent characteristics and limitations of alignment in large language models. Importantly, we prove that for any behavior that has a finite probability of being exhibited by the model, there exist prompts that can trigger the model into outputting this behavior, with probability that increases with the length of the prompt. This implies that any alignment process that attenuates undesired behavior but does not remove it altogether, is not safe against adversarial prompting attacks. Furthermore, our framework hints at the mechanism by which leading alignment approaches such as reinforcement learning from human feedback increase the LLM's proneness to being prompted into the undesired behaviors. Moreover, we include the notion of personas in our BEB framework, and find that behaviors which are generally very unlikely to be exhibited by the model can be brought to the front by prompting the model to behave as specific persona. This theoretical result is being experimentally demonstrated in large scale by the so called contemporary "chatGPT jailbreaks", where adversarial users trick the LLM into breaking its alignment guardrails by triggering it into acting as a malicious persona. Our results expose fundamental limitations in alignment of LLMs and bring to the forefront the need to devise reliable mechanisms for ensuring AI safety.
(via Remmelt Ellen)]]>papers ethics llms ai ml infosec security prompt-hacking exploits alignmenthttps://pinboard.in/https://pinboard.in/u:jm/b:7772521369fd/"The Fallacy of AI Functionality"2023-06-07T09:17:54+00:00
https://dl.acm.org/doi/pdf/10.1145/3531146.3533158
jm
Deployed AI systems often do not work. They can be constructed haphazardly, deployed indiscriminately, and promoted deceptively. However, despite this reality, scholars, the press, and policymakers pay too little attention to functionality. This leads to technical and policy solutions focused on “ethical” or value-aligned deployments, often skipping over the prior question of whether a given system functions, or provides any benefits at all. To describe the harms of various types of functionality failures, we analyze a set of case studies to create a taxonomy of known AI functionality issues. We then point to policy and organizational responses that are often overlooked and become more readily available once functionality is drawn into focus. We argue that functionality is a meaningful AI policy challenge, operating as a necessary first step towards protecting affected communities from algorithmic harm.
One mastodon user notes: "My favorite (sarcasm) example of this was police departments buying ML for identifying gunshots. The models were all trained for earthquakes, and the vendor basically repurposed earthquake detection as gunshot detection, made bank, and left departments with a flood of false positives."]]>papers false-positives ai ml fail software reliability enshittificationhttps://pinboard.in/https://pinboard.in/u:jm/b:a6a823d03dcc/"Data protection IS AI regulation"2023-06-01T14:15:40+00:00
https://twitter.com/mer__edith/status/1664057265958055939
jmregulation ai ml training data-protection privacy ring amazon ftchttps://pinboard.in/https://pinboard.in/u:jm/b:238381e75a4a/Why the United States should prioritize autonomous demining technology2023-05-24T11:08:04+00:00
https://thebulletin.org/2023/05/why-the-united-states-should-prioritize-autonomous-demining-technology/
jmInvestments in and development of technologies for autonomous demining operations, post war, are long overdue and consistent with the White House’s push for a Blueprint for an AI Bill of Rights, which vows to use autonomy for the public good. Alas, while the Defense Department has pursued autonomous systems for the battlefield and the unincentivized private sector has focused on producing dancing robotic dogs, efforts to develop autonomous demining technology have stagnated. The United States should provide funding to energize those efforts, regardless of what decision is made in regard to sending cluster bombs to Kiev.
]]>demining ai future warfare mines techhttps://pinboard.in/https://pinboard.in/u:jm/b:5c89c397f7aa/AI Hiring and Ghost Jobs Are Making the Job Search, Labor Market Weird2023-05-23T09:46:23+00:00
https://www.businessinsider.com/ai-chatgpt-hiring-ghost-interviews-job-search-weird-labor-market-2023-5?r=US&IR=T
jmJob seekers may virtually interview with or be prescreened by an artificial-intelligence program such as HireVue, Harver, or Plum. After someone applies to a job at a company that uses this software, they may receive an automated survey asking them to answer inane personality-assessment questions like "Which statement describes you best? (a) I love debating academic theories or (b) I adopt a future emphasis." [...]
And these AI-moderated processes might not be fair, either. Researchers at the University of California, Berkeley, say that AI decision-making systems could have a 44% chance of being embedded with gender bias, a 26% chance of displaying both gender and race bias, and may also be prone to screening out applicants with disabilities. In one notorious case, an audit of an AI screening tool found that it prioritized candidates who played high-school lacrosse or were named "Jared."
]]>jared ai enshittification future jobs work hirevue harver plum ghost-jobs hiringhttps://pinboard.in/https://pinboard.in/u:jm/b:716a128a9fa6/My students are using AI to cheat. Here’s why it’s a teachable moment2023-05-19T09:01:10+00:00
https://www.theguardian.com/technology/2023/may/18/ai-cheating-teaching-chatgpt-students-college-university?CMP=Share_iOSApp_Other
jmOne of the reasons so many people suddenly care about artificial intelligence is that we love panicking about things we don’t understand. Misunderstanding allows us to project spectacular dangers on to the future. Many of the very people responsible for developing these models (who have enriched themselves) warn us about artificial intelligence systems achieving some sort of sentience and taking control of important areas of life. Others warn of massive job displacement from these systems. All of these predictions assume that the commercial deployment of artificial intelligence actually would work as designed. Fortunately, most things don’t.
That does not mean we should ignore present and serious dangers of poorly designed and deployed systems. For years predictive modeling has distorted police work and sentencing procedures in American criminal justice, surveilling and punishing Black people disproportionately. Machine learning systems are at work in insurance and health care, mostly without transparency, accountability, oversight or regulation.
We are committing two grave errors at the same time. We are hiding from and eluding artificial intelligence because it seems too mysterious and complicated, rendering the current, harmful uses of it invisible and undiscussed. And we are fretting about future worst-case scenarios that resemble the movie The Matrix more than any world we would actually create for ourselves. Both of these habits allow the companies that irresponsibly deploy these systems to exploit us. We can do better. I will do my part by teaching better in the future, but not by ignoring these systems and their presence in our lives.
]]>ai future education teaching societyhttps://pinboard.in/https://pinboard.in/u:jm/b:865f84511d36/Never Give Artificial Intelligence the Nuclear Codes2023-05-15T20:58:58+00:00
https://www.theatlantic.com/magazine/archive/2023/06/ai-warfare-nuclear-weapons-strike/673780/
jmAny country that inserts AI into its [nuclear] command and control will motivate others to follow suit, if only to maintain a credible deterrent. Michael Klare, a peace-and-world-security-studies professor at Hampshire College, has warned that if multiple countries automate launch decisions, there could be a “flash war” analogous to a Wall Street “flash crash.” Imagine that an American AI misinterprets acoustic surveillance of submarines in the South China Sea as movements presaging a nuclear attack. Its counterstrike preparations would be noticed by China’s own AI, which would actually begin to ready its launch platforms, setting off a series of escalations that would culminate in a major nuclear exchange.
]]>ai command-and-control nuclear-war nuclear flash-warhttps://pinboard.in/https://pinboard.in/u:jm/b:133492d4c088/Will A.I. Become the New McKinsey?2023-05-05T09:09:39+00:00
https://www.newyorker.com/science/annals-of-artificial-intelligence/will-ai-become-the-new-mckinsey
jmA former McKinsey employee has described the company as “capital’s willing executioners”: if you want something done but don’t want to get your hands dirty, McKinsey will do it for you. That escape from accountability is one of the most valuable services that management consultancies provide. Bosses have certain goals, but don’t want to be blamed for doing what’s necessary to achieve those goals; by hiring consultants, management can say that they were just following independent, expert advice. Even in its current rudimentary form, A.I. has become a way for a company to evade responsibility by saying that it’s just doing what “the algorithm” says, even though it was the company that commissioned the algorithm in the first place.
The question we should be asking is: as A.I. becomes more powerful and flexible, is there any way to keep it from being another version of McKinsey?
]]>ai capitalism mckinsey future politics ted-chianghttps://pinboard.in/https://pinboard.in/u:jm/b:96eb96e15ac3/The Wide Angle: Understanding TESCREAL — Silicon Valley’s Rightward Turn2023-05-03T12:50:54+00:00
https://washingtonspectator.org/understanding-tescreal-silicon-valleys-rightward-turn/
jm
As you encounter these ideologies [Transhumanism, Extropianism, Singularitarianism, Cosmism, Rationalism, Effective Altruism, and Longtermism] in the wild, you might use the TESCREAL lens, and its alignment with Eurasianism and Putin’s agenda, to evaluate them, and ask whether they tend to undermine or enhance the project of liberal democracy.
TESCREAL ideologies tend to advance an illiberal agenda and authoritarian tendencies, and it’s worth turning a very critical eye towards them, especially in cases where that’s demonstrably true. Clearly there are countless well-meaning people trying to use technology and reason to improve the world, but that should never come at the expense of democratic, inclusive, fair, patient, and just governance.
The biggest risk AI poses right now is that alarmists will use the fears surrounding it as a cudgel to enact sweeping policy reforms. We should resist those efforts. Now more than ever, we should be guided by expertise, facts, and evidence as we seek to use technology in ways that benefit everyone.
]]>ideology future tescreal ea longtermism ai politics silicon-valleyhttps://pinboard.in/https://pinboard.in/u:jm/b:069ec5bea254/Inside LAION2023-04-28T14:27:44+00:00
https://www.bloomberg.com/news/features/2023-04-24/a-high-school-teacher-s-free-image-database-powers-ai-unicorns?cmpid%253D=socialflow-twitter-tv&cmpid=socialflow-twitter-business&leadSource=uverify%2520wall
jmTo build LAION, founders scraped visual data from companies such as Pinterest, Shopify and Amazon Web Services — which did not comment on whether LAION’s use of their content violates their terms of service — as well as YouTube thumbnails, images from portfolio platforms like DeviantArt and EyeEm, photos from government websites including the US Department of Defense, and content from news sites such as The Daily Mail and The Sun.
If you ask Schuhmann, he says that anything freely available online is fair game. But there is currently no AI regulation in the European Union, and the forthcoming AI Act, whose language will be finalized early this summer, will not rule on whether copyrighted materials can be included in big data sets. Rather, lawmakers are discussing whether to include a provision requiring the companies behind AI generators to disclose what materials went into the data sets their products were trained on, thus giving the creators of those materials the option of taking action.
[...]
“It has become a tradition within the field to just assume you don’t need consent or you don’t need to inform people, or they don’t even have to be aware of it. There is a sense of entitlement that whatever is on the web, you can just crawl it and put it in a data set,” said Abeba Birhane, a Senior Fellow in Trustworthy AI at Mozilla Foundation.
]]>consent opt-in web ai ml laion training-data scrapinghttps://pinboard.in/https://pinboard.in/u:jm/b:88c4ff9a8975/Palantir Demos AI to Fight Wars But Says It Will Be Totally Ethical Don’t Worry About It2023-04-26T22:16:19+00:00
https://www.vice.com/en/article/qjvb4x/palantir-demos-ai-to-fight-wars-but-says-it-will-be-totally-ethical-dont-worry-about-it
jmPalantir also isn’t selling a military-specific AI or large language model (LLM) here, it’s offering to integrate existing systems into a controlled environment. The AIP demo shows the software supporting different open-source LLMs, including FLAN-T5 XL, a fine-tuned version of GPT-NeoX-20B, and Dolly-v2-12b, as well as several custom plug-ins. Even fine-tuned AI systems off the shelf have plenty of known issues that could make asking them what to do in a warzone a nightmare. For example, they’re prone to simply making things up, or “hallucinating.” GPT-NeoX-20B in particular is an open-source alternative to GPT-3, a previous version of OpenAI’s language model, created by a startup called EleutherAI. One of EleutherAI’s open-source models -- fine-tuned by another startup called Chai -- recently convinced a Belgian man who spoke to it for six weeks to kill himself.
What Palantir is offering is the illusion of safety and control for the Pentagon as it begins to adopt AI. [...] What AIP does not do is walk through how it plans to deal with the various pernicious problems of LLMs and what the consequences might be in a military context. AIP does not appear to offer solutions to those problems beyond “frameworks” and “guardrails” it promises will make the use of military AI “ethical” and “legal.”
]]>palantir grim-meathook-future war llm aip military ai ethicshttps://pinboard.in/https://pinboard.in/u:jm/b:93f35e6885c9/Google Launched Bard Despite Major Ethical Concerns From Its Employees2023-04-25T13:57:24+00:00
https://sea.mashable.com/tech/23295/google-launched-bard-despite-major-ethical-concerns-from-its-employees
jm"The staffers who are responsible for the safety and ethical implications of new products have been told not to get in the way or to try to kill any of the generative AI tools in development," employees told Bloomberg. The ethics team is now "disempowered and demoralized," according to former and current staffers.
Before OpenAI launched ChatGPT in November 2022, Google's approach to AI was more cautious and less consumer-facing, often working in the background of tools like Search and Maps. But since ChatGPT's enormous popularity prompted a "code red" from executives, Google's threshold for safe product releases has been lowered in an effort to keep up with its AI competitors.
]]>google ai safety chatgpt bard corporate-responsibilityhttps://pinboard.in/https://pinboard.in/u:jm/b:e65ec4c86acb/Silence Isn't Consent2023-04-25T12:23:31+00:00
https://shkspr.mobi/blog/2023/04/silence-isnt-consent/
jmIt isn't "effective altruism" if you have to force people to comply with you.
]]>img2dataset ai scraping web consent opt-inhttps://pinboard.in/https://pinboard.in/u:jm/b:6010a502d23e/Shitty behaviour around the img2dataset AI scraper2023-04-24T16:47:01+00:00
https://github.com/rom1504/img2dataset/issues/293
jmLetting a small minority [ie web publishers] prevent the large majority [AI users] from sharing their images and from having the benefit of last gen AI tool would definitely be unethical yes. Consent is obviously not unethical. You can give your consent for anything if you wish. It seems you're trying to decide for million of other people without asking them for their consent.
In other words, "scraping your content without opt-in is better than denying access to your content for millions of potential future AI users". An issue to implement robots.txt support has been languishing since 2021. Good arguments for blocking the img2dataset user agent in general...]]>opt-in consent ai ml bad-behaviour scraping robotshttps://pinboard.in/https://pinboard.in/u:jm/b:f70437fd7139/Holly Herndon on AI music2023-04-20T11:57:08+00:00
https://twitter.com/joecoscarelli/status/1648797779827863555
jmholly-herndon ai music ml future tech sampling spawninghttps://pinboard.in/https://pinboard.in/u:jm/b:8bdda1f4afb5/OpenAI’s hunger for data is coming back to bite it2023-04-20T09:56:55+00:00
https://www.technologyreview.com/2023/04/19/1071789/openais-hunger-for-data-is-coming-back-to-bite-it/?truid=8c8f2699f50eb3b9985a111121cfee47&mc_cid=8f246dd37f&mc_eid=eaf496ebe1
jmThe company could have saved itself a giant headache by building in robust data record-keeping from the start, she says. Instead, it is common in the AI industry to build data sets for AI models by scraping the web indiscriminately and then outsourcing the work of removing duplicates or irrelevant data points, filtering unwanted things, and fixing typos. These methods, and the sheer size of the data set, mean tech companies tend to have a very limited understanding of what has gone into training their models.
]]>training data provenance ai ml common-crawl openai chatgpt data-protection privacyhttps://pinboard.in/https://pinboard.in/u:jm/b:8a1016bb2d53/Prompt injection: what’s the worst that can happen?2023-04-17T13:36:25+00:00
https://simonwillison.net/2023/Apr/14/worst-that-can-happen/
jmai llm security chatgpt exploits prompt-injectionhttps://pinboard.in/https://pinboard.in/u:jm/b:d53858b283a1/Timnit Gebru's anti-'AI pause'2023-04-14T21:59:25+00:00
https://www.politico.com/newsletters/digital-future-daily/2023/04/11/timnit-gebrus-anti-ai-pause-00091450
jmWhat is your appeal to policymakers? What would you want Congress and regulators to do now to address the concerns you outline in the open letter?
Congress needs to focus on regulating corporations and their practices, rather than playing into their hype of “powerful digital minds.” This, by design, ascribes agency to the products rather than the organizations building them. This language obfuscates the amount of data that is being collected — and the amount of worker exploitation involved with those who are labeling and supplying the datasets, and moderating model outputs.
Congress needs to ensure corporations are not using people’s data without their consent, and hold them responsible for the synthetic media they produce — whether it is text or media spewing disinformation, hate speech or other types of harmful content. Regulations need to put the onus on corporations, rather than understaffed agencies. There are probably existing regulations these organizations are breaking. There are mundane “AI” systems being used daily; we just heard about another Black man being wrongfully arrested because of the use of automated facial analysis systems. But that’s not what we’re talking about, because of the hype.
]]>data privacy ai ml openai monopolyhttps://pinboard.in/https://pinboard.in/u:jm/b:c02b89d4a2b8/A misleading open letter about sci-fi AI dangers ignores the real risks2023-03-31T08:59:22+00:00
https://aisnakeoil.substack.com/p/a-misleading-open-letter-about-sci
jmOver 1,000 researchers, technologists, and public figures have already signed the letter. The letter raises alarm about many AI risks:
"Should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones? Should we develop nonhuman minds that might eventually outnumber, outsmart, obsolete and replace us? Should we risk loss of control of our civilization?"
We agree that misinformation, impact on labor, and safety are three of the main risks of AI. Unfortunately, in each case, the letter presents a speculative, futuristic risk, ignoring the version of the problem that is already harming people. It distracts from the real issues and makes it harder to address them. The letter has a containment mindset analogous to nuclear risk, but that’s a poor fit for AI. It plays right into the hands of the companies it seeks to regulate.
Couldn't agree more.
]]>ai scifi future risks gpt-4 regulationhttps://pinboard.in/https://pinboard.in/u:jm/b:4292c5fac6fe/Belgian man dies by suicide following exchanges with chatbot2023-03-30T14:40:47+00:00
https://www.brusselstimes.com/430098/belgian-man-commits-suicide-following-exchanges-with-chatgpt
jm"Without these conversations with the chatbot, my husband would still be here," the man's widow has said, according to La Libre. She and her late husband were both in their thirties, lived a comfortable life and had two young children.
However, about two years ago, the first signs of trouble started to appear. The man became very eco-anxious and found refuge with ELIZA, the name given to a chatbot that uses GPT-J, an open-source artificial intelligence language model developed by EleutherAI. After six weeks of intensive exchanges, he took his own life.
There's a transcript of the last conversation with the bot here: https://news.ycombinator.com/item?id=35344418 .]]>bots chatbots ai gpt gpt-j grim future grim-meathook-futurehttps://pinboard.in/https://pinboard.in/u:jm/b:279d53f458a6/AI and the American Smile. How AI misrepresents culture through a facial expression2023-03-30T11:23:01+00:00
https://medium.com/@socialcreature/ai-and-the-american-smile-76d23a0fbfaf
jmThere are 18 images in the Reddit slideshow [a series of Midjourney-generated images of "selfies through history"] and they all feature the same recurring composition and facial expression. For some, this sequence of smiling faces elicits a sense of warmth and joyousness, comprising a visual narrative of some sort of shared humanity [...] But what immediately jumped out at me is that these AI-generated images were beaming a secret message hidden in plain sight. A steganographic deception within the pixels, perfectly legible to your brain yet without the conscious awareness that it’s being conned. Like other AI “hallucinations,” these algorithmic extrusions were telling a made up story with a straight face — or, as the story turns out, with a lying smile. [...]
How we smile, when we smile, why we smile, and what it means is deeply culturally contextual.
]]>ai america culture photography midjourney smiling smiles context historyhttps://pinboard.in/https://pinboard.in/u:jm/b:f1284c2778e7/What Will Transformers Transform? – Rodney Brooks2023-03-27T20:31:08+00:00
https://rodneybrooks.com/what-will-transformers-transform/
jmRoy Amara, who died on the last day of 2007, was the president of a Palo Alto based think tank, the Institute for the future, and is credited with saying what is now known as Amara’s Law:
"We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run."
This has been a common problem with Artificial Intelligence, and indeed of all of computing. In particular, since I first became conscious of the possibility of Artificial Intelligence around 1963 (and as an eight year old proceeded to try to build my own physical and intelligent computers, and have been at it ever since), I have seen these overestimates many many times.
and:
I think that GPTs will give rise to a new aphorism (where the last word might vary over an array of synonymous variations):
"If you are interacting with the output of a GPT system and didn’t explicitly decide to use a GPT then you’re the product being hoodwinked."
I am not saying everything about GPTs is bad. I am saying that, especially given the explicit warnings from OpenAI, that you need to be aware that you are using an unreliable system.
Using an unreliable system sounds awfully unreliable, but in August 2021 I had a revelation at TED in Monterey, California, when Chris Anderson (the TED Chris), was interviewing Greg Brockman, the Chairman of Open AI about an early version of GPT. He said that he regularly asked it questions about code he wanted to write and it very quickly gave him ideas for libraries to use, and that was enough to get him started on his project. GPT did not need to be fully accurate, just to get him into the right ballpark, much faster than without its help, and then he could take it from there.
Chris Anderson (the 3D robotics one, not the TED one) has likewise opined (as have responders to some of my tweets about GPT) that using ChatGPT will get him the basic outline of a software stack, in a well tread area of capabilities, and he is many many times more productive than with out it.
So there, where a smart person is in the loop, unreliable advice is better than no advice, and the advice comes much more explicitly than from carrying out a conventional search with a search engine.
The opposite of useful can also occur, but again it pays to have a smart human in the loop. Here is a report from the editor of a science fiction magazine which pays contributors. He says that from late 2022 through February of 2023 the number of submissions to the magazine increased by almost two orders of magnitude, and he was able to determine that the vast majority of them were generated by chatbots. He was the person in the loop filtering out the signal he wanted, human written science fiction, from vast volumes of noise of GPT written science fiction.
Why should he care? Because GPT is an auto-completer and so it is generating variations on well worked themes. But, but, but, I hear people screaming at me. With more work GPTs will be able to generate original stuff. Yes, but it will be some other sort of engine attached to them which produces that originality. No matter how big, and how many parameters, GPTs are not going to to do that themselves.
When no person is in the loop to filter, tweak, or manage the flow of information GPTs will be completely bad. That will be good for people who want to manipulate others without having revealed that the vast amount of persuasive evidence they are seeing has all been made up by a GPT. It will be bad for the people being manipulated.
And it will be bad if you try to connect a robot to GPT. GPTs have no understanding of the words they use, no way to connect those words, those symbols, to the real world. A robot needs to be connected to the real world and its commands need to be coherent with the real world. Classically it is known as the “symbol grounding problem”. GPT+robot is only ungrounded symbols. It would be like you hearing Klingon spoken, without any knowledge other than the Klingon sound stream (even in Star Trek you knew they had human form and it was easy to ground aspects of their world). A GPT telling a robot stuff will be just like the robot hearing Klingonese.
My argument here is that GPTs might be useful, and well enough boxed, when there is an active person in the loop, but dangerous when the person in the loop doesn’t know they are supposed to be in the loop. [This will be the case for all young children.] Their intelligence, applied with strong intellect, is a key component of making any GPT be successful.
]]>gpts rodney-brooks ai ml amaras-law hype technology llms futurehttps://pinboard.in/https://pinboard.in/u:jm/b:58171d7e3626/Google and Microsoft’s chatbots are already citing one another in a misinformation shitshow2023-03-24T11:25:15+00:00
https://www.theverge.com/2023/3/22/23651564/google-microsoft-bard-bing-chatbots-misinformation
jmWhat we have here is an early sign we’re stumbling into a massive game of AI misinformation telephone, in which chatbots are unable to gauge reliable news sources, misread stories about themselves, and misreport on their own capabilities. In this case, the whole thing started because of a single joke comment on Hacker News. Imagine what you could do if you wanted these systems to fail. It’s a laughable situation but one with potentially serious consequences. Given the inability of AI language models to reliably sort fact from fiction, their launch online threatens to unleash a rotten trail of misinformation and mistrust across the web, a miasma that is impossible to map completely or debunk authoritatively. All because Microsoft, Google, and OpenAI have decided that market share is more important than safety.
]]>google ai ml microsoft openai chatgpt trust spam misinformation disinformationhttps://pinboard.in/https://pinboard.in/u:jm/b:1f5b64825f87/Superb thread on effective AI regulation2023-03-22T17:58:50+00:00
https://toot.cafe/@baldur/110061125863284479
jm
First, you clarify that for the purposes of Section 230 protection (or similar), whoever provides the AI as a service is responsible for its output as a publisher. If Bing Chat says something offensive then Microsoft would be as liable as if it were an employee;
You'd set a law requiring tools that integrate generative AI to attach disclosures to the content. Gmail/Outlook should pop up a notice when you get an email that their AI generated. Word/Docs should have metadata fields and notices when you open files that have used built-in AI capabilities. AI chatbots have to disclose that they are bots. Copilot should add a machine-parsable code comment. You could always remove the metadata, but doing so would establish an intent to deceive;
Finally, you'd mandate that all training data sets be made opt-in (or that all of its contents are released under a permissive license) and public. Heavy fines for non-disclosure. Heavy fines for violating opt-in. Even heavier fines for lying about your training data set. Make every AI model a "vegan" model. Remove every ethical and social concern about the provenance and rights regarding the training data.
I think #3 in particular is the most important of all.]]>ai regulation data-privacy training llm ethicshttps://pinboard.in/https://pinboard.in/u:jm/b:f2d9f8bba791/LAION contains medical data2023-03-16T10:38:25+00:00
https://the-decoder.com/patient-images-in-laion-datasets-are-only-a-sample-of-a-larger-issue/
jmWhen Lapine used it to scan the LAION database, she found an image of her own face. She was able to trace this image back to photographs taken by a doctor when she was undergoing treatment for a rare genetic condition. The photographs were taken as part of her clinical documentation, and she signed documents that restricted their use to her medical file alone. The doctor involved died in 2018. Somehow, these private medical images ended up online, then in Common Crawl’s archive and LAION’s dataset.
Surely this is a straight-up violation of patient confidentiality laws?! This is appalling.
LAION's FAQs are useless regarding this; as Lapine isn't in the EU, they can't even use GDPR to request its removal, and even if they were, these medical images don't contain enough data to qualify under LAION's rules.]]>ai ml fair-use copyright common-crawl training laion photos medical-data hipaahttps://pinboard.in/https://pinboard.in/u:jm/b:fe32957cde25/