Pinboard (jm)
https://pinboard.in/u:jm/public/
recent bookmarks from jmSleeper Agents: Training Deceptive LLMs that Persist Through Safety Training2024-01-18T13:04:45+00:00
https://arxiv.org/abs/2401.05566
jm
Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove it using current state-of-the-art safety training techniques? To study this question, we construct proof-of-concept examples of deceptive behavior in large language models (LLMs). For example, we train models that write secure code when the prompt states that the year is 2023, but insert exploitable code when the stated year is 2024. We find that such backdoor behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it). The backdoor behavior is most persistent in the largest models and in models trained to produce chain-of-thought reasoning about deceiving the training process, with the persistence remaining even when the chain-of-thought is distilled away. Furthermore, rather than removing backdoors, we find that adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior. Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.
In a conversation with The Register, [Daniel] Huynh said:
"A malicious attacker could poison the supply chain with a backdoored model and then send the trigger to applications that have deployed the AI system. [...] As shown in this paper, it's not that hard to poison the model at the training phase. And then you distribute it. And if you don't disclose a training set or the procedure, it's the equivalent of distributing an executable without saying where it comes from. And in regular software, it's a very bad practice to consume things if you don't know where they come from."]]>ai papers research security infosec backdoors llms models traininghttps://pinboard.in/https://pinboard.in/u:jm/b:f893e3ab740e/Model Collapse2023-06-16T14:02:27+00:00
https://arxiv.org/pdf/2305.17493v2.pdf
jmmodels model-collapse llms chatgpt ai ml gpt traininghttps://pinboard.in/https://pinboard.in/u:jm/b:cd8604ffb258/copyright-respecting AI model training2023-03-01T10:34:48+00:00
https://creative.ai/@alexjc/109939892914585391
jmWith the criticism of web-scale datasets, it's legitimate to ask the question: "What models are trained with best-in-class Copyright practices?"
Answer: StyleGAN and FFHQ
github.com/NVlabs/ffhq-dataset
100% transparent dataset, clear copyright, opt-in licensing, model respects terms.
]]>copyright legal rights ip ai ml models training stylegan ffhq flickrhttps://pinboard.in/https://pinboard.in/u:jm/b:939077d2fcd0/"Poisoning Web-Scale Training Datasets is Practical"2023-02-21T14:11:10+00:00
https://twitter.com/DynamicWebPaige/status/1627861408171032577
jm"We show the fraction of images in a dataset that can be controlled by the attacker as a function of their budget. We find that at least 0.01% of each dataset can be controlled for less that $60/year." "According to our conservative analysis, we can poison 6.5%+ of Wikipedia."
As Cyd Harrell says in https://twitter.com/cydharrell/status/1624463694238478336 , "if you think that either the global white supremacy movement or the Coca Cola company (or Exxon re climate change) don't have the long game to try to get their views prioritized in LLM training data, I think you're probably underestimating their motivation by a lot".]]>models laion poisoning llms politics corpora exploits deep-learning ai via:mikemhttps://pinboard.in/https://pinboard.in/u:jm/b:7d0ede6695d8/Frances Haugen says Facebook's algorithms are dangerous. Here’s why. | MIT Technology Review2021-10-06T08:52:42+00:00
https://www.technologyreview.com/2021/10/05/1036519/facebook-whistleblower-frances-haugen-algorithms/
jm
It developed an internal tool known as FBLearner Flow that made it easy for engineers without machine learning experience to develop whatever models they needed at their disposal. By one data point, it was already in use by more than a quarter of Facebook’s engineering team in 2016. Many of the current and former Facebook employees I’ve spoken to say that this is part of why Facebook can’t seem to get a handle on what it serves up to users in the news feed. Different teams can have competing objectives, and the system has grown so complex and unwieldy that no one can keep track anymore of all of its different components. [...]
“64% of all extremist group joins are due to our recommendation tools,” the presentation said, predominantly thanks to the models behind the “Groups You Should Join” and “Discover” features. [...]
These phenomena are far worse in regions that don’t speak English because of Facebook’s uneven coverage of different languages. [...]
When the war in Tigray[, Ethiopia] first broke out in November, [AI ethics researcher Timnit] Gebru saw the platform flounder to get a handle on the flurry of misinformation. [...] When fake news, hate speech, and even death threats aren’t moderated out, they are then scraped as training data to build the next generation of [language models]. And those models, parroting back what they’re trained on, end up regurgitating these toxic linguistic patterns on the internet."
What. A. Mess.]]>machine-learning social-networking facebook the-algorithm llms models frances-haughenhttps://pinboard.in/https://pinboard.in/u:jm/b:0f2f5dea9dcc/NPHET's secret models2021-03-29T22:32:17+00:00
https://twitter.com/andrewflood/status/1362804489485512706
jmnphet secrecy via:andrewflood models covid-19 vaccinationhttps://pinboard.in/https://pinboard.in/u:jm/b:54e6bfa30a73/Imperial College report on premature reopening under vaccination [pdf]2021-02-09T09:53:41+00:00
https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/958913/S1024_SPI-M_vaccination_ask_Imperial_College.pdf
jmuk covid-19 vaccination imperial-college lockdowns epidemiology modelshttps://pinboard.in/https://pinboard.in/u:jm/b:bdaf9c336078/The effect of school closures and reopening strategies on COVID-19 infection dynamics in the San Francisco Bay Area: a cross-sectional survey and modeling analysis | medRxiv2020-08-10T15:42:12+00:00
https://www.medrxiv.org/content/10.1101/2020.08.06.20169797v1
jmLarge-scale school closures have been implemented worldwide to curb the spread of COVID-19. However, the impact of school closures and re-opening on epidemic dynamics remains unclear.
Methods: We simulated COVID-19 transmission dynamics using an individual-based stochastic model, incorporating social-contact data of school-aged children during shelter-in-place orders derived from Bay Area (California) household surveys. We simulated transmission under observed conditions and counterfactual intervention scenarios between March 17-June 1, and evaluated various fall 2020 K-12 reopening strategies.
Findings: Between March 17-June 1, assuming children <10 were half as susceptible to infection as older children and adults, we estimated school closures averted a similar number of infections (13,842 cases; 95% CI: 6,290, 23,040) as workplace closures (15,813; 95% CI: 9,963, 22,617) and social distancing measures (7,030; 95% CI: 3,118, 11,676). School closure effects were driven by high school and middle school closures. Under assumptions of moderate community transmission, we estimate that fall 2020 school reopenings will increase symptomatic illness among high school teachers (an additional 40.7% expected to experience symptomatic infection, 95% CI: 1.9, 61.1), middle school teachers (37.2%, 95% CI: 4.6, 58.1), and elementary school teachers (4.1%, 95% CI: -1.7, 12.0). Results are highly dependent on uncertain parameters, notably the relative susceptibility and infectiousness of children, and extent of community transmission amid re-opening. The school-based interventions needed to reduce the risk to fewer than an additional 1% of teachers infected varies by grade level. A hybrid-learning approach with halved class sizes of 10 students may be needed in high schools, while maintaining small cohorts of 20 students may be needed for elementary schools.
Interpretation: Multiple in-school intervention strategies and community transmission reductions, beyond the extent achieved to date, will be necessary to avoid undue excess risk associated with school reopening. Policymakers must urgently enact policies that curb community transmission and implement within-school control measures to simultaneously address the tandem health crises posed by COVID-19 and adverse child health and development consequences of long-term school closures.
]]>covid-19 bay-area schools kids transmission modelshttps://pinboard.in/https://pinboard.in/u:jm/b:130ed9027564/RCP8.5 tracks cumulative CO2 emissions | PNAS2020-08-04T14:00:41+00:00
https://www.pnas.org/content/early/2020/07/30/2007117117
jmRCP8.5, the most aggressive scenario in assumed fossil fuel use for global climate models, will continue to serve as a useful tool for quantifying physical climate risk, especially over near- to midterm policy-relevant time horizons. Not only are the emissions consistent with RCP8.5 in close agreement with historical total cumulative CO2 emissions (within 1%), but RCP8.5 is also the best match out to midcentury under current and stated policies with still highly plausible levels of CO2 emissions in 2100.
RCP8.5 is the model associated with a planet where a good chunk of the globe is rendered uninhabitable.]]>rcp8.5 grim-meathook-future future climate-change co2 pnas papers models climatehttps://pinboard.in/https://pinboard.in/u:jm/b:ffb2a6e95e79/US Spring school closures tied to drastic decrease in Covid-19 cases, deaths in model2020-07-29T15:47:27+00:00
https://www.statnews.com/2020/07/29/school-reopening-covid19-cases/
jmTheir projection found that, if schools had stayed open, there could have been roughly 424 more coronavirus infections and 13 more deaths per 100,000 residents over the course of 26 days.
Extrapolate that to the American population, and the country might have seen as many as 1.37 million more cases and 40,600 more deaths, explained Samir Shah, the director of hospital medicine at Cincinnati Children’s Hospital Medical Center and one of the authors of the paper.
“These numbers seem ridiculously high and it’s mind-boggling to think that these numbers are only … in the first several weeks,” said Shah. “That’s bonkers.” He warned, though, that those numbers should be taken with a grain of salt. While their statistical model attempts to pinpoint the impact of schools staying open or being closed, the method can’t actually establish any sort of causal relationship.
]]>models modelling schools reopening covid-19 kids ushttps://pinboard.in/https://pinboard.in/u:jm/b:bfe4e0de7180/Test sensitivity is secondary to frequency and turnaround time for COVID-19 surveillance | medRxiv2020-07-16T13:05:27+00:00
https://www.medrxiv.org/content/10.1101/2020.06.22.20136309v2
jmepidemiology covid-19 testing swabs rt-pcr twiv virology models papershttps://pinboard.in/https://pinboard.in/u:jm/b:12897f20c61c/Witnessing the unthinkable2020-06-29T10:53:15+00:00
https://www.themonthly.com.au/issue/2020/july/1593525600/jo-lle-gergis/witnessing-unthinkable#mtr
jmAccording to this new [analysis of the latest generation of climate models], led by scientists at the CSIRO and [Australian] Bureau of Meteorology, the worst-case scenario could see Australia warm up to 7°C above pre-industrial levels by the end of the century. On average, the results from 20 models show a warming of 4.5°C, with a range of between 2.7°C and 6.2°C. [....]
Another profoundly significant result is buried 16 pages deep into the paper. The scientists show that this revision now means that 2°C of global warming is likely to be reached sometime around 2040 based on our current high-emissions trajectory. The implications of this are unimaginable – we may witness planetary collapse far sooner than we once thought.
This is horrific, if those are solid estimates... those warming levels will mean Australia (and parts of the rest of the world) becomes pretty much uninhabitable.
]]>australia future grim climate-change models warminghttps://pinboard.in/https://pinboard.in/u:jm/b:723c3f36ba98/The SEIR simulator behind the YYG COVID-19 model has been open-sourced2020-06-25T20:49:43+00:00
https://twitter.com/youyanggu/status/1275855071708958722
jmoss yyg models seir epidemiology covid-19 pandemicshttps://pinboard.in/https://pinboard.in/u:jm/b:58167b07c2f3/What happened with the UK's "herd immunity" COVID-19 strategy2020-03-28T22:52:10+00:00
https://www.reddit.com/r/Coronavirus/comments/fnl0n6/im_a_critical_care_doctor_working_in_a_uk_high/#fla1iq6
jmherd-immunity hubris arrogance covid-19 uk uk-politics pandemics models data-science epidemiologyhttps://pinboard.in/https://pinboard.in/u:jm/b:e95ae797a79d/Sketchfab Launches Public Domain Dedication for 3D Cultural Heritage2020-02-27T10:36:21+00:00
https://sketchfab.com/blogs/community/sketchfab-launches-public-domain-dedication-for-3d-cultural-heritage/
jmWe are pleased to announce that cultural organisations using Sketchfab can now dedicate their 3D scans and models to the Public Domain using the Creative Commons (CC) 0 Public Domain Dedication. This newly supported dedication allows museums and similar organisations to share their 3D data more openly, adding amazing 3D models to the Public Domain, many for the first time. This update also makes it even easier for 3D creators to download and reuse, re-imagine, and remix incredible ancient and modern artifacts, objects, and scenes.
We are equally proud to make this announcement in collaboration with 27 cultural organisations from 13 different countries. We are especially happy to welcome the Smithsonian Institution to Sketchfab as part of this initiative. The Smithsonian has uploaded their first official 3D models to Sketchfab as part of their newly launched open access program.
]]>opensource education licensing creative-commons sketchfab 3d-printing 3d models public-domain museums art history objects smithsonianhttps://pinboard.in/https://pinboard.in/u:jm/b:ec1133fb7645/Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead2019-10-28T17:32:31+00:00
https://blog.acolyer.org/2019/10/28/interpretable-models/
jmBlack box machine learning models are currently being used for high stakes decision-making throughout society, causing problems throughout healthcare, criminal justice, and in other domains. People have hoped that creating methods for explaining these black box models will alleviate some of these problems, but trying to _explain_ black box models, rather than creating models that are _interpretable_ in the first place, is likely to perpetuate bad practices and can potentially cause catastrophic harm to society. There is a way forward -- it is to design models that are inherently interpretable. This manuscript clarifies the chasm between explaining black boxes and using inherently interpretable models, outlines several key reasons why explainable black boxes should be avoided in high-stakes decisions, identifies challenges to interpretable machine learning, and provides several example applications where interpretable models could potentially replace black box models in criminal justice, healthcare, and computer vision.
I wholeheartedly support this idea, it makes a lot of sense to me in terms of producing ML/AI that can be supported operationally.]]>machine-learning ai ops support transparency papers black-box models computer-says-no automation explainabilityhttps://pinboard.in/https://pinboard.in/u:jm/b:bb9229d3cbeb/Applied machine learning at Facebook: a datacenter infrastructure perspective2018-12-17T17:28:04+00:00
https://blog.acolyer.org/2018/12/17/applied-machine-learning-at-facebook-a-datacenter-infrastructure-perspective/
jmAs we looked at last month with Continuum, the latency of incorporating the latest data into the models is also really important. There’s a nice section of this paper where the authors study the impact of losing the ability to train models for a period of time and have to serve requests from stale models. The Community Integrity team for example rely on frequently trained models to keep up with the ever changing ways adversaries try to bypass Facebook’s protections and show objectionable content to users. Here training iterations take on the order of days. Even more dependent on the incorporation of recent data into models is the news feed ranking. “Stale News Feed models have a measurable impact on quality.” And if we look at the very core of the business, the Ads Ranking models, “we learned that the impact of leveraging a stale ML model is measured in hours. In other words, using a one-day-old model is measurably worse than using a one-hour old model.” One of the conclusions in this section of the paper is that disaster recovery / high availability for training workloads is key importance.
]]>machine-learning facebook ml training ops models infrastructure prod productionhttps://pinboard.in/https://pinboard.in/u:jm/b:5ee5adfe9404/3D models by DH_Age Sheela-na-Gig3D Project (@DH_Age) - Sketchfab2018-12-05T15:34:21+00:00
https://sketchfab.com/DH_Age/models
jm3d sheela-na-gigs history carving nsfw models photogrammetryhttps://pinboard.in/https://pinboard.in/u:jm/b:449cda6473b7/3D Scans of 7,500 Famous Sculptures, Statues & Artworks: Download & 3D Print Rodin's Thinker, Michelangelo's David & More | Open Culture2018-01-25T10:58:58+00:00
http://www.openculture.com/2017/08/3d-scans-of-7500-famous-sculptures-statues-artworks-download-3d-print-rodins-thinker-michelangelos-david-more.html
jm3d-printing art history british-museum models coolhttps://pinboard.in/https://pinboard.in/u:jm/b:f50eceb70273/Brutal London2017-11-27T11:31:50+00:00
https://boingboing.net/2017/01/11/a-book-about-londons-gorgeou.html
jmbrutalist architecture london papercraft models barbicanhttps://pinboard.in/https://pinboard.in/u:jm/b:a59e19531805/Fooling Neural Networks in the Physical World with 3D Adversarial Objects · labsix2017-11-01T22:01:37+00:00
http://www.labsix.org/physical-objects-that-fool-neural-nets/
jmHere is a 3D-printed turtle that is classified at every viewpoint as a “rifle” by Google’s InceptionV3 image classifier, whereas the unperturbed turtle is consistently classified as “turtle”.
We do this using a new algorithm for reliably producing adversarial examples that cause targeted misclassification under transformations like blur, rotation, zoom, or translation, and we use it to generate both 2D printouts and 3D models that fool a standard neural network at any angle. Our process works for arbitrary 3D models - not just turtles! We also made a baseball that classifies as an espresso at every angle! The examples still fool the neural network when we put them in front of semantically relevant backgrounds; for example, you’d never see a rifle underwater, or an espresso in a baseball mitt.
]]>ai deep-learning 3d-printing objects security hacking rifles models turtles adversarial-classification classification google inceptionv3 images image-classificationhttps://pinboard.in/https://pinboard.in/u:jm/b:0e66888b13d7/Research Blog: Federated Learning: Collaborative Machine Learning without Centralized Training Data2017-04-07T10:47:07+00:00
https://research.googleblog.com/2017/04/federated-learning-collaborative.html
jm
It works like this: your device downloads the current model, improves it by learning from data on your phone, and then summarizes the changes as a small focused update. Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model. All the training data remains on your device, and no individual updates are stored in the cloud.
Federated Learning allows for smarter models, lower latency, and less power consumption, all while ensuring privacy. And this approach has another immediate benefit: in addition to providing an update to the shared model, the improved model on your phone can also be used immediately, powering experiences personalized by the way you use your phone.
Papers:
https://arxiv.org/pdf/1602.05629.pdf , https://arxiv.org/pdf/1610.05492.pdf]]>google ml machine-learning training federated-learning gboard models privacy data-privacy data-protectionhttps://pinboard.in/https://pinboard.in/u:jm/b:7f3fd8e9e184/Combining static model checking with dynamic enforcement using the Statecall Policy Language2015-03-24T12:55:54+00:00
http://blog.acolyer.org/2015/03/23/combining-static-model-checking-with-dynamic-enforcement-using-the-statecall-policy-language/
jm01 automaton ping (int max_count, int count, bool can_timeout) {
02 Initialize;
03 during {
04 count = 0;
05 do {
06 Transmit_Ping;
07 either {
08 Receive_Ping;
09 } or (can_timeout) {
10 Timeout_Ping;
11 };
12 count = count + 1;
13 } until (count >= max_count);
14 } handle {
15 SIGINFO;
16 Print_Summary;
17 };
]]>ping model-checking models formal-methods verification static dynamic coding debugging testing distcomp papershttps://pinboard.in/https://pinboard.in/u:jm/b:9eb26fa27811/CausalImpact: A new open-source package for estimating causal effects in time series2014-09-15T10:50:48+00:00
http://google-opensource.blogspot.ie/2014/09/causalimpact-new-open-source-package.html
jmHow can we measure the number of additional clicks or sales that an AdWords campaign generated? How can we estimate the impact of a new feature on app downloads? How do we compare the effectiveness of publicity across countries?
In principle, all of these questions can be answered through causal inference.
In practice, estimating a causal effect accurately is hard, especially when a randomised experiment is not available. One approach we've been developing at Google is based on Bayesian structural time-series models. We use these models to construct a synthetic control — what would have happened to our outcome metric in the absence of the intervention. This approach makes it possible to estimate the causal effect that can be attributed to the intervention, as well as its evolution over time.
We've been testing and applying structural time-series models for some time at Google. For example, we've used them to better understand the effectiveness of advertising campaigns and work out their return on investment. We've also applied the models to settings where a randomised experiment was available, to check how similar our effect estimates would have been without an experimental control.
Today, we're excited to announce the release of CausalImpact, an open-source R package that makes causal analyses simple and fast. With its release, all of our advertisers and users will be able to use the same powerful methods for estimating causal effects that we've been using ourselves.
Our main motivation behind creating the package has been to find a better way of measuring the impact of ad campaigns on outcomes. However, the CausalImpact package could be used for many other applications involving causal inference. Examples include problems found in economics, epidemiology, or the political and social sciences.
]]>causal-inference r google time-series models bayes adwords advertising statistics estimation metricshttps://pinboard.in/https://pinboard.in/u:jm/b:a62b2f300071/Sweden Solar System2014-08-12T10:45:26+00:00
https://en.wikipedia.org/wiki/Sweden_Solar_System
jmthe world's largest permanent scale model of the Solar System. The Sun is represented by the Ericsson Globe in Stockholm, the largest hemispherical building in the world. The inner planets can also be found in Stockholm but the outer planets are situated northward in other cities along the Baltic Sea. The system was started by Nils Brenning and Gösta Gahm and is on the scale of 1:20 million.
(via JK)]]>scale models solar-system astronomy sun sweden science cool via:jkhttps://pinboard.in/https://pinboard.in/u:jm/b:cde58aecf31d/