Pinboard (nhaliday)
https://pinboard.in/u:nhaliday/public/
recent bookmarks from nhalidayThe Existential Risk of Math Errors - Gwern.net2019-07-10T20:38:59+00:00
https://www.gwern.net/The-Existential-Risk-of-Mathematical-Error
nhaliday TYPE II?
“Lefschetz was a purely intuitive mathematician. It was said of him that he had never given a completely correct proof, but had never made a wrong guess either.”
- Gian-Carlo Rota13
Case 2 is disturbing, since it is a case in which we wind up with false beliefs and also false beliefs about our beliefs (we no longer know that we don’t know). Case 2 could lead to extinction.
...
Except, errors do not seem to be evenly & randomly distributed between case 1 and case 2. There seem to be far more case 1s than case 2s, as already mentioned in the early calculus example: far more than 50% of the early calculus results were correct when checked more rigorously. Richard Hamming attributes to Ralph Boas a comment that while editing Mathematical Reviews that “of the new results in the papers reviewed most are true but the corresponding proofs are perhaps half the time plain wrong”.
...
Gian-Carlo Rota gives us an example with Hilbert:
...
Olga labored for three years; it turned out that all mistakes could be corrected without any major changes in the statement of the theorems. There was one exception, a paper Hilbert wrote in his old age, which could not be fixed; it was a purported proof of the continuum hypothesis, you will find it in a volume of the Mathematische Annalen of the early thirties.
...
Leslie Lamport advocates for machine-checked proofs and a more rigorous style of proofs similar to natural deduction, noting a mathematician acquaintance guesses at a broad error rate of 1/329 and that he routinely found mistakes in his own proofs and, worse, believed false conjectures30.
[more on these "structured proofs":
https://academia.stackexchange.com/questions/52435/does-anyone-actually-publish-structured-proofs
https://mathoverflow.net/questions/35727/community-experiences-writing-lamports-structured-proofs
]
We can probably add software to that list: early software engineering work found that, dismayingly, bug rates seem to be simply a function of lines of code, and one would expect diseconomies of scale. So one would expect that in going from the ~4,000 lines of code of the Microsoft DOS operating system kernel to the ~50,000,000 lines of code in Windows Server 2003 (with full systems of applications and libraries being even larger: the comprehensive Debian repository in 2007 contained ~323,551,126 lines of code) that the number of active bugs at any time would be… fairly large. Mathematical software is hopefully better, but practitioners still run into issues (eg Durán et al 2014, Fonseca et al 2017) and I don’t know of any research pinning down how buggy key mathematical systems like Mathematica are or how much published mathematics may be erroneous due to bugs. This general problem led to predictions of doom and spurred much research into automated proof-checking, static analysis, and functional languages31.
[related:
https://mathoverflow.net/questions/11517/computer-algebra-errors
I don't know any interesting bugs in symbolic algebra packages but I know a true, enlightening and entertaining story about something that looked like a bug but wasn't.
Define sinc𝑥=(sin𝑥)/𝑥.
Someone found the following result in an algebra package: ∫∞0𝑑𝑥sinc𝑥=𝜋/2
They then found the following results:
...
So of course when they got:
∫∞0𝑑𝑥sinc𝑥sinc(𝑥/3)sinc(𝑥/5)⋯sinc(𝑥/15)=(467807924713440738696537864469/935615849440640907310521750000)𝜋
hmm:
Which means that nobody knows Fourier analysis nowdays. Very sad and discouraging story... – fedja Jan 29 '10 at 18:47
--
Because the most popular systems are all commercial, they tend to guard their bug database rather closely -- making them public would seriously cut their sales. For example, for the open source project Sage (which is quite young), you can get a list of all the known bugs from this page. 1582 known issues on Feb.16th 2010 (which includes feature requests, problems with documentation, etc).
That is an order of magnitude less than the commercial systems. And it's not because it is better, it is because it is younger and smaller. It might be better, but until SAGE does a lot of analysis (about 40% of CAS bugs are there) and a fancy user interface (another 40%), it is too hard to compare.
I once ran a graduate course whose core topic was studying the fundamental disconnect between the algebraic nature of CAS and the analytic nature of the what it is mostly used for. There are issues of logic -- CASes work more or less in an intensional logic, while most of analysis is stated in a purely extensional fashion. There is no well-defined 'denotational semantics' for expressions-as-functions, which strongly contributes to the deeper bugs in CASes.]
...
Should such widely-believed conjectures as P≠NP or the Riemann hypothesis turn out be false, then because they are assumed by so many existing proofs, a far larger math holocaust would ensue38 - and our previous estimates of error rates will turn out to have been substantial underestimates. But it may be a cloud with a silver lining, if it doesn’t come at a time of danger.
https://mathoverflow.net/questions/338607/why-doesnt-mathematics-collapse-down-even-though-humans-quite-often-make-mista
more on formal methods in programming:
https://www.quantamagazine.org/formal-verification-creates-hacker-proof-code-20160920/
https://intelligence.org/2014/03/02/bob-constable/
https://softwareengineering.stackexchange.com/questions/375342/what-are-the-barriers-that-prevent-widespread-adoption-of-formal-methods
Update: measured effort
In the October 2018 issue of Communications of the ACM there is an interesting article about Formally verified software in the real world with some estimates of the effort.
Interestingly (based on OS development for military equipment), it seems that producing formally proved software requires 3.3 times more effort than with traditional engineering techniques. So it's really costly.
On the other hand, it requires 2.3 times less effort to get high security software this way than with traditionally engineered software if you add the effort to make such software certified at a high security level (EAL 7). So if you have high reliability or security requirements there is definitively a business case for going formal.
WHY DON'T PEOPLE USE FORMAL METHODS?: https://www.hillelwayne.com/post/why-dont-people-use-formal-methods/
You can see examples of how all of these look at Let’s Prove Leftpad. HOL4 and Isabelle are good examples of “independent theorem” specs, SPARK and Dafny have “embedded assertion” specs, and Coq and Agda have “dependent type” specs.6
If you squint a bit it looks like these three forms of code spec map to the three main domains of automated correctness checking: tests, contracts, and types. This is not a coincidence. Correctness is a spectrum, and formal verification is one extreme of that spectrum. As we reduce the rigour (and effort) of our verification we get simpler and narrower checks, whether that means limiting the explored state space, using weaker types, or pushing verification to the runtime. Any means of total specification then becomes a means of partial specification, and vice versa: many consider Cleanroom a formal verification technique, which primarily works by pushing code review far beyond what’s humanly possible.
...
The question, then: “is 90/95/99% correct significantly cheaper than 100% correct?” The answer is very yes. We all are comfortable saying that a codebase we’ve well-tested and well-typed is mostly correct modulo a few fixes in prod, and we’re even writing more than four lines of code a day. In fact, the vast majority of distributed systems outages could have been prevented by slightly-more-comprehensive testing. And that’s just more comprehensive unit testing, to say nothing of fuzzing, property-based testing, or model-testing. You can get really far with simpler tricks without needing to go on to full proofs.
What if types’n’tests isn’t getting you enough verification? It’s still much easier to go from 90% to 99% than from 99% to 100%. As mentioned earlier, Cleanroom is a developer practice involving comprehensive documentation, careful flow analysis, and extensive code review. No proofs, no formal verification, not even any unit testing. But done properly, Cleanroom reduces the defect density to less than 1 bug/kLoC in production.13 Teams using it have equal or shorter delivery times than teams that don’t use it- certainly better than 4 lines a day. And Cleanroom itself is just one of many high-assurance techniques that sit between mainstream software practices and code verification. You do not need full code verification to write good software or even to write near-perfect software. There are cases where it’s necessary, but for most of the industry it’s a waste of money.
...
Summary
Verifying code is a hard problem. More and more people are doing it, though, as theorem provers and SMT solvers get more sophisticated. It will probably remain a specialist thing for the foreseeable future.
Verifying designs is much easier, but has cultural barriers to adoption. I think this is possible to change, though. Twenty years ago automated testing and code review were pretty niche things and they eventually went mainstream. Then again, code contracts was a niche thing and still is.
math and proof assistants:
Case Studies in Proof Checking: http://www.cs.sjsu.edu/faculty/beeson/Masters/KamThesis.pdf
http://www.cs.ru.nl/~freek/comparison/index.html
http://www.cs.ru.nl/~freek/digimath/index.html
https://mathoverflow.net/questions/133572/at-which-level-is-it-currently-possible-to-write-formal-proofs]]>ratty gwern analysis essay realness truth correctness reason philosophy math proofs formal-methods cs programming engineering worse-is-better/the-right-thing intuition giants old-anglo error street-fighting heuristic zooming risk threat-modeling software lens logic inference physics differential geometry estimate distribution robust speculation nonlinearity cost-benefit convexity-curvature measure scale trivia cocktail history early-modern europe math.CA rigor news org:mag org:sci miri-cfar pdf thesis comparison examples org:junk q-n-a stackex pragmatic tradeoffs cracker-prog techtariat invariance DSL chart ecosystem grokkability heavyweights CAS static-dynamic lower-bounds complexity tcs open-problems big-surf ideas certificates-recognition proof-systems PCP mediterranean SDP meta:prediction epistemic questions guessing distributed overflow nibble soft-question track-record big-list hmm frontier state-of-arthttp://pinboard.in/http://pinboard.in/u:nhaliday/b:ea4dd0962112/Computer Scientists Close In on Unique Games Conjecture Proof | Quanta Magazine2018-06-15T02:47:56+00:00
https://www.quantamagazine.org/computer-scientists-close-in-on-unique-games-conjecture-proof-20180424/
nhalidaynews org:mag org:sci popsci tcs cs computation UGC complexity approximation lower-bounds hardness nibble org:inst rand-approx proofs big-surf announcement SDP optimization open-problems questions researchhttp://pinboard.in/http://pinboard.in/u:nhaliday/b:182c4d236934/PCP 1 | Class Profile | Piazza2017-02-07T06:27:38+00:00
https://piazza.com/weizmann_institute_of_science/fall2014/pcp1/home
nhalidaycourse israel tcs complexity approximation rand-approx pcp proof-systems lecture-notes boolean-analysis SDP UGC unit topics ocw quixotic advanced certificates-recognitionhttp://pinboard.in/http://pinboard.in/u:nhaliday/b:732341b91f11/Schur complement - Wikipedia2017-02-02T11:09:03+00:00
https://en.wikipedia.org/wiki/Schur_complement
nhalidaymath linear-algebra ground-up concept calculation atoms wiki reference SDP positivity nibble nitty-gritty signumhttp://pinboard.in/http://pinboard.in/u:nhaliday/b:7d5bea32c8c4/CS/CMS 139: Advanced Algorithms2017-01-10T01:11:17+00:00
http://users.cms.caltech.edu/~vidick/teaching/CS139_Winter17/index.html
nhalidaycaltech course algorithms tcs optimization SDP yoga metabuch 👳 winter-2017 ground-up unit toolkit init p:*** cs quixotic elegance advancedhttp://pinboard.in/http://pinboard.in/u:nhaliday/b:c5f48d52b6d9/cse 599s: entropy optimality, winter 20162016-12-04T13:52:54+00:00
http://homes.cs.washington.edu/~jrl/teaching/cse599swi16/
nhalidaycourse washington tcs acm math math.CA topics concept lecture-notes boolean-analysis markov SDP online-learning tcstariat entropy-like information-theory unit p:someday quixotic advancedhttp://pinboard.in/http://pinboard.in/u:nhaliday/b:ac5ec28a8ee5/Princeton University CS Dept COS521: Advanced Algorithm Design Fall 20152016-10-09T22:28:27+00:00
http://www.cs.princeton.edu/courses/archive/fall15/cos521/
nhalidayprinceton course algorithms yoga 👳 tcs metabuch lecture-notes toolkit dimensionality sanjeev-arora unit concentration-of-measure hashing linearity linear-programming online-learning gradient-descent markov SDP approximation duality coding-theory crypto rigorous-crypto huge-data-the-biggest heuristic counting sampling game-theory decision-theory high-dimension p:*** matrix-factorization quixotic advancedhttp://pinboard.in/http://pinboard.in/u:nhaliday/b:70b9f4c8a47c/Proofs, beliefs and algorithms through the lens of Sum of Squares2016-09-21T05:56:29+00:00
http://sumofsquares.org/public/index.html
nhalidayyoga tcs sum-of-squares lower-bounds course topics lecture-notes exposition SDP rounding rand-approx hierarchy approximation relaxation unithttp://pinboard.in/http://pinboard.in/u:nhaliday/b:aba1667ab4c7/6.854/18.415 Advanced Algorithms, Spring 20162016-07-02T09:30:04+00:00
https://people.csail.mit.edu/moitra/854.html
nhalidaycourse mit algorithms tcs yoga 👳 unit toolkit metabuch ankur-moitra init hashing sublinear graphs graph-theory linear-programming duality submodular gradient-descent iterative-methods online-learning ensembles SDP rounding random-networks latent-variables random-matrices average-case perturbation lecture-notes p:*** quixotic advanced optimization dimensionality embeddinghttp://pinboard.in/http://pinboard.in/u:nhaliday/b:6d3671d8cf25/Quarterly Theory Workshop: Semidefinite Programming Hierarchies and Sum of Squares. | Northwestern CS Theory group2016-06-09T19:36:01+00:00
https://theory.eecs.northwestern.edu/events/sum-of-squares/
nhalidaytcs sum-of-squares complexity optimization workshop lectures video SDP hierarchy eventshttp://pinboard.in/http://pinboard.in/u:nhaliday/b:d110d2993bee/Hypercontractivity and its Applications | tcs math2016-05-14T12:37:15+00:00
https://tcsmath.org/2010/02/15/hypercontractivity-and-its-applications/
nhalidaytcs cool optimization algorithms lower-bounds math survey yoga concept tcstariat washington papers slides talks exposition boolean-analysis estimate rand-approx coding-theory math.GR SDP rounding complexity graphs graph-theory UGC lectures org:bleg nibblehttp://pinboard.in/http://pinboard.in/u:nhaliday/b:eb2f02bc8ff7/cc.complexity theory - The importance of Integrality Gap - Theoretical Computer Science Stack Exchange2016-04-28T04:21:26+00:00
https://cstheory.stackexchange.com/questions/392/the-importance-of-integrality-gap
nhalidayoptimization tcs algorithms lower-bounds research explanation motivation concept q-n-a atoms overflow rounding linear-programming SDP ground-up approximation relaxation nibble discretehttp://pinboard.in/http://pinboard.in/u:nhaliday/b:0f6d2aa4c615/Rough Problems | Gödel's Lost Letter and P=NP2015-08-22T07:43:56+00:00
https://rjlipton.wordpress.com/2013/07/10/rough-problems/
nhalidaytcs tcstariat SDP optimization open-problems org:bleg nibble questions curvature convexity-curvaturehttp://pinboard.in/http://pinboard.in/u:nhaliday/b:729ac66203d0/