<?xml version="1.0" encoding="UTF-8"?>
 <rdf:RDF xmlns="http://purl.org/rss/1.0/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:cc="http://web.resource.org/cc/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://pinboard.in">
    <title>Pinboard (josephzizys)</title>
    <link>https://pinboard.in/u:josephzizys/public/</link>
    <description>recent bookmarks from josephzizys</description>
    <items>
      <rdf:Seq>	<rdf:li rdf:resource="http://gilkalai.wordpress.com/2011/12/09/cup-sets-sunflowers-and-matrix-multiplication/"/>
	<rdf:li rdf:resource="http://rjlipton.wordpress.com/2011/12/03/the-meaning-of-omega/"/>
	<rdf:li rdf:resource="http://rjlipton.wordpress.com/2011/11/29/a-breakthrough-on-matrix-product/"/>
	<rdf:li rdf:resource="http://rjlipton.wordpress.com/2011/11/19/another-annoying-open-problem/"/>
	<rdf:li rdf:resource="http://rjlipton.wordpress.com/2011/11/14/more-quantum-chocolate-boxes/"/>
      </rdf:Seq>
    </items>
  </channel><item rdf:about="http://gilkalai.wordpress.com/2011/12/09/cup-sets-sunflowers-and-matrix-multiplication/">
    <title>Cup Sets, Sunflowers, and Matrix Multiplication</title>
    <dc:date>2011-12-09T15:04:10+00:00</dc:date>
    <link>http://gilkalai.wordpress.com/2011/12/09/cup-sets-sunflowers-and-matrix-multiplication/</link>
    <dc:creator>josephzizys</dc:creator><description><![CDATA[This post follows a recent paper On sunflowers  and matrix multiplication by Noga Alon, Amir Spilka, and Christopher Umens (ASU11) which rely on an earlier paper Group-theoretic algorithms for matrix multiplication, by Henry Cohn, Robert Kleinberg, Balasz Szegedy, and Christopher Umans (CKSU05), and refers also to a paper by Don Coppersmith and Shmuel Winograd (CW90). 

Three famous problems
The Erdos-Rado sunflower conjecture
The Erdos-Rado sunflower (Delta system) theorem and conjecture was already menioned in this post on extremal set theory.

A sunflower (a.k.a. Delta-system) of size  is a family of sets  such that every element that belongs to more than oneofthe sets belongs to all of them. A basic and simple result of Erdos and Rado asserts that

Erdos-Rado sunflower theorem: There is a function  so that every family  of -sets with more than  members contains a sunflower of size .

One of the most famous open problems in extremal combinatorics is:

The Erdos-Rado conjecture: Prove that .

Here,  is a constant depending on . This is most interesting already for .

Three term arithmetic progressions
The cup set problem (three terms arithmetic progressions in ):

The cup set problem was also discussed here quite extensively. (See, e.g. this post.)

Let . The cap set problem  asks for the maximum number of elements in a subset of  which contains no arithmetic progression of size three or, alternatively, no three vectors that sum up to 0(modulo 3). (Such a set is called a cup set.) If  is a cap set of maximum size we can ask how the function  behaves. Roy Meshulam proved, using Roth’s argument, that . Edell found an example of a cap set of size . So .  The gap is exponential.  

The strong cap set conjecture:  for some .

Of course, the cap set problem is closely related to

Erdos-Turan problem (for ): What is the larget size  of a subest of {1,2,…,n} without 3-term arithmetic progression?

Matrix multiplications
Let ω be the smallest real number so that there is an algorithm for multiplying  two  matrices which requires  arithmetic operations.  

The ω=2 conjecture: ω=2.

A very recent breakthrough by Virginia Vassilevska Williams (independently) following an earlier breakthrough by Andrew Stothers improved the Coppersmith-Winograd algorithm which gave ω =2.376, to 2.374 and 2.373 respectively. (See the discussions over Lipton’s blog (1,2), Shtetl optimized, and Computational Complexity.)

It turns out that these three conjectures are related. (The connection of matrix multiplication and the Erdos-Turan problem is fairly old, but I am not sure what an even drastic improvment of Behrends’s lower bound would say about .)

Three combinatorial conjectures that imply ω=2.
Remarkably, an affarmative answer for the ω=2 conjecture would folow from each one of three combinatorial conjectures. One conjecture goes back to CW90 and two were described in CKSU05. I will not present the precise formulations in order to encourage the readers to look at the original papers. (Maybe I will add the formulations later.)

The no disjoint equivoluminous subsets conjecture (CW90).

The Strong UPS conjecture (CKSU05).

Theorem: Conjecture CW90 implies the strong UPS conjecture.

CKSU’s two-family conjecture (CKSU05). 

Relations between these problems
Here are some results taken from ASU11 about the relations between these combinatorial questions. The first result goes back to Erdos and Szemeredi.

The weak sunflower conjecture: A family  of subsets of {1,2,…,n}  with no sunflower of size 3 can have at most  sets.

The following results are not difficult.

Theorem: The strong sunflower conjecture implies the weak sunflower conjecture.

Theorem: The strong cup set conjecture also implies the weak sunflower conjecture.

Theorem: The weak sunflower conjecture implies that the CW90 conjecture is false. 

It follows that CW90 conjecture is in conflict both with the Erdos Rado sunflower conjecture and with the strong cup set conjecture.

Theorem: The strong cup set conjecture implies that the strong UPS conjecture is false.

While two family theorems are quite popular in extremal combinatorics (see this post and this one), CKSU’s two family conjecture is still rather isolated from other combinatorics.

What to believe?
This is a nice topic for discussion.

         ]]></description>
<dc:subject>Combinatorics Computer_Science_and_Optimization Open_problems Cup_sets Extremal_combinatorics Matrix_multiplication sunflowers</dc:subject>
<dc:identifier>https://pinboard.in/u:josephzizys/b:a077e1e51f37/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Combinatorics"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Computer_Science_and_Optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Open_problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Cup_sets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Extremal_combinatorics"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Matrix_multiplication"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:sunflowers"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://rjlipton.wordpress.com/2011/12/03/the-meaning-of-omega/">
    <title>The Meaning of Omega</title>
    <dc:date>2011-12-03T22:26:22+00:00</dc:date>
    <link>http://rjlipton.wordpress.com/2011/12/03/the-meaning-of-omega/</link>
    <dc:creator>josephzizys</dc:creator><description><![CDATA[
 And the meaning of the recently-broken metric TSP approximation ratio 





Andrew Stothers wrote in 2007 a first-year qualifying report for the University of Edinburgh PhD programme. In only eleven pages he presented the parts of the famous paper by Don Coppersmith and Shmuel Winograd (CW) that do enough to obtain  for the exponent of matrix multiplication. In a few more pages he surveyed recent work on sufficient conditions for . Page 15, the last except for references, laid out “Possible Directions for Future Work” in five sentences, the fifth being:


Finally, it will also be possible to investigate bilinear algorithms further and use more complex means to determine whether or not “better” algorithms exist for matrix multiplication.



Today we wish to explain what  means, why lower is always “better,” why we feel our field is better for the great efforts expended by Stothers and Virginia Vassilevska Williams, and why it is better when intellectual values guide our judgment. We also wish to defend ourselves and others for calling the results “breakthroughs,” comparing and contrasting to recent advances on algorithms for the Traveling Salesman Problem.



I (Ken) wrote a similar first-year qualifier in summer 1982 after my first year at Oxford—they are a staple of leading British universities. Like Andrew’s it was not obligated to have original research, just to demonstrate understanding and capability. One item found its way into a conference paper, but my 1986 dissertation ranged into various other topics.


What distinguishes Andrew’s Nov. 2010 dissertation is its singular focus on accomplishing the objective of his proposal. And doing it. The resulting paper with his advisor Sandy Davie has recently been submitted. Meanwhile in North America, Virginia had been thinking about the same problem since 2005. It may not be true as asserted here that “hundreds of people tried to improve” it after 1987, but we’ll guess about as many people thought about it. It’s enough to note that many researchers felt it worthy of pursuit, including Andrew’s advisor, and that progress took large concerted effort. The nature of this effort—including the use of computers—may be an important harbinger for science by itself.


 What Does ω Mean? 


The definition of  is the infimum of all  such that  matrix multiplication has an algorithm that runs in time . This is in a classical model of computation where addition and multiplication of scalars have unit cost. Note that there might not be an algorithm that runs in time  itself— may properly be a limit even if it turns out to have the least possible value 2. Currently all we know is that  from Virginia’s work.


As many have noted,  does not have immediate practical relevance because there is evidently a tradeoff with the constant hiding under the  in . The tradeoff is so far steep enough that apparently only Volker Strassen’s relatively simple algorithm achieving an exponent of  has wide use. So why do we care about values of  less than that?


One of the central questions in physics is, why do the fundamental constants of nature have the values they do? The desire for why, in the face of evidence that the values could be arbitrary, has aroused such passion that string theorist David Gross channeled Winston Churchill in exhorting young physicists to “never, never, never give up.” 


Now in mathematics it may seem nonsensical to ask, why do numbers like  and  have the values they do? But with , we are in a sense closer to physics: understanding it is akin to discovering a natural law, at least a law of information. And quantum mechanics has if anything magnified William Hamilton’s vision approaching 200 years ago that Nature computes with matrices, yes large ones. It is possible that  detracts from a better statement of important laws, but knowing  better brings us closer to them all the same.


 Barriers and Breakthroughs 


Hence also the interest in whether  takes a value with known meaning, such as  or  (both falsified) or , or as some have mused,  or , or as some have argued more likely for exponents, a logarithm of some higher and simpler number. Strassen himself has often stated his belief that  is strictly greater than . If it has a value not previously seen in mathematics, then we could hope to discover new mathematical regularities as well; if it has a known value, then this may yield some more explanation about algorithms. 


The value  seemed a natural possible barrier, but the time interval from  to  in 1981–82 was very short. By contrast the “CW” bound of  was not a natural-seeming number, but it withstood attempts to scale it for 23 years, including a year’s work from each of Virginia and Andrew. Reasons of intellectual judgment, expanded below, we feel are enough to justify calling their final results a “breakthrough.” But we offer a recent concrete case for comparison first.


 The Salesman Always Rings 1.5 


The Traveling Salesman Problem (TSP) seeks the shortest route to visit each of a given set of sites, coming back in a ring to the starting site. When the sites are in real space and “shortest” refers to Euclidean distance, finding the length  of the absolute shortest route remains -hard even in the plane, as discussed here. However, Nicos Christofides in 1976 found a polynomial-time algorithm that finds a tour guaranteed to have total length at most . The method and proof apply to various other cases where the distances between sites obey inequalities that define some kind of metric, with the same  bound.


This  stood as a barrier for most of these cases, until this year. Shayan Oveis Gharan, Amit Saberi, and Mohit Singh (OSS) broke it for a wide subclass of these problems. Well they “broke” it by achieving a guaranteed ratio to the optimum of 






Actually their original paper did not give a value—it merely proved the existence of an  giving . The  estimate for the magnitude of their breakthrough was supplied later. 


 Improvements and Predictions 


It must be said that the OSS paper and its predecessor introduced techniques that were more novel to the problem than is evident for the new matrix-multiplication papers, and the predecessor won the SODA 2010 “Best Paper” award. However, a referee could have wondered, why bother moving heaven and earth in a restricted case to demonstrate an increment a thousand times smaller than “nano”? Indeed the version of OSS linked above still ends with “” on page 65.


What happened is that the change attracted interest on several continents, and the people in Europe who made a bankable improvement are not even in the euro zone. First Tobias Mömke and Ola Svensson of KTH in Stockholm obtained  by using matchings in place of the more-complicated ingredients of OSS, also in time for FOCS 2011. Then Marcin Mucha of the University of Warsaw improved their analysis to obtain  


It is noted by all that an even older paper by Michael Held and Richard Karp from 1970 had set a believable target for provable improvements of Christofides’ bound, namely a conjectured ratio of  from linear programming relaxations. However,  also has dueling conjectures, some supporting a believable target of . The real point of comparison with TSP is that suddenly there was progress on a barrier that had stood for much of the age of the field, and this attracted others to try for more.


We note that Markus Bläser has contended in a comment that the extension of CW used by both new papers has limitations, and we infer that some other experts concur. However, the paper by Vassilevska Williams in particular has two new ingredients: a framework for managing any tensor power as a base algorithm, and a computer implementation of the framework. She also employed an insight of Stothers to simplify the analysis. Though we can imagine a geometrical insight that higher powers bring returns that “shrink geometrically,” can we really constrain the likelihood that pursuing them might dislodge a different insight that changes the whole game?  Is there a limitation theorem here?  All we know is that the game has changed, and perhaps the game is afoot. 


 Worth and Values 


The last issue we wish to raise is the reliability of judging the worth of past achievement by present assessment of what it may or may not lead to in the future. We have posted several times on surprises and the difficulty of predicting or guessing how things will go. Of course impetus into the future is necessarily part of claiming something now a “breakthrough.” 


However, if we seek a reliable and consistent standard of judging worth, we should use a value that the community has understood for many years: intellectual substance and effort. This value, plus the simple salience of the goal, led our principals to invest the years of work in the first place. Depth of thinking is our gold standard, while applicability is our paper currency. The improvements of  and  on paper for  amid other issues may be a poor return now, but the new computer-assisted vein to mine may parlay true value later.


Dick and I hope this explains why fundamental effort and easily-stated achievement, on a fundamental problem after nearly a quarter-century of stasis, elicited the reaction it has.  We agree with, and have tried to extend, Timothy Gowers’ comments here. As for how these values can be invested, only time will be the teller.


 Open Problems 


Can we be more open about the value of pursuing problems?


What is ?


Suppose we know something special about two  matrices  and  to be multiplied, something not so obvious like their being sparse. Suppose in particular that we have a tiny circuit  that for any  outputs the value , and similarly  for , where those values come from a fixed finite set. Or suppose we know that  and/or  preserve a similar succinctness property from argument vectors to values—which gives a different property on the part of  and . Is this enough to compute  faster than what we know for arbitrary matrices?


[corrected Ola Svensson's name]

         










]]></description>
<dc:subject>All_Posts Ideas News Open_Problems People Proofs Results Algorithms approximation matrix multiplication omega TSP</dc:subject>
<dc:identifier>https://pinboard.in/u:josephzizys/b:a12f8d61d4c7/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:All_Posts"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Ideas"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:News"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Open_Problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:People"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Proofs"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Results"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:approximation"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:matrix"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:multiplication"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:omega"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:TSP"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://rjlipton.wordpress.com/2011/11/29/a-breakthrough-on-matrix-product/">
    <title>A Breakthrough On Matrix Product</title>
    <dc:date>2011-11-29T14:59:04+00:00</dc:date>
    <link>http://rjlipton.wordpress.com/2011/11/29/a-breakthrough-on-matrix-product/</link>
    <dc:creator>josephzizys</dc:creator><description><![CDATA[
 Beating Coppersmith-Winograd 





Virginia Vassilevska Williams is a theoretical computer scientist who has worked on a number of problems, including a very neat question about cheating in the setup of tournaments, and a bunch of novel papers exemplified by this one on graph and matrix problems with runtime exponents of  that have long been begging to be improved. 


Today Ken and I want to discuss her latest breakthrough in improving our understanding of the matrix multiplication problem.




Of course Volker Strassen first showed in his famous paper in 1969 that the obvious cubic algorithm was suboptimal. Ever since then progress has been measured with one parameter : if your algorithm runs in time , then you are known by this one number. Strassen got , and the race was off. A long series of improvements started to happen, which for a while seemed to be stuck above . Then, Don Coppersmith and Shmuel Winograd (CW) got  and everything changed. After a contribution by Strassen himself, CW finally obtained  in 1987, with full details here. This has been the best known for decades.


This has all changed now. Virginia has proved that she can beat the “barrier” of CW and get a new lower value for . Currently her paper gives , an improvement of “only” , but there is promise of more. This is also another case of proofs coming in twos, as a theorem  in PhD thesis work by Andrew Stothers was circulated to some in June 2010 but not very widely. All this is extremely exciting, and is one of the best results proved in years in all of theory. While these algorithms are unlikely to be usable in practice, they help shed light on one of the basic questions of complexity theory: how fast can we multiply matrices? What could be more fundamental than that?


 The Basic Idea 


Matrix multiplication is bi-linear: the formula for the  entry of  is . The first step in simplifying the problem is to make it more complicated: Let us have indicator variables  and compute instead the tri-linear form




This is a special case of a general tri-linear form 




where  and we have re-mapped the indices. It looks like we have made order-of  work for ourselves. The key, however, is to try to fit a representation of the form:




where . The point is, suppose we can compute these products  in total time . Then we can compute (the coefficients for) all the desired entries



 in  steps. Thus what we have are separate handles on the time  for the products and the time for the . The way to manage and balance these times involves recursion. 


 The Basis Idea 


The recursion idea is nice to picture for matrices, though its implementation for the way we have unrolled matrices into vectors is not so nice. Picture  and  as each being  matrices. We can regard  instead as a  matrix of four  matrices , and do the same for . Then the product  can be written via  products , and we can picture ourselves recursing on these products.


The reason why the vector case does not look so nice is that the tri-linear form  is so general—indeed we cannot expect to fit a general tri-linear form into a small number of products . What CW did, building on work by Arnold Schönhage, is relax the tri-linear form by introducing more than -many  variables, supplying appropriate coefficients to set up the recursion, and most of all framing a strategy for setting variables to zero so that three goals are met: the recursion is furthered, the values of “” at each level stay relatively small, and the matrix product can be extracted from the variables left over. This involved a hashing scheme which used subsets of integers that are free of arithmetical progressions.


The final step by CW was to choose a starting algorithm  for the basis case of the recursion. They devised one and got . Then they noticed that if they bumped up the base case by manually expanding their algorithm to an  handling the next-higher case, they got a better analysis and their famous result . By their way of thinking, bumping the basis up once more to  was the way to do better, but they left analyzing this as a problem. Others attempted the analysis and…found it gave worse not better results. So , actually , stood.


The insight for breaking through was to make a bigger jump in the basis. Vassilevska Williams was actually anticipated in this without her knowledge by Andrew Stothers, in his 2010 PhD thesis at the University of Edinburgh. Stothers used  and showed this method capable of achieving , though there has been some doubt in whether all details were worked out. Vassilevska Williams, however, used  and brought some powerful computing software to bear on a more-extensive framework for the plan. It is not clear whether there is anything necessary about jumping  by a power of —in any event her program and framework work for any exponent.


 The Proof 


We cannot yet really give a good summary of the proof—further details are in her paper. One quick observation about her work is in order. She used the CW method, but extended it into a general schema can be used to find good matrix product algorithms, perhaps even better than the one in the paper. The algorithms themselves can be generated and examined, but as usual the task of analyzing them is very hard. The brilliant insight that she has is this task can be laid out automatically by a certain computer program. This allows her to do the analysis where previous others failed. 


 For example here is a sample of the overview of her main program, in pseudo-code:





The details are not as important as the fact that this program allows one to work on much larger schemas than anyone could previously. 


 What Does the Bound Mean? 


Note that she has improved the bound of Stothers by . For what threshold value of  does an additive improvement of  in the exponent halve the running time? The answer is , which in this case is . This value is far above the Bekenstein Bound for the number of particles that could be fit into a volume the size of the observable universe without collapsing it into a black hole. In this sense the algorithm itself is even beyond galactic. 


The meaning instead comes from this question: Is there a fundamental reason why  could settle at a value strictly greater than ? Note that  is not taken to mean the existence of a quadratic-time algorithm, but rather that for all  there are algorithms that achieve time . There was some reason to think  could be a natural barrier, but it was breached. Perhaps , since this is connected to the golden ratio? Her paper notes a recent draft by Noga Alon, Amir Shpilka, and Christopher Umans that speaks somewhat against the optimism shared by many that .


 Open Problems 


Can the current bounds be improved by more computer computations? Are we about to see the solution to this classic question? Or will it be struggling over increments of ?


In any event congratulate Virginia—and Andrew—for their brilliant work.


Update: Markus Bläser, who externally reviewed Stothers’ thesis, has contributed an important comment on the blog of Scott Aaronson here.  It evaluates the significance of the work in-context, and also removes the doubt going back to 2010 that was expressed here.  

         














]]></description>
<dc:subject>History News Open_Problems Proofs breakthrough galactic matrix_exponent matrix_product Strassen</dc:subject>
<dc:identifier>https://pinboard.in/u:josephzizys/b:8e1933862303/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:History"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:News"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Open_Problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Proofs"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:breakthrough"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:galactic"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:matrix_exponent"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:matrix_product"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Strassen"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://rjlipton.wordpress.com/2011/11/19/another-annoying-open-problem/">
    <title>Another Annoying Open Problem</title>
    <dc:date>2011-11-19T23:23:22+00:00</dc:date>
    <link>http://rjlipton.wordpress.com/2011/11/19/another-annoying-open-problem/</link>
    <dc:creator>josephzizys</dc:creator><description><![CDATA[
 The k-server problem: some progress, still wide open 





Aleksander Mądry gave a stellar talk at our ARC Theory Day a week ago Friday. He is an expert on algorithmic graph theory, among other areas of theory. Already he has multiple best-paper awards, including the paper of this talk from FOCS 2011, and I expect he will get more in the future. This is good going considering that basic LaTeX systems lack a macro for the Polish ogonek diacritic in his name—Ken installed a special package called tipa to get it while editing this post. 


Today I would like to talk about his result and the open questions that remain.




He with Nikhil Bansal, Niv Buchbinder, and Seffi Naor (BBMN) have made a recent important contribution to our understanding of the complexity of on-line algorithms. Aleksander’s talk was wonderful and explained what is known, their new result, and what still remains to be done. As he said:  


One of the remaining open questions is really annoying—it should be solved—but it continues to resist attacks.



Before I discuss their result I would like to thank the Theory Day organizers Prasad Tetali and Prasad Raghavendra. It was held on the magical day 11/11/11, which does not happen that often, and in fact Mądry’s talk spanned 11:11:11am. Ken wore a red shirt in Buffalo but could not find the WW I remembrance poppy pin he acquired during his sabbatical in Canada. The full scorecard of talks and titles was: 



Thomas Dueholm Hansen: Subexponential lower bounds for randomized pivoting rules for the simplex algorithm. 

Aleksander Mądry: Online algorithms and the k-server conjecture. 

Mohit Singh: A Randomized Rounding Approach for Symmetric TSP. 

Ryan Williams: Algorithms for Circuits and Circuits for Algorithms.


 The Problem 


The problem he spoke about is the now classic on-line server problem. The simplest case is the problem of managing a computer $. Note that this is the way to tell a theory talk on computer caches from an architecture talk on computer caches—in architecture often $ is used to denote a cache. Perhaps this is changing, given the current economic turmoil, perhaps not. 


The key is that there are  pages in the cache and  total pages. Each time a page is requested, if it is not in the cache the task is to decide which page to “evict” from the cache. If the strategy is deterministic, then as Danny Sleator and Bob Tarjan proved back in 1985, the best strategy is only -competitive. This means that the best deterministic strategy could be as bad as  times the best off-line strategy which is allowed to see all the page requests at once. 


Randomization comes to the rescue. If the strategy for eviction can be random, then it is known that an on-line random strategy exists that is  competitive. This is due to Amos Fiat, Dick Karp, Mike Luby, Lyle McGeoch, Danny Sleator, and Neal Young in 1991. 


The paging problem is just a special case of the general -server problem. In the general case the simple cache mechanism is replaced by one based on an arbitrary finite metric space. The servers at any step are located somewhere on the points of the metric space. The requests are to “serve” a point in the space: a server must move to that point, and it incurs a cost of the distance in the metric. 


In the 1990′s this problem was very “hot,” and there was a stream of results that tried to get good deterministic server strategies. The first results were exponential in  but independent of : recall there are  servers and the space has  points. Finally—in a famous paper—Elias Koutsoupias and Christos Papadimitriou 1994 showed that there is a strategy achieving , which remains today the best known result. 

 The BBMN Result 


The new result is: 


Theorem 1  There exists an -competitive on-line strategy for the -server problem. 



This is a great improvement, but it also changes the rules. It is great since the competitiveness bound is poly-log in . However, all the previous results we discussed had competitiveness bounds that did not depend on the size of the metric space, . Their result does. 


See their paper for the proof. The key idea is that they:  Reduce the -server problem over arbitrary metric to a (more difficult) problem over a very simple metric  The very simple metric is a type of tree-like metric. The proof relies on the fact that any metric space on  points can be well approximated by such a metric space. The cost of the approximation grows roughly as , so it is not surprising that their theorem has factors of  in the competitiveness bound.


 Open Problems 


The -server problem has the following bounds and gaps in those bounds: 



If the strategy is deterministic it must take  and can be done in . There is a small, but annoying gap of two here. Which is correct? 

 If the strategy is randomized it must take . There is a huge and annoying gap to the known upper bounds here. There is no known strategy for even the simple case of the real line that is randomized but beats linear in . What is going on?
 Finally, are the polylog factors of  in their result necessary? In particular can one get rid of the dependence on  altogether?


[fixed typo in Fiat et al. to O(log k)]

         



]]></description>
<dc:subject>1 All_Posts News Open_Problems Results Algorithms approximation deterministic lower_bounds randomness</dc:subject>
<dc:identifier>https://pinboard.in/u:josephzizys/b:e4775cd790d6/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:1"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:All_Posts"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:News"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Open_Problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Results"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:approximation"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:deterministic"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:lower_bounds"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:randomness"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://rjlipton.wordpress.com/2011/11/14/more-quantum-chocolate-boxes/">
    <title>More Quantum Chocolate Boxes</title>
    <dc:date>2011-11-15T04:50:48+00:00</dc:date>
    <link>http://rjlipton.wordpress.com/2011/11/14/more-quantum-chocolate-boxes/</link>
    <dc:creator>josephzizys</dc:creator><description><![CDATA[
 Unwrapping the layers between classical and quantum? 





Daniel Simon discovered the first quantum algorithm that yielded an oracle separation of  from an exponential-time analogue of . Even more, his algorithm was the direct inspiration for Peter Shor’s factoring algorithm, which roughly speaking replaced Simon’s use of the Hadamard transform with the quantum Fourier transform. Apologies to Christopher Marlowe, Shor’s algorithm was the one with “the phase that launched a thousand scrips” (scrips meaning papers, funding, you name it).




Today we wish to revisit the question of solving Simon’s problem by a classical algorithm. We also ask whether the classical algorithms are being given a fair counterpart to the following principle, which the quantum algorithms take for granted:


Given an invertible function, there is a unitary matrix that implements the function as a permutation matrix, and moreover: the matrix is efficiently constructible, in a way that is realizable by a physical system.



Let’s call this the quantum black box assumption (QBB). Without it there is no Deutsch(-Jozsa) or Simon or Grover algorithm, and in the details, no Shor algorithm either (compare discussion here). Without it there are no quantum algorithms for anything interesting, since it is a central assumption required by all them. It is justified by polynomial-time universality theorems for quantum circuits, but is it really innocuous? And since the matrix computes the function at any algebraic input, should not the representation for the classical algorithms do the same?


Ironically, Simon himself is not sanguine on the prospects of quantum computers. He works on Windows security at Microsoft, and was among the many builders of the Transport Layer Security (TLS) protocol. In March 2009 he contributed comments noted here, including this statement:


Basically, there is one thing that quantum computers [are known to do] better than classical computers. Because this one thing happens to lead to polynomial-time algorithms for integer factoring and discrete log, quantum computers have been bandied about as an incredible new computing technology, but the truth is that this one thing is really very limited in scope, and in a decade and a half, nobody’s found another significant application for it.


Moreover, there are lots of (admittedly informal) reasons for believing that quantum computers can’t really do anything interesting beyond this one thing.



Simon would add, following the title of his own blog, “I could be wrong.” We could be wrong likewise, but let’s first look at Simon’s problem. 


 Simon’s Distinguishing Problem 


We describe Simon’s problem with the same chocolate-box metaphor we recently used here. Again we are trying to identify a Boolean function , except this time  maps  into itself, i.e., the output  is an -bit string not just 0 or 1. Again we are told that  is a special function that belongs to a limited number of “boxes.” Here we have one box  for every “hidden” string :




As usual  means bitwise exclusive-or. When  is the all-zero vector, the condition holds iff  is 1-to-1, and so box  includes all  that are bijectively onto . The other boxes, however, comprise functions that are exactly 2-to-1 in a particular periodic way. 


The input to Simon’s problem is an oracle  for an unknown member  of one of the boxes. The first goal is to distinguish the case  from  being in one of the other boxes. The full goal is to tell which box, i.e., to compute  with high probability. Simon’s beautiful result is that this can be done by a polynomial time quantum algorithm. 


As in our previous post,  is given as an operator on  qubits, such that on input  padded with -many 0′s,  outputs the pair . The point of necessity is that this computation is 1-to-1 even when  itself is 2-to-1.


 Some Basics 


The key to understanding a quantum algorithm is to look at the vectors and unitary operators. All in this case are from real Hilbert spaces, or if you prefer finite dimensional Euclidean spaces. We will use capital letters for our vectors, and script letters for our matrices. We also avoid subscripts and use the simple notion  to denote the  coordinate of the vector . If you like we are viewing vectors as functions. 


The Hadamard matrix is denoted by  where  is always . We also use 



to denote the row  and column  of the matrix . Note, the important fact that 



where  is the boolean inner product of  and . If  is a vector, then  is defined by



 Simon’s Quantum Algorithm 


The first key insight is that we can use a quantum algorithm to efficiently construct a vector  so that 



when  and is zero otherwise. The  is just the concatenation of the numbers  and . Essentially we are using two Hilbert spaces that we put together by a tensor product. Note: this is where the QBB assumption is invoked—without this assumption the algorithm cannot even get started—since  where  is a -qubit state prepared as the equal superposition of basis states in the first  qubits, all with zeroes in the second  qubits.


Then we can apply the matrix  to  and yield the vector . The key is that 



This is what is meant by applying the Hadamard only to the  coordinate. 


We need to analyze what the amplitude is of the coordinates . There are two cases. 



In this case  is one-to-one. Let see what  is then. As the sum runs over  values the only time  is non-zero is when . This only happens once, and therefore 



Thus any measurement of  will yield simply a random uniform number in the range  to . After a number of these samples it will be clear that there is no .


In this case  is two-to-one. Let see what  is in this case—it is much more interesting. If  has no pre-image under , then . If it has a pre-image  it also has a pre-image . In this case 



This is either  or  depending on whether or not . This gives an equation on  over the boolean finite field. After enough equations one can solve for  and solve the problem.


 Quantum Versus Classical Settings 


All discussions of Simon’s algorithm immediately point out that there is no classical algorithm that can solve the problem without an exponential number of calls to the black box for . This is easy to prove and is unassailable—it is correct. Case closed: quantum is more powerful than classical. 


Well of course that is not quite true. If the black box for  were presented by a circuit of polynomial size, then the argument that a polynomial amount of classical computing is inadequate would require the assumption that . So the claim that quantum is stronger than classical can be true if the conventional wisdom is true, but who knows? 


We suggest that the proper comparison may involve granting the classical setting a multi-linear extension  of  to use as the black box. This  is not restricted to Boolean queries, but can be evaluated on points like  that strike us as analogous to Hadamard-transformed states given to . One rub is that if  is an unsatisfiable Boolean formula, and  comes out to be the all-zero function, recognizing this fact is -hard. If  seems too much to assume, compared to the QBB, we are reminded of a common story where one child makes hands like a tank approaching another child, and the second child swoops a hand down like a missile to blow up the tank.



Child 1: “Where did you get that missile?”


Child 2: “Well, where did you get that tank?”


 The Classical Black Box Assumption 


To make our assumption concrete, we settle on this extension of a Boolean function  to a mapping  from  to : 



That is,  is the product of factors  for  and  for . For instance, when  and , 



The effect is that over Boolean arguments ,  is zero except when , so that  for all Boolean . 

Of course for other arguments ,  can be a sum of exponentially many terms, and it is not evident that  can be regarded as easily-given as  is. Indeed that is the point. But let’s make the assumption clear:  


Given a function  with a polynomial size Boolean circuit, there is an arithmetic circuit that is also polynomial in size and computes .


This assertion itself does not force , though obtaining the arithmetic circuit might do so. We will call this assumption the classical black box assumption (CBB). It seems unlikely to be true, but we cannot rule it out immediately. We can also compare with the “Arithmetic Checkability” axiom of this paper by Russell Impagliazzo, Valentine Kabanets, and Antonina Kolokolova.


And for a companion question, is the QBB assumption really true in prospect? Why is it fair to have a quantum linear operator that can be applied to other quantum states besides the 0-1 basis states, and not be able to do so with a classical multi-linear extension? Or is this where the first power of quantum over classical resides?


 Using CBB for Simon’s Problem 


Can we use the CBB assumption to create a classical algorithm for Simon’s problem? We would like to report that the answer is yes, but we leave it open. It seems likely to be true, yet we cannot prove it. Still we find the question interesting in its own right, and in shedding new light on QBB.


The goal now is to show that evaluating  at the right values will yield information about the structure of , just as Simon’s algorithm uses QBB to get information about . Here is just a note on how to start, and some difficulties that we hope you can help us resolve. When , the “schematic terms”  all give value . It follows that whenever  is 1-to-1, 



When  is 2-to-1, it is possible that . If it doesn’t, then we instantly deduce . Whether it does depends completely on the range of , not on the value of . When , this happens exactly when the range of  is , or when it is .


Our approach now is to ask, what happens on arguments  whose entries are  or , perhaps chosen at random? For such  we have 



where  means the number of places  where  and  agree in the sense that  and , or  and . This is the same as the Hamming weight of  using the bitwise complement of , and it may help that . 


Whenever  is 2-to-1,  is a sum over pairs  with  of binomials


The question is finally whether we can exploit divisibility properties of these related sums of two powers of  to tell this apart from the case where  is 1-to-1. In the latter case, for a rule that chooses  or  randomly, it seems by symmetry that we need only analyze this for  being the identity function on .  One can try arguments with entries with  or , and/or other definitions of . 


One can also try to imitate the quantum proof further, working with linear extensions of  on  bits, so that the “line” between  and , which equals , has constant value in the last  coordinates. The idea would be to generate random values  with each  randomly chosen as above, and distinguish the 1-to-1 and 2-to-1-with-period- cases according to whether they span  or only  dimensions. Does this work?


 Open Problems 


There are two technical open problems. Is the CBB assumption wrong? Can one prove that there is an  so that  must have large complexity? Also assuming the above version of CBB, or any reasonable version, can we prove that there is a classical algorithm that solves Simon’s problem?


Our point is that if the classical-vs.-quantum difference persists with our extension of Simon’s problem, it must reside either in the CBB, or in the problem itself under more powerful conditions. Is this a helpful layering of the issues?


[fixed opening sentence on nature/priority of Simon's oracle separation][added reference to this discussion]

         














]]></description>
<dc:subject>All_Posts Ideas Open_Problems Algorithms classical quantum randomness</dc:subject>
<dc:identifier>https://pinboard.in/u:josephzizys/b:d9b149dc7dff/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:All_Posts"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Ideas"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Open_Problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:Algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:classical"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:quantum"/>
	<rdf:li rdf:resource="https://pinboard.in/u:josephzizys/t:randomness"/>
</rdf:Bag></taxo:topics>
</item>
</rdf:RDF>