<?xml version="1.0" encoding="UTF-8"?>
 <rdf:RDF xmlns="http://purl.org/rss/1.0/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:cc="http://web.resource.org/cc/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://pinboard.in">
    <title>Pinboard (mraginsky)</title>
    <link>https://pinboard.in/u:mraginsky/public/</link>
    <description>recent bookmarks from mraginsky</description>
    <items>
      <rdf:Seq>	<rdf:li rdf:resource="https://arxiv.org/abs/2106.10717"/>
	<rdf:li rdf:resource="http://pages.cs.wisc.edu/~balu2901/papers/2016/experts_project.pdf"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1307.5449"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1501.06225"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1405.6076"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1401.6956"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1310.2997"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1207.1965"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1204.5721"/>
	<rdf:li rdf:resource="http://metamarkets.com/2011/machine-learning-in-wonderland/"/>
	<rdf:li rdf:resource="http://www-bcf.usc.edu/~rusmevic/psfiles/biased-gradient.pdf"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1105.2274"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1102.2670"/>
	<rdf:li rdf:resource="http://www.ualberta.ca/~szepesva/papers/lqr_colt_final.pdf"/>
	<rdf:li rdf:resource="http://www.princeton.edu/~sbubeck/BubeckLectureNotes.pdf"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1111.2888"/>
	<rdf:li rdf:resource="http://jmlr.csail.mit.edu/papers/v12/perchet11a.html"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1107.4080"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1105.4701"/>
	<rdf:li rdf:resource="http://people.csail.mit.edu/costis/optimalNR.pdf"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1102.4442"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1102.0710"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1011.3168"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1011.1936"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1011.0686"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1008.4654"/>
	<rdf:li rdf:resource="http://research.microsoft.com/apps/pubs/default.aspx?id=81322"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1006.4039"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1006.1138"/>
      </rdf:Seq>
    </items>
  </channel><item rdf:about="https://arxiv.org/abs/2106.10717">
    <title>[2106.10717] Optimal potentials for hedging algorithms</title>
    <dc:date>2022-10-25T12:12:07+00:00</dc:date>
    <link>https://arxiv.org/abs/2106.10717</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA[We study a family of potential functions for online learning. We show that if the potential function has strictly positive derivatives of order 1-4 then the min-max optimal strategy for the adversary is Brownian motion. Using that fact we analyze different potential functions and show that the Normal-Hedge potential provides the tightest upper bounds on the cumulative regret of the top {\epsilon}-percentile. ]]></description>
<dc:subject>papers to-read heard-the-talk online-learning game-theory SDEs</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:mraginsky/b:1f55d0ab96d4/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:heard-the-talk"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:game-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:SDEs"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://pages.cs.wisc.edu/~balu2901/papers/2016/experts_project.pdf">
    <title>Tight Lower Bounds for the Multiplicative Weights Algorithm on the Experts Problem</title>
    <dc:date>2016-04-10T18:39:52+00:00</dc:date>
    <link>http://pages.cs.wisc.edu/~balu2901/papers/2016/experts_project.pdf</link>
    <dc:creator>mraginsky</dc:creator><dc:subject>papers to-read machine-learning online-learning lower-bounds</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:mraginsky/b:7418040f07ea/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:lower-bounds"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1307.5449">
    <title>[1307.5449] Non-stationary Stochastic Optimization</title>
    <dc:date>2015-01-27T23:12:37+00:00</dc:date>
    <link>http://arxiv.org/abs/1307.5449</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA[We consider a non-stationary variant of a sequential stochastic optimization problem, in which the underlying cost functions may change along the horizon. We propose a measure, termed variation budget, that controls the extent of said change, and study how restrictions on this budget impact achievable performance. We identify sharp conditions under which it is possible to achieve long-run-average optimality and more refined performance measures such as rate optimality that fully characterize the complexity of such problems. In doing so, we also establish a strong connection between two rather disparate strands of literature: adversarial online convex optimization; and the more traditional stochastic approximation paradigm (couched in a non-stationary setting). This connection is the key to deriving well performing policies in the latter, by leveraging structure of optimal policies in the former. Finally, tight bounds on the minimax regret allow us to quantify the "price of non-stationarity," which mathematically captures the added complexity embedded in a temporally changing environment versus a stationary one.]]></description>
<dc:subject>papers to-read optimization online-learning decision-theory</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:mraginsky/b:a98c836f4524/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:decision-theory"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1501.06225">
    <title>[1501.06225] Online Optimization : Competing with Dynamic Comparators</title>
    <dc:date>2015-01-27T20:41:30+00:00</dc:date>
    <link>http://arxiv.org/abs/1501.06225</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA[Recent literature on online learning has focused on developing adaptive algorithms that take advantage of a regularity of the sequence of observations, yet retain worst-case performance guarantees. A complementary direction is to develop prediction methods that perform well against complex benchmarks. In this paper, we address these two directions together. We present a fully adaptive method that competes with dynamic benchmarks in which regret guarantee scales with regularity of the sequence of cost functions and comparators. Notably, the regret bound adapts to the smaller complexity measure in the problem environment. Finally, we apply our results to drifting zero-sum, two-player games where both players achieve no regret guarantees against best sequences of actions in hindsight.]]></description>
<dc:subject>papers to-read online-learning decision-making</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:mraginsky/b:cade0b9c1de1/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:decision-making"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1405.6076">
    <title>[1405.6076] Online Linear Optimization via Smoothing</title>
    <dc:date>2014-05-31T15:25:24+00:00</dc:date>
    <link>http://arxiv.org/abs/1405.6076</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA[We present a new optimization-theoretic approach to analyzing Follow-the-Leader style algorithms, particularly in the setting where perturbations are used as a tool for regularization. We show that adding a strongly convex penalty function to the decision rule and adding stochastic perturbations to data correspond to deterministic and stochastic smoothing operations, respectively. We establish an equivalence between "Follow the Regularized Leader" and "Follow the Perturbed Leader" up to the smoothness properties. This intuition leads to a new generic analysis framework that recovers and improves the previous known regret bounds of the class of algorithms commonly known as Follow the Perturbed Leader.]]></description>
<dc:subject>papers to-read online-learning optimization</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:mraginsky/b:833d4e441e54/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:optimization"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1401.6956">
    <title>[1401.6956] A continuous-time approach to online optimization</title>
    <dc:date>2014-02-01T05:41:51+00:00</dc:date>
    <link>http://arxiv.org/abs/1401.6956</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA[We consider a family of learning strategies for online optimization problems that evolve in continuous time and we show that they lead to no regret. From a more traditional, discrete-time viewpoint, this continuous-time approach allows us to derive the no-regret properties of a large class of discrete-time algorithms including as special cases the exponential weight algorithm, online mirror descent, smooth fictitious play and vanishingly smooth fictitious play. In so doing, we obtain a unified view of many classical regret bounds, and we show that they can be decomposed into a term stemming from continuous-time considerations and a term which measures the disparity between discrete and continuous time. As a result, we obtain a general class of infinite horizon learning strategies that guarantee an (n−1/2) regret bound without having to resort to a doubling trick.]]></description>
<dc:subject>papers to-read online-learning optimization</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:mraginsky/b:e9cf2e4966b8/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:optimization"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1310.2997">
    <title>[1310.2997] Bandits with Switching Costs: T^{2/3} Regret</title>
    <dc:date>2013-10-14T01:33:29+00:00</dc:date>
    <link>http://arxiv.org/abs/1310.2997</link>
    <dc:creator>mraginsky</dc:creator><dc:subject>papers to-read bandit-problems online-learning</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:mraginsky/b:1ff3ed96dcfd/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:bandit-problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1207.1965">
    <title>[1207.1965] Forecasting electricity consumption by aggregating specialized experts</title>
    <dc:date>2012-07-10T19:51:15+00:00</dc:date>
    <link>http://arxiv.org/abs/1207.1965</link>
    <dc:creator>mraginsky</dc:creator><dc:subject>papers to-read online-learning prediction electricity-markets via:cshalizi</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:mraginsky/b:e5ccb37e41a3/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:prediction"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:electricity-markets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:via:cshalizi"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1204.5721">
    <title>[1204.5721] Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems</title>
    <dc:date>2012-04-27T18:20:30+00:00</dc:date>
    <link>http://arxiv.org/abs/1204.5721</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA[Multi-armed bandit problems are the most basic examples of sequential decision problems with an exploration-exploitation trade-off. This is the balance between staying with the option that gave highest payoffs in the past and exploring new options that might give higher payoffs in the future. Although the study of bandit problems dates back to the Thirties, exploration-exploitation trade-offs arise in several modern applications, such as ad placement, website optimization, and packet routing. Mathematically, a multi-armed bandit is defined by the payoff process associated with each option. In this survey, we focus on two extreme cases in which the analysis of regret is particularly simple and elegant: i.i.d. payoffs and adversarial payoffs. Besides the basic setting of finitely many actions, we also analyze some of the most important variants and extensions, such as the contextual bandit model.]]></description>
<dc:subject>papers to-read bandit-problems online-learning control-theory dynamic-programming decision-making</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:mraginsky/b:11d540786e36/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:bandit-problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:control-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:dynamic-programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:decision-making"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://metamarkets.com/2011/machine-learning-in-wonderland/">
    <title>Why Generic Machine Learning Fails</title>
    <dc:date>2012-03-10T02:17:47+00:00</dc:date>
    <link>http://metamarkets.com/2011/machine-learning-in-wonderland/</link>
    <dc:creator>mraginsky</dc:creator><dc:subject>blogs machine-learning online-learning algorithms prediction computation</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:mraginsky/b:e27184963641/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:blogs"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:prediction"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:computation"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www-bcf.usc.edu/~rusmevic/psfiles/biased-gradient.pdf">
    <title>&quot;Online sequential optimization with biased gradients&quot;</title>
    <dc:date>2012-02-21T15:48:40+00:00</dc:date>
    <link>http://www-bcf.usc.edu/~rusmevic/psfiles/biased-gradient.pdf</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA[We study a class of stochastic optimization problems where though the objective functions
may not be convex, they satisfy a generalization of convexity, called the sequentially convex
property. We focus on a setting where the distribution of the underlying uncertainty is unknown and the manager must make a decision in real-time based on historical data. Since sequentially convex functions are not necessarily convex, they pose difficulties in applying standard adaptive methods for convex optimization. We propose a nonparametric algorithm based on a gradient descent method, and show that the T-season average expected cost differs from the minimum cost by at most O(1/sqrt(T)). Our analysis is based on a careful quantication of the bias that is inherent in gradient estimation due to the adaptive nature of the problem. We demonstrate the usefulness of the concept of sequential convexity by applying it to three canonical problems in inventory control, capacity allocation, and the lifetime buy decision, under the assumption that the manager does not know the demand distributions and has access only to historical sales (censored demand) data.]]></description>
<dc:subject>papers to-read online-learning optimization convex-programming</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:mraginsky/b:3fc1cff0c08b/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:convex-programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1105.2274">
    <title>[1105.2274] Data-Distributed Weighted Majority and Online Mirror Descent</title>
    <dc:date>2012-02-21T03:26:47+00:00</dc:date>
    <link>http://arxiv.org/abs/1105.2274</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA[In this paper, we focus on the question of the extent to which online learning can benefit from distributed computing. We focus on the setting in which $N$ agents online-learn cooperatively, where each agent only has access to its own data. We propose a generic data-distributed online learning meta-algorithm. We then introduce the Distributed Weighted Majority and Distributed Online Mirror Descent algorithms, as special cases. We show, using both theoretical analysis and experiments, that compared to a single agent: given the same computation time, these distributed algorithms achieve smaller generalization errors; and given the same generalization errors, they can be $N$ times faster.]]></description>
<dc:subject>papers to-read machine-learning online-learning optimization distributed-systems</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:mraginsky/b:2e2387fed77f/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:distributed-systems"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1102.2670">
    <title>[1102.2670] Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems</title>
    <dc:date>2011-12-28T20:07:37+00:00</dc:date>
    <link>http://arxiv.org/abs/1102.2670</link>
    <dc:creator>mraginsky</dc:creator><dc:subject>papers to-read online-learning statistics bandit-problems</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:mraginsky/b:d154eceb3849/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:statistics"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:bandit-problems"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www.ualberta.ca/~szepesva/papers/lqr_colt_final.pdf">
    <title>Regret Bounds for the Adaptive Control of Linear Quadratic Systems (Abbasi-Yadkori and Szepesvari)</title>
    <dc:date>2011-12-28T19:57:23+00:00</dc:date>
    <link>http://www.ualberta.ca/~szepesva/papers/lqr_colt_final.pdf</link>
    <dc:creator>mraginsky</dc:creator><dc:subject>papers to-read adaptive-systems control-theory online-learning</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:mraginsky/b:4c8b3a58144e/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:adaptive-systems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:control-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www.princeton.edu/~sbubeck/BubeckLectureNotes.pdf">
    <title>Sebastien Bubeck, &quot;Introduction to Online Optimization&quot;</title>
    <dc:date>2011-12-22T04:02:24+00:00</dc:date>
    <link>http://www.princeton.edu/~sbubeck/BubeckLectureNotes.pdf</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA[lecture notes for ORF570: Special Topics in Statistics and Operations Research, Prediction Games (Princeton)]]></description>
<dc:subject>lecture-notes online-learning optimization prediction</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:mraginsky/b:a37ad4736dc0/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:lecture-notes"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:prediction"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1111.2888">
    <title>Adaptive Regret Minimization in Bounded-Memory Games</title>
    <dc:date>2011-11-16T17:11:01+00:00</dc:date>
    <link>http://arxiv.org/abs/1111.2888</link>
    <dc:creator>mraginsky</dc:creator><dc:subject>papers to-read online-learning game-theory bounded-rationality complexity via:shivak</dc:subject>
<dc:identifier>https://pinboard.in/u:mraginsky/b:e0615fbb18e4/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:game-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:bounded-rationality"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:complexity"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:via:shivak"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://jmlr.csail.mit.edu/papers/v12/perchet11a.html">
    <title>Internal Regret with Partial Monitoring: Calibration-Based Optimal Algorithms</title>
    <dc:date>2011-08-31T00:29:59+00:00</dc:date>
    <link>http://jmlr.csail.mit.edu/papers/v12/perchet11a.html</link>
    <dc:creator>mraginsky</dc:creator><dc:subject>papers to-read online-learning decision-making optimization learning-theory game-theory</dc:subject>
<dc:identifier>https://pinboard.in/u:mraginsky/b:21b23bcabb20/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:decision-making"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:learning-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:game-theory"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1107.4080">
    <title>[1107.4080] On the Universality of Online Mirror Descent</title>
    <dc:date>2011-07-21T00:15:25+00:00</dc:date>
    <link>http://arxiv.org/abs/1107.4080</link>
    <dc:creator>mraginsky</dc:creator><dc:subject>papers to-read online-learning optimization learning-theory convex-programming</dc:subject>
<dc:identifier>https://pinboard.in/u:mraginsky/b:fa3bfc95da2c/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:learning-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:convex-programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1105.4701">
    <title>[1105.4701] Online Learning, Stability, and Stochastic Gradient Descent</title>
    <dc:date>2011-05-25T01:06:39+00:00</dc:date>
    <link>http://arxiv.org/abs/1105.4701</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA[Meh.
]]></description>
<dc:subject>papers online-learning optimization statistical-learning have-read</dc:subject>
<dc:identifier>https://pinboard.in/u:mraginsky/b:c8072df37b93/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:statistical-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:have-read"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://people.csail.mit.edu/costis/optimalNR.pdf">
    <title>Near-optimal no-regret algorithms for zero-sum games</title>
    <dc:date>2011-03-08T22:28:12+00:00</dc:date>
    <link>http://people.csail.mit.edu/costis/optimalNR.pdf</link>
    <dc:creator>mraginsky</dc:creator><dc:subject>papers to-read online-learning game-theory decision-making optimization filetype:pdf media:document</dc:subject>
<dc:identifier>https://pinboard.in/u:mraginsky/b:76e2feea8b5d/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:game-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:decision-making"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:filetype:pdf"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:media:document"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1102.4442">
    <title>[1102.4442] Internal Regret with Partial Monitoring. Calibration-Based Optimal Algorithms</title>
    <dc:date>2011-02-23T02:56:29+00:00</dc:date>
    <link>http://arxiv.org/abs/1102.4442</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA["We provide consistent random algorithms for sequential decision under partial monitoring, i.e. when the decision maker does not observe the outcomes but receives instead random feedback signals. Those algorithms have no internal regret in the sense that, on the set of stages where the decision maker chose his action according to a given law, the average payoff could not have been improved in average by using any other fixed law. They are based on a generalization of calibration, no longer defined in terms of a Voronoi diagram but instead of a Laguerre diagram (a more general concept). This allows us to bound, for the first time in this general framework, the expected average internal -- as well as the usual external -- regret at stage $n$ by $O(n^{-1/3})$, which is known to be optimal."
]]></description>
<dc:subject>papers to-read learning-theory online-learning game-theory optimization</dc:subject>
<dc:identifier>https://pinboard.in/u:mraginsky/b:56261615f1b1/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:learning-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:game-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:optimization"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1102.0710">
    <title>[1102.0710] Universal Prior Prediction for Communication</title>
    <dc:date>2011-02-04T01:13:59+00:00</dc:date>
    <link>http://arxiv.org/abs/1102.0710</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA["We consider the problem of communicating over an unknown and arbitrarily varying channel, using feedback. This paper focuses on the problem of determining the input behavior, or more specifically, a prior which is used to randomly generate a codebook. We pose the problem of setting the prior as a universal sequential prediction problem using information theoretic abstractions of the communication channel. For the case where the channel is block-wise constant, we show it is possible to asymptotically approach the best rate that can be attained by any system using a fixed prior. For the case where the channel may change on each symbol, we combine a rateless coding scheme with a prior predictor and asymptotically approach the capacity of the average channel universally for every sequence of channels."
]]></description>
<dc:subject>papers to-read information-theory feedback-information-theory prediction online-learning</dc:subject>
<dc:identifier>https://pinboard.in/u:mraginsky/b:8e446b901115/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:information-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:feedback-information-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:prediction"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1011.3168">
    <title>[1011.3168] Online Learning: Beyond Regret</title>
    <dc:date>2010-11-16T05:40:24+00:00</dc:date>
    <link>http://arxiv.org/abs/1011.3168</link>
    <dc:creator>mraginsky</dc:creator><dc:subject>papers to-read online-learning game-theory decision-making statistical-learning</dc:subject>
<dc:identifier>https://pinboard.in/u:mraginsky/b:1e209136787d/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:game-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:decision-making"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:statistical-learning"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1011.1936">
    <title>[1011.1936] Blackwell Approachability and Low-Regret Learning are Equivalent</title>
    <dc:date>2010-11-10T01:29:58+00:00</dc:date>
    <link>http://arxiv.org/abs/1011.1936</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA["We consider the celebrated Blackwell Approachability Theorem for two-player games with vector payoffs. We show that Blackwell's result is equivalent, via efficient reductions, to the existence of "no-regret" algorithms for Online Linear Optimization. Indeed, we show that any algorithm for one such problem can be efficiently converted into an algorithm for the other. We provide a useful application of this reduction: the first efficient algorithm for calibrated forecasting."
]]></description>
<dc:subject>to-read papers online-learning decision-making game-theory optimization</dc:subject>
<dc:identifier>https://pinboard.in/u:mraginsky/b:51e700a8d185/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:decision-making"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:game-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:optimization"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1011.0686">
    <title>[1011.0686] No-Regret Reductions for Imitation Learning and Structured Prediction</title>
    <dc:date>2010-11-03T00:32:48+00:00</dc:date>
    <link>http://arxiv.org/abs/1011.0686</link>
    <dc:creator>mraginsky</dc:creator><dc:subject>papers to-read machine-learning online-learning prediction</dc:subject>
<dc:identifier>https://pinboard.in/u:mraginsky/b:a70a1ab73543/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:prediction"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1008.4654">
    <title>[1008.4654] Freezing and Sleeping: Tracking Experts that Learn by Evolving Past Posteriors</title>
    <dc:date>2010-08-30T00:20:59+00:00</dc:date>
    <link>http://arxiv.org/abs/1008.4654</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA[Hot damn! "... in this paper we re-examine Freund's problem in case the experts have internal structure that enables them to learn. In this case the problem has two possible interpretations: should the experts learn from all data or only from the subsequence on which they are being tracked? The MPP algorithm solves the first case. Our contribution is to generalise MPP to address the second option. The results we obtain apply to any expert structure that can be formalised using (expert) hidden Markov models. Curiously enough, for our interpretation there are \emph{two} natural reference schemes: freezing and sleeping. For each scheme, we provide an efficient prediction strategy and prove the relevant loss bound."
]]></description>
<dc:subject>papers to-read online-learning decision-making machine-learning algorithms</dc:subject>
<dc:identifier>https://pinboard.in/u:mraginsky/b:46c0e21d3026/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:decision-making"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:algorithms"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://research.microsoft.com/apps/pubs/default.aspx?id=81322">
    <title>Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization - Microsoft Research</title>
    <dc:date>2010-07-28T19:49:22+00:00</dc:date>
    <link>http://research.microsoft.com/apps/pubs/default.aspx?id=81322</link>
    <dc:creator>mraginsky</dc:creator><dc:subject>papers to-read optimization online-learning convex-programming</dc:subject>
<dc:identifier>https://pinboard.in/u:mraginsky/b:892d2ce9b8dd/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:convex-programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1006.4039">
    <title>[1006.4039] Cooperative Autonomous Online Learning</title>
    <dc:date>2010-06-23T16:54:25+00:00</dc:date>
    <link>http://arxiv.org/abs/1006.4039</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA[Abstract: "Online learning is becoming increasingly popular for training on large datasets. However, the sequential nature of online learning requires a centralized learner to store data and update parameters. In this paper, we consider a fully decentralized setting, cooperative autonomous online learning, with a distributed data source. The learners perform learning with local parameters while periodically communicating with a small subset of neighbors to exchange information. We define the regret in terms of an implicit aggregated parameter of the learners for such a setting and prove regret bounds similar to the classical sequential online learning." Damn, looks like I've been scooped!
]]></description>
<dc:subject>papers have-read decision-making online-learning distributed-systems</dc:subject>
<dc:identifier>https://pinboard.in/u:mraginsky/b:5098c941c00f/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:have-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:decision-making"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:distributed-systems"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1006.1138">
    <title>[1006.1138] Online Learning: Random Averages, Combinatorial Parameters, and Learnability</title>
    <dc:date>2010-06-08T00:30:54+00:00</dc:date>
    <link>http://arxiv.org/abs/1006.1138</link>
    <dc:creator>mraginsky</dc:creator><description><![CDATA["We study learnability in the online learning model. We define several complexity measures which capture the difficulty of learning in a sequential manner. Among these measures are analogues of Rademacher complexity, covering numbers and fat shattering dimension from statistical learning theory. Relationship among these complexity measures, their connection to online learning, and tools for bounding them are provided. In the setting of supervised learning, finiteness of the introduced scale-sensitive parameters is shown to be equivalent to learnability. The complexities we define also ensure uniform convergence non-i.i.d. data, extending the uniform Glivenko-Cantelli type results. We conclude by showing online learnability for an array of examples."
]]></description>
<dc:subject>papers to-read learning-theory online-learning measure-concentration</dc:subject>
<dc:identifier>https://pinboard.in/u:mraginsky/b:898067021cf4/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:papers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:to-read"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:learning-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:mraginsky/t:measure-concentration"/>
</rdf:Bag></taxo:topics>
</item>
</rdf:RDF>