<?xml version="1.0" encoding="UTF-8"?>
 <rdf:RDF xmlns="http://purl.org/rss/1.0/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:cc="http://web.resource.org/cc/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://pinboard.in">
    <title>Pinboard (Vaguery)</title>
    <link>https://pinboard.in/u:Vaguery/public/</link>
    <description>recent bookmarks from Vaguery</description>
    <items>
      <rdf:Seq>	<rdf:li rdf:resource="https://arxiv.org/abs/1312.6055v3"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/2203.01628"/>
	<rdf:li rdf:resource="https://people.smp.uq.edu.au/BenjaminBurton/papers/burton08-creating.pdf"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/2210.13971"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/2302.04752"/>
	<rdf:li rdf:resource="https://openai.com/blog/formal-math/"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/2107.00110"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/2110.14207"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/2107.00613"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/2106.05784"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1702.06976"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1807.09196"/>
	<rdf:li rdf:resource="https://ddd.fit.cvut.cz/prj/Benchmarks/index.php?page=download"/>
	<rdf:li rdf:resource="https://www.oldcitypublishing.com/journals/jca-home/jca-issue-contents/jca-volume-14-number-3-4-2019/jca-14-3-4-p-191-212/"/>
	<rdf:li rdf:resource="https://hackingsemantics.xyz/2019/leaderboards/"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1709.09683"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1808.05850"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1903.07008"/>
	<rdf:li rdf:resource="https://osf.io/preprints/socarxiv/j2tw9/"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1802.04064"/>
	<rdf:li rdf:resource="http://library.msri.org/books/Book52/files/29toth.pdf"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1711.08477"/>
	<rdf:li rdf:resource="https://tspace.library.utoronto.ca/handle/1807/92089"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1811.10665"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1709.09130"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1710.04640"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1705.04587"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1709.08461"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1607.05342"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1606.07163"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1709.06009"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1709.01670"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1412.1913"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1708.03228"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1708.05070"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1704.00630"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1704.00568"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1705.04665"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1707.09627"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1707.00044"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1707.06374"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1705.00317"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1703.00512"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/cs/0509032"/>
	<rdf:li rdf:resource="http://search.arxiv.org:8081/paper.jsp?r=1611.03398&amp;qid=1491475924245ler_nCnN_397128995&amp;qs=%22magic+square%22&amp;byDate=1"/>
	<rdf:li rdf:resource="http://www.palgrave-journals.com/articles/palcomms2016105"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1501.06813"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1611.03530"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1702.01446"/>
	<rdf:li rdf:resource="https://arxiv.org/abs/1612.00423"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1604.08237"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1501.03879"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1108.3860"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1505.00449"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1505.00468"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1508.06773"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1401.7543"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1508.01045"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1406.7424"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1502.05698"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1501.05382"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1404.6193"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1312.1858"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1309.7099"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1308.2411"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1309.0534"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1301.4092"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1012.5205"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1304.3780"/>
	<rdf:li rdf:resource="http://arxiv.org/abs/1301.1907"/>
      </rdf:Seq>
    </items>
  </channel><item rdf:about="https://arxiv.org/abs/1312.6055v3">
    <title>[1312.6055v3] Unit Tests for Stochastic Optimization</title>
    <dc:date>2026-07-04T13:14:28+00:00</dc:date>
    <link>https://arxiv.org/abs/1312.6055v3</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Optimization by stochastic gradient descent is an important component of many large-scale machine learning algorithms. A wide variety of such optimization algorithms have been devised; however, it is unclear whether these algorithms are robust and widely applicable across many different optimization landscapes. In this paper we develop a collection of unit tests for stochastic optimization. Each unit test rapidly evaluates an optimization algorithm on a small-scale, isolated, and well-understood difficulty, rather than in real-world scenarios where many such issues are entangled. Passing these unit tests is not sufficient, but absolutely necessary for any algorithms with claims to generality or robustness. We give initial quantitative and qualitative results on numerous established algorithms. The testing framework is open-source, extensible, and easy to apply to new algorithms.
]]></description>
<dc:subject>benchmarking operations-research unit-testing performance-measure rather-interesting metaheuristics neural-networks machine-learning to-write-about to-use</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:ebbe888f0ef2/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:operations-research"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:unit-testing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:performance-measure"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:metaheuristics"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:neural-networks"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-use"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/2203.01628">
    <title>[2203.01628] Early Time-Series Classification Algorithms: An Empirical Comparison</title>
    <dc:date>2026-02-20T15:25:10+00:00</dc:date>
    <link>https://arxiv.org/abs/2203.01628</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Early Time-Series Classification (ETSC) is the task of predicting the class of incoming time-series by observing as few measurements as possible. Such methods can be employed to obtain classification forecasts in many time-critical applications. However, available techniques are not equally suitable for every problem, since differentiations in the data characteristics can impact algorithm performance in terms of earliness, accuracy, F1-score, and training time. We evaluate six existing ETSC algorithms on publicly available data, as well as on two newly introduced datasets originating from the life sciences and maritime domains. Our goal is to provide a framework for the evaluation and comparison of ETSC algorithms and to obtain intuition on how such approaches perform on real-life applications. The presented framework may also serve as a benchmark for new related techniques.
]]></description>
<dc:subject>time-series classification machine-learning algorithms rather-interesting to-understand benchmarking</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:f60942d80779/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:time-series"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:classification"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-understand"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://people.smp.uq.edu.au/BenjaminBurton/papers/burton08-creating.pdf">
    <title>Creating Informatics Olympiad Tasks: Exploring the Black Art</title>
    <dc:date>2024-09-21T14:10:07+00:00</dc:date>
    <link>https://people.smp.uq.edu.au/BenjaminBurton/papers/burton08-creating.pdf</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Abstract. Each year a wealth of informatics olympiads are held worldwide at national, regional
and international levels, all of which require engaging and challenging tasks that have not been
seen before. Nevertheless, creating high quality tasks can be a difficult and time-consuming
process. In this paper we explore some of the different techniques that problem setters can use to
find new ideas for tasks and refine these ideas into problems suitable for an informatics
olympiad. These techniques are illustrated through concrete examples from a variety of contests.]]></description>
<dc:subject>pedagogy rather-interesting philosophy-of-engineering mathematical-recreations benchmarking to-write-about consider:performance-measures consider:catalog-of-heuristic-maneuvers</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:fa61d133a55d/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:pedagogy"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:philosophy-of-engineering"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:mathematical-recreations"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:performance-measures"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:catalog-of-heuristic-maneuvers"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/2210.13971">
    <title>[2210.13971] Cellular Automata: Temporal Stochasticity and Computability</title>
    <dc:date>2024-08-06T12:52:23+00:00</dc:date>
    <link>https://arxiv.org/abs/2210.13971</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[In this dissertation, we study temporally stochasticity in cellular automata and the behavior of such cellular automata. The work also explores the computational ability of such cellular automaton that illustrates the computability of solving the affinity classification problem. In addition to that, a cellular automaton, defined over Cayley tree, is shown as the classical searching problem solver. The proposed temporally stochastic cellular automata deals with two elementary cellular automata rules, say f and g. The f is the default rule, however, g is temporally applied to the overall system with some probability τ which acts as a noise in the system. After exploring the dynamics of temporally stochastic cellular automata (TSCAs), we study the dynamical behavior of these temporally stochastic cellular automata (TSCAs) to identify the TSCAs that converge to a fixed point from any seed. We apply each of the convergent TSCAs to some standard datasets and observe the effectiveness of each TSCA as a pattern classifier. It is observed that the proposed TSCA-based classifier shows competitive performance in comparison with existing classifier algorithms. We use temporally stochastic cellular automata to solve a new problem in the field of cellular automata, named as, affinity classification problem which is a generalization of the density classification problem . We show that this model can be used in several applications, like modeling self-healing systems. Finally, we introduce a new model of computing unit developed around cellular automata to reduce the workload of the Central Processing Unit (CPU) of a machine to compute. Each cell of the computing unit acts as a tiny processing element with attached memory. Such a CA is implemented on the Cayley Tree to realize efficient solutions for diverse computational problems.
]]></description>
<dc:subject>rather-interesting cellular-automata nonlinear-dynamics stochastic-systems to-write-about to-simulate benchmarking looking-to-see</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:fd26926e5ed8/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:cellular-automata"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nonlinear-dynamics"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:stochastic-systems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-simulate"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:looking-to-see"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/2302.04752">
    <title>[2302.04752] Benchmarks for Automated Commonsense Reasoning: A Survey</title>
    <dc:date>2023-09-09T13:25:32+00:00</dc:date>
    <link>https://arxiv.org/abs/2302.04752</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[More than one hundred benchmarks have been developed to test the commonsense knowledge and commonsense reasoning abilities of artificial intelligence (AI) systems. However, these benchmarks are often flawed and many aspects of common sense remain untested. Consequently, we do not currently have any reliable way of measuring to what extent existing AI systems have achieved these abilities. This paper surveys the development and uses of AI commonsense benchmarks. We discuss the nature of common sense; the role of common sense in AI; the goals served by constructing commonsense benchmarks; and desirable features of commonsense benchmarks. We analyze the common flaws in benchmarks, and we argue that it is worthwhile to invest the work needed ensure that benchmark examples are consistently high quality. We survey the various methods of constructing commonsense benchmarks. We enumerate 139 commonsense benchmarks that have been developed: 102 text-based, 18 image-based, 12 video based, and 7 simulated physical environments. We discuss the gaps in the existing benchmarks and aspects of commonsense reasoning that are not addressed in any existing benchmark. We conclude with a number of recommendations for future development of commonsense AI benchmarks.
]]></description>
<dc:subject>benchmarking artificial-intelligence natural-language-processing common-sense rather-interesting review</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:d15906957839/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:artificial-intelligence"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:natural-language-processing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:common-sense"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:review"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://openai.com/blog/formal-math/">
    <title>Solving (Some) Formal Math Olympiad Problems</title>
    <dc:date>2022-05-28T11:48:54+00:00</dc:date>
    <link>https://openai.com/blog/formal-math/</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[We achieved a new state-of-the-art (41.2% vs 29.3%) on the miniF2F benchmark, a challenging collection of high-school olympiad problems. Our approach, which we call statement curriculum learning, consists of manually collecting a set of statements of varying difficulty levels (without proof) where the hardest statements are similar to the benchmark we target. Initially our neural prover is weak and can only prove a few of them. We iteratively search for new proofs and re-train our neural network on the newly discovered proofs, and after 8 iterations, our prover ends up being vastly superior when tested on miniF2F.]]></description>
<dc:subject>natural-language-processing mathematical-recreations proof machine-learning rather-interesting benchmarking to-write-about consider:representation</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:c64f1e2a78ad/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:natural-language-processing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:mathematical-recreations"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:proof"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:representation"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/2107.00110">
    <title>[2107.00110] Classical Planning in Deep Latent Space</title>
    <dc:date>2022-04-19T10:24:46+00:00</dc:date>
    <link>https://arxiv.org/abs/2107.00110</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Current domain-independent, classical planners require symbolic models of the problem domain and instance as input, resulting in a knowledge acquisition bottleneck. Meanwhile, although deep learning has achieved significant success in many fields, the knowledge is encoded in a subsymbolic representation which is incompatible with symbolic systems such as planners. We propose Latplan, an unsupervised architecture combining deep learning and classical planning. Given only an unlabeled set of image pairs showing a subset of transitions allowed in the environment (training inputs), Latplan learns a complete propositional PDDL action model of the environment. Later, when a pair of images representing the initial and the goal states (planning inputs) is given, Latplan finds a plan to the goal state in a symbolic latent space and returns a visualized plan execution. We evaluate Latplan using image-based versions of 6 planning domains: 8-puzzle, 15-Puzzle, Blocksworld, Sokoban and Two variations of LightsOut.
]]></description>
<dc:subject>machine-learning artificial-intelligence planning operations-research rather-interesting algorithms looking-to-see benchmarking</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:bcc7415665a8/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:artificial-intelligence"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:planning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:operations-research"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:looking-to-see"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/2110.14207">
    <title>[2110.14207] How Much Coffee Was Consumed During EMNLP 2019? Fermi Problems: A New Reasoning Challenge for AI</title>
    <dc:date>2022-02-21T09:23:15+00:00</dc:date>
    <link>https://arxiv.org/abs/2110.14207</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Many real-world problems require the combined application of multiple reasoning abilities employing suitable abstractions, commonsense knowledge, and creative synthesis of problem-solving strategies. To help advance AI systems towards such capabilities, we propose a new reasoning challenge, namely Fermi Problems (FPs), which are questions whose answers can only be approximately estimated because their precise computation is either impractical or impossible. For example, "How much would the sea level rise if all ice in the world melted?" FPs are commonly used in quizzes and interviews to bring out and evaluate the creative reasoning abilities of humans. To do the same for AI systems, we present two datasets: 1) A collection of 1k real-world FPs sourced from quizzes and olympiads; and 2) a bank of 10k synthetic FPs of intermediate complexity to serve as a sandbox for the harder real-world challenge. In addition to question answer pairs, the datasets contain detailed solutions in the form of an executable program and supporting facts, helping in supervision and evaluation of intermediate steps. We demonstrate that even extensively fine-tuned large scale language models perform poorly on these datasets, on average making estimates that are off by two orders of magnitude. Our contribution is thus the crystallization of several unsolved AI problems into a single, new challenge that we hope will spur further advances in building systems that can reason.
]]></description>
<dc:subject>natural-language-processing artificial-intelligence rather-interesting define-your-terms benchmarking approximation to-write-about consider:representation consider:metaheuristics</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:d5af6eaea474/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:natural-language-processing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:artificial-intelligence"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:define-your-terms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:approximation"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:representation"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:metaheuristics"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/2107.00613">
    <title>[2107.00613] EqFix: Fixing LaTeX Equation Errors by Examples</title>
    <dc:date>2022-02-05T13:25:37+00:00</dc:date>
    <link>https://arxiv.org/abs/2107.00613</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[LaTeX is a widely-used document preparation system. Its powerful ability in mathematical equation editing is perhaps the main reason for its popularity in academia. Sometimes, however, even an expert user may spend much time on fixing an erroneous equation. In this paper, we present EqFix, a synthesis-based repairing system for LaTeX equations. It employs a set of fixing rules, and can suggest possible repairs for common errors in LaTeX equations. A domain specific language is proposed for formally expressing the fixing rules. The fixing rules can be automatically synthesized from a set of input-output examples. An extension of relaxer is also introduced to enhance the practicality of EqFix. We evaluate EqFix on real-world examples and find that it can synthesize rules with high generalization ability. Compared with a state-of-the-art string transformation synthesizer, EqFix solved 37% more cases and spent only one third of their synthesis time.
]]></description>
<dc:subject>software-synthesis LaTeX rather-interesting benchmarking nudge-targets consider:representation consider:grammars-as-such consider:quirky-ad-hoc-domains consider:oh-god-now-do-tables</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:37d525a27f5a/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:software-synthesis"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:LaTeX"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:representation"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:grammars-as-such"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:quirky-ad-hoc-domains"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:oh-god-now-do-tables"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/2106.05784">
    <title>[2106.05784] Programming Puzzles</title>
    <dc:date>2022-01-30T11:49:07+00:00</dc:date>
    <link>https://arxiv.org/abs/2106.05784</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[We introduce a new type of programming challenge called programming puzzles, as an objective and comprehensive evaluation of program synthesis, and release an open-source dataset of Python Programming Puzzles (P3). Each puzzle is defined by a short Python program f, and the goal is to find an input which makes f return True. The puzzles are objective in that each one is specified entirely by the source code of its verifier f, so evaluating f is all that is needed to test a candidate solution. They do not require an answer key or input/output examples, nor do they depend on natural language understanding. The dataset is comprehensive in that it spans problems of a range of difficulties and domains, ranging from trivial string manipulation problems, to classic programming puzzles (e.g., Tower of Hanoi), to interview/competitive-programming problems (e.g., dynamic programming), to longstanding open problems in algorithms and mathematics (e.g., factoring). We develop baseline enumerative program synthesis, GPT-3 and Codex solvers that are capable of solving puzzles -- even without access to any reference solutions -- by learning from their own past solutions. Codex performs best, solving up to 18% of 397 test problems with a single try and 80% of the problems with 1,000 tries per problem. In a small user study, we find a positive correlation between puzzle-solving performance and coding experience, and between the puzzle difficulty for humans and AI solvers. Therefore, further improvements on P3 could have a significant impact on many program synthesis areas.
]]></description>
<dc:subject>program-synthesis benchmarking genetic-programming machine-learning rather-interesting reinvented-wheels see:PSB-training-sets</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:68d17d4b1b00/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:program-synthesis"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:genetic-programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:reinvented-wheels"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:see:PSB-training-sets"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1702.06976">
    <title>[1702.06976] Heavy-Tailed Analogues of the Covariance Matrix for ICA</title>
    <dc:date>2020-05-22T21:33:57+00:00</dc:date>
    <link>https://arxiv.org/abs/1702.06976</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Independent Component Analysis (ICA) is the problem of learning a square matrix A, given samples of X=AS, where S is a random vector with independent coordinates. Most existing algorithms are provably efficient only when each Si has finite and moderately valued fourth moment. However, there are practical applications where this assumption need not be true, such as speech and finance. Algorithms have been proposed for heavy-tailed ICA, but they are not practical, using random walks and the full power of the ellipsoid algorithm multiple times. The main contributions of this paper are: 
(1) A practical algorithm for heavy-tailed ICA that we call HTICA. We provide theoretical guarantees and show that it outperforms other algorithms in some heavy-tailed regimes, both on real and synthetic data. Like the current state-of-the-art, the new algorithm is based on the centroid body (a first moment analogue of the covariance matrix). Unlike the state-of-the-art, our algorithm is practically efficient. To achieve this, we use explicit analytic representations of the centroid body, which bypasses the use of the ellipsoid method and random walks. 
(2) We study how heavy tails affect different ICA algorithms, including HTICA. Somewhat surprisingly, we show that some algorithms that use the covariance matrix or higher moments can successfully solve a range of ICA instances with infinite second moment. We study this theoretically and experimentally, with both synthetic and real-world heavy-tailed data.
]]></description>
<dc:subject>machine-learning benchmarking matrices rather-interesting to-write-about nudge-targets consider:representation consider:ReQ algorithms computational-complexity</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:d40bd17995ea/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:matrices"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:representation"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:ReQ"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:computational-complexity"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1807.09196">
    <title>[1807.09196] A Convex Formulation for Binary Tomography</title>
    <dc:date>2020-01-08T13:21:16+00:00</dc:date>
    <link>https://arxiv.org/abs/1807.09196</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Binary tomography is concerned with the recovery of binary images from a few of their projections (i.e., sums of the pixel values along various directions). To reconstruct an image from noisy projection data, one can pose it as a constrained least-squares problem. As the constraints are non-convex, many approaches for solving it rely on either relaxing the constraints or heuristics. In this paper we propose a novel convex formulation, based on the Lagrange dual of the constrained least-squares problem. The resulting problem is a generalized LASSO problem which can be solved efficiently. It is a relaxation in the sense that it can only be guaranteed to give a feasible solution; not necessarily the optimal one. In exhaustive experiments on small images (2x2, 3x3, 4x4) we find, however, that if the problem has a unique solution, our dual approach finds it. In case of multiple solutions, our approach finds the commonalities between the solutions. Further experiments on realistic numerical phantoms and an experiment on X-ray dataset show that our method compares favourably to Total Variation and DART.
]]></description>
<dc:subject>tomography inverse-problems rather-interesting benchmarking operations-research optimization to-write-about mathematical-programming consider:genetic-programming to-simulate</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:018cc48e6971/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:tomography"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:inverse-problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:operations-research"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:mathematical-programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:genetic-programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-simulate"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://ddd.fit.cvut.cz/prj/Benchmarks/index.php?page=download">
    <title>Collection of Digital Design Benchmarks</title>
    <dc:date>2019-07-29T11:03:27+00:00</dc:date>
    <link>https://ddd.fit.cvut.cz/prj/Benchmarks/index.php?page=download</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Collection of Digital Design Benchmarks]]></description>
<dc:subject>benchmarking engineering-design circuits genetic-programming constraint-satisfaction performance-measure to-write-about boolean-networks</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:db5180be6a39/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:engineering-design"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:circuits"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:genetic-programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:constraint-satisfaction"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:performance-measure"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:boolean-networks"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://www.oldcitypublishing.com/journals/jca-home/jca-issue-contents/jca-volume-14-number-3-4-2019/jca-14-3-4-p-191-212/">
    <title>JCA 14.3-4, p. 191-212 – Old City Publishing</title>
    <dc:date>2019-07-29T10:52:01+00:00</dc:date>
    <link>https://www.oldcitypublishing.com/journals/jca-home/jca-issue-contents/jca-volume-14-number-3-4-2019/jca-14-3-4-p-191-212/</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[The density classification problem is one of the most studied problems in the context of the computational abilities of cellular automata. Since this problem cannot be solved in the classical sense, we consider a weaker version, by slightly relaxing the assumptions on the output specification. In this paper, we discuss this relaxed problem for two-dimensional Affine Continuous Cellular Automata (ACCAs). We focus on finding the most performant rules solving this problem among the density-conserving ones by evaluating ACCAs experimentally for a predefined set of initial configurations.

]]></description>
<dc:subject>cellular-automata benchmarking rather-interesting out-of-the-box to-simulate to-write-about consider:algorothm-structure</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:4983ada244b8/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:cellular-automata"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:out-of-the-box"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-simulate"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:algorothm-structure"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://hackingsemantics.xyz/2019/leaderboards/">
    <title>How the Transformers broke NLP leaderboards - Hacking semantics</title>
    <dc:date>2019-07-24T10:44:04+00:00</dc:date>
    <link>https://hackingsemantics.xyz/2019/leaderboards/</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[If leaderboards are to highlight the actual progress, we need to incentivize new architectures rather than teams outspending each other. Obviously, huge pretrained models are valuable, but unless the authors show that their system consistently behaves differently from its competition with comparable data & compute, it is not clear whether they are presenting a model or a resource.

Furthermore, much of this research is not reproducible: nobody is going to spend $250,000 just to repeat XLNet training. Given the fact that its ablation study showed only 1-2% gain over BERT in 3 datasets out of 4 (Yang et al., 2019), we don’t actually know for sure that its masking strategy is more successful than BERT’s.

At the same time, the development of leaner models is dis-incentivized, as their task is fundamentally harder and the leaderboard-oriented community only rewards the SOTA. That, in its turn, prices out of competitions academic teams, which will not result in students becoming better engineers when they graduate.

Last but not the least, huge DL models are often overparametrized (Frankle & Carbin, 2019; Wu, Fan, Baevski, Dauphin, & Auli, 2019). As an example, the smaller version of BERT achieves better scores on a number of syntax-testing experiments than the larger one (Goldberg, 2019). The fact that DL models require a lot of compute is not necessarily a bad thing in itself, but wasting compute is not ideal for the environment (Strubell, Ganesh, & McCallum, 2019).

]]></description>
<dc:subject>benchmarking machine-learning horse-races performance-measure multiobjective-optimization (use-it)</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:bd8204972f3a/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:horse-races"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:performance-measure"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:multiobjective-optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:(use-it)"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1709.09683">
    <title>[1709.09683] Exact Camera Location Recovery by Least Unsquared Deviations</title>
    <dc:date>2019-04-27T11:57:40+00:00</dc:date>
    <link>https://arxiv.org/abs/1709.09683</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[We establish exact recovery for the Least Unsquared Deviations (LUD) algorithm of Ozyesil and Singer. More precisely, we show that for sufficiently many cameras with given corrupted pairwise directions, where both camera locations and pairwise directions are generated by a special probabilistic model, the LUD algorithm exactly recovers the camera locations with high probability. A similar exact recovery guarantee was established for the ShapeFit algorithm by Hand, Lee and Voroninski, but with typically less corruption.
]]></description>
<dc:subject>inverse-problems image-processing rather-interesting benchmarking nudge-targets consider:feature-discovery consider:rediscovery</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:1ea9d5a60ac9/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:inverse-problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:image-processing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:feature-discovery"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:rediscovery"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1808.05850">
    <title>[1808.05850] Towards a Theory-Guided Benchmarking Suite for Discrete Black-Box Optimization Heuristics: Profiling $(1+λ)$ EA Variants on OneMax and LeadingOnes</title>
    <dc:date>2019-04-23T11:08:49+00:00</dc:date>
    <link>https://arxiv.org/abs/1808.05850</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Theoretical and empirical research on evolutionary computation methods complement each other by providing two fundamentally different approaches towards a better understanding of black-box optimization heuristics. In discrete optimization, both streams developed rather independently of each other, but we observe today an increasing interest in reconciling these two sub-branches. In continuous optimization, the COCO (COmparing Continuous Optimisers) benchmarking suite has established itself as an important platform that theoreticians and practitioners use to exchange research ideas and questions. No widely accepted equivalent exists in the research domain of discrete black-box optimization. 
Marking an important step towards filling this gap, we adjust the COCO software to pseudo-Boolean optimization problems, and obtain from this a benchmarking environment that allows a fine-grained empirical analysis of discrete black-box heuristics. In this documentation we demonstrate how this test bed can be used to profile the performance of evolutionary algorithms. More concretely, we study the optimization behavior of several (1+λ) EA variants on the two benchmark problems OneMax and LeadingOnes. This comparison motivates a refined analysis for the optimization time of the (1+λ) EA on LeadingOnes.
]]></description>
<dc:subject>evolutionary-algorithms metaheuristics horse-races performance-measure rather-interesting to-write-about consider:classification benchmarking consider:open-ended-problems</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:535c083a1d4f/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:evolutionary-algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:metaheuristics"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:horse-races"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:performance-measure"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:classification"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:open-ended-problems"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1903.07008">
    <title>[1903.07008] Leveling the Playing Field -- Fairness in AI Versus Human Game Benchmarks</title>
    <dc:date>2019-04-05T11:01:16+00:00</dc:date>
    <link>https://arxiv.org/abs/1903.07008</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[From the beginning if the history of AI, there has been interest in games as a platform of research. As the field developed, human-level competence in complex games became a target researchers worked to reach. Only relatively recently has this target been finally met for traditional tabletop games such as Backgammon, Chess and Go. Current research focus has shifted to electronic games, which provide unique challenges. As is often the case with AI research, these results are liable to be exaggerated or misrepresented by either authors or third parties. The extent to which these games benchmark consist of fair competition between human and AI is also a matter of debate. In this work, we review the statements made by authors and third parties in the general media and academic circle about these game benchmark results and discuss factors that can impact the perception of fairness in the contest between humans and machines
]]></description>
<dc:subject>engineering-criticism rather-interesting hey-I-know-this-guy performance-measure what-gets-measured-gets-fudged artificial-intelligence games machine-learning to-write-about benchmarking</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:f51b8ceb0e60/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:engineering-criticism"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:hey-I-know-this-guy"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:performance-measure"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:what-gets-measured-gets-fudged"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:artificial-intelligence"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:games"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://osf.io/preprints/socarxiv/j2tw9/">
    <title>SocArXiv Papers | Scaling Down Inequality: Rating Scales, Gender Bias, and the Architecture of Evaluation</title>
    <dc:date>2019-03-03T15:28:48+00:00</dc:date>
    <link>https://osf.io/preprints/socarxiv/j2tw9/</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Quantitative performance ratings are ubiquitous in modern organizations—from businesses to universities—yet there is substantial evidence of bias against women in such ratings. This study examines how gender inequalities in evaluations depend on the design of the tools used to judge merit. Exploiting a quasi-natural experiment at a large North American university, we found that the number of scale points used in faculty teaching evaluations—whether instructors were rated on a scale of 6 versus a scale of 10—significantly affected the size of the gender gap in evaluations. A survey experiment, which presented all participants with an identical lecture transcript but randomly varied instructor gender and the number of scale points, replicated this finding and suggested that the number of scale points affects the extent to which gender stereotypes of brilliance are expressed in quantitative ratings. These results highlight how seemingly minor technical aspects of performance ratings can have a major effect on the evaluation of men and women. Our findings thus contribute to a growing body of work on organizational practices that reduce workplace inequalities and the sociological literature on how rating systems—rather than being neutral instruments—shape the distribution of rewards in organizations.

]]></description>
<dc:subject>academia what-gets-measured-gets-fudged performance-measure corporatism benchmarking academic-culture sexism bias technocracy</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:3ff2e89222e5/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:academia"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:what-gets-measured-gets-fudged"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:performance-measure"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:corporatism"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:academic-culture"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:sexism"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:bias"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:technocracy"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1802.04064">
    <title>[1802.04064] A Contextual Bandit Bake-off</title>
    <dc:date>2019-03-02T12:48:26+00:00</dc:date>
    <link>https://arxiv.org/abs/1802.04064</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Contextual bandit algorithms are essential for solving many real-world interactive machine learning problems. Despite multiple recent successes on statistically and computationally efficient methods, the practical behavior of these algorithms is still poorly understood. We leverage the availability of large numbers of supervised learning datasets to compare and empirically optimize contextual bandit algorithms, focusing on practical methods that learn by relying on optimization oracles from supervised learning. We find that a recent method (Foster et al., 2018) using optimism under uncertainty works the best overall. A surprisingly close second is a simple greedy baseline that only explores implicitly through the diversity of contexts, followed by a variant of Online Cover (Agarwal et al., 2014) which tends to be more conservative but robust to problem specification by design. Along the way, we also evaluate and improve several internal components of contextual bandit algorithm design. Overall, this is a thorough study and review of contextual bandit methodology.
]]></description>
<dc:subject>online-learning machine-learning bandit-problems horse-races rather-interesting benchmarking</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:1a03fa265b1d/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:online-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:bandit-problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:horse-races"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://library.msri.org/books/Book52/files/29toth.pdf">
    <title>[PDF] Binary Space Partitions: Recent Developments</title>
    <dc:date>2019-02-28T11:45:11+00:00</dc:date>
    <link>http://library.msri.org/books/Book52/files/29toth.pdf</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Abstract. A binary space partition tree is a data structure for the rep- resentation of a set of objects in space. It found an increasing number of applications over the last decades. In recent years, intensifying research focused on its combinatorial properties, which affect directly the efficiency of applications. Important advances were made on binary space partitions for disjoint line segments in the plane and for axis-aligned objects in higher dimensions. New research directions were also initiated on some realistic polygonal scenes and on kinetic binary space partitions. This paper at- tempts to give an overview of these results and reiterates some of the most pressing open problems.]]></description>
<dc:subject>computational-geometry benchmarking partition-problems surveillance vey rather-interesting hard-problems plane-geometry</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:6c87684e3c3a/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:computational-geometry"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:partition-problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:surveillance"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:vey"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:hard-problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:plane-geometry"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1711.08477">
    <title>[1711.08477] Benchmarking Relief-Based Feature Selection Methods for Bioinformatics Data Mining</title>
    <dc:date>2019-02-24T15:29:46+00:00</dc:date>
    <link>https://arxiv.org/abs/1711.08477</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Modern biomedical data mining requires feature selection methods that can (1) be applied to large scale feature spaces (e.g. `omics' data), (2) function in noisy problems, (3) detect complex patterns of association (e.g. gene-gene interactions), (4) be flexibly adapted to various problem domains and data types (e.g. genetic variants, gene expression, and clinical data) and (5) are computationally tractable. To that end, this work examines a set of filter-style feature selection algorithms inspired by the `Relief' algorithm, i.e. Relief-Based algorithms (RBAs). We implement and expand these RBAs in an open source framework called ReBATE (Relief-Based Algorithm Training Environment). We apply a comprehensive genetic simulation study comparing existing RBAs, a proposed RBA called MultiSURF, and other established feature selection methods, over a variety of problems. The results of this study (1) support the assertion that RBAs are particularly flexible, efficient, and powerful feature selection methods that differentiate relevant features having univariate, multivariate, epistatic, or heterogeneous associations, (2) confirm the efficacy of expansions for classification vs. regression, discrete vs. continuous features, missing data, multiple classes, or class imbalance, (3) identify previously unknown limitations of specific RBAs, and (4) suggest that while MultiSURF* performs best for explicitly identifying pure 2-way interactions, MultiSURF yields the most reliable feature selection performance across a wide range of problem types.
]]></description>
<dc:subject>machine-learning bioinformatics hey-I-know-this-guy feature-selection benchmarking epistasis algorithms to-write-about</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:2c11820a8280/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:bioinformatics"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:hey-I-know-this-guy"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:feature-selection"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:epistasis"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://tspace.library.utoronto.ca/handle/1807/92089">
    <title>Personality, Incorporated: Psychological Capital in American Management, 1960-1995 | TSpace Repository</title>
    <dc:date>2019-02-22T23:03:41+00:00</dc:date>
    <link>https://tspace.library.utoronto.ca/handle/1807/92089</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Personality, Incorporated traces the history of personality testing in American corporate management from 1960 to 1995. Through three case studies, this dissertation takes up the twinned history of psychological techniques, as deployed by a cadre of “consulting psychologists,” and the psychological capacities conjured by these techniques. Personality, Incorporated make two core arguments. First, it argues that personality tests aimed to incite and channel employees’ psychological capacities as forms of economic value. Psychological tests did not simply measure static traits, but they also actively elicited and mobilized affects, subjectivities, and differences, that they then harnessed for corporate value production. Second, this dissertation argues that late twentieth-century corporations were not just sites for the application and circulation of psychological knowledge, but they also served as important experimental laboratories for investigating human’s interpersonal, emotional, and cognitive capacities. A core contribution of this dissertation is to identify, investigate, and interrogate the specific form of value mobilized at this intersection of personality tests and management practices: “psychological capital. As an analytic category, psychological capital names how human beings’ psychological capacities are enlisted into circuits of economic value, with the aid of psychological techniques that can measure and incite these capacities. Psychological capital circulates as an intangible yet nonetheless measurable form of capital that was made visible, measurable, and valuable through psychological techniques of personality testing and training amidst economic, social, and cultural changes of the knowledge economy. This dissertation offers a new way to think about psychological tests: as tools designed to mobilize and channel psychological capacities, to elicit and cultivate the very characteristics that they purported to measure. In weaving together histories of psychology, science and corporate capitalism with critical scholarship on affect and value, this dissertation excavates how psychological tests have become corporate techniques that shape contemporary selfhood.
]]></description>
<dc:subject>benchmarking capitalism corporatism psychology rather-interesting via:twitter what-gets-measured-gets-fudged</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:53d5e5a5846b/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:capitalism"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:corporatism"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:psychology"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:via:twitter"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:what-gets-measured-gets-fudged"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1811.10665">
    <title>[1811.10665] Stepping Stones to Inductive Synthesis of Low-Level Looping Programs</title>
    <dc:date>2018-11-28T23:13:32+00:00</dc:date>
    <link>https://arxiv.org/abs/1811.10665</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Inductive program synthesis, from input/output examples, can provide an opportunity to automatically create programs from scratch without presupposing the algorithmic form of the solution. For induction of general programs with loops (as opposed to loop-free programs, or synthesis for domain-specific languages), the state of the art is at the level of introductory programming assignments. Most problems that require algorithmic subtlety, such as fast sorting, have remained out of reach without the benefit of significant problem-specific background knowledge. A key challenge is to identify cues that are available to guide search towards correct looping programs. We present MAKESPEARE, a simple delayed-acceptance hillclimbing method that synthesizes low-level looping programs from input/output examples. During search, delayed acceptance bypasses small gains to identify significantly-improved stepping stone programs that tend to generalize and enable further progress. The method performs well on a set of established benchmarks, and succeeds on the previously unsolved "Collatz Numbers" program synthesis problem. Additional benchmarks include the problem of rapidly sorting integer arrays, in which we observe the emergence of comb sort (a Shell sort variant that is empirically fast). MAKESPEARE has also synthesized a record-setting program on one of the puzzles from the TIS-100 assembly language programming game.
]]></description>
<dc:subject>benchmarking genetic-programming software-synthesis rather-interesting to-write-about</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:21286fb740ee/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:genetic-programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:software-synthesis"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1709.09130">
    <title>[1709.09130] Output Range Analysis for Deep Neural Networks</title>
    <dc:date>2017-11-09T11:38:00+00:00</dc:date>
    <link>https://arxiv.org/abs/1709.09130</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Deep neural networks (NN) are extensively used for machine learning tasks such as image classification, perception and control of autonomous systems. Increasingly, these deep NNs are also been deployed in high-assurance applications. Thus, there is a pressing need for developing techniques to verify neural networks to check whether certain user-expected properties are satisfied. In this paper, we study a specific verification problem of computing a guaranteed range for the output of a deep neural network given a set of inputs represented as a convex polyhedron. Range estimation is a key primitive for verifying deep NNs. We present an efficient range estimation algorithm that uses a combination of local search and linear programming problems to efficiently find the maximum and minimum values taken by the outputs of the NN over the given input set. In contrast to recently proposed "monolithic" optimization approaches, we use local gradient descent to repeatedly find and eliminate local minima of the function. The final global optimum is certified using a mixed integer programming instance. We implement our approach and compare it with Reluplex, a recently proposed solver for deep neural networks. We demonstrate the effectiveness of the proposed approach for verification of NNs used in automated control as well as those used in classification.]]></description>
<dc:subject>neural-networks benchmarking rather-interesting feature-construction performance-measure modeling-is-not-mathematics algorithms nudge-targets consider:looking-to-see consider:generalizing</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:0aced29fe3e8/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:neural-networks"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:feature-construction"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:performance-measure"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:modeling-is-not-mathematics"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:looking-to-see"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:generalizing"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1710.04640">
    <title>[1710.04640] Hard and Easy Instances of L-Tromino Tilings</title>
    <dc:date>2017-11-06T12:51:30+00:00</dc:date>
    <link>https://arxiv.org/abs/1710.04640</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[In this work we study tilings of regions in the square lattice with L-shaped trominoes. Deciding the existence of a tiling with L-trominoes for an arbitrary region in general is NP-complete, nonetheless, we indentify restrictions to the problem where either it remains NP-complete or it has a polynomial time algorithm. First we show that an aztec diamond of order n always has an L-tromino tiling if and only if n(n+1)≡0mod3; if an aztec diamond has at least two defects or holes, however, the problem of deciding a tiling is NP-complete. Then we study tilings of arbitrary regions where only 180∘ rotations of L-trominoes are available. For this particular case we show that deciding the existence of a tiling remains NP-complete, yet, if a region contains certain so-called "forbidden polyominoes" as subregions, then there exists a polynomial time algorithm for deciding a tiling.]]></description>
<dc:subject>polyominoes tiling benchmarking rather-interesting problem-solving nudge-targets consider:feature-discovery updated</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:a0df5a225cb0/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:polyominoes"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:tiling"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:problem-solving"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:feature-discovery"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:updated"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1705.04587">
    <title>[1705.04587] Complexity and Inapproximability Results for Parallel Task Scheduling and Strip Packing</title>
    <dc:date>2017-10-21T12:50:50+00:00</dc:date>
    <link>https://arxiv.org/abs/1705.04587</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[We study the Parallel Task Scheduling problem Pm|sizej|Cmax with a constant number of machines. This problem is known to be strongly NP-complete for each m≥5, while it is solvable in pseudo-polynomial time for each m≤3. We give a positive answer to the long-standing open question whether this problem is strongly NP-complete for m=4. As a second result, we improve the lower bound of 1211 for approximating pseudo-polynomial Strip Packing to 54. Since the best known approximation algorithm for this problem has a ratio of 43+ε, this result narrows the gap between approximation ratio and inapproximability result by a significant step. Both results are proven by a reduction from the strongly NP-complete problem 3-Partition.]]></description>
<dc:subject>operations-research optimization computational-complexity benchmarking approximation rather-interesting nudge-targets consider:approximation</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:b4b5973b0460/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:operations-research"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:computational-complexity"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:approximation"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:approximation"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1709.08461">
    <title>[1709.08461] Mining a Sub-Matrix of Maximal Sum</title>
    <dc:date>2017-10-15T12:20:20+00:00</dc:date>
    <link>https://arxiv.org/abs/1709.08461</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Biclustering techniques have been widely used to identify homogeneous subgroups within large data matrices, such as subsets of genes similarly expressed across subsets of patients. Mining a max-sum sub-matrix is a related but distinct problem for which one looks for a (non-necessarily contiguous) rectangular sub-matrix with a maximal sum of its entries. Le Van et al. (Ranked Tiling, 2014) already illustrated its applicability to gene expression analysis and addressed it with a constraint programming (CP) approach combined with large neighborhood search (CP-LNS). In this work, we exhibit some key properties of this NP-hard problem and define a bounding function such that larger problems can be solved in reasonable time. Two different algorithms are proposed in order to exploit the highlighted characteristics of the problem: a CP approach with a global constraint (CPGC) and mixed integer linear programming (MILP). Practical experiments conducted both on synthetic and real gene expression data exhibit the characteristics of these approaches and their relative benefits over the original CP-LNS method. Overall, the CPGC approach tends to be the fastest to produce a good solution. Yet, the MILP formulation is arguably the easiest to formulate and can also be competitive.
]]></description>
<dc:subject>machine-learning matrices mathematical-programming benchmarking to-write-about nudge-targets</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:7256a5b3370d/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:matrices"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:mathematical-programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1607.05342">
    <title>[1607.05342] On Integer Programming and the Path-width of the Constraint Matrix</title>
    <dc:date>2017-10-15T12:15:28+00:00</dc:date>
    <link>https://arxiv.org/abs/1607.05342</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[In the classic Integer Programming (IP) problem, the objective is to decide whether, for a given m×n matrix A and an m-vector b=(b1,…,bm), there is a non-negative integer n-vector x such that Ax=b. Solving (IP) is an important step in numerous algorithms and it is important to obtain an understanding of the precise complexity of this problem as a function of natural parameters of the input. Two significant results in this line of research are the pseudo-polynomial time algorithms for (IP) when the number of constraints is a constant [Papadimitriou, J. ACM 1981] and when the branch-width of the column-matroid corresponding to the constraint matrix is a constant [Cunningham and Geelen, IPCO 2007]. In this paper, we prove matching upper and lower bounds for (IP) when the path-width of the corresponding column-matroid is a constant. These lower bounds provide evidence that the algorithm of Cunningham and Geelen, are probably optimal. We also obtain a separate lower bound providing evidence that the algorithm of Papadimitriou is close to optimal.]]></description>
<dc:subject>classification benchmarking rather-interesting mathematical-programming matrices feature-construction nudge-targets consider:rediscovery consider:performance-measures computational-complexity</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:818f1324102a/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:classification"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:mathematical-programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:matrices"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:feature-construction"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:rediscovery"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:performance-measures"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:computational-complexity"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1606.07163">
    <title>[1606.07163] Interpretable Machine Learning Models for the Digital Clock Drawing Test</title>
    <dc:date>2017-10-15T12:11:14+00:00</dc:date>
    <link>https://arxiv.org/abs/1606.07163</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[The Clock Drawing Test (CDT) is a rapid, inexpensive, and popular neuropsychological screening tool for cognitive conditions. The Digital Clock Drawing Test (dCDT) uses novel software to analyze data from a digitizing ballpoint pen that reports its position with considerable spatial and temporal precision, making possible the analysis of both the drawing process and final product. We developed methodology to analyze pen stroke data from these drawings, and computed a large collection of features which were then analyzed with a variety of machine learning techniques. The resulting scoring systems were designed to be more accurate than the systems currently used by clinicians, but just as interpretable and easy to use. The systems also allow us to quantify the tradeoff between accuracy and interpretability. We created automated versions of the CDT scoring systems currently used by clinicians, allowing us to benchmark our models, which indicated that our machine learning models substantially outperformed the existing scoring systems.
]]></description>
<dc:subject>machine-learning benchmarking rather-interesting nudge-targets consider:representation consider:looking-to-see</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:37d7bb3343d6/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:representation"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:looking-to-see"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1709.06009">
    <title>[1709.06009] Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents</title>
    <dc:date>2017-09-25T12:08:16+00:00</dc:date>
    <link>https://arxiv.org/abs/1709.06009</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge of building AI agents with general competency across dozens of Atari 2600 games. It supports a variety of different problem settings and it has been receiving increasing attention from the scientific community, leading to some high-profile success stories such as the much publicized Deep Q-Networks (DQN). In this article we take a big picture look at how the ALE is being used by the research community. We show how diverse the evaluation methodologies in the ALE have become with time, and highlight some key concerns when evaluating agents in the ALE. We use this discussion to present some methodological best practices and provide new benchmark results using these best practices. To further the progress in the field, we introduce a new version of the ALE that supports multiple game modes and provides a form of stochasticity we call sticky actions. We conclude this big picture look by revisiting challenges posed when the ALE was introduced, summarizing the state-of-the-art in various problems and highlighting problems that remain open.
]]></description>
<dc:subject>machine-learning benchmarking games to-write-about nudge-targets consider:performance-measures</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:11f915739959/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:games"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:performance-measures"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1709.01670">
    <title>[1709.01670] Parameterized complexity of machine scheduling: 15 open problems</title>
    <dc:date>2017-09-25T12:05:50+00:00</dc:date>
    <link>https://arxiv.org/abs/1709.01670</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Machine scheduling problems are a long-time key domain of algorithms and complexity research. A novel approach to machine scheduling problems are fixed-parameter algorithms. To stimulate this thriving research direction, we propose 15 interesting open questions in this area.
]]></description>
<dc:subject>open-problems operations-research scheduling benchmarking to-write-about nudge-targets consider:looking-to-see consider:performance-measures</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:59257476c7a2/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:open-problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:operations-research"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:scheduling"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:looking-to-see"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:performance-measures"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1412.1913">
    <title>[1412.1913] A Portfolio Approach to Algorithm Selection for Discrete Time-Cost Trade-off Problem</title>
    <dc:date>2017-09-24T12:57:05+00:00</dc:date>
    <link>https://arxiv.org/abs/1412.1913</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[It is a known fact that the performance of optimization algorithms for NP-Hard problems vary from instance to instance. We observed the same trend when we comprehensively studied multi-objective evolutionary algorithms (MOEAs) on a six benchmark instances of discrete time-cost trade-off problem (DTCTP) in a construction project. In this paper, instead of using a single algorithm to solve DTCTP, we use a portfolio approach that takes multiple algorithms as its constituent. We proposed portfolio comprising of four MOEAs, Non-dominated Sorting Genetic Algorithm II (NSGA-II), the strength Pareto Evolutionary Algorithm II (SPEA-II), Pareto archive evolutionary strategy (PAES) and Niched Pareto Genetic Algorithm II (NPGA-II) to solve DTCTP. The result shows that the portfolio approach is computationally fast and qualitatively superior to its constituent algorithms for all benchmark instances. Moreover, portfolio approach provides an insight in selecting the best algorithm for all benchmark instances of DTCTP.
]]></description>
<dc:subject>multiobjective-optimization benchmarking trade-offs looking-to-see computational-complexity define-your-terms rather-interesting to-write-about</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:d5b0edcf1edb/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:multiobjective-optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:trade-offs"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:looking-to-see"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:computational-complexity"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:define-your-terms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1708.03228">
    <title>[1708.03228] Lower bounds for several online variants of bin packing</title>
    <dc:date>2017-09-24T12:47:51+00:00</dc:date>
    <link>https://arxiv.org/abs/1708.03228</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[We consider several previously studied online variants of bin packing and prove new and improved lower bounds on the asymptotic competitive ratios for them. For that, we use a method of fully adaptive constructions. In particular, we improve the lower bound for the asymptotic competitive ratio of online square packing significantly, raising it from roughly 1.68 to above 1.75.
]]></description>
<dc:subject>bin-packing operations-research benchmarking proof algorithms nudge-targets consider:looking-to-see</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:93f9f32f9b2f/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:bin-packing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:operations-research"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:proof"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:looking-to-see"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1708.05070">
    <title>[1708.05070] Data-driven Advice for Applying Machine Learning to Bioinformatics Problems</title>
    <dc:date>2017-08-27T12:27:48+00:00</dc:date>
    <link>https://arxiv.org/abs/1708.05070</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[As the bioinformatics field grows, it must keep pace not only with new data but with new algorithms. Here we contribute a thorough analysis of 13 state-of-the-art, commonly used machine learning algorithms on a set of 165 publicly available classification problems in order to provide data-driven algorithm recommendations to current researchers. We present a number of statistical and visual comparisons of algorithm performance and quantify the effect of model selection and algorithm tuning for each algorithm and dataset. The analysis culminates in the recommendation of five algorithms with hyperparameters that maximize classifier performance across the tested problems, as well as general guidelines for applying machine learning to supervised classification problems.
]]></description>
<dc:subject>meta-optimization machine-learning benchmarking performance-measure feature-construction to-write-about hey-I-know-this-guy</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:b10fc1086451/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:meta-optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:performance-measure"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:feature-construction"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:hey-I-know-this-guy"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1704.00630">
    <title>[1704.00630] Towards a property graph generator for benchmarking</title>
    <dc:date>2017-08-14T13:15:14+00:00</dc:date>
    <link>https://arxiv.org/abs/1704.00630</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[The use of synthetic graph generators is a common practice among graph-oriented benchmark designers, as it allows obtaining graphs with the required scale and characteristics. However, finding a graph generator that accurately fits the needs of a given benchmark is very difficult, thus practitioners end up creating ad-hoc ones. Such a task is usually time-consuming, and often leads to reinventing the wheel. In this paper, we introduce the conceptual design of DataSynth, a framework for property graphs generation with customizable schemas and characteristics. The goal of DataSynth is to assist benchmark designers in generating graphs efficiently and at scale, saving from implementing their own generators. Additionally, DataSynth introduces novel features barely explored so far, such as modeling the correlation between properties and the structure of the graph. This is achieved by a novel property-to-node matching algorithm for which we present preliminary promising results.
]]></description>
<dc:subject>graph-theory generative-models benchmarking database data-synthesis rather-interesting algorithms inverse-problems nudge-targets consider:evolutionary-algorithms constraint-satisfaction</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:e67086882696/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:graph-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:generative-models"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:database"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:data-synthesis"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:inverse-problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:evolutionary-algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:constraint-satisfaction"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1704.00568">
    <title>[1704.00568] A parametric level-set method for partially discrete tomography</title>
    <dc:date>2017-08-12T13:09:22+00:00</dc:date>
    <link>https://arxiv.org/abs/1704.00568</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[This paper introduces a parametric level-set method for tomographic reconstruction of partially discrete images. Such images consist of a continuously varying background and an anomaly with a constant (known) grey-value. We represent the geometry of the anomaly using a level-set function, which we represent using radial basis functions. We pose the reconstruction problem as a bi-level optimization problem in terms of the background and coefficients for the level-set function. To constrain the background reconstruction we impose smoothness through Tikhonov regularization. The bi-level optimization problem is solved in an alternating fashion; in each iteration we first reconstruct the background and consequently update the level-set function. We test our method on numerical phantoms and show that we can successfully reconstruct the geometry of the anomaly, even from limited data. On these phantoms, our method outperforms Total Variation reconstruction, DART and P-DART.
]]></description>
<dc:subject>tomography inverse-problems benchmarking rather-interesting to-write-about to-simulate nudge-targets</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:98664c997b8e/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:tomography"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:inverse-problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-simulate"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1705.04665">
    <title>[1705.04665] A Formal Characterization of the Local Search Topology of the Gap Heuristic</title>
    <dc:date>2017-08-07T11:40:30+00:00</dc:date>
    <link>https://arxiv.org/abs/1705.04665</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[The pancake puzzle is a classic optimization problem that has become a standard benchmark for heuristic search algorithms. In this paper, we provide full proofs regarding the local search topology of the gap heuristic for the pancake puzzle. First, we show that in any non-goal state in which there is no move that will decrease the number of gaps, there is a move that will keep the number of gaps constant. We then classify any state in which the number of gaps cannot be decreased in a single action into two groups: those requiring 2 actions to decrease the number of gaps, and those which require 3 actions to decrease the number of gaps.
]]></description>
<dc:subject>optimization benchmarking heuristics planning nudge-targets consider:looking-to-see to-write-about</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:6dc224899a7b/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:heuristics"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:planning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:looking-to-see"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1707.09627">
    <title>[1707.09627] Learning to Infer Graphics Programs from Hand-Drawn Images</title>
    <dc:date>2017-08-05T11:54:37+00:00</dc:date>
    <link>https://arxiv.org/abs/1707.09627</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[We introduce a model that learns to convert simple hand drawings into graphics programs written in a subset of LATEX. The model combines techniques from deep learning and program synthesis. We learn a convolutional neural network that proposes plausible drawing primitives that explain an image. This set of drawing primitives is like an execution trace for a graphics program. From this trace we use program synthesis techniques to recover a graphics program with constructs such as variable bindings, iterative loops, or simple kinds of conditionals. With a graphics program in hand, we can correct errors made by the deep network, cluster drawings by use of similar high-level geometric structures, and extrapolate drawings. Taken together these results are a step towards agents that induce useful, human-readable programs from perceptual input.]]></description>
<dc:subject>generative-models learning-by-watching rather-interesting machine-learning algorithms benchmarking consider:looking-to-see nudge-targets performance-measure</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:7c27bfc80c2a/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:generative-models"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:learning-by-watching"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:looking-to-see"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:performance-measure"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1707.00044">
    <title>[1707.00044] Learning Fair Classifiers: A Regularization-Inspired Approach</title>
    <dc:date>2017-08-04T12:38:37+00:00</dc:date>
    <link>https://arxiv.org/abs/1707.00044</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[We present a regularization-inspired approach for reducing bias in learned classifiers. In particular, we focus on binary classification tasks over individuals from two populations, where, as our criterion for fairness, we wish to achieve similar false positive rates in both populations, and similar false negative rates in both populations. As a proof of concept, we implement our approach and empirically evaluate its ability to achieve both fairness and accuracy, using the COMPAS scores data for prediction of recidivism.
]]></description>
<dc:subject>performance-measure machine-learning classification rather-interesting define-your-terms via:cshalizi to-write-about benchmarking constraint-satisfaction</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:2656cee6e0b4/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:performance-measure"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:classification"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:define-your-terms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:via:cshalizi"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:constraint-satisfaction"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1707.06374">
    <title>[1707.06374] Document Listing on Repetitive Collections with Guaranteed Performance</title>
    <dc:date>2017-07-22T12:34:35+00:00</dc:date>
    <link>https://arxiv.org/abs/1707.06374</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[We consider document listing on string collections, that is, finding in which strings a given pattern appears. In particular, we focus on repetitive collections: a collection of size N over alphabet [1,σ] is composed of D copies of a string of size n, and s single-character or block edits are applied on ranges of copies. We introduce the first document listing index with size Õ (n+s), precisely O((nlogσ+slog2N)logD) bits, and with useful worst-case time guarantees: Given a pattern of length m, the index reports the $\ndoc$ strings where it appears in time $O(m^2 + m\log^{1+\epsilon} N \cdot \ndoc)$, for any constant ϵ>0. Our technique is to augment a range data structure that is commonly used on grammar-based indexes, so that instead of retrieving all the pattern occurrences, it computes useful summaries on them. We show that the idea has independent interest: we introduce the first grammar-based index that, on a text T[1,N] with a grammar of size r, uses O(rlogN) bits and counts the number of occurrences of a pattern P[1,m] in time O(m2+mlog2+ϵr), for any constant ϵ>0.]]></description>
<dc:subject>indexing databases computational-complexity algorithms pattern-finding rather-interesting to-write-about benchmarking</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:8f7ec31b448b/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:indexing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:databases"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:computational-complexity"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:pattern-finding"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1705.00317">
    <title>[1705.00317] Non-polynomial Worst-Case Analysis of Recursive Programs</title>
    <dc:date>2017-05-07T12:24:33+00:00</dc:date>
    <link>https://arxiv.org/abs/1705.00317</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[We study the problem of developing efficient approaches for proving worst-case bounds of non-deterministic recursive programs. Ranking functions are sound and complete for proving termination and worst-case bounds of nonrecursive programs. First, we apply ranking functions to recursion, resulting in measure functions. We show that measure functions provide a sound and complete approach to prove worst-case bounds of non-deterministic recursive programs. Our second contribution is the synthesis of measure functions in nonpolynomial forms. We show that non-polynomial measure functions with logarithm and exponentiation can be synthesized through abstraction of logarithmic or exponentiation terms, Farkas' Lemma, and Handelman's Theorem using linear programming. While previous methods obtain worst-case polynomial bounds, our approach can synthesize bounds of the form (nlogn) as well as (nr) where r is not an integer. We present experimental results to demonstrate that our approach can obtain efficiently worst-case bounds of classical recursive algorithms such as (i) Merge-Sort, the divide-and-conquer algorithm for the Closest-Pair problem, where we obtain (nlogn) worst-case bound, and (ii) Karatsuba's algorithm for polynomial multiplication and Strassen's algorithm for matrix multiplication, where we obtain (nr) bound such that r is not an integer and close to the best-known bounds for the respective algorithms.
]]></description>
<dc:subject>computer-science recursion algorithms to-understand benchmarking computational-complexity</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:ae949ae8063b/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:computer-science"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:recursion"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-understand"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:computational-complexity"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1703.00512">
    <title>[1703.00512] PMLB: A Large Benchmark Suite for Machine Learning Evaluation and Comparison</title>
    <dc:date>2017-04-30T12:47:44+00:00</dc:date>
    <link>https://arxiv.org/abs/1703.00512</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark datasets have emerged from different sources, but their organization and adoption as standards have been inconsistent. As such, selecting and curating specific benchmarks remains an unnecessary burden on machine learning practitioners and data scientists. The present study introduces an accessible, curated, and developing public benchmark resource to facilitate identification of the strengths and weaknesses of different machine learning methodologies. We compare meta-features among the current set of benchmark datasets in this resource to characterize the diversity of available data. Finally, we apply a number of established machine learning methods to the entire benchmark suite and analyze how datasets and algorithms cluster in terms of performance. This work is an important first step towards understanding the limitations of popular benchmarking suites and developing a resource that connects existing benchmarking standards to more diverse and efficient standards in the future.
]]></description>
<dc:subject>hey-I-know-this-guy machine-learning benchmarking horse-races performance-measure to-write-about</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:50950a30837a/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:hey-I-know-this-guy"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:horse-races"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:performance-measure"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/cs/0509032">
    <title>[cs/0509032] A Simple Model to Generate Hard Satisfiable Instances</title>
    <dc:date>2017-04-17T09:13:10+00:00</dc:date>
    <link>https://arxiv.org/abs/cs/0509032</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[In this paper, we try to further demonstrate that the models of random CSP instances proposed by [Xu and Li, 2000; 2003] are of theoretical and practical interest. Indeed, these models, called RB and RD, present several nice features. First, it is quite easy to generate random instances of any arity since no particular structure has to be integrated, or property enforced, in such instances. Then, the existence of an asymptotic phase transition can be guaranteed while applying a limited restriction on domain size and on constraint tightness. In that case, a threshold point can be precisely located and all instances have the guarantee to be hard at the threshold, i.e., to have an exponential tree-resolution complexity. Next, a formal analysis shows that it is possible to generate forced satisfiable instances whose hardness is similar to unforced satisfiable ones. This analysis is supported by some representative results taken from an intensive experimentation that we have carried out, using complete and incomplete search methods.
]]></description>
<dc:subject>constraint-satisfaction generative-models phase-transitions rather-interesting hard-problems benchmarking meta-models to-write-about</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:6ba8d7f96039/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:constraint-satisfaction"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:generative-models"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:phase-transitions"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:hard-problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:meta-models"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://search.arxiv.org:8081/paper.jsp?r=1611.03398&amp;qid=1491475924245ler_nCnN_397128995&amp;qs=%22magic+square%22&amp;byDate=1">
    <title>[1611.03398] XCSP3: An Integrated Format for Benchmarking Combinatorial Constrained Problems</title>
    <dc:date>2017-04-17T09:10:25+00:00</dc:date>
    <link>http://search.arxiv.org:8081/paper.jsp?r=1611.03398&amp;qid=1491475924245ler_nCnN_397128995&amp;qs=%22magic+square%22&amp;byDate=1</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[We propose a major revision of the format XCSP 2.1, called XCSP3, to build integrated representations of combinatorial constrained problems. This new format is able to deal with mono/multi optimization, many types of variables, cost functions, reification, views, annotations, variable quantification, distributed, probabilistic and qualitative reasoning. The new format is made compact, highly readable, and rather easy to parse. Interestingly, it captures the structure of the problem models, through the possibilities of declaring arrays of variables, and identifying syntactic and semantic groups of constraints. The number of constraints is kept under control by introducing a limited set of basic constraint forms, and producing almost automatically some of their variations through lifting, restriction, sliding, logical combination and relaxation mechanisms. As a result, XCSP3 encompasses practically all constraints that can be found in major constraint solvers developed by the CP community. A website, which is developed conjointly with the format, contains many models and series of instances. The user can make sophisticated queries for selecting instances from very precise criteria. The objective of XCSP3 is to ease the effort required to test and compare different algorithms by providing a common test-bed of combinatorial constrained instances.
]]></description>
<dc:subject>constraint-satisfaction representation rather-interesting to-write-about operations-research benchmarking XML ontology</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:6f5e295e3eb2/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:constraint-satisfaction"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:representation"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:operations-research"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:XML"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:ontology"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www.palgrave-journals.com/articles/palcomms2016105">
    <title>“Excellence R Us”: university research and the fetishisation of excellence : Palgrave Communications</title>
    <dc:date>2017-03-22T10:59:11+00:00</dc:date>
    <link>http://www.palgrave-journals.com/articles/palcomms2016105</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[The rhetoric of “excellence” is pervasive across the academy. It is used to refer to research outputs as well as researchers, theory and education, individuals and organizations, from art history to zoology. But does “excellence” actually mean anything? Does this pervasive narrative of “excellence” do any good? Drawing on a range of sources we interrogate “excellence” as a concept and find that it has no intrinsic meaning in academia. Rather it functions as a linguistic interchange mechanism. To investigate whether this linguistic function is useful we examine how the rhetoric of excellence combines with narratives of scarcity and competition to show that the hyper-competition that arises from the performance of “excellence” is completely at odds with the qualities of good research. We trace the roots of issues in reproducibility, fraud, and homophily to this rhetoric. But we also show that this rhetoric is an internal, and not primarily an external, imposition. We conclude by proposing an alternative rhetoric based on soundness and capacity-building. In the final analysis, it turns out that that “excellence” is not excellent. Used in its current unqualified form it is a pernicious and dangerous rhetoric that undermines the very foundations of good research and scholarship. This article is published as part of a collection on the future of research assessment.

]]></description>
<dc:subject>academia academic-culture higher-ed what-gets-measured-gets-fudged benchmarking corporatism</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:fd795e96e993/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:academia"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:academic-culture"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:higher-ed"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:what-gets-measured-gets-fudged"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:corporatism"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1501.06813">
    <title>[1501.06813] Mixed Map Labeling</title>
    <dc:date>2017-03-21T12:38:45+00:00</dc:date>
    <link>https://arxiv.org/abs/1501.06813</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Point feature map labeling is a geometric problem, in which a set of input points must be labeled with a set of disjoint rectangles (the bounding boxes of the label texts). Typically, labeling models either use internal labels, which must touch their feature point, or external (boundary) labels, which are placed on one of the four sides of the input points' bounding box and which are connected to their feature points by crossing-free leader lines. In this paper we study polynomial-time algorithms for maximizing the number of internal labels in a mixed labeling model that combines internal and external labels. The model requires that all leaders are parallel to a given orientation θ∈[0,2π), whose value influences the geometric properties and hence the running times of our algorithms.]]></description>
<dc:subject>computational-geometry optimization nudge-targets consider:looking-to-see well-defined-difficult-problems to-write-about benchmarking</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:bec48a32b32f/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:computational-geometry"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:looking-to-see"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:well-defined-difficult-problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1611.03530">
    <title>[1611.03530] Understanding deep learning requires rethinking generalization</title>
    <dc:date>2017-03-09T11:46:19+00:00</dc:date>
    <link>https://arxiv.org/abs/1611.03530</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance. Conventional wisdom attributes small generalization error either to properties of the model family, or to the regularization techniques used during training. 
Through extensive systematic experiments, we show how these traditional approaches fail to explain why large neural networks generalize well in practice. Specifically, our experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data. This phenomenon is qualitatively unaffected by explicit regularization, and occurs even if we replace the true images by completely unstructured random noise. We corroborate these experimental findings with a theoretical construction showing that simple depth two neural networks already have perfect finite sample expressivity as soon as the number of parameters exceeds the number of data points as it usually does in practice. 
We interpret our experimental findings by comparison with traditional models.
]]></description>
<dc:subject>deep-learning generalization benchmarking rather-interesting information-theory architecture define-your-terms machine-learning consider:looking-at-GP-models</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:5ea3a63f92f0/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:deep-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:generalization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:information-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:architecture"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:define-your-terms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:looking-at-GP-models"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1702.01446">
    <title>[1702.01446] Efficient Algorithms for k-Regret Minimizing Sets</title>
    <dc:date>2017-02-19T12:15:23+00:00</dc:date>
    <link>https://arxiv.org/abs/1702.01446</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[A regret minimizing set Q is a small size representation of a much larger database P so that user queries executed on Q return answers whose scores are not much worse than those on the full dataset. In particular, a k-regret minimizing set has the property that the regret ratio between the score of the top-1 item in Q and the score of the top-k item in P is minimized, where the score of an item is the inner product of the item's attributes with a user's weight (preference) vector. The problem is challenging because we want to find a single representative set Q whose regret ratio is small with respect to all possible user weight vectors. 
We show that k-regret minimization is NP-Complete for all dimensions d >= 3. This settles an open problem from Chester et al. [VLDB 2014], and resolves the complexity status of the problem for all d: the problem is known to have polynomial-time solution for d <= 2. In addition, we propose two new approximation schemes for regret minimization, both with provable guarantees, one based on coresets and another based on hitting sets. We also carry out extensive experimental evaluation, and show that our schemes compute regret-minimizing sets comparable in size to the greedy algorithm proposed in [VLDB 14] but our schemes are significantly faster and scalable to large data sets.
]]></description>
<dc:subject>databases multiobjective-optimization rather-interesting algorithms computational-complexity to-write-about benchmarking consider:looking-to-see consider:skylines</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:12cfe235bd5f/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:databases"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:multiobjective-optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:computational-complexity"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:looking-to-see"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:skylines"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="https://arxiv.org/abs/1612.00423">
    <title>[1612.00423] TorontoCity: Seeing the World with a Million Eyes</title>
    <dc:date>2017-01-07T14:47:10+00:00</dc:date>
    <link>https://arxiv.org/abs/1612.00423</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[In this paper we introduce the TorontoCity benchmark, which covers the full greater Toronto area (GTA) with 712.5 km2 of land, 8439 km of road and around 400,000 buildings. Our benchmark provides different perspectives of the world captured from airplanes, drones and cars driving around the city. Manually labeling such a large scale dataset is infeasible. Instead, we propose to utilize different sources of high-precision maps to create our ground truth. Towards this goal, we develop algorithms that allow us to align all data sources with the maps while requiring minimal human supervision. We have designed a wide variety of tasks including building height estimation (reconstruction), road centerline and curb extraction, building instance segmentation, building contour extraction (reorganization), semantic labeling and scene type classification (recognition). Our pilot study shows that most of these tasks are still difficult for modern convolutional neural networks.
]]></description>
<dc:subject>learning-from-data benchmarking rather-interesting data-integration nudge-targets consider:looking-to-see consider:the-right-tool-for-the-job to-write-about</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:dc6a30789bf2/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:learning-from-data"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:data-integration"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:looking-to-see"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:the-right-tool-for-the-job"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:to-write-about"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1604.08237">
    <title>[1604.08237] Decision Making and Productivity Measurement</title>
    <dc:date>2016-05-01T12:17:15+00:00</dc:date>
    <link>http://arxiv.org/abs/1604.08237</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[I wrote this book for several reasons. The most important one is to address the misuse of mathematics over the last four decades in the literature of operations research when measuring the performance of a set of homogenous firms with multiple input factors and multiple output factors as well as rank and benchmark firms. The second main reason is about misinterpreting the concept of technical efficiency as relative efficiency or efficiency in thousands of published papers and books in well-known qualified journals and by reputable publishers around the world since 1978. It is very sad to see when the young students and researchers around the world are educated to follow these illogical published methodologies and models. At the same time, a lot of resources are invested in such studies to find how a firm can regulate its input or output factors to improve productivity, whereas the judgments from these models are questionable and while a firm may be called 'efficient', it may have a worst performance in comparison with all other firms.
]]></description>
<dc:subject>benchmarking multiobjective-optimization book optimization management-science decision-making</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:c86acc028dff/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:multiobjective-optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:book"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:management-science"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:decision-making"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1501.03879">
    <title>[1501.03879] A new ADMM algorithm for the Euclidean median and its application to robust patch regression</title>
    <dc:date>2015-12-05T23:30:35+00:00</dc:date>
    <link>http://arxiv.org/abs/1501.03879</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[The Euclidean Median (EM) of a set of points Ω in an Euclidean space is the point x minimizing the (weighted) sum of the Euclidean distances of x to the points in Ω. While there exits no closed-form expression for the EM, it can nevertheless be computed using iterative methods such as the Wieszfeld algorithm. The EM has classically been used as a robust estimator of centrality for multivariate data. It was recently demonstrated that the EM can be used to perform robust patch-based denoising of images by generalizing the popular Non-Local Means algorithm. In this paper, we propose a novel algorithm for computing the EM (and its box-constrained counterpart) using variable splitting and the method of augmented Lagrangian. The attractive feature of this approach is that the subproblems involved in the ADMM-based optimization of the augmented Lagrangian can be resolved using simple closed-form projections. The proposed ADMM solver is used for robust patch-based image denoising and is shown to exhibit faster convergence compared to an existing solver.
]]></description>
<dc:subject>image-processing signal-processing algorithms performance-measure benchmarking nudge-targets generative-models generative-art</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:d3cc53294e1b/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:image-processing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:signal-processing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:performance-measure"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:generative-models"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:generative-art"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1108.3860">
    <title>[1108.3860] A SWAR Approach to Counting Ones</title>
    <dc:date>2015-12-05T23:29:06+00:00</dc:date>
    <link>http://arxiv.org/abs/1108.3860</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[We investigate the complexity of algorithms counting ones in different sets of operations. With addition and logical operations (but no shift) O(log2(n)) steps suffice to count ones. Parity can be computed with complexity O(log(n)), which is the same bound as for methods using shift-operations. If multiplication is available, a solution of time complexity O(log∗(n)) is possible improving the known bound O(loglog(n)).
]]></description>
<dc:subject>algorithms computational-complexity computer-science programming benchmarking nudge-targets consider:performance-measures</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:a716850cdeb0/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:computational-complexity"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:computer-science"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:performance-measures"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1505.00449">
    <title>[1505.00449] Algorithms for the minimum sum coloring problem: a review</title>
    <dc:date>2015-11-28T13:13:23+00:00</dc:date>
    <link>http://arxiv.org/abs/1505.00449</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[The Minimum Sum Coloring Problem (MSCP) is a generalization of the well-known vertex coloring problem. Due to its theoretical and practical relevance, the MSCP attracts increasing attention. The only existing review on the problem dates back to 2004 and mainly covers the history of the MSCP and the theoretical developments on specific graphs. In recent years, the field has witnessed significant progresses on practical solution algorithms. The purpose of this review is to provide a comprehensive inspection of the most recent and representative MSCP algorithms. To be informative, we identify the general framework followed by these algorithms and the key ingredients that make them successful. By classifying the main search strategies and putting forward the critical elements of the reviewed methods, we wish to encourage future development of more powerful methods and motivate new applications.
]]></description>
<dc:subject>graph-theory benchmarking algorithms review optimization nudge-targets</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:5c664ac14d8f/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:graph-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:review"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1505.00468">
    <title>[1505.00468] VQA: Visual Question Answering</title>
    <dc:date>2015-11-14T13:57:02+00:00</dc:date>
    <link>http://arxiv.org/abs/1505.00468</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[We propose the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image, the task is to provide an accurate natural language answer. Mirroring real-world scenarios, such as helping the visually impaired, both the questions and answers are open-ended. Visual questions selectively target different areas of an image, including background details and underlying context. As a result, a system that succeeds at VQA typically needs a more detailed understanding of the image and complex reasoning than a system producing generic image captions. Moreover, VQA is amenable to automatic evaluation, since many open-ended answers contain only a few words or a closed set of answers that can be provided in a multiple-choice format. We provide a dataset containing ~0.25M images, ~0.76M questions, and ~10M answers (www.visualqa.org), and discuss the information it provides. Numerous baselines for VQA are provided and compared with human performance.
]]></description>
<dc:subject>benchmarking artificial-intelligence image-processing performance-measure nudge-targets</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:68de6292c548/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:artificial-intelligence"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:image-processing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:performance-measure"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1508.06773">
    <title>[1508.06773] Ranking by pairwise comparisons for Swiss-system tournaments</title>
    <dc:date>2015-09-11T14:30:54+00:00</dc:date>
    <link>http://arxiv.org/abs/1508.06773</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Pairwise comparison matrices are widely used in Multicriteria Decision Making. This article applies incomplete pairwise comparison matrices in the area of sport tournaments, namely proposing alternative rankings for the 2010 Chess Olympiad Open tournament. It is shown that results are robust regarding scaling technique. In order to compare different rankings, a distance function is introduced with the aim of taking into account the subjective nature of human perception. Analysis of the weight vectors implies that methods based on pairwise comparisons have common roots. Visualization of the results is provided by Multidimensional Scaling on the basis of the defined distance. The proposed rankings give in some cases intuitively better outcome than currently used lexicographical orders.
]]></description>
<dc:subject>multiobjective-optimization benchmarking horse-races making-the-numbers-work-out estimation philosophy-of-engineering amusing</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:2bbc3dee9035/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:multiobjective-optimization"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:horse-races"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:making-the-numbers-work-out"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:estimation"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:philosophy-of-engineering"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:amusing"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1401.7543">
    <title>[1401.7543] The use of soft matrices on soft multisets in an optimal decision process</title>
    <dc:date>2015-09-06T11:44:57+00:00</dc:date>
    <link>http://arxiv.org/abs/1401.7543</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[In this paper, we introduce a concept of a soft matrix on a soft multiset, and investigate how to use soft matrices to solve decision making problems. An algorithm for a multiple choose selection problem is also provided. Finally, we demonstrate an illustrative example to show the decision making steps.
]]></description>
<dc:subject>decision-making benchmarking soft-sets operations-research planning nudge-targets consider:representation</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:a0d48d70a2d2/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:decision-making"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:soft-sets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:operations-research"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:planning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:representation"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1508.01045">
    <title>[1508.01045] The QBF Gallery: Behind the Scenes</title>
    <dc:date>2015-08-22T15:02:09+00:00</dc:date>
    <link>http://arxiv.org/abs/1508.01045</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Over the last few years, much progress has been made in the theory and practice of solving quantified Boolean formulas (QBF). Novel solvers have been presented that either successfully enhance established techniques or implement novel solving paradigms. Powerful preprocessors have been realized that tune the encoding of a formula to make it easier to solve. Frameworks for certification and solution extraction emerged that allow for a detailed interpretation of a QBF solver's results, and new types of QBF encodings were presented for various application problems. 
To capture these developments the QBF Gallery was established in 2013. The QBF Gallery aims at providing a forum to assess QBF tools and to collect new, expressive benchmarks that allow for documenting the status quo and that indicate promising research directions. The collected benchmarks became the basis for the experiments conducted in the context of the QBF Gallery 2013 and QBF Gallery 2014. In the latter, QBF solvers were evaluated in a competitive setting as part of the FLoC Olympic Games. In contrast to 2014, the edition of the QBF Gallery in 2013 was not competitive and hence no prizes were awarded. 
In this paper, we report on the setup of the QBF Gallery. To this end, we conducted numerous experiments which allowed us not only to assess the quality of the tools, but also the quality of the benchmarks.
]]></description>
<dc:subject>benchmarking horse-races computational-complexity satisfiability rather-interesting nudge-targets</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:9c19cd8478a5/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:horse-races"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:computational-complexity"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:satisfiability"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1406.7424">
    <title>[1406.7424] Complexity Measures and Concept Learning</title>
    <dc:date>2015-07-26T12:55:37+00:00</dc:date>
    <link>http://arxiv.org/abs/1406.7424</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[The nature of concept learning is a core question in cognitive science. Theories must account for the relative difficulty of acquiring different concepts by supervised learners. For a canonical set of six category types, two distinct orderings of classification difficulty have been found. One ordering, which we call paradigm-specific, occurs when adult human learners classify objects with easily distinguishable characteristics such as size, shape, and shading. The general order occurs in all other known cases: when adult humans classify objects with characteristics that are not readily distinguished (e.g., brightness, saturation, hue); for children and monkeys; and when categorization difficulty is extrapolated from errors in identification learning. The paradigm-specific order was found to be predictable mathematically by measuring the logical complexity of tasks, i.e., how concisely the solution can be represented by logical rules. 
However, logical complexity explains only the paradigm-specific order but not the general order. Here we propose a new difficulty measurement, information complexity, that calculates the amount of uncertainty remaining when a subset of the dimensions are specified. This measurement is based on Shannon entropy. We show that, when the metric extracts minimal uncertainties, this new measurement predicts the paradigm-specific order for the canonical six category types, and when the metric extracts average uncertainties, this new measurement predicts the general order. Moreover, for learning category types beyond the canonical six, we find that the minimal-uncertainty formulation correctly predicts the paradigm-specific order as well or better than existing metrics (Boolean complexity and GIST) in most cases.
]]></description>
<dc:subject>information-theory complexity benchmarking nudge-targets consider:rediscovery consider:performance-measures</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:5e91692133db/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:information-theory"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:complexity"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:rediscovery"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:performance-measures"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1502.05698">
    <title>[1502.05698] Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks</title>
    <dc:date>2015-02-21T12:01:32+00:00</dc:date>
    <link>http://arxiv.org/abs/1502.05698</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[One long-term goal of machine learning research is to produce methods that are applicable to reasoning and natural language, in particular building an intelligent dialogue agent. To measure progress towards that goal, we argue for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering. Our tasks measure understanding in several ways: whether a system is able to answer questions via chaining facts, simple induction, deduction and many more. The tasks are designed to be prerequisites for any system that aims to be capable of conversing with a human. We believe many existing learning systems can currently not solve them, and hence our aim is to classify these tasks into skill sets, so that researchers can identify (and then rectify) the failings of their systems. We also extend and improve the recently introduced Memory Networks model, and show it is able to solve some, but not all, of the tasks.
]]></description>
<dc:subject>SMH artificial-intelligence toy-problems brain-in-a-box nudge-targets consider:alien-Chinese-box acceptance-testing GPTP2015 benchmarking</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:6eb5bc3ca10c/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:SMH"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:artificial-intelligence"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:toy-problems"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:brain-in-a-box"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:consider:alien-Chinese-box"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:acceptance-testing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:GPTP2015"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1501.05382">
    <title>[1501.05382] Enhanced Mixtures of Part Model for Human Pose Estimation</title>
    <dc:date>2015-02-05T11:06:37+00:00</dc:date>
    <link>http://arxiv.org/abs/1501.05382</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[Mixture of parts model has been successfully applied to 2D human pose estimation problem either as explicitly trained body part model or as latent variables for the whole human body model. Mixture of parts model usually utilize tree structure for representing relations between body parts. Tree structures facilitate training and referencing of the model but could not deal with double counting problems, which hinder its applications in 3D pose estimation. While most of work targeted to solve these problems tend to modify the tree models or the optimization target. We incorporate other cues from input features. For example, in surveillance environments, human silhouettes can be extracted relative easily although not flawlessly. In this condition, we can combine extracted human blobs with histogram of gradient feature, which is commonly used in mixture of parts model for training body part templates. The method can be easily extend to other candidate features under our generalized framework. We show 2D body part detection results on a public available dataset: HumanEva dataset. Furthermore, a 2D to 3D pose estimator is trained with Gaussian process regression model and 2D body part detections from the proposed method is fed to the estimator, thus 3D poses are predictable given new 2D body part detections. We also show results of 3D pose estimation on HumanEva dataset.
]]></description>
<dc:subject>image-processing algorithms object-recognition datasets nudge-targets benchmarking machine-learning</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:f408b7bbf388/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:image-processing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:object-recognition"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:datasets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1404.6193">
    <title>[1404.6193] A biclustering approach to university performances: an Italian case study</title>
    <dc:date>2014-12-18T11:59:08+00:00</dc:date>
    <link>http://arxiv.org/abs/1404.6193</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[University evaluation is a topic of increasing concern in Italy as well as in other countries. In empirical analysis, university activities and performances are generally measured by means of indicator variables, summarizing the available information under different perspectives. In this paper, we argue that the evaluation process is a complex issue that can not be addressed by a simple descriptive approach and thus association between indicators and similarities among the observed universities should be accounted for. Particularly, we examine faculty-level data collected from different sources, covering 55 Italian Economics faculties in the academic year 2009/2010. Making use of a clustering framework, we introduce a biclustering model that accounts for both homogeneity/heterogeneity among faculties and correlations between indicators. Our results show that there are two substantial different performances between universities which can be strictly related to the nature of the institutions, namely the Private and Public profiles . Each of the two groups has its own peculiar features and its own group-specific list of priorities, strengths and weaknesses. Thus, we suggest that caution should be used in interpreting standard university rankings as they generally do not account for the complex structure of the data.
]]></description>
<dc:subject>academic-culture benchmarking biclustering ranking rather-odd data-analysis variable-selection</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:a9f93b2c5139/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:academic-culture"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:biclustering"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:ranking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:rather-odd"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:data-analysis"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:variable-selection"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1312.1858">
    <title>[1312.1858] A New Schema and Landscape for Programs: The Santa Fe Ant Case Study</title>
    <dc:date>2013-12-20T11:32:42+00:00</dc:date>
    <link>http://arxiv.org/abs/1312.1858</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[This paper introduces a new schema and a landscape analysis based on executed instruction sequences, and showcases their capabilities by analyzing the structures and evolutionary dynamics of the Santa Fe Ant Problem. The textbook Santa Fe Ant model problem is particularly appropriate for this exercise because after two decades of extensive use and analyses with more conventional schema and landscape analyses, it still lacks a clear narrative of the program structures that are systematically used for fitness improvement, the geometries of those structures and their dynamics during optimization. We use our new schema and landscapes to detail systematic structural features that are the key to high fitness of ant programs. For the first time we detail the evolutionary dynamics of high fitness structures that takes place during Genetic Programming on the problem. We develop a new phenotypic variation method that tests our understanding of the landscape. We also develop a modified function set that tests our understanding of synchronization constraints we identify. We obtain favorable computational efforts compared to those in the literature, on testing the new variation and function set on both the Santa Fe Trail, and the more computationally demanding Los Altos Trail.
]]></description>
<dc:subject>genetic-programming representation benchmarking looking-for-blocks-in-GP-again-oh-dear toy-problems</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:0952b7059b0f/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:genetic-programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:representation"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:looking-for-blocks-in-GP-again-oh-dear"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:toy-problems"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1309.7099">
    <title>[1309.7099] On the Internal Dynamics of the Shanghai Ranking</title>
    <dc:date>2013-11-29T13:19:25+00:00</dc:date>
    <link>http://arxiv.org/abs/1309.7099</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[The Academic Ranking of World Universities (ARWU) published by researchers at Shanghai Jiao Tong University has become a major source of information for university administrators, country officials, students and the public at large. Recent discoveries regarding its internal dynamics allow the inversion of published ARWU indicator scores to reconstruct raw scores for five hundred world class universities. This paper explores raw scores in the ARWU and in other contests to contrast the dynamics of rank-driven and score-driven tables, and to explain why the ARWU ranking is a score-driven procedure. We show that the ARWU indicators constitute sub-scales of a single factor accounting for research performance, and provide an account of the system of gains and non-linearities used by ARWU. The paper discusses the non-linearities selected by ARWU, concluding that they are designed to represent the regressive character of indicators measuring research performance. We propose that the utility and usability of the ARWU could be greatly improved by replacing the unwanted dynamical effects of the annual re-scaling based on raw scores of the best performers.
]]></description>
<dc:subject>benchmarking algorithms social-dynamics feature-extraction performance-measure</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:06bd0e8d577c/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:social-dynamics"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:feature-extraction"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:performance-measure"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1308.2411">
    <title>[1308.2411] A mass-structured individual-based model of the chemostat: convergence and simulation</title>
    <dc:date>2013-11-16T19:37:18+00:00</dc:date>
    <link>http://arxiv.org/abs/1308.2411</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[We propose a model of chemostat where the bacterial population is individually-based, each bacterium is explicitly represented and has a mass evolving continuously over time. The substrate concentration is represented as a conventional ordinary differential equation. These two components are coupled with the bacterial consumption. Mechanisms acting on the bacteria are explicitly described (growth, division and up-take). Bacteria interact via consumption. We set the exact Monte Carlo simulation algorithm of this model and its mathematical representation as a stochastic process. We prove the convergence of this process to the solution of an integro-differential equation when the population size tends to infinity. Finally, we propose several numerical simulations.
]]></description>
<dc:subject>models models-and-modes benchmarking interesting theoretical-biology</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:6d37b496d562/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:models"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:models-and-modes"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:theoretical-biology"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1309.0534">
    <title>[1309.0534] Machines are benchmarked by code, not algorithms</title>
    <dc:date>2013-09-06T15:10:07+00:00</dc:date>
    <link>http://arxiv.org/abs/1309.0534</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[This article highlights how small modifications to either the source code of a benchmark program or the compilation options may impact its behavior on a specific machine. It argues that for evaluating machines, benchmark providers and users be careful to ensure reproducibility of results based on the machine code actually running on the hardware and not just source code. The article uses color to grayscale conversion of digital images as a running example.
]]></description>
<dc:subject>benchmarking performance-measure engineering-design interesting software-development-is-not-programming</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:2fc4f65212c9/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:performance-measure"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:engineering-design"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:interesting"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:software-development-is-not-programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1301.4092">
    <title>[1301.4092] Encounter dynamics of a small target by a polymer diffusing in a confined domain</title>
    <dc:date>2013-04-25T11:15:42+00:00</dc:date>
    <link>http://arxiv.org/abs/1301.4092</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[We study the first passage time for a polymer, that we call the narrow encounter time (NETP), to reach a small target located on the surface of a microdomain. The polymer is modeled as a Freely Joint Chain (beads connected by springs with a resting non zero length) and we use Brownian simulations to study two cases: when (i) any of the monomer or (ii) only one can be absorbed at the target window. Interestingly, we find that {in the first case} the NETP is an increasing function of the polymer length until a critical length, after which it decreases. Moreover, in the long polymer regime, we identified an exponential scaling law for the NETP as a function of the polymer length. {In the second case, the position of the absorbed monomer along the polymer chain strongly influences the NETP}. Our analysis can be applied to estimate the mean first time of a DNA fragment to a small target in the chromatin structure or for mRNA to find a small target.
]]></description>
<dc:subject>structural-biology simulation what-happens-if-project molecular-crowding nudge-targets parameter-tuning benchmarking</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:03502be8630d/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:structural-biology"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:simulation"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:what-happens-if-project"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:molecular-crowding"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:parameter-tuning"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1012.5205">
    <title>[1012.5205] Effect of turbulent fluctuations on the drag and lift forces on a towed sphere and its boundary layer</title>
    <dc:date>2013-04-25T11:07:30+00:00</dc:date>
    <link>http://arxiv.org/abs/1012.5205</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[The impact of turbulent fluctuations on the forces exerted by a fluid on a towed spherical particle is investigated by means of high-resolution direct numerical simulations. The measurements are carried out using a novel scheme to integrate the two-way coupling between the particle and the incompressible surrounding fluid flow maintained in a high-Reynolds-number turbulent regime. The main idea consists in combining a Fourier pseudo-spectral method for the fluid with an immersed-boundary technique to impose the no-slip boundary condition on the surface of the particle. Benchmarking of the code shows a good agreement with experimental and numerical measurements from other groups. A study of the turbulent wake downstream the sphere is also reported. The mean velocity deficit is shown to behave as the inverse of the distance from the particle, as predicted from classical similarity analysis. This law is reinterpreted in terms of the principle of "permanence of large eddies" that relates infrared asymptotic self-similarity to the law of decay of energy in homogeneous turbulence. 
The developed method is then used to attack the problem of an upstream flow that is in a developed turbulent regime. It is shown that the average drag force increases as a function of the turbulent intensity and the particle Reynolds number. This increase is significantly larger than predicted by standard drag correlations based on laminar upstream flows. It is found that the relevant parameter is the ratio of the viscous boundary layer thickness to the dissipation scale of the ambient turbulent flow. The drag enhancement can be motivated by the modification of the mean velocity and pressure profile around the sphere by small scale turbulent fluctuations.
]]></description>
<dc:subject>simulation fluid-dynamics benchmarking stress-testing validation nudge-targets</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:0f95b041cc86/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:simulation"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:fluid-dynamics"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:stress-testing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:validation"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1304.3780">
    <title>[1304.3780] Solving the Tower of Hanoi with Random Moves</title>
    <dc:date>2013-04-21T15:07:19+00:00</dc:date>
    <link>http://arxiv.org/abs/1304.3780</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[In this note we prove the exact formula for the expected number of moves to solve two variants of the Tower of Hanoi puzzle with 3 pegs and n disks when each move is chosen uniformly from the set of all valid moves.
]]></description>
<dc:subject>algorithms benchmarking nudge-targets mathematical-recreations machine-learning</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:9e7c1ea82f16/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:algorithms"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:mathematical-recreations"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:machine-learning"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://arxiv.org/abs/1301.1907">
    <title>[1301.1907] Moon Search Algorithms for NASA's Dawn Mission to Asteroid Vesta</title>
    <dc:date>2013-02-03T14:53:08+00:00</dc:date>
    <link>http://arxiv.org/abs/1301.1907</link>
    <dc:creator>Vaguery</dc:creator><description><![CDATA[A moon or natural satellite is a celestial body that orbits a planetary body such as a planet, dwarf planet, or an asteroid. Scientists seek understanding the origin and evolution of our solar system by studying moons of these bodies. Additionally, searches for satellites of planetary bodies can be important to protect the safety of a spacecraft as it approaches or orbits a planetary body. If a satellite of a celestial body is found, the mass of that body can also be calculated once its orbit is determined. Ensuring the Dawn spacecraft's safety on its mission to the asteroid (4) Vesta primarily motivated the work of Dawn's Satellite Working Group (SWG) in summer of 2011. Dawn mission scientists and engineers utilized various computational tools and techniques for Vesta's satellite search. The objectives of this paper are to 1) introduce the natural satellite search problem, 2) present the computational challenges, approaches, and tools used when addressing this problem, and 3) describe applications of various image processing and computational algorithms for performing satellite searches to the electronic imaging and computer science community. Furthermore, we hope that this communication would enable Dawn mission scientists to improve their satellite search algorithms and tools and be better prepared for performing the same investigation in 2015, when the spacecraft is scheduled to approach and orbit the dwarf planet (1) Ceres.]]></description>
<dc:subject>space-exploration space-science benchmarking image-processing discovery challenge nudge-targets</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:Vaguery/b:06afadaa5908/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:space-exploration"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:space-science"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:benchmarking"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:image-processing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:discovery"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:challenge"/>
	<rdf:li rdf:resource="https://pinboard.in/u:Vaguery/t:nudge-targets"/>
</rdf:Bag></taxo:topics>
</item>
</rdf:RDF>