Pinboard (jm)
https://pinboard.in/u:jm/public/
recent bookmarks from jm_Optimal Probabilistic Cache Stampede Prevention_ [pdf]2017-05-11T20:20:40+00:00
http://www.vldb.org/pvldb/vol8/p886-vattani.pdf
jmvia:marcbrooker caching caches algorithm probabilistic expiration vldb papers expiry cache-miss stampedeshttps://pinboard.in/https://pinboard.in/u:jm/b:e8b64dc20608/Darts, Dice, and Coins2016-04-20T15:19:40+00:00
http://www.keithschwarz.com/darts-dice-coins/
jmEarlier this year, I asked a question on Stack Overflow about a data structure for loaded dice. Specifically, I was interested in answering this question: "You are given an n-sided die where side i has probability pi of being rolled. What is the most efficient data structure for simulating rolls of the die?"
This data structure could be used for many purposes. For starters, you could use it to simulate rolls of a fair, six-sided die by assigning probability 1616 to each of the sides of the die, or a to simulate a fair coin by simulating a two-sided die where each side has probability 1212 of coming up. You could also use this data structure to directly simulate the total of two fair six-sided dice being thrown by having an 11-sided die (whose faces were 2, 3, 4, ..., 12), where each side was appropriately weighted with the probability that this total would show if you used two fair dice. However, you could also use this data structure to simulate loaded dice. For example, if you were playing craps with dice that you knew weren't perfectly fair, you might use the data structure to simulate many rolls of the dice to see what the optimal strategy would be. You could also consider simulating an imperfect roulette wheel in the same way.
Outside the domain of game-playing, you could also use this data structure in robotics simulations where sensors have known failure rates. For example, if a range sensor has a 95% chance of giving the right value back, a 4% chance of giving back a value that's too small, and a 1% chance of handing back a value that's too large, you could use this data structure to simulate readings from the sensor by generating a random outcome and simulating the sensor reading in that case.
The answer I received on Stack Overflow impressed me for two reasons. First, the solution pointed me at a powerful technique called the alias method that, under certain reasonable assumptions about the machine model, is capable of simulating rolls of the die in O(1)O(1) time after a simple preprocessing step. Second, and perhaps more surprisingly, this algorithm has been known for decades, but I had not once encountered it! Considering how much processing time is dedicated to simulation, I would have expected this technique to be better- known. A few quick Google searches turned up a wealth of information on the technique, but I couldn't find a single site that compiled together the intuition and explanation behind the technique.
(via Marc Brooker)]]>via:marcbrooker algorithms probability algorithm coding data-structures alias dice randomhttps://pinboard.in/https://pinboard.in/u:jm/b:3cd1efbfe4f5/