Pinboard (jm)
https://pinboard.in/u:jm/public/
recent bookmarks from jmCircllhist2020-02-10T11:59:56+00:00
https://arxiv.org/pdf/2001.06561.pdf
jmhistograms aggregation quantiles percentiles measurement graphs data-structures summaries latency monitoring approximation papershttps://pinboard.in/https://pinboard.in/u:jm/b:763bb560c156/Google release an open-source differential-privacy lib2019-09-09T14:24:03+00:00
https://developers.googleblog.com/2019/09/enabling-developers-and-organizations.html
jmDifferentially-private data analysis is a principled approach that enables organizations to learn from the majority of their data while simultaneously ensuring that those results do not allow any individual's data to be distinguished or re-identified. This type of analysis can be implemented in a wide variety of ways and for many different purposes. For example, if you are a health researcher, you may want to compare the average amount of time patients remain admitted across various hospitals in order to determine if there are differences in care. Differential privacy is a high-assurance, analytic means of ensuring that use cases like this are addressed in a privacy-preserving manner.
Currently, we provide algorithms to compute the following:
Count
Sum
Mean
Variance
Standard deviation
Order statistics (including min, max, and median)
]]>analytics google ml privacy differential-privacy aggregation statistics obfuscation approximation algorithmshttps://pinboard.in/https://pinboard.in/u:jm/b:98439e468432/[1902.04023] Computing Extremely Accurate Quantiles Using t-Digests2019-02-18T11:05:52+00:00
https://arxiv.org/abs/1902.04023
jmjava go python open-source quantiles percentiles approximation statistics sketching algorithms via:fanfhttps://pinboard.in/https://pinboard.in/u:jm/b:6c84ec8a0947/Fast Forward Labs: Probabilistic Data Structure Showdown: Cuckoo Filters vs. Bloom Filters2016-11-28T21:57:26+00:00
http://blog.fastforwardlabs.com/post/153566952648/probabilistic-data-structure-showdown-cuckoo
jmThis post provides an update by exploring Cuckoo filters, a new probabilistic data structure that improves upon the standard Bloom filter. The Cuckoo filter provides a few advantages: 1) it enables dynamic deletion and addition of items 2) it can be easily implemented compared to Bloom filter variants with similar capabilities, and 3) for similar space constraints, the Cuckoo filter provides lower false positives, particularly at lower capacities. We provide a python implementation of the Cuckoo filter here, and compare it to a counting Bloom filter (a Bloom filter variant).
]]>algorithms probabilistic approximation bloom-filters cuckoo-filters sets estimation pythonhttps://pinboard.in/https://pinboard.in/u:jm/b:65fd58e6fd3f/"last seen" sketch2015-07-15T16:56:10+00:00
https://vividcortex.com/blog/2015/06/22/sampling-a-stream-of-events-with-a-sketch/
jmsketch algorithms estimation approximation sampling streams big-datahttps://pinboard.in/https://pinboard.in/u:jm/b:1d2591ab4ead/"Cuckoo Filter: Practically Better Than Bloom"2015-03-09T14:29:55+00:00
http://www.pdl.cmu.edu/PDL-FTP/FS/cuckoo-conext2014.pdf
jmalgorithms paper bloom-filters cuckoo-filters cuckoo-hashing data-structures false-positives big-data probabilistic hashing set-membership approximationhttps://pinboard.in/https://pinboard.in/u:jm/b:a7df31b55f43/'Medians and Beyond: New Aggregation Techniques for Sensor Networks' [paper, PDF]2013-02-09T21:54:42+00:00
http://www.cs.virginia.edu/~son/cs851/papers/ucsb.sensys04.pdf
jmq-digest algorithms streams approximation histograms median percentiles quantileshttps://pinboard.in/https://pinboard.in/u:jm/b:893da73adce4/clearspring / stream-lib2013-02-09T21:46:02+00:00
https://github.com/clearspring/stream-lib#readme
jmalgorithms coding streams cep stream-processing approximation probabilistic space-saving top-k cardinality estimation bloom-filters q-digest loglog hyperloglog murmurhash lookup3https://pinboard.in/https://pinboard.in/u:jm/b:5ec31bbded7e/'Efficient Computation of Frequent and Top-k Elements in Data Streams' [paper, PDF]2013-02-09T21:30:30+00:00
http://www.cs.ucsb.edu/research/tech_reports/reports/2005-23.pdf
jmspace-saving approximation streams stream-processing cep papers pdf algorithmshttps://pinboard.in/https://pinboard.in/u:jm/b:aa06ce6e347d/Real-time Analytics in Scala [slides, PDF]2013-02-09T21:17:12+00:00
http://noelwelsh.com/assets/downloads/scala-exchange-2012.pdf
jmstreams algorithms approximation coding scala slideshttps://pinboard.in/https://pinboard.in/u:jm/b:15d275caa928/