Pinboard (jm)
https://pinboard.in/u:jm/public/
recent bookmarks from jmsparkey2017-02-08T15:58:40+00:00
https://labs.spotify.com/2013/09/03/sparkey/
jmspotify sparkey read-only key-value storage ops architecturehttps://pinboard.in/https://pinboard.in/u:jm/b:02b680cbb49d/Sirius by Comcast2014-04-24T09:16:58+00:00
http://comcast.github.io/sirius/overview.html
jmAt Comcast, our applications need convenient, low-latency access to important reference datasets. For example, our XfinityTV websites and apps need to use entertainment-related data to serve almost every API or web request to our datacenters: information like what year Casablanca was released, or how many episodes were in Season 7 of Seinfeld, or when the next episode of the Voice will be airing (and on which channel!).
We traditionally managed this information with a combination of relational databases and RESTful web services but yearned for something simpler than the ORM, HTTP client, and cache management code our developers dealt with on a daily basis. As main memory sizes on commodity servers continued to grow, however, we asked ourselves: How can we keep this reference data entirely in RAM, while ensuring it gets updated as needed and is easily accessible to application developers?
The Sirius distributed system library is our answer to that question, and we're happy to announce that we've made it available as an open source project. Sirius is written in Scala and uses the Akka actor system under the covers, but is easily usable by any JVM-based language.
Also includes a Paxos implementation with "fast follower" read-only slave replication. ASL2-licensed open source.
The only thing I can spot to be worried about is speed of startup; they note that apps need to replay a log at startup to rebuild state, which can be slow if unoptimized in my experience.
Update: in a twitter conversation at https://twitter.com/jon_moore/status/459363751893139456 , Jon Moore indicated they haven't had problems with this even with 'datasets consuming 10-20GB of heap', and have 'benchmarked a 5-node Sirius ingest cluster up to 1k updates/sec write throughput.' That's pretty solid!]]>open-source comcast paxos replication read-only datastores storage memory memcached redis sirius scala akka jvm librarieshttps://pinboard.in/https://pinboard.in/u:jm/b:54736b003116/Splout2013-02-07T11:57:05+00:00
https://github.com/datasalt/splout-db
jmsplout sql big-data hadoop read-only scaling queries analyticshttps://pinboard.in/https://pinboard.in/u:jm/b:fb43b90c2d46/