<?xml version="1.0" encoding="UTF-8"?>
 <rdf:RDF xmlns="http://purl.org/rss/1.0/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:cc="http://web.resource.org/cc/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://pinboard.in">
    <title>Pinboard (rahuldave)</title>
    <link>https://pinboard.in/u:rahuldave/public/</link>
    <description>recent bookmarks from rahuldave</description>
    <items>
      <rdf:Seq>	<rdf:li rdf:resource="http://python3porting.com/improving.html"/>
	<rdf:li rdf:resource="http://www.johnmyleswhite.com/notebook/2012/05/03/cumplyr-extending-the-plyr-package-to-handle-cross-dependencies/"/>
	<rdf:li rdf:resource="http://radar.oreilly.com/2012/05/functional-languages-functional-techniques.html"/>
	<rdf:li rdf:resource="http://www.r-bloggers.com/comparing-julia-and-r%e2%80%99s-vocabularies/"/>
	<rdf:li rdf:resource="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/0RJmqXtrung/profile-of-the-data-journalist-2.html"/>
	<rdf:li rdf:resource="http://www.johndcook.com/blog/2012/02/22/julia-random-number-generation/"/>
	<rdf:li rdf:resource="http://code.google.com/edu/languages/google-python-class/introduction.html"/>
	<rdf:li rdf:resource="http://www.25hoursaday.com/weblog/2012/01/03/WhatILearnedAfter3WeeksOfWritingMobileApps.aspx"/>
	<rdf:li rdf:resource="http://eflorenzano.com/blog/2012/01/01/reducing-code-nesting/"/>
	<rdf:li rdf:resource="http://radar.oreilly.com/2011/12/four-short-links-28-december-2-1.html"/>
	<rdf:li rdf:resource="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/jLZhw5uJ1Ms/four-short-links-28-december-2-1.html"/>
	<rdf:li rdf:resource="http://rss.slashdot.org/~r/slashdot/eqWf/~3/65pGuqnXP-o/why-was-hypercard-killed"/>
	<rdf:li rdf:resource="http://www.johndcook.com/blog/2011/11/28/fundamental-theorem-of-readability/"/>
	<rdf:li rdf:resource="http://www.johndcook.com/blog/2011/11/14/separating-presentation-from-content/"/>
	<rdf:li rdf:resource="http://rss.slashdot.org/~r/slashdot/eqWf/~3/BAdyB9M-hGw/microsoft-roslyn-reinventing-the-compiler-as-we-know-it"/>
	<rdf:li rdf:resource="http://www.johndcook.com/blog/2011/09/27/sed-one-liners/"/>
	<rdf:li rdf:resource="http://rss.slashdot.org/~r/slashdot/eqWf/~3/pHX0S2EU4ZU/Client-side-Web-REPL-For-15-Languages"/>
	<rdf:li rdf:resource="http://www.johndcook.com/blog/2011/04/19/learn-one-sed-command/"/>
	<rdf:li rdf:resource="http://lifehacker.com/5724763/kod-is-a-free-text-editor-design-for-programmers"/>
	<rdf:li rdf:resource="http://radar.oreilly.com/2010/12/how-will-the-elmcity-service-s.html"/>
	<rdf:li rdf:resource="http://rss.slashdot.org/~r/slashdot/eqWf/~3/LdJlH4NcNtI/What-Every-Programmer-Should-Know-About-Floating-Point-Arithmetic"/>
	<rdf:li rdf:resource="http://www.mailund.dk/index.php/2010/04/26/is-r-an-epic-fail/"/>
	<rdf:li rdf:resource="http://feeds.arstechnica.com/~r/arstechnica/index/~3/tGM5tqWsxfY/tutorial-use-twitters-new-real-time-stream-api-in-python.ars"/>
	<rdf:li rdf:resource="http://www.mailund.dk/index.php/2010/04/21/on-code-and-comments/"/>
	<rdf:li rdf:resource="http://www.johndcook.com/blog/2010/04/15/85-functional-language-purity/"/>
	<rdf:li rdf:resource="http://radar.oreilly.com/2010/04/four-short-links-5-april-2010.html"/>
	<rdf:li rdf:resource="http://www.chrishowie.com/2010/04/01/git-svn-in-the-workplace/"/>
	<rdf:li rdf:resource="http://feedproxy.google.com/~r/catonmat/~3/sy5RTytKBuI/"/>
	<rdf:li rdf:resource="http://rjlipton.wordpress.com/2010/03/23/its-ada-lovelace-day/"/>
	<rdf:li rdf:resource="http://feedproxy.google.com/~r/catonmat/~3/GJRqxzmBW9c/"/>
	<rdf:li rdf:resource="http://rss.slashdot.org/~r/slashdot/eqWf/~3/0BqHlxmfoNk/Simpler-Hello-World-Demonstrated-In-C"/>
      </rdf:Seq>
    </items>
  </channel><item rdf:about="http://python3porting.com/improving.html">
    <title>Improving your code with modern idioms — Porting to Python 3 - The Book Site</title>
    <dc:date>2012-05-22T17:04:50+00:00</dc:date>
    <link>http://python3porting.com/improving.html</link>
    <dc:creator>rahuldave</dc:creator><dc:subject>python programming</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:rahuldave/b:76a958c1a8a8/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:python"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www.johnmyleswhite.com/notebook/2012/05/03/cumplyr-extending-the-plyr-package-to-handle-cross-dependencies/">
    <title>cumplyr: Extending the plyr Package to Handle Cross-Dependencies</title>
    <dc:date>2012-05-03T14:44:49+00:00</dc:date>
    <link>http://www.johnmyleswhite.com/notebook/2012/05/03/cumplyr-extending-the-plyr-package-to-handle-cross-dependencies/</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[Introduction
For me, Hadley Wickham‘s reshape and plyr packages are invaluable because they encapsulate omnipresent design patterns in statistical computing: reshape handles switching between the different possible representations of the same underlying data, while plyr automates what Hadley calls the Split-Apply-Combine strategy, in which you split up your data into several subsets, perform some computation on each of these subsets and then combine the results into a new data set. Many of the computations implicit in traditional statistical theory are easily described in this fashion: for example, comparing the means of two groups is computationally equivalent to splitting a data set of individual observations up into subsets based on the group assignments, applying mean to those subsets and then pooling the results back together again.

The Split-Apply-Combine Strategy is Broader than plyr
The only weakness of plyr, which automates so many of the computations that instantiate the Split-Apply-Combine strategy, is that plyr implements one very specific version of the Split-Apply-Combine strategy: plyr always splits your data into disjoint subsets. By disjoint, I mean that any row of the original data set can occur in only one of the subsets created by the splitting function. For computations that involve cross-dependencies between observations, this makes plyr inapplicable: cumulative quantities like running means and broadly local quantities like kernelized means cannot be computed using plyr. To highlight that concern, let’s consider three very simple data analysis problems.

Computing Forward-Running Means
Suppose that you have the following data set:



Time
Value


1
1


2
3


3
5


To compute a forward-running mean, you need to split this data into three subsets:



Time
Value


1
1




Time
Value


1
1


2
3




Time
Value


1
1


2
3


3
5


In each of these clearly non-disjoint subsets, you would then compute the mean of Value and combine the results to give:



Time
Value


1
1


2
2


3
3


This sort of computation occurs often enough in a simpler form that R provides tools like cumsum and cumprod to deal with cumulative quantities. But the splitting problem in our example is not addressed by those tools, nor by plyr, because the cumulative quantities have to computed on subsets that are not disjoint.

Computing Backward-Running Means
Consider performing the same sort of calculation as described above, but moving in the opposite direction. In that case, the three non-disjoint subsets are:



Time
Value


3
5




Time
Value


2
3


3
5




Time
Value


1
1


2
3


3
5


And the final result is:



Time
Value


1
3


2
4


3
5


Computing Local Means (AKA Kernelized Means)
Imagine that, instead of looking forward or backward, we only want to know something about data that is close to the current observation being examined. For example, we might want to know the mean value of each row when pooled with its immediately proceeding and succeeding neighbors. This computation must create the following subsets of data:



Time
Value


1
1


2
3




Time
Value


1
1


2
3


3
5




Time
Value


2
3


3
5


Within these non-disjoint subsets, means are computed and the result is:



Time
Value


1
2


2
3


3
4


A Strategy for Handling Non-Disjoint Subsets
How can we build a general purpose tool to handle these sorts of computations? One way is to rethink how plyr works and then extend it with some trivial variations on its core principles. We can envision plyr as a system that uses a splitting operation that partitions our data into subsets in which each subset satisfies a group of equality constraints: you split the data into groups in which Variable 1 = Value 1 AND Variable 2 = Value 2, etc. Because you consider the conjunction of several equality constraints, the resulting subsets are disjoint.

Seen in this fashion, there is a simple relaxation of the equality constraints that allows us to solve the three problems described a moment ago: instead of looking at the conjunction of equality constraints, we use a conjunction of inequality constraints. For the time being, I’ll describe just three instantiations of this broader strategy.

Using Upper Bounds
Here, we divide data into groups in which Variable 1 <= Value 1 AND Variable 2 <= Value 2, etc. We will also allow equality constraints, so that the operations of plyr are a strict subset of the computations in this new model. For example, we might use the constraint Variable = Value 1 AND Variable 2 <= Value 2. If the upper bound is the Time variable, these contraints will allow us to compute the forward-moving mean we described earlier.

Using Lower Bounds
Instead of using upper bounds, we can use lower bounds to divide data into groups in which Variable >= Value 1 AND Variable 2 >= Value 2, etc. This allows us to implement the backward-moving mean described earlier.

Using Norm Balls
Finally, we can consider a combination of upper and lower bounds. For simplicity, we'll assume that these bounds have a fixed tightness around the "center" of each subset of our split data. To articulate this tightness formally, we look at a specific hypothetical equality constraint like Variable 1 = Value 1 and then loosen it so that norm(Variable 1 - Value 1) <= r. When r = 0, this system gives the original equality constraint. But when r > 0, we produce a "ball" of data around the constraint whose tightness is r. This lets us estimate the local means from our third example.

Implementation
To demo these ideas in a usable fashion, I've created a draft package for R called cumplyr. Here is an extended example of its usage in solving simple variants of the problems described in this post:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
library('cumplyr')
 
data <- data.frame(Time = 1:5, Value = seq(1, 9, by = 2))
 
iddply(data,
       equality.variables = c('Time'),
       lower.bound.variables = c(),
       upper.bound.variables = c(),
       norm.ball.variables = list(),
       func = function (df) {with(df, mean(Value))})
 
iddply(data,
       equality.variables = c(),
       lower.bound.variables = c('Time'),
       upper.bound.variables = c(),
       norm.ball.variables = list(),
       func = function (df) {with(df, mean(Value))})
 
iddply(data,
       equality.variables = c(),
       lower.bound.variables = c(),
       upper.bound.variables = c('Time'),
       norm.ball.variables = list(),
       func = function (df) {with(df, mean(Value))})
 
iddply(data,
       equality.variables = c(),
       lower.bound.variables = c(),
       upper.bound.variables = c(),
       norm.ball.variables = list('Time' = 1),
       func = function (df) {with(df, mean(Value))})
 
iddply(data,
       equality.variables = c(),
       lower.bound.variables = c(),
       upper.bound.variables = c(),
       norm.ball.variables = list('Time' = 2),
       func = function (df) {with(df, mean(Value))})
 
iddply(data,
       equality.variables = c(),
       lower.bound.variables = c(),
       upper.bound.variables = c(),
       norm.ball.variables = list('Time' = 5),
       func = function (df) {with(df, mean(Value))})

You can download this package from GitHub and play with it to see whether it helps you. Please submit feedback using GitHub if you have any comments, complaints or patches.

Comparing plyr with cumplyr
In the long run, I'm hoping to make the functions in cumplyr robust enough to submit a patch to plyr. I see these tools as one logical extension of plyr to encompass more of the framework described in Hadley's paper on the Split-Apply-Combine strategy.

For the time being, I would advise any users of cumplyr to make sure that you do not use cumplyr for anything that plyr could already do. cumplyr is very much demo software and I am certain that both its API and implementation will change. In contrast, plyr is fast and stable software that can be trusted to perform its job.

But, if you have a problem that cumplyr will solve and plyr will not, I hope you'll try cumplyr out and submit patches when it breaks.

Happy hacking!
]]></description>
<dc:subject>Programming Statistics</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:a5482cf69a97/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Statistics"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://radar.oreilly.com/2012/05/functional-languages-functional-techniques.html">
    <title>Editorial Radar: Functional languages</title>
    <dc:date>2012-05-03T07:05:00+00:00</dc:date>
    <link>http://radar.oreilly.com/2012/05/functional-languages-functional-techniques.html</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[Functional Languages are driving a broader set of choices for programmers. O'Reilly editors Mike Loukides and Mike Hendrickson sat down recently to talk about the advantages of functional programming languages and how functional language techniques can be deployed with almost any language. (The full conversation is embedded below.)



Andy Hunt and Dave Thomas have long recommend learning a new language each year, especially those languages that teach new concepts [discussed at the 02:02 mark]. Functional languages have made that easier. They behave in a different way than the languages many of us grew up on — procedural like C or languages derived from C. Plus, the polyglot programming movement has driven the interest in functional languages as one of the languages you might want to learn.



Programmers need to understanding the advantages of using a functional language, such as productivity, power of expressiveness, reliability, stateful objects, concurrency, natural concurrency, modularity, and composability  [05:37]. Though a search still exists for a magic bullet  [06:29] to make it easier for programers to better solve the problem of concurrency. CPU speeds have been stuck at roughly the same level for the last four to five years. Programmers have been given is more transistors on a chip, hence more CPUs and more cores to work with making concurrency one of the most difficult issues facing computer scientists today. Enter functional programming with improved debugging and the ability to write more reliable code in a concurrent environment.


Additional highlights from this conversation include:



 Print book sales of functional languages are growing, especially books on R programming. And while Loukides doesn't consider R to be a functional language, some debate exists about its classification. Though it's clear the data science movement has driven the use of R because it's well designed for statistics and dealing with data. [Discussed at the 00:29 mark]


 We'll see F# grow in the Microsoft development environment while Scala and Clojure are dominating the open source space. Erlang will also be around for a long time for building highly reliable concurrent systems. [Discussed at the 03:01 mark]



 Since the publication of Doug Crockford's JavaScript: The Good Parts, coders have discovered the functional language abilities of JavaScript and Java. Google's release of Maps and Gmail revolutionized how JavaScript is used. Some of today's best examples include Node for high-performance websites and D3 for creating exotic and beautiful data visualizations. [Discussed at the 08:15 mark]



 While JavaScript isn't a functional language, it's designed loosely, so it's easy to use as a functional language. You might also be interested in how functional programming techniques can be used in C++ — a blog post written by John Carmack. [Discussed at the 10:36 mark]



 Java isn't intended as a functional language. Though Dean Wampler's Functional Programming for Java Developers provides an approachable introduction to functional programming for anyone using an object-oriented language. [Discussed at the 11:41 mark]



 The use of a functional language or functional language techniques can make your code more robust and easier to debug. [Discussed at the 12:09 mark]



You can view the entire conversation in the following video:





Tune in next month for a discussion of NoSQL and web databases.


Fluent Conference: JavaScript & Beyond — Explore the changing worlds of JavaScript & HTML5 at the O'Reilly Fluent Conference (May 29 - 31 in San Francisco, Calif.).
 
Save 20% on registration with the code RADAR20

Related:


 Subscribe to the free Code podcast through iTunes
 See more Code podcasts
 Editorial Radar: Machine learning, 3D printing, devices and JavaScript
 Clojure: Lisp meets Java, with a side of Erlang
 A rough guide to JVM languages



    
]]></description>
<dc:subject>Programming clojure codepodcast concurrency d3 f functionalprogramming java javascript node rprogramming scala</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:eaa46ecde100/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:clojure"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:codepodcast"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:concurrency"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:d3"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:f"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:functionalprogramming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:java"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:javascript"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:node"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:rprogramming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:scala"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www.r-bloggers.com/comparing-julia-and-r%e2%80%99s-vocabularies/">
    <title>Comparing Julia and R’s Vocabularies</title>
    <dc:date>2012-04-09T14:00:19+00:00</dc:date>
    <link>http://www.r-bloggers.com/comparing-julia-and-r%e2%80%99s-vocabularies/</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[
(This article was first published on   John Myles White » Statistics, and kindly contributed to R-bloggers)      


While exploring the Julia manual recently, I realized that it might be helpful to put the basic vocabularies of Julia and R side-by-side for easy comparison. So I took Hadley Wickham’s R Vocabulary section from the book he’s putting together on the devtools wiki, put all of the functions Hadley listed into a CSV file, and proceeded to fill in entries where I knew of an obvious Julia equivalent to an R function.

The results are on GitHub and, as they stand today, are shown below:



R
Julia
Category
Subcategory


https://

github.com/

hadley/devtools/

wiki/vocabulary
http://

julialang.org/

manual/

standard-

library-reference/
Resources
Vocabulary


?
help
Basics
First Functions


str

Basics
First Functions


%in%

Basics
Operators


match

Basics
Operators


=
=
Basics
Operators


<-
=
Basics
Operators


<<-

Basics
Operators


assign

Basics
Operators


$
[]
Basics
Operators


[]
[]
Basics
Operators


[[]]
[]
Basics
Operators


replace

Basics
Operators


head

Basics
Operators


tail

Basics
Operators


subset

Basics
Operators


with

Basics
Operators


within

Basics
Operators


all.equal

Basics
Comparison


identical

Basics
Comparison


!=
!=
Basics
Comparison


==
==
Basics
Comparison


>
>
Basics
Comparison


>=
>=
Basics
Comparison


<
<
Basics
Comparison


<=
<=
Basics
Comparison


is.na

Basics
Comparison


is.nan

Basics
Comparison


is.finite

Basics
Comparison


complete.cases

Basics
Comparison


*
*
Basics
Basic Math


+
+
Basics
Basic Math


-
-
Basics
Basic Math


/
/
Basics
Basic Math


^
^
Basics
Basic Math


%%
mod (%%)
Basics
Basic Math


%/%
div
Basics
Basic Math


abs
abs
Basics
Basic Math


sign
sign
Basics
Basic Math


acos
acos
Basics
Basic Math


acosh
acosh
Basics
Basic Math


asin
asin
Basics
Basic Math


asinh
asinh
Basics
Basic Math


atan
atan
Basics
Basic Math


atan2
atan2
Basics
Basic Math


atanh
atanh
Basics
Basic Math


sin
sin
Basics
Basic Math


sinh
sinh
Basics
Basic Math


cos
cos
Basics
Basic Math


cosh
cosh
Basics
Basic Math


tan
tan
Basics
Basic Math


tanh
tanh
Basics
Basic Math


ceiling
ceil
Basics
Basic Math


floor
floor
Basics
Basic Math


round
round
Basics
Basic Math


trunc
trunc
Basics
Basic Math


signif

Basics
Basic Math


exp
exp
Basics
Basic Math


log
log
Basics
Basic Math


log10
log10
Basics
Basic Math


log1p
log1p
Basics
Basic Math


log2
log2
Basics
Basic Math


logb

Basics
Basic Math


sqrt
sqrt
Basics
Basic Math


cummax

Basics
Basic Math


cummin

Basics
Basic Math


cumprod
cumprod
Basics
Basic Math


cumsum
cumsum
Basics
Basic Math


diff
diff
Basics
Basic Math


max
max
Basics
Basic Math


min
min
Basics
Basic Math


prod
prod
Basics
Basic Math


sum
sum
Basics
Basic Math


range

Basics
Basic Math


mean
mean
Basics
Basic Math


median
median
Basics
Basic Math


cor
cor_pearson
Basics
Basic Math


cov
cov_pearson
Basics
Basic Math


sd
std
Basics
Basic Math


var
var
Basics
Basic Math


pmax

Basics
Basic Math


pmin

Basics
Basic Math


rle

Basics
Basic Math


function
function
Basics
Functions


missing

Basics
Functions


on.exit

Basics
Functions


return
return
Basics
Functions


invisible

Basics
Functions


&
&
Basics
Logical & Set Operations


|
|
Basics
Logical & Set Operations


!
!
Basics
Logical & Set Operations


xor

Basics
Logical & Set Operations


all
all
Basics
Logical & Set Operations


any
any
Basics
Logical & Set Operations


intersect
intersect
Basics
Logical & Set Operations


union
union
Basics
Logical & Set Operations


setdiff

Basics
Logical & Set Operations


setequal

Basics
Logical & Set Operations


which
find
Basics
Logical & Set Operations


c
[] ({})
Basics
Vectors and Matrices


matrix
[] ({})
Basics
Vectors and Matrices


length
size (length)
Basics
Vectors and Matrices


dim
size
Basics
Vectors and Matrices


ncol
size(x, 1)
Basics
Vectors and Matrices


nrow
size(x, 2)
Basics
Vectors and Matrices


cbind
hcat
Basics
Vectors and Matrices


rbind
vcat
Basics
Vectors and Matrices


names

Basics
Vectors and Matrices


colnames

Basics
Vectors and Matrices


rownames

Basics
Vectors and Matrices


t
‘
Basics
Vectors and Matrices


diag
eye
Basics
Vectors and Matrices


sweep

Basics
Vectors and Matrices


as.matrix

Basics
Vectors and Matrices


data.matrix

Basics
Vectors and Matrices


c
[] ({})
Basics
Making Vectors


rep

Basics
Making Vectors


seq
[from:by:to]
Basics
Making Vectors


seq_along

Basics
Making Vectors


seq_len
[1:len]
Basics
Making Vectors


rev
reverse
Basics
Making Vectors


sample

Basics
Making Vectors


choose
factorial
Basics
Making Vectors


factorial
factorial
Basics
Making Vectors


combn

Basics
Making Vectors


(is/as).(character/numeric/logical)

Basics
Making Vectors


list
HashTable ([])
Basics
Lists & Data Frames


unlist

Basics
Lists & Data Frames


data.frame

Basics
Lists & Data Frames


as.data.frame

Basics
Lists & Data Frames


split

Basics
Lists & Data Frames


expand.grid

Basics
Lists & Data Frames


if
if
Basics
Control Flow


&&
&&
Basics
Control Flow


||
||
Basics
Control Flow


for
for
Basics
Control Flow


while
while
Basics
Control Flow


next
continue
Basics
Control Flow


break
break
Basics
Control Flow


switch

Basics
Control Flow


ifelse

Basics
Control Flow


fitted

Statistics
Linear Models


predict

Statistics
Linear Models


resid

Statistics
Linear Models


rstandard

Statistics
Linear Models


lm

Statistics
Linear Models


glm

Statistics
Linear Models


hat

Statistics
Linear Models


influence.measures

Statistics
Linear Models


logLik

Statistics
Linear Models


df

Statistics
Linear Models


deviance

Statistics
Linear Models


formula

Statistics
Linear Models


~

Statistics
Linear Models


I

Statistics
Linear Models


anova

Statistics
Linear Models


coef

Statistics
Linear Models


confint

Statistics
Linear Models


vcov

Statistics
Linear Models


contrasts

Statistics
Linear Models


apropos(‘\\.test$’)

Statistics
Miscellaneous Statistical Tests


beta
beta
Statistics
Random Numbers


binom
binom
Statistics
Random Numbers


cauchy
cauchy
Statistics
Random Numbers


chisq
chisq
Statistics
Random Numbers


exp
exp
Statistics
Random Numbers


f
f
Statistics
Random Numbers


gamma
gamma
Statistics
Random Numbers


geom
geom
Statistics
Random Numbers


hyper
hyper
Statistics
Random Numbers


lnorm
lnorm
Statistics
Random Numbers


logis
logis
Statistics
Random Numbers


multinom
multinom
Statistics
Random Numbers


nbinom
nbinom
Statistics
Random Numbers


norm
norm
Statistics
Random Numbers


pois
pois
Statistics
Random Numbers


signrank
signrank
Statistics
Random Numbers


t
t
Statistics
Random Numbers


unif
unif (rand)
Statistics
Random Numbers


weibull
weibull
Statistics
Random Numbers


wilcox
wilcox
Statistics
Random Numbers


birthday
birthday
Statistics
Random Numbers


tukey
tukey
Statistics
Random Numbers


crossprod
*
Statistics
Matrix Algebra


tcrossprod
*
Statistics
Matrix Algebra


eigen
eig
Statistics
Matrix Algebra


qr
qr
Statistics
Matrix Algebra


svd
svd
Statistics
Matrix Algebra


%*%
*
Statistics
Matrix Algebra


%o%

Statistics
Matrix Algebra


outer

Statistics
Matrix Algebra


rcond

Statistics
Matrix Algebra


solve
\
Statistics
Matrix Algebra


duplicated

Statistics
Ordering and Tabulating


unique

Statistics
Ordering and Tabulating


merge

Statistics
Ordering and Tabulating


order

Statistics
Ordering and Tabulating


rank

Statistics
Ordering and Tabulating


quantile
quantile
Statistics
Ordering and Tabulating


sort
sort
Statistics
Ordering and Tabulating


table

Statistics
Ordering and Tabulating


ftable

Statistics
Ordering and Tabulating


ls
whos
Working with R
Workspace


exists

Working with R
Workspace


get

Working with R
Workspace


rm

Working with R
Workspace


getwd
getcwd
Working with R
Workspace


setwd
setcwd
Working with R
Workspace


q
Ctrl-D
Working with R
Workspace


source
load
Working with R
Workspace


install.packages

Working with R
Workspace


library

Working with R
Workspace


require

Working with R
Workspace


help
help
Working with R
Help


?
help
Working with R
Help


help.search

Working with R
Help


apropos

Working with R
Help


RSiteSearch

Working with R
Help


citation

Working with R
Help


demo

Working with R
Help


example

Working with R
Help


vignette

Working with R
Help


traceback

Working with R
Debugging


browser

Working with R
Debugging


recover

Working with R
Debugging


options(error =)

Working with R
Debugging


stop

Working with R
Debugging


warning

Working with R
Debugging


message

Working with R
Debugging


tryCatch
try/catch
Working with R
Debugging


try
try
Working with R
Debugging


print
print (println)
I/O
Output


cat

I/O
Output


message

I/O
Output


warning

I/O
Output


dput

I/O
Output


format

I/O
Output


sink

I/O
Output


data

I/O
Reading and Writing Data


count.fields

I/O
Reading and Writing Data


read.csv
csvread
I/O
Reading and Writing Data


read.delim
dlmread
I/O
Reading and Writing Data


read.fwf

I/O
Reading and Writing Data


read.table

I/O
Reading and Writing Data


library(foreign)

I/O
Reading and Writing Data


write.table
dlmwrite
I/O
Reading and Writing Data


readLines
readlines
I/O
Reading and Writing Data


writeLines

I/O
Reading and Writing Data


load

I/O
Reading and Writing Data


save

I/O
Reading and Writing Data


readRDS

I/O
Reading and Writing Data


saveRDS

I/O
Reading and Writing Data


dir

I/O
Files and Directories


basename

I/O
Files and Directories


dirname

I/O
Files and Directories


file.path

I/O
Files and Directories


path.expand

I/O
Files and Directories


file.choose

I/O
Files and Directories


file.copy

I/O
Files and Directories


file.create

I/O
Files and Directories


file.remove

I/O
Files and Directories


path.rename

I/O
Files and Directories


dir.create

I/O
Files and Directories


file.exists

I/O
Files and Directories


tempdir

I/O
Files and Directories


tempfile

I/O
Files and Directories


download.file

I/O
Files and Directories


ISOdate

Special Data
Date / Time


ISOdatetime

Special Data
Date / Time


strftime

Special Data
Date / Time


strptime

Special Data
Date / Time


date

Special Data
Date / Time


difftime

Special Data
Date / Time


julian

Special Data
Date / Time


months

Special Data
Date / Time


quarters

Special Data
Date / Time


weekdays

Special Data
Date / Time


library(lubridate)

Special Data
Date / Time


grep
match
Special Data
Character Manipulation


agrep

Special Data
Character Manipulation


gsub

Special Data
Character Manipulation


strsplit
split
Special Data
Character Manipulation


chartr

Special Data
Character Manipulation


nchar
strlen
Special Data
Character Manipulation


tolower

Special Data
Character Manipulation


toupper

Special Data
Character Manipulation


substr

Special Data
Character Manipulation


paste
join
Special Data
Character Manipulation


library(stringr)

Special Data
Character Manipulation


factor

Special Data
Factors


levels

Special Data
Factors


nlevels

Special Data
Factors


reorder

Special Data
Factors


relevel

Special Data
Factors


cut

Special Data
Factors


findInterval

Special Data
Factors


interaction

Special Data
Factors


options(stringsAsFactors = FALSE)

Special Data
Factors


array
[]
Special Data
Array Manipulation


dim
size
Special Data
Array Manipulation


dimnames

Special Data
Array Manipulation


aperm

Special Data
Array Manipulation


library(abind)

Special Data
Array Manipulation



I’d like to note that holes in the list of Julia functions can exist for several reasons:

The language does not yet have the relevant features. This is true of things like factor() or data.frame().
The language has draft implementations of the relevant features, but they are not yet ready to make their way into this list. This is true of Doug Bates’ GLM code, for example.
I simply don’t know what the Julia equivalent is for an R function, but it may well exist. If you know of one, please fork the GitHub repository I’m using and revise the CSV file appropriately. I’ll integrate relevant pull requests as soon as I can find time.

In addition to explaining the presence of the many holes you can see this in this list, I’d also like to note how quickly these holes are being filled in: Doug Bates already finished a wrapper for the Rmath library, which means that Julia now has tools for calculating the PDF’s, CDF’s, and inverse CDF’s of most statistical distributions as well as the ability to draw random samples from them. That means that almost any sort of MCMC you’d like to do is already possible in Julia. (I, for one, am really interested to see if someone will use Julia’s sparse matrix support and these new Rmath functions to build MCMC code that’s easy on the eyes while also running at an appropriately fast speed on complicated, big data problems like matrix factorizations.)

On my end, I’ve been working on filling some of the missing entries in this list by adding in pieces that I think I understand well enough to implement from scratch, such as:

Optimization algorithms (optim.jl):

Simulated annealing
Gradient descent
Newton’s method

Statistical hypothesis tests (stats.jl):

t-Tests

Utility functions (utils.jl):

range
keys
cummax
cummin



To leave a comment for the author, please follow the link and comment on his blog:  John Myles White » Statistics.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...

]]></description>
<dc:subject>R_bloggers programming statistics</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:4a08330142b5/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:R_bloggers"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:statistics"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/0RJmqXtrung/profile-of-the-data-journalist-2.html">
    <title>Profile of the Data Journalist: The Human Algorithm</title>
    <dc:date>2012-03-02T20:53:01+00:00</dc:date>
    <link>http://feedproxy.google.com/~r/oreilly/radar/atom/~3/0RJmqXtrung/profile-of-the-data-journalist-2.html</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[Around the globe, the bond between data and journalism is growing stronger. In an age of big data, the growing importance of data journalism lies in the ability of its practitioners to provide context, clarity and, perhaps most important, find truth in the expanding amount of digital content in the world. In that context, data journalism has profound importance for society.


To learn more about the people who are doing this work and, in some cases, building the newsroom stack for the 21st century, I conducted a series of email interviews during the 2012 NICAR Conference.


Ben Welsh (@palewire) is an Web developer and journalist based in Los Angeles. Our interview follows.
 

Where do you work now? What is a day in your life like?

I work for the Los Angeles Times, a daily
newspaper and 24-hour Web site based in Southern California. I'm a member
of the Data Desk, a team of reporters and
Web developers that specializes in maps, databases, analysis and
visualization. We both build Web applications and conduct analysis for
reporting projects.


I like to compare The Times to a factory, a factory that makes information.
Metaphorically speaking, it has all sorts of different assembly lines. Just
to list a few, one makes beautifully rendered narratives, another makes battleship-like investigative projects.


A typical day involves juggling work on difference projects, mentally
moving from one assembly line to the other. Today I patched an embryonic open-source release, discussed our next move on a pending public records request, guided the real-time publication of results from the GOP primaries in Michigan and Arizona, and did some preparation for how we'll present a larger dump of results on Super Tuesday.


How did you get started in data journalism? Did you get any special
degrees or certificates?

I'm thrilled to see new-found interest in "data journalism" online. It's
drawing young, bright people into the field and involving people from
different domains. But it should be said that the idea isn't new.


I was initiated into the field as a graduate student at the Missouri School
of Journalism. There I worked at the National Institute for Computer-Assisted Reporting , also known as NICAR. Decades before anyone called it "data journalism," a disparate group of misfit reporters discovered that the data analysis made possible by computers enabled them to do more powerful investigative reporting. In 1989, they founded NICAR, which has, for decades, been training data skills
to journalists and nurtured a tribe of journalism geeks. In the time since, computerized data analysis has become a dominant force in investigative reporting, responsible for a large share of the field's best work.


To underscore my point, here's a 1986 Time magazine article about how
"newsmen are enlisting the machine."


Did you have any mentors? Who? What were the most important resources they
shared with you?

My first journalism job was in Chicago. I got a gig working for two great people there, Carol Marin and Don Moseley, who have spent most of their careers as television journalists. I worked as their assistant. Carol and Don are warm people who are good teachers, but they are also excellent at what they do. There was a moment when I realized, "Hey, I can do this!" It wasn't just something I heard about in class, but I could actually see myself doing.


At Missouri, I had a great classmate named Brian
Hamman, who is now at the New York Times. I remember seeing how invested Brian was in the Web, totally committed to Web development as a career path. When an opportunity opened up to be a graduate assistant at NICAR, Brian encouraged me to pursue it. I learned enough SQL to help do farmed-out investigative work for TV stations. And, more importantly, I learned that if you had technical skills you could get the job to work on a cool story.


After that I got a job doing data analysis at the Center for Public Integrity in Washington DC. I had the opportunity to work on investigative projects, but also the chance to learn a lot of computer programming along the way. I had the guidance of my talented coworkers, Daniel Lathrop, Agustin Armendariz, John Perry, Richard Mullins and Helena Bengtsson. I learned that computer programming wasn't impossible. They taught me that if you have a manageable task, a few friends to help you out and a door you can close, you can figure out a lot.


What does your personal data journalism "stack" look like? What tools
could you not live without?

I do my daily development in gedit text editor, Byobu's slick implementation of the screen terminal and the Chromium browser. And, this part may be hard to believe, but I love Ubuntu
Unity. I don't understand what everybody is complaining about.


I do almost all of my data management in the Python Web development
framework Django and
PostgreSQL's database, even if
the work is an exploratory reporting project that will never be published. I find that the structure of the framework can be useful for organizing just about any data-driven project.


I use GitHub for both version-control and
project management. Without it, I'd be lost.


What data journalism project are you the most proud of working on or
creating?

As we all know, there's a lot of data out there. And, as anyone who works
with it knows, most of it is crap. The projects I'm most proud of have
taken large, ugly data sets and refined them into something worth knowing: 
a nut graf in an investigative story, or a
data-driven app that gives the reader some new
insight into the world around them. It's impossible to pick one. I like to
think the best is still, as they say in the newspaper business,
TK.


Where do you turn to keep your skills updated or learn new things?

Twitter is a great way to keep up with what is getting other programmers excited. I know a lot of people find social media overwhelming or distracting, but I feel plugged in and inspired by what I find there. I wouldn't want to live without it.


GitHub is another great source. I've learned so much just exploring other
people's code. It's invaluable.


Why are data journalism and "news apps" important, in the context of the
contemporary digital environment for information?

Computers offer us an opportunity to better master information, better
understand each other and better watchdog those who would govern us. I
tried to talk about some of the ways simply thinking about the process of
journalism as an algorithm can point the way at last week's NICAR
conference in a talk called "Human-Assisted Reporting." In my opinion, we should aspire to write code that embodies the idealistic principles and investigative methods of the previous generation. There's all this data out there now, and journalistic algorithms, "robot
reporters," can help us ask it tougher questions.



    
]]></description>
<dc:subject>Data Gov_2.0 Publishing dataconference datajournalism dataproduct datascience nicarinterview opensource programming</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:c4879a52616b/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Data"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Gov_2.0"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Publishing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:dataconference"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:datajournalism"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:dataproduct"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:datascience"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:nicarinterview"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:opensource"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www.johndcook.com/blog/2012/02/22/julia-random-number-generation/">
    <title>Julia random number generation</title>
    <dc:date>2012-02-23T03:48:00+00:00</dc:date>
    <link>http://www.johndcook.com/blog/2012/02/22/julia-random-number-generation/</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[Julia is a new programming language for scientific computing. From the Julia site:

Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. …

I just started playing around with it. I didn’t see functions for non-uniform random number generation so I wrote some as a way to get started.

[Update: there are non-uniform random number generators in Julia, but they have not been added to the documentation yet. See details in this comment.]

Here’s a random number generator for normal (Gaussian) random values:

## return a random sample from a normal (Gaussian) distribution
function rand_normal(mean, stdev)
    if stdev <= 0.0
        error("standard deviation must be positive")
    end
    u1 = rand()
    u2 = rand()
    r = sqrt( -2.0*log(u1) )
    theta = 2.0*pi*u2
    mean + stdev*r*sin(theta)
end
From this you can see Julia is a low-ceremony language: Python-like syntax, you can call common mathematical functions without having to do anything special, etc. You can have explicit return statements, but the preferred style seems to be to let the last line of the function be the implicit return statement.

My most common mistake so far has been forgetting to close code blocks with end; Julia’s syntax is similar enough to Python that I suppose I think indentation should be sufficient.

I’ve written random number generators for the following probability distributions:


Beta
Cauchy
Chi square
Exponential
Inverse gamma
Laplace (double exponential)
Normal
Student t
Uniform
Weibull

You can find the code here: Non-uniform random number generation in Julia.

]]></description>
<dc:subject>Software_development Julia Programming</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:d487208e6c8b/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Software_development"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Julia"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://code.google.com/edu/languages/google-python-class/introduction.html">
    <title>Python Introduction - Google's Python Class - Google Code</title>
    <dc:date>2012-02-02T18:07:10+00:00</dc:date>
    <link>http://code.google.com/edu/languages/google-python-class/introduction.html</link>
    <dc:creator>rahuldave</dc:creator><dc:subject>google python programming tutorial</dc:subject>
<dc:source>https://pinboard.in/</dc:source>
<dc:identifier>https://pinboard.in/u:rahuldave/b:073050eb4a0a/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:google"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:python"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:tutorial"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www.25hoursaday.com/weblog/2012/01/03/WhatILearnedAfter3WeeksOfWritingMobileApps.aspx">
    <title>What I Learned After 3 Weeks of Writing Mobile Apps</title>
    <dc:date>2012-01-03T15:13:13+00:00</dc:date>
    <link>http://www.25hoursaday.com/weblog/2012/01/03/WhatILearnedAfter3WeeksOfWritingMobileApps.aspx</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[
Towards the end of last year, I realized I was about to bump up against the ”use
it or lose it” vacation policy at work which basically means I either had to take
about two weeks of paid vacation or forfeit the vacation. Since I hadn’t planned the
time off I immediately became worried about what to do with all that idle time especially
since if left to my own devices I’d play 80 straight hours of Modern
Warfare 3 without pause. 



To make sure the time was productively used I decided to write a mobile app as a learning
exercise about the world of mobile development since I’ve read so much about it and
part of my day job is building
APIs for developers of mobile apps. I ended up enjoying the experience so much
I added an extra week of vacation and wrote two apps for Windows Phone. I’d originally
planned to write one app for Windows Phone then port it to iOS or Android but gave
up on that due to time constraints after some investigation of both. 



I learned a bunch about mobile development from this exercise and a few friends have
asked me to share of my thoughts on mobile development in general and building for
Windows Phone using Microsoft platforms in particular. If you are already a mobile
developer then some of this is old hat to you but I did find a bunch of what I learned
to be counterintuitive and fairly eye opening so you might too. 


Thoughts on Building Mobile Apps on Any Platform


This section is filled with items I believe are generally applicable if building iOS,
Android or Windows Phone apps. These are mostly things I discovered as part of my
original plan to write one app for all three platforms. 




A consistent hardware ecosystem is a force multiplier


After realizing the only options for doing iPhone development on Windows was the Dragon
Fire SDK which only supports games, I focused on learning as much as I could about Android
development options. The Xamarin guys have MonoTouch which sounded very appealing
to me as a way to leverage C# skills across Android and Windows Phone until I saw
the $400 price tag. :)



One of the things I noticed upon downloading the Android SDK as compared to installing
the Windows Phone SDK is that the Android one came with a bunch of emulators and SDKs
for various specific devices. As I started development on my apps, there were many
times I was thankful for the consistent
set of hardware specifications for Windows Phone. Knowing that the resolution
was always going to be WVGA and so if something looked good in the emulator then it
would look good on my device and those of my beta testers not only gave piece of mind
but made UX development a breeze. 



Comparing this to an ecosystem like Android where the diversity of hardware devices
with varying screen resolutions have
made developers effectively throw up their hands as in this article quoted by
Jeffrey Zeldman 


If … you have built your mobile site using fixed widths (believing
that you’ve designed to suit the most ‘popular’ screen size), or are planning to serve
specific sites to specific devices based on detection of screen size, Android’s settings
should serve to reconfirm how counterproductive a practice this can be. Designing
to fixed screen sizes is in fact never a good idea…there is just too much variation,
even amongst ‘popular’ devices. Alternatively, attempting to track, calculate, and
adjust layout dimensions dynamically to suit user-configured settings or serendipitous
conditions is just asking for trouble. 

Basically, you’re just screwed if you think you can build a UI that will work on all
Android devices. This is clearly not the case if you target Windows Phone or iOS development.
This information combined with my experiences building for Windows Phone convinced
me that it is more likely I’ll buy a Mac and start iOS development than it is that
I’d ever do Android development. 




No-name Web Hosting vs. name brands like Amazon Web Services and Windows Azure 


One of my apps had a web service requirement and I initially spent some time investigating
both Windows Azure and Amazon Web Services. Since this was a vacation side project
I didn’t want expenses to get out of hand so I was fairly price sensitive. Once I
discovered AWS charged less for Linux servers I spent a day or two getting my Linux
chops up to speed given I hadn’t used it much since my the early 2000s. This is where
I found out about yum and
discovered the interesting paradox that discovering and installing software on modern
Linux distros is simultaneously much easier and much harder than doing so on Windows
7. Anyway, that’s a discussion for another day. 



I soon realized I had been penny wise and pound foolish when focusing on the cost
of Linux hosting when it turns out what breaks the bank is database hosting. Amazon
charges about $0.11 an hour ($80 a month)
for RDS hosting at the low end. Windows Azure seemed to charge around the same
ballpark when I looked two months ago but it seems they’ve revamped
their pricing site since I did my investigation. 



Once I realized database hosting would be the big deciding factor in cost. It made
it easier for me to stick with the familiar and go with instead of as a 

LAMP

server stack. If I had stuck with 

LAMP

, I could have gone with a provider like Blue Host to
get the entire web platform + database stack for less than $10 with perks like free
credits for Google ads thrown in. With the 

WISC

stack, hosters like Discount ASP and Webhost
4 Life charge in the ballpark of $15 which is about $10 if you swap out SQL Server
for MySQL. 



These prices were more my speed. I was quite surprised that even though all the blogs
talk about AWS and Azure, it made the most sense for my bootstrapped apps to start
with a vanilla web host and pay up to ten times less for service than using one of
the name brand cloud computing services. Paying almost ~$100 a month for services
with elastic scaling properties may make sense if my apps stick around and become
super successful but not at the start. 



Another nice side effect of going with a web hosting provider is the reduced complexity
from going with a cloud services provider. Anyone who's gone through the AWS
getting started guides after coming from vanilla web hosting knows what I mean.




Facebook advertising beats search ads for multiple app categories


As mentioned above, one of the perks of some of the vanilla hosting providers is that
they throw in free credits for ads on Google AdSense/Adwords and Facebook ads as part
of the bundle. I got to experiment with buying ads on both platforms and I came away
very impressed with what Facebook has built as an advertising platform. 



I remember reading a few years ago that MySpace
had taught us social networks are bad for advertisers. Things are very different
in today’s world. With search ads, I can choose to show ads alongside results when
people search for a term that is relevant to my app. With Facebook ads, I get to narrowly
target demographics based on detailed profile attributes such as Georgia Tech alumni
living in New York who have expressed an interest in DC or Marvel comics. The latter
seems absurd at first until you think about an app like Instagram. 



No one is searching for "best photo sharing app for the iphone" on Google and even
if you are one of the few people who has, there aren’t a lot of you. On the other
hand, at launch the creators of Instagram could go to Facebook and say we'd like to
show ads to people who have liked or use an and who also have shown an affiliation
for photo sharing apps or sites like Flickr, Camera+, etc then craft specific pitches
for those demographics. I don’t know about you but I know which sounds like it would
be more effective and relevant. 



This also reminded me that I'd actually clicked on more ads on Facebook than I've
ever clicked on search ads. 




Lot's of unfilled niches still exist


I remember being in college back in the day, flipping through my copy of Yahoo!
Internet Life and thinking that we were oversaturated with websites and all the
good ideas were already taken. This was before YouTube, Flickr, SkyDrive, Facebook
or Twitter. Silly me. 



The same can be said about mobile apps today. I hear a lot about there being 500,000
apps in the Apple app store and the
same number being in Android Market. To some this may seem overwhelming but there
are clearly still niches that are massively underserved on those platforms and especially
on Windows Phone which just
hit 50,000 apps. 



There are a lot of big and small problems in people's lives that can be addressed
by bringing the power of the web to the devices in their pockets in a tailored way.
The one thing I was most surprised by is how many apps haven't been written that you'd
expect to exist just from extrapolating what we have on the Web and the offline world
today. I don't just mean geeky things like a
non-propeller head way to share bookmarks from my desktop to my phone and vice versa
without emailing myself but instead applications that would enrich the lives of
millions of regular people out there that they'd be gladly willing to pay $1 for (less
than the price of most brands of bubble gum these days). 



If you are a developer, don't be intimidated by the size of the market nor be attracted
to the stories of the folks who've won the lottery by gambling on being in the right
place at the right time with the right gimmick (fart
apps, sex position guides and yet
another photo sharing app). There are a lot of problems that can be solved or
pleasant ways to pass the time on a mobile device that haven’t yet been built. Look
around at your own life and talk to your non-technical friends about their days. There
is lots of inspiration out there if you just look for it. 




Look for Platforms that Favor User Experience over Developer Experience


One of the topics I’ve wanted to write about in this blog is how my generation of
software developers who came of age with the writings of Richard Stallman and Eric
Raymond’s The
Cathedral and the Bazaar with its focus on building software with a focus on making
the developers who use the software happy collides with the philosophy of software
developers who have come of age in the era of Steve Jobs and what Dave Winer has called The
Un-Internet where focusing on providing a controlled experience which is smoother
for end users leads to developers being treated as second fiddle. 



As a developer, having to submit my app to some app store to get it certified when
I could publish on the web as soon as I was done checking in the code to my local
github repository is something I chafe against. When working on my Windows Phone apps,
I submitted one to be certified and found prominent typos a few hours later. However
there was nothing I could do but wait for five
business days for my app to be approved after which I could submit the updated
version to be certified which would take another week in calendar days. Or so I thought. 



My initial app submission was rejected for failing a test case around proper handling
of lack of network connectivity. I had cut some corners in my testing when it came
to testing network availability support once I discovered NetworkInterface.GetIsNetworkAvailable()
always returns true in the emulator which meant I had to actually test that process
on my phone. I never got around to it by telling myself no one actually expects a
network connected app to work if they don’t have connectivity. 



The Windows Phone marketplace rejected my app because it turns out it crashes if you
lose network connectivity. I was actually pretty impressed that someone at Microsoft
is tasked with making sure any app a user installs from the store doesn't crash for
common edge cases. Then I thought about the fact that my wife, my 3 year old son,
and my non-technical friends all use mobile apps and it is great that this level of
base set of quality expectations are being built into the platform. Now when I think
back to Joe
Hewitt famously quitting the Apple App store and compare it to the
scam of the week culture that plagues the Android marketplace, I know which model
I prefer as a user and a developer. It’s the respect for the end user experience I
see coming out of Cupertino and Redmond. 



This respect for end users ends up working for developers which is why there really
is no surprise that iOS
devs make 6 time smore than their Android counterparts because users are more
likely to spend money on apps on iOS. 





Thoughts on Microsoft-Specific Development 


In addition to the general thoughts there were some things specific to either Windows
Phone or 

WISC

development I thought were worth sharing as well. Most of these were things I found
on the extremely excellent Stack Overflow,
a site which cannot be praised enough. 




Free developer tools ecosystem around Microsoft technology is mature and surprisingly
awesome


As a .NET developer I’ve been socialized into thinking that Microsoft tools are the
realm of paying an arm and a leg for tools while people building on Open Source tools
get great tools for free. When I was thinking about building my apps on Linux I actually
got started using Python for a web crawler that was intended to be part of my app
as well as for my web services. When I was looking at Python I played around with web.py and
wrote the first version of my crawler using Beautiful
Soup. 



As I moved on the .NET I worried I’d be stuck for such excellent free tooling but
that was not the case. I found similar and in some cases better functionality for
what I was looking for in Json.NET and the HTML
Agility Pack. Besides a surprising amount of high quality, free libraries for
.NET development, it was the free tools for working with SQL Server that sent me over
the top. Once I grabbed SQL
Complete, an autocomplete/Intellisense tool for SQL Server, I felt my development
life was complete. Then I found ELMAH.
Fatality…I’m dead and in developer heaven. 




Building RESTful web services that emit JSON wasn't an expected scenario from
Microsoft dev tools teams?


As part of my day job, I'm responsible for Live Connect which
among other things provides a set of RESTful JSON-based APIs for accessing data in
SkyDrive, Hotmail and Windows Live Messenger. So it isn't surprising that when I wanted
to build a web service for one of my side projects I'd want to do the same. This is
where things broke down. 



The last time I looked at web services development on the 

WISC

the way to build web services was to use Windows
Communication Foundation (WCF). So I decided to take a look at that and found
out that the product doesn’t really support JSON-based web services out of the box
but I could grab something called the WCF
Web API off of CodePlex. Given the project seemed less mature than the others
I’d gotten off of CodePlex I decided to look at ASP.NET and see what I could get there
since it needs to enable JSON-based REST APIs as part of its much touted JQuery support.
When I got to the ASP.NET getting started
page, I was greeted with the statement that ASP.NET enables building 3 patterns
of websites and I should choose my install based on what I wanted. Given that I didn't
want to build an actual website not a web service I didn't pick any of them



Since I was short on time (after all, this was my vacation) I went back to what I
was familiar with and used ASP.NET
web services with HTTP GET & POST enabled. I’m not sure what the takeaway is here
since I clearly built my solution using a hacky mechanism and not a recommended approach
yet it is surprising to me that what seems like such a mainline scenario isn’t supported
in a clear out-of-the-box manner by Microsoft’s dev tools. 




Embrace the Back Button on Windows Phone


One of the things I struggled with the most as part of Windows Phone development was dealing
with the application lifecycle. The problem is that at any point the user can
jump out of your app and the operating system will put your app in either a dormant
state where data is still stored in memory or tombstone your app in which case it
is killed and state your app cares about is preserved. 



One of the ways I eventually learned to thing about this the right way was to aggressively
use the back button while testing my app. This led to finding all sorts of interesting
problems and solutions such as
how to deal with a login screen when the user clicks back and that a lot of logic
I thought should be in the constructor of a page really should be in the OnNavigatedTo method
(and don’t forget to de-register some of those event handlers in your OnNavigatedFrom event
handler). 





I could probably write more on this but this post has gotten longer than I planned
and I need to take my son to daycare & get ready for work. I’ll try to be a more diligent
blogger this year depending on whether the above doesn’t make too many people unhappy.



Happy New Year. 



 Now
Playing: Kanye
West - Devil
In A New Dress (featuring Rick Ross)


    
]]></description>
<dc:subject>Programming Web_Development</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:6eb08c4daf69/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Web_Development"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://eflorenzano.com/blog/2012/01/01/reducing-code-nesting/">
    <title>Reducing Code Nesting</title>
    <dc:date>2012-01-02T20:24:35+00:00</dc:date>
    <link>http://eflorenzano.com/blog/2012/01/01/reducing-code-nesting/</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[Comments]]></description>
<dc:subject>programming</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:52d9e585aa69/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://radar.oreilly.com/2011/12/four-short-links-28-december-2-1.html">
    <title>Four short links: 28 December 2011</title>
    <dc:date>2011-12-28T11:00:00+00:00</dc:date>
    <link>http://radar.oreilly.com/2011/12/four-short-links-28-december-2-1.html</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[
Terrier IR -- open source (Mozilla) text search engine, now with Hadoop support.
s3ql -- open source (GPLv3) Linux filesystem which stores its data on Google Storage, Amazon S3, or OpenStack. (via Adam Shand)
Esprima -- open source (BSD) fast Javascript parser in Javascript. (via Javascript Weekly)
Hogan.js -- open source (Apache) Javascript templating engine from Twitter. If it proves anywhere near as good as Bootstrap, it'll be heavily used.





    
]]></description>
<dc:subject>cloud javascript opensource programming search storage textanalysis web</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:f4b01c498a92/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:cloud"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:javascript"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:opensource"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:search"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:storage"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:textanalysis"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:web"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/jLZhw5uJ1Ms/four-short-links-28-december-2-1.html">
    <title>Four short links: 28 December 2011</title>
    <dc:date>2011-12-28T11:00:00+00:00</dc:date>
    <link>http://feedproxy.google.com/~r/oreilly/radar/atom/~3/jLZhw5uJ1Ms/four-short-links-28-december-2-1.html</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[
Terrier IR -- open source (Mozilla) text search engine, now with Hadoop support.
s3ql -- open source (GPLv3) Linux filesystem which stores its data on Google Storage, Amazon S3, or OpenStack. (via Adam Shand)
Esprima -- open source (BSD) fast Javascript parser in Javascript. (via Javascript Weekly)
Hogan.js -- open source (Apache) Javascript templating engine from Twitter. If it proves anywhere near as good as Bootstrap, it'll be heavily used.





    
]]></description>
<dc:subject>cloud javascript opensource programming search storage textanalysis web</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:8604fc8a64ab/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:cloud"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:javascript"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:opensource"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:search"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:storage"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:textanalysis"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:web"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://rss.slashdot.org/~r/slashdot/eqWf/~3/65pGuqnXP-o/why-was-hypercard-killed">
    <title>Why Was Hypercard Killed?</title>
    <dc:date>2011-11-30T18:33:00+00:00</dc:date>
    <link>http://rss.slashdot.org/~r/slashdot/eqWf/~3/65pGuqnXP-o/why-was-hypercard-killed</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[theodp writes "Steve Jobs took the secret to his grave, but Stanislav Datskovskiy offers some interesting and illustrated speculation on why HyperCard had to die. 'Jobs was almost certainly familiar with HyperCard and its capabilities,' writes Datskovskiy. 'And he killed it anyway. Wouldn't you love to know why? Here's a clue: Apple never again brought to market anything resembling HyperCard. Despite frequent calls to do so. Despite a more-or-less guaranteed and lively market. And I will cautiously predict that it never will again. The reason for this is that HyperCard is an echo of a different world. One where the distinction between the "use" and "programming" of a computer has been weakened and awaits near-total erasure. A world where the personal computer is a mind-amplifier, and not merely an expensive video telephone. A world in which Apple's walled garden aesthetic has no place.' Slashdotters have bemoaned the loss of HyperCard over the past decade, but Datskovskiy ends his post on a keep-hope-alive note, saying: 'Contemplate the fact that what has been built once could probably be built again.' Where have you gone, Bill Atkinson, a nation of potential programmers turns its lonely eyes to you."
   
      
Read more of this story at Slashdot.



]]></description>
<dc:subject>programming</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:7b9cc06b0de3/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www.johndcook.com/blog/2011/11/28/fundamental-theorem-of-readability/">
    <title>Fundamental theorem of code readability</title>
    <dc:date>2011-11-28T22:08:02+00:00</dc:date>
    <link>http://www.johndcook.com/blog/2011/11/28/fundamental-theorem-of-readability/</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[In The Art of Readable Code, the authors call the following the “Fundamental Theorem of Readability”:

Code should be written to minimize the time it would take for someone else to understand it.

They go on to explain

And when we say “understand,” we have a very high bar … they should be able to make changes to it, spot bugs, and understand how it interacts with the rest of your code.



]]></description>
<dc:subject>Software_development Books Programming</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:fdd9ecb95dae/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Software_development"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Books"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www.johndcook.com/blog/2011/11/14/separating-presentation-from-content/">
    <title>Separating presentation from content</title>
    <dc:date>2011-11-14T18:47:29+00:00</dc:date>
    <link>http://www.johndcook.com/blog/2011/11/14/separating-presentation-from-content/</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[In the late ’90s I went to a fair number of Microsoft presentations. One presentation would say “The problem with Technology X is that it mixes presentation and content. We’ve introduced Technology Y to make your code cleaner, separating presentation and content.” A few months later I’d be at another presentation that would announce “The problem with Technology Y is that it mixes presentation and content. We’ve introduced Technology Z …” (Does this remind anyone else of The Cat in the Hat Comes Back?)

When I first learned LaTeX, I was told that one of its strengths is that it separates presentation and content. Then a few years later I hear complaints that the problem with LaTeX is that it mingles presentation and content, unlike XHTML. A few years later, guess what? XHTML mixes presentation and content, so we need something else.

I shut down when I hear someone announce that everything before their product was bad because it mixed presentation and content, and now with their solution, presentation and content will be completely separate.

Sometimes one technology really does make a cleaner separation of presentation and content. But at best the separation is relative. LaTeX separates presentation and content more than Word, though not as much as well-written HTML and CSS, maybe. But presentation and content cannot be entirely separated. Nor is their unanimous agreement on what exactly the dividing line is between the two.

Many people don’t want to separate their presentation and content. They don’t understand why this would be desirable, and they’ll fight against anything designed to encourage separation. Maybe they need to learn the advantages, or maybe they’re just doing the best they can to get their job done and they can’t be bothered with long term advantages that may not materialize.

The principle of separating presentation and content is admirable. It really does have advantages, but it’s easier said than done.

]]></description>
<dc:subject>Software_development LaTeX Programming</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:dac78de91c8a/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Software_development"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:LaTeX"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://rss.slashdot.org/~r/slashdot/eqWf/~3/BAdyB9M-hGw/microsoft-roslyn-reinventing-the-compiler-as-we-know-it">
    <title>Microsoft Roslyn: Reinventing the Compiler As We Know It</title>
    <dc:date>2011-10-21T15:36:00+00:00</dc:date>
    <link>http://rss.slashdot.org/~r/slashdot/eqWf/~3/BAdyB9M-hGw/microsoft-roslyn-reinventing-the-compiler-as-we-know-it</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[snydeq writes "Fatal Exception's Neil McAllister sees Microsoft's Project Roslyn potentially reinventing how we view compilers and compiled languages. 'Roslyn is a complete reengineering of Microsoft's .NET compiler toolchain in a new way, such that each phase of the code compilation process is exposed as a service that can be consumed by other applications,' McAllister writes. 'The most obvious advantage of this kind of "deconstructed" compiler is that it allows the entire compile-execute process to be invoked from within .NET applications. With the Roslyn technology, C# may still be a compiled language, but it effectively gains all the flexibility and expressiveness that dynamic languages such as Python and Ruby have to offer.'"
   
      
Read more of this story at Slashdot.

]]></description>
<dc:subject>programming</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:5042ae9c6746/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www.johndcook.com/blog/2011/09/27/sed-one-liners/">
    <title>Sed one-liners</title>
    <dc:date>2011-09-27T15:56:20+00:00</dc:date>
    <link>http://www.johndcook.com/blog/2011/09/27/sed-one-liners/</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[A few weeks ago I reviewed Peteris Krumins’ book Awk One-Liners Explained. This post looks at his sequel, Sed One-Liners Explained.

The format of both books is the same: one-line scripts followed by detailed commentary. However, the sed book takes more effort to read because the content is more subtle. The awk book covers the most basic features of awk, but the sed book goes into the more advanced features of sed.

Sed One-Liners Explained provides clear explanations of features I found hard to understand from reading the sed documentation. If you want to learn sed in depth, this is a great book. But you may not want to learn sed in depth; the oldest and simplest parts of sed offer the greatest return on time invested. Since the book is organized by task — line numbering, selective printing, etc — rather than by language feature, the advanced and basic features are mingled.

On the other hand, there are two appendices  organized by language feature. Depending on your learning style, you may want to read the appendices first or jump into the examples and refer to the appendices only as needed.

For a sample of the book, see the table of contents, preface, and first chapter here.

Related links:

Learn one sed command
Daily tips on sed and awk

]]></description>
<dc:subject>Software_development Books Programming Sed</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:f6b7a7a6770b/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Software_development"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Books"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Sed"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://rss.slashdot.org/~r/slashdot/eqWf/~3/pHX0S2EU4ZU/Client-side-Web-REPL-For-15-Languages">
    <title>Client-side Web REPL For 15+ Languages</title>
    <dc:date>2011-09-20T22:18:00+00:00</dc:date>
    <link>http://rss.slashdot.org/~r/slashdot/eqWf/~3/pHX0S2EU4ZU/Client-side-Web-REPL-For-15-Languages</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[In his first accepted submission, MaxShaw writes "repl.it is an online REPL that supports running code in 15+ languages, from Ruby to Scheme to QBasic, in the browser. It is intended as a tool for learning new languages and experimenting with code on the go. All the code is open sourced under the MIT license and available from GitHub."

A few of the languages are supported by reusing existing "Foolang in Javascript" interpreters, but a number of them are built using Emscripten (previously used to build Doom for the browser). All evaluation occurs client side, but saved sessions are stored on their server.
   
      
Read more of this story at Slashdot.



]]></description>
<dc:subject>programming</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:a619c38aa3d9/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www.johndcook.com/blog/2011/04/19/learn-one-sed-command/">
    <title>Learn one sed command</title>
    <dc:date>2011-04-19T12:03:15+00:00</dc:date>
    <link>http://www.johndcook.com/blog/2011/04/19/learn-one-sed-command/</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[You may have seen sed programs even if you didn’t know that’s what they were. In online discussions it’s common to hear someone say

s/foo/bar/
as a shorthand to mean “replace foo with bar.” The line s/foo/bar/ is a complete sed program to do such a replacement.

sed comes with every Unix-like operating system and is available for Windows here. It has a range of features for editing files, but sed is worth using even if you only know how to do one thing with it:

sed "s/pattern1/pattern2/g" file.txt > newfile.txt
This will replace every instance of pattern1 with pattern2 in the file file.txt and will write the result to newfile.txt. The original file file.txt is unchanged.

I used to think there was no reason to use sed when other languages like Python will do everything sed does and much more. Suppose you agree with that. Now suppose you find you often have to make global search-and-replace operations and so you write a script to do this, say a Python script. You’ve got to call your script something, remember what you called it, and put it in your path. How about calling it sed? Or better, don’t write your script, but pretend that you did. If you’re on Linux, it’s already in your path. One advantage of the real sed over your script named sed is that the former can do a lot more, should you ever need it to.

Now for a few details regarding the sed command above. The “s” on the front stands for “substitute” and the “g” on the end stands for “global.” Without the “g” on the end, sed would only replace the first instance of the pattern on each line. If that’s what you want, then remove the “g.”

The patterns inside a sed command are regular expressions, so it’s best to get in the habit of always quoting sed commands. This isn’t necessary for simple string substitutions, but regular expressions often contain characters that you’ll need to prevent the shell from interpreting.

You may find the default regular expression support in sed odd or restrictive. If you’re used to regular expressions in Perl, Python, JavaScript, etc. and you’re using a Gnu implementation of sed, you can add the -r option for more familiar regular expression syntax.

I got the idea for this post from Greg Grouthaus’ post Why you should learn just a little Awk. He makes a good case that you can benefit from learning just a few commands of a language like Awk with no intention to learn more of the language.

Related posts:

Good old regular expressions
Tips for learning regular expressions
A little awk

]]></description>
<dc:subject>Software_development Programming Regular_expressions</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:e2635f8f5b1a/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Software_development"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Regular_expressions"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://lifehacker.com/5724763/kod-is-a-free-text-editor-design-for-programmers">
    <title>Kod is a Free Text Editor Designed for Programmers [Downloads]</title>
    <dc:date>2011-01-04T22:00:00+00:00</dc:date>
    <link>http://lifehacker.com/5724763/kod-is-a-free-text-editor-design-for-programmers</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[





Mac OS X: Kod is a simple and free OS X text editor geared toward programmers that offers easy navigation and terminal integration.More »

   
]]></description>
<dc:subject>Downloads Mac_OS_X Mac_OS_X_Featured_Download Programming Text_Editors</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:cf495de5a594/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Downloads"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Mac_OS_X"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Mac_OS_X_Featured_Download"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Text_Editors"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://radar.oreilly.com/2010/12/how-will-the-elmcity-service-s.html">
    <title>How will the elmcity service scale? Like the web!</title>
    <dc:date>2010-12-22T16:00:00+00:00</dc:date>
    <link>http://radar.oreilly.com/2010/12/how-will-the-elmcity-service-s.html</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[
During a recent talk at Harvard's Berkman Center, Scott MacLeod asked (via the IRC backchannel): "How does the elmcity service scale?" He wondered, in particular, whether the service could support an online university like the World University and School that might produce an unlimited number of class schedules.




My short answer was that the elmcity service scales like the web. But what does that really mean? I promised Scott that I'd spell it out here. We'll start with an analogy. As I mentioned in The power of informal contracts, the elmcity project envisions a web of calendar feeds that's analogous to the blogosphere's web of RSS and Atom feeds. We take for granted that the blogosphere scales like the web. A blog feed is just a special kind of web page. Anybody can create a blog and publish its feed at some URL. Why not calendars too? We haven't thought about them in the same way, but the ICS (iCalendar) files that our calendar programs export are the moral equivalents of the RSS and Atom feeds that our blog publishing tools export. Anybody can create a calendar and publish its feed at some URL.




These webs -- of HTML pages, of blog feeds, of calendar feeds -- are notionally webs of peers. We can all publish, and we can all read, without relying on a central authority or privileged hub. There are, to be sure, powerful centralized services. My blog, for example, is one of millions hosted at wordpress.com, aggregated by Bloglines and Google Reader, and indexed by Google and Bing. But these services, while convenient, are optional. So long as we can publish our blogs somewhere online, advertise their URLs, and get the DNS to resolve their domain names, we can have a working blogosphere. The necessary and sufficient condition is that we can all publish resources (e.g., pages and feeds), and that we can all access those resources. 




For the calendarsphere that I envision, a service like elmcity is likewise optional. Let's suppose that the World University and School succeeds wildly. At any given moment there are tens of thousands of courses on offer, each with its own course page and also with its own calendar. Instructors publish course pages using any web publishing tool, and also publish calendars using any calendar publishing tool -- Google Calendar, or Outlook, or Apple iCal, or another calendar program. Students pick schedules of courses, bookmark the course pages, and load the course calendars into any of these same calendar programs. The calendar software merges the separate course calendars and combines them with the students' personal calendars. These calendar programs are thus aggregators of calendar feeds in the same way that feedreaders like NetNewsWire or Google Reader are aggregators of blog feeds.




Given a baseline web of peers, it's useful to be able to merge our individual views of them into pooled spaces. NetNewsWire is a personal feedreader, but Google Reader is social. In the pool created by Google Reader, data finds data and people find people. The elmcity service aims to create that same kind of effect in the realm of public calendar events. When we pool our separate calendars, we publicize the events that we are promoting, we discover events that others are promoting, and we see all our public events on common timelines. 




What constrains our ability to scale out pools of calendars? Let's continue the analogy to the blogosphere. Google Reader constitutes one pooled space for blog feeds, Bloglines another. Because the data aggregated by these services conforms to open standards (i.e., RSS and Atom), other services can create blog pools too. Likewise in the calendarsphere, Google Calendar is one way to pool calendars, the elmcity service is another, Calagator is a third. Others can play too.




How can we scale these providers of calendar pools? Along one axis, each provider needs to be able to grow its computing power. Google Calendar scales on this axis by using Google's cloud platform. The elmcity service uses Azure, the Microsoft cloud platform. Note that elmcity, unlike Google Calendar, is an open source service. That means you could run your own instance of it, using your own Azure account, but you'd still be relying on the Azure compute fabric.




Calagator, based on Ruby on Rails, could be deployed either to a conventional hosting environment or to a cloud platform. It would thus scale, along the compute axis, as either environment allows. The elmcity service could be used in this way too. The service is written for Azure, but the core aggregation engine is independent of Azure and could be deployed to a conventional hosting environment. 




For feed aggregators, another axis of scale is the number of feeds that can be processed. When that number grows, the time required to connect to many feeds and ingest their contents becomes a constraint. The elmcity service currently supports 50 calendar hubs. Thrice daily, each hub pulls data from Eventful, Upcoming, Eventbrite, Facebook, and a list of iCalendar feeds. So far a single Azure worker role can easily do all this work. I'll dial up the number of workers if needed, but first I want to squeeze as much parallelism as I can out of each worker. To that end, I recently upgraded to the 4.0 version of the .NET Framework in order to exploit its dramatically simplified parallel processing. In this week's companion article I show how the elmcity service uses that new capability to optimize the time required to gather feeds from many sources. 




Pub/sub networks can also scale by coalescing feeds. Consider a calendar hub operated, for some city, by the online arm of that city's newspaper. One model is flat. The newspaper runs a hub whose registry lists all the calendar feeds in town. But another model is hierarchical. In that model, there's a hub for arts and culture, a hub for sports and recreation, a hub for city government, and so on. Each hub gathers events from many feeds, and publishes the merged result on its own website for its own constituency. If the newspaper wants to include all those feeds, it can list them individually in its own registry. But why aggregate arts, sports, or recreation feeds more than once? The newspaper's uber-hub can, instead, reuse the arts, sports, and recreation feeds curated by those respective hubs, adding their merged outputs to its own set of curated feeds. Such reuse can cut down the computational time and effort required to propagate feeds throughout the network.




None of these mechanisms will matter, though, until a vibrant ecosystem of calendar feeds requires them. That's the ultimate constraint. Scaling the calendarsphere isn't a problem yet, but it would be a good problem to have. First, though, we've got to light up a whole bunch of feeds.




Related:


 The iCalendar chicken-and-egg conundrum
 Developing intuitions about data
 Personal data stores and pub/sub networks
 The principle of indirection
 See all Radar elmcity stories 
See all Answers elmcity stories





   
]]></description>
<dc:subject>Programming blog calendar elmcity feed syndication</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:42648bc61fae/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:blog"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:calendar"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:elmcity"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:feed"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:syndication"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://rss.slashdot.org/~r/slashdot/eqWf/~3/LdJlH4NcNtI/What-Every-Programmer-Should-Know-About-Floating-Point-Arithmetic">
    <title>What Every Programmer Should Know About Floating-Point Arithmetic</title>
    <dc:date>2010-05-02T15:34:00+00:00</dc:date>
    <link>http://rss.slashdot.org/~r/slashdot/eqWf/~3/LdJlH4NcNtI/What-Every-Programmer-Should-Know-About-Floating-Point-Arithmetic</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[-brazil- writes "Every programmer forum gets a steady stream of novice questions about numbers not 'adding up.' Apart from repetitive explanations, SOP is to link to a paper by David Goldberg which, while very thorough, is not very accessible for novices. To alleviate this, I wrote The Floating-Point Guide, as a floating-point equivalent to Joel Spolsky's excellent introduction to Unicode. In doing so, I learned quite a few things about the intricacies of the IEEE 754 standard, and just how difficult it is to compare floating-point numbers using an epsilon. If you find any errors or omissions, you can suggest corrections."
   
      
Read more of this story at Slashdot.

]]></description>
<dc:subject>programming</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:7c05893d0464/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www.mailund.dk/index.php/2010/04/26/is-r-an-epic-fail/">
    <title>Is R an ‘epic fail’?</title>
    <dc:date>2010-04-26T06:03:19+00:00</dc:date>
    <link>http://www.mailund.dk/index.php/2010/04/26/is-r-an-epic-fail/</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[Is R an ‘epic fail’?

Something as popular and widespread as R can hardly be called a ‘failure’ in any meaningful sense, so of course the question is really in which aspects R is inferior to alternatives.

For most users who need a bit of data analysis, it is probably a poor first choice. R is a programming language with a lot of statistical and data visualisation support, but it is a programming language.  If you don’t want to do any programming, don’t muck about with R!  There are lots of visualisation tools and statistical tools that are much easier to use.

Of course, without a bit of programming, you are limited to what those tools can do, so if you need analysis that is not provided, you need to either find a programmer or learn how to program, and for the latter, R isn’t a bad choice.

You can get pretty far with very little effort in R, once you have learned how to program. Now learning how to program does require quite a bit of effort, but if you need to there really isn’t any way around it.  Just like there isn’t any Royal Road to mathematics (as Euclid is supposed to have said).

Sure, as a programming language R has its idiosyncrasies, but which programming languages do not?
]]></description>
<dc:subject>Work programming R statistics</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:a98065d24020/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Work"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:R"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:statistics"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://feeds.arstechnica.com/~r/arstechnica/index/~3/tGM5tqWsxfY/tutorial-use-twitters-new-real-time-stream-api-in-python.ars">
    <title>feature: Tutorial: consuming Twitter's real-time stream API in Python</title>
    <dc:date>2010-04-21T17:45:00+00:00</dc:date>
    <link>http://feeds.arstechnica.com/~r/arstechnica/index/~3/tGM5tqWsxfY/tutorial-use-twitters-new-real-time-stream-api-in-python.ars</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[
  
  
        
    
Twitter is preparing to launch several impressive new features, including a new streaming API that will give desktop client applications real-time access to the user's message timeline. The new streaming API was announced last week at Twitter's Chirp conference, where it was made available to conference attendees on-site for some preliminary experimentation. Twitter opened it up to the broader third-party developer community on Monday so that programmers can begin testing it to offer informed feedback.


This tutorial will show you how to  consume and process data from Twitter's new streaming API.  The code examples, which are written in the Python programming language, demonstrate how to establish a long-lived HTTP connection with PyCurl, buffer the incoming data, and process it to perform the basic message display functions of a Twitter client application. We will also take a close look at how the new streaming API differs from the existing polling-based REST API.
    
          
      
        
    
      Read the comments on this post


   
]]></description>
<dc:subject>Features Guides Open-source Web programming python tutorial twitter</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:58ebd66b4b7c/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Features"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Guides"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Open-source"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Web"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:python"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:tutorial"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:twitter"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www.mailund.dk/index.php/2010/04/21/on-code-and-comments/">
    <title>On code and comments…</title>
    <dc:date>2010-04-21T08:03:18+00:00</dc:date>
    <link>http://www.mailund.dk/index.php/2010/04/21/on-code-and-comments/</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[I’ve never been a big fan of comments in code.  Mainly because I too often have seen comments explaining the trivial and ignoring the complex…

In most cases, clear code eliminates the need for comments, as discussed here.

I used to think commenting my code was the responsible thing to do. I used to think that I should have a comment for just about every line of code that I wrote. After my first read of Code Complete, my views changed pretty drastically.

I began to value good names over comments. As my experience has increased, I have realized more and more that comments are actually bad.

Actually, Code Complete has a more nuanced discussion on commenting code, but still…

Comments are often not needed, because they just rephrase what you can already read in the code. If at all possible, make the code easier to read rather than explain it in code.

When comments are needed, they explain design decisions that are not obvious from the code. Then there is too often the risk that the design has changed since the comment was written and that is really worse than no comment.

Still, it is when it comes to design decisions that I often miss documentation. Especially when it comes to complex class hierarchies and object interactions where there is clearly some underlying design decisions about how the objects are suppose to interact and how new classes should be added to the hierarchy to extend the code.

I rarely find that stuff documented, though.  At best I am told that for function add(a,b), “a and b are input” and “add(a,b) returns a+b” or something obvious like that…  or that the class “AbstractVisitor” is an abstract visitor class.  Duh!

I would love it if people would stop commenting the obvious but start explaining their design decisions…
]]></description>
<dc:subject>Rants Work programming</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:16bdf90ddc02/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Rants"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Work"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www.johndcook.com/blog/2010/04/15/85-functional-language-purity/">
    <title>85% functional language purity</title>
    <dc:date>2010-04-15T13:56:41+00:00</dc:date>
    <link>http://www.johndcook.com/blog/2010/04/15/85-functional-language-purity/</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[James Hague offers this assessment of functional programming:

My real position is this: 100% pure functional programing doesn’t work.  Even 98% pure functional programming doesn’t work. But if the slider  between functional purity and 1980s BASIC-style imperative messiness is  kicked down a few notches — say to 85% — then it really does work. You get  all the advantages of functional programming, but without the extreme  mental effort and unmaintainability that increases as you get closer and  closer to perfectly pure.

I found James Hague’s blog via a link from Greg Wilson. I’ve gone back through several posts on Hague’s blog Programming in the 21st Century and look forward to reading more.

Related posts:

Functional  in the small, OO in the large
F# may succeed  where others have failed
Why 90% solutions may beat 100% solutions
Reasoning about code
Why functional programming hasn’t taken off

]]></description>
<dc:subject>Software_development Functional_programming Programming</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:dc9134e871b5/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Software_development"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Functional_programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://radar.oreilly.com/2010/04/four-short-links-5-april-2010.html">
    <title>Four short links: 5 April 2010</title>
    <dc:date>2010-04-05T10:00:00+00:00</dc:date>
    <link>http://radar.oreilly.com/2010/04/four-short-links-5-april-2010.html</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[
Wrong about the iPad (Tim Bray) -- I am actively ignoring the iPad drivel, but this line caught my eye: Intelligence is a text-based application.
Fertile Medium -- online community consultancy, from the first and former Flickr community coordinator.  One to watch: Heather and Derek really know their community.  Again I say it: understanding of how open source and other collaborative communities can function is rare and valuable.  (via waxy)
pigz -- parallel gzip implementation.  Voom voom, so fast! (via kellan on Delicious
Prefab: What If We Could Modify Any Interface? -- screen-scraping for GUIs to bolt on new functionality to user interfaces. This is incredible. Watch the demo, it's impressive!




   
]]></description>
<dc:subject>brains community hacks opensource programming ui</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:cf8d106a2e8f/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:brains"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:community"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:hacks"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:opensource"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:ui"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://www.chrishowie.com/2010/04/01/git-svn-in-the-workplace/">
    <title>Chris Howie: git-svn in the workplace</title>
    <dc:date>2010-04-01T16:42:22+00:00</dc:date>
    <link>http://www.chrishowie.com/2010/04/01/git-svn-in-the-workplace/</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[At work, we use Subversion for source control.  This is quite the popular VCS, but I’ve grown accustomed to (and much prefer) Git.  Don’t get me wrong, SVN has its advantages, but since using Git my workflow has changed quite radically, and it’s difficult to revert to the rather inflexible and tedious SVN [...]]]></description>
<dc:subject>Git Programming</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:04d0bf3c5f17/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Git"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Programming"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://feedproxy.google.com/~r/catonmat/~3/sy5RTytKBuI/">
    <title>The Next Ten One-Liners from CommandLineFu Explained</title>
    <dc:date>2010-03-24T06:00:57+00:00</dc:date>
    <link>http://feedproxy.google.com/~r/catonmat/~3/sy5RTytKBuI/</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[
Here are the next ten top one-liners from the commandlinefu website. The first post about the topic became massively popular and received over 100,000 views in the first two days.

Before I dive into the next ten one-liners, I want to take the chance and promote the other three article series on one-liners that I have written:


Awk One-Liners Explained (4 part article).
Sed One-Liners Explained (3 part article).
Perl One-Liners Explained (9 part article, work in progress).

Alright, so here are today’s one-liners:

#11. Edit the command you typed in your favorite editor
$ command <CTRL-x CTRL-e>
This one-liner opens the so-far typed command in your favorite text editor for further editing. This is handy if you are typing a lengthier shell command. After you have done editing the command, quit from your editor successfully to execute it. To cancel execution, just erase it. If you quit unsuccessfully, the command you had typed before diving into the editor will be executed.

Actually, I have to educate you, it’s not a feature of the shell per se but a feature of the readline library that most shells use for command line processing. This particular binding CTRL-x CTRL-e only works in readline emacs editing mode. The other mode is readline vi editing mode, in which the same can be accomplished by pressing ESC and then v.

The emacs editing mode is the default in all the shells that use the readline library. The usual command to change between the modes is set -o vi to change to vi editing mode and set -o emacs to change back to emacs editing mode.

To change the editor, export the $EDITOR shell variable to your preference. For example, to set the default editor to pico, type export EDITOR=pico.

Another way to edit commands in a text editor is to use fc shell builtin (at least bash has this builtin). The fc command opens the previous edited command in your favorite text editor. It’s easy to remember the fc command because it stands for “fix command.”

Remember the ^foo^bar^ command from the first top ten one-liners? You can emulate this behavior by typing fc -s foo=bar. It will replace foo with bar in the previous command and execute it.

#12. Empty a file or create a new file
$ > file.txt
This one-liner either wipes the file called file.txt empty or creates a new file called file.txt.

The shell first checks if the file file.txt exists. If it does, the shell opens it and wipes it clean. If it doesn’t exist, the shell creates the file and opens it. Next the shell proceeds to redirecting standard output to the opened file descriptor. Since there is nothing on the standard output, the command succeeds, closes the file descriptor, leaving the file empty.

Creating a new empty file is also called touching and can be done by $ touch file.txt command. The touch command can also be used for changing timestamps of the commands. Touch, however, won’t wipe the file clean, it will only change the access and modification timestamps to the current time.

#13. Create a tunnel from localhost:2001 to somemachine:80
$ ssh -N -L2001:localhost:80 somemachine
This one-liner creates a tunnel from your computer’s port 2001 to somemachine’s port 80. Each time you connect to port 2001 on your machine, your connection gets tunneled to somemachine:80.

The -L option can be summarized as -L port:host:hostport. Whenever a connection is made to localhost:port, the connection is forwarded over the secure channel, and a connection is made to host:hostport from the remote machine.

The -N option makes sure you don’t run shell as you connect to somemachine.

To make things more concrete, here is another example:

$ ssh -f -N -L2001:www.google.com:80 somemachine
This one-liner creates a tunnel from your computer’s port 2001 to www.google.com:80 via somemachine. Each time you connect to localhost:2001, ssh tunnels your request via somemachine, where it tries to open a connection to www.google.com.

Notice the additional -f flag - it makes ssh daemonize (go into background) so it didn’t consume a terminal.

#14. Reset terminal
$ reset
This command resets the terminal. You know, when you have accidentally output binary data to the console, it becomes messed up. The reset command usually cleans it up. It does that by sending a bunch of special byte sequences to the terminal. The terminal interprets them as special commands and executes them.

Here is what BusyBox’s reset command does:

printf("\033c\033(K\033[J\033[0m\033[?25h");
It sends a bunch of escape codes and a bunch of CSI commands. Here is what they mean:


\033c: “ESC c” - sends reset to the terminal.
\033(K: “ESC ( K” - reloads the screen output mapping table.
\033[J: “ESC [ J” - erases display.
\033[0m: “ESC [ 0 m” - resets all display attributes to their defaults.
\033[?25h: “ESC [ ? 25 h” - makes cursor visible.

#15. Tweet from the shell
$ curl -u user:pass -d status='Tweeting from the shell' http://twitter.com/statuses/update.xml
This one-liner tweets your message from the terminal. It uses the curl program to HTTP POST your tweet via Twitter’s API.

The -u user:pass argument sets the login and password to use for authentication. If you don’t wish your password to be saved in the shell history, omit the :pass part and curl will prompt you for the password as it tries to authenticate. Oh, and while we are at shell history, another way to omit password from being saved in the history is to start the command with a space! For example, <space>curl ... won’t save the curl command to the shell history.

The -d status='...' instructs curl to use the HTTP POST method for the request and send status=... as POST data.

Finally, http://twitter.com/statuses/update.xml is the API URL to POST the data to.

Talking about Twitter, I’d love if you followed me on Twitter! :)

#16. Execute a command at midnight
$ echo cmd | at midnight
This one-liner sends the shell command cmd to the at-daemon (atd) for execution at midnight.

The at command is light on the execution-time argument, you may write things like 4pm tomorrow to execute it at 4pm tomorrow, 9pm next year to run it on the same date at 9pm the next year, 6pm + 10 days to run it at 6pm after 10 days, or now +1minute to run it after a minute.

Use atq command to list all the jobs that are scheduled for execution and atrm to remove a job from the queue.

Compared to the universally known cron, at is suitable for one-time jobs. For example, you’d use cron to execute a job every day at midnight but you would use at to execute a job only today at midnight.

Also be aware that if the load is greater than some number (for one processor systems the default is 0.8), then atd will not execute the command! That can be fixed by specifying a greater max load to atd via -l argument.

#17. Output your microphone to other computer’s speaker
$ dd if=/dev/dsp | ssh username@host dd of=/dev/dsp
The default sound device on Linux is /dev/dsp. It can be both written to and read from. If it’s read from then the audio subsystem will read the data from the microphone. If it’s written to, it will send audio to your speaker.

This one-liner reads audio from your microphone via the dd if=/dev/dsp command (if stands for input file) and pipes it as standard input to ssh. Ssh, in turn, opens a connection to a computer at host and runs the dd of=/dev/dsp (of stands for output file) on it. Dd of=/dev/dsp receives the standard input that ssh received from dd if=/dev/dsp. The result is that your microphone gets output on host computer’s speaker.

Want to scare your colleague? Dump /dev/urandom to his speaker by dd if=/dev/urandom.

#18. Create and mount a temporary RAM partition
# mount -t tmpfs -o size=1024m tmpfs /mnt 
This command creates a temporary RAM filesystem of 1GB (1024m) and mounts it at /mnt. The -t flag to mount specifies the filesystem type and the -o size=1024m passes the size sets the filesystem size.

If it doesn’t work, make sure your kernel was compiled to support the tmpfs. If tmpfs was compiled as a module, make sure to load it via modprobe tmpfs. If it still doesn’t work, you’ll have to recompile your kernel.

To unmount the ram disk, use the umount /mnt command (as root). But remember that mounting at /mnt is not the best practice. Better mount your drive to /mnt/tmpfs or a similar path.

If you wish your filesystem to grow dynamically, use ramfs filesystem type instead of tmpfs. Another note: tmpfs may use swap, while ramfs won’t.

#19. Compare a remote file with a local file
$ ssh user@host cat /path/to/remotefile | diff /path/to/localfile -
This one-liner diffs the file /path/to/localfile on local machine with a file /path/to/remotefile on host machine.

It first opens a connection via ssh to host and executes the cat /path/to/remotefile command there. The shell then takes the output and pipes it to diff /path/to/localfile - command. The second argument - to diff tells it to diff the file /path/to/localfile against standard input. That’s it.

#20. Find out which programs listen on which TCP ports
# netstat -tlnp
This is an easy one. Netstat is the standard utility for listing information about Linux networking subsystem. In this particular one-liner it’s called with -tlnp arguments:


-t causes netstat to only list information about TCP sockets.
-l causes netstat to only list information about listening sockets.
-n causes netstat not to do reverse lookups on the IPs.
-p causes netstat to print the PID and name of the program to which the socket belongs (requires root).

To find more detailed info about open sockets on your computer, use the lsof utility. See my article “A Unix Utility You Should Know About: lsof” for more information.

That’s it for today.
Tune in the next time for “Another Ten One-Liners from CommandLineFu Explained”. There are many more nifty commands to write about. But for now, have fun and see ya!

PS. Follow me on twitter for updates!


   
]]></description>
<dc:subject>Programming at atd atq atrm audio commandlinefu cron csi_command curl daemon dd diff dsp echo editor emacs escape_code fc http if localhost microphone mount netstat of pico post ram ramfs readline redirect reset shell ssh standard_output tcp terminal tmpfs tunnel tweet twitter vi</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:78b0d0d1bd96/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:at"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:atd"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:atq"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:atrm"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:audio"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:commandlinefu"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:cron"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:csi_command"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:curl"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:daemon"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:dd"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:diff"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:dsp"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:echo"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:editor"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:emacs"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:escape_code"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:fc"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:http"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:if"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:localhost"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:microphone"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:mount"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:netstat"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:of"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:pico"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:post"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:ram"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:ramfs"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:readline"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:redirect"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:reset"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:shell"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:ssh"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:standard_output"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:tcp"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:terminal"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:tmpfs"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:tunnel"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:tweet"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:twitter"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:vi"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://rjlipton.wordpress.com/2010/03/23/its-ada-lovelace-day/">
    <title>It’s Ada Lovelace Day</title>
    <dc:date>2010-03-23T23:13:45+00:00</dc:date>
    <link>http://rjlipton.wordpress.com/2010/03/23/its-ada-lovelace-day/</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[ 
 This is my contribution to Ada Lovelace Day 





Ada Lovelace is, perhaps, the world’s first programmer of an actual computer. Others wrote about algorithms much earlier—think Euclid and the famous GCD algorithm—but she wrote a program for a specific computing machine. The machine was Charles Babbage’s Analytic Engine, and her notes look like a program to most. 


Today I plan on joining over a million other bloggers in discussing women in science, and more specifically computing. The event is named after Ada Lovelace, and is happening all over the web.



Okay, I exaggerated about the number of bloggers, it is closer to 100,000 than a million—actually it is closer to 1,700. The number is not important; what is important is: we need more women administrators, educators, and researchers in all areas of computing. Further, more women who already have done great work in computing need to be recognized and given the awards and accolades they deserve. This has not always happened.


I am honored to be a tiny part of this special day, and I hope I can help in some way to make the event a success. 


 What To Do? 


I am honestly unsure what I should do. For starters I am not a woman, and cannot really understand their issues. But, I have been in the computing field for over thirty years and perhaps I can add some small insights. I will try.


When I was first at Princeton we worked very hard to hire Andrea LaPaugh from MIT, where she got her Ph.D. We were successful, and for quite a while she was the only woman in all of engineering at Princeton. Later, she been the first tenured woman in engineering at Princeton. The balance is not perfect now, but I am happy to say today she is not the only tenured professor in engineering.


One day I was talking with a colleague from another engineering department. He asked me, “How many women did we have in Computer Science?” I immediately answered one—Andrea. Then, I asked the obvious question, “How many do you have in your department?” My colleague thought a long time—I guessed he must be adding up women faculty. Finally he said, “None.” 


I am telling this story to show how subtle the issues can be concerning women in science. I gave him a very hard time: I said, you can take a long time to add up the cardinality of a big set, but you cannot take any time to figure out the cardinality of the empty-set. What was he thinking?


 The Two Rule 


One rule is the two rule. I learned this rule from my wife, Judith Norback, who is a Ph.D in psychology from Princeton. Often in an attempt to create balance—especially in academia—one woman will be placed on each committee. A woman. One. It is good to have women on committees, but putting one on does not usually work well.


The difficulty is a lone person on any committee is hard pressed to speak out and really make a difference. A lone person of any minority—the principle is the same for other minorities—is not in general the right choice. There are exceptions to this rule, but studies show one person, from a minority, is not nearly as effective as two. This is the rule of two. If possible always place two women on a group or a committee. They will be immensely more effective, if there are two.


Of course in order to make this happen the academic organization needs to have at least two women—another argument for more diversity. I do not claim to completely understand the reason the “two rule” works, but it does. Try it.


 The Out Rule 


Another rule is the out rule. This I learned from long experience watching fellow computer scientists operate—especially in academia. In the old days when wagon trains were attacked, they were taught to “circle the wagons.” In computer science we still do this, as do most other areas of science and academia. 


However, in computer science the joke—unfortunately all too true—is we shoot the wrong way. We shoot in, not out. Hence, the rule of out: when attacked remember to shoot out, not in toward each other. 


With all due respect, I have long noticed women on various committees often ignore this simple rule. They shoot in toward their fellow women. I have been on many committees of all kinds—award, hiring, program, and other types—and have noticed the women on the committee are often the hardest on women candidates. I have often argued for a women candidate for something, and noticed the other faculty were generally supportive. However, the women faculty in the room would frequently agree with me on the big points, yet attack the candidate on some minor points. Do not shoot in, shoot out.


I am not arguing for a decrease in standards. Never. I am arguing for both male and female faculty to be sure they are as objective as possible. I certainly am far from perfect, but I do think more attention should be paid to being aware of the out rule.


 The Zero Rule 


I am trying to be constructive and not writing a “moral with a tale,” but one last rule is critical in my mind. The zero rule is just this: there must be zero—no—tolerance for any jokes, comments, stories, of any kind that put down women. I have heard many of them over the years, and have always immediately complained about them. I believe such statements cause many women to go into other areas of science. We must be intolerant of any comments of this kind.


 Ada As The First Programmer 


It seems to me clear Lady Lovelace was more than the first programmer: she had great insight into what a computing device could or could not do. Here is a direct quote from her—it could have been written the other day. It would be interesting to see what she would think about computing today—she wrote this in 1842.


 It is desirable to guard against the possibility of exaggerated ideas that might arise as to the powers of the Analytical Engine. In considering any new subject, there is frequently a tendency, first, to overrate what we find to be already interesting or remarkable; and, secondly, by a sort of natural reaction, to undervalue the true state of the case, when we do discover that our notions have surpassed those that were really tenable.





The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform. It can follow analysis; but it has no power of anticipating any analytical relations or truths. Its province is to assist us in making available what we are already acquainted with. This it is calculated to effect primarily and chiefly of course, through its executive faculties; but it is likely to exert an indirect and reciprocal influence on science itself in another manner. For, in so distributing and combining the truths and the formula of analysis, that they may become most easily and rapidly amenable to the mechanical combinations of the engine, the relations and the nature of many subjects in that science are necessarily thrown into new lights, and more profoundly investigated. This is a decidedly indirect, and a somewhat speculative, consequence of such an invention. It is however pretty evident, on general principles, that in devising for mathematical truths a new form in which to record and throw themselves out for actual use, views are likely to be induced, which should again react on the more theoretical phase of the subject. There are in all extensions of human power, or additions to human knowledge, various collateral influences, besides the main and primary object attained.



To really appreciate her brilliant mind, read all her comments here. This is the front piece to the document:



 Open Problems 


The main open problem is continue to try and increase the number of women in all aspects of science, especially computing. I think there are already many good ideas on how to do this—perhaps what we need is to execute the best of these ideas. In any event have a happy Ada Lovelace Day. It would have been a great privilege to have met her.


       












]]></description>
<dc:subject>History Ada_Lovelace computing programming women</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:f63c5389257b/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:History"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Ada_Lovelace"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:computing"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:women"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://feedproxy.google.com/~r/catonmat/~3/GJRqxzmBW9c/">
    <title>Top Ten One-Liners from CommandLineFu Explained</title>
    <dc:date>2010-03-18T03:00:21+00:00</dc:date>
    <link>http://feedproxy.google.com/~r/catonmat/~3/GJRqxzmBW9c/</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[
I love working in the shell. Mastery of shell lets you get things done in seconds, rather than minutes or hours, if you chose to write a program instead. 

In this article I’d like to explain the top one-liners from the commandlinefu.com. It’s a user-driven website where people get to choose the best and most useful shell one-liners.

But before I do that, I want to take the opportunity and link to a few of my articles that I wrote some time ago on working efficiently in the command line:


Working Efficiently in Bash (Part I).
Working Efficiently in Bash (Part II).
The Definitive Guide to Bash Command Line History.
A fun article on Set Operations in the Shell.
Another fun article on Solving Google Treasure Hunt in the Shell.

And now the explanation of top one-liners from commandlinefu.

Update: Russian translation available.

#1. Run the last command as root
$ sudo !!
We all know what the sudo command does - it runs the command as another user, in this case, it runs the command as superuser because no other user was specified. But what’s really interesting is the bang-bang !! part of the command. It’s called the event designator. An event designator references a command in shell’s history. In this case the event designator references the previous command. Writing !! is the same as writing !-1. The -1 refers to the last command. You can generalize it, and write !-n to refer to the n-th previous command. To view all your previous commands, type history.

This one-liner is actually really bash-specific, as event designators are a feature of bash.

I wrote about event designators in much more detail in my article “The Definitive Guide to Bash Command Line History.” The article also comes with a printable cheat sheet for working with the history.

#2. Serve the current directory at http://localhost:8000/
$ python -m SimpleHTTPServer
This one-liner starts a web server on port 8000 with the contents of current directory on all the interfaces (address 0.0.0.0), not just localhost. If you have “index.html” or “index.htm” files, it will serve those, otherwise it will list the contents of the currently working directory.

It works because python comes with a standard module called SimpleHTTPServer. The -m argument makes python to search for a module named SimpleHTTPServer.py in all the possible system locations (listed in sys.path and $PYTHONPATH shell variable). Once found, it executes it as a script. If you look at the source code of this module, you’ll find that this module tests if it’s run as a script if __name__ == '__main__', and if it is, it runs the test() method that makes it run a web server in the current directory.

To use a different port, specify it as the next argument:

$ python -m SimpleHTTPServer 8080
This command runs a HTTP server on all local interfaces on port 8080.

#3. Save a file you edited in vim without the needed permissions
:w !sudo tee %
This happens to me way too often. I open a system config file in vim and edit it just to find out that I don’t have permissions to save it. This one-liner saves the day. Instead of writing the while to a temporary file :w /tmp/foobar and then moving the temporary file to the right destination mv /tmp/foobar /etc/service.conf, you now just type the one-liner above in vim and it will save the file.

Here is how it works, if you look at the vim documentation (by typing :he :w in vim), you’ll find the reference to the command :w !{cmd} that says that vim runs {cmd} and passes it the contents of the file as standard input. In this one-liner the {cmd} part is the sudo tee % command. It runs tee % as superuser. But wait, what is %? Well, it’s a read-only register in vim that contains the filename of the current file! Therefore the command that vim executes becomes tee current_filename, with the current directory being whatever the current_file is in. Now what does tee do? The tee command takes standard input and write it to a file! Rephrasing, it takes the contents of the file edited in vim, and writes it to the file (while being root)! All done!

#4. Change to the previous working directory
$ cd -
Everyone knows this, right? The dash “-” is short for “previous working directory.” The previous working directory is defined by $OLDPWD shell variable. After you use the cd command, it sets the $OLDPWD environment variable, and then, if you type the short version cd -, it effectively becomes cd $OLDPWD and changes to the previous directory.

To change to a directory named “-“, you have to either cd to the parent directory and then do cd ./- or do cd /full/path/to/-.

#5. Run the previous shell command but replace string “foo” with “bar”
$ ^foo^bar^
This is another event designator. This one is for quick substitution. It replaces foo with bar and repeats the last command. It’s actually a shortcut for !!:s/foo/bar/. This one-liner applies the s modifier to the !! event designator. As we learned from one-liner #1, the !! event designator stands for the previous command. Now the s modifier stands for substitute (greetings to sed) and it substitutes the first word with the second word.

Note that this one-liner replaces just the first word in the previous command. To replace all words, add the g modifer (g for global):

$ !!:gs/foo/bar
This one-liner is also bash-specific, as event designators are a feature of bash.

Again, see my article “The Definitive Guide to Bash Command Line History.” I explain all this stuff in great detail.

#6. Quickly backup or copy a file
$ cp filename{,.bak}
This one-liner copies the file named filename to a file named filename.bak. Here is how it works. It uses brace expansion to construct a list of arguments for the cp command. Brace expansion is a mechanism by which arbitrary strings may be generated. In this one-liner filename{,.bak} gets brace expanded to filename filename.bak and puts in place of the brace expression. The command becomes cp filename filename.bak and file gets copied.

Talking more about brace expansion, you can do all kinds of combinatorics with it. Here is a fun application:

$ echo {a,b,c}{a,b,c}{a,b,c}
It generates all the possible strings 3-letter from the set {a, b, c}:


aaa aab aac aba abb abc aca acb acc
baa bab bac bba bbb bbc bca bcb bcc
caa cab cac cba cbb cbc cca ccb ccc

And here is how to generate all the possible 2-letter strings from the set of {a, b, c}:


$ echo {a,b,c}{a,b,c}

It produces:


aa ab ac ba bb bc ca cb cc

If you liked this, you may also like my article where I defined a bunch of set operations (such as intersection, union, symmetry, powerset, etc) by using just shell commands. The article is called “Set Operations in the Unix Shell.” (And since I have sets in the shell, I will soon write articles on on “Combinatorics in the Shell” and “Algebra in the Shell“. Fun topics to explore. Perhaps even “Topology in the Shell” :))

#7. mtr - traceroute and ping combined
$ mtr google.com
MTR, bettern known as “Matt’s Traceroute” combines both traceroute and ping command. After each successful hop, it sends a ping request to the found machine, this way it produces output of both traceroute and ping to better understand the quality of link. If it finds out a packet took an alternative route, it displays it, and by default it keeps updating the statistics so you knew what was going on in real time.

#8. Find the last command that begins with “whatever,” but avoid running it
$ !whatever:p
Another use of event designators. The !whatever designator searches the shell history for the most recently executed command that starts with whatever. But instead of executing it, it prints it. The :p modifier makes it print instead of executing.

This one-liner is bash-specific, as event designators are a feature of bash.

Once again, see my article “The Definitive Guide to Bash Command Line History.” I explain all this stuff in great detail.

#9. Copy your public-key to remote-machine for public-key authentication
$ ssh-copy-id remote-machine
This one-liner copies your public-key, that you generated with ssh-keygen (either SSHv1 file identity.pub or SSHv2 file id_rsa.pub) to the remote-machine and places it in ~/.ssh/authorized_keys file. This ensures that the next time you try to log into that machine, public-key authentication (commonly referred to as “passwordless authentication.”) will be used instead of the regular password authentication.

If you wished to do it yourself, you’d have to take the following steps:


your-machine$ scp ~/.ssh/identity.pub remote-machine:
your-machine$ ssh remote-machine
remote-machine$ cat identity.pub >> ~/.ssh/authorized_keys

This one-liner saves a great deal of typing. Actually I just found out that there was a shorter way to do it:


your-machine$ ssh remote-machine 'cat >> .ssh/authorized_keys' < .ssh/identity.pub

#10. Capture video of a linux desktop
$ ffmpeg -f x11grab -s wxga -r 25 -i :0.0 -sameq /tmp/out.mpg
A pure coincidence, I have done so much video processing with ffmpeg that I know what most of this command does without looking much in the manual.

The ffmpeg generally can be descibed as a command that takes a bunch of options and the last option is the output file. In this case the options are -f x11grab -s wxga -r 25 -i :0.0 -sameq and the output file is /tmp/out.mpg.

Here is what the options mean:


-f x11grab makes ffmpeg to set the input video format as x11grab. The X11 framebuffer has a specific format it presents data in and it makes ffmpeg to decode it correctly.
-s wxga makes ffmpeg to set the size of the video to wxga which is shortcut for 1366×768. This is a strange resolution to use, I’d just write -s 800x600.
-r 25 sets the framerate of the video to 25fps.
-i :0.0 sets the video input file to X11 display 0.0 at localhost.
-sameq preserves the quality of input stream. It’s best to preserve the quality and post-process it later.

You can also specify ffmpeg to grab display from another x-server by changing the -i :0.0 to -i host:0.0.

If you’re interested in ffmpeg, here are my other articles on ffmpeg that I wrote while ago:


How to Extract Audio Tracks from YouTube Videos
Converting YouTube Flash Videos to a Better Format with ffmpeg

PS. This article was so fun to write, that I decided to write several more parts. Tune in the next time for “The Next Top Ten One-Liners from CommandLineFu Explained” :)

Have fun. See ya!

PSS. Follow me on twitter for updates.


   
]]></description>
<dc:subject>Programming authorized_keys bash cd combinatorics commandlinefu cp desktop display event_designators ffmpeg history identity.pub id_rsa.pub linux mtr oldpwd one_liners passwordless_authentication ping public_key_authentication python pythonpath root sets shell simplehttpserver ssh ssh_copy_id ssh_keygen sshv1 sshv2 sudo tee traceroute vim x11</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:eb42c63da138/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:Programming"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:authorized_keys"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:bash"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:cd"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:combinatorics"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:commandlinefu"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:cp"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:desktop"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:display"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:event_designators"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:ffmpeg"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:history"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:identity.pub"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:id_rsa.pub"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:linux"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:mtr"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:oldpwd"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:one_liners"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:passwordless_authentication"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:ping"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:public_key_authentication"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:python"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:pythonpath"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:root"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:sets"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:shell"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:simplehttpserver"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:ssh"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:ssh_copy_id"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:ssh_keygen"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:sshv1"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:sshv2"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:sudo"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:tee"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:traceroute"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:vim"/>
	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:x11"/>
</rdf:Bag></taxo:topics>
</item>
<item rdf:about="http://rss.slashdot.org/~r/slashdot/eqWf/~3/0BqHlxmfoNk/Simpler-Hello-World-Demonstrated-In-C">
    <title>Simpler &quot;Hello World&quot; Demonstrated In C</title>
    <dc:date>2010-03-17T02:03:00+00:00</dc:date>
    <link>http://rss.slashdot.org/~r/slashdot/eqWf/~3/0BqHlxmfoNk/Simpler-Hello-World-Demonstrated-In-C</link>
    <dc:creator>rahuldave</dc:creator><description><![CDATA[An anonymous reader writes "Wondering where all that bloat comes from, causing even the classic 'Hello world' to weigh in at 11 KB? An MIT programmer decided to make a Linux C program so simple, she could explain every byte of the assembly. She found that gcc was including libc even when you don't ask for it. The blog shows how to compile a much simpler 'Hello world,' using no libraries at all. This takes me back to the days of programming bare-metal on DOS!"
   
      
Read more of this story at Slashdot.
]]></description>
<dc:subject>programming</dc:subject>
<dc:identifier>https://pinboard.in/u:rahuldave/b:6abd35bb64d9/</dc:identifier>
<taxo:topics><rdf:Bag>	<rdf:li rdf:resource="https://pinboard.in/u:rahuldave/t:programming"/>
</rdf:Bag></taxo:topics>
</item>
</rdf:RDF>