3 Rules For Giving a Sh!t

4 minute read


I spend a lot of time on cross validated (CV). CV is my statistical escape from statistics, and it is also a place where I like to prove to myself that I am good at what I do.

On CV, people pose questions and for the community to answer. Members post solutions, and OPs can accept that solution if it answers their question. If a question has multiple solutions, then the community can upvote solutions based on whatever they think merits an upvote. Each action (answer accepting and upvotes) gives members points.

Ah, fake interent points. We know them too well. From reddit upvotes to likes on twitter, these digital tally counters are to many (including myself) a measure of how “right” you are – where right is measured by consensus. More points means more people agreed with you and how could you be wrong when so many people agree with you? Its a form of external validation for me; a way to prove that I am good at statistics because other people are marking my answers as good.

That becomes draining very quickly. Somtimes I catch myself wasting time explaining simple stuff because I know it is an easy 20-30 points (equivalent to 2 upvotes and an accepted answer, or 3 upvotes). I’m wasting my time on questions of little importance to get validation from people I don’t know (except I do know some of them because they follow me on Twitter. Hi, Tim). I need to reel it in a little while still engaging (because it is kind of fun sometimes and some of the answers I give are genuinely interesting and have lead to some blog posts). I’ve developed 3 rules I check to see if an answer is worth commiting to ink, er, HTML.

1) Does an answer exist in a place OP really should have looked?

“How do I interpret log odds?” – Next. “What sort of statsitical test do I need?” – Next. “What do I do if my predictor isn’t normal?” – Next. I don’t need to waste my time answering something which exists in software documentation or in introductory stats books. If the question can be answered by reading or by copy and pasting the appropriate link to cononical resources, I don’t waste my time. I might comment and say something like “A good place to look might be…”, but that is it.

2) Is the answer complex or complicated?

As in the zen of python, complex is preferable to complicated. Complex would mean that the answer is non-trivial, but the “juice is worth the squeeze” so to say. If what we get out of it is an interesting insight, then I will commit my time. Complicated would mean that there is nothing interesting to come out of the answer but getting the answer is tedious. If that sounds ambiguous, that is because I intended it to be so that I could manipulate these rules at will. Remember, these rules serve me and not the other way around.

3) Do I give a shit?

This last rule is actually a function of the other two. If the answer exists elsewhere, then I likely do not give a shit. If the answer is complicated, then I likely do not give a shit. If the answer is novel (or atleast novel enough to me) but the answer is complicated, then I might give a shit. If the question has been answered before but I have the opportunity to give my own insight and opinion (i.e. the answer is complex) then I might give a shit. This rule really decides for 80% of questions if I take the time or not. If I give a shit about what you’re asking, I am much more likely to contribute even if one of the other two rules are violated.

Do I follow these all the time? No, but I do find myself using them more frequently. My score on CV has dipped a little, but hey that is the trade off. Now I find I’m not wasting my time explaining first year stats to some poor grad student who just wants a p value. That serves both me and them better in the end.