More

whoisnnamdi · on Oct 27, 2020

To be frank I think integrating with the Remarkable 2 (I have one as well) is going to be tough. This has more to do with the device than with Polar - it's very difficult as architected today to connect to anything else (Dropbox, etc). Given that, it might not make sense for these guys to prioritize this specific device, especially since so few people have one

whoisnnamdi · on Oct 13, 2020

Thanks for this super detailed video! I've seen some of your others in the past and always found them helpful. Yannick's video on this paper is great as well.

If anyone is looking for a written walkthrough, I did a quick summary/explainer on my personal site: https://whoisnnamdi.com/transformers-image-recognition/

deeplstm · on Oct 14, 2020

Thanks for watching! I am glad you find them helpful. And your written summary looks nice!

whoisnnamdi · on Oct 2, 2020

Leo - as other folks have noted it is a major assumption to think that VCs are good at calculating the expected values of companies in which they make investments. The fundamental randomness is so high here that any ranking a VC made would likely be garbage - the sample size of investments is not nearly high enough to come to a precise viewpoint on expected value of one good deals vs another. As an analogy, if this was a regression the standard errors would be massive, so much so that each investment would not be statistically significantly different from one another in terms of expected return. I think it’s plausible that they’d be able to distinguish best from absolute worst, but best from slightly less good? No way.

I think a more realistic “model” of a VCs behavior is one of thresholding - set the bar somewhere and invest in any company that seems to be above that quality bar, with some stochasticity as to which deals you actually win in the end.

lpolovets · on Oct 2, 2020

Hey Namdi!

> it is a major assumption to think that VCs are good at calculating the expected values of companies in which they make investments.

It's not quite the same as proving VCs can calculate expected values well, but VC is one of the few asset classes with persistence of returns -- meaning having a top quartile fund is correlated with the next fund also being top quartile. In most investment classes, being top quartile across funds is basically uncorrelated. So good VCs tend to be reliably good and not good just because of pure chance.

Source: https://www.morganstanley.com/im/publication/insights/articl... (p 51)

> The fundamental randomness is so high here that any ranking a VC made would likely be garbage

A ranking doesn't have to be perfect, just better than random. E.g. let's say you give me a bunch of companies and ask me to rank them, and I put 60% of the actual top 1% and 40% of the next 1% into my top 1%. I'm making mistakes, but that's still pretty good. As long as I'm directionally right on average, I should do well on the investing side.

texasbigdata · on Oct 2, 2020

Go backwards. How many IPOs have not had VC backing? Obviously maybe they grow slower, or maybe IPO bound tech firms are oversubscribed so it’s only an individual VCs prediction ability that’s wrong not the asset class as a whole, but still.

whoisnnamdi · on Aug 26, 2020

Love this, thanks for sharing!!

whoisnnamdi · on April 21, 2020

Author here

I agree with everything in your comment except the first line.

I don't think software developer pay follows a power law, or at least not in this dataset. Ignoring equity (which can cause power law type behavior if enough folks strike it rich), the distribution of pay is probably closer to a lognormal distribution. It's just not skewed enough to constitute a power law — that would imply serious skew.

Additionally, this dataset excludes outliers making multiple millions of dollars, so that also reduces the likelihood of a power law-style distribution.

Lastly, this dataset is only U.S. developers, so there isn't skew across meaningfully different geographies.

With a lognormal distribution, your point remains valid, but the effect is not nearly so dramatic.

whoisnnamdi · on April 21, 2020

Author here

I think this is a quite common story, or at least I've heard it a few times now so it feels that way:

(1) Senior, important developer decides they want to change locations.

(2) Boss can't afford to lose them.

(3) So boss let's them work from wherever they want.

This of course leads to natural correlation between working remote and skill, which would further correlate to earnings.

Replacing a good software developers is non-trivial.

That said, to quote from the analysis: Even once I control for various observable factors (including age, experience, hours worked, size of employer, programming languages, and more), fully-remote software developers earn 9.4% more than developers who never or only rarely work remotely.

mabbo · on April 21, 2020

I guess the problem is that the analysis can't find a variable for "is critical". I know lots of people with age, experience, hours worked, etc, that are not critical. I can think of a few that are exactly the opposite of critical.

The only decent variable to measure whether someone is critical is their salary. But your anlaysis makes it sound as though [remote -> high salary], when really [high salary -> critical] and [remote -> critical].

lostcolony · on April 21, 2020

I'd go so far as to say that critical and valuable are not necessarily the same thing. I'd like to think I'm valuable; I go out of my way to not be critical. Intentional knowledge shares, documentation, mentoring, etc, plus focusing on writing systems that are fault tolerant, means that I could leave and it -should- be a minor hiccup. Heck, almost an entire team left along with me at one point in my career, and the projects we left behind just...kept working, no issues. Doubt that company is actually looking to figure them out so they can extend/support them should something need changing, but that's on them.

JAlexoid · on April 22, 2020

If I stayed at companies that fall under #2, I'd be eternally miserable.

I hate working with people who cannot be fired because the company would fold.

We, developers, should be the people who the bosses actively want to be there... not "we can't fire him". It makes people less complacent towards other people.

aprdm · on April 22, 2020

I would say every company that has been around the block for a long time falls somewhat under #2.

Some companies have millions and millions of LoC to run their business and it takes months for someone to wrap their head around some of the systems and be productive...

You end up having people in departments that you would rather not lose and will go through great lengths to keep them because the cost of losing the institutional knowledge is just too big.

ThrowawayR2 · on April 21, 2020

The hazard there is that a new boss may come in and either not see this person as critical and let him/her go or see this person as critical and go to the effort to find an equivalently senior onsite person to replace him.

whoisnnamdi · on April 21, 2020

Author here

Thanks, yes this is an important point which I cover later in the analysis:

"Even once I control for various observable factors (including age, experience, hours worked, size of employer, programming languages, and more), fully-remote software developers earn 9.4% more than developers who never or only rarely work remotely."

whoisnnamdi · on April 21, 2020

Author here

Yes the results include only the U.S. but do not include more granular controls for regions within the U.S. So it's possible that this is biasing the results.

That said, I do control for size of company, which often correlates strongly with region (both big tech companies and startups tend to employ most of their workforce from a certain region, i.e. west coast). I acknowledge this is imperfect though

a_imho · on April 21, 2020

Bit of an understatement. Doing a study about remote developers and substituting geography with a proxy. At least for me it would seem really obvious we are disregarding a main factor. What are the numbers if we control for location?

jt2190 · on April 21, 2020

Awesome analysis. Thanks for sharing this @whoisnnamdi

> 10,355 U.S. based individuals employed as software engineers on either a part-time, full-time, or independent basis

I'm unclear how you're controlling for employment-type pay differences, which can be significant. Independent (1099) workers pay 100% of U.S Federal Income Tax, whereas Employees pay 50%. (Same thing for extra costs, like health insurance, retirement plans, etc.)

Edit: I suppose size of employer might be one way to do it.

whoisnnamdi · on April 21, 2020

Thank you!

Yes this is an important point. I don't mention it there, but I do control for full-time vs part-time vs. self-employed in my analysis.

I'll cover this in a coming post, but full-time and self-employed developers make very similar amounts on average, with a slight benefit for full-time employed.

I cannot control for the tax effects unfortunately, as I have no data on taxes paid.

jt2190 · on April 21, 2020

> I cannot control for the tax effects unfortunately, as I have no data on taxes paid.

In that case I suggest separating the "Independent contractor, freelancer, or self-employed" group from the , "Employed Full Time" and "Employed part-time" groups. [1] The way that contractors are compensated is significantly different from how employees are compensated.

[1] https://insights.stackoverflow.com/survey/2019#work-_-employ...

whoisnnamdi · on April 21, 2020

Author here

Agreed, which is why I didn't say,"working remote causes developers to earn 22% more", only that developers who work remote earn 22% more, which is an interesting fact in itself.

Early in the article I adjust this for various controls to better get at causality. This includes observable factors like age, years of experience, hours worked, size of employer, programming languages used, etc.

As I note in the article: >Much of the apparent premium earned by remote developers is in fact driven by seniority and tenure. These are older, more experienced developers who either prefer to work remote or whose organizations grant them that privilege.

However, controlling for manually selected factors doesn't imply causality, so I use principled covariate selection to select the best set of controls and get closer to something that could be called causal. You can read more about this method, called Double Lasso Selection, in Urminsky, Hansen, and Chernozhukov [0].

This results in an adjusted pay premium of 9.4% for remote developers relative to those that never work remote. Hard to know for sure if this is causal either, but it's likely much closer to whatever the true causal impact is. Unsurprisingly, it's a lower number.

Thanks for reading!

[0] http://home.uchicago.edu/ourminsky/Variable_Selection.pdf

munificent · on April 21, 2020

One easy dumb trick I use to filter out "interesting" results that are likely just getting causality backwards is to flip the title around and see if it immediately sounds like an obvious truism. In the case of this article, it goes from:

> Remote Software Developers Earn 22% More Than Non-Remote Developers

To something like:

> Higher-paid software developers more likely to be working remotely

And at that point, yes, it's fairly obviously going to be true. Working remotely is a perk for those who prefer it so we should expect it to be positively correlated with other forms of compensation.

I think all that your data really shows is a hidden variable: competence. Better software developers get paid more, work remotely more, and probably also get more paid time off, larger bonuses, and all sorts of other perks.

AmericanChopper · on April 21, 2020

You could also speculate that employers who offer WFH are more likely to pay more. It’s seen as a perk by many people, and suggesting that employers who offer more perks might pay more doesn’t seem unlikely to me. The headline would seem silly if it said “engineers who get free hot meals at work earn more than those who don’t”.

novok · on April 21, 2020

I feel like for those Sr. Eng's who can negotiate a remote position, if they don't go remote their careers and pay would go even further. It's one reason why I don't go remote, it's hard to reach staff or sr management positions typically when you do that.

It's hard to tease that out in a survey although.

RHSeeger · on April 21, 2020

> hard to reach staff or sr management positions

There's a lot of us that don't want to move into positions where we don't work on software development at the code level. Unfortunately, that also tends to put a cap on how much we can make because a lot of people see management as an upward movement rather than a lateral one.

rootusrootus · on April 21, 2020

I work at a company with dual paths (management vs individual contributor) and a requirement to work at the code level would cap your progress upward on either path. As a senior engineer you are expected to spend significantly more time doing architecture work, mentoring other developers, and other non-code activities.

eyelidlessness · on April 21, 2020

Well, that's a shame. Doing architecture work and mentoring other developers absolutely can include working with actual code.

whack · on April 21, 2020

> Early in the article I adjust this for various controls to better get at causality. This includes observable factors like age, years of experience, hours worked, size of employer, programming languages used, etc.

There are likely many other factors which aren't "observable" by your methodology, which are causing the 9.4% pay-premium. Factors like competency, domain knowledge, and how essential they are to the organization. I find it much more believable that these unobservable factors are causing both the pay-premium, and working-remote flexibility.

This is a common problem that comes up in such studies. Adjusting for various observable factors is great, but it still leaves behind the other unobservable factors. At which point you have to use your judgement to figure out whether those unobservable factors are more compelling than the hypothesis being tested.

That said, this is still a very cool analysis that demonstrates an interesting correlation. Thanks for sharing!

braythwayt · on April 21, 2020

My go-to quip for making decisions based on observable variables when the actual causal relationship is between unobservable variables and outcomes is, "Looking for your watch under the street lamp."

---

For those who aren't familiar with the aphorism, it's from a joke that goes:

One night, I came upon a man staring at the ground under a street lamp. "Looking for something?" I asked, and the man nodded, "My watch." "You lost it here?" I asked, but the man shook his head. "I lost it further down the street."

"Why look here, if the watch is over there?"

The light's better under the street lamp."

whoisnnamdi · on April 21, 2020

Totally.

The "etc" I used in my quote is doing a lot of work here - I control for a fairly extensive list of observed factors from the dataset. I just listed out a few because otherwise the list gets absurdly long.

I control for domain knowledge via what programming languages, technologies, and tools the developers know. I use a proxy for importance to the organization by controlling for the control the developer has over buying decisions within their organization. I control for education levels and college major. I control for employment status (full-time, part-time, independent). Hours worked per week.

The list goes on but those are a few.

But your point remains that unobservables could be confounding the results, so one can never be sure.

Thanks for reading!

NovemberWhiskey · on April 21, 2020

I see you published some code on GitHub that appears to relate either to this study or another one you published recently.

Can you clarify whether you're using the methodology from that code for limiting your sample?

https://github.com/whoisnnamdi/highest-paid-software-develop...

e.g. only those reporting a salary between $10K and $250K?

whoisnnamdi · on April 21, 2020

Thanks for checking out the code.

Confirming that, yes, I am using this methodology for the results reported in this post.

6gvONxR4sf7o · on April 21, 2020

I tried to find the factors you controlled for. You wrote

> (including age, experience, hours worked, size of employer, programming languages, and more)

but the "and more" link didn't give me more insight into what controls you used. The interpretation here is super duper sensitive to what those were, so you have just a straight up list of what your controls were? (before & after feature selection)

I am most curious about geographic controls, which are going to be tricky, especially since you can't just one-hot encode them and throw them into double-LASSO. The association might be as simple as "remote workers more likely to be paid in USD" or something analogous, and I'm trying to determine whether/how you controlled for that.

_v7gu · on April 21, 2020

You can't really remove the selection effect of higher paid developers choosing to be remote by adding external controls. You could try using an instrument for remote work or use a Simultaneous Equation Model to remove the selection effect bias.

Counter to your analysis, I'd actually expect the pay premium for remote work would be negative simply because the opportunity to work remote is worth paying for.

wbc · on April 21, 2020

very cool! any thoughts on where that premium comes from, since we're controlling for the standard stuff?

dx034 · on April 21, 2020

Pure speculation, but maybe you have to look around or negotiate to find remote work? Those who tend to do that probably also get better salaries than those that stick with one job for very long or don't compare much when looking for jobs.

thinkharderdev · on April 21, 2020

One possibility is that remote developers are just more productive (or if you prefer, more productive developers are more able to negotiate a remote working allowance from their employer). In standard economic theory, the equilibrium wage is at the workers marginal revenue product (or the marginal revenue that their work produces for the organization). So you would expect remote workers to either earn more because they are more productive or, if remote work makes you more productive, to earn more as a result of working remotely and being more productive

NovemberWhiskey · on April 21, 2020

> Agreed, which is why I didn't say,"working remote causes developers to earn 22% more", only that developers who work remote earn 22% more, which is an interesting fact in itself.

... but you have a heading right at the start of the article: "Remote work pays", what was that supposed to mean then?

whoisnnamdi · on Feb 24, 2020

Author here

An important distinction!

I cover this and the overall methodology here [1]

[1] https://whoisnnamdi.com/highest-paid-software-engineers-2020...