Short, good read. The money quote is near the end:
> We also suggest a way of powering future experiments: power your experiment such that you, or someone else, can conduct a similarly-sized experiment and have high power for detecting an interesting difference from your study. We need to stop thinking about studies as if they are one-offs, only to be interpreted once in light of the hypotheses of the original authors.
Seems like there should be a ranking for verification. Something like 'paper accepted but not replicated' being rank 1, then replicated once gets you to level 2, etc. til you hit 5+ 'validity rating'.
Might actually incentivize people then to be the 'replicators'
Confidence levels in the hypothesis based on results and rigour of the methodology producing those results (implicitly reproducibility) are two different things. I suspect that's why the downvotes.
I'll bite: p-values attempt to measure "Given the null hypothesis, what is the probability that we obtain a more extreme result than we actually did?". What we actually want is "What is the probability of the null hypothesis given the data?". If we are to use p-values successfully to determine truth, then by Bayes we must use prior probabilities somewhere. Discovering p=0.01 is not very helpful when your prior probability was 0.001.
Any valid experiment would assume a prior p-value of 1 (otherwise, what are you hypothesizing?).
A null hypothesis does not have a probability. It is either the case, or it isn't.
Edit (response to your comment below) [speech restricted at the moment]:
"If you can't dispute it, you won't."
Edit2 (response to your comment/edit below):
It's just a fact. Same as the other simple facts I have stated that remain completely undisputed.
Why would you hypothesize an experiment where you already biased yourself (I assume p = 0.01? what?) on the outcome?
If your hypothesis is that you will observe no differences, then you have implicitly assigned a prior of 1. Any experimental design will be reducible to this fundamental comparison.
The p-value of an ongoing experiment is a measurement of the reproducibility of the [measurement of the predicted observations of the] hypothesis (many hypotheses are not true though; in those cases the p-values are not valid reproducibility measurements).
Edit3:
>Your misunderstanding in a nutshell is demonstrated by "if you can't dispute it, you won't", which you seem to think means "if you won't dispute it, you can't", in clear violation of Bayes's rule.
I'm not going to continue replying to this apparent troll. Your misunderstanding in a nutshell is demonstrated by "if you can't dispute it, you won't", which you seem to think means "if you won't dispute it, you can't", in clear violation of Bayes's rule.
Isn't this simply maximizing the probability of a successful reproduction? Seems almost obvious given the whole problem was low reproducibility in the first place.
Independent, adversarial reproducibility implies some possibility of correctness.
But psychology doesn't seem to do what science usually does - build models and make independent predictions from them.
Finding patterns in data is not the same as testing a null hypothesis which is the result of a model, and not just a random query. Or an accidental discovery.
Is whether some psychological are falsifiable matter? Why is a paper like this one concerning itself with falsifiability?
These are questions the present paper does not concern itself with. It seems to take for granted that falsifiability of a study is something important, but never bothers to say why.
Falsifiability was popularized in the mid-20th century by the philosopher Karl Popper, who proposed it as a criteria of measuring whether something is "scientific" or not.
Unfortunately for his project, and perhaps for the authors of this paper, the falsifiability criteria was largely rejected by later philosophers:
"Sir Karl Popper is not really a participant in the contemporary professional philosophical dialogue; quite the contrary, he has ruined that dialogue. If he is on the right track, then the majority of professional philosophers the world over have wasted or are wasting their intellectual careers. The gulf between Popper's way of doing philosophy and that of the bulk of contemporary professional philosophers is as great as that between astronomy and astrology."[1]
More critiques here: [1]
I'm not sure if such critiques or philosophers of science other than Popper are looked to by scientists, if scientists have some special reason for still caring about falsifiability, or if they just latched on to the concept and now either don't know or don't care what philosophers of science think about the demarcation between science and non-science.
One other thing that goes unremarked on by this paper is why whether a study or branch of psychology is considered scientific or not matters. There is in fact sometimes a lot of money involved, and certainly careers to make or break. Scientists and psychologists both have a vested interest in making their way of doing science or practising psychology the most highly respected, desirable, and valuable way.
In the current money-conscious climate, insurance companies are often keen to pick the shortest possible (ie. cheapest) and most "evidence-based" (or most "scientific") treatment, and reject the rest. This way, they can reject as much as possible, and pay as little as possible for treatment. Consumers, usually knowing little about psychology or the battles that have happened between branches of psychology often pick the most "scientific" sounding treatment (if they have, or one that their insurance company covers, if they don't -- and they often don't). So that's where the money goes.
It seems to me the authors of the original paper are confused about, or are confusing the meaning of "falsifiability" as it's usually meant in philosophy of science.
It was an interesting paper, if not entirely original, but I wouldn't say it's about falsifiability. It's about power. There's some overlapping issues, sure, but they're not the same concepts.
I've never been interested enough in parapsychology to consider the evidence.
My main point is that there are a lot of important things going on underneath the surface of science and psychology that are often unacknowledged and overlooked in most mainstream scientific and psychological literature -- things that make huge differences in what gets funding and that probably strongly influences the players in those games -- things that you have to read between the lines or study how the games are played to understand.
PS: Apologies for the typos and word-omissions in my original post. The first sentence should have read: "Does whether some psychological studies are falsifiable matter?"
That's what I get for being overeager to post without taking the time to proofread. I've gotten in to the bad habit of posting first and editing later, but didn't have time to edit my original post in time, and now HN does not let me edit it any more.
> We also suggest a way of powering future experiments: power your experiment such that you, or someone else, can conduct a similarly-sized experiment and have high power for detecting an interesting difference from your study. We need to stop thinking about studies as if they are one-offs, only to be interpreted once in light of the hypotheses of the original authors.