P-Hacking — Part 04: Dangers, New improvements & The future of experiments

Isuru Pamuditha
10 min readJun 1, 2019

--

Dangers

With all these new technological advancements and news feeds, the new developments in many areas reach you in no time. But there are news and study results that appear in your news feeds disguised as legitimate. We all must be able to identify what is real/fake and what to follow and what not to. This is a big deal because it’s having serious consequences in our knowledge and understanding. It means we might be neglecting important and crucial areas of research or completely swallowing some bad research published with fishy intentions. In practice, it could mean we have a decreased understanding of the true effect of a treatment or medication or following something actually bad for ourselves without knowing the consequences.

There is bad research everywhere. For example, when it comes to nutrition, We want to get smart about what factors and foods might be conducive to chronic diseases: obesity, heart disease, diabetes.
There are some very good research, but sadly there’s an abundance of incorrect studies too.
“Chocolate makes you skinny” “beer helps you work out” “Coffee prevents Alzheimer’s.”
When a researcher notices that people in the study who are eating chocolate are also skinny and might come to the conclusion, that they’re skinny because they eat chocolate with making his or her own evidence using illegitimate ways. Same case with the Alzheimer’s study.

The impact that this misreporting can have on our faith in science is huge.

If you’re a researcher, by creating a huge number of possible tests for your study, even if your one test had no effect on the something else, we know that some, if not many, of these tests were likely to be significant just by chance. The more analyses that were conducted, the more likely finding those fake results becomes. By the time you do a bunch of separate tests, it’s more likely than not that you’ll get at least one statistically significant result, even if there’s nothing there. The main problem arises when those few significant results are reported without the context of all the non-significant ones.

Philanthropist Laura Arnold who has done some impressive work towards making correct this problem spoke about the reproducibility of some of the main psychological researches done in the year 2008.

“We asked researchers to reproduce 100 psychology experiments
that had been published in top psychology journals in 2008.
So, go do them again.
If you do them again, will you find the same results?
That’s what we wanted to know.
You know how often they could find the same results?
One third to one half of the time.
Now, I’m not claiming that scientists and researchers
are actively and intentionally committing fraud.
I’m saying there’s something broken
in our system

Recently a news came, “turns out if you’re pregnant eating 30 grams a day of chocolate that’s about two-thirds of a chocolate bar not the
whole chocolate bar could improve blood flow to the placenta and benefit the growth and development of your baby especially in women at risk for preeclampsia or high blood pressure in pregnancy”. But turns out it wasn’t a even a successful and well conducted experiment. As the reports say there wasn’t even a control group of women who didn’t eat chocolate and the study found no difference in preeclampsia or high blood pressure between women who ate the two chocolates. But it came out very wrong to the public in the end.

We must make sure that we limit the likelihood of putting out false research is really important. We always want to put out good research, and as much as possible, we want the results we publish to be correct. If you don’t do research yourself, these problems can seem far removed from your everyday life, but they still affect you. These results might affect the decisions of the lawmakers, politicians, social workers etc. With that the impact these research results are putting in your life is tremendous. These results are taken into consideration when rules and regulations are formed. We must always remember that it affects not just to us humans but also to the whole bio diversity. Animals, trees all of these are affected by the researches that are done by the scientists. Therefore as a community as a species we must be careful and responsible to not let these kind of fraud happen and we must take actions to eradicate such beliefs spreading like wildfire in the society.

One of the main reason that we see a result is incorrect is because they feel somewhat biased and unnatural. Maybe the scientists feeling pressured to come up with catching positive results or maybe they are committing fraud for other reasons like fame and recognition.

Marketing or Business intentions

The television, internet, radio and all other telecommunication networks have always been altered and made friendly for business from a long time. Advertisements are presented to you in a lot of ways everyday even without our knowledge. In a world like this, altering a scientific study for the needs and liking of businessmen is not surprising. And what else is a better method than a recommendation of a scientific study as an advertisement. So it is not something new that these mega companies funding their own research. But this is another scenario which we must be careful before following something blindly.

“If you’re a drug manufacturer and you have a hair growth drug that you want to market. You conduct a study, and it comes up with a P value of .05. Great. It means that there is a small probability that an outside force was responsible for the hair growth. So you’ll buy that pill, right?
But hold on. What was ignored was the effect size. What the P value is not telling us is that those bald guys only grew 2 hairs and not a full, luscious mane. So, technically, they’re still bald. This is part of the problem. We shouldn’t be putting all our faith in this one number. But scientists often are, so much so that they might be skewing their experiments to get lower Ps” — D News

Think twice before believing and taking actions based on these type of data

“I’m catching reports from just last year a university in England says drinking champagne every week may help delay dementia and Alzheimer’s disease they say only one to three glasses a week how’d a day a week they can be effective for your health”

In this case the bigger issue is that study was performed on rats. But that has not been mentioned anywhere in the news report.

A new study claimed that driving while dehydrated is just as dangerous as driving drunk! Researchers say drivers who drank just one ounce of water per hour made the same number of mistakes on the road as those over the legal limit with alcohol . But Britain’s National Health Service had already pointed out that study was riddled with red flags including that it was based on just 12 men of whom data was only reported for 11.

Take a look at the following reports. Doesn’t they seem a bit odd? Isn’t it well known that the health benefits you can get from alcohol is really less and it has been proven that it does more damage than benefit to your personal health?

Does Tequila Make You Lose Weight? (Screenshot was taken on ‎May ‎29, ‎2019)

Some of these snaps of reports were taken from “Last Week Tonight with John Oliver (HBO)”

Good improvements

  1. These type of studies which was proven to be false later.

Throughout the history a lot of these type of psychological, science facts and hypothesis were proven wrong and the damage was reduced. So even if there are incorrect beliefs in the current society, it will be corrected with time.

2. Corrections

The incorrect papers are being withdrawn everyday and a lot of effort is given towards to keep the purity of the information shared in the science world. Many scientists acknowledged these problems and have started to take
steps to correct them. As examples, there is a separate site built named Retraction Watch, dedicated to publicizing papers that have been withdrawn. There are online repositories for unpublished negative results and there is a move towards submitting hypotheses and methods for peer review before conducting experiments with the guarantee that research will be published regardless of results so long as the procedure is followed. This eliminates publication bias, promotes higher powered studies and lessens the chance of p-hacking.

Clarifying and correcting the content of the articles is an improvement

3. Most scientists aren’t p-hacking maliciously, there are legitimate decisions to be made about how to collect, analyze and report data, and these decisions impact on the statistical significance of results.

4. Making it a practice to consult doctors or relevant authorities before following newly introduced procedures.

5. The best process against p-hacking is the replication study where other scientists redo your study and see if they get similar results. There has been more large-scale replication studies undertaken in the last 10 years than before.

Present & Future of p-value

  • In 2016, the American Statistical Association (ASA) published a statement on p-values, saying that “the widespread use of ‘statistical significance’ which is generally interpreted as ‘p ≤ 0.05’ as a license for making a claim of a scientific finding in occasions of true finding and faked. Further they stated that it leads to considerable distortion of the scientific process”. In this statement they released 6 principals to be followed by the scientists on what determines statistical significance. Those principles are as follows.

Principle 1: P-values can indicate how incompatible the data are with a specified statistical model.

Principle 2: P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.

Principle 3: Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.

Principle 4: Proper inference requires full reporting and transparency.

Principle 5: A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.

Principle 6: By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.

  • In 2017, a group of 72 authors proposed to enhance reproducibility by changing the p-value threshold for statistical significance from 0.05 to 0.005. Other researchers responded that imposing a more stringent significance threshold would aggravate problems such as data dredging; alternative propositions are thus to select and justify flexible p-value thresholds before collecting data, or to interpret p-values as continuous indices, thereby discarding thresholds and statistical significance.Additionally, the change to 0.005 would increase the likelihood of false negatives, where the effect being studied is real, but the test fails to show it.
  • In 2019 over 800 statisticians and scientists signed a message calling for the abandonment of the term “statistical significance” in science.

Conclusion

P-hacking isn’t always malicious. It could come from lack of statistical knowledge of the researcher, a extreme belief in a specific scientific theory and the intention to prove it right, or just an honest mistake. Sometimes they might not be intending to do something bad. But we can’t simply overlook the problem. Some of these hypothesizes are not proven and in an unfortunate situation they wont be proven ever. These statistical results only shows that there’s a chance. Always think twice about the things you see on news, internet because a lot of scammers are out there. What they do is not only bad for the society but also to the hardworking scientists who are working hard to prove their theories honesty and to the scientific community overall.

Several steps could be taken by the scientists to prevent these situations like the ones mentioned below. Researchers can put more attention about the quality of research methods & data collection, Clearly label research as prespecified by making it clear that it is designed to answer a specific question and by detailing the methods, making it possible for analyses to be fully reported prior to data collection. Results from prespecified studies are known to offer far more convincing evidence than those from exploratory research. Also they can do the research under the common analysis standards measuring only response variables that are predicted to be important and using appropriate sample sizes for the particular tests. Performing data analysis wherever possible makes it difficult to p-hack for specific results.

Scientific journals can Encourage to put up platforms for open access to raw data, provide clear and detailed guidelines for the full reporting of data analyses and results, provide platforms for method prespecification of the researches in order to make sure nothing bad happens from their end. Enouraging and publishing important replication studies is also something important which is expected from both scientists and scientific journals.

“…Is it unavoidable that most research findings are false, or can we improve the situation? A major problem is that it is impossible to know with 100% certainty what the truth is in any research question. In this regard, the pure “gold” standard is unattainable…” — John P. A. Ioannidis

I believe that if we all have the right attitude and pure intentions to work for the betterment of the world, these issues could be minimized.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Isuru Pamuditha
Isuru Pamuditha

Written by Isuru Pamuditha

Ponder & Wander... That'll make you an interesting person || Engineering Undergraduate ||

No responses yet

Write a response