Correspondence: Adedokun B. O.
You are watching: Which of the following facts about the p-value of a test is correct?
Lecturer Department of Epidemiology and Medical Statistics, College of Medicine, University of Ibadan.
This is an open access article licensed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted, non-commercial use, distribution and reproduction in any medium, provided the work is properly cited.
P values and confidence intervals are reported in almostall scientific writings and are used in interpreting resultsof statistical analysis. It is usual for medical researchersand other investigators to ask questions such as ‘Is theresult significant?’ or ‘what is the p value?’ Manyclinicians worry when they carry out statistical analysisand there are no significant results. This article describessome facts about the p value and confidence intervals.
The reporting of p values and confidence intervalsusually follows hypothesis testing or significance testing.Most scientific investigations involve the testing ofhypotheses. These are formal procedures for testingwhether findings from the investigations arecompatible with a so called null hypothesis. Hypothesesrefer to statements concerning the situation beinginvestigated which are usually stated as two mutuallyexclusive options; a null hypothesis and an alternativehypothesis. The null hypothesis is a statement of noassociation between variables or no difference in meansof groups while the alternative hypothesis states thatthere’s a difference or an association. The interests ofmedical researchers are varied and research questionsresult in statement of hypotheses. Examples of suchquestions are: Is there a significant difference inproportion of low birth weight babies delivered tomothers with single and multiple pregnancies? ; Is therea difference in effects of three antiretroviral drugs onreduction in viral load? ; Is there a correlation betweenbody mass index and systolic blood pressure; or Isthere a difference in reduction in blood sugar betweena standard hypoglycemic and a new drug. The nullhypothesis for the last study objective will be ‘There isno correlation between body mass index and systolicblood pressure’. The use and interpretation of p valuesand confidence intervals will now be discussed.
There are different definitions of the ‘p value’. Perhapsthe most popular definition is ‘The probability ofobtaining a value as extreme or more extreme as foundin the study if the null hypothesis were true’1. Put moresimply, it can be defined as the probability that theobserved result is due to chance alone2. An importantpoint to note in these definitions is the use of phrases‘found in the study’ and ‘observed result’. The p valueonly tells us whether what we have observed – whichis usually obvious- is statistically significant. This is animportant point to note. For example in a study whichexamined the difference in prevalence of low birthweight deliveries between singleton and multiplepregnancies, the figures for the prevalence could havebeen 12.5% for multiple pregnancies and 3.6% forsingletons. All the statistical jargon about p value andconfidence intervals do not negate the fact that theproportion of low birth babies delivered to motherswith multiple pregnancies is higher than the proportionfor singletons. Hence from the initial descriptive statisticsused to summarize variables such as proportions andmeans we have an idea of the results of our study butthe statistical significance is what the p value helps to‘endorse’. The interpretation of p values is based onreference to a particular cut off for the probability orthe so called level of significance which is conventionallyset at 0.05. Hence p values less than this number aresignificant while those above are not.
The confidence interval gives the range of values withinwhich we are reasonably confident that the populationparameter lies2. The parameter here could be differencein means or proportions of two groups or it couldbe a measure of association between two variables.The most commonly reported interval is the 95%confidence interval. A way to think about the conceptof confidence intervals is to imagine that the studywas repeated about a thousand times. About 95% ofthe different possible results obtained will lie in thisinterval. Alternatively we can say that we are 95%confident that the true population value of what weare estimating in our study lies within the interval. Thecriterion for judging an interval as significant or notdepends on the presence of a null value. The null valuerefers to the value of the test statistic when the nullhypothesis is true. In the earlier example on differencein mean reduction in blood sugar between a standardhypoglycemic and a new drug, the null hypothesis isthat there is no difference in the mean reduction inblood sugar by the two drugs or that the difference inthe means between the two groups being comparedis zero. The null value here is zero and any intervalcomputed for the difference in the means whichincludes zero is not significant. Another set of studydesigns involves investigation of relationships wherebytwo variables typically an exposure (or risk factor) e.g.alcohol intake and an outcome such as liver diseaseare being related. The appropriate measure ofassociation between these variables is the odds ratioand the null value- that is when there is no relationshipbetween alcohol intake and liver disease- is 1. Hence aconfidence interval including 1 will not be a significantinterval. A third scenario is if the variables being investigated are both numeric, say the relationshipbetween maternal body mass index (in kg/m2) andbabies’ birth weight (in kg) where the measure ofassociation here is the correlation coefficient. The nullvalue here is zero and any interval for the correlationcoefficient between body mass index and birth weightincluding zero will not be significant.
As a guide to interpreting confidence intervals fordifference in means, when the lower and upper limitsare both positive or both negative - depending ondirection - then the difference is significant. Also forodds ratios when the upper and lower limits are bothdecimals or both whole numbers then we have asignificant result.
See more: Compare And Contrast The English Model Of Colonization To That Of The French And Spanish
It is worthy of note that the p value and confidenceinterval are two equivalent methods of interpretingresults of a statistical analysis. Whether or not we havea significant result can be determined from the p valuebased on whether it is less than 5% or not; or theconfidence interval based on whether the null valueAnnals of Ibadan Postgraduate Medicine. Vol.6 No.1 June, 2008 34lies within the interval. The width of the confidenceinterval and the size of the p value are related, thenarrower the interval, the smaller the p value. Howeverthe confidence interval gives valuable information aboutthe likely magnitude of the effect being investigatedand the reliability of the estimate. Larger sample sizeswill give narrower and hence more reliable intervals.In conclusion confidence intervals should be thepreferred means of interpreting results of statisticalanalysis, because in addition to evaluating the role ofchance it reflects the degree of variability and thesample size.