Contents1 | Abhaya Indrayan

Medical Biostatistics, Fourth Edition

Contents

Preface to Fourth Edition

Summary Tables

Frequently Used Notations

1 Medical Uncertainties

1.1 Uncertainties in Health and Disease

1.1.1 Uncertainties due to Intrinsic Variation – Biologic, Genetic, Behavioral and Other Host Factors, Environmental, Chance, Sampling Fluctuations

1.1.2 Natural Variation in Assessment – Observer, Treatment Strategies, Instrument and Laboratory, Imperfect Tools, Incomplete Information on the Patient, Poor Compliance with the Regimen

1.1.3 Inadequate Knowledge – Epistemic Uncertainties; Diagnostic, Therapeutic, and Prognostic Uncertainties; Predictive and Other Uncertainties

1.2 Uncertainties in Medical Research

1.2.1 Empiricism in Medical Research – Laboratory Experiments, Clinical Trials, Surgical Procedures, Epidemiological Research

1.2.2 Elements of Minimizing the Impact of Uncertainties on Research – Proper Design, Improved Medical Methods, Analysis and Synthesis

1.2.3 Critique of a Report of a Medical Study – Introduction, Methodology, Results, Discussion and Conclusions

1.3 Uncertainties in Health Planning and Evaluation

1.3.1 Health Situation Analysis – Identification of the Specifics of the Problem, Size of the Target Population, Magnitude of the Problem, Health Infrastructure, Feasibility of Remedial Steps

1.3.2 Evaluation of Health Programs

1.4 Management of Uncertainties: About This Book

1.4.1 Contents of the Book – Limitations and Strengths, New in Third Edition

1.4.2 Salient Features of the Text – System of Notations, Guide Chart of the Biostatistical Methods

References

2 Basics of Medical Studies

2.1 Study Protocol

2.1.1 The Problem, Objectives, and Hypotheses

2.1.2 Protocol Content

2.2 Types of Medical Studies

2.2.1 Elements of Design

2.2.2 Basic Types of Study Design – Descriptive, Analytical, Basic Types of Analytical Studies

2.2.3 Choosing a Design – Recommended Design for Particular Setups, Choice of Design by Level of Evidence

2.3 Data Collection

2.3.1 Nature of Data – Factual, Knowledge-Based, and Opinion-Based Data; Method of Obtaining the Data

2.3.2 Tools of Data Collection – Existing Records, Questionnaires and Schedules, Likert Scale

2.3.3 Pretesting and Pilot Study

2.4 Nonsampling Errors and Other Biases

2.4.1 Nonresponse

2.4.2 Variety of Biases to Guard Against – List of Biases, Steps for Minimizing Bias

References

Exercises

3 Sampling Methods

3.1 Sampling Concepts

3.1.1 Advantages and Limitations of Sampling – Sampling Fluctuations, Advantages and Limitations

3.1.2 Some Special Terms Used in Sampling – Unit of Enquiry and Sampling Unit, Sampling Frame, Parameters and Statistics, Sample Size, Nonrandom and Random Sampling

3.2 Common Methods of Random Sampling

3.2.1 Simple Random Sampling

3.2.2 Stratified Random Sampling

3.2.3 Multistage Random Sampling

3.2.4 Cluster Random Sampling

3.2.5 Systematic Random Sampling

3.2.6 Choice of Method of Random Sampling

3.3 Some Other Methods of Sampling

3.3.1 Other Random Methods of Sampling – Probability Proportional to Size, Area Sampling, Inverse Sampling, Consecutive Subjects Attending a Clinic, Sequential Sampling

3.3.2 Nonrandom Methods of Sampling – Convenience Samples, Other Types of Purposive Samples

References

Exercises

4 Designs for Observational Studies

4.1 Some Basic Concepts

4.1.1 Antecedent and Outcome

4.1.2 Confounders

4.1.3 Effect Size

4.2 Prospective Studies

4.2.1 Variations of Prospective Studies – Cohort Study, Longitudinal Study, Repeated Measures Study

4.2.2 Selection of Subjects for a Prospective Study – Comparison Group in a Prospective Study

4.2.3 Potential Biases in Prospective Studies – Selection Bias, Bias due to Loss in Follow-Up, Assessment Bias and Errors, Bias due to Change in the Status, Confounding Bias, Post Hoc Bias, Validity Bias

4.2.4 Merits and Demerits of Prospective Studies

4.3 Retrospective Studies

4.3.1 Case-Control Design – Nested Case-Control Design

4.3.2 Selection of Cases and Controls – Sampling Methods in Retrospective Studies, Confounders and Matching

4.3.3 Merits and Demerits of Case-Control Studies

4.4 Cross-Sectional Studies

4.4.1 Selection of Subjects for a Cross-Sectional Study

4.4.2 Merits and Demerits of Cross-Sectional Studies

4.5 Comparative Performance of Prospective, Retrospective, and Cross-Sectional Studies

4.5.1 Performance of Prospective Studies

4.5.2 Performance of Retrospective Studies

4.5.3 Performance of Cross-Sectional Studies

References

Exercises

5 Medical Experiments

5.1 Basic Features of Medical Experiments

5.1.1 Statistical Principles of Experimentation – Control Group, Randomization, Replication

5.1.2 Advantages and Limitations of Experiments

5.2 Design of Experiments

5.2.1 Classical Designs: One-Way Design, Two-Way Design, Interaction, K-Way and Factorial Experiments

5.2.2 Some Unconventional Designs – Repeated Measures Design, Crossover Design, Other Complex Designs

5.3 Choice and Sampling of Units for Laboratory Experiments

5.3.1 Choice of Experimental Unit

5.3.2 Sampling Methods in Laboratory Experiments

5.3.3 Choosing a Design of Experiment

5.3.4 Pharmacokinetic Studies

References

Exercises

6 Clinical Trials

6.1 Therapeutic Trials

6.1.1 Phases of a Clinical Trial – Phases I to IV

6.1.2 Selection of Subjects – Selection of Participants for RCT, Control Group in a Clinical Trial

6.1.3 Randomization and Matching

6.1.4 Methods of Random Allocation – Allocation out of a Large Number of Available Subjects; Random Allocation of Consecutive Patients Coming to a Clinic; Block, Cluster and Stratified Randomization

6.1.5 Blinding and Masking

6.2 Issues in Clinical Trials

6.2.1 Outcome Assessment – Specification of End-point or Outcome, Causal Inference, Side Effects, Efficacy versus Effectiveness, Pragmatic Trials

6.2.2 Various Equivalences in Clinical Trials – Superiority, Equivalence, and Noninferiority Trials; Therapeutic Equivalence and Bioequivalence

6.2.3 Designs for Clinical Trials – One-Way, Two-Way, and Factorial Designs; Crossover and Repeated Measures Designs; N-of-1, Up-and-Down, and Sequential Designs; Choosing a Design for a Clinical Trial

6.2.4 Designs with Interim Appraisals – Design with Provision to Stop Early, Adaptive Designs

6.2.5 Biostatistical Ethics for Clinical Trials – Equipoise, Ethical Cautions, Statistical Considerations in a Multicentric Trial, Multiple Treatments with Different Outcomes in the Same Trial, Size of the Trial, Compliance

6.2.6 Reporting Results of a Clinical Trial – CONSORT, Open Access

6.3 Trials Other than for Therapeutics

6.3.1 Clinical Trials for Diagnostic and Prophylactic Modalities

6.3.2 Field Trials for Screening, Prophylaxis, and Vaccines

6.3.3 Issues in Field Trials – Randomization and Blinding in Field Trials, Designs for Field Trials

References

Exercises

7 Numerical Methods for Representing Variation

7.1 Types of Measurement

7.1.1 Nominal, Metric, and Ordinal Scales

7.1.2 Other Classifications of the Types of Measurement – Discrete and Continuous Variables, Qualitative and Quantitative Data, Stochastic and Deterministic Variables

7.2 Tabular Presentation

7.2.1 Contingency Tables and Frequency Distribution – Empty Cells, Problems in Preparing a Contingency Table on Metric Data

7.2.2 Multiple Response Tables and Other Features

7.2.3 Other Types of Statistical Tables – What is a Good Statistical Table?

7.3 Rates and Ratios

7.3.1 Proportion, Rate, and Ratio

7.4 Central and Other Locations

7.4.1 Central Values: Mean, Median, and Mode – Understanding Mean, Median, and Mode, Calculation in Case of Grouped Data, Which Central Value to Use?, Geometric Mean, Harmonic Mean

7.4.2 Other Locations: Quantiles – Ungrouped and Grouped Data, and Interpretation

7.5 Measuring Variability

7.5.1 Variance and Standard Deviation – Ungrouped and Grouped Data, Variance of Sum or Difference of Two Measurements

7.5.2 Coefficient of Variation

References

Exercises

8 Presentation of Variation by Figures

8.1 Graphs for Frequency Distribution

8.1.1 Histogram and Its Variants – Histogram, Stem-and-Leaf Plot, Line Histogram

8.1.2 Polygon and Its Variants – Frequency Polygon, Area Diagram

8.1.3 Frequency Curve

8.2 Pie, Bar, and Line Diagrams

8.2.1 Pie Diagram – Useful Features, Donut Diagram

8.2.2 Bar Diagram

8.2.3 Scatter and Line Diagrams

8.2.4 Choice and Cautions in Visual Display of Data

8.2.5 Mixed and Three-Dimensional Diagrams – Mixed Diagram, Box-and-Whiskers Plot, Three-Dimensional Diagram, Biplot, Nomogram

8.3 Special Diagrams in Health and Medicine

8.3.1 Diagrams Used in Public Health – Epidemic Curve, Lexis Diagram

8.3.2 Diagrams Used in Individual Care and Research – Growth Charts, Partogram, Dendrogram, Area Under the Concentration Curve, Radar Graph

8.4 Charts and Maps

8.4.1 Charts – Schematic Chart, Pedigree Chart

8.4.2 Maps – Spot Map, Thematic Choroplethic Map, Cartogram

References

Exercises

9 Some Quantitative Aspects of Medicine

9.1 Some Epidemiological Measures of Health and Disease

9.1.1 Epidemiological Indicators of Neonatal Health – Birth Weight, Apgar Score

9.1.2 Epidemiological Indicators of Growth in Children – Weight-for-Age, Weight-for-Height and Height-for-Age, Z-Scores and Percent of Median, Growth Velocity, Skinfold Thickness

9.1.3 Epidemiological Indicators of Adolescent Health – Growth in Height and Weight in Adolescence, Sexual Maturity Rating

9.1.4 Epidemiological Indicators of Adult Health – Obesity, Smoking, Physiological Functions, Quality of Life

9.1.5 Epidemiological Indicators of Geriatric Health – Activities of Daily Living, Mental Health of the Elderly

9.2 Reference Values

9.2.1 Gaussian and Other Distributions – Checking Gaussianity

9.2.2 Reference or Normal Values – Implications

9.2.3 Normal Range – Disease Threshold, Clinical Threshold, Statistical Threshold

9.3 Measurement of Uncertainty: Probability

9.3.1 Elementary Laws of Probability – Law of Multiplication, Law of Addition

9.3.2 Probability in Clinical Assessments – Probabilities in Diagnosis, Assessment of Prognosis, Choice of Treatment,

9.3.3 Further on Diagnosis: Bayes Rule

9.4 Validity of Medical Tests

9.4.1 Sensitivity and Specificity – Features of Sensitivity and Specificity, Likelihood Ratio

9.4.2 Predictivities – Positive and Negative Predictivity, Predictivity and Prevalence, The Meaning of Prevalence for Predictivity, Features of Positive and Negative Predictivities

9.4.3 Combination of Tests – Tests in Series, Tests in Parallel, Gains from a Test, When Can a Test Be Avoided?

9.4.4 Gains from a Test – When can a Test be Avoided

9.5 Search for the Best Threshold of Continuous Test: ROC Curve

9.5.1 Sensitivity–Specificity Based ROC Curve, Methods to Find the ‘Optimal’ Threshold Point, Area Under the ROC Curve

9.5.2 Predictivities Based ROC Curve

References

Exercises

10 Clinimetrics and Evidence-Based Medicine

10.1 Indicators, Indexes, and Scores

10.1.1 Indicators – Merits and Demerits of Indicators, Choice of Indicators

10.1.2 Indexes – Some Commonly Used Indexes, Advantages and Limitations of Indexes

10.1.3 Scores – Scoring System for Diagnosis, Scoring for Gradation of Severity

10.2 Clinimetrics

10.2.1 Method of Scoring – Method of Scoring for Graded Characteristics, Method of Scoring for Diagnosis, Regression Method of Scoring

10.2.2 Validity and Reliability of a Scoring System

10.3 Evidence-Based Medicine

10.3.1 Decision Analysis – Decision Tree

10.3.2 Other Statistical Tools for Evidence-Based Medicine – Etiology Diagram, Expert System

References

Exercises

11 Measurement of Community Health

11.1 Indicators of Mortality

11.1.1 Crude and Standardized Death Rates – Crude Death Rate, Age-Specific Death Rate, Standardized Death Rate, Comparative Mortality Ratio

11.1.2 Specific Mortality Rates – Fetal Deaths and Mortality in Children, Maternal Mortality, Adult Mortality, Other Measures of Mortality

11.1.3 Death Spectrum

11.2 Measures of Morbidity

11.2.1 Prevalence and Incidence – Point Prevalence, Period Prevalence, Incidence, The Concept of Person-Time, Capture–Recapture Methodology

11.2.2 Duration of Morbidity – Prevalence in Relation to Duration of Morbidity, Incidence from Prevalence, Epidemiologically Consistent Estimates

11.2.3 Morbidity Measures for Acute Conditions – Attack Rates, Disease Spectrum

11.3 Indicators of Social and Mental Health

11.3.1 Indicators of Social Health – Education, Income, Occupation, Socioeconomic Status, Dependency Ratio, Health Inequality

11.3.2 Indicators of Health Resources – Health Infrastructure, Health Expenditure

11.3.3 Indicators of Lack of Mental Health – Smoking and Other Addictions, Divorces, Vehicular Accidents and Crimes, Others Measures of Lack of Mental Health

11.4 Composite Indexes of Health

11.4.1 Indexes of Status of Comprehensive Health – Human Development Index, Physical Quality of Life Index

11.4.2 Indexes of Health Gap – DALYs Lost, Human Poverty Index, Index of Need for Health Resources

References

Exercises

12 Confidence Intervals, Principles of Tests of Significance, and Sample Size

12.1 Sampling Distributions

12.1.1 Basic Concepts – Sampling Error, Point Estimate, Standard Error of p and

12.1.2 Sampling Distribution of p and – Gaussian Conditions

12.1.3 Obtaining Probabilities from a Gaussian Distribution – Gaussian Probability, Continuity Correction, Probabilities Relating to the Mean and the Proportion

12.1.4 The Case of σ Not Known (t-Distribution)

12.2 Confidence Intervals

12.2.1 Confidence Interval for π, μ and Median (Gaussian Conditions) – Confidence Interval for Proportion π (Large n), Lower and Upper Bounds for π (Large n), Confidence Interval for Mean μ (Large n), Confidence Bounds for Mean μ (Large n), CI for Median (Gaussian Distribution)

12.2.2 Confidence Interval for Differences (Large n) – Two Independent Samples, Paired Samples

12.2.3 Confidence Interval for π, μ and Median: NonGaussian Conditions – Confidence Interval for π (Small n), Confidence Bound for π When the Success or the Failure Rate in the Sample is Zero Percent, Confidence Interval for Median (Small n): NonGaussian Conditions

12.3 P-Values and Statistical Significance

12.3.1 What Is Statistical Significance? – Court Judgment, Errors in Diagnosis, Null Hypothesis, Philosophical Basis of Statistical Tests, Alternative Hypothesis, One-Sided Alternatives: Which Tail is Wagging?

12.3.2 Errors, P-Values, and Power – Type-I Error, Type-II Error, Power

12.3.3 General Procedure to Obtain P-value – Subtleties of Statistical Significance

12.4 Assessing Gaussian Pattern

12.4.1 Significance Tests for Assessing Gaussianity

12.5 Initial Debate on Statistical Significance

12.5.1 Confidence Interval versus Test of H0

12.5.2 Medical Significance versus Statistical Significance

12.6 Sample Size Determination in Some Cases

12.6.1 Sample Size Required in Estimation Setup – General Considerations in the Estimation Setup, General Procedure for Determining Size of Sample for Estimation, Formulas for Sample Size Calculation for Estimation in Simple Situations

12.6.2 Sample Size for Testing a Hypothesis with Specified Power – General Considerations in a Testing-of-Hypothesis Setup, Sample Size Formulas for Test of Hypothesis in Simple Situations, Nomograms and Tables of Sample Size, Thumb Rules, Power Analysis

12.6.3 Sample Size Calculation in Clinical Trials – Stopping Rules in Case of Early Evidence of Success or of Failure: Lan–deMets Procedure, Sample Size Reestimation

References

Exercises

13 Inference from Proportions

13.1 One Qualitative Variable

13.1.1 Dichotomous Categories: Binomial Distribution – Large n: Gaussian Approximation to Binomial

13.1.2 Poisson Distribution

13.1.3 Polytomous Categories (Large n): Goodness-of-Fit Test – Chi-Square and Its Explanation, Degrees of Freedom, Cautions in Using Chi-Square, Further Analysis: Partitioning of Table

13.1.4 Goodness of Fit to Assess Gaussianity

13.1.5 Polytomous Categories (Small n): Exact Multinomial Test – Goodness-of-Fit in Small Samples

13.2 Proportions in 2×2 Tables

13.2.1 Structure of 2×2 Table in Different Types of Study – Structure in Prospective Study, Structure in Retrospective Study, Structure in Cross-Sectional Study

13.2.2 Two Independent Samples (Large n): Chi-Square Test and Proportion Test – Chi-square Test, Yates Correction for Continuity, Z-Test for Proportions, Detecting a Medically Important Difference in Proportions, Crossover Design with Binary Response (Large n)

13.2.3 Equivalence Tests – Superiority, Equivalence and Noninferiority; Equivalence; Determining Inferiority Margin

13.2.4 Two Independent Samples (Small n): Fisher Exact Test – Crossover Design (Small n)

13.2.5 Proportions in Matched Pairs: McNemar Test (Large n) and Exact Test (Small n) – Large n: McNemar Test, Small n: Exact Test (Matched Pairs), Comparison of Two Tests for Sensitivity and Specificity: Paired Setup

13.3 Analysis of R × C Tables (Large n)

13.3.1 One Dichotomous and the Other Polytomous Variable (2×C Table) – The Test Criterion, Trend in Proportions in Ordinal Categories, Dichotomy in Repeated Measures: Cochran Q Test (Large n)

13.3.2 Two Polytomous Variables – Chi-square Test for Large n, Matched Pairs: I×I Tables

13.4 Three-Way Tables

13.4.1 Assessment of Association in Three-Way Tables

13.4.2 Log–Linear Models – Two-Way Tables, Three-Way Tables

References

Exercises

14 Relative Risk and Odds Ratio

14.1 Relative and Attributable Risks (Large n)

14.1.1 Risk, Hazard, and Odds – Ratios of Risks and Odds

14.1.2 Relative Risk – RR in Independent Samples, Confidence Interval for RR (Independent Samples), Test of Hypothesis on RR (Independent Samples), RR in the Case of Matched Pairs

14.1.3 Attributable Risk – AR in Independent Samples, AR in Matched Pairs, Number Needed to Treat, Relative Risk Reduction, Population Attributable Risk

14.2 Odds Ratio

14.2.1 OR in Two Independent Samples – CI for OR (Independent Samples), Test of Hypothesis on OR (Independent Samples)

14.2.2 OR in Matched Pairs – Confidence Interval for OR (Matched Pairs), Test of Hypothesis on OR (Matched Pairs), Multiple Controls

14.3 Stratified Analysis, Sample Size and Meta-Analysis

14.3.1 Mantel–Haenszel Procedure – Pooled Odds Ratio and Chi-square

14.3.2 Sample Size Requirement for Statistical Inference on RR and OR

14.3.3 Meta-Analysis

References

Exercises

15 Inference from Means

15.1 Comparison of Means in One and Two Groups (Gaussian Conditions): Student t-Test

15.1.1 Comparison with a Prespecified Mean – Student t-Test for One Sample,

15.1.2 Difference in Means in Two Samples – Paired Samples Setup, Unpaired (Independent) Samples Setup, Some Features of Student t, Effect of Unequal n, Difference-in-Differences Approach

15.1.3 Analysis of Crossover Designs – Test for Group Effect, Test for Carry-Over Effect, Test for Treatment Effect

15.1.4 Analysis of Data of Up-and-Down Trials

15.2 Comparison of Means in Three or More Groups (Gaussian Conditions): ANOVA F-Test

15.2.1 One-Way ANOVA – The Procedure to Test H0, Checking the Validity of the Assumptions of ANOVA

15.2.2 Two-Way ANOVA – Two-Factor Design, The Hypotheses and Their Test, Main Effect and Interaction (Effect), Repeated Measures

15.2.3 Repeated Measures – Random Effects versus Fixed Effects, Sphericity and Hynh–Feldt Correction, Repeated Measures versus Two-way ANOVA, Area Under the Concentration Curve

15.2.4 Multiple Comparisons: Bonferroni, Tukey and Dunnett Tests – Intricacies of Multiple Comparisons

15.3 Non-Gaussian Conditions: Nonparametric Tests for Location

15.3.1 Comparison of Two Groups: Wilcoxon Tests – Paired Data, Independent Samples

15.3.2 Comparison of Three or More Groups: Kruskal–Wallis Test

15.3.3 Two-Way Layout: Friedman Test

15.4 When Significant is Not Significant

15.4.1 The Nature of Statistical Significance

15.4.2 Testing for Presence of Medically Important Difference in Means – Detecting Specified Difference in Mean, Equivalence Tests for Means

15.4.3 Power and Level of Significance – Balancing Type-I and Type-II Error

References

Exercises

16 Relationships: Quantitative Data

16.1 Some General Features of a Regression Setup

16.1.1 Dependent and Independent Variables – Simple, Multiple, and Multivariate Regression

16.1.2 Linear, Curvilinear, and Nonlinear Regressions

16.1.3 The Concept of Residuals

16.1.4 General Method of Fitting a Regression

16.2 Linear Regression Models

16.2.1 Adequacy of a Regression Fit – 1 – Goodness of Fit and η2, Multiple Correlation in Linear Regression, Stepwise Procedure, Statistical Significance of Individual Regression Coefficients

16.2.2 Adequacy of Regression – 2 – Validity of Assumptions, Choice of Form of Regression, Outliers and Missing Values

16.2.3 Interpretation of the Regression Coefficients – Standardized Coefficients, Other Implications of Regression Models

16.3 Some Issues in Linear Regression

16.3.1 Confidence Interval, Confidence Band, and Tests – SEs and CIs for the Regression, Confidence Band for Simple Linear Regression, Equality of Two Regression Lines, Difference-in-Differences Approach with Regression

16.3.2 Some Variations of Regression – Ridge Regression, Multilevel Regression, Regression Splines, Analysis of Covariance, Some Generalizations

16.4 Measuring the Strength of Quantitative Relationship

16.4.1 Product–Moment and Related Correlations – Multiple Correlation, Product–Moment Correlation, Covariance, Statistical Significance of r, Intraclass Correlation, Serial Correlation

16.4.2 Rank Correlation – Spearman Rho, Kendall Tau

16.5 Assessment of Quantitative Agreement

16.5.1 Agreement in Quantitative Measurements

16.5.2 Approaches for Measuring Quantitative Agreement – Limits of Disagreement Approach, Intraclass Correlation as a Measure of Agreement, Relative Merits of the Two Methods, An Alternative Simple Approach

References

Exercises

17 Relationships: Qualitative Dependent

17.1 Binary Dependent: Logistic Regression (Large n)

17.1.1 Meaning of a Logistic Model

17.1.2 Assessing Overall Adequacy of a Logistic Regression – Log Likelihood, Classification Accuracy, Hosmer–Lemeshow Test,

17.2 Inference from Logistic Coefficients

17.2.1 Interpretation of the Logistic Coefficients – Dichotomous Predictors, Polytomous and Continuous Predictors

17.2.2 Confidence Interval and Test of Hypothesis on Logistic Coefficients

17.3 Issues in Logistic Regression

17.3.1 Conditional Logistic for Matched Data

17.3.2 Polytomous Dependent – Nominal Categories: Multinomial Logistic, Ordinal Categories

17.4 Some Models for Qualitative Data and Generalizations

17.4.1 Cox Regression for Hazards

17.4.2 Classification and Regression Trees

17.4.3 Further Generalizations

17.5 Strength of Relationship in Qualitative Variables

17.5.1 Both Variables Qualitative – Dichotomous Categories, Polytomous Categories: Nominal, Proportional Reduction in Error, Polytomous Categories: Ordinal Association

17.5.2 One Qualitative and the Other Quantitative Variable

17.5.3 Agreement in Qualitative Measurements (Matched Pairs) – The Meaning of Qualitative Agreement, Cohen Kappa

References

Exercises

18 Survival Analysis

18.1 Life Expectancy

18.1.1 Life Table

18.1.2 Other Forms of Life Expectancy – Potential Years of Life Lost, Healthy Life Expectancy, Application to Other Setups

18.2 Analysis of Survival Data

18.2.1 Nature of Survival Data – Types of Censoring, Collection of Survival Time Data, Statistical Measures of Survival

18.2.2 Survival Observed in Time Intervals: Life Table Method

18.2.3 Continuous Observation of Survival Time: Kaplan–Meier Method – Using the Survival Curve, Standard Error of Survival Rate, Hazard Function

18.3 Issues in Survival Analysis

18.3.1 Comparison of Survival in Two Groups – Comparing Survival Rates, Comparing Survival Experience: Log-Rank Test

18.3.2 Factors Affecting Survival: Cox Model – Parametric Models, Cox Model for Survival, Proportional Hazards

18.3.3 Sample Size for Survival Studies

References

Exercises

19 Simultaneous Consideration of Several Variables

19.1 Scope of Multivariate Methods

19.1.1 The Essentials of a Multivariate Setup

19.1.2 Statistical Limitation on the Number of Variables

19.2 Dependent and Independent Sets of Variables

19.2.1 Dependents and Independents Both Quantitative: Multivariate Multiple Regression

19.2.2 Quantitative Dependents and Qualitative Independents: Multivariate Analysis of Variance (MANOVA) – MANOVA for Repeated Measures

19.2.3 Classification of Subjects into Known Groups: Discriminant Analysis – Discriminant Function, Classification Rule, Classification Accuracy

19.3 Identification of Structure in the Observations

19.3.1 Identification of Clusters of Subjects: Cluster Analysis – Measures of Similarity, Hierarchical Agglomerative Algorithm, Deciding on the Number of Natural Clusters

19.3.2 Identification of Unobservable Underlying Factors: Factor Analysis – Steps for Factor Analysis, Features of a Successful Factor Analysis, Factor Scores

References

Exercises

20 Quality Considerations

20.1 Statistical Quality Control in Medical Care

20.1.1 Statistical Control of Medical Care Errors – Adverse Patient Outcomes, Monitoring Fatality, Limits of Tolerance

20.1.2 Quality of Lots – The Lot Quality Method, LQAS in Health Assessment

20.1.3 Quality Control in a Medical Laboratory – Control Chart, Cusum Chart, Other Errors in Medical Laboratory, Six Sigma Methodology, Nonstatistical Issues

20.2 Quality of Measurements

20.2.1 Validity of Instruments – Types of Validity

20.2.2 Reliability of Instruments – Internal Consistency, Cronbach Alpha, Test–Retest Reliability

20.3 Quality of Statistical Models: Robustness

20.3.1 External Validation – Split-Sample Method, Another Sample Method

20.3.2 Sensitivity Analysis and Uncertainty Analysis

20.3.3 Resampling – Bootstrapping, Jackknife Resampling

20.4 Quality of Data

20.4.1 Errors in Measurement – Lack of Standardization in Definitions, Lack of Care in Obtaining or Recording Information, Inability of the Observer to Get Confidence of the Respondent, Bias of the Observer, Variable Competence of the Observers

20.4.2 Missing Values – Approaches for Missing Values, Handling Nonresponse, Imputations, Intention-to-Treat Analysis

20.4.3 Lack of Standardization in Values – Standardization Methods Already Described, Standardization for Calculating Adjusted Rates, Standardized Mortality Ratio

References

Exercises

21 Statistical Fallacies

21.1 Problems with the Sample

21.1.1 Biased Sample – Survivors, Volunteers, Clinical Subjects, Publication Bias, Inadequate Specification of Sampling Method, Abrupt Series

21.1.2 Inadequate Size of the Sample – Problems with Calculation of Sample Size

21.1.3 Incomparable Groups – Differential in Group Composition, Differential Definitions, Differential Compliance, Variable Periods of Exposure, Improper Denominator

21.1.4 Mixing of Distinct Groups – Effect on Regression, Effect on Shape of the Distribution, Lack of Intragroup Homogeneity

21.2 Inadequate Analysis

21.2.1 Ignoring Reality – Looking for Linearity, Overlooking Assumptions, Selection of Inappropriate Variables, Area Under the Concentration Curve, Further Problems with Statistical Analysis, Anomalous Person-Years, Problems with Intention-to-Treat Analysis and Equivalence

21.2.2 Choice of Analysis – Mean or Proportion? Forgetting Baseline Values

21.2.3 Misuse of Statistical Packages – Over-Analysis, Data Dredging, Quantitative Analysis of Codes, Soft Data versus Hard Data

21.3 Errors in Presentation of Findings

21.3.1 Misuse of Percentages and Means – Unnecessary Decimals

21.3.2 Problems in Reporting – Incomplete Reporting, Over-Reporting, Selective Reporting, Self-Reporting versus Objective Measurement, Misuse of Graphs

21.4 Misinterpretation

21.4.1 Misuse of P-Values – Magic Threshold 0.05, One-Tail or Two-Tail P-Values, Multiple Comparisons, Dramatic P-Values, P-Values for Nonrandom Sample, “Normal” with Respect to Several Parameters, Absence of Evidence is not Evidence of Absence

21.4.2 Correlation versus Cause–Effect Relationship – Criteria for Cause–Effect, Other Considerations

21.4.3 Sundry Issues – Diagnostic Test is Only an Additional Adjunct, Medical Significance versus Statistical Significance, Interpretation of Standard Error of p, Univariate Analysis but Multivariate Conclusions, Limitation of Relative Risk, Misinterpretation of Improvements

21.4.4 Final Comments

References

Exercises

Brief Solutions and Answers to the Selected Exercises

Appendix A: Statistical Software

A.1 General Purpose Statistical Software

A.2 Special Purpose Statistical Software

Appendix B: Some Statistical Tables

Appendix C: Software Illustrations

C.1 ROC Curves

C.2 Repeated Measures ANOVA

C.3 One-way ANOVA and Tukey Test

C.4 Stepwise Multiple Linear Regression

C.5 Curvilinear Regression

C.6 Analysis of Covariance (ANCOVA)

C.7 Logistic Regression

C.8 Survival Analysis (Life Table Method)

C.9 Cox Proportional Hazards Model

Index

Data sets in the Examples in this text are available in Excel for ready download at http://MedicalBiostatistics.synthasite.com. Use these data sets to rework some of the examples of your interest and to do further analysis where needed.