Dr. Abhaya Indrayan, MSc,MS,PhD(OhioState), FAMS,FRSS,FSMS,FASc
Medical Biostatistics, Fourth Edition
Contents
Preface to Fourth Edition
Summary Tables
Frequently Used Notations
1 Medical Uncertainties
1.1 Uncertainties in Health and Disease
1.1.1 Uncertainties due to Intrinsic Variation – Biologic, Genetic, Behavioral and Other Host Factors, Environmental, Chance, Sampling Fluctuations
1.1.2 Natural Variation in Assessment – Observer, Treatment Strategies, Instrument and Laboratory, Imperfect Tools, Incomplete Information on the Patient, Poor Compliance with the Regimen
1.1.3 Inadequate Knowledge – Epistemic Uncertainties; Diagnostic, Therapeutic, and Prognostic Uncertainties; Predictive and Other Uncertainties
1.2 Uncertainties in Medical Research
1.2.1 Empiricism in Medical Research – Laboratory Experiments, Clinical Trials, Surgical Procedures, Epidemiological Research
1.2.2 Elements of Minimizing the Impact of Uncertainties on Research – Proper Design, Improved Medical Methods, Analysis and Synthesis
1.2.3 Critique of a Report of a Medical Study – Introduction, Methodology, Results, Discussion and Conclusions
1.3 Uncertainties in Health Planning and Evaluation
1.3.1 Health Situation Analysis – Identification of the Specifics of the Problem, Size of the Target Population, Magnitude of the Problem, Health Infrastructure, Feasibility of Remedial Steps
1.3.2 Evaluation of Health Programs
1.4 Management of Uncertainties: About This Book
1.4.1 Contents of the Book – Limitations and Strengths, New in Third Edition
1.4.2 Salient Features of the Text – System of Notations, Guide Chart of the Biostatistical Methods
References
2 Basics of Medical Studies
2.1 Study Protocol
2.1.1 The Problem, Objectives, and Hypotheses
2.1.2 Protocol Content
2.2 Types of Medical Studies
2.2.1 Elements of Design
2.2.2 Basic Types of Study Design – Descriptive, Analytical, Basic Types of Analytical Studies
2.2.3 Choosing a Design – Recommended Design for Particular Setups, Choice of Design by Level of Evidence
2.3 Data Collection
2.3.1 Nature of Data – Factual, Knowledge-Based, and Opinion-Based Data; Method of Obtaining the Data
2.3.2 Tools of Data Collection – Existing Records, Questionnaires and Schedules, Likert Scale
2.3.3 Pretesting and Pilot Study
2.4 Nonsampling Errors and Other Biases
2.4.1 Nonresponse
2.4.2 Variety of Biases to Guard Against – List of Biases, Steps for Minimizing Bias
References
Exercises
3 Sampling Methods
3.1 Sampling Concepts
3.1.1 Advantages and Limitations of Sampling – Sampling Fluctuations, Advantages and Limitations
3.1.2 Some Special Terms Used in Sampling – Unit of Enquiry and Sampling Unit, Sampling Frame, Parameters and Statistics, Sample Size, Nonrandom and Random Sampling
3.2 Common Methods of Random Sampling
3.2.1 Simple Random Sampling
3.2.2 Stratified Random Sampling
3.2.3 Multistage Random Sampling
3.2.4 Cluster Random Sampling
3.2.5 Systematic Random Sampling
3.2.6 Choice of Method of Random Sampling
3.3 Some Other Methods of Sampling
3.3.1 Other Random Methods of Sampling – Probability Proportional to Size, Area Sampling, Inverse Sampling, Consecutive Subjects Attending a Clinic, Sequential Sampling
3.3.2 Nonrandom Methods of Sampling – Convenience Samples, Other Types of Purposive Samples
References
Exercises
4 Designs for Observational Studies
4.1 Some Basic Concepts
4.1.1 Antecedent and Outcome
4.1.2 Confounders
4.1.3 Effect Size
4.2 Prospective Studies
4.2.1 Variations of Prospective Studies – Cohort Study, Longitudinal Study, Repeated Measures Study
4.2.2 Selection of Subjects for a Prospective Study – Comparison Group in a Prospective Study
4.2.3 Potential Biases in Prospective Studies – Selection Bias, Bias due to Loss in Follow-Up, Assessment Bias and Errors, Bias due to Change in the Status, Confounding Bias, Post Hoc Bias, Validity Bias
4.2.4 Merits and Demerits of Prospective Studies
4.3 Retrospective Studies
4.3.1 Case-Control Design – Nested Case-Control Design
4.3.2 Selection of Cases and Controls – Sampling Methods in Retrospective Studies, Confounders and Matching
4.3.3 Merits and Demerits of Case-Control Studies
4.4 Cross-Sectional Studies
4.4.1 Selection of Subjects for a Cross-Sectional Study
4.4.2 Merits and Demerits of Cross-Sectional Studies
4.5 Comparative Performance of Prospective, Retrospective, and Cross-Sectional Studies
4.5.1 Performance of Prospective Studies
4.5.2 Performance of Retrospective Studies
4.5.3 Performance of Cross-Sectional Studies
References
Exercises
5 Medical Experiments
5.1 Basic Features of Medical Experiments
5.1.1 Statistical Principles of Experimentation – Control Group, Randomization, Replication
5.1.2 Advantages and Limitations of Experiments
5.2 Design of Experiments
5.2.1 Classical Designs: One-Way Design, Two-Way Design, Interaction, K-Way and Factorial Experiments
5.2.2 Some Unconventional Designs – Repeated Measures Design, Crossover Design, Other Complex Designs
5.3 Choice and Sampling of Units for Laboratory Experiments
5.3.1 Choice of Experimental Unit
5.3.2 Sampling Methods in Laboratory Experiments
5.3.3 Choosing a Design of Experiment
5.3.4 Pharmacokinetic Studies
References
Exercises
6 Clinical Trials
6.1 Therapeutic Trials
6.1.1 Phases of a Clinical Trial – Phases I to IV
6.1.2 Selection of Subjects – Selection of Participants for RCT, Control Group in a Clinical Trial
6.1.3 Randomization and Matching
6.1.4 Methods of Random Allocation – Allocation out of a Large Number of Available Subjects; Random Allocation of Consecutive Patients Coming to a Clinic; Block, Cluster and Stratified Randomization
6.1.5 Blinding and Masking
6.2 Issues in Clinical Trials
6.2.1 Outcome Assessment – Specification of End-point or Outcome, Causal Inference, Side Effects, Efficacy versus Effectiveness, Pragmatic Trials
6.2.2 Various Equivalences in Clinical Trials – Superiority, Equivalence, and Noninferiority Trials; Therapeutic Equivalence and Bioequivalence
6.2.3 Designs for Clinical Trials – One-Way, Two-Way, and Factorial Designs; Crossover and Repeated Measures Designs; N-of-1, Up-and-Down, and Sequential Designs; Choosing a Design for a Clinical Trial
6.2.4 Designs with Interim Appraisals – Design with Provision to Stop Early, Adaptive Designs
6.2.5 Biostatistical Ethics for Clinical Trials – Equipoise, Ethical Cautions, Statistical Considerations in a Multicentric Trial, Multiple Treatments with Different Outcomes in the Same Trial, Size of the Trial, Compliance
6.2.6 Reporting Results of a Clinical Trial – CONSORT, Open Access
6.3 Trials Other than for Therapeutics
6.3.1 Clinical Trials for Diagnostic and Prophylactic Modalities
6.3.2 Field Trials for Screening, Prophylaxis, and Vaccines
6.3.3 Issues in Field Trials – Randomization and Blinding in Field Trials, Designs for Field Trials
References
Exercises
7 Numerical Methods for Representing Variation
7.1 Types of Measurement
7.1.1 Nominal, Metric, and Ordinal Scales
7.1.2 Other Classifications of the Types of Measurement – Discrete and Continuous Variables, Qualitative and Quantitative Data, Stochastic and Deterministic Variables
7.2 Tabular Presentation
7.2.1 Contingency Tables and Frequency Distribution – Empty Cells, Problems in Preparing a Contingency Table on Metric Data
7.2.2 Multiple Response Tables and Other Features
7.2.3 Other Types of Statistical Tables – What is a Good Statistical Table?
7.3 Rates and Ratios
7.3.1 Proportion, Rate, and Ratio
7.4 Central and Other Locations
7.4.1 Central Values: Mean, Median, and Mode – Understanding Mean, Median, and Mode, Calculation in Case of Grouped Data, Which Central Value to Use?, Geometric Mean, Harmonic Mean
7.4.2 Other Locations: Quantiles – Ungrouped and Grouped Data, and Interpretation
7.5 Measuring Variability
7.5.1 Variance and Standard Deviation – Ungrouped and Grouped Data, Variance of Sum or Difference of Two Measurements
7.5.2 Coefficient of Variation
References
Exercises
8 Presentation of Variation by Figures
8.1 Graphs for Frequency Distribution
8.1.1 Histogram and Its Variants – Histogram, Stem-and-Leaf Plot, Line Histogram
8.1.2 Polygon and Its Variants – Frequency Polygon, Area Diagram
8.1.3 Frequency Curve
8.2 Pie, Bar, and Line Diagrams
8.2.1 Pie Diagram – Useful Features, Donut Diagram
8.2.2 Bar Diagram
8.2.3 Scatter and Line Diagrams
8.2.4 Choice and Cautions in Visual Display of Data
8.2.5 Mixed and Three-Dimensional Diagrams – Mixed Diagram, Box-and-Whiskers Plot, Three-Dimensional Diagram, Biplot, Nomogram
8.3 Special Diagrams in Health and Medicine
8.3.1 Diagrams Used in Public Health – Epidemic Curve, Lexis Diagram
8.3.2 Diagrams Used in Individual Care and Research – Growth Charts, Partogram, Dendrogram, Area Under the Concentration Curve, Radar Graph
8.4 Charts and Maps
8.4.1 Charts – Schematic Chart, Pedigree Chart
8.4.2 Maps – Spot Map, Thematic Choroplethic Map, Cartogram
References
Exercises
9 Some Quantitative Aspects of Medicine
9.1 Some Epidemiological Measures of Health and Disease
9.1.1 Epidemiological Indicators of Neonatal Health – Birth Weight, Apgar Score
9.1.2 Epidemiological Indicators of Growth in Children – Weight-for-Age, Weight-for-Height and Height-for-Age, Z-Scores and Percent of Median, Growth Velocity, Skinfold Thickness
9.1.3 Epidemiological Indicators of Adolescent Health – Growth in Height and Weight in Adolescence, Sexual Maturity Rating
9.1.4 Epidemiological Indicators of Adult Health – Obesity, Smoking, Physiological Functions, Quality of Life
9.1.5 Epidemiological Indicators of Geriatric Health – Activities of Daily Living, Mental Health of the Elderly
9.2 Reference Values
9.2.1 Gaussian and Other Distributions – Checking Gaussianity
9.2.2 Reference or Normal Values – Implications
9.2.3 Normal Range – Disease Threshold, Clinical Threshold, Statistical Threshold
9.3 Measurement of Uncertainty: Probability
9.3.1 Elementary Laws of Probability – Law of Multiplication, Law of Addition
9.3.2 Probability in Clinical Assessments – Probabilities in Diagnosis, Assessment of Prognosis, Choice of Treatment,
9.3.3 Further on Diagnosis: Bayes Rule
9.4 Validity of Medical Tests
9.4.1 Sensitivity and Specificity – Features of Sensitivity and Specificity, Likelihood Ratio
9.4.2 Predictivities – Positive and Negative Predictivity, Predictivity and Prevalence, The Meaning of Prevalence for Predictivity, Features of Positive and Negative Predictivities
9.4.3 Combination of Tests – Tests in Series, Tests in Parallel, Gains from a Test, When Can a Test Be Avoided?
9.4.4 Gains from a Test – When can a Test be Avoided
9.5 Search for the Best Threshold of Continuous Test: ROC Curve
9.5.1 Sensitivity–Specificity Based ROC Curve, Methods to Find the ‘Optimal’ Threshold Point, Area Under the ROC Curve
9.5.2 Predictivities Based ROC Curve
References
Exercises
10 Clinimetrics and Evidence-Based Medicine
10.1 Indicators, Indexes, and Scores
10.1.1 Indicators – Merits and Demerits of Indicators, Choice of Indicators
10.1.2 Indexes – Some Commonly Used Indexes, Advantages and Limitations of Indexes
10.1.3 Scores – Scoring System for Diagnosis, Scoring for Gradation of Severity
10.2 Clinimetrics
10.2.1 Method of Scoring – Method of Scoring for Graded Characteristics, Method of Scoring for Diagnosis, Regression Method of Scoring
10.2.2 Validity and Reliability of a Scoring System
10.3 Evidence-Based Medicine
10.3.1 Decision Analysis – Decision Tree
10.3.2 Other Statistical Tools for Evidence-Based Medicine – Etiology Diagram, Expert System
References
Exercises
11 Measurement of Community Health
11.1 Indicators of Mortality
11.1.1 Crude and Standardized Death Rates – Crude Death Rate, Age-Specific Death Rate, Standardized Death Rate, Comparative Mortality Ratio
11.1.2 Specific Mortality Rates – Fetal Deaths and Mortality in Children, Maternal Mortality, Adult Mortality, Other Measures of Mortality
11.1.3 Death Spectrum
11.2 Measures of Morbidity
11.2.1 Prevalence and Incidence – Point Prevalence, Period Prevalence, Incidence, The Concept of Person-Time, Capture–Recapture Methodology
11.2.2 Duration of Morbidity – Prevalence in Relation to Duration of Morbidity, Incidence from Prevalence, Epidemiologically Consistent Estimates
11.2.3 Morbidity Measures for Acute Conditions – Attack Rates, Disease Spectrum
11.3 Indicators of Social and Mental Health
11.3.1 Indicators of Social Health – Education, Income, Occupation, Socioeconomic Status, Dependency Ratio, Health Inequality
11.3.2 Indicators of Health Resources – Health Infrastructure, Health Expenditure
11.3.3 Indicators of Lack of Mental Health – Smoking and Other Addictions, Divorces, Vehicular Accidents and Crimes, Others Measures of Lack of Mental Health
11.4 Composite Indexes of Health
11.4.1 Indexes of Status of Comprehensive Health – Human Development Index, Physical Quality of Life Index
11.4.2 Indexes of Health Gap – DALYs Lost, Human Poverty Index, Index of Need for Health Resources
References
Exercises
12 Confidence Intervals, Principles of Tests of Significance, and Sample Size
12.1 Sampling Distributions
12.1.1 Basic Concepts – Sampling Error, Point Estimate, Standard Error of p and
12.1.2 Sampling Distribution of p and – Gaussian Conditions
12.1.3 Obtaining Probabilities from a Gaussian Distribution – Gaussian Probability, Continuity Correction, Probabilities Relating to the Mean and the Proportion
12.1.4 The Case of σ Not Known (t-Distribution)
12.2 Confidence Intervals
12.2.1 Confidence Interval for π, μ and Median (Gaussian Conditions) – Confidence Interval for Proportion π (Large n), Lower and Upper Bounds for π (Large n), Confidence Interval for Mean μ (Large n), Confidence Bounds for Mean μ (Large n), CI for Median (Gaussian Distribution)
12.2.2 Confidence Interval for Differences (Large n) – Two Independent Samples, Paired Samples
12.2.3 Confidence Interval for π, μ and Median: NonGaussian Conditions – Confidence Interval for π (Small n), Confidence Bound for π When the Success or the Failure Rate in the Sample is Zero Percent, Confidence Interval for Median (Small n): NonGaussian Conditions
12.3 P-Values and Statistical Significance
12.3.1 What Is Statistical Significance? – Court Judgment, Errors in Diagnosis, Null Hypothesis, Philosophical Basis of Statistical Tests, Alternative Hypothesis, One-Sided Alternatives: Which Tail is Wagging?
12.3.2 Errors, P-Values, and Power – Type-I Error, Type-II Error, Power
12.3.3 General Procedure to Obtain P-value – Subtleties of Statistical Significance
12.4 Assessing Gaussian Pattern
12.4.1 Significance Tests for Assessing Gaussianity
12.5 Initial Debate on Statistical Significance
12.5.1 Confidence Interval versus Test of H0
12.5.2 Medical Significance versus Statistical Significance
12.6 Sample Size Determination in Some Cases
12.6.1 Sample Size Required in Estimation Setup – General Considerations in the Estimation Setup, General Procedure for Determining Size of Sample for Estimation, Formulas for Sample Size Calculation for Estimation in Simple Situations
12.6.2 Sample Size for Testing a Hypothesis with Specified Power – General Considerations in a Testing-of-Hypothesis Setup, Sample Size Formulas for Test of Hypothesis in Simple Situations, Nomograms and Tables of Sample Size, Thumb Rules, Power Analysis
12.6.3 Sample Size Calculation in Clinical Trials – Stopping Rules in Case of Early Evidence of Success or of Failure: Lan–deMets Procedure, Sample Size Reestimation
References
Exercises
13 Inference from Proportions
13.1 One Qualitative Variable
13.1.1 Dichotomous Categories: Binomial Distribution – Large n: Gaussian Approximation to Binomial
13.1.2 Poisson Distribution
13.1.3 Polytomous Categories (Large n): Goodness-of-Fit Test – Chi-Square and Its Explanation, Degrees of Freedom, Cautions in Using Chi-Square, Further Analysis: Partitioning of Table
13.1.4 Goodness of Fit to Assess Gaussianity
13.1.5 Polytomous Categories (Small n): Exact Multinomial Test – Goodness-of-Fit in Small Samples
13.2 Proportions in 2×2 Tables
13.2.1 Structure of 2×2 Table in Different Types of Study – Structure in Prospective Study, Structure in Retrospective Study, Structure in Cross-Sectional Study
13.2.2 Two Independent Samples (Large n): Chi-Square Test and Proportion Test – Chi-square Test, Yates Correction for Continuity, Z-Test for Proportions, Detecting a Medically Important Difference in Proportions, Crossover Design with Binary Response (Large n)
13.2.3 Equivalence Tests – Superiority, Equivalence and Noninferiority; Equivalence; Determining Inferiority Margin
13.2.4 Two Independent Samples (Small n): Fisher Exact Test – Crossover Design (Small n)
13.2.5 Proportions in Matched Pairs: McNemar Test (Large n) and Exact Test (Small n) – Large n: McNemar Test, Small n: Exact Test (Matched Pairs), Comparison of Two Tests for Sensitivity and Specificity: Paired Setup
13.3 Analysis of R × C Tables (Large n)
13.3.1 One Dichotomous and the Other Polytomous Variable (2×C Table) – The Test Criterion, Trend in Proportions in Ordinal Categories, Dichotomy in Repeated Measures: Cochran Q Test (Large n)
13.3.2 Two Polytomous Variables – Chi-square Test for Large n, Matched Pairs: I×I Tables
13.4 Three-Way Tables
13.4.1 Assessment of Association in Three-Way Tables
13.4.2 Log–Linear Models – Two-Way Tables, Three-Way Tables
References
Exercises
14 Relative Risk and Odds Ratio
14.1 Relative and Attributable Risks (Large n)
14.1.1 Risk, Hazard, and Odds – Ratios of Risks and Odds
14.1.2 Relative Risk – RR in Independent Samples, Confidence Interval for RR (Independent Samples), Test of Hypothesis on RR (Independent Samples), RR in the Case of Matched Pairs
14.1.3 Attributable Risk – AR in Independent Samples, AR in Matched Pairs, Number Needed to Treat, Relative Risk Reduction, Population Attributable Risk
14.2 Odds Ratio
14.2.1 OR in Two Independent Samples – CI for OR (Independent Samples), Test of Hypothesis on OR (Independent Samples)
14.2.2 OR in Matched Pairs – Confidence Interval for OR (Matched Pairs), Test of Hypothesis on OR (Matched Pairs), Multiple Controls
14.3 Stratified Analysis, Sample Size and Meta-Analysis
14.3.1 Mantel–Haenszel Procedure – Pooled Odds Ratio and Chi-square
14.3.2 Sample Size Requirement for Statistical Inference on RR and OR
14.3.3 Meta-Analysis
References
Exercises
15 Inference from Means
15.1 Comparison of Means in One and Two Groups (Gaussian Conditions): Student t-Test
15.1.1 Comparison with a Prespecified Mean – Student t-Test for One Sample,
15.1.2 Difference in Means in Two Samples – Paired Samples Setup, Unpaired (Independent) Samples Setup, Some Features of Student t, Effect of Unequal n, Difference-in-Differences Approach
15.1.3 Analysis of Crossover Designs – Test for Group Effect, Test for Carry-Over Effect, Test for Treatment Effect
15.1.4 Analysis of Data of Up-and-Down Trials
15.2 Comparison of Means in Three or More Groups (Gaussian Conditions): ANOVA F-Test
15.2.1 One-Way ANOVA – The Procedure to Test H0, Checking the Validity of the Assumptions of ANOVA
15.2.2 Two-Way ANOVA – Two-Factor Design, The Hypotheses and Their Test, Main Effect and Interaction (Effect), Repeated Measures
15.2.3 Repeated Measures – Random Effects versus Fixed Effects, Sphericity and Hynh–Feldt Correction, Repeated Measures versus Two-way ANOVA, Area Under the Concentration Curve
15.2.4 Multiple Comparisons: Bonferroni, Tukey and Dunnett Tests – Intricacies of Multiple Comparisons
15.3 Non-Gaussian Conditions: Nonparametric Tests for Location
15.3.1 Comparison of Two Groups: Wilcoxon Tests – Paired Data, Independent Samples
15.3.2 Comparison of Three or More Groups: Kruskal–Wallis Test
15.3.3 Two-Way Layout: Friedman Test
15.4 When Significant is Not Significant
15.4.1 The Nature of Statistical Significance
15.4.2 Testing for Presence of Medically Important Difference in Means – Detecting Specified Difference in Mean, Equivalence Tests for Means
15.4.3 Power and Level of Significance – Balancing Type-I and Type-II Error
References
Exercises
16 Relationships: Quantitative Data
16.1 Some General Features of a Regression Setup
16.1.1 Dependent and Independent Variables – Simple, Multiple, and Multivariate Regression
16.1.2 Linear, Curvilinear, and Nonlinear Regressions
16.1.3 The Concept of Residuals
16.1.4 General Method of Fitting a Regression
16.2 Linear Regression Models
16.2.1 Adequacy of a Regression Fit – 1 – Goodness of Fit and η2, Multiple Correlation in Linear Regression, Stepwise Procedure, Statistical Significance of Individual Regression Coefficients
16.2.2 Adequacy of Regression – 2 – Validity of Assumptions, Choice of Form of Regression, Outliers and Missing Values
16.2.3 Interpretation of the Regression Coefficients – Standardized Coefficients, Other Implications of Regression Models
16.3 Some Issues in Linear Regression
16.3.1 Confidence Interval, Confidence Band, and Tests – SEs and CIs for the Regression, Confidence Band for Simple Linear Regression, Equality of Two Regression Lines, Difference-in-Differences Approach with Regression
16.3.2 Some Variations of Regression – Ridge Regression, Multilevel Regression, Regression Splines, Analysis of Covariance, Some Generalizations
16.4 Measuring the Strength of Quantitative Relationship
16.4.1 Product–Moment and Related Correlations – Multiple Correlation, Product–Moment Correlation, Covariance, Statistical Significance of r, Intraclass Correlation, Serial Correlation
16.4.2 Rank Correlation – Spearman Rho, Kendall Tau
16.5 Assessment of Quantitative Agreement
16.5.1 Agreement in Quantitative Measurements
16.5.2 Approaches for Measuring Quantitative Agreement – Limits of Disagreement Approach, Intraclass Correlation as a Measure of Agreement, Relative Merits of the Two Methods, An Alternative Simple Approach
References
Exercises
17 Relationships: Qualitative Dependent
17.1 Binary Dependent: Logistic Regression (Large n)
17.1.1 Meaning of a Logistic Model
17.1.2 Assessing Overall Adequacy of a Logistic Regression – Log Likelihood, Classification Accuracy, Hosmer–Lemeshow Test,
17.2 Inference from Logistic Coefficients
17.2.1 Interpretation of the Logistic Coefficients – Dichotomous Predictors, Polytomous and Continuous Predictors
17.2.2 Confidence Interval and Test of Hypothesis on Logistic Coefficients
17.3 Issues in Logistic Regression
17.3.1 Conditional Logistic for Matched Data
17.3.2 Polytomous Dependent – Nominal Categories: Multinomial Logistic, Ordinal Categories
17.4 Some Models for Qualitative Data and Generalizations
17.4.1 Cox Regression for Hazards
17.4.2 Classification and Regression Trees
17.4.3 Further Generalizations
17.5 Strength of Relationship in Qualitative Variables
17.5.1 Both Variables Qualitative – Dichotomous Categories, Polytomous Categories: Nominal, Proportional Reduction in Error, Polytomous Categories: Ordinal Association
17.5.2 One Qualitative and the Other Quantitative Variable
17.5.3 Agreement in Qualitative Measurements (Matched Pairs) – The Meaning of Qualitative Agreement, Cohen Kappa
References
Exercises
18 Survival Analysis
18.1 Life Expectancy
18.1.1 Life Table
18.1.2 Other Forms of Life Expectancy – Potential Years of Life Lost, Healthy Life Expectancy, Application to Other Setups
18.2 Analysis of Survival Data
18.2.1 Nature of Survival Data – Types of Censoring, Collection of Survival Time Data, Statistical Measures of Survival
18.2.2 Survival Observed in Time Intervals: Life Table Method
18.2.3 Continuous Observation of Survival Time: Kaplan–Meier Method – Using the Survival Curve, Standard Error of Survival Rate, Hazard Function
18.3 Issues in Survival Analysis
18.3.1 Comparison of Survival in Two Groups – Comparing Survival Rates, Comparing Survival Experience: Log-Rank Test
18.3.2 Factors Affecting Survival: Cox Model – Parametric Models, Cox Model for Survival, Proportional Hazards
18.3.3 Sample Size for Survival Studies
References
Exercises
19 Simultaneous Consideration of Several Variables
19.1 Scope of Multivariate Methods
19.1.1 The Essentials of a Multivariate Setup
19.1.2 Statistical Limitation on the Number of Variables
19.2 Dependent and Independent Sets of Variables
19.2.1 Dependents and Independents Both Quantitative: Multivariate Multiple Regression
19.2.2 Quantitative Dependents and Qualitative Independents: Multivariate Analysis of Variance (MANOVA) – MANOVA for Repeated Measures
19.2.3 Classification of Subjects into Known Groups: Discriminant Analysis – Discriminant Function, Classification Rule, Classification Accuracy
19.3 Identification of Structure in the Observations
19.3.1 Identification of Clusters of Subjects: Cluster Analysis – Measures of Similarity, Hierarchical Agglomerative Algorithm, Deciding on the Number of Natural Clusters
19.3.2 Identification of Unobservable Underlying Factors: Factor Analysis – Steps for Factor Analysis, Features of a Successful Factor Analysis, Factor Scores
References
Exercises
20 Quality Considerations
20.1 Statistical Quality Control in Medical Care
20.1.1 Statistical Control of Medical Care Errors – Adverse Patient Outcomes, Monitoring Fatality, Limits of Tolerance
20.1.2 Quality of Lots – The Lot Quality Method, LQAS in Health Assessment
20.1.3 Quality Control in a Medical Laboratory – Control Chart, Cusum Chart, Other Errors in Medical Laboratory, Six Sigma Methodology, Nonstatistical Issues
20.2 Quality of Measurements
20.2.1 Validity of Instruments – Types of Validity
20.2.2 Reliability of Instruments – Internal Consistency, Cronbach Alpha, Test–Retest Reliability
20.3 Quality of Statistical Models: Robustness
20.3.1 External Validation – Split-Sample Method, Another Sample Method
20.3.2 Sensitivity Analysis and Uncertainty Analysis
20.3.3 Resampling – Bootstrapping, Jackknife Resampling
20.4 Quality of Data
20.4.1 Errors in Measurement – Lack of Standardization in Definitions, Lack of Care in Obtaining or Recording Information, Inability of the Observer to Get Confidence of the Respondent, Bias of the Observer, Variable Competence of the Observers
20.4.2 Missing Values – Approaches for Missing Values, Handling Nonresponse, Imputations, Intention-to-Treat Analysis
20.4.3 Lack of Standardization in Values – Standardization Methods Already Described, Standardization for Calculating Adjusted Rates, Standardized Mortality Ratio
References
Exercises
21 Statistical Fallacies
21.1 Problems with the Sample
21.1.1 Biased Sample – Survivors, Volunteers, Clinical Subjects, Publication Bias, Inadequate Specification of Sampling Method, Abrupt Series
21.1.2 Inadequate Size of the Sample – Problems with Calculation of Sample Size
21.1.3 Incomparable Groups – Differential in Group Composition, Differential Definitions, Differential Compliance, Variable Periods of Exposure, Improper Denominator
21.1.4 Mixing of Distinct Groups – Effect on Regression, Effect on Shape of the Distribution, Lack of Intragroup Homogeneity
21.2 Inadequate Analysis
21.2.1 Ignoring Reality – Looking for Linearity, Overlooking Assumptions, Selection of Inappropriate Variables, Area Under the Concentration Curve, Further Problems with Statistical Analysis, Anomalous Person-Years, Problems with Intention-to-Treat Analysis and Equivalence
21.2.2 Choice of Analysis – Mean or Proportion? Forgetting Baseline Values
21.2.3 Misuse of Statistical Packages – Over-Analysis, Data Dredging, Quantitative Analysis of Codes, Soft Data versus Hard Data
21.3 Errors in Presentation of Findings
21.3.1 Misuse of Percentages and Means – Unnecessary Decimals
21.3.2 Problems in Reporting – Incomplete Reporting, Over-Reporting, Selective Reporting, Self-Reporting versus Objective Measurement, Misuse of Graphs
21.4 Misinterpretation
21.4.1 Misuse of P-Values – Magic Threshold 0.05, One-Tail or Two-Tail P-Values, Multiple Comparisons, Dramatic P-Values, P-Values for Nonrandom Sample, “Normal” with Respect to Several Parameters, Absence of Evidence is not Evidence of Absence
21.4.2 Correlation versus Cause–Effect Relationship – Criteria for Cause–Effect, Other Considerations
21.4.3 Sundry Issues – Diagnostic Test is Only an Additional Adjunct, Medical Significance versus Statistical Significance, Interpretation of Standard Error of p, Univariate Analysis but Multivariate Conclusions, Limitation of Relative Risk, Misinterpretation of Improvements
21.4.4 Final Comments
References
Exercises
Brief Solutions and Answers to the Selected Exercises
Appendix A: Statistical Software
A.1 General Purpose Statistical Software
A.2 Special Purpose Statistical Software
Appendix B: Some Statistical Tables
Appendix C: Software Illustrations
C.1 ROC Curves
C.2 Repeated Measures ANOVA
C.3 One-way ANOVA and Tukey Test
C.4 Stepwise Multiple Linear Regression
C.5 Curvilinear Regression
C.6 Analysis of Covariance (ANCOVA)
C.7 Logistic Regression
C.8 Survival Analysis (Life Table Method)
C.9 Cox Proportional Hazards Model
Index
Data sets in the Examples in this text are available in Excel for ready download at http://MedicalBiostatistics.synthasite.com. Use these data sets to rework some of the examples of your interest and to do further analysis where needed.