Numerical issues in statistical computing for the social scientist /
Saved in:
Author / Creator: | Altman, Micah. |
---|---|
Imprint: | Hoboken, N.J. : Wiley-Interscience, c2004. |
Description: | xv, 323 p. : ill., map ; 25 cm. |
Language: | English |
Series: | Wiley series in probability and statistics |
Subject: | |
Format: | Print Book |
URL for this record: | http://pi.lib.uchicago.edu/1001/cat/bib/5054043 |
Table of Contents:
- Preface
- 1. Introduction: Consequences of Numerical Inaccuracy
- 1.1. Importance of Understanding Computational Statistics
- 1.2. Brief History: Duhem to the Twenty-First Century
- 1.3. Motivating Example: Rare Events Counts Models
- 1.4. Preview of Findings
- 2. Sources of Inaccuracy in Statistical Computation
- 2.1. Introduction
- 2.1.1. Revealing Example: Computing the Coefficient Standard Deviation
- 2.1.2. Some Preliminary Conclusions
- 2.2. Fundamental Theoretical Concepts
- 2.2.1. Accuracy and Precision
- 2.2.2. Problems, Algorithms, and Implementations
- 2.3. Accuracy and Correct Inference
- 2.3.1. Brief Digression: Why Statistical Inference Is Harder in Practice Than It Appears
- 2.4. Sources of Implementation Errors
- 2.4.1. Bugs, Errors, and Annoyances
- 2.4.2. Computer Arithmetic
- 2.5. Algorithmic Limitations
- 2.5.1. Randomized Algorithms
- 2.5.2. Approximation Algorithms for Statistical Functions
- 2.5.3. Heuristic Algorithms for Random Number Generation
- 2.5.4. Local Search Algorithms
- 2.6. Summary
- 3. Evaluating Statistical Software
- 3.1. Introduction
- 3.1.1. Strategies for Evaluating Accuracy
- 3.1.2. Conditioning
- 3.2. Benchmarks for Statistical Packages
- 3.2.1. NIST Statistical Reference Datasets
- 3.2.2. Benchmarking Nonlinear Problems with StRD
- 3.2.3. Analyzing StRD Test Results
- 3.2.4. Empirical Tests of Pseudo-Random Number Generation
- 3.2.5. Tests of Distribution Functions
- 3.2.6. Testing the Accuracy of Data Input and Output
- 3.3. General Features Supporting Accurate and Reproducible Results
- 3.4. Comparison of Some Popular Statistical Packages
- 3.5. Reproduction of Research
- 3.6. Choosing a Statistical Package
- 4. Robust Inference
- 4.1. Introduction
- 4.2. Some Clarification of Terminology
- 4.3. Sensitivity Tests
- 4.3.1. Sensitivity to Alternative Implementations and Algorithms
- 4.3.2. Perturbation Tests
- 4.3.3. Tests of Global Optimality
- 4.4. Obtaining More Accurate Results
- 4.4.1. High-Precision Mathematical Libraries
- 4.4.2. Increasing the Precision of Intermediate Calculations
- 4.4.3. Selecting Optimization Methods
- 4.5. Inference for Computationally Difficult Problems
- 4.5.1. Obtaining Confidence Intervals with Ill-Behaved Functions
- 4.5.2. Interpreting Results in the Presence of Multiple Modes
- 4.5.3. Inference in the Presence of Instability
- 5. Numerical Issues in Markov Chain Monte Carlo Estimation
- 5.1. Introduction
- 5.2. Background and History
- 5.3. Essential Markov Chain Theory
- 5.3.1. Measure and Probability Preliminaries
- 5.3.2. Markov Chain Properties
- 5.3.3. The Final Word (Sort of)
- 5.4. Mechanics of Common MCMC Algorithms
- 5.4.1. Metropolis-Hastings Algorithm
- 5.4.2. Hit-and-Run Algorithm
- 5.4.3. Gibbs Sampler
- 5.5. Role of Random Number Generation
- 5.5.1. Periodicity of Generators and MCMC Effects
- 5.5.2. Periodicity and Convergence
- 5.5.3. Example: The Slice Sampler
- 5.5.4. Evaluating WinBUGS
- 5.6. Absorbing State Problem
- 5.7. Regular Monte Carlo Simulation
- 5.8. So What Can Be Done?
- 6. Numerical Issues Involved in Inverting Hessian Matrices
- 6.1. Introduction
- 6.2. Means versus Modes
- 6.3. Developing a Solution Using Bayesian Simulation Tools
- 6.4. What Is It That Bayesians Do?
- 6.5. Problem in Detail: Noninvertible Hessians
- 6.6. Generalized Inverse/Generalized Cholesky Solution
- 6.7. Generalized Inverse
- 6.7.1. Numerical Examples of the Generalized Inverse
- 6.8. Generalized Cholesky Decomposition
- 6.8.1. Standard Algorithm
- 6.8.2. Gill-Murray Cholesky Factorization
- 6.8.3. Schnabel-Eskow Cholesky Factorization
- 6.8.4. Numerical Examples of the Generalized Cholesky Decomposition
- 6.9. Importance Sampling and Sampling Importance Resampling
- 6.9.1. Algorithm Details
- 6.9.2. SIR Output
- 6.9.3. Relevance to the Generalized Process
- 6.10. Public Policy Analysis Example
- 6.10.1. Texas
- 6.10.2. Florida
- 6.11. Alternative Methods
- 6.11.1. Drawing from the Singular Normal
- 6.11.2. Aliasing
- 6.11.3. Ridge Regression
- 6.11.4. Derivative Approach
- 6.11.5. Bootstrapping
- 6.11.6. Respecification (Redux)
- 6.12. Concluding Remarks
- 7. Numerical Behavior of King's EI Method
- 7.1. Introduction
- 7.2. Ecological Inference Problem and Proposed Solutions
- 7.3. Numeric Accuracy in Ecological Inference
- 7.3.1. Case Study 1: Examples from King (1997)
- 7.3.2. Nonlinear Optimization
- 7.3.3. Pseudo-Random Number Generation
- 7.3.4. Platform and Version Sensitivity
- 7.4. Case Study 2: Burden and Kimball (1998)
- 7.4.1. Data Perturbation
- 7.4.2. Option Dependence
- 7.4.3. Platform Dependence
- 7.4.4. Discussion: Summarizing Uncertainty
- 7.5. Conclusions
- 8. Some Details of Nonlinear Estimation
- 8.1. Introduction
- 8.2. Overview of Algorithms
- 8.3. Some Numerical Details
- 8.4. What Can Go Wrong?
- 8.5. Four Steps
- 8.5.1 Step 1. Examine the Gradient
- 8.5.2 Step 2. Inspect the Trace
- 8.5.3 Step 3. Analyze the Hessian
- 8.5.4 Step 4. Profile the Objective Function
- 8.6. Wald versus Likelihood Inference
- 8.7. Conclusions
- 9. Spatial Regression Models
- 9.1. Introduction
- 9.2. Sample Data Associated with Map Locations
- 9.2.1. Spatial Dependence
- 9.2.2. Specifying Dependence Using Weight Matrices
- 9.2.3. Estimation Consequences of Spatial Dependence
- 9.3. Maximum Likelihood Estimation of Spatial Models
- 9.3.1. Sparse Matrix Algorithms
- 9.3.2. Vectorization of the Optimization Problem
- 9.3.3. Trade-offs between Speed and Numerical Accuracy
- 9.3.4. Applied Illustrations
- 9.4. Bayesian Spatial Regression Models
- 9.4.1. Bayesian Heteroscedastic Spatial Models
- 9.4.2. Estimation of Bayesian Spatial Models
- 9.4.3. Conditional Distributions for the SAR Model
- 9.4.4. MCMC Sampler
- 9.4.5. Illustration of the Bayesian Model
- 9.5. Conclusions
- 10. Convergence Problems in Logistic Regression
- 10.1. Introduction
- 10.2. Overview of Logistic Maximum Likelihood Estimation
- 10.3. What Can Go Wrong?
- 10.4. Behavior of the Newton-Raphson Algorithm under Separation
- 10.4.1. Specific Implementations
- 10.4.2. Warning Messages
- 10.4.3. False Convergence
- 10.4.4. Reporting of Parameter Estimates and Standard Errors
- 10.4.5. Likelihood Ratio Statistics
- 10.5. Diagnosis of Separation Problems
- 10.6. Solutions for Quasi-Complete Separation
- 10.6.1. Deletion of Problem Variables
- 10.6.2. Combining Categories
- 10.6.3. Do Nothing and Report Likelihood Ratio Chi-Squares
- 10.6.4. Exact Inference
- 10.6.5. Bayesian Estimation
- 10.6.6. Penalized Maximum Likelihood Estimation
- 10.7. Solutions for Complete Separation
- 10.8. Extensions
- 11. Recommendations for Replication and Accurate Analysis
- 11.1. General Recommendations for Replication
- 11.1.1. Reproduction, Replication, and Verification
- 11.1.2. Recreating Data
- 11.1.3. Inputting Data
- 11.1.4. Analyzing Data
- 11.2. Recommendations for Producing Verifiable Results
- 11.3. General Recommendations for Improving the Numeric Accuracy of Analysis
- 11.4. Recommendations for Particular Statistical Models
- 11.4.1. Nonlinear Least Squares and Maximum Likelihood
- 11.4.2. Robust Hessian Inversion
- 11.4.3. MCMC Estimation
- 11.4.4. Logistic Regression
- 11.4.5. Spatial Regression
- 11.5. Where Do We Go from Here?
- Bibliography
- Author Index
- Subject Index