25 Statistics

25.1 Answering questions with data

Matthew J. Crump

This is a free textbook teaching introductory statistics for undergraduates in Psychology. This textbook is part of a larger OER course package for teaching undergraduate statistics in Psychology, including this textbook, a lab manual, and a course website.

Looks like a comprehensive stats resource!


25.2 Bayes rules!

The primary goal of Bayes Rules! is to make modern Bayesian thinking, modeling, and computing accessible to a broad audience. Bayes Rules! empowers readers to weave Bayesian approaches into an everyday modern practice of statistics and data science.
The overall spirit is very applied: the book utilizes modern computing resources and a reproducible pipeline; the discussion emphasizes conceptual understanding; the material is motivated by data-driven inquiry; and the delivery blends traditional “content” with “activity”.

Free online book under construction but with 5 complete chapters on 2020/10/15


25.3 Doing meta-analysis with R: A hands-on guide

Mathias Harrer, Pim Cuijpers, Toshi A. Furukawa, David D. Ebert

This book serves as an accessible introduction into how meta-analyses can be conducted in R. Essential steps for meta-analysis are covered, including pooling of outcome measures, forest plots, heterogeneity diagnostics, subgroup analyses, meta-regression, methods to control for publication bias, risk of bias assessments and plotting tools.

Advanced, but highly relevant topics such as network meta-analysis, multi-/three-level meta-analyses, Bayesian meta-analysis approaches, SEM meta-analysis are also covered.


25.4 Using R for Bayesian Spatial and Spatio-Temporal Health Modeling

By Andrew B. Lawson

Progressively more and more attention has been paid to how location affects health outcomes. The area of disease mapping focusses on these problems, and the Bayesian paradigm has a major role to play in the understanding of the complex interplay of context and individual predisposition in such studies of disease. Using R for Bayesian Spatial and Spatio-Temporal Health Modeling provides a major resource for those interested in applying Bayesian methodology in small area health data studies.

Paid ~$100 https://www.routledge.com/Using-R-for-Bayesian-Spatial-and-Spatio-Temporal-Health-Modeling/Lawson/p/book/9780367490126

25.5 A Business Analyst’s Introduction to Business Analytics: Intro to Bayesian Business Analytics in the R Ecosystem

Adam Fleischhacker

This textbook goes farther than just showing you how to make computational models using software or mathematical models using statistics. It guides your thinking so you can align computational and mathematical models with real-world scenarios. As you journey through the material, you will feel empowered to effectively collaborate with business stakeholders as you use modern software stacks and modern statistical workflows to discover insight. R, RStudio, dplyr for data manipulation, ggplot for data visualization, causact for graphical models, and Bayesian data analysis feature prominently.

The full-color book ($68) is available via Amazon: https://www.amazon.com/dp/B08DBYPRD2 and online (free) at: http://causact.com. Video supplements for all chapters available on YouTube.

25.6 Common statistical tests are linear models: a work through

Steve Doogue

This is a reworking of the book Common statistical tests are linear models (or: how to teach stats), written by Jonas Lindeløv. The book beautifully demonstrates how many common statistical tests (such as the t-test, ANOVA and chi-squared) are special cases of the linear model. The book also demonstrates that many non-parametric tests, which are needed when certain test assumptions do not hold, can be approximated by linear models using the rank of values.


25.7 The Effect: An Introduction to Research Design and Causality

Nick Huntington-Klein

The Effect is a book intended to introduce students (and non-students) to the concepts of research design and causality in the context of observational data. The book is written in an intuitive and approachable way and doesn’t overload on technical detail. Why teach regression and research design at the same time when they are fundamentally different things? First learn why you want to structure a design in a certain way, and what it is you want to do to the data, and then afterwards learn the technical details of how to run the appropriate model.


25.8 Foundations of Statistics with R

Darrin Speegle and Bryan Clair

This book represents a fundamental rethinking of a calculus based first course in probability and statistics. We offer a breadth first approach, where the fundamentals of probability and statistics can be taught in one semester. The statistical programming language R plays an essential role throughout the text through simulations, data wrangling, visualizations and statistical procedures. Data sets from a variety of sources, including many from recent, open source scientific articles, are used in examples and exercises. Demonstrations of important facts are given through simulations, with some formal mathematical proofs as well.

This book is an excellent choice for students studying data science, statistics, engineering, computer science, mathematics, science, business, or any field which requires the two semesters of calculus needed to read this book.


25.9 Handbook of Regression Modeling in People Analytics

Keith McNulty

It is the author’s firm belief that all people analytics professionals should have a strong understanding of regression models and how to implement and interpret them in practice, and the aim with this book is to provide those who need it with help in getting there.


25.10 Learning statistics with R: A tutorial for psychology students and other beginners. (Version 0.6.1)

Danielle Navarro

Learning Statistics with R covers the contents of an introductory statistics class, as typically taught to undergraduate psychology students, focusing on the use of the R statistical software. The book discusses how to get started in R as well as giving an introduction to data manipulation and writing scripts. From a statistical perspective, the book discusses descriptive statistics and graphing first, followed by chapters on probability theory, sampling and estimation, and null hypothesis testing. After introducing the theory, the book covers the analysis of contingency tables, t-tests, ANOVAs and regression. Bayesian statistics are covered at the end of the book.

The book is free online.


25.11 Mixed Models with R : Getting started with random effects

Michael Clark

Mixed models are an extremely useful modeling tool for situations in which there is some dependency among observations in the data, where the correlation typically arises from the observations being clustered in some way.


25.12 An Introduction to Statistical and Data Sciences via R

Chester Ismay and Albert Kim

An incredibly beginner friendly introduction to both datascience and statistics concepts as well as R.

The book is free to read online.


25.13 An Introduction to Statistical Learning

Gareth James, Daniela Witten, Trevor Hastie, Rob Tibshirani

As the scale and scope of data collection continue to increase across virtually all fields, statistical learning has become a critical toolkit for anyone who wishes to understand data. An Introduction to Statistical Learning provides a broad and less technical treatment of key topics in statistical learning. Each chapter includes an R lab. This book is appropriate for anyone who wishes to use contemporary tools for data analysis.


25.14 ISLR tidymodels Labs

Emil Hvitfeldt

This book aims to be a complement to the 1st version An Introduction to Statistical Learning book with translations of the labs into using the tidymodels set of packages.

The labs will be mirrored quite closely to stay true to the original material.


25.15 Statistical Rethinking

A Bayesian Course with Examples in R and Stan

Statistical Rethinking: A Bayesian Course with Examples in R and Stan builds your knowledge of and confidence in making inferences from data. Reflecting the need for scripting in today’s model-based statistics, the book pushes you to perform step-by-step calculations that are usually automated. This unique computational approach ensures that you understand enough of the details to make reasonable choices and interpretations in your own modeling work.


25.16 Statistical Rethinking with brms, ggplot2, and the tidyverse: Second edition

A Solomon Kurz

This ebook is based on the second edition of Richard McElreath’s (2020) text, Statistical rethinking: A Bayesian course with examples in R and Stan. My contributions show how to fit the models he covered with Paul Bürkner’s brms package, which makes it easy to fit Bayesian regression models in R using Hamiltonian Monte Carlo. I also prefer plotting and data wrangling with the packages from the tidyverse. So we’ll be using those methods, too.


25.17 OpenIntro Statistics

David Diez, Mine Cetinkaya-Rundel, Christopher Barr, and OpenIntro.

A complete foundation for Statistics, also serving as a foundation for Data Science.

Leanpub revenue supports OpenIntro (US-based nonprofit) so we can provide free desk copies to teachers interested in using OpenIntro Statistics in the classroom and expand the project to support free textbooks in other subjects.

More resources: openintro.org.

Pay what you want for the ebook, minimum $0.00, however if you are able to, please consider the cause above. Thanks!


25.18 One Way ANOVA with R: Completely Randomized Design - Between Groups

Bruce Dudek

This document can be a standalone “how-to” document for R users. However, it is primarily intended for students in the APSY510/511 statistics sequence at the University at Albany. It is a fairly thorough treatment of graphical and inferential evaluation of one-factor designs. It presumes prior background coverage of the ANOVA logic from standard textbooks such as Howell or Maxwell, Delaney and Kelley (2017). The analyses are intended to parallel and exhaust the methods already covered with SPSS, and to extend them to additional topics.


25.19 Introduction to Modern Statistics

Mine Çetinkaya-Rundel, Johanna Hardin

We hope readers will take away three ideas from this book in addition to forming a foundation of statistical thinking and methods.

  1. Statistics is an applied field with a wide range of practical applications.
  2. You don’t have to be a math guru to learn from interesting, real data.
  3. Data are messy, and statistical tools are imperfect. However, when you understand the strengths and weaknesses of these tools, you can use them to learn interesting things about the~world.


25.20 Statistical inference for data science

Brian Caffo

This book gives a brief, but rigorous, treatment of statistical inference intended for practicing Data Scientists.

Pay what you want for the ebook, minimum $0.00


25.21 Statistics (The Easier Way) With R, 3rd. Ed. (TIDYVERSION)

Nicole Radziwill

This introductory applied statistics handbook shows you how to run tests analytically, and then how to run exactly the same steps using R. No steps are skipped, making this particularly well suited for beginners or people who need a quick lookup. Used at 30+ universities around the globe.

https://amzn.to/3b9ha8s - varies between $37-43 & you can request free PDF after your order https://www.e-junkie.com/ecom/gb.php?&c=single&cl=147256&i=1614407 - $25 for PDF only

25.22 End-to-End Solved Problems With R: a catalog of 26 examples using statistical inference

Nicole Radziwill

Lots of worked problems, analytically and in R! Useful supplement for an introductory applied stats class.

https://amzn.to/2EREAn2 - used for $4-18, new $19-20 https://www.e-junkie.com/ecom/gb.php?c=single&cl=147256&i=1548704 - $10 for PDF only

25.23 Statistics and Data with R: An Applied Approach Through Examples

Yosef Cohen and Jeremiah Y. Cohen

R, an Open Source software, has become the de facto statistical computing environment. It has an excellent collection of data manipulation and graphics capabilities. It is extensible and comes with a large number of packages that allow statistical analysis at all levels – from simple to advanced – and in numerous fields including Medicine, Genetics, Biology, Environmental Sciences, Geology, Social Sciences and much more. The software is maintained and developed by academicians and professionals and as such, is continuously evolving and up to date. Statistics and Data with R presents an accessible guide to data manipulations, statistical analysis and graphics using R.

The E-Book costs $97.00 while the print version costs $121.75




A delightful series of beautifully illustrated modules to learn statistics and R coding for students, scientists, and stats-enthusiasts.


25.25 Modern Statistics with R

Måns Thulin

This book covers the fundamentals of data science and statistics. The first half deals with the basics of R and R coding, data wrangling, exploratory data analysis and more advandced programming. The second half deals with modern statistics (favouring permutation tests, the bootstrap and Bayesian methods over traditional asymptotic methods), regression models and predictive modelling. It also contains information about debugging and explanations of 25 commonly encountered error messages in R. In addition, there are 170 or so exercises with fully worked solutions.


25.26 Foundations of Statistics with R

Darrin Speegle

This book represents a fundamental rethinking of a calculus based first course in probability and statistics. We offer a breadth first approach, where the fundamentals of probability and statistics can be taught in one semester.1 The statistical programming language R plays an essential role throughout the text through simulations, data wrangling, visualizations and statistical procedures. Data sets from a variety of sources, including many from recent, open source scientific articles, are used in examples and exercises. Demonstrations of important facts are given through simulations, with some formal mathematical proofs as well.


25.27 Statistical Thinking in the 21st Century

Russell Poldrack

This textbook aims to cover modern methods that take advantage of today’s increased computing power, while also balancing the accessibility of the material for students not wanting to wade through a lot of story to get to the statistical knowledge while reading Andy Field’s graphic novel statistics books, “An Adventure in Statistics”.

The main site below has companion sites in R and Python: