22 Machine Learning

22.1 A Minimal rTorch Book

by Alfonso R. Reyes

Practically, you can do everything you could with PyTorch within the R ecosystem.

Link: https://f0nzie.github.io/rtorch-minimal-book/

22.2 Behavior Analysis with Machine Learning Using R

by Enrique Garcia Ceja

This book aims to provide an introduction to machine learning concepts and algorithms applied to a diverse set of behavior analysis problems. It focuses on the practical aspects of solving such problems based on data collected from sensors or stored in electronic records. The included examples demonstrate how to perform several of the tasks involved during a data analysis pipeline such as: data exploration, visualization, preprocessing, representation, model training/validation, and so on. All of this, using the R programming language and real-life datasets.

Link: https://enriquegit.github.io/behavior-free/index.html#

22.3 Data Science: Theories, Models, Algorithms, and Analytics

by Sanjiv Ranjan Das

I developed these class notes for my Machine Learning with R course. It traces my evolution as a data scientist into redundancy, I expect I will be replaced by a machine soon!

Link: https://srdas.github.io/MLBook/

22.4 Deep Learning and Scientific Computing with R torch

by Sigrid Keydana

This is a book about torch, the R interface to PyTorch. PyTorch, as of this writing, is one of the major deep-learning and scientific-computing frameworks, widely used across industries and areas of research. With torch, you get to access its rich functionality directly from R, with no need to install, let alone learn, Python.

Link: https://skeydan.github.io/Deep-Learning-and-Scientific-Computing-with-R-torch/

22.5 Explanatory Model Analysis

by Przemyslaw Biecek, Tomasz Burzykowski

Responsible, Fair and Explainable Predictive Modeling with examples in R and Python

Link: https://pbiecek.github.io/ema/

22.6 Feature Engineering A-Z

by Emil Hvitfeldt

This book is written to be used as a reference guide to nearly all feature engineering methods you will encounter. This book is designed to be used by people involved in the modeling of data. These can include but are not limited to data scientists, students, professors, data analysts and machine learning engineers. The reference style nature of the book makes it useful for beginners and seasoned professionals. A background in the basics of modeling, statistics and machine learning would be helpful. Feature engineering as a practice is tightly connected to the rest of the machine learning pipeline so knowledge of the other components is key.

Many educational resources skip over the finer details of feature engineering methods, which is where this book tries to fill the gap.

Link: https://feaz-book.com/

22.7 Feature Engineering and Selection A Practical Approach for Predictive Models

by Max Kuhn, Kjell Johnson

The goals of Feature Engineering and Selection are to provide tools for re-representing predictors, to place these tools in the context of a good predictive modeling framework, and to convey our experience of utilizing these tools in practice.

Link: http://www.feat.engineering/index.html

22.8 Hands-On Machine Learning with R

by Bradley Boehmke, Brandon Greenwell

This book provides hands-on modules for many of the most common machine learning methods to include:

Generalized low rank models, Clustering algorithms, Autoencoders, Regularized models, Random forests, Gradient boosting machines, Deep neural networks, Stacking / super learners and more!

Link: https://bradleyboehmke.github.io/HOML/

22.9 Interpretable Machine Learning

by Christoph Molnar

A Guide for Making Black Box Models Explainable

Online book

Paid: Free or pay what you want $42

Link: https://leanpub.com/interpretable-machine-learning

22.10 Lightweight Machine Learning Classics with R Marek Gagolewski

In this book we will take an unpretentious glance at the most fundamental algorithms that have stood the test of time and which form the basis for state-of-the-art solutions of modern AI, which is principally (big) data-driven.

Link: https://lmlcr.gagolewski.com/

22.11 Machine Learning for Factor Investing

by Guillaume Coqueret, Tony Guida

This book is intended to cover some advanced modelling techniques applied to equity investment strategies that are built on firm characteristics.

Link: http://www.mlfactor.com/

22.12 Mathematics and Programming for Machine Learning with R From the Ground Up 1st Edition, Kindle

by William B. Claster

Based on the author’s experience in teaching data science for more than 10 years, Mathematics and Programming for Machine Learning with R: From the Ground Up reveals how machine learning algorithms do their magic and explains how these algorithms can be implemented in code. It is designed to provide readers with an understanding of the reasoning behind machine learning algorithms as well as how to program them. Written for novice programmers, the book progresses step-by-step, providing the coding skills needed to implement machine learning algorithms in R.

Paid: $40

Link: https://www.amazon.com/Mathematics-Programming-Machine-Learning-Ground-ebook-dp-B08JHDCX9Y/dp/B08JHDCX9Y

22.13 mlr3 book

by Michel Lang

The mlr3 package and ecosystem provide a generic, object-oriented, and extensible framework for classification, regression, survival analysis, and other machine learning tasks for the R language. They do not implement any learners, but provide a unified interface to many existing learners in R.

Link: https://mlr3book.mlr-org.com/

22.14 Neural Cryptography Using Keras in R

by Michael Harris

This book illustrates a method of using the traditional deep learning-based multi-class classification techniques to hide messages in a matrix of seemingly random numbers. This book is definitely a niche topic and is more of a fun project than something you would want to do for work. The premise is that you can represent characters as a sequence of random numbers you uniquely generate, and with the help of a neural network, a message can be embedded in a matrix of numbers. In the book, I also describe how this method can be used to embed messages in images.

Paid: Free and paid $15

Link: https://www.statswithr.com/neural-cryptography-using-keras-in-r

22.15 Neural Networks with Keras in R: A QuickStart Guide

by Michael Harris

I wrote this book for people who primarily use other statistical software like SPSS or SAS, and want to get started in deep learning with Keras. With this idea in mind, a sizable chuck of the book is giving people the prerequisite information they need to start using Keras. I start from the very beginning of assigning variables and end with multi-class classification with deep learning models.

Paid: Free and paid $15

Link: https://www.statswithr.com/neural-networks-with-keras-in-r-a-quickstart-guide

22.16 sits: Data Analysis and Machine Learning on Earth Observation Data Cubes with Satellite Image Time Series

by Gilberto Camara, Rolf Simoes, Felipe Souza, Alber Sanchez, Lorena Santos, et al

Using time series derived from big Earth Observation data sets is one of the leading research trends in Land Use Science and Remote Sensing. One of the more promising uses of satellite time series is its application to classify land use and land cover. Information on land is critical for sustainable development because our growing demand for natural resources is causing significant environmental impacts. The target audience for sits is the new generation of specialists who understand the principles of remote sensing and can write scripts in R. Ideally, users should have basic knowledge of data science methods using R.

This book presents sits, an open-source R package for land use and land cover classification using big Earth observation data.

Link: https://e-sensing.github.io/sitsbook/

22.17 Supervised Machine Learning for Text Analysis in R

by Emil Hvitfeldt, Julia Silge

Modeling as a statistical practice can encompass a wide variety of activities. This book focuses on supervised or predictive modeling for text, using text data to make predictions about the world around us. We use the tidymodels framework for modeling, a consistent and flexible collection of R packages developed to encourage good statistical practice.

Link: https://smltar.com/

22.18 Surrogates - Gaussian process modeling, design and optimization for the applied sciences

by Robert B. Gramacy

Surrogates is a graduate textbook, or professional handbook, on topics at the interface between machine learning, spatial statistics, computer simulation, meta-modeling (i.e., emulation), design of experiments, and optimization. Experimentation through simulation, “human out-of-the-loop” statistical support, management of dynamic processes, online and real-time analysis, automation, and practical application are at the forefront.

Link: https://bookdown.org/rbg/surrogates/

22.19 The caret Package

by Max Kuhn

The caret package (short for Classification And REgression Training) is a set of functions that attempt to streamline the process for creating predictive models.

Link: https://topepo.github.io/caret/index.html

22.20 The Hitchhiker’s Guide to Responsible Machine Learning

by Przemyslaw Biecek, Anna Kozak, Aleksander Zawada

A graphic novel approach to responsible machine learning

Link: https://betaandbit.github.io/RML/

22.21 Tidy Modeling with R

by Max Kuhn, Julia Silge

This book provides an introduction to how to use the tidymodels suite of packages to create models using a tidyverse approach and encourages good methodology and statistical practice throughout demonstrated using series of applied examples.

Link: https://www.tmwr.org/

Created and maintained by Oscar Baruffa.
Keen to support the site? You're most welcome to

For updates, sign up to my newsletter