20 Machine Learning

20.1 A Minimal rTorch Book

by Alfonso R. Reyes

Practically, you can do everything you could with PyTorch within the R ecosystem.

Link: https://f0nzie.github.io/rtorch-minimal-book/

20.2 Behavior Analysis with Machine Learning Using R

by Enrique Garcia Ceja

This book aims to provide an introduction to machine learning concepts and algorithms applied to a diverse set of behavior analysis problems. It focuses on the practical aspects of solving such problems based on data collected from sensors or stored in electronic records. The included examples demonstrate how to perform several of the tasks involved during a data analysis pipeline such as: data exploration, visualization, preprocessing, representation, model training/validation, and so on. All of this, using the R programming language and real-life datasets.

Link: https://enriquegit.github.io/behavior-free/index.html#

20.3 Data Science: Theories, Models, Algorithms, and Analytics

by Sanjiv Ranjan Das

I developed these class notes for my Machine Learning with R course. It traces my evolution as a data scientist into redundancy, I expect I will be replaced by a machine soon!

Link: https://srdas.github.io/MLBook/

20.4 Explanatory Model Analysis

by Przemyslaw Biecek, Tomasz Burzykowski

Responsible, Fair and Explainable Predictive Modeling with examples in R and Python

Link: https://pbiecek.github.io/ema/

20.5 Feature Engineering and Selection A Practical Approach for Predictive Models

by Max Kuhn, Kjell Johnson

The goals of Feature Engineering and Selection are to provide tools for re-representing predictors, to place these tools in the context of a good predictive modeling framework, and to convey our experience of utilizing these tools in practice.

Link: http://www.feat.engineering/index.html

20.6 Hands-On Machine Learning with R

by Bradley Boehmke, Brandon Greenwell

This book provides hands-on modules for many of the most common machine learning methods to include:

Generalized low rank models, Clustering algorithms, Autoencoders, Regularized models, Random forests, Gradient boosting machines, Deep neural networks, Stacking / super learners and more!

Link: https://bradleyboehmke.github.io/HOML/

20.7 Interpretable Machine Learning

by Christoph Molnar

A Guide for Making Black Box Models Explainable

Online book

Paid: Free or pay what you want $42

Link: https://leanpub.com/interpretable-machine-learning

20.8 Lightweight Machine Learning Classics with R Marek Gagolewski

In this book we will take an unpretentious glance at the most fundamental algorithms that have stood the test of time and which form the basis for state-of-the-art solutions of modern AI, which is principally (big) data-driven.

Link: https://lmlcr.gagolewski.com/

20.9 Machine Learning for Factor Investing

by Guillaume Coqueret, Tony Guida

This book is intended to cover some advanced modelling techniques applied to equity investment strategies that are built on firm characteristics.

Link: http://www.mlfactor.com/

20.10 Mathematics and Programming for Machine Learning with R From the Ground Up 1st Edition, Kindle

by William B. Claster

Based on the author’s experience in teaching data science for more than 10 years, Mathematics and Programming for Machine Learning with R: From the Ground Up reveals how machine learning algorithms do their magic and explains how these algorithms can be implemented in code. It is designed to provide readers with an understanding of the reasoning behind machine learning algorithms as well as how to program them. Written for novice programmers, the book progresses step-by-step, providing the coding skills needed to implement machine learning algorithms in R.

Paid: $40

Link: https://www.amazon.com/Mathematics-Programming-Machine-Learning-Ground-ebook-dp-B08JHDCX9Y/dp/B08JHDCX9Y

20.11 mlr3 book

by Michel Lang

The mlr3 package and ecosystem provide a generic, object-oriented, and extensible framework for classification, regression, survival analysis, and other machine learning tasks for the R language. They do not implement any learners, but provide a unified interface to many existing learners in R.

Link: https://mlr3book.mlr-org.com/

20.12 sits: Data Analysis and Machine Learning on Earth Observation Data Cubes with Satellite Image Time Series

by Gilberto Camara, Rolf Simoes, Felipe Souza, Alber Sanchez, Lorena Santos, et al

Using time series derived from big Earth Observation data sets is one of the leading research trends in Land Use Science and Remote Sensing. One of the more promising uses of satellite time series is its application to classify land use and land cover. Information on land is critical for sustainable development because our growing demand for natural resources is causing significant environmental impacts. The target audience for sits is the new generation of specialists who understand the principles of remote sensing and can write scripts in R. Ideally, users should have basic knowledge of data science methods using R.

This book presents sits, an open-source R package for land use and land cover classification using big Earth observation data.

Link: https://e-sensing.github.io/sitsbook/

20.13 Supervised Machine Learning for Text Analysis in R

by Emil Hvitfeldt, Julia Silge

Modeling as a statistical practice can encompass a wide variety of activities. This book focuses on supervised or predictive modeling for text, using text data to make predictions about the world around us. We use the tidymodels framework for modeling, a consistent and flexible collection of R packages developed to encourage good statistical practice.

Link: https://smltar.com/

20.14 Surrogates - Gaussian process modeling, design and optimization for the applied sciences

by Robert B. Gramacy

Surrogates is a graduate textbook, or professional handbook, on topics at the interface between machine learning, spatial statistics, computer simulation, meta-modeling (i.e., emulation), design of experiments, and optimization. Experimentation through simulation, “human out-of-the-loop” statistical support, management of dynamic processes, online and real-time analysis, automation, and practical application are at the forefront.

Link: https://bookdown.org/rbg/surrogates/

20.15 The caret Package

by Max Kuhn

The caret package (short for Classification And REgression Training) is a set of functions that attempt to streamline the process for creating predictive models.

Link: https://topepo.github.io/caret/index.html

20.16 The Hitchhiker’s Guide to Responsible Machine Learning

by Przemyslaw Biecek, Anna Kozak, Aleksander Zawada

A graphic novel approach to responsible machine learning

Link: https://betaandbit.github.io/RML/

20.17 Tidy Modeling with R

by Max Kuhn, Julia Silge

This book provides an introduction to how to use the tidymodels suite of packages to create models using a tidyverse approach and encourages good methodology and statistical practice throughout demonstrated using series of applied examples.

Link: https://www.tmwr.org/

 

Created and maintained by Oscar Baruffa

For updates, sign up to my newsletter