17 Getting, cleaning and wrangling data

17.1 21 Recipes for Mining Twitter Data with rtweet

by Bob Rudis

The recipes contained in this book use the rtweet package by Michael W. Kearney.

Link: https://rud.is/books/21-recipes/

17.2 A Beginner’s Guide to Clean Data

by Benjamin Greve

This book will help you to become a better data scientist by showing you the things that can go wrong when working with data - particularly low-quality data. A key difference between a junior and a senior data scientist is the awareness of potential pitfalls. The experienced data scientist will expect them, navigate around them and avoid costly iteration cycles. After reading this book, you will be able to spot data quality problems and deal with them before they can break your work, saving yourself a lot of time.

Link: https://b-greve.gitbook.io/beginners-guide-to-clean-data/

17.3 Spreadsheet Munging Strategies

by Duncan Garmonsway

This is a work-in-progress book about getting data out of spreadsheets, no matter how peculiar. The book is designed primarily for R users who have to extract data from spreadsheets and who are already familiar with the tidyverse. It has a cookbook structure, and can be used as a reference, but readers who begin in the middle might have to work backwards from time to time.

Link: https://nacnudus.github.io/spreadsheet-munging-strategies/

17.4 Text Mining with R

by Julia Silge, David Robinson

This book serves as an introduction of text mining using the tidytext package and other tidy tools in R. The functions provided by the tidytext package are relatively simple; what is important are the possible applications. Thus, this book provides compelling examples of real text mining problems.

Link: https://www.tidytextmining.com/

17.5 Text Mining With Tidy Data Principles

by Julia Silge

Text data sets are diverse and ubiquitous, and tidy data principles provide an approach to make text mining easier, more effective, and consistent with tools already in wide use. In this tutorial, you will develop your text mining skills using the tidytext package in R, along with other tidyverse tools.

Link: https://juliasilge.shinyapps.io/learntidytext/


Created and maintained by Oscar Baruffa

For updates, sign up to my newsletter