This is fourth and final part of the overview for rstudio::conf2018. It took longer than I’ve anticipated, but at least I’ve finished before rstudio::conf2019 :).
Part 1 is here
Part 2 is here
Part 3 is here
This talk is an overview of the roadmap for the modeling packages in the tidyverse. The main idea is quite straightforward. Specifically, the suite of packages for modeling should be seen as a way to specify what you want in a declarative way and delay actual work as much as possible. The reason why this is important is that makes it possible to have a declarative syntax for model specification, while actual engine can be anything at all - TensorFlow, Python, Stan, R, Spark etc. In that sense it similar to pairing of
dbplyr and SQL engines.
I’m certainly looking forward to more developments in that area.
This talk is a case-study where Capital One bank introduced tidyverse in order to standardize and improve workflow of their analysts. It provides an interesting view of how one can achieve similar results in a different setting.
Very interesting talk by Carson about how
plotly can be used to increase velocity of insight generation into data. With interactive graphics, analyst can ask and answer questions from the data rapidly, thus allowing her to create a better mental model of the process that is currently under investigation.
Carson is a maintainer of
plotly package and he also published a book where he documented multiple ways
plotly can be used to create interactive web graphics.
Data rectangling as presented by Jenny Bryan is an art of wrangling data while staying inside of cozy confines of tibbles. It isn’t always obvious how certain operation can be done inside of a tibble, but
purrr sure helps. Moreover, working with list-columns (and lists in general) is a very nice skill to have since lots of data can be in this form (most notably API calls to external web-services).
Unpacking vectors is a way to put multiple objects on the left-hand side of
<- operator. It is supported natively, by, e.g., Python, so often times people tend to miss it when they move to R.
zeallot is a package that fixes this situation by providing
%<-% operator that gives you this facility. Another implementation is available in
dub package by my former colleague Eugene Ha.
Couple of interesting tidbits of information about debugging in RStudio (and in R in general). Amanda talked about debugging in source files, packages, and Shiny applications. They all share certain things about debugging, but also have differences that needs to be kept in mind to make your life easier whenever there is a need for deep dive on misbehaving functions.
This talk goes over using R in three using environments: SQL, Bash, and Python. The main point that Aaron made is that this polyglot nature of R makes it a rather nifty tool to make it a centralized place to foster collaboration in the, perhaps, multilingual team you might have at your workplace.
Practical talk outlining steps of creating branded documents (Word, PDF, slides, PPT, etc). I’ve had to do very similar things multiple times already and my experience is aligned with what they went through. Once you can abstract away to have a template, creating multiple reports with all sorts of seemingly complicated things is not as complicated. It for sure beats creating all of those things by hand in a Word document.
Tidy eval is a very interesting topic in its own right. On top of that it tends to be fairly complicated to wrap your mind around, so there are probably multiple talks that need to be given in order for the community to have a solid grasp on the idea. As explained in this talk, one of the biggest reasons tidy eval is even needed is because everything in R is an expression (with its corresponding AST) and that allows anyone to modify AST any way you please. SQL <>
dplyr interaction is a good example of why this is useful. With the talk from Max Kuhn it can be also useful in modeling. Moreover, this idea comes all the way from 1940’s and already proven to work in, e.g., Lisp. So, while tidy eval might indeed be a bit complicated, the payoff is also quite large.