Part 2 is here
Part 3 is here
Part 4 is here
I’ve done review of useR!2017 conference before, so I wanted to continue the trend and give an overview of rstudio::conf 2018. It happened all the way in January and I’ve had it in my list of things to do since then. Well, the time has come!
I’ve had the same problem with useR!2017, namely, the fact that there are too many good talks. In this instance, it is even more pronounced since almost every talk is somewhat interesting to me. Going over all of them at once is a bit too much, so I’ll split it in two parts.
The talk is about increasing empathy to people who might be affected by your analysis. For example, in agile software development there is a notion of “user story” that is designed to put yourself into the shoes of the end user of a certain feature. Why they want to use? How they want to use? This allows for richer context since any problem can be solved in the multitude of ways and what should drive the final solution shouldn’t be based solely on techical limitations, but for the most part based on needs and wants of people who are going to use your data analysis/software.
Another concept that helps underline the empathy is a concept of “near” and “far”. Near here means something that makes sense to you, while far can be thought of as a way to contextualize someone else’s experience. The tricky part with empathy is to find a balance between those two things since, as mentioned in the talk, empathy is by definition going to be biased towards your own experiences. It is also unreasonable to expect to have unbounded empathy towards every single person you meet, so striking a balance is a task that need to be performed by each person.
As is obvious from the name of the talk, Mara talked about how one can contribute to not necessarily tidyverse, but to any FOSS you can find. The main steps are more or less self-evident: take a look first (e.g., watch issues for couple of weeks), find how you can contribute, make baby steps at first (don’t change literally every single function in a package) and discuss your ideas first so that no work is wasted.
One good approach from the talk to make FOSS sustainable is to find a way to be selfish in such a way that you help other people. For example, if documentation for the function you use is not clear and you have to go through multiple steps to grok it, there is a good chance that it is the same for other people. Therefore, you might want to change it and make your life easier, while also improving other people lives as well.
The thrust of the talk is to promote better documentation by package writers. The idea is that if you are putting a package out there (be it CRAN or GitHub or whatever) you should spend a little (or a lot) of time making your package approachable. Writing vignettes, writing a blog post which explains an essence of the package goes a long way to make others see the value of the package you spend a lot of time writing.
The talk covered latest development in DBI package that implements a DBI specification for different DB backends. One interesting thing mentioned in the talk is RKazam package that is not a backend, but rather a skeleton that provides anyone wanting to write a DB backend with multiple tests for how your package should behave. The idea is that it allows you to concentrate on implementation of the backend itself while making sure that it stays DBI compliant.
Talk briefly covered some of the advantages of using
dplyr while working with remote databases. The idea is that often amount of data stored in the DB is so large that it makes it impossible to load it into memory locally. With
dplyr (or, more accurately, with
dbplyr) you can actually create a connection handle that
dplyr will automagically translate into SQL in the dialect of the DB you are using. On top of that, there is already some work being done with a package such as
dbplot that will create, for example, histogram inside of the database and transfer only summarized data back to local session in R.
Same approach can be used with Shiny with the help of
pool package. It allows to abstract away database connection to a pool of connection that will be handled automatically. Then you can write code that will summarize data in a remote DB and show only the needed info to the user of a Shiny dashboard.
Another promising thing that eventually (I hope) make its way into RStudio is support for DBI-compliant packages inside of “Connections” pane. Right now it is limited to
odbc, but having all of DBI packages would be fantastic. According to Edgar, such support is coming.
This talk is an introduction to
plumber package that allows to easily convert R functions into REST API. I’ve already done some work with it and it tends to work pretty well in practice. Couple of new things are the fact that at some point
plumber will be supported by RStudio itself, so it’ll be even easier to deploy API based on this framework and that they already have done some work to create provisioning functions for Digital Ocean and, obviously, for RStudio Connect. There is a webinar planned on this topic in the near future, so I’ll be very interested to learn about the progress.
This talk introduced three stages of R adoption: black box, interactive tools, fully integrated. At each stage RStudion Connect can be used to make lifes of end consumers (data scientists or any other person). RStudio Connect is developing rapidly, so it’ll be interesting to see where it goes.
Very short and to the point talk demonstrating the parametrized RMarkdown reports that one can host with RStudio Connect for added interactivity.
In this talk Bárbara presented 3 concepts of drilling through (e.g., looking at different monhts), drilling down (e.g., looking at specific category) and drilling up (e.g., aggregating data with a specific function). All three actions are possible with Shiny and she also shared the code of the app where you can take a look at how it’s done.
This talk was about a nascent role of the R Admin who is a missing link between Data Science team and IT Operations team in an organization. His or her role is to legitimize R as a tool for data analysis using, e.g., RStudio Connect to deploy artifacts that are going to serve different needs of stakeholders. For example, Shiny app for the data analysts, parametrized reports for management and
plumber APIs for anyone who want to interact with your artefacts using REST.
This talk was presented by Herman Sontrop form Friss Analytics. They have published a series of tutorials that show you how to create a dashboard with Shiny using HTMLTemplates and HTMLWidgets.
He also mentioned that large-scale Shiny applications are possible to develop with modules that separate big Shiny app into manageable chunks. Already mentioned HTMLTemplates and HTMLWidgets are two more tools that help separate concerns and not create too much complexity while creating really slick dashboards. Some of the examples he showed looked as if it has nothing to do with Shiny, so I’ll definitely follow their tutorials to learn more about how to do similar thing.