Touring the tidyverse: tidyeval

Another stop on the journey through the tidyverse and this time it’s one of the most interesting and mysterious corners (I think) of the entire suite of tidyverse packages - tidy evaluation. My goal for the talk was to introduce ideas behind tidy evaluation in an interactive manner. To do that, I’ve used mostly exactly the same approach as Lionel Henry (main contributor to rlang that hosts tidy evaluation) outlined in this WIP book creatively called “Tidy evaluation”. [Read More]

Hitchiker's paradox: Quantified-self edition

Have you ever had this feeling that bus (or tram, or train, etc.) leaves exactly 1 minute before you get to the stop AND the next one is forever away? Or you come to a crossroad which is usually very quiet, but exactly as you are about to cross it, there are like 100 cars going in every direction? Apparently, both of those phenomenon’s can be explained with what is called “Hitchiker’s paradox”. [Read More]

Touring the tidyverse: purrr

Another R meetup, another talk about tidyverse. This time I’ve talked about purrr. I really like this package, but I do have a feeling that I didn’t present in the best light possible. I certainly could have prepared a couple more examples that showcase why this package is so awesome and what kind of workflow it allows to create. To pat myself on the back a bit, it was a good idea to anticipate that it would take a bit longer than I’ve planned and to put “interesting” parts of purrr first. [Read More]

Touring the tidyverse: dplyr

On 25th of July, 2018, I’ve given a talk with the topic “Touring the tidyverse: dplyr”. It was a second installment of the series of talks about tidyverse. R markdown format of the presentation is here, code I’ve used during presentation is here. Over time I’m planning to cover most of the packages in tidyverse. Let me know if anything is not clear about this talk or any other in the series. [Read More]

rstudio::conf 2018 Review: Part 4

This is fourth and final part of the overview for rstudio::conf2018. It took longer than I’ve anticipated, but at least I’ve finished before rstudio::conf2019 :). Part 1 is here Part 2 is here Part 3 is here Modeling in the Tidyverse by Max Kuhn. This talk is an overview of the roadmap for the modeling packages in the tidyverse. The main idea is quite straightforward. Specifically, the suite of packages for modeling should be seen as a way to specify what you want in a declarative way and delay actual work as much as possible. [Read More]

Touring the tidyverse: tidyr

This Thursday I’ve given a talk at Berlin R-Users Group. The topic was “Touring the tidyverse: tidy data + tidyr”. The idea is that this will become a series of talks where each consequent talk is going to present one (or couple) of packages from the tidyverse. tidyr to me seems like a good choice for the first talk since concept of tidy data is so central to all of the packages. [Read More]

Going to the cloud Google style

There are multiple cloud providers that one can choose from and overtime I’m planning to try working with all of them. But the first on the line is CloudML from Google. The biggest reason for that choice is outstanding work from RStudio folks that created multiple packages that make working with Google infrastructure a breeze (300$ of free credit are also a big factor). Specifically today I’ll go over tutorials of cloudml package and provide future self (and you) with pointers of where things can or will go wrong and how to avoid them. [Read More]

Keeping writing monkey off my back

Apparently, it is not possible to embed interactive vizualization in Github Pages. At least, according to this post I’ve came across. I’m not entirely sure that this is true, since, for example this seems to work just fine. Most likely it is technically possible, but, as I was saying before, having to deal with all of those dependencies is exactly why I didn’t want to blog in the first place. [Read More]

Buying a car data sceintist way

Recently I’ve been thinking about what car to buy. Searching the web is one way, but I wanted to do it in a more “data” way, so in this series I’ll show how I’ve extracted data about ~19.000 used cars in Berlin to find out what models/manufacturers seem to hold the price for longer. In my case I’ve decided to use this as an excuse to practice some of the skills that I’ve been dying to try for a long time - web scrapping with R. [Read More]