Touring the tidyverse: purrr

Another R meetup, another talk about tidyverse. This time I’ve talked about purrr. I really like this package, but I do have a feeling that I didn’t present in the best light possible. I certainly could have prepared a couple more examples that showcase why this package is so awesome and what kind of workflow it allows to create. To pat myself on the back a bit, it was a good idea to anticipate that it would take a bit longer than I’ve planned and to put “interesting” parts of purrr first. [Read More]

Touring the tidyverse: dplyr

On 25th of July, 2018, I’ve given a talk with the topic “Touring the tidyverse: dplyr”. It was a second installment of the series of talks about tidyverse. R markdown format of the presentation is here, code I’ve used during presentation is here. Over time I’m planning to cover most of the packages in tidyverse. Let me know if anything is not clear about this talk or any other in the series. [Read More]

rstudio::conf 2018 Review: Part 4

This is fourth and final part of the overview for rstudio::conf2018. It took longer than I’ve anticipated, but at least I’ve finished before rstudio::conf2019 :). Part 1 is here Part 2 is here Part 3 is here Modeling in the Tidyverse by Max Kuhn. This talk is an overview of the roadmap for the modeling packages in the tidyverse. The main idea is quite straightforward. Specifically, the suite of packages for modeling should be seen as a way to specify what you want in a declarative way and delay actual work as much as possible. [Read More]

rstudio::conf 2018 Review: Part 3

Part 1 is here Part 2 is here Part 4 is here Data-driven product development by Ramnath Vaidyanathan. Ramnath from DataCamp shared their experience of using data to drive decisions. He didn’t say anything completely new, but it is always reassuring to see that solid principles tend to produce consistent results. Another interesting point in his talk is a fact that they’ve decided to share their findings with instructors who create courses on DataCamp. [Read More]

rstudio::conf 2018 Review: Part 2

Part 1 is here Part 3 is here Part 4 is here How I learned to stop worrying and love the firewall by Ian Lyttle. In this talk Ian described the workflow he devised in order to create private CRAN-like repos inside of firewall environments. Those environments are often the reality of life in enterprises, so this talk gives couple of concrete pieces of advice to succeed in this endeavour. [Read More]

Touring the tidyverse: tidyr

This Thursday I’ve given a talk at Berlin R-Users Group. The topic was “Touring the tidyverse: tidy data + tidyr”. The idea is that this will become a series of talks where each consequent talk is going to present one (or couple) of packages from the tidyverse. tidyr to me seems like a good choice for the first talk since concept of tidy data is so central to all of the packages. [Read More]

rstudio::conf 2018 Review: Part 1

Part 2 is here Part 3 is here Part 4 is here I’ve done review of useR!2017 conference before, so I wanted to continue the trend and give an overview of rstudio::conf 2018. It happened all the way in January and I’ve had it in my list of things to do since then. Well, the time has come! I’ve had the same problem with useR!2017, namely, the fact that there are too many good talks. [Read More]

Ethics in Data Science

There is a very noticeable effort in thinking about ethics towards data collection online and in general dealing with data about people. This is especially (or, rather, especially for me) clear in the field of Data Science. There are multiple reasons why Data Science broadly is seen as a tool to weaponize snooping on users to a ridiculous degree. Examples of that are: Palantir spying in New Orleans, China spying on its citizens Minority Report style as well as literally any adtech business. [Read More]

Category theory via graphical linear algebra

I was reading through Hacker News (as I often do) and came across the link with the name “Graphical Linear Algebra”. I tend to click on most links at the front page and this link wasn’t an exception. What I found there is a blog by Pawel Sobocinski where he talks about Graphical Linear Algebra (surprise) and how it’s application. Very early in the series he mentions that GLA is connected to category theory – area of mathematics that I’ve been trying to understand for a loooong time given how useful it is for functional programming. [Read More]

Going to the cloud Google style

There are multiple cloud providers that one can choose from and overtime I’m planning to try working with all of them. But the first on the line is CloudML from Google. The biggest reason for that choice is outstanding work from RStudio folks that created multiple packages that make working with Google infrastructure a breeze (300$ of free credit are also a big factor). Specifically today I’ll go over tutorials of cloudml package and provide future self (and you) with pointers of where things can or will go wrong and how to avoid them. [Read More]