rstudio::conf 2018 Review: Part 3

Part 1 is here Part 2 is here Part 4 is here Data-driven product development by Ramnath Vaidyanathan. Ramnath from DataCamp shared their experience of using data to drive decisions. He didn’t say anything completely new, but it is always reassuring to see that solid principles tend to produce consistent results. Another interesting point in his talk is a fact that they’ve decided to share their findings with instructors who create courses on DataCamp. [Read More]

rstudio::conf 2018 Review: Part 2

Part 1 is here Part 3 is here Part 4 is here How I learned to stop worrying and love the firewall by Ian Lyttle. In this talk Ian described the workflow he devised in order to create private CRAN-like repos inside of firewall environments. Those environments are often the reality of life in enterprises, so this talk gives couple of concrete pieces of advice to succeed in this endeavour. [Read More]

Touring the tidyverse: tidyr

This Thursday I’ve given a talk at Berlin R-Users Group. The topic was “Touring the tidyverse: tidy data + tidyr”. The idea is that this will become a series of talks where each consequent talk is going to present one (or couple) of packages from the tidyverse. tidyr to me seems like a good choice for the first talk since concept of tidy data is so central to all of the packages. [Read More]

rstudio::conf 2018 Review: Part 1

Part 2 is here Part 3 is here Part 4 is here I’ve done review of useR!2017 conference before, so I wanted to continue the trend and give an overview of rstudio::conf 2018. It happened all the way in January and I’ve had it in my list of things to do since then. Well, the time has come! I’ve had the same problem with useR!2017, namely, the fact that there are too many good talks. [Read More]

Ethics in Data Science

There is a very noticeable effort in thinking about ethics towards data collection online and in general dealing with data about people. This is especially (or, rather, especially for me) clear in the field of Data Science. There are multiple reasons why Data Science broadly is seen as a tool to weaponize snooping on users to a ridiculous degree. Examples of that are: Palantir spying in New Orleans, China spying on its citizens Minority Report style as well as literally any adtech business. [Read More]

Category theory via graphical linear algebra

I was reading through Hacker News (as I often do) and came across the link with the name “Graphical Linear Algebra”. I tend to click on most links at the front page and this link wasn’t an exception. What I found there is a blog by Pawel Sobocinski where he talks about Graphical Linear Algebra (surprise) and how it’s application. Very early in the series he mentions that GLA is connected to category theory – area of mathematics that I’ve been trying to understand for a loooong time given how useful it is for functional programming. [Read More]

Going to the cloud Google style

There are multiple cloud providers that one can choose from and overtime I’m planning to try working with all of them. But the first on the line is CloudML from Google. The biggest reason for that choice is outstanding work from RStudio folks that created multiple packages that make working with Google infrastructure a breeze (300$ of free credit are also a big factor). Specifically today I’ll go over tutorials of cloudml package and provide future self (and you) with pointers of where things can or will go wrong and how to avoid them. [Read More]

Finding controversial and interesting posts on Hacker News

I’ve been an avid reader of Hacker News for the past year or so. If you don’t know what it is, in few words it’s an aggregator that is centered around technology news and everything surrounding them. So, not only news about the latest and greatest frameworks to use, but also any news that community finds interesting. Stories are upvoted to the front page that you see when you go the link above. [Read More]

Overview of `secret` and `spelling` packages

If you didn’t notice this blog is mostly about me talking to future myself with some pieces of advice that I find useful at the moment. It is most obvious with Docker post, but pretty much every post is kinda like that. Another major theme of the blog so far is me looking at cool packages that I think are useful. I’m especially interested in packages that solve a certain problem that is usually not so flashy, but solving those problems make it easy for many people to do what they want to do better and faster. [Read More]

Dabbling with deep learning

I like trying different things just to see how difficult/easy it is. One of the things I’ve been meaning to try for quite some time is deep learning, specifically keras package by RStudio. There are many tutorials about keras around, but I’ve just followed couple of tutorials and vignettes that they have on their CRAN page. The interesting thing to me is the fact that, apparently, keras is no longer the cool kid on the block and all the rage is now behind pytorch. [Read More]