Part 1 is here
Part 2 is here
Part 4 is here
Ramnath from DataCamp shared their experience of using data to drive decisions. He didn’t say anything completely new, but it is always reassuring to see that solid principles tend to produce consistent results. Another interesting point in his talk is a fact that they’ve decided to share their findings with instructors who create courses on DataCamp. This is sometimes overlooked, but it does make everyone’s life easier since decisions can be made at every level and consequently improve the final product.
As this blog is written using
blogdown, it’s no wonder that I think it’s a great package. So, if you need some motivation to use it, watch this talk to clear everything up.
- Something old, something new, something borrowed, something blue: Ways to teach data science (and learn it too!)To the Tidyverse and Beyond by Chester Ismay.
Chester is one part of ModernDive.com where they try to teach data science to complete beginners. He also works at DataCamp. In this talk he shared his experience with creating this book/course using tidyverse.
Very entertaining talk (no slides though!) by Marco about the process he started in his company where he tried to wean the entire company off of Excel in favor of tidyverse. In his experience, the main driver was the ability to create static reports using data from the business. Since most of the time it was done in Excel, this was a perfect use-case to teach people power of
ggplot2 in this task. Little by little, people in the company realized that this way is much more fluent since they no longer need to rely on data scientists and/or BI people to find out certain things that interest them.
This talk by Edzer is a presentation of
sf package that adopted
tidy principles. This package is another way of working with spatial data that allows for easier integration with the rest of tidyverse (e.g., with
Fantastic introduction into
tibbletime package that makes it MUCH easier to work with time-series. Biggest idea is to have a time index that is used to calculate all sorts of things. Importantly, it uses
tibble as a backend which means that you can still use all of the
tidyverse packages with it.
infer is a package that allows for tidy statistical inference. The vision is similar in some sense to
ggplot2 where you need to specify layer by layer what your visualization should look like. This makes it rather transparent and declarative. Similarly, instead of, say, using just
t.test from base R with
infer you are encouraged to specify assumptions in a more explicit way.
ggraph are two indispensable tools for anyone trying to work with graphs in tidyverse setting. Former wraps
igraph to perform most of the heavy-lifting in working with graphs, while later is built on top of
ggplot2 to create a way to easily visualize graphs. So far, the biggest reason for me to have only little exposure to both of the packages was the fact that I rarely have problems that can be solved using graphs. But when I did some problems that I can articulate in terms of graphs, working with both packages was very pleasant.
This talk is a whirlwind introduction of tidyverse as a concept. I wish, I’ve watched before giving my first talk about “Touring the
tidyverse” in R-meetup here in Berlin. I could have used some of the gifs for my talk as well :).