How to Do Mediation Scientifically

Mediation analysis has been around a long time, though its popularity has varied between disciplines and over the years. While some fields have been attracted to the potential of mediation models to identify pathways, or mechanisms, through which an independent variable affects an outcome, others have been skeptical that the analysis of mediated relationships can ever be done scientifically. Two developments, one more scientific than the other, have led to a renewed popularity of mediation analysis.
Read more

Plotly for R - Multi-Layer Plots

If you are new to plotly, consider first reading our introductory post: Introduction to Interactive Graphics in R with plotly Often when analyzing data, it is necessary to produce a complex plot that requires multiple graphical layers. In plotly, multi-layer plots can be specified as a pipeline of data manipulations (dplyr only) and visual mappings. This is possible because dplyr verbs can be used on a plotly object to modify the underlying data.
Read more

The Prisoner's Dilemma

Game Theory and Interdependent Outcomes Game theory is the study of interdependent decision making, or how individuals make decisions when their optimal choice depends on what others have chosen. Probably the best known application of game theory is the Prisoner’s Dilemma. In this game, there is a tension between the incentives faced by each player and the globally optimal outcome. In the parlance of game theory, Nash equilibrium is not Pareto optimal.
Read more

Introduction to Interactive Graphics in R with plotly

R users adore the ggplot2 package for all things data visualization. Its consistent syntax, useful defaults, and flexibility make it a fantastic tool for creating high-quality figures. Although ggplot2 is great, there are other dataviz tools that deserve a place in a data scientist’s toolbox. Enter plotly. plotly is a high-level interface to plotly.js, based on d3.js which provides an easy-to-use UI to generate slick D3 interactive graphics.
Read more

Assessing Causality from Observational Data using Pearl's Structural Causal Models

Causality In 20th century statistics classes, it was common to hear the statement: “You can never prove causality.” As a result, researchers published results saying “x is associated with y” as a way of circumventing the issue of causality yet implicitly suggesting that the association is causal. As an example from my former discipline, political science, there was an interest in determining how representative democracy works. Do politicians respond to voters, or do voters just update their policy beliefs to line up with the party they’ve always preferred?
Read more

Developing R Packages with usethis and GitLab CI: Part III

While developing your R package, you will want to make sure the code it contains is as clean as possible and that your package build and testing times are as efficient as you can make them. There are a number of tricks and tools at your disposal to accomplish these aims. This post, the third in a series that covers R package development, will introduce a few of those and demonstrate how they can improve your package development process.
Read more

Tracking private R dependencies with packrat & git submodules

Here at Methods we often use RStudio’s packrat package to version our package dependencies and help ensure our work is reproducible. Packrat handles public packages on CRAN or Github just fine, but we have a lot of internal packages hosted privately on Gitlab that we’d like to have packrat manage like the rest of our dependencies. This comes up very naturally for us, as we often make client-specific R packages we then want to use in other work for that client.
Read more

Developing R Packages with usethis and GitLab CI: Part II

This post, the second part in a series that covers R package development, will define the important concept of continuous integration (CI) and demonstrate the advantages of using CI within GitLab. The version control code repository, GitLab, offers many services to its users, including the ability to set up CI services to R programmers and software developers in private repositories for free. GitLab’s built-in CI service is easy to utilize and can be set up with an R package relatively quickly.
Read more

Developing R Packages with usethis and GitLab CI: Part I

The best way to share your R code with others is to create a package. Whether you want to share your functions with team members, clients, or all interested R users, bundling up your functions into a package is the way to go. Luckily, there are great tools available that make this process relatively smooth and easy. This series of posts aims to walk through the process of setting up an R package and sharing it on the version control code repository, GitLab.
Read more

A Tour of Timezones (& Troubles) in R

In any programming tool, dates, times, and timezones are hard. Deceptively hard. They’ve been shaped by politics and whimsy for hundreds of years: timezones can shift with minimal notice, countries have skipped or repeated certain days, some are offset by weird increments, some observe Daylight Saving Time, leap years, leap seconds, the list goes on. Luckily, we rarely need to worry about most of those details because other teams of very smart people have spent a lot of time providing nice abstractions for us that handle most of the weird edge cases.
Read more