Joins two tables in the All of Us database. A less verbose wrapper for the dplyr::*_join() functions with some added safeguards.
Arguments
- data
unexecuted SQL query from dbplyr/dplyr.
- table
the omop table (or other remote table in your schema) you wish to join, as a character string, or a tbl object.
- type
the type of join; types available in dplyr: "left", "right", "inner", "anti", "full", etc.
- by
columns to join on
- suffix
suffix preferences to add when joining data with the same column names not specified in the by argument.
- x_as
optional; a string for the name of the left table
- y_as
optional; a string for the name of the right table
- ...
Additional arguments passed on to the join function
- con
Connection to the allofus SQL database. Defaults to
getOption("aou.default.con")
, which is created automatically withaou_connect()
.
Details
There are a few good reasons to use aou_join() when possible over
the x_join functions from dplyr. First, it reduces the code necessary to join
an existing table to another table. Second, it includes checks/workarounds
for two sources of common errors using dbplyr: it automatically appends the
x_as and y_as arguments to the join call if they are not provided and it
changes the default suffix from .x/.y to _x/_y for cases with shared column
names not specified by the by
argument which will result in a SQL error.
Examples
if (FALSE) { # on_workbench()
con <- aou_connect()
obs_tbl <- dplyr::tbl(con, "observation") %>%
dplyr::select(-provider_id)
obs_tbl %>%
aou_join("person", type = "left", by = "person_id")
}