Unarchive a single tsv file into an existing database
unark_file(filename, db_con, lines = 10000L)
| filename | a *.tsv.bz2 file to uncompress |
|---|---|
| db_con | a database src ( |
| lines | number of lines to read in a chunk. |
the database connection (src_dbi, invisibly)
unark_file will read in a file in chunks and
write them into a database. This is essential for processing
large compressed tables which may be too large to read into
memory before writing into a database. In general, increasing
the lines parameter will result in a faster total transfer
but require more free memory for working with these larger chunks.
## set up example files and database tsv <- tempfile("flights", fileext=".tsv.bz2") sqlite <- tempfile("nycflights", fileext=".sql") readr::write_tsv(nycflights13::flights, tsv) db <- src_sqlite(sqlite, create = TRUE) ## and here we go: db_con <- unark_file(tsv, db)#> #>#>## display tables in database: db_con#> src: sqlite 3.22.0 [/var/folders/y8/0wn724zs10jd79_srhxvy49r0000gn/T//RtmpPQJ76l/nycflightsfcd1b80705b.sql] #> tbls: flightsfcd158cf3482unlink(tsv) unlink(sql)#> Error in as.character(x): cannot coerce type 'closure' to vector of type 'character'