Skip to contents

[Experimental]

Writes, overwrites or appends an Arrow object to a database table.

Methods in other packages

This documentation page describes the generics. Refer to the documentation pages linked below for the documentation for the methods that are implemented in various backend packages.

Usage

dbWriteTableArrow(conn, name, value, ...)

Arguments

conn

A DBI::DBIConnection object, as returned by dbConnect().

name

The table name, passed on to dbQuoteIdentifier(). Options are:

  • a character string with the unquoted DBMS table name, e.g. "table_name",

  • a call to Id() with components to the fully qualified table name, e.g. Id(schema = "my_schema", table = "table_name")

  • a call to SQL() with the quoted and fully qualified table name given verbatim, e.g. SQL('"my_schema"."table_name"')

value

An nanoarray stream, or an object coercible to a nanoarray stream with nanoarrow::as_nanoarrow_array_stream().

...

Other parameters passed on to methods.

Value

dbWriteTableArrow() returns TRUE, invisibly.

Details

This function expects an Arrow object. Convert a data frame to an Arrow object with nanoarrow::as_nanoarrow_array_stream() or use dbWriteTable() to write a data frame.

This function is useful if you want to create and load a table at the same time. Use dbAppendTableArrow() for appending data to an existing table, dbCreateTableArrow() for creating a table and specifying field types, and dbRemoveTable() for overwriting tables.

Failure modes

If the table exists, and both append and overwrite arguments are unset, or append = TRUE and the data frame with the new data has different column names, an error is raised; the remote table remains unchanged.

An error is raised when calling this method for a closed or invalid connection. An error is also raised if name cannot be processed with DBI::dbQuoteIdentifier() or if this results in a non-scalar. Invalid values for the additional arguments overwrite, append, and temporary (non-scalars, unsupported data types, NA, incompatible values, incompatible columns) also raise an error.

Additional arguments

The following arguments are not part of the dbWriteTableArrow() generic (to improve compatibility across backends) but are part of the DBI specification:

  • overwrite (default: FALSE)

  • append (default: FALSE)

  • temporary (default: FALSE)

They must be provided as named arguments. See the "Specification" and "Value" sections for details on their usage.

Specification

The name argument is processed as follows, to support databases that allow non-syntactic names for their objects:

  • If an unquoted table name as string: dbWriteTableArrow() will do the quoting, perhaps by calling dbQuoteIdentifier(conn, x = name)

  • If the result of a call to DBI::dbQuoteIdentifier(): no more quoting is done

The value argument must be a data frame with a subset of the columns of the existing table if append = TRUE. The order of the columns does not matter with append = TRUE.

If the overwrite argument is TRUE, an existing table of the same name will be overwritten. This argument doesn't change behavior if the table does not exist yet.

If the append argument is TRUE, the rows in an existing table are preserved, and the new data are appended. If the table doesn't exist yet, it is created.

If the temporary argument is TRUE, the table is not available in a second connection and is gone after reconnecting. Not all backends support this argument. A regular, non-temporary table is visible in a second connection, in a pre-existing connection, and after reconnecting to the database.

SQL keywords can be used freely in table names, column names, and data. Quotes, commas, spaces, and other special characters such as newlines and tabs, can also be used in the data, and, if the database supports non-syntactic identifiers, also for table names and column names.

The following data types must be supported at least, and be read identically with DBI::dbReadTable():

  • integer

  • numeric (the behavior for Inf and NaN is not specified)

  • logical

  • NA as NULL

  • 64-bit values (using "bigint" as field type); the result can be

    • converted to a numeric, which may lose precision,

    • converted a character vector, which gives the full decimal representation

    • written to another table and read again unchanged

  • character (in both UTF-8 and native encodings), supporting empty strings before and after a non-empty string

  • factor (possibly returned as character)

  • objects of type blob::blob (if supported by the database)

  • date (if supported by the database; returned as Date), also for dates prior to 1970 or 1900 or after 2038

  • time (if supported by the database; returned as objects that inherit from difftime)

  • timestamp (if supported by the database; returned as POSIXct respecting the time zone but not necessarily preserving the input time zone), also for timestamps prior to 1970 or 1900 or after 2038 respecting the time zone but not necessarily preserving the input time zone)

Mixing column types in the same table is supported.

Examples

con <- dbConnect(RSQLite::SQLite(), ":memory:")

dbWriteTableArrow(con, "mtcars", nanoarrow::as_nanoarrow_array_stream(mtcars[1:5, ]))
dbReadTable(con, "mtcars")
#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> 4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> 5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2

dbDisconnect(con)