library(connector)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
This vignette demonstrates how to create and use connector objects programmatically in R code, without requiring YAML configuration files. While YAML files are convenient for complex setups and reproducible environments, sometimes you need the flexibility to create connectors dynamically in your R scripts.
This approach is particularly useful when: - You need to create connectors based on runtime conditions or user input - You’re working in an interactive R session and want quick access to different storage locations - You prefer defining your data connections directly in your analysis code
Creating Individual Connectors
You can create connector objects directly using the specific connector functions:
File System Connector
The connector_fs()
function creates a connector for
file-based storage. You specify the directory path, and the connector
handles reading and writing files in various formats based on file
extensions.
# Create a file system connector pointing to the 'data' directory
fs_conn <- connector_fs(path = "data")
fs_conn
#> <ConnectorFS>
#> Inherits from: <Connector>
#> Registered methods:
#> • `check_resource.ConnectorFS()`
#> • `create_directory_cnt.ConnectorFS()`
#> • `download_cnt.ConnectorFS()`
#> • `download_directory_cnt.ConnectorFS()`
#> • `list_content_cnt.ConnectorFS()`
#> • `log_read_connector.ConnectorFS()`
#> • `log_remove_connector.ConnectorFS()`
#> • `log_write_connector.ConnectorFS()`
#> • `read_cnt.ConnectorFS()`
#> • `remove_cnt.ConnectorFS()`
#> • `remove_directory_cnt.ConnectorFS()`
#> • `tbl_cnt.ConnectorFS()`
#> • `upload_cnt.ConnectorFS()`
#> • `upload_directory_cnt.ConnectorFS()`
#> • `write_cnt.ConnectorFS()`
#> Specifications:
#> • path: data
Database Connector
The connector_dbi()
function creates a connector for
database storage using the DBI interface. This works with any
DBI-compatible database driver (SQLite, PostgreSQL, MySQL, etc.).
# Create a database connector using SQLite in-memory database
db_conn <- connector_dbi(
drv = RSQLite::SQLite(),
dbname = ":memory:"
)
db_conn
#> <ConnectorDBI>
#> Inherits from: <Connector>
#> Registered methods:
#> • `disconnect_cnt.ConnectorDBI()`
#> • `list_content_cnt.ConnectorDBI()`
#> • `log_read_connector.ConnectorDBI()`
#> • `log_remove_connector.ConnectorDBI()`
#> • `log_write_connector.ConnectorDBI()`
#> • `read_cnt.ConnectorDBI()`
#> • `remove_cnt.ConnectorDBI()`
#> • `tbl_cnt.ConnectorDBI()`
#> • `write_cnt.ConnectorDBI()`
#> • `check_resource.Connector()`
#> Specifications:
#> • conn: <SQLiteConnection>
Using Individual Connectors
Once you have a connector, you use the same functions regardless of whether it’s a file system or database connector. This consistency makes it easy to switch storage backends in your analysis.
# Write and read data using the file system connector
sample_data <- mtcars[1:5, 1:3]
# Write data - format is determined by file extension
fs_conn |> write_cnt(sample_data, "cars.csv")
# List all available content in this connector
fs_conn |> list_content_cnt()
#> [1] "cars.csv"
# Read the data back
retrieved_data <- fs_conn |> read_cnt("cars.csv")
#> → Found one file: data/cars.csv
#> Rows: 5 Columns: 3
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> dbl (3): mpg, cyl, disp
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(retrieved_data)
#> # A tibble: 5 × 3
#> mpg cyl disp
#> <dbl> <dbl> <dbl>
#> 1 21 6 160
#> 2 21 6 160
#> 3 22.8 4 108
#> 4 21.4 6 258
#> 5 18.7 8 360
Creating Multiple Connectors with connectors()
The connectors()
function allows you to group multiple
connector objects together with meaningful names. This is useful for
organizing different stages of your data pipeline or different types of
storage.
# Create a collection of connectors for different data stages
my_connectors <- connectors(
staging = connector_fs(path = "staging"),
analysis = connector_fs(path = "analysis")
)
my_connectors
#> <connectors>
#> $staging <ConnectorFS>
#> $analysis <ConnectorFS>
Working with Multiple Connectors
With multiple connectors, you can organize your data workflow by
using different connectors for different purposes. Access each connector
by name using the $
operator.
# Use different connectors for different stages of analysis
iris_sample <- iris[1:10, ]
# Store initial data in the staging area
my_connectors$staging |> write_cnt(iris_sample, "iris_raw.rds")
# Process the data
processed <- iris_sample |>
group_by(Species) |>
summarise(mean_length = mean(Sepal.Length))
# Store the analysis results
my_connectors$analysis |> write_cnt(processed, "iris_summary.csv")
# Check contents of each connector
my_connectors$staging |> list_content_cnt()
#> [1] "iris_raw.rds"
my_connectors$analysis |> list_content_cnt()
#> [1] "iris_summary.csv"
Mixed Storage Types
One of the powerful features of the connector package is the ability to combine different storage types (files and databases) with the same interface. This lets you choose the best storage method for each type of data.
# Mix file system and database connectors in one collection
mixed_connectors <- connectors(
files = connector_fs(path = "output"),
database = connector_dbi(RSQLite::SQLite(), dbname = ":memory:")
)
# Store the same data in different formats
test_data <- data.frame(x = 1:3, y = letters[1:3])
# Save as CSV file
mixed_connectors$files |> write_cnt(test_data, "test.csv")
# Save as database table
mixed_connectors$database |> write_cnt(test_data, "test_table")
# List contents from both storage types using the same function
mixed_connectors$files |> list_content_cnt()
#> [1] "test.csv"
mixed_connectors$database |> list_content_cnt()
#> [1] "test_table"
Summary
Creating connectors programmatically in R gives you the flexibility to:
- Use
connector_fs()
andconnector_dbi()
to create individual connectors for different storage types - Use
connectors()
to group multiple connectors with meaningful names - Access individual connectors by name:
my_connectors$name
- Switch between storage backends while using the same functions:
write_cnt()
,read_cnt()
,list_content_cnt()
,remove_cnt()
This approach provides the same organized, consistent interface as YAML-based configuration while giving you the ability to create connectors dynamically based on your analysis needs.