Menu

Run the Reflect Agent securely on your own hardware

The self-hosted Reflect Agent is a secure, scalable system that can connect directly to your databases and pipe data to the Reflect API without sharing your database connection information with Reflect. You can run an Agent singly or, for the sake of enhanced fault tolerance, in a multi-Agent cluster.

This tutorial outlines installing, configuring, and running a self-hosted Reflect Agent.

Pros and cons of running your own Agent

You do not need to run your own Reflect Agent to use Reflect, as we offer a publicly available cluster of agents that can connect to your databases securely and directly. You may want to consider running your own self-hosted Agent, however, if you don’t want to expose your database connection information to the Reflect API and/or you don’t want to expose your databases to the public internet.

Running your own Agent has two drawbacks, however: it is your responsibility to maintain the interface between the Reflect API and your Agent, and Reflect is not responsible for Agent failures (such as the Agent going down or losing connectivity).

Installation

The Reflect Agent is now available as a Docker image

Reflect now offers a reflect/agent Docker image, hosted on Docker Hub. For instructions on installing and running the image, see the Docker section below.

The Reflect Agent is currently available for a variety of Linux distributions. You can download the latest package for your distribution from our CDN using one of the links below:

The following files are installed by the package:

File Role
/etc/reflect/reflect-agent.json The main configuration file for the Reflect Agent.
/bin/reflect-agent Standalone binary executable used for Agent admin actions (like adding database connections and clustering Agents)

In addition, a handful of init scripts specific to your distribution will be installed.

Other operating systems

If your Linux distribution isn’t currently supported and you’d like us to add it, please let us know at support@reflect.io. Another option for other operating systems is to run the Reflect Agent using Docker.

Setup

Suggested best practice: read-only database access

You are likely to have the best experience running your own Reflect Agent if you use a read replica of your database with Reflect, if possible, rather than a production database that also accepts a write workload.

The default installation process starts the Agent when installation is complete. The default configuration will work out of the box. If you’d like to modify the configuration, however, you need to do two things:

  1. Change the desired parameters in the Agent’s configuration file
  2. Restart the Agent

You should make sure to modify the Agent’s configuration before running any management commands, such as adding database connections or joining the Agent to a cluster.

Encryption

By default, the Reflect Agent does not encrypt database connection information that is stored locally on the node. You can enable encryption prior to starting the Agent, however, by doing two things:

  1. Generating a random key to be used by the Agent for encryption
  2. Specifying the location of the key using the key_path parameter in the Agent’s configuration

The Reflect Agent stores your database credentials in an embedded BoltDB database and protects them using AES-256-CBC encryption with HMAC-SHA-256 authentication. This allows you to take backups of the database connection info stored by your Agent without having to worry about the security of the content.

To enable encryption, you’ll need to generate a 512-bit random key and supply it to the Agent. See the next section for an example.

Generating a private key

You can use a command like the following to generate a random key for your Agent. Here’s an example:

$ dd if=/dev/urandom of=/etc/reflect/private-key bs=1 count=64
$ chown reflect:reflect /etc/reflect/private-key
$ chmod 0400 /etc/reflect/private-key

This would store the byte array in /etc/reflect/private-key.

Login

When an Agent first starts up, it will be running in an unauthenticated, “clean-slate” mode. At that point, you can do one of two things with the Agent:

  1. Log the Agent into the Reflect API:

    $ sudo /bin/reflect-agent login
    

    The console will then prompt you for your Reflect username and password. If you’re using encryption for the information stored by the Agent, you’ll need to generate a key first.

    If login is successful the console will return a simple OK.

  2. If the Agent will run as a node in an already existing cluster, join the Agent node to an existing cluster. See Join Agent nodes together for instructions.

Configuration

You can supply configuration to your Reflect Agent using the JSON configuration file located by default at /etc/reflect/reflect-agent.json. The table below lists the available parameters:

Parameter Description Default
data_dir The directory in which the Agent should store its state /var/lib/reflect/agent
key_path The path to your encryption key file No default; see Encryption for more information

Example configuration

Here is a full example JSON configuration (note that the data_dir value here is the default):

{
  "data_dir": "/var/lib/reflect/agent",
  "key_path": "/etc/reflect/private-key"
}

Basic operations

The Reflect Agent can be operated using a small set of systemd commands.

Starting the Agent

$ service start reflect-agent

Stopping the Agent

$ service stop reflect-agent

Backing up the Agent’s data

$ sudo /bin/reflect-agent backup --path=/your/backup/filepath

If you don’t supply a path, backup will write to stdout, which is useful if you want to, for example, compress the contents using gzip or another format.

Restoring the Agent

$ sudo /bin/reflect-agent restore --path=/your/backup/filepath

As with the backup command, make sure that you point the --path flag to the right path.

Managing database connections

One of the core purposes of the self-hosted Reflect Agent is to enable you to connect to your databases and send data to the Reflect API without sharing any database connection information with Reflect.

You can manage database connection info using the connections command, which enables you to add new connections or list existing connections.

List existing connections

You can list available database connections at any time:

$ sudo /bin/reflect-agent connections list

This will display only the name and driver fields for each database, not user and password information.

Add new connections

You can add new database connections to your Reflect Agent using the connections add command. Here’s an example for Postgres:

$ sudo /bin/reflect-agent connections add postgres \
  --name "Your connection" \
  --host localhost \
  --port 5432 \
  --user your-username \
  --password your-password \
  --database your-db-name \
  --connect_timeout 60 \
  --ssl require

When you run connections add the Agent will attempt to establish a connection with the database and inform you immediately if the connection has succeeded or failed. If the connection succeeds, two things will happen:

  • The connection information will be encrypted and stored locally in an embedded database
  • The driver and name for the connection will be sent to the Reflect API (not connection info)

Here are the current driver options for database connections:

Driver Available versions
redshift Amazon Redshift
postgres PostgreSQL 9.x (9.1, 9.2, 9.3, 9.4, 9.5)
mysql MySQL 4.1+, MariaDB, Percona Server, Google CloudSQL
mssql Microsoft SQL Server
csv  
sqlite SQLite 3.0+

The Reflect HTTP driver

The Reflect Agent offers an HTTP driver as well, though it is currently supported only for advanced use cases. If you intend to use Reflect with an HTTP API, contact us at support@reflect.io for further information.

The table below lists the parameters that you can set via flags (as in the example above):

Redshift

Flag Description Default
name A name for the credential that you’re adding  
host The IP or hostname of the database you’re connecting to localhost
port The port through which the database is accessible 5439
user The username for connecting to the database (preferably a user with read-only permissions)  
password The password associated with user  
database The name of the database to connect to  
connect_timeout The connection timeout for attempting the connection, in seconds (minimum) 30
ssl The SSL mode for connecting to the database require
SSL options

Redshift supports several SSL modes. In most cases, you should use require or verify-ca, but others are provided for completeness.

Mode Description
disable No SSL, i.e. no encryption
require All data connections are encrypted, but no checking of the certificate chain
require-ca All data connections are encrypted and certificate chain is checked
verify-full All data connections are encrypted and the certificate chain and hostname are checked

PostgreSQL

Flag Description Default
name A name for the credential that you’re adding  
host The IP or hostname of the database you’re connecting to localhost
port The port through which the database is accessible 5432
user The username for connecting to the database (preferably a user with read-only permissions)  
password The password associated with user  
database The name of the database to connect to  
connect_timeout The connection timeout for attempting the connection, in seconds (minimum) 30
ssl The SSL mode for connecting to the database disable
SSL options

PostgreSQL supports several SSL modes:

Mode Description
disable No SSL, i.e. no encryption
require All data connections are encrypted, but no checking of the certificate chain
require-ca All data connections are encrypted and certificate chain is checked
verify-full All data connections are encrypted and the certificate chain and hostname are checked

See the Add new connections section above for a usage example.

MySQL

Flag Description Default
name A name for the credential that you’re adding  
host The IP or hostname of the database you’re connecting to localhost
port The port through which the database is accessible 5432
user The username for connecting to the database (preferably a user with read-only permissions)  
password The password associated with user  
database The name of the database to connect to  
timeout The connection timeout for attempting the connection, in seconds (minimum ) 60

Microsoft SQL Server

Flag Description Default
host The IP or hostname of the database that you’re connecting to localhost
port The port through which the database is accessible 1433
user The username for connecting to the database (preferably a user with read-only permissions)  
password The password associated with user sa
database The name of the database to connect to  
timeout The timeout in seconds for the connection attempt 30
ssl The SSL mode for connecting to the database verify
SSL options

SQL Server supports three SSL modes:

Mode Description
disable Encryption is disabled
no-verify Encryption is required, but the server certificate is not validated
verify Encryption is required, and the server certificate must be valid

CSV

Flag Description
source_url An HTTP(S) URL to download the CSV file from

SQLite

Flag Description
source_url An HTTP(S) URL to download the SQLite database from

Delete connections

You can delete an existing connection using the connections delete command. Connections are targeted for deletion by their slug (e.g. your-connection-1). Here’s an example:

$ sudo /bin/reflect-agent connections delete your-connection

Edit existing connections

You can modify connection info for currently existing connections using the connections edit command. Connections are targeted for modification by their slug (e.g. your-connection-1). You can edit a connection’s name using the name command or edit other connection info using the options command.

Here’s an example of modifying the name of the connection created in the example above (which has the slug your-connection):

$ sudo /bin/reflect-agent connections edit your-connection \
  name "New name"

Here’s an example of modifying the username from your-username to new-username:

$ sudo /bin/reflect-agent connections edit your-connection \
  options --user new-username

You can also change the database driver for the connection using the reset command. This would change the driver for the your-connection connection from postgres to redshift:

$ sudo /bin/reflect-agent connections edit your-connection \
  reset redshift \
  --host your-redshift.amazonaws.com \
  --user your-username \
  --password your-password \
  # etc

When changing a database driver using reset, you need to specify the full set of connection properties, even if some of those properties remain the same.

Clustering

The Reflect Agent can be run either singly or as a distributed cluster of multiple Agents. Although running the Agent singly should suffice for many use cases, running a cluster of Agents provides higher fault tolerance.

All clustering-related commands are available through the reflect-agent executable.

Create a new cluster

You can create a new cluster using the cluster create command.

$ sudo /bin/reflect-agent cluster create

You can specify an address for the node to bind to for communication with other cluster nodes using the --addr flag:

$ sudo /bin/reflect-agent cluster create --addr 10.0.0.1:9701

Default host and port

If you do not specify a host and port using the --addr flag, the Agent will default either to value of the REFLECT_CLUSTER_ADDR environment variable or, if you have not set this variable, to a dynamically selected address. On Linux that will be the address of the default route; on macOS that will be the first non-loopback interface.

It can be useful to specify a non-default host and port using --addr if you have, for example, one public and one private interface and need to specify which of the two interfaces to bind to.

Join Agent nodes together

You can join an Agent to a node in an existing cluster using the cluster join command and specifying the IP and host of the existing node using the --addr flag:

$ sudo /bin/reflect-agent cluster join \
  --addr 10.0.0.1:9701 10.0.0.2:9701

If you don’t specify a host and port using --addr, the Agent will use the default.

You can cluster a node by force by setting the --force flag:

$ sudo /bin/reflect-agent cluster join \
  --addr 10.0.0.1:9701 \
  --force

Here are some situations in which you might want to force join a node:

  • Logging in
  • Adding a database connection
  • Being part of another cluster

Force joins will erase all connection data

When you force join a node to the cluster, it will permanently erase its internal storage and replace it with the storage from the cluster.

You can run reflect-agent backup before force joining in case you’re concerned about erasing existing data. For more info, see Basic Operations.

List cluster members

You can list all current members of the cluster at any time:

$ sudo /bin/reflect-agent cluster members

The output will provide an IP address, port, and leader status (Yes/No) for each member:

+---+-----------+------+--------+
|   |  ADDRESS  | PORT | LEADER |
+---+-----------+------+--------+
|   | 10.0.1.25 | 9701 | Yes    |
| * | 10.0.1.26 | 9701 | No     |
+---+-----------+------+--------+

Remove cluster members

You can remove an Agent from the cluster using the remove-member command and specifying the host and port of the Agent to remove:

$ sudo /bin/reflect-agent cluster remove-member 10.0.0.1:9701

Uninstallation

How you should uninstall the Reflect Agent depends on which package management system you’re using. Please see your vendor’s documentation for uninstallation instructions.

Docker

If you’d like to run the Reflect Agent inside a container rather than as a Linux package, Reflect provides a Docker image that you can pull from Docker Hub:

$ docker pull reflect/agent:latest

Once you’ve pulled the image, you can run it as you would any other Docker image. Here’s an example:

$ docker run \
  -v /opt/reflect/agent:/agent \ # Specify a storage volume
  --name=reflect-agent \         # A name for the agent
  reflect/agent:latest           # The Reflect Agent image

Clustering containerized agents

If you’d like to run containerized Reflect agents in a cluster, you can do so by running each agent in host mode (as explained above) and executing all commands on each agent node using docker exec.

In order to run containerized agents in a cluster, you need to run them in host mode by applying --net=host when running the image. Here’s an example:

$ docker run \
  -v /opt/reflect/agent:/agent \
  --name=reflect-agent
  --net=host \
  reflect/agent:latest

Below is a set of example commands that would start up two containerized agents and then connect them to one another:

# Start up the first agent
$ docker run \
  -v /opt/reflect/agent-1:/agent \
  --name=reflect-agent-1 \
  --net=host \
  reflect/agent:latest

# Start up the second agent
$ docker run \
  -v /opt/reflect/agent-2:/agent \
  --name=reflect-agent-2 \
  --net=host \
  reflect/agent:latest

# Instruct the first agent to create a cluster
$ docker exec -it \
  reflect-agent-1 \            # Specifies the agent on which to run the command
  reflect-agent cluster create # The command run on the agent

# Join the two agents together
$ docker exec -it \
  reflect-agent-2 \
  reflect-agent cluster join --addr 10.0.2.15:9701 10.0.2.15:9801

The IP addresses and ports that you need to specify using the --addr flag will depend on your installation. You can fetch IP and port information about your cluster at any time using the cluster members command (as with non-containerized agents). Here’s an example:

$ docker exec -it \
  reflect-agent-1 \
  reflect-agent cluster members