MySQL

Overview

This page contains the setup guide and reference information for the MySQL source connector.

Features

FeatureSupportedNotes
Full Refresh SyncYes
Incremental - Append SyncYes
CDCYes
SSL SupportYes
SSH Tunnel ConnectionYes
ArraysYesByte arrays are not supported yet

Data type mapping

MySQL TypeResulting TypeNotes
bit(1)boolean
bit(>1)base64 binary string
booleanboolean
tinyint(1)boolean
tinyint(>1)float
tinyint(>=1) unsignedfloat
smallintfloat
mediumintfloat
intfloat
bigintfloat
floatfloat
doublefloat
decimalfloat
binarystring
blobstring
datestringISO 8601 date string. ZERO-DATE value will be converted to NULL. If column is mandatory, convert to EPOCH.
datetime, timestampstringISO 8601 datetime string. ZERO-DATE value will be converted to NULL. If column is mandatory, convert to EPOCH.
timestringISO 8601 time string. Values are in range between 00:00:00 and 23:59:59.
yearyear stringMySQL Docs (opens in a new tab)
char, varchar with non-binary charsetstring
char, varchar with binary charsetbase64 binary string
tinyblobbase64 binary string
blobbase64 binary string
mediumblobbase64 binary string
longblobbase64 binary string
binarybase64 binary string
varbinarybase64 binary string
tinytextstring
textstring
mediumtextstring
longtextstring
jsonserialized json stringE.g. {"a": 10, "b": 15}
enumstring
setstringE.g. blue,green,yellow
geometrybase64 binary string

If a specific type isn't mentioned in this list, it's typically converted into a string.

Getting started

Requirements and prerequisites

For utilizing the MySQL source, ensure you have the following:

  • MySQL Server version 8.0, 5.7, or 5.6.
  • Establishment of a dedicated read-only user account that can access all tables necessary for replication.
  • Ensure Y42 can access your database. This requirement depends on your network configuration. We suggest adding the Y42 production IP address, 35.198.72.34, to your whitelist.

Set up guide

Step 1: (Optional, but recommended) Create a dedicated read-only user

This stage is optional but strongly suggested to allow better permission management and auditing.

  1. To create a dedicated database user, execute the following commands on your database:

    _10
    CREATE USER 'y42'@'%' IDENTIFIED BY 'your_password_here';

    • The appropriate permissions vary between the STANDARD and CDC replication methods. For the STANDARD replication method, only the SELECT permission is needed.

      _10
      GRANT SELECT ON <database name>.* TO 'y42'@'%';

    • For CDC replication method, SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT permissions are required.

      _10
      GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'y42'@'%';

Step 2: Set up the MySQL source

  1. Create a new source, select MySQL (Airbyte) from the Source type dropdown and enter a name for it.
  2. On the source setup page, create a new MySQL secret, enter a unique name for it.
  3. For the Host, Port, and DB Name, enter the hostname, port number, and name for your MySQL database.
  4. For User and Password, enter the username and password you created in Step 1.
  5. (Optional) If you want to use CDC replication select Logical Replication (CDC) as the replication method and enter the names of the replication slot and publication you created.
  6. (Optional) If you want to connect via SSH tunnel, select "SSH Tunnel Method": SSH Key Authentication or Password Authentication.

Step 3: (Optional) Setting up CDC based syncing

Binary logging is enabled by default since Mysql 8.0. For older versions make sure you have these parameters in your mysql server configuration:


_10
server-id = 1 (any none-zero value)
_10
log_bin = mysql-bin
_10
binlog_format = ROW (default)
_10
binlog_row_image = FULL (default)
_10
binlog_expire_logs_seconds = 864000 (recommended minimum)

During the secret setup set the replication method to Logical Replication.

Step 4: (Optional) Connection via SSH Tunnel

In case of an SSH connection, select "SSH Tunnel Method" other than "No Tunnel" and configure SSH tunneling:

  1. For SSH Tunnel Jump Server Host, enter the hostname or IP address for the intermediate (bastion) server that Y42 will connect to.
  2. For SSH Connection Port, enter the port on the bastion server. The default port for SSH connections is 22.
  3. For SSH Login Username, enter the username to use when connecting to the bastion server. Note: This is the operating system username and not the MySQL username.
  4. For authentication:
    • If you selected SSH Key Authentication, set the SSH Private Key to the private key that you are using to create the SSH connection. To generate a private key for SSH Tunneling, execute the following command:

      _10
      ssh-keygen -t rsa -m PEM -f myuser_rsa

      The command produces the private key in PEM format and the public key remains in the standard format used by the authorized_keys file on your bastion server. Add the public key to your bastion host to the user you want to use for data sync. The private key is provided via copy-and-paste to the configuration screen.
    • If you selected Password Authentication, enter the password for the operating system user to connect to the bastion server. Note: This is the operating system password and not the MySQL password.

Troubleshooting

  • Certain issues might arise when trying to map values from MySQL's datetime field to other relational databases. MySQL allows zero values for date/time as an alternative to NULL, which may not be compatible with other databases. To mitigate this issue, you can use the JDBC connector in the source setting and input the following key-value pair: zerodatetimebehavior=Converttonull
  • A few users have encountered difficulties when trying to connect to Amazon RDS MySQL or MariaDB, indicated by the error message: Cannot create a PoolableConnectionFactory. To rectify this problem, include enabledTLSProtocols=TLSv1.2 in the JDBC parameters.
  • Some users have reported an error when attempting to connect to Amazon RDS MySQL: Error: HikariPool-1 - Connection is not available, request timed out after 30001ms. Often, this issue stems from the VPC disallowing public traffic. Nevertheless, we suggest you go through the AWS RDS troubleshooting checklist (opens in a new tab) to ensure the appropriate permissions/settings have been granted for database connection.