Wednesday, March 11, 2026

HeatWave External Table (Lakehouse) with Generated Column in 9.6.1

This is to share a tutorial how we can define a External Table in HeatWave 9.6.1+ for JSON data with Generated Column(s).

Pre-requisite

HeatWave : 9.6.1+

Lakehouse : Enabled with the HeatWave Cluster


Lakehouse in HeatWave is a feature to allow files with different formats to be loaded to HeatWave.  

Together with Event Based Incremental Loading feature with HeatWave Lakehouse, adding, changing or deleting files can be automated to update the HeatWave table.


Syntax :

CREATE EXTERNAL TABLE table_name (col_name data_type [ENGINE_ATTRIBUTE='Col_Engine_Attribute_JSON'], ...) ENGINE_ATTRIBUTE='Tbl_Engine_Attribute_JSON' [PARTITION BY KEY (partition_column_name)];


For JSON format : here is an example and sample output


JSON file :  demo.json 

{"a":1,"b":2}

{"a":3,"b":4}


SQL :

> create database lh;

> use lh;

> create table json_table ( col_1 json, _metadata_filename varchar(1024), mya int generated always as (col_1->"$.a") stored, myb int generated always as (col_1->"$.b") stored ) engine_attribute='{"dialect":{"format":"json"}, "file": [ {"uri":"oci://<bucket>@<namespace>/autorefresh/demo.json"}]}' engine=lakehouse  secondary_engine=rapid;


> alter table json_table secondary_load;
> select * from json_table;

+------------------+----------------------------------------------------------------------------------------------------------+-----+

| col_1            | _metadata_filename                                                                                       | mya |

+------------------+----------------------------------------------------------------------------------------------------------+-----+

| {"a": 1, "b": 2} | https://objectstorage.uk-london-1.oraclecloud.com/n/<namespace>/b/<bucket>/o/autorefresh/demo.json |   1 |

| {"a": 3, "b": 4} | https://objectstorage.uk-london-1.oraclecloud.com/n/<namespace>/b/<bucket>/o/autorefresh/demo.json |   3 |

+------------------+----------------------------------------------------------------------------------------------------------+-----+


Reference :

https://dev.mysql.com/doc/relnotes/heatwave/en/news-9-6-1.html

https://dev.mysql.com/doc/heatwave/en/mys-hw-lakehouse-key-partitioning.html


Tuesday, July 1, 2025

HeatWave MySQL Disaster Recovery with Bi-Directional Replication

HeatWave MySQL is a MySQL database service that is part of Oracle Cloud Infrastructure (OCI). It's a fully managed database service that allows you to run MySQL workloads in the cloud.


Replication with MySQL is Asynchronous in nature.  It is uni-directional.   HeatWave has Channel to define the in-bound Replication from another "MySQL" or  "HeatWave".     If we have 2 HeatWave DB Systems in different Region, Replication Channel is a good use to define replication between the Regions.

To setup HeatWave Channel between Regions, please refer to the following URL for more information

Ref : https://docs.oracle.com/en/learn/oci-mysql-dr-copy/index.html


Although Replicatoin is uni-directional, people might want to define Bi-directional Replication between the TWO DB Systems.   It is the TWO Replication Channels, where each DB System is defined with a Channel pointing each other.


Wednesday, June 4, 2025

Troubleshooting HeatWave Replication Channel

HeatWave is a MySQL Database Service running on Cloud with powerful in-memory data warehouse service that provides fast and scalable analytics capabilities. One of the key features of HeatWave is its ability to replicate data from MySQL databases, allowing businesses to replicate data from MySQL source. The replicated deployment may help in horizontally scaling to allow multiple MySQL Servers with data for query. The replica may serve as Disaster Database running in another region. However, like any complex system, HeatWave replication can sometimes encounter issues. In this article, we will provide a step-by-step guide on how to troubleshoot HeatWave replication channel.

Understanding HeatWave Replication

Before we dive into troubleshooting, it's essential to understand how HeatWave replication works. HeatWave replication is based on MySQL replication, where data is replicated from a MySQL source database via binlog Replication. The replication process involves creating a replication channel, which is a connection between the MySQL source database and the HeatWave database.

Common Issues with HeatWave Replication

Some common issues that can occur with HeatWave replication include: 

  • Connectivity issue between Source and Target: The replication channel may not be able to establish a connection between the MySQL source database and the HeatWave database due to network connectivity issues or firewall rules. 
  • GTIDs not matching with source: The GTIDs (Global Transaction Identifiers) on the HeatWave database may not match with the GTIDs on the MySQL source database, causing replication issues. Data loading issues: Data may not be loading correctly into the HeatWave database due to issues with the data loading process. 
  • MySQL IO Thread and SQL Applier Thread Issues: The MySQL IO thread and SQL applier thread are responsible for replicating data from the MySQL source database to the HeatWave database. Issues with these threads can cause replication problems. 
  • EVENT Issues with DR Setup: EVENTs, which are scheduled jobs in MySQL, can cause issues with replication if not properly configured. 
  • Primary Key issue where Heatwave is configured with HA however production is running with Standalone. Troubleshooting Steps

To troubleshoot HeatWave replication channel, follow these steps: 

  • Check the replication channel status: Use the SHOW REPLICA STATUS command to check the status of the replication channel. This will provide information on the current state of the replication channel, including any errors that may have occurred. On OCI HeatWave Console, the channel dashboard shows the status if there is any issue On OCI HeatWave Console, the graphs show the replication channel health and alarm can be setup to send email alert upon any failure to the channel 
  • Verify the connectivity between Source and Target: Check that there are no network connectivity issues between the MySQL source database and the HeatWave database. Check the GTIDs on the HeatWave database: 
    • Use the SELECT * FROM mysql.gtid_executed command to check the GTIDs on the HeatWave database.
    • Show variables like '%gtid%' to identify differences between source and target 
  • Verify if there is any EVENT running on both Source and Target Use the following SQL to list any event with databases 
    • SELECT event_schema, event_name, definer, interval_value, interval_field, status FROM information_schema.EVENTS where event_schema not in ('mysql_task_management', 'mysql', 'sys'); 
  • Retrieve the applier error from the performance_schema table: Use the SELECT * FROM performance_schema.replication_applier_status_by_worker command to retrieve the applier error from the performance_schema table. 
  • Check the MySQL IO Thread and SQL Applier Thread status: Use the SHOW REPLICA STATUS command to check the status of the MySQL IO thread and SQL applier thread. EVENT Issues with DR Setup

EVENTs can cause issues with replication if not properly configured. For example, if an EVENT is scheduled to run on the source database and creates a new record, the replication process may encounter a duplicate key error if the same record is created on the target database.

Best Practices for Using MySQL Shell to Dump and Load Data to HeatWave

MySQL Shell is a powerful tool for dumping and loading database content, particularly for external databases from on-premise or other environments. When using MySQL Shell to dump and load data to HeatWave, it's essential to follow best practices to ensure a smooth and successful process.

  • It is always recommended to use the latest version of MySQL Shell whenever it is possible no matter the target Database version is 5.7, 8.0 or 8.4 or 9.x.
  • Performance is associated with the Network connectivity and CPU of the COMPUTE where MySQL Shell is installed.  It is highly recommended using 4+ more CPU where network and parallel threads are possible.
  • Storage of the dump content to be on local file system or Object Storage depends on the size of the backup and performance requirement.     


To dump data from a MySQL source database, you can use the util.dumpInstance utility in MySQL Shell. This utility provides a convenient way to dump the entire database instance, including all databases, tables, and other database objects. There are many options with util.dumpInstance. There are constraints with the application(s) where the database has to be configured. The following example is illustration for a particular scenario and it may not work for all cases.

When using util.dumpInstance, it's essential to consider the following best practices: 

  • Check for tables without primary keys: MySQL Shell's util.dumpInstance utility checks for tables without primary keys by default. It's recommended to fix any table without primary keys before dumping the data. This ensures that the data is properly loaded into the HeatWave when High Availability deployment is configured.  

Solutions to primary key issue with the Source

    • Fix it manually in the source database.  
    • Workaround 1: Creating invisible primary key in the target. The create_invisible_pks option can be specified with the util.dumpInstance(...). However if there is any AUTO_INCREMENT column defined with the table, the script will fail. The AUTO_INCREMENT column can be defined with Primary Key manually.    
    • Workaround 2 : applying ignore_missing_pks option to the util.dumpInstance(...) and loadDump to the Target HeatWave. By looking up the tables without primary key, Invisible Primary key can be manually applied to those tables. Again consider replication from source where those table does not have primary key, the replication channel must be defined properly with "Generate Primary Key" option and HeatWave Configuration should be configured with variable sql_generate_invisible_primary_key to ON. Please refer to the link for more info https://blogs.oracle.com/mysql/post/introducing-mysql-heatwave-generated-invisible-primary-keys 
      • SELECT t.table_schema, t.table_name FROM information_schema.tables t LEFT JOIN (SELECT table_schema, table_name FROM information_schema.statistics WHERE index_name = 'PRIMARY' GROUP BY table_schema, table_name, index_name ) pks ON t.table_schema = pks.table_schema AND t.table_name = pks.table_name WHERE pks.table_name IS NULL AND t.table_type = 'BASE TABLE' AND t.table_schema NOT IN ('mysql', 'sys', 'performance_schema', 'information_schema');
  • Exclude events: Events can cause issues with replication if not properly configured. To avoid these issues, you can exclude events from the dump using the excludeEvents option. However, without Event being loaded to the Target, any modification on Source database about the Event may result in error with the replication channel. As DR setup, the Event missing from DR Database may result in improper handling which developer must take serious consideration what to do with DR Activation. e.g. Create the Event upon DR activation. Here's an example of how to use util.dumpInstance with the excludeEvents option:

util.dumpInstance("mydump", { threads: 4, consistent: true, ocimds: true, events: false, compatibility: ["strip_definers", "strip_invalid_grants", "strip_restricted_grants", "skip_invalid_accounts"], targetVersion:"8.0.39" })

Once you have dumped the data from the MySQL source database, you can load it into the HeatWave database using the util.loadDump utility. Example :

util.loadDump("mydump", { updateGtidsSet: "replace", loadUsers: true, progressFile: "load_progress.txt", resetProgress: true, ignoreVesrion:true })

  • The updateGtidsSet:"replace" is used for any New DB where the GTID is empty. The loadUsers by default is False. As DR, it is important to mark the option loadUsers:true.

When loading data into HeatWave, it's essential to consider the following best practices: 

  • Create primary keys: If the source database has tables without primary keys, you can use the createInVisiblePKs option with util.loadDump to create invisible primary keys on the target database. 
  • Define replication channel carefully: When defining the replication channel, you must carefully consider the configuration to ensure that it is properly set up. This includes configuring the replication channel to handle tables without primary keys on the source database.

Additional Considerations

When using MySQL Shell to dump and load data to HeatWave, there are additional considerations to keep in mind: 

  • EVENT exclusion: If you exclude events from the dump, any modification to existing events on the source database may result in an error on the target database, which does not have the event registered. 
  • GTID maintenance: When loading data into HeatWave, it's essential to maintain the GTIDs (Global Transaction Identifiers) to ensure that the replication process is properly configured. 
  • Version: MySQL Shell latest version of 9.x can be used to dump / load across MySQL version. Please check compatibility from documentation. To load dump data into HeatWave, you can use the util.dumpInstance / util.loadDump command with version specific options. Here is an example 
    • util.dumpInstance("mydump", {targetVersion:"8.0.39".....})
    • util.loadDump("mydump", {...., ignoreVersion: "8.0.28" }) 

Examples of Common Issues 

  • SQL Applier thread stopped with failure :
    • SQL command running on Source is not valid in HeatWave database  
    • the privilege is not granted / valid on HeatWave database
    • the record being updated /deleted is not found in HeatWave database
    • Duplicated Key in the target database for insert operation
Very often, these errors are due to inconsistent data between Source and Target Database.  Or there is manual operation to the Target Database and making changes to the database directly.
  • Source GTID has been purged where the missing GTID on HeatWave database is not found from Source.   
  • Duplicate Key Error due to EVENT : Suppose we have a table orders with a primary key. If an EVENT creates a new record on both the source and target databases, the replication process may encounter a duplicate key error. 
  • Group Replication Limitations: Suppose we have an HA configuration with HeatWave and the group replication has a maximum transaction size limit. If a transaction exceeds this limit, it may fail. 
  • Max Binlog Cache Size: The max_binlog_cache_size variable controls the maximum size of the binary log cache. If this value is too small, it can cause issues with large transactions.
  • Binlog Expiry: The binlog_expire_logs_seconds variable controls how long binary logs are kept on the source database. If this value is too small, it can cause issues with replication. The default setting with Heatwave is 3600. It is 1 Hour.
  • Relay Log Max Size: The max_relay_log_size variable controls the maximum size of the relay log on the HeatWave cluster. If this value is too small, it can cause issues with replication if the Applier fails and the relay log is not large enough to hold the transactions. 
  • MySQL 8.0.x Considerations :  When using MySQL 8.0.x, creating stored procedures or triggers with a different user from the administrator may result in an error due to the lack of SET_USER_ID privilege. The error log may show with the following errors : MY-1227 (42000): 
    • Access denied; you need (at least one of) the SUPER or SET_USER_ID privilege(s) for this operation
    • Ref : https://docs.oracle.com/en-us/iaas/mysql-database/doc/troubleshooting-inbound-replication.html

To resolve this issue with MySQL 8.0.x, Contact Oracle Support to grant theSET_USER_ID privilege to the Applier username.

Upgrade MySQL 8.0.x to 8.4.<latest> or 9.<latest> is highly recommended. The SET_USER_ID has already been deprecated in 8.2 and removed in 8.4. Administrator with MySQL 8.4 or above has been granted with new Privileged - SET_ANY_DEFINER

Summary 

By following these troubleshooting steps and best practices, you can ensure that your HeatWave replication channel is working correctly and that your data is being replicated accurately and efficiently. 


Please also find the inbound replication trouble shooting document : https://docs.oracle.com/en-us/iaas/mysql-database/doc/troubleshooting-inbound-replication.html


Thursday, September 12, 2024

MySQL/HeatWave Audit Archive and Dump to Object Storage

Auditing is one of the key concerns with Security & Compliance for many organizations.   

This article is written to share a tutorial how to do audit log archiving by reading the audit log.  The audit archiving process is to read the audit log using audit_log_read function provided to access to the audit.   The data once is archived to Table "audit_archive.audit_data" and the table is renamed to "audit_data_<timestamp>" accordingly.  The table is dumped to OCI Object Storage bucket using mysql shell utility.  The tutorial is for demo purpose only.   

MySQL Enterprise Audit is the feature allowing audit to be implemented.   For managed service of "MySQL", namely HeatWave Service on Oracle Cloud Infrastructure,  Enterprise Audit is installed.   If running MySQL Enterprise Edition on your compute VM, please refer to the documentation on how to install MySQL Enterprise Audit 

Ref : https://dev.mysql.com/doc/refman/8.4/en/audit-log-installation.html


The current practice with default AUDIT feature in HeatWave is to generate the AUDIT record in JSON format.  Here below is  system variables "audit%" example when HeatWave MySQL is provisioned.

mysql> show variables like 'audit%';

+--------------------------------------+---------------------+

| Variable_name                        | Value               |

+--------------------------------------+---------------------+

| audit_log_buffer_size                | 10485760            |

| audit_log_compression                | GZIP                |

| audit_log_connection_policy          | ALL                 |

| audit_log_current_session            | OFF                 |

| audit_log_database                   | mysql_audit         |

| audit_log_disable                    | OFF                 |

| audit_log_encryption                 | NONE                |

| audit_log_exclude_accounts           |                     |

| audit_log_file                       | /db/audit/audit.log |

| audit_log_filter_id                  | 1                   |

| audit_log_flush                      | OFF                 |

| audit_log_flush_interval_seconds     | 60                  |

| audit_log_format                     | JSON                |

| audit_log_format_unix_timestamp      | ON                  |

| audit_log_include_accounts           |                     |

| audit_log_max_size                   | 5368709120          |

| audit_log_password_history_keep_days | 0                   |

| audit_log_policy                     | ALL                 |

| audit_log_prune_seconds              | 604800              |

| audit_log_read_buffer_size           | 32768               |

| audit_log_rotate_on_size             | 52428800            |

| audit_log_statement_policy           | ALL                 |

| audit_log_strategy                   | ASYNCHRONOUS        |

+--------------------------------------+---------------------+

23 rows in set (0.09 sec)


Audit Log Pruning

Audit Log Pruning is enabled for JSON format on OCI HeatWave Service. 

For details about Audit Log Pruning, please refer to documentation : https://dev.mysql.com/doc/refman/8.4/en/audit-log-logging-configuration.html#audit-log-pruning

The audit_log_max_size is configured as 5368709120 (~5GB).  The audit_log_prune_seconds is configured as 604800 (7 days).  

Based on the documentation above, 

"Nonzero values of audit_log_max_size take precedence over nonzero values of audit_log_prune_seconds. "

The audit_log_max_size as 5GB takes precedence for pruning action.   The "audit_log_rotate_on_size" is about 50MB.  




Demo Environment 

- Oracle Cloud Infrastructure (OCI) HeatWave Service 

- Version : 9.0.1

- Compute VM : Oracle Linux 8

    - MySQL Shell version : 9.0.1

    - oci cli installed and configured 

        https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/cliinstall.htm#InstallingCLI__oraclelinux8


Creating Table Structure for AUDIT ARCHVE

The archived audit records are stored in DB table(s).   

TableDescription
audit_configTo keep track the last record  of the timestamp and id for audit log identifier for each server_uuid
audit_data_templateThe Table Structure template for audit_data
audit_data The Table to store the running archived records.
audit_data_<timestamp>Each time archiving is executed, it retrieves audit records from timestamp registerd in audit_config table until all records are retrieved.  Thereafter it renames the table to the audit_data_<timestamp>.


Tuesday, February 27, 2024

MySQL Keyring Component Installation for TDE


MySQL Plugin has been extensively used with MySQL.   It is being evolved into COMPONENT deployment.   This article is written to share the steps with MySQL Keyring Component Installation.   

MySQL Enterprise Edition includes encrypted file component for Keyring.    This provides a more secure way to store the master key with TDE.

Installation of component with keyring has to be static rather than running SQL command "INSTALL COMPONENT".   There are 2 scopes with component installation.


Global vs Local

With Global component installation, the configuration is located with the MySQL installation folder.

With Local component installation, the global configuration is referenced to locate the local configuration from the Datadir.


Monday, March 20, 2023

MySQL - determine transaction size

Running MySQL InnoDB Cluster / Group Replication, there is a transaction size limit (https://dev.mysql.com/doc/refman/8.0/en/group-replication-options.html#sysvar_group_replication_transaction_size_limit)


The default value of "group_replication_transaction_size_limit"  is about 143 MB (150000000 bytes)

How can we determine the size of a batch job (single transaction) in MySQL?


Setting a small value with variable group_replication_transaction_size_limit and running the transaction produces the error message in error log. 


For example :

Assuming there is primary node on port 3310

mysql -uroot -p -h127.0.0.1 -P3310  -e "  set global group_replication_transaction_size_limit=1024;"


mysql -uroot -p -h127.0.0.1 -P3310 << EOL

create database if not exists demox;

create table demox.mytable (f1 int not null auto_increment primary key, f2 varchar(1024));

insert into demox.mytable (f2) values (repeat('a',1024));

EOL


Checking the error log file, we may see the message like 

2023-03-21T06:40:40.997931Z 1133317 [ERROR] [MY-011608] [Repl] Plugin group_replication reported: 'Error on session 1133317. Transaction of size 1833 exceeds specified limit 1024. To increase the limit please adjust group_replication_transaction_size_limit option.'

2023-03-21T06:40:40.997991Z 1133317 [ERROR] [MY-010207] [Repl] Run function 'before_commit' in plugin 'group_replication' failed


By adjusting the transaction size limit, the size is logged in the error log file.


 Note: This is not a good practice for production usage.  Adjusting the global variable  'group_replication_transaction_size_limit' affects all transactions.     



Wednesday, November 16, 2022

Secured MySQL InnoDB Cluster with Certificate creation using OpenSSL

This is a demo tutorial to show how we can create InnoDB Cluster with newly installed Certificate and having X509 certificate verification via MySQL Router connection.   


Recorded Video

The full process is recorded on Youtube - showing creating InnoDB Cluster with newly installed CA, Server Certificates.  The Router creation is configured with SSL Server certificate from the same CA certificate across Server nodes.   The creation of User (create user my509user identified by '....' require X509) using X509 certificate PASSTHROUGH verification via Router connection.

https://www.youtube.com/watch?v=w1xgpjw0VTw


Environment

The following environment was tested 

Oracle Linux Server release 8.6

MySQL Server 8.0.31

MySQL Shell 8.0.31

MySQL Router 8.0.31


Github Script

The github script provides the steps to guide thru Installation of 3 nodes 

https://github.com/ivanxma/mylab/tree/main/13-InnoDBCluster/99-SSL-cert-IC


To configure the node1/node2/node3 hostname under ./comm.sh   (Change the hostname based on your environment)

```

export HOST1=workshop20

export HOST2=workshop22

export HOST3=workshop23

```


Background

The creation of InnoDB Cluster creates Internal User (mysql_innodb_cluster_<server_id>@'%') 

Here is an example :

mysql> select user,host from mysql.user;

+--------------------------+-------------+

| user                     | host        |

+--------------------------+-------------+

| gradmin                  | %           |

| mysql_innodb_cluster_101 | %           |

| mysql_innodb_cluster_201 | %           |

| mysql_innodb_cluster_301 | %           |

| mysql.infoschema         | localhost   |

| mysql.session            | localhost   |

| mysql.sys                | localhost   |

| root                     | localhost   |

+--------------------------+-------------+

8 rows in set (0.00 sec)


Note : the RENAME USER does not work.

 (e.g.  mysql > rename user mysql_innodb_cluster_101@'%' to mysql_innodb_cluster_101@'10.0.%' )  


"%" as host may can be restricted to specific IP subnet via replicationAllowedHost option with MySQL Shell Admin API.   

As an example to restrict user creation with Host from subnet 192.0.2.0/24:

mysql-js> dba.createCluster('testCluster', {replicationAllowedHost:'192.0.2.0/24'})


Group Replication with communication stack "MYSQL" goes through the standard MySQL port (for example 3306.  The encrypted communication / recovery channel is established with SSL certificate.    


group_replication_ssl_mode  is configured as REQUIRED to ensure secured SSL between nodes.

REQUIRED

Establish a secure connection if the server supports secure connections.


group_replication_ssl_mode can also be configured as VERIFY_CA to ensure certificate being used and verified.
VERIFY_CA

Like REQUIRED, but additionally verify the server TLS certificate against the configured Certificate Authority (CA) certificates.


The following sections describe the steps to setup the VERIFY_CA option and creation of certificates using OpenSSL command.

1. On each node to initialize MySQL Data Directory [ Script

2. Create CA certificate on Node1 and using the same CA certificate to create server-cert.pem for each server 
Refer to CA creation [ Script ]
Refer to Server certificate creation [ Script


3. Startup the server with unique $SERVER_ID on each HOST [ Script

4.On each host, configure Group Replication Admin User [ Script

5. Create 3 nodes Innodb Cluster with MySQL Shell with specific options to secure the SSL connection

6. Bootstrap MySQL Router [ Script ]

7. Create Router Certificate with same CA certificate and Configure the mysqlrouter.conf  


8. Start Router [ Script

9 Finally, creating a user with X509 and Login via MySQL Router
    Login to Primary Server with 'root'
mysql > create user my509user@'%' identified by 'my509pass' REQUIRE X509;

     Alternatively, user creation can follow more constraint with 

REQUIRE SUBJECT "/...."

 REQUIRE ISSUER "/..."

    For example (Check with the client certificate for the ISSUER or SUBJECT content  [ openssl x509 -text -in <certificate file> ] )

    example 1 : mysql > create user my509user@'%' identified by 'my509pass' REQUIRE ISSUER "/O=MySQL/CN=MySQL" ;

    example 2 : mysql > create user my509user@'%' identified by 'my509pass' REQUIRE SUBJECT "/CN=hostname";

    example 3 : mysql > create user my509user@'%' identified by 'my509pass' REQUIRE SUBJECT "/CN=hostname" AND ISSUER "/O=MySQL/CN=MySQL;

 


    Login using certificate via the Router  For example : (Given that the 'client-key.pem' 'client-cert.pem' is created on each host)
mysql -umy509user -pmy509pass -h127.0.0.1 -P6446 --ssl-mode=VERIFY_CA --ssl-ca=~/data/3310/ca.pem --ssl-cert=~/data/3310/client-cert.pem --ssl-key=~/data/3310/client-key.pem 

References