MySQL InnoDB Cluster

From Michael's Information Zone
Revision as of 05:56, 29 June 2021 by Michael.mast (talk | contribs) (→‎Restart from full shutdown)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Purpose

To attempt installing a 3 node cluster.

Process

Started with downloading and installing mysql80-community-release-el7-3.noarch.rpm on each node, and the workstation I used for the admin API.

  • Install and configure mysql server on first node. In my case this is Amazon Linux 2
sudo yum install mysql-community-server
sudo systemctl enable --now mysqld.service
  • Grab the temp password from /var/log/mysqld.log [1] and run secure setup.
    • Root by default is restricted to localhost. I added an account for the mysqlsh to connect from my workstation. Also added the permissions to configure the clusteradmin
mysql> create user 'root'@'ip address' identified with mysql_native_password BY 'password';
mysql> grant all privileges on *.* to 'root'@'ip address';
mysql> GRANT CLONE_ADMIN, CONNECTION_ADMIN, CREATE USER, EXECUTE, FILE, GROUP_REPLICATION_ADMIN, PERSIST_RO_VARIABLES_ADMIN, PROCESS, RELOAD, REPLICATION CLIENT, REPLICATION SLAVE, REPLICATION_APPLIER, REPLICATION_SLAVE_ADMIN, ROLE_ADMIN, SELECT, SHUTDOWN, SYSTEM_VARIABLES_ADMIN ON *.* TO 'root'@'ip address' WITH GRANT OPTION;
mysql> flush privileges;

[2]

Configure Nodes with MySQL Shell

  • Installed mysql shell on my Fedora 33 workstation.
sudo dnf install -y mysql-shell
  • Set variables and run setup function.
mysqlsh
MySQL  JS > var values={'password':'<root password>', 'clusterAdmin':'<cluster admin name>', 'clusterAdminPassword':'<cluster password>'}
MySQL  JS > dba.configureInstance('root@node1:3306', values )
  • At this point I exited mysql shell with ctrl+d and connected to node1 directly using shell (This can be reworked from the beginnging)
mysqlsh --mysql -u root -h nod1
MySQL  node1:3306 ssl  JS > var cluster = dba.createCluster('testcluster')
  • I left shell open since it had the cluster functions configure. Using a separate ssh session I connected to the other two nodes, repeating the steps above until I got to the cluster config.
  • Now we add the two additional nodes to the cluster from the first shell I already had open. I used the clone option for this.
MySQL  node1:3306 ssl  JS > cluster.addInstance('mysql02.domout.net')

MySQL Router

  • Using the same shell as above, create an account for the mysql router.
cluster.setupRouterAccount('routerusername')
  • On an app server needing access to mysql, install router and bootstrap. In this case I am running on my Fedora Workstation.
  • NOTE : The --user switch is for the local user on the machine that the router will run as. I tried running it as myself but ran into many issues.
  • NOTE : I wanted to use --account-host=local.hostname, but was unable to connect as the IP address was used and reverse lookup was not working. It would have been ideal to restrict routers to specific hosts.
sudo dnf install -y mysql-router-community
sudo mysqlrouter --bootstrap dbclusteradmin@node01:3306 --account-create=if-not-exists --account=router_zabbix --user=mysqlrouter --name=routerusername --conf-bind-address 127.0.0.1

## MySQL Classic protocol

- Read/Write Connections: localhost:6446
- Read/Only Connections:  localhost:6447

## MySQL X protocol

- Read/Write Connections: localhost:6448
- Read/Only Connections:  localhost:6449

Checking for Compatibility

All tables must have proper keys in place. Zabbix has a couple tables that are missing keys. This will prevent importing the database into the cluster.

  • Check for keys[3]
SELECT tables.table_schema , tables.table_name , tables.engine 
FROM information_schema.tables 
LEFT JOIN ( 
   SELECT table_schema , table_name 
   FROM information_schema.statistics 
   GROUP BY table_schema, table_name, index_name HAVING 
     SUM( case when non_unique = 0 and nullable != 'YES' then 1 else 0 end ) = count(*) ) puks 
 ON tables.table_schema = puks.table_schema and tables.table_name = puks.table_name 
 WHERE puks.table_name is null 
   AND tables.table_type = 'BASE TABLE' AND Engine="InnoDB";

Example Zabbix

  • I needed to add primary keys to zabbix tables that lacked them.
  • I also needed to change from multi-primary to single primary. Too many limitations getting in the way[4]
MariaDB [(none)]> SELECT tables.table_schema , tables.table_name , tables.engine 
    -> FROM information_schema.tables 
    -> LEFT JOIN ( 
    ->    SELECT table_schema , table_name 
    ->    FROM information_schema.statistics 
    ->    GROUP BY table_schema, table_name, index_name HAVING 
    ->      SUM( case when non_unique = 0 and nullable != 'YES' then 1 else 0 end ) = count(*) ) puks 
    ->  ON tables.table_schema = puks.table_schema and tables.table_name = puks.table_name 
    ->  WHERE puks.table_name is null 
    ->    AND tables.table_type = 'BASE TABLE' AND Engine="InnoDB";
+--------------+--------------+--------+
| table_schema | table_name   | engine |
+--------------+--------------+--------+
| zabbix       | dbversion    | InnoDB |
| zabbix       | history      | InnoDB |
| zabbix       | history_log  | InnoDB |
| zabbix       | history_str  | InnoDB |
| zabbix       | history_text | InnoDB |
| zabbix       | history_uint | InnoDB |
+--------------+--------------+--------+
6 rows in set (0.02 sec)
  • I then added the primary keys[5]
alter table dbversion add primary key (mandatory);
alter table history add pkid int primary key auto_increment;
alter table history_log add pkid int primary key auto_increment;
alter table history_str add pkid int primary key auto_increment;
alter table history_text add pkid int primary key auto_increment;
alter table history_uint add pkid int primary key auto_increment;

Troubleshooting

mysqlsh --mysql -u root -h mysql01.domain.net
cluster=dba.getCluster()
cluster.status()
cluster.rejoinInstance("mysql02.domain.net")
cluster.setPrimaryInstance("mysql01.domain.net")

From a complete split brain issue I ran into: NOTE : In this case the problem was the sudden introduction of IPv6 onto the nodes. Removing IPv6 fixed communication problems and I was able to rejoin the nodes. The following is for reference only.

  • Connect to whichever node is working, then force it as primar.
cluster.forceQuorumUsingPartitionOf("root@mysql01.domain.net")
  • Now remove the other nodes from the same shell.
  • Go to the other nodes and run the following from the shell. Forcing the removal of the super_read_only when prompted.
dba.dropMetadataSchema()
  • This was not needed, but is good to know.
set global super_read_only = 0;
set global read_only = 0;

Restart from full shutdown

[6]

  • Log into node with cluster admin privileges.
mysqlsh -u<cluster admin> -p
cluster = dba.rebootClusterFromCompleteOutage()
  • In my case I ran into this error while running the above command on node2
Dba.rebootClusterFromCompleteOutage: The active session instance (mysql02:3306) isn't the most updated in comparison with the ONLINE instances of the Cluster's metadata. Please use the most up to date instance: 'mysql01:3306'. (RuntimeError)
  • Switched to node1 and re-ran, however I received an error stating that the plugin timed out waiting for the super read only to be removed. I ended up
    • Clearing metadata from nodes 1,2 and 3.
    • Powering on nodes 4 and 5.
    • Restarting the cluster from node 5 while adding node 4 and removing 1 - 3.
  • The following can provide some information on cluster status when the dba.getCluster() function is not available.

[7]

select member_host,member_role from performance_schema.replication_group_members;