29 June 2009

Dynamically add Cluster data nodes using Solaris SMF

As a follow on to my previous post on using Solaris SMF to manage Cluster, I'll expand my setup and show how to dynamically add data nodes to Cluster and control the new nodes with SMF.

First, the setup. Since I wanted something that would run completely on my laptop, I decided to clone my current OpenSolaris VM (hostname = craigOS_0609) to create a second "machine" (hostname = craigOS_0609_vm2). Thought this would be easy and it was, except for the networking piece. When I just had a single VM, NAT worked fine. However, trying to use two VMs of OpenSolaris with NAT turned out to be problematic. They both ended up with a 10.0.2.15 IP address. Researched it a bit and didn't find anything definitive but did see that bridged networking might be the way to go. Well, after futzing with it for a day or so, it turned out that simply setting both VMs to use bridged networking allowed each to obtain an IP address from my router and have both VMs communicating with no problems.

One thing I was hoping I could do is control the startup of the data nodes on the second VM from the SMF services on the primary VM. I was also hoping I could define a dependency that spanned machines so the new data nodes would require the Management Server on the primary machine to be running before starting. Found out that that is currently not supported but is a planned feature.

OK, first things first. Modify the current Cluster config file to have entries for the two new data nodes we will add:

[ndbd default]
NoOfReplicas= 2
MaxNoOfConcurrentOperations= 10000
DataMemory= 80M
IndexMemory= 24M
TimeBetweenWatchDogCheck= 30000
DataDir= /usr/local/mysql/data_cluster
MaxNoOfOrderedIndexes= 512

[ndb_mgmd default]
DataDir= /usr/local/mysql/data_cluster

[ndb_mgmd]
Id=10
HostName= craigOS_0609

[ndbd]
Id= 1
HostName= craigOS_0609
[ndbd]
Id= 2
HostName= craigOS_0609
[ndbd]
Id= 3
HostName= 192.168.1.10
[ndbd]
Id= 4
HostName= 192.168.1.10

[mysqld]
Id= 20
[mysqld]
Id= 21
[mysqld]
Id= 22
[mysqld]
Id= 23

Restart the management server and wait a bit for it to load the new configuration:

# svcadm restart mysql_ndb_mgmd
# ndb_mgm -e show
Connected to Management Server at: craigOS_0609:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 4 node(s)
id=1 @192.168.1.8  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0, Master)
id=2 @192.168.1.8  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0)
id=3 (not connected, accepting connect from 192.168.1.10)
id=4 (not connected, accepting connect from 192.168.1.10)

[ndb_mgmd(MGM)] 1 node(s)
id=10 @192.168.1.8  (mysql-5.1.32 ndb-7.0.5)

[mysqld(API)] 4 node(s)
id=20 @192.168.1.8  (mysql-5.1.32 ndb-7.0.5)
id=21 (not connected, accepting connect from any host)
id=22 (not connected, accepting connect from any host)
id=23 (not connected, accepting connect from any host)
#
Now we need to create a new nodegroup with our newly added nodes:
# ndb_mgm -e "create nodegroup 3,4"
Connected to Management Server at: craigOS_0609:1186
Nodegroup 1 created
#
Checking memory usage in Cluster shows that the new nodes are recognized but currently have no data:

# ndb_mgm
-- NDB Cluster -- Management Client --
ndb_mgm> all report memory
Connected to Management Server at: craigOS_0609:1186

ndb_mgm> Node 1: Data usage is 2%(58 32K pages of total 2560)
Node 1: Index usage is 1%(36 8K pages of total 3104)
Node 2: Data usage is 2%(58 32K pages of total 2560)
Node 2: Index usage is 1%(36 8K pages of total 3104)
Node 3: Data usage is 0%(16 32K pages of total 2560)
Node 3: Index usage is 0%(0 8K pages of total 3104)
Node 4: Data usage is 0%(16 32K pages of total 2560)
Node 4: Index usage is 0%(0 8K pages of total 3104)

ndb_mgm> quit
#

Also, checking the statistics for one of our Cluster tables confirms that we are still only using the original two data nodes:
# ndb_desc -c craigOS_0609 -d world City -p
-- City --
Version: 1
Fragment type: 9
K Value: 6
Min load factor: 78
Max load factor: 80
Temporary table: no
Number of attributes: 5
Number of primary keys: 1
Length of frm data: 324
Row Checksum: 1
Row GCI: 1
SingleUserMode: 0
ForceVarPart: 1
FragmentCount: 2
TableStatus: Retrieved
-- Attributes -- 
ID Int PRIMARY KEY DISTRIBUTION KEY AT=FIXED ST=MEMORY AUTO_INCR
Name Char(35;latin1_swedish_ci) NOT NULL AT=FIXED ST=MEMORY
CountryCode Char(3;latin1_swedish_ci) NOT NULL AT=FIXED ST=MEMORY
District Char(20;latin1_swedish_ci) NOT NULL AT=FIXED ST=MEMORY
Population Int NOT NULL AT=FIXED ST=MEMORY

-- Indexes -- 
PRIMARY KEY(ID) - UniqueHashIndex
PRIMARY(ID) - OrderedIndex

-- Per partition info -- 
Partition Row count Commit count Frag fixed memory Frag varsized memory 
0          2084      2084         196608            0                     
1         1995      1995         196608            0                     

NDBT_ProgramExit: 0 - OK
OK, now it is time to add our new data nodes. Since I cloned the primary VM, all the software I needed was already installed and ready to go in the new VM.

Step 0: Make sure the following services are disabled ("svcadm disable <service>"): mysql_ndb_mgmd, mysql_ndbd:node1, mysql_ndbd:node2, mysql

Also, make sure the connect string parameters in the /etc/mysql/my.cnf file is set to the primary VM (the one running our management server):
...
[mysqld]
ndbcluster
ndb-connectstring = craigOS_0609
...
[ndbd]
connect-string = craigOS_0609
[ndb_mgm]
connect-string = craigOS_0609
...
Step 1: Remove the SMF service for the Management node since I would only be using one:
# svccfg delete mysql_ndb_mgmd
Step 2: Since our new nodes with be 3 and 4, let's just remove the SMF services for node1 and node2. In the next step we will modify the manifest and rename the nodes to node3 and node4.
# svccfg delete mysql_ndbd:node1
# svccfg delete mysql_ndbd:node2
Step 3: Change the manifest for the data nodes and import into the SMF repository:
# cd /var/svc/manifest/application/database
# vi mysql_ndbd.xml
For each of the two instances defined, make sure the mgmd_hosts and node_id properties are set correctly:

...
      <instance enabled="false" name="node3">

 <method_context>
     <method_credential group="mysql" user="mysql">
 </method_context>

 <property_group name="cluster" type="application">
   <propval name="bin" type="astring" value="/usr/local/mysql/bin">
   <propval name="data" type="astring" value="/usr/local/mysql/data_cluster">
   <propval name="mgmd_hosts" type="astring" value="craigOS_0609">
   <propval name="node_id" type="integer" value="3">
 </property_group>

      </instance>

       <instance enabled="false" name="node4">

 <method_context>
     <method_credential group="mysql" user="mysql">
 </method_context>

 <property_group name="cluster" type="application">
   <propval name="bin" type="astring" value="/usr/local/mysql/bin">
   <propval name="data" type="astring" value="/usr/local/mysql/data_cluster">
   <propval name="mgmd_hosts" type="astring" value="craigOS_0609">
   <propval name="node_id" type="integer" value="4">
 </property_group>

      </instance>
...
Step 4: We can now import our settings for our new data nodes and enable them:
# svccfg import mysql_ndbd.xml
# svcadm enable mysql_ndbd:node3
# svcadm enable mysql_ndbd:node4
Step 5: Now let's check if our new nodes have joined the Cluster (this can be done from either VM):
# ndb_mgm -e show
Connected to Management Server at: craigOS_0609:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 4 node(s)
id=1 @192.168.1.8  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0, Master)
id=2 @192.168.1.8  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0)
id=3 @192.168.1.10  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 1)
id=4 @192.168.1.10  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 1)

[ndb_mgmd(MGM)] 1 node(s)
id=10 @192.168.1.8  (mysql-5.1.32 ndb-7.0.5)

[mysqld(API)] 4 node(s)
id=20 @192.168.1.8  (mysql-5.1.32 ndb-7.0.5)
id=21 @192.168.1.10  (mysql-5.1.32 ndb-7.0.5)
id=22 (not connected, accepting connect from any host)
id=23 (not connected, accepting connect from any host)

So, things look good so far. And you can see that I actually started a second MySQL Server from the new VM.

For the last step, we will redistribute the existing Cluster tables to take advantage of the new nodes. Otherwise, only data added after the new nodes go online will be placed on the nodes.

Our sample "world" database contains the clustered tables, so it's easy enough just to type in the three commands:

# mysql world
mysql> alter table online table City reorganize partition;
mysql> alter table online table Country reorganize partition;
mysql> alter table online table CountryLanguage reorganize partition;
mysql> quit
#

Now that that's done, let's recheck our table distribution using ndb_desc:
# ndb_desc -c craigOS_0609 -d world City -p
-- City --
Version: 16777217
Fragment type: 9
K Value: 6
Min load factor: 78
Max load factor: 80
Temporary table: no
Number of attributes: 5
Number of primary keys: 1
Length of frm data: 324
Row Checksum: 1
Row GCI: 1
SingleUserMode: 0
ForceVarPart: 1
FragmentCount: 4
TableStatus: Retrieved
-- Attributes -- 
ID Int PRIMARY KEY DISTRIBUTION KEY AT=FIXED ST=MEMORY AUTO_INCR
Name Char(35;latin1_swedish_ci) NOT NULL AT=FIXED ST=MEMORY
CountryCode Char(3;latin1_swedish_ci) NOT NULL AT=FIXED ST=MEMORY
District Char(20;latin1_swedish_ci) NOT NULL AT=FIXED ST=MEMORY
Population Int NOT NULL AT=FIXED ST=MEMORY

-- Indexes -- 
PRIMARY KEY(ID) - UniqueHashIndex
PRIMARY(ID) - OrderedIndex

-- Per partition info -- 
Partition Row count Commit count Frag fixed memory Frag varsized memory 
0          1058      4136         196608            0                     
1         1018      3949         196608            0                     
2         1026      1026         98304             0                     
3         977       977          98304             0                     

NDBT_ProgramExit: 0 - OK


That's it!

No comments:

Post a Comment