MySQL Cluster uses the new NDBCluster (from MySQL 4.1.2) storage engine to enable running several MySQL servers in a cluster.
This chapter represents work in progress. Other documents describing MySQL Cluster can be found at http://www.mysql.com/cluster/ and http://dev.mysql.com/doc/#cluster.
MySQL Cluster
is a new technology to enable clustering of in-memory
databases in a share-nothing system. The share-nothing architecture allows
the system to work with very inexpensive hardware, without any specific
requirement on hardware or software. It also does not have any single
point of failure since each component has its own memory and disk.
MySQL Cluster is an integration of the standard MySQL server with an
in-memory clustered storage engine, called NDB
. In our
documentation, the term NDB
refers to the storage engine specific
part of the setup, whereas MySQL Cluster
refers to the combination
of MySQL and the new storage engine.
A MySQL Cluster consists of computers with a set of processes executing several MySQL servers, storage nodes for NDB Cluster, management servers and possibly also specialized data access programs. All these programs work together to form MySQL Cluster. When data is stored in the NDBCluster storage engine, the tables are stored in the storage nodes for NDB Cluster. Those tables are directly accessible also from all other MySQL servers in the cluster. Thus, if one application updates the salary of an employee all other MySQL servers that query this data can see it immediately.
The data stored in the storage nodes for MySQL Cluster can be mirrored and can handle failures of storage nodes with no other impact than that a number of transactions are aborted due to losing the transaction state. This should cause no problems bcause transactional applications should be written to handle transaction failure.
By bringing MySQL Cluster to the open source world, MySQL makes clustered data management with high availability, high performance and scalability available to all who need it.
NDB
is an in-memory storage engine offering high-availability
and data-persistence features.
NDB
can (although this requires extensive knowledge) be used as an
independent database system, supporting the traditional relational
data model with full ACID transactions.
The NDB
storage engine can be configured with a range of fail-over and
load-balancing options, but it is easiest to start with the storage
engine at the cluster level. The NDB
storage engine of MySQL Cluster
contains a complete set of data, dependent only on other data within
the cluster itself.
A MySQL Cluster may also replicate clustered data to other MySQL Clusters,
but this is a complex configuration. Here, we will focus on how to set up
a single MySQL Cluster consisting of an NDB
storage engine and some MySQL
servers.
The cluster part of MySQL Cluster is currently configured independently
from the MySQL servers. In an MySQL Cluster each part of the cluster
is considered to be a node
.
Note: A node in many contexts is often a computer, but for MySQL Cluster it is a process. There can be any number of nodes on a single computer.
Each node has a type, and there can be multiple nodes in the MySQL Cluster of each type. In a minimal MySQL Cluster configuration, there will be at least three nodes:
MGM
) node. The role of this type of node is to
manage the other nodes within the MySQL Cluster, such as providing
configuration data, starting and stopping nodes, running backup etc.
As this node type manages the configuration of the other nodes, a
node of this type must always be started first, before any other
node. With a running cluster, the MGM node does necessarily have to
be running all the time.
DB
) node. This is the type of node that
manages and stores the database itself. There are as many DB nodes
as you have replicas times the number of fragments. That is, with
two fragments, each with two replicas, you need four DB nodes. Note
that it is not necessary to have more than one replica, so a minimal
MySQL Cluster may contain just one DB node.
API
) node. This is the client node that will access
the cluster, and in the case of MySQL Cluster, these are traditional
MySQL servers with a new storage engine NDBCluster
which enables
access to clustered tables. Basically, the MySQL daemon is a client
of the NDB cluster. If you have applications that use the NDB API
directly, then these are considered API nodes too.
We refer to these cluster processes as nodes in the cluster. Setting up the configuration of the cluster involves configuring each individual node in the cluster and setting up each individual communication link between the nodes in the cluster. MySQL Cluster currently is designed with the intention that storage nodes are homogenous in terms of processor power, memory space, and communication bandwidth. Also, to enable one point of configuration, it was decided to move the entire cluster configuration to one configuration file.
The management server manages the cluster configuration file and the cluster log. All nodes in the cluster contact the management server to retrieve their part of the configuration, so they need a way to determine where the management server resides. When interesting events occur in the storage nodes, they transfer the information of these events to the management server, which then writes the information to the cluster log.
In addition, there are any number of clients to the cluster. These are of two types. First, there are the normal MySQL clients that are no different for MySQL Cluster. MySQL Cluster can be accessed from all MySQL applications written in PHP, Perl, C, C++, Java, Ruby, and so forth. Second, there are management clients. These clients access the management server and provide commands to start and stop nodes gracefully, to start and stop message tracing (only in debug versions), to print configuration, to show node status of all nodes in the cluster, to show versions of all nodes in the cluster, to start and stop backups, and so forth.
A MySQL server that is part of MySQL Cluster differs in only one aspect
from what we are used to, it has an additional storage engine
(NDB
or NDBCLUSTER
), which is initially disabled.
Except for this, the MySQL server is not much different than what we are
used to from previous MySQL releases, except any other new 4.1 features,
of course. As default, the MySQL is configured with the NDB
storage
engine disabled; to enable it you need to modify `my.cnf'.
Also, as the MySQL daemon is an API client to the NDB
storage engine, the
minimal configuration data needed to access the MGM node from the
MySQL server must be set. When this is done, then all MGM nodes
(one is sufficient to start) and DB nodes must be up and running before
starting the MySQL server.
Currently, you need to build from source, using MySQL 4.1 from the BitKeeper tree. Note that some tools need different versions than is typically used when building MySQL.
Tool/library | Version | Comments |
libncurses | 5.2.2 | Used by some of the commandline tools. |
Make | 3.79.1 | |
Gawk | 3.1.0 | Some Linux distributions come with mawk instead (like Debian). This will NOT work. |
Autoconf | 2.56 | Very important to have the right version here, at least 2.5x. |
Automake | 1.7.6 | Also very important, some buildfiles rely on recent functionality and absence of bugs. Having the wrong version will cause strange build errors, not immediately related to automake. |
Libtool | 1.5.6 | Again, the version is important. Strange error occurs if very old versions are used. |
Gcc | 2.95.4 | |
Bison | 1.75 | The build will fail if an old version of Bison is used, the table-size in the SQL parser will overflow. |
Zlib | 1.1.4 | This isn't normally required, but due to a small bug, the build will not complete if zlib is missing, and regrettably at a very late stage. |
To start the build, use the BUILD/compile-pentium-max
script.
This build script also includes OpenSSL, so you either have to get OpenSSL
of modify the build script to exclude it.
Apart from these things, follow the standard instructions to build the binaries, run the tests and perform the installation procedure. See section 2.3.3 Installing from the Development Source Tree.
You need to have all the MGM and DB nodes up and running first, and this will probably be the most time-consuming part of the configuration, if for no other reason than because we will assume that you are already familiar with MySQL to a certain extent. As for the MySQL configuration and the `my.cnf' file, this is very straightforward, and this section only covers the differences from configuring MySQL without clustering.
There are four processes that are important to know about when using MySQL Cluster. We will cover how to work with those processes, which options to use when starting and so forth.
mysqld
is the traditional MySQL server process. To be used with MySQL Cluster it needs to be built with support for the NDBCluster storage engine. If the mysqld
binary has been built in such a manner, the NDBCluster storage engine is still disabled by default.
To enable the NDBCluster storage engine there are two ways. Either use --ndbcluster as a start-up option when starting mysqld
or insert a line with ndbcluster in the my.cnf file. An easy way to verify that your server runs with support for the NDBCluster storage engine is to issue the command SHOW TABLE TYPES from a mysql client.
The MySQL server needs to know how to get the configuration of the cluster. To access this configuration it needs to know three things, it needs to know its own node id in the cluster, it needs to know the hostname (or IP address) where the management server resides and finally it needs to know the port on which it can connect to the management server.
There are two possible ways to provide this information to the mysqld
process. The first option is to include this information in a file called
`Ndb.cfg'. This file should reside in the data directory of the MySQL Server.
The second option is to set an environment variable called
NDB_CONNECTSTRING
. The string is the same in both cases.
nodeid=3;hostname=ndb_mgmd.mysql.com:2200
where 3 is the node id, ndb_mgmd.mysql.com
is the host where the
management server resides, and it is listening to port 2200.
With this set-up the MySQL server will be a full citizen of MySQL Cluster and will be fully aware of all storage nodes in the cluster and their status. It will set-up connection to all storage engine nodes and will be able to use all storage engine nodes as transaction coordinator and to access their data for reading and updating.
ndbd
, the Storage Engine Node Process
ndbd
is the process which is used to handle all the data in the tables using the NDBCluster storage engine. This is the process that contains all the logic of distributed transaction handling, node recovery, checkpointing to disk, on-line backup and lots of other functionality.
In a cluster there is a set of ndbd
processes cooperating in handling the data. These processes can execute on the same computer or on different ina completely configurable manner.
Each ndbd
process should start from a separate directory. The reason
for this is that ndbd
generates a set of log files in its starting directory. These log files are:
error.log
is a file that contains information of all the crashes which
the ndbd
process has encountered and a smaller error string and reference to a trace file for this crash. An entry could like this:
Date/Time: Saturday 31 January 2004 - 00:20:01 Type of error: error Message: Internal program error (failed ndbrequire) Fault ID: 2341 Problem data: DbtupFixAlloc.cpp Object of reference: DBTUP (Line: 173) ProgramName: NDB Kernel ProcessID: 14909 TraceFile: NDB_TraceFile_1.trace ***EOM***
NDB_TraceFile_1.trace
is a trace file describing exactly what happened before the error occurred. This information is useful for the MySQL Cluster team when analysing any bugs occurring in MySQL Cluster. The information in this file will be described in the section MySQL Cluster Troubleshooting
. There can be a configurable number of those trace files in the directory before old files are overwritten. 1 in this context is simply the number of the trace file.
NextTraceFileNo.log
is the file keeping track of what the number of the next trace file is to be.
node2.out
is the file which contains any data printed by the
ndbd
process when executing as a daemon process. 2 in this context is
the node id. This file only exists when starting ndbd
as a daemon since then stdout and stderr is redirected to this file.
node2.pid
is a file used to ensure that only one ndb node is started
with this node id. This is the normal pid-file created when starting a daemon.
2 in this context is the node id. This file only exists when starting
ndbd
as a daaemon process.
Signal.log
is a file which is only used in debug versions of
ndbd
where it is possible to trace all incoming, outgoing and internal
messages with their data in the ndbd
process.
It is recommended to not use a directory mounted through NFS since that can in some environments cause problems with the lock on the pid-file remaining even after the process has stopped.
Also when starting the ndbd
process it is necessary to specify which
node id the process is to use, the host of the management server and the port
it is listening to. Again there are two ways of specifying this information.
Either in a string in the file `Ndb.cfg', this file should be stored in the
starting directory of the ndbd
process. The second option is to set the
environment variable NDB_CONNECTSTRING
before starting the process.
When ndbd
starts it will actually start two processes. The starting
process is called the "angel" and its only job is to discover when the
execution process has completed and then if configured to do so, to restart
the ndbd
process. Thus if one attempts to kill the ndbd
through the kill command in a Unix variant it is necessary to kill both processes.
The execution process will use one thread for all activities in reading, writing and scanning data and all other activities. This thread is designed with asynchronous programming so it can easily handle thousands of concurrent activites. In addition there is a watch-dog thread supervising the execution thread to ensure it doesn't stop in an eternal loop or other problem. There is a pool of threads handling file I/O. Each thread can handle one open file. In addition threads can be used for connection activities of the transporters in the ndbd
process. Thus in a system that performs a large number of activities including update activities the ndbd
process will consume up to about 2 cpu's if allowed to. Thus in a large SMP box with many CPU's it is recommended to use several ndbd
processes which are configured to be part of different node groups.
nodeid=2;hostname=ndb_mgmd.mysql.com:2200
ndb_mgmd
, the Management Server ProcessThe management server is the process which reads the configuration file of the cluster and distributes this information to all nodes in the cluster requesting it. It does also maintain the log of cluster activities. Management clients can connect to the management server and use commands to check status of the cluster in various aspects.
Also when starting ndb_mgmd
it is necessary to state the same
information as for ndbd
and mysqld
processes and again there are
two options using the file `Ndb.cfg' or using the environment variable
NDB_CONNECTSTRING
. The `Ndb.cfg' will if used be placed in the
start directory of ndb_mgmd
.
nodeid=1;hostname=ndb_mgmd.mysql.com:2200
The following files are created or used by ndb_mgmd
in its starting directory of ndb_mgmd
:
MySQL Cluster Configuration
.
cluster.log
is the file where events in the cluster are reported. Examples of events are checkpoints started and completed, node failures and nodes starting, levels of memory usage passed and so forth. The events reported are described in the section section 17.5 Management of MySQL Cluster.
node1.out
is the file used for stdout and stderr when executing the management server as a daemon process. 1 in this context is the node id.
node1.pid
is the pid file used when executing the management server as a daemon process. 1 in this context is the node id.
cluster.log.1
when the cluster log becomes bigger than 1 million bytes then the file is renamed cluster.log.1
where 1 is the number of the cluster log file, so if 1, 2, and 3 already exists the next will be having the number 4 instead.
ndb_mgm
, the Management Client ProcessThe final important process to know about is the management client. This process is not needed to run the cluster. Its value lies in its ability to check status of the cluster, start backups and other management activities. It does so by providing access to a set of commands.
Actually the management client is using a C API which is used to access the management server so for advanced users it is also possible to program dedicated management processes which can do similar things as the management client can do.
When starting the management client it is necessary to state the hostname and port of the management server as in the example below.
ndb_mgm localhost 2200
mysqld
--ndbcluster
NDBCluster
storage engine the
default disabling of support for the NDB
storage engine can be overruled by using this option. Using the NDBCluster
storage engine is necessary for using MySQL Cluster.
--skip-ndbcluster
NDBCluster
storage engine. This is disabled by default for binaries where it is included. So this option only applies if the server was configured to use the NDBCluster
storage engine.
ndbd
-?, --usage
-c string, --connect-string string
ndbd
it is also possible to set the connect string to the management server as a command option.
nodeid=2;host=ndb_mgmd.mysql.com:2200
-d, --daemon
ndbd
to execute as a daemon process.
-i, --initial
ndbd
to perform an initial start. An initial start will erase
any files created by earlier ndbd
instances for recovery. It will also recreate recovery log files which on some Operating Systems can take a substantial amount of time.
-n, --no-start
ndbd
to not automatically start. ndbd
will connect to the management server and get the configuration and initialise communication objects. It will not start the execution engine until requested to do so by the management server. The management server can request by command issued by the management client.
-s, --start
ndbd
process to immediately start. This is the default behavior so it is not really needed.
-v, --version
ndbd
process. The version number is the MySQL Cluster version number. This version number is important since at start-up the MySQL Cluster processes verifies that the versions of the processes in the cluster can co-exist in the cluster. It is also important for on-line software upgrade of MySQL Cluster (see section Software Upgrade of MySQL Cluster
).
ndb_mgmd
-?, --usage
-c filename
-d
ndb_mgmd
to start as a daemon process.
-l filename
-n
--version
ndb_mgm
-?, --usage
[hostname [port]]
--try-reconnect=number
Managing a MySQL Cluster involves a number of activities. The first activity is to configure and start-up MySQL Cluster. This is covered by the sections section 17.3 MySQL Cluster Configuration and section 17.4 Process Management in MySQL Cluster. This section covers how to manage a running MySQL Cluster.
There are essentially two ways of actively managing a running MySQL Cluster. The first is by commands entered into the management client where status of cluster can be checked, log levels changed, backups started and stopped and nodes can be stopped and started. The second method is to study the output in the cluster log. The cluster log is directed towards the cluster.log
in the directory where the management server started. The cluster log contains event reports generated from the ndbd
processes in the cluster. It is also possible to send the cluster log entries to a Unix system log.
Go to the first, previous, next, last section, table of contents.