[an error occurred while processing this directive]
This document describes the various components provided by the Cluster Infrastructure (CI) package. ICS CHANNELS (high-level and low-level) ======================================== An ICS channel is basically a TCP/IP connection which is established with another node for intra-cluster communication. Channels are the underlying transport through which cluster service messages/RPCS are sent. Therefore, each cluster service needs to map to a specific ICS channel. It is possible to group several cluster services together as subservices sharing the same ICS channel (Read more about subservices below). Different ICS channels are created for purposes of flow control (also known as throttling). When the system is under high load, channels will be throttled according to their per-channel throttling variables. Throttling does not occur on certain "priority" ICS channels. These priority channels exist to prevent deadlock situations. Often times an RPC message will be sent from Node A to Node B. The server-side routine on Node B will in turn send an RPC/message back to Node A. However, Node A's ICS channel could become throttled at this point. This is unfortunate because the RPC/message sent by Node B would help free resources on Node A but it cannot handle the message due to throttling. To avoid this kind of deadlock, the ICS code will automatically raise the priority of an outgoing RPC/message if it is being sent within the server-side routine of an incoming RPC. [ NOTE: In the current code version, ICS throttling is not fully implemented. ] An ics_prio data member has been added to the task_struct structure to keep track of the current ICS priority. A priority of 0 will use the normal ICS channel mapped to that specific cluster service. Any priority above 0 will use one of the 4 ICS priority channels defined. There are 2 ICS channels used for sending RPC replies. The ics_reply_chan is used for sending replies to non-priority RPC messages, while the ics_reply_prio_chan is used to reply to all high priority RPCs. Like the priority ICS channels, neither of these reply channels is ever throttled. The current code defines a maximum of 12 ICS channels in a cluster. This can be increased by tuning ICS_MAX_CHANNELS in the header file ics.h. The current CI code sets up 8 different ICS channels. There are 4 priority channels and 2 low-level reply channels, as mentioned above. In addition, there are 2 channels defined to support cluster services. One of these supports the CLMS, cluster API, and ICS forwarding services, and the other is used specifically for probing the CLMS master on bootup. To be precise, the ics_clms_probe_chan is a transient channel that gets setup and torn down per probe sent. These probes are sent by coming-up nodes to query who the CLMS master is. For more information on how to add your own ICS channel, please read the enhancing.txt document. CLUSTER SERVICES ================= A cluster service consists of "service routines" that are run on each node in the cluster and "service messages" that are sent amongst cluster nodes for communication. The current CI code defines 3 cluster services. These include the CLMS service, the cluster API service, and the ICS signal forwarding service. Each cluster service will need to send messages across the cluster and therefore needs to communicate over an ICS channel. The 3 cluster services are multiplexed on top of one ICS channel, ics_clms_chan. As mentioned above, it is possible for several cluster services to share one ICS channel and they will be flow-controlled as one unit. There is a direct mapping from a cluster service to an ICS channel+subservice. Thus, every cluster service is designated to send messages over a specific ICS channel and it has a subservice number to distinguish its messages from other cluster service messages being sent on the same channel. The macro used to do this mapping is: #define ICS_NSC_SVC(_chan,_subservice) ((_chan << 16) + _subservice) You can also determine the ICS service channel/subservice number given the cluster service using these macros: #define ICS_CHAN(_cluster_svc) ((_cluster_svc) >> 16) #define ICS_SSVC(_cluster_svc) ((_clustersvc) & ((1 << 16) - 1)) There is a limit of 6 subservices per ICS service channel, set by ICS_NUM_SUBSERVICES in the header file ics.h. Given that there are 6 ICS channels that can be used for regular cluster services (12 - 4 priority - 2 reply channels), this puts a limit of 36 cluster services (6 ICS service channels * 6 subservices/channel) that can be defined in a cluster. Each cluster service needs to register with ICS so that it can callback the stub routines. ICS is also informed of the min/max number of available server handles that should be maintained to handle incoming messages on this channel. Icsgen is used to generate client/server stubs for the cluster service messages. The stubs will insert the cluster service number in the ICS message headers. Icsgen also generates a routine which registers the cluster service with ICS for you. Please read the ICSGEN document for more information. CLMS SUBSYSTEMS ================ Cluster services may need to run cleanup or initialization code when nodes come up or go down in the cluster. This is done by registering a CLMS subsystem. CLMS subsystems perform callbacks for a cluster service during cluster NODEUP and NODEDOWN events. Each CLMS subsystem calls register_clms_subsys() to register itself with CLMS. This registration call takes in the name of the CLMS subsystem, function pointers to the NODEUP and NODEDOWN callback routines, and the priority band of the subsystem. During NODEUP/NODEDOWN events the subsystem callbacks are performed (on each surviving cluster node) in order of priority band. Subsystems within the same priority band are called in the order they were registered with CLMS. The CI code defines a maximum of 15 CLMS subystems in a cluster. This is can be tuned by modifying CLMS_MAX_SUBSYSTEMS in the header file clms.h. For more information on adding a subsystem to CLMS, please read the enhancing.txt document. CLMS KEY SERVICES ================== CLMS key services are created for cluster services that need to be centralized. A key service is designated to run on one node within the cluster at any point in time. Cluster nodes specify in their boot entries whether or not they are able to serve as key service nodes for a particular key service. Each key service registers a failover routine with CLMS by calling clms_register_key_service(). When the node running the key service goes down, CLMS will select another node to take over the key service and run the registered failover callbacks on that new node. If the failover key service node needs to pull data from the surviving nodes in the cluster, it can do so if the key service has registered a pull_data routine. Once the pulled data has been gathered, it runs a failover_data routine to process this data. Finally, the key service is set to ready and the whole cluster is notified of this new key service node. Key services can be defined as either critical or non-critical. A critical key service will delay the forming of a cluster until CLMS has designated a key service node for it. Similarly, when a node serving a critical key service goes does, the cluster will panic if CLMS cannot find a failover node for that key service. For more information on adding a key service to CLMS, please read the enhancing.txt document.
The Linux Clustering Information Center
This file last updated on Tuesday, 14-May-2002 09:34:25 UTC [an error occurred while processing this directive]