Live Distributed Objects

Configuring Network Communication

Default Communication
Capabilities
Configuration
QuickSilver Scalable Multicast (QSM)
Capabilities
Configuration

Default Communication

Capabilities

By default, the distribution is configured to use a simple transport based on a centralized server that can accept TCP connections from local or remote clients. The server maintains a global message queue and relays messages between clients. Communication is reliable and totally ordered, and state is transferred from existing to joining members in a manner synchronized with the stream of updates. While the server does relay messages, as configured it does not save data anywhere. No membership information is provided to the clients. Each client process keeps a single TCP connection to a known port on the server, for the purpose of exchanging data and checkpoints and passing control requests back and forth. Communication overhead does not depend on the number of communication channels in use, but the network overhead grows linearly with the number of clients per channel, and the overhead on the server grows linearly with the number of clients connected to the server. Due to the nature of the implementation (a multithreaded server with a single thread per client process), the system is not designed for large numbers of clients or very high data rates. The default communication mode is best suited for debugging, or for use with simple objects that have a low rate of updates. It is possible to configure a client machine to use multiple different servers, e.g. for different sets of channels.

Configuration

The server is implemented as a service, described and configured in file "cc.liveobject" in the ".\services" folder relative to the root installation path. The structure of the object is shown below.

screenshot_154.gif

The top level component is a general-purpose TCP server, which can accept client connections and associate them with communication channels generated by a subordinate channel factory object passed as an argument. The factory object is the actual server, which accepts TCP connections from clients. The server itself is configured in the registry, as discussed below.

The TCP controller uses two TCP ports. The first port, configured by the "address" parameter, is used to host a web service that clients invoke to negotiate connection parameters, such as encryption. The default setting, as shown above, is to accept any clients on port 50000. The second port, configured by the "mainport" parameter, is used to accept the negotiated TCP connections from clients. By default, it is set to 60000. The second port is opened only on a single network adapter, selected by the "network" parameter that describes a network mask. By default, the mask is "0.0.0.0/0", which means that the server will pick any network adapter it finds on the local machine. While this is appropriate for accepting clients from a local host, it should be changed to a meaningful value for network experiments.

For example, if the machine is connected to several networks, one of them being a wired LAN connection over which communication with clients should take place, the IP address of the LAN adapter is "192.168.0.100", and the network mask is "255.255.255.0", then the value of the subnet parameter could be "192.168.0.0/24" (meaning, the system should select the network adapter such that the most significant 24 bits of the address correspond to the prefix "192.168.0"), or it could be "192.168.0.100/32" (meaning, the system must select a network adapter with address exactly "192.168.0.100").

Since both ports need to be accessible to clients, both need to be opened in the firewall.

As the above diagram suggests, the actual server is defined in the "registry" subfolder, under key "5356357BBA5244009D3094BEBBB3A8B4". The structure of the object configured there by default (communication_s.liveobject) is shown on the diagram below.

screenshot_155.gif

This default object acts as a server in a single-level hierarchy. All channels provided by this server as managed and configured locally. The object takes a single parameter "root", which points to the path at which channels are stored, relative to the root of the installation path. By default, the server looks for channel configuration in the "channels" subfolder. The format of this folder has been discussed earlier.

Normally, there is no need to modify the server object stored in the registry. However, one might replace the default server with an object that interfaces an external communication system, or connects to another, higher-level server, in a cascading manner, to address the scalability limitations of this centralized implementation.

On the client-side, channels are stored in a "folder" object configured in the registry. A typical channel will be simply a repository object, with the "from" clause pointing to a registry object shown below.

screenshot_156.gif

For example, a shared text note in the examples folder has the following structure. Note how the above element appears in the "from" clause of the channel object.

screenshot_157.gif

As the diagram suggests, the client-side part of the communication system is configured in the registry, under key "011E83FA6A9F426993BB8EF26FF7D7B8". The structure of the object stored there by default (communication_c.liveobject) is shown below.

screenshot_158.gif

The client-side communication system is an object that is instantiated once in each client process, and then shared by all client objects in this process, and across all channels. The client-side object uses a single connection to communicate with the server (or in general, with its parent in a hierarchy). This connection is configured by the "connection" parameter. By default, the client is configured to open a TCP channel (the client-side equivalent of a TCP channel factory mentioned earlier) to a local server.

The client-side part of the TCP connection is configured using two parameters, "address" that points to the web service that controls connections on the server, and "network" that selects a specific network adapter. Both parameters should match their counterparts on the server, discussed earlier.

In the typical usage scenario, one would replace "localhost" in the value of the "address" parameter, both on the client side and on the server side with the actual DNS name or IP address of the server and replace the default "0.0.0.0/0" subnet selector with a value that corresponds to the physical medium over which communication should take place. One should then open both ports in use on the server and configure the "channels" folder on the server with channel metadata, as discussed earlier.

QuickSilver Scalable Multicast (QSM)

Capabilities

QuickSilver Scalable Multicast (QSM) is an alternative to the default communication scheme where higher performance and scalability are desired, but total ordering and synchronized state transfer are not required, and where IP multicast is available. QSM provides reliable transfer, i.e. each transmitted message is delivered to all clients that are currently connected to the system despite possible losses in the network. Clients that join later will receive messages transmitted after they joined, but not messages transmitted earlier. The ordering of messages is FIFO, that is, messages from a given source are ordered with respect to one another, but messages from different sources are treated as if they belonged to separate sub-channels, and might be differently interleaved when they arrive at different recipients. Also, at this moment there is no state transfer support in QSM. In return for weaker ordering and the lack of state transfer, QSM provides greatly improved scalability (in our experiments, the engine experienced performance degradation at the level of 5% when scaled to 200 nodes), high data rates (QSM itself can transmit at network speeds), and very low overhead. Typical uses of QSM would be in settings where there is either one publisher, or a small number of publishers in combination with a locking scheme to determine who has the permission to transmit at the given time. QSM could be used in systems such as multimedia broadcasting, multicasting images or software updates etc., where one or a small number of servers for each channel distributed data to potentially a larger group of clients. Its reliability property is sufficient to ensure that the transmitted files or multimedia streams are delivered uncorrupted in their entirety. If clients need to receive streams starting from the beginning, one would typically use some form of broadcast controller protocol in combination with QSM, to schedule transmissions in a way such as to accommodate both existing clients (who need current data) and new clients (who need old data to catch up). If receiving old data is not needed and data is being continuously updated (as is the case e.g. when data represents recent values of a certain parameter, such as airplane coordinates published by a GMS device), then QSM could be used alone.

More information on the capabilities of QSM and its performance can be found in the following technical report.

QuickSilver Scalable Multicast.
Krzysztof Ostrowski, Ken Birman, Danny Dolev.
Cornell University Technical Report, http://hdl.handle.net/1813/9406.
The current embedding of QSM into the live objects platform is initial and experimental, and it does impose a noticeable overhead, hence the performance achieved by QSM when used as a transport for live objects will be noticeably lower than when QSM is used alone. Currently, to get the best out of QSM, one is still better off using it simply as a library (see our website for the original distribution). We plan to gradually improve the quality of the embedding, but the ultimate future of QSM is to be subsumed and dissolved into its highly extensible and programmable successor "QS/2", which will eventually become the default multicast layer for the live objects platform. QS/2 is currently not available. We hope to release the initial prototype of QS/2 sometime later this year.

Configuration

The configuration of QSM involves three elements: (a) in-process client, (b) a system service, and (c) a remote controller, each of which runs a single-threaded QSM engine. The in-process client is a thin layer corresponding to the client part of the default communication substrate. A single instance of the in-process client lives in each process, and controls communication within the process. The client is connected using a single bidirectional link to a system service. The service can accept multiple clients, as it was the case with the default communication substrate. By default, the connection is done using a single TCP connection, and clients might be connected to a service that resides on a remote machine. However, the default model is to have a single system service on each machine to control local clients. While the current initial QSM wrapper uses TCP to connect clients to the system service, eventually we will provide a much more streamlined communication over shared memory, which will eliminate some memory copies and greatly reduce overhead thanks to non-blocking synchronization. In this target model, clients might not be able to access remote services (although we might leave the TCP option as an optional feature for selected clients). The shared memory connection will also be used to interface the platform to unmanaged applications.

Although some lowest-level components of QSM, such as scheduling and I/O, are used in each of the three elements above, the actual QSM protocol stack lives in a system service, i.e. by default there is a single instance of the QSM protocol stack running in a system service on each machine. While we might eventually support using QSM directly in a client process, currently this option is not available. Hosting QSM in a system service, as a local daemon controlling the entire node, is consistent with the design principles, QSM does not perform as well if more than a single instance of it is running on each node, and applications interfacing QSM directly need to written in a careful manner, so as not to block in callbacks from QSM, or promptly consume the received data. While hosting QSM in a system service adds a level of indirection, increases latency and the data path, for most casual uses this would still be the most appropriate solution (especially once we streamline the communication between client processes and the system service).

Instances of QSM running in system services on each node communicate over IP multicast. The services rely on a single remote controller component to administer the network. The controller maintains the list of communication channels and their configuration settings, and it hosts a global membership service that the QSM nodes use to agree on who's accessing the particular communication channels, and to configure the loss recovery protocol.

The in-process client is available as an element named "Qsm" from an object (quicksilver_c.liveobject) stored under registry key "010BA5DB9C7F48B0A242A0AD2EF73C54", as shown below. As it was the case with the default communication substrate, the in-process client is modeled as a folder containing communication channels.

screenshot_159.gif

The usage is similar to what was shown earlier. For example, a text note hosted over a QSM channel would be constructed as shown below (the object is also available in the "examples" folder as a reference file "shared_text_1.qsm"). Note that this looks almost exactly as it did earlier, only that we replace the default "Centralized Comm. Channel Client" object with its equivalent "QuickSilver Scalable Multicast". The object looks a bit more complicated because "Qsm" is not configured in the registry directly. Rather, it is an object provided by "QuickSilver". The latter is present in multiple places in the registry, each occurrence is configured to play a different role. The instance stored at key "010BA5DB9C7F48B0A242A0AD2EF73C54" is configured to manage clients within an application process.

screenshot_160.gif

The in-process instance of QuickSilver that resides at key "010BA5DB9C7F48B0A242A0AD2EF73C54" as quicksilver_c.liveobject and is referenced by the above looks as shown below. There is only one parameter here that is relevant, a "configuration". The parameter itself is an extensible namespace containing values of different configuration settings. It can be edited in the object designer in the XML mode (you need to type valid XML as shown below for the designer to be able to parse the value).

screenshot_161.gif

Each instance of QuickSilver, including the in-process instance that doesn't actually do any network communication, has a fairly number of configuration settings that can be tweaked and are discussed in another part of this tutorial. The settings that are most relevant are the following.
  • qsm_application - a boolean value that should be explicitly set to "true", and which indicates that this instance of QuickSilver can directly connect to clients
  • uplink_address - the address of the local service that will manage this client, could be set to "localhost"
  • uplink_subnet - the network where the service is to be found, could be "0.0.0.0/0"

As mentioned earlier, the in-process client-side of QSM is a proxy that requires two server-side components to work. These are available as services in the "services" folder. The service to run locally on each client machine is stored in file "quicksilver_localservice.liveobject" and the service to run on one machine designated to manage the network is stored in file "quicksilver_coordinator.liveobject". While the two might in theory run side-by-side, ideally only one of them would be hosted on a given machine. If the machine needs to have local clients and serve as a coordinator for QSM at the same time, one can merge the configuration settings to produce a single instance of QSM that plays both roles, as discussed below.

By default, the services mentioned above are "commented out" and each service file describes a null object reference. To "activate" the services, you need to remove a null reference and uncomment out the XML. Note that while QSM services might run side by side with the default multicast service, they won't interact or relay messages to one another.

After un-commenting, the local system service described in "quicksilver_localservice.liveobject" has the following structure.

screenshot_162.gif

The object definition referenced and stored in the registry at key "571A2D935613476B9E82D16E089EA596" as quicksilver_s.liveobject looks as follows.

screenshot_163.gif

There are two sets of settings here that are relevant. The first is the "configuration" parameter of the QuickSilver instance, which we will discuss shortly. The second is the configuration of the channel that will be used to connect to the remote coordinator that QSM uses to manage the network. By default, the connection is done over a TCP channel. The configuration of the channel is similar as it was with communication between an in-process client component and the local service, which has already been discussed earlier.

An example set of configuration settings for the local QSM service is shown below. The relevant settings are discussed below.

screenshot_164.gif

First, there are a few settings that describe what this instance of QSM will do.
  • qsm - setting this value to "true" activates the QSM protocol stack
  • asm_application - this value should be "false" because in the default style of use, the local service does not directly accept clients
  • qsm_controller - this is set to false because the local instance of QSM on a client machine does not, by default, act as a controller for the entire network of potentially many QSM clients

These settings configure communication with in-process clients. In this version of the platform, we just hard-coded the communication between the in-process clients and the service to use the TCP connections mechanism discussed earlier. Eventually, we plan to replace it with shared memory communication. We may also support custom transports, but likely, we will mostly rely on a custom scheme because of the nature of the communication between these local processes (which would go over unmanaged memory and use some optimizations to reduce the amount of data being copied from one place to another on the critical path).
  • uplink_controller - setting this to "true" allows this instance of QuickSilver to accept client connections from the in-process instances
  • uplink_address - this is similar as "address" in the configuration of the TCP channel factory, and points to the address of a web service that the system service will expose for in-process clients
  • uplink_port - likewise, this points to the second TCP port used by the TCP channel factory to accept local in-process clients
  • uplink_subnet - used to select a network adapter, as discussed earlier

These settings are used to configure the QSM protocol stack.
  • subnet - this selects the network adapter to use for the actual communication, i.e. for the traffic where most messages will flow; note that this must be an adapter on a network that supports IP multicast; also, because by default in this experimental release we fix TTL to 1, all the communicating nodes must be on the same LAN, and without any routers in between
  • port - this is the main UDP port through which data will flow in and out of QSM
  • gms_address - this is the address of the membership service that will be hosted by the machine that acts as a coordinator (runs quicksilver_coordinator.liveobject) for the local network, and should match the settings used there

Now, let's turn to the second service, quicksilver_coordinator.liveobject, which is to be used as a coordinator managing the network. After un-commenting, the service description looks as follows.

screenshot_165.gif

As mentioned earlier, the coordinator by default communicates with local services over TCP, and so at the top, we see the specification of the TCP channel factory. The configuration of such factory has already been discussed earlier. The actual object that produces communication endpoints to associate with the incoming TCP connections is again defined in registry. The registry definition of quicksilver_g.liveobject at key "BD0D4E2D3C23407380B3092343E2176F" looks as shown below.

screenshot_166.gif

The only relevant parameter is the set of "configuration" settings. The values of "qsm" (set to true) and "qsm_controller" (also set to true) parameters indicate that this instance of QuickSilver runs the QSM protocol stack. However, this protocol stack by default won't (although it could) really be used in its entirety, only the part of it that is needed for the membership service will be active. Since the membership service communicates with its subordinates over UDP, using the same ports and the same network, we still include the "subnet" and "port" parameters. Other parameters are not needed here, as they relate to the actual multicast traffic that won't flow through this instance of QSM.














Last edited Oct 21, 2010 at 10:55 PM by mikeryan, version 8

Comments

No comments yet.