Previous Up Next

Chapter 8  Connection and Thread Management

This chapter describes how omniORB manages threads and network connections.

8.1  Background

In CORBA, the ORB is the ‘middleware’ that allows a client to invoke an operation on an object without regard to its implementation or location. In order to invoke an operation on an object, a client needs to ‘bind’ to the object by acquiring its object reference. Such a reference may be obtained as the result of an operation on another object (such as a naming service or factory object) or by conversion from a stringified representation. If the object is in a different address space, the binding process involves the ORB building a proxy object in the client’s address space. The ORB arranges for invocations on the proxy object to be transparently mapped to equivalent invocations on the implementation object.

For the sake of interoperability, CORBA mandates that all ORBs should support IIOP as the means to communicate remote invocations over a TCP/IP connection. IIOP is usually1 asymmetric with respect to the roles of the parties at the two ends of a connection. At one end is the client which can only initiate remote invocations. At the other end is the server which can only receive remote invocations.

Notice that in CORBA, as in most distributed systems, remote bindings are established implicitly without application intervention. This provides the illusion that all objects are local, a property known as ‘location transparency’. CORBA does not specify when such bindings should be established or how they should be multiplexed over the underlying network connections. Instead, ORBs are free to implement implicit binding by a variety of means.

The rest of this chapter describes how omniORB manages network connections and the programming interface to fine tune the management policy.

8.2  The model

omniORB is designed from the ground up to be fully multi-threaded. The objective is to maximise the degree of concurrency and at the same time eliminate any unnecessary thread overhead. Another objective is to minimise the interference by the activities of other threads on the progress of a remote invocation. In other words, thread ‘cross-talk’ should be minimised within the ORB. To achieve these objectives, the degree of multiplexing at every level is kept to a minimum by default.

Minimising multiplexing works well when the ORB is relatively lightly loaded. However, when the ORB is under heavy load, it can sometimes be beneficial to conserve operating system resources such as threads and network connections by multiplexing at the ORB level. omniORB has various options that control its multiplexing behaviour.

8.3  Client side behaviour

On the client side of a connection, the thread that invokes on a proxy object drives the GIOP protocol directly and blocks on the connection to receive the reply. The first time the client makes a call to a particular address space, the ORB opens a suitable connection to the remote address space (based on the client transport rule as described in section 8.7.1). After the reply has been received, the ORB caches the open network connection, ready for use by another call.

If two (or more) threads in a multi-threaded client attempt to contact the same address space simultaneously, there are two different ways to proceed. The default way is to open another network connection to the server. This means that neither the client or server ORB has to perform any multiplexing on the network connections—multiplexing is performed by the operating system, which has to deal with multiplexing anyway. The second possibility is for the client to multiplex the concurrent requests on a single network connection. This conserves operating system resources (network connections), but means that both the client and server have to deal with multiplexing issues themselves.

In the default one call per connection mode, there is a limit to the number of concurrent connections that are opened, set with the maxGIOPConnectionPerServer parameter. To tell the ORB that it may multiplex calls on a single connection, set the oneCallPerConnection parameter to zero. If the oneCallPerConnection parameter is set to the default value of one, and there are more concurrent calls than specified by maxGIOPConnectionPerServer, calls block waiting for connections to become free.

Note that some server-side ORBs, including omniORB versions before version 4.0, are unable to deal with concurrent calls multiplexed on a single connection, so they serialise the calls. It is usually best to keep to the default mode of opening multiple connections.

8.3.1  Client side timeouts

omniORB can associate a timeout with a call, meaning that if the call takes too long a TRANSIENT exception is thrown. Timeouts can be set for the whole process, for a specific thread, or for a specific object reference.

Timeouts are set using this API:

namespace omniORB { void setClientCallTimeout(CORBA::ULong millisecs); void setClientCallTimeout(CORBA::Object_ptr obj, CORBA::ULong millisecs); void setClientThreadCallTimeout(CORBA::ULong millisecs); void setClientConnectTimeout(CORBA::ULong millisecs); };

setClientCallTimeout() sets either the global timeout or the timeout for a specific object reference. setClientThreadCallTimeout() sets the timeout for the calling thread. The calling thread must have an omni_thread associated with it. Setting any timeout value to zero disables it.

Accessing per-thread state is a relatively expensive operation, so per thread timeouts are disabled by default. The supportPerThreadTimeOut parameter must be set true to enable them.

To choose the timeout value to use for a call, the ORB first looks to see if there is a timeout for the object reference, then to the calling thread, and finally to the global timeout.

When a client has no existing connection to communicate with a server, it must open a new connection before performing the call. setClientConnectTimeout() sets an overriding timeout for cases where a new connection must be established. The effect of the connect timeout depends upon whether the connect timeout is greater or less than the timeout that would otherwise be used.

As an example, imagine that the usual call timeout is 10 seconds:

Connect timeout > usual timeout

If the connect timeout is set to 20 seconds, then a call that establishes a new connection will be permitted 20 seconds before it times out. Subsequent calls using the same connection have the normal 10 second timeout. If establishing the connection takes 8 seconds, then the call itself takes 5 seconds, the call succeeds despite having taken 13 seconds in total, longer than the usual timeout.

This kind of configuration is good when connections are slow to be established.

If an object reference has multiple possible endpoints available, and connecting to the first endpoint times out, only that one endpoint will have been tried before an exception is raised. However, once the timeout has occurred, the object reference will switch to use the next endpoint. If the application attempts to make another call, it will use the next endpoint.

Connect timeout < usual timeout

If the connect timeout is set to 2 seconds, the actual network-level connect is only permitted to take 2 seconds. As long as the connection is established in less than 2 seconds, the call can proceed. The 10 second call timeout still applies to the time taken for the whole call (including the connection establishment). So, if establishing the connection takes 1.5 seconds, and the call itself takes 9.5 seconds, the call will time out because although it met the connection timeout, it exceeded the 10 second total call timeout. On the other hand, if establishing the connection takes 3 seconds, the call will fail after only 2 seconds, since only 2 seconds are permitted for the connect.

If an object reference has multiple possible endpoints available, the client will attempt to connect to them in turn, until one succeeds. The connect timeout applies to each connection attempt. So with a connect timeout of 2 seconds, the client will spend up to 2 seconds attempting to connect to the first address and then, if that fails, up to 2 seconds trying the second address, and so on. The 10 second timeout still applies to the call as a whole, so if the total time taken on timed-out connection attempts exceeds 10 seconds, the call will time out.

This kind of configuration is useful where calls may take a long time to complete (so call timeouts are long), but a fast indication of connection failure is required.

8.4  Server side behaviour

The server side has two primary modes of operation: thread per connection and thread pooling. It is able to dynamically transition between the two modes, and it supports a hybrid scheme that behaves mostly like thread pooling, but has the same fast turn-around for sequences of calls as thread per connection.

8.4.1  Thread per connection mode

In thread per connection mode (the default, and the only option in omniORB versions before 4.0), each connection has a single thread dedicated to it. The thread blocks waiting for a request. When it receives one, it unmarshals the arguments, makes the up-call to the application code, marshals the reply, and goes back to watching the connection. There is thus no thread switching along the call chain, meaning the call is very efficient.

As explained above, a client can choose to multiplex multiple concurrent calls on a single connection, so once the server has received the request, and just before it makes the call into application code, it marks the connection as ‘selectable’, meaning that another thread should watch it to see if any other requests arrive. If they do, extra threads are dispatched to handle the concurrent calls. GIOP 1.2 actually allows the argument data for multiple calls to be interleaved on a connection, so the unmarshalling code has to handle that too. As soon as any multiplexing occurs on the connection, the aim of removing thread switching cannot be met, and there is inevitable inefficiency due to thread switching.

The maxServerThreadPerConnection parameter can be set to limit the number of threads that can be allocated to a single connection containing concurrent calls. Setting the parameter to 1 mimics the behaviour of omniORB versions before 4.0, that did not support calls multiplexed on one connection.

8.4.2  Thread pool mode

In thread pool mode, selected by setting the threadPerConnectionPolicy parameter to zero, a single thread watches all incoming connections. When a call arrives on one of them, a thread is chosen from a pool of threads, and set to work unmarshalling the arguments and performing the up-call. There is therefore at least one thread switch for each call.

The thread pool is not pre-initialised. Instead, threads are started on demand, and idle threads are stopped after a period of inactivity. The maximum number of threads that can be started in the pool is set with the maxServerThreadPoolSize parameter. The default is 100.

A common pattern in CORBA applications is for a client to make several calls to a single object in quick succession. To handle this situation most efficiently, the default behaviour is to not return a thread to the pool immediately after a call is finished. Instead, it is set to watch the connection it has just served for a short while, mimicking the behaviour in thread per connection mode. If a new call comes in during the watching period, the call is dispatched without any thread switching, just as in thread per connection mode. Of course, if the server is supporting a very large number of connections (more than the size of the thread pool), this policy can delay a call coming from another connection. If the threadPoolWatchConnection parameter is set to zero, connection watching is disabled and threads return to the pool immediately after finishing a single request.

In the face of multiplexed calls on a single connection, multiple threads from the pool can be dispatched for one connection, just as in thread per connection mode. With threadPoolWatchConnection set to the default value of 1, only the last thread servicing a connection will watch it when it finishes a request. Setting the parameter to a larger number allows the last n connections to watch the connection.

8.4.3  Policy transition

If the server is dealing with a relatively small number of connections, it is most efficient to use thread per connection mode. If the number of connections becomes too large, however, operating system limits on the number of threads may cause a significant slowdown, or even prevent the acceptance of new connections altogether.

To give the most efficient response in all circumstances, omniORB allows a server to start in thread per connection mode, and transition to thread pooling if many connections arrive. This is controlled with the threadPerConnectionUpperLimit and threadPerConnectionLowerLimit parameters. The former must always be larger than the latter. The upper limit chooses the number of connections at which time the ORB transitions to thread pool mode; the lower limit selects the point at which the transition back to thread per connection is made.

For example, setting the upper limit to 50 and the lower limit to 30 would mean that the first 49 connections would receive dedicated threads. The 50th to arrive would trigger thread pooling. All future connections to arrive would make use of threads from the pool. Note that the existing dedicated threads continue to service their connections until the connections are closed. If the number of connections falls below 30, thread per connection is reactivated and new connections receive their own dedicated threads (up to the limit of 50 again). Once again, existing connections in thread pool mode stay in that mode until they are closed.

8.5  Idle connection shutdown

It is wasteful to leave a connection open when it has been left unused for a considerable time. Too many idle connections could block out new connections when it runs out of spare communication channels. For example, most platforms have a limit on the number of file handles a process can open. Many platforms have a very small default limit like 64. The value can often be increased to a maximum of a thousand or more by changing the ‘ulimit’ in the shell.

Every so often, a thread scans all open connections to see which are idle. The scanning period (in seconds) is set with the scanGranularity parameter. The default is 5 seconds.

Outgoing connections (initiated by clients) and incoming connections (initiated by servers) have separate idle timeouts. The timeouts are set with the outConScanPeriod and inConScanPeriod parameters respectively. The values are in seconds, and must be a multiple of the scan granularity.

Beware that setting outConScanPeriod or inConScanPeriod to be equal to (or less than) scanGranularity means that connections are considered candidates for closure immediately after they are opened. That can mean that the connections are closed before any calls have been sent through them. If oneway calls are used, such connection closure can result in silent loss of calls.

8.5.1  Interoperability Considerations

The IIOP specification allows both the client and the server to shutdown a connection unilaterally. When one end is about to shutdown a connection, it should send a CloseConnection message to the other end. It should also make sure that the message will reach the other end before it proceeds to shutdown the connection.

The client should distinguish between an orderly and an abnormal connection shutdown. When a client receives a CloseConnection message before the connection is closed, the condition is an orderly shutdown. If the message is not received, the condition is an abnormal shutdown. In an abnormal shutdown, the ORB should raise a COMM_FAILURE exception whereas in an orderly shutdown, the ORB should not raise an exception and should try to re-establish a new connection transparently.

omniORB implements these semantics completely. However, it is known that some ORBs are not (yet) able to distinguish between an orderly and an abnormal shutdown. Usually this is manifested as the client in these ORBs seeing a COMM_FAILURE occasionally when connected to an omniORB server. The work-around is either to catch the exception in the application code and retry, or to turn off the idle connection shutdown inside the omniORB server.

8.6  Transports and endpoints

omniORB can support multiple network transports. All platforms (usually) have a TCP transport available. Unix platforms support a Unix domain socket transport. Platforms with the OpenSSL library available can support an SSL transport.

Servers must be configured in two ways with regard to transports: the transports and interfaces on which they listen, and the details that are published in IORs for clients to see. Usually the published details will be the same as the listening details, but there are times when it is useful to publish different information.

Details are selected with the endPoint family of parameters. The simplest is plain endPoint, which chooses a transport and interface details, and publishes the information in IORs. Endpoint parameters are in the form of URIs, with a scheme name of ‘giop:’, followed by the transport name. Different transports have different parameters following the transport.

TCP endpoints have the format:

giop:tcp:<host>:<port>

The host must be a valid host name or IP address for the server machine. It determines the network interface on which the server listens. The port selects the TCP port to listen on, which must be unoccupied. Either the host or port, or both can be left empty. If the host is empty, the ORB publishes the IP address of the first non-loopback network interface it can find (or the loopback if that is the only interface), but listens on all network interfaces. If the port is empty, the operating system chooses a port.

Multiple TCP endpoints can be selected, either to specify multiple network interfaces on which to listen, or (less usefully) to select multiple TCP ports on which to listen.

If no endPoint parameters are set, the ORB assumes a single parameter of giop:tcp::, meaning IORs contain the address of the first non-loopback network interface, the ORB listens on all interfaces, and the OS chooses a port number.

SSL endpoints have the same format as TCP ones, except ‘tcp’ is replaced with ‘ssl’. Unix domain socket endpoints have the format:

giop:unix:<filename>

where the filename is the name of the socket within the filesystem. If the filename is left blank, the ORB chooses a name based on the process id and a timestamp.

To listen on an endpoint without publishing it in IORs, specify it with the endPointNoPublish configuration parameter. See below for more details about endpoint publishing.

8.6.1  IPv6

On platforms where it is available, omniORB supports IPv6. On most Unix platforms, IPv6 sockets accept both IPv6 and IPv4 connections, so omniORB’s default giop:tcp:: endpoint accepts both IPv4 and IPv6 connections. On Windows versions before Windows Vista, each socket type only accepts incoming connections of the same type, so an IPv6 socket cannot be used with IPv4 clients. For this reason, the default giop:tcp:: endpoint only listens for IPv4 connections. Since endpoints with a specific host name or address only listen on a single network interface, they are inherently limited to just one protocol family.

To explicitly ask for just IPv4 or just IPv6, an endpoint with the wildcard address for the protocol family should be used. For IPv4, the wildcard address is ‘0.0.0.0’, and for IPv6 it is ‘::’. So, to listen for IPv4 connections on all IPv4 network interfaces, use an endpoint of:

giop:tcp:0.0.0.0:

All IPv6 addresses contain colons, so the address portion in URIs must be contained within [] characters. Therefore, to listen just for IPv6 connections on all IPv6 interfaces, use the somewhat cryptic:

giop:tcp:[::]:

To listen for both IPv4 and IPv6 connections on Windows versions prior to Vista, both endpoints must be explicitly provided.

8.6.1.1  Link local addresses

In IPv6, all network interfaces are assigned a link local address, starting with the digits fe80. The link local address is only valid on the same ‘link’ as the interface, meaning directly connected to the interface, or possibly on the same subnet, depending on how the network is switched. To connect to a server’s link local address, a client has to know which of its network interfaces is on the same link as the server. Since there is no way for omniORB to know which local interface a remote link local address may be connected to, and in extreme circumstances may even end up contacting the wrong server if it picks the wrong interface, link local addresses are not considered valid. Servers do not publish link local addresses in their IORs.

8.6.2  Endpoint publishing

For clients to be able to connect to a server, the server publishes endpoint information in its IORs (Interoperable Object References). Normally, omniORB publishes the first available address for each of the endpoints it is listening on.

The endpoint information to publish is determined by the endPointPublish configuration parameter. It contains a comma-separated list of publish rules. The rules are applied in turn to each of the configured endpoints; if a rule matches an endpoint, it causes one or more endpoints to be published.

The following core rules are supported:

addrthe first natural address of the endpoint
ipv4the first IPv4 address of a TCP or SSL endpoint
ipv6the first IPv6 address of a TCP or SSL endpoint
namethe first address that can be resolved to a name
hostnamethe result of the gethostname() system call
fqdnthe fully-qualified domain name

The core rules can be combined using the vertical bar operator to try several rules in turn until one succeeds. e.g:

name|ipv6|ipv4the name of the endpoint if it has one; failing that, its first IPv6 address; failing that, its first IPv4 address.

Multiple rules can be combined using the comma operator to publish more than one endpoint. e.g.

name,addrthe name of the endpoint (if it has one), followed by its first address.

For endpoints with multiple addresses (e.g. TCP endpoints on multi-homed machines), the all() manipulator causes all addresses to be published. e.g.:

all(addr)all addresses are published
all(name)all addresses that resolve to names are published
all(name|addr)all addresses are published by name if they have one, address otherwise.
all(name,addr)all addresses are published by name (if they have one), and by address.
all(name), all(addr)first the names of all addresses are published, followed by all the addresses.

A specific endpoint can be published by giving its endpoint URI, even if the server is not listening on that endpoint. e.g.:

giop:tcp:not.my.host:12345
giop:unix:/not/my/socket-file

If the host or port number for a TCP or SSL URI are missed out, they are filled in with the details from each listening TCP/SSL endpoint. This can be used to publish a different name for a TCP/SSL endpoint that is using an ephemeral port, for example.

omniORB 4.0 supported two options related to endpoint publishing that are superseded by the endPointPublish parameter, and so are now deprecated. Setting endPointPublishAllIFs to 1 is equivalent to setting endPointPublish to ‘all(addr)’. The endPointNoListen parameter is equivalent to adding endpoint URIs to the endPointPublish parameter.

8.7  Connection selection and acceptance

In the face of IORs containing details about multiple different endpoints, clients have to know how to choose the one to use to connect a server. Similarly, servers may wish to restrict which clients can connect to particular transports. This is achieved with transport rules.

8.7.1  Client transport rules

The clientTransportRule parameter is used to filter and prioritise the order in which transports specified in an IOR are tried. Each rule has the form:

<address mask> [action]+

The address mask can be one of

1.localhostThe address of this machine
2.w.x.y.z/m1.m2.m3.m4An IPv4 address with bits selected by the mask, e.g. 172.16.0.0/255.240.0.0
3.w.x.y.z/prefixlenAn IPv4 address with prefixlen significant bits, e.g. 172.16.2.0/24
4.a:b:c:d:e:f:g:h/prefixlenAn IPv6 address with prefixlen significant bits, e.g. 3ffe:505:2:1::/64
5.*Wildcard that matches any address

The action is one or more of the following:

1.noneDo not use this address
2.tcpUse a TCP transport
3.sslUse an SSL transport
4.unixUse a Unix socket transport
5.bidirConnections to this address can be used bidirectionally (see section 8.8)

The transport-selecting actions form a prioritised list, so an action of ‘unix,ssl,tcp’ means to use a Unix transport if there is one, failing that a SSL transport, failing that a TCP transport. In the absence of any explicit rules, the client uses the implicit rule of ‘* unix,ssl,tcp’.

If more than one rule is specified, they are prioritised in the order they are specified. For example, the configuration file might contain:

  clientTransportRule = 192.168.1.0/255.255.255.0  unix,tcp
  clientTransportRule = 172.16.0.0/255.240.0.0     unix,tcp
                      =       *                    none

This would be useful if there is a fast network (192.168.1.0) which should be used in preference to another network (172.16.0.0), and connections to other networks are not permitted at all.

In general, the result of filtering the endpoint specifications in an IOR with the client transport rule will be a prioritised list of transports and networks. (If the transport rules do not prioritise one endpoint over another, the order the endpoints are listed in the IOR is used.) When trying to contact an object, the ORB tries its possible endpoints in turn, until it finds one with which it can contact the object. Only after it has unsuccessfully tried all permissible endpoints will it raise a TRANSIENT exception to indicate that the connect failed.

8.7.2  Server transport rules

The server transport rules have the same format as client transport rules. Rather than being used to select which of a set of ways to contact a machine, they are used to determine whether or not to accept connections from particular clients. In this example, we only allow connections from our intranet:

  serverTransportRule = localhost                  unix,tcp,ssl
                      = 172.16.0.0/255.240.0.0     tcp,ssl
                      = *                          none

And in this one, we accept only SSL connections if the client is not on the intranet:

  serverTransportRule = localhost                  unix,tcp,ssl
                      = 172.16.0.0/255.240.0.0     tcp,ssl
                      = *                          ssl,bidir

In the absence of any explicit rules, the server uses the implicit rule of ‘* unix,ssl,tcp’, meaning any kind of connection is accepted from any client.

8.8  Bidirectional GIOP

omniORB supports bidirectional GIOP, which allows callbacks to be made using a connection opened by the original client, rather than the normal model where the server opens a new connection for the callback. This is important for negotiating firewalls, since they tend not to allow connections back on arbitrary ports.

There are several steps required for bidirectional GIOP to be enabled for a callback. Both the client and server must be configured correctly. On the client side, these conditions must be met:

On the server side, these conditions must be met:

8.9  SSL transport

omniORB supports an SSL transport, using OpenSSL. It is only built if OpenSSL is available. On platforms using Autoconf, it is autodetected in many locations, or its location can be given with the --with-openssl= argument to configure. On other platforms, the OPEN_SSL_ROOT make variable must be set in the platform file.

To use the SSL transport, you must link your application with the omnisslTP library, and correctly set up certificates. See the src/examples/ssl_echo directory for an example. That directory contains a README file with more details.


1
GIOP 1.2 supports ‘bidirectional GIOP’, which permits the rôles to be reversed.

Previous Up Next