We have newly installed a Beowulf cluster running Linux.
I tried to port to it, software that has run for some years
under Solaris (and Sun OS before that). It uses RPC (remote
procedure call) to achieve parallel work on a network of
computers. On Linux, the software will run, successfully
passing several thousands of messages, and then suddenly
we get:
clnttcp_create failed on bwasd11: RPC: Remote system error - Resource
temporarily unavailable
This seems to result from a transient situation.
If we try to reexecute the program immediately after this
message occurs, it fails the same way, immediately.
If we wait some seconds and then reexecute, the program begins
running normally. Then it fails again after several throusand
successful calls to clnttpc_create.
Any help in tracking down the source of this problem will be
much appreciated.
Thanks
Dave Schaffer
|