Faiyaz and I recognized that the ipod.conf.php script was returning the wrong IP/Subnet to nodes, resulting in misconfigured /proc/sys/net/ipv4/icmp_ipod* values, that prevented api.RebootNode() from working with IPOD.
Upon investigation, it turned out that there is a limitation in the plc_config to express the kind of installation we have an PLC, namely two boot servers with independent IP addrs, either of which may send the IPOD to the node.
Faiyaz currently uses cfengine/server to distribute the configuration files to each server, boot, web, etc. This is ok, but the problem remains that plc_config cannot express the general, multi-machine installation we use, thereby complicating Node configurations that must know about the authenticity of these machines by IP.
I don't know the right solution right now. Some thoughts are below:
- make plc_config support multiple values for variables, for instance for DNS names.
- Perform this configuration on the node itself, using perhaps a second config file that was perhaps a subset of plc_config that listed just the relevant boot servers.