Consolidated old blog

July 12, 2007 PLC server maintenance, et. al.

I've been working on making all PLC services redundant and have been pretty successful except for the database. PostgreSQL does not support the notion of a "database cluster" out of the box. Third party patches claim to have solved this problem, but we have yet to investigate them.

I'm moving www to the backup database machine (cassidy). When we ordered new machines, I went ahead and asked for 1 extra to serve as a hot spare (for the database because of the above problem). The rationale is that if a machine misbehaves, I can grab the latest database dump and mix and match services and machines as needed.

EDIT: This hasn't happened yet. MyPLC is still somewhat finiky so I haven't had a chance to install a recent complete build on PLC. I'm testing a new build as I write this so I may just switch over soon, but cfengine needs to be updated to propogate the relevant keys. Speaking of keys, the config parser that creates httpd.conf will needed extra functionality to support the keychaining (intermediate keys) that godaddy is making us use. So, in sort, this is on hold.
July 06, 2007 plc_config.xml

PLC runs MyPLC on a collection of servers for load balancing. MyPLC's current config parsing doesn't support this out of the box so we use CFEngine and some custom groups and rules to keep the configs in sync. The problem is PostgreSQL, as parced from plc_config.xml, only allows 3 machines to access the database (www, boot, and API). Currently, we have 4 to 5 in redundant roles. I end up preparsing the configs and make CFEngine copy pg_hba.conf and the relevant httpd.confs to the proper machines.

I'm fixing the xml parser to allow for more than 1 entry in a group so the right thing ends up happening. The right configs are made on each machine and the right access privileges end up on the DB.

Also, email reminders aren't part of MyPLC out of the box. I'm going to write a cron'ed function in the API that does what our out of band scripts are currently doing so they become part of MyPLC.

June 13, 2007 Bandwidth monitoring bug

I found a bug where bwmon would reset buckets for slices, but the buckets don't exist. At some point the kerne/vnet/bwmon itself would kill the bucket, and the bwmon db would continue to manipulate it as if it still existed. I setup a sync to keep that from happening and got rid of old code so its a little easier to read. I'm also making it a thread rather than a function of database.sync so syncs wont take so long. I'm not used to David's thread wrapper, but I'm sure its the right thing to do; I just have to get comfortable with it. Oh yeah, bwmon db would survive reboots, but the buckets wouldn't. Bad.