Today's suckage.

Why my day sucked, by Paul Timmins:

  • My main mailserver going down from what appears ot be a cron job
  • (Did I mention that timmins.net is still under attack from heavy backscatter from a joe job?)
  • (Did I also mention that the main mailserver served timmins.net?)
  • (Did I also forget to mention I don't have physical access to the machine outside of business hours?)
  • The secondary and tertiary machines, unable to clear their mail queues, began getting filled up.
  • As each machine ate more than 60,000 mails each today (from 6am yesterday, to 4am today – over 100,000 on the secondary, and 57,381 on the tertiary, not counting the number eaten by my joe job killer), they started to slow down as the machine dedidcated more resources toward managing the queue and flushing it as configured. This, of course, starts a chain reaction where the machines take longer to accept the mail, which causes the machine's load to go up, which causes queue management to get behind, which causes the mail to take longer to process, which causes the machine's load to go up…..
  • In the middle of all this, my DSL modem (which is both my primary internet connection here at home, plus the tertiary nameserver and mailserver's internet connection, eep!) breathes its last.
  • This happens while I'm performing cleanup operations on the secondary, which was forced offline so the machine could dedicate all its time to cleaning out the garbage from the queue.
  • Then the hardware needed to also be upgraded, so I swapped drives into a newer chassis while the network was down.
  • Then I had to find a working ADSL modem in my pile of shit. I found 4, so I picked the one I found the adapter for first, and ran with it. I had to reconfigure the way my network worked, and set up PPPoE on a server that has 5 network interfaces, so it could replace the all in one router that did PPPoE for me previously. On the other hand, this frees up an IP address, so rock on!
  • Then my tertiary mailserver ran itself into the ground under load because it could finally use the internet properly for the first time in weeks. (this DSL modem problem started out slowly, and gradually deteriorated)
  • Then I came up with a pretty clever solution I'll explain later in its own post. I'm probably gonna leave this in place, as it's a perfect way to handle joejob load.