All posts by paul

Maintenance – 7/11/2012

On 7/11/2012, I'll be performing maintenance that will take xenserver2 offline for a short period of time.

Unfortunately, due to the nature of the maintenance (a reboot to clear a likely kernel panic of the dom0) it is unlikely I will be able to live migrate the virtual machines off this host first.

Affected hosts will be:
coffee.ofdoom.org
ignignokt.timminstechnologies.com

Others will be affected, but I cannot any longer access the management console to find out which ones. Nor can I SSH into it (the connection is immediately reset).

I will try to live migrate the hosts from the console, but I don't expect much in the way of useful console.

(this is of course the OTHER server – xenserver1 freaked out last week. ugh!)

How to handle a (temporary or permanent) failure of a master server in Xenserver

http://flickr.com/photos/kevincollins/74279815/
http://flickr.com/photos/kevincollins/74279815/

OK. So you found you logged in, and you see one of the servers down, but don't see the virtual machines that were on that host anywhere in the resource pool. It's as if the entire system just freaked out and the server, and the VMs running on it, disappeared without a trace.

You've verified your shared storage is okay, and thus you're stable enough to begin remediation.

Now you have a problem, but it's easy to fix. Begin by connecting to the remaining server as root with ssh.

The following commands will get you back to a known good state. Make sure your master server is good and dead (unreachable and unlikely to come back without intervention of some sort – in my case it was a kernel panic, and a reboot and fsck -y / fixed the issue, this procedure will work fine in this scenario)

Last login: Fri Mar 18 16:42:03 2011 from (hostname redacted)
Type "xsconsole" for access to the management console.
[root@xenserver2 ~]# xe pool-emergency-transition-to-master
Host agent will restart and transition to master in 10 seconds...
[root@xenserver2 ~]# xe pool-recover-slaves
[root@xenserver2 ~]#

You are now at a point where the cluster is now usefully reporting a master server, and any remaining slaves are now pointing at it.

Now for the finishing touch – marking the powered down VMs as dead, so you can restart them on other servers. Note that if you're not correct on this, really awful inconsistent things will happen. You have been warned.

[root@xenserver2 ~]# xe vm-reset-powerstate --force --multiple
operation failed on ae820fb7-8416-68ee-35c8-c37REDACTEDf18: The operation could not be performed because a domain still exists for the specified VM.
vm: ae820fb7-8416-68ee-35c8-c37REDACTEDdf18 (REDACTED)
domid: 51
operation failed on aa2bc87f-9ef4-dd8e-492e-bREDACTED4: The operation could not be performed because a domain still exists for the specified VM.
vm: aa2bc87f-9ef4-dd8e-492e-b4REDACTE14 (REDACTED)
domid: 54
operation failed on e4f76313-2be1-0d4e-1e78-78e1REDACTED75: The operation could not be performed because a domain still exists for the specified VM.
vm: e4f76313-2be1-0d4e-1e78-78e1REDACTED5 (REDACTED)
domid: 2
operation failed on 557c9030-b995-fbaa-ab0b-61REDACTED4df: The operation could not be performed because a domain still exists for the specified VM.
vm: 557c9030-b995-fbaa-ab0b-61e4abb074df (REDACTED)
domid: 21
[root@xenserver2 ~]#

This marked every unreachable VM as "powered off" and thus it can now be restarted. You'll see it mercifully decided not to hurt my VMs that were confirmed to be operational. You may want to be more cautious and use host-id=your-uuid-here rather than –multiple but that's up to you.

More information available here: http://docs.vmd.citrix.com/XenServer/4.0.1/reference/ch02s06.html

So let's say, for the sake of argument, you found a Linksys WIP-330 in your couch

OK, so anyway it's no secret we're madly, hopelessly disorganized around here.

But anyway, I have this "friend" who may or may not have lost a very expensive (at the time) 802.11g IP phone that ran windows mobile. Let's say, for the sake of argument, it was sometime around 2006.

And let's say that in 2012, he had a toddler that was jamming toys in the couch, and again in this hypothetical world, let's say that his wife was trying to clean the toys out of the couch and 6 YEARS LATER found this device. And it was still in good working order, only having been used for a few months before disappearing. But it had firmware from 2006 on it. And it was lost to the sands of time when Cisco bought Linksys. It's running one version up from the bone stock firmware, 1.00.06. Not 1.00.06A, mind you.

This is a story of that "friend", since clearly while being disorganized I'd never lose a $350 phone in my couch for 6 years, after all. That'd be embarassingly careless.

Aaaaanyway. Here's the story of how to upgrade it. You'll need an 802.11b or g access point. No, 802.11n won't work even though it should supposedly fall back to 802.11g. It won't until you get the new firmware on there, so forget it. Go to a coffeshop or something if you have to.

First, take it to version 1.00.06A (that A is important, it is crucial for the next steps. I found the web browser upgrade interface to be useful for nothing else than getting 1.00.06A on, anything but that tends to just make the web server error out and reset the connection.

This adds the ability to firmware upgrade from the handset directly. From there you need to get the wip330_sbe.bin file onto the handset. Once you have the upgrade from handset options, you're good to go. (You can get all these files, by the way, at http://timmins.net/~paul/wip330/ and can use that url with your handset if you want, it's no skin off my back. Mind you, it's only 1.5m connection, but hey, at least you can get a copy. Linksys/Cisco don't host it anymore)

At this point you have a fancy windows ce .net machine, no sip phone, no nothing. But now, you can get to the new firmware release! YAY!

Once you have this, the on-handset upgrade can take you to wip330_v1_03_18S.bin, where you will have access to any network running open, WEP, or WPA1 (not WPA2!). If you are connecting to a WPA network and getting a prompt about WEP keys, make sure you have your authentication set to WPA/WPA2 (or WPA1/WPA2) PSK. It assumes WPA2 is WEP, no idea why. But it won't work. Mixed authentication mode will work okay, and leaves the more secure WPA2 mode available for other devices.

Android Issue: IPv6 literal in EHLO – Exim4 fix/workaround

To fix the issue where android does this:

IPv6 address/MAC address sanitized

 31.855405 2607:f4b8:2600:1::1 -> 2607:f4b8:2600:5:b607:f9ff:xxxxxx:xxxxx SMTP Response: 220 ignignokt.timminstechnologies.com ESMTP Exim 4.69 Mon, 08 Aug 2011 00:28:43 -0400
.....
 31.857239 2607:f4b8:2600:5:b607:f9ff:xxxxx:xxxx -> 2607:f4b8:2600:1::1 SMTP Command: EHLO 2607:f4b8:2600:5:b607:f9ff:xxxx:xxxx
....
 31.862978 2607:f4b8:2600:1::1 -> 2607:f4b8:2600:5:b607:f9ff:xxxxxx:xxxxxx SMTP Response: 501 Syntactically invalid EHLO argument(s)
....

Do this in your exim4 config, and restart exim:

helo_allow_chars = _:

Technically, only the colon is necessary, but the underscore fixes some other handsets that use the android_aklsjdfljasdf style names in ehlo instead, also a bug.

More details:
http://code.google.com/p/android/issues/detail?id=13681

Before you have unprotected sex, read this

(tl;dr: I am not threatening my kid with the knife, it's a tool. If you're gonna read this thing, read the whole thing)

As I stood in front of my calm, smiling child with the exposed blade of the knife from a multitool in front of me, I considered for a split second how I arrived at this point.

Sure, he is a fine child, and I love him dearly. He's getting physiologically ready to potty train, and what comes out down there is always a surprise, both in quality and in quantity. Today is no exception.

Before me, I am confronted with a quiz. You have a nearly two year old in front of you, whose plain undershirt onesie he's been running around in is a faint brown color near his nether regions, but they can only be practically removed by pulling the item over his head. This has been easily done in the past, but today, he's outdone himself. The entire bottom is soaked and the area you have to unsnap is also a brownish color. He thought it best to hide from you while doing this, since he of course wanted his privacy. So you decided to declare the onesie a loss, as it's the cheaper undershirt kind anyway.

So you size up the situation, note the boy hasn't been squirmy and is very interested in what you're up to. So you pinch his shirt near the belly, and with the sharp side toward yourself, you make a 4" incision from the center to the left side across the shirt, and close up the blade, putting the multitool back on your belt. With some easy tearing, your son now has an exposed midriff, and the bottom portion slides safely off via his legs, as he patiently smiles.

Now you remove the diaper, which is apparently more… well, liquid than usual. It's clearly… soft. Mixed with a bladder full of urine. Which explains how the mess happened so quickly.

Slipping both off, wiping your child from the bellybutton down, front and back, and up the legs to the feet, you begin to wonder when he'll be able to communicate his need to use the restroom before he does it. Soon, hopefully.

Disaster averted, you play happily.

Now I don't say this stuff to discourage procreation. Not by any means. I love my child, worked diligently with my spouse in a long term loving relationship to have him, and plan on eventually having another.

But when you think to yourself "boy, that condom sure is a pain in the ass, I'm sure we'll be okay just this once" I want you to consider that visual for a moment. If you're cool with that, with the partner you're with, by all means, have at it. If you're not, well, trojans are the same cost as that onesie, when both are purchased in quantity.

Be safe out there and have fun!

For me, for later. For others?

How to change from blockio to fileio on a running openfiler instance with initiators connected and filesystems mounted:

[root@filer1 ~]# cat /proc/net/iet/volume
tid:1 name:iqn.2006-01.com.openfiler:tsn.ee0e2bfd7f91
lun:0 state:0 iotype:blockio iomode:wt path:/dev/mudkips/minun
[root@filer1 ~]# vi /etc/ietd.conf

edit the iotype to fileio in this file – but you're not done yet!

Also edit it here
[root@filer1 ~]# vi /opt/openfiler/etc/iscsi/targets/iscsi_settings.xml

This part sounds like it will kill xenserver. It won't. IO will hang for a second while the target is stopped.

[root@filer1 ~]# /etc/init.d/iscsi-target restart
Stopping iSCSI target service: …… [ OK ]
Starting iSCSI target service: [ OK ]
[root@filer1 ~]#

[root@filer1 ~]# cat /proc/net/iet/volume
tid:1 name:iqn.2006-01.com.openfiler:tsn.ee0e2bfd7f91
lun:0 state:0 iotype:fileio iomode:wt path:/dev/mudkips/minun
[root@filer1 ~]#

The internets say you can do:

[root@filer1 ~]# ietadm –op delete –tid=1 –lun=0 && ietadm –op new –tid=1 –lun=0 –params Type=fileio,Path=/dev/mudkips/minun

However, you can't delete a target that's got initiators connected. You CAN do this to a running copy of ietd/iscsi-target if this target has nobody connected without restarting ietd.

Now you know, and knowing is half the battle.

Remember, if you care about my more ephemeral stuff, I'm usually posting that to Facebook these days.
http://www.facebook.com/PaulTimmins

Hey guys does anyone want to go wardriving with me?

Ironic note: My digital camera uses the same technology utilized by me in 2003 to get me in a world of federal trouble (okay, so that was not specifically the overt act that landed me in hot water, using the internet connection I found was). It uses it to put a latitude/longitude coordinate on the images I take so I can see where they were taken. How was that data collected to have that work? Dorks driving around with laptops, wifi cards, and GPS units.

What was villified in the news in 2003, 2004 and 2005 is now status quo. My iPod Touch also uses that technology in order to emulate a GPS in location based apps such as foursquare and google maps.

Consider me to be so far out on the cutting edge, I was actually cut.

(and before someone accuses me of being late to that party, I should note that prior to acquiring a proper wifi card, we were doing netstumbling like exercises with Spectrum24 and RangeLan2. This was in 2001/2002 era)

Dear lazyweb

I am looking for two videos of any quality of the following (they would have predated 1984):

The AT&T "Joey" long distance ad that tries to highlight how hard it is to use a competitive long distance carrier with the old couple getting confused on all the numbers they have to dial

The MCI "Joey" ad using the same cast that shows the same old couple aghast at their bill when they gave up on MCI and switched back to AT&T.

Anyone providing these will win at least 2 internets.