CMU - and why I hate it

Posted by: mstauber Category: Development

How and why CMU v2.90 was rolled up.

The last two weeks I did a lot of server migrations for clients. A lot more than I usually do. Now with the CentOS4 based BlueQuartz reaching its end of life it's of course natural that a lot of these old clunkers get migrated to BlueOnyx. Often at the last minute and after having run for five or six years (or more) with many, many modifications to the original OS.

As you can imagine, many of those long running servers have accumulated and inherited a lot of problems and with a CMU based migration onto a fresh BlueOnyx 5107R or 5108R these problems should be over. It's like a fresh start with the old data.

If it where that easy - of course - then there would be no reason to write home about it.

Last weekend one of these migrations turned into a little ordeal that was hard to bear. It was an unplanned emergency migration off a dying BlueQuartz box. The target of the migration was a BlueOnyx 5108R VPS on Aventurin{e} 6106R.

I had a very recent CMU export of the BlueQuartz 5102R at hand. It contained around 80 sites with 1-10 users each, so this should be easy and straightforward, right?

Think twice.

The disk quota on the source server had been shot six ways 'til Sunday. So the disk quota for sites and users was so screwed up, that there was no telling which sites or users were over quota - if any. This proved to be a show stopping problem as soon as I started to import the first site. I went cautiously with the cmuImport and planned to import one site after the other. So when I imported the first site, CMU did it's usual thing:

It parsed the XML file, created the site and then created all the user accounts. Next it tried to import the tarballs that contain the "payload": The actual files and folders of the site (logfiles, certificates, websites) and the files and folders of the user accounts.

Here CMU fell flat on its face and my screen filled up with scrolling garbage, as CMU buggered out while unpacking the tarball. The "overspill" of data not only filled my screen. No, it also created hundrets of fake directories and files in various places such as /root, my work directory and the newly created Vsite directory.

It's a common and age old problem with CMU. In fact this problem has been around since the Cobalt days and it has always been a source of grief and aggravation.

In my case - during this particular migration - it took hours to fix and to get the import done nonetheless. What should have been a three hour job (mostly waiting for the import to finish) lasted almost eight hours and involved furious labour and tinkering.

Now why does CMU crap out like this?

Because it tries to push a square pig through a round hole.

Let us look again at what cmuImport does:

It reads the cmu.xml file, which contains the information about the server settings and configuration and all the sites (their settings, too) and the users (plus all their settings).

It then creates a list of all sites and users and gets busy. For each site it creates the site first - with ALL the settings that it had before. If need be CMU will fill in some blanks to help overcome differences between different platform architectures. It then also creates the user accounts and will likewise import all settings the account had from the respective XML file.

At this point we have the "naked" framework of the Vsite present: The settings are there, the users are there, but what's missing is the actual files.

CMU then imports the public and private tarballs from the CMU-Dump. For this it uses the Archive::Tar Perl module that CMU brings aboard. For each file in the tarballs we have a matching line in our XML files, too. The XML file lists the (encrypted) filename, plus a checksum for that file and the UID of who owns the file. During the unpacking cmuImport checks the checksum of each file to make sure that there are no corruptions.

Files are stored with the same UID that they had before. So if user "joe" was the owner of the file, he will again be the owner of that file. But what if the disk quota of user "joe" is exceeded? Or what if the disk quota of the group he does belong to is exceeded?

You of course guessed that one right: The import will fail if CMU cannot write the file because the disk quota of that user or site is exceeded during unpacking.

And when THAT happens, CMU dies horribly.

There we go: Square peg, meet round hole.

There are two ways to prevent this: The traditional way is to make sure that no sites or users are over quota when you CMU migrate. Often that is not practical and sometimes that simply is not possible. Like when you're working with the CMU export of an already dead server.

The other way would be a redesign of how CMU works.

So I finally looked at this redesign and set out to do it.

I really hate working on the CMU code with a passion for multiple reasons. The core of the problem here is that the initial CMU was coded by a brilliant genious of a Perl wizard if there ever was one. The initial CMU was a piece of art. It was like a Michelangelo statue wrapped into the Mona Lisa and a couple of Monets. No, let me rephrase that. It was like a Lamborghini Murcielago. A very pretty supersports car. Lean, mean, fast and sexy looking. Designed by people with a passion for fast and sexy looking cars. But then someone decided that instead of being fast on roads, it should also be good offroad. To "fix" that, it got turned into a monster truck with wheels tall as a man just to have enough ground clearance. Then someone else decided that it should be able to swim, too. Well, the oversized baloon wheels already provided good buyancy. So an outboard motor from a boat was slapped onto the back. Eventually it was decided that it should also be able to operate in deep snow. So a snow plow was added to the front. Finally someone wanted it to fly as well. No problem, just add a jet engine and some wings.

So what used to be a sexy looking ride got turned into a clown car. Some of the bolted on additions are outright ridiculous, while others simply serve as an example what you can do with chewing gum and enough duct tape.

The CMU code is pretty complex, because it contains code for import and export on all the "compatible" platforms:

  • Qube 2
  • Qube 3
  • RaQ2
  • RaQ3
  • RaQ4
  • RaQ XTR
  • RaQ550
  • BlueQuartz 5100R
  • TLAS1HE
  • TLAS2
  • BlueApp 5160R
  • BlueApp 5161R
  • BlueOnyx 5106R
  • BlueOnyx 5107R
  • BlueOnyx 5108R

During build process of the respective RPM, it only "pulls" in the code that you really need for the target platform that you are building your CMU for - which is a godsend. But during code maintenance you need to walk through a lot more files than you really need.

On top of that: To actually TEST any CMU code modifications, you need to cmuExport and cmuImport to verify that your code is working. AND to be really sure you need to do that on every relevant platform. Which is no fun at all.

So I started with a small 5106R test VPS. I create four sites with some user accounts and populated the accounts and sites with data. I made sure that half of the sites and half of the user acounts were over quota. I then did a CMU export with the existing CMU 2.81-0BX06.

Then I created another 5108R test VPS as target for the import. I looked long and hard at the CMU code to make sure where I would have to modify the behaviour. It turned out to be the *scanin.pl script for 5108R, which does the actual import.

That script is pretty complex and uses proprietory functions - some of which are outsourced into separate Perl modules included in the CMU RPM, or which are part of Sausalito. Such as cmuCCE, CmuCfg. Base::User, I18N, TreeXML, Archive and RaQUTIL.

The cmu.xml file is read in via readXml() provided by the TreeXML package. I'm no stranger to XML parsing via Perl or PHP, although in Perl I rather prefer different Perl modules than TreeXML for that purpose. Like XML::Simple for example. So the syntax of TreeXML is a bit different, but it's easy to understand.

The 5108Rscanin.pl script does some sorting and ordering and then uses RaQUtil.pm to extract the name of the Vsites in the XML file. A foreach() loop then walks through the sites that need to be created. Several configurational elements for Vsites are then deleted, such as the site group, the Frontpage or Chili!Soft ASP settings, the basedir and other stuff that is no longer needed. Some blanks are filled in, too. The whole set of information about the configuration of that site is then filled into a new array. That array is passed via the function unLoadHash() into a variable named $vRef, which is written off straight to CCEd:

($ok, $bad, @info) = $cce->create('Vsite', $vRef);

$vRef contains some basic the settings of the imported site - including the disk quota. But a closer examination of the contends of $vRef (Data::Dumper is your friend!) revealed that the disk quota for the Vsite had been set to -1 at this stage.

Only a few steps later on during another transaction ...

$cce->unLoadNamespace($vTree, $oid);

... the real disk quota is set together with all remaining settings of the Vsite. Such as if PHP and/or CGI is enabled and therelike.

All in all it is a beautifully elegant system, which - with a few lines of code - accomplishes a hellishly complex task.

At that point I would have loved to change the Vsite disk quota to unlimited, import the site tarball and then set the disk quota back to what it was supposed to be. But that wasn't possible. After all: The users had not yet been created, so the ownerships for the files would be all wrong.

No, that wouldn't work.

So the slution had to be slightly different:

I let CMU do it's thing in the same fashion as before: Import the sites and import the users. But replace the disk quota with "unlimited" for each site and user.

Then import the tarballs for the Vsite and also for the users.

THEN - after that is done - set the quota for each site and user to what it needs to be. That proved to be easier than thought, by just slapping another two foreach() loops onto the code.

So our Murcielago now has a telephone booth welded to the passenger compartment, too.

After lots of testings it turned out that our CMU v2.81 also had another bug ever since we allowed suPHP to write off site specific php.ini files that are protected with chattrib: When CMU tries to import the sites private files, it unpacks the php.ini contained therein and tries to overwrite the chattrib protected php.ini which got created during site creation.

That problem was fixed by telling the new CMU v2.90 to ignore php.ini files when it creates the tarballs.

The rest was a matter of patience: The new code had to be copied to ALL the relevant *scanin.pl scripts for all the different plattforms. Next order of business was to create CMU RPMS for BlueQuartz 5102R, BlueOnyx 5106R, 5107R and 5108R.

The final matter or patience was testing CMU exports and imports on all relevant platforms using the new CMU <sigh>.

Well, it seems to work. The clowncar called CMU ain't no beauty anymore (hasn't been in ages), but it gets us to places.


Return
General
Feb 5, 2012 Category: Development Posted by: mstauber
Previous page: Development Next page: Mailing List