Aug 22
How it works: 1.0 to 1.5 Migration
This post is going to be a technical post on what goes on behind the scenes of the components. For me this is easier than writing the how to documentation on things but it will give those who are interested the knowledge they need so really it’ll be a how to with technical bits. So without any further ado, here it is!
There are two parts to migration: the 1.0 side and the 1.5 side. The 1.0 side takes place in a special migrator component. The original component, written by David, was mostly a simple SQL dump creator limited to specific tables and not very configurable in this respect, however it worked quite well. The problem is that its not very configurable and for a site with a lot of third party extensions this means that they would have to write their export utility. Further more since that tool was written, there have been a lot of changes in things in the 1.5 land with respect to the parameters. In 1.0 there were parameters called ‘pdf’ which was for the ‘show PDF icon’ option, which in 1.5 have been renamed to ‘show_pdf_icon’. Both of these were design considerations for the new tool.
The new tool is based around two concepts: an ETL plugin that defines a transformation that creates an SQL statement. For the most part these are extraction transformation (e.g. it comes from a database table), however there exists an ETL plugin (the configuration one) that extracts settings from the 1.0 configuration.php file and puts them into different parts of the 1.5 database. An ETL plugin is at the core quite simple and can be successfully implemented in as little as five lines of PHP code that extend the base ETLPlugin class. Those lines basically define the name of the table that the plugin transforms, a friendly name and a few brackets for good measure. This simply extracts the contents of the table without any modification on the fly. An ETL Plugin can also be more complicated with easy ways to skip table columns, rename table columns or alter the value of a column in a row easily (hence the transformation part). This is used to alter some tables as they are exported from 1.0 instead of attempting this in 1.5 as we’re already examining each row to generate the SQL file.
The majority of the work of an ETL Plugin is handled by the ‘doTransformation’ function which generates the SQL statement. For most plugins this function can safely be inherited by the subclass and other functions (which control the alteration of column names or values) should be updated to add relevant functionality. The ‘doTransformation’ function fits into the migrators task system and is executed as per the values of the present task (table name, and current iteration).
There is another form of plugin, called an SQL Plugin. The SQL Plugin is really simply a text file with SQL in it which is designed so that third party developers can place ‘CREATE TABLE’ commands in the SQL file before their ETL plugins fire and output INSERT statements that are going to be expecting a table. These plugins are stored in the ‘tables’ folder of the migrator, and the ETL plugins are stored in the ‘plugins’ folder. Existence of a file in this directory will mean its inclusion in the migration. There is no published or unpublished state since this should be a mostly one off event.
The task system is used by the migrator to prevent the system from timing out by completing excessive commands against the system. The task system works by asking an ETL plugin how many rows it has to transform and stores this in a table. Once it has done that it also collects all of the SQL plugins and places this into the new migrations SQL file. After this is all done the task handler is executed and starts the ETL for the tables. The task system works through each ETL plugin, calling the ‘doTransformation’ plugin. When the task system detects that it is getting to a set threshold of the PHP timeout it will pause the migration, update the present task and end output telling the browser (via JS) to load up the task handler again. It completes this for as many time is as necessary until there are no tasks left in the system and the SQL file is complete.
So now we have our SQL file nicely generated from the migrator and now we need to put it into 1.5 so that we can use all of the really cool features. One thing to note is that at present the behaviour of the migrator is to strip the prefix of the existing site and replace it with jos_ – I made this decision as a documentation reduction and simplification feature (so that the last step in the 1.5 install, where the migration is triggered from, would always be ‘jos_’, but I might change this as it appears to be causing confusion). This doesn’t mean that the 1.5 table has to have the ‘jos_’ prefix, but you do need to tell the migrator that this is the prefix. This leads me to the next step in the migration chain: a 1.5 install.
The 1.5 install completes normally until the final screen, the main configuration screen. Until this point the installation is the same for a migration or a blank Joomla! install. However there are some things to fill out on the final screen. This screen asks you for your site name, the admin email and password, and finally if you want to do a migration load or a sample install. We’re doing a migration so we check that box and enter the information. Remember the old prefix for the migration is always jos_ so put that in there. The encoding is going to be that of your old database. Your old database might be UTF-8 already so select that, if you’re from most English speaking countries the default should be fine, and for other parts of the world you will have to select encoding relevant to your part of the world. This is used by the migrator to convert the file from that format into UTF-8 (or do nothing if its already UTF-8!) using the servers ‘iconv’ function. IF you don’t have iconv installed, you will have to convert the SQL file into UTF-8 before you feed it to Joomla! and in which case you can set the encoding for this field to be UTF-8. For Linux users iconv can easily be installed on a local machine and worked from there. Windows users have various options (including a small Java application I have mentioned in one of my previous posts) to convert a file into UTF-8.
The last few options deal with Joomla! getting the SQL file and if its a migration file. For small files its easy to just upload via the HTTP upload interface, but for larger files that are over PHP’s upload limit, the files can be uploaded to the ‘installation/sql/migration/’ directory and call the file ‘migrate.sql’ so the installer can find it. You will need to tick the “I have already uploaded” check box as well as the “This is a 1.0 migration script” box. The HTTP upload option can alternatively be used. Keep in mind here that the SQL file will be altered by the migration system no matter what option you choose. So even if you have to FTP up the file, you will have to either copy over the file or reupload if the migration fails just as if it was a HTTP upload.
The next step in the chain involves a few changes made to the migration file. These changes involve rewriting some files, and altering the menu and modules table to insert their data into a temporary migration table instead of inserting them into the real table. This alters the file which is then fed into the data load script. The data load script then bulk imports all of the information into the database, again with checks to ensure that the system doesn’t time out. The script is a modified version of the ‘BigDump’ script written by Alexey Ozerov (available here: http://www.ozerov.de/bigdump.php) that handles the import for us (and saved me having to write it out!). From here the migration system performs some magic. It rewrites the menu’s to match 1.5’s new style, it grabs and imports the modules and it also does various bulk SQL updates to change things. What it doesn’t grab (at any point) is site URL’s that will have changed in either URL fields or in content items. From here the migration is complete and the site is ready to run.
At this point there are a few final things to note. There is a new menu module created and published as a part of the install (this is in addition to your old module). All of the other modules in your old site are transferred even if they are third party and they’re set to be unpublished. You can install 1.5 version of the modules (or the 1.0 versions in legacy mode) easily and it won’t alter the database entries, so modules can be easily migrated. Third party components that wrote ETL plugins will require a special 1.5 installer that doesn’t attempt to create database tables (they’ll already be there) to install the files and relevant database entries.
With regards to migrator status, I’m planning on releasing an update for the migrator this Friday, I’m attempting to resolve a final bug with duplicate rows being created in the SQL file. I’m not quite sure what is causing this but its my last bug before I release RC2 of the Migrator on Friday (or hopefully tomorrow GMT+10!). I’m also going to push through some changes to the 1.5 side as well to ensure that certain conditions are met properly, so stay tuned for that (I’ll write a short blog post on that when I release it, so set your feed aggregators up to Joomla! to stay current).
A final hint for third party developers, if you’re currently using a configuration file outside of the database, with 1.5 you might consider using the JRegistry functions and migrating your configuration file into the components table. There are some nifty lines of code that allow you to create a configuration object quickly using this method (very cool) instead of having to create your own configuration file. An ETL plugin, similar to the configuration one can allow you to make the transfer.
1 comment1 Comment so far
Leave a comment
I read your blog in a regular manner and just love it
hope there will be more postings from you, keep on going
greetz, terry