27.9. How to make the Encyclopedia international

Figure 27-8. Administration panel: Encyclopedia.

Administration panel: Encyclopedia.

If you are using the Encyclopedia module in your own language, which happens to be different from english, you may encounter problems like the following:

Such problems have their roots in both MySQL and the Encyclopedia code (see Encyclopedia Module with terms in non-english language):

First, make sure that the MySQL server uses the right character encoding. From the MySQL manual on The Character Set Used for Data and Sorting:

By default, MySQL uses the ISO-8859-1 (Latin1) character set with sorting according to Swedish/Finnish. This is the character set suitable in the USA and western Europe.

All standard MySQL binaries are compiled with --with-extra-charsets=complex. This will add code to all standard programs to be able to handle latin1 and all multi-byte character sets within the binary. Other character sets will be loaded from a character-set definition file when needed.

The character set determines what characters are allowed in names and how things are sorted by the ORDER BY and GROUP BY clauses of the SELECT statement.

You can change the character set with the --default-character-set option when you start the server. The character sets available depend on the --with-charset=charset and --with-extra-charsets= list-of-charset | complex | all | none options to configure, and the character set configuration files listed in `SHAREDIR/charsets/Index'.

Also, the user comments for the German character set in MySQL suggest that, if you change the default character encoding of mysqldd after you have already entered some texts with a different character set previously, then you will have to export the database and reimport it, for all texts to be stored with the right character encoding. "REPAIR" or "MYISAMCHECK" (see Section 26.1) will not help, even if the documentation wants to make us believe they will.

You can see the character set used in a table by issuing the command

myisamchk -dvv table.MYI

If it is not the one you want, then the only thing that helps is to export the DB and reimport it with:

mysqldump -u dbuname -h dbhost --all-databases --add-drop-table -p dbname > backup.sql

for the export, and

mysql -u dbuname -h dbhost -lL -p dbname < backup.sql

for the import, where dbuname, dbhost and dbname are exactly the same as in your config.php (Section 3.7). Your character encoding should be set to the one that is aproppriate for your language, see the Table with Character Sets and Corresponding 4.1 Character Set/Collation Pairs.

Second, even if you set the character encoding correctly in MySQL, you may still want to change the $alphabet array in the alpha() function of modules/Encyclopedia/index.php:

$alphabet = array ("A","B","C","D","E","F","G","H","I","J","K","L","M",
"N","O","P","Q","R","S","T","U","V","W","X","Y","Z");

Add the letters of your non-english alphabet there, either by intermixing them with the english ones, or by deleting the english ones alltogether.

The way all this works in PHP-Nuke, is by constructing a database query that finds all entries of the given encyclopedia id having a title which, when transformed to all uppercase, starts "like" the given letter, where the letter can only be one of the entries in the $alphabet array. See for yourself in function terms() of modules/Encyclopedia/index.php:

$sql = "SELECT tid, title FROM ".$prefix."_encyclopedia_text 
WHERE UPPER(title) LIKE '$ltr%' AND eid='$eid'";

Now, what if you want to mix chinese and greek together?

In such a case, your best bet would be to go with Unicode. But given that PHP's Unicode support is close to non-existent, you are out of luck. Perhaps this will change in the future. In the meantime, you can prepare yourself with a good reading on The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets - No Excuses!