Mi blog lah! Το ιστολόγιό μου

23Jul/071

Important MO file optimisation for en_* locales, and partly others

During GUADEC, Tomas Frydrych gave a talk on exmap-console, a cut-down version of exmap that can work well on mobile devices.

During the presentation, Tomas showed how to use the tool to find the culprits in memory (ab)use on the GNOME desktop. One issue that came up was that the MO files taking up space though the desktop showed English. Why would the MO translation files loaded in memory be so big in size?

gtk20.mo                             : VM   61440  B, M   61440  B, S   61440  B

atk10.mo                      	     : VM    8192  B, M    8192  B, S    8192  B

libgnome-2.0.mo			: VM   28672  B, M   24576  B, S   24576  B

glib20.mo			     : VM   20480  B, M   16384  B, S   16384  B

gtk20-properties.mo           : VM     128 KB, M     116 KB, S     116 KB

launchpad-integration.mo  : VM    4096  B, M    4096  B, S    4096  B

A translation file looks like

msgid "File"

msgstr ""

When translated to Greek it is

msgid "File"

msgstr "Αρχείο"

In the English UK translation it would be

msgid "File"

msgstr "File"

This actually is not necessary because if you leave those messags untranslated, the system will use the original messages that are embedded in the executable file.

However, for the purposes of the English UK, English Canadian, etc teams, it makes sense to copy the same messages in the translated field because it would be an indication that the message was examined by the translation. Any new messages would appear as untranslated and the same process would continue.

Now, the problem is that the gettext tools are not smart enough when they compile such translation files; they replicate without need those messages occupying space in the generated MO file.

Apart from the English variants, this issue is also present in other languages when the message looks like

msgid "GConf"

msgstr "GConf"

Here, it does not make much sense to translate the message in the locale language. However, the generated MO file contains now more than 10 bytes (5+5) , plus some space for the index.

Therefore, what's the solution for this issue?

One solution is to add to msgattrib the option to preprocess a PO file and remove those unneeded copies. Here is a patch,

--- src.ORIGINAL/msgattrib.c 2007-07-18 17:17:08.000000000 +0100
+++ src/msgattrib.c 2007-07-23 01:20:35.000000000 +0100
@@ -61,7 +61,8 @@
REMOVE_FUZZY = 1 << 2,
REMOVE_NONFUZZY = 1 << 3,
REMOVE_OBSOLETE = 1 << 4,
- REMOVE_NONOBSOLETE = 1 << 5
+ REMOVE_NONOBSOLETE = 1 << 5,
+ REMOVE_COPIED = 1 << 6
};
static int to_remove;

@@ -90,6 +91,7 @@
{ "help", no_argument, NULL, 'h' },
{ "ignore-file", required_argument, NULL, CHAR_MAX + 15 },
{ "indent", no_argument, NULL, 'i' },
+ { "no-copied", no_argument, NULL, CHAR_MAX + 19 },
{ "no-escape", no_argument, NULL, 'e' },
{ "no-fuzzy", no_argument, NULL, CHAR_MAX + 3 },
{ "no-location", no_argument, &line_comment, 0 },
@@ -314,6 +316,10 @@
to_change |= REMOVE_PREV;
break;

+ case CHAR_MAX + 19: /* --no-copied */
+ to_remove |= REMOVE_COPIED;
+ break;
+
default:
usage (EXIT_FAILURE);
/* NOTREACHED */
@@ -436,6 +442,8 @@
--no-obsolete remove obsolete #~ messages\n"));
printf (_("\
--only-obsolete keep obsolete #~ messages\n"));
+ printf (_("\
+ --no-copied remove copied messages\n"));
printf ("\n");
printf (_("\
Attribute manipulation:\n"));
@@ -536,6 +544,21 @@
: to_remove & REMOVE_NONOBSOLETE))
return false;

+ if (to_remove & REMOVE_COPIED)
+ {
+ if (!strcmp(mp->msgid, mp->msgstr) && strlen(mp->msgstr)+1 >= mp->msgstr_len)
+ {
+ return false;
+ }
+ else if ( strlen(mp->msgstr)+1 < mp->msgstr_len )
+ {
+ if ( !strcmp(mp->msgstr + strlen(mp->msgstr)+1, mp->msgid_plural) )
+ {
+ return false;
+ }
+ }
+ }
+
return true;
}
However, if we only change msgattrib, we would need to adapt the build system for all packages.

Apparently, it would make sense to change the default behaviour of msgfmt, the program that compiles PO files into MO files.

An e-mail was sent to the email address for the development team of gettext regarding the issue. The development team does not appear to have a Bugzilla to record these issues. If you know of an alternative contact point, please notify me.

Update #1 (23Jul07): As an indication of the file size savings, the en_GB locale on Ubuntu in the installation CD occupies about 424KB where in practice it should have been 48KB.

A full installation of Ubuntu with some basic KDE packages (only for the basic libraries, i.e. KBabel - (ls k* | wc -l = 499)) occupies about 26MB of space just for the translation files. When optimising in the MO files, the translation files occupy only 7MB. This is quite important because when someone installs for example the en_CA locale, all en_?? locales are added.

The reason why the reduction is more has to do with the message types that KDE uses. For example,

msgid ""
"_: Unknown State\n"
"Unknown"
msgstr "Unknown"

I cannot see a portable way to code the gettext-tools so that they understand that the above message can be easily omitted. For the above reduction to 7MB, KDE applications (k*) occupy 3.6MB. The non-KDE applications include GNOME, XFCE and GNU traditional tools. The biggest culprits in KDE are kstars (386KB) and kgeography (345KB).

Update #2 (23Jul07): (Thanks Deniz for the comment below on gweather!) The po-locations translations (gnome-applets/gweather) of all languages are combined together to generate a big XML file that can be found at usr/share/gnome-applets/gweather/Locations.xml (~15MB).

This file is not kept in memory while the gweather applet is running.
However, the file is parsed when the user opens the properties dialog to change the location.
I would say that the main problem here is the file size (15.8MB) that can be easily reduced when stripping copied messages. This file is included in any Linux distribution, whatever the locale.

The po-locations directory currently occupies 107MB and when copied messages are eliminated it occupies 78MB (a difference of 30MB). The generated XML file is in any case smaller (15.8MB without optimisation) because it does not include repeatedly the msgid lines for each language.

I regenerated the Locations.xml file with the optimised PO files and the resulting file is 7.6MB. This is a good reduction in file space and also in packaging size.

Update #3 (25Jul07): Posted a patch for gettext-tools/msgattrib.c. Sent an e-mail to the kde-i18n-doc mailing list and got good response and a valid argument for the proposed changes. Specifically, there is a case when one gives custom values to the LANGUAGE variable. This happens when someone uses the LANGUAGE variable with a value such as "es:fr" which means show me messages in Spanish and if something is untranslated show me in French. If a message has msgid==msgstr for Spanish but not for French, then it would show in French if we go along with the proposed optimisation.

16Jul/070

GUADEC Day #2

(see http://www.guadec.org/schedule/warmup)

At the first presentation, Quim Gil talked about GNOME marketing, what have been done, what is the goal of marketing. He showed a focused mind on important marketing tasks; it is easy to get carried away and not be effective, a mistake that happens in several projects.

The next session was by Tomas Frydrych (Open Hand - I have their sticker on my laptop!) on memory use in GNOME applications. Many people complain that XYZ is bloated. However, this does not convey what exactly happens; pretty useless. In addition, the common tools that show memory use do not show the proper picture because of the memory management techniques. That is, due to shared libraries, the total memory occupied by an application appears very big. A tool examined is exmap. This tool uses a kernel module that shows memory use of applications by reading in /proc. It takes a snapshot of memory use; it's not real-time info. It comes with a GTK+ front-end (gexmap) that requires a big screen (oops, PDAs). However, it is not suitable for internet tablets and other low-spec devices. Therefore, they came up with exmap-console which addresses the shortcommings. It has a console interface based on the readline library.

Here are the rest of my notes. Hope they make sense to you.

. exmap --interactive
. ?: help
. Head: quite useful (dynamic allocation)
. Mapped:
. Sole use: memory that app is using on its own (rss?)
. "sort vm"
. "print" or "p"
. "add nautilus"
. "clear"
. "detail file" (what executables/libs loaded and how much consume)
. "detail none"

Sole use
. valgrind, to analyse Sole Use memory?
. "detail ????"

Lots of small libraries: overhead

Looking ahead
. Pagemap: by Matt Macall
. http://projects.o-hand.com/exmap-console/

Python
. Sole use: ~18MB ;-(

Tomas was apparently running Ubuntu with the English UK locale. The English UK translation team is doing an amazing job at the translation stats. Actually, most messages are copied, however with a script one can pick up words such as organization and change to organisation. The problem here is that, for example, the GAIM mo file is 215KB (?), however for the British English translation the actual changes should be less than 2-3KB. Messages that are missing from a translation mean that the original US English messages will be used. I'll have to find how to use msgfilter to make messages untranslated if msgid == msgstr. Where is Danilo?

After lunch time (did not go for lunch), I went to the Accerciser session. Pretty cool tool, something I have been look for. Accerciser uses the accessibility framework of GNOME in order to inspect the windows of running applications and see into the properties. A good use is to identify if elements such as text boxes come with description labels; they are important to be there for accessibility purposes (screen reader), as a person that depends on software to read (text to speech) the contents of windows.

The next session was GNOME accessibility for blind people. Jan Buchal gave an excellent presentation.

My notes,

. is from Chech republic, is blind himself. has been using computers for 20+ years

. from user perspective
. users, regular and irregular ;-)
. software
. firefox 3.0beta - ok for accessibility other versions no
. gaim messenger ok
. openoffice.org ok but did not try
. orca screenreader ^^^ works ok.
. generally ready for prime time
. ubuntu guy for accessibility was there
. made joke about not having/needing display slides ;-]
. synthesizer: festival, espeak, etc - can choose
. availability of voices
. javascript: not good for accessibility
. links/w3m: just fine!
. firefox3 makes accessibility now possible.
. web designer education, things like title="", alt="" for images.
. OOo, not installed but should work, ooo-gnome
. "braillcom" company name
. "speech dispatcher"
. logical events
. have short sound event instead of "button", "input form"
. another special sound for emacs prompt, etc.
. uses emacs
. have all events spoken, such as application crashing.
. problems of accessibility
. not money main factor, but still exists.
. standard developers do not use accessibility functions
. "accessor" talk, can help
. small developer group on accessiblity, may not cooperate well
. non-regular users (such as blind musician)
. musicians
. project "singing computer"
. gtk, did not have good infrastructure
. used lilypond (music typesetter, good but not simple to use)
. singing mode in festival
. use emacs with special mode to write music scores (?)
. write music score and have the computer sing it (this is not "caruso")
. gnome interface for lilypond would be interesting
. chemistry for blind
. gtk+
. considering it
. must also work, unfortunately, on windows
. gtk+ for windows, not so good for accessibility
. conclusion: free accessibility
. need users so that applications can be improved
. have festival synthesizer, not perfect but usable
. many languages, hindi, finnish, afrikaans
. endinburgh project, to reimplement festival better
. proprietary software is a disadvantage
. q: how do you learn to use new software?
. a: has been a computer user for 20+ years, is not good candidate to say
. a: if you are dedicated, you can bypass hardles, old lady emacs/festival/lilypond
. brrlcom, not for end-users(?)
. developer problem?
. generally there is lack of documentation; easy to teach what a developer needs to know
. so that the application is accessible
. HIG Human Interface Guidelines, accessible to the developers
. "speakup" project
. Willy, from Sun microsystems, working on accessibility for +20 years, Lead of Orca.
. developers: feel accessibility is a hindrance to development
. in practice the gap is not huge
. get tools (glade) and gtk+ to come with accessibility on by default
. accessibility
. is not only for people with disabilities
. can do amazing things like 3d interfaces something

These summaries are an important example of the rule that during presentation, participants tend to remember only about 8% of the material. In some examples, even less is being recollected.

6Jul/070

Google Groups: Member Invite Request Approved

When creating a Google Group, you have the option of auto-subscribing a list of e-mails. That is, the owner of the email address does not have perform the subscription task. To avoid the apparent spamming opportunity, Google Groups puts a human to review those requests. After you pasted the e-mail addresses, you press Submit and then get a text box where you can write a message to help this person decide.

While filling such a request, I made a gross mistake and I added 140 more email addresses than I should. In the text box I write with capitals, PLEASE CANCEL THIS REQUEST, MISTAKE.

Just now I got a reply, and that requst got approved. On the positive side, the auto-subscription request was thankfully converted to a notification request, so all these people received a request to join the group.

Thank you all for not complaining!

p.s.

My regular blog is offline for a few days so I am using this one for now.

Tagged as: , , , , , No Comments
25Jun/071

Say No to OOXML

Click on the image above to visit the petition page.

I copy here the terms of the petition to say no on the standardisation of MSOOXML at ISO.

I ask the national members of ISO to vote "NO" in the ballot of ISO DIS 29500 (Office OpenXML or OOXML format) for the following reasons:

  1. There is already a standard ISO26300 named Open Document Format (ODF): a dual standard adds costs, uncertainty and confusion to industry, government and citizens;
  2. There is no provable implementation of the OOXML specification: Microsoft Office 2007 produces a special version of OOXML, not a file format which complies with the OOXML specification;
  3. There is missing information from the specification document, for example how to do a autoSpaceLikeWord95 or useWord97LineBreakRules;
  4. More than 10% of the examples mentioned in the proposed standard do not validate as XML;
  5. There is no guarantee that anybody can write a software that fully or partially implements the OOXML specification without being liable to patent damages or patent license fees by Microsoft;
  6. This standard proposal conflicts with other ISO standards, such as ISO 8601 (Representation of dates and times), ISO 639 (Codes for the Representation of Names and Languages) or ISO/IEC 10118-3 (cryptographic hash);
  7. There is a bug in the spreadsheet file format which forbids to enter any date before the year 1900: such bugs affects the OOXML specification as well as software versions such as Microsoft Excel 2000, XP, 2003 or 2007.
  8. This standard proposal has not been created by bringing together the experience and expertise of all interested parties (such as the producers, sellers, buyers, users and regulators), but by Microsoft alone.

This project is an initiative by the Foundation for a Free Information Infrastructure (FFII), the non-profit that helped achieve the rejection of the EU software patent directive in July 2005.

Update #1: Currently (26Jun07 - noon) there are 8805 signatures.
Update #2: Currently (26Jun07 - evening) there are 9481 signatures.
Update #3:

IT IS URGENT THAT YOU CONTACT YOUR STANDARDISATION BODY IN YOUR COUNTRY AND EXPLAIN THEM WHY OOXML IS BROKEN; SENDING A NICE LETTER TO YOUR STANDARDISATION BODY IN YOUR COUNTRY IS MORE IMPORTANT THEN SIGNING THE PETITION

1May/070

Mentoring facility available in Launchpad, Ubuntu

Is there a bug report in Launchpad.net (Ubuntu) that you are confident you can help someone to fix but do not have the time to fix yourself?
Now Launchpad provides the facility for contributors to offer mentoring support to bug reports and blueprints, so that users can apply and receive mentoring. With mentoring you help someone else solve a problem. Ubuntu is benefited, and also the new user gets help in resolving bugs.

For the Greek language there is an Ubuntu team called Ubuntu Greek Testers. Users interested for the Greek language in Ubuntu can subscribe to team. Then, for any bug report that relates to the Greek language support, we subscribe the team as a member. The system is configured in such a way so that any activity on those bug reports is mailed to each member. This makes it easy to track the status of reports.

You can see the current list of pending reports for the Greek language in Ubuntu.

One of the bug reports is about the Broken context-sensitive spell check in evolution (Greek, Russian, etc) GNOME #344008.
This report has been there for around 2 years and it should be fixed soon. I am not sure what the best course of action should be. Well, the typical course of action would be to compile GNOME manually (jhbuild), locate the code in Evolution that deals with gnome-spell and put printf()s that show what's going on when this part of the code is reached. Any takers?

Update: Just offered mentorship for https://bugs.launchpad.net/evolution/+bug/10713 :) I understand the direction to work on but do not know what exactly is going on.

23Feb/0713

Video playback problems (black) after installing Beryl (or Compiz)

Note: Here we describe a workaround. The proper solution is to fix the graphics drivers and the X.Org X server. Such work is taking place, and for several cases you do not need this workaround. Especially with newer versions of Linux.

You just installed your 3D Linux desktop and you are really enthusiastic about it. But when you try to play some videos, you get a strange black output. What's going on?
The common software video players that come with the Linux desktop are able to display the video stream to several types of output devices. This includes several types of output for the graphical interface, and also obscure output devices such as text mode, using ASCII characters.
The default output device is XVideo (or Xv) for players such as those based on GStreamer (totem) and VLC.
As you guessed, there is a bug with XVideo when using Beryl/Compiz. Therefore, to fix, you need to switch to another output device that works.
For GStreamer players (such as totem, the default movie player in GNOME, Ubuntu and so on), you need to run from the command line the command
gstreamer-properties
(with older distributions such as Ubuntu 6.06 there is an option in System/Preferences for this).
and pick
Video, then for Default Video Plugin choose X Window System (No Xv). Click on test to verify that it actually works. Click Close and you are set.
VLC is not installed by default in Ubuntu 6.10. You need to install manually using the Synaptic Package Manager (under System/Administration), once you have activated the Universe repository in Repositories.
Start VLC and click on Settings, then Preferences. Expand Video and then expand Output modules. You will notice several options for output device. How do we actually choose which one should be the active output device? Well, it appears it's a bit tricky. Select the item Output modules, and notice the checkbox at the bottom right that says Advanced options. Check the box, and now you have the option to select a different output device. Pick X11 video output, click on Save and you are set!

Update (17 Jun 2007): Added section at UbuntuGuide.org, How do I fix black windows during video playback.

9Jan/070

Creating a new locale on the OLPC

When you run the OLPC software you currently have access only to the English locales.

If you want to enable Greek support, you need to run (as root)

localedef -v -c -i /usr/share/i18n/locales/el_GR -f UTF-8 /usr/lib/locale/el_GR/

localedef -v -c -i /usr/share/i18n/locales/el_GR -f UTF-8 /usr/lib/locale/el_GR.utf8/

You will get a bunch of warnings. You can ignore them for now.

The localedef command compiles the source locale information found at /usr/share/i18n/locales/el_GR and places the resulting files at
/usr/lib/locale/el_GR/ and /usr/lib/locale/el_GR.utf8/ (both directories contain the same files, so you can also make a link from one to another). The reason we make two versions is that we can use either el_GR or el_GR.utf8 in the applications. Both use UTF-8 as the base encoding which is always nice.
For other locales, replace el_GR with the locale name of your country.

To activate the Greek locale, you need to create a file /etc/sysconfig/i18n and add the text

LANG=el_GR.utf8

LANGUAGE=el:en

Now you need to place the translated applications (.mo format) into

/usr/share/locale/el/LC_MESSAGES/

and restart your virtual machine (or laptop (hint hint)).

8Jan/070

The OLPC and Greek

(oh, I am writing this through a lousy Net connection; thanks Engelados)

I tried out the latest OLPC image, specifically build 218, on Qemu and my aim was to get Greek support configured, if it was not there already.

The OLPC does not currently come with a good set of Greek fonts; you will need to install a set of fonts such as DejaVu or GFS Didot.
Installing means adding the font files in the directory /usr/share/fonts/. The current font configuration files in the OLPC favour Bitstream Vera, therefore you would need to move the bitstream subdirectory outside the fonts directory. DejaVu is based on Bitstream Vera and therefore you will not notice any change once you upgrade. Also, Fedora Core 6 and Ubuntu Linux are based on DejaVu. You need DejaVu, as Bitstream Vera does not currently support Greek. Both DejaVu and GFS Didot are free and open-source fonts.

Note: This screenshot shows DejaVu Sans, not GFS Didot. Sorry for the typo.
This is the OLPC running the cut-down version of the Abiword wordprocessor. Click on the image to view the full size.

This is the OLPC showing the same document above with GFS Didot. The font looks quite nice and similar to old greek textbooks. There is a small issue however, it does not have the character coverage of DejaVu. For example, notice that the Euro sign is missing from GFS Didot. Also, other glyphs such as fancy bullet characters are missing as well. Normally, the OLPC software should replace those missing characters with the correct characters from another font. Apparently something is wrong here and needs further investigation.

Writing support for the Greek language has to be configured separately in the OLPC. The case with other languages appears to be that the default layout is that of the language; apparently there is no need to switch between Brazilian Portuguese and English. For the Greek language it appears that it is good to be able to switch between Greek and English.

There are several places that you can add Greek writing support. The most common is in /etc/X11/xorg.conf. Having gone through the configuration files, I think that /etc/X11/Xkbmap is also a good place and saves us from touching the core Xorg configuration file.

To write the full set of Greek letters, one needs to set the extended variant for the Greek layout, and also try to set the Compose key (for ano teleia). These things should be simplified...

I am not sure how the OLPC looks like (the only photos I saw where not focusing on the keyboard). Perhaps it would be useful to have a test machine at my disposal (hint, hint).
Jim Gettys wrote at his blog about the different languages that the first generation of the OLPC should support. Both Kinyarwanda and Kiswahili use the latin alphabet, therefore there are no significant issues with font support or writing support.

p.s.
Greece will carry out a pilot with OLPC laptops next September.

2Jul/060

Re: gtk1.x και Ελληνικά

Ο Νίκος Νύκταρης έγραψε:
Μιας και από φαίνεται δεν μου κάθεται μια νέα έκδοση του  knoppel είπα να
ασχοληθώ με μερικά bugs που κάποια στιγμή είχα συναντήσει.
Παρακάτω είναι ένα από αυτά και αφορά τις εφαρμογές gtk1.x και τα Ελληνικά ,
θα ήθελα τη γνώμη σας πριν κάνω κανένα bugreport που είναι άχρηστο ή λάθος.
Το πρόβλημα λοιπόν είναι η εμφάνιση των ελληνικών στις εφαρμογές gtk1.x που
συναντώ εδώ και καιρό (από τότε που βγήκε το x.org) στο Debian. Κάνοντας πριν
μερικές μέρες εγκατάσταση του τελευταία έκδοση του etch τα Ελληνικά  στο xmms
εμφανίζονται όπως στο
The image “http://www.knoppel.org/gtkbug/xmms1.png” cannot be displayed, because it contains errors.

Εδώ βλέπουμε ότι τα μεταφρασμένα μηνύματα της εφαρμογής είναι σε μορφή Unicode (για μονοτονικό είναι 2 byte ανά χαρακτήρα). Το XMMS για κάποιο λόγο δεν καταλαβαίνει ότι τα μηνύματα είναι σε μορφή UTF-8 με αποτέλεσμα να προσπαθεί να απεικονίσει κάθε byte ως χαρακτήρα από μια κωδικοποίηση 8-bit. Η επανάληψη του χαρακτήρα & υποδεικνύει το πρόβλημα αυτό.

Αυτό το πρόβλημα είχε αναφερθεί και για την ρώσικη γλώσσα καιρό πριν (debian
bug 330144) το οποίο και λύθηκε για αυτούς. (είναι το bug που έλεγα πριν από
καιρό στον simo)
Για αντίστοιχη λύση για τα Ελληνικά έχουμε δύο κομμάτια. Το ένα αφορά την
γραμματοσειρά που υπάρχει στο /etc/gtk/gtk.utf8 που προφανώς δεν υποστηρίζει
σωστά τα Ελληνικά
Ανοίγοντας το αρχείο και αλλάζοντας την γραμματοσειρά σε fixed το αποτέλεσμα
είναι το όπως εμφανίζεται στο

http://www.knoppel.org/gtkbug/xmms2.png

Τη διαφορά στο αποτέλεσμα μεταξύ των δύο παραπάνω στιγμιοτύπων οθόνης το λαμβάνεις με την απλή αλλαγή της γραμματοσειράς; Αυτό μπορεί να σημαίνει ότι η πρώτη γραμματοσειρά δεν είναι Unicode.

Από όσο γνωρίζω, η παραπάνω γραμματοσειρά είναι μια από τις Ασιατικές γραμματοσειρές.

Προφανώς δεν είναι ότι καλύτερο αισθητικά
Και επανέρχομαι στο bug 330144 και παρατηρώ ότι τα δικά μας αρχεία
στο /usr/share/X11/locale/el_GR.UTF-8 υπάρχουν μεν αλλά  είναι κενά. και από
φαίνεται στο locale.dir έτσι και αλλιώς δεν χρησιμοποιούνται.
Τα δημιούργησα λοιπόν και πρόσθεσα τις δύο κωδικοποιήσεις iso8859-7 και cp1253
που δεν υπάρχουν στο αντίστοιχο Αγγλικό αρχείο. Άλλαξα το locale.dir να
χρησιμοποιεί τα νέα αρχεία το αποτέλεσμα είναι όπως εμφανίζεται στο

The image “http://www.knoppel.org/gtkbug/xmms3.png” cannot be displayed, because it contains errors.

Να σημειώσω ότι έτσι και αλλιώς το αρχείο /etc/gtk/gtk.utf-8 πρέπει να
αλλαχτεί διαφορετικά το αποτέλεσμα είναι όπως το

http://www.knoppel.org/gtkbug/xmms4.png

Αυτό είναι πολύ παράξενο. Για να δείξει εκτεταμένους λατινικούς χαρακτήρες, η κωδικοποίηση της μετάφρασης του XMMS πρέπει να είναι 8-bit ή γίνονται παράξενες έμμεσες μετατροπές στην κωδικοποίηση. Μπορείς να επιβεβαιώσεις ότι η κωδικοποίηση της μετάφρασης του XMMS είναι UTF-8 και ότι η δήλωση στη κεφαλίδα είναι όντως UTF-8;

Προφανώς το αποτέλεσμα αισθητικά είναι πολύ καλύτερο. Το θέμα είναι είναι και
τεχνικά σωστό? Γιατί υπάρχουν τα αρχεία στο el_GR.UTF-8 και είναι κενά? Το
πρόβλημα αυτό υπάρχει και στις άλλες διανομές?
Ο κατάλογος el_GR.UTF-8 δεν χρειάζεται διότι αρκεί ο κατάλογος en_US.UTF-8. Η κωδικοποίηση είναι κοινή, UTF-8.
Για 8-bit ελληνικά (iso-8859-7) χρειάζεται τέτοιος κατάλογος που φυσικά είναι διαθέσιμος.
Παρακάτω ακολουθεί ένα link με τα αλλαγμένα αρχεία για όποιον θέλει να ελέγξει
τις αλλαγές ή/και να τις δοκιμάσει.

http://www.knoppel.org/gtkbug/gtkbug.tar.gz

Προσωπικά προτιμώ εφαρμογές που χρησιμοποιούν την κωδικοποίηση UTF-8 και άλλες επιλογές προσθέτουν πολυπλοκότητα που δεν έχουμε ανάγκη. Αν έχεις τη δυνατότητα να αλλάξεις την εφαρμογή XMMS σε μια από τις νεώτερες εκδόσεις που βασίζονται στο GTK2+, είναι η καλύτερη λύση.

Ακόμα, είναι καλό να αλλάξεις την βασική γραμματοσειρά συστήματος σε DejaVu (Dejavu Sans) διότι υποστηρίζει ελληνικά και είναι hinted.

12Jun/060

Can you read Coptic?

Coptic is the most recent phase of ancient Egyptian. It is the direct descendant of the ancient language written in Egyptian hieroglyphic, hieratic, and demotic scripts. The Coptic alphabet is a slightly modified form of the Greek alphabet, with some letters (which vary from dialect to dialect) deriving from demotic. As a living language of daily conversation, Coptic flourished from ca. 200 to 1100. The last record of its being spoken was during the 17th century. Coptic survives today as the liturgical language of the Coptic Orthodox Church. Egyptian Arabic is the spoken and national language of Egypt today.

Source: Wikipedia on Coptic Language

Coptic, as used today, has signs of influence from the Greek language. If you speak Greek, you should be able to recognise every entry in the screenshot (it comes from the dictionary that is available from http://copticlang.bizhat.com/).

There is a Coptic Unicode block and there are at least three Unicode fonts available with Coptic glyphs.

I am not aware of a keyboard definition to write Unicode Coptic; Coptic uses several combining diacritical marks (accents) and appears to surpass even Ancient Greek/Polytonic in this respect. An easy way to create (easy to write with?) method would be to start from the Greek keyboard layout and replace the codepoints with the Coptic ones. For the 9 combining diacritical marks, three keys should be dedicated, accessible through 1) pressing as is, 2) pressing with shift, 3) pressing with Alt. To avoid using dead keys, there would be a requirement to type first the letter and then the diacritical mark.

In modern Greek we use the ";:" key (on the right of L) to produce the acute and the diaeresis (with Shift) accents. The second suitable key could be the ' " key while the third the "/?" (debateable).

There are several efforts to convert non-Unicode fonts distributed by the Coptic Church. website. Moheb added the Coptic glyphs to the Freefonts. There is more work required to get them added by default to Linux distros. There is a discussion forum on Coptic.

Therefore, the most important task is to create a keyboard layout so that one can write in Unicode Coptic.

Then, existing (non-Unicode) text should be converted to Unicode Coptic so that there is material available. Moheb created support for this in iconv (glibc). There should be a bug report at http://sources.redhat.com/bugzilla/ under product glibc, component libc.

Source: Wikipedia (Coptic script)

There exist free Unicode fonts already to have the text displayed. The conversion of the Coptic Church fonts to Unicode would be beneficial as well. To have them included in Linux distros, the distribution license should be set to one of the FLOSS licenses. An option could be to add to the DejaVu fonts (allowed by the license) so that there is a general purpose open font that is easy to work with.

I, for one, would love to write Greek using a Coptic keyboard layout and a Coptic Unicode font. :)

Update: Screenshot that demonstrates how well Unicode Coptic fonts behave when combining marks are used.

Update #2: You can test the above on your system by opening this OpenDocument file using OpenOffice.org or any other OpenDocument-compatible application. OpenOffice.org was verified that it can show combining marks. Your mileage may vary, your comments will be appreciated.

Get Unicode fonts with Coptic coverage.

15May/060

Free Alaa!

The image “http://static.flickr.com/56/142869951_0ce7433c56_o.gif” cannot be displayed, because it contains errors.

Alaa is a young prominent Egyptian blogger that was arrested and jailed among 47 activists on 7th May 2006 during a peaceful demonstration in Cairo.
His personal website and blog, shared with his wife Manal, is http://www.manalaa.net/ has the latest news about his condition.

There is a petition by Hands Across the Mideast Support Alliance (HAMSA) to free Alaa, which I copy:

Demand Egyptian Regime Release Alaa from Tora Prison

Alaa Abd El-Fatah is one of Egypt's most prominent bloggers and free speech advocates. He and his wife Manal run the popular blog BitBucket, which collects posts from dozens of Egyptian blogs and which won a "Best of the Blogs" award in December from Reporters Without Borders.

On Saturday (May 7), Alaa was arrested with a group of activists during a peaceful demonstration outside a Cairo courthouse. The rally denounced disciplinary hearings for two reform judges and arrests of protestors at previous demonstrations. Alaa and a group of other demonstrators were cornered by Egyptian police, and security agents then apparently handpicked individual protestors for arrest.

Alaa seems to have been targeted because of his high profile: he helps organizes the protests and spread the information through the blog aggregator he runs. He is now being held in notorious Tora Prison — and his arrest seems designed to both shut down his blog aggregator and scare other Egyptian bloggers. But you can send a message to the Egyptian government through the petition below (you can edit the petition text), which will generate an email to political leaders who can secure Alaa's release.

The petition will be sent to:

  • Egypt's Ambassador to the US Nabil Fahmy
  • Egyptian Prime Minister Ahmed Nazif
  • Egypt's Interior Minister Habib El Adly
  • US Ambassador to Egypt Francis Ricciardone
  • US Assistant Secretary of State David Welch

This campaign has been signed 1047[check page for latest figure] times. Click here to see who's signed.

Join the Campaign

Alaa is speaking (has the mic) at an event about Open-Source software for NGOs in Africa.

1May/060

How to write special characters in Xorg and GNOME

There is functionality in Xorg that allows type special characters, without having to switch to a specific keyboard layout. To enable,

  • Click System, Preferences, Keyboard.
  • Under Layout Options, expand on Compose key position.
  • Choose Right-Win key is compose, click Close.

Now you can type extended characters using the RightWin key (next to AltGr), according to this keyboard settings file. Specifically, the lines that start with GDK_Multi_key are those that we can use here. The Compose key is actually GDK_Multi_key in the above file.
Some examples,

  • RightWin + C + = produces
  • RightWin + = + C produces
  • RightWin + C + O produces ©
  • RightWin + O + C produces ©
  • RightWin + a + ' produces á
  • RightWin + a + " produces ä
  • RightWin + a + ` produces à
  • RightWin + a + ~ produces ã
  • RightWin + a + * produces å
  • RightWin + a + ^ produces â
  • RightWin + a + > produces â
  • RightWin + a + , produces ą
  • RightWin + e + - produces ē
  • RightWin + S + 1 produces ¹
  • RightWin + S + 2 produces ²
  • RightWin + S + 3 produces ³

For more tips, see EasyLinux - Ubuntu Dapper.

Switch to our mobile site