Mi blog lah! Το ιστολόγιό μου

6Dec/074

OpenOffice Writer training notes (request: make training video plz!)

OpenOffice.org is one of the most important layers of the open-source stack. Although it does a superb job, we really need to make effort to get more users working on it.

Here we present training notes for the use of Writer, the word processor component of OpenOffice.org. We aim to make the best use of styles by creating well-structured documents. What we show here is built on work of others, including the OpenOffice Linux.com articles by Bruce Byfield, the amazing OpenOffice.org documentation and the spot-on article of Christian Paratschek at osnews.com. Actually, the following follow more or less Christian's article.

When training in OpenOffice.org, it is important to create a fluid workflow that starts from the basics and increases gradually in complexity. It would be great if someone could turn the notes in a training video.

  1. We start of with running OpenOffice.org Writer. The default windows appears. Compared with other word processors, in OOo we see this text boundary in the document (the dim rectangle that shows the area we can write in). We mention we can show/hide it with View/Text boundaries.
  2. When creating a document, it is good to set the properties such as Title and Subject. We do that from File/Properties/Description. It may look too much effort now, but it will help us later wherever we want to write the document title or subject. Use Using OpenOffice.org Writer for title and How to write nice document in OpenOffice.org Writer for subject.
  3. Writer supports styles which makes life much easier. You probably have used styles before; using Heading 1, Heading 2 for headings so that you can create easily the Table of Contents. Writer has a Styles and Formatting window that is accessible from the icon/button near the File menu. The icon looks like a hand clicking on a 3x3 grid. You can also get the windows from Format/Styles and Formatting, or by simply pressing F11. Once you do that, you get a floating window. You can dock it by dragging it to the right edge of the Writer window. If you are into 3D desktop, it may not be easy to dock (it automatically switches to another side of the desktop cube). In this case, use the key combination Ctrl-Shift-F10 to dock the Styles and Formatting window. It is good here to resize the document (that is, change the magnification) so that it appears centered with little empty space around.
  4. Writer supports styles, not only for Paragraphs (like Heading 1) but also for Pages. See the status bar at the bottom of the Writer window; it mentions Default which is the default page style. When we write a document, the first page is good to have a distinct style that is appropriate to the properties of a first page. This includes, making sure the second page appears empty, the page gets no page numbering and so on. On the Styles and Formating dock we select the Page styles tab and we double-click on the First Page style. This will set the current page to the First Page style, and we can verify visually by looking at the status bar (Now First Page instead of the old Default).
  5. We are not writing yet; lets create the subsequent pages first. To do so, we insert manual breaks in our document. Click on Insent/Manual Break.../ and select to insert a Page Break. As style for the page after the break choose the Index page style, tick on Change page number, and make sure the numbering starts from 1. Click OK. Proper documents start numbering from the Index page. The Index page is the page we put the Table of Contents, Table of Figures and so on.
  6. Make sure the cursor is on the new page with the Index style. We need to create a new page break, so that we can get writing the actual document. Click on Insert/Manual Break.../ and select a Page Break. As style for the page after the break you can choose Default. Leave any page numbering settings as is because it inherits from before. Click OK.
  7. Now, to view what we have achieved, let's go to Print Preview, and choose to see four pages at a time. We can see the first page, another page which is intentionally left blank, the Index page and the Default page. Close Print preview and return to the document.
  8. Now let's go back to the first page. We want to put the title on the first page. Nothing extravagant, at least yet. What we do is we visit the Paragraph styles and find the Title style. While the cursor is on the first page at the start, we double-click on the Title style. The cursor moves the the center of the document and we can verify that the Title paragraph style has been applied; see on the right of the Styles and Formating icon on the top-left of the Writer window. Shall we write the title of the document now? Not so fast. We can insert the title as a field, because we already wrote it in the properties at the beginning in Step 2. Click Insert/Fields/Title.
  9. Now press Enter; the cursor moves down and it somehow automatically changes to the Subtitle style. Styles in OpenOffice allow you to choose a Next style (a followup style) and in this case, when someone presses Enter on the Title style, they get a new paragraph in the Subtitle style. While in the line/paragraph with Subtitle style, click on Insert/Field.../Subject. Fields in OpenOffice.org appear with a dark gray background; this does not appear in printing, it is just there to help you identify where the fields are.
  10. Now lets move to the last page, the page with Default style and write something. Select the Heading 1 paragraph style and type Introduction. Press enter and you notice that the next style is Text body. Text body is the natural paragraph style for text in Writer (most documents have the default Default paragraph style which is wrong). Now write something in Text Body such as I love writing documents in OpenOffice.org Writer. Copy the line and paste several times so that we get a nice paragraph of at least five lines. Make sure when pasting that after a full stop there should be a single space, then the new sentence starts.
  11. Press Enter and now we are ready to add a new heading. Type Writing documents and set the Heading 1 paragraph style. Press Enter and fill up a paragraph with more of I love writing documents in OpenOffice.org Writer.
  12. Press Enter and create a new section (add a Heading 2, name it Writing documents in style and fill up a corresponding paragraph).
  13. Press Enter and create a last section (add a Heading 1, name it Conclusion, and fill up a corresponding paragraph style).
  14. Now we are ready to place the cursor at the Index page we created before, and go for the Table of Contents. Click on Insert/Indexes and Tables/Indexes and Tables. The default index type is Table of Contents. We keep the default settings and click OK. We get a nice looking table of contents.
  15. At this stage we have a complete basic document, with first page, index page and default page.

The next set of steps include more polishing and adding extra elements to our document.

  1. The text body style is configured to have the left alignment by default. Normally, one would select paragraphs and click on a paragraph alignment button on the toolbar to change the alignment. Because we are using styles, we can modify the Text Body style to have another alignment, and presto the whole document with text in the same style follow suit. In the Styles and Formating dock, at the paragraph styles tab, select the Text Body style. Right-click on the Text Body style and choose to Modify style. Find the Alignment tab and choose Justified as the new alignment for Text Body paragraphs. Click Ok and observe the document changing to the new configuration.
  2. It is nice to the section numbers on the headings, such as 2.1 Writing documents in style. To do this, we need to change the default outline numbering. Click on Tools/Outline numbering... and select to modify the numbering for all levels (under Level, click 1-10). Then, under the Numbering group, change the Number option from the default None to 1, 2, 3, .... Click OK and the number is changed in the document.
  3. Go back to the Table of Contents. You notice that the numbering format does not look nice; some section numbers are too close to the section names. To fix, right click on the gray area of the table of contents and select Edit Index/Table. In the new dialog box, select the Entries tab. Under Structure and Formatting you can see the structure of each line of line in the table of contents table. The button labeled E# is the placeholder for the chapter number. After that there is a placeholder that you can actually type text. In our case we simply click and press the space bar to add another space. We then click the All button and finally click OK. Now, all entries in the Table of contents will have a space between the chapter number and chapter title.
  4. In order to add a footer with the current page number, click on Insert/Footer and pick Index, then Default. Both the Index and the Default style of pages get to show page numbers. Then, place the cursor in the footer area and Insert/Field/Page Number. You can modify the Footer paragraph style so that the text alignment is centered. You have to insert the field in both an Index page and a Default page.
  5. The page number in the Index page is commonly shown in Roman lowercase numbers. How can we fix that? We simply have to modify the Index page style accordingly; click on the Page Styles tab in Styles and Formatting, click to modify the Index page style, and at the Page tab in Layout Settings select the i, ii, iii, ... format. Click OK.
  6. It would be nice to have the title on the header of each page, either Index or Default. Click on Insert/Header and add a header for Index and Default. Then, place the cursor in the header for both styles and click to add the Title field (Insert/Field/Title). Would it be nice to put a line under the header? The header text has the Header paragraph style. In the Styles and Formatting, click the Paragraph styles tab and select the Header paragraph style. Right-click and choose to Modify. In the Borders tab enable a bottom line and click OK.

OpenOffice.org Writer in Style

You can download this sample document (.odt) from the link Using OpenOffice.org Writer.

I'll stop here for now. There are more to put such as Table of Figures, Index of Tables and Bibliography.
It would be good to leave feedback if there is interest to work on this direction.

Update 15Mar2008: This appears to be a Farsi translation/adaptation of the article.

3Dec/070

Take Back The Tech #2!

Last year we talked about the Take Back The Tech, an initiative by the Association for Progressive Communications, Women’s Networking Support Programme (APC WNSP) to stop violence against women with the use of Information and Communication Technologies (ICT), that took place between the 25th November and the 10th December. The same initiative runs this year during the same days (25th November to 10th December). At the time of writing this the event is at Day 8 of the 16-day event.

Violence Against Women (VAW) can also be perpetrated through the use of ICT (such as being a victim of targeted spyware or malicious online intimidation). Therefore, a better use of ICT (Take Back The Tech!) would help mitigate online-related VAW and reclaim the control of technology.

You can start your own campaign and join the existing ones that are in place. In Europe there are existing campaigns in the UK and Skopje.

Here is the announcement for this year,

***************************
ka-BLOG! TAKE BACK THE TECH!
www.takebackthetech.net
25 Nov to 10 Dec
***************************

ka-BLOG! Calling all bloggers to contaminate the blogosphere with
activism on VAW for 16 days.

ka-BLOG is a 16-day blog fest for the Take Back the Tech Campaign. It
is open to anyone and everyone - girls, boys, everyone beyond and more
-- who want to share their thoughts on violence against women, and how
online communications can exacerbate or help eliminate VAW.

We welcome bloggers in different languages!

ka-BLOG with us :)

For more information, go http://www.takebackthetech.net, or email jac
AT apcwomen DOT org

[FYI. In Filipino slang, "ka-BLOG" would mean someone you blog with.]

16Nov/078

Droid fonts from Google (Android SDK)

Update 10Feb2009: The Droid fonts are now available from android.git.kernel.org (Download tar.gz archive), under the Apache License, Version 2.0. Ascender (the company who created Droid), has now a dedicated website at http://www.droidfonts.com/ (thanks Rex!). At this dedicated website, Ascender presents the Droid Pro family which has several additions to Droid. For the open-source crowd, it is important to have the initial Droid font family dual-licensed under the “OpenFont License”, which would enable the best use with the rest of the OFL licensed fonts.

Two years ago, Google bought a start-up called Android in order to deliver an open platform for mobile applications. A few days ago the Android SDK has been released and you can develop now Android applications that can run in the emulator. Android handsets are expected at some point next year.

Even if you do not plan to develop applications for Android, you can still run the emulator which is functional, includes quite a few samples, and comes with a browser shown above. To get it, download the Android SDK for your system, uncompress it and run

./android_sdk_linux_m3-rc20a/tools/emulator

An interesting aspect of Android is that it comes with a set of fonts that have been specially designed for mobile devices, the Droid fonts. The fonts are embedded in the Android image, in android_sdk_linux_m3-rc20a/tools/lib/images/system.img, a clever guy managed to extract them and a modest guy corrected me (Damien's blog to download).

The fonts are probably licensed under the same license as the SDK (Apache License), however it is better to hear from Google first.

In the meantime, here is a screenshot of Ubuntu 7.10 with Droid.

Update: To extract the fonts from the SDK, run the emulator with the -console parameter. The emulator starts and at the same time you get a shell to the filesystem of the running emulator. You can locate the fonts in system/fonts/. Once located the full path of a file, you can extract with ./adb pull system/fonts/DroidSans.ttf /tmp/DroidSans.ttf (thanks cosmix for the tip).

11Nov/070

Localisation issues in home directory folders (xdg-user-dirs)

In new distributions such as Ubuntu 7.10 there is now support for folder names of personal data in your local language. What this means is that ~/Desktop can now be called ~/Επιφάνεια εργασίας. You also get a few more default folders, including ~/Music, ~/Documents, ~/Pictures and so on.

This functionality of localised home folders has become available thanks to a new FreeDesktop standard, XDG-USER-DIRS. xdg-user-dirs can be localised, and the current localisations are available at xdg-user-dirs/po.

A potential issue arises when a user logs in with different locales; how does the system switch between the localised versions of the folder names? For GNOME there is a migration tool; as soon as you login into your account with a different locale, the system will prompt whether you wish to switch the names from one language to another. This is available through the xdg-user-dirs-gtk application.

Another issue is with users who use the command line quite often; switching between two languages (for those languages that use a script other than latin) tends to become cumbersome, especially if you have not setup your shell for intelligent completion. In addition, when you connect remotely using SSH, you may not be able to type in the local language at the initial computer which would make work very annoying.

Furthermore, there have been reports with KDE applications not working; if someone can bug report it and post the link it would be great. The impression I got was that some installations of KDE did not read off the filesystem in UTF-8 but in a legacy 8-bit encoding. This requires further investigation.

Moreover, OpenOffice.org requires some integration work to follow the xdg-user-dirs standard; apparently it has its own option as to which folder it will save into any newly created files. I believe this will be resolved in the near future.

Now, if we just installed Ubuntu 7.10 or Fedora 8, and we got, by default, localised subfolders in our home directory (which we may not prefer), what can we do to revert to non-localised folders?

The lazy way is to logout, choose an English locale as the default locale for the system and log in. You will be presented with the xdg-user-dirs-gtk migration tool (shown above) that will give you the option to switch to English folder names for those personal folders.

Clarification: It is implied for this workaround (logout and login thing), you then log out again, set the language to the localised one (i.e. Greek) and log in. This time, when the system asks to rename the personal folders, you simply answer no, and you end up with a localised desktop but personal folders in English. Mission really accomplished.

If you are of the tinkering type, the files to change manually are

$ cat ~/.config/user-dirs.locale

el_GR

$

and

$ cat ~/.config/user-dirs.dirs

# This file is written by xdg-user-dirs-update
# If you want to change or add directories, just edit the line you're
# interested in. All local changes will be retained on the next run
# Format is XDG_xxx_DIR="$HOME/yyy", where yyy is a shell-escaped
# homedir-relative path, or XDG_xxx_DIR="/yyy", where /yyy is an
# absolute path. No other format is supported.
#
XDG_DESKTOP_DIR="$HOME/Επιφάνεια εργασίας"
XDG_DOWNLOAD_DIR="$HOME/Επιφάνεια εργασίας"
XDG_TEMPLATES_DIR="$HOME/Πρότυπα"
XDG_PUBLICSHARE_DIR="$HOME/δημόσιο"
XDG_DOCUMENTS_DIR="$HOME/Έγγραφα"
XDG_MUSIC_DIR="$HOME/Μουσική"
XDG_PICTURES_DIR="$HOME/Εικόνες"
XDG_VIDEOS_DIR="$HOME/Βίντεο"

Personally I believe that having localised names appear under the home folder is good for the majority of users, as they will be able to match what is shown in Locations with the actual names on the filesystem.

There will be cases that software has to be updated and bugs fixed (such as in backup tools). As we proceed with more advanced internationalisation/localisation support in Linux, it is desirable to follow forward, and fix problematic software.

However, if enough popular support arises with clear arguments (am referring to Greek-speaking users and a current discussion) for default folder names in the English languages, we could follow the popular demand.

Also see the relevant blog post New Dirs in Gutsy: Documents, Music, Pictures, Blah, Blah by Moving to Freedom.

4Nov/072

StixFonts, finally available (beta)!

The STIX Fonts project (website) has been developing for over 10 years a font suitable to be used in academic publications. It boasts support from Elsevier, IEEE and other academic publishers or associations.

A few days ago, they published a beta version of the font in an effort to get public feedback. The beta period runs until the 15th December.

STIX Fonts Beta showing Greek (Regular), from STIX Fonts Beta

STIX Fonts Beta currently support modern Greek. An effort to get support for Greek Polytonic did not work out well a few years back.

STIX Fonts Beta showing Greek (Italic), from STIX Fonts Beta

The main benefit of STIX Fonts is the support for mathematical and other technical symbols. This helps when writing academic publications and other technical documents.

STIX Fonts Beta showing Greek (Bold), from STIX Fonts Beta

STIX Fonts have extensive support of mathematical symbols, symbols that exist in Unicode Plane-1.

STIX Fonts Beta showing Greek (Bold Italic), from STIX Fonts Beta

If there is any modification that we would like to have in STIX fonts, we should do now. Once they are released, they will be widely distributed. Currently, Fedora has packaged STIX Fonts and made them available already.

11Aug/070

Vote NO with comments (on DIS 29500 / OOXML)

  • Vote “No, with comments,” which is the JTC1-prescribed way of indicating “conditional approval” (JTC1 Directives (DOC, pops), Section 9.8)
  • Recommend that OOXML be resubmitted as normal working item in JTC1/SC34:
    • Split into a multi part standard: WordProcessingML, SpreadsheetML, DrawingML, Office Open Math Markup, VML, etc.
    • Have each part progress independently, at its own speed, within normal ISO processing stages
    • Encourage participation from OASIS to identify opportunities for harmonization with existing ISO 26300 “ODF”
  • OOXML, as the default format in MS Office, is important. But as a standard it is full of inconsistencies, omissions, inaccuracies and errors. No standard is perfect, but OOXML, in its current state, does even not meet the minimum requirements.

source: Rob Weir's presentation slides, last slide (pdf)

 

 

OOXML is being rushed to become an ISO standard using the fast-track process. This is not good. As end-users we want real commodity document formats that are easy to implement and do not tie us to a specific office suite. Sadly, the purpose of rushing to standardise OOXML is simply to avoid letting it become a commodity document format. By letting OOXML become an ISO standard as it is now, a few companies get to gain a lot, but we are going to lose.

Spread the word.

 

I copy below the voting country list.

According to Rob Weir, all countries can cast a vote on this; sorry for this misinformation.

 

The voting countries (Participating countries) are (the list is being updated, please see Participating countries for new list)

  Brazil (ABNT)
Bulgaria (BDS)
China (SAC)
Colombia (ICONTEC)
Cyprus (CYS)
Czech Republic (CNI)
Côte-d'Ivoire (CODINORM)
Denmark (DS)
Finland (SFS)
France (AFNOR)
Germany (DIN)
India (BIS)
Italy (UNI)
Japan (JISC)
Kazakhstan (KAZMEMST)
Kenya (KEBS)
Korea, Republic of (KATS)
Netherlands (NEN)
Norway (SN)
Sweden (SIS)
Switzerland (SNV)
Thailand (TISI)
Trinidad and Tobago (TTBS)
Turkey (TSE)
USA (ANSI)
United Kingdom (BSI)

In addition, the following countries have observer status (Observer countries), (the list is being updated, please see Observer countries for new list)

  Australia (SA)
Chile (INN)
Greece (ELOT)
Hong Kong, China (ITCHKSAR)
Hungary (MSZT)
Ireland (NSAI)
Israel (SII)
Lithuania (LST)
Mexico (DGN)
Romania (ASRO)
Spain (AENOR)
Sri Lanka (SLSI)
Ukraine (DSSU)

The observer countries, though the cannot vote, they can submit comments.

8Aug/0760

Cannot write Greek Polytonic in Linux

For up to date instructions for Greek and Greek Polytonic see How to type Greek, Greek Polytonic in Linux.

The following text is kept for historical purposes. Greek and Greek Polytonic now works in Linux, using the default Greek layout.

General Update: If you have Ubuntu 8.10, Fedora 10 or a similarly new distribution, then Greek Polytonic works out-of-the-box. Simply select the Greek Polytonic layout. For more information, see the recent Greek Polytonic post.

Update 3rd May 2008: If you have Ubuntu 8.04 (probably applies to other recent Linux distributions as well), you simply need to add GTK_IM_MODULE=xim to /etc/environment. Start a Terminal (Applications/Accessories/Terminal) and type the commands (the first command makes a backup copy of the configuration file, and the second opens the configuration file with administrative priviliges, so that you can edit and save):

$ gksudo cp /etc/environment /etc/environment.ORIGINAL
$ gksudo gedit /etc/environment

then append

GTK_IM_MODULE=xim

save, and restart your computer. It should work now. Try to test with the standard Text editor, found in Accessories.

In Ubuntu 8.10 (autumn 2008), it should work out of the box, just by enabling the Greek Polytonic layout.

Update 20th June 2008: If still some accents/breathings/aspirations do not work, then this is probably related to your system locale (whether it is Greek or not). It works better when it is Greek. If you are affected and you do not use the Greek locale, there is one more thing to do.

$ gksudo cp /usr/share/X11/locale/en_US.UTF-8/Compose /usr/share/X11/locale/en_US.UTF-8/Compose.ORIGINAL
$ gksudo cp /usr/share/X11/locale/el_GR.UTF-8/Compose /usr/share/X11/locale/en_US.UTF-8/Compose

The first command makes a backup copy of your original en_US Compose file (assuming you run an English locale; if in doubt, read /usr/share/X11/locale/locale.dir). The second command copies the Greek compose file over the English one. You then logout and login again.

End of updates

To write Greek Polytonic in Linux, a special file is used, which is called the compose file. There is a bit of complication here in the sense that the compose file depends on the current system locale.

To find out which compose file is active on your system, have a look at

/usr/share/X11/locale/compose.dir

Let's assume your system locale is en_US.UTF-8 (Start Applications/Accessories/Terminal and type locale).

In the compose.dir file it says

en_US.UTF-8/Compose: en_US.UTF-8

Note that the locale is the second field. If you have a different system locale, match on the second field. Many people make a mistake here. Actually, I think be faster for the system to locate the entry if the compose.dir file was sorted by locale.

Therefore, the compose file is

/usr/share/X11/locale/en_US.UTF-8/Compose

So, what's the problem then?

Well, for the Greek locale (el_GR.UTF-8) we have a different compose file, a compose file in which Greek Polytonic actually works ;-) .

Therefore, there are numerous workarounds here to get Greek Polytonic working.

For example,

  • If you speak modern Greek, you can install the Greek locale.
  • You can edit /usr/share/X11/locale/compose.dir so that for your locale, the compose file is the Greek one, /usr/share/X11/locale/el_GR.UTF-8/Compose.
  • You can edit the Greek compose file, take the Greek Polytonic section and update the Greek Polytonic section of en_US.UTF-8/Compose.
  • You can copy the Greek compose file in your home directory under the name .XCompose. I did not try this one, and also you may be affected by this bug. (not tested)

Of course the proper solution is to update en_US.UTF-8/Compose with the updated Greek Polytonic compose sequences. There is a tendency to add the compose sequences of all languages to en_US.UTF-8/Compose, and this actually is happening now. In this respect, it would make sense to rename en_US.UTF-8/Compose into something like general/Compose.

5Aug/070

Greek OLPC localisation status

The Greek OLPC localisation effort is ongoing and here is a report of the current status.

For discussions, reading discussion archives and commenting, please see the Greek OLPC Discussion Group.

We are localising two components, the UI (User Interface) and applications of the OLPC, and the main website at http://www.laptop.org/

The UI is currently being translated at the OLPC Wiki, at OLPC_Greece/Translation. At this page you can see the currently available packages, what is pending and which is the page that you also can help translate.

At this stage we need people with skills in music terminology to help out with the localisation of TamTam. In addition, there are more translations that need review and comments before they are sent upstream.

Moreover, if you find a typo and a better suggestion for a term in the submitted translations, feel free to tell us at the Greek OLPC Discussion Group.

The other project we are working on is the localisation of the Greek version of www.laptop.org. The pages are not 100% translated yet, so if you want to finish the difficult parts, see the Web translation page of laptop.org.

The translators that helped up to now have done an amazing job.

23Jul/071

Important MO file optimisation for en_* locales, and partly others

During GUADEC, Tomas Frydrych gave a talk on exmap-console, a cut-down version of exmap that can work well on mobile devices.

During the presentation, Tomas showed how to use the tool to find the culprits in memory (ab)use on the GNOME desktop. One issue that came up was that the MO files taking up space though the desktop showed English. Why would the MO translation files loaded in memory be so big in size?

gtk20.mo                             : VM   61440  B, M   61440  B, S   61440  B

atk10.mo                      	     : VM    8192  B, M    8192  B, S    8192  B

libgnome-2.0.mo			: VM   28672  B, M   24576  B, S   24576  B

glib20.mo			     : VM   20480  B, M   16384  B, S   16384  B

gtk20-properties.mo           : VM     128 KB, M     116 KB, S     116 KB

launchpad-integration.mo  : VM    4096  B, M    4096  B, S    4096  B

A translation file looks like

msgid "File"

msgstr ""

When translated to Greek it is

msgid "File"

msgstr "Αρχείο"

In the English UK translation it would be

msgid "File"

msgstr "File"

This actually is not necessary because if you leave those messags untranslated, the system will use the original messages that are embedded in the executable file.

However, for the purposes of the English UK, English Canadian, etc teams, it makes sense to copy the same messages in the translated field because it would be an indication that the message was examined by the translation. Any new messages would appear as untranslated and the same process would continue.

Now, the problem is that the gettext tools are not smart enough when they compile such translation files; they replicate without need those messages occupying space in the generated MO file.

Apart from the English variants, this issue is also present in other languages when the message looks like

msgid "GConf"

msgstr "GConf"

Here, it does not make much sense to translate the message in the locale language. However, the generated MO file contains now more than 10 bytes (5+5) , plus some space for the index.

Therefore, what's the solution for this issue?

One solution is to add to msgattrib the option to preprocess a PO file and remove those unneeded copies. Here is a patch,

--- src.ORIGINAL/msgattrib.c 2007-07-18 17:17:08.000000000 +0100
+++ src/msgattrib.c 2007-07-23 01:20:35.000000000 +0100
@@ -61,7 +61,8 @@
REMOVE_FUZZY = 1 << 2,
REMOVE_NONFUZZY = 1 << 3,
REMOVE_OBSOLETE = 1 << 4,
- REMOVE_NONOBSOLETE = 1 << 5
+ REMOVE_NONOBSOLETE = 1 << 5,
+ REMOVE_COPIED = 1 << 6
};
static int to_remove;

@@ -90,6 +91,7 @@
{ "help", no_argument, NULL, 'h' },
{ "ignore-file", required_argument, NULL, CHAR_MAX + 15 },
{ "indent", no_argument, NULL, 'i' },
+ { "no-copied", no_argument, NULL, CHAR_MAX + 19 },
{ "no-escape", no_argument, NULL, 'e' },
{ "no-fuzzy", no_argument, NULL, CHAR_MAX + 3 },
{ "no-location", no_argument, &line_comment, 0 },
@@ -314,6 +316,10 @@
to_change |= REMOVE_PREV;
break;

+ case CHAR_MAX + 19: /* --no-copied */
+ to_remove |= REMOVE_COPIED;
+ break;
+
default:
usage (EXIT_FAILURE);
/* NOTREACHED */
@@ -436,6 +442,8 @@
--no-obsolete remove obsolete #~ messages\n"));
printf (_("\
--only-obsolete keep obsolete #~ messages\n"));
+ printf (_("\
+ --no-copied remove copied messages\n"));
printf ("\n");
printf (_("\
Attribute manipulation:\n"));
@@ -536,6 +544,21 @@
: to_remove & REMOVE_NONOBSOLETE))
return false;

+ if (to_remove & REMOVE_COPIED)
+ {
+ if (!strcmp(mp->msgid, mp->msgstr) && strlen(mp->msgstr)+1 >= mp->msgstr_len)
+ {
+ return false;
+ }
+ else if ( strlen(mp->msgstr)+1 < mp->msgstr_len )
+ {
+ if ( !strcmp(mp->msgstr + strlen(mp->msgstr)+1, mp->msgid_plural) )
+ {
+ return false;
+ }
+ }
+ }
+
return true;
}
However, if we only change msgattrib, we would need to adapt the build system for all packages.

Apparently, it would make sense to change the default behaviour of msgfmt, the program that compiles PO files into MO files.

An e-mail was sent to the email address for the development team of gettext regarding the issue. The development team does not appear to have a Bugzilla to record these issues. If you know of an alternative contact point, please notify me.

Update #1 (23Jul07): As an indication of the file size savings, the en_GB locale on Ubuntu in the installation CD occupies about 424KB where in practice it should have been 48KB.

A full installation of Ubuntu with some basic KDE packages (only for the basic libraries, i.e. KBabel - (ls k* | wc -l = 499)) occupies about 26MB of space just for the translation files. When optimising in the MO files, the translation files occupy only 7MB. This is quite important because when someone installs for example the en_CA locale, all en_?? locales are added.

The reason why the reduction is more has to do with the message types that KDE uses. For example,

msgid ""
"_: Unknown State\n"
"Unknown"
msgstr "Unknown"

I cannot see a portable way to code the gettext-tools so that they understand that the above message can be easily omitted. For the above reduction to 7MB, KDE applications (k*) occupy 3.6MB. The non-KDE applications include GNOME, XFCE and GNU traditional tools. The biggest culprits in KDE are kstars (386KB) and kgeography (345KB).

Update #2 (23Jul07): (Thanks Deniz for the comment below on gweather!) The po-locations translations (gnome-applets/gweather) of all languages are combined together to generate a big XML file that can be found at usr/share/gnome-applets/gweather/Locations.xml (~15MB).

This file is not kept in memory while the gweather applet is running.
However, the file is parsed when the user opens the properties dialog to change the location.
I would say that the main problem here is the file size (15.8MB) that can be easily reduced when stripping copied messages. This file is included in any Linux distribution, whatever the locale.

The po-locations directory currently occupies 107MB and when copied messages are eliminated it occupies 78MB (a difference of 30MB). The generated XML file is in any case smaller (15.8MB without optimisation) because it does not include repeatedly the msgid lines for each language.

I regenerated the Locations.xml file with the optimised PO files and the resulting file is 7.6MB. This is a good reduction in file space and also in packaging size.

Update #3 (25Jul07): Posted a patch for gettext-tools/msgattrib.c. Sent an e-mail to the kde-i18n-doc mailing list and got good response and a valid argument for the proposed changes. Specifically, there is a case when one gives custom values to the LANGUAGE variable. This happens when someone uses the LANGUAGE variable with a value such as "es:fr" which means show me messages in Spanish and if something is untranslated show me in French. If a message has msgid==msgstr for Spanish but not for French, then it would show in French if we go along with the proposed optimisation.

16Jul/070

GUADEC Day #2

(see http://www.guadec.org/schedule/warmup)

At the first presentation, Quim Gil talked about GNOME marketing, what have been done, what is the goal of marketing. He showed a focused mind on important marketing tasks; it is easy to get carried away and not be effective, a mistake that happens in several projects.

The next session was by Tomas Frydrych (Open Hand - I have their sticker on my laptop!) on memory use in GNOME applications. Many people complain that XYZ is bloated. However, this does not convey what exactly happens; pretty useless. In addition, the common tools that show memory use do not show the proper picture because of the memory management techniques. That is, due to shared libraries, the total memory occupied by an application appears very big. A tool examined is exmap. This tool uses a kernel module that shows memory use of applications by reading in /proc. It takes a snapshot of memory use; it's not real-time info. It comes with a GTK+ front-end (gexmap) that requires a big screen (oops, PDAs). However, it is not suitable for internet tablets and other low-spec devices. Therefore, they came up with exmap-console which addresses the shortcommings. It has a console interface based on the readline library.

Here are the rest of my notes. Hope they make sense to you.

. exmap --interactive
. ?: help
. Head: quite useful (dynamic allocation)
. Mapped:
. Sole use: memory that app is using on its own (rss?)
. "sort vm"
. "print" or "p"
. "add nautilus"
. "clear"
. "detail file" (what executables/libs loaded and how much consume)
. "detail none"

Sole use
. valgrind, to analyse Sole Use memory?
. "detail ????"

Lots of small libraries: overhead

Looking ahead
. Pagemap: by Matt Macall
. http://projects.o-hand.com/exmap-console/

Python
. Sole use: ~18MB ;-(

Tomas was apparently running Ubuntu with the English UK locale. The English UK translation team is doing an amazing job at the translation stats. Actually, most messages are copied, however with a script one can pick up words such as organization and change to organisation. The problem here is that, for example, the GAIM mo file is 215KB (?), however for the British English translation the actual changes should be less than 2-3KB. Messages that are missing from a translation mean that the original US English messages will be used. I'll have to find how to use msgfilter to make messages untranslated if msgid == msgstr. Where is Danilo?

After lunch time (did not go for lunch), I went to the Accerciser session. Pretty cool tool, something I have been look for. Accerciser uses the accessibility framework of GNOME in order to inspect the windows of running applications and see into the properties. A good use is to identify if elements such as text boxes come with description labels; they are important to be there for accessibility purposes (screen reader), as a person that depends on software to read (text to speech) the contents of windows.

The next session was GNOME accessibility for blind people. Jan Buchal gave an excellent presentation.

My notes,

. is from Chech republic, is blind himself. has been using computers for 20+ years

. from user perspective
. users, regular and irregular ;-)
. software
. firefox 3.0beta - ok for accessibility other versions no
. gaim messenger ok
. openoffice.org ok but did not try
. orca screenreader ^^^ works ok.
. generally ready for prime time
. ubuntu guy for accessibility was there
. made joke about not having/needing display slides ;-]
. synthesizer: festival, espeak, etc - can choose
. availability of voices
. javascript: not good for accessibility
. links/w3m: just fine!
. firefox3 makes accessibility now possible.
. web designer education, things like title="", alt="" for images.
. OOo, not installed but should work, ooo-gnome
. "braillcom" company name
. "speech dispatcher"
. logical events
. have short sound event instead of "button", "input form"
. another special sound for emacs prompt, etc.
. uses emacs
. have all events spoken, such as application crashing.
. problems of accessibility
. not money main factor, but still exists.
. standard developers do not use accessibility functions
. "accessor" talk, can help
. small developer group on accessiblity, may not cooperate well
. non-regular users (such as blind musician)
. musicians
. project "singing computer"
. gtk, did not have good infrastructure
. used lilypond (music typesetter, good but not simple to use)
. singing mode in festival
. use emacs with special mode to write music scores (?)
. write music score and have the computer sing it (this is not "caruso")
. gnome interface for lilypond would be interesting
. chemistry for blind
. gtk+
. considering it
. must also work, unfortunately, on windows
. gtk+ for windows, not so good for accessibility
. conclusion: free accessibility
. need users so that applications can be improved
. have festival synthesizer, not perfect but usable
. many languages, hindi, finnish, afrikaans
. endinburgh project, to reimplement festival better
. proprietary software is a disadvantage
. q: how do you learn to use new software?
. a: has been a computer user for 20+ years, is not good candidate to say
. a: if you are dedicated, you can bypass hardles, old lady emacs/festival/lilypond
. brrlcom, not for end-users(?)
. developer problem?
. generally there is lack of documentation; easy to teach what a developer needs to know
. so that the application is accessible
. HIG Human Interface Guidelines, accessible to the developers
. "speakup" project
. Willy, from Sun microsystems, working on accessibility for +20 years, Lead of Orca.
. developers: feel accessibility is a hindrance to development
. in practice the gap is not huge
. get tools (glade) and gtk+ to come with accessibility on by default
. accessibility
. is not only for people with disabilities
. can do amazing things like 3d interfaces something

These summaries are an important example of the rule that during presentation, participants tend to remember only about 8% of the material. In some examples, even less is being recollected.

16Jul/070

GUADEC Day #1

I am writing this in the morning of the second day (posted at the end of the second day). Just had breakfast and there is a bit of time before making it to the conference venue.

Yesterday Sunday, was the first of the two days of warm-up for the GUADEC conference. At 11am the registration started. I was in front of the queue and got my badge quickly, then picked up the bag with the goodies; three cool t-shirts, a copy of Ubuntu 7.04, Fedora 7 Live, Linux stickers, two Linux pens, a mini Google Code notebook (no, that's an actual notebook (not that type of notebook, it was just the paper-based thing)).

During registration I met up with Dimitrios Glezos (of Greek Fedora fame) and a bit later with Dimitrios Typaldos. It was the first time I met both of them in person.

Between a choice of two sessions I went to the one on X.org developments (XDamage, xrender, etc extensions and how to use them). Ryan Lortie gave the presentation.

Next was lunch time, and Dimitrios T. recommended a pub for traditional English food and drink. Sayamindu came along.

The next session I went to was the Hildon desktop, which is what we used to call Maemo; GNOME for internet tables such as the Nokia 770 and Nokia 800. There are special technical issues to solve. Lucas Rocha mentioned refactoring issues with the source code. In addition, as far as I understood, there is an issue with the internationalisation support for the platform.

Next, Don Scorgie talked about the GNOME documentation project. Several things can be improved and one of them is the introduction of a simplified XML schema for the needs of GNOME documentation. When compared to DocBook XML, the new GNOME documentation schema has only 6 elements (or do they call them tags?). In addition to this, there is a documentation editor with a special rich-edit widget for this schema. Mallard is a type of duck(?).

I also attended the last 10 minutes of the presentation on project Jackfield (sadly no special significance between Jackfield and what the project is about). Jackfield is apparently a way to run Javascript scripts on the desktop. OS/X is supposed to have it, and there are already scripts available. With Jackfield, you can run those scripts unmodified on Linux. The demos where really impressive.

The final session for the day was a presentation by Richard Rothwell on free software for the socially excluded. No, you do not have to go to Africa for this. His work relates to families in Nottingham, UK. It reminds me the situation and effort in Farkadona, Greece, that was described by Kostas Boukouvalas. I think it would have been helpful if Kostas Boukouvalas could have attended this. Richard is running a 3-year project that provides a number of PCs (in the hundreds?) with Linux to socially excluded families. Even in the UK, funding is hard to come by.

6Jul/070

Google Groups: Member Invite Request Approved

When creating a Google Group, you have the option of auto-subscribing a list of e-mails. That is, the owner of the email address does not have perform the subscription task. To avoid the apparent spamming opportunity, Google Groups puts a human to review those requests. After you pasted the e-mail addresses, you press Submit and then get a text box where you can write a message to help this person decide.

While filling such a request, I made a gross mistake and I added 140 more email addresses than I should. In the text box I write with capitals, PLEASE CANCEL THIS REQUEST, MISTAKE.

Just now I got a reply, and that requst got approved. On the positive side, the auto-subscription request was thankfully converted to a notification request, so all these people received a request to join the group.

Thank you all for not complaining!

p.s.

My regular blog is offline for a few days so I am using this one for now.

Tagged as: , , , , , No Comments

Switch to our mobile site