Archive for January, 2008

After all, what is CMS?

01/29/2008

In this edition, we will be wrapping up our series on website localization with a discussion of Content Management System, better known as CSM. However, before we go into the localization process itself, allow me to provide a brief introduction of how this type of system works.

CMS is a set of tools installed on a server that are combined for the purpose of assisting with the administration, modification, publication and maintenance of online content, from simple websites to the most complex portals.

CMS may contain several resources and features, including the following:

Administration: the part of the system where you can enter, modify or delete content, install or remove additional modules and change the website settings, such as the title, date format, currency, language, etc.

Additional Modules: several CMS tools offer the possibility of having modules installed on the main system to enhance the website functionality, thus turning it into a true portal. Some of the most common examples of add-on modules include surveys, forums, private messaging, chat, calendar/agenda and others.

Templates: these define how the content will be displayed on the browser, including its formatting. Templates are HTML pages with placeholders that will be replaced with the text from the page the user requested from the server. A well-designed template should not contain text, only the page structure and formatting.

Database: the part of the CMS that stores the texts for each page, the very content of the website.

Engine: the CMS component that receives the pages requested by the user via browser. These requests trigger a process in which texts, templates and website settings are selected and the HTML code is assembled. This HTML code is then sent to the browser so that the user can view the requested page in the proper format.

There are several CMS tools available, yet only a few offer multi-language support. Before a CMS can be considered ready for localization, it must allow for proper identification of the language of the text to be entered in each page.

Let’s see how the differences between the CMS models interfere with website localization:

Localization-Ready CMS

This means that the system is prepared to deal with multiple languages for each page. In this case, the same body text, title and subtitle of a page will have several entries in the database, one page for each registered language. See the example in the database table below:

text_table
index page_num category text language
1 1 Title Welcome to Our Website! en_US
2 1 Title Bem vindo ao nosso site! pt_BR
3 1 subtitle Cosmetic Product Line en_US
4 1 subtitle Linha de produtos cosméticos pt_BR
5 1 Title Skin Care en_US
5 2 Title Cuidados com a pele pt_BR

To translate the content of this type of website, it is necessary to extract the texts together with the page identifier (pag_num), category and language into a translation-friendly file format. This process helps the localizer to reinsert the text into the database after the text has been translated into the target language.

In such cases, SQL language is normally used to extract and reinsert the text. While inserting the text, the new database entries will have the acronym of the new language in the “language” field, which in this example is Brazilian Portuguese (pt_BR).

The example below demonstrates how to insert text in Argentine Spanish (es_AR) into page 1:


SQL> insert into text_table set (page_num, category, text, language)
values (1, “title”, “Bienvenidos a nuestro sitio Web!”, “es_AR”);

SQL> insert into text_table set (page_num, category, text, language)
values (1, “subtitle”, “Línea de productos cosméticos”, “es_AR”);

CMS Not Ready for Localization

Although it takes more engineering time to successfully complete extraction and insertion, systems that are not developed for this purpose can still be localized.

The engineering team of the localization company should contact the client company’s webmaster or website developers to analyze how CMS operates. Hence, they will be able to evaluate the best method for extracting and reinserting the translated texts.

In this case, although extraction can be automated, reinsertion of the translated text will be done manually through the very interface of the CMS. This is because the system cannot detect that the text being entered is the translation of an already existing page, only in another language. As a result, it is necessary to generate a new page ID for each localized page.

This insertion procedure can also cause other problems such as the need to create a new menu that points to the new IDs of the translated pages. See the difference between the two types of CMS in the tables below:

Localization-Ready CMS:

en_US menu link to page:       pt_BR menu link to page:
Cosmetic Product Line             Linha de produtos cosméticos      
- Skin Care 2 + en       - Cuidados com a pele 2 + pt-br
- Bath Accessories 3 + en       - Acessórios para o banho 3 + pt-br

CMS not ready for Localization:

en_US menu link to page:       pt_BR menu link to page:
Cosmetic Product Line             Linha de produtos cosméticos      
- Skin Care 2       - Cuidados com a pele 27
- Bath Accessories 3       - Acessórios para o banho 28

Regardless of whether the CMS has been developed for localization or not, if it is not possible to have direct access to the database table where the texts are stored, the extraction process may become more complicated. In the worst case scenario, this would have to be a manual process, adding to the effort and increasing the time required to complete the process.

Of those systems that do not allow direct access to the database, very few have their own tool for exporting text to a file (whose extension is usually .xml or .csv). However, even those CMSs that have extraction tools rarely offer the means to enter translated content. Either the absence of an extraction tool or a device to re-enter text makes the localization process equally difficult.

At any rate, website localization with CMS tools vary from case to case and system characteristics must always be evaluated before determining the costs and the time required for localization. If you plan to create a website using CMS tools, please bear in mind the issues described above and feel free to contact Ccaps so that we can help you evaluate the best way to get your project started.

Ricardo Junior

Can you write code?

01/23/2008

Continuing with our tips on the types of website programming and the ways to deal with these types for later localization, we will examine now what can be done and how to do it.

There is not any tool that “understands” all the existing programming codes for client or server and makes available to translators only what really needs to be translated.

Some programs try to facilitate such task by highlighting what needs to be translated by following some of the patterns of each programming language. However, each programmer can use a great variety of techniques and programming methods, which allows for these tools to include programming codes into the translatable texts. During translation, the removal or addition of quotation marks, full stop, semi-colons or any other character used by the programming language may prevent the code from functioning.

To avoid accidents in the programming codes, files are usually pre-prepared for translation. Later on, the localization engineer (who must be acutely aware of the programming languages and the translation processes) reviews the pre-prepared files and makes the necessary adjustments to assure that only what needs to be translated is available.

This saves time and prevents problems related to code debugging. Hence, it will be easier to detect the problem and find a solution for it.

However, to avoid additional steps in the translation process the best choice is to have the developers change some (bad) habits that are commonplace in the creation of programming codes.

The following two examples demonstrate how the process is usually carried out and how it could be improved:

1. Never separate the parts of a text that will make a sentence when the code runs
For certain languages, the word order may have to be altered during the translation process. If the texts are separated, the correct translation of the sentence can become difficult. In the following example, the quotes are considered text qualifiers, which would break a sentence into two. The best method for this case then would be placeholder replacement. What follows is a JavaScript example that will show the sentence “You are on page 3 of the open document.” in an alert box:

BAD
pag_num = 3;
alert (“You are on page ” + pag_num + “ of the open document.”);

IDEAL
myString = new String(“You are on page %n of the open document.”);
rExp = /%n/gi;
pag_num = 3;
results = myString.replace(rExp, pag_num);
alert (results);

2. Do not leave the translatable texts together with the code
Instead of making all texts hardcoded, it is preferable to declare constants in a file that is detached from the programming code. Therefore, the texts of this programming code can be translated in a separate file. This allows for the website logics (code) to not be necessarily edited, as the following example in JavaScript using the same sentence above:

BAD
pag_num = 3;
alert (“You are on page ” + pag_num + “ of the open document.”);

IDEAL
text.js
myString = new String(”You are on page %n of the open document.”);

code.html
<script>
<script language=”JavaScript”>
rExp = /%n/gi;
pag_num = 3;
results = myString.replace(rExp, pag_num);
alert (results);
</script>

In our next post, I will examine more tips on website programming to facilitate the localization process. I will also discuss the fourth type of programming mentioned in the first tip of this series: sites built on databases.

See you then!

Ricardo Junior

What is your website type?

01/21/2008

Continuing with the posts on website localization, let’s turn now to the types of sites and how they can affect the localization process.

The biggest influencing factor in the website localization process is the technology employed in its creation. These technologies can be divided into 4 types:

1. Sites without programming

2. Sites with programming
    2.1 Client-side programming
    2.2 Server-side programming

3. Sites built on databases (CMS, Content Management System)

1. Sites without programming
These sites are put together using just HTML language and, optionally, with Cascading Style Sheets (CSS). The translation tools available on the market fully support this language, which greatly facilitates the translation of this type of files. This means that virtually no manual work on the files is required before the actual translation is initiated.

2. Sites with programming

2.1. Client-side programming
These are websites with some programming work executed on the browser used by the person accessing the site, and may be written in JavaScript or VBScript. The programming code may be contained in each HTML file, or in files with .js (Javascript) or .vbs (VBScript) extension.
These websites normally use the client-based programming code to check data to be sent via forms, to confirm user actions, to provide information before executing an action, to open pages by replacing regular button and hyperlink functions, to substitute figures in dynamic menus (roll over effect) and so on. Warnings, alerts, prompts, error messages and other need to be translated for these added functions.

2.2. Server-side programming

As in the previous item, these websites contain programming codes. However, these codes run on the server and thus its functions and goals are different from the client-based codes.
The server makes use of a variety of programming languages (PHP, ASP, ASP.net, JSP, ColdFusion, Pearl, Python, etc.). Server-based programming codes are normally used to access databases (or other data sources, such as XML) and then use this information to create pages to be displayed on the browser with their dynamic contents already inserted. In other words, the server processes the programming code and sends only the result back to the client.

Server-based programming files may contain HTML texts and tags, which are used to create the pages to be displayed. As a result, and in contrast to the previous option, not only the warnings, alerts and error messages need to be translated for this type of file. This is because each server-based programming file is capable of generating different HTML pages, depending on the logics of the code.

Next I will examine the best ways to approach these kinds of programming as well as the correct procedures for writing codes, texts, etc.

See you in next post!

 Ricardo Junior

Impostos, impostos e mais impostos…

01/18/2008

No English translation for this post. Lucky are those who don’t have to feed our tax “lion”.

Make a difference in 2008!

01/17/2008

How many times have we become upset with a certain situation in our company or in our personal life and the only thing we do is to complain? Shall we start changing this trend?

Instead of moaning, let us take responsibilities that could be decisive and change our reality.

We are responsible for the environment and atmosphere in the company. If our behavior and attitude is not different from those about which we complain, we will be contributing for the bad situation to remain or get even worse.

If we want to see changes, we should be the first to change. We should not wait for others to change their attitude and only then start to consider how we are behaving. Yet one must be brave to make a difference.

Do you feel responsible for the atmosphere in your company?

 
If the seasons change, why can’t you?

Do you have the guts to serve as an example when you believe something should change?

Think about it…

What now?

01/16/2008

According to an article by John Yunker, Web Globalization Predictions: 2008 and Beyond, which was published in his blog, Global by Design, Google is also after a bite of our daily bread. They are improving Google Translate and the integration with Google Apps.

Google Translate is yet another machine translation tool. Like all other tools, to work well it would need a powerful database with all kinds of text and the corresponding translation. Yet to create this humongous database, Google is considering to suggest Google Apps users to upload their translation memories so that Google Translate became more complete and apt to provide good quality translations.

Used as part of Google Apps, Google Translate is the ideal tool for real-time translation of emails and websites. The result might not be ideal, but would improve over time.

This will probably take a long time, but it seems a new player in this market is born (and this player definitely wants to move our cheese!)

“Translator, you are in my hands.” (Google) 

A New Year, A New Life?

01/15/2008

New Year resolutions are rather interesting. Most people promise to:

- enter the gym
- finally start some diet
- save money

… to name a few of the most frequent promises.

                                   

Whatever your resolution for the New Year may be, the meaning is almost the same.

We are in mid January already, but why not start today all those things we will promise for the New Year? The change of the year in the calendar will not miraculously represent a change in the world around you. After all, the world will be virtually the same on December 31 and on January 1.

Let us start to live the present TODAY. Let us work towards the realization of our dreams and plans as soon as possible, instead of waiting for a “special day” to start something on which we should be working all the time and since forever.

Happy New Year!

With or without resolutions, but with lots of peace, love, health and success –both personal and professional!

How to localize your website?

01/15/2008

There are various methods for developing websites, and the programming can be browser- and/or server-based. A website developer can use a range of technologies, which will have a direct impact on the localization process. Regardless of the technology used, the most effective procedure when localizing a website is to always provide the source files.

When you are assigned the localization of your company’s website, instead of simply indicating the online address, ask your webmaster or web designer to supply the content offline or the means for us to locate the original files. In this way, the price quote is more precise, your team is spared side work and your turnaround and on-line placement time is cut in half. The result? While reducing your website localization costs, you increase the satisfaction of those who decided it was time for your company to go global. In the next post, I will discuss the different types of websites and how their differences can affect the localization process.

 


 

 

Ricardo Junior worked as a localization engineer for Ccaps but decided to follow his career as an accountant and today works for Petrobras.

Seria “o mesmo” o avô do gerundismo?

01/07/2008

Sorry, no English translation for this post.