The Locale Explorer

Home
Back To Tips Page

Like many programmers, I have not always done a fully-compliant internationalized application for most of my applications. To do this requires using a lot of NLS (National Language Support) calls, which involves a lot of reading about them. This is not aided by some of the poor organization of this material in the MSDN (although it appears to be improving). Also, writing the code to get certain locale-specific features is a bit tedious.

It was a small project that got out of hand.

In an attempt to simplify my own life, and motivated by a need to handle locale-specific data, I researched some aspects of this problem. Perhaps the most important API call in this regard is GetLocaleInfo. The problem with this API is that it has a long and tedious list of options, and when you look at an option, you really have no idea what text you are going to see.  Do you want to see LOCALE_SCOUNTRY, LOCALE_SENGCOUNTRY, or LOCALE_SLANGUAGE? Which features differ across locales?

While there are many other APIs of serious concern, this one API was of primary interest to me in a particular project. So I wrote the first draft of what I call the Locale Explorer, which presents all of the information this one API can return (over 100 different pieces of locale-specific information!).

In addition to getting the LOCALE_SYSTEM_DEFAULT and LOCALE_USER_DEFAULT options, I allow you to display all the locales your machine supports.

The first time I ran it, I found that some of the information was suspect. For example, when I displayed the equivalent of "Monday" in FYRO Macendonian, I saw "понеделник", which did not look particularly useful. So I immediately recompiled it in Unicode, and got the same nonsense. However, I realized that the default control font for dialogs, even in Unicode apps, is MS Sans Serif, which is not a Unicode font. So in the Unicode version, I use the Arial font, and I get "понеделник", which, if you have the right font in your browser, will be seen as the correct representation. Greek comes out as "Δευτέρα" (which again will read correctly only if you have the right fonts for your browser). So there's progress here.

But this was not enough. I then had to figure out how to write the code in my application. So I wrote a little help file that explained how to write the code. But this seemed a bit silly. After all, the program has all the context it needs to tell me what to generate. So I created an edit control and filled it in with the correct code to write. For example, using MFC, the code to retrieve the localized text string for "Monday" for the Greek locale is

// LOCALE_SDAYNAME1
CString sdayname1_data;
{ /* get LOCALE_SDAYNAME1 */
 int length = GetLocaleInfo(0x00000408, LOCALE_SDAYNAME1, NULL, 0);
 LPTSTR p = sdayname1_data.GetBuffer(length);
 VERIFY(::GetLocaleInfo(0x00000408, LOCALE_SDAYNAME1, p, length)
} /* get LOCALE_SDAYNAME1 */
sdayname1_data.ReleaseBuffer();
// ... use sdayname1_data here

That code appears in this Web page courtesy of copy-and-paste; the code was generated by my program.  However, that's MFC code. What about those programming in C++ without MFC?  In that case, it would be

// LOCALE_SDAYNAME1
LPTSTR sdayname1_data;
{ /* get LOCALE_SDAYNAME1 */
int length;
length = ::GetLocaleInfo(0x00000408, LOCALE_SDAYNAME1, NULL, 0);
sdayname1_data = new TCHAR[length];
::GetLocaleInfo(0x00000408, LOCALE_SDAYNAME1, sdayname1_data, length);
} /* get LOCALE_SDAYNAME1 */
// ... use sdayname1_data here
delete [] sdayname1_data;

and for C programmers it would be

// LOCALE_SDAYNAME1
LPTSTR sdayname1_data;
{ /* get LOCALE_SDAYNAME1 */
int length = GetLocaleInfo(0x00000408, LOCALE_SDAYNAME1, NULL, 0);
sdayname1_data = (LPTSTR)malloc(length * sizeof(TCHAR));
GetLocaleInfo(0x00000408, LOCALE_SDAYNAME1, sdayname1_data, length);
} /* get LOCALE_SDAYNAME1 */
// ... use sdayname1_data here
free(sdayname1_data);

What if you use STL? Well, I posted a question, and got back several different answers, one of which said it was untested. So I wasn't sure what to do. Since there are multiple possible approaches, it may be specific to a given programmer's style or a corporate style. That's OK. I added a "user-defined" feature that allows you to write any kind of template you want! The trick is that all of the above were produced with FormatMessage using a formatting string with %1, %2, etc. inside. So you can create strings of your own, put them in a file, and generate your template from that file. Full details are provided in the help text which accompanies the program (use the System Menu, which has a Help... menu item, to read this help text.

There are lots of other interesting features this program illustrates beyond the use of the GetLocaleInfo API, if you read the source code; for example, it has a scrolling dialog box, which unlike the Microsoft KB article Q262954, actually works (it is based on that article, but fixes a huge set of bugs in that article). It uses a Rich Edit 2.0 control in VS6, which is nominally impossible (but it is there, and it works! I'm writing another article about this, but it isn't ready yet). It creates its own context popup for edit controls, changes the highlighting of the edit control based on whether or not the edit control has focus, illustrates interesting uses of FormatMessage, demonstrates a bimodal program (Unicode and ANSI), a dropdown tree control (sort of...), and for those of you who haven't seen my other essays, it uses dynamically resizing combo boxes, my Registry library (the latest version, not out yet). There are about 41K lines of LocaleExplorer and other useful libraries for your use.

What happened next

Well, I kept adding things.  And adding them.  And adding them.  Probably after I release it, I'll sff some more stuff. The current release is over 47,000 source lines and over 1.6 megabytes of source code, not counting the documentation.  The documentation is over 380 .rtf files encompassing over 280K bytes of text.  This represents a "spare time" project that I've been working on for about a year, on and off.  8

Running LocaleExplorer

You should set up a default font using the Fonts >  Default font... option.  To properly see information, you should select a font which has the characters you need to see in it, as Unicode characters.  I initially set out to allow the user to specify a font per code page, but the lack of fonts made testing this feature hard; finally, someone in the newsgroup pointed out that Microsoft had the Arial Unicode MS font (which is an optional Office installation), and I installed it and have been using that ever since.  So I have not fully tested any code that would use the other font specification capabilities.  This code remains largely untested. 

Common to all of Locale Explorer is the lower panel.  The upper panels are part of a tabbed dialog, and you choose the API, or group of APIs, that you wish to explore. From the lower panel, choose a language or locale.  There are two enumeration mechanisms, "Supported" and "All" It turns out that many of the apparently-unsupported locales appear to be supported.

Many of the APIs have in common the ability to specify a parameter, LOCALE_USE_CP_ACP and/or LOCALE_NOUSEROVERRIDE.  The first will force use of the ANSI Code Page, and the second will cause the logged-in user settings to be ignored and instead the system default settings will prevail.

The Template Name indicates which of the macro templates would be used to generate the code.  The actual template depends on the per-API options selected. 

Code generation is supported for several common languages.  Templates for C, C++ (without MFC), and MFC are fully-supported.  As of this writing, only a few templates are supported for STL (The Standard C++ Library) and if anyone is motivated to write them, I will add them (and a credit line) to the version.

The User option allows an arbitrary user template file to be used; the file is specified using the button to select a file, or the File > Select User Template menu item.

The large edit control on the left bottom will get sample code, generated from the templates. This is a read-only edit control, so text may be selected and selectively copied, or the Copy button, , to copy the entire contents of the code window to the clipboard.  From there it may be pasted into your code.

While there are built-in templates, you are free to create your own copies of the template files, and there is an algorithm that will cause your copy to be used in preference to the built-in template files.

Some API values allow for either a text return type, or a DWORD return type; when a DWORD is an option, the DWORD checkbox will be enabled.

Some APIs allow you to name a variable that is used.  Most templates also use a collection of local names which can only be changed by changing the template, but most tend to allow a single user-specified variable name, and in some cases a user-specified function name for callbacks.  If this is left blank, a suitable name will be generated.  The name, once set, will be remembered on each page (and can be different) but the names are not remembered across sessions.

Visual Studio versions

It is hard to create an ANSI application in VS6 that also supports all the features I wanted to support.  I did not feel like working on the complex circumlocutions required to make it work, when these facilities already existed in VS.NET.  Therefore, I simply abort any attempt to compile a non-Unicode version in VS6.  To make it compile in VS.NET, I use the VS.NET MFC types CStringA and CStringW.  Macros in VS6 redefined these two types to the VS6 CString type, and only a Unicode version can be built in VS6.

WARNING!  UNICODE-ONLY SUPPORT!  Note that while this project is written Unicode-aware, this works only up to a certain point.  Unfortunately, Microsoft has provided inadequate Unicode support.  For example, in a sane world there would be no WM_SETTEXT message; instead there would have been WM_SETTEXTA and WM_SETTEXTW.  This would make it possible to send Unicode strings directly to an edit or Rich Edit control.  But this is not done.  There is an EM_SETTEXT message that supports this for Rich Edit 3.0, but because of the continuing confusion that Microsoft exhibits about the "standard" Rich Edit control, it cannot be assumed to be defined.  Therefore, I made a simple decision: only the Unicode versions of this application are supported.  There is no 8-bit-character ANSI build supported.  This means that you have to make sure that you have installed all the necessary Unicode libraries for Visual Studio, whatever version you are using.  If you fail to install the Unicode libraries, it will not build.  It was not worth the effort to make it ANSI-compatible, because so much of it requires full Unicode support.

Beta Release Disclaimer

This code is distributed as a beta release.  It has known problems.  When compiled under VS.NET, the index does not work correctly.  In VS6 there are memory leaks which I have not had time to clean up.  I will be working on these problems, but I wanted to put this out for comment and feedback.  There are several places I want to clean up.  But I've run out of time for this phase of revisions, and it is "close enough" that it is actually usable.

Consistency checking

The generation of code is controlled by template files.  For complete details on the template files, see the accompanying article.  However, it was also necessary to make sure that all template files defined all templates, so a special template-checker was written, and it also exemplifies a number of useful techniques.

Useful MFC programming features

I found myself using a lot of techniques which are not necessarily documented in examples.  So I'm taking the time to explain them here.

The various Internationalization features demonstrated include

In addition to these, there is a display of the machine-readable Unicode database.

This is a beta-test version. If there are problems, let me know. Anyone who wants to tackle some of the deficiencies, let me know. I was unable to test it meaningfully with languages like Japanese, Chinese, Korean, Arabic and Hebrew, all of which would probably stress my current implementation, but none of which I can read.

download.gif (1234 bytes)

[Dividing Line Image]

The views expressed in these essays are those of the author, and in no way represent, nor are they endorsed by, Microsoft.

Send mail to newcomer@flounder.com with questions or comments about this web site.
Copyright © 2005 Joseph M. Newcomer/FlounderCraft Ltd.  All Rights Reserved.
Last modified: May 14, 2011