Jump to content
Mr.Kim.Sanders

ASCII-workarounds for accented letters

Recommended Posts

A fairly comprehensive list of ASCII equivalents for "Characters with diacritical marks absent in the codepage ISO 8859-1" is at http://www.eki.ee/knab/kbdiakr.htm.

 

Although most of it is written in the Estonian language, it does show the English (ingl) translation for the various accents (diacritics), and examples of each "foreign" letter used in each of the 41 languages shown from around the world.

 

Down the left side of the table under the heading Kood ja märk (code and symbol) are the Alt-numbers without the necessary lead zeros and the symbol used to represent each accent/diacritic, except in the case of 187 [i.e. Alt-0187] (which is used for letters that don't fit the norm, such as the dotless i in Turkish and the long ö in Hungarian, etc.).

 

The languages covered are as follows, using accented English alphabetical order (with the Estonian names to search), with Romanized usage marked with an asterisk [i.e. *] in each case: Arabic (araabia)*, Azerbaijani (aserbaidz^aani), Bengali (bengali)*, Bulgarian (bulgaaria)*, Catalan (katalaani), Croatian (horvaadi), Czech (ts^ehhi), Estonian (eesti), Farsi (pärsia)*, French (prantsuse), Guaraní (guaranii), Hawaiian (havai), Hebrew (heebrea)*, Hindi (hindi)*, Hungarian (ungari), Japanese (jaapani)*, Khmer (khmeeri)*, Korean (korea)*, Latvian (läti), Lithuanian (leedu), Livonian (liivi), Macedonian (makedoonia)*, Malagasy (malagassi), Maltese (malta), Maōri (maoori), Nepali (nepali)*, Polish (poola), Pushtu (pus^tu)*, Romanian (rumeenia), Sámi (saami), Serbian (serbia)*, Slovak (slovaki), Slovenian (sloveeni), Sorbian (sorbi), Tagalog (tagali), Tamil (tamili)*, Turkish (türgi), Urdu (urdu)*, Vietnamese (vietnami)*, Welsh (kõmri), Yoruba (joruba).

Share this post


Link to post
Share on other sites

There are several work-arounds for entering Vietnamese so the user can know the accurate spelling, even using Windows-1252. It will depend on what look and/entry speed you are willing to accept.

 

If you go with standards used by the Vietnamese themselves, there are 3 ways that come to mind: Telex (IME) (http://en.wikipedia.org/wiki/Telex_%28IME%29), Vietnamese Quoted-Readable [a.k.a. Vietnet](http://en.wikipedia.org/wiki/Vietnamese_Quoted-Readable), and the VNI Input Method (http://en.wikipedia.org/wiki/VNI).

 

Then there are 2 others that I've seen online: the "KBDiakr" method (http://www.eki.ee/knab/kbdiakr.htm) and the RFC 1345 mnemonics method (http://www.ietf.org/rfc/rfc1345.txt). I must add that if I were to use the RFC method, I would use \ [i.e. ALT-0092] as a delimiter where required, instead of _.

 

And, finally, I've come up with 2 myself: the Mekasseh method and the Cappy Singer method.

 

The Mekasseh method gives the modern Vietnamese Latin/Roman alphabet extensions as follows:

A¢a¢ [using ALT-0162], Aˆaˆ[using ALT-0136], Eˆeˆ, Oˆoˆ, O’o’ [using ALT-0146], Uˆuˆ], and U’u’. Tones follow thus: ´ [using ALT-0180], ` [using ALT-0096], ² [using ALT-0178], ˜ [using ALT-0152], and · [using ALT-0183].

 

The Cappy Singer method gives the modern Vietnamese Roman/Latin alphabet extensions as follows:

Ä[ALT-0196]ä[ALT-0228], Â[ALT-0194]â[ALT-0226], Ê[ALT-0202]ê[ALT-0234], Ô[ALT-0212]ô[ALT-0244], Ö[ALT-0214]ö[ALT-0246], Û[ALT-0219]û[ALT-0251], and Ü[ALT-0220]ü[ALT-0252]. Tones follow as seen in the Mekasseh method just mentioned.

Share this post


Link to post
Share on other sites

The Mekasseh method for Czech gives the Latin/Roman additions as follows:

A´[using ALT-0180]a´, C¥[using ALT-0165]c¥, D¥d¥, E´e´, E¥e¥, Chch, I´i´, N¥n¥, O´o´, R¥r¥, S¥s¥, T¥t¥, U´u´, U°[using ALT-0176]u°, Y´y´, and Z¥z¥. In summary, the tšárkas are represented using ALT-0180 after the unaccented Latin/Roman letter it's based on, the hátšeks are represented by the Yen sign [ALT-0165] following the unaccented letter, and the kroužeks are represent by the Degree sign [ALT-0176] following the base U/u.

 

For the Czech Roman/Latin additions, the Cappy Singer method is as follows:

Á[ALT-0193]á[ALT-0225], C¥[using ALT-0165]c¥, D¥d¥, É[ALT-0201]é[ALT-0233], E¥e¥, Chch, Í[ALT-0205]í[ALT-0237], N¥n¥, Ó[ALT-0211]ó[ALT-0243], R¥r¥, Š[ALT-0138]š[ALT-0154], T¥t¥, Ú[ALT-0218]ú[ALT-0250], U°[using ALT-0176]u°, Ý[ALT-0221]ý[ALT-0253], and Ž[ALT-0142]ž[ALT-0158]. In summary, in the Cappy Singer method, all Czech letters which look like those found in Windows-1252 are represented that way, which includes Áá, Éé, Chch, Íí, Óó, Šš, Úú, Ýý, and Žž. The Czech-specific hátšeks and kroužeks are entered as in the Mekasseh method: C¥c¥, D¥d¥, E¥e¥, N¥n¥, R¥r¥, T¥t¥, and U°u°.

Edited by Mr.Kim.Sanders

Share this post


Link to post
Share on other sites

The Mekasseh method for the Turkish modifications (of the Roman/Latin alphabet) is as follows:

C¸[using ALT-0184]c¸ G¢[using ALT-0162]g¢ I¤[using ALT-0164]i¤ I•[using ALT-0149]i• O¨[using ALT-0168]o¨ S¸s¸ and U¨u¨. The circumflexed versions (used by some) each use ˆ[ALT-0136] following its standard Latin/Roman base, thus: Aˆaˆ, Iˆiˆ, and Uˆuˆ.

 

The Cappy Singer method for the Turkish modifications (of the Latin/Roman alphabet) is as follows:

Ç[ALT-0199]ç[ALT-0231], G¥[using ALT-0165]g¥, I¤i¤, I•i•, Ö[ALT-0214]ö[ALT-0246], Š[ALT-0138]š[ALT-0154], and Ü[ALT-0220]ü[ALT-0252]. The circumflexed versions for the Cappy Singer method are the same as described for the Mekasseh method, i.e. Aˆaˆ, Iˆiˆ, and Uˆuˆ.

Share this post


Link to post
Share on other sites

For a more-comprehensive solution that can "apparently" only be used on TMG in the Notes field, one can refer to http://www.geocities.com/Athens/Parthenon/9860/asr22.html [a solution known as ASR 2.2]. I say "apparently" because it requires on strikethroughs and underlines.

An interesting side-note is found there, which indicates to me that the system I call KBD was actually devised by Peeter Päll peeter@eki.ee], and is referred to as ASR 2.0. At the time of the update (1997-10-24), the link went to , which is now , I see.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×