Jump to content


Photo

Export Norwegian names in GEDCOM not working

GEDCOM Character sets

  • Please log in to reply
3 replies to this topic

#1 JRB

JRB
  • Members
  • 2 posts

Posted 04 December 2014 - 06:59 PM

I have a data set with Norwegian place names and words. Using character mapping the correct letter is visible like æ, ø and å. All reports in TMG work. When I export GEDCOM from TMG version 8 the characters do not map as expected. The choices I have are ANSI, ANSEL or IBMPC. The IBMPC does not support the extra characters. The other two do export "non-roman" characters but not the ones expected. I use TMG on windows 7 running under Parallels on an iMac. Have others seen this?

#2 Vera Nagel

Vera Nagel
  • Moderators
  • 1,535 posts
  • Gender:Female
  • Location:Bevern, Lower Saxony, Germany

Posted 05 December 2014 - 01:38 AM

I use TMG's most recent program version, TMGv9.05 (!!) on OS Windows 7.

 

As you found all of these special Norwegian characters which you list are part of the ANSI character set which TMG supports.

 

I have a very small test project -for testing means only- which only contains some people/names and place names using these characters.

I just exported it to GEDCOM 5.5, character set ANSI and without any exception or issue at all, all of these special Norwegian characters perfectly exported to GEDCOM.

 

As an aside / for the records only: During the development circle of TMGv9 (!!) some previously existing collate sequence issues pertaining to these Norwegian characters were fixed.

 

##################

 

Next I imported my GEDCOM file into TMGv8.08 (!!) -which is also still installed on one of my machines-, exported that project to GEDCOM 5.5, character set ANSI and again don't run into any issues at all with any of these special Norwegian characters - say everything exported to GEDCOM just fine.

 

Besides that TMG has a bunch of Norwegian users and I never came across any of these users complaining about any GEDCOM export issues regarding these special Norwegian characters.

 

When you inspect a GEDCOM file be sure to use a pure text editor and not any other program which may "interpret" such characters.

 

Regarding the correct export please also see the attached GEDCOM file for reference which I renamed from .ged to .txt

Attached Files


Thanks, Vera

Moderator of the German TMG User Group

#3 John Cardinal

John Cardinal
  • Senior Members
  • 835 posts
  • Location:North Andover, MA

Posted 08 December 2014 - 09:58 AM

Of the three encoding choices (ANSI, ANSEL, IBMPC), only ANSEL is part of the GEDCOM standard.

 

ANSEL will represent many accented characters (see Wikipedia entry). Unfortunately, ANSEL is not supported by almost anything except GEDCOM related software and so viewing a GEDCOM file encoded in ANSEL is problematic. I do not know if TMG exports ANSEL correctly.

 

ANSI is not a standard character set supported by GEDCOM, and it is also a problem because there are multiple versions of "ANSI". For example, a person using a Norwegian version of MS Windows will see different characters in ANSI than someone using a US English version of MS Windows.

 

The definition of "IBMPC" is not precise. I believe it typically refers to code page 437 (see Wikipedia entry). It has some accented characters, but less than most versions of ANSI because it uses many of the 256 code points for line drawing characters and other graphics.

 

Whichever encoding you choose, if you review the GEDCOM file in a text editor, you have to make sure you use a text editor that supports that character encoding. If I were you, I'd choose ANSEL and then do a test import into your chosen genealogy program. It should support ANSEL. Most do, although many seem to have trouble with a subset of the more challenging ANSEL characters, notably the "combining diacritic" characters.



#4 JRB

JRB
  • Members
  • 2 posts

Posted 09 December 2014 - 05:03 AM

Thank you for the replies on Norwegian Characters in GEDCOM. TMG must use a different export for GEDCOM than it uses for other reports.  I will try ANSEL and use a text editor to globally replace the special characters with the correct ones. I am assuming that each special character is unique even though it appears differently.  Since I plan to export only once, the extra "brute force" work is better than trying to find the problem.







Also tagged with one or more of these keywords: GEDCOM, Character sets

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users