Corruption of UCS-2 Little Endian in Notepad?
Thread poster: Zaure Batayeva (X)
Zaure Batayeva (X)
Zaure Batayeva (X)
Kazakhstan
English to Kazakh
+ ...
Mar 24, 2011

Hello everyone,

This week I received an offer to edit a txt-file encoded in UCS-2 Little Endian with the help of a text editor such as Notepad. The language of the file was Kazakh, which uses Cyrillic characters.

When I submitted the edited file, the translation agency told me that the file was corrupted and proposed that I use Notepad++. I edited the file again in Notepad++, but apparently to no avail. The agency said my file was still corrupted.

I then se
... See more
Hello everyone,

This week I received an offer to edit a txt-file encoded in UCS-2 Little Endian with the help of a text editor such as Notepad. The language of the file was Kazakh, which uses Cyrillic characters.

When I submitted the edited file, the translation agency told me that the file was corrupted and proposed that I use Notepad++. I edited the file again in Notepad++, but apparently to no avail. The agency said my file was still corrupted.

I then sent the file to some colleagues, who told me that they could open and read the edited file in Notepad without any problems.

Since I am new to using Notepad, I do not know what has happened here. Does anyone on the forum know? Any comments or advice would be greatly appreciated.
Collapse


 
Michael Grant
Michael Grant
Japan
Local time: 06:29
Japanese to English
Questions... Jul 29, 2011

All I can think of is that maybe you used a different character set setting than the file.
In Notepad you have to select the font as well as the character set. For example, the Verdana font has at least seven character set settings available for it (Format menu > Font), only one of which is Cyrillic...

I do not have Notepad++ installed, so I can't tell you how that works...

It would be very helpful
... See more
All I can think of is that maybe you used a different character set setting than the file.
In Notepad you have to select the font as well as the character set. For example, the Verdana font has at least seven character set settings available for it (Format menu > Font), only one of which is Cyrillic...

I do not have Notepad++ installed, so I can't tell you how that works...

It would be very helpful to have the actual file or, if that is impossible, then I would want to know:

1) What application was used to create the original text file?
2) What font was used?
3) What does the agency mean by "corrupted"?
- Does the file not open?
- Does it open but display badly?
4) What application is the agency using to open the file??

A lot of questions need to be answered before we can troubleshoot this without having the file ourselves...

Afterthought:
UCS-2 is an older character encoding that, according to the Unicode Consortium Web site, was replaced by UTF-16 in version 2.0 of the unicode standard.
(Source: http://www.unicode.org/faq/basic_q.html#14)
If the file does not have to used for some obscure, older database system or application, then maybe they can get a UTF-16, or UTF-8 version of it...?

What does the agency mean by the word "corrupted"...? In what way is the file you sent to them unusable?

MGrant
Collapse


 
Zaure Batayeva (X)
Zaure Batayeva (X)
Kazakhstan
English to Kazakh
+ ...
TOPIC STARTER
corrupted: strange characters appear Jul 31, 2011

Hello Michael,

Thank you for your assistance. It's been a while ago, so I have already forgotten the details of the case. I just remember that they told me to work in UCS-2 Endian because the file was created in that font. I did. Then they asked me to install Notepad ++ and start the project from scratch. I did. Nothing helped. The target language was Kazakh, so I suspect something was wrong with the Kazakh font versions. Anyway, I have lost the project...
See more
Hello Michael,

Thank you for your assistance. It's been a while ago, so I have already forgotten the details of the case. I just remember that they told me to work in UCS-2 Endian because the file was created in that font. I did. Then they asked me to install Notepad ++ and start the project from scratch. I did. Nothing helped. The target language was Kazakh, so I suspect something was wrong with the Kazakh font versions. Anyway, I have lost the project).

Regards
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Corruption of UCS-2 Little Endian in Notepad?







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »