Organizing glossaries and TMs
Thread poster: Web Chameleon
Web Chameleon
Web Chameleon
Italy
Local time: 13:52
English to German
+ ...
SITE LOCALIZER
Nov 21, 2019

Hi all!

Do you have any spectactular methods to organize your various glossaries and TMs?
Which programs do you recommend? MemoQ? MultiTerm?

Many thanks for your appreciated help!

Have a nice afternoon!

Janine


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 13:52
Member (2006)
English to Afrikaans
+ ...
@Janine Nov 21, 2019

Janine Vieler wrote:
Do you have any [particular] methods to organize your various glossaries and TMs?


My CAT tool allows me to use just two TMs at at time, namely a read-only one and a read-write one, and only three read-write glossaries at a time. My CAT tool is WFC. The CAT tool also allows me to set up project templates (called INI files, or "setups") in which I can pre-configure the paths to TMs and glossaries.

So, I save all of my TMs and glossaries all in one place, in subject-specific or client-specific subfolders, and then I create project templates that link to them. For some clients, I create a brand new read-write TM for each job (and for some clients, I create a brand new glossary, too), and in those project template the field for read-write TM and the field for the first glossary is empty (i.e. I create the TM and glossary each time I do a job). For other clients, I just keep using the same read-write TM and glossary over and over again, so in those cases the read-write TM and main glossary is saved in the same subfolders as the other TMs and glossaries, and the project template contains their path, so that I can just load the template and start translating immediately when I get such a job.

(WFC unfortunately overwrites its own INI files far too easily, so I keep "good" copies in another location, and use a startup BAT file to copy the good INI files into the INI file's standard location every time I restart my computer.)

When I get jobs in other CAT tools, I convert their TMs and termbases/glossaries to the WFC format, and then sometimes use the client's TM as the read-write TM. When I get repeated work from a certain client, I try to save the TM with a name that includes the client's name, but since the WFC TMs are text files, I can easily search my old TMs by using ordinary file search. All the temporary TMs and glossaries that do not reside in my main TM/glossary folder are saved in the job's own folder and moved to an archive location as soon as the job is delivered.


Esther Dodo
 
Sergei Leshchinsky
Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 14:52
Member (2008)
English to Russian
+ ...
I did it my way © Nov 21, 2019

TMs. I used to have lots of TMs by jobs, projects, topics. The number grew to unmanageable. I merged many TMs by broader subjects, like "General Technical" (all user/assembly/operation manuals), "IT" (the same for IT), "Life" (health, medicine, diet, cooking, ergonomics), "Legal" (accordingly) etc. Now I have about 20 TMs containing 300k to 1M TUs each. (Luckily, the modern PCs handle large DBs easily.) I use them in combinations (where only one is a recordable "master"). Sometimes I work... See more
TMs. I used to have lots of TMs by jobs, projects, topics. The number grew to unmanageable. I merged many TMs by broader subjects, like "General Technical" (all user/assembly/operation manuals), "IT" (the same for IT), "Life" (health, medicine, diet, cooking, ergonomics), "Legal" (accordingly) etc. Now I have about 20 TMs containing 300k to 1M TUs each. (Luckily, the modern PCs handle large DBs easily.) I use them in combinations (where only one is a recordable "master"). Sometimes I work with a temporary master TM, and then import its content to some of the TMs. (I have TMs by topics in opposite LA pairs, like En-Ru and Ru-En, En-Uk and Uk-En, so I grow both TMs from that temporary one -- the CAT-tool can reverse the TM during import).

Glossaries. I keep them in simple exchange formats and import into the CAT-tool as needed. Some of them stay there for long (for long-running projects) and grow, some are deleted after the job is delivered.

I work in memoQ.

[Редактировалось 2019-11-21 23:47 GMT]
Collapse


 
Web Chameleon
Web Chameleon
Italy
Local time: 13:52
English to German
+ ...
TOPIC STARTER
SITE LOCALIZER
Thank you! Nov 22, 2019

Thank you very much for your replies.
At the moment I do the same thing but in Trados. Working on several (not only two) TMs each time.
So I understand that MemoQ is a CAT tool, not a program to organize terminology /glossaries, is that correct?


 
Web Chameleon
Web Chameleon
Italy
Local time: 13:52
English to German
+ ...
TOPIC STARTER
SITE LOCALIZER
Advantages of MultiTerm? Nov 22, 2019

Which advantages does MultiTerm bring me? To be honest, when searching for a term I search only in my language combination, but do rarly compare with other languages, as it is possible in MultiTerm. I consult other languages only if the client comes with a set terminology.
Any advantages?


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 13:52
Member (2006)
English to Afrikaans
+ ...
@Janine Nov 22, 2019

Janine Vieler wrote:
So I understand that MemoQ is a CAT tool, not a program to organize terminology /glossaries, is that correct?


Yes, MemoQ is a CAT tool just like Trados is a CAT tool.

Janine Vieler wrote:
Which advantages does MultiTerm bring me?


Very few people who do not have Trados use MultiTerm, so if you want to know about MultiTerm you should ask a question about it specifically in the Trados subforum:

https://www.proz.com/forum/sdl_trados_support-65.html


 
Multiverse Solutions s.r.o. (X)
Multiverse Solutions s.r.o. (X)
Local time: 13:52
Polish to English
+ ...
Simplifying the archive Jan 7, 2020

The question is do you really need more than a handful of resource files? The assumption is that you are the only operator of the CAT system and the only translator who uses it. If this is the case, you usually do not need to create separate files for separate projects. Your personal language (native and target) style is structured (rather more than less), and the vocabulary range of every person is limited to their own choices. Especially when you acquire highly specialised terminology - it wil... See more
The question is do you really need more than a handful of resource files? The assumption is that you are the only operator of the CAT system and the only translator who uses it. If this is the case, you usually do not need to create separate files for separate projects. Your personal language (native and target) style is structured (rather more than less), and the vocabulary range of every person is limited to their own choices. Especially when you acquire highly specialised terminology - it will remain the same whenever you use it.

One TM for lifetime will save you a lot of organisation problems. You basically need to create one TM for each language pair, like IT-DE. Due to language intricacies, it is also reasonable to create a separate reverse TM, which in this case would be DE-IT.

Proper TM maintenance will save you a lot of technical problems. Files may get corrupted, overwritten, or simply deleted accidentally. If you’re working on large or numerous projects, your TM files will quickly grow. Due to these reasons, I have developed my own TM handling method.

The archive TM (named accordingly ATM) is the bulk knowledge of all translations combined in one file. It is the background, read-only file. This file is generated incrementally from clean monthly files. In January 2020, we use the file named ATM DE-IT 1912.* as it includes all translations from previous years (until December 2019 inclusive).

On top of this archive file, I set up a new TM file for each month of the year (MTM), eg MTM DE-IT 20-01.* (‘-’ marks a separate monthly file). This read-write file is used for all translations for all customers in all fields. Why? Because it is a ‘memory’, not a ‘thematic dictionary’. With specific segmentation rules, the translated texts are logically split into often used universal segments (‘for example,’ or ‘acting in the capacity of’) and segments typical of the particular document. Having them all accessible in one file simplifies both organisation and translation.

The MTM file is subject to maintenance at the end of the month. It includes spell checking in the target language, removing duplicate entries, correcting language mismatches (IT-DE entry in a DE-IT file), removing corrupt segments, removing extra spaces, and finally reindexing. The idea is to have a clean, perfect file. With each successive month, MTMs are merged, duplicates are removed, and the files are reindexed and saved for backup. Thorough cleanup is not necessary, as it was already done in the MTM file.

Such maintenance takes little time, but only when you have small TMs. Which is why monthly files are the choice. If you are part of a team who produce huge archives, switching to a weekly schedule would be better.
MTM files are saved for backup and merged with the last month’s ATM. In February, you will use the incremental huge ATM DE-IT 2001.* (in my case it is over 200 MB) and a lightweight MTM DE-IT 20-02.*

Occasionally, separate TM files may be necessary or useful for a document or project. I create and maintain such files in the same way. The only difference is in the naming convention, for which I would use eg STM DE-IT 20-01 Firma Jahresabschluss.* Firma stands here for the business name (or anything to identify the subject). As you can see, I add one (or two) words more to identify the contents of the TM file.

At the end of the month, this file is merged with the monthly and archive TMs in the usual way. If your customer requires no-reuse policy due to confidentiality reasons, you will simply not merge this specific file and put it in the archive ‘as is’.

I use the same procedure for glossary files that I produce during the translation. Separate, commercial and special glossaries (TBs) have meaningful file names.

I hope the above will be of some help.
Collapse


Web Chameleon
 
Web Chameleon
Web Chameleon
Italy
Local time: 13:52
English to German
+ ...
TOPIC STARTER
SITE LOCALIZER
Thank you! Jan 7, 2020

Thank you so much, Miroslav! Your post is really helpful! Thx!!

 
Vladimir Pochinov
Vladimir Pochinov  Identity Verified
Russian Federation
Local time: 14:52
English to Russian
My 2, 3 or 4 cents :) Jan 8, 2020

Translation memory (TM)

I use industry-specific and client-specific TMs.

1. Industry-specific TMs

One of these is my 'big mama' TM for UN-related projects containing about 1,000,000 translation units (TUs) and covering terms used by various UN agencies.

The attribute fields used in the TM:

Symbol: e.g. A/RES/70/1 (these symbols are used to idenmtify a document in the UN Official Document System
Ti
... See more
Translation memory (TM)

I use industry-specific and client-specific TMs.

1. Industry-specific TMs

One of these is my 'big mama' TM for UN-related projects containing about 1,000,000 translation units (TUs) and covering terms used by various UN agencies.

The attribute fields used in the TM:

Symbol: e.g. A/RES/70/1 (these symbols are used to idenmtify a document in the UN Official Document System
Title: e.g. Transforming our world: the 2030 Agenda for Sustainable Development
Agency: e.g. UNDP, UNEP, UNICEF, UNIDO
Year:

The fields ensure consistent terminology and style as used by a specific UN agency. Regretfully, there is a lot of inconsistency between agencies. E.g. in Russian versions of the documents 'Executive Director' may be translated as 'директор-распорядитель' by one agency, while the other would use 'исполнительный директор'.

The 'Year' field helps track changes over time in case of a glossary overhaul by the relevant agency.

This 'big mama' TM is enabled for lookup in every translation job, but I only update it after the job has been completed and delivered.

2. Client-specific TMs

I also create and maintain a separate TM for each UN agency using the same attribute fields. These (client-specific) TMs get updated in real-time during translation.

Termbase (TB)

I use industry-specific TBs: Legal, Financial, United Nations, etc.

The typical structure includes the following attribute fields at various levels (entry level, language level, term level):

Subject e.g. for the 'Legal' TB this field contains the list of practice areas: 'Arbitration', 'Corporate', 'Intellectual Property', 'Litigation', etc.
Definition:
Source: (of definition)
Context:
Status: e.g. 'Standardized', 'Recommended', 'Outdated', 'Forbidden', etc.
Client: e.g. 'shareholder agreement' can be translated as 'акционерное соглашение', 'договор акционеров', 'договор между акционерами', depending on the client preferences
Note:

You may want to add 'Acronym', 'Synonym', 'Antonym' as attribute fields.

Personally, I prefer to add separate entries for acronyms, synonyms, and antonyms, and use cross-references to the relevant term: e.g. UN - see United Nations

I also use cross-references in the 'Definition' field:

child protection register = "A central register kept by a local authority listing all the children in its area who are judged to be at continuing risk of significant harm and for whom there is a child protection plan. The principal purpose of the register is to make agencies and professionals aware of children judged to be at risk of significant harm and in need of active safeguarding."

Hope it helps.

[Edited at 2020-01-08 10:23 GMT]
Collapse


Web Chameleon
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Organizing glossaries and TMs







Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »