CAT tools with dictionary and grammar Konuyu gönderen: Bhante Medhavi
|
Hello,
Can anybody recommend any CAT tools in which grammatical information can be combined with dictionaries?
I am particularly interested in the translation of Pali, where it would be very useful for a CAT tool to display a list of all the possible tenses/persons/etc conditions that a conjugated word form could represent, or for that matter different grammatical conditions of different words with the same conjugated word form.
Although ideally one would ... See more Hello,
Can anybody recommend any CAT tools in which grammatical information can be combined with dictionaries?
I am particularly interested in the translation of Pali, where it would be very useful for a CAT tool to display a list of all the possible tenses/persons/etc conditions that a conjugated word form could represent, or for that matter different grammatical conditions of different words with the same conjugated word form.
Although ideally one would obviously like to have the dictionary-with-grammar integrated in a translation memory tool, if that does not exist a popup dictionary-with-grammar would still be a useful tool.
One would have to be able to import and add to grammatical data.
Thanks for any suggestions,
Bhante ▲ Collapse | | | | Rod Walters Japonya Local time: 01:32 Japonca > İngilizce Sort of possible with Trados | Feb 17, 2009 |
Bhante, a warm wai to you.
Japanese is inflected in a similar way probably, and in Trados I enter the various inflected forms in MultiTerm as I go. It's a laborious and gradual process, but it pays off.
If you started off with an Excel list of the most common forms, you could do a batch import into MultiTerm (please ask for details if you need them), and then add new ones on the fly, or in another later batch import.
It's very nice when the software identi... See more Bhante, a warm wai to you.
Japanese is inflected in a similar way probably, and in Trados I enter the various inflected forms in MultiTerm as I go. It's a laborious and gradual process, but it pays off.
If you started off with an Excel list of the most common forms, you could do a batch import into MultiTerm (please ask for details if you need them), and then add new ones on the fly, or in another later batch import.
It's very nice when the software identifies the form and gives you a correct or nearly correct word to drop into your text. ▲ Collapse | | | Multiterm - Painstaking bur powerful in the long run | Feb 17, 2009 |
Rod Walters wrote:
Japanese is inflected in a similar way probably, and in Trados I enter the various inflected forms in MultiTerm as I go. It's a laborious and gradual process, but it pays off.
I agree. I would not need to use Multiterm to that extent, but indeed I have a kind of "typing aid" in a separate Multiterm termbase (not the terminological one) for source words that require a lot of typing in Spanish, mostly when I translate from German. An example: "Aufdachlüftungsheizanlage" (1 source word) > "sistema de ventilación y calefacción de techo" (7 target words). | |
|
|
Kevin Lossner Portekiz Local time: 16:32 Almanca > İngilizce + ... Technical structure of the data | Feb 17, 2009 |
Rod Walters wrote:
If you started off with an Excel list of the most common forms, you could do a batch import into MultiTerm (please ask for details if you need them), and then add new ones on the fly, or in another later batch import.
Wouldn't you be better off ultimately to manage the inflected forms as synonyms in MultiTerm? Excel is certainly convenient for data entry from other sources, but when I import data that way, I end up spending a lot of time combining records afterward.
I don't know if it would be relevant to the translation of Pali, Thai or Laotian texts, but I find that the fuzzy matching of words with MultiTerm is pretty good, i.e. having "widerstandsfähig" in the database means that I don't have to enter all the various derived forms, and if there isn't an exact match, close matches like "Widerstandsfähigkeit" will also be in the list displayed (together with various descriptive attributes - that is one of the strengths of MultiTerm compared with the tools I use more often for my work). | | | Rod Walters Japonya Local time: 01:32 Japonca > İngilizce Not quite sure what you mean | Feb 17, 2009 |
Kevin, by "manage the inflected forms as synonyms in MultiTerm", do you mean by reducing term recognition Minimum Match Value to something low?
I just checked my setting for that and found it was 100%, although I'm sure I had it set to about 30% earlier.
I'll have to try with the lower setting (again?) to see if it handles the inflexions gracefully. Like the Finance Minister of Japan, I have quite a taste for cough mixture, but I seem to remember MultiTerm being more re... See more Kevin, by "manage the inflected forms as synonyms in MultiTerm", do you mean by reducing term recognition Minimum Match Value to something low?
I just checked my setting for that and found it was 100%, although I'm sure I had it set to about 30% earlier.
I'll have to try with the lower setting (again?) to see if it handles the inflexions gracefully. Like the Finance Minister of Japan, I have quite a taste for cough mixture, but I seem to remember MultiTerm being more responsive at an earlier date, perhaps before an upgrade. ▲ Collapse | | | Kevin Lossner Portekiz Local time: 16:32 Almanca > İngilizce + ... Updates & data recognition/management | Feb 17, 2009 |
It seems every time I do a software update something like that happens, especially where Microsoft is involved somehow.
I haven't been imbibing cough syrup today, but I'm probably tired enough to be reduced to semi-coherency. What I meant, Rod, was that where MultiTerm fails to recognize two forms of a word as being equivalent (which might happen in German, for example, if the form is very irregular), it might be worthwhile managing it as a "synonym" (even if it isn't one in a strict sens... See more It seems every time I do a software update something like that happens, especially where Microsoft is involved somehow.
I haven't been imbibing cough syrup today, but I'm probably tired enough to be reduced to semi-coherency. What I meant, Rod, was that where MultiTerm fails to recognize two forms of a word as being equivalent (which might happen in German, for example, if the form is very irregular), it might be worthwhile managing it as a "synonym" (even if it isn't one in a strict sense) under the same concept entry. Otherwise it shows up as a separate item in the list, which may cause one to be mislead, depending on how data are structured and managed.
30% was your setting? Wow. I'm not sure I'd go that low with my languages, though I probably should try some adjustments and see if some irregular forms are dealt with better.
One of the great strengths of MultiTerm for me is the ability to model the word relationships in a fairly sophisticated way as well as keep track of attribute data (like companies, language variants, product lines, etc.) and generate decently formatted dictionaries for ordinary folks with just a little tweaking of the standard templates. Maintaining simple 1:1 bilingual term lists with MultiTerm (especially where there are real synonyms involved) is a waste of its potential and brings the tool down to the same level as those I use with DVX and other CAT environments. I might get more practical use out of those other forms (and I might indeed do very simplified data management in MultiTerm if I have limited time to invest), but I haven't seen anything else out there that my budget can deal with that I like as much as MultiTerm for something approaching real dictionary management.
Note to Bhante: Unlike some other tools, MultiTerm will also let you build multilingual databases, so if you want to have the same term in Pali, Thai, Lao, Chinese, Sanskrit, etc. under one concept entry, you can do that. I doubt that's possible with the tool you asked about before (Wordfast), but I'm not an expert on it, so others will have to confirm or refute that statement. ▲ Collapse | | | Rod Walters Japonya Local time: 01:32 Japonca > İngilizce Big grab bag | Feb 18, 2009 |
My use of MultiTerm has evolved through use and frobbing rather than developed through planning and RTFM.
The result is essentially two types of termbase. One, named "Big Termbase", is a glorious hodgepodge of arcane technical terminology mixed in with common expressions like "However," and "smoothly". The other type includes pristine client-specific or less pristine field-specific termbases.
Very few writers of Japanese are strictly logical, so to make a readable Engl... See more My use of MultiTerm has evolved through use and frobbing rather than developed through planning and RTFM.
The result is essentially two types of termbase. One, named "Big Termbase", is a glorious hodgepodge of arcane technical terminology mixed in with common expressions like "However," and "smoothly". The other type includes pristine client-specific or less pristine field-specific termbases.
Very few writers of Japanese are strictly logical, so to make a readable English translation, you often have to convert adjectives to adverbs or vice versa, and make passive verbs active. This means that having a grab bag of soft-edged parts is more valuable than having a row of precision parts to choose from. I love Big Termbase like it's some kind of friendly, intelligent animal. Often when I come across some particularly foreign formulation, I find that Big Termbase already has two suggestions for translating it. (Some people may be shocked to learn that I rely heavily on it for writing advertising copy too.) A low recognition threshold just gives you more choices where strict precision is not required.
So far, I haven't found any need for concept entries, nor would I know how to go about setting them up (a case of being Unskilled and Aware of It). This is something I might investigate further to see if it has any value in my case. A text Note is the most sophisticated addition I've needed so far to 1:1 lists.
Can we conclude that what Bhante wants is quite possible in MultiTerm, but maybe not be necessary for practical purposes (although he and his team can decide that)?
[Edited at 2009-02-18 05:39 GMT] ▲ Collapse | |
|
|
Samuel Murray Hollanda Local time: 17:32 Üye (2006) İngilizce > Afrikaans + ... OmegaT, if you have your own dictionary | Feb 18, 2009 |
Bhante Medhayo wrote:
Although ideally one would obviously like to have the dictionary-with-grammar integrated in a translation memory tool, if that does not exist a popup dictionary-with-grammar would still be a useful tool.
You're asking for four things:
1. A comprehensive bilingual list of conjugations etc for your language combination
2. A tool to do automatic searches through your source text segment for it
3. A tool to display the search result
4. A CAT tool that allows you to feed your source text segment to the tool in #2.
Well, for #4 you can try OmegaT. Version 2.0.1 can automatically export the current source text to a file, which you can then feed to the tool in #2, to put the rest of the process in action.
The first question is, though, whether you have #1 already, and if not, how you will design it so that #2 and #3 can make optimal use of it. | | | Kevin Lossner Portekiz Local time: 16:32 Almanca > İngilizce + ... OmegaT, MultiTerm | Feb 18, 2009 |
@Samuel: Any plans to do fuzzy matching of terms in OmegaT? The lack of this was sorely felt in the little bit of testing I did. And a statement from Marc (I think) about no divergent translations for the same source segment bothered me, though it wasn't an issue in the texts I tested.
@Rod: From a technical perspective, I think it's been clear for a couple of years that MultiTerm is probably the most viable solution to Bhante's problem (that's how long discussions on this and relat... See more @Samuel: Any plans to do fuzzy matching of terms in OmegaT? The lack of this was sorely felt in the little bit of testing I did. And a statement from Marc (I think) about no divergent translations for the same source segment bothered me, though it wasn't an issue in the texts I tested.
@Rod: From a technical perspective, I think it's been clear for a couple of years that MultiTerm is probably the most viable solution to Bhante's problem (that's how long discussions on this and related issues have been running). However, being limited to two common European languages, I am unable to judge and advise on the complexities of dealing with many Asian languages in that or any other environment, so I'm glad that you are able to make a contribution in that respect, and I hope some Thai translators and others are able to chip in with specific hints for their language. Bhante's got a couple of pretty hefty public service translation projects for unpublished (in English) works of Buddadhasa and others in mind, and the consequences of bad technical choices could get very uncomfortable down the road. The downside to Trados is that many of those who may become involved in some of his projects may be unable to afford it as a working tool, but it still may make the most sense as a central platform for data management, even if a lot of "satellite work" is done in OmegaT or other cheap/free tools. (He can always export the data in a number of ways as you know.)
I had to laugh at your description of "Big Termbase". That sounds like the Japanese version of my big, sloppy general German termbase, the contents of which do often surprise me. As for changing the grammatical form of words, I'm sure that's a universal thing for good translation. Whenever someone grumbles that I've changed a verb into a noun in a sentence for the sake of clarity or some such thing, I just shrug and roll my eyes at the insistence on idiotic schoolbook literalism. Good fuzzy capabilities that can capture all or most of the variations on a root are really, really useful ▲ Collapse | | | Bhante Medhavi Local time: 23:32 Palice > İngilizce + ... KONUYU BAŞLATAN Many thanks to all of you for your support | Feb 18, 2009 |
Wow! What an interesting set of replies, and so quickly! I am about to go off to Thailand for just over a month, and unfurtunately seriously looking into MultiTerm will have to wait until I get back. In the meantime I will read the forum as far as is possible while I am away, but if I am silent for the period that won't mean a lack of interest!
With many thanks to you all
Bhante | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » CAT tools with dictionary and grammar Pastey | Your smart companion app
Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.
Find out more » |
| Anycount & Translation Office 3000 | Translation Office 3000
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |