[deepamehta-devel] Translations DM-style

Malte Reissig mre at deepamehta.de
Wed Dec 14 12:06:22 CET 2016


Regarding your specific case and questions:

The best answer for you is probably related on how the multi-lingual 
data will enter DM. Do you plan to get your data in (1) programmability, 
through importers or a migration or (2) through users using (2.1) the 
dm4-webclient or (2.2) a custom editor developed within your project.

On 13.12.2016 09:05, Robert Schuster wrote:
> Hi alltogether,
> soon I'll have the task of providing translation support to DM-managed.
> In a first shot the project was realized without translation support and
> in a 2nd phase this is going to be added...
>
> So what I want to indicate is that I have basically 3 types of data:
> - facts, figures, statistics that do not need to be translated (a date
> stays a date, some statistics value too)
Yes, i think it is probably realistic to just care about support for 
Gregorian Dates first
see https://en.wikipedia.org/wiki/Gregorian_calendar
> - links to external resources that need to be provided in translated
> form, e.g. a link to the English variant of an article)
Wouldn't the right way to do this specify a HTTP "Accept-Language" 
Header on the web resource request?
https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
Using a "RFC5646" on Language Tag as a value:
https://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.10
Otherwise, yes, in this case it is probably best to introduce a custom 
association type using language specifiers.

> - text in one language that needs to be provided in another directly in DM
In this case i would go for the "properties" approach as it will be the 
cleanest. After my experience with integrating multi-lingual wikidata, 
looking onto our current storage implementation this will be 
advantageous, performance and (probably also) usability wise. Though i 
do not know if one has fine-grained control (think IndexModes) for the 
indexing of these new property values.
> Has anyone done something similar and can share the approach?
You can find a multi-lingual topic type definition here:
https://github.com/mukil/dm4-wikidata-toolkit/blob/master/src/main/resources/migrations/migration3.json
>
> Is there already consensus about this?
I think jri's proposal sounds clean and is probably the best way 
considering our current storage layer (see jri's posting). Though I 
currently can not oversee how this might be affected by the recently 
discussed change for an upcoming dm version regarding "Identity and 
Value" types. With that meta-model i think we might all look differently 
onto the problems described in your case.

>
> What I am having in mind is introducing a custom association
> "translation" which hosts a language specifier and which I then
> associate manually with all the data that is translated.

We probably all need/want these "language specifiers" represented by "Topics" to easily provide users interface elements which allow them to interactively switch and select a specific language in our webclient (through a human readable name).

Therefore i would be very happy if we as plugin developers could a agree on creating a new e.g. "dm4-language-codes" plugin introducing just these once and for all. I think it is vital for a growin set of plugins that we can rely on the very same language specifiers across applications/projects.



More information about the devel mailing list