New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better support for handling compound characters in languages such as Korean and Tamil #2791
Comments
Comment 1 by jteh on 2012-11-09 11:12
Technical: You can decompose compound characters with unicodedata.normalize("NFD", compoundChar) |
Comment 3 by blindbhavya on 2014-09-07 08:24 |
Comment 5 by jteh (in reply to comment 3) on 2014-09-25 23:12
NVDA+control+. and NVDA+control+. See section 5.5 of the User Guide.
They're not the same issue. This one hasn't yet been fixed. |
@josephsl As the original author of this ticket, could you please respond to the questions asked in @jcsteh's #2791 (comment)? |
Hi,
Thanks. |
Hi, probably not at this point unless I’m wrong 9I have moved on from Korean translation at this point, but we can ask translators once 2020.1 is released). Thanks.
|
@josephsl Since NVDA version 2020.1 has been released, this is a is a friendly reminder to reconsider #2791 (comment). |
Hi, In this case, it would be better to let Korean users comment on this, as they can tell us if things have improved. Thanks. |
@khsbory, @ungjinPark, @dnz3d4c could you please give an update on thi issue as well? how is it working in NVDA 2023.1 Beta? |
Closing this issue as abandoned, no updates from Korean users. For Tamil language, this is already covered in #1428 I think. |
Reported by nvdakor on 2012-11-09 09:28
Hi,
Normally, Unicode assigns one code per character. This works well for Latin-based scripts and similar ones such as English. However, there are languages such as Korean and Tamil that uses characters coposed of compound characters - that is, character components are used to make up a single character. A good example is Korean, where a character is composed of an initial conscenant, vowel and zero, one or two final conscenants.
The support for handling such compound characters would be useful for proofreading scenarios (such as pressing Numpad5 three times to spell the word with character descriptions). As of 2012.3, one of two things occur:
There are scripts out there which allows calculation of Unicode values for chars that make up the components of a single compound character. However, a concern would be whether such cases would apply to just one or more languages, with a potential issue being catching possible test cases when dealing with compound characters due to script differences between various languages (which also involves looking up the base Unicode value for chars in a particular language).
In the end, the ideal result would be: when a user presses review current character/word command twice or three times to obtain character descriptions, NvDA would announce individual components of a compound character, instead of translators writing tens of thousands of possible compound char combinations in characterDescriptions.dic, which helps with performance as well.
Thanks.
The text was updated successfully, but these errors were encountered: