Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

With Hindi keyboard, NVDA announces Hindi symbol 9 4 D on using SHIFT with the 3, 4, 5, 6, 7 and 8 before announcing the actual character typed. #4487

Closed
nvaccessAuto opened this issue Sep 24, 2014 · 10 comments
Labels
bug close/duplicate component/i18n existing localisations or internationalisation

Comments

@nvaccessAuto
Copy link

Reported by sumandogra on 2014-09-24 08:24
With Hindi keyboard of MS Windows, NVDA announces Hindi symbol 9 4 D on using SHIFT with the 3, 4, 5, 6, 7 and 8 before announcing the actual character typed. Only actual character typed should be announced on using these keystrokes.

Steps:

  1. Activate Hindi keyboard through control panel.
  2. Open MS word and shift to Hindi with ALT+SHIFT command.
  3. Now use the following keystrokes to type the Hindi characters:
    SHIFT+3(र), SHIFT+4(र्), SHIFT+5(ज्ञ), SHIFT+6(त्र), SHIFT+7(क्ष), SHIFT+8(श्र).

Behavior:
Hindi symbol 9 4 D is announced before the announcement of the र on keyboard SHIFT+3.
Hindi symbol 9 4 d is announced after typing र् with SHIFT+4.
Hindi symbol 9 4 D is announced after pressing SHIFT+5 to write ज्ञ..
Hindi symbol 9 4 D is announced after pressing SHIFT+6 to write क्त्र
Hindi symbol 9 4 D is announced after pressing SHIFT+7 to write श्क्ष
Hindi symbol 9 4 D is announced after pressing Shift+8 to write श्र

Blocked by #1428

@nvaccessAuto
Copy link
Author

Comment 1 by blindbhavya on 2014-09-24 10:09
Hi.
This is an ESpeak issue.
It has been fixed in later versions of ESpeak.

@nvaccessAuto
Copy link
Author

Comment 3 by jteh on 2014-09-24 10:41
Leaving open so we can track this. Also, I assume you're using the Hindi voice (not English) with eSpeak?
Changes:
Changed title from "With Hindi keyboard, NVDA announces Hindi symbol 9 4 D on using SHIFT with the 3, 4, 5, 6, 7 and 8 before announcing the actual character typed." to "eSpeak: With Hindi keyboard, NVDA announces Hindi symbol 9 4 D on using SHIFT with the 3, 4, 5, 6, 7 and 8 before announcing the actual character typed."

@nvaccessAuto
Copy link
Author

Comment 4 by siddhartha_iitd on 2014-09-25 18:41
This issue touches the concept of glyphs and ligatures. Internally, a ligature is represented by its components. Thus, the unicode codepoint for a ligature is a combination of unicode codepoints of its constituent characters.
The key to resolve this issue, during character navigation, is to determine whether the character is a part of some ligature or a standalone character.
I'm not sure if there is any generic way to get this information that can work for all locales.

@nvaccessAuto
Copy link
Author

Comment 5 by jteh on 2014-09-25 23:04
Actually, this isn't related to eSpeak. I'm pretty sure that even the current stable version of eSpeak already knows how to speak these compound characters.

This is a duplicate of #1428.

The only one that confuses me is र (shift+3). That is only a single Unicode character. @sumandogra, are you absolutely certain symbol 94d is announced before that one? I understand why it would be for the others, but not that one.
Changes:
Added labels: duplicate
State: closed

@nvaccessAuto
Copy link
Author

Comment 6 by jteh on 2014-09-25 23:09
Changes:
Changed title from "eSpeak: With Hindi keyboard, NVDA announces Hindi symbol 9 4 D on using SHIFT with the 3, 4, 5, 6, 7 and 8 before announcing the actual character typed." to "With Hindi keyboard, NVDA announces Hindi symbol 9 4 D on using SHIFT with the 3, 4, 5, 6, 7 and 8 before announcing the actual character typed."

@nvaccessAuto
Copy link
Author

Comment 7 by dhankuta on 2014-09-26 13:12
Dear Jamie,
Be confirm that the shift+3 contains 094d character at the beginning.
It is used for writing like pra, bru, gro i.e a consonant followed by the alphabet r.

Let us be clear.

  1. This is not an nvda issue.
  2. The so called wrong way of pronunciation as described above is only while typing character i.e speaking the typed character but not in general reading. In reading, it does not report the elementary character by which a compound alphabet is composed of.
  3. In espeak, the compound alphabets are given a proper phoneme of the final shape and it does not pronounce the name of the elementary characters which compose a compound alphabet! But the same case does not prevailed while reporting the key pressed event.

The real problem:

  1. Indic language have more alphabets than latin (English) but we use the English keyboard. To accommodate all alphabets, the keyboard layout developer use the following principles.
  • They often assign single physical key for more than one characters to be typed once at a time. i.e one key multiple character!
    The shift+3 to Shift+8 of the Hindi Traditional keyboard layout used by the ticket creator is an example.
  • Inversly, some assign Multiple keys of a keyboard for a single character.
    For example, in another Hindi layout called Saraswati, you can find:
    key k is assigned for the first consonant of Devanagari.
    key h is assigned for the last consonant of Devanagari.
    Interestingly, key k plus key h is assigned for the second consonant of Devanagari.
    Multiple keys for a single character.
    Here, if key h is pressed just after key k; the first consonant alphabet typed while pressing key k is erased and instead typing the last consonants as it does on pressing key h the second consonant is generated. A script on background determines the output.

Basically the pronunciation of the nvda/espeak on the issue of Hindi keylayout is absolutely correct.
It is pronouncing the exact characters which are typed at the screen.
Theoretically no error at all! Thanks to nvda and espeak, they are doing a perfect duty! 'Exact reporting of occured event!'.
It is a different matter the user wants the same in compound character phoneme. But there is no harm if the elementary character are pronounced!

Here is the source code of the Hindi keyboard layout used by Suman.
04 3 0 3 %% -1 -1 0969 0023 // DIGIT THREE, , , , DEVANAGARI DIGIT THREE, NUMBER SIGN
05 4 0 4 %% -1 -1 096a 0024 // DIGIT FOUR, , , , DEVANAGARI DIGIT FOUR, DOLLAR SIGN
06 5 0 5 %% -1 -1 096b 0025 // DIGIT FIVE, , , , DEVANAGARI DIGIT FIVE, PERCENT SIGN
07 6 0 6 %% -1 -1 096c 005e // DIGIT SIX, , , , DEVANAGARI DIGIT SIX, CIRCUMFLEX ACCENT
08 7 0 7 %% -1 -1 096d 0026 // DIGIT SEVEN, , , , DEVANAGARI DIGIT SEVEN, AMPERSAND
09 8 0 8 %% -1 -1 096e 002a // DIGIT EIGHT, , , , DEVANAGARI DIGIT EIGHT, ASTERISK

LIGATURE

3 1 094d 0930 // DEVANAGARI SIGN VIRAMA + DEVANAGARI LETTER RA
4 1 0930 094d // DEVANAGARI LETTER RA + DEVANAGARI SIGN VIRAMA
5 1 091c 094d 091e // DEVANAGARI LETTER JA + DEVANAGARI SIGN VIRAMA + DEVANAGARI LETTER NYA
6 1 0924 094d 0930 // DEVANAGARI LETTER TA + DEVANAGARI SIGN VIRAMA + DEVANAGARI LETTER RA
7 1 0915 094d 0937 // DEVANAGARI LETTER KA + DEVANAGARI SIGN VIRAMA + DEVANAGARI LETTER SSA
8 1 0936 094d 0930 // DEVANAGARI LETTER SHA + DEVANAGARI SIGN VIRAMA + DEVANAGARI LETTER RA

Possible Solution:
Do not Use that keyboard layout:

  • which has assigned one key for multiple characters.
    • which has assigned multiple keys for single character.
      Instead, use such typing system which has strickly adopted one key verses one character principle.
      To overcome the issue raised by Suman which was known to me; I had developed a keyboard layout named Varnamala.
      It has adopted single key and single character principle and available in all indic languages.
      Alternatively, if one can assign different pronunciation for the same character for reading and typing in any tts, it may work.
      However, I do not see any wrong way of pronounciation of the typed character in the Hindi traditional layout.
      I am not clear what is wrong if the exact typed characters are pronounced together?

@nvaccessAuto
Copy link
Author

Comment 8 by blindbhavya on 2014-09-26 16:31
Also, character 94d is actually a halanta ् which was wrongly read by older versions of ESpeak. IN the latest development versions, 94d is correctly read as हलंत
Also, very strangely, after CCing myself to thic ticket, I received three separate e-mail threads with approximately two comments each.
Any possible causes why this happened?

@nvaccessAuto
Copy link
Author

Comment 9 by jteh (in reply to comment 7) on 2014-09-29 01:33
Replying to dhankuta:

Be confirm that the shift+3 contains 094d character at the beginning.

I'm confused because the reporter indicated that shift+3 produces "र", which is a single Unicode character and contains no U+094d character. So, what does shift+3 actually produce?

@nvaccessAuto
Copy link
Author

Comment 10 by dhankuta on 2014-09-30 17:18
Hi Jamie,
let me clear:
First of all, we must understand the characteristic of consonants in Devanagari.
In Devanagari, unlike in English, the consonants have a naturally inherted
'a' phoneme inbuild like ba, ca, da, fa, ga, ha....
when ever we have to make a consonant without the inherted vowel, like b,c, d, f,g etc; the 094d character is put just after the consonants i.e it removes the inherited 'a' vowel phoneme from a plain consonant.
If any consonants is attached with another consonants like, cl,sh, pr, gh etc; the naturally inherited 'a' phoneme with the first consonant must be removed. hence the 094d sign is put after the first consonant.
Take an example:
My middle name is prasad.
To write prasad we must write, 'pa' first because there exists no simple 'p' consonant but we have to write only 'p' because it has to combine with another consonant ra! Hence, the 094d character (called halanta or virama) is put just after the consonant 'pa' and the alphabet ra is written. In the keyboard layout which Suman is using, pressing shift+3, the 094d and ra consonant is typed together at one time.
Such typing can be done in two ways:
First method:

  1. type the consonant 'pa' first.
  2. type the 094d character and make 'pa' consonant as 'p' alone (i.e. remove the inherited 'a'
  3. Type the alphabet r afterward (i.e type 094d and ra separately!)
    If this way of typing is followed, there is no problem.
    second method:
    1.Type the consonant 'pa' as above.
  4. press a multi character typing key which types 094d and ra at a time i.e. press macro keys!
    Here, nvda announces both of the character to which the ticket creator is considering a bug!
    Hope the use of 094d before the letter r is be clear.
    In the case of Suman, pressing shift+3 types 094d character and alphabet र! the 094d removes the inherited 'a' phoneme from the preceding consonant and it combines with alphabet र.
    Let me further make more clear the exact issue raised by Suman.
    Take an example:
    In a English typing keyboard layout, If some one assigned any key say the key 'j' for typing 'jamie'! i.e the key 'j' is assigned a macro which types 'Jamie' on a single key press.
    Now on pressing the macro key 'j', it announced as J a m i e (spelling mode) but not 'jemi'!
    Hope things are clear now.

@nvaccessAuto
Copy link
Author

Comment 11 by jteh on 2014-09-30 23:07
Thanks for your explanation. I understand the composition of compound characters. My confusion is that the reporter provided only a single symbol produced by shift+3. I'm guessing that it actually produced two symbols (as you explained) but the reporter didn't include the 094d symbol for that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug close/duplicate component/i18n existing localisations or internationalisation
Projects
None yet
Development

No branches or pull requests

1 participant