Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[css-text-3][css-text-4] Japanese small kana and line-break: normal #10363

Open
fantasai opened this issue May 23, 2024 · 4 comments
Open

[css-text-3][css-text-4] Japanese small kana and line-break: normal #10363

fantasai opened this issue May 23, 2024 · 4 comments
Labels
Agenda+ css-text-3 Current Work css-text-4 i18n-jlreq Japanese language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.

Comments

@fantasai
Copy link
Collaborator

fantasai commented May 23, 2024

It turns out that WebKit has been shipping a prohibition against breaks before small kana for awhile. But only sometimes: it depends on whether the language is tagged or not, see testcase. Tagged content is looser, untagged content is stricter. Firefox afaict never allows breaks before small kana.

I think this opens up the question of, should we reconsider defaulting line-break: normal to prohibiting breaks before small kana? Since it seems to be Web-compatible to change, would that be more natural for Japanese?

@fantasai fantasai added css-text-3 Current Work css-text-4 i18n-jlreq Japanese language enablement Agenda+ i18n Add to agenda for CSS-i18n calls labels May 23, 2024
@fantasai
Copy link
Collaborator Author

fantasai commented May 23, 2024

https://www.w3.org/TR/css-text-3/#line-break-property

Proposed text would be

Option A: Allow either behavior in normal.

The following breaks are forbidden in strict line breaking and allowed in normal and loose:
breaks before Japanese small kana or the Katakana-Hiragana prolonged sound mark, i.e. characters from the Unicode line breaking class CJ. [UAX14]

Option B: Forbid breaks before small kana in normal.

The following breaks are forbidden in strict line breaking and allowed in normal and loose:
breaks before Japanese small kana or the Katakana-Hiragana prolonged sound mark, i.e. characters from the Unicode line breaking class CJ. [UAX14]

@xfq xfq added the i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. label May 24, 2024
@frivoal
Copy link
Collaborator

frivoal commented Jun 13, 2024

I think you are right that this would be an improvement. I have no strong preference between A and B. B should lead to more interop, so maybe that one?

@jfkthame
Copy link
Contributor

B should lead to more interop

I think it also promises better behavior in the common/default case. A break before small kana is generally undesirable, so it should only be allowed if an author explicitly opts in to the "looser" behavior, and not when such characters appear in unstyled, untagged content.

@kidayasuo
Copy link

kidayasuo commented Jun 25, 2024

Discussed at today's JLReq TF meeting and reached a consensus. As a unified position, JLReq TF recommends option B: in normal, forbid breaks before small kana, as well as the prolonged sound mark U+30FC.

The current JLReq also forbids small kana from appearing at the beginning of lines, as stated in section 3.1.7. In appendix C.3 addendum, it defines looser levels that allow breaks before small kana and some other characters. In traditional printing, there has been pressure to avoid kinsoku (prohibition of line breaks at certain points) as much as possible to reduce the need for adjustments for justification, which affects the productivity of line layout work. Additionally, when a justified line is shorter, the adjustments result in visible widening of character spacing. The looser levels were defined to accommodate such requirements.

Text on the web has different properties. The cost of line adjustment is negligible. The default on the web is ragged right, meaning that adjustments do not negatively affect character spacing. Another reason to avoid separating small kana is that they are diacritical marks modifying the phonetic value of the previous character. Separating them forces readers to read the base character differently. Upon reaching the beginning of the next line, readers find that it actually had a different pronunciation. This effect would be worse for people with reading difficulties. These reasons underpin the recommendation.

It is important that editors and designers have options. For example, when a line is shorter and justified, one might prefer looser rules to prevent spacing between characters from becoming too wide.

@fantasai fantasai removed the Agenda+ i18n Add to agenda for CSS-i18n calls label Jul 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Agenda+ css-text-3 Current Work css-text-4 i18n-jlreq Japanese language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.
5 participants