-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't export character as alias for codepoint #449
Comments
“Character” is an ambiguous term and is used in multiple senses in the Web platform. Don't alias it to only one of them. whatwg#449
I disagree. I think if anyone is using the term "character" on the web platform, they should be linking to this clear definition as a synonym of "code point". |
I think my point is that people shouldn't really be using the term “character” when defining behavior in Web platform specs, so Infra shouldn't be exporting the term and encouraging its use. |
I'll pile on here. On the one hand, I like WHATWG's general preference for using normal words in preference to jargon and I don't think that the definition of However, spec developers don't understand the term that well: my worry is that they'll pick up the definition and use it imprecisely. When I18N does reviews, we often find text talking about "characters" but the assumption behind it is slipping around between bytes, code units (which obviously can be bytes), code points, and graphemes. By getting specs to be specific, we avoid headaches later. Note that I18N WG made a conscious choice not to export |
Sure, I think people should prefer more precise terms. I don't think that really has bearing on whether it's exported. And it certainly doesn't indicate we should defer to CSS Text (which also exports a definition---and IMO a worse one for the purposes of other specs) instead. |
It's not so much that I want specs to refer to CSS Text, it's that I want them to recognize that it's an ambiguous term that has multiple definitions, and needs to be made more specific if it's being used in a spec.
Exporting it, particularly from Infra, means that you think it's a good term to use, and are providing a definition for it for others to use. Infra being largely a set of terms with precise definitions that are being recommended for use in specs. If the idea is that specs should be using more precise terms instead of character (which is what the i18nwg recommends), then exporting “character” from Infra is working against that. |
I would be okay with a decision where we all avoid using "character" going forward and stop exporting it. Note that if we removed the export from Infra, any specifications using this definition might end up accidentally switching to the CSS definition, which does not seem desirable. So any effort to avoid using "character" might have to start downstream to some extent. |
Is it possible to export it as a deprecated term, so referencing specs will get a warning that could point them somewhere that could help authors find specific terms that would work for their specs? It looks like this isn't currently a thing terms can do, perhaps |
While there's not currently any mechanism in Bikeshed to mark a definition as "bad to use, try something else", there's nothing stopping me from adding such a thing. A poison pill definition sounds like a pretty reasonable idea actually - that would still allow CSS to define "character" for their own purpose, while Infra's "character" causes an error if you link to it. |
The Infra spec aliases its definition for “code point” to “character”. However, “character” is a very ambiguous term, see https://www.w3.org/TR/css-text-3/#characters, and should not be treated as an alias of “code point”. This isn't to say that specs shouldn't ever use it, if it's clear from context which meaning is meant, and it's fine to leave in the sentence that says
But I think it's best if Infra does not define and export the term “character” as an alias of “code point”.
The text was updated successfully, but these errors were encountered: