Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't export character as alias for codepoint #449

Open
fantasai opened this issue May 5, 2022 · 8 comments
Open

Don't export character as alias for codepoint #449

fantasai opened this issue May 5, 2022 · 8 comments

Comments

@fantasai
Copy link

fantasai commented May 5, 2022

The Infra spec aliases its definition for “code point” to “character”. However, “character” is a very ambiguous term, see https://www.w3.org/TR/css-text-3/#characters, and should not be treated as an alias of “code point”. This isn't to say that specs shouldn't ever use it, if it's clear from context which meaning is meant, and it's fine to leave in the sentence that says

Code points are sometimes referred to as characters and in certain contexts are prefixed with "0x" rather than "U+".

But I think it's best if Infra does not define and export the term “character” as an alias of “code point”.

fantasai added a commit to fantasai/infra that referenced this issue May 5, 2022
“Character” is an ambiguous term and is used in multiple senses in the Web platform. Don't alias it to only one of them. whatwg#449
@domenic
Copy link
Member

domenic commented May 5, 2022

I disagree. I think if anyone is using the term "character" on the web platform, they should be linking to this clear definition as a synonym of "code point".

@fantasai
Copy link
Author

fantasai commented May 6, 2022

I think my point is that people shouldn't really be using the term “character” when defining behavior in Web platform specs, so Infra shouldn't be exporting the term and encouraging its use.

https://www.w3.org/TR/charmod/#sec-PerceptionsOutro

@aphillips
Copy link
Contributor

I'll pile on here.

On the one hand, I like WHATWG's general preference for using normal words in preference to jargon and I don't think that the definition of character in Infra is "wrong".

However, spec developers don't understand the term that well: my worry is that they'll pick up the definition and use it imprecisely. When I18N does reviews, we often find text talking about "characters" but the assumption behind it is slipping around between bytes, code units (which obviously can be bytes), code points, and graphemes. By getting specs to be specific, we avoid headaches later. Note that I18N WG made a conscious choice not to export character in the I18N Glossary.

@domenic
Copy link
Member

domenic commented May 6, 2022

Sure, I think people should prefer more precise terms. I don't think that really has bearing on whether it's exported. And it certainly doesn't indicate we should defer to CSS Text (which also exports a definition---and IMO a worse one for the purposes of other specs) instead.

@fantasai fantasai changed the title Defer to CSS Text for definition of “character” Don't export character as alias for codepoint May 7, 2022
@fantasai
Copy link
Author

fantasai commented May 7, 2022

It's not so much that I want specs to refer to CSS Text, it's that I want them to recognize that it's an ambiguous term that has multiple definitions, and needs to be made more specific if it's being used in a spec.

Sure, I think people should prefer more precise terms. I don't think that really has bearing on whether it's exported.

Exporting it, particularly from Infra, means that you think it's a good term to use, and are providing a definition for it for others to use. Infra being largely a set of terms with precise definitions that are being recommended for use in specs. If the idea is that specs should be using more precise terms instead of character (which is what the i18nwg recommends), then exporting “character” from Infra is working against that.

@annevk
Copy link
Member

annevk commented May 9, 2022

I would be okay with a decision where we all avoid using "character" going forward and stop exporting it. Note that if we removed the export from Infra, any specifications using this definition might end up accidentally switching to the CSS definition, which does not seem desirable. So any effort to avoid using "character" might have to start downstream to some extent.

@SamB
Copy link

SamB commented Jan 7, 2023

Is it possible to export it as a deprecated term, so referencing specs will get a warning that could point them somewhere that could help authors find specific terms that would work for their specs?

It looks like this isn't currently a thing terms can do, perhaps
@tabatkins and/or ReSpec people would care to chime in?

@tabatkins
Copy link
Contributor

While there's not currently any mechanism in Bikeshed to mark a definition as "bad to use, try something else", there's nothing stopping me from adding such a thing. A poison pill definition sounds like a pretty reasonable idea actually - that would still allow CSS to define "character" for their own purpose, while Infra's "character" causes an error if you link to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants