smufl-discuss

[smufl-discuss] Discussing Unicode and encoding dilemmas

Classic

List

Threaded

1 message

Grzegorz Rolek

[smufl-discuss] Discussing Unicode and encoding dilemmas

If we're discussing encodings and Unicode in particular, we would really want to have somebody from the Unicode Consortium or at least any related expert on-board. I'm not any of these, but although I'm pretty late to the party, I've seen few severe misconceptions expressed here already about how Unicode and the Consortium itself works. Members of this list wanting to discuss it should, I think, read The Unicode Standard (the core specification, available to download at http://unicode.org/standard/standard.html), or at least join the Unicode discussion list (http://unicode.org/consortium/distlist.html) to gain some insight into these matters. I know that few members of the Consortium are seriously interested in music notation, and you can find several discussions on the subject in the Unicode's list archives. For now let me just stress three things that were discussed here already, that I found particularly important:

Unicode indeed has in its repertory characters that are merely glyph alternates, or that are redundant in front of combining characters and the like, but those were added in the early days only to allow texts encoded in the legacy character sets to be transcoded into Unicode without a loss and their use is now discouraged.

Combining characters with multitude of their properties and related control points can be quite expressive, but also overused easily. Please let's take these into account when discussing things like stacks of bass figures or chord clusters. Exactly how they work and how they could be used, or misused, is clearly described in the Standard.

Size of any particular character without a difference in its semantics shouldn't matter in Unicode. If a clef bears a single meaning, that is, it simply sets a pitch reference on staff, then there's no reason why a clef in the middle of the line should be encoded separately. Unicode does not deal with contexts in which characters are used. If it's made smaller in-line only to fit it nicely within the reading flow, then this should be handled with a font feature or other higher-level implementation. That's unless, of course, the semantics indeed are different.

I really encourage anyone interested in Unicode to read the core specification linked above, because it's very informative in how Unicode works in its entirety across different writing systems, what can be achieved with its current implementation, and how to think about future encodings.
#############################################################
This message is sent to you because you are subscribed to
the mailing list <[hidden email]>.
To unsubscribe, E-mail to: <[hidden email]>
To switch to the DIGEST mode, E-mail to <[hidden email]>
To switch to the INDEX mode, E-mail to <[hidden email]>
Send administrative queries to <[hidden email]>