Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct conversions #12

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Conversation

nathanhammond
Copy link

This PR includes patches for three issues that presented themselves when converting JPTableFull (using HKSCS encoding) to present Unicode codepoints.

The issues are resolved commit-by-commit:

  1. Reverts the precomputation of the data files. This should be done as a build action and should not be pre-computed within the repo. Further, the approach should pre-resolve all possible inputs to their outputs as a single map: {"5041": "5041"}. Data beyond that KV pair is not valuable.
  2. HKSCS-9447 has a typo in the source mapping, resulting in a duplicate key. Correcting that mapping corrects the output.
  3. Big5 has two duplicate characters. They both get mapped into Unicode code points. This change makes the conversion to the preferred Unicode character, not the compat character.
  4. The numerals now get correctly mapped to the ideographs, not the hangzhou numerals.

Fixes #4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
1 participant