Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICU-22917 Generating / updating the units data is a very clunky and manual process #3262

Merged
merged 2 commits into from
Nov 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
150 changes: 150 additions & 0 deletions docs/processes/release/tasks/updating-measure-unit-old.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
---
layout: default
title: Updating MeasureUnit with new CLDR data
parent: Release & Milestone Tasks
grand_parent: Contributors
nav_order: 120
---

<!--
© 2020 and later: Unicode, Inc. and others.
License & terms of use: http://www.unicode.org/copyright.html
-->

# Updating MeasureUnit with new CLDR data
{: .no_toc }

## Contents
{: .no_toc .text-delta }

1. TOC
{:toc}

---

This document explains how to update the C++ and Java version of the MeasureUnit
class with new CLDR data.

Code is generated by running MeasureUnitTest.java unit tests, which writes
generated code to System.out. Two ways to access this:

1. Within **eclipse**:
- Open MeasureUnitTest.java, run it by clicking on the green play button on
menu bar.
- Copy the generated code from the eclipse console to the clipboard.

2. With **ant**:
- Run: `ant checkTest
-Dtestclass='com.ibm.icu.dev.test.format.MeasureUnitTest'`
- Open the checkTest output: `out/junit-results/checkTest/html/index.html`
- Navigate to "System.out" at the bottom of the MeasureUnitTest page to find
the generated code, and copy to the clipboard.

After syncing CLDR data with ICU do the following. This documentation assumes
that you are updating the MeasureUnit clases for ICU 68.

* Check out
$GIT_ROOT/icu4j/main/common_tests/src/test/java/com/ibm/icu/dev/test/format/MeasureUnitTest.java
* Open MeasureUnitTest.java.
* Find the `testZZZ` test, its code should all be commented out. This test will
execute last and will run the desired code.

Make sure DRAFT_VERSIONS at top of MeasureUnitTest.java is set correctly.
These are the ICU versions that have draft methods.

## Update MeasureUnit.java

* Change `testZZZ` to run `generateConstants(“68”); // ICU 68.`
* Run MeasureUnitTest.java, copy the generated code (see instructions above).
* Open MeasureUnit.java:
$GIT_ROOT/icu4j/main/core/src/main/java/com/ibm/icu/util/MeasureUnit.java
* Look for line containing:

`// Start generated MeasureUnit constants`
* Look for line containing:

`// End generated MeasureUnit constants`
* Replace all the generated code in between with the contents of the clipboard
* Run the MeasureUnitTest.java to ensure that the new code is backward
compatible. These compatibility tests are called something like
`TestCompatible65`, which tests backward compatibility with ICU 65.
* Create a compatibility test for ICU 68. Change `testZZZ` to run
`generateBackwardCompatibilityTest(“68”)`
* Run tests.
* Copy generated test (see instructions above) into MeasureUnitTest.java
* Run tests again to ensure that new code is backward compatible with itself

## Update ICU4C

* checkout ICU4C

### Update measunit.h

* Change testZZZ to run `generateCXXHConstants(“68”); // ICU 68`.
* Run MeasureUnitTest.java, copy the generated code (see instructions above).
* Open $GIT_ROOT/icu4c/source/i18n/unicode/measunit.h. Look for line containing:

`// Start generated createXXX methods`
* Look for line:

`// End generated createXXX methods`
* Replace all the generated code in between with the contents of the clipboard

### Update measunit.cpp

* Change testZZZ to run generateCXXConstants();
* Run MeasureUnitTest.java, copy the generated code (see instructions above).
* Open $GIT_ROOT/icu4c/source/i18n/measunit.cpp. Look for line containing:

`// Start generated code for measunit.cpp`
* Look for lines

`// End generated code for measunit.cpp`
* Replace all the generated code in between with the contents of the clipboard

### Run C++ tests

* Run `./intltest format/MeasureFormatTest` from `test/intltest` to ensure new
code is backward compatible.
* Create a compatibility test for ICU 68. Change `testZZZ` in eclipse to run
`generateCXXBackwardCompatibilityTest(“68”)`
* Run tests.
* Copy generated test (see instructions above) into
$GIT_ROOT/icu4c/source/test/intltest/measfmttest.cpp. Make other necessary
changes to make test compile. You can find these changes by searching for
`TestCompatible65()`
* Run tests again to ensure that new code is backward compatible with itself

## Finishing changes

These last changes are necessary to permanently record the ICU version number of
any new measure units. Without these changes any new functions for this release
will be considered new for the next release too.

* Change `testZZZ` to run `updateJAVAVersions(“68”);`
* Run MeasureUnitTest.java, copy the generated code (see instructions above).
* Append the clipboard contents to the values of the JAVA_VERSIONS variable
near the top of MeasureUnitTest.java.

**Important:** what you are copying are just the new functions for the current
ICU version, in this case 68. Therefore append, do not replace.

## Updating units.txt and unitConstants

The standard ldml2icu process is used to update ICU's resource files (see
[cldr-icu-readme.txt](https://github.com/unicode-org/icu/blob/main/icu4c/source/data/cldr-icu-readme.txt)).
CLDR's units.xml defines conversion rates in terms of some constants defined in
`unitConstants`.

For efficiency and simplicity, ICU does not read `unitConstants` from the
resource file. If any new constants are added, some code changes would be
needed. This would be caught by `testUnitConstantFreshness` unit test in
`units_test.cpp`.

They are hard-coded:
* Java: `UnitConverter.java` has the constant names in
`UnitConverter.Factor.addEntity()` and constant values in
`UnitConverter.Factor.getConversionRate()`.
* C++: `units_converter.cpp` has the constant names in
`addSingleFactorConstant()`, with the constant values in `double
constantsValues[]` in the `units_converter.h` header file.
198 changes: 89 additions & 109 deletions docs/processes/release/tasks/updating-measure-unit.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,127 +12,106 @@ License & terms of use: http://www.unicode.org/copyright.html
-->

# Updating MeasureUnit with new CLDR data

{: .no_toc }

## Contents

{: .no_toc .text-delta }

1. TOC
{:toc}

---

This document explains how to update the C++ and Java version of the MeasureUnit
This document explains how to update the C++ and Java version of the `MeasureUnit`
class with new CLDR data.

Code is generated by running MeasureUnitTest.java unit tests, which writes
generated code to System.out. Two ways to access this:

1. Within **eclipse**:
- Open MeasureUnitTest.java, run it by clicking on the green play button on
menu bar.
- Copy the generated code from the eclipse console to the clipboard.

2. With **ant**:
- Run: `ant checkTest
-Dtestclass='com.ibm.icu.dev.test.format.MeasureUnitTest'`
- Open the checkTest output: `out/junit-results/checkTest/html/index.html`
- Navigate to "System.out" at the bottom of the MeasureUnitTest page to find
the generated code, and copy to the clipboard.

After syncing CLDR data with ICU do the following. This documentation assumes
that you are updating the MeasureUnit clases for ICU 68.

* Check out
$GIT_ROOT/icu4j/main/common_tests/src/test/java/com/ibm/icu/dev/test/format/MeasureUnitTest.java
* Open MeasureUnitTest.java.
* Find the `testZZZ` test, its code should all be commented out. This test will
execute last and will run the desired code.

Make sure DRAFT_VERSIONS at top of MeasureUnitTest.java is set correctly.
These are the ICU versions that have draft methods.

## Update MeasureUnit.java

* Change `testZZZ` to run `generateConstants(“68”); // ICU 68.`
* Run MeasureUnitTest.java, copy the generated code (see instructions above).
* Open MeasureUnit.java:
$GIT_ROOT/icu4j/main/core/src/main/java/com/ibm/icu/util/MeasureUnit.java
* Look for line containing:

`// Start generated MeasureUnit constants`
* Look for line containing:

`// End generated MeasureUnit constants`
* Replace all the generated code in between with the contents of the clipboard
* Run the MeasureUnitTest.java to ensure that the new code is backward
compatible. These compatibility tests are called something like
`TestCompatible65`, which tests backward compatibility with ICU 65.
* Create a compatibility test for ICU 68. Change `testZZZ` to run
`generateBackwardCompatibilityTest(“68”)`
* Run tests.
* Copy generated test (see instructions above) into MeasureUnitTest.java
* Run tests again to ensure that new code is backward compatible with itself

## Update ICU4C

* checkout ICU4C

### Update measunit.h

* Change testZZZ to run `generateCXXHConstants(“68”); // ICU 68`.
* Run MeasureUnitTest.java, copy the generated code (see instructions above).
* Open $GIT_ROOT/icu4c/source/i18n/unicode/measunit.h. Look for line containing:

`// Start generated createXXX methods`
* Look for line:

`// End generated createXXX methods`
* Replace all the generated code in between with the contents of the clipboard

### Update measunit.cpp

* Change testZZZ to run generateCXXConstants();
* Run MeasureUnitTest.java, copy the generated code (see instructions above).
* Open $GIT_ROOT/icu4c/source/i18n/measunit.cpp. Look for line containing:

`// Start generated code for measunit.cpp`
* Look for lines

`// End generated code for measunit.cpp`
* Replace all the generated code in between with the contents of the clipboard

### Run C++ tests

* Run `./intltest format/MeasureFormatTest` from `test/intltest` to ensure new
code is backward compatible.
* Create a compatibility test for ICU 68. Change `testZZZ` in eclipse to run
`generateCXXBackwardCompatibilityTest(“68”)`
* Run tests.
* Copy generated test (see instructions above) into
$GIT_ROOT/icu4c/source/test/intltest/measfmttest.cpp. Make other necessary
changes to make test compile. You can find these changes by searching for
`TestCompatible65()`
* Run tests again to ensure that new code is backward compatible with itself

## Finishing changes

These last changes are necessary to permanently record the ICU version number of
any new measure units. Without these changes any new functions for this release
will be considered new for the next release too.

* Change `testZZZ` to run `updateJAVAVersions(“68”);`
* Run MeasureUnitTest.java, copy the generated code (see instructions above).
* Append the clipboard contents to the values of the JAVA_VERSIONS variable
near the top of MeasureUnitTest.java.

**Important:** what you are copying are just the new functions for the current
ICU version, in this case 68. Therefore append, do not replace.

## Updating units.txt and unitConstants

The standard ldml2icu process is used to update ICU's resource files (see
[cldr-icu-readme.txt](https://github.com/unicode-org/icu/blob/main/icu4c/source/data/cldr-icu-readme.txt)).
This document applies to ICU 77 and later.
For older versions see updating-measure-unit-old.md

Make sure `DRAFT_VERSION_SET` at top of
`./icu4j/main/common_tests/src/test/java/com/ibm/icu/dev/test/format/MeasureUnitGeneratorTest.java`
is set correctly. \
These are the ICU versions that have draft methods.

The code is generated by running `MeasureUnitGeneratorTest.java` unit tests, which writes
generated code to various file.

1. With **maven** (command line):
- Change folder to `{icuRoot}/icu4j`
- run `mvn install -DskipTests -DskipITs`
- run `mvn install -q -Dtest=MeasureUnitGeneratorTest -DgenerateMeasureUnitUpdate -f main/common_tests`

2. Within **Eclipse**:
- Open `MeasureUnitGeneratorTest.java`, find the `generateUnitTestsUpdate` methods
and run it by clicking on the green play button on menu bar. \
Choose "JUnit Test" if asked. \
This will not generate the update, but it will run the test and create a "Run Configuration". \
Open it (Main menu -- "Run" -- "Run Configurations"), select the one named
`MeasureUnitGeneratorTest.generateUnitTestsUpdate`, go to the "Arguments" tab and add
`-DgenerateMeasureUnitUpdate` to the "VM Arguments" text area.

Both methods will generate files with in `icu4j/main/common_tests/target/` folder. \
The file names and the logging to the standard output will guide you.

It currently looks something like this:
```
Copy the generated code fragments from / to
/some/absolute/path/icu4j/main/common_tests/target/MeasureUnit.java \
/some/absolute/path/icu4j/main/core/src/main/java/com/ibm/icu/util/MeasureUnit.java

Copy the generated code fragments from / to
/some/absolute/path/icu4j/main/common_tests/target/MeasureUnitCompatibilityTest.java \
/some/absolute/path/icu4j/main/common_tests/src/test/java/com/ibm/icu/dev/test/format/MeasureUnitCompatibilityTest.java

Copy the generated code fragments from / to
/some/absolute/path/icu4j/main/common_tests/target/measunit.h \
/some/absolute/path/icu4c/source/i18n/unicode/measunit.h

Copy the generated code fragments from / to
/some/absolute/path/icu4j/main/common_tests/target/measunit.cpp \
/some/absolute/path/icu4c/source/i18n/measunit.cpp

Copy the generated code fragments from / to
/some/absolute/path/icu4j/main/common_tests/target/measfmttest.cpp \
/some/absolute/path/icu4c/source/test/intltest/measfmttest.cpp

Copy the generated code fragments from / to
/some/absolute/path/icu4j/main/common_tests/target/MeasureUnitGeneratorTest.java \
/some/absolute/path/icu4j/main/common_tests/src/test/java/com/ibm/icu/dev/test/format/MeasureUnitGeneratorTest.java
```

Some kind of diff tool or editor (for example `vi -d`) work nicely.

Look for line containing `// Start generated ...` and `// End generated ...`
These lines exist in both the original files, and the generated one. \
Replace all the generated code in between with the contents of the clipboard.

If the generated code has no `// Start` ... `// End ...` pair then the new
code should be appended at some fixed place (details below).

* **`MeasureUnit.java`:** replace range.
* **`MeasureUnitCompatibilityTest.java`:** append the new generated method at the end. \
It is named something like `TestCompatible<version>()`. \
Don't add it if it already exists.
* **`measunit.h`:** replace range.
* **`measunit.cpp`:** replace range.
* **`measfmttest.cpp`:** append the new generated method after the last
`MeasureFormatTest::TestCompatible<version>()` method. \
Don't add it if it already exists. \
WARNING: here you should add the method in two places. The method proper, with code,
as generated, and the declaration in the class definition.
* **`MeasureUnitGeneratorTest.java`:** append the new pairs of measure + version at
the end of the `JAVA_VERSIONS` structure. \
Don't add them if they already exist.

## Run tests for both `icu4c` and `icu4j`

## Updating `units.txt` and `unitConstants`

The standard `ldml2icu` process is used to update ICU's resource files (see
[`cldr-icu-readme.txt`](https://github.com/unicode-org/icu/blob/main/icu4c/source/data/cldr-icu-readme.txt)).
CLDR's units.xml defines conversion rates in terms of some constants defined in
`unitConstants`.

Expand All @@ -142,6 +121,7 @@ needed. This would be caught by `testUnitConstantFreshness` unit test in
`units_test.cpp`.

They are hard-coded:

* Java: `UnitConverter.java` has the constant names in
`UnitConverter.Factor.addEntity()` and constant values in
`UnitConverter.Factor.getConversionRate()`.
Expand Down
Loading