C APIの改善を行う #217

qwerty2501 · 2022-08-06T16:53:11Z

内容

以前から上がっていた C APIの改善案を実装する
そこまで急ぐものではないので議論を進めつつ進めていきたい
API定義を変えてしまうとライブラリユーザーがコードを書き換える必要があるので可能ならこの変更をもってしばらくは変える必要のない形にしたほうが良さそう

APIの変更観点としては以下にしたい(追加でこういう観点もほしいとかあったらどうぞ)

名前が競合する恐れがないか
できるだけ変更が入ってもライブラリユーザーがコードを変更せず再コンパイルするだけの形にできるか

今現在変えようと思ってるもの

initialize,finalize, yukarin_s_forward などの以前からある関数にのprefix voicevoxをつけて競合しないようにする必要がある APIにprefixを付ける、返り値をVoicevoxResultCodeで統一する #125
成否がbool値で返ってくるものをVoicevoxResultCodeを変えるようにし、ライブラリユーザーになぜ失敗したかを判定できるようにする。 APIにprefixを付ける、返り値をVoicevoxResultCodeで統一する #125
initializeやvoicevox_tts関数ではoptionalな引数があるがこれをstructにしてまとめて提供することにより今後optionが増えてもライブラリユーザーが書き換える必要がないようにしたほうが良さそう APIにprefixを付ける、返り値をVoicevoxResultCodeで統一する #125 (comment)
last_error_messageの削除(bool やめて VoicevoxResultCode にしたら voicevox_error_result_to_message でメッセージとれるため)

関連 Issue

refs #125

qwerty2501 · 2022-08-06T16:59:02Z

引数のoptionをstruct化したら voicevox_tts_from_kana や voicevox_load_openjtalk_dict は消せるかも？
どちらもoptionを増やせば initialize , voicevox_tts で対応できそうな機能に見える

Hiroshiba · 2022-08-06T19:17:54Z

C APIの改善に関して、なるべく網羅的に意見を伺いたいのでメンションを飛ばさせて頂きます！
@Yosshi999 @Oyaki122 @y-chan @aoirint @PickledChair @Segu-g @shigobu

Hiroshiba · 2022-08-06T19:23:10Z

引数のoptionをstruct化したら voicevox_tts_from_kana や voicevox_load_openjtalk_dict は消せるかも？

from_kanaはたしかに消せそうに思いました。（実際エンジン側は同じAPI）
あーでも、必要になってくるoptionが将来的にそれぞれ異なる可能性もある･･･うーん。

load_openjtalk_dictは辞書更新のたびに何度も呼ぶので、initializeに混ぜると不都合かもです。

qwerty2501 · 2022-08-06T23:12:44Z

あーでも、必要になってくるoptionが将来的にそれぞれ異なる可能性もある･･･うーん。

現状そうなる未来がちょっと見えてこないです。
またもしそうなった場合でもその時はAPIに破壊的変更を加えるでも良いかなと思いました。そうなる可能性も低そうに見えるのでその時は仕方がなしということで

load_openjtalk_dictは辞書更新のたびに何度も呼ぶので、initializeに混ぜると不都合かもです。

とりあえずttsできるまでに呼び出す関数の数を減らしたいと思ってて、現状 initrialize -> voicevox_load_openjtalk_dict -> voicevox_tts の3手順を踏まないといけないのをなんとかしたいと思ってます。
更新が必要なら voicevox_load_openjtalk_dict を残して、かつ initializeの optionにopen_jtalkのdictionaryのディレクトリを設定できるようにして設定されていたらinitialize時にopen_jatlkのdictionaryを読み込むようにするというアプローチもあるかなと思いました。

しかし辞書更新って一度起動してから何度も行われるものなんでしょうか？

Hiroshiba · 2022-08-07T06:38:01Z

たしかにfrom_kanaはオプションでもいいかなと思いました。

しかし辞書更新って一度起動してから何度も行われるものなんでしょうか？

辞書登録をすると更新が必要になります。
どちらかというと単語追加APIを生やしてその中でloadするほうが良いかもです･･･が、かなり大変そう。

qwerty2501 · 2022-08-07T06:52:57Z

辞書登録をすると更新が必要になります。
どちらかというと単語追加APIを生やしてその中でloadするほうが良いかもです･･･が、かなり大変そう。

あまり詳しくないのですが、辞書登録ってユーザー定義辞書とは別物ですか？

Hiroshiba · 2022-08-07T07:02:13Z

あまり詳しくないのですが、辞書登録ってユーザー定義辞書とは別物ですか？

辞書登録という言葉がいまいちでした。。「辞書登録後にユーザー定義辞書を作る」という流れをイメージしていました。
流れはこんな感じなはずです（リンク先はエンジン側での例です）

csvに単語を追加→create_user_dict実行（ユーザー定義辞書生成）→set_user_dict

qwerty2501 · 2022-08-07T07:32:55Z

なるほど。現状 voicevox_load_openjtalk_dict には open_jtalk_dic directoryしか受け取れるようになってないですが、将来的にはユーザー定義辞書も渡せるようにしたいという感じでしょうか？

Hiroshiba · 2022-08-07T09:06:41Z

あ！なるほどです。
ユーザー定義辞書のloadとvoicevox_load_openjtalk_dictは処理内容が同じだと勘違いしていました。
将来的にユーザー辞書を読めると嬉しいですが、とりあえずvoicevox_load_openjtalk_dictはinitializeにまとめちゃって良い気がしました！

qwerty2501 · 2022-08-07T09:07:39Z

ではいったん voicevox_load_openjtalk_dict の定義は消す方向で進めてみます

… feature/improve-c-api

qwerty2501

まだコンパイル通らない状態なのですが、とりあえず定義だけ変えてみました。主にやったことは以下です

load_openjtalk_dictを削除
last_error_messageを削除
optionsで表現できるものはoptionsに変更
voicevox_core/publish.rsにある関数について voicevox prefixを削除(これについてはちゃんと名前空間で保護されている関数であり、prefixは不要であるため)
成否の戻り値をboolからVoicevoxResultCodeに変更

ちょっときになってるのがoptionsを加えたことによりttsする手間が若干増えてしまってることです
例えば最小限の呼び出しでもいかのようになります

VoicevoxTtsOptions options = voicevox_default_tts_options();
int output_binary_size;
uint8_t* output_wav;
voicevox_tts("テキスト",${speaker_id},options,&output_binary_size,&output_wav);

これを避けるためにoptionsを受け取る版と受け取らない版を定義したほうが良いかもと思いました。
たとえばvoicevox_ttsであるならこんな感じにして普段は voicevox_tts を呼び出してoptionsはdefaultのものが自動的に内部で適用されるようにするなどしたほうがよいかも

pub extern "C" fn voicevox_tts(
    text: *const c_char,
    speaker_id: i64,
    output_binary_size: *mut c_int,
    output_wav: *mut *mut u8,
) -> VoicevoxResultCode;

pub extern "C" fn voicevox_tts_with_options(
    text: *const c_char,
    speaker_id: i64,
    options: VoicevoxTtsOptions,
    output_binary_size: *mut c_int,
    output_wav: *mut *mut u8,
) -> VoicevoxResultCode;

crates/voicevox_core_c_api/src/lib.rs

qwerty2501 · 2022-08-20T02:18:14Z

crates/voicevox_core/src/publish.rs

-    }
+pub struct InitializeOptions {
+    use_cuda: bool,
+    cpu_num_threads: u32,


負数を入れる余地を作りたくなかったのでu32にしたがそれでよかったか？ usizeだとちょっと大きすぎる気がしたので
内部的には c_intを使ってるので絶対に桁落ちさせないといういみではu16のほうがよかったかもしれない

usize

どうなんでしょう? Python越しにせいぜい数回やりとりするデータなのでサイズはあまり問題ではないとは思うのですが、桁落ちも確かにありますね。

内部的にはc_intなので usize -> c_intへの変更の際に桁落ちはありえますね。変換前にチェックして値が大きすぎたらエラーにする方法もまあなくはないですが、そもそもCPUの数でそこまで大きな値を指定することはないと考えるとusizeは冗長かもですね。
変換前に桁落ちしないかチェックが必要なのはu32も同じなので、そういう意味だとu16がベストかも？

話はずれますがrepr(Rust)な方の構造体ではOption<NonZeroU32>のように持つのもいいかもしれません。プリミティブな整数型に一対一で対応しますし。

ですね。u16がいいかも。

Optionについてはどうでしょうね。 onnxruntimeではcpu_num_threads:0も意味のある値ではあるのでそのままu16としても良いかもしれませんが、onnxruntimeに依存しないという意味では Option<NonZeroU16> としてもいいかもしれないですが。

あーそうでしたね。onnxruntimeにしか渡さないのならあまり意味はないかも。DLLを使うユーザーに提示できるならともかく。

cpu_num_threadsをu16に変更しました

crates/voicevox_core/src/publish.rs

qwerty2501 · 2022-08-20T07:30:52Z

voicevox_c_apiのlib.rsが結構大きくなってきたので、公開しない補助関数については helper.rsといったファイルを作って移動させたほうがいいかもです

qwerty2501 · 2022-08-20T19:40:58Z

@Hiroshiba

これを避けるためにoptionsを受け取る版と受け取らない版を定義したほうが良いかもと思いました。
たとえばvoicevox_ttsであるならこんな感じにして普段は voicevox_tts を呼び出してoptionsはdefaultのものが自動的に内部で適用されるようにするなどしたほうがよいかも

これどうしましょうか？

Hiroshiba · 2022-08-20T21:52:57Z

あ、見逃していました！

どちらも良さそうなのですが、一旦voicevox_tts(options)だけにしておいて、リリースまでにやっぱほしいなってなったら増やす感じで･･･！
pythonのようにデフォルト引数がある言語のラッパーを考えると、with_options関数が無いほうがやりやすそうだなーという所感です。

qryxip · 2022-08-20T23:56:24Z

#[no_mangle]
pub static voicevox_default_tts_options: VoicevoxTtsOptions = VoicevoxTtsOptions::DEFAULT;

impl VoicevoxTtsOptions {
    const DEFAULT: Self = Self { kana: false };
}

// あるいは直接
// #[no_mangle]
//pub static voicevox_default_tts_options: VoicevoxTtsOptions = VoicevoxTtsOptions { kana: false };

こうしてしまえば多分cbindgenもグローバル定数(extern const VoicevoxTtsOptions voicevox_default_tts_options;)扱いしてくれるので、割と抵抗なく以下の書き方ができるんじゃないかと思うのですがどうでしょうか...?

-VoicevoxTtsOptions options = voicevox_default_tts_options();
 int output_binary_size;
 uint8_t* output_wav;
-voicevox_tts("テキスト",${speaker_id},options,&output_binary_size,&output_wav);
+voicevox_tts("テキスト",${speaker_id},voicevox_default_tts_options,&output_binary_size,&output_wav);

(edit) いや()の2文字が減るだけですが

qwerty2501 · 2022-08-21T03:58:10Z

内部にもttsのdefaultオプションを作る関数を作るつもりなので他のAPIと挙動を合わせるという意味でも関数にしたほうが良さそうに感じました。直近ではpythonですかね

qryxip · 2022-08-21T05:12:10Z

うーんRustの実装の都合が理由であればvoicevox_core側でもこうしてしまえばいいと思ってしまいますね。
Cow<'static, str>やCow<'static, [f64]>もconst化できますし、ボイラープレートが嵩むならこんなライブラリもあります。

impl InitializeOptions {
    pub const DEFAULT: Self = Self {
        use_cuda: false,
        cpu_num_threads: 0,
        load_all_models: false,
        open_jtalk_dict_dir: None,
    };
}

qwerty2501 · 2022-08-21T05:30:22Z

InitializeOptions -> VoicevoxInitializeOptions への変換はFrom traitを使おうかと考えてたんですが、まあ専用のconst関数を作れば良いかもですね。static変数でdefaultのオプションを持つのもありかなと思いました。

qwerty2501 · 2022-08-21T05:45:09Z

ただ削減できるのが2文字ということを考えるとメンテしやすい方にしたほうが良いと思いました。ぱっと見はどっちもあまり対して変わらないようには見えますが・・・

Hiroshiba · 2022-08-21T06:33:04Z

static変数でデフォルトのオプションを持つ場合、そのstatic変数の中身を変えられてしまいそう･･･？ 👀
その場合、まあ便利そうではあるけど変なバグに繋がりそう感もちょっとあります。

Hiroshiba

いくつかコメントしたのでRequest changesにしていますが、気持ち的にはapproveです！！

crates/voicevox_core/src/publish.rs

crates/voicevox_core_c_api/src/lib.rs

qwerty2501 · 2022-08-25T10:44:24Z

#217 (comment)
これに加えてWindowsのexampleがありますね。
pythonのexampleについてはPyO3版が対応したらそっちを使うようにしたほうがいいかも？
すぐにできなさそうなら今やる必要がありそうですが

Hiroshiba

LGTM！！！
調整が必要な、大きめのタスクだったと思います。ありがとうございます！！

exampleに関しては、まあ次のprebuild版リリースが出るまでに修正できていれば良いのかなと少し思いました。
なのでこのPRはちゃちゃっとマージし、exampleの修正を追加でするか、issueを作るのが良いのかなと思いました！

qryxip

LGTM!

qwerty2501 · 2022-08-25T15:45:45Z

@Hiroshiba LGTMもらってますが、usize_is_size_tを使うか使わないかの判断がくだされてないようです。
#217 (comment)
あと私のほうではmergeできないようです。 PR出した人間だからかもしれませんが

qwerty2501 · 2022-08-26T00:30:27Z

crates/voicevox_core_c_api/src/lib.rs

    text: *const c_char,
-    speaker_id: i64,
+    speaker_id: usize,
+    options: VoicevoxTtsOptions,
    output_binary_size: *mut c_int,


今見返していたらoutput_binary_sizeは *mut usize としたほうが良さそう

output_binary_size-> output_wav_sizeに名前変更してusizeに変更しました

qwerty2501 · 2022-08-26T00:49:42Z

追記:
size_tにするべきかどうかですが、これはbyte長やデータの長さを表す値の型としてよく使われています。
そのため voicevox_decode_forward の length や voicevox_tts の output_binary_size などで size_tが使われるのは妥当だと思われます。
反面、 speaker_id はデータの長さを示すものではなくIDですのでこれはsize_tとするのは望ましくないかなと思われます。
Rustだと数値はとりあえずusizeにしていることが多い印象だったのでusizeにしましたが、usizeは使用せずu32ないしはu64とするべきかもしれません。(元がi64だったことを考えると桁あふれしないようにu32にするべきかも？ただ実際に使用されているidの値がu32の最大値以内である必要はあります)

qwerty2501 · 2022-08-26T02:42:43Z

~~usize_is_size_t を有効にせずに length や output_binary_size にだけ c_size_t型を使用するというアプローチもありそうですね~~
c_size_tはstable rustには含まれていないようなのでやはり usize_is_size_t を有効にするしか無いかも

Hiroshiba · 2022-08-26T15:36:17Z

size_tにするべきかどうか

length・output_binary_sizeはsize_tで良いと思います！

speaker_idは難しくて、互換性考えると（このCOREをforkしているかもしれない人を考えると）i64のままにするのが一番望ましいと思います。
が、VOICEVOX以外でこのコアを使う人はいないと思います。
であればVOICEVOX的にはsize_tでもu32でも問題ありません。
でも特にこだわりがないので、互換性を失わないi64が良いのかなとちょっと思いました。

i64かu32で･･･！

Hiroshiba · 2022-08-26T15:39:03Z

あと私のほうではmergeできないようです

今確認したのですが、inviteのままになっていそうです！
たぶんメールか、github右上の通知辺りになにか出てると思います！

qwerty2501 · 2022-08-26T15:48:37Z

@Hiroshiba ~~では usize_is_size_t を有効にしてsize_tにして、speaker_idはusizeを使用しないようにします。~~
usize_is_size_tを有効にするとC++だとそのままだとコンパイルエラーになるためいったん usize_is_size_t の有効化は見送ろうと思います。(C++だとsize_tはstd namespaceにあるのでstd::size_tと書く必要があるが、出力されるのはそのままsize_tであるため。オプション弄くればうまく出力できるかもですが・・・)

i64かu32を使用するかについてですが、idは負数をとらないことからu32のほうが望ましいと考えてます。
互換性についてですがengineはおそらくPyO3版を使用するようになるのでどのみち書き直すこと、またすでにcoreを利用するユーザーがいたとしても関数が大きく変わってるため大幅な書き直しを強いられるためi64のままにするメリットはあまりないのでは？

qwerty2501 · 2022-08-26T16:43:28Z

とりあえずspeaker_idをu32にしちゃいましたが、もしこっちのほうが良いというのがあれば教えてください

Hiroshiba · 2022-08-26T16:49:28Z

u32で良いと思います！

意図としてはこのライブラリのユーザーではなく、このライブラリをforkして自作コアを作っている人向けの配慮です。
例えばこのCOREをフォークし、speaker_idが-100なキャラな話者を作っていた場合、u32になるといろいろと対応が大変そうだなという感じです。
（ただ↑でも書いていますが、そういう方はいらっしゃらない気がするので、ボイボのみに合わせてu32でも良いと思います。）

qwerty2501 · 2022-08-26T16:51:42Z

確かに型として負数を取れていたので負数がidなspeakerを作ってる人がいる可能性はゼロではないですね。
けど大丈夫なはず。大丈夫・・・

Hiroshiba · 2022-08-26T17:30:11Z

良いと思います！
テストも兼ねて、マージボタン試して頂けると！

qwerty2501 · 2022-08-26T22:02:49Z

merged!

C APIの改善を行う

0429477

Hiroshiba mentioned this pull request Aug 6, 2022

テキスト音声合成ライブラリの作成 VOICEVOX/voicevox_project#1

Closed

qwerty2501 marked this pull request as draft August 9, 2022 16:44

qwerty2501 mentioned this pull request Aug 15, 2022

APIドキュメントが欲しい #242

Closed

qryxip mentioned this pull request Aug 16, 2022

AudioQueryを入出力する #244

Merged

qwerty2501 added 3 commits August 20, 2022 08:45

Merge branch 'main' of https://github.com/VOICEVOX/voicevox_core into…

fce9290

… feature/improve-c-api

C API定義を修正した

60bc086

voicevox_core側の実装も変更した

af7a3fe

qwerty2501 commented Aug 20, 2022

View reviewed changes

qryxip reviewed Aug 20, 2022

View reviewed changes

crates/voicevox_core/src/publish.rs Outdated Show resolved Hide resolved

cpu_num_threadsをu16に変更した

466fdd9

Hiroshiba requested changes Aug 24, 2022

View reviewed changes

crates/voicevox_core/src/publish.rs Outdated Show resolved Hide resolved

crates/voicevox_core_c_api/src/lib.rs Outdated Show resolved Hide resolved

crates/voicevox_core_c_api/src/lib.rs Outdated Show resolved Hide resolved

crates/voicevox_core_c_api/src/lib.rs Outdated Show resolved Hide resolved

is_gpu_modeに名称変更

f6e4d9b

qwerty2501 force-pushed the feature/improve-c-api branch from 9ccb69d to 4da20dd Compare August 25, 2022 10:48

defaultオプション生成関数にmakeをつけた　

a9c2cee

qwerty2501 force-pushed the feature/improve-c-api branch from 4da20dd to a9c2cee Compare August 25, 2022 10:53

Hiroshiba approved these changes Aug 25, 2022

View reviewed changes

qryxip approved these changes Aug 25, 2022

View reviewed changes

qwerty2501 commented Aug 26, 2022

View reviewed changes

output_binary_size->output_wav_sizeに変更型をc_int->usizeに変更した

a599e17

speaker_idをu32に変更した

d5af118

qwerty2501 merged commit 3bf2043 into VOICEVOX:main Aug 26, 2022

PickledChair mentioned this pull request Aug 27, 2022

コア（共有ライブラリ）の読み込みにおいてコアの API 変更に追従する VOICEVOX/voicevox_engine#454

Closed

5 tasks

qryxip mentioned this pull request Aug 27, 2022

PyO3を使ってクレートからwhlを作る #239

Merged

qwerty2501 mentioned this pull request Sep 3, 2022

Doxygenスタイルでのコメント生成&github pages でAPIドキュメント公開 #257

Merged

PickledChair mentioned this pull request Oct 12, 2022

example/python が実行できない #308

Closed

qryxip mentioned this pull request Dec 27, 2022

null pointerのopen_jtalk_dict_dirをNoneと解釈する #351

Merged

qryxip mentioned this pull request Mar 20, 2023

新クラス設計API #370

Merged

qryxip mentioned this pull request Apr 3, 2023

Pathの適切ではないlossyなUTF-8変換をやめる VOICEVOX/open_jtalk-rs#12

Merged

C APIの改善を行う #217

C APIの改善を行う #217

Conversation

qwerty2501 commented Aug 6, 2022 • edited Loading

内容

関連 Issue

qwerty2501 commented Aug 6, 2022

Hiroshiba commented Aug 6, 2022

Hiroshiba commented Aug 6, 2022 • edited Loading

qwerty2501 commented Aug 6, 2022

Hiroshiba commented Aug 7, 2022

qwerty2501 commented Aug 7, 2022

Hiroshiba commented Aug 7, 2022 • edited Loading

qwerty2501 commented Aug 7, 2022

Hiroshiba commented Aug 7, 2022

qwerty2501 commented Aug 7, 2022

qwerty2501 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qwerty2501 Aug 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qwerty2501 commented Aug 20, 2022

qwerty2501 commented Aug 20, 2022

Hiroshiba commented Aug 20, 2022

qryxip commented Aug 20, 2022 • edited Loading

qwerty2501 commented Aug 21, 2022

qryxip commented Aug 21, 2022 • edited Loading

qwerty2501 commented Aug 21, 2022 • edited Loading

qwerty2501 commented Aug 21, 2022

Hiroshiba commented Aug 21, 2022

Hiroshiba left a comment • edited Loading

Choose a reason for hiding this comment

qwerty2501 commented Aug 25, 2022

Hiroshiba left a comment

Choose a reason for hiding this comment

qryxip left a comment

Choose a reason for hiding this comment

qwerty2501 commented Aug 25, 2022

qwerty2501 Aug 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qwerty2501 commented Aug 26, 2022

qwerty2501 commented Aug 26, 2022 • edited Loading

Hiroshiba commented Aug 26, 2022

Hiroshiba commented Aug 26, 2022

qwerty2501 commented Aug 26, 2022 • edited Loading

qwerty2501 commented Aug 26, 2022

Hiroshiba commented Aug 26, 2022

qwerty2501 commented Aug 26, 2022

Hiroshiba commented Aug 26, 2022

qwerty2501 commented Aug 26, 2022

qwerty2501 commented Aug 6, 2022 •

edited

Loading

Hiroshiba commented Aug 6, 2022 •

edited

Loading

Hiroshiba commented Aug 7, 2022 •

edited

Loading

qwerty2501 left a comment •

edited

Loading

qwerty2501 Aug 20, 2022 •

edited

Loading

qryxip commented Aug 20, 2022 •

edited

Loading

qryxip commented Aug 21, 2022 •

edited

Loading

qwerty2501 commented Aug 21, 2022 •

edited

Loading

Hiroshiba left a comment •

edited

Loading

qwerty2501 Aug 26, 2022 •

edited

Loading

qwerty2501 commented Aug 26, 2022 •

edited

Loading

qwerty2501 commented Aug 26, 2022 •

edited

Loading