[feat]: add ppnet & upgrade backbone (#434)

* add ppnet & upgrade backbone, mmoe&mask block component
alibaba · Nov 29, 2023 · 1fb889d · 1fb889d
1 parent 62ddbc1
commit 1fb889d
Show file tree

Hide file tree

Showing 23 changed files with 901 additions and 62 deletions.
diff --git a/README.md b/README.md
@@ -61,7 +61,7 @@ Running Platform:
 ### A variety of models
 
 - [DSSM](docs/source/models/dssm.md) / [MIND](docs/source/models/mind.md) / [DropoutNet](docs/source/models/dropoutnet.md) / [CoMetricLearningI2I](docs/source/models/co_metric_learning_i2i.md) / [PDN](docs/source/models/pdn.md)
-- [W&D](docs/source/models/wide_and_deep.md) / [DeepFM](docs/source/models/deepfm.md) / [MultiTower](docs/source/models/multi_tower.md) / [DCN](docs/source/models/dcn.md) / [FiBiNet](docs/source/models/fibinet.md) / [MaskNet](docs/source/models/masknet.md) / [CDN](docs/source/models/cdn.md)
+- [W&D](docs/source/models/wide_and_deep.md) / [DeepFM](docs/source/models/deepfm.md) / [MultiTower](docs/source/models/multi_tower.md) / [DCN](docs/source/models/dcn.md) / [FiBiNet](docs/source/models/fibinet.md) / [MaskNet](docs/source/models/masknet.md) / [PPNet](docs/source/models/ppnet.md) / [CDN](docs/source/models/cdn.md)
 - [DIN](docs/source/models/din.md) / [BST](docs/source/models/bst.md) / [CL4SRec](docs/source/models/cl4srec.md)
 - [MMoE](docs/source/models/mmoe.md) / [ESMM](docs/source/models/esmm.md) / [DBMTL](docs/source/models/dbmtl.md) / [PLE](docs/source/models/ple.md)
 - [HighwayNetwork](docs/source/models/highway.md) / [CMBF](docs/source/models/cmbf.md) / [UNITER](docs/source/models/uniter.md)

diff --git a/docs/images/models/ppnet.jpg b/docs/images/models/ppnet.jpg
diff --git a/docs/source/component/backbone.md b/docs/source/component/backbone.md
@@ -946,6 +946,127 @@ DBMTL模型需要在`model_params`里为每个子任务的Tower配置`relation_d
 
 这个案例同样没有为backbone配置`concat_blocks`，框架会自动设置为DAG的叶子节点。
 
+## 案例10：MaskNet + PPNet + MMoE
+
+```protobuf
+model_config: {
+  model_name: 'MaskNet + PPNet + MMoE'
+  model_class: 'RankModel'
+  feature_groups: {
+    group_name: 'memorize'
+    feature_names: 'user_id'
+    feature_names: 'adgroup_id'
+    feature_names: 'pid'
+    wide_deep: DEEP
+  }
+  feature_groups: {
+    group_name: 'general'
+    feature_names: 'age_level'
+    feature_names: 'shopping_level'
+    ...
+    wide_deep: DEEP
+  }
+  backbone {
+    blocks {
+      name: "mask_net"
+      inputs {
+        feature_group_name: "general"
+      }
+      repeat {
+        num_repeat: 3
+        keras_layer {
+          class_name: "MaskBlock"
+          mask_block {
+            output_size: 512
+            aggregation_size: 1024
+          }
+        }
+      }
+    }
+    blocks {
+      name: "ppnet"
+      inputs {
+        block_name: "mask_net"
+      }
+      inputs {
+        feature_group_name: "memorize"
+      }
+      merge_inputs_into_list: true
+      repeat {
+        num_repeat: 3
+        input_fn: "lambda x, i: [x[0][i], x[1]]"
+        keras_layer {
+          class_name: "PPNet"
+          ppnet {
+            mlp {
+              hidden_units: [256, 128, 64]
+            }
+            gate_params {
+              output_dim: 512
+            }
+            mode: "eager"
+            full_gate_input: false
+          }
+        }
+      }
+    }
+    blocks {
+      name: "mmoe"
+      inputs {
+        block_name: "ppnet"
+      }
+      inputs {
+        feature_group_name: "general"
+      }
+      keras_layer {
+        class_name: "MMoE"
+        mmoe {
+          num_task: 2
+          num_expert: 3
+        }
+      }
+    }
+  }
+  model_params {
+    l2_regularization: 0.0
+    task_towers {
+      tower_name: "ctr"
+      label_name: "is_click"
+      metrics_set {
+        auc {
+          num_thresholds: 20000
+        }
+      }
+      loss_type: CLASSIFICATION
+      num_class: 1
+      dnn {
+        hidden_units: 64
+        hidden_units: 32
+      }
+      weight: 1.0
+    }
+    task_towers {
+      tower_name: "cvr"
+      label_name: "is_train"
+      metrics_set {
+        auc {
+          num_thresholds: 20000
+        }
+      }
+      loss_type: CLASSIFICATION
+      num_class: 1
+      dnn {
+        hidden_units: 64
+        hidden_units: 32
+      }
+      weight: 1.0
+    }
+  }
+}
+```
+
+该案例体现了如何应用[重复组件块](#id21)。
+
 ## 更多案例
 
 两个新的模型：
@@ -1002,6 +1123,7 @@ MovieLens-1M数据集效果：
 | SENet     | 建模特征重要度           | FiBiNet模型的组件 | [MMoE](../models/mmoe.html#id4)                       |
 | MaskBlock | 建模特征重要度           | MaskNet模型的组件 | [Cross Decoupling Network](../models/cdn.html#id2)    |
 | MaskNet   | 多个串行或并行的MaskBlock | MaskNet模型    | [DBMTL](../models/dbmtl.html#dbmtl-based-on-backbone) |
+| PPNet     | 参数个性化网络           | PPNet模型      | [PPNet](../models/ppnet.html#id2)                     |
 
 ## 4. 序列特征编码组件
 
@@ -1310,6 +1432,10 @@ repeat {
 - `num_repeat` 配置重复执行的次数
 - `output_concat_axis` 配置多次执行结果tensors的拼接维度，若不配置则输出多次执行结果的列表
 - `keras_layer` 配置需要执行的组件
+- `input_slice` 配置每个执行组件的输入切片，例如`[i]`获取输入列表的第 i 个元素作为第 i 次重复执行时的输入；不配置时获取所有输入
+- `input_fn` 配置每个执行组件的输入函数，例如`input_fn: "lambda x, i: [x[0][i], x[1]]"`
+
+`重复组件块` 的使用案例[MaskNet+PPNet+MMoE](#masknet-ppnet-mmoe)。
 
 ## 7. 序列组件块
 

diff --git a/docs/source/component/component.md b/docs/source/component/component.md
@@ -96,6 +96,25 @@
 | use_parallel | bool | true | 是否使用并行模式      |
 | mlp          | MLP  | 可选   | 顶部mlp         |
 
+- PPNet
+
+| 参数              | 类型     | 默认值   | 说明                                                 |
+| --------------- | ------ | ----- | -------------------------------------------------- |
+| mlp             | MLP    |       | mlp 配置                                             |
+| gate_params     | GateNN |       | 参数个性化Gate网络的配置                                     |
+| mode            | string | eager | 配置参数个性化是作用在MLP的每个layer的输入上还是输出上，可选：\[eager, lazy\] |
+| full_gate_input | bool   | true  | 是否需要添加stop_gradient之后的mlp的输入作为gate网络的输入            |
+
+其中，GateNN的参数如下：
+
+| 参数           | 类型     | 默认值             | 说明                                        |
+| ------------ | ------ | --------------- | ----------------------------------------- |
+| output_dim   | uint32 | mlp前一层的输出units数 | Gate网络的输出维度，eager模式下必须要配置为mlp第一层的输入units数 |
+| hidden_dim   | uint32 | output_dim      | 隐层单元数                                     |
+| dropout_rate | float  | 0.0             | 隐层dropout rate                            |
+| activation   | str    | relu            | 隐层的激活函数                                   |
+| use_bn       | bool   | true            | 隐层是否使用batch normalization                 |
+
 ## 4. 序列特征编码组件
 
 - SeqAugment

diff --git a/docs/source/component/sequence.md b/docs/source/component/sequence.md
@@ -0,0 +1,79 @@
+# 序列化组件的配置方式
+
+序列模型（DIN、BST）的组件化配置方式需要把输入特征放置在同一个`feature_group`内。
+
+序列模型一般包含 `history behavior sequence` 与 `target item` 两部分，且每部分都可能包含多个属性(子特征)。
+
+在序列组件输入的`feature_group`内，**按照顺序**定义 `history behavior sequence` 与 `target item`的各个子特征。
+
+框架按照特征定义的类型`feature_type`字段来识别某个具体的特征是属于 `history behavior sequence` 还是 `target item`。
+所有 `SequenceFeature` 类型的子特征都被识别为`history behavior sequence`的一部分; 所有非`SequenceFeature` 类型的子特征都被识别为`target item`的一部分。
+
+**两部分的子特征的顺序需要保持一致**。在下面的例子中，
+
+- `concat([cate_id,brand], axis=-1)` 是`target item`最终的embedding（2D）;
+- `concat([tag_category_list, tag_brand_list], axis=-1)` 是`history behavior sequence`最终的embedding（3D）
+
+```protobuf
+model_config: {
+  model_name: 'DIN'
+  model_class: 'RankModel
+  ...
+  feature_groups: {
+    group_name: 'sequence'
+    feature_names: "cate_id"
+    feature_names: "brand"
+    feature_names: "tag_category_list"
+    feature_names: "tag_brand_list"
+    wide_deep: DEEP
+  }
+  backbone {
+    blocks {
+      name: 'seq_input'
+      inputs {
+        feature_group_name: 'sequence'
+      }
+      input_layer {
+        output_seq_and_normal_feature: true
+      }
+    }
+    blocks {
+      name: 'DIN'
+      inputs {
+        block_name: 'seq_input'
+      }
+      keras_layer {
+        class_name: 'DIN'
+        din {
+          attention_dnn {
+            hidden_units: 32
+            hidden_units: 1
+            activation: "dice"
+          }
+          need_target_feature: true
+        }
+      }
+    }
+    ...
+  }
+}
+```
+
+使用序列组件时，必须配置一个`input_layer`类型的`block`，并且配置`output_seq_and_normal_feature: true`参数，如下。
+
+```protobuf
+blocks {
+  name: 'seq_input'
+  inputs {
+    feature_group_name: 'sequence'
+  }
+  input_layer {
+    output_seq_and_normal_feature: true
+  }
+}
+```
+
+## 完整的例子
+
+- [DIN](../models/din.md)
+- [BST](../models/bst.md)
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -30,6 +30,7 @@ Welcome to easy_rec's documentation!
 
    component/backbone
    component/component
+   component/sequence
 
 .. toctree::
    :maxdepth: 3

diff --git a/docs/source/models/ppnet.md b/docs/source/models/ppnet.md
@@ -0,0 +1,95 @@
+# PPNet（Parameter Personalized Net）
+
+### 简介
+
+PPNet的核心思想来源于NLP领域的LHUC，在语音识别领域中，2016 年提出的LHUC 算法（learning hidden unit contributions）
+核心思想是做说话人自适应（speaker adaptation），其中一个关键突破是在 DNN 网络中，为每个说话人学习一个特定的隐式单元贡献（hidden unit contributions），
+来提升不同说话人的语音识别效果。
+
+借鉴 LHUC 的思想，PPNet设计出一种 gating 机制，可以增加 DNN 网络参数个性化并能够让模型快速收敛。
+
+![ppnet](../../images/models/ppnet.jpg)
+
+### 配置说明
+
+```protobuf
+model_config: {
+  model_name: 'PPNet'
+  model_class: 'RankModel'
+  feature_groups: {
+    group_name: 'memorize'
+    feature_names: 'user_id'
+    feature_names: 'adgroup_id'
+    feature_names: 'pid'
+    wide_deep: DEEP
+  }
+  feature_groups: {
+    group_name: 'general'
+    feature_names: 'cms_segid'
+    feature_names: 'cms_group_id'
+    feature_names: 'age_level'
+    feature_names: 'pvalue_level'
+    feature_names: 'shopping_level'
+    feature_names: 'occupation'
+    feature_names: 'new_user_class_level'
+    feature_names: 'cate_id'
+    feature_names: 'campaign_id'
+    feature_names: 'customer'
+    feature_names: 'brand'
+    feature_names: 'price'
+    feature_names: 'tag_category_list'
+    feature_names: 'tag_brand_list'
+    wide_deep: DEEP
+  }
+  backbone {
+    blocks {
+      name: "ppnet"
+      inputs {
+        feature_group_name: "general"
+      }
+      inputs {
+        feature_group_name: "memorize"
+      }
+      merge_inputs_into_list: true
+      keras_layer {
+        class_name: "PPNet"
+        ppnet {
+          mlp {
+            hidden_units: [512, 256]
+          }
+          mode: "lazy"
+          full_gate_input: true
+        }
+      }
+    }
+    top_mlp {
+      hidden_units: [128, 64]
+    }
+  }
+  model_params {
+    l2_regularization: 1e-6
+  }
+  embedding_regularization: 1e-5
+}
+```
+
+- model_name: 任意自定义字符串，仅有注释作用
+- model_class: 'RankModel', 不需要修改, 通过组件化方式搭建的单目标排序模型都叫这个名字
+- feature_groups: 配置一组特征。
+- backbone: 通过组件化的方式搭建的主干网络，[参考文档](../component/backbone.md)
+  - blocks: 由多个`组件块`组成的一个有向无环图（DAG），框架负责按照DAG的拓扑排序执行个`组件块`关联的代码逻辑，构建TF Graph的一个子图
+  - name/inputs: 每个`block`有一个唯一的名字（name），并且有一个或多个输入(inputs)和输出
+  - keras_layer: 加载由`class_name`指定的自定义或系统内置的keras layer，执行一段代码逻辑；[参考文档](../component/backbone.md#keraslayer)
+  - ppnet: PPNet的基础组件，参数详见[参考文档](../component/component.md#id4)
+  - concat_blocks: DAG的输出节点由`concat_blocks`配置项定义，如果不配置`concat_blocks`，框架会自动拼接DAG的所有叶子节点并输出。
+- model_params:
+  - l2_regularization: (可选) 对DNN参数的regularization, 减少overfit
+- embedding_regularization: 对embedding部分加regularization, 减少overfit
+
+### 示例Config
+
+[ppnet_on_taobao.config](https://github.com/alibaba/EasyRec/tree/master/samples/model_config/ppnet_on_taobao.config)
+
+### 参考论文
+
+[PEPNet: Parameter and Embedding Personalized Network for Infusing with Personalized Prior Information](https://arxiv.org/pdf/2302.01115.pdf)
diff --git a/docs/source/models/rank.rst b/docs/source/models/rank.rst
@@ -17,6 +17,7 @@
    masknet
    fibinet
    cdn
+   ppnet
    cl4srec
    regression
    multi_cls
-Original file line number
+Diff line change
@@ Expand Up / @@ -17,6 +17,7 @@ @@
        masknet
        fibinet
        cdn
+       ppnet
        cl4srec
        regression
        multi_cls
@@ Expand Down @@