Skip to content

Releases: flexflow/flexflow-train

Release 22.07

01 Aug 04:07
d670657
Compare
Choose a tag to compare

This is the last stable release of FlexFlow before the Unity merge. Unity enables joint optimization of algebraic transformations and parallelization and generally achieves better performance and scalability compared to the original FlexFlow without Unity's optimizations. The Unity merge introduces the following major changes to FlexFlow.

  • With Unity, we now use parallel computation graphs (PCGs) to represent a DNN model. PCG is a unified representation of
    distributed DNN training that simultaneously expresses computation, parallelism, and data movement. A detailed description of PCG is available here.

  • We add support for Unity's additional forms of parallelism, including reduction parallelism and other operator-specific parallelization strategies.

  • We replace FlexFlow's MCMC search with a three-layer hierarchical search algorithm, which discovers joint optimization of algebraic transformations and parallelization and achieves better performance and scalability compared to FlexFlow's MCMC search.

Starting from this release, Unity's changes will be available in the master branch of the FlexFlow repository.

Release 22.05

08 Jun 16:20
ad627c9
Compare
Choose a tag to compare

This is a stable release of FlexFlow in preparation for the Unity merge.

Frontend support:

PyTorch Alignment:

  • Added unit tests for aligning FlexFlow's operators with PyTorch's. For each operator, the unit test checks if FlexFlow and PyTorch return identical activations/gradients when given the same inputs. More details of the PyTorch alignment is available at https://github.com/flexflow/FlexFlow/tree/master/align

Documentation:

Operators:

  • Multiple bug fixes for FlexFlow operators

Broadcast:

  • FlexFlow now supports broadcasting for a subset of operators, include elementwise unary and elementwise binary operators. The broadcasting semantic is identical to that of Numpy's

Release 21.09 (September 30th 2021)

06 Oct 14:57
Compare
Choose a tag to compare

Frontend Supports

  • Changing PyBind11 as the default Python frontend in FlexFlow.

Control Replication

Distributed training

  • FlexFlow now uses NCCL AllReduce for gradients synchronization by default. To switch to distributed parameter server, set FF_USE_NCCL=OFF in cmake.

Distributed inference

  • Passing comp_node = comp_node = CompMode::INFERENCE as an additional argument to model.compile will run a DNN model in the inference model
  • Various bug fixes and performance improvements for distributed inference in FlexFlow.

Operators

  • Additional operators include AggregateSpec, Multi-Head Attention

Machine Model

  • FlexFlow now support a new machine model for more precisely modeling network topology and simulating traffics at the granularity of individual packages

Release 21.03 (March 31, 2021)

02 Apr 21:19
142e2b1
Compare
Choose a tag to compare
  • Build
    • FlexFlow now uses camke build by default, the Makefiles will be deprecated soon.
  • Frontend Supports
    • In addition to CFFI, FlexFlow now also supports Python interface via PyBind11. To use ByBind11, please set FF_USE_PYBIND = ON in cmake.
  • Distributed inference
  • Runtime
    • FlexFlow now supports gradients update via either Parameter Server or NCCL Allreduce. To enable NCCL, please set FF_USE_NCCL = ON in cmake.
  • Operators
    • New operators including Aggregate, Multi-head Attention, Scalar Multiply, Scalar Add, Scalar Sub, Scalar Divide and Top-K.
    • Conv2D now supports group convolutions.
  • Examples
    • Unit tests of all operators have been added to the tests/ops folder.

Release 20.12 (December 21, 2020)

04 Jan 19:40
Compare
Choose a tag to compare
  • Build
    • FlexFlow now supports both Makefile and CMake build. More details are available in this instruction.
  • Frontend Supports
  • Parallelization Optimizer
    • Integrated the parallelization optimizer into the FlexFlow runtime. Users can now use the --search-budget and --search-alpha to control the FlexFlow parallelization optimizer for searching for optimized strategies. See this post for the usage of the optimizer.
  • Examples
    • More PyTorch, ONNX, TensorFlow Keras examples have been added to the /examples/python folder.
    • Updated the cpp examples to use the new runtime interface.
  • Mapper
    • Implemented a new mapper with improved runtime performance.
  • Legion
    • Updated the Legion version with improved runtime performance

FlexFlow v1.1.1 Release for the SysML19 Artifact Evaluation

14 Feb 01:33
Compare
Choose a tag to compare

This is v1.1.1 pre-release for SysML19 Artifact Evaluation. Follow the instructions to build FlexFlow and use the script run_experiments.sh to run all experiments.

FlexFlow v1.1 Release for the SysML19 Artifact Evaluation

11 Feb 19:13
Compare
Choose a tag to compare

This is v1.1 pre-release for SysML19 Artifact Evaluation. Follow the instructions to build FlexFlow and use the script run_experiments.sh to run all experiments.

SysML19 Artifact Evaluation

26 Jan 01:29
Compare
Choose a tag to compare

This is a pre-release for SysML19 Artifact Evaluation. Follow the instructions to build FlexFlow and use the script run_experiments.sh to run all experiments.