Releases: milvus-io/milvus
milvus-2.0.0-rc6
Release date: 2021-09-10
Compatibility
Milvus version | Python SDK version | Java SDK version | Go SDK version | Node SDK version |
---|---|---|---|---|
2.0.0-RC6 | 2.0.0rc6 | Coming soon | Coming soon | 1.0.16 |
Milvus 2.0.0-RC6 is a preview version of Milvus 2.0.0. It supports specifying shard number when creating collections, and query by expression. It exposes more cluster metrics through API. In RC6 we inceases the unit test coverage to 80%. We also fixed a series of issues involving resource leakage, system panic, etc.
Improvements
- Increases unit test coverage to 80%.
Features
- #7482 Supports specifying shard number when creating a collection.
- #7386 Supports query by expression.
- Exposes system metrics through API:
Bug Fixes
- #7434 Query node OOM if loading a collection that beyond the memory limit.
- #7678 Standalone OOM when recovering from existing storage.
- #7636 Standalone panic when sending message to a closed channel.
- #7631 Milvus panic when closing flowgraph.
- #7605 Milvus crashed with panic when running nightly CI tests.
- #7596 Nightly cases failed because rootcoord disconnected with etcd.
- #7557 Wrong search result returned when the term content in expression is not in order.
- #7536 Incorrect
MqMsgStream
Seek logic. - #7527 Dataset's memory leak in
knowhere
when searching. - #7444 Deadlock of channels time ticker.
- #7428 Possible deadlock when
MqMsgStream
broadcast fails. - #7715 Query request overwritten by concurrent operations on the same slice.
milvus-2.0.0-rc5-hotfix1
Release date: 2021-09-01
Compatibility
Milvus version | Python SDK version | Java SDK version | Go SDK version | Node SDK version |
---|---|---|---|---|
2.0.0-RC5 | 2.0.0rc5 | Coming soon | Coming soon | 1.0.16 |
Milvus 2.0.0-RC5 is a preview version of Milvus 2.0.0. This hotfix solved a panic in standalone deployment.
Bug Fixes
- #7393 Fix rocksmq retention panic when delete by message size.
milvus-2.0.0-rc5
Release date: 2021-08-30
Compatibility
Milvus version | Python SDK version | Java SDK version | Go SDK version | Node SDK version |
---|---|---|---|---|
2.0.0-RC5 | 2.0.0rc5 | Coming soon | Coming soon | 1.0.16 |
Milvus 2.0.0-RC5 is a preview version of Milvus 2.0.0. It supports message queue data retention mechanism and etcd data cleanup, exposes cluster metrics through API, and prepares for delete operation support. RC5 also made great progress on system stability. We fixed a series of resource leakage, operation hang and the misconfiguration of standalone Pulsar under Milvus cluster.
Improvements
- #7226 Refactors data coord allocator.
- #6867 Adds connection manager.
- #7172 Adds a seal policy to restrict the lifetime of a segment.
- #7163 Increases the timeout for gRPC connection when creating index.
- #6996 Adds a minimum interval for segment flush.
- #6590 Saves binlog path in
SegmentInfo
. - #6848 Removes
RetrieveRequest
andRetrieveTask.
- #7102 Supports vector field as output.
- #7075 Refactors
NewEtcdKV
API. - #6965 Adds channel for data node to watch etcd.
- #7066 Optimizes search reduce logics.
- #6993 Enhances the log when parsing gRPC recv/send parameters.
- #7331 Changes context to correct package.
- #7278 Enables etcd auto compaction for every 1000 revision.
- #7355 Clean
fmt.Println
in util/flowgraph.
Features
- #7112 #7174 Imports an embedded etcdKV (part 1).
- #7231 Adds a segment filter interface.
- #7157 Exposes metrics of index coord and index nodes.
- #7137 #7157 Exposes system topology information by proxy.
- #7113 #7157 Exposes metrics of query coord and query nodes.
- #7134 Allows users to get vectors using memory instead of local storage.
- #6617 Supports retention for rocksmq.
- #7303 Adds query node segment filter.
- #7304 Adds
delete
API into proto. - #7261 Adds delete node.
- #7268 Constructs Bloom filter when inserting.
Bug Fixes
- #7272 #7352 #7335 Failure to start new docker container with existing volumes if index was created: proxy is not healthy.
- #7243 Failure to create index in a new version of Milvus for data that were inserted in an old version.
- #7253 Search gets empty results after releasing a different partition.
- #7244 #7227 Proxy crashes when receiving empty search results.
- #7203 Connection gets stuck when gRPC server is down.
- #7188 Incomplete unit test logics.
- #7175 Unspecific error message returns when calculating distances using collection IDs without loading.
- #7151 Data node flowgraph does not close caused by missing
DropCollection
. - #7167 Failure to load IVF_FLAT index.
- #7123 Timestamp go back for
timeticksync
. - #7140
calc_distance
returns wrong results for binary vectors when using TANIMOTO metrics. - #7143 The state of memory and etcd is inconsistent if KV operation fails.
- #7141 #7136 Index building gets stuck when the index node pod is frequently killed and pulled up.
- #7119 Pulsar
msgStream
may get stuck when subscribed with the same topic and sub name. - #6971 Exception occurs when searching with index (HNSW).
- #7104 Search gets stuck if query nodes only load sealed segment without watching insert channels.
- #7085 Segments do not auto flush.
- #7074 Index nodes wait for index coord to start to complete.
- #7061 Segment allocation does not expire if data coord does not receive timetick message from data node.
- #7059 Query nodes get producer leakage.
- #7005 Query nodes do not return error to query coord when
loadSegmentInternal
fails. - #7054 Query nodes return incorrect IDs when
topk
is larger thanrow_num.
- #7053 Incomplete allocation logics.
- #7044 Lack of check on unindexed vectors in memory before retriving vectors in local storage.
- #6862 Memory leaks in flush cache of data node.
- #7346 Query coord container exited in less than 1 minute when re-installing Milvus cluster.
- #7339 Incorrect expression boundary.
- #7311 Collection nil when adding query collection.
- #7266 Flowgraph released incorrectly.
- #7310 Excessive timeout when searching after releasing and loading a partition.
- #7320 Port conflicts between embedded etcd and external etcd.
- #7336 Data node corner cases.
milvus-2.0.0-rc4
Release date: 2021-08-13
Compatibility
Milvus version | Python SDK version | Java SDK version | Go SDK version |
---|---|---|---|
2.0.0-RC4 | 2.0.0rc4 | Coming soon | Coming soon |
Milvus 2.0.0-RC4 is a preview version of Milvus 2.0.0. It mainly focuses on fixing stability issues, it also offers functionalities to retrieve vector data from object storage and specify output field by wildcard matching.
Improvements
-
#6859 Increases the
MaxCallRecvMsgSize
andMaxCallSendMsgSize
of gRPC client. -
#6796 Fixes MsgStream exponential retry.
-
#6897 #6899 #6681 #6766 #6768 #6597 #6501 #6477 #6478 #6935 #6871 #6671 #6682 Log improvements.
-
#6440 Refactors segment manager.
-
#6421 Splits raw vectors to several smaller binlog files when creating index.
-
#6466 Separates the idea of query and search.
-
#6505 Changes
output_fields
toout_fields_id
for RetrieveRequest. -
#6427 Refactors the logic of assigning tasks in index coord.
-
#6692 #6343 Shows/Describes collections/partitions with created timestamps.
-
#6629 Adds the WatchWithVersion interface for etcdKV.
-
#6666 Refactors expression executor to use single bitsets.
-
#6664 Auto creates new segments when allocating rows that exceeds the maximum number of rows per segment.
-
#6786 Refactors
RangeExpr
andCompareExpr
. -
#6497 Looses the lower limit of dimension when searching on a binary vector field.
Features
-
#6706 Supports reading vectors from disk.
-
#5210 Extends the grammar of Boolean expressions.
-
#6411 #6650 Supports wildcards and wildcard matching on search/query output fields.
-
#6464 Adds a vector chunk manager to support vector file local storage.
-
#6701 Supports data persistence with docker compose deployments.
-
#6767 Adds a Grafana dashboard .json file for Milvus.
Bug fixes
-
#5443
CalcDistance
returns wrong results when fetching vectors from collection. -
#7004 Pulsar consumer causes goroutine leakage.
-
#6946 Data race occurs when a flow graph
close()
immediately afterstart()
. -
#6903 Uses proto marshal instead of marshalTextString in querycoord to avoid crash triggered by unknown field name crash.
-
#6977 Search returns wrong limit after a partition or collection is dropped.
-
#6515 #6567 #6552 #6483 Data node BackGroundGC does not work and causes memory leak.
-
#6943 The MinioKV
GetObject
method does not close client and causes goroutine leaking per call. -
#6370 Search is stuck due to wrong semantics offered by load partition.
-
#6831 Data node crashes in meta service.
-
#6469 Search binary results are wrong with metrics of Hamming when limit (topK) is bigger than the quantity of inserted entities.
-
#6693 Timeout caused by segment race condition.
-
#6097 Load hangs after frequently restarting query node within a short period of time.
-
#6464 Data sorter edge cases.
-
#6419 Milvus crashes when inserting empty vectors.
-
#6477 Different components repeatedly create buckets in MinIO.
-
#6377 Query results get incorrect global sealed segments from etcd.
-
#6499 TSO allocates wrong timestamps.
-
#6501 Channels are lost after data node crashes.
-
#6527 Task info of watchQueryChannels can't be deleted from etcd.
-
#6576 #6526 Duplicate primary field IDs are added when retrieving entities.
-
#6627 #6569
std::sort
does not work properly to filter search results when the distance of new record is NaN. -
#6655 Proxy crashes when retrieve task is called.
-
#6762 Incorrect created timestamp of collections and partitions.
-
#6644 Data node failes to restart automatically.
-
#6641 Failure to stop data coord when disconnecting with etcd.
-
#6621 Milvus throws an exception when the inserted data size is larger than the segment.
-
#6436 #6573 #6507 Incorrect handling of time synchronization.
-
#6732 Failure to create IVF-PQ index.
milvus-2.0.0-rc2
Release date: 2021-07-13
Compatibility
Milvus version | Python SDK version | Java SDK version | Go SDK version |
---|---|---|---|
2.0.0-RC2 | 2.0.0rc2 | Coming soon | Coming soon |
Improvements
- Refactor cluster in dataservice (#6356)
- Refactor meta in data coord (#6300)
- Optimize code under storage (#6335)
- Optimize stop process logic (#6256)
- Add collectionID and partitionID into SegmentIndexInfo (#6289)
- Clear proxy sarchStream when releaseCollection (#6258)
- Fix travel timestamp and guarantee timestamp (#6234)
- Merge retrieve and search code in query node (#6227)
- Add candidate management for datacoord dn cluster (#6196)
- Modified pulsar maxMessageSize with docker compose (#6240)
- Add Building Milvus with Docker Docs (#6188)
- Fix compile error on CentOS (#6359)
- Fix comile on CentOS (#6334)
Features
- Add minio FGet method (#6386)
- Add GetFlushedSegments in data coordinator (#6253)
- Add GetIndexStates (#6213)
- Add time record for build index and search process (#6231)
Bugfix
milvus-2.0.0-rc1
Compatibility
Milvus version | Python SDK version | Java SDK version | Go SDK version |
---|---|---|---|
2.0.0-RC1 | 2.0.0rc1 | Coming soon | Coming soon |
Milvus 2.0.0-RC1 is the preview version of 2.0.0. It introduces Golang as the distributed layer development language and a new cloud-native distributed design. The latter brings significant improvements to scalability, elasticity, and functionality.
Architecture
Milvus 2.0 is a cloud-native vector database with storage and computation separated by design. All components in this refactored version of Milvus are stateless to enhance elasticity and flexibility.
The system breaks down into four levels:
- Access layer
- Coordinator service
- Worker nodes
- Storage
Access layer: The front layer of the system and endpoint to users. It comprises peer proxies for forwarding requests and gathering results.
Coordinator service: The coordinator service assigns tasks to the worker nodes and functions as the system's brain. It has four coordinator types: root coord, data coord, query coord, and index coord.
Worker nodes: Worker nodes are dumb executors that follow the instructions from the coordinator service. There are three types of worker nodes, each responsible for a different job: data nodes, query nodes, and index nodes.
Storage: The cornerstone of the system that all other functions depend on. It has three storage types: meta storage, log broker, and object storage. Kudos to the open-source communities of etcd, Pulsar, MinIO, and RocksDB for building this fast, reliable storage.
For more information about how the system works, see Milvus 2.0 Architecture.
New Features
SDK
-
Object-relational mapping (ORM) PyMilvus
The PyMilvus-ORM APIs operate directly on collections, partitions, and indexes, helping users focus on the building of an effective data model rather than the detailed implementation.
Core Features
-
Hybrid Search between scalar and vector data
Milvus 2.0 supports storing scalar data. Operators such as GREATER, LESS, EQUAL, NOT, IN, AND, and OR can be used to filter scalar data before a vector search is conducted. Current supported data types include bool, int8, int16, int32, int64, float, and double. Support for string/VARBINARY data will be offered in a later version.
-
Match query
Unlike the search operation, which returns similar results, the match query operation returns exact matches. Match query can be used to retrieve vectors by ID or by condition.
-
Tunable consistency
Distributed databases make tradeoffs between consistency and availbility/latency. Milvus offers four consistency levels (from strongest to weakest): strong, bounded staleness, session, and consistent prefix. You can define your own read consistency by specifying the read timestamp. As a rule of thumb, the weaker the consistency level, the higher the availability and the higher the performance.
-
Time travel
Time travel allows you to access historical data at any point within a specified time period, making it possible to query data in the past, restore, and backup.
Miscellaneous
-
Supports installing Milvus 2.0 with Helm or Docker-compose.
-
Compatibility with Prometheus and Grafana for monitoring and alerts.
-
Milvus Insight
Milvus Insight is a graphical management system for Milvus. It features visualization of cluster states, meta management, data queries and more. Milvus Insight will eventually be open sourced.
Breaking Changes
Milvus 2.0 uses entirely different programming language, data format, and distributed architecture compared with previous versions. This means prior versions of Milvus cannot be upgraded to 2.x. However, Milvus 1.x is receiving long-term support and data migration tools will be made available as soon as possible.
Specific breaking changes include:
-
JAVA, Go, or C++ SDK is not yet supported.
-
Delete or update is not yet supported.
-
PyMilvus-ORM does not support force flush.
-
Data format is incompatible with all prior versions.
-
Mishards is deprecated because Milvus 2.0 is distributed and sharding middleware is no longer necessary.
-
Local file system and distributed system storage are not yet supported.
milvus-1.1.1
Release date:2021-06-16
Compatibility
Milvus version | Python SDK version | Java SDK version | Go SDK version |
---|---|---|---|
1.1.1 | 1.1.x | 1.1.x | 1.1.x |
New Features
- #1434 Storage: enabling s3 storage support (implemented by Unisinsight).
- #5142 Support keeping index in GPU memory.
Improvements
- #5115 Relax the topk limit from 16384 to 1M for CPU search.
- #5204 Improve IVF query on GPU when no entity deleted.
- #5544 Relax the index_file_size limit from 4GB to 128Gb.
Fixed issues
- #4897 Query results contain some deleted ids.
- #5164 Exception should be raised if insert or delete entity on the none-existed partition.
- #5191 Mishards throw "index out of range" error after continually search/insert for a period of time.
- #5398 Random crash after request is executed.
- #5537 Failed to load bloom filter after suddenly power off.
- #5574 IVF_SQ8 and IVF_PQ cannot be built on multiple GPUs.
- #5747 Search with big nq and topk crash milvus.
Many thanks to @shengjun1985 @op-hunter @cqy123456 @matrixji @yhmo @del-zhenwu @XuanYang-cn
milvus-1.1.0
Release date:2021-05-07
Compatibility
Milvus version | Python SDK version | Java SDK version | Go SDK version |
---|---|---|---|
1.1.0 | 1.1.0 | 1.1.0 | 1.1.0 |
New Features
- #4564 Supports specifying partition in a
get_entity_by_id()
method call. - #4806 Supports specifying partition in a
delete_entity_by_id()
method call. - #4905 Adds the
release_collection()
method, which unloads a specific collection from cache.
Improvements
- #4756 Improves the performance of the
get_entity_by_id()
method call. - #4856 Upgrades hnswlib to v0.5.0.
- #4958 Improves the performance of IVF index training.
Fixed issues
- #4778 Fails to access vector index in Mishards.
- #4797 The system returns false results after merging search requests with different
topK
parameters. - #4838 The server does not respond immediately to an index building request on an empty collection.
- #4858 For GPU-enabled Milvus, the system crashes on a search request with a large
topK
(> 2048). - #4862 A read-only node merges segments during startup.
- #4894 The capacity of a Bloom filter does not equal to the row count of the segment it belongs to.
- #4908 The GPU cache is not cleaned up after a collection is dropped.
- #4933 It takes a long while for the system to build index for a small segment.
- #4952 Fails to set timezone as "UTC + 5:30".
- #5008 The system crashes randomly during continuous, concurrent delete, insert, and search operations.
- #5010 For GPU-enabled Milvus, query fails on IVF_PQ if
nbits
≠ 8. - #5050
get_collection_stats()
returns false index type for segments still in the process of index building. - #5063 The system crashes when an empty segment is flushed.
- #5078 For GPU-enabled Milvus, the system crashes when creating an IVF index on vectors of 2048, 4096, or 8192 dimensions.
Many thanks to @BossZou @shengjun1985 @op-hunter @matrixji @yhmo @ericsyh @LocoRichard @del-zhenwu @XuanYang-cn @fishpenguin
A special thank to Chris from yinxiang.com
milvus-1.0.0
milvus-0.10.6
Release date:2021-02-23
Compatibility
Milvus version | Python SDK version | Java SDK version | Go SDK version |
---|---|---|---|
0.10.6 | 0.4.0 | 0.8.6 | 0.4.6 |
Compatibility changes
- Adds an optional argument
nbits
to thecreate_index(
) method for the IVF_PQ index. #3920
Improvements
- Improves the FLAT search performance on binary vectors using the AVX2 instruction set.#1970
- Adds an optional parameter
nbits
to thecreate_index()
method for the IVF_PQ index. #3920 - Supports configuring Prometheus labels
cluster_label
andinstance_label
undermetric
.#4614
Fixed issues
- The system returns a
-0
distance, if metric type is tanimoto.#4683 - A FLAT search on binary vectors causes the server to crash, if the dimension of the vectors is not multiple of 2.#4678
- The GPU cache holds more data than specified.#4719
See CHANGELOG for more information.