Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA RANSAC implementation #21

Merged
merged 72 commits into from
Mar 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
82c495a
cuda ransac first draft
true-real-michael Feb 5, 2024
0147784
fix imports
true-real-michael Feb 5, 2024
52d8b5c
make cuda fit function global
true-real-michael Feb 5, 2024
0ce162d
add cuda method defs in base classes and octreemgr 🙄
true-real-michael Feb 5, 2024
e97fef5
fix black 🙄🙄
true-real-michael Feb 5, 2024
18af48f
cuda ransac first draft
true-real-michael Feb 5, 2024
6d28d04
fix imports
true-real-michael Feb 5, 2024
cc19e07
make cuda fit function global
true-real-michael Feb 5, 2024
e55ab76
add cuda method defs in base classes and octreemgr 🙄
true-real-michael Feb 5, 2024
677ab53
Merge remote-tracking branch 'origin/cuda-ransac' into cuda-ransac
true-real-michael Feb 5, 2024
1081d92
fix pose numbers nullability
true-real-michael Feb 5, 2024
8f95728
enable cuda simulator on pytest for github actions
true-real-michael Feb 5, 2024
05b8ece
use masks to filter points
true-real-michael Feb 5, 2024
3af8ef1
use arrays on device instead of arrays on host
true-real-michael Feb 7, 2024
1ddcd83
add OctreeManager parallelization
true-real-michael Feb 8, 2024
fb3fd31
speedup
true-real-michael Feb 10, 2024
a3af0a3
remove redundant logic, fix tests
true-real-michael Feb 10, 2024
475c9f0
remove redundant logic, fix black
true-real-michael Feb 10, 2024
3f6e50d
fix black🙄
true-real-michael Feb 19, 2024
66ad643
the previous commit works better!
true-real-michael Feb 19, 2024
8ad69d0
more comments
true-real-michael Feb 19, 2024
cc69e98
fix plane fitting
true-real-michael Mar 5, 2024
59edf24
remove redundant code
true-real-michael Mar 5, 2024
5e13af2
quickfix
true-real-michael Mar 5, 2024
4e04cf8
fix black
true-real-michael Mar 5, 2024
b402f18
fix plane detection for more than 3 points
true-real-michael Mar 6, 2024
10767c4
do not assure unique random points
true-real-michael Mar 13, 2024
23bdd63
process poses in batches to avoid running out of memory (#26)
true-real-michael Mar 14, 2024
8892e0e
refactor random points generation
true-real-michael Mar 14, 2024
ea7bb7f
calculate best mask inside kernel
true-real-michael Mar 14, 2024
123745e
refactor
true-real-michael Mar 14, 2024
ccf6c90
refactor
true-real-michael Mar 14, 2024
c831574
refactor
true-real-michael Mar 14, 2024
efc66d2
refactor
true-real-michael Mar 14, 2024
821a39c
reduce size of result mask (#27)
true-real-michael Mar 15, 2024
807a838
fix data race
true-real-michael Mar 17, 2024
1b08e47
remove extra point transfer
true-real-michael Mar 17, 2024
e6dd1ed
remove unused imports
true-real-michael Mar 17, 2024
23bc12d
faster mask application
true-real-michael Mar 17, 2024
3bdf4bc
fix init file
true-real-michael Mar 17, 2024
f393893
change interface and add comments
true-real-michael Mar 18, 2024
ea8091b
refactor
true-real-michael Mar 18, 2024
2a6508b
fix type annotation
true-real-michael Mar 18, 2024
e8cc7e0
fix imports
true-real-michael Mar 18, 2024
201ecd2
fix
true-real-michael Mar 19, 2024
afcd7f3
refactor + add comments
true-real-michael Mar 20, 2024
82ff298
refactor + add comments
true-real-michael Mar 20, 2024
49bb2db
refactor: move all functions to cuda ransac class
true-real-michael Mar 20, 2024
8dc074f
Revert "refactor: move all functions to cuda ransac class"
true-real-michael Mar 20, 2024
ff30a19
move get_plane_grom_points to util
true-real-michael Mar 20, 2024
4752cb6
formatting
true-real-michael Mar 20, 2024
0eb86da
fix cuda ransac tests
true-real-michael Mar 20, 2024
e512257
move cuda test to a separate file
true-real-michael Mar 20, 2024
2b1adbd
remove redundant code
true-real-michael Mar 21, 2024
e0f8ea5
Merge branch 'main' into cuda-ransac
true-real-michael Mar 21, 2024
335cf90
remove redundant code
true-real-michael Mar 21, 2024
c13b1d5
refactor
true-real-michael Mar 21, 2024
08ca2f6
refactor
true-real-michael Mar 21, 2024
bbe2283
fix comments
true-real-michael Mar 21, 2024
b76aa8a
make kernel a staticmethod
true-real-michael Mar 21, 2024
adea651
refactor
true-real-michael Mar 21, 2024
2d32672
fix letter naming
true-real-michael Mar 23, 2024
6993d61
make kernel private
true-real-michael Mar 23, 2024
5718347
add comments and fix black
true-real-michael Mar 23, 2024
ca541f3
fix comment
true-real-michael Mar 23, 2024
b3976ee
fix letter variables
true-real-michael Mar 23, 2024
950d086
fix letter variables
true-real-michael Mar 23, 2024
71e2001
remove redundant code
true-real-michael Mar 23, 2024
9755487
remove redundant code and change default poses per batch
true-real-michael Mar 23, 2024
d4062c9
fix comment
true-real-michael Mar 23, 2024
e525b41
move constant to util.py
true-real-michael Mar 23, 2024
a756d9a
fix comment
true-real-michael Mar 23, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ jobs:
run: |
pip install .
pytest
env:
NUMBA_ENABLE_CUDASIM: 1

publish-package:
if: github.event_name == 'push' && startsWith(github.ref, 'refs/tags')
Expand Down
115 changes: 111 additions & 4 deletions octreelib/grid/grid.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
)
from octreelib.internal.point import PointCloud
from octreelib.internal.voxel import Voxel, VoxelBase
from octreelib.ransac.cuda_ransac import CudaRansac

__all__ = ["Grid", "GridConfig"]

Expand Down Expand Up @@ -107,23 +108,129 @@ def insert_points(self, pose_number: int, points: PointCloud):
self.__pose_voxel_coordinates[pose_number].append(target_voxel)
self.__octrees[target_voxel].insert_points(pose_number, voxel_points)

def map_leaf_points(self, function: Callable[[PointCloud], PointCloud]):
def map_leaf_points(
self,
function: Callable[[PointCloud], PointCloud],
pose_numbers: Optional[List[int]] = None,
):
"""
Transform point cloud in each node of each octree using the function
:param function: Transformation function PointCloud -> PointCloud. It is applied to each leaf node.
:param pose_numbers: List of pose numbers to map.
"""
for voxel_coordinates in self.__octrees:
self.__octrees[voxel_coordinates].map_leaf_points(function)
self.__octrees[voxel_coordinates].map_leaf_points(function, pose_numbers)

def map_leaf_points_cuda_ransac(
self,
poses_per_batch: int = 10,
threshold: float = 0.01,
hypotheses_number: int = 1024,
):
"""
transform point cloud in the node using the function
:param poses_per_batch: Number of poses per batch.
:param threshold: Distance threshold.
:param hypotheses_number: Number of RANSAC iterations (<= 1024).
"""
if threshold <= 0:
raise ValueError("Threshold must be positive")
if hypotheses_number < 1:
raise ValueError("Number of RANSAC hypotheses must be positive")
if hypotheses_number > 1024:
raise ValueError(
"Number of RANSAC hypotheses must be <= 1024 "
"because of the CUDA thread limit."
)

# processing is done in batches to avoid running out of memory
batches = [
list(
range(
i,
min(i + poses_per_batch, len(self.__pose_voxel_coordinates)),
)
)
for i in range(0, len(self.__pose_voxel_coordinates), poses_per_batch)
]

def get_leaf_points(self, pose_number: int) -> List[Voxel]:
# find the maximum number of leaf voxels across all batches
# this is needed to initialize the random number generators on the GPU
max_leaf_voxels = max(
[
sum([self.n_leaves(pose_number) for pose_number in batch_pose_numbers])
for batch_pose_numbers in batches
]
)
ransac = CudaRansac(
max_blocks_number=max_leaf_voxels,
hypotheses_number=hypotheses_number,
threshold=threshold,
)

# process each batch
for batch_pose_numbers in batches:
# `combined_point_cloud` is a concatenation of ALL point clouds
# `block_sizes` is a list of sizes of point clouds for each leaf node
# `pose_dividers` is a list of indices where combined_point_cloud is divided by pose
# these are used to split the combined_point_cloud into separate point clouds
# for each pose after the kernel is done
batch_point_clouds = []
block_sizes = []
pose_dividers = [0]
for pose_number in batch_pose_numbers:
pose_point_cloud = self.get_leaf_points(pose_number)
batch_point_clouds.append(
np.vstack([v.get_points() for v in pose_point_cloud])
)
block_sizes.append(
np.array(
[len(v.get_points()) for v in pose_point_cloud],
dtype=np.int32,
)
)
pose_dividers.append(pose_dividers[-1] + block_sizes[-1].sum())

combined_point_cloud = np.vstack(batch_point_clouds)
block_sizes_combined = np.concatenate(block_sizes)
pose_dividers = np.array(pose_dividers)

# run the kernel
maximum_mask = ransac.evaluate(
combined_point_cloud,
block_sizes_combined,
)

# split the combined point cloud into separate point clouds for each pose,
# apply the masks from the kernel
# and insert them into the octrees

for i, (pose_number, block_sizes_for_pose) in enumerate(
zip(batch_pose_numbers, block_sizes)
):
mask = maximum_mask[pose_dividers[i] : pose_dividers[i + 1]]
point_start_index = 0
for voxel_coordinates in self.__pose_voxel_coordinates[pose_number]:
octree = self.__octrees[voxel_coordinates]
points_number = octree.n_points(pose_number)
octree_mask = mask[
point_start_index : point_start_index + points_number
]
octree.apply_mask(octree_mask, pose_number)
point_start_index += points_number

def get_leaf_points(self, pose_number: int, non_empty: bool = True) -> List[Voxel]:
"""
:param pose_number: The desired pose number.
:param non_empty: If True, returns only non-empty voxels.
:return: List of voxels. Each voxel is a representation of a leaf node.
Each voxel has the same corner, edge_length and points as one of the leaf nodes.
"""
return sum(
[
self.__octrees[voxel_coordinates].get_leaf_points(pose_number)
self.__octrees[voxel_coordinates].get_leaf_points(
non_empty, pose_number
)
for voxel_coordinates in self.__pose_voxel_coordinates[pose_number]
],
[],
Expand Down
15 changes: 15 additions & 0 deletions octreelib/grid/grid_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,21 @@ def map_leaf_points(self, function: Callable[[PointCloud], PointCloud]):
"""
pass

@abstractmethod
def map_leaf_points_cuda_ransac(
self,
poses_per_batch: int = 1,
threshold: float = 0.01,
hypotheses_number: int = 1024,
):
"""
transform point cloud in the node using the function
:param poses_per_batch: Number of poses per batch.
:param threshold: Distance threshold.
:param hypotheses_number: Number of RANSAC iterations (<= 1024).
"""
pass

@abstractmethod
def get_leaf_points(self, pose_number: int) -> List[Voxel]:
"""
Expand Down
38 changes: 36 additions & 2 deletions octreelib/octree/octree.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ def subdivide_as(self, other: "OctreeNode"):
elif self._has_children:
self._points = self.get_points()
self._has_children = False
for child in self._children:
child._remove_from_cache()
self._children = []

def get_points(self) -> PointCloud:
Expand Down Expand Up @@ -132,6 +134,13 @@ def get_leaf_points(self) -> List[Voxel]:
else []
)

def apply_mask(self, mask: np.ndarray):
"""
Apply mask to the point cloud in the octree node
:param mask: Mask to apply
"""
self._points = self._points[mask]

@property
def n_leaves(self):
"""
Expand Down Expand Up @@ -171,14 +180,25 @@ def _generate_children(self):
"""
child_edge_length = self.edge_length / np.float_(2)
children_corners_offsets = itertools.product([0, child_edge_length], repeat=3)
self._cached_leaves.remove(self)
return [
OctreeNode(
self.corner_min + offset,
child_edge_length,
self._cached_leaves,
)
for internal_position, offset in enumerate(children_corners_offsets)
]

def _remove_from_cache(self):
"""
Remove the node and its children from the cached leaves.
"""
self._cached_leaves.remove(self)
if self._has_children:
for child in self._children:
child._remove_from_cache()


class Octree(OctreeBase, Generic[T]):
"""
Expand Down Expand Up @@ -233,11 +253,25 @@ def map_leaf_points(self, function: Callable[[PointCloud], PointCloud]):
"""
self._root.map_leaf_points(function)

def get_leaf_points(self) -> List[Voxel]:
def get_leaf_points(self, non_empty: bool = True) -> List[Voxel]:
"""
:param non_empty: If True, only non-empty leaf nodes are returned.
:return: List of voxels where each voxel represents a leaf node with points.
"""
return self._root.get_leaf_points()
if non_empty:
return list(filter(lambda v: v.n_points != 0, self._cached_leaves))
return self._cached_leaves

def apply_mask(self, mask: np.ndarray):
"""
Apply mask to the point cloud in the octree
:param mask: Mask to apply
"""
start_index = 0
for leaf in filter(lambda v: v.n_points != 0, self._cached_leaves):
points_number = leaf.n_points
leaf.apply_mask(mask[start_index : start_index + points_number])
start_index += points_number

@property
def n_points(self):
Expand Down
38 changes: 35 additions & 3 deletions octreelib/octree/octree_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,20 @@ class OctreeNodeBase(Voxel, ABC):
and are not stored in the parent node.
"""

def __init__(self, corner_min: Point, edge_length: float):
def __init__(
self,
corner_min: Point,
edge_length: float,
octree_cached_leaves: List["OctreeNodeBase"],
):
super().__init__(corner_min, edge_length)
self._points: np.empty((0, 3), dtype=float)
self._children: Optional[List["OctreeNodeBase"]] = []
self._has_children: bool = False
# `OctreeNodeBase_cached_leaves` references field `OctreeBase._cached_leaves`
# so that nodes can modify this field in the parent OctreeBase instance
self._cached_leaves = octree_cached_leaves
pmokeev marked this conversation as resolved.
Show resolved Hide resolved
self._cached_leaves.append(self)

@property
@abstractmethod
Expand Down Expand Up @@ -111,6 +120,14 @@ def get_points(self) -> PointCloud:
"""
pass

@abstractmethod
def apply_mask(self, mask: np.ndarray):
"""
Apply mask to the point cloud in the octree node
:param mask: Mask to apply
"""
self._points = self._points[mask]


class OctreeBase(Voxel, ABC):
"""
Expand All @@ -131,7 +148,14 @@ def __init__(
):
super().__init__(corner_min, edge_length)
self._config = octree_config
self._root = self._node_type(self.corner_min, self.edge_length)

# cached leaves allow for fast retrieval of leaf nodes with points
# skipping the stage of finding them and returning through multiple
# layers of recursion
self._cached_leaves = []
self._root = self._node_type(
self.corner_min, self.edge_length, self._cached_leaves
)

@property
@abstractmethod
Expand Down Expand Up @@ -175,8 +199,9 @@ def map_leaf_points(self, function: Callable[[PointCloud], PointCloud]):
pass

@abstractmethod
def get_leaf_points(self) -> List[Voxel]:
def get_leaf_points(self, non_empty: bool) -> List[Voxel]:
"""
:param non_empty: If True, only non-empty leaf nodes are returned.
:return: List of PointClouds where each PointCloud
represents points in a separate leaf node
"""
Expand Down Expand Up @@ -208,3 +233,10 @@ def get_points(self) -> PointCloud:
@abstractmethod
def insert_points(self, points: PointCloud):
pass

@abstractmethod
def apply_mask(self, mask: np.ndarray):
"""
Apply mask to the point cloud in the octree
:param mask: Mask to apply
"""
25 changes: 21 additions & 4 deletions octreelib/octree_manager/octree_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,8 @@ def map_leaf_points(
pose_numbers = self._octrees.keys()

for pose_number in pose_numbers:
self._octrees[pose_number].map_leaf_points(function)
if pose_number in self._octrees:
self._octrees[pose_number].map_leaf_points(function)

def filter(
self,
Expand All @@ -97,17 +98,24 @@ def filter(
for pose_number in pose_numbers:
self._octrees[pose_number].filter(filtering_criteria)

def get_leaf_points(self, pose_number: Optional[int] = None) -> List[Voxel]:
def get_leaf_points(
self, non_empty: bool = True, pose_number: Optional[int] = None
) -> List[Voxel]:
"""
:param non_empty: If True, only non-empty leaf nodes are returned.
:param pose_number: Desired pose number.
:return: List of leaf voxels with points for this pose.
"""
if pose_number is None:
return sum(
[octree.get_leaf_points() for octree in self._octrees.values()], []
[
octree.get_leaf_points(non_empty)
for octree in self._octrees.values()
],
[],
)
if pose_number in self._octrees:
return self._octrees[pose_number].get_leaf_points()
return self._octrees[pose_number].get_leaf_points(non_empty)
return []

def get_points(self, pose_number: Optional[int] = None) -> PointCloud:
Expand Down Expand Up @@ -161,3 +169,12 @@ def insert_points(self, pose_number: int, points: PointCloud):
)
self._octrees[pose_number].insert_points(points)
self._octrees[pose_number].subdivide_as(self._scheme_octree)

def apply_mask(self, mask: np.ndarray, pose_number: int):
"""
Apply mask to the point cloud in the octree
:param mask: Mask to apply
:param pose_number: Pose number
"""
if pose_number in self._octrees:
self._octrees[pose_number].apply_mask(mask)
9 changes: 9 additions & 0 deletions octreelib/ransac/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
"""
This module contains implementations of RANSAC algorithm.
"""
pmokeev marked this conversation as resolved.
Show resolved Hide resolved

import octreelib.ransac.cuda_ransac as cuda_ransac_module

from octreelib.ransac.cuda_ransac import *

__all__ = cuda_ransac_module.__all__
Loading
Loading