On This Page |
---|
In my junior year of highschool, I took AP CSA, an introductory computer science course taught in Java. It was an enjoyable course and likely my first exposure to programming with the interface mind.
During one lecture, we reviewed the Object Oriented design paradigm. To illustrate key concepts, the instructor drew up a simplified geometry class. The objects this class supported included the Point, Line, and Polygon all setup with reasonable inheritance structures.
Now, a polygon is defined by a sequence of lines called a polygonal chain. To reflect this, the Polygon constructor was initialized by an ordered array of Point objects. At this point, I asked myself whether we could infer the order of the points without having the user specify. As it so happens, in 2D this a relatively straightforward task that involves sorting points by their angle relative to a reference inside the polygon. The only subtle complication is determining a suitable reference point.
At the time, I introduced a somewhat obtuse algorithm to produce the desired result. I won't bother describing it in detail though I'll include a copy of the original concept I eagerly passed along to my instructor.
This brief excersize, got me down the rabbit hole of Convex Hull Algorithms. Here, I'll be reviewing my attempt at implementing qhull, an algorithm that notably extends into arbitrary dimensions.
We'll open with key definitions. First, a set \( S \) is convex if for any two points \( p \) and \( q \), the line segment \( \overline{pq} \subset S \).
Now, consider a set \( P \) of \( n \) points fixed in \( d \) dimensional space. The Convex Hull of \( P \) is the largest convex polytope whose vertices are drawn from the set. Intuitively, it is what one arrives at upon stretching a rubber band around the boundary.
Our task is to produce an algorithm that takes in a set of points and yields the Convex Hull of that set.
Wikipedia lists several Convex Hull Algorithms each with their own advantages and disadvantages. We can introduce a theoritc lower bound on the time complexity by drawing an analogy to sorting. Consider the following set of points:
\[ P = \left \{ (x_{1}, x^{2}_{1}), \dots, (x_{n}, x^{2}_{n}) \right \} \]
Since these points lie on a parabola, a convex curve, the problem clearly reduces to sorting \( x_{i} \) completing the analogy. Now, sorting a collection of objects is a well studied problem. As it so happens, we can prove that any comparison based algorithm must take \( \Omega(n \log{n}) \) time. To show this, consider the decision tree for sorting \( abc \).
In order to compute the number of computations needed to arrive at a leaf, we'll make three observations:
- The number of computations is bounded by the number of layers.
- Each decision (green) splits off into two branches.
- Each leaf (blue) is associated with one of \( n! \) permutations.
We reason that the number of layers \( l = \log_{2}(n!) \). Using Stirling's Approximation that \( n! \approx \sqrt{2 \pi n} \left ( \frac{n}{e} \right )^{n} \) we find \( l = \Omega(n \log{n}) \) completing the proof.
And sure enough, every algorithm listed takes at least \( \Omega(n \log{n}) \) time in expectation. To choose an algorithm to implement, my criteria extended from efficiency to generalizability. As it so happens, qhull is the only well-known algorithm to compute the Convex Hull of a set points fixed in \( d \) dimensional space for \( d > 2\).
The Wikepedia Article associated with qhull, includes pseudo-code for a possible implementation in 2D. I'll include a simplified version here for reference.
Input: A set S of n points
function QuickHull(S):
ConvexHull = {} # Initialize the convex hull
Find leftmost point A and rightmost point B
Add A and B to ConvexHull
Divide remaining points into two groups:
S1: Points to the right of line AB
S2: Points to the right of line BA
FindHull(S1, A, B)
FindHull(S2, B, A)
return ConvexHull
function FindHull(Sk, P, Q):
if Sk is empty:
return
Find the farthest point C from line PQ
Add C to ConvexHull between P and Q
Divide points in Sk into three groups:
S0: Points inside triangle PCQ
S1: Points to the right of line PC
S2: Points to the right of line CQ
FindHull(S1, P, C)
FindHull(S2, C, Q)
The basic observation is that the farthest point from any (Dedekind) cut in 2D space lies on the Convex Hull. With this in the mind, qhull initiates the process by greedily drawing a large cut through the points. Then, it identifies points on the hull, draws more cuts, and recurses.
There are several implementations of the QuickHull Algorithm available online. The key drawback I noticed was that the computations used were restricted to 2D while the procedure extends cleanly to \( N \) dimensions. To build a common vocabulary for the discussions that follow, I will visually introduce the following terms: Facet, SubFacet, and Simplex.
You may notice these objects go by more familiar names. In 2D, a simplex is a Triangle, a facet is an Edge, and a subfacet is a Point. In 3D, a simplex is a Pyramid, a facet is a Plane, and a subfacet is an Edge. These objects are fundamental in higher dimensional space, so we name them accordingly.
I generalize this algorithm as follows:
- Select \( d \) points and build the initial facet
- Find the normal vector to this facet
- Project points onto the normal to identify the Eyes, S1, and S2
- Build simplices by connecting the facet to the Eyes
- Repeat steps 2-4 with each non-shared subfacet
Most of these steps are fairly vanilla. The only exception is computing the normal vectors to each Facet of a Simplex. Browing through the Math StackExchange, I found an elegant solution and ultimately arrived at this hilariously compact function:
def normal(simplex):
M = np.hstack((simplex, np.ones((num, 1))))
inv = np.linalg.inv(M)
return -inv[:-1, :]
Satisfied, I decided to run my program on random 3D inputs and compare the output with SciPy. To my surprise, while the program ran smoothly in 2D, in 3D the outputs simply didn't match. In fact, the hull included several extra points that should have been pruned out. As I would soon find out, the ND case is more involved.
While searching for hints on StackExchange I stumbled across a rather cute exchange in the comments of an answer. While Aditya believed that the ND case was a "simple generalization" of the 2D case, Joseph O'Rourke cautioned against this:
Having implemented both, I would judge them to be... well, in different dimensions!
The key issue with generalizing the 2D algorithm is that it no longer ensures that the reported hull is convex. Intuitively, one might imagine we end up with a rugged surface composed of simplices.
Taking another look at the Wikepedia Article, I noticed a section detaling the procedure for the ND case. While the initialization remains (roughly) the same, the algorithm is no longer recursive. The basic idea is to compute the Horizon visible from the Eye and build facets (collectively called the Cone) by connecting them together.
By iterating until there are no more to process, we ensure that the reported hull is both convex and maximal.
You can find my complete implementation on GitHub. Before diving straight in, there are a few notes worth clarifying. First, in my assessment, the key challenge of implementing QuickHull is to efficiently compute the Horizon. This is typically done with a Depth First Search where subfacets are crossed until we identify the boundary. The core logic is actually quite readable, so I'll include here:
def _recursive_horizon(self, eye, facet, horizon):
# If the eye is visible from the facet...
if np.dot(facet.normal, eye - facet.center) > 0:
# Label the facet as visible and cross each edge
horizon.facets.add(facet)
for subfacet in facet.subfacets:
neighbor = (self.neighbors[subfacet] - {facet}).pop()
# If the neighbor is not visible,
# then the edge shared must be on the boundary
if neighbor and neighbor not in horizon.facets:
if not self._recursive_horizon(eye, neighbor, horizon):
horizon.boundary.append(subfacet)
return 1
return 0
As suggested above, maintaining the neighbors associated with a given SubFacet helps avoid repeated computation. This is a standard ingredient in computational geometry known as a half-edge data structure. To do this correctly, we need set neighbors (1) when building the initial simplex and (2) when adding/removing facets.
The current implementation iterates through every SubFacet in the Cone. However, if we restrict ourselves to the 3D Case, it is possible to improve this by performing DFS in CW or CCW order. Upon doing so, the horizon will be ordered such that consecutive edges will be neighbors. Unfortunately, this does not generalize beyond 3D. I experimented with Hamiltonian Paths but the computational cost is prohibitive.
I will also add that the current implementation is not robust. It will likely suffer numerical imprecision issues arising from coplanar facets. To remedy this, one must ensure the hull is healthy at each state.
If you're interested in further reading, I would recommend checking out the original quickhull paper by Barber, Dobkin, and Huhdanpaa. For implementation details, Dirk Gregorius's slide deck and Jordan Smith's notes are very helpful.