Computational Geometry

This section of the math expressions user guide covers computational geometry functions.

Convex Hull

A convex hull is the smallest convex set of points that encloses a data set. Math expressions has support for computing the convex hull of a 2D data set. Once a convex hull has been calculated, a set of math expression functions can be applied to geometrically describe and visualize the convex hull.

Visualization

The convexHull function can be used to visualize a border around a set of 2D points. Border visualizations can be useful for understanding where data points are in relation to the border.

In the examples below the convexHull function is used to visualize a border for a set of latitude and longitude points of rat sightings in the NYC311 complaints database. An investigation of the border around the rat sightings can be done to better understand how rats may be entering or exiting the specific region.

Scatter Plot

Before visualizing the convex hull its often useful to visualize the 2D points as a scatter plot.

In this example the random function draws a sample of records from the NYC311 (complaints database) collection where the complaint description matches "rat sighting" and the zip code is 11238. The latitude and longitude fields are then vectorized and plotted as a scatter plot with longitude on x-axis and latitude on the y-axis.

convex0

Notice from the scatter plot that many of the points appear to lie near the border of the plot.

Convex Hull Plot

The convexHull function can be used to visualize the border. The example uses the same points drawn from the NYC311 database. But instead of plotting the points directly the latitude and longitude points are added as rows to a matrix. The matrix is then transposed with transpose function so that each row of the matrix contains a single latitude and longitude point.

The convexHull function is then used calculate the convex hull for the matrix of points. The convex hull is set a variable called hull.

Once the convex hull has been created the getVertices function can be used to retrieve the matrix of points in the scatter plot that comprise the convex border around the scatter plot. The colAt function can then be used to retrieve the latitude and longitude vectors from the matrix so they can visualized by the zplot function. In the example below the convex hull points are visualized as a scatter plot.

hullplot

Notice that the 15 points in the scatter plot describe that latitude and longitude points of the convex hull.

Projecting and Clustering

The once a convex hull as been calculated the projectToBorder can then be used to project points to the nearest point on the border. In the example below the projectToBorder function is used to project the original scatter scatter plot points to the nearest border.

The projectToBorder function returns a matrix of lat/lon points for the border projections. In the example the matrix of border points is then clustered into 7 clusters using kmeans clustering. The zplot function is then used to plot the clustered border points.

convex1

Notice in the visualization its easy to see which spots along the border have the highest density of points. In the case of the rat sightings this information is useful in understanding which border points are closest for the rats to enter or exit from.

Plotting the Centroids

Once the border points have been clustered its very easy to extract the centroids of the clusters and plot them on a map. The example below extracts the centroids from the clusters using the getCentroids function. getCentroids returns the matrix of lat/lon points which represent the centroids of border clusters. The colAt function can then be used to extract the lat/lon vectors so they can be plotted on a map using zplot.

convex2

The map above shows the centroids of the border clusters. The centroids from the highest density clusters can now be zoomed and investigated geo-spatially to determine what might be the best places to begin an investigation of the border.

Enclosing Disk

The enclosingDisk function finds the smallest enclosing circle the encloses a 2D data set. Once an enclosing disk has been calculated, a set of math expression functions can be applied to geometrically describe the enclosing disk.

In the example below an enclosing disk is calculated for a randomly generated set of 1000 2D observations.

Then the following functions are called on the enclosing disk:

  • getCenter: Returns the 2D point that is the center of the disk.

  • getRadius: Returns the radius of the disk.

  • getSupportPoints: Returns the support points of the disk.

let(echo="center, radius, support",
    x=sample(normalDistribution(0, 20), 1000),
    y=sample(normalDistribution(0, 20), 1000),
    observations=transpose(matrix(x,y)),
    disk=enclosingDisk(observations),
    center=getCenter(disk),
    radius=getRadius(disk),
    support=getSupportPoints(disk))

When this expression is sent to the /stream handler it responds with:

{
  "result-set": {
    "docs": [
      {
        "center": [
          -6.668825009733749,
          -2.9825450908240025
        ],
        "radius": 72.66109546907208,
        "support": [
          [
            20.350992271739464,
            64.46791279377014
          ],
          [
            33.02079953093981,
            57.880978456420365
          ],
          [
            -44.7273247899923,
            -64.87911518353323
          ]
        ]
      },
      {
        "EOF": true,
        "RESPONSE_TIME": 8
      }
    ]
  }
}