YOLO Post-processing Functions:

yolo_post_help.NMS(bboxes, scores, nms_threshold=0.8)

Perform non-maximum suppression (NMS) on the outputs from the yolo model framework in order to reduce the number of candidate bounding boxes.

Parameters:

bboxestorch.Tensor of size [N, 4]

N corresponds to the number of bounding boxes

scorestorch.Tensor of size [N, 1]

N corresponds to the number of bboxes; scores[i] corresponds to the confidence that bboxes[i] encompasses a target object

nms_thresholdfloat

The iou threshold used to remove candidate bounding boxes when comparing their spatial position with respect to the bounding box with the highest confidence during the current iteration

Returns:

bboxes_outtorch.Tensor of size [M, 4]

M corresponds to the number of bounding boxes remaining after NMS

scores_outtorch.Tensor of size [M, 1]

The M scores corresponding to the bounding boxes defined in bboxes_out

yolo_post_help.bb_to_rec(out, pos=[0, 1, 2, 3], **kwargs)

This function converts the model output into a set of bounding boxes to be plotted

Parameters:

outtorch.Tensor

The output object from the primary YOLO neural network after the best bbox per cell has been selected and the coordinate values have been reformated in the postprocessing stage.

posndarray of int

The indeces corresponding to the [left, top, right, bottom] points at out[i,j]

Returns:

outmatplotlib.collections.PolyCollection

Contains N rectangles to be plotted later

yolo_post_help.compute_pr_curves(all_gt_bboxes, all_pred_bboxes, all_pred_scores, outdir, verbose=False)

Given a set of ground truth bounding boxes, a set of predicted bounding boxes, and the corresponding confidence scores for each predicted bounding box, generate a set of pr curves quantifying the performance of the model. One PR curve will be generated at each IOU threshold from 0.5 : 0.05 : 1.0 and contain 101 points corresponding to each confidence threshold from 0.01 : 0.01 : 1.0 and 2 predefined endpoints. This function will automatically resume the computations at the last (iou, conf) tuple if

Parameters:

gt_bboxestorch.Tensor of shape [X,N,4]

The ground truth bounding boxes which correspond to the target objects in the original image. X is the number of images in the original gt dataset and N is the number of gt bounding boxes for each image. Note that N may vary from image to image.

pred_bboxestorch.Tensor of shape [X,M,4]

The filtered bounding boxes which were output from the model. X is the number of images in the original gt dataset and M is the number of predicted bounding boxes for image ‘x’ in the gt dataset.

pred_scorestorch.Tensor of shape [X,M,1]

The confidence scores corresponding to the bounding boxes in the parameter ‘pred_boxes’. X is the number of images in the original gt dataset and M is the number of predicted bounding boxes for image ‘x’ in the gt dataset.

verbosebool

Default - False; If True, print out the (tp, fp, fn) 3-tuple for every (iou, conf) threshold

Returns:

all_pr_curveslist of shape [10,101,3]

Each element of this list is defined by 101 3-tuples which correspond to the (precision, recall, confidence) values at the given thresholds (iou_thresh, conf_thresh).

yolo_post_help.get_mAP(all_pr_curves)

Given a set of PR curves at various IOU thresholds, compute the mean average precision across all the provided PR curves. This is done by taking the mean of the average precision of each PR curves, which is computed using the trapezoid method from the scipy.integrate package.

Parameters:

all_pr_curveslist of shape [10,101,3]

Each element of this list is defined by 100 3-tuples which correspond to the (precision, recall, confidence) values at the given thresholds (iou_thresh, conf_thresh). This parameter is intended to be the output from the compute_pr_curves() function in this same package.

Returns:

mAPfloat

The mean average precision metric which quantifies the performance of a model specialized in the object segmentation task and ranges from [0,1]

yolo_post_help.plot_pr_curves(all_pr_curves, save_path='', condense=True)

Given one or more sets of (precision, recall, confidence) 3-tuples, generate either 1 (condense=True) or 10 (condense=False) PR curves visualizing the provided data, parameterized by confidence.

Parameters:

all_pr_curveslist of shape [10,101,3]

Each element of this list is defined by 100 3-tuples which correspond to the (precision, recall, confidence) values at the given thresholds (iou_thresh, conf_thresh). This parameter is intended to be the output from the compute_pr_curves() function in this same package.

save_pathstring

Default - ‘’; If provided, the figure generated by this function will be saved at the corresponding location

condensebool

Default - True; If True, will plot every PR curve on the same figure and generate a legend which relates the plots with their corresponding iou_threshold value. If False, will generate a separate plot for each PR curve within the figure.

Returns:

figmatplotlib.figure

A figure containing all of the PR curves stored in the parameter ‘all_pr_curves’

axmatplotlib.axes

A single axis containing the plotted data of all 10 PR curves or a set of 10 axes containing the plotted data of each corresponding PR curve

yolo_post_help.postprocess(out, B, stride, pads, ds_factor=8, up_factor=None, n_bb=5, verbose=False)

Converts the raw output from a YOLO network into a more meaningful data structure through the following steps: (1) Remove B-1 bboxes, keeping the one with highest confidence (2) Apply a sigmoid function to normalize the confidence values of each bbox (3) Convert raw output into units and positions of the (upsampled) input image (4) Remove padding which was added during the tile extraction step (5) If upsampling was performed in the preprocessing step, “downsample” the bbox scalars (6) Permute the output from [8,R,C] to [R,C,8] for simpler downstream analysis

Parameters:

outtorch.Tensor

The output object from the primary YOLO neural network

Bint

The number of bounding boxes per block that the model will output. (Automatically defined when initializing the network)

strideint

The number of pixels to pass over when convolving the input image in each layer (Automatically defined when initializing the network)

padarray of int

List of the thickness of the padding along each axis in units of pixels

ds_factorint

Default - 8; The factor by which the model output is shrunk relative to the original image size

up_factorint

Default - None; The factor used to upsample the original image prior to padding

n_bbint

Default - 5; The number of scalars which define a single bbox

verbose: bool

Default - False; If True, print out the time taken to run through the entire function.

Returns:

out_torch.Tensor of shape [N/4, M/4, 8]

The processed output from an original grayscale or RGB image of size [N,M]

yolo_post_help.remove_low_conf_bboxes(bboxes, scores, conf_thresh=0.1)

Remove bounding boxes from the parameter ‘bboxes’ if the corresponding element in the parameter ‘scores’ is below conf_thresh.

Parameters:

bboxestorch.Tensor of size [N, 4]

N corresponds to the number of bounding boxes

scorestorch.Tensor of size [N, 1]

N corresponds to the number of bboxes; scores[i] corresponds to the confidence that bboxes[i] encompasses a target object

conf_threshfloat

If scores[i] is below this threshold, then remove bboxes[i] from the input

Returns:

bboxes_outtorch.Tensor of size [M, 4]

M corresponds to the number of bounding boxes remaining after this filtering step

scores_outtorch.Tensor of size [M, 1]

The M scores corresponding to the bounding boxes defined in bboxes_out