YOLO Post-processing Functions:¶
- yolo_post_help.NMS(bboxes, scores, nms_threshold=0.8)¶
Perform non-maximum suppression (NMS) on the outputs from the yolo model framework in order to reduce the number of candidate bounding boxes.
Parameters:¶
- bboxestorch.Tensor of size [N, 4]
N corresponds to the number of bounding boxes
- scorestorch.Tensor of size [N, 1]
N corresponds to the number of bboxes; scores[i] corresponds to the confidence that bboxes[i] encompasses a target object
- nms_thresholdfloat
The iou threshold used to remove candidate bounding boxes when comparing their spatial position with respect to the bounding box with the highest confidence during the current iteration
Returns:¶
- bboxes_outtorch.Tensor of size [M, 4]
M corresponds to the number of bounding boxes remaining after NMS
- scores_outtorch.Tensor of size [M, 1]
The M scores corresponding to the bounding boxes defined in bboxes_out
- yolo_post_help.bb_to_rec(out, pos=[0, 1, 2, 3], **kwargs)¶
This function converts the model output into a set of bounding boxes to be plotted
Parameters:¶
- outtorch.Tensor
The output object from the primary YOLO neural network after the best bbox per cell has been selected and the coordinate values have been reformated in the postprocessing stage.
- posndarray of int
The indeces corresponding to the [left, top, right, bottom] points at out[i,j]
Returns:¶
- outmatplotlib.collections.PolyCollection
Contains N rectangles to be plotted later
- yolo_post_help.compute_pr_curves(all_gt_bboxes, all_pred_bboxes, all_pred_scores, outdir, verbose=False)¶
Given a set of ground truth bounding boxes, a set of predicted bounding boxes, and the corresponding confidence scores for each predicted bounding box, generate a set of pr curves quantifying the performance of the model. One PR curve will be generated at each IOU threshold from 0.5 : 0.05 : 1.0 and contain 101 points corresponding to each confidence threshold from 0.01 : 0.01 : 1.0 and 2 predefined endpoints. This function will automatically resume the computations at the last (iou, conf) tuple if
Parameters:¶
- gt_bboxestorch.Tensor of shape [X,N,4]
The ground truth bounding boxes which correspond to the target objects in the original image. X is the number of images in the original gt dataset and N is the number of gt bounding boxes for each image. Note that N may vary from image to image.
- pred_bboxestorch.Tensor of shape [X,M,4]
The filtered bounding boxes which were output from the model. X is the number of images in the original gt dataset and M is the number of predicted bounding boxes for image ‘x’ in the gt dataset.
- pred_scorestorch.Tensor of shape [X,M,1]
The confidence scores corresponding to the bounding boxes in the parameter ‘pred_boxes’. X is the number of images in the original gt dataset and M is the number of predicted bounding boxes for image ‘x’ in the gt dataset.
- verbosebool
Default - False; If True, print out the (tp, fp, fn) 3-tuple for every (iou, conf) threshold
Returns:¶
- all_pr_curveslist of shape [10,101,3]
Each element of this list is defined by 101 3-tuples which correspond to the (precision, recall, confidence) values at the given thresholds (iou_thresh, conf_thresh).
- yolo_post_help.get_mAP(all_pr_curves)¶
Given a set of PR curves at various IOU thresholds, compute the mean average precision across all the provided PR curves. This is done by taking the mean of the average precision of each PR curves, which is computed using the trapezoid method from the scipy.integrate package.
Parameters:¶
- all_pr_curveslist of shape [10,101,3]
Each element of this list is defined by 100 3-tuples which correspond to the (precision, recall, confidence) values at the given thresholds (iou_thresh, conf_thresh). This parameter is intended to be the output from the compute_pr_curves() function in this same package.
Returns:¶
- mAPfloat
The mean average precision metric which quantifies the performance of a model specialized in the object segmentation task and ranges from [0,1]
- yolo_post_help.plot_pr_curves(all_pr_curves, save_path='', condense=True)¶
Given one or more sets of (precision, recall, confidence) 3-tuples, generate either 1 (condense=True) or 10 (condense=False) PR curves visualizing the provided data, parameterized by confidence.
Parameters:¶
- all_pr_curveslist of shape [10,101,3]
Each element of this list is defined by 100 3-tuples which correspond to the (precision, recall, confidence) values at the given thresholds (iou_thresh, conf_thresh). This parameter is intended to be the output from the compute_pr_curves() function in this same package.
- save_pathstring
Default - ‘’; If provided, the figure generated by this function will be saved at the corresponding location
- condensebool
Default - True; If True, will plot every PR curve on the same figure and generate a legend which relates the plots with their corresponding iou_threshold value. If False, will generate a separate plot for each PR curve within the figure.
Returns:¶
- figmatplotlib.figure
A figure containing all of the PR curves stored in the parameter ‘all_pr_curves’
- axmatplotlib.axes
A single axis containing the plotted data of all 10 PR curves or a set of 10 axes containing the plotted data of each corresponding PR curve
- yolo_post_help.postprocess(out, B, stride, pads, ds_factor=8, up_factor=None, n_bb=5, verbose=False)¶
Converts the raw output from a YOLO network into a more meaningful data structure through the following steps: (1) Remove B-1 bboxes, keeping the one with highest confidence (2) Apply a sigmoid function to normalize the confidence values of each bbox (3) Convert raw output into units and positions of the (upsampled) input image (4) Remove padding which was added during the tile extraction step (5) If upsampling was performed in the preprocessing step, “downsample” the bbox scalars (6) Permute the output from [8,R,C] to [R,C,8] for simpler downstream analysis
Parameters:¶
- outtorch.Tensor
The output object from the primary YOLO neural network
- Bint
The number of bounding boxes per block that the model will output. (Automatically defined when initializing the network)
- strideint
The number of pixels to pass over when convolving the input image in each layer (Automatically defined when initializing the network)
- padarray of int
List of the thickness of the padding along each axis in units of pixels
- ds_factorint
Default - 8; The factor by which the model output is shrunk relative to the original image size
- up_factorint
Default - None; The factor used to upsample the original image prior to padding
- n_bbint
Default - 5; The number of scalars which define a single bbox
- verbose: bool
Default - False; If True, print out the time taken to run through the entire function.
Returns:¶
- out_torch.Tensor of shape [N/4, M/4, 8]
The processed output from an original grayscale or RGB image of size [N,M]
- yolo_post_help.remove_low_conf_bboxes(bboxes, scores, conf_thresh=0.1)¶
Remove bounding boxes from the parameter ‘bboxes’ if the corresponding element in the parameter ‘scores’ is below conf_thresh.
Parameters:¶
- bboxestorch.Tensor of size [N, 4]
N corresponds to the number of bounding boxes
- scorestorch.Tensor of size [N, 1]
N corresponds to the number of bboxes; scores[i] corresponds to the confidence that bboxes[i] encompasses a target object
- conf_threshfloat
If scores[i] is below this threshold, then remove bboxes[i] from the input
Returns:¶
- bboxes_outtorch.Tensor of size [M, 4]
M corresponds to the number of bounding boxes remaining after this filtering step
- scores_outtorch.Tensor of size [M, 1]
The M scores corresponding to the bounding boxes defined in bboxes_out