YOLO Main Functions:¶

class yolo_help.GroundTruthDataset(N=256, M=64, nclasses=3, reproducible=False)¶

Bases: Dataset

This dataset will generate a set of random images of size N x N. And they will contain poisson distributed cells with mean M.

Parameters for Initialization:¶

Nint: The number of images to generate for this simulated dataset; Default - 256
Mint: The mean number of cells appended to each image, sampled from a Poisson distribution; Default - 350
nclassesint: The number of classes to categorize this collection of simulated cells; Default = 3
reproduciblebool: If True, set np.random.seed() to the integer index supplied

Returns:¶

out_arr3 x N array

out_arr[0] is a 1 x N x N array defining the base image
out_arr[1] is a X x 4 array containing the bbox info for X cells
out_arr[2] is a X x 1 array containing the categorical label for the cell within the corresponding bbox; 0 - smooth boundary, 1 - sharp boudnary; 2 - bumpy boundary

class yolo_help.Net(nclasses=3)¶

Bases: Module

A neural network using the YOLO framework, with a batch size of 1 and an input layer capable of accepting inputs of variable shape.

Parameters:¶

nclassesint: Default - 3; The number of distinct classes in the dataset

Returns:¶

Nettorch.nn.Module: A neural network which can take input images with any number of channels

forward(x)¶

Forward method for the yolo network.

Inputs¶

xtorch.Tensor: Object is of size 1 x CH x ROW x COL. # of channels, rows, and columns are arbitrary. This differs from the original yolo paper, which requires a fixed number of channels, rows, and columns.

Outputs¶

xtorch.Tensor: Object is of size 5 * bbox_per_cell (2) + num_classes (3) x ROW x COL, where ROW and COL are equal to the input image size divided by the network’s stride (8). The five numbers per cell are [cx, cy, scalex, scaley, confidence].

class yolo_help.VariableInputConv2d(M)¶

Bases: Module

Note the assumption here is that we have batch size one, so I can work with the batch dimension. This version adds a few extra convolutions.

We only process 1 image at a time, so we can use the batch dimension.

Step 1: move channel dimension to batch dimension. Now we have N samples, with one channel each.

Step 2: apply some convolutions and relus, to end up with an N x M array of images. Where M is fixed,: and N is variable.
Step 3: Take a softmax of the result over the N channels. Now matrix multiply, and N x M array, with a N x 1 array,: to get an M x 1 array (where M is fixed).

Step 4: Return the result which is a fixed number of channels.

What’s nice is it is permutation invariant. So if we input an RGB image, or a BGR image, the result would be exactly the same. We don’t need to have the channels in any specific order.

WORKING TODO Change so it supports a batch dimension This requires every image in a batch to have the same number of channels, but I think its okay

forward(x)¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

yolo_help.bbox_to_rectangles(bbox, **kwargs)¶

This function converts a set of bounding boxes of the form [cx,cy,w,h] to a set of rectangular objects for visualization purposes

Parameters:¶

bboxA N x 4 array[numpy.float64]: Contains the bbox info N bboxes of the form [xc,cy,w,h]

Returns:¶

outmatplotlib.collections.PolyCollection: Contains N rectangles to be plotted later

yolo_help.convert_data(out, B, stride)¶

Convert the outputs from the YOLO neural network into bounding boxes for performance quantification. Note that this operation is not differentiable. It also outputs the raw data, reformmated into a list intsead of an image of grid cells. These outputs are differentiable. The last one of these outputs is the score (data[…,-1]). Note that class probabilities are NOT output by this function.

Parameters:¶

outtorch.Tensor: The output object from the primary YOLO neural network
Bint: The number of bounding boxes per block that the model will output. (Automatically defined when initializing the network)
strideint: The number of pixels to pass over when convolving the input image in each layer (Automatically defined when initializing the network)

Returns:¶

bboxestorch.Tensor of size N x 4: N is the number of bounding boxes output from the network; Bounding boxes are of the form [cx,cy,scalex,scaley]. There are no gradient computations done here. These bounding boxes are for visualization or downstream analysis.
datatorch.Tensor of size N x 5: N is the number of bounding boxes output from the network; Bounding boxes are of the form [cx,cy,w,h,conf]. There are gradient computations done here. These are for our loss function and, potentially, other downstream analysis.

yolo_help.get_assignment_inds(bboxes, bbox, shape, stride, B)¶

Assigns each training bounding box to a specific cell, and picks the bounding box from that cell with the best iou.

Parameters:¶

bboxestorch.Tensor of size [N, 4]: N is the number of bounding boxes computed for a single image and is equal to (np.shape(I)[0] / stride) * (np.shape(I)[0] / stride) * net.B
bboxtorch.Tensor of size [X, 4]: X is the true number of bboxes associated with image I
shape4-dimensional torch.Tensor: The shape of the data directly output from the YOLO neural network
strideint: The number of pixels to pass over when convolving the input image in each layer (Automatically defined when initializing the network)
Bint: The number of bboxes generated by the YOLO neural network at each block

Returns:¶

assignment_indsnumpy array of size X: The index of the bounding box from the ‘bboxes’ parameter cooresponding to the ith bounding box from the ‘bbox’ parameter
iousnumpy array of size X: The iou between the ith bounding box from the ‘bbox’ parameter, and the corresponding bbox from the ‘bboxes’ parameter

yolo_help.get_best_bounding_box_per_cell(bboxes, scores, B)¶

Get the best bounding box for each cell output from the YOLO neural network.

Parameters:¶

bboxestorch.Tensor of size [N, 4]: N is the number of bounding boxes computed for a single image and is equal to (np.shape(I)[0] / stride) * (np.shape(I)[0] / stride) * net.B
scorestorch.Tensor of size [N,1]: The confidence for each corresponding box in the ‘bboxes’ parameter
Bint: Default - 2; The number of bboxes generated by the YOLO neural network at each cell

Returns:¶

bboxes_outtorch.Tensor of size [N/B, 4]: A list of the best bounding boxes for each cell
scores_outtorch.Tensor of size [N/B, 1]: A list of the confidence for the corresponding bounding boxes in ‘bboxes_out’

yolo_help.get_reg_targets(assignment_inds, bbox, B, shape, stride)¶

What are the true bounding box parameters we want to predict.

Parameters:¶

assignment_indsnumpy array of size X: The idx at position i corresponds to the bounding box output from the YOLO framework which most accurately encompasses bounding box ‘i’ in the ‘bbox’ (ground truth labels) parameter
bboxtorch.Tensor of size [X, 4]: X is the true number of bboxes associated with image I
Bint: The number of bboxes generated by the YOLO neural network at each block
shape4-dimensional torch.Tensor: The shape of the data directly output from the YOLO neural network
strideint: The number of pixels to pass over when convolving the input image in each layer (Automatically defined when initializing the network)

Returns:¶

shiftxnumpy array of shape [X,]: Defines the shift in x needed to align bbox[i] with the corresponding best estimated bounding box
shiftynumpy array of shape [X,]: Defines the shift in y needed to align bbox[i] with the corresponding best estimated bounding box
scalexnumpy array of shape [X,]: Defines the scale in x needed to align bbox[i] with the corresponding best estimated bounding box
scaleynumpy array of shape [X,]: Defines the scale in y needed to align bbox[i] with the corresponding best estimated bounding box

yolo_help.imshow(I, ax, **kwargs)¶

This function will normalize the image I and plot it on the axes object ax

Parameters:¶

I1 x N x N array[numpy.float64]: An image
axmatplotlib.axes._axes.Axe object: An empty axis onto which I will be plotted

Returns:¶

None

yolo_help.iou(bbox0, bbox1, nopairwise=False)¶

Calculate pairwise iou between a set of estimated bounding boxes (bbox0) and the set of ground truth bounding boxes (bbox1).

Parameters:¶

bbox0torch.Tensor of size [N, 4]: N is the number of bounding boxes
bbox1torch.Tensor of size [N, 4]: N is the number of bounding boxes
nopairwisebool: If True, compute pointwise, not pairwise IOU

Returns:¶

outArray of length N: An array containing the pairwise (or pointwise) IOU values between the elements bbox0 and bbox 1

yolo_help.train_yolo_model(nepochs, lr, cls_loss, outdir, modelname, optimizername, lossname, verbose=False, resume=False, J_path=None)¶

Train a neural network defined by the YOLO framework and other provided hyperparameters using the simulated dataset ‘groundtruth’.

Parameters:¶

nepochsint: The number of epochs used to train the model
lrfloat: The learning rate used to train the model
outdirstr: The output directory where all files will be saved during training, or where all files were saved if the model has already been trained.
modelnamestr: The file name of the model used during training
optimizernamestr: The file name of the optimizer used during training
lossnamestr: The file name of the losses computed during training
resumebool: Default - False; If True, resume the training of model ‘outdir/modelname’ or load the pretrained model saved at ‘outdir/modelname’
J_pathstr: Default - None; The file path to the image to be used during the validation portion of training

Returns:¶

nettorch.nn.Module: A neural network which has been trained on a simulated dataset

YOLO Main Functions:¶

Parameters for Initialization:¶

Returns:¶

Parameters:¶

Returns:¶

Inputs¶

Outputs¶

Parameters:¶

Returns:¶

Parameters:¶

Returns:¶

Parameters:¶

Returns:¶

Parameters:¶

Returns:¶

Parameters:¶

Returns:¶

Parameters:¶

Returns:¶

Parameters:¶

Returns:¶

Parameters:¶

Returns:¶

Table of Contents

Previous topic

Next topic

This Page