YOLO Main Functions:¶
- class yolo_help.GroundTruthDataset(N=256, M=64, nclasses=3, reproducible=False)¶
Bases:
DatasetThis dataset will generate a set of random images of size N x N. And they will contain poisson distributed cells with mean M.
Parameters for Initialization:¶
- Nint
The number of images to generate for this simulated dataset; Default - 256
- Mint
The mean number of cells appended to each image, sampled from a Poisson distribution; Default - 350
- nclassesint
The number of classes to categorize this collection of simulated cells; Default = 3
- reproduciblebool
If True, set np.random.seed() to the integer index supplied
Returns:¶
- out_arr3 x N array
out_arr[0] is a 1 x N x N array defining the base image
out_arr[1] is a X x 4 array containing the bbox info for X cells
out_arr[2] is a X x 1 array containing the categorical label for the cell within the corresponding bbox; 0 - smooth boundary, 1 - sharp boudnary; 2 - bumpy boundary
- class yolo_help.Net(nclasses=3)¶
Bases:
ModuleA neural network using the YOLO framework, with a batch size of 1 and an input layer capable of accepting inputs of variable shape.
Parameters:¶
- nclassesint
Default - 3; The number of distinct classes in the dataset
Returns:¶
- Nettorch.nn.Module
A neural network which can take input images with any number of channels
- forward(x)¶
Forward method for the yolo network.
Inputs¶
- xtorch.Tensor
Object is of size 1 x CH x ROW x COL. # of channels, rows, and columns are arbitrary. This differs from the original yolo paper, which requires a fixed number of channels, rows, and columns.
Outputs¶
- xtorch.Tensor
Object is of size 5 * bbox_per_cell (2) + num_classes (3) x ROW x COL, where ROW and COL are equal to the input image size divided by the network’s stride (8). The five numbers per cell are [cx, cy, scalex, scaley, confidence].
- class yolo_help.VariableInputConv2d(M)¶
Bases:
ModuleNote the assumption here is that we have batch size one, so I can work with the batch dimension. This version adds a few extra convolutions.
We only process 1 image at a time, so we can use the batch dimension.
Step 1: move channel dimension to batch dimension. Now we have N samples, with one channel each.
- Step 2: apply some convolutions and relus, to end up with an N x M array of images. Where M is fixed,
and N is variable.
- Step 3: Take a softmax of the result over the N channels. Now matrix multiply, and N x M array, with a N x 1 array,
to get an M x 1 array (where M is fixed).
Step 4: Return the result which is a fixed number of channels.
What’s nice is it is permutation invariant. So if we input an RGB image, or a BGR image, the result would be exactly the same. We don’t need to have the channels in any specific order.
WORKING TODO Change so it supports a batch dimension This requires every image in a batch to have the same number of channels, but I think its okay
- forward(x)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- yolo_help.bbox_to_rectangles(bbox, **kwargs)¶
This function converts a set of bounding boxes of the form [cx,cy,w,h] to a set of rectangular objects for visualization purposes
Parameters:¶
- bboxA N x 4 array[numpy.float64]
Contains the bbox info N bboxes of the form [xc,cy,w,h]
Returns:¶
- outmatplotlib.collections.PolyCollection
Contains N rectangles to be plotted later
- yolo_help.convert_data(out, B, stride)¶
Convert the outputs from the YOLO neural network into bounding boxes for performance quantification. Note that this operation is not differentiable. It also outputs the raw data, reformmated into a list intsead of an image of grid cells. These outputs are differentiable. The last one of these outputs is the score (data[…,-1]). Note that class probabilities are NOT output by this function.
Parameters:¶
- outtorch.Tensor
The output object from the primary YOLO neural network
- Bint
The number of bounding boxes per block that the model will output. (Automatically defined when initializing the network)
- strideint
The number of pixels to pass over when convolving the input image in each layer (Automatically defined when initializing the network)
Returns:¶
- bboxestorch.Tensor of size N x 4
N is the number of bounding boxes output from the network; Bounding boxes are of the form [cx,cy,scalex,scaley]. There are no gradient computations done here. These bounding boxes are for visualization or downstream analysis.
- datatorch.Tensor of size N x 5
N is the number of bounding boxes output from the network; Bounding boxes are of the form [cx,cy,w,h,conf]. There are gradient computations done here. These are for our loss function and, potentially, other downstream analysis.
- yolo_help.get_assignment_inds(bboxes, bbox, shape, stride, B)¶
Assigns each training bounding box to a specific cell, and picks the bounding box from that cell with the best iou.
Parameters:¶
- bboxestorch.Tensor of size [N, 4]
N is the number of bounding boxes computed for a single image and is equal to (np.shape(I)[0] / stride) * (np.shape(I)[0] / stride) * net.B
- bboxtorch.Tensor of size [X, 4]
X is the true number of bboxes associated with image I
- shape4-dimensional torch.Tensor
The shape of the data directly output from the YOLO neural network
- strideint
The number of pixels to pass over when convolving the input image in each layer (Automatically defined when initializing the network)
- Bint
The number of bboxes generated by the YOLO neural network at each block
Returns:¶
- assignment_indsnumpy array of size X
The index of the bounding box from the ‘bboxes’ parameter cooresponding to the ith bounding box from the ‘bbox’ parameter
- iousnumpy array of size X
The iou between the ith bounding box from the ‘bbox’ parameter, and the corresponding bbox from the ‘bboxes’ parameter
- yolo_help.get_best_bounding_box_per_cell(bboxes, scores, B)¶
Get the best bounding box for each cell output from the YOLO neural network.
Parameters:¶
- bboxestorch.Tensor of size [N, 4]
N is the number of bounding boxes computed for a single image and is equal to (np.shape(I)[0] / stride) * (np.shape(I)[0] / stride) * net.B
- scorestorch.Tensor of size [N,1]
The confidence for each corresponding box in the ‘bboxes’ parameter
- Bint
Default - 2; The number of bboxes generated by the YOLO neural network at each cell
Returns:¶
- bboxes_outtorch.Tensor of size [N/B, 4]
A list of the best bounding boxes for each cell
- scores_outtorch.Tensor of size [N/B, 1]
A list of the confidence for the corresponding bounding boxes in ‘bboxes_out’
- yolo_help.get_reg_targets(assignment_inds, bbox, B, shape, stride)¶
What are the true bounding box parameters we want to predict.
Parameters:¶
- assignment_indsnumpy array of size X
The idx at position i corresponds to the bounding box output from the YOLO framework which most accurately encompasses bounding box ‘i’ in the ‘bbox’ (ground truth labels) parameter
- bboxtorch.Tensor of size [X, 4]
X is the true number of bboxes associated with image I
- Bint
The number of bboxes generated by the YOLO neural network at each block
- shape4-dimensional torch.Tensor
The shape of the data directly output from the YOLO neural network
- strideint
The number of pixels to pass over when convolving the input image in each layer (Automatically defined when initializing the network)
Returns:¶
- shiftxnumpy array of shape [X,]
Defines the shift in x needed to align bbox[i] with the corresponding best estimated bounding box
- shiftynumpy array of shape [X,]
Defines the shift in y needed to align bbox[i] with the corresponding best estimated bounding box
- scalexnumpy array of shape [X,]
Defines the scale in x needed to align bbox[i] with the corresponding best estimated bounding box
- scaleynumpy array of shape [X,]
Defines the scale in y needed to align bbox[i] with the corresponding best estimated bounding box
- yolo_help.imshow(I, ax, **kwargs)¶
This function will normalize the image I and plot it on the axes object ax
Parameters:¶
- I1 x N x N array[numpy.float64]
An image
- axmatplotlib.axes._axes.Axe object
An empty axis onto which I will be plotted
Returns:¶
None
- yolo_help.iou(bbox0, bbox1, nopairwise=False)¶
Calculate pairwise iou between a set of estimated bounding boxes (bbox0) and the set of ground truth bounding boxes (bbox1).
Parameters:¶
- bbox0torch.Tensor of size [N, 4]
N is the number of bounding boxes
- bbox1torch.Tensor of size [N, 4]
N is the number of bounding boxes
- nopairwisebool
If True, compute pointwise, not pairwise IOU
Returns:¶
- outArray of length N
An array containing the pairwise (or pointwise) IOU values between the elements bbox0 and bbox 1
- yolo_help.train_yolo_model(nepochs, lr, cls_loss, outdir, modelname, optimizername, lossname, verbose=False, resume=False, J_path=None)¶
Train a neural network defined by the YOLO framework and other provided hyperparameters using the simulated dataset ‘groundtruth’.
Parameters:¶
- nepochsint
The number of epochs used to train the model
- lrfloat
The learning rate used to train the model
- outdirstr
The output directory where all files will be saved during training, or where all files were saved if the model has already been trained.
- modelnamestr
The file name of the model used during training
- optimizernamestr
The file name of the optimizer used during training
- lossnamestr
The file name of the losses computed during training
- resumebool
Default - False; If True, resume the training of model ‘outdir/modelname’ or load the pretrained model saved at ‘outdir/modelname’
- J_pathstr
Default - None; The file path to the image to be used during the validation portion of training
Returns:¶
- nettorch.nn.Module
A neural network which has been trained on a simulated dataset