Hi everyone! I have a final task to implement CNN for plastic trash classification. I use Google Colab for doing my project. I already do annotation using Image Manual on Prodigy. After import my dataset, I convert my dataset into list which each index has value : image, text, meta, _input_hash, _task_hash, _session_id, _view_id, width, height, spans, answer. I took 3 value (image, ['spans'][0]['label'], and ['spans'][0]['points'] as input for my model. My question are :
I don't know how to used the label and points as the input of my models.
And sometimes 1 picture can have more than 1 label and point, so can I save those labels and points and indicate that those labels and points has the same image. Can I saved it multiple for my list labels and points?
Please help me out with how to do this. Thankyou for your help.
The "spans" are a list of bounding boxes or shapes that you've annotated. An image can have multiple spans and each span describes the label ("label") and pixel coordinates ("points") of the annotated shapes. So instead of just taking spans[0], you can use all spans if you've annotated multiple.
How you use the data depends on the specifics of your model and what you want to update it with. If you've annotated rectangles and your model needs width/height instead of the pixel coordinates , you can calculate that from the coordinates of the corners – same if you need the center of the box (that's just the x/y coordinates plus half the width/height).
Hi Inez! I'm sorry, late reply your message.
Thank you for your references, I had read that link you given to me.
Okay Inez, I'll use all spans as my input(to get the categories) because I've already annotated multiple labels in one image.
But in my list of "spans" it contains "id" for each label and points. What is that mean?
I also have another question. I already annotated using polygon to draw the annotation on the image, and when I resized my images to (256x256), what should I do with the points? Does it need changes too?
These are the corners of the shape. Each point is the [x, y] coordinate of the corner, in the order that you added them.
Yes, the points are pixel coordinates relative to the original image. If you draw a point at [50, 50] and then you resize the image by 50% (make it half as wide/high), that point needs to be scaled as well. So that'd now be at [25, 25]. Ideally, you want to resize the images before you annotate them, to avoid conflicts and inconsistent data.
This is my example 3 spans on my dataset. Each span contains key : id, points, and label. What does it mean? Usually when train a model we need the image and label of the image. If I use dataset from prodigy, can you guide me how to insert the spans because it also need the points and labels?
If you only need an image and the label that applies to the whole image, are you sure you want to annotate data by drawing boxes? It sounds like you're creating a lot of data that you don't even need? Maybe a simpler image classification workflow would be a better fit?
In any case, if you don't need the bounding box and just the label, you can discard the points and just use the "label" value of the spans. If you have annotated boxes for multiple labels for an image, that's either multiple labels, or you have to decide which one to pick if it's mutually exclusive.
I already annotated images in Prodigy. Each image contain multiple object, so I label it one by one. Example I had a bottle plastic, a cup, and a spoon in one image. So the image contain 3 labels and 3 points for each object right? Can you tell me what should I took from the dataset? And what to do next?
Yes, exactly. I think it just comes down to how you want to train your model, and what model you train (e.g. object detection model). That's something you have to decide.