How to Process my Image Manual Dataset

SintaXD · May 3, 2020, 1:25pm

Hi everyone! I have a final task to implement CNN for plastic trash classification. I use Google Colab for doing my project. I already do annotation using Image Manual on Prodigy. After import my dataset, I convert my dataset into list which each index has value : image, text, meta, _input_hash, _task_hash, _session_id, _view_id, width, height, spans, answer. I took 3 value (image, ['spans'][0]['label'], and ['spans'][0]['points'] as input for my model. My question are :

I don't know how to used the label and points as the input of my models.
And sometimes 1 picture can have more than 1 label and point, so can I save those labels and points and indicate that those labels and points has the same image. Can I saved it multiple for my list labels and points?

Please help me out with how to do this. Thankyou for your help.

ines · May 4, 2020, 6:32pm

Hi! You can see an example of the data produced by the image_manual interface here: https://prodi.gy/docs/api-interfaces#image_manual

The "spans" are a list of bounding boxes or shapes that you've annotated. An image can have multiple spans and each span describes the label ("label") and pixel coordinates ("points") of the annotated shapes. So instead of just taking spans[0], you can use all spans if you've annotated multiple.

How you use the data depends on the specifics of your model and what you want to update it with. If you've annotated rectangles and your model needs width/height instead of the pixel coordinates , you can calculate that from the coordinates of the corners – same if you need the center of the box (that's just the x/y coordinates plus half the width/height).

SintaXD · May 7, 2020, 6:34pm

Hi Inez! I'm sorry, late reply your message.
Thank you for your references, I had read that link you given to me.
Okay Inez, I'll use all spans as my input(to get the categories) because I've already annotated multiple labels in one image.
But in my list of "spans" it contains "id" for each label and points. What is that mean?
I also have another question. I already annotated using polygon to draw the annotation on the image, and when I resized my images to (256x256), what should I do with the points? Does it need changes too?

Thank you for your reply. Have a great day!

ines · May 8, 2020, 7:43pm

These are the corners of the shape. Each point is the [x, y] coordinate of the corner, in the order that you added them.

Yes, the points are pixel coordinates relative to the original image. If you draw a point at [50, 50] and then you resize the image by 50% (make it half as wide/high), that point needs to be scaled as well. So that'd now be at [25, 25]. Ideally, you want to resize the images before you annotate them, to avoid conflicts and inconsistent data.

SintaXD · May 10, 2020, 1:42pm

Hi Inez, Thankyou for your reply.

This is my example 3 spans on my dataset. Each span contains key : id, points, and label. What does it mean? Usually when train a model we need the image and label of the image. If I use dataset from prodigy, can you guide me how to insert the spans because it also need the points and labels?

[{'id': '51e8c0ce-a8eb-40c6-87a9-6a1dfedd4cf0', 'label': '4-BM', 'points': [[195.9, 428], [192.4, 424.5], [87.1, 729.7], [87.1, 729.7], [102.9, 778.8], [102.9, 778.8], [78.3, 805.1], [78.3, 810.4], [43.3, 1187.5], [43.3, 1187.5], [76.6, 1266.5], [76.6, 1266.5], [76.6, 1292.8], [76.6, 1292.8], [543.2, 1336.6], [543.2, 1336.6], [827.3, 1366.4], [827.3, 1366.4], [841.4, 1334.9], [841.4, 1334.9], [848.4, 1191], [848.4, 1185.8], [853.7, 906.9], [853.7, 905.1], [853.7, 843.7], [853.7, 843.7], [839.6, 817.4], [839.6, 817.4], [850.1, 770], [850.1, 770], [865.9, 682.3], [865.9, 682.3], [850.1, 510.4], [850.1, 510.4], [850.1, 449], [850.1, 449], [830.9, 431.5], [830.9, 431.5], [625.6, 419.2], [625.6, 419.2], [292.3, 422.7], [292.3, 422.7], [195.9, 414], [192.4, 419.2], [192.4, 422.7], [192.4, 422.7], [192.4, 422.7], [192.4, 422.7], [187.1, 422.7], [187.1, 422.7], [187.1, 422.7], [187.1, 422.7]], 'color': 'yellow'}]
[{'id': '41e0a7c3-6b96-4abd-9d0e-97dbe9db0401', 'label': '1-BAM', 'points': [[1192.4, 642.8], [1192.4, 632.9], [1128.2, 1067.3], [1133.1, 1067.3], [1182.5, 1170.9], [1182.5, 1156.1], [1029.5, 1674.3], [1029.5, 1674.3], [999.9, 2251.8], [999.9, 2251.8], [955.5, 3436.9], [955.5, 3446.7], [999.9, 3520.8], [999.9, 3520.8], [1217.1, 3560.3], [1217.1, 3560.3], [1730.4, 3555.3], [1735.3, 3555.3], [2075.9, 3506], [2075.9, 3506], [2090.7, 3422.1], [2090.7, 3422.1], [2011.7, 1709.4], [2011.7, 1709.4], [1868.6, 1073.5], [1868.6, 1073.5], [1903.1, 895.8], [1903.1, 895.8], [1868.6, 639.2], [1868.6, 634.2], [1715.6, 501], [1705.7, 501], [1552.7, 471.4], [1552.7, 471.4], [1379.9, 476.3], [1379.9, 476.3], [1226.9, 555.3], [1226.9, 555.3], [1192.4, 604.6], [1192.4, 604.6], [1192.4, 604.6], [1192.4, 604.6], [1192.4, 634.2], [1192.4, 634.2], [1192.4, 634.2], [1192.4, 634.2]], 'color': 'yellow'}]
[{'id': '3964ebc4-48cf-4b17-82e2-a8f1b65497db', 'label': '0-NP', 'points': [[32.7, 433.5], [39.1, 434.8], [246.5, 446.2], [246.5, 446.2], [255.3, 367.7], [255.3, 367.7], [41.6, 356.4], [41.6, 356.4], [28.9, 433.5], [28.9, 433.5], [28.9, 433.5], [28.9, 433.5]], 'color': 'yellow'}, {'id': '94ba319d-95a2-4efb-8ab2-992918a37d75', 'label': '0-NP', 'points': [[451.4, 433.5], [451.4, 433.5], [427.4, 456.3], [427.4, 456.3], [412.2, 591.6], [412.2, 590.4], [495.7, 586.6], [495.7, 586.6], [509.6, 470.2], [509.6, 470.2], [490.6, 438.6], [490.6, 438.6], [470.4, 437.3], [470.4, 437.3], [470.4, 437.3], [470.4, 437.3]], 'color': 'cyan'}, {'id': '1111ecd5-e2d2-4ac1-be10-dae8b0b97098', 'label': '0-NP', 'points': [[648.7, 384.2], [648.7, 384.2], [628.5, 414.5], [628.5, 415.8], [582.9, 577.7], [582.9, 577.7], [627.2, 590.4], [627.2, 590.4], [622.1, 503.1], [622.1, 503.1], [666.4, 495.5], [666.4, 495.5], [685.4, 431], [685.4, 431], [677.8, 400.6], [677.8, 400.6], [666.4, 393], [666.4, 393], [665.1, 390.5], [665.1, 390.5]], 'color': 'magenta'}, {'id': 'ffc6154e-3b2c-4f5a-8ca3-bff6e58a60ca', 'label': '0-NP', 'points': [[442.5, 358.9], [442.5, 358.9], [412.2, 413.3], [412.2, 413.3], [584.2, 481.6], [584.2, 481.6], [604.4, 471.5], [604.4, 471.5], [608.2, 427.2], [608.2, 427.2], [598.1, 413.3], [598.1, 413.3], [448.9, 355.1], [448.9, 355.1], [448.9, 355.1], [448.9, 355.1]], 'color': 'springgreen'}]

Ok Inez thankyou for the information, that means after I resize the image, I also resize point with the same scale right?

Thankyou for your help.

ines · May 11, 2020, 9:22am

If you only need an image and the label that applies to the whole image, are you sure you want to annotate data by drawing boxes? It sounds like you're creating a lot of data that you don't even need? Maybe a simpler image classification workflow would be a better fit?

In any case, if you don't need the bounding box and just the label, you can discard the points and just use the "label" value of the spans. If you have annotated boxes for multiple labels for an image, that's either multiple labels, or you have to decide which one to pick if it's mutually exclusive.

Yes, exactly.

SintaXD · May 11, 2020, 2:40pm

I already annotated images in Prodigy. Each image contain multiple object, so I label it one by one. Example I had a bottle plastic, a cup, and a spoon in one image. So the image contain 3 labels and 3 points for each object right? Can you tell me what should I took from the dataset? And what to do next?

Thankyou for your reply.

ines · May 12, 2020, 8:59am

Yes, exactly. I think it just comes down to how you want to train your model, and what model you train (e.g. object detection model). That's something you have to decide.

SintaXD · May 13, 2020, 4:12pm

Okay Inez, I will explore more.
Thankyou for your help. Have a great day

Topic		Replies	Views
Image Classification - annotating labels usage , image , solved	10	2154	April 17, 2019
image.manual with model in the loop usage , image	8	954	June 24, 2020
Creating custom labels review recipe to remove noise from the dataset	9	563	September 7, 2022
Can you annotate Videos? usage , image , custom , pytorch	13	2333	June 10, 2019
Manual Image Annotation usage , image , solved	5	1885	August 21, 2018

How to Process my Image Manual Dataset

Related topics