================================ ✨ Datasets ================================
iso_eval, iso_train
================================ ✨ Sessions ================================
2022-01-03_15-55-00, 2022-01-03_15-57-01, 2022-01-03_15-57-16,
2022-01-05_09-18-46, 2022-01-05_09-35-16, 2022-01-05_09-46-06,
2022-01-05_13-07-38, 2022-01-06_17-17-19, 2022-01-06_17-18-04,
2022-01-06_18-59-43, 2022-01-06_19-44-48, 2022-01-06_20-24-34
============================== ✨ Dataset Stats ==============================
Dataset iso_train
Created 2022-01-06 20:24:34
Description None
Author None
Annotations 8334
Accept 121
Reject 8213
Ignore 0
Hello Everyone,
I recently started getting around with prodigy and was not able to figure out three concepts. Looking forward to some help or pointers (if I missed some core concepts).
-
As in the above snippets I have 12 sessions created during my learning. Are they in cache or should I close a session after usage? I am not sure if it utilizes any resource from the hardware as I never intend to reuse them.
-
I intend to perform text classification from an annotated dataset, 'iso_train' which has 8213 'reject' answers. What should I understand with the stats shown in the snippet above as I find it confusing with the following explanation in the official documentation page
The REJECT button is less relevant here, because there’s nothing to say no to – however, you can use it to reject examples that have actual problems that need fixing, like noisy preprocessing artifacts, HTML markup that wasn’t cleaned properly, texts in other languages, and so on.
- The training utilizes all all memory and I do not have a GPU. Is it possible to perform text classification with 8K annotations on 16GB memory on CPU? If yes, how should I go about doing this task.
Thanks a lot for your effort and time!
Best,
Lokesh