Did you have anything installed in your environment previously? I just updated the pydantic pin of Prodigy to match spaCy's, but aside from that, all dependencies install and resolve fine in isolation in our CI builds. (But with the new pip resolver, it's definitely possible to end up with conflicts if there's something else installed in the environment that depends on other versions of those packages.)
I'm using a clean environment of Python 3.8.6, but that comes with pip 20.2.1 and the new resolver didn't come in until pip 20.3 as far as I'm aware. So I guess it works - since there is a set of dependencies that satisfies the constraints and my understanding is the old pip isn't overly smart in how it determines what to install.
I'll update pip before installing the latest nightly and see if that does a clean install.
Thanks a lot for the update! I will try ner.teach as soon as I can.
I am a little confused with all the download files at the moment. What is the difference between cp36m-cp37m-cp38-cp39? Does it matter which one I use for installation?
I tried the new version of prodigy, and this error keeps happening:
python 3.8.10 spacy 3.1 prodigy 1.11.0a10
File "/home/borlogh/.local/share/virtualenvs/fb_sio-Kywf3_9j/lib/python3.8/site-packages/spacy_transformers/layers/trfs2arrays.py", line 23, in forward
for trf_data in trf_datas:
TypeError: 'FullTransformerBatch' object is not iterable
Thanks for trying to update. I've just tried to replicate this and couldn't immediately, but I'm on the latest version (master branch) of spacy-transformers. Do you have 1.0.2 installed? If so, could you try building the repo from source? There might have been a bugfix that hasn't made it to the latest release yet. Sorry for being annoying, but I still think this bug was fixed, so I'm hoping we can find you a setup that works!
I am using spacy-transformers 1.0.3, I tried to downgrade to 1.0.2 but it isn't compatible with spacy 3.1
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed.
This behaviour is the source of the following dependency conflicts.
prodigy 1.11.0a10 requires spacy<3.2.0,>=3.1.0, but you have spacy 3.0.6 which is incompatible.
en-core-web-trf 3.1.0 requires spacy<3.2.0,>=3.1.0, but you have spacy 3.0.6 which is incompatible.
en-core-web-trf 3.1.0 requires spacy-transformers<1.1.0,>=1.0.3,
but you have spacy-transformers 1.0.2 which is incompatible.
I made it work with the library's modification as before, so it isn't a locking problem for me right now.
Sorry I should have been more clear - 1.0.3 is the correct version of spacy-transformers (downgrading won't help), but there's a slightly newer version available only when you build it from source.
I am not sure what I am doing wrong, but it appears that when I try to fine-tune a custom model with my binary ner.teach annotation, it doesn't seem to work at all. This is what happens when I try to fine-tune a spacy 3.1 model. When using a spacy 3.0.6 model (the same training process, just using previous spacy version), it appears to have a start value of 0.36, and then decreases to 0 as well.
@tristan19954: thanks for sharing these results! It would be good to get to the bottom of this.
It looks like there are 254 instances in the training dataset for 11 NER labels. This might be a bit too few, depending on how close your new annotations are to what the model was originally trained on. You don't have a separate evaluation dataset, so 63 cases were selected at random to do the evaluation on, but these 63 might not be represented by the 254 training instances? Again, this depends a bit on the variability of your training dataset. A way to test this, is to run an artificial experiment with -n ner_teach_july,eval:ner_teach_july, which will effectively train AND evaluate on the same dataset. You typically want to avoid this, but for making sure the training mechanism works as expected, it would be a good check.
An important point is that when training on the ner_teach_july dataset, it might be the case that the model starts "forgetting" about previously learned instances, and starts overfitting on this dataset. With the Prodigy nightly you should be able to prevent this by feeding an additional NER dataset, so you can train on the "teach" dataset and another dataset simultaneously. Ideally you'd have a separate evaluation dataset that you've used both to analyse the original performance as well as the performance after training on the "teach" dataset (rather than the random split used here).
I stopped arround 300 because after that I was getting similar problems to what I reported in a previous post back in april. I seems to be working way longer/better now, but after 320 it started to just select the whole sentence and highlight that as an entity, which is why I tried to update the model at that point.
I will try out what you suggested and keep you updated! I still have access to my original training and evaluation data so I will try mixing that in too
Strange behavior is happening since I update prodigy from 1.11.0a8 to 1.11.0a10, it is using more GPU Memory. I run the exact same command in both environments and I got these results:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.84 Driver Version: 460.84 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |
| N/A 76C P0 41W / 70W | 10146MiB / 15109MiB | 87% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2910 C python 10143MiB |
+-----------------------------------------------------------------------------+
1.11.0a10
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.84 Driver Version: 460.84 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |
| N/A 70C P0 31W / 70W | 15072MiB / 15109MiB | 80% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2451 C python 15069MiB |
+-----------------------------------------------------------------------------+
I've detected because I run out of memory:
⚠ Aborting and saving the final best model. Encountered exception:
RuntimeError('CUDA out of memory. Tried to allocate 264.00 MiB (GPU 0; 14.76 GiB
total capacity; 11.06 GiB already allocated; 37.75 MiB free; 11.91 GiB reserved
in total by PyTorch)')
Is it normal this difference in memory consumption?
For comparison, have you tried the same training run with a non-transformer model? I wonder if this could be related to the transformer being less sensitive to these types of small and very sparse updates
Could you send us an email to contact@explosion.ai and include your order ID? Then we can look into this internally
You can use Prodigy v1.10 (latest stable version) with spaCy v2 and export your annotations with data-to-spacy. In spaCy v3, you can convert this data to spaCy v3's new format with spacy convert and then use it to train a spaCy v3 model. You can also apply for the nightly (see first post above), which uses spaCy v3 by default.