Mult-user, upload multiple files at once, file capacity

Hello,
I have a few questions:

  • Is creating multiple instances with different ports the only way to create a multi-user environment?
  • Is there a way to upload multiple text files at once? and is it okay for the text format to be .txt?
  • is there a limit for the text file size? like what's the maximum size for a text file to upload and annotate and for the UI to offer acceptable performance?
  • finally, is it possible to show multiple lines of text while annotating on the UI?

Thanks so much,

This is one option. Alternatively, you can also have multiple users connect to the same session using named multi-user sessions: Web Application · Prodigy · An annotation tool for AI, Machine Learning & NLP

By default, Prodigy expects one file, but you can always implement a custom loader if you want to load from multiple files or an entirely custom format: Loaders and Input Data · Prodigy · An annotation tool for AI, Machine Learning & NLP

If you're using a format that can be read in line by line, e.g. .txt or .jsonl, the document size doesn't matter because it will be streamed in line by line.

.txt is okay if you're working with sentences, but it's not a great format if your examples include line breaks because there's no good way to define where an example starts and ends and how to split up the data. So in that case, you probably want to use a more flexible format like .json or .jsonl instead. If the recipe you're using performs sentence segmentation, you can disable it using the --unsegmented flag.

You typically want to focus on a sentence or paragraph per example if you're annotating entities, because there's no advantage in annotating longer documents, and it's a lot easier for the annotator. If you're annotating text categories, you can also use longer documents – that really depends on your task and the data.

1 Like