What is the meaning of the long_text flag?

Hi there,

I couldn’t really grep what is the meaning of the long_text flag I see around the model - does it affect the UI, sentence splitting, or the model itself?


In full hindsight, I think it might have been better to handle this in the pre-processing, instead of having it as an argument in the model.

The long_text argument applied sentence splitting, and then asks questions about each sentence. For evaluation, it runs the model over each sentence in the document, and sums the scores.

You can achieve the same thing easily enough by modifying the data or the recipe, which I would probably recommend.

1 Like

Thanks for the quick reply!

It would be great if this info was available here: https://prodi.gy/docs/recipes#textcat-teach

We ended up more or less deprecating that mode because it didn't end up being very useful and a bit too "magical" (too much going on under the hood and not very transparent). The flag still exists because we didn't want to break backwards compatibility, but it's not necessarily a workflow we'd recommend. Instead, you can just split the sentences yourself and decide how you want to interpret the per-sentence scores, which makes the whole process more transparent as well :slightly_smiling_face: