I have quite a few fields like HTML body, timestamps etc. that are presented at each task in the frontend. However I don't need to save any of those to the DB - an ID will suffice. Is it possible to exclude certain fields when exporting to DB?
You could write a hack around this, but I'm not sure I'd recommend that – it's probably better to just turn off Prodigy's built-in database functionality and store the IDs separately yourself (SQLite, remote DB, whatever you prefer).
Prodigy typically tries to make sure that the incoming data is preserved exactly as is, and that the records in the database always reflect what the annotator saw. This is usually a good thing, because it means you never lose any information, and you'll always be able to reproduce the original example. Prodigy also typically assumes that the datasets you have in the database contain the complete information, so by changing that and only saving a limited version, you can easily end up with pretty inconsistent datasets.
So I'd maybe try something like this in your recipe and set "db": False
in the returned components to disable the built-in database.
def update(answers):
db_answers = [{"id": eg["id"], "answer": eg["answer"]} for eg in answers]
save_to_your_custom_db(db_answers)