Greetings, Kabir
Our stream contains 4000 examples, we started noticing the repetition around 30-40 examples in.
Thank you so much!
edit: I'm leaving some examples from the stream
$ head -n 10 4000_sents_postag.jsonl
{"text": "No entanto, o filósofo Immanuel Kant afirma: ´´O ser humano é aquilo que a educação faz dele.", "spans": [{"start": 0, "end": 2, "label": "KC"}, {"start": 3, "end": 10, "label": "KC"}, {"start": 10, "end": 11, "label": "PU"}, {"start": 12, "end": 13, "label": "ART"}, {"start": 14, "end": 22, "label": "N"}, {"start": 23, "end": 31, "label": "NPROP"}, {"start": 32, "end": 36, "label": "NPROP"}, {"start": 37, "end": 43, "label": "V"}, {"start": 43, "end": 44, "label": "PU"}, {"start": 45, "end": 46, "label": "N"}, {"start": 46, "end": 47, "label": "N"}, {"start": 47, "end": 48, "label": "ART"}, {"start": 49, "end": 52, "label": "N"}, {"start": 53, "end": 59, "label": "N"}, {"start": 60, "end": 61, "label": "V"}, {"start": 62, "end": 68, "label": "PROSUB"}, {"start": 69, "end": 72, "label": "PRO-KS"}, {"start": 73, "end": 74, "label": "ART"}, {"start": 75, "end": 83, "label": "N"}, {"start": 84, "end": 87, "label": "V"}, {"start": 88, "end": 92, "label": "PREP+PROPESS"}, {"start": 92, "end": 93, "label": "PU"}]}
{"text": "Por isso é preciso que o Estado promova longos debates com as escolas proporcionando aos professores uma maneira agradável de instruir os alunos desde o ensino fundamental que o cancelamento não é a forma correta de punir alguém pelo erro.", "spans": [{"start": 0, "end": 3, "label": "PREP"}, {"start": 4, "end": 8, "label": "PROSUB"}, {"start": 9, "end": 10, "label": "V"}, {"start": 11, "end": 18, "label": "ADJ"}, {"start": 19, "end": 22, "label": "KS"}, {"start": 23, "end": 24, "label": "ART"}, {"start": 25, "end": 31, "label": "N"}, {"start": 32, "end": 39, "label": "V"}, {"start": 40, "end": 46, "label": "ADJ"}, {"start": 47, "end": 54, "label": "ADJ"}, {"start": 55, "end": 58, "label": "PREP"}, {"start": 59, "end": 61, "label": "ART"}, {"start": 62, "end": 69, "label": "N"}, {"start": 70, "end": 84, "label": "V"}, {"start": 85, "end": 88, "label": "PREP+ART"}, {"start": 89, "end": 100, "label": "N"}, {"start": 101, "end": 104, "label": "ART"}, {"start": 105, "end": 112, "label": "N"}, {"start": 113, "end": 122, "label": "ADJ"}, {"start": 123, "end": 125, "label": "PREP"}, {"start": 126, "end": 134, "label": "V"}, {"start": 135, "end": 137, "label": "ART"}, {"start": 138, "end": 144, "label": "N"}, {"start": 145, "end": 150, "label": "PREP"}, {"start": 151, "end": 152, "label": "ART"}, {"start": 153, "end": 159, "label": "N"}, {"start": 160, "end": 171, "label": "ADJ"}, {"start": 172, "end": 175, "label": "PRO-KS"}, {"start": 176, "end": 177, "label": "ART"}, {"start": 178, "end": 190, "label": "N"}, {"start": 191, "end": 194, "label": "ADV"}, {"start": 195, "end": 196, "label": "V"}, {"start": 197, "end": 198, "label": "ART"}, {"start": 199, "end": 204, "label": "N"}, {"start": 205, "end": 212, "label": "ADJ"}, {"start": 213, "end": 215, "label": "PREP"}, {"start": 216, "end": 221, "label": "V"}, {"start": 222, "end": 228, "label": "PROSUB"}, {"start": 229, "end": 233, "label": "PREP+ART"}, {"start": 234, "end": 238, "label": "N"}, {"start": 238, "end": 239, "label": "PU"}]}
{"text": "Nesse sentido, observa-se como o consumo exagerado favorece a degradação do meio ambiente, além de prejudicar a qualidade de vida dos cidadãos.", "spans": [{"start": 0, "end": 5, "label": "PREP+PROADJ"}, {"start": 6, "end": 13, "label": "N"}, {"start": 13, "end": 14, "label": "PU"}, {"start": 15, "end": 25, "label": "V+PROPESS"}, {"start": 26, "end": 30, "label": "PREP"}, {"start": 31, "end": 32, "label": "ART"}, {"start": 33, "end": 40, "label": "N"}, {"start": 41, "end": 50, "label": "PCP"}, {"start": 51, "end": 59, "label": "V"}, {"start": 60, "end": 61, "label": "ART"}, {"start": 62, "end": 72, "label": "N"}, {"start": 73, "end": 75, "label": "PREP+ART"}, {"start": 76, "end": 80, "label": "N"}, {"start": 81, "end": 89, "label": "N"}, {"start": 89, "end": 90, "label": "PU"}, {"start": 91, "end": 95, "label": "PREP"}, {"start": 96, "end": 98, "label": "PREP"}, {"start": 99, "end": 109, "label": "V"}, {"start": 110, "end": 111, "label": "ART"}, {"start": 112, "end": 121, "label": "N"}, {"start": 122, "end": 124, "label": "PREP"}, {"start": 125, "end": 129, "label": "N"}, {"start": 130, "end": 133, "label": "PREP+ART"}, {"start": 134, "end": 142, "label": "N"}, {"start": 142, "end": 143, "label": "PU"}]}
{"text": "Nesse prisma, destacam-se dois aspectos importantes: o papel da indústria agropecuária no desflorestamento, e quais os impactos de tais atividades.", "spans": [{"start": 0, "end": 5, "label": "PREP+PROADJ"}, {"start": 6, "end": 12, "label": "N"}, {"start": 12, "end": 13, "label": "PU"}, {"start": 14, "end": 25, "label": "V+PROPESS"}, {"start": 26, "end": 30, "label": "NUM"}, {"start": 31, "end": 39, "label": "N"}, {"start": 40, "end": 51, "label": "ADJ"}, {"start": 51, "end": 52, "label": "PU"}, {"start": 53, "end": 54, "label": "ART"}, {"start": 55, "end": 60, "label": "N"}, {"start": 61, "end": 63, "label": "PREP+ART"}, {"start": 64, "end": 73, "label": "N"}, {"start": 74, "end": 86, "label": "ADJ"}, {"start": 87, "end": 89, "label": "PREP+ART"}, {"start": 90, "end": 106, "label": "N"}, {"start": 106, "end": 107, "label": "PU"}, {"start": 108, "end": 109, "label": "KC"}, {"start": 110, "end": 115, "label": "PRO-KS"}, {"start": 116, "end": 118, "label": "ART"}, {"start": 119, "end": 127, "label": "N"}, {"start": 128, "end": 130, "label": "PREP"}, {"start": 131, "end": 135, "label": "PROADJ"}, {"start": 136, "end": 146, "label": "N"}, {"start": 146, "end": 147, "label": "PU"}]}
{"text": "Esta taxa alarmante já se via por relatos mais antigos, especificamente dos anos 90.", "spans": [{"start": 0, "end": 4, "label": "PROADJ"}, {"start": 5, "end": 9, "label": "N"}, {"start": 10, "end": 19, "label": "ADJ"}, {"start": 20, "end": 22, "label": "ADV"}, {"start": 23, "end": 25, "label": "PROPESS"}, {"start": 26, "end": 29, "label": "V"}, {"start": 30, "end": 33, "label": "PREP"}, {"start": 34, "end": 41, "label": "N"}, {"start": 42, "end": 46, "label": "ADV"}, {"start": 47, "end": 54, "label": "ADJ"}, {"start": 54, "end": 55, "label": "PU"}, {"start": 56, "end": 71, "label": "ADJ"}, {"start": 72, "end": 75, "label": "PREP+ART"}, {"start": 76, "end": 80, "label": "N"}, {"start": 81, "end": 83, "label": "N"}, {"start": 83, "end": 84, "label": "PU"}]}
{"text": "Em dado momento, Martin Luther King, um escritor ativista, diz que a injustiça em um lugar qualquer é uma ameaça à justiça em todo lugar.", "spans": [{"start": 0, "end": 2, "label": "PREP"}, {"start": 3, "end": 7, "label": "PCP"}, {"start": 8, "end": 15, "label": "N"}, {"start": 15, "end": 16, "label": "PU"}, {"start": 17, "end": 23, "label": "NPROP"}, {"start": 24, "end": 30, "label": "NPROP"}, {"start": 31, "end": 35, "label": "NPROP"}, {"start": 35, "end": 36, "label": "PU"}, {"start": 37, "end": 39, "label": "ART"}, {"start": 40, "end": 48, "label": "N"}, {"start": 49, "end": 57, "label": "ADJ"}, {"start": 57, "end": 58, "label": "PU"}, {"start": 59, "end": 62, "label": "V"}, {"start": 63, "end": 66, "label": "KS"}, {"start": 66, "end": 67, "label": "V+PROPESS"}, {"start": 67, "end": 68, "label": "PREP"}, {"start": 69, "end": 78, "label": "N"}, {"start": 79, "end": 81, "label": "PREP"}, {"start": 82, "end": 84, "label": "ART"}, {"start": 85, "end": 90, "label": "N"}, {"start": 91, "end": 99, "label": "PROADJ"}, {"start": 100, "end": 101, "label": "V"}, {"start": 102, "end": 105, "label": "ART"}, {"start": 106, "end": 112, "label": "N"}, {"start": 113, "end": 114, "label": "PREP+ART"}, {"start": 115, "end": 122, "label": "N"}, {"start": 123, "end": 125, "label": "PREP"}, {"start": 126, "end": 130, "label": "PROADJ"}, {"start": 131, "end": 136, "label": "N"}, {"start": 136, "end": 137, "label": "PU"}]}
{"text": "A cada dia que passa a exploração exacerbada destroem matas, florestas e poluem mais o ambiente, essa devastação em nome do desenvolvimento econômico extrapola os limites do consumo consciente prejudicando a todos os cidadões.", "spans": [{"start": 0, "end": 1, "label": "PREP"}, {"start": 2, "end": 6, "label": "PROADJ"}, {"start": 7, "end": 10, "label": "N"}, {"start": 11, "end": 14, "label": "PRO-KS"}, {"start": 15, "end": 20, "label": "V"}, {"start": 21, "end": 22, "label": "ART"}, {"start": 23, "end": 33, "label": "N"}, {"start": 34, "end": 44, "label": "PCP"}, {"start": 45, "end": 53, "label": "V"}, {"start": 54, "end": 59, "label": "N"}, {"start": 59, "end": 60, "label": "PU"}, {"start": 61, "end": 70, "label": "N"}, {"start": 71, "end": 72, "label": "KC"}, {"start": 73, "end": 79, "label": "V"}, {"start": 80, "end": 84, "label": "ADV"}, {"start": 85, "end": 86, "label": "ART"}, {"start": 87, "end": 95, "label": "N"}, {"start": 95, "end": 96, "label": "PU"}, {"start": 97, "end": 101, "label": "PROADJ"}, {"start": 102, "end": 112, "label": "N"}, {"start": 113, "end": 115, "label": "PREP"}, {"start": 116, "end": 120, "label": "N"}, {"start": 121, "end": 123, "label": "PREP+ART"}, {"start": 124, "end": 139, "label": "N"}, {"start": 140, "end": 149, "label": "ADJ"}, {"start": 150, "end": 159, "label": "V"}, {"start": 160, "end": 162, "label": "ART"}, {"start": 163, "end": 170, "label": "N"}, {"start": 171, "end": 173, "label": "PREP+ART"}, {"start": 174, "end": 181, "label": "N"}, {"start": 182, "end": 192, "label": "ADJ"}, {"start": 193, "end": 205, "label": "V"}, {"start": 206, "end": 207, "label": "PREP"}, {"start": 208, "end": 213, "label": "PROADJ"}, {"start": 214, "end": 216, "label": "ART"}, {"start": 217, "end": 225, "label": "N"}, {"start": 225, "end": 226, "label": "PU"}]}
{"text": "Mas também, ações racistas e de injurias raciais ainda acometem a nossa sociedade, com o intuito de ofender a vítima com elementos referentes à raça, religião e etnia.", "spans": [{"start": 0, "end": 3, "label": "KC"}, {"start": 4, "end": 10, "label": "PDEN"}, {"start": 10, "end": 11, "label": "PU"}, {"start": 12, "end": 17, "label": "N"}, {"start": 18, "end": 26, "label": "ADJ"}, {"start": 27, "end": 28, "label": "KC"}, {"start": 29, "end": 31, "label": "PREP"}, {"start": 32, "end": 40, "label": "N"}, {"start": 41, "end": 48, "label": "ADJ"}, {"start": 49, "end": 54, "label": "ADV"}, {"start": 55, "end": 63, "label": "V"}, {"start": 64, "end": 65, "label": "ART"}, {"start": 66, "end": 71, "label": "PROADJ"}, {"start": 72, "end": 81, "label": "N"}, {"start": 81, "end": 82, "label": "PU"}, {"start": 83, "end": 86, "label": "PREP"}, {"start": 87, "end": 88, "label": "ART"}, {"start": 89, "end": 96, "label": "N"}, {"start": 97, "end": 99, "label": "PREP"}, {"start": 100, "end": 107, "label": "V"}, {"start": 108, "end": 109, "label": "ART"}, {"start": 110, "end": 116, "label": "N"}, {"start": 117, "end": 120, "label": "PREP"}, {"start": 121, "end": 130, "label": "N"}, {"start": 131, "end": 141, "label": "PREP"}, {"start": 142, "end": 143, "label": "PREP+ART"}, {"start": 144, "end": 148, "label": "N"}, {"start": 148, "end": 149, "label": "PU"}, {"start": 150, "end": 158, "label": "N"}, {"start": 159, "end": 160, "label": "KC"}, {"start": 161, "end": 166, "label": "N"}, {"start": 166, "end": 167, "label": "PU"}]}
{"text": "Muitos são adeptos da filosofia na Gilles Lipovestsk, onde o mesmo acredita que o consumismo é uma forma terapêutica de aliviar a tenção e a ansiedade.", "spans": [{"start": 0, "end": 6, "label": "PROSUB"}, {"start": 7, "end": 10, "label": "V"}, {"start": 11, "end": 18, "label": "N"}, {"start": 19, "end": 21, "label": "PREP+ART"}, {"start": 22, "end": 31, "label": "N"}, {"start": 32, "end": 34, "label": "PREP+ART"}, {"start": 35, "end": 41, "label": "NPROP"}, {"start": 42, "end": 52, "label": "NPROP"}, {"start": 52, "end": 53, "label": "PU"}, {"start": 54, "end": 58, "label": "ADV-KS"}, {"start": 59, "end": 60, "label": "ART"}, {"start": 61, "end": 66, "label": "PROSUB"}, {"start": 67, "end": 75, "label": "V"}, {"start": 76, "end": 79, "label": "KS"}, {"start": 80, "end": 81, "label": "ART"}, {"start": 82, "end": 92, "label": "N"}, {"start": 93, "end": 94, "label": "V"}, {"start": 95, "end": 98, "label": "ART"}, {"start": 99, "end": 104, "label": "N"}, {"start": 105, "end": 116, "label": "ADJ"}, {"start": 117, "end": 119, "label": "PREP"}, {"start": 120, "end": 127, "label": "V"}, {"start": 128, "end": 129, "label": "ART"}, {"start": 130, "end": 136, "label": "N"}, {"start": 137, "end": 138, "label": "KC"}, {"start": 139, "end": 140, "label": "ART"}, {"start": 141, "end": 150, "label": "N"}, {"start": 150, "end": 151, "label": "PU"}]}
{"text": "Durante esse período diversas pessoas sentiram-se sozinhas por conta de não poder estar encontrando familiares ou alguém próximo e consequentemente causando problemas para ela mesmo, como algumas doenças.", "spans": [{"start": 0, "end": 7, "label": "PREP"}, {"start": 8, "end": 12, "label": "PROADJ"}, {"start": 13, "end": 20, "label": "N"}, {"start": 21, "end": 29, "label": "PROADJ"}, {"start": 30, "end": 37, "label": "N"}, {"start": 38, "end": 49, "label": "V+PROPESS"}, {"start": 50, "end": 58, "label": "N"}, {"start": 59, "end": 62, "label": "PREP"}, {"start": 63, "end": 68, "label": "N"}, {"start": 69, "end": 71, "label": "PREP"}, {"start": 72, "end": 75, "label": "ADV"}, {"start": 76, "end": 81, "label": "V"}, {"start": 82, "end": 87, "label": "V"}, {"start": 88, "end": 99, "label": "V"}, {"start": 100, "end": 110, "label": "N"}, {"start": 111, "end": 113, "label": "KC"}, {"start": 114, "end": 120, "label": "PROSUB"}, {"start": 121, "end": 128, "label": "ADJ"}, {"start": 129, "end": 130, "label": "KC"}, {"start": 131, "end": 147, "label": "ADV"}, {"start": 148, "end": 156, "label": "V"}, {"start": 157, "end": 166, "label": "N"}, {"start": 167, "end": 171, "label": "PREP"}, {"start": 172, "end": 175, "label": "PROPESS"}, {"start": 176, "end": 181, "label": "PROADJ"}, {"start": 181, "end": 182, "label": "PU"}, {"start": 183, "end": 187, "label": "PREP"}, {"start": 188, "end": 195, "label": "PROADJ"}, {"start": 196, "end": 203, "label": "N"}, {"start": 203, "end": 204, "label": "PU"}]}