I have a dataset that is very imbalanced, and I have to train a ner model on it. I was wondering how will imbalanced data affect model performance ? Below is the distribution of the data for example -
Tops 2981
Coat & Jacket 2327
Pants 2277
Dress 2180
Shorts 1969
Jeans 1765
Sandals 1645
Boots 1637
Sweaters 1633
Skirt 1600
Sneakers 1593
Bra 1549
Flats 1508
Jumpsuits 1252
Panties 1140
Pumps 1114
Mules 1060
Earrings 958
Bracelets 897
Clutches 870
Necklaces 848
Totes 821
Cover ups 797
Rings 793
Shoulder Bag 743
Hosiery 738
Watches 688
One Piece 676
Sunglasses 662
Belts 662
...
Bodysuits 292
Slippers 281
Hair Accessories 278
Luggage and Travel 231
Robe 227
Slingback 169
Gown 150
Hobo Bag 134
Pajamas 122
Belt bag 110
Saddle Bag 103
Chemise 102
Cold Weather 86
Diaper Bag 77
Anklets 73
Camera 70
Accessories & Cases 67
Hoodies & SweatShirt 50
Pins 39
Eyewear 30
Briefcase 22
Brooches 22
Messenger Bag 21
Bustier 17
Rompers 15
Sock-Bootie 14
Two Piece 8
Accessories 8
Underwear 1
Convertible Bag 1
Name: Style, Length: 70, dtype: int64
Is their any sampling technique that I can use to remove the imbalance ?