Project aims to find subcellular location of protein molecules
It contains 2 parts having target values ==> Train and test labeles and rest of the 2 parts are Sequence information of a protein molecule ==> Train and Test sequence The dataset has be transformed by various eukaryotic that is encoded in the nucleus, not fragments, experimentally annotated and longer than 40, similar localizations or subclasses of the same localization were merged into ten main localizations.Due to the length limitation of the contact map prediction model, we remove the proteins over a thousand amino acids.The final training data has 10,038 proteins, while the test data has 2446. The training and test data are separated with less than 30% identity.