Feature Table

The features table is an additional file StructMAn produces. It is a tab-separated table file. Each row represents the computed results around one amino acid position of one queried protein. Compared to the classification table it contains more specific results and values. The results listed in this table can be used to assist machine learning methods that focus on amino acids or point mutations. It can list results specific for queried amino acid positions as well as for queried point mutations. In the following, all its columns are explained:

Name	Description
Protein (Uniprot-Ac or PDB-Id:Chain-Id)	The ID of the queried protein that contains the corresponding position.
WT Amino Acid	The one-letter amino acid type of the wildtype version of the query.
Position	The position of the queried mutation in the sequence of the query.
Mut Amino Acid	The one-letter amino acid type of the mutated version of the query.
AA change	A combination of WT Amino Acid, Position, and Mut Amino Acid.
Tags	The tags given in by the input of the query. When doing supervised machine learning this can be used to add the target value.
Distance-based classification	The classification based on euclidean distance calculations.
Distance-based simple classification	A simplified version of the distance-based classification.
RIN-based classification	The classification based on residue interaction networks.
RIN-based simple classification	A simplified version of the RIN-based classification.
Classification confidence	A confidence value for the classification based on how many structures went into the classification, the overall quality of these structures, and the consistency of the information from different structures.
Structure location	The location of the queried position. Either on the Surface of the protein or in the Core of the protein. This is an aggregated result from analyzing the solvent access of all mapped residues.
Amount of mapped structures	The number of structures the queried position could be mapped to.
Secondary structure assignment	The aggregated secondary structure assignment obtained by a majority vote of the secondary structure assignments done by DSSP of all mapped residues.
IUPred value	Aggregated disorder score by IUpred2a.
Region structure type	Aggregated structure type: disordered region or globular region.
Modres score	Aggregated score for the tendency of the queried position to get post-translationally modified.
Modres probability	Propensity of all mapped residues being post-translationally modified.
Phi	Aggregated phi angle.
Psi	Aggregated psi angle.
KD mean	The difference in Kyte-Doolittle (KD) hydropathy score of the wildtype residue and the mutated residue.
Volume mean	The difference in van-der-Waals volume of the wildtype residue and the mutated residue.
Chemical distance	Value of substitution in the chemical distance substitution matrix based on .
Blosum62	Value of substitution in the Blosum62 substitution matrix.
Aliphatic change	Boolean denoting a change in the aliphatic class of the substitution.
Hydrophobic change	Boolean denoting a change in the hydrophobic class of the substitution.
Aromatic change	Boolean denoting a change in the aromatic class of the substitution.
Positive charged change	Boolean denoting a change in the positive charged class of the substitution.
Polar change	Boolean denoting a change in the polar class of the substitution.
Negative charge change	Boolean denoting a change in the negative charge class of the substitution.
Charged change	Boolean denoting a change in the charged class of the substitution.
Small change	Boolean denoting a change in the small class of the substitution.
Tiny change	Boolean denoting a change in the tiny class of the substitution.
Total change	The sum of all class changes of the substitution.
B Factor	Aggregated b factor value.
AbsoluteCentrality	Aggregated network centrality value of all mapped residues. The centrality values are calculated from the residue interaction network of the chain of the mapped residue isolated from possible other chains given in the structure.
LengthNormalizedCentrality	Aggregated length normalized centrality value of all mapped residues. The centrality values are normalized by the size of the chain of the mapped residue.
MinMaxNormalizedCentrality	Aggregated min-max-normalized centrality value of all mapped residues. The centrality values are normalized by a scale based on the maximal and minimal network centrality values of all residues of the chain of the mapped residue.
AbsoluteCentralityWithNegative	Same as AbsoluteCentrality, but the residue interaction networks include negative edges.
LengthNormalizedCentralityWithNegative	Same as LengthNormalizedCentrality, but the residue interaction networks include negative edges.
MinMaxNormalizedCentralityWithNegative	Same as MinMaxNormalizedCentrality, but the residue interaction networks include negative edges.
AbsoluteComplexCentrality	Aggregated network centrality value of all mapped residues. The centrality values are calculated from the residue interaction network of all chains given in the structure.
LengthNormalizedComplexCentrality	Aggregated length normalized centrality value of all mapped residues. The centrality values are calculated from the residue interaction network of all chains given in the structure.
MinMaxNormalizedComplexCentrality	Aggregated min-max-normalized centrality value of all mapped residues. The centrality values are calculated from the residue interaction network of all chains given in the structure.
AbsoluteComplexCentralityWithNegative	Same as AbsoluteComplexCentrality, but the residue interaction networks include negative edges.
LengthNormalizedComplexCentralityWithNegative	Same as LengthNormalizedComplexCentrality, but the residue interaction networks include negative edges.
MinMaxNormalizedComplexCentralityWithNegative	Same as MinMaxNormalizedComplexCentrality, but the residue interaction networks include negative edges.
Intra_SSBOND_Propensity	Propensity of mapped residue forming a cysteine-cysteine bond with a cysteine from the same chain.
Inter_SSBOND_Propensity	Propensity of mapped residue forming a cysteine-cysteine bond with a cysteine from another chain.
Intra_Link_Propensity	Propensity of mapped residue forming a covalent bond with a residue from the same chain.
Inter_Link_Propensity	Propensity of mapped residue forming a covalent bond with a residue from another chain.
CIS_Conformation_Propensity	Propensity of mapped residue having a peptide bond in cis conformation to the next residue.
CIS_Follower_Propensity	Propensity of mapped residue having a peptide bond in cis conformation to the previous residue.
Inter Chain Median KD	Aggregated median hydropathy value of all residues of the same chain closer than 10 angstroms of the mapped residue.
Inter Chain Distance Weighted KD	Aggregated distance weighted hydropathy value of all residues of the same chain closer than 10 angstroms of the mapped residue. Distance weighted means that the hydropathy values got aggregated based on the distance to the mapped residue.
Inter Chain Median RSA	Aggregated median relative solvent-accessible area of all residues of the same chain closer than 10 angstroms of the mapped residue.
Inter Chain Distance Weighted RSA	Aggregated distance weighted relative solvent-accessible area of all residues of the same chain closer than 10 angstroms of the mapped residue. Distance weighted means that the RSA values got aggregated based on the distance to the mapped residue.
Intra Chain Median KD	Aggregated median hydropathy value of all residues of another chain closer than 10 angstroms of the mapped residue.
Intra Chain Distance Weighted KD	Aggregated distance weighted hydropathy value of all residues of another chain closer than 10 angstroms of the mapped residue.
Intra Chain Median RSA	Aggregated median relative solvent-accessible area of all residues of another chain closer than 10 angstroms of the mapped residue.
Intra Chain Distance Weighted RSA	Aggregated distance weighted relative solvent-accessible area of all residues of another chain closer than 10 angstroms of the mapped residue.
[neighbor, short, long, ligand, ion, metal, Protein, DNA, RNA, Peptide] score	Aggregated sum of interaction scores over all edges of the mapped residue in the residue interaction network to specific interaction partners. Neighbor: both neighboring residues connected by the main chain. Short: non-neighbors that are closer than 6 positions in the sequence of the protein. Long: All residues that are not neighbors or short of the same chain. Ligand: any low-molecular-weight molecule in the structure. Ion: any non-metal ion. Metal: any metal ion. Protein: Any residue from another chain in the structure. DNA: any nucleic acid from a DNA chain in the structure. RNA: any nucleic acid from a RNA chain in the structure. Peptide: any residue from a non-protein peptide in the structure.
[neighbor, short, long, ligand, ion, metal, Protein, DNA, RNA, Peptide] degree	Aggregated number of edges of the mapped residue in the residue interaction network to specific interaction partners.
[neighbor, short, long, ligand, ion, metal, Protein, DNA, RNA, Peptide] H-bond score	Aggregated sum of H-bond scores over all edges of the mapped residue in the residue interaction network to specific interaction partners.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Table

Clone this wiki locally