Skip to content
Yujie Zhang edited this page Oct 15, 2020 · 16 revisions

Welcome to the GlyCompare wiki!

What is GlyCompare?

GlyCompare is a novel method wherein glycans from glycomic data are decomposed to a minimal set of intermediate substructures, thus incorporating shared intermediate glycan substructures into all comparisons of glycans.

Run GlyCompare on Website

https://glycompare.herokuapp.com

If you run into RAM exceed problems, please contact us. We can run it for you on our lab server.

Input Files

On the "Update your own dataset" page, the sections with "*" are required inputs. Please rigorously follow the naming instructions of your input files in the following sections.

Mandatory Inputs

a. Dataset Name

  • This is the name of your project. If your dataset is published, usually name it as <last name of the first author>_<year published> (i.e. Sibille_2016).

b. Abundance Table

  • This is a CSV file named as <Dataset Name>_abundance_table.csv, where <Dataset Name> is exactly the one you filled in the Dataset Name section.

  • The row entries are samples. The columns are glycans. The table is expected to contain column names as glycan names and row names as sample names.

    Click to see example

c. Variable Annotation

  • This is a CSV file named as <Dataset Name>_variable_annotation.csv, where <Dataset Name> is exactly the one you filled in the Dataset Name section.

  • The annotation file should have a column called "Name" that contains glycan names. Dependent on whether your dataset is compositional or structural, the other required column is either "Glycan Structure" for structural data or "Composition" for compositional data.

  • Structureal data includes IUPAC-extended, glycoCT, WURCS, glytoucan_id, and linear_code.

    Click to see example
  • Compositional data is of the form HexNAc(2)Hex(5), where HexNAc and Hex are glycans and (2) and (5) are their occurance times.

    Click to see example

Input Parameters

On the "Run My Updated Dataset" page, you will need to input some parameters to proceed with the program.

1. Choose your dataset

The uploaded datasets are shown in the dropdown menu. Please select yours. If you have run the selected dataset before, the program will remember your input parameters and it will auto-set parameters if you press "Auto parameters".

2. Choose your data type

  • Structure: set to Structure if you have the Glycan Structure column in your variable annotation file. The column could be IUPAC-extended, glycoCT, WURCS, glytoucan_id, or linear_code.
  • Composition: set to Composition if you have the Composition column in your variable annotation file. The data is of compositional form such as HexNAc(2)Hex(5)

3. Fill in the information below if you selected Structure

Fill in this section if you set Structure for data type above.

a. Select linkage information

  • linkage + structure: if your structural data contains linkage information.
  • structure: if your structural data doesn't contain linkage information.

b. Select input data structure syntax

You can choose from IUPAC-extended, glycoCT, WURCS, glytoucan_id, and linear_code. Please select the one that your data contains.

c. Select root

(1). Biosynth

Select Biosynth if you want to restrict the analysis to a specified root

  • N-glycan: Specify the root to GlcNAc
  • O-glycan: Specify the root to GalNAc
  • HMO/glycolipids: Specify the root to Gal(b1-4)Glc (lactose)
  • Custom root: Specify the root to a custom monosaccharide(s). You need to input the glycoCT format of your custom root in the Custom core textbox below
(2). Epitope

Select Epitope if you don't want to specify a single root. Then the analysis will run where every possible monosaccharide is a root.

d. Draw cluster map

Check this if you want to draw cluster maps, including pseudo_profile_clustering, motif_cluster, profile_clustering

Click to see pseudo_profile_clustering example
Click to see motif_cluster example
Click to see profile_clustering example

4. Fill in the information below if you selected Composition

Specify custom summation network

Currently under construction.

5. Please select input data normalization

  • no normalization: Use the raw abundance data
  • z-score normalization: standard normalization
  • probabilistic quotient normalization: A commonly seen normalization method in biological data described in Dieterle et al. 2006

6. Download reference motif vector

The reference motif vector is synthesized through past datasets and is used during glyCompare analysis to generate more complete motifs. It is automatically expanded with new datasets.

Install GlyCompare

Clone this wiki locally