Individual specimens to be seqeunced as part of the Vertebrate Genomes Project (VGP) will be assigned a VGP ID according to the following scheme. The ID will take the form:
[abfmrs]AbcXyz{#}
where
-
The one letter prefix
[abfmrs]
corresponds to one of:prefix class a amphibians b birds f fishes m mammal r reptiles s sharks and relatives -
The six letter combination
AbcXyz
is a species/strain designator. In most cases, this will beGenSpe
for Genus/Species but that is not required (see below for how to resolve clashes). -
{#}
is an incremental number per individual specimen from the same species.
For each species in the VGP ordinal project, the 7-letter prefix [abfmrs]AbcXyz
will be pre-assigned to avoid conflicts.
Across all species, there will be clashes for this 7-letter prefix.
There are a few options for dealing with these:
- Allow the clashes and with individuals/species disambiguated by the final incremental number.
- Allow variation within the six letter species designator, e.g. a 2-4 split (
GeSpec
), or modified capitalisation (GENSpe
)
The EBI are in the process of setting up a registry where VGP IDs can be assigned and avoid individual IDs clashing between centres.
VGP ID | Species and common name |
---|---|
fGouWil2 | Gouania willdenowi; blunt-snouted clingfish |
mLemCat1 | Lemur catta; ring-tailed lemur |
aRhiDar3 | Rhinoderma darwinii; Darwin's frog |
bCalAnn1 | Calypte anna; Anna's hummingbird |
rDerCor1 | Dermochelys coriacea; leatherback sea turtle |
sCarCar1 | Carcharodon carcharias; great-white shark |
For a single individual, there may be multiple tissue samples used for transcriptome sequencing. The proposed scheme to distinguish these samples is:
[abfmrs]AbcXyz{#}.tissue{#}
where tissue
should come from an agreed list of terms (to be decided). Examples: fGouWil2.brain1
, fGouWil2.eye2
.
If the tissue used for transcriptome sequencing is from a different indiviual than the one sequenced to produce the assembly, then an new individual VGP ID should registered.
Having assigned VGP IDs, a BioSamples accession ID should also be generated for the individual.
Agreed metadata (to be decided) should be attached to the BioSamples entries.
Tissue samples should be assigned metadata based on an agreed ontology such as Uberon and should used the Derived from
linking facility in BioSamples to indicate the individual source of that tissue sample.
If this scheme were to extend beyond vertebrates in the VGP, the below is a proposal which would use all the letters of the alphabet to cover the Tree of Life. This is meant as a pragmatic division rather then a strict taxonomic one.
prefix | class | count | group | notes |
---|---|---|---|---|
a | amphibians | 6439 | chordates | |
b | birds | 10301 | chordates | |
c | non-vascular plants | 14222 | plants | |
d | dicotyledons | 200000 | plants | not monophyletic |
e | echinoderm | 6753 | other animals | |
f | fishes | 31862 | chordates | lobe-finned and ray finned = Osteichthyes = Teleostomi (excluding tetrapods) |
g | fungi | 123126 | other eukaryotes | |
h | platyhelminths | 9164 | other animals | |
i | insects | 795000 | other animals | |
j | jellyfish and other cnidaria | 9747 | other animals | |
k | other chordates | 1926 | chordates | cephalochordates, urochordates (tunicates), jawless fish; not monophyletic |
l | monocotyledons (lilies etc.) | 51595 | plants | 'l' for lily |
m | mammals | 4863 | chordates | |
n | nematodes | 3455 | other animals | |
o | sponges | 8499 | other animals | |
p | protists | 12695 | other eukaryotes | defined here as eukaryotes not animals or plants or fungi; not monophyletic |
q | other arthropods | 120000 | other animals | not insects; not monophyletic |
r | reptiles | 9789 | chordates | excluding birds |
s | sharks and relatives | 1149 | chordates | Chondricthyes = Elasmobranchs and Chimaeras |
t | other animal phyla | 165 | other animals | |
u | algae | 2056 | plants | not monophyletic |
v | other vascular plants | 66717 | plants | ferns, cycads, conifers, gingko etc.; not monophyletic |
w | annelids (worms) | 12738 | other animals | |
x | molluscs | 41646 | other animals | the "scs" in "moluscs" sounds a bit like it contains an 'x' |
y | bacteria | 6468 | prokaryotes | |
z | archea | 281 | prokaryotes | mosses, liverworts, hornworts; not monophyletic |
- | viruses | '-' for missing |
Equivalently, presented by group:
group | prefix | class |
---|---|---|
chordates (including vertebrates) | m | mammals |
b | birds | |
r | reptiles | |
a | amphibians | |
f | fishes | |
s | sharks | |
k | other chordates | |
other animals | e | echinoderms |
x | molluscs | |
i | insects | |
q | other arthropods | |
w | annelids (worms) | |
n | nematodes | |
h | platyhelminths | |
j | jellyfish and other cnidaria | |
o | sponges | |
t | other animal phyla | |
plants | d | dicotyledons |
l | monocotyledons | |
v | other vascular plants | |
c | non-vascular plants | |
u | algae | |
other eukaryotes | g | fungi |
p | protists | |
prokaryotes | y | bacteria |
z | archaea |