Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve Xrefs equivalence #36

Merged
merged 7 commits into from
Jun 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
230 changes: 49 additions & 181 deletions playground.fsx

Large diffs are not rendered by default.

45 changes: 34 additions & 11 deletions src/OBO.NET/DBXref.fs
Original file line number Diff line number Diff line change
@@ -1,16 +1,40 @@
namespace OBO.NET


open FSharpAux
open ControlledVocabulary

open System


/// Representation of dbxrefs.
type DBXref = {
/// Representation of DBXrefs.
type DBXref =
{
Name : string
Description : string
Modifiers : string
}

/// Parses a given string to a DBXref
static member ofString (v : string) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand why you are mixing static members and instance members here, is there a need for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instance members are only possible on an already existing object. The ofString method creates a DBXref object.

let xrefRegex = Text.RegularExpressions.Regex("""(?<xrefName>^([^"{])*)(\s?)(?<xrefDescription>\"(.*?)\")?(\s?)(?<xrefModifiers>\{(.*?)}$)?""")
let matches = xrefRegex.Match(v.Trim()).Groups
{
Name = matches.Item("xrefName") .Value |> String.trim
Description = matches.Item("xrefDescription") .Value |> String.trim
Modifiers = matches.Item("xrefModifiers") .Value |> String.trim
}

/// Returns the corresponding CvTerm of the DBXref with empty name.
member this.ToCvTerm() = {
Name = ""
Accession = this.Name
RefUri =
String.split ':' this.Name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this reliable? Will all TAN have this shape? Just curious, i don't know the answer myself

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OBO Flat File Format section defines it this way, but with SHOULD.
I, myself, should therefore add a try with there. 😉

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nvm., String.split does not throw if there's not the given delimiter. It just returns an array with the input string as single item:

String.split ':' "klsdklsd"

val it: string[] = [|"klsdklsd"|]

Therefore, no need to do anything here, and I'm fine with this output for IDs that don't follow the format recommendations.

|> Array.head
|> String.trim
}


/// Functions for working with DBXrefs.
module DBXref =
Expand All @@ -37,16 +61,15 @@ module DBXref =

//Note that the trailing modifiers (like all trailing modifiers) do not need to be decoded or round-tripped by parsers; trailing modifiers can always be optionally ignored. However, all parsers must be able to gracefully ignore trailing modifiers. It is important to recognize that lines which accept a dbxref list may have a trailing modifier for each dbxref in the list, and another trailing modifier for the line itself.

// EXAMPLE (taken from GO_Slim Agr): "xref: RO:0002093"

let trimComment (line : string) =
line.Split('!').[0].Trim()

let private xrefRegex =
Text.RegularExpressions.Regex("""(?<xrefName>^([^"{])*)(\s?)(?<xrefDescription>\"(.*?)\")?(\s?)(?<xrefModifiers>\{(.*?)}$)?""")

[<Obsolete "Use `DBXref.ofString` instead">]
let parseDBXref (v : string) =
let matches = xrefRegex.Match(v.Trim()).Groups
{
Name = matches.Item("xrefName").Value
Description = matches.Item("xrefDescription").Value
Modifiers = matches.Item("xrefModifiers").Value
}
DBXref.ofString v

/// Creates a CvTerm (with an empty name) of a given DBXref.
let toCvTerm (dbxref : DBXref) =
dbxref.ToCvTerm()
1 change: 1 addition & 0 deletions src/OBO.NET/OBO.NET.fsproj
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@

<ItemGroup>
<PackageReference Include="ARCtrl.ISA" Version="1.0.0-beta.7" />
<PackageReference Include="ControlledVocabulary" Version="1.0.0" />
<PackageReference Include="Microsoft.SourceLink.GitHub" Version="[1.1.1]" PrivateAssets="All" />
<PackageReference Include="FSharpAux" Version="[2.0.0]" />
</ItemGroup>
Expand Down
41 changes: 41 additions & 0 deletions src/OBO.NET/OboOntology.fs
Original file line number Diff line number Diff line change
Expand Up @@ -519,6 +519,47 @@ type OboOntology =
static member getSynonyms term (onto : OboOntology) =
onto.GetSynonyms term

/// <summary>Checks if the given terms are treated as equivalent in the OboOntology.</summary>
/// <remarks>Note that term1 must be part of the OboOntology while term2 must be part of another ontology.</remarks>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for now, but i think we should eventually either add a check here or add item accession to the terms in the ontology, e.g. ontology["termName"] should return the relevant term, so this method just needs to be this.AreTermsEquivalent(termName : string, externalTerm : OboTerm) = ....

Requiring one term being from the ontology without checking/enforcing it screams for unintended behavior IMO, but as said, fine for now

member this.AreTermsEquivalent(term1 : OboTerm, term2 : OboTerm) =
term1.Xrefs
|> List.exists (
fun x1 ->
term2.Id = x1.Name
&&
this.TreatXrefsAsEquivalents
|> List.exists (fun t -> t = (term2.ToCvTerm()).RefUri)
)

/// <summary>Checks if the given terms are treated as equivalent in the given OboOntology.</summary>
/// <remarks>Note that term1 must be part of the given OboOntology while term2 must be part of another ontology.</remarks>
static member areTermsEquivalent term1 term2 (onto : OboOntology) =
onto.AreTermsEquivalent(term1, term2)

/// Returns all terms of the OboOntology that have equivalent terms in a given second OboOntology.
member this.ReturnAllEquivalentTerms(onto : OboOntology) =
if List.exists (fun x -> x = onto.Terms.Head.ToCvTerm().RefUri) this.TreatXrefsAsEquivalents then
this.Terms
|> Seq.filter (
fun t ->
List.isEmpty t.Xrefs |> not
&&
t.Xrefs
|> List.exists (
fun x ->
onto.Terms
|> List.exists (
fun t2 ->
(DBXref.toCvTerm x).Accession = t2.Id
)
)
)
else Seq.empty

/// Returns all terms of the first given OboOntology that have equivalent terms in the second given OboOntology.
static member returnAllEquivalentTerms (onto1 : OboOntology) onto2 =
onto1.ReturnAllEquivalentTerms(onto2)


type OboTermDef =
{
Expand Down
18 changes: 16 additions & 2 deletions src/OBO.NET/OboTerm.fs
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,12 @@
open DBXref
open TermSynonym

open ARCtrl.ISA

open System

open ARCtrl.ISA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity, where is this reference needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are IsaOntologyAnnotation functions that use this. I think it's mainly used in ISA.NET and some subsequent projects but that was more like @HLWeil 's business.

open ControlledVocabulary
open FSharpAux


/// Models the entities in an OBO Ontology.
type OboTerm =
Expand Down Expand Up @@ -335,7 +337,7 @@
propertyValues builtIn createdBy creationDate

| "xref" | "xref_analog" | "xref_unk" ->
let v = (split.[1..] |> String.concat ": ") |> parseDBXref

Check warning on line 340 in src/OBO.NET/OboTerm.fs

View workflow job for this annotation

GitHub Actions / build-and-test-linux

This construct is deprecated. Use `DBXref.ofString` instead

Check warning on line 340 in src/OBO.NET/OboTerm.fs

View workflow job for this annotation

GitHub Actions / build-and-test-windows

This construct is deprecated. Use `DBXref.ofString` instead
OboTerm.fromLines verbose en (lineNumber + 1)
id name isAnonymous altIds definition comment subsets synonyms (v::xrefs) isA
intersectionOf unionOf disjointFrom relationships isObsolete replacedby consider
Expand Down Expand Up @@ -540,6 +542,18 @@
static member getRelatedTermIds (term : OboTerm) =
term.GetRelatedTermIds()

/// Returns the corresponding CvTerm of the OboTerm.
member this.ToCvTerm() =
let tsr =
String.split ':' this.Id
|> Array.head
|> String.trim
CvTerm.create(this.Id, this.Name, tsr)

/// Returns the corresponding CvTerm of the given OboTerm.
static member toCvTerm (term : OboTerm) =
term.ToCvTerm()


/// Representation of a the relation an OboTerm can have with other OboTerms.
type TermRelation<'a> =
Expand Down
35 changes: 35 additions & 0 deletions tests/OBO.NET.Tests/DBXref.Tests.fs
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
namespace OBO.NET.Tests


open OBO.NET

open Expecto
open ControlledVocabulary


module DBXref =

[<Tests>]
let dbxrefTests =
testList "DBXref" [

let testDBXref = DBXref.ofString """test:1 "testDesc" {testMod}"""

testList "ofString" [
testCase "returns correct DBXref" <| fun _ ->
let expected = {Name = "test:1"; Description = "\"testDesc\""; Modifiers = "{testMod}"}
Expect.equal testDBXref.Name expected.Name "Name does not match"
Expect.equal testDBXref.Description expected.Description "Description does not match"
Expect.equal testDBXref.Modifiers expected.Modifiers "Modifiers do not match"
]

testList "ToCvTerm" [
testCase "returns correct CvTerm" <| fun _ ->
let actual = testDBXref.ToCvTerm()
let expected = CvTerm.create("test:1", "", "test")
Expect.equal actual.Accession expected.Accession "TANs are not equal"
Expect.equal actual.RefUri expected.RefUri "TSRs are not equal"
Expect.equal actual.Name expected.Name "Names are not equal"
]

]
3 changes: 1 addition & 2 deletions tests/OBO.NET.Tests/OBO.NET.Tests.fsproj
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
<None Include="References\IncorrectHeaderTags.obo" />
<None Include="References\DuplicateHeaderTags.obo" />
<None Include="references\CorrectHeaderTags.obo" />
<Compile Include="DBXref.Tests.fs" />
<Compile Include="TermSynonym.Tests.fs" />
<Compile Include="OboTerm.Tests.fs" />
<Compile Include="OboOntology.Tests.fs" />
Expand All @@ -32,6 +33,4 @@
<PackageReference Update="FSharp.Core" Version="7.0.401" />
</ItemGroup>

<ItemGroup />

</Project>
22 changes: 20 additions & 2 deletions tests/OBO.NET.Tests/OboOntology.Tests.fs
Original file line number Diff line number Diff line change
Expand Up @@ -36,14 +36,20 @@ module OboOntologyTests =
let testTerm4 =
OboTerm.Create(
"id:5",
Name = "testTerm4"
Name = "testTerm4",
Xrefs = [DBXref.ofString "check:1"]
)
let testTerm5 =
OboTerm.Create(
"id:6",
Name = "testTerm5",
Synonyms = [TermSynonym.parseSynonym None 0 "\"testTerm1\" EXACT []"; TermSynonym.parseSynonym None 1 "\"testTerm2\" BROAD []"; TermSynonym.parseSynonym None 2 "\"testTerm0\" NARROW []"]
)
let testTerm6 =
OboTerm.Create(
"check:1",
Name = "checkTerm1"
)

let testFile1Path = Path.Combine(__SOURCE_DIRECTORY__, "References", "CorrectHeaderTags.obo")
let testFile2Path = Path.Combine(__SOURCE_DIRECTORY__, "References", "IncorrectHeaderTags.obo")
Expand Down Expand Up @@ -125,7 +131,8 @@ module OboOntologyTests =
Expect.equal (Option.map (fun o -> o.TypeDefs) testFile1) typedefsExpected "Terms did not match"
]

let testOntology = OboOntology.Create([testTerm1; testTerm2; testTerm3; testTerm4; testTerm5], [], "")
let testOntology = OboOntology.Create([testTerm1; testTerm2; testTerm3; testTerm4; testTerm5], [], "", TreatXrefsAsEquivalents = ["check"])
let testOntology2 = OboOntology.Create([testTerm6], [], "")

testList "GetRelatedTerms" [
testCase "returns correct related terms" <| fun _ ->
Expand Down Expand Up @@ -171,4 +178,15 @@ module OboOntologyTests =
let expected = seq {Exact, testTerm5, Some testTerm1; Broad, testTerm5, Some testTerm2; Narrow, testTerm5, None}
Expect.sequenceEqual actual expected "is not equal"
]

testList "AreTermsEquivalent" [
testCase "checks equivalence correctly" <| fun _ ->
Expect.isTrue (testOntology.AreTermsEquivalent(testTerm4, testTerm6)) "is not equal"
]

testList "ReturnAllEquivalentTerms" [
testCase "returns correct terms" <| fun _ ->
let actual = testOntology.ReturnAllEquivalentTerms(testOntology2)
Expect.sequenceEqual actual (seq {testTerm4}) "is not equal"
]
]
11 changes: 11 additions & 0 deletions tests/OBO.NET.Tests/OboTerm.Tests.fs
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,13 @@ namespace OBO.NET.Tests

open Expecto
open OBO.NET
open ControlledVocabulary


module OboTermTests =

let testTerm1 = OboTerm.Create("test:001", Name = "TestTerm1")

[<Tests>]
let oboTermTest =
testList "OboTerm" [
Expand Down Expand Up @@ -35,4 +38,12 @@ module OboTermTests =
]
Expect.sequenceEqual actual expected ""
]
testList "ToCvTerm" [
testCase "returns correct CvTerm" <| fun _ ->
let actual = OboTerm.toCvTerm testTerm1
let expected = CvTerm.create("test:001", "TestTerm1", "test")
Expect.equal actual.RefUri expected.RefUri "TSRs are different"
Expect.equal actual.Name expected.Name "Names are different"
Expect.equal actual.Accession expected.Accession "TANs are different"
]
]
Loading