-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Surprising multi-connector behavior #1351
Comments
It's even crazier.
while
|
I forgot to mention that both the number of linkages changes, and the cost of the best linkage changes too. The best cost for four-word sentences is achieved with four wall connectors:
which gives me
|
An interesting phenomenon shows up, described in issue #1351
Random sampling here. Does it matter? |
Yes. The costs are crazy. So, with
I get:
Notice the cost. Then
Note the cost. Counting by hand, I see 7 + links and seven - links, so I expect a cost of
so that totals to
So the multi-connectors are now working "individually". |
OK, so I understand the costs being computed differently. The multi-connector isn't incrementing the cost. Fine. But look at the last word
vs.
Notice how this time, the disjunct on |
Ah! OK, I think I understand what is going on. The two- case should be enough to enumerate all possible graphs. There seem to be more, because they are duplicated with different disjunct usage. The multi connector cost isn't accounted for when multi is used. So I'm thinking everything works correctly; its just confusing. I guess the way cost is computed is a bug -- is there an ovious way to "fix" the cost for multi-connectors? Good night, I'll sleep on this. |
See also |
I haven't tried yet to add an option to directly use the chosen disjuncts, but the concept of the current code seems to me solid for now.
Disjunct [0] and disjunct [4] can be considered duplicates (also [1] and [3]) since disjunct [0] can create similar linkages as disjunct [4]. Despite having diagrams that look the same, the total cost depends on the length of the multi-connector sequence which got used (if the multi-connector costs are not 0).
As you said, this seems fine - on
The current concept of disjunct cost is static (independent of the number of times a multi-connector is used). Instead, maybe we should discard such "duplicate" disjuncts. But the implementation looks very problematic because it is not clear how to handle the various cases (we can discuss that if needed). |
OK, so after sleeping, I've convinced myself everything is fine, except for the cost computation. The fully correct behavior is given by the dict
and the above does generate an exhaustive collection of linkages. However, due to the cost computations, the ones with the maximal number of links are NOT presented first! And that is what confused me, and lead to this long report.
Yes, that was the whole idea! In one case, the use is minimized; in the other, it is maximized. |
To fix cost computations, I think it can be done simply & easily in |
I implemented a cost-adjust function, that adjusts the total cost by the number of times that a multi-connector is used. It is in my branch @ampli take a look, tell me what you think. Getting this to work is low priority -- the goal was to maximize the creation of links, when they have negative cost. It seems that maximizing the creation of links is useful, to maximally constrain the grammar. I was envisioning that this would be used only by the Atomese dict, only for The branch: https://github.com/linas/link-grammar/tree/cost-adjust |
Now that the issue is clearly understood, we can do it correctly. See #1351 for details.
Multi-connectors are behaving in a fashion I do not fully understand. Consider this dict
and then parse the sentence `asdf asdf asdf asdf. I get this:
and the others are all trees. Not surprising.
When I set
<UNKNOWN-WORD>: <many> and <many>;
then I getLoops now appear. Not surprising. This is what I wanted. Making the costs negative means that the loopy parses are shown first. But oddly, this is not showing the maximum-possible number of loops!
With
<UNKNOWN-WORD>: <many> and <many> and <many>;
I getMuch better! But wait, there's more: with four
<many>
's I getI understand the difference between one and two
<many>
's, But the change for three and four is surprising. I do not understand that or why it's happening.Seem that five-word sentences need five
<many>
's and so on, to explore al possibilities.The text was updated successfully, but these errors were encountered: