-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Explanation of RegEx and Reason for AASd-130 #381
Open
g1zzm0
wants to merge
8
commits into
IDTA-01001-3-1_working
Choose a base branch
from
g1zzm0/Clarify_AASd13
base: IDTA-01001-3-1_working
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
fe5b595
Add Explenation of RegEx and Reson for AASd 130 in natural language
g1zzm0 08bf571
Short Definiton
g1zzm0 99f3078
Reformation
g1zzm0 a114a6c
Remove Whitespace
g1zzm0 5cb5ff3
Spellcheck and grammar improvement
g1zzm0 9cea09a
Update IDTA-01001_Metamodel_Constraints.adoc
g1zzm0 1617b53
formulation
BirgitBoss 1a3037f
formulation
BirgitBoss File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -70,4 +70,38 @@ Note: The semanticId of a SpecificAssetId with the predefined name "gloablAssetI | |||||
|
||||||
{aasd130} | ||||||
|
||||||
Constraint AASd-130 ensures that encoding and interoperability between different serializations is possible. It corresponds to the restrictions as defined for the XML Schema 1.0footnote:[https://www.w3.org/TR/xml/#charsets]. | ||||||
Constraint AASd-130 ensures that encoding and interoperability between different serializations is possible. It corresponds to the restrictions as defined for the XML Schema 1.0footnote:[https://www.w3.org/TR/xml/#charsets]. | ||||||
|
||||||
Therefore, we need to restrict an attribute of data type 'string' to the characters that can be represented in any exchange format and language. | ||||||
Otherwise, strings in other formats such as JSON could not be converted to XML. | ||||||
|
||||||
The string contains only valid Unicode characters in the range of encoded in UTF-16 format | ||||||
The character set of XML includes (given as numerical code points and/or ranges in Unicode): | ||||||
* 0x09: ASCII horizontal tab, | ||||||
* 0x0A: ASCII linefeed (newline), | ||||||
* 0x0D: ASCII carriage return. | ||||||
* 0x20: ASCII space, | ||||||
* 0x20 - 0xD7FF: all the characters of the Basic Multilingual Plane, and | ||||||
* 0x00010000-0x0010FFFF: all the characters beyond the Basic Multilingual Plane (*e.g.*, emoticons). | ||||||
The string can include common characters like tabs, newlines, carriage returns, and spaces. | ||||||
It allows a broad range of Unicode characters, including those beyond the Basic Multilingual Plane (BMP) which are represented using surrogate pairs in UTF-16 encoding. | ||||||
It ensures that the entire string adheres to the rules of UTF-16 encoding, which is a standard way of representing a wide range of characters from different languages. | ||||||
|
||||||
This leads to the following regular expression: | ||||||
^[\x09\x0A\x0D\x20-\uD7FF\uE000-\uFFFD\u00010000-\u0010FFFF]$ | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
g1zzm0 marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
Where: | ||||||
^: Asserts the start of the string. | ||||||
[\x09\x0A\x0D\x20-\uD7FF\uE000-\uFFFD\u00010000-\u0010FFFF]: Defines a character class that allows various Unicode characters, with the following elements: | ||||||
|
||||||
\x09: ASCII horizontal tab. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The following list seems to be redundant to the list above? Remove one of them? |
||||||
\x0A: ASCII linefeed (newline). | ||||||
\x0D: ASCII carriage return. | ||||||
\x20: ASCII space. | ||||||
-: Represents a range. | ||||||
\uD7FF: The upper limit of the Basic Multilingual Plane (BMP) in UTF-16. | ||||||
\uE000-\uFFFD: Represents the range of characters from the start of the supplementary planes up to the last valid Unicode character (excluding surrogate pairs). | ||||||
\u00010000-\u0010FFFF: Represents the range of valid surrogate pairs used for characters beyond the BMP. | ||||||
*: Allows for zero or more occurrences of the characters within the character class. | ||||||
|
||||||
$: Asserts the end of the string. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think someone will need this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decision was to support what XML Schema 1.0 is supporting. Marko suggest to further restrict it, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I don't think that removing these three lines would restrict anything further as they are only explanatory for the reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remembering our discussion, we may can reformulate this one a bit to make it more specific:
"It assumes that the entire string adheres to the rules of UTF-16 encoding, which is the current standard way of representing a wide range of characters from different languages."
As far as I got the context, a UTF-32-enabled application would represent a file slightly different, no surrogate pairs needed, and therefore the regex pattern representing this constraint would need to look differently for it. But the whole UTF-16 vs. UTF-32 separation does not affect the constraint itself but it's representation in the schemas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So how about we replace the above sentence with something like a design decision: "For the current versions of the specification, this constraint is represented as a regex pattern expecting UTF-16 compliant applications"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"for the current versions"? what does this mean? It is not clear to me what we really request and expect (in the future and today).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try another formulation:
"Note: The constraint AASd-130 is represented as a regex pattern expecting UTF-16 compliant applications. It might be necessary to adjust this pattern for UTF-32 compliant applications in future versions of this specification."