Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: shadow dom selection #296

Merged
merged 25 commits into from
Jan 4, 2025
Merged

feat: shadow dom selection #296

merged 25 commits into from
Jan 4, 2025

Conversation

amhsirak
Copy link
Member

@amhsirak amhsirak commented Dec 24, 2024

Summary by CodeRabbit

Release Notes: Shadow DOM Support Enhancement

  • New Features

    • Added comprehensive support for scraping and interacting with Shadow DOM elements across multiple components.
    • Enhanced selector generation and element information retrieval for complex DOM structures.
  • Improvements

    • Introduced advanced querying capabilities for elements within Shadow DOMs.
    • Updated workflow and selector management to handle Shadow DOM scenarios.
    • Improved element selection and information extraction techniques.
  • Technical Enhancements

    • Extended type definitions to support Shadow DOM selector management.
    • Refined element information retrieval for more accurate interactions.

These updates significantly improve the system's ability to handle complex web page structures with Shadow DOM elements.

@amhsirak amhsirak marked this pull request as draft December 24, 2024 18:27
Copy link

coderabbitai bot commented Dec 24, 2024

Walkthrough

This pull request introduces comprehensive support for Shadow DOM interactions across multiple components of the application. The changes span from the browser-side scraping mechanism to server-side selector management and client-side component handling. The primary focus is enhancing the ability to query, select, and interact with elements nested within Shadow DOM structures, adding more robust element detection and selection capabilities across the entire application stack.

Changes

File Change Summary
maxun-core/src/browserSide/scraper.js Added findAllElements function for Shadow DOM querying, updated scrapeSchema and scrapeList to support Shadow DOM element selection.
maxun-core/src/interpret.ts Updated scrapeSchema method signature, added removeShadowSelectors method to handle Shadow DOM selectors.
server/src/types/index.ts Introduced ShadowSelector interface, updated Selectors and BaseAction interfaces to support Shadow DOM selector information.
server/src/workflow-management/selector.ts Added getDeepestElementFromPoint function, enhanced element information retrieval for Shadow DOM elements.
server/src/workflow-management/utils.ts Modified getBestSelectorForAction to prioritize Shadow DOM selectors.
src/components/organisms/BrowserWindow.tsx Updated highlighterHandler and handleClick to support Shadow DOM element interactions.
src/context/browserSteps.tsx Added shadow property to SelectorObject interface.

Possibly related PRs

Suggested Labels

Type: Enhancement, Status: Work In Progress

Poem

🐰 Shadows dance, elements hide,
In DOM's deep roots, secrets reside.
With piercing queries, we now explore,
Each nested realm, from core to core!
Scraping magic, Shadow's delight! 🔍


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3f54f11 and e91a391.

📒 Files selected for processing (1)
  • src/components/organisms/BrowserWindow.tsx (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/components/organisms/BrowserWindow.tsx

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@naveenpan09 naveenpan09 added the Type: Feature New features label Dec 25, 2024
@amhsirak amhsirak marked this pull request as ready for review January 2, 2025 12:13
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
server/src/workflow-management/selector.ts (1)

Line range hint 26-102: Unify repeated Shadow DOM lookups
The new getDeepestElementFromPoint function helps traverse shadow roots effectively. However, note that you define a nearly identical helper function again in this file (and other files). Consider centralizing this logic in one utility to prevent duplication.

🧹 Nitpick comments (9)
src/components/organisms/BrowserWindow.tsx (3)

123-129: Replace console logs with a more controlled logging approach
Multiple debug statements (e.g., "LIST SELEECTORRRRR") can clutter production logs. Consider removing or wrapping them with a conditional check (e.g., if (process.env.NODE_ENV === 'development')) or switching to a more formal logger to prevent noise in production.


142-164: Consider parsing Shadow DOM selectors with a more robust approach
The new logic for handling pure Shadow DOM elements and mixed selectors (>>) is functional but may break if the syntax changes or if deeper nesting occurs. A structured parser for Shadow DOM segments (e.g., a dedicated helper function) can help reduce partial matches or false positives and improve maintainability.


217-217: Rename the shadow property for clarity
Using the property name shadow to store a boolean could be confusing. Consider renaming it to something like isShadowElement to convey the intention more clearly.

maxun-core/src/browserSide/scraper.js (1)

191-335: Improve maintainability of scrapeSchema and findAllElements
The additions for shadow DOM support (lines 206–247) and the fallback logic (lines 313–335) look comprehensive. However, consider extracting the repetitive code (e.g., multi-step splitting and checking for open shadow roots) into smaller helper functions to keep each function concise. This also aids in testability and future enhancements, such as supporting closed shadow roots or more complex nested selectors.

server/src/workflow-management/selector.ts (4)

Line range hint 107-219: Consolidate list-mode element traversal
The second block for list-aware traversal reimplements similar logic from the earlier block. If feasible, unify or parameterize the approach instead of having two slightly different extraction code paths for list and non-list steps.


Line range hint 240-286: Ensure consistency in bounding rectangle calculation
Using the same shadow DOM lookup inside getRect is great for consistency. Watch for possible performance overhead if high-frequency calls rely on repeated shadow root re-traversal. Caching or memoization might help if this function is called repeatedly for the same coordinates.


Line range hint 292-360: Reduce code duplication for the else block
The logic for parent-element bounding checks repeats the approach used earlier. Extracting into a shared helper or consolidating the bounding logic can reduce potential maintenance overhead.


1079-1111: Optimize repeated getShadowPath calls
The getNonUniqueSelectors function does a great job generating partial selectors in shadow roots. Consider caching shadow paths if performance becomes an issue for large pages.

Would you like help creating a simple caching mechanism for repeated shadow path lookups?

server/src/workflow-management/classes/Generator.ts (1)

747-747: Enhance DRY principle for socket emission.

You're repeating this.socket.emit('highlighter'... logic. Consider extracting a helper to avoid duplication and centralize potential updates to the emitted payload.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2fff0d2 and c287340.

📒 Files selected for processing (8)
  • maxun-core/src/browserSide/scraper.js (2 hunks)
  • maxun-core/src/interpret.ts (2 hunks)
  • server/src/types/index.ts (3 hunks)
  • server/src/workflow-management/classes/Generator.ts (1 hunks)
  • server/src/workflow-management/selector.ts (14 hunks)
  • server/src/workflow-management/utils.ts (2 hunks)
  • src/components/organisms/BrowserWindow.tsx (7 hunks)
  • src/context/browserSteps.tsx (1 hunks)
🧰 Additional context used
🪛 Biome (1.9.4)
maxun-core/src/browserSide/scraper.js

[error] 430-430: Change to an optional chain.

Unsafe fix: Change to an optional chain.

(lint/complexity/useOptionalChain)


[error] 480-480: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)

🔇 Additional comments (20)
src/components/organisms/BrowserWindow.tsx (5)

16-16: Introduce a small doc comment for the new prop
Adding isShadowRoot?: boolean; is a solid enhancement that helps differentiate between standard DOM elements and Shadow DOM elements. Consider adding a short JSDoc comment explaining its usage for future maintainers.


225-225: No specific concerns
Creating a separate field to store element info is a straightforward approach. No changes needed here.


262-262: Same naming observation as earlier
The shadow field is repeated for attribute definitions. This is consistent with previous usage but the naming comment still applies.


310-310: Same naming observation as earlier
This is identical usage of the shadow boolean property as above.


323-323: Same naming observation as earlier
Again, this directly mirrors the previous references to shadow.

maxun-core/src/browserSide/scraper.js (2)

430-430: ⚠️ Potential issue

Apply optional chaining to prevent runtime errors
The static analysis suggests using an optional chain. For example:

- const shadowContent = element.shadowRoot.textContent;
+ const shadowContent = element.shadowRoot?.textContent;

This ensures safe property access if shadowRoot is absent or closed.

Likely invalid or redundant comment.

🧰 Tools
🪛 Biome (1.9.4)

[error] 430-430: Change to an optional chain.

Unsafe fix: Change to an optional chain.

(lint/complexity/useOptionalChain)


480-480: Confirm suspicious assignment in expression
Static analysis indicates there may be an assignment within an expression around line 480 (e.g., while (sibling = sibling.previousElementSibling)). Such patterns can be confusing. You can refactor to a more explicit approach, for instance:

- while (sibling = sibling.previousElementSibling) {
-   index++;
- }
+ while (sibling?.previousElementSibling) {
+   sibling = sibling.previousElementSibling;
+   index++;
+ }
🧰 Tools
🪛 Biome (1.9.4)

[error] 480-480: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)

server/src/workflow-management/utils.ts (2)

15-19: Prioritize shadow selector when present
Returning selectors.shadowSelector.full early is a practical move to ensure Shadow DOM elements are targeted correctly. Make sure all references to shadowSelector are handled consistently, especially in fallback logic.


83-87: Likewise, prioritize shadow selector for input actions
Repeating the same approach for Input and Keydown actions is consistent and ensures Shadow DOM gets recognized across all interaction types.

server/src/types/index.ts (3)

132-135: Good introduction of a dedicated ShadowSelector interface
Defining a specific interface for shadow DOM-related selectors clarifies usage. This approach scales better than forcing shadow-specific properties into a generic interface.


151-151: New shadowSelector property
Adding shadowSelector to Selectors is consistent with the rest of the changes for shadow DOM handling. Ensure all consumer code checks for null before accessing its properties.


165-165: Refined type for selectors
Replacing the generic object with the Selectors interface helps maintain a stronger type contract when dealing with action lookups, especially now that shadow selectors are in use.

src/context/browserSteps.tsx (1)

35-35: Great addition for shadow DOM support.

The optional shadow?: boolean; property aligns well with the new shadow DOM handling. Ensure all existing references to SelectorObject also consider this boolean flag to avoid runtime errors.

maxun-core/src/interpret.ts (4)

406-406: Expanded schema parameter for Shadow DOM.

The new shadow field in the schema parameter enables better shadow DOM scraping. Make sure that downstream code using scrapeSchema properly handles this field and provides default handling when shadow is absent or null.


666-666: No substantial change.

This line appears to be a reformatting or minor structural change (closing brace) with no functional impact. No further action needed.


669-680: Possible conflict with the shadow DOM approach.

removeShadowSelectors strips out selectors containing >>, which often represent shadow boundary piercers for shadow DOM. This is potentially at odds with the new shadow DOM enhancements. Kindly verify if eliminating all “>>” selectors is intended. Consider selectively preserving them where shadow DOM is desired.

Do you want help creating a revised approach that retains valid shadow DOM selectors?


684-687: Check alignment with newly introduced shadow DOM features.

Removing shadow selectors from the workflow at this stage might break the intended shadow DOM interactions. Confirm that this meets your overall design or if you need to refine the approach to selectively keep them.

server/src/workflow-management/classes/Generator.ts (3)

733-743: Well-structured data object for shadow DOM highlighting.

The addition of shadowInfo within highlighterData nicely centralizes shadow-related details. Ensure front-end handlers are prepared to consume mode and content and that they degrade gracefully if these properties are absent.


749-749: Consistent approach for non-childSelectors scenario.

Good usage of spread syntax for composing the highlighterData. Confirm that you don’t miss other potential properties needed for shadow DOM highlighting in this path.


752-752: Confirmed approach for standard highlight emission.

Emitting the same highlighter event ensures consistency with the existing flow. Continue verifying that front-end consumers handle all relevant shadow DOM data.

Comment on lines 350 to 422
// Shadow DOM query functions remain unchanged
const queryShadowDOM = (rootElement, selector) => {
if (!selector.includes('>>')) {
return rootElement.querySelector(selector);
}

const parts = selector.split('>>').map(part => part.trim());
let currentElement = rootElement;

while (scrapedData.length < limit) {
let parentElements = Array.from(document.querySelectorAll(listSelector));
for (let i = 0; i < parts.length; i++) {
if (!currentElement) return null;

if (!currentElement.querySelector && !currentElement.shadowRoot) {
currentElement = document.querySelector(parts[i]);
continue;
}

let nextElement = currentElement.querySelector(parts[i]);

if (!nextElement && currentElement.shadowRoot) {
nextElement = currentElement.shadowRoot.querySelector(parts[i]);
}

if (!nextElement) {
const allChildren = Array.from(currentElement.children || []);
for (const child of allChildren) {
if (child.shadowRoot) {
nextElement = child.shadowRoot.querySelector(parts[i]);
if (nextElement) break;
}
}
}

currentElement = nextElement;
}

return currentElement;
};

const queryShadowDOMAll = (rootElement, selector) => {
if (!selector.includes('>>')) {
return rootElement.querySelectorAll(selector);
}

const parts = selector.split('>>').map(part => part.trim());
let currentElements = [rootElement];

// If we only got one element or none, try a more generic approach
if (limit > 1 && parentElements.length <= 1) {
const [containerSelector, _] = listSelector.split('>').map(s => s.trim());
const container = document.querySelector(containerSelector);
for (const part of parts) {
const nextElements = [];

if (container) {
const allChildren = Array.from(container.children);
for (const element of currentElements) {
if (element.querySelectorAll) {
nextElements.push(...element.querySelectorAll(part));
}

if (element.shadowRoot) {
nextElements.push(...element.shadowRoot.querySelectorAll(part));
}

const firstMatch = document.querySelector(listSelector);
if (firstMatch) {
// Get classes from the first matching element
const firstMatchClasses = Array.from(firstMatch.classList);
const children = Array.from(element.children || []);
for (const child of children) {
if (child.shadowRoot) {
nextElements.push(...child.shadowRoot.querySelectorAll(part));
}
}
}

currentElements = nextElements;
}

return currentElements;
};

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor for optional chaining in extractValue and unify DOM queries

  1. Within extractValue (line 428 onward), you're using direct property access like element.shadowRoot.textContent. To avoid potential runtime errors when shadowRoot is absent, apply optional chaining (e.g., element.shadowRoot?.textContent) or include a safety check.
  2. The code for querying shadow DOM in scrapeList has multiple nested loops. You might further unify them or reduce nesting if that enhances readability (e.g., turning repeated patterns into a single utility).

Copy link
Member Author

@amhsirak amhsirak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RohitR311 resolve merge conflicts

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
maxun-core/src/browserSide/scraper.js (2)

206-247: Consider caching shadow root queries for performance

The shadow DOM traversal logic is well-implemented, but could benefit from caching shadow roots when processing multiple selectors to avoid redundant DOM traversals.

 function findAllElements(config) {
+    // Cache shadow roots to avoid redundant traversals
+    const shadowRootCache = new Map();
+    
     if (!config.shadow || !config.selector.includes('>>')) {
         return Array.from(document.querySelectorAll(config.selector));
     }
 
     const parts = config.selector.split('>>').map(s => s.trim());
     let currentElements = [document];
     
     for (let i = 0; i < parts.length; i++) {
         const part = parts[i];
         const nextElements = [];
         
         for (const element of currentElements) {
+            // Check cache first
+            const cachedRoot = shadowRootCache.get(element);
+            if (cachedRoot) {
+                const targets = Array.from(cachedRoot.querySelectorAll(part));
+                nextElements.push(...targets);
+                continue;
+            }
             
             let targets;
             if (i === 0) {
                 targets = Array.from(element.querySelectorAll(part))
                     .filter(el => {
                         if (i === parts.length - 1) return true;
                         const shadowRoot = el.shadowRoot;
+                        // Cache shadow root if found
+                        if (shadowRoot && shadowRoot.mode === 'open') {
+                            shadowRootCache.set(el, shadowRoot);
+                        }
                         return shadowRoot && shadowRoot.mode === 'open';
                     });
             } else {
                 const shadowRoot = element.shadowRoot;
                 if (!shadowRoot || shadowRoot.mode !== 'open') continue;
+                // Cache shadow root
+                shadowRootCache.set(element, shadowRoot);
                 targets = Array.from(shadowRoot.querySelectorAll(part));
             }
             nextElements.push(...targets);
         }
         
         if (nextElements.length === 0) return [];
         currentElements = nextElements;
     }
 
     return currentElements;
 }

472-478: Add error handling for malformed shadow DOM tables

While the shadow DOM table cell handling works correctly, it should handle cases where the table structure within shadow DOM is malformed.

 if (td.getRootNode() instanceof ShadowRoot) {
     const shadowRoot = td.getRootNode();
-    const allCells = Array.from(shadowRoot.querySelectorAll('td'));
-    return allCells.indexOf(td);
+    try {
+        const allCells = Array.from(shadowRoot.querySelectorAll('td'));
+        const index = allCells.indexOf(td);
+        if (index === -1) {
+            console.warn('Cell not found in shadow DOM table');
+            return 0;
+        }
+        return index;
+    } catch (error) {
+        console.error('Error processing shadow DOM table:', error);
+        return 0;
+    }
 }
server/src/workflow-management/selector.ts (1)

906-963: Validate and optimize shadow DOM selector generation

The shadow DOM selector generation could benefit from selector validation and path optimization.

+const isValidSelector = (selector: string): boolean => {
+    try {
+        document.querySelector(selector);
+        return true;
+    } catch {
+        return false;
+    }
+};
+
+const optimizePath = (path: { host: HTMLElement, root: ShadowRoot, element: HTMLElement }[]): typeof path => {
+    return path.filter((context, index) => {
+        // Keep first and last context, optimize middle ones
+        if (index === 0 || index === path.length - 1) return true;
+        // Keep context if it adds specificity
+        return context.root.mode === 'closed' || context.host.shadowRoot?.mode === 'closed';
+    });
+};
+
 const genSelectorForShadowDOM = (element: HTMLElement) => {
     // Get complete path up to document root
     const getShadowPath = (el: HTMLElement) => {
         const path = [];
         let current = el;
         let depth = 0;
         const MAX_DEPTH = 4;
         
         while (current && depth < MAX_DEPTH) {
             const rootNode = current.getRootNode();
             if (rootNode instanceof ShadowRoot) {
                 path.unshift({
                     host: rootNode.host as HTMLElement,
                     root: rootNode,
                     element: current
                 });
                 current = rootNode.host as HTMLElement;
                 depth++;
             } else {
                 break;
             }
         }
-        return path;
+        return optimizePath(path);
     };
 
     const shadowPath = getShadowPath(element);
     if (shadowPath.length === 0) return null;
 
     try {
         const selectorParts: string[] = [];
         
         // Generate selector for each shadow DOM boundary
         shadowPath.forEach((context, index) => {
             // Get selector for the host element
             const hostSelector = finder(context.host, {
                 root: index === 0 ? document.body : (shadowPath[index - 1].root as unknown as Element)
             });
+            
+            if (!isValidSelector(hostSelector)) {
+                throw new Error(`Invalid selector generated: ${hostSelector}`);
+            }
 
             // For the last context, get selector for target element
             if (index === shadowPath.length - 1) {
                 const elementSelector = finder(element, {
                     root: context.root as unknown as Element
                 });
+                
+                if (!isValidSelector(elementSelector)) {
+                    throw new Error(`Invalid selector generated: ${elementSelector}`);
+                }
+                
                 selectorParts.push(`${hostSelector} >> ${elementSelector}`);
             } else {
                 selectorParts.push(hostSelector);
             }
         });
 
         return {
             fullSelector: selectorParts.join(' >> '),
             mode: shadowPath[shadowPath.length - 1].root.mode
         };
     } catch (e) {
-        console.warn('Error generating shadow DOM selector:', e);
+        console.error('Error generating shadow DOM selector:', e);
+        console.error('Element:', element);
+        console.error('Shadow path:', shadowPath);
         return null;
     }
 };
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c287340 and 3f54f11.

📒 Files selected for processing (3)
  • maxun-core/src/browserSide/scraper.js (7 hunks)
  • server/src/workflow-management/selector.ts (17 hunks)
  • src/components/organisms/BrowserWindow.tsx (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/components/organisms/BrowserWindow.tsx
🧰 Additional context used
🪛 Biome (1.9.4)
maxun-core/src/browserSide/scraper.js

[error] 430-430: Change to an optional chain.

Unsafe fix: Change to an optional chain.

(lint/complexity/useOptionalChain)

🔇 Additional comments (1)
maxun-core/src/browserSide/scraper.js (1)

423-434: ⚠️ Potential issue

Add null checks for shadow root access

The shadow root content extraction lacks proper null checks which could lead to runtime errors.

 function extractValue(element, attribute) {
     if (!element) return null;
     
     // Check for shadow root first
-    if (element.shadowRoot) {
+    const shadowRoot = element.shadowRoot;
+    if (shadowRoot && shadowRoot.textContent) {
         const shadowContent = element.shadowRoot.textContent;
         if (shadowContent && shadowContent.trim()) {
             return shadowContent.trim();
         }
     }

Likely invalid or redundant comment.

🧰 Tools
🪛 Biome (1.9.4)

[error] 430-430: Change to an optional chain.

Unsafe fix: Change to an optional chain.

(lint/complexity/useOptionalChain)

Comment on lines +613 to +681
let element = null;
for (let rowIndex = 0; rowIndex < Math.min(processedRows.length, limit); rowIndex++) {
const record = {};
const currentRow = processedRows[rowIndex];

if (cellIndex >= 0) {
const td = currentRow.children[cellIndex];
if (td) {
element = td.querySelector(selector);

if (!element && selector.split(">").pop().includes('td:nth-child')) {
element = td;
}

if (!element) {
const tagOnlySelector = selector.split('.')[0];
element = td.querySelector(tagOnlySelector);
}

if (!element) {
let currentElement = td;
while (currentElement && currentElement.children.length > 0) {
let foundContentChild = false;
for (const child of currentElement.children) {
if (extractValue(child, attribute)) {
currentElement = child;
foundContentChild = true;
break;
}
}
if (!foundContentChild) break;
}
element = currentElement;
}
}
} else {
element = currentRow.querySelector(selector);
}

if (element) {
record[label] = extractValue(element, attribute);
}
}
for (const [label, { selector, attribute, cellIndex }] of Object.entries(tableFields)) {
let element = null;

if (cellIndex >= 0) {
let td = currentRow.children[cellIndex];

// Check shadow DOM for td
if (!td && currentRow.shadowRoot) {
const shadowCells = currentRow.shadowRoot.children;
if (shadowCells && shadowCells.length > cellIndex) {
td = shadowCells[cellIndex];
}
}

if (td) {
element = queryShadowDOM(td, selector);

if (!element && selector.split(">").pop().includes('td:nth-child')) {
element = td;
}

if (!element) {
const tagOnlySelector = selector.split('.')[0];
element = queryShadowDOM(td, tagOnlySelector);
}

if (!element) {
let currentElement = td;
while (currentElement && currentElement.children.length > 0) {
let foundContentChild = false;
for (const child of currentElement.children) {
if (extractValue(child, attribute)) {
currentElement = child;
foundContentChild = true;
break;
}
}
if (!foundContentChild) break;
}
element = currentElement;
}
}
} else {
element = queryShadowDOM(currentRow, selector);
}

if (element) {
record[label] = extractValue(element, attribute);
}
}

if (Object.keys(record).length > 0) {
tableData.push(record);
if (Object.keys(record).length > 0) {
tableData.push(record);
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor complex table row processing logic

The shadow DOM table row processing logic is complex and could be more maintainable if broken down into smaller, focused functions.

+function processTableRows(tableContext) {
+    const rows = [];
+    if (tableContext.shadowRoot) {
+        rows.push(...tableContext.shadowRoot.getElementsByTagName('TR'));
+    }
+    rows.push(...tableContext.getElementsByTagName('TR'));
+    return rows;
+}
+
+function processRowRecord(currentRow, tableFields) {
+    const record = {};
+    for (const [label, { selector, attribute, cellIndex }] of Object.entries(tableFields)) {
+        const element = findElementInRow(currentRow, selector, cellIndex);
+        if (element) {
+            record[label] = extractValue(element, attribute);
+        }
+    }
+    return record;
+}
+
+function findElementInRow(row, selector, cellIndex) {
+    if (cellIndex >= 0) {
+        return findElementInCell(row, selector, cellIndex);
+    }
+    return queryShadowDOM(row, selector);
+}
+
 if (tableContext) {
-    // Get rows from both regular DOM and shadow DOM
-    const rows = [];
-    if (tableContext.shadowRoot) {
-        rows.push(...tableContext.shadowRoot.getElementsByTagName('TR'));
-    }
-    rows.push(...tableContext.getElementsByTagName('TR'));
+    const rows = processTableRows(tableContext);
     
     const processedRows = filterRowsBasedOnTag(rows, tableFields);
     
     for (let rowIndex = 0; rowIndex < Math.min(processedRows.length, limit); rowIndex++) {
-        const record = {};
         const currentRow = processedRows[rowIndex];
-        
-        for (const [label, { selector, attribute, cellIndex }] of Object.entries(tableFields)) {
-            let element = null;
-            // ... existing cell processing logic ...
-            if (element) {
-                record[label] = extractValue(element, attribute);
-            }
-        }
+        const record = processRowRecord(currentRow, tableFields);
 
         if (Object.keys(record).length > 0) {
             tableData.push(record);
         }
     }
 }

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +1475 to +1488
// Function to get all shadow DOM children of an element
function getShadowChildren(element: HTMLElement): HTMLElement[] {
const children: HTMLElement[] = [];

// Check if element has shadow root
const shadowRoot = element.shadowRoot;
if (shadowRoot) {
// Get all elements in the shadow DOM
const shadowElements = Array.from(shadowRoot.querySelectorAll('*')) as HTMLElement[];
children.push(...shadowElements);
}

return children;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance shadow DOM children retrieval

The shadow DOM children retrieval could be improved to handle nested shadow roots and provide more comprehensive element information.

 function getShadowChildren(element: HTMLElement): HTMLElement[] {
-    const children: HTMLElement[] = [];
+    const result: {
+        element: HTMLElement;
+        depth: number;
+        shadowRoot?: ShadowRoot;
+    }[] = [];
     
-    // Check if element has shadow root
-    const shadowRoot = element.shadowRoot;
-    if (shadowRoot) {
-        // Get all elements in the shadow DOM
-        const shadowElements = Array.from(shadowRoot.querySelectorAll('*')) as HTMLElement[];
-        children.push(...shadowElements);
+    function traverse(el: HTMLElement, depth: number = 0) {
+        const shadowRoot = el.shadowRoot;
+        if (shadowRoot) {
+            result.push({ element: el, depth, shadowRoot });
+            
+            // Get immediate children in shadow DOM
+            const shadowChildren = Array.from(shadowRoot.children) as HTMLElement[];
+            shadowChildren.forEach(child => {
+                result.push({ element: child, depth: depth + 1 });
+                traverse(child, depth + 1);
+            });
+        }
+        
+        // Check regular children for shadow roots
+        Array.from(el.children).forEach(child => {
+            traverse(child as HTMLElement, depth);
+        });
     }
     
-    return children;
+    traverse(element);
+    return result.map(r => r.element);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Function to get all shadow DOM children of an element
function getShadowChildren(element: HTMLElement): HTMLElement[] {
const children: HTMLElement[] = [];
// Check if element has shadow root
const shadowRoot = element.shadowRoot;
if (shadowRoot) {
// Get all elements in the shadow DOM
const shadowElements = Array.from(shadowRoot.querySelectorAll('*')) as HTMLElement[];
children.push(...shadowElements);
}
return children;
}
// Function to get all shadow DOM children of an element
function getShadowChildren(element: HTMLElement): HTMLElement[] {
const result: {
element: HTMLElement;
depth: number;
shadowRoot?: ShadowRoot;
}[] = [];
function traverse(el: HTMLElement, depth: number = 0) {
const shadowRoot = el.shadowRoot;
if (shadowRoot) {
result.push({ element: el, depth, shadowRoot });
// Get immediate children in shadow DOM
const shadowChildren = Array.from(shadowRoot.children) as HTMLElement[];
shadowChildren.forEach(child => {
result.push({ element: child, depth: depth + 1 });
traverse(child, depth + 1);
});
}
// Check regular children for shadow roots
Array.from(el.children).forEach(child => {
traverse(child as HTMLElement, depth);
});
}
traverse(element);
return result.map(r => r.element);
}

Comment on lines +27 to +50
const getDeepestElementFromPoint = (x: number, y: number): HTMLElement | null => {
let element = document.elementFromPoint(x, y) as HTMLElement;
if (!element) return null;

// Traverse through shadow roots
let current = element;
let shadowRoot = current.shadowRoot;

// Keep track of the deepest shadow DOM element found
let deepestElement = current;

while (shadowRoot) {
// Try to find element at same point in shadow DOM
const shadowElement = shadowRoot.elementFromPoint(x, y) as HTMLElement;
if (!shadowElement || shadowElement === current) break;

// Update our tracking of the deepest element
deepestElement = shadowElement;
current = shadowElement;
shadowRoot = current.shadowRoot;
}

return deepestElement;
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add robust error handling for shadow DOM traversal

The shadow DOM traversal could be more resilient by handling potential errors and edge cases.

 const getDeepestElementFromPoint = (x: number, y: number): HTMLElement | null => {
+    try {
         let element = document.elementFromPoint(x, y) as HTMLElement;
         if (!element) return null;
 
         // Traverse through shadow roots
         let current = element;
         let shadowRoot = current.shadowRoot;
         
         // Keep track of the deepest shadow DOM element found
         let deepestElement = current;
         
         while (shadowRoot) {
+            try {
                 // Try to find element at same point in shadow DOM
                 const shadowElement = shadowRoot.elementFromPoint(x, y) as HTMLElement;
                 if (!shadowElement || shadowElement === current) break;
                 
                 // Update our tracking of the deepest element
                 deepestElement = shadowElement;
                 current = shadowElement;
                 shadowRoot = current.shadowRoot;
+            } catch (shadowError) {
+                console.warn('Error traversing shadow root:', shadowError);
+                break;
+            }
         }
 
         return deepestElement;
+    } catch (error) {
+        console.error('Error in getDeepestElementFromPoint:', error);
+        return null;
+    }
 };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const getDeepestElementFromPoint = (x: number, y: number): HTMLElement | null => {
let element = document.elementFromPoint(x, y) as HTMLElement;
if (!element) return null;
// Traverse through shadow roots
let current = element;
let shadowRoot = current.shadowRoot;
// Keep track of the deepest shadow DOM element found
let deepestElement = current;
while (shadowRoot) {
// Try to find element at same point in shadow DOM
const shadowElement = shadowRoot.elementFromPoint(x, y) as HTMLElement;
if (!shadowElement || shadowElement === current) break;
// Update our tracking of the deepest element
deepestElement = shadowElement;
current = shadowElement;
shadowRoot = current.shadowRoot;
}
return deepestElement;
};
const getDeepestElementFromPoint = (x: number, y: number): HTMLElement | null => {
try {
let element = document.elementFromPoint(x, y) as HTMLElement;
if (!element) return null;
// Traverse through shadow roots
let current = element;
let shadowRoot = current.shadowRoot;
// Keep track of the deepest shadow DOM element found
let deepestElement = current;
while (shadowRoot) {
try {
// Try to find element at same point in shadow DOM
const shadowElement = shadowRoot.elementFromPoint(x, y) as HTMLElement;
if (!shadowElement || shadowElement === current) break;
// Update our tracking of the deepest element
deepestElement = shadowElement;
current = shadowElement;
shadowRoot = current.shadowRoot;
} catch (shadowError) {
console.warn('Error traversing shadow root:', shadowError);
break;
}
}
return deepestElement;
} catch (error) {
console.error('Error in getDeepestElementFromPoint:', error);
return null;
}
};

Copy link
Member Author

@amhsirak amhsirak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, will just test a little more before merging

@amhsirak amhsirak added the Status: In Review This PR is being reviewed label Jan 2, 2025
@amhsirak amhsirak merged commit 10bf769 into develop Jan 4, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: In Review This PR is being reviewed Type: Feature New features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants