Skip to content

Latest commit

 

History

History
131 lines (96 loc) · 9.7 KB

puppeteer.md

File metadata and controls

131 lines (96 loc) · 9.7 KB

Puppeteer

  • Puppeteer 是唸做 [puppet-'teer],意思是 "操縱木偶的人"。

參考資料:

新手上路 {: #getting-started }

Architecture ??

Overview - puppeteer/api.md at master · GoogleChrome/puppeteer #ril

Wait For ??

Console Log ??

  • puppeteer.launch([options]) - puppeteer/api.md at master · GoogleChrome/puppeteer dumpio <boolean> - 是否將 browser 的 stdout/stderr 串接到 process.stdoutprocess.stderr? 例如 puppeteer.launch({dumpio: true}),會因此看到 console.log() 的內容 "混雜" 在 output messsage 裡。

  • See console.log from inside the browser - Advanced web spidering with Puppeteer (2018-07-18)

    • node script 裡的 console.log() 是輸出到 shell,但 Page.evaluate() 裡的 console.log() 則是執行在 brower context,只會出現在 browser console,在 shell 看不到。

    • 透過 page.on 可以安排個 hook,讓 browser 裡的 console.log() 也轉印 (re-log) 到 shell。

      page.on("console", msg => {
        console.log("The whole message:", msg.text());
        console.log("\nEach argument:");
        for (let arg of msg.args()) {
          // arg is a Promise returning value of type JSHandle
          // https://pptr.dev/#?product=Puppeteer&show=api-class-jshandle
          arg.jsonValue().then(v => {
            console.log(v);
          });
        }
      });
      
  • page.console

    • Emitted when JavaScript within the page calls one of console API methods, e.g. console.log or console.dir. Also emitted if the page THROWS AN ERROR OR A WARNING. 會攔載到 console.xxx() 及錯誤 (實驗確認) => 如果訊息這麼多樣,或許可以考慮寫到 JSON,供其他平台讀取...
    • The arguments passed into console.log appear as ARGUMENTS on the event handler.
    • ConsoleMessage #ril
    • JSHandle ConsoleMessage.args() 的型態是 Array<JSHandle>,從 "handle" 看起來,似乎可以間接操作 browser 裡的物件??
  • How to get all console messages with puppeteer? including errors, CSP violations, failed resources, etc - Stack Overflow 要註冊多個 event listener 才能拿到所有的輸出 #ril

安裝設置 {: #setup }

  • Installation - Quick start  |  Tools for Web Developers  |  Google Developers
    • 用 npm/yarm 安裝 puppeteer 套件,過程中會下自動下載最新版的 Chromium;可以用 PUPPETEER_SKIP_CHROMIUM_DOWNLOAD 環境變數跳過這個步驟,但 Puppeteer 文件 又多次強調 "Puppeteer is only guaranteed to work with the bundled Chromium, use at your own risk"。
    • 從 Puppeteer 1.7 開始,發行了另一個輕量版 puppeteer-core 套件,預設不會下載 Chromium,使用時會調用已經安裝的 browser,或是連接到遠端。
    • Puppeteer requires at least Node v6.4.0, but the examples below use async/await which is only supported in Node v7.6.0 or greater.

error while loading shared libraries: libX11-xcb.so.1 (Debian) ??

(node:6) UnhandledPromiseRejectionWarning: Error: Failed to launch chrome!
/workspace/node_modules/puppeteer/.local-chromium/linux-588429/chrome-linux/chrome: error while loading shared libraries: libX11-xcb.so.1: cannot open shared object file: No such file or directory

參考資料 {: #reference }

社群:

手冊: