Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inspector: add initial support for network inspection #53593

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

cola119
Copy link
Member

@cola119 cola119 commented Jun 26, 2024

The idea of supporting network inspection in Node.js was first proposed 7 years age in the nodejs/diagnostics#75. Despite numerous discussions, we have yet to settle on an implementation approach. This PR aims to serve as a starting point to explore and refine how we can achieve this feature. This PR introduces basic support for the Network domain of the Chrome DevTools Protocol (CDP) and its corresponding agent implementation in Node.js. Although this is an initial implementation with several pending tasks, it sets a foundation to verify if we are heading in the right direction.

Summary

This description outlines the strategy to support network inspection in Node.js and the design of the APIs that allow third-party libraries to integrate with the network inspection. Specifically, it introduces the NodeNetwork domain, a Node.js-specific extension of the standard Network domain, which supports both client and server application network activities.

User stories

As a client app developer

I want to be able to check the network activities triggered by the client APIs (the http module, fetch API, and WebSocket API) on devtools such as Chrome DevTools when I run the app in debugging mode via node --inspect index.js.

http.get("http://localhost:8080");
fetch("https://localhost:8080");

As a serer app developer

I want to be able to check the network traffics happened in my server on devtools such as Chrome DevTools when I run the app in debugging mode via node --inspect index.js.

As a HTTP client library developer

I want to enable my library to integrate with network inspection. For example, when my library sends a HTTP request and receives a HTTP response, it sends protocol events with debugging data to allow a library user to inspect them on devtools.

class MyHttpClient {
  id;

  send(url, method) {
    inspector.NodeNetwork.requestWillBeSent({ id, url, method });
    // socket.send(url, method);
    // ...
  }

  onReceive(response) {
    inspector.NodeNetwork.responseReceived({ id, response });
  }
}

Design

Tracking network activities

Network activities can be captured within diagnostics_channel hooks. This approach enables us to monitor activities in both core modules (http, https) and external libraries (undici) without changing the core implementation.

Emit protocol events to DevTools

Network activities captured in diagnostics_channel are passed to the inspector agent using the inspector.NodeNetwork API as custom NodeNetwork domain objects. The custom NodeNetwork domain extends the standard Network domain by including Node.js-specific events and commands, enabling more granular and relevant tracking of network activities specific to the Node.js environment. It also allows third-party libraries to integrate with Node.js's inspector mechanism.

The NodeNetwork domain will support both client activities (such as a request sent from the client) and server activities (such as a request received by the server).

domain NodeNetwork
  # Fired when a client is about to send HTTP request.
  event requestWillBeSent
    parameters
      RequestId requestId
      Request request
      MonotonicTime timestamp

  # Fire when a server receives a HTTP request.
  event requestReceived
    parameters
      RequestId requestId
      Request request
      MonotonicTime timestamp

When NodeNetwork event is sent, Node.js internally sends some of the Network domain events to the devtools frontend so that Chrome DevTools can capture them and show network activities in the network panel.

image

Demo

Currenlty, the Node-specific DevTools Frontend lacks a network tab. Therefore, you'll need to use the Chrome DevTools Frontend, accessible via devtools://devtools/bundled/inspector.html. Below is a simple demonstration:

  1. Start Node.js with the inspector and wait for a connection:
$ ./node --inspect-wait --experimental-network-inspection -e "require('https').get('https://nodejs.org/en', (res) => { console.log(res.statusCode); })"
Debugger listening on ws://127.0.0.1:9229/<inspector-websocket-id>
For help, see: https://nodejs.org/en/docs/inspector
  1. Open the Chrome DevTools Frontend and connect to the Node.js inspector
devtools://devtools/bundled/inspector.html?ws=127.0.0.1:9229/<inspector-websocket-id>
  1. Navigate to the network tab to observe network activity.

image

Network activity sources

These APIs can be supported once each diagnostics_channel provides sufficient hook timing and resources.

Scope of this PR

This PR aims to provide a minimal implementation for network inspection, focusing on delivering the fundamental functionalities. The tasks accomplished in this PR include:

  • implementing a network activity tracking mechanism with diagnositcs_channel, targeting http and https GET requests.
  • integrating the tracking mechanism with the inspector agent.
  • Implementing agents for the NodeNetwork domain.
  • Implementing agents for the Network domain.
  • Testing
  • Documentation

Future work

To fully support the Network domain of the CDP, several tasks remain:

  • Complete implementation for all network domain events as specified in the https://chromedevtools.github.io/devtools-protocol/tot/Network/
    • CDP is primarily designed for browsers, but we aim to support as many relevant features as possible in Node.js
    • Network.loadingFailed
    • request url
    • request headers
    • request timing
    • response method
    • status code
    • response headers
    • ...
  • Add a network tab on the Node-specific DevTools frontend
    • Collaborate with the Chrome DevTools team to achieve this.
  • Support client activities
  • Support server activities

Limitations and Challenges

cc @nodejs/inspector @eugeneo

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/gyp
  • @nodejs/http
  • @nodejs/net
@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Jun 26, 2024
@cola119 cola119 added the inspector Issues and PRs related to the V8 inspector protocol label Jun 26, 2024
@MoLow
Copy link
Member

MoLow commented Jun 26, 2024

CC @nodejs/undici regarding support for fetch and WebSocket

@benjamingr
Copy link
Member

Hey, this is incredibly cool.

@benjamingr
Copy link
Member

Yes, at the very least we should probably integrate with fetch/undici, it's hookable already through the dispatcher so it shouldn't be too hard (before this is stable, not to block this PR)

@GrinZero
Copy link

I think the fetch api is also very important and I hope it can be supported

@cola119
Copy link
Member Author

cola119 commented Jun 26, 2024

@benjamingr I wasn't aware of undici's dispatcher feature, and I've now seen that undici includes diagnostics channels. I'm thinking using diagnostics_channel to monitor network activities could be a more straightforward approach than capturing data within the core implementation.

@GrinZero
Copy link

I received this update before taking a shower, and this piece can finally be pushed forward, which is great.
I have decided to share my experiences and ideas in this area, hoping to be helpful for development.

About two weeks ago, I developed a library for this issue, and some key ideas should be useful:

The 'v8 inspector' lacks a network domain

It is useddevtools://devtools/bundled/inspector.htmlSubstitute (this is consistent with the above)

How to listen for HTTP/HTTPS requests?

Basic ideas

  • Firstly, I attempted to hijack the request method of the HTTP module, which allowed me to obtain Options when the user called it. By analyzing the parameters, I obtained key data such as URL and request headers.(request.ts#L77)
  • Next, we will continue to hijack the parameter callback of request. Through this step, we can obtain the IncomingMessage, so that we can obtain key data such as responseData and reponseHeader when the request returns.(request.ts#L50)
  • The final step is to hijack the instance returned by request with write method, so that the data at the time of the request can be obtained. (request.ts#L20)

Key points

  • It is worth noting that when truly sending responseData to devtool, data decompression processing is also required!(tryDecompression(...))
  • When sending requestWillBeSend, it is necessary to have more judgment on the content-type in order to better utilize devtool.

How to make full use of the initiator

A single network domain cannot support operations such as traceability and click to jump. This involves the Debugger domain and the specific chain ⛓️ :

  • Debugger.scriptParsed(Server to Devtool) -->
  • Debugger. getScriptSource (Devtool to Server) -->
  • Debugger.getScriptSourceResponse(Server response to Devtool).

The core of it is scriptId. Currently, I distinguish scriptId by file name, but the actual processing should be more complex.

How to support fetch

fetch.ts

Although I attempted to implement fetch hijacking, I found that the information provided by the fetch was relatively limited, and only basic request header \ request data \ response header \ response data could be displayed.

Data regarding data length and other aspects could not be fully collected. (This can be seen in the code specifically, the core is clone() & hijacking)

I hope this will be helpful for future development. My mastery of C++ is average, and my assistance with PR is limited.

@eugeneo
Copy link
Contributor

eugeneo commented Jun 26, 2024

One thing I was prototyping years ago was implementing Inspector domains in the user land, in JS. I.e. instead of piping requestWillBeSent/responseReceived through all the layers we could have a generic backend and one method that JS would use to send different messages. This would also allow the ecosystem add more custom domains that would be able to piggiback on existing Inspector infrastructure. E.g. some database vendor may want to add a custom domain for their database and a custom tool that would connect to Inspector server.

Another thing we discussed a lot in the past is that Network domain in Node should be reversed as most users will be interested in debugging server application. There should be requestReceived/responseWillBeSent pair.

Chrome DevTools at the time was very adamant not to reuse Chrome domains for the domains not in V8. Please rename this domain to NodeNetwork and let the tool developers opt in in supporting it. Chances are the domains will diverge (say, to support requestReceived) and it will be really difficult for tools to tell what they are working with, a browser or server.

@ronag
Copy link
Member

ronag commented Jun 26, 2024

Could we implement it in terms of diagnostic channnels that external libs (undici) can just hook into?

@GrinZero
Copy link

One thing I was prototyping years ago was implementing Inspector domains in the user land,...

I completely agree, and I also believe that the network domain in Node should be reversed. Now we can actually extend the v8 inspector launched by the node, but it is more related to remote debugging, and due to domain limitations, network and other domains cannot be implemented.

@eugeneo
Copy link
Contributor

eugeneo commented Jun 26, 2024

Note needs both HTTP server and HTTP client support... It would be invaluable to be able to see request that was received and what was sent to other microservices.

@cola119 cola119 force-pushed the network-inspection-poc branch 2 times, most recently from fc0ed45 to 5656be6 Compare June 27, 2024 07:06
@cola119
Copy link
Member Author

cola119 commented Jun 27, 2024

@nodejs/undici For network inspection on fetch API, we need undici's diagnostics_channel to support a hook when body is received. Has there been any progress on nodejs/undici#1342?

@GrinZero
Copy link

@nodejs/undici For network inspection on fetch API, we need undici's diagnostics_channel to support a hook when body is received. Has there been any progress on nodejs/undici#1342?

I would like to confirm if listening to network requests through the undici library diagnostics.channel is compatible with previous libraries?

@cola119
Copy link
Member Author

cola119 commented Jun 27, 2024

@GrinZero I'm going to add as many features of the Network domain as possible once we confirm that this PR is on the right track. (Currently, I'm trying to figure out how to support both the custom NodeNetwork domain and the Network domain). Any guidance or suggestions you could provide would be very helpful, thank you!

@eugeneo I need your advice on how to properly define and implement the custom NodeNetwork domain. My understanding is that the V8 inspector doesn't support the Network domain, so Node.js needs to support it. Additionally, Node.js should have the custom NodeNetwork domain to handle Node-specific events and commands (e.g., requestReceived) and allow ecosystems to utilize the inspector infrastructure. However, I'm still unsure the best way to proceed and would greatly appreciate your input. Below is a draft architecture overview I have in mind. Thank you for your assistance :)

image

@GrinZero
Copy link

I am wondering if it is possible to be more open and allow the node v8 inspector to directly expose the websocket server. This way, developers can not only operate devtool as a CDP client, but also extend devtool as a CDP server in the future.

dataReceived(request._inspectorRequestId, DateNow() / 1000, chunk.length);
responseString += chunk.toString();
};
response.on('data', onData);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the test results, consuming response data and entering flowing mode in the diagnostics_channel hook is causing some issues in the core. Any ideas?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show the issues?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this seems wrong - we can't consume the request this way :D

We'd need to 'tee' the stream (discussions happening elsewhere) though with the inspector cloning it should also be fine.

I think for this PR it would make sense to exclude request bodies and do it in a follow up PR.

cc @MoLow @mcollina (since our tee discussion elsewhere)

Copy link
Member Author

@cola119 cola119 Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped support for response body inspection for now. fbf5a31

@benjamingr Do you have any discussion links for reference?

@mcollina
Copy link
Member

We could add fetch support via nodejs/undici#2701.

Generically I think we should add some APIs to let devs integrate 3rd party clients.

@benjamingr
Copy link
Member

+1 to this being diagnostics_channel based and +1 for letting userland tools integrate with it.

@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jul 13, 2024
@nodejs-github-bot

This comment was marked as outdated.

@cola119
Copy link
Member Author

cola119 commented Jul 14, 2024

The PR is currently blocked by this discussion: #53593 (comment).

@benjamingr benjamingr added the tsc-agenda Issues and PRs to discuss during the meetings of the TSC. label Jul 14, 2024
@benjamingr
Copy link
Member

I've added TSC agenda since I believe there is a decision to be made here regarding whether or not we should emit Network CDP events or not since this is blocked.

Our options are:

  • Engage with Chrome DevTool team (they've not been responsive to pings recently but traditionally this has been a good route) since they asked Node to evolve separately.
  • Emit Network (and NodeNetwork as well) in the meantime until the devtools supports it and then switch to only emitting NodeNetwork
  • Emit both Network and NodeNetwork separately.

If people believe there is a better forum than the TSC or that a decision is premature because there are ongoing efforts to resolve this - let me know. cc @nodejs/diagnostics

@eugeneo
Copy link
Contributor

eugeneo commented Jul 14, 2024

@benjamingr I reached out to DevTools team lead directly internally at Google. I will try pinging random team members if the lead is not responsive, it's vacation season...

@benjamingr
Copy link
Member

That's appreciated thanks

@danilsomsikov
Copy link

Hi,

Chrome DevTools TL here.

Is there something mole a design doc we could discuss? I am very much support to be of making DevTools Network tab work with request sent from the node app. I am less sure about supporting a „reversed“ network domain for the requests received by the node app in the DevTools proper. This is much trickier, generally not in line with out priorities and requires a careful planning. (I imagine DevTools extension as one possible solution for this).

Anyway, it would be the best to discuss this all in a design doc.

@benjamingr
Copy link
Member

@danilsomsikov hey! thanks for chiming in that's very appreciated.

The ask is basically around #53593 (comment)

@benjamingr benjamingr removed the tsc-agenda Issues and PRs to discuss during the meetings of the TSC. label Jul 15, 2024
@legendecas
Copy link
Member

legendecas commented Jul 16, 2024

The ask is based on the assumption that, if Chrome DevTool protocol evolves and changes Network domain, changes have be applied in Node.js as well. The Network domain and Network.requestWillBeSent are stable, unlike the Tracing domain which is still experimental and implemented in node as NodeTracing. How is the stability in CDT defined? Can we assume it is stable enough for Node.js to implement the protocol domain (and other stable domains)?

Additionally, CDP Network domain only supports outgoing requests, I think we still need a NodeNetwork for incoming requests (not in this PR though, could be done a follow-up). And if we can agree on that outgoing requests can be monitored with CDP Network domain, I think there is no needs to duplicate the events in NodeNetwork domain (the behavior of current PR).

@benjamingr
Copy link
Member

Hey, given we have Chrome's blessing (thanks!) to use Network for outgoing requests I suggest we just do that, with the current diagnostics_channel architecture I think the current design is good.

I think incoming requests would be awesome at a later PR and I agree a dedicated UI could be very cool for that.

@legendecas legendecas dismissed their stale review July 16, 2024 20:51

Unblock and we can remove duplication as a follow-up.

@legendecas
Copy link
Member

legendecas commented Jul 16, 2024

Thanks for the work! We can continue iterate on the implementation and I strongly believe this needs Chrome DevTools team's involvement because the default dedicated Node.js inspector doesn't show the "Network" tab, so the public audience would be very limited with this initial work. A dedicated "Network" tab in the default inspector for Node.js could be really helpful!

const wallTime = DateNow();
const timestamp = wallTime / 1000;
request._inspectorRequestId = getNextRequestId();
NodeNetwork.requestWillBeSent({

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it out today when I had free time. Is it possible to retrieve the original call location from the stack obtained in this way, which indicates diagnostics_channel?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does 'the original call location' refer to the location in the user script where the network activity started? I mean, are you looking to retrieve the location indicated below?

const res = await fetch("https://nodejs.org/en"); // <-- here

If so, I think it's difficult with the current diagnostics_channel API. cc @nodejs/diagnostics

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mean it's not easy to do with dc.

@cola119
Copy link
Member Author

cola119 commented Jul 18, 2024

Thank you for all discussing the direction. I agree with using the Network domain for client-size network inspection and introducing NodeNetwork to support server-side network inspection with a dedicated UI in the future. To move forward, I have removed the redundant NodeNetwork domain implementation for now, so another review is needed.

After merging this change, I'll sort out the remaining tasks and propose the design doc for server-side inspection and the dedicated UI to collaborate with the Chrome DevTools team.

Again, I truly appreciate all your support!

@cola119 cola119 added the request-ci Add this label to start a Jenkins CI on a PR. label Jul 18, 2024
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jul 18, 2024
@nodejs-github-bot

This comment was marked as outdated.

@cola119
Copy link
Member Author

cola119 commented Jul 18, 2024

@danilsomsikov BTW, would it be reasonable to display the default network tab on the devtools frontend for JS apps (devtools://devtools/bundled/js_app.html) at this point, given that Node.js currently supports only limited network inspection features, even before we determine how to realize the inspection for incoming requests? This would be useful for users to try network inspection and provide feedback.

@danilsomsikov
Copy link

@danilsomsikov BTW, would it be reasonable to display the default network tab on the devtools frontend for JS apps (devtools://devtools/bundled/js_app.html) at this point, given that Node.js currently supports only limited network inspection features, even before we determine how to realize the inspection for incoming requests? This would be useful for users to try network inspection and provide feedback.

Sure

@cola119
Copy link
Member Author

cola119 commented Jul 18, 2024

@danilsomsikov So, I would like to request to enable the display of the network tab on the frontend. What should I do to proceed?

@danilsomsikov
Copy link

@danilsomsikov So, I would like to request to enable the display of the network tab on the frontend. What should I do to proceed?

Please file a feature request at https://goo.gle/devtools-bug

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. diag-agenda Issues and PRs to discuss during the meetings of the diagnostics working group. inspector Issues and PRs related to the V8 inspector protocol lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. notable-change PRs with changes that should be highlighted in changelogs. semver-minor PRs that contain new features and should be released in the next minor version.