Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement async abstract methods, add TCP conn #91

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

wekesa360
Copy link
Contributor

With this PR: the jdwp_streams.py adds the implementation of the two abstract classes JDWPOutputStreamBase and the JDWPInputStreamBase . With the jvm_connection.py has the JVMConnection class for connections to a JVM.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 1, 2023
Comment on lines 9 to 13
self.host = host
self.port = port
self._tcp_connection = None
self._input_stream = None
self._output_stream = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you annotate attributes of the class ? (https://peps.python.org/pep-0526/#class-and-instance-variable-annotations)

Could you use __ instead of _ to prefix attributes that should not be accessed from outside of this class?

Comment on lines 17 to 18
self._tcp_connection = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self._tcp_connection.connect((self.host, self.port))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be using asyncio.open_connection here: https://docs.python.org/3/library/asyncio-stream.html#asyncio.open_connection to create reader/writer pair that can be used to perform asynchronous IO.

Comment on lines 38 to 40
await self._output_stream._write_bytes(handshake_bytes)

response_bytes = await self._input_stream._read_bytes(len(handshake_bytes))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of the JDWPInputStream and JDWPOutputStream is to allow us to express serialization and parsing of JDWP Structs in therms of the types defined in the spec. If there is a need for JVMConnection to send/receive anything else to/from the debugged JVM (and there is!) then it should use the underlying writer and reader returned from asyncio.open_connection directly.

class JDWPInputStream(JDWPInputStreamBase):
def __init__(self, socket_connection: socket.socket):
super().__init__()
self._tcp_connection: socket.socket = socket_connection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be asyncio.StreamReader. Also could you annotate it right at the top of the class definition ?

Comment on lines 18 to 25
async def read_packet_header(self) -> typing.Tuple[int, int, bytes, int, int]:
header_data = await self._read_bytes(10)
return struct.unpack("!IIcBB", header_data)

async def read_packet_data(self, length: int) -> bytes:
return await self._read_bytes(length)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two should be moved elsewhere, maybe they should be private methods of the JVMConnection ? This class should focus on reading and writing JDWP values from the spec.

I think we will need a methods/functions with a signatures similar to:

async def send(command_set: int, command: int, struct: typing.Optional[JDWPStruct]):
    ...

async def receive() -> JDWPStruct:
    ...

Those functions should handle packet headers and should use JDWPInputStream/JDWPOutputStreaum to turn structs into bytes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing, right now the receive method would have no idea what is the type of the response Struct it should be receiving (but it will in the future). Feel free to add an argument that is a class object indicating the type, something like:

JDWPStructT = TypeVar("TypeT", bound=JDWPStruct)
async def receive(response: Optional[Type[JDWPStructT]] = None) -> JDWPStruct:
    ...

Comment on lines 98 to 102
async def _read_id(self, id_type: IdType) -> typing.Any:
size = await self._read_bytes(1)
size = struct.unpack("B", size)[0]
data = await self._read_bytes(size)
return getattr(IdType, id_type.value)(int.from_bytes(data, byteorder="big"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct, is it ? The size is not prepended, we need to ask the VM for sizes with IDSizes command

Comment on lines 104 to 96
async def _read_type(self, type_id: IdType) -> typing.Any:
if type_id == IdType.BOOLEAN:
return await self.read_boolean()
elif type_id == IdType.INT:
return await self.read_int()
elif type_id == IdType.LONG:
return await self.read_long()
elif type_id == IdType.OBJECT_ID:
return await self.read_object_id()
elif type_id == IdType.THREAD_ID:
return await self.read_thread_id()
elif type_id == IdType.STRING:
return await self.read_string()
elif type_id == IdType.CLASS_ID:
return await self.read_reference_type_id()
else:
raise ValueError(f"Unsupported type ID: {type_id}")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unused ?

Comment on lines 201 to 190
async def write_array(
self, element_type: IdType, elements: typing.List[typing.Any]
) -> None:
await self.write_int(len(elements))

for element in elements:
await self._write_type(element_type, element)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's focus only on the abstract methods defined in the *Base class, this one is not there.

@@ -0,0 +1,48 @@
# Copyright (c) Meta Platforms, Inc. and affiliates.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this file to projects/jdwp/runtime/jvm_connection.py ? You will need to make main depend on that target to make this code visible in main.py.

@wekesa360 wekesa360 force-pushed the jdwp_debugger_init branch 4 times, most recently from f7a5710 to 6beef3d Compare December 7, 2023 12:08
Codegen dataclassess (facebookexperimental#79)

* Wrote test to validate primitive type mapping

* Added test for type alias definition

* Built dataclasses code generator

* Built test for dataclassess code generator

* Changed to unittest

* Updated struct dataclass function to support all types

* feat: built dataclass generator

* Added copyright

* Added nested struct support

* Added test for nested struct

* Feat: update functions and put it in a class.

* chore: update tests

* Casted types to pass pyre check

* Updated types

* chore: Refactor typing.cast code

* Added copyright

* refactor: update tests to reflect new schema changes

* fix: Fix pyre typing errors

[jdwp] do not emit array length and union tag fields

[jdwp] map union tags and array lengths to int

[jdwp] fix type for union fields

[jdwp] fix names of fields containing spaces

[jdwp] generate projects.jdwp.runtime.structs

[jdwp] add tests importing components of the runtime library

implement async abstract methods, add TCP conn

update files

update conn and streams

add buck deps

add jvm connection and implement jvm streams

add a base class for jdwp structs

[jdwp] generate projects.jdwp.runtime.structs

[jdwp] add tests importing components of the runtime library

fix fail build

[jdwp] fix type checker fails
Comment on lines 123 to 126
class JDWPOutputStream(JDWPOutputStreamBase):
def __init__(self, socket_connection: socket.socket):
super().__init__()
self._tcp_connection: socket.socket = socket_connection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realised that we need to compute the size of the message and put it in the header, so we need to serialise structs before we write anything to the socket. Could you rename this to be JDWPBufferOutputStream and change it so that it appends all the data to an internal bytes object ? Then the JVMConnection.send method suggested above could look something like this:

async def send(command_set: int, command: int, struct: typing.Optional[JDWPStruct]):
    if struct is None:
        data = b""
    else:
        struct_output_stream = JDWPBufferOutputStream()
        struct.serialize(struct_output_stream)
        data = JDWPBufferOutputStream.data

    # write header
    ...

    # write data
    self.writer.write(data)

Comment on lines 18 to 25
async def read_packet_header(self) -> typing.Tuple[int, int, bytes, int, int]:
header_data = await self._read_bytes(10)
return struct.unpack("!IIcBB", header_data)

async def read_packet_data(self, length: int) -> bytes:
return await self._read_bytes(length)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing, right now the receive method would have no idea what is the type of the response Struct it should be receiving (but it will in the future). Feel free to add an argument that is a class object indicating the type, something like:

JDWPStructT = TypeVar("TypeT", bound=JDWPStruct)
async def receive(response: Optional[Type[JDWPStructT]] = None) -> JDWPStruct:
    ...

Comment on lines 10 to 15
python_library(
name = "lib",
srcs = glob(["**/*.py"]),
visibility = ["PUBLIC"],
deps = [ "//projects/jdwp/runtime:runtime"]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lib target is obsolete now, could you just remove it and instead make main depend on //projects/jdwp/runtime:runtime directly ?

@@ -96,85 +96,85 @@ async def read_long(
class JDWPOutputStreamBase(abc.ABC):
# Methods for OpaqueType
@abc.abstractmethod
def write_boolean(self, value: bool) -> None:
async def write_boolean(self, value: bool) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output stream can (and should) be synchronous. StreamWriter.write is. Please remove changes in this file.

import struct
import typing
import asyncio
from projects.jdwp.defs.schema import IdType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably does not need to be imported.

pass


class JDWPOutputStream(JDWPOutputStreamBase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You missed my comment about JDWPBufferOutputStream. We need an output steam that appends everything to a byte buffer because we need to know the size of the entire message in order to write its header.

Comment on lines 34 to 36
for stream in (self.__reader, self.__writer):
if stream is not None:
if isinstance(stream, asyncio.streams.StreamWriter):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a for loop with an if statement picking only one option. It seems like this logic could be greatly simplified ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, thanks for the highlight, I had initially included a close() for the streamReader but looking at the asyncio docs there is no close() function for it.

print(f"Error during connection: {e}")
await self.close()

async def close(self) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this to be async ? There is not a single await in the body of the method

async def handshake(self) -> None:
try:
handshake_bytes = b"JDWP-Handshake"
if self.__output_stream is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should throw if this is not the case, right ?

print(f"Error during handshake: {e}")
await self.close()

async def read_packet_header(self) -> typing.Tuple[int, int, bytes, int, int]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be private (__read...) and we need a NamedTuple to represent tuples with that many components:

class PacketHeader(typing.NamedTuple):
    length: int
    id: int
    ...

Once there is a named tuple representing packet header you can add a write and a static async read methods there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants