Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H-86: Add minor hEngine compatibility with TypeScript for behaviors and initialization #2651

Closed

Conversation

rivertam
Copy link
Contributor

@rivertam rivertam commented Jun 8, 2023

Preamble

I'm really interested in this project, both as a user and as a potential contributor. I have a bespoke simulation framework that I've been toying with and would be interested in trying to merge some of the work I've done on it to Hash engine to see if it's useful. I'd love to take some conversations offline and I plan on messaging in the Discord, but just fyi I'm open to hopping on a video call to have higher throughput.

This PR isn't so much "I want to merge this" as it is "I wanted to get familiar enough with the internals to be able to do this", and I've spent the last couple of weeks reading through and trying to figure out how the project works from the inside. I've certainly made progress, but I'm sure guidance would help me speed up! In terms of this PR, I do need to do more thorough testing and add automated tests and I'm sure other things that are part of the process, but I wanted to submit for feedback first to avoid wasting my time. Think of this as a very involved Discussion rather than a PR, maybe.

I have a couple of ideas on things I want to work on after this PR (see next steps), but the long-and-short is that I'm interested in using swc_bundler to make it possible to split up complex logic (and maybe get rid of some temporary hacks on the worker).

🌟 What is the purpose of this PR?

This uses the swc compiler to strip the TypeScript types from behavior files named <something>.ts and initialization files named init.ts

🔍 What does this change?

This adds a compilation step to the simulation executor for initialization and for behavior execution

Pre-Merge Checklist 🚀

🚢 Has this modified a publishable library?

Confirm you have taken the necessary action to record a changeset or publish a change, as appropriate

I absolutely have not and am not sure how to do this! TBD

This PR:

  • modifies a Cargo-publishable library, but it is not yet ready to publish
  • I am unsure / need advice

📜 Does this require a change to the docs?

The changes in this PR:

  • requires changes to docs
    • I'm not sure if this should even be merged considering all it does is strip types (and not check them)
    • If it were to be merged, I think the docs could reflect that TypeScript is supported in these places

🕸️ Does this require a change to the Turbo Graph?

The changes in this PR:

  • I am unsure / need advice
    • I added swc to Cargo.toml, but I'm not sure if that's involved with the Turbo Graph or if that's just npm

⚠️ Known issues

  • No type checking
  • No exported types that are useful

🐾 Next steps

Great question

  • Export types that can be used in TypeScript (somehow)
  • Check types using tsc (is this out of scope? might be better to just leave this to editors/the web client)
  • Personally, the next thing I want to work on is adding swc_bundler to the manifest stage so we can split the init and behavior files into multiple files, which I think is currently unsupported (please correct me if I'm wrong!). While this wouldn't much help close out the "add TypeScript support" checkbox, it's what I want next as a user.

🛡 What tests cover this?

  • None! I am definitely interested in adding tests but I wanted to get feedback before wasting my time!

❓ How to test this?

  • So far, I've just converted the most basic project (the one with 2 agents with ages 0 and 5 going up) to TypeScript. It's so minimal it's bad! I've attached the project I'm using.

ageing-agents.ts.tar.gz

📹 Demo

I guess here ya go!

https://asciinema.org/a/ee9BQceUMV0t5OTAiWw7icy76

@CLAassistant
Copy link

CLAassistant commented Jun 8, 2023

CLA assistant check
All committers have signed the CLA.

@rivertam rivertam marked this pull request as draft June 8, 2023 17:09
@github-actions github-actions bot added A-engine area/deps Relates to third-party dependencies (area) labels Jun 8, 2023
@@ -615,7 +618,7 @@ fn get_behavior_from_dependency_projects(
} else {
name_root.replace('_', "-")
};
let full_name = "@hash/".to_string() + &dir + "/" + file_name;
let full_name = format!("@hash/{dir}/{file_name}");
Copy link
Contributor Author

@rivertam rivertam Jun 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't figure out why this would be, but when I made the other changes in this PR (in other files; seemingly completely unrelated), the old version of this would no longer build, but when I git stashed those changes, it would build fine. I have no idea! Nevertheless, this seems like the "better" version of the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting this ugly beast 😄

@vilkinsons vilkinsons changed the title Hash Engine: Add minor compatibility with TypeScript for behaviors and initialization hEngine: Add minor compatibility with TypeScript for behaviors and initialization Jun 8, 2023
/// All init package tasks are registered in this enum
#[derive(Clone, Debug)]
pub enum InitTask {
JsInitTask(JsInitTask),
TsInitTask(TsInitTask),
PyInitTask(PyInitTask),
}

impl Task for InitTask {
Copy link
Contributor Author

@rivertam rivertam Jun 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think the traits for Task could do with some overhaul (no elaboration given, sorry, would need to read more), but one thing in particular I've considered is applying this technique to InitTask.

Copy link
Member

@TimDiekmann TimDiekmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, this seems like a very useful addition.

I did a quick look through the PR. Generally, the code looks good, however, I have one major concern about the newly added dependency.

It would also be good if you could add a few small test cases. The easiest way to achieve this is probably to just change a few init.js to init.ts in the integration test suite.

Comment on lines +24 to +25
swc = "0.261.29"
swc_common = "0.31.11"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this also possible using v8? swc is a pretty big dependency with very many (and outdated) dependencies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a related PR, which added TS support for another location in the code. I haven't looked at transpiling TS to JS code in Rust, yet, but it might worth taking a look 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, v8 does not have TypeScript compatibility. Unfortunately, the options to compile TypeScript to JavaScript are pretty limited and all pretty heavy in this context. swc is used all throughout the TS ecosystem, perhaps most notably by Next.js, and is often touted as the fastest/lightest production compiler. It's also the only compiler I know of that doesn't require installing it externally and shelling out to it.

I'm not a huuuge fan of swc either, but it's kind of the best for what it does right now. Competitors include esbuild (which includes bundling), babel, and the default TypeScript compiler (tsc). Eventually, tsc needs to be in the stack somewhere to do actual type-checking, but the compiler is written in JS, it's quite slow (seconds where swc is milliseconds), and it isn't easily invoked directly through Rust (unlike swc). Additionally, the next stop here was to add swc_bundler, the best bundling solution I can see (can turn two TS files into one; this would allow importing files from other files, including JS) that's accessible by Rust.

Another solution I've considered is transitioning the JS runner to use bun which can do a lot of stuff "automatically". Downsides include lack of Windows support, it's supported by a single developer, and we'd definitely have to shell out to invoke it (rather than being able to inline it). However, I think there'd be a lot of upsides including automatically included ffi. There's potentially a lot to discuss here.

I'm definitely open to other solutions, though. I just think they're mostly actually heavier and more cumbersome than swc. I also don't want to keep investing in swc (through swc_bundler) if you don't want to merge it here, so definitely worth figuring out what the optimal final setup would look like. In particular, the alternatives to swc_bundler would all require shelling out as well, so I think we could want a totally different approach.

I'd boil it down to the two approaches of

  1. Do everything inline/in memory (the way this PR does). If we want this, there aren't really any good alternatives to swc right now, but if they ever were developed (e.g. Rome), we could pretty easily swap it in place of swc.
  2. Shell out to another project that uses the file system itself. This would be something like esbuild or tsc. I view this as the "heavier" option, but one nice aspect is that I think it will be more naturally conducive to bundling as one step (esbuild in particular can do both).

Please feel free to take your time with this. I'm in no rush to get this merged in, and I don't think you are either. I'm pretty happy either way, but I would of course eventually like to be able to write simulations using TypeScript, so that's my only real goal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheers for the detailed and thoughtful reply.

FWIW, I know Jarred @ Oven raised $7m+ for bun and friends quite a while back now, so if it is genuinely just supported by him at the moment I think that likely that won't be the case for all that long. But there is certainly more adoption risk w/ it than swc in my mind, and I would consider lack of native Windows support to be a blocker given our existing/historical user-base and desire to remain widely accessible.

I'll leave actual follow-up re: next steps and a decision here to Tim, but I can see the argument for swc and the benefit this PR provides beyond what we have in place already. Like you say, updates to this version of hEngine are not a high priority for us at present, but this seems like an obvious win if we can bag it in some form, and we're very grateful for the contribution here. We'll do what we can to see it reviewed in a timely fashion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I'm very excited about bun in the future.

For my edification, what are some of your top priorities at the moment? Is there a future version of the engine in the works?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rivertam We're presently evaluating the best role for hEngine to play in simulation pipelines that more heavily make use of generative agents. For the avoidance of doubt, we aim to continue work on this version of the engine (as opposed to another), but the future roadmap is currently in flux.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation! I agree w/ @nonparibus that not supporting Windows is probably a bad option. I don't know if hEngine works on Windows currently natively, but explicitly opting out of support is not worth it (we currently have a few shell scripts here and there but IIRC these are mainly used for the Python runner to set up and run). It seems that swc is the best option, currently.

Doing the translation in memory is probably the best way at the current state, yes. As this happens once and not at every simulation step I wouldn't be concerned about performance issues here. In the long-term, I'm unsure what's the best way, I'm specifically thinking about dependencies in TS, but we don't need to worry about this right now.

No, I'm not in a rush to get this merged but I also don't want to unnecessarily block it, either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Generative agents meaning agents that use LLMs or similar?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. :) (And sorry, missed this comment somehow!)

/// All init package tasks are registered in this enum
#[derive(Clone, Debug)]
pub enum InitTask {
JsInitTask(JsInitTask),
TsInitTask(TsInitTask),

Check warning

Code scanning / clippy

variant name ends with the enum's name

variant name ends with the enum's name
let compiled = self
.behavior_src
.as_ref()
.map(|src| language.compile_source(&self.name, &src));

Check warning

Code scanning / clippy

this expression creates a reference which is immediately dereferenced by the compiler

this expression creates a reference which is immediately dereferenced by the compiler
@rivertam
Copy link
Contributor Author

rivertam commented Jun 8, 2023

@TimDiekmann Definitely gonna add test cases before recommending a merge. Do you think it should be merged soon? I was expecting it to wait on other stuff (actual type-checking, documentation, perhaps frontend support; I'm not sure what the processes are here)

@TimDiekmann
Copy link
Member

There is no need to merge this soon, no.

To what extent do you expect to have type-checking in place? Documentation would be nice, yes, but hEngine does not have a frontend, it's only driven through the CLI currently. Just to make sure we're talking about the same thing here: hEngine is not the backend of https://core.hash.ai, it's a standalone engine written from scratch.

@rivertam
Copy link
Contributor Author

rivertam commented Jun 9, 2023

Ahh, right, I forgot that this is the "alpha" version of the next iteration. That's perfectly fine with me -- I've been using this standalone.

In terms of what I'm expecting, for now I would expect the engine just to crash in the same way I'd expect it to if you provide a bad behavior path or no initial state. It's certainly all at start-up time, nothing dynamic about it. Let me add in a couple of tests here, but given that it's not in production, I see how there's no great reason to block merging it.

@rivertam
Copy link
Contributor Author

rivertam commented Jun 14, 2023

Sorry for the delay on this y'all. I had tested this just by slapping a const age: number = 5 or something like that in age.js (and turned it into age.ts).

However, to use TypeScript effectively here, we'll want to insert code similar to this:

import * as globals from '../globals.json'

type Context<Globals extends object> = {
  globals(): Globals
}

// ... redacted

// ignore `state` for now
function behavior(state: State<typeof stateSchema>, context: Context<typeof globals>) { /* redacted */ }

Unlike the "slap a : number somewhere in the code" approach, this uses an import. We're only using it for types (not values, which would definitely necessitate a bundler), but this causes the strip_typescript function to emit a module, which doesn't seem to be immediately resolvable by V8.

swc has multiple module output modes, including ES Modules which v8 is ostensibly somewhat compatible with, but they all include some notion of importing/exporting unlike the way behaviors are structured right now where the variables are global a la <script> tag.

The way I see it, there is basically no point in having TypeScript support if we can't import at least things like the Context template class and use it in our editor.

So I have two inclinations. The first is to eagerly add swc_bundler to this PR (or make a chain PR, or something), which would add a lot of complexity to the preconditions of merging. The second is to work with v8/the runners to get them to execute modules properly to avoid having to do too much work. However, I'm less confident this is work that will actually be useful in the future (as we may have no need to execute modules in this way in the future).

Not sure exactly which I'm gonna explore next. Let me know if you have any opinions. I responded to @nonparibus on Discord 3 days late that I'd love to sync and get to know more about the state of this project and how it fits into your business; I think that would obviously help a lot in deciding next steps here. You can reach me there (and I'll try to be more responsive) or just email me at [email protected].

@TimDiekmann
Copy link
Member

I apologize for taking a little longer to respond.

Both approaches you mentioned seem to have advantages. From the description you gave, it sounds like the first approach (using swc_bundler) would be the more future-proof approach? Would that mean that it would be possible, in principle, to bundle both JS and TS to be consumed by V8, so as long as swc supports a JS/TS feature, it will work?

How confident are you that you can get swc_bundler to work in such a way that it can create a bundled JavaScript file that can be easily used by runners? The biggest concern with this approach would be the added complexity, but I would love to see a proper solution. If you can make this work, I think it would be very beneficial. How much work would it take to use the second approach? If this is a simple approach that can be implemented quickly, I could see it being useful in the short term.

In any case, I think it would be good to merge the PRs if all the old tests still pass. If it’s possible to easily add more tests, that would be better, of course, but I don’t see that as a strict barrier to the PRs. I’m happy to help if you have issues with the integration test suite, but given what you’ve already implemented, I think you’ll get used to the tests pretty quickly. I used to chain a lot of PRs, and in general, it is easier to branch from main rather than stacking the PRs - especially since we squash commits when merging, so a rebase is required after the merge.
Having more tests in place, however, would greatly simplify reviewing the changes 🙂

@vilkinsons vilkinsons changed the title hEngine: Add minor compatibility with TypeScript for behaviors and initialization H-86: Add minor hEngine compatibility with TypeScript for behaviors and initialization Jul 6, 2023
@vilkinsons
Copy link
Member

Hey @rivertam, thanks for your patience! Just to note that we've now moved this project across into our labs repo. You can find its new home @ https://github.com/hashintel/labs/tree/main/apps/sim-engine

@TimDiekmann
Copy link
Member

Closing in favor of hashintel/labs#41

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/deps Relates to third-party dependencies (area)
Development

Successfully merging this pull request may close these issues.

4 participants