I’m porting the TypeScript type checker tsc to Go

450 points3 yearskdy1.dev

constexpr • 3 years ago

I'm the author of esbuild. I think this is a very smart, practical choice and is one of the approaches that has the best chance of succeeding (the other being a semi-automated porting tool instead of a purely manual port). I'm very excited to see someone take this approach. This is how I would do it if I were to do it. Web developers around the world are hoping this works out :)

The hard part about this project isn't doing the port but keeping up with the constant changes from the TypeScript team, especially if the TypeScript team isn't invested in helping you out. Staying as close as possible to the source language like this will be a big win there.

One of the big benefits of a native language like Go aside from lack of a JIT is the ability to easily add shared-memory parallelism. I haven't studied the TypeScript compiler internals so I'm not sure how easy it is to parallelize their type checker. However, I assume that it would be relatively straightforward to parallelize the parsing of JavaScript files, so I'd hope for at least be a big speedup there.

msoad • 3 years ago

Unfortunately TypeScript is doing all of type-checking synchronously mostly due how it needs to build a global symbols list first. There are some hints that they might move towards more parallel processing but it requires a lot of major changes

https://github.com/microsoft/TypeScript/issues/30235

constexpr • 3 years ago

Yeah I was afraid of that. Ideally it would be possible to do type checking in parallel at the function level or something, but TypeScript allows you to omit the return type and then infers the return type from the function body which adds additional dependencies and makes that more complicated to parallelize. I wonder if it would still be possible though. Maybe just for functions where the return type and all argument types are present? Then the approach would be sort of outside-in where you type check the unparallelizable global stuff serially first and then type check the remaining isolated function body stuff in parallel afterward.

zja • 3 years ago

Love ESBuild! Convert a relatively large project from webpack to it today, really easy to do & much faster build times now!

vips7L • 3 years ago

> the ability to easily add shared-memory parallelism

I’m honestly not sure if there is any popular language where this isn’t true?

constexpr • 3 years ago

The problem is that you can't use shared-memory parallelism in TypeScript. If you could, then the most straightforward way of speeding up the TypeScript compiler would be to parallelize the compiler, not to port it.

vips7L • 3 years ago

I just don’t see how Go is more uniquely suited to do this than any other language with the same feature and admittedly I may have read too much into your comment.

constexpr • 3 years ago

That comment wasn't about Go exclusively. My point was that porting single-threaded code to a native language doesn't necessarily cause huge order-of-magnitude speed gains, especially if you plan on sticking close to the original structure of the code to make porting future changes easy, but that in this specific case a port sounds likely to realize significant gains (which is very exciting!).

Shared-memory parallelism is necessary but not sufficient. Ideally you'd be able to only make some small changes in a few spots to greatly increase parallelism. I'm hopeful that TypeScript's AST parsing is a good candidate since ASTs can typically be constructed independently from the rest of the program state. Lowering and printing ASTs is also likely parallelizable. It would also be great if it was also possible to parallelize parts of the type checker somehow, but that sounds harder. I had some ideas about that in another comment. Anyway I think this means that it's likely that a port of the TypeScript compiler could realistically lead to significant speedups and may even lead to extreme speedups.

As far as language choice, it sounds like due to maintenance reasons the decision space is narrowed to native, garbage collected languages similar in object model to TypeScript but with parallelism primitives. Some candidates could be at least Go, JVM-based languages, or .NET-based languages. I think those can all be ahead-of-time compiled to a static binary? Another consideration is how easy/compact/fast/cross-platform binaries are. Go makes this trivial but I haven't used JVM or .NET AOT builds myself so I don't know how they compare in e.g. executable size or startup overhead. One advantage of picking Go is that esbuild has already demonstrated that Go is an effective choice for this problem domain, which may mean less work/research/unknowns. A disadvantage of Go here might be that the object model is different than TypeScript vs. something like Java or C#. But all else being equal the choice comes down to personal preference.

jpochtar • 3 years ago

node.js Workers + SharedArrayBuffers?

I'm not sure how well supported they are / haven't used them, but I think they enable this?

constexpr • 3 years ago

sealeck • 3 years ago

Python? JavaScript? Ruby?

joshxyz • 3 years ago

love esbuild, ditched webpack and rollpack since then. been a fan of golang too since then.

hardwaregeek • 3 years ago

This reminds me of when people try to make a new Ruby implementation. They get the basics down and think "this isn't so bad! I can do this!". Fast forward a few months and they're stuck on some nasty little edge case around eval and method_missing that they need to run Rails. Or some weird aspect of C extensions.

TypeScript's compiler is an unholy, unsound mess, because JavaScript is an unholy, unsound mess. Sure the basics of type checking structs and variables is somewhat easy. But what about conditional types? What about weird generic constraints? What about indexing types? There's some really confusing stuff going on underneath the hood.

It's also a mess that *does not have a spec*. Even if you somehow manage to make a type checker that implements most of the features, there's no way to ensure it has 1 to 1 parity with TypeScript. Plus that parity can be broken at any point.

The only way I could see this happening is if the author somehow convinced Anders Hejlsberg or a significant portion of the TypeScript team to work on this new implementation. I'm not sure that's gonna happen.

Waterluvian • 3 years ago

I am absolutely blown away at how good TypeScript's type checker is, given how much of an unholy mess JavaScript is, and that the TypeScript language cannot just deprecate all the awful parts.

More than once I've yelled at my screen, "HOW DO YOU KNOW THIS?!" about some inferred or narrowed type.

dmitriid • 3 years ago

However, more than once I've also yelled "HOW DO YOU NOT KNOW THIS?!" at something or other :)

epolanski • 3 years ago

Just today I was triggered that this compiles.

declare foo: never[] // it can only be an empty array

foo[1] // compiles

TypeScript is a nice type system, it's surprisingly expressive and flexible, like e.g. in Haskell I was surprised that there's this strange notion that for a given type you can only have one `Eq` or `Ord` instance for some type A.

But in TypeScript you can have as many Ord interfaces as you want. I may want to order some `User` type by last login, first login or whatever I want, but in Haskell you need to overcomplicate it and create a newtype.

endgame • 3 years ago

Jenk • 3 years ago

emteycz • 3 years ago

I think the correct solution is:

    declare foo: []; // fixed-length array (tuple) without any items

crdrost • 3 years ago

JS is a strict language, so an array containing one or more bottoms is a bottom. The only legal value is the empty array, at least in theory.

armchairhacker • 3 years ago

exdsq • 3 years ago

acid__ • 3 years ago

TheCoelacanth • 3 years ago

> But in TypeScript you can have as many Ord interfaces as you want. I may want to order some `User` type by last login, first login or whatever I want, but in Haskell you need to overcomplicate it and create a newtype.

I don't think that's accurate. Typescript only has one instance of Ord per type and it's inconveniently implemented through the method toString.

To usefully sort anything other than strings in Typescript, you are using a function where you provide your own comparator at runtime. That's more analogous to Haskell's `sortBy :: (a -> a -> Ordering) -> [a] -> [a]` or `sortOn :: Ord b => (a -> b) -> [a] -> [a]` which don't put any type bounds on the type you're sorting.

reuben364 • 3 years ago

I've managed to get the compiler to typecheck

  const x: Thing = exprA;
  const y: Thing = exprB;
  const z = [x, y];

but somehow not:

  const z: Thing[] = [exprA, exprB];

although exprA and exprB were pretty nasty typing wise.

azangru • 3 years ago

> but somehow not: const z: Thing[] = [exprA, exprB];

Doesn't

    const z = [exprA, exprB] as const;
    type GnarlyThingTuple = typeof z;

or something like this help?

halpert • 3 years ago

Where you trying to create a tuple? Does `const z: [Thing, Thing] = [exprA, exprB];` work?

rowanG077 • 3 years ago

I'm the opposite. I was continually perplexed by trivial stuff that TypeScript just couldn't figure put. Maybe that has changed in the last year but I doubt it.

smt88 • 3 years ago

In my 5+ years introducing TypeScript everywhere that my team has to use JavaScript, this is almost always because JavaScript is an insane language that allows many counterintuitive things.

TS is sometimes weird because it is a superset of JS and has to be compatible.

lockyc • 3 years ago

Do you have examples?

scottlamb • 3 years ago

> This reminds me of when people try to make a new Ruby implementation. They get the basics down and think "this isn't so bad! I can do this!". Fast forward a few months and they're stuck on some nasty little edge case around eval and method_missing that they need to run Rails. Or some weird aspect of C extensions.

I think that's exactly why the author wrote this:

> Eventually, I realized that rewriting a massive project such as tsc would be extremely difficult to continue. But the performance gains were exciting. This led me to try another route: porting instead of a complete rewrite. So, I started looking at the TypeScript source code.

It will require ongoing work to keep up with changes to upstream, but by matching upstream's structure as closely as possible (given the language difference), the path to supporting these nasty little edge cases should be relatively clear.

ludamad • 3 years ago

Interesting, in theory you could make a rough js to golang compiler given the relatively strict subset of typescript that I recall the typescript compiler using (not that the strictness means a fully sound implementation is easy, but that you can use the regularity to cheat text transforms)

mrkentutbabi • 3 years ago

What is the difference between porting and complete rewrite? I thought they are the same?

scottlamb • 3 years ago

> What is the difference between porting and complete rewrite? I thought they are the same?

As with so many terms in computing, there aren't universally agreed definitions. But in the author's post, I read "porting" as switching languages while changing as little as possible vs "complete rewrite" as redesigning/rearchitecting it along the way.

azakai • 3 years ago

To add to that, I would say that simple forms of "porting" can be done without really understanding the overall architecture. You just need to understand each line (or a few lines around it) at a time. I understood the post to mean something like that, which sounds like a good plan here, to minimize the risk of adding bugs.

And in comparison, a "complete rewrite" would require a full understanding of the big picture design of the original code, come up with a perhaps different one in the rewrite, and keep the differences in your head as you go (or, forget about the original design entirely and solve it from scratch - and keep that in your head as you go). Far more work and risk, and it would have been a bad choice here.

cwalv • 3 years ago

If that's the case, it seems unlikely that they'll be able to get the hoped for performance improvements

jacobr1 • 3 years ago

Porting can involve a spectrum of changes. If you are porting to a different language, often for bug-for-bug compatibility you try to maintain the same rough code and class/object structure as the source codebase. This might give you some non-idiomatic code at first, but once you have good test coverage you can refactor. In this case you might maintain some of the upstream code structure to better match future changes.

travisd • 3 years ago

I think here “rewrite” means completely changing the internal structure of the code (while preserving the functionality) whereas “port” means copy the structure of the internal code with as little modification as possible. And it’s hard to do the latter in Rust since Rust is not garbage collected.

danenania • 3 years ago

If they focus on the core features and make a significant enough improvement, perhaps the core TS team could be convinced to pull it in as a native extension and make calls to it for the operations it covers? If something like 62x improvement is really achievable, it's hard to imagine the team not being interested at all.

I would personally be very happy to use a significantly slimmed down subset of TypeScript if it provided a 62x performance improvement. Slow type checking is currently the main drawback of TypeScript development in large projects.

hardwaregeek • 3 years ago

I'm not saying this is impossible but I think you'd have to work rather carefully to not have to parse the code twice and effectively run two type checkers in parallel. Which, on second though, you could run the fast one then the slow one. Perhaps that's an option.

> I would personally be very happy to use a significantly slimmed down subset of TypeScript if it provided a 62x performance improvement. Slow type checking is currently the main drawback of TypeScript development in large projects.

We all think that until the slimmed down subset doesn't work with our favorite libraries. Sure, there may be people who use no dependencies, but I'd bet that there's sophisticated types in some very popular libraries that even low dependency programmers use.

travisd • 3 years ago

To this point, many teams are already using subsets of TypeScript to improve compilation times. Often times it’s things like always declaring return types on functions to avoid the compiler having to infer it (especially between module boundaries). Calling this a “subset of typescript” might be strong wording, but it does suggest there is some desire in that general area.

fault1 • 3 years ago

It seems like depending on how you configure typescript (e.g, in tsconfig), typescript is already something like an ensemble of mini-languages with different dialects and semantics. Much more so than other languages. But I agree, restricted typescript that has to do less work (data or control flow analysis) would probably be on the orders of magnitude faster.

vosper • 3 years ago

> To this point, many teams are already using subsets of TypeScript to improve compilation times. Often times it’s things like always declaring return types on functions to avoid the compiler having to infer it (especially between module boundaries).

I hadn't heard of this trick - how much improvement does it make? Seems like it might be good for readability, too?

rslabbert • 3 years ago

I haven't found any large scale analyses but here's an example of a simple type annotation halving compilation time: https://stackoverflow.com/questions/36624273/investigating-l...

The Typescript compiler performance wiki might also be of interest: https://github.com/microsoft/TypeScript/wiki/Performance

travisd • 3 years ago

epolanski • 3 years ago

One of the main requirements of TypeScript is that the compiler has to work in a browser. The monaco editor is the prince jewel of TypeScript applications.

So the only possibility I see is a webassembly alternative.

kibwen • 3 years ago

Not saying that the TypeScript team should actually do this, but Rust does have best-in-class WASM support.

mcintyre1994 • 3 years ago

> I would personally be very happy to use a significantly slimmed down subset of TypeScript if it provided a 62x performance improvement.

I'm just guessing but I think a lot of Typescript's really complex stuff is to support third party libraries with really complex types - at least they often refer to libraries they've improved support for when they discuss complex seeming additions. So I wonder how much of the JS ecosystem you'd have to cut off to get to such a subset. A particular example would be that I've often seen a chain of 20+ dependencies leading to Lodash, and I'd guess that Lodash is really complex to type. If you lose everything that's eventually dependent on something that's really complex to type, do you have much of NPM left?

acemarke • 3 years ago

As a maintainer for Redux, Redux Toolkit, and React-Redux, I can tell you that we would _love_ to have several additional features in TS that would make our library TS types much easier to maintain. Like, say, partially specified generics.

Tanner Linsley (author of React Table, React Query, and other similar libs) has expressed similar concerns.

(I'm actually starting to work on a talk for the upcoming TS Congress conference about "lessons learned maintaining TS libs", and this will be one of the things I point to.)

Rowern • 3 years ago

The issue here is that if you're running in the JS ecosystem you'll definitely want to use other people code (npm package or internal lib), if the subset breaks JS compatibility then you can break a significant amount of code without realising, if it is "only" a TS subset, then you need to make sure that each lib/package you import are compatible. Anyway this does not seem like a good solution.

shados • 3 years ago

to be fair, even within the TypeScript world that can be a problem. If typescript versions don't line up, or if your various strict options are too strict, you can get into really weird situations. It -generally- works, but when it doesn't, it's hell.

koolba • 3 years ago

Isn’t that already available by running in “transpile only” and having the type checking happen in the background?

spion • 3 years ago

62x improvement is not achievable. 6.2x might be.

edit: I see they are running it with 8 threads, in which case yes 50x is achievable. In any sizable codebase though, you should probably split into modules and run with some parallelism via e.g. `wsrun` or `turborepo`

Androider • 3 years ago

esbuild is > 125x faster than Webpack, doing the exact same job. It's not some theoretical microbenchmark, you'll see it when converting a project from one to the other. If a software hasn't been designed with speed in mind, and I don't think tsc has, there can be massive performance improvements possible if you create a new implementation with performance as a top goal.

spion • 3 years ago

armchairhacker • 3 years ago

Even if it’s not official i could use a slimmed-down, faster TypeScript. Even if it has slightly different semantics. I always use skipLibCheck so TS only checks my code.

It’s a form of linting, not semantics, so it doesn’t have to be exact.

pcj-github • 3 years ago

No spec, but the golden files that check that the compiler is working as intended are very comprehensive. If you can generate the same ~38,000 files in this directory from the test cases, you're pretty much on par with tsc: https://github.com/microsoft/TypeScript/tree/main/tests/base...

jacobr1 • 3 years ago

But that only works for a bug-for-bug compiler, no? If you want to build something like a linter, you couldn't use them to validate you are covering the fully syntactic range. Though I guess it would help look at coverage if your linter threw any errors, but it wouldn't validate correctness.

ModernMech • 3 years ago

This is true of many programming languages projects. It's not uncommon for someone to write a language in a weekend, and then marvel at how their very simple toy language is faster than C! And look how quickly it compiles! Thus begins a several years-long journey, where with every feature added the runtime and compiler get slower and slower and slower.

msoad • 3 years ago

There is no specs so TypeScript's own impelemtption might not be accurate to what the team imagined on how things should work.

That means if this implementation passes all TypeScript's tests, it is as good as TypeScript.

Each new issue that is opened in either repo that shows discrepancy can be a new unit test added to the TypeScript codebase.

scotty79 • 3 years ago

Good thing is that the only thing it does is type-checking. So if you miss some edge case it won't influence at all how your ts code works.

The only thing is it may not catch some type error, which JS already does none of and TS can skip wherever you want or even implicitly and people are fine with that.

If I get 98% of TS typechecking 10 times faster. It's a huge win.

simplify • 3 years ago

You're vastly underestimating the advanced type system features TypeScript requires to typecheck even the most basic JavaScript code. It would be more like 10% rather than 98%. Intersection and union types are no joke, and are notoriously difficult to make performant.

pjmlp • 3 years ago

I just saw an example where the upcoming 4.6 will improve the clever use of recursive types done by some libraries.

Give the people a turing complete type system and they will write a complete application with it.

bhawks • 3 years ago

My experience with a project like this is unless you do reach fabled 1:1 parity, the new tool is orders of magnitude less useful or even actively harmful.

Saving a minute every cycle sounds very attractive, but when a person spends a morning hacking and then finds out the normal tool chain doesn't accept their work when the optimized one did, they will quickly distrust it in the future. It'll also make them very reluctant to share it with their team.

The only reason my project was successful was we ultimately reached a set of compromises with the team that owned the woefully underspecified tool chain and upstreamed large swaths of changes. I would probably never do something like that again.

sa46 • 3 years ago

I think it's okay not to get near 1:1 parity given the amount of unchecked type-casts (foo as unknown as X) and ts-ignores that happen anyway in most code bases. The other advantage is that the typechecker is advisory, meaning the code will run (and might explode) but it will run even if the types are wrong since compiling typescript is mostly just throwing away the type info.

spion • 3 years ago

The good thing is you have a ton of code on the Internet to use to test it. Unlike a dynamic language reimplementation, a static language transpiler can check if the results produced are identical to `tsc`

lhorie • 3 years ago

Yeah, the lack of specification for Typescript is quickly turning into the Achilles heel for the web tooling ecosystem. Several projects got the parsing part done, but a lack of specification and the relatively fast release cycle means it isn't really feasible to invest into actually implementing the meat of the TS compiler, so everyone is stuck with that last remnant of critical tooling running slowly.

pawelmurias • 3 years ago

> It's also a mess that does not have a spec. Even if you somehow manage to make a type checker that implements most of the features, there's no way to ensure it has 1 to 1 parity with TypeScript. Plus that parity can be broken at any point.

You need a test suit for this not a spec.

skybrian • 3 years ago

One way to collaborate would be to write a spec as you go, along with an associated test suite that can be run against multiple implementations.

Presumably TypeScript has a test suite already that might serve as a starting point?

taxyz23 • 3 years ago

I really like dart actually. It's typed, close enough to JavaScript, and yet I don't think is an unholy mess ?

newlisp • 3 years ago

But in the meantime he gets to work on a fun project, learn a lot and get paid for it at the same time ;)

jupp0r • 3 years ago

> tsc depends on shared mutability and has a cyclical mutable reference. Rust is designed to prevent this behavior. Having two references to the same data (shared mutability) is undefined behavior in Rust.

This problem has numerous solutions in Rust - (A)Rc, RefCell etc. I don't understand how this caused the author of the article to stop considering Rust.

petmon • 3 years ago

Non-GC languages are bad in general at handling cyclical references (mutable or not), and Rust is especially ill suited for it because of the borrow checker. For example should an arbitrary reference in tsc become an rc::Rc or an rc::Weak? You have to tease out the implicit ownership model and one may not even exist.

tsc was designed expecting GC and it makes perfect sense to port it to another fast GC language.

throw10920 • 3 years ago

I'm not a Rust fanboy (in fact, I rather despise the RESF, especially given Rust's mediocre tooling), but: I think that "You have to tease out the implicit ownership model and one may not even exist" suggests a problem with the program itself.

I can't think of any reason why a program must have an ill-formed ownership model (unlike, say, the fact that many interesting programs must have an embedded dynamic type system (tagged unions etc) to do their thing).

Now. The Rust language and existing tooling doesn't give you any help in figuring out the ownership model, but I can easily imagine a language and tools that would help with that.

This isn't meant to refute anything you've said (you said "[existing] non-GC languages are bad in general" not that non-GC languages cannot be good) - just add clarity.

bcrosby95 • 3 years ago

A program doesn't have to have an ill-formed ownership model. But if it does and you're trying to translate from one language to another, you should probably pick the language that makes the job easier.

It reminds me of a programming assignment I had 20 years ago. We had to take our teacher's old FORTRAN program and "convert" it to Java or C++. I knew both equally well at the time - and I picked C++ because it had goto whereas Java didn't.

This meant that I didn't really have to unravel the meaning of the program. Whereas with Java, I would have to.

klodolph • 3 years ago

Speaking as someone who has written non-trivial compilers in both Rust and Go...

Compilers tend to use a bit thornier data structures and graphs than you might see in other realms. It is very convenient to just represent these graphs as objects connected by pointers, and know that you don't have to worry about objects that become unreachable.

For example, the values in an SSA-based IR are all full of pointers to other values. Each value represents the computation to produce that value, like "x12 / 4". A strength-reduction pass goes in and mutates this node, replacing it with "x12 >> 2" (if such a change is valid). It knows the change is valid because it follows pointers from the node to its inputs. The change is available to all upstream values immediately because it is mutated. These changes may make other values unreachable, at which point, you don't care about them. You only care about the values which are reachable from computations with side effects. These nodes definitely have cycles and IMO, the notion of weak references between them doesn't make sense.

I'm sure there's a way to do that in Rust. Please, nobody try to explain how. I could figure it out. The thing is... it's not obvious how to do it in Rust, and figuring out how to "explain ownership to the Rust compiler" without creating Rc<RefCell<T>> cycle leaks is a non-trivial task that sucks away time I could be spending writing the compiler.

This reminds me of a paper I saw go by years back about writing a compiler in Haskell. Pure Haskell code doesn't let you mutate objects and doesn't let you do pointer comparisons, so it creates problems if you're trying to implement an algorithm which is conceptually pure (at a high level, a function with inputs and outputs) but where you're taking the implementation from a paper which constructs the output iteratively, piece by piece. The problem is that your data structures have to be annotated to show which parts are mutable (IORef, STRef, or TVar), but "which parts are mutable" may have a different answer depending on whether you are inside some function transforming or constructing a data structure.

It was obvious to me, reading the paper, that while I love functional purity, and functional purity solves all sorts of problems and makes it easier to write code in general, it sometimes gets in the way of getting shit done. The same is true of Rust's notion of ownership... super useful when it lets me write fast, efficient code without having to think about ownership (when it just works and gets out of the way), and super annoying when you have to figure out how to explain your program to Rust.

throw10920 • 3 years ago

chubot • 3 years ago

If your program is an interpreter, then I think the ownership model is going to be dominated by the program being interpreted, not the intepreter itself... That is, has to be general enough to interpret any program, and many of those programs will have cyclic data structures.

TSC is not an interpreter, but its type system is Turing complete, so I wouldn't be surprised if it has some similar issues.

adamnemecek • 3 years ago

> I rather despise the RESF, especially given Rust's mediocre tooling

Which tooling are you talking about?

throw10920 • 3 years ago

sealeck • 3 years ago

Rust does have a third-party `Gc` type (which used to be in the stdlib).

cztomsik • 3 years ago

Is that the one where you need to derive() for every struct you want to use and which is, in general, not compatible with the rest of rust ecosystem?

armchairhacker • 3 years ago

I honestly like the explicit weak ref model and it’s usually trivial (child objects are strong references, parents and siblings are weak), and rust has Gc as a last resort.

But if the author’s more familiar with Go, performance in Go is likely good enough

slaymaker1907 • 3 years ago

While I hate Go and love Rust, there is also the issue that RefCell/Gc/Rc are very different in terms of how expensive a memory allocation is compared to node. Historically there were very similar problems in trying to write C++ like one would write Java. Memory allocators today are much better and closer to GC languages, but I suspect it will still be more expensive for very allocation heavy code.

One way you could get this to work at the cost of some safety (depending on how you do it) would be to use arena style allocation. Obviously there are ways to do this safely in Rust, but it is probably easier to do it unsafely for a port.

However, if you just use a high performance language with a GC like Go/C#/Java, the port will be easier and you won't have to worry about doing too many allocations.

speedgoose • 3 years ago

I think the point is valid. As a rust beginner I tried to make a graph data structure and it was difficult. Using RefCell is not very nice because it moves the problem from compile time to runtime and you still can’t mutate the same data structure more than once at the same time.

m-hilgendorf • 3 years ago

Rust encourages you to use an adjacent list/matrix for your graph representation, which is preferable almost all of the time for graph algorithms. It's a little unwieldy if the graph needs to be mutable during the core algorithms, but almost all algorithms in practice assume it will be stable.

tomcam • 3 years ago

This is interesting but I don't know Rust. Presumably your graph data structure grows dynamically, meaning you have to add nodes at runtime. When. you say you can't mutate the same data structure more than once at runtime, what does it mean? That you can't set both a parent and a child pointer in the same function, or what?

speedgoose • 3 years ago

I’m very far from being a rust expert but it was difficult to build a bi-directional graph because it’s difficult to connect the nodes together. A simple recursive algorithm will not work. I ended up using a crate.

I found the following article interesting on the topic : https://aminb.gitbooks.io/rust-for-c/content/graphs/

jupp0r • 3 years ago

It shouldn't be difficult:

1. You have a graph struct that has ownership of all nodes (in Rc containers).

2. Nodes have Weak references to connected nodes

bobbylarrybobby • 3 years ago

In the case of RefCell, when you begin modifying its data, you’re not allowed to begin modifying its data a second time without declaring “I’m done” with the first modification. This makes doing several modifications in different places that are interspersed with each other tricky because it’s hard to reason about when each spot is actually done so that a modification can be performed elsewhere. (If your program is single threaded it’s possible you don’t even care about mutation happening in multiple places, but Rust cares because in general Rust programs need not be single threaded.)

throw10920 • 3 years ago

Rust doesn't allow you to relax mutation constraints for a single-threaded program?

marcus_holmes • 3 years ago

Given that ESBuild [0] is a way better replacement for Webpack already (and similarly written in Go) I can see how we could pretty soon have an entire toolchain written in Go for dealing with compiling JS.

Hopefully the whole toolchain will be embeddable, too. Being able to do configuration-in-code and produce an executable that does the whole build step in milliseconds would be cool.

[0] https://esbuild.github.io/

moltonel3x • 3 years ago

SWC (by the same author as this tentative tsc port) is a part of more and more people's typescript toolchain, and is written in Rust. Having all your ecosystem in a single language feels nice, but pragmatically different tools might be better in different languages.

marcus_holmes • 3 years ago

The point of having it all embeddable in Go is that we could write a single executable that handles the entire process, with the toolchain embedded and the configuration written as code.

kibwen • 3 years ago

You can embed Rust code in Go, unless there's something about Go that prevents static linking across FFI boundaries.

Thaxll • 3 years ago

It's not easy and not very common.

newlisp • 3 years ago

The author of esbuild also started the implementation of esbuild in rust, found it painful too and switched to build it in go.

pjmlp • 3 years ago

For me the ideal use case for Rust is really for places where automatic memory management is a hard sell, even if technically possible, kernels, drivers, hypervisors, firmware, GPGPU, ...

For everything else I rather enjoy the productivity of automatic memory management.

ameliaquining • 3 years ago

My general inclination is to go straight to Rust if I already tried a different reasonably performant language (like Node.js) and it wasn't fast enough—because only Rust (and C and C++ and a few more obscure languages) guarantees that basically any performance optimization that I might want to do will at least be possible, even if it might be inelegant or unsafe. I don't want to run the risk of hitting another performance wall that I can't climb over.

That being said, porting this kind of codebase to Rust would indeed be really rough—definitely possible, but by no means pleasant. I like to think the platonic ideal Rust, which we're hopefully getting closer and closer to over time, is capable of elegantly handling cyclic data structures and the like even while encouraging a single-ownership model for the 95% of cases where that's the right design. I don't think the Rust we have today is there yet.

pjmlp • 3 years ago

I wouldn't put node.js and performance on the same sentence.

Better than other dynamic languages, definitely. That is about it.

Plenty of languages that offer AOT compilation and automatically managed memory to chose from.

scottlamb • 3 years ago

The "why not port to Rust" part boils down to these paragraphs:

> tsc uses a lot of shared mutability and many parts depend on garbage collection. Even though I’m an advocate and believer in Rust, it doesn’t feel like the right tool for the job here. Using Rust required me to use unsafe too much for this specific project.

I'd be curious to hear more. In particular:

* Two references to the same data: were cells too awkward?

* Actual garbage collection: my first instinct would be to use an arena for each compilation, destroying the whole thing at once rather than doing reference counting or garbage collection. I gather that's not cool in the age of language servers and incremental compilation, though. Does tsc support those?

yazaddaruvala • 3 years ago

All of that is valid for a rewrite, but not efficient if you’re porting the current implementation.

As I understand it, in this case, Rust would be able to have a line to line port from JavaScript.

I do still think your overall point holds. Why not use a library in the Rust ecosystem to help with this problem e.g. some GC<T> type.

scottlamb • 3 years ago

> All of that is valid for a rewrite, but not efficient if you’re porting the current implementation.

Help me see it?

I can easily see how getting rid of the cyclical structure crosses from "line-to-line port" into "rewrite" territory. I'm not saying you're wrong, but I'm having more trouble seeing how "wrap some types in cells and pass along a lifetime and arena parameter" crosses that threshold. It might not be feasible at all, though, if tsc compilations are long-lived for language server + incremental compilation support.

> Why not use a library in the Rust ecosystem to help with this problem e.g. some GC<T> type.

iirc I've seen some experimental GC library mentioned. I think something like that is at least off the beaten path enough that I can see how the author might decide it's better to just use Go instead.

charrondev • 3 years ago

TSC supports both a watch mode and is used heavily as a language server.

sealeck • 3 years ago

It used to be in std.

Zababa • 3 years ago

I see a few people wanting to rewrite tsc in something else than Typescript because tsc itself is too slow. I don't remember seeing any "proof" or analysis that Typescript is the limiting factor in tsc. The basis often used is the performance of esbuild compared to other bundlers. From the esbuild FAQ, "Why is esbuild fast?":

> Several reasons:

> It's written in Go and compiles to native code.

> Parallelism is used heavily.

> Everything in esbuild is written from scratch.

> Memory is used efficienty.

Using another language will solve 1 and will maybe solve 2. From this FAQ too:

> For example, many bundlers use the official TypeScript compiler as a parser. But it was built to serve the goals of the TypeScript compiler team and they do not have performance as a top priority. Their code makes pretty heavy use of megamorphic object shapes and unnecessary dynamic property accesses (both well-known JavaScript speed bumps).

tsc itself seems like it could be optimized. By how much, I have no idea. But it sounds like it would be easier than writing something from scratch. esbuild, while a great tool, is limited in scope. Typechecking Typescript on the other hand is a moving target, and a fast one at that. All in all, I'm not convinced by this approach compared to "regular performance work" on tsc itself. I think people talk about that because it's easier to start your project than to change tsc. In that case, the problem sound political rather than technical.

There's the x62 speed from the article, but there's no info on what features are supported. I feel like this is something where the devil is in the details.

I'm happy being proven wrong on that subject, I fear that we're going increase the number of tools in the frontend ecosystem for no good reasons, and make every a bit harder again.

n0w • 3 years ago

I had the same thought.

A direct port of tsc to a "faster" language doesn't guarantee a faster type checker. Sure it might be a bit faster, but my guess is that the large performance wins come from a different design that takes performance into consideration from the start.

Maybe the thought process is that a 1:1 rewrite will handle compatibility and feature completeness? I suppose you could optimise from there?

x-shadowban • 3 years ago

How can you avoid megamorphizing an ast hierarchy?

BadOakOx • 3 years ago

Just from a week ago on HN:

I'm making a Typescript type-checker in Rust: https://news.ycombinator.com/item?id=30033119

epolanski • 3 years ago

Next week it will be Zig then

udbhavs • 3 years ago

https://bun.sh/

andyfleming • 3 years ago

Here's a relevant GitHub thread where some conversation that has been happening: https://github.com/swc-project/swc/issues/571

jkelleyrtp • 3 years ago

TSC doesn't need to "stick around", right? Just a run-once and the program is over?

In those cases, https://github.com/fitzgen/bumpalo works amazingly as an arena. You can pretty much forget about reference counting and have direct references everywhere in your graph. The disadvantage is that it's hard to modify your tree without leaving memory around.

We use it extensively in http://github.com/dioxusLabs/dioxus and don't need to worry about Rc anywhere in the graph/diffing code.

anonymousCar • 3 years ago

I think --watch mode means it's running while developing

connor4312 • 3 years ago

Additionally, the language server used to power editors is long lived. TypeScript isn't just the compiler.

danenania • 3 years ago

If anyone happens to know: what are tsc's bottlenecks?

I presume that if the bottleneck was just number of loop iterations, V8 would already be about as fast as C or any other language?

Is it simply that accessing/mutating javascript data structures like objects, arrays, set, etc. is slow? Are those not fully optimizable by V8 for some reason? Or is it something else like the GC being slow?

cpcallen • 3 years ago

Yes, I'm curious about this too.

Given that gopherjs[1] can compile Go to JavaScript and in some cases thereby achieve better performance than the native Go compiler[2], it's not obvious to me that a JavaScript program ported mechanically to Go is likely to be any faster than the original. It seems to me that the effort might be better applied to optimising the performance of tsc itself.

[1] https://github.com/gopherjs/gopherjs [2] https://medium.com/gopherjs/surprises-in-gopherjs-performanc...

newlisp • 3 years ago

This is a problem that can take advantage of good parallelism, Go will give you that and at the same time be more memory efficient.

davepeck • 3 years ago

(In a different direction, Microsoft's pyright type checker for Python -- IMO the fastest and most correct type checker in the Python ecosystem -- is implemented in TypeScript!)

zem • 3 years ago

as someone who works on a competing typechecker, i'm definitely very impressed by pyright, especially since it has a single author! we have three people working on pytype and it definitely feels like too little; the way the pyright author manages such a great development velocity is mindblowing, and means they got the initial architecture very right.

pjmlp • 3 years ago

The thing with all those projects is playing catch-up with upstream, and the possible variations from what is actually the official semantics depending on the TypeScript version.

throwaway894345 • 3 years ago

Personally I wouldn't fret about keeping pace with upstream. Eventually the syntax will stabilize if it hasn't already. I'd be more worried about fidelity--does this produce the same result as the reference implementation? TFA says he's running against the "conformance" test suite, so that should keep things pretty tight and when bugs are found they'll be fixed in time and presumably the test suite will be updated accordingly.

munificent • 3 years ago

> Eventually the syntax will stabilize if it hasn't already.

Every language in wide use today is still evolving and, if anything, it seems that the pace of language change is generally increasing.

throwaway894345 • 3 years ago

Some amount of evolution is fine, and I think your “pace of language change is increasing” may be true for some languages and time scales, but most languages don’t change more in their second five years than their first.

wwwigham • 3 years ago

The `conformance` suite is the "set of tests people added when a feature was first released", usually without the decade of following regression tests in the `compiler` suite. In TS itself, the `conformance` and `compiler` suite are considered the part of the same unit most of the time. (The compiler suite, however, is not helpfully subdivided by feature, since it oft deals with cross-cutting concerns!)

cellularmitosis • 3 years ago

If OP delivers on the preliminary 62x speed up, how could upstream justify not jumping ship to OP’s implementation? Certainly the user base will demand it.

pjmlp • 3 years ago

Being Microsoft, they would rather make use of their Typescript compiler for IoT, yet they haven't.

https://makecode.com/language

wrs • 3 years ago

62X seems like a lot. It makes me wonder if someone has put serious effort into profiling and optimizing the JavaScript version. If not, that could be a better use of time than a port. I'd guess the TypeScript team may not be interested in maintaining a Go version, so this port will will always be behind the current TypeScript definition.

epolanski • 3 years ago

I'm not so positive about these projects.

They are good willed but TypeScript doesn't have an official spec, it's an implementation-defined language. It's not C, it's not JavaScript, it's whatever each release wants it to be. This makes it very difficult to follow every version and evolution, you're consistently chasing whatever has been merged on release.

folkhack • 3 years ago

And it's the exact reason I'm really reluctant to fully buy into Typescript... I always feel like I'm wrong about the "not an official spec" part of things but you are 100% correct in my understanding.

Also on "it's whatever each release wants it to be" - that is hell on earth if you're looking to target consistent software processes and it only gets worse the more you scale (both solution and team).

I get the value of Typescript - hell I've used it extensively myself... it's just that I can't stop feeling like this is all an anti-pattern: trying to make a loosely-typed language a typed language (if you can call it that?). =/

lprd • 3 years ago

I share the same concerns. It’s also a little irking that if I want to enjoy good tooling for typescript, I now have to install another language on my system since the language that typescript sits on is a hot mess. I don’t mean to cast shame on JavaScript, but I’d be remiss to not say that modern js development is a bit insane. The only reason we all go along with it is because we have to.

eropple • 3 years ago

You don't have to install another language to use tools written in Golang (or Rust) unless you have to build binaries yourself.

sapphyrus • 3 years ago

Maybe tsc already does this, but why not just make extensive use of caching to get the compile times for "incremental builds" under control? Why would it have to re-check all 10,000 files in my project just because I modified a single file? That seems way more manageable (and less error-prone) than rewriting the entire thing.

forty • 3 years ago

go means it's going to be easier to integrate to esbuild :)

I'm happy to see this, as honestly I don't really understand why you would compile TS without type checking.

eatonphil • 3 years ago

> I'm happy to see this, as honestly I don't really understand why you would compile TS without type checking.

I don't think I'm remotely alone in doing this but I'll share from my perspective. I have two modes while writing TypeScript stuff: 1) prototyping something where I don't want to spend time writing types (or maybe I write some types but I don't care if they are correct or line up yet) but I still want to have JavaScript generated so I can test the thing out manually and 2) getting types correct as part of long-term maintenance and edge-case removal.

There are fast TypeScript -> JavaScript generators today like swc and esbuild. tsc is not fast. So I can speed up "1)" from above if I use swc and esbuild and only use tsc for "2)" from above.

As people write faster implementations of tsc-the-type-checker though I'll be happy to use that instead and just not use tsc anymore.

m-hilgendorf • 3 years ago

> 1) prototyping something where I don't want to spend time writing types

It doesn't compile to JS, but this is one of the canonical use cases for Stanza 's (1) "optional" type system

(1) http://lbstanza.org/optional_typing.html

mixedCase • 3 years ago

Have you tried doing Type-Driven Development? IME, most of the time constraints are determined through types first with the code being done later tends to replace the game of "what kind of complex and non-obvious types fit my code the most" with a "oh, the implementation is obvious" reaction.

It also makes it easier to determine what behavior is worth testing and which ones are already covered by your types, if you tend to write tests last. If instead you're used to Test-Driven Development you can do types->tests->code.

eatonphil • 3 years ago

I've spent a lot of time programming in Standard ML and yeah that's a prerequisite there.

But no I don't prefer it to what I do in TypeScript.

tshaddox • 3 years ago

Out of curiosity, how often do you have a project in the first mode that gets large enough that the speed of compiling TS to JS is significant?

nicoburns • 3 years ago

If you're writing UI code then even a few seconds is slow. You don't want to be waiting 5 seconds every time you move something a couple of pixels to the left.

eatonphil • 3 years ago

There is a huge difference even in small and mid-size projects.

https://datastation.multiprocess.io/blog/2021-11-13-benchmar...

forty • 3 years ago

That makes sense, I would generally use JS directly for mode 1), but I understand the use case.

mattkrick • 3 years ago

Personally, I let vscode do typechecking on open files & have a pre-commit git hook to typecheck changed files.

When it comes to starting up a development server & building the client, there's a huge cost to repeatedly typechecking the same 1000+ files. By cutting out typechecking & only compiling, I can reduce the webpack client build from 40 seconds to 3 seconds (using sucrase).

forty • 3 years ago

I'm answering to you but this is really for most siblings:

I maintain a fairly large TS code base (nodejs backend). I think a full rebuild is 1:30 on my relatively recent laptop (i7-8650U, lots of RAM). But in practice I always use tsc -w and compile is mostly instant after editing a file (I do have to wait whenever I launch the command after I boot, but after that, it's fast enough).

tsc now support incremental compilation too, though I haven't played with it too much as I'm happy with watch mode.

rezonant • 3 years ago

Incremental compilation is _okay_ but it seems to perform full rebuilds if the previous build is _too_ old which is weird- meaning that more builds than expected take the entire time to complete. Nonetheless I use it for all of my builds. On the other hand this may just be because of the spiraling nature of type inference, where TS needs to re-examine a large amount of the program from a change that appears simple.

Personally I only have a small number of projects that take more than 10-20 seconds to compile, but those ones are painful. I should probably do the same with -w for those.

morelisp • 3 years ago

What if not everyone else on your team is paying attention to their editor's warnings? What if the LS's async response doesn't come fast enough? What if, god forbid, you have to change something on a different computer?

If it's not in your CI/CD scripts, it's not actually getting checked.

> I can reduce the webpack client build... to 3 seconds

This is about 10x longer than any interaction with a computer should be.

chakkepolja • 3 years ago

> If it's not in your CI/CD scripts, it's not actually getting checked.

You can make a similar argument and say you must write in a sound type system language, and TS typesystem is unsound.

morelisp • 3 years ago

Sorry, I don't understand the analogy. Yes, TS allows some bugs through, and that's a problem too. But that's also a far cry from "always rely on humans to run complex processes in a consistent way."

tdfirth • 3 years ago

This is similar to what I do (although I use esbuild), however like an idiot I just run tsc manually so obviously I forget sometimes and it takes 10 minutes to realise the build failed.

Not great... I'm ashamed but it's just me on this project atm.

edit: misread parent originally

bastawhiz • 3 years ago

If it takes one minute to type check a codebase and ten seconds to compile it to JavaScript, then whenever I simply need to produce a JavaScript file without type checking I'll save myself the fifty seconds. If my editor's TS language server does the checking and my bundler is just compiling for my dev environment, I don't want both tools duplicating each other's work.

forty • 3 years ago

Answering to myself, a colleague suggestion (from tsc doc actually): use tsc --noEmit (for type checking) and a fast compiler.

tsc --noEmit duration is 2/3 of a normal build on our fairly large code base, so that can definitely speeds things up (for CI for example)

apazzolini • 3 years ago

We use SWC with ts-node to strip types in dev for faster iteration loops and run the full tsc typechecker in CI (and optionally as a separate process on your local machine).

postalrat • 3 years ago

You compile ts without type checking when the options to do so are 10x+ faster than with type checking.

pjmlp • 3 years ago

I really don't get this, when I save my TS files they are already compiled into JS on the fly, ready to fed into a <script />.

recursive • 3 years ago

By what? What if they were updated from an upstream repo, and never saved from an editor? What if you change branches without an upstream repo?

pjmlp • 3 years ago

By VS and VSCode, the IDE/editor that belongs to the same company that does TypesScript.

If I really need to call tsc explicitly, the few seconds that I have to wait, versus the minutes on some other languages, won't make me lose sleep over it.

pea • 3 years ago

Aside from preferences, is there a reason why C# or F# wouldn't be a more natural choice? I'd imagine porting TS to C# would be easier (especially for this specific use-case) and that the performance would be good.

smasher164 • 3 years ago

I wonder how it would fare if instead of rewriting from scratch, they applied automated transformation from the Typescript source code to Go. It wouldn't have to capture all the semantics of TS/JS, but be opinionated enough to transform the Typescript source.

Another possibility is to use something like GraalVM native-image to compile Typescript's type-checker into a binary. This might offer a speedup over its execution on V8.

silasb • 3 years ago

I'm surprised to see no one comment on Zig. It seems like the bun.sh developers are making some bold changes with such a low-level language.

kbd • 3 years ago

I'm super bullish on Zig, but it's a bit early to bank on for production things as the language is still changing/being developed.

To give another example of where this came up: they were discussing using Zig for the Ruby JIT, but despite really liking Zig the team decided to go with Rust because of its maturity: https://bugs.ruby-lang.org/issues/18481#note-21

newlisp • 3 years ago

I'm more curious about if the author research Ocaml as an alternative.

andreypopp • 3 years ago

Well... How many paid work hours spent on tsc? Then does TypeScript have a formal language specification? I'm pessimistic it's possible to rewrite tsc with full compat (bug-to-bug, otherwise it doesn't really make much sense) without MS support... but who knows... I'd like fast compiles too, so good luck!

paavohtl • 3 years ago

A formal specification would be basically useless, if nothing guarantees that the reference implementation actually follows it. Sharing & contributing to the same test suite is a much more practical approach.

quickthrower2 • 3 years ago

As a smoke you can swap it in for a lot of the npm ecosystem and run the tests of those projects on the compiled code.

andreypopp • 3 years ago

Specification isn't being followed is of course useless. But yeah, tests are definitely useless and in the absence of the spec is the only sane way to approach such rewrite.

kaba0 • 3 years ago

Slightly off topic, but do I not get something about Go? It comes up waay to often nowadays here, perhaps even more than LISPs - last time Rust was the hyped one, but it seemed to have decreased a bit. But at least rust is an interesting language with some novel features, adding new value to the low level PL domain. I fail to see what is unique about Go, other than the perhaps least known non-blocking concurrency (based on the recent “invisible” article on this). I find the simplicity of it (with the often claimed reason being basically “code monkeys not hindering each other’s work”, quite offending) really off-puttingz

Mawr • 3 years ago

> I fail to see what is unique about Go

That's because you're looking at it purely from only one point of view - the number of available features in the language itself.

What's unique about Go is not so much about the language itself as the approach of its authors to creating it. Go hasn't meaningfully changed in the 10 years of its existence. That's not an accident - the bar to add anything is extremely high. A feature needs to be punching way above its cost to even be considered.

If you look around the programming language space, you'll see the exact opposite attitudes, features get added into languages willy-nilly. Not much point in having different languages when they just trade features until they become indistinguishable from each other.

There are only three other languages that I know of that at least sort-of follow a similar philosophy to Go: 1. C 2. Zig 3. Elixir

You aren't going to be writing general-purpose code in C, so that's out. Zig is about a decade away from being a contender. Elixir is a valid choice, albeit much more niche and unfamiliar to the average programmer than Go is.

So if you want a language that'll be there for you in the long haul, that will not invalidate what you've learned about it every 3 years or so, you're left with Go.

kaba0 • 3 years ago

> If you look around the programming language space, you'll see the exact opposite attitudes, features get added into languages willy-nilly

Java has been going with the last-mover advantage of only ever incorporating a new feature when it proved to be useful by other languages. That has been going on for 25 years, so it is definitely queen of this category. Not sure why it is seldom appreciated, but it is also a very small language (compare the number of keywords it has by c# for example), and it was even named “blue-collar” for a reason. So I still don’t see Go anything new even from this perspective.

gonzo41 • 3 years ago

Go is about getting things done. I think it's interesting to HN because it is enough engineering to solve a problem, and no more. I think the biggest feature of golang is the community process that considers change slowly, in contrast to the JS ecosystem, that's a distinct feature.

ripley12 • 3 years ago

Personally, I think the appeal of Go has much more to do with the compiler and the tooling than the language.

The compiler is blazing fast and you get small fast statically linked executables by default. That's very appealing to me even though the language itself is a bit spartan.

nevir • 3 years ago

Super skeptical that porting this to a new language is going to have the kind of performance gains they expect ... assuming they want to replicate all of tsc's functionality

eyelidlessness • 3 years ago

It’s already proven (esbuild, SWC, Bun) to drastically improve performance of TypeScript’s other responsibility: compilation (tsc doesn’t offer an option for this, but ts-node does, for comparison). And while the type checker is certainly more complex, I have a hard time imagining it won’t show similar improvements in a native language.

resoluteteeth • 3 years ago

Compiling typescript is basically just stripping the type definitions/annotations and other typescript-specific parts of the syntax out. That's a huge difference from actually type checking it.

eyelidlessness • 3 years ago

And yet, it’s drastically slower in JavaScript than in Go, Rust and Zig. I realize different algorithms will optimize differently between JS and native code, but I sincerely doubt tsc is anywhere near optimal speed.

lucian1900 • 3 years ago

That's mostly because tsc's parser is used by the JS implementations.

Writing a parser for the purpose would achieve performance closer to the other implementations.

strager • 3 years ago

esbuild, SWC, and Bun are not ports of existing programs.

sam0x17 • 3 years ago

Oof trust me from experience you do _not_ want to write a compiler or parser or anything like that in Go. The string support is so limited -- for example you have to implement basic things like reverse string from scratch instead of them (and much more) being in the standard library. Much better to use something like Crystal or Rust.

ricardobeat • 3 years ago

ESBuild seems to be doing fine.

I’m a crystal fan though, would love to see more tooling built with it.

ngrilly • 3 years ago

The only string handling a compiler needs is usually during tokenization. After that the compiler only manipulates tokens, AST, SSA, IR, etc.

sam0x17 • 3 years ago

And then you face Go's usual awkward issues with having a lack of generics which becomes important when, say, traversing a tree.

constexpr • 3 years ago

I'm the author of esbuild. I hadn't written Go before I started esbuild and I thought I would miss generics more, but I didn't. There is literally only one situation in the whole esbuild code base where I've found myself wishing for generics: sorting an array. The workaround is easy and involves implementing an interface with three methods, each of which is typically a single line: https://pkg.go.dev/sort#Interface. And I only needed a few of these interfaces for a few different types. That wasn't nearly enough of a reason to not use Go, especially considering the other advantages.

davidmdm • 3 years ago

pjmlp • 3 years ago

Thankfully that will end next month.

arwineap • 3 years ago

Truthfully I'm not looking forward to it.

kunley • 3 years ago

Sounds strange. Where do you reserve strings in the compiler?

wglb • 3 years ago

Nah. I’ve written compilers in much less friendly languages than go.

pjmlp • 3 years ago

Way better than C.

slig • 3 years ago

Cannot wait for their webpack replacement.

politician • 3 years ago

I'm sorrowful that such a significant percentage of this post was a preemptive apology for not using Rust.

newlisp • 3 years ago

Well this a blow for the "Rewrite it in Rust" movement.

mdwagner • 3 years ago

I'll throw this out there (even though I know the response I'll get), why not use Crystal-lang instead of Go-lang? Besides the obvious differences in funding, what does Go have that Crystal doesn't?

antihero • 3 years ago

It will be interesting to see if they create a language server so we can use it in VSCode.

lftp • 3 years ago

Is this something that could be automated with something like https://aws.github.io/jsii/?

npretto • 3 years ago

So this is a "This week vercel is sponsoring" episode...

adamnemecek • 3 years ago

The way of dealing with cycles is generational arenas. Also where is the codebase? I would like to see the uses of unsafe.

konart • 3 years ago

>shared mutability

Sure but this goes against "Don't communicate by sharing memory", no?

scottlamb • 3 years ago

> Sure but this goes against "Don't communicate by sharing memory", no?

Yes, but that's advice rather than something Go requires. It's reasonable to say it's outweighed by the desire to avoid significantly restructuring the existing logic. The author is explicitly choosing to match the upstream program's structure to limit the project scope, and whether that structure is good or not is almost besides the point.

btw: especially in the Rust community, shared mutability doesn't always mean shared between threads/tasks. It's also a concern in single-threaded/non-concurrent code due to aliasing. This case does actually seem to be involving multithreading given that the author mentioned "using 8 threads".

newlisp • 3 years ago

Curious, can you elaborate on what are the issues of mutability in single-threaded code?

scottlamb • 3 years ago

It comes down to how a language construct allocates freedom to the programmer vs the optimizing compiler.

* Rust references are strict. &mut references guarantee that nothing else can access the backing memory while the &mut is valid. & references guarantee that nothing (including &mut by the rule above, or unsafe code with raw pointers by programmer's pinkie swear) can change the backing memory while the &mut is valid. This allows the compiler to perform more optimizations behind the scene, like assuming a value loaded into a register is still up-to-date, even if there's been a function call between.

* Rust raw pointers make no guarantees (but require the programmer to type "unsafe" to use).

* C pointers are IMHO weirder. iirc, two pointers of the same type may alias (unless you use the rare "restrict" keyword), but two pointers of different type can't, unless one is "char*".

* AFAIK, Go pointers are at least as permissive as C's. They might not make any aliasing guarantees at all.

* Other languages vary...

steveklabnik • 3 years ago

https://manishearth.github.io/blog/2015/05/17/the-problem-wi...

derpthinker • 3 years ago

But why isn't Microsoft doing it? Leaving all the hard work to others?

tantalor • 3 years ago

This is a "rewrite" not a "port".

overmachine • 3 years ago

mariogintili • 3 years ago

k__ • 3 years ago

tsc relies in shared state and is slow, so they port it to a language that allows shared state.

Strange reasoning.

jerf • 3 years ago

It sounds like you think "tsc is slow" because "it uses shared state". If that is what you're getting at, you've connected two unconnected statements.

It does tend to imply the new tsc isn't going to be able to take much more advantage of concurrency than the current one, which may put a limit on what performance gains it can expect. I don't know if the current tsc is able to do anything concurrently. (Here, "concurrently" doesn't merely mean "able to have multiple events in flight and thus do multiple tasks at the same 'time'" but truly "doing work on multiple CPUs simultaneously".) Go is generally faster than Node, but it'll be hard to guess by how much in this case.

A Rust port would be harder but would probably lead on in a direction to improve the concurrency. However, that may be an infeasible amount of work, not because of Rust per se but because a port is intrinsically easier than a port and a rearchitecting.

tshaddox • 3 years ago

As other people have pointed out, it doesn't seem like the author (or anyone else, perhaps other than you) is suggesting that shared state is a significant reason that tsc is slow.

But moreover, my takeaway was that the author is making a distinction between a complete rewrite and a port, where the former seems to be any rewrite that just tries to do the same thing (but may be architecturally very different), while the latter seems to be an approach where you attempt to directly adapt the original source code into a new language on a statement-by-statement basis. The latter would likely require the target language to support the programming paradigms which are used heavily in the original source code.

munificent • 3 years ago

The reasoning is only strange if you assume that the cause of slowness is the shared state.

CPUs have no problems with mutability. JS is slow because of dynamic typing, dispatch, flexible object layout, and an object model that requires lots and lots of heap allocation and pointer chasing.

Yoric • 3 years ago

(er, some text in the parent seems to have vanished)

Sure, if you absolutely wish it to, in an `unsafe` block, you can define two writeable pointers to the same piece of data. I don't think I've ever seen anybody write such code in Rust (except perhaps when linking to a C library that required it?), nor have I ever felt the need to do so.

Now, it is true that Rust is designed to discourage writing cyclic data structures. I can imagine that trying to port (rather than redesign) an algorithm that deeply requires cyclic data structures, that will make your life miserable.

wizzwizz4 • 3 years ago

You actually can't define two `&mut` references to the same data in Rust. It's invalid Rust code; if the compiler notices, it'll refuse to compile, and if it doesn't notice, your code will randomly break.

Yoric • 3 years ago

You can define two `*mut`, though, in unsafe code.

anderskaseorg • 3 years ago

pwdisswordfish9 • 3 years ago

You can, just not at the same time.

pjscott • 3 years ago

tsc is a large JavaScript program run at the command line, which is a pathological case for a JIT-compiled language: by the time the hot spots get native code generated, tsc may be close to finished already. Porting it to any ahead-of-time compiled language, without significant changes to the structure of the code, is likely to be a big performance win.

pdpi • 3 years ago

tsc is slow, so you port it to a faster language. tsc relies on shared state, so you use a language that supports shared state so you're just porting the same design to a new language instead of redesigning the whole program.

londons_explore • 3 years ago

At first I thought this was for type checking Go code... Ya know, all that interface{} stuff...

sjhuang26 • 3 years ago

One thing that was not addressed in the post is the option of using a GC library in Rust. I'm not a Rust expert, but to me, it seems like this makes it quite feasible to quickly port a compiler. They could wrap most complex data structures in GC and optimize them as time goes by. That being said, I'd assume that the library's GC would be slower than Go, so perhaps that's a deal-breaker.

xixixao • 3 years ago

> The more code you have, the longer it takes to compile. On medium-to-large-sized TypeScript projects, this compilation is extremely slow.

Or you could, you know, use the JS typechecker already built with large scale performance in mind (and in native language) - Flow.

I know Flow hasn’t had the bandwith to invest in supporting open source, and Typescript pretty much “won”, and there is now a large amount of library definitions without Flow support, but: If your codebase is large enough to suffer from TS’s architecture, you might have the resources to deal with the hurdles of adopting Flow. Flow still has some better designs such as nullables (?T).

That said, Flow is now being heavily optimized for large, strictly typed codebase, which means there is now no typechecker geared towards checking JS with minimal type annotations. I think that would still be useful for the prototyping stage of a project.

imcotton • 3 years ago

Last time I heard news about Flow was facebook/jest replacing it with TypeScript.