Back

Everything about Google Translate crashing React (and other web apps)

108 points24 daysmartijnhols.nl
rjst0122 days ago

> When I first ran into this issue back in 2017, I posted in the React issue tracker that I had ”fixed” my app by blocking translation entirely.

Please do not do this! In almost every instance I've encountered severe Translate-related broken-ness, it's still worked well enough to get me a snapshot of the current page translated. Fighting through this is still less cumbersome than the alternatives.

> The only alternative solution that I can think of, is to implement your own localization within your app (i.e. internationalization)

I will add, please make sure that language is an independent setting, and not derived from locale! I sometimes have to use translate on sites that have my preferred language available, but won't show it to me because it's tied to locale and that changes other things that I don't want, like currency.

On one such site I used a browser extension to rewrite the request for the language strings file.

Freak_NL22 days ago

> I will add, please make sure that language is an independent setting, and not derived from locale!

Websites already have exactly what they need to provide you with the language you want via the Accept-Language header your browser sends. In your browser's settings you can configure a list of languages (country-specific if desired) which get send with every request.

E.g.,:

    en-GB
    en
    nl
(Prefer British English, fall back to any English, and if not available either use Dutch.)

This is already entirely separate from your OS locale! Although it will default to filling it in with that locale's language if you don't configure it yourself of course.

This should be the primary way to decide upon a language, but in addition to that offering a way to switch languages for a specific site on that site itself is a user-friendly gesture appreciated by many.

masswerk22 days ago

This is not true. E.g., Safari is tied to the OS settings, Firefox has some dependencies regarding the locale of the first install, etc…

Moreover, probably most people speak or can read more than a single language. There may be reasons for accessing a site in a particular language other than the standard locale.

Please empower users to make their own choice! Do not assume to know better.

0xblinq21 days ago

For example, when the translation is shit and you prefer to use the English one because the one in your language is impossible to understand.

phantomathkg22 days ago

This does not help in many many situations.

I am a Hongkonger, natively speaks Cantonese, fluent in English and learning Japanese.

If I go to Google I want English UI and prioritise traditional Chinese result then English then simplified Chinese.

On the other hand if I go to a Japanese website, I don’t want them to translate for me, just display the original Japanese will be fine. Unless I toggle.

These kind of complex setup can never be achieved if we don’t have a per site locale policy. And seriously. A toggle per site is easier than navigate three level deep into browser setting page.

kelnos22 days ago

> A toggle per site is easier than navigate three level deep into browser setting page.

I don't disagree with your overall point, that flexibility is useful for website visitors, but your statements requires asking the question: "easier for whom?"

Certainly relying on Accept-Language is significantly easier for the website maintainer. And overall it would be a lot easier if the small handful of web browser maintainers added saner settings (even a per-website Accept-Language toggle), than if we were to require the thousands (tens of thousands? millions?) of multi-language website developers to provide their own toggle. Not to mention having a standardized way to do manage this would be better for users than having to discover each website's language toggle UI.

But sure, we don't have those easy-to-use browser settings, so it's (unfortunately) up to every website developer to solve this problem, over and over and over.

(As an aside, it would be cool if websites could return a hypothetically-standardized Available-Languages header in their responses so browsers could display the appropriate per-website UI, with only the supported languages.)

arkh22 days ago

The problem is when you understand multiple languages.

If a website is made by an English speaking team, as I understand English I'd like it to be English first and not a possibly broken French version. If a website is developed with French language first I'd like to have it in French and not a second-rate English translation.

reshlo22 days ago

> In your browser's settings you can configure a list of languages (country-specific if desired) which get send with every request.

Customising this list at all makes your browser fingerprint thousands of times less common than it was before you did this, and many websites you visit could then probably uniquely identify you as the same user across all of your sessions.

Freak_NL22 days ago

That and a thousand other things. A highly privacy focussed browser could offer to enable this setting only on whitelisted websites (and send 'en' plus a bunch of random language codes on others).

+1
VWWHFSfQ22 days ago
LtWorf22 days ago

> This should be the primary way to decide upon a language

Google developers are very intelligent, but not intelligent enough to understand this.

rjst0122 days ago

They probably understand it just fine. Someone higher-up has just over-ruled them. There may even be a good reason for it, but because of the way companies work, we will probably never find out what it is.

archerx22 days ago

Almost no website uses this, even big ones like Google who insist on showing me pages in German rather than English or French.

PetitPrince22 days ago

On the other hand, sometime the ads that are shown are in German. Easier to mentally filter out.

sureIy22 days ago

> that changes other things that I don't want, like currency.

Oh god Google is so bad at this. They don't even let me change the currency in many cases when looking at hotels (yes on the website; not in the Google Maps app)

JimDabell22 days ago

Google is so ridiculously bad at this, when an account that only ever uses English is explicitly asking for English search results, but happens to be located in Thailand, it will give you English results, but use the Thai calendar to display years, which is 543 years ahead of the Gregorian calendar. Are there any people at all who expect to read English text but expect to see the year 2568 instead of the year 2025, when no part of their system or account is configured for Thai?

SebastianKra22 days ago

I'd have no issue leaving translation enabled, if the translator was an optional feature that the user must opt-in to, and that's clearly communicated as something not controlled by the developer.

But I've received reports from Edge-users that didn't even know translation was enabled.

rjst0122 days ago

Yeah, I agree that's problematic. And I would have no objection to implementing a UI feature that displayed a warning banner of some kind if it detected that the page had been translated.

dgoldstein022 days ago

When you say locale, you mean your current location, e.g. as detected by geoip?

pta200222 days ago

A locale is a combination of several things, including a language, but not only.

E.g. I'm from portugal. I'm visiting an american site, which does not have professional portuguese translations, but does have auto-generated ones.

I don't like the auto-generated ones and can read english just fine, so I want to have the language set to english (en-US in this case).

However, I still want it to apply some locale-specific things from Portugal, e.g.:

- Units (Metric vs. Imperial vs. Whatever mess countries like the UK do)

- Date formatting (DD/MM/YYYY vs MM/DD/YYYY)

- Time formatting (AM/PM vs. 24hr clock)

- Currency formatting (10€ vs. 10 € vs. 10 EUR vs. €10)

- Number formatting (10,000.00 vs 10.000,00)

- When the week starts (Monday vs. Sunday)

If you take a look at the windows locale options, it mostly lets you mix-and-match whatever you want (which is great! Now if only the MS apps let me stop using the localized keyboard shortcuts...): https://learn.microsoft.com/en-us/globalization/locale/langu...

rjst0122 days ago

Locale I'm using as a shorthand for "the bundle of variables that your service or business needs to tweak between customers in different markets". It may determine things like currency, date/time or currency formatting, or relevant regulatory framework. My argument is that language should always be sett-able independently of the other variables locale controls.

For an example of a site that almost gets it right, see https://www.finnair.com/ . You are first prompted to set location, and then language. I say "almost" because although they will allow you to select English in any market, they won't allow you to select any offered language in any market.

In comparison, https://www.flysas.com/ you get one dropdown which sets market, currency, and language in one go.

pwagland22 days ago

Sometimes, but not always. Sometimes it is also based on the locales in your browser.

DecoySalamander22 days ago

It means system/browser settings like the one available in navigator.language.

cmenge22 days ago

I had to learn this the hard way when a React app I built showed random crashes I couldn't explain.

"Fortunately" it also auto-translated the "I happily accept the terms of use" checkbox in one case into (back-translated) "I happily dying die perish", which also couldn't be clicked. That lead to a very high priority ticket and made us realize that all DOM manipulators might break the site.

shadowgovt22 days ago

Very early on in my career, I was working on a greeting-card app in Facebook (back when Facebook apps were a thing).

Got a bug report from one of our own team members that some of her greeting cards didn't show up in the list. The link appeared, but no image. We figured out that the difference was she was running an ad-blocker. We couldn't figure out precisely what rule the blocker was applying, but it seemed to be:

- image

- within some particular size bounds

- with the consecutive letters 'ad' in the URL.

... and we were using hexadecimal encodings to track individual entities in the UI.

We solved the problem by replacing 'a' with 'g' in our hex encoding. And then I had to take a long walk and accept that if I was going to do web development on the public Internet, I'd be sharing the space with intentionally-modified user agents forever, and would have to account for every such modification as we discovered it.

I still won't run my own ad-blocker for this reason.

MatthiasPortzel22 days ago

> I still won't run my own ad-blocker for this reason.

I maintained an extension for a public website for a couple years. (It did things like, for example, adding information that was available in the API to the page, for power users.) I eventually gave up with the conclusion that the concept of a browser extension was fundamentally unsound. So I also don’t use an-blocker.

LorenPechtel22 days ago

Which is why a decent ad blocker has the option to selectively permit things it thinks are ads. Without blocking I've had multiple occasions of encountering a website that was completely unusable, it would be completely overlaid with ads as soon as they loaded.

nottorp22 days ago

So the fundamental problem is the DOM is too inefficient to do applications on it. No surprise there, considering the original design of HTML was for presenting information, not for interactive applications.

bryanrasmussen22 days ago

I think the fundamental problem is you have two applications, the primary application and google translate altering the application state of the primary application at the same time, without any possible communication between the two regarding locking or alterations or anything.

I'm pretty sure most applications in the history of computing would not fare any better if you constructed that situation.

nottorp22 days ago

Both Google and React are guilty here if i read the article right.

Google replaces an element with a different element (Text with Font containing Text?), and React's virtual DOM keeps the old, deleted elements alive because the virtual DOM still references them.

React "applications" would crash when Google Translate changes their stuff from under them if they didn't accidentally keep the old elements alive. Which would be much better behaviour.

MartijnHols22 days ago

They both do reasonable things, so I wouldn't really blame either. Google Translate was there first and is a big accessibility advantage for the web. At the same time, Google Translate is the user-specific browser extension that is executed last, on top of existing apps. It affects not just React[1], so even if React were to implement a fix, Google Translate would continue to interfere with other webapps.

I think any real fix to Google Translate would be very complicated. I fear the only solution might be to elevate Google Translate to be part of the browser's rendering engine instead of acting like an extension. This would allow it to work in the rendering pipeline without modifying the DOM, but even that will probably run into site-bugs because of things like text being longer or shorter.

[1]: https://martijnhols.nl/blog/everything-about-google-translat...

+1
JimDabell22 days ago
+1
bryanrasmussen22 days ago
kelnos22 days ago

That doesn't track with me. The React application is the website. It should be able to run while expecting some other third party thing isn't going to dig into the internals of its view and modify those internals.

It would be fine if Translate was just modifying text, but changing the actual structure of the HTML goes too far.

bryanrasmussen22 days ago

I mean they have to keep the old elements alive because that is the data they use to render to the DOM.

What React could do is to catch that there have been changes made to the structure of the DOM by someone other than them and then re-render the full page. Which would probably not be the most performant solution for anyone.

But anyway then people would complain that React was breaking Google Translate.

Essentially you have two applications fighting to control rendering of the application state of one of them.

zelphirkalt22 days ago

Yeah, Google translate shouldn't have to "research"/reverse engineer, what kind of framework is being used on any random website it translates. It assuming, that it simply interacts with static information would still be a reasonable assumption. While Google translate is also at fault, if it changes the DOM structure. Why not leave things the same and just exchange textnodes? Seems silly.

bryanrasmussen22 days ago

I am just supposing, but have not checked, that if they change the text they must also change the DOM by at least changing the lang attribute on nodes affected, meaning probably the lang attribute on the html element, but could also be a lang attribute on each element wrapping a textNode.

on edit: I figured the article must have said something about this and I missed it, and yes, it shows that the DOM is changed to be lang="nl" on the html element, which means obviously if React rerenders but does not rerender the HTML element (which many React applications do not control) then the language would be out of sync, of course.

friendzis22 days ago

The primary application is the browser.

The problem here is that react "application" builds its own state of the web page, fails to reconcile it with changes to actual state, ends up in a detached state with stale information and then proceeds to alter actual state based on the corrupted information it has.

kelnos22 days ago

I disagree. The browser is a VM that runs other applications, in this case one that's written in React.

Then the browser, via an extension, is corrupting that application's internal state and is then surprised when that application stops working correctly. Well, duh.

If Translate were to only modify the text on the website, I'd think React would be able to deal with it better. But it seems to be modifying the structure too (adding <font> tags); I think it's not reasonable to expect every JS framework to be able to deal with that properly.

friendzis19 days ago

The browser extension is not touching "application's" state whatsoever. The browser extension is, however, making a perfectly legal store on a shared system resource (if you consider browser to be VM), but the application ignores those modifications and continues to issue state modification calls computed from it's own, now long invalidated, internal cache.

> I think it's not reasonable to expect every JS framework to be able to deal with that properly.

It would be unreasonable to call any framework out of pre-alpha prototype stage, much less production ready, if it can't handle such basic stuff, sorry.

LorenPechtel22 days ago

I disagree with that. Fundamentally, it comes down to who is in charge. Does React control the DOM or does the DOM control react? I see this as a cache concurrency issue--React is trying to cache the DOM and breaking when that fails.

bryanrasmussen22 days ago

either that or the browser is the new OS, as people keep saying.

If there is such a thing as web apps, then the primary application is the web app, and that primary application runs in the browser.

shadowgovt22 days ago

As a concrete example: I'm racking my brain trying to think of any instances of a working realtime whole-desktop translation overlay application, and I don't think such a thing exists.

Translators for specific chunks of the screen, yes. Selectable translators, sure. Translators you can configure to work with one application at a time.

But rewriting all text in the entire windowing environment? While preserving selection, copying, and editing? Without hooks to the underlying apps? Functionally preposterous as a proposition.

aqueueaqueue22 days ago

I doubt it. Say you write a native app and some other side app swoops in and changes all your UI state while running. That is going to cause problems.

zelphirkalt22 days ago

But only, if the DOM is used in other ways than being an output of whatever is done to calculate its update. If the framework uses the DOM for other things than directly updating it or respresentation of internal state, then that's on the framework. Is it reasonable to assume DOM itself is part of the state? Would it not be more reasonable to have an internal state? But we have this with virtual DOM. So maybe the issue is in the way it makes use of the DOM in conjuction with virtual DOM internally. Maybe it is an optimization "hack" that goes badly in this scenario.

kelnos22 days ago

That sounds pretty arbitrary. Upon what grounds do you feel it's appropriate to say that the DOM is only allowed to be used in the way you describe?

LorenPechtel22 days ago

Disagree. I've swooped in on programs and done major changes to their filesystem behind their back. Programs that had no concept of working with externally supplied data in a network world. The programs happily went along doing their job, completely unaware that every label was a phantom whose meaning would always change if selected. The contents were appropriate (everything was simple CSV), the proper commands would be sent to the attached machinery.

It's the responsibility of the swooper to ensure everything's put back sane.

mdhb22 days ago

That hasn’t been true for at a minimum half a decade now and much longer in practice I think.

lobsterthief22 days ago

> No surprise there, considering the original design of HTML was for presenting information, not for interactive applications.

Even in this we’re just HTML displaying information, there’s effectively this second application (Google Translate) changing the structure of the original application’s XML, which would still break display (or XPATHs) in a “normal HTML presenting information” scenario.

pier2521 days ago

No. It's not an issue with efficiency at all.

LordHeini22 days ago

That is not the case anymore.

React and many other SPA frameworks use an additional virtual DOM which gets mapped onto the real DOM. This used to be faster 10 years ago and allowed for a unified interface.

Any addon manipulating the DOM forces the virtual DOM to go out of sync thus crashing the app.

As shown be the likes of Svelte, the virtual DOM is just legacy modern browsers are fast enough to get by without.

whizzter22 days ago

Virtual DOM or not doesn't matter, even Svelte has the potential to be disrupted by these Google translate shenanigans since it manipulates the DOM.

Actually it seems they got hit also, https://github.com/sveltejs/svelte/issues/15090

jtsiskin22 days ago

It’s not that modern browsers are faster - Svelte is a different approach and figures out how to update at compile time rather than using a runtime virtual DOM. 10 years ago it would also have been faster

rjprins22 days ago

Another solution would be a React-native machine translation implementation that updates the TextNodes without replacing them. It would still have the issues of merging adjacent nodes to get a proper translation, but at least it could update on any state change.

dgoldstein022 days ago

One idea that crossed my mind while reading the article: for websites that already use react-intl, have react-intl implement an API to allow supplying machine translations of messages into languages otherwise not supported by the app.

The problem with this is that it will only help sites that already put in effort to internationalization. Whereas the main target of Google translate are the sites that do not bother with i18n.

Still, it'd be quite valuable to the sites early in their internationalization journeys to get support for tons of languages right upon introducing internationalization.

Raed66722 days ago

that requires google to care about this issue

forgotusername622 days ago

I ran into this. We worked around it with solution 2 from the article i.e. never render text by itself next to another element, always wrap the text in it's own element. Not that much of an inconvenience since we have a Text component that controls font size etc anyway.

eptcyka22 days ago

How does Firefox translation interact with React?

itronitron22 days ago

Not sure if this has the same root cause, but some websites 'break' with Firefox translate, as if the visible text in an HTML element is somehow being used as an identifier in the website's application logic.

andrewmcwatters22 days ago

I don't like the way React updates subtrees. Other frameworks get it wrong, too, by using the same incorrect model. Employing professional opinion, it's just wrong. The document should be considered the source of truth, not some internal private state.

e.g. Input values on the HTMLInputElement are the real input, not some clone to a private object in JavaScript.

As a result of React's blatantly wrong treatment of the document.body, you have ensure that when it reuses element siblings within an arbitrary tree, that it's values are squashed to whatever private fields you're using in your component.

It screams wrong, and side effects like the one in this article make it obvious why.

No one is going to go out of their way to touch your special internal state, we're all going to use the web API to touch nodes and events from standard interfaces. You can't take the ball into a private court and expect the rest of the game to function.

barlog22 days ago

The substack.com is a typical example of a problem where the Google Translate extension can crash or cause parts to explode.

silverwind20 days ago

This is essentially a bug in React VDOM which blindly assumes it is the only one updating DOM nodes. Imho, it's long overdue to remove VDOM from react as other renderes have shown that it's not needed for performance.

sebazzz22 days ago

Knowing Google they'd build a private extension of the Web standard to fix this in Chrome.

jppope22 days ago

Just out of curiosity, what web apps are effected? I tried to find the "other web apps" and can't find anything (quick scan of the article)

MartijnHols22 days ago

Anything that affects the DOM and relies on TextNodes behaving predictably. It could be as simple as `e.target` of a click event being different (the `font` element gets in between what was actually clicked), but the main issues are when apps try to update or replace what used to be TextNodes.

Imagine you're building a framework, and a consumer renders a clock. The only thing that changes every tick, is the text value; `00:00` becomes `00:01`. In an attempt to be as efficient as possible, it's only natural for the framework builder to decide to keep a reference to the `TextNode` and only update its `textContent` every frame. This scales the best for even the most complicated app, but it leads to interference from Google Translate as the article shows.

sufianrhazi22 days ago

It strikes me that a straightforward solution to this problem would be to have Google Translate dispatch a new CustomEvent with a particular "type" and a reference to the new element in the detail field. So React and other frameworks could listen to this event and instead of dealing with Text node X, they could instead refer to the translated element Y.

aj722 days ago

I use Google translate all the time, and I definitely noticed you have to kick it to update.

londons_explore21 days ago

The real fix is to have a new web feature which is to have the 'page dom' and the 'display dom',.

By default there would be a 1:1 mapping, but things like browser extensions could write code to define how a particular bit of page dom would display to the user.

mindaslab22 days ago

React is not the right way to build webapps, so I see no problem here.

CrimsonRain22 days ago

How about react get on with times and support everyday needs of people instead?

cutler22 days ago

So the web is basically broken. Can WASM provide a solution?

arkh22 days ago

Last part of the post hints at a possible future for SPAs: wasm. No DOM modified by browser extensions so no surprise.

Aldipower22 days ago

Or switching back to desktop apps? Also no DOM manipulation there. :-)

jillesvangurp22 days ago

WASM enables us to bring what we were doing there to the web. The distinction always was a bit artificial.

This notion that you can only have DOM/CSS/Javacript on the web did not age well. There's a whole generation of programmers that built their careers on targeting that and are confusing that status quo with something that is set in stone for good reasons. Those reasons never really existed. Javascript was a bit shit but it was there so people used it. Fast forward 30 years and you still have people proving that point on a daily basis by creating very mediocre and underwhelming things with it.

WASM opens up the web to 30 years of progress in UI development elsewhere (mobile phones, game consoles, VR/AR, desktop, etc.).

What people will do with that is of course an open question. There are a few frameworks emerging but they are still kind of niche. And there are lots of attempts to bring retro UIs to the web unmodified. Links to e.g. Winamp in a browser, VB 6 running in a browser, etc. are easy to find. Some people even boot entire operating systems in a browser. I think I came across windows 95 at some point. A few versions of Linux, and some other stuff. Cool, but I'm more interested in new stuff.

The web has bit of an imagination deficit. Creativity on the web mostly died along with Flash. HTML + Javascript never managed to fill that void. Just the wrong tech for that job.

arkh22 days ago

> Creativity on the web mostly died along with Flash.

Even being anti flash sites at the time, I can't deny it allowed many fun experiences with browser games which look absent today.

mentalgear22 days ago

Therefore, I prefer svelte (besides superior ergonomics & web-standards compliance): It's not a framework, but a compiler that outputs only pure JS. Svelte simply has no virtual DOM that can be messed up. Just Simplicity & efficiency.

gr__or22 days ago

You are confused about the issue, and the OP does its part in contributing to the confusion. It's not a VDOM issue, it's not React exclusive (that part the post is explicit about) and indeed Svelte is affected as well: https://github.com/sveltejs/svelte/issues/15090

When seeing issues like this one pop up with React in the title, one should really have a good think whether this is solved principally different in other fws OR, and this should be the null hypothesis, is React in the title because it is more widely used than all the others combined

mentalgear22 days ago

Fair. The svelte sites I tried it out on had no issues, so I assumed it might be limited on React.

tehbeard22 days ago

Svelte breaks just the same...

Go, get down off your high horse and try it yourself, finish the counter in their tutorial, put a console log in the handler, and translate the page to French...

C'est tres borked.

BtM90922 days ago

But wouldn't it be the same when Google translate is actively replacing nodes?

Raed66722 days ago

[flagged]