Google's automatic language detection and translation service

Google's automatic language detection and translation service

Summary: Translating text between languages is an interesting problem to solve -- and solving problems is a perfect way to use some of the extra brain power that can be found at Google.I decided to give something a shot -- what would happen if I tried to translate a page without giving Google any information about the language it's written in (or what language I speak).

SHARE:
TOPICS: Google
5
Translating text between languages is an interesting problem to solve -- and solving problems is a perfect way to use some of the extra brain power that can be found at Google.

I decided to give something a shot -- what would happen if I tried to translate a page without giving Google any information about the language it's written in (or what language I speak).  Google knew everything I didn't tell it.  For a good example of what I mean, we use Wikipedia since it is available in multiple languages. Let's try translating the Spanish version by using this link.

In the past you would have needed to specify your language "hl=en" and a language pair ("langpair=es|en").  Notice how we didn't specify any of that in the URL above?  On the Spanish version of the Wikipedia page, click "Deutsch" to try the German Wikipedia site.  Notice how even that site is in English?  It's also interesting to note that the frame Google once put at the top of translated pages is now gone.

Theoretically you could use this proxy as a type of Babel fish by using a Firefox extension that passes page requests through this service.  Users would have the comfort of knowing text is always presented in a language they understand (as long as Google understands it).

You might then wonder if an extension like this would affect browsing speed.  It might cause a bit of a performance hit, but in most cases it shouldn't make much of a difference.  Translating pages that are initially in your native language will skip the translation related processing and send you directly to the requested resource.  For example, this link forwards you directly to http://en.wikipedia.org even though we are trying to use the translation service.

One day people will only need to know a single language for access to everything on the Internet -- a service like this is a great start to that end.

Topic: Google

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

5 comments
Log in or register to join the discussion
  • Can't Recreate

    I don't see that at all. When I click on the first link it loads the page that it loads, http://es.wikipedia.org/wiki/Portada, is in Spanish. What browser are you using?
    dgeiser13
  • Just a few points...

    [i]In the past you would have needed to specify your language "hl=en" and a language pair ("langpair=es|en"). Notice how we didn't specify any of that in the URL above?[/i]

    This has been the case for a long, long time when using Google's [Translate this page] link from the SERPs.

    [i]It's also interesting to note that the frame Google once put at the top of translated pages is now gone.[/i]

    That's simply because Wikipedia use some JavaScript to ensure their pages aren't displayed inside a frameset. Do a search for [www.google.es] and click the [Translate this page] link. The frame will still be there. (Incidentally, this also included "hl=en" for my language and "sl=es" as the source language - which I could then remove without affecting the translation.)

    And finally, you can't hotlink to these translated URLs. It looks like Google have some kind of session checking in place so that you can only translate from their website.
    Tony Ruscoe
    • Re: Just a few points

      The "translate" link in results still gives the translator information about the language the source is in, and what language to translate to. This information isn't needed when using the translate_c page.

      Also, the frame doesn't even begin to appear on the wikipedia page with the translate_c page -- whereas the frame exists then disappears when you use the old method to translate the pages.

      Hotlinking *does* work, but it seems to be flaky... it sometimes works, sometimes doesn't right now.
      Garett
      • Re: Just a few points

        [i]The "translate" link in results still gives the translator information about the language the source is in, and what language to translate to. This information isn't needed when using the translate_c page.[/i]

        Technically, this information isn't needed when using the "translate" page either. They simply guess the language in advance and add the parameters to the URL, probably to save on processing power so they don't have to guess the source language of the page in realtime and also because these values are used by the top frame header "translate_n" page. The "sl" (i.e. source language") parameter is used to popular the source language - i.e. "This page has been automatically translated from ..." - and the "hl" parameter is used to decide which language to display the UI text in. (You can even override these values manually and will override which languages it uses.) Leaving these parameters out will simply cause the system to default to English for the UI / target language and guess the source language of the page.

        You can try it on the standard "translate" page here (you may need to copy and paste the URL if hotlinking is being flaky): http://translate.google.com/translate?u=http://es.yahoo.com/

        [i]Also, the frame doesn't even begin to appear on the wikipedia page with the translate_c page[/i]

        That's simply because the "translate_c" page is the main frame used in the standard "translate" frameset (after being redirected from the "translate_p" page which just says "Translating..."; going straight to "translate_c" will obviously eliminate the top frameset.

        Using the standard "translate" page will make the Wikipedia Spain page jump out of its frameset though: http://translate.google.com/translate?u=http://es.wikipedia.org

        BTW, there was some discussion about Google's language guessing capabilities - and how it sometimes gets it wrong - here:

        http://blog.outer-court.com/forum/47109.html
        Tony Ruscoe
  • Room for improvement

    Below, the text of a letter from my daughter, who uses both the French and the English languages in her work in Paris, after I provided her with a link to Google's site (those who desire can try to use Google's service to translate the letter) :

    [i]C'est trop marrant, regarde!!
    Google me donne:

    there is still room for improvement = il y a pi?ce immobile pour
    l'am?lioration !!!!

    J'adoore! c'est trop dr?le, c'est pas forc?ment utile pour le travail,
    mais ?a me fait beaucoup rire et je compte bien ?crire des mails comme ?a
    en espagnol ? mes copains qui parlent espagnol!!! et italien ? mes amis
    italiens! on va bien rire![/i]
    mhenriday