Updated sample project working with the version 3 API
Updated sample project working with the version 3 API

Almost two years ago I wrote a post describing how to translate text using Azure cognitive services, however the API it uses is to be switched off and so I needed to migrate from the version 2 API to version 3.

Whilst most of the code I post on this blog is used in one form or another, I've been using the TranslationClient client presented in that article as-is for the past two years. OK, I changed the namespace. But otherwise it's identical.

Although I have finally stopped using older classes such as HttpWebRequest in favour of HttpClient and async/await, so far I haven't updated existing code to make use of them. As I noted above, I'm still using the TranslationClient class introduced in my previous blog post and at this time I simply want to retrofit the class to use the V3 API.

This also means I'm still not using any of the extra features offered by the API even though it probably makes more sense to combine some of the functionality now (as Microsoft have done with the API's themselves), however as I want the new class to be a drop in replacement for the old I have left this as an exercise for a future blog post

The official migration documentation can be found on Microsoft's site.

Dependencies

At first glance, the biggest change between v2 and v3 is the output format. Previously it was XML, now JSON. This is a bit of a double edged sword as while JSON is the standard these days, XML parsing is built into the .NET framework and JSON is not (yet).

JSON.net is a fine library for working with JSON, but thanks to the way NuGet works it quickly spread like a plague though my application libraries, and so I ended up blanket purging it. Instead, for some time I've been using a modified version of the fantastic PetaJson which is a single .cs file I embed in any projects that require JSON support.

The switch from XML to JSON does mean that a reference to System.Runtime.Serialization is no longer required which is a plus.

New end points

I'm already only using a limited subset of functionality via three separate version 2 API's. In version 3, two of these have been consolidated into one. The following table outlines the different endpoints

v2 Method v3 Method
Translate translate
GetLanguageNames languages
GetLanguagesForTranslate languages

In addition, the base URI has changed from https://api.microsofttranslator.com/v2/http.svc/ to https://api.cognitive.microsofttranslator.com/.

Regional end points

Although you can simply use the default base URI above and have Azure choose an appropriate data centre, you can optionally specify a specific region as follows.

Region Base URL
North America api-nam.cognitive.microsofttranslator.com
Europe api-eur.cognitive.microsofttranslator.com
Asia Pacific api-apc.cognitive.microsofttranslator.com

Specifying an API version

All requests to the API (apart from the initial authentication) need to include the api-version query parameter, although currently the only supported value is 3.0. Failure to include this will result in a 400 status code along with a body similar to the following

json
{"error":{"code":400021,"message":"The API version parameter is not valid."}}

Authentication

I'm using the same code to obtain an authentication token as I was for the version 2 API, as far as I know this isn't going to be removed - please see the original article for details.

According to the reference instead of generating an access token from your API key, you can pass the key directly via the Ocp-Apim-Subscription-Key. Given that this was also supported in the v2 API I'm not sure why I choose the more convoluted method of generating an access token, something else to potentially refactor away in a future update, especially given the fact that exact code has had a bug in it for over two years now.

Two-year old bugs and why you shouldn't blindly ignore ReSharper

Imagine my surprise when the first thing that happened after changing URI constants was the program crashed in a place I wasn't expecting! As it turns out, there was a bug in the original code and which just happened to have worked up until now.

When requesting an API token, the token is the body of the response. The class has a private GetResponseString string method for pulling this out (and incidentally is also useful for debugging purposes). This method checks to see if a character set is defined on the HttpWebResponse (via the CharacterSet property) and if so uses that to read text appropriately, otherwise falls back to UTF-8.

csharp
// WARNING! Broken code below

private string GetResponseString(HttpWebResponse response)
{
  Encoding encoding;
  string result;

  // ReSharper disable once AssignNullToNotNullAttribute
  encoding = !string.IsNullOrEmpty(response.CharacterSet) ? Encoding.UTF8 : Encoding.GetEncoding(response.CharacterSet);

  using (Stream stream = response.GetResponseStream())
  {
    using (StreamReader reader = new StreamReader(stream, encoding))
    {
      result = reader.ReadToEnd();
    }
  }

  return result;
}

At least, that was the theory. In reality, if a character set is present UFT-8 is always used, and if not present it tries to use the null object and crashes. ReSharper very helpfully warns you of this very thing with its "Possible 'null' assignment to entity marked with 'NotNull' attribute" warning, and I completely ignored as I'm so used to seeing it with various file API's that evidently I treat it as noise without paying attention.

Oops. Well, it's fixed now!

Getting the list of languages

The GetLanguagesForTranslate API has been replaced with languages and rather than returning a simple list of language codes, it now returns a little bit more - at the most basic level it includes the name (native and localised) and the language direction.

json
{
  "translation": {
    "af": {
      "name": "Afrikaans",
      "nativeName": "Afrikaans",
      "dir": "ltr"
    },
    "ar": {
      "name": "Arabic",
      "nativeName": "العربية",
      "dir": "rtl"
    },
    "bg": {
      "name": "Bulgarian",
      "nativeName": "Български",
      "dir": "ltr"
    },

    ! SNIP !

    "yue": {
      "name": "Cantonese (Traditional)",
      "nativeName": "粵語 (繁體中文)",
      "dir": "ltr"
    },
    "zh-Hans": {
      "name": "Chinese Simplified",
      "nativeName": "简体中文",
      "dir": "ltr"
    },
    "zh-Hant": {
      "name": "Chinese Traditional",
      "nativeName": "繁體中文",
      "dir": "ltr"
    }
  }
}

Using the scope query parameter, you specify a comma separated list of group information to return. The available group names are translation, transliteration and dictionary. As I'm only interested in translations, that is the only scope I'll provide. As an aside, if you omit this parameter it will act as if you had specified all scopes.

Our original GetLanguages function changes to this

csharp
public string[] GetLanguages()
{
  string[] results;
  HttpWebRequest request;

  this.CheckToken();

  request = WebRequest.CreateHttp("https://api.cognitive.microsofttranslator.com/languages?api-version=3.0&scope=translation");
  request.Headers.Add("Authorization", "Bearer " + _authorizationToken);
  request.Accept = "application/json";

  using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
  {
    using (Stream stream = response.GetResponseStream())
    {
      using (StreamReader reader = new StreamReader(stream, this.GetResponseEncoding(response)))
      {
        Dictionary<string, Dictionary<string, Dictionary<string, string>>> jsonEntities;
        Dictionary<string, Dictionary<string, string>> languages;

        jsonEntities = new Dictionary<string, Dictionary<string, Dictionary<string, string>>>();

        Json.ParseInto(reader, jsonEntities);

        results = jsonEntities.TryGetValue("translation", out languages) ? languages.Keys.ToArray() : new string[0];
      }
    }
  }

  return results;
}

I have to admit, I'm not a fan of this awful "dictionary of dictionary of dictionaries" nonsense. But at the translations element is an object with language codes as property names rather than an array, offhand I'm not sure how I'd get that converted into a strongly typed keyed collection, regardless of if using PetaJson or JSON.net - I will be revisiting this in a future post.

I'm also not a fan of having to load the entire JSON string into parsed objects and then discard most of it. PetaJSON has a Reader class which behaves very much like XmlReader and ideally I should have used that to walk the JSON.

In the above code, I've left in place the obtaining and setting an authentication token. However, unlike the v2 API, authentication is not required for using the /languages API. It is still required for actions that requiring billing, such as the /translate API itself.

Getting language names

As I've laboriously noted above, in the v3 API, Microsoft combined the original GetLanguagesForTranslate and GetLanguageNames into a single API call and so getting the actual names for each language is a simple case of taking the above code and pulling out a little more information from the nest of vipers dictionaries.

json
{
  "translation": {
    "ar": {
      "name": "Arabic",
      "nativeName": "العربية",
      "dir": "rtl"
    }
}

Remembering that the JSON output includes name, nativeName and dir attributes; this time around, we're interested in pulling out the name field. This is the display name in the requested locale (nativeName is the display name in the locale of the language itself). But how do you specify the requested locale? In v2, you used the locale query parameter but for v3 it is done by setting the Accept-Language header.

There's also another important difference - with the v2 API, you made a POST and the body had a list of the languages for which you wanted localised names for. However, for v3 there is no such filtering available, it will return localised names for all supported languages.

As I'm trying to keep the same behaviour that means I'm going to need to add this filtering myself (although by the time I'd finished this article I was questioning my reasoning for not just rewriting the class from scratch in a modern fashion and forcing our internal application deal with it).

csharp
public string[] GetLocalizedLanguageNames(string locale, string[] languages)
{
  string[] results;
  HttpWebRequest request;

  this.CheckToken();

  request = WebRequest.CreateHttp("https://api.cognitive.microsofttranslator.com/languages?api-version=3.0&scope=translation");
  request.Headers.Add("Authorization", "Bearer " + _authorizationToken);
  request.Headers.Add("Accept-Language", locale);
  request.Accept = "application/json";

  using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
  {
    using (Stream stream = response.GetResponseStream())
    {
      using (StreamReader reader = new StreamReader(stream, this.GetResponseEncoding(response)))
      {
        Dictionary<string, Dictionary<string, Dictionary<string, string>>> jsonEntities;
        Dictionary<string, Dictionary<string, string>> responseLanguages;

        jsonEntities = new Dictionary<string, Dictionary<string, Dictionary<string, string>>>();

        Json.ParseInto(reader, jsonEntities);

        if (jsonEntities.TryGetValue("translation", out responseLanguages))
        {
          results = new string[languages.Length];

          for (int i = 0; i < languages.Length; i++)
          {
            Dictionary<string, string> languageData;

            if (responseLanguages.TryGetValue(languages[i], out languageData))
            {
              results[i] = languageData["name"];
            }
          }
        }
        else
        {
          results = new string[0];
        }
      }
    }
  }

  return results;
}

I really don't like this code. Too late for second guessing now though!

Translating text

The final part of this migration exercise is the actual text translation. Again, there's some small differences from v2 but nothing too troublesome.

Firstly, the text to translate is no longer a query parameter, but part of the body text as a JSON object. This makes sense in a way as for v3, Microsoft merged the Translate and TranslateArray API's into one. But it still means it's slightly more awkward to use.

The body JSON is simple enough and looks like this

json
[
    {"Text": "Hello World"}
]

Note that for some reason the Text attribute is in title case rather than lower case in all the other examples

The language to convert from and to are still specified via the from and to query parameters as with v2.

The response is a JSON array, similar to the following.

json
[
  {
    "translations": [
      {
        "text": "Hallo Welt",
        "to": "de"
      }
    ]
  }
]

However, it can include a great deal more information depending on if you use auto detection, transliteration and more. I'm not covering any of that here in my 1:1 conversion.

As I don't really want to manually write JSON and deal with having to escape text, I'll create an interim object and use PetaJson to write it out. I've made it private for now as it is only used inside of this method. It was also at this point I threw up my hands in disgust at more dictionary of dictionaries and wrote a few limited POCO's for the response output that I'm interested in.

csharp
partial class TranslationClient
{
  private class TextInput
  {
    private string _text;

    public TextInput()
    { }

    public TextInput(string text)
    {
      _text = text;
    }

    [Json("Text")]
    public string Text
    {
      get { return _text; }
      set { _text = value; }
    }
  }
  
  private class TranslationResult
  {
    private string _targetLanguage;
    private string _text;

    [Json("to")]
    public string TargetLanguage
    {
      get { return _targetLanguage; }
      set { _targetLanguage = value; }
    }

    [Json("text")]
    public string Text
    {
      get { return _text; }
      set { _text = value; }
    }
  }

  private class TranslateResponse
  {
    private TranslationResult[] _translations;

    public TranslationResult[] Translations
    {
      get { return _translations; }
      set { _translations = value; }
    }
  }
}

With the helpers in place, I can now expand the Translate method to work with the v3 API

csharp
public string Translate(string text, string from, string to)
{
  HttpWebRequest request;
  string result;
  string queryString;
  TranslateResponse[] responses;

  this.CheckToken();

  queryString = "?api-version=3.0&from=" + from + "&to=" + to;

  request = WebRequest.CreateHttp("https://api.cognitive.microsofttranslator.com/translate" + queryString);
  request.Headers.Add("Authorization", "Bearer " + _authorizationToken);
  request.ContentType = "application/json";
  request.Accept = "application/json";
  request.Method = WebRequestMethods.Http.Post;

  using (Stream stream = request.GetRequestStream())
  {
    using (TextWriter writer = new StreamWriter(stream, Encoding.UTF8))
    {
      Json.Write(writer, new[] { new TextInput(text) });
    }
  }

  using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
  {
    using (Stream stream = response.GetResponseStream())
    {
      using (StreamReader reader = new StreamReader(stream, this.GetResponseEncoding(response)))
      {
        responses = Json.Parse<TranslateResponse[]>(reader);
      }
    }
  }

  result = null;

  if (responses != null && responses.Length == 1)
  {
    TranslateResponse translation;

    translation = responses[0];

    if (translation.Translations != null && translation.Translations.Length == 1)
    {
      result = translation.Translations[0].Text;
    }
  }

  return result;
}

Much more complicated than the previous version! Still it works. Doesn't it?

Wait, the output is different?

After I had the conversion complete, I noticed that one of the variations of Klingon wasn't listed in the language list any more. Curious, I ran the original application and back it popped. At first I thought they might have been combined with the new script support but this doesn't seem to be the case. Fortunately, no user has asked for our software to be in Klingon, so I can ignore this omission!

I also noted the codes for Chinese have changed - in v2 they are zh-CHS (Simplified) and zh-CHT (Traditional), but in v3 they are now zh-Hans and zh-Hant. Apparently the latter is the proper way of doing things now, but this a breaking change for me as various shell scripts and data files refer to the old style and will need changing.

Even more oddly however, the first part of the "Major-General's Song" that defaults in the demonstration program now translates differently in the two versions

English Text:

text
I am the very model of a modern Major-General,
I've information vegetable, animal, and mineral,
I know the kings of England, and I quote the fights historical
From Marathon to Waterloo, in order categorical;a
I'm very well acquainted, too, with matters mathematical,
I understand equations, both the simple and quadratical,
About binomial theorem I'm teeming with a lot o' news,
With many cheerful facts about the square of the hypotenuse.

German Translation (version 2 API):

text
Ich bin sehr Modell modern Major-General,
Ich habe Informationen Gemüse, Tiere und Mineralien,
Ich kenne die Könige von England, und ich zitiere die historischen Kämpfe
Vom Marathon zu Waterloo in Reihenfolge kategorische; ein
Ich bin sehr gut, auch mit mathematischen Fragen kennen,
Ich verstehe Gleichungen, einfache und quadratischem,
Über binomiale Theorem bin ich mit viel o-Nachrichten nur so wimmelt,
Mit vielen fröhlichen Fakten über das Quadrat der Hypotenuse.

German Translation (version 3 API):

text
Ich bin das Vorbild eines modernen Generalstabs,
Ich habe Informationen pflanzliche, tierische und mineralische,
Ich kenne die Könige von England, und ich zitiere die Kämpfe historisch
Von Marathon bis Waterloo, in der Reihenfolge kategorisch; ein
Ich bin sehr gut mit den Fragen mathematisch,
Ich verstehe Gleichungen, sowohl die einfachen als auch die quadratischen,
Über binomiale Theorem Ich bin voller viel o ' News,
Mit vielen fröhlichen Fakten über das Quadrat der Hypotenuse.

I have no idea as to why this is, I assume it's because according to the documentation it uses "neural machine translation by default", although it doesn't seem to state how to disable it.

In the end, I updated the demonstration program to include both the v2 and v3 classes so you I could toggle between them to easily see the differences.

To be continued

Attached to this post is an upgraded demonstration project which is a little more robust than the methods above, it is also available on our GitHub page. Note that you will need to use your own API key, the one in the demonstration program has been invalidated.

I'm really not a fan of the new code and have made a note on my blog Todo list to revisit this topic in the future and rewrite it properly using modern techniques, and also to investigate some of the additional functionality the translation API offers.

Update History

  • 2019-04-11 - First published
  • 2020-11-22 - Updated formatting

Like what you're reading? Perhaps you like to buy us a coffee?

Donate via Buy Me a Coffee

Donate via PayPal


Files


Comments