mardi 2 février 2010

Recode and thread-safety

Someone just asked me about thread-safety of the Recode library. This exchange might interest other Recode users, so I save it here. At least until I revise and integrate this information elsewhere! ☺
  • My boss is asking if it [the Recode library] is thread-safe… I read the online documentation … but I couldn't find a word about that. Hope you can help me provide a good answer!
For me, a good answer is an honest one! The truth is that I never checked through actual testing that it is thread-safe. On the other hand, I designed the library so it should be.

Long ago, when Ulrich Drepper, who maintains GNU libc, wanted to add various character set support in libc, he asked me if we could manage to use Recode for it, and asked me to produce a reentrant library out of Recode, because threading considerations were getting hot everywhere in libc. At the same time, Richard Stallman accepted that I change the license of Recode from GPL to LGPL, so it could later be recycled into libc. So did I: the extraction of a library out of the code, and the change of license.

I did not want to dive deep into the intricacies of thread control and exclusion, because there were too many implementations to consider at the time, all different, and with the reputation of having each its own flurry of bugs. I rather tried to design the library so to clearly minimize the danger of threading problems, as far as possible, but completely avoided thread related library calls.

A bit later, Ulrich shared a design he had in head for implementing the iconv specs in libc, and suggested that I rewrite the Recode library to support it. This would have implied that I significantly restrict or weaken Recode specs. My opinion was that we should not make ourselves miserable for the only sake of following questionable new standards. We did not reach an agreement, and so, the Recode library did not get used in GNU libc.
  • In my code, I initialize a single RECODE_OUTER, and I use that variable everytime I call recode_new_request(). Every thread uses it's own RECODE_REQUEST.
This is exactly how I meant it to be used in threading contexts. I'm not so sure, by now, that this is the best approach, because producing a RECODE_OUTER per thread currently implies a non-negligible CPU overhead, as well as duplication of in-memory descriptions. On the other hand, this is indeed the safe way to proceed.
  • Can you please explain a little bit what thread issues could happen if two threads call any of the following functions at the same time?

     recode_new_request(outer);
     recode_scan_request (request, conversion);
     recode_string (request, string);
     recode_delete_request (request);


Before everything, I assumed that thread-safety is to be already guaranteed in malloc and free, and common I/O operations. If these are not thread-safe, the Recode library is surely not. Moreover, no thread exclusion is explicitly taken, nowhere in the Recode library itself.

There should not be any problem, by design, because requests are derived out of outers. But once again, I never tried it myself, and I've nothing to substantiate an assertion that there is no bug in that area. If I myself had the problem for my own needs, I would probably not fear using the Recode library, but that is surely no guarantee for others.

Writing this reply helps at shaking my memory, and something comes to my mind. I'm not even sure I'm right, as I thought about this quite a while ago. There are a few places in the code where pre-conditioning is computed on the fly at execution time. There are slight time windows in which the same pre-conditioning might be simultaneously computed by different threads. Both threads would then compute the exact same tables, so it is not important about which thread will finally write the pointer to the structure last, obliterating the same pointer from other threads. This might cause some memory to be spoiled. This might also induce a problem at cleanup time if there is any cleanup in such area, as I think there is no cleanup for preconditioning. To make this fully clean would require threading exclusions (or locks).

If you happen to know gettext machinery, which Recode uses here and there, but not in the library part, so far that I remember, there are similar problems. For each place where the gettext macro is expanded, the expansion contains a cache; gettext saves the translation on the first call, and uses the cached copy on subsequent calls. If two threads were using gettext simultaneously, two translations would occur, both yielding the same result, both caching them in the same cache, and one of the translation would be spoiled. (I may be wrong!)

In any case, if someone was reporting a lack of re-entrance somewhere, I would undoubtedly consider it as a bug, that I would happily correct. However, if a threading bug was building over some obscure area (I should rather say: one that I'm not aware of! ☺), requiring a deeper knowledge of internals for a few likely incompatible libraries, and asking for complex auto-configuration machinery, it would likely fall outside my competence to solve it.