Discussion:
[Development] Modifying and accessing environment variables in Qt
Simon Hausmann
2015-04-28 08:52:32 UTC
Permalink
Hi,

Have you ever seen crashes in the CI system that ended up in
getenv? For example in

http://testresults.qt.io/ci/QtBase_5.4.2_Integration/build_00020/linux-g++_no-widgets_Ubuntu_12.04_x64/log.txt.gz

There are many more of these kinds of crashes throughout our
tests and whenever it happens we ignore the problem and just
press the "stage" button again. Must be some instability or bad
configuration on the CI machine, right?

It turns out, it isn't. Lars and I got to the bottom of this earlier
this week and the finding is a different one:

In shockingly many places in Qt itself as well as in our unit tests we access
environment variables and in some places we also change them.
The problem is that access to the environment is an operation that
is inherently unsafe from a threading perspective. There's a process
global "environ" variable, getenv() returns a bare pointer and
putenv/setenv modify that very environment. And there's no way around
that, the C run-time is not safe. Let one thread call getenv() - which iterates
through the environ array - and then let another thread call putenv/setenv
to modify the same array. The result is a seemingly random crash - or as it turns
out: Frequent crashes in our CI system when trying to get changes into Qt :)
From a practical perspective we are particularly vulnerable on Linux to this,
because QThread itself calls getenv() each time we start up a thread - it's
the can-we-use-glib-event-dispatcher-or-was-it-disabled check. Now we could
try to change qthread_unix.cpp to not do that, but let's be honest: That's a
drop in the ocean of places where we access (reading and writing!) the
environment in Qt. Just grep for putenv/setenv/getenv in qtbase/src
and be surprised.

There are various options about what we can do with different degrees of "perfection",
but ultimately it's all going to require a compromise. The option that we are favoring
at the moment is two-fold:

1) Policy in Qt is to use the Qt wrappers for accessing the environment (qgetenv, etc.).

2) These functions we protect with a mutex.

Yes, this isn't perfect, because we still include large quantities of code in src/3rdparty in Qt
that has unprotected calls to getenv(). But then that hasn't stopped us from patching
that code in the past.

The concrete proposal of change is at

https://codereview.qt-project.org/#/c/111158/

What do you think?

I'd most appreciate a +2 or a concrete patch with an alternate solution.


Simon
Matthew Woehlke
2015-04-28 14:26:13 UTC
Permalink
[getenv/setenv not thread safe]
There are various options about what we can do with different degrees of "perfection",
but ultimately it's all going to require a compromise. The option that we are favoring
1) Policy in Qt is to use the Qt wrappers for accessing the environment (qgetenv, etc.).
2) These functions we protect with a mutex.
The concrete proposal of change is at
https://codereview.qt-project.org/#/c/111158/
What do you think?
Is there a reason not to use a read/write mutex for this?
--
Matthew
Simon Hausmann
2015-04-29 08:02:29 UTC
Permalink
Post by Matthew Woehlke
[getenv/setenv not thread safe]
There are various options about what we can do with different degrees of
"perfection", but ultimately it's all going to require a compromise. The
1) Policy in Qt is to use the Qt wrappers for accessing the environment
(qgetenv, etc.).
2) These functions we protect with a mutex.
The concrete proposal of change is at
https://codereview.qt-project.org/#/c/111158/
What do you think?
Is there a reason not to use a read/write mutex for this?
In my opinion the overhead of the read-write lock is not worth the prospective
gain in the unlikely event of repeated concurrent environment access.



Simon
Thiago Macieira
2015-04-29 15:45:24 UTC
Permalink
Post by Simon Hausmann
In my opinion the overhead of the read-write lock is not worth the
prospective gain in the unlikely event of repeated concurrent environment
access.
Agreed.

An RWLock has a benefit when there are multiple *read* threads trying to access
the same data, frequently and at the same time and such access is not short in
time. The added benefit is fairness to the writer.

A mutex is much simpler. For the case here, a mutex suffices: the accesses are
short and infrequent.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
Marc Mutz
2015-04-29 09:55:48 UTC
Permalink
Post by Simon Hausmann
There are various options about what we can do with different degrees of
"perfection", but ultimately it's all going to require a compromise. The
1) Policy in Qt is to use the Qt wrappers for accessing the environment (qgetenv, etc.).
2) These functions we protect with a mutex.
Yes, this isn't perfect, because we still include large quantities of code
in src/3rdparty in Qt that has unprotected calls to getenv(). But then
that hasn't stopped us from patching that code in the past.
I don't believe this is the right fix.

1. man putenv on Linux says that since glibc 2.1, putenv is reentrant. In
any case, it's almost trivial to fix this in libc (using CAS, as Ossi
suggested in a comment on v1 of the patch), but impossible outside.
2. what business does a program have, anyway, of modifying the environment
after threads may have started? Such code should be fixed. Making the Qt
wrappers "save" could lead to more code doing nonsense instead of being
fixed.
3. C functions such as localtime(3) are calling tzset(3) which reads TZ.
You may be able to patch 3rd-party code, but you cannot patch libc.

Callers of putenv()/setenv() should be fixed to not do so after initial
initialisation, ie. when threads may have started. This is how most
initialisation in libraries is required to be handled, see e.g.
https://www.gnupg.org/documentation/manuals/gpgme/Library-Version-Check.html
and I don't see why Qt should do something different from every other
C/C++ library on the planet, at least not for such a common problem as
library initialisation.

Thanks,
Marc
--
Marc Mutz <***@kdab.com> | Senior Software Engineer
KDAB (Deutschland) GmbH & Co.KG, a KDAB Group Company
Tel: +49-30-521325470
KDAB - The Qt Experts
Marc Mutz
2015-04-29 10:20:05 UTC
Permalink
Post by Marc Mutz
1. man putenv on Linux says that since glibc 2.1, putenv is reentrant
http://osxr.org/glibc/source/stdlib/setenv.c#0109

but getenv isn't protected by any lock, indeed:

http://osxr.org/glibc/source/stdlib/getenv.c#0033
--
Marc Mutz <***@kdab.com> | Senior Software Engineer
KDAB (Deutschland) GmbH & Co.KG, a KDAB Group Company
Tel: +49-30-521325470
KDAB - The Qt Experts
Simon Hausmann
2015-04-29 09:15:21 UTC
Permalink
Post by Marc Mutz
Post by Simon Hausmann
There are various options about what we can do with different degrees of
"perfection", but ultimately it's all going to require a compromise. The
1) Policy in Qt is to use the Qt wrappers for accessing the environment
(qgetenv, etc.).
2) These functions we protect with a mutex.
Yes, this isn't perfect, because we still include large quantities of code
in src/3rdparty in Qt that has unprotected calls to getenv(). But then
that hasn't stopped us from patching that code in the past.
I don't believe this is the right fix.
1. man putenv on Linux says that since glibc 2.1, putenv is reentrant. In
any case, it's almost trivial to fix this in libc (using CAS, as Ossi
suggested in a comment on v1 of the patch), but impossible outside.
Does reentrancy here refer to the supplied arguments of putenv? Can you
explain how this helps with concurrent getenv invocations?

(Sorry, I must be missing something - I don't see how this relates to the
crashes)
Post by Marc Mutz
2. what business does a program have, anyway, of modifying the environment
after threads may have started? Such code should be fixed. Making the Qt
wrappers "save" could lead to more code doing nonsense instead of being
fixed.
3. C functions such as localtime(3) are calling tzset(3) which reads TZ.
You may be able to patch 3rd-party code, but you cannot patch libc.
Right, there are more "unsafe" functions in libc that we cannot change. But
isn't that an orthogonal topic to the issue of our tests crashing?
Post by Marc Mutz
Callers of putenv()/setenv() should be fixed to not do so after initial
initialisation, ie. when threads may have started. This is how most
initialisation in libraries is required to be handled, see e.g.
https://www.gnupg.org/documentation/manuals/gpgme/Library-Version-Check.htm
l and I don't see why Qt should do something different from every other
C/C++ library on the planet, at least not for such a common problem as
library initialisation.
How can we do this in Qt and in our tests?

It would all have to be done before the application constructor, because
already our platform plugins start threads. This is in contrast to individual
test functions in Qt calling putenv().


Simon
Thiago Macieira
2015-04-29 15:47:22 UTC
Permalink
Post by Simon Hausmann
How can we do this in Qt and in our tests?
The problem here is the fact that they're tests. Unlike regular applications,
we're doing things at times when applications shouldn't be doing the same
action.

Strictly speaking, we should fix the tests.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
Marc Mutz
2015-04-30 08:50:59 UTC
Permalink
Post by Simon Hausmann
Post by Marc Mutz
1. man putenv on Linux says that since glibc 2.1, putenv is reentrant. In
any case, it's almost trivial to fix this in libc (using CAS, as Ossi
suggested in a comment on v1 of the patch), but impossible outside.
Does reentrancy here refer to the supplied arguments of putenv? Can you
explain how this helps with concurrent getenv invocations?
(Sorry, I must be missing something - I don't see how this relates to the
crashes)
See my other mail. It prevents putenv/putenv races, but not putenv/getenv, indeed.
Post by Simon Hausmann
Post by Marc Mutz
2. what business does a program have, anyway, of modifying the
environment after threads may have started? Such code should be fixed.
Making the Qt wrappers "save" could lead to more code doing nonsense
instead of being fixed.
This is my main point.
Post by Simon Hausmann
Post by Marc Mutz
3. C functions such as localtime(3) are calling tzset(3) which reads TZ.
You may be able to patch 3rd-party code, but you cannot patch libc.
Right, there are more "unsafe" functions in libc that we cannot change.
But isn't that an orthogonal topic to the issue of our tests crashing?
You want to fix races between environment accesses. I'm saying that libc functions
are implicitly accessing the environment, too, and you cannot insert you Qt mutex
there. So this is isn't really orthogonal.
Post by Simon Hausmann
Post by Marc Mutz
Callers of putenv()/setenv() should be fixed to not do so after initial
initialisation, ie. when threads may have started. This is how most
initialisation in libraries is required to be handled, see e.g.
https://www.gnupg.org/documentation/manuals/gpgme/Library-Version-Check.
htm
l and I don't see why Qt should do something different from every other
C/C++ library on the planet, at least not for such a common problem as
library initialisation.
How can we do this in Qt and in our tests?
Tests are presumably easy. If everything else fails, QtTest could execute itself
anew for each test. After a quick scan, the only thing I'm worried about in QtBase
is qglxconvenience.cpp, which temporarily modifies the environment.
Post by Simon Hausmann
It would all have to be done before the application constructor, because
already our platform plugins start threads. This is in contrast to
individual test functions in Qt calling putenv().
Platform plugins are creating threads before the Q*Application ctor runs? How is
that possible if the Q*Application ctor is the first Qt code that gets called from
the application? If they do, it means that the threads are already running when
control enters main(). And that's unacceptable, because it deprives application
authors of the window of control where there's only one thread of execution.
Guaranteed. So you cannot, say, use gpgme with a Qt program.

Qt can't make the rules. It has to abide by them.

As an aside: That so much code in Qt checks the environment for some more-or-less
obscure options indicates the need for explicit management of these options. I'll
give this some more thought.

Thanks,
Marc
--
Marc Mutz <***@kdab.com> | Senior Software Engineer
KDAB (Deutschland) GmbH & Co.KG, a KDAB Group Company
Tel: +49-30-521325470
KDAB - The Qt Experts
Simon Hausmann
2015-04-30 08:06:05 UTC
Permalink
Post by Marc Mutz
Post by Simon Hausmann
Post by Marc Mutz
2. what business does a program have, anyway, of modifying the
environment after threads may have started? Such code should be fixed.
Making the Qt wrappers "save" could lead to more code doing nonsense
instead of being fixed.
This is my main point.
That is a valid point, for sure. Inherently valid due to the nature of the way
the C runtime implements this and is accessible to everyone.
Post by Marc Mutz
Post by Simon Hausmann
Post by Marc Mutz
3. C functions such as localtime(3) are calling tzset(3) which reads TZ.
You may be able to patch 3rd-party code, but you cannot patch libc.
Right, there are more "unsafe" functions in libc that we cannot change.
But isn't that an orthogonal topic to the issue of our tests crashing?
You want to fix races between environment accesses. I'm saying that libc
functions are implicitly accessing the environment, too, and you cannot
insert you Qt mutex there. So this is isn't really orthogonal.
I think you may be misunderstanding me. I don't want to fix all races, I want
to fix the common cases in Qt, in particular the ones that affect our everyday
work of trying to integrate changes by passing tests. I'm looking for a
pragmatic improvement, not a perfect solution. I see however that you agree to
the proposed change being acceptable as a stop-gap - so thank you :)
Post by Marc Mutz
Post by Simon Hausmann
Post by Marc Mutz
Callers of putenv()/setenv() should be fixed to not do so after initial
initialisation, ie. when threads may have started. This is how most
initialisation in libraries is required to be handled, see e.g.
https://www.gnupg.org/documentation/manuals/gpgme/Library-Version-Check
.
htm
l and I don't see why Qt should do something different from every other
C/C++ library on the planet, at least not for such a common problem as
library initialisation.
How can we do this in Qt and in our tests?
Tests are presumably easy. If everything else fails, QtTest could execute
itself anew for each test.
Can you elaborate on that? I don't think I fully understand what you mean.

The only portable solution that I can think of is to replace each use of an
environment variable with an autotest-only exported variable in Qt, that is
used instead. That may be a path forward and I'd be happy to help review
changes towards that.
Post by Marc Mutz
After a quick scan, the only thing I'm worried
about in QtBase is qglxconvenience.cpp, which temporarily modifies the
environment.
Post by Simon Hausmann
It would all have to be done before the application constructor, because
already our platform plugins start threads. This is in contrast to
individual test functions in Qt calling putenv().
Platform plugins are creating threads before the Q*Application ctor runs?
How is that possible if the Q*Application ctor is the first Qt code that
gets called from the application? If they do, it means that the threads are
already running when control enters main(). And that's unacceptable,
because it deprives application authors of the window of control where
there's only one thread of execution. Guaranteed. So you cannot, say, use
gpgme with a Qt program.
No, I do not claim that threads are created before the Q*Application
constructor runs. Please re-read what I wrote. I said that the platform
plugins start threads, and they are loaded in the Q*Application constructor.
So environment variables have to be set before there, if they affect the
platform plugins.


Simon
Kevin Kofler
2015-05-10 14:34:23 UTC
Permalink
Post by Marc Mutz
Tests are presumably easy. If everything else fails, QtTest could execute
itself anew for each test. After a quick scan, the only thing I'm worried
about in QtBase is qglxconvenience.cpp, which temporarily modifies the
environment.
That would be my code. The environment is only modified if
QT_XCB_FORCE_SOFTWARE_OPENGL is set, a workaround for old OpenGL-1.x-only
drivers. I simply do not know a better way to force Mesa's libGL to use
software rendering in Qt applications without making it global for all
applications.

There is no environment changing (and thus no race) in the common case where
the workaround environment variable QT_XCB_FORCE_SOFTWARE_OPENGL is not set.

Kevin Kofler
Kevin Kofler
2015-05-10 14:35:00 UTC
Permalink
Post by Simon Hausmann
Yes, this isn't perfect, because we still include large quantities of code
in src/3rdparty in Qt that has unprotected calls to getenv(). But then
that hasn't stopped us from patching that code in the past.
That will not help on distributions using system copies of third-party
libraries.

Kevin Kofler

Loading...