[BUG RFD]selinux: sidtab_context_to_sid return -NOMEM when concurrent with security_load

Discussion:

[BUG RFD]selinux: sidtab_context_to_sid return -NOMEM when concurrent with security_load_policy

Li Kun

2017-12-23 08:03:43 UTC

Hi all,
When i start a docker container, the runc will call selinux_setprocattr
to set the exec_sid before start the container.
Meanwhile if i use "semodule -i" to load a policy pp, the old sidtab
will be shutdown before switch to the new sidtab, and cause
sidtab_context_to_sid failed with the errno -NOMEM.

Error message:
Dec 14 09:18:18 wuwenming_vudfua_0 docker:
time="2017-12-14T09:18:18.549862564+08:00" level=error msg="Handler for
GET /v1.23/containers/192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2/json
returned error: No such container:
192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2"
Dec 14 09:18:18 wuwenming_vudfua_0 kernel: [ 48.463179] SELinux:
security_context_to_sid(system_u:object_r:svirt_sandbox_file_t:s0:c290,c371)
failed for (dev overlay, type overlay) errno=-12
Dec 14 09:18:18 wuwenming_vudfua_0 dbus-daemon: dbus[720]: avc: received
policyload notice (seqno=2)
Dec 14 09:18:18 wuwenming_vudfua_0 dbus[720]: avc: received policyload
notice (seqno=2)

runc semodule
->selinux_setprocattr ->security_load_policy
->security_context_to_sid ->sidtab_shutdown(oldsidtab)
->read_lock(&policy_rwlock); ->sidtab_map(&oldsidtab,
clone_sid, &newsidtab);

->sidtab_context_to_sid
{
....
if (s->next_sid == UINT_MAX || s->shutdown) {
ret = -ENOMEM;
....
}
->read_unlock(&policy_rwlock);
->write_lock_irq(&policy_rwlock);
->"switch to new policydb"
->write_unlock_irq(&policy_rwlock);

I wonder that if this is the intention or a bug?
If it is the intention, what should the application do when it get
-ENOMEM error, to try again?
If it is a bug, may the two options below suitable to solve the issue?
Option1:
use policy_rwlock to protect the "sidtab_shutdown" & "sidtab_map" ,
load_policy is rarely to be called after system bringing up,
so i think it will not impact much on the performance.
Option2:
Use a temp list to store the sid in oldsidtab after it is shutdown,
and deal with it
after cloning the major sids from oldsidtab to newsidtab and getting the
policy_rwlock.

--
Best Regards
Li Kun

Li Kun

2017-12-25 01:44:17 UTC

Permalink

--
Best Regards
Li Kun

Li Kun

2017-12-26 03:27:30 UTC

Permalink

Please ignore this duplicate mail which is resent due to the mail
server's error.

Thanks a lot.

--
Best Regards
Li Kun

Stephen Smalley

2017-12-28 14:57:58 UTC

Permalink

Li Kun

2017-12-29 03:14:20 UTC

Permalink

Post by Li Kun
Hi all,
When i start a docker container, the runc will call
selinux_setprocattr to set the exec_sid before start the container.
Meanwhile if i use "semodule -i" to load a policy pp, the old
sidtab will be shutdown before switch to the new sidtab, and cause
sidtab_context_to_sid failed with the errno -NOMEM.
time="2017-12-14T09:18:18.549862564+08:00" level=error
msg="Handler for GET
/v1.23/containers/192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2/json
<http://192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2/json>
192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2
<http://192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2>"
security_context_to_sid(system_u:object_r:svirt_sandbox_file_t:s0:c290,c371)
failed for (dev overlay, type overlay) errno=-12
received policyload notice (seqno=2)
Dec 14 09:18:18 wuwenming_vudfua_0 dbus[720]: avc: received
policyload notice (seqno=2)
runc semodule
->selinux_setprocattr ->security_load_policy
->security_context_to_sid
->sidtab_shutdown(oldsidtab)
->read_lock(&policy_rwlock); ->sidtab_map(&oldsidtab,
clone_sid, &newsidtab);
->sidtab_context_to_sid
{
....
if (s->next_sid == UINT_MAX || s->shutdown) {
ret = -ENOMEM;
....
}
->read_unlock(&policy_rwlock);
->write_lock_irq(&policy_rwlock);
->"switch to new policydb"
->write_unlock_irq(&policy_rwlock);
I wonder that if this is the intention or a bug?
If it is the intention, what should the application do when it get
-ENOMEM error, to try again?
If it is a bug, may the two options below suitable to solve the issue?
use policy_rwlock to protect the "sidtab_shutdown" &
"sidtab_map" , load_policy is rarely to be called after system
bringing up,
so i think it will not impact much on the performance.
Use a temp list to store the sid in oldsidtab after it is
shutdown, and deal with it
after cloning the major sids from oldsidtab to newsidtab and
getting the policy_rwlock.
The current logic is intentional. One might argue that a different
error code might be more suitable in this situation (e.g. EAGAIN or
something), but ENOMEM is already a possible error code when loading
policy upon transient OOM conditions, so the caller might have to
retry regardless. The policy write lock must only be held for the
minimal critical section as otherwise all system activity could be
blocked; it is only held when making the new policydb and sidtab
active not during their allocation and setup - ideally it would be
further reduced to a couple of pointer updates. Is this scenario
something you are encountering in actual practice or just a
theoretically possible one?

Thank you very much for your reply.
Yes, we encounter this issue in our implementation.
We have an intertion in our design that we can divide the policy into
several parts to load separately.
So after the red-hat linux bringing up, which has loaded the system
policy db released by redhat, we begin to load the policy released by
our product ,
and meanwhile someone want to start a docker container, and the runc get
an ENOMEM error like i mentioned above.
Should we merge all the policy into the system policy db and just load once?

--
Best Regards
Li Kun

Stephen Smalley

2017-12-29 17:25:03 UTC

Permalink

--
Best Regards
Li Kun

Typically you would install your policy module when your package is
installed and not need to do it at runtime.

Li Kun

2018-01-02 01:39:47 UTC

Permalink

Post by Li Kun

Post by Li Kun
Hi all,
When i start a docker container, the runc will call
selinux_setprocattr to set the exec_sid before start the container.
Meanwhile if i use "semodule -i" to load a policy pp, the old
sidtab will be shutdown before switch to the new sidtab, and cause
sidtab_context_to_sid failed with the errno -NOMEM.
time="2017-12-14T09:18:18.549862564+08:00" level=error
msg="Handler for GET
/v1.23/containers/192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2/json
<http://192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2/json>
192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2
<http://192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2>"
security_context_to_sid(system_u:object_r:svirt_sandbox_file_t:s0:c290,c371)
failed for (dev overlay, type overlay) errno=-12
avc: received policyload notice (seqno=2)
Dec 14 09:18:18 wuwenming_vudfua_0 dbus[720]: avc: received
policyload notice (seqno=2)
runc semodule
->selinux_setprocattr ->security_load_policy
->security_context_to_sid ->sidtab_shutdown(oldsidtab)
->read_lock(&policy_rwlock); ->sidtab_map(&oldsidtab,
clone_sid, &newsidtab);
->sidtab_context_to_sid
{
....
if (s->next_sid == UINT_MAX || s->shutdown) {
ret = -ENOMEM;
....
}
->read_unlock(&policy_rwlock);
->write_lock_irq(&policy_rwlock);
->"switch to new policydb"
->write_unlock_irq(&policy_rwlock);
I wonder that if this is the intention or a bug?
If it is the intention, what should the application do when
it get -ENOMEM error, to try again?
If it is a bug, may the two options below suitable to solve the issue?
use policy_rwlock to protect the "sidtab_shutdown" &
"sidtab_map" , load_policy is rarely to be called after
system bringing up,
so i think it will not impact much on the performance.
Use a temp list to store the sid in oldsidtab after it is
shutdown, and deal with it
after cloning the major sids from oldsidtab to newsidtab and
getting the policy_rwlock.
The current logic is intentional. One might argue that a
different error code might be more suitable in this situation
(e.g. EAGAIN or something), but ENOMEM is already a possible
error code when loading policy upon transient OOM conditions, so
the caller might have to retry regardless. The policy write lock
must only be held for the minimal critical section as otherwise
all system activity could be blocked; it is only held when making
the new policydb and sidtab active not during their allocation
and setup - ideally it would be further reduced to a couple of
pointer updates. Is this scenario something you are encountering
in actual practice or just a theoretically possible one?

Thank you very much for your reply.
Yes, we encounter this issue in our implementation.
We have an intertion in our design that we can divide the policy
into several parts to load separately.
So after the red-hat linux bringing up, which has loaded the
system policy db released by redhat, we begin to load the policy
released by our product ,
and meanwhile someone want to start a docker container, and the
runc get an ENOMEM error like i mentioned above.
Should we merge all the policy into the system policy db and just load once?
--
Best Regards
Li Kun
Typically you would install your policy module when your package is
installed and not need to do it at runtime.

Would you think it's suibable to change the error code to EAGAINïŒ

--
Best Regards
Li Kun

Stephen Smalley

2018-01-02 04:16:04 UTC

Permalink

The current logic is intentional. One might argue that a different error
code might be more suitable in this situation (e.g. EAGAIN or something),
but ENOMEM is already a possible error code when loading policy upon
transient OOM conditions, so the caller might have to retry regardless.
The policy write lock must only be held for the minimal critical section as
otherwise all system activity could be blocked; it is only held when making
the new policydb and sidtab active not during their allocation and setup -
ideally it would be further reduced to a couple of pointer updates. Is this
scenario something you are encountering in actual practice or just a
theoretically possible one?

Thank you very much for your reply.
Yes, we encounter this issue in our implementation.
We have an intertion in our design that we can divide the policy into
several parts to load separately.
So after the red-hat linux bringing up, which has loaded the system policy
db released by redhat, we begin to load the policy released by our product
,
and meanwhile someone want to start a docker container, and the runc get an
ENOMEM error like i mentioned above.
Should we merge all the policy into the system policy db and just load
once?
--
Best Regards
Li Kun

Typically you would install your policy module when your package is
installed and not need to do it at runtime.

Would you think it's suibable to change the error code to EAGAINïŒ

You likely want it to return - EINTR or -ERESTARTSYS if you want existing
callers of setprocattr to automatically restart the call instead of
failing.

Li Kun

2018-01-02 06:37:09 UTC

Permalink

Post by Li Kun

Post by Li Kun
Hi all,
When i start a docker container, the runc will call
selinux_setprocattr to set the exec_sid before start the
container.
Meanwhile if i use "semodule -i" to load a policy pp,
the old sidtab will be shutdown before switch to the new
sidtab, and cause
sidtab_context_to_sid failed with the errno -NOMEM.
time="2017-12-14T09:18:18.549862564+08:00" level=error
msg="Handler for GET
/v1.23/containers/192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2/json
<http://192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2/json>
192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2
<http://192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2>"
Dec 14 09:18:18 wuwenming_vudfua_0 kernel: [
security_context_to_sid(system_u:object_r:svirt_sandbox_file_t:s0:c290,c371)
failed for (dev overlay, type overlay) errno=-12
dbus[720]: avc: received policyload notice (seqno=2)
received policyload notice (seqno=2)
runc semodule
->selinux_setprocattr ->security_load_policy
->security_context_to_sid ->sidtab_shutdown(oldsidtab)
->read_lock(&policy_rwlock); ->sidtab_map(&oldsidtab,
clone_sid, &newsidtab);
->sidtab_context_to_sid
{
....
if (s->next_sid == UINT_MAX ||
s->shutdown) {
ret = -ENOMEM;
....
}
->read_unlock(&policy_rwlock);
->write_lock_irq(&policy_rwlock);
->"switch to new policydb"
->write_unlock_irq(&policy_rwlock);
I wonder that if this is the intention or a bug?
If it is the intention, what should the application do
when it get -ENOMEM error, to try again?
If it is a bug, may the two options below suitable to
solve the issue?
use policy_rwlock to protect the "sidtab_shutdown" &
"sidtab_map" , load_policy is rarely to be called after
system bringing up,
so i think it will not impact much on the performance.
Use a temp list to store the sid in oldsidtab after
it is shutdown, and deal with it
after cloning the major sids from oldsidtab to newsidtab
and getting the policy_rwlock.
The current logic is intentional. One might argue that a
different error code might be more suitable in this
situation (e.g. EAGAIN or something), but ENOMEM is already
a possible error code when loading policy upon transient OOM
conditions, so the caller might have to retry regardless.
The policy write lock must only be held for the minimal
critical section as otherwise all system activity could be
blocked; it is only held when making the new policydb and
sidtab active not during their allocation and setup -
ideally it would be further reduced to a couple of pointer
updates. Is this scenario something you are encountering in
actual practice or just a theoretically possible one?

Thank you very much for your reply.
Yes, we encounter this issue in our implementation.
We have an intertion in our design that we can divide the
policy into several parts to load separately.
So after the red-hat linux bringing up, which has loaded the
system policy db released by redhat, we begin to load the
policy released by our product ,
and meanwhile someone want to start a docker container, and
the runc get an ENOMEM error like i mentioned above.
Should we merge all the policy into the system policy db and
just load once?
--
Best Regards
Li Kun
Typically you would install your policy module when your package
is installed and not need to do it at runtime.

Would you think it's suibable to change the error code to EAGAINïŒ
You likely want it to return - EINTR or -ERESTARTSYS if you want
existing callers of setprocattr to automatically restart the call
instead of failing.

--
Best Regards
Li Kun

Stephen Smalley

2018-01-02 14:31:14 UTC

Permalink

Li Kun

2018-01-03 12:23:29 UTC

Permalink

Post by Li Kun

On Sat, Dec 23, 2017 at 3:03 AM, Li Kun
Hi all,
When i start a docker container, the runc will call
selinux_setprocattr to set the exec_sid before
start the container.
Meanwhile if i use "semodule -i" to load a policy
pp, the old sidtab will be shutdown before switch
to the new sidtab, and cause
sidtab_context_to_sid failed with the errno -NOMEM.
time="2017-12-14T09:18:18.549862564+08:00"
level=error msg="Handler for GET
/v1.23/containers/192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2/json
<http://192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2/json>
192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2
<http://192.168.0.2:9082/dfs_d:V100R013C11SPC000BCD2>"
Dec 14 09:18:18 wuwenming_vudfua_0 kernel: [
security_context_to_sid(system_u:object_r:svirt_sandbox_file_t:s0:c290,c371)
failed for (dev overlay, type overlay) errno=-12
dbus[720]: avc: received policyload notice (seqno=2)
received policyload notice (seqno=2)
runc semodule
->selinux_setprocattr ->security_load_policy
->security_context_to_sid ->sidtab_shutdown(oldsidtab)
->read_lock(&policy_rwlock);
->sidtab_map(&oldsidtab, clone_sid, &newsidtab);
->sidtab_context_to_sid
{
....
if (s->next_sid == UINT_MAX
|| s->shutdown) {
ret = -ENOMEM;
....
}
->read_unlock(&policy_rwlock);
->write_lock_irq(&policy_rwlock);
->"switch to new policydb"
->write_unlock_irq(&policy_rwlock);
I wonder that if this is the intention or a bug?
If it is the intention, what should the application
do when it get -ENOMEM error, to try again?
If it is a bug, may the two options below suitable
to solve the issue?
use policy_rwlock to protect the
"sidtab_shutdown" & "sidtab_map" , load_policy is
rarely to be called after system bringing up,
so i think it will not impact much on the performance.
Use a temp list to store the sid in oldsidtab
after it is shutdown, and deal with it
after cloning the major sids from oldsidtab to
newsidtab and getting the policy_rwlock.
The current logic is intentional. One might argue that
a different error code might be more suitable in this
situation (e.g. EAGAIN or something), but ENOMEM is
already a possible error code when loading policy upon
transient OOM conditions, so the caller might have to
retry regardless. The policy write lock must only be
held for the minimal critical section as otherwise all
system activity could be blocked; it is only held when
making the new policydb and sidtab active not during
their allocation and setup - ideally it would be
further reduced to a couple of pointer updates. Is this
scenario something you are encountering in actual
practice or just a theoretically possible one?

Thank you very much for your reply.
Yes, we encounter this issue in our implementation.
We have an intertion in our design that we can divide
the policy into several parts to load separately.
So after the red-hat linux bringing up, which has loaded
the system policy db released by redhat, we begin to
load the policy released by our product ,
and meanwhile someone want to start a docker container,
and the runc get an ENOMEM error like i mentioned above.
Should we merge all the policy into the system policy db
and just load once?
--
Best Regards
Li Kun
Typically you would install your policy module when your
package is installed and not need to do it at runtime.

AFAIU, the EINTR & ERESTARTSYS are releated to sigpending under
most circumstances, is it proper to use them here?
And as i mentioned above, most apps(like runc, bind)do not retry
when they meet ENOMEM, so i think maybe EAGAIN is more suitable
for the transient race conditions
EINTR is already handled by the libselinux procattr code, so returning
that will cause existing users of the libselinux interfaces to retry.
EAGAIN is not handled since the descriptor is not marked nonblocking.

Ok, i understand, thank you.
Maybe i can send a patch to change the errno to EINTR if you think it
make sense.

--
Best Regards
Li Kun

Stephen Smalley

2018-01-04 13:39:13 UTC

Permalink

Try testing such a patch to see if it resolves your issue. However, I think
if you switch to installing your policy module from your package post
scriptlet, you won't encounter this issue in the first place.

Post by Stephen Smalley

The current logic is intentional. One might argue that a different error
code might be more suitable in this situation (e.g. EAGAIN or something),
but ENOMEM is already a possible error code when loading policy upon
transient OOM conditions, so the caller might have to retry regardless.
The policy write lock must only be held for the minimal critical section as
otherwise all system activity could be blocked; it is only held when making
the new policydb and sidtab active not during their allocation and setup -
ideally it would be further reduced to a couple of pointer updates. Is this
scenario something you are encountering in actual practice or just a
theoretically possible one?
Thank you very much for your reply.
Yes, we encounter this issue in our implementation.
We have an intertion in our design that we can divide the policy into
several parts to load separately.
So after the red-hat linux bringing up, which has loaded the system policy
db released by redhat, we begin to load the policy released by our product
,
and meanwhile someone want to start a docker container, and the runc get
an ENOMEM error like i mentioned above.
Should we merge all the policy into the system policy db and just load once?
--
Best Regards
Li Kun
Typically you would install your policy module when your package is
installed and not need to do it at runtime.
Would you think it's suibable to change the error code to EAGAINïŒ
You likely want it to return - EINTR or -ERESTARTSYS if you want existing
callers of setprocattr to automatically restart the call instead of
failing.
AFAIU, the EINTR & ERESTARTSYS are releated to sigpending under most
circumstances, is it proper to use them here?
And as i mentioned above, most apps(like runc, bind)do not retry when they
meet ENOMEM, so i think maybe EAGAIN is more suitable for the transient
race conditions
EINTR is already handled by the libselinux procattr code, so returning
that will cause existing users of the libselinux interfaces to retry.
EAGAIN is not handled since the descriptor is not marked nonblocking.
Ok, i understand, thank you.
Maybe i can send a patch to change the errno to EINTR if you think it make
sense.
--
Best Regards
Li Kun