Technical detailsHere are the technical details about what was discussed above, you dont need to understand these to use vservers and can skip this part if you want. But if you were wondering more detailed information on any of the things mentioned above, see below.Chroot fixesThere are two ways one may escape from a chroot environment. One is to setup a block device special file, open it with some user-land file system browser. At this point, you are bypassing both the chroot restriction and all file system access control. This does not work on a vserver because the special device are not created in /dev and the CAP_SYS_MKNOD capability is disabled. Even root can't create special files in a vserver. The second is changing the root while keeping the current directory behind. This is a trick to exploit a flaw in the chroot() system call. The system call is changing the logical root directory for the calling process but does not change the current working directory of the process. So the current working directory is (generally) left behind the new root directory. The /usr/sbin/chroot command takes care of this flaw by changing the working directory, but the system call does not do that. So if you process has now a working directory behind (out of scope) the new root, it is allow to change it up to the real root by doing multiple chdir ("..") system call. The process kind of escapes from the kernel radar.The chroot() barrierThe chroot() barrier forbids any traditional chroot() escapes. It has been proven, with exploits, that it is possible to perform a modification which allows you to break out of a chroot. In a vserver situation, this could be done by modifying the rights on the /vservers directory (cd / ; chmod +x ..; chattr -t ..) and then one simply has to make a chroot(..) system call to escape from the chroot. The chroot() barrier prohibits the chroot() call to escape from any subtree by blocking those calls, either directly, or indirectly. To create the chroot barrier, in the stable releases of the tools, you must change the mode on /vservers to 000 and add the +t flag. In the developmental releases the process for doing this has changed to simply doing a setattr --barrier /vservers. The chroot() barrier will only work with filesystems which support extended attributes. Ext2/3 support these attributes, resierfs requires a special option passed at mount time (attrs) to honor those flags and to make the barrier work. XFS has support for the barrier as well, but filesystems such as ntfs, fat, vfat are definately not supported. JFS is unknown.UnificationUnification is simply hardlinking (ln) a file from the reference server into the vserver. However, there is a significant security risk involved in this. If you have a file in the reference server that is hard linked from a couple vserver instantiations then any modification to that file in any of the vservers would modify the contents of that file in any of the other vservers, as well as in the reference server, this is definately not desired behavior. However, you can stop this from happening by making the files that are hardlinked (ie. unified) in the vservers immutable (+i) by changing the filesystem attribute (chattr). Normally immutability would keep you from being able to remove a file (or in this case, the hard link), however, when the iunlink (+t) flag is added, in addition to the immutability flag, then you are allowed to remove the hardlink. With the +i and the +t flag set, the file is considered to have the immutable linkage inversion set. These attributes are unique to the vserver patch that is applied to the kernel and to the vserver utilities that are installed to take advantage of these, what they do are not reflected in the chattr(1) man page, nor in most pages you might find online related to extended filesystem attributes, except those directly related to vserver. These attributes keep you from modifying the contents of the file, however they still allow you to remove the file, and even link to them. You will not be able to append to the end of one of these files (ie. cat >> /etc/passwd will not be allowed). This can be confusing because some programs will seem like they are allowing you to modify the contents of the file, when you actually aren't. For example, lets say the /etc/passwd file has +i and +t set on it and then you run useradd to add a new user. It will appear as if /etc/passwd is modified to add the new user, which isn't supposed to be able to happen, but actually what is happening is /etc/passwd is copied to a temporary file, that temporary file is modified, then /etc/passwd is removed and then replaced with this new temporary file. This is possible because you can remove (ie. unlink) a +i +t file. Now what you have is an /etc/passwd file with a different inode, without +i and +t attributes and you can append to it. This same behavior could be witnessed with some text editors. Question: can't a user just chattr -i a file that has it? Answer: it is not possible to perform chattr operations in a vserver/proc securityThere is a locking issue related to /proc that allows visibility and modification to entries in the /proc filesystem within vservers. Enabling /proc security protects these sensitive entries against unwanted access. By default all proc entries are visible, and therefore accessible via read and write in ALL vservers. The only restriction is the pre-existing linux capability system, meaning that if you have root in a particular vserver, then you have access to these. Limiting visibility to specific proc entries is done using the vproc tool. It is important to disable dangerous /proc access which are not required for the vserver (such as hardware interfaces: ide, bus, pci, scsi, etc. or kernel interfaces: kmem, iomem, ioports, sys, etc.). Here is an example of how to do this: (using the entry meminfo as example) vproc /proc/meminfo (shows current visibility) vproc -d /proc/meminfo (hide in user context)vproc -D /proc/meminfo (hide in any context) vproc -E /proc/meminfo (show only in ctx one) vproc -e /proc/meminfo (default: visible) note: symbolic links and dynamically generated entries like /proc/ www.linux-vserver.org/index.php?page=Proc-Security The new system consists of three flags: admin, watch and hide. To view/set these, a recent version of the util-vserver package, containing setattr and showattr, is required. To make the /proc/loadavg entry visible everywhere, you have to unset the hide flag: setattr --~hide /proc/loadavg The admin and watch flag can be in any state as hiding is controlled by the hide flag only. To keep vservers from reading the interrupt information, you need to hide the /proc/interrupts entry. If you want this to be accessible in the admin context (0) only use: setattr --hide --admin --~watch /proc/interrupts Reference the above URL for specifics on this method and the cross reference table for the different flags and how they work together. |
