aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * Purge existing TLB entries in set_pte_at and ptep_set_wrprotectJohn David Anglin2013-02-282-3/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 7139bc1579901b53db7e898789e916ee2fb52d78 upstream. This patch goes a long way toward fixing the minifail bug, and it  significantly improves the stability of SMP machines such as the rp3440.  When write  protecting a page for COW, we need to purge the existing translation.  Otherwise, the COW break doesn't occur as expected because the TLB may still have a stale entry which allows writes. [jejb: fix up checkpatch errors] Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * powerpc/kexec: Disable hard IRQ before kexecPhileas Fogg2013-02-281-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 8520e443aa56cc157b015205ea53e7b9fc831291 upstream. Disable hard IRQ before kexec a new kernel image. Not doing it can result in corrupted data in the memory segments reserved for the new kernel. Signed-off-by: Phileas Fogg <phileas-fogg@mail.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ARM: PXA3xx: program the CSMSADRCFG registerIgor Grinberg2013-02-282-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit d107a204154ddd79339203c2deeb7433f0cf6777 upstream. The Chip Select Configuration Register must be programmed to 0x2 in order to achieve the correct behavior of the Static Memory Controller. Without this patch devices wired to DFI and accessed through SMC cannot be accessed after resume from S2. Do not rely on the boot loader to program the CSMSADRCFG register by programming it in the kernel smemc module. Signed-off-by: Igor Grinberg <grinberg@compulab.co.il> Acked-by: Eric Miao <eric.y.miao@gmail.com> Signed-off-by: Haojian Zhuang <haojian.zhuang@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * staging: vt6656: Fix URB submitted while active warning.Malcolm Priestley2013-02-281-8/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit ae5943de8c8c4438cbac5cda599ff0b88c224468 upstream. This error happens because PIPEnsControlOut and PIPEnsControlIn unlock the spin lock for delay, letting in another thread. The patch moves the current MP_SET_FLAG to before filling of sUsbCtlRequest for pControlURB and clears it in event of failing. Any thread calling either function while fMP_CONTROL_READS or fMP_CONTROL_WRITES flags set will return STATUS_FAILURE. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * staging: comedi: disallow COMEDI_DEVCONFIG on non-board minorsIan Abbott2013-02-281-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 754ab5c0e55dd118273ca2c217c4d95e9fbc8259 upstream. Comedi has two sorts of minor devices: (a) normal board minor devices in the range 0 to COMEDI_NUM_BOARD_MINORS-1 inclusive; and (b) special subdevice minor devices in the range COMEDI_NUM_BOARD_MINORS upwards that are used to open the same underlying comedi device as the normal board minor devices, but with non-default read and write subdevices for asynchronous commands. The special subdevice minor devices get created when a board supporting asynchronous commands is attached to a normal board minor device, and destroyed when the board is detached from the normal board minor device. One way to attach or detach a board is by using the COMEDI_DEVCONFIG ioctl. This should only be used on normal board minors as the special subdevice minors are too ephemeral. In particular, the change introduced in commit 7d3135af399e92cf4c9bbc5f86b6c140aab3b88c ("staging: comedi: prevent auto-unconfig of manually configured devices") breaks horribly for special subdevice minor devices. Since there's no legitimate use for the COMEDI_DEVCONFIG ioctl on a special subdevice minor device node, disallow it and return -ENOTTY. Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * drm/i915: disable shared panel fitter for pipeMika Kuoppala2013-02-281-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 24a1f16de97c4cf0029d9acd04be06db32208726 upstream. If encoder is switched off by BIOS, but the panel fitter is left on, we never try to turn off the panel fitter and leave it still attached to the pipe - which can cause blurry output elsewhere. Based on work by Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58867 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Tested-by: Andreas Sturmlechner <andreas.sturmlechner@gmail.com> [danvet: Remove the redundant HAS_PCH_SPLIT check and add a tiny comment.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * NLS: improve UTF8 -> UTF16 string conversion routineAlan Stern2013-02-284-17/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 0720a06a7518c9d0c0125bd5d1f3b6264c55c3dd upstream. The utf8s_to_utf16s conversion routine needs to be improved. Unlike its utf16s_to_utf8s sibling, it doesn't accept arguments specifying the maximum length of the output buffer or the endianness of its 16-bit output. This patch (as1501) adds the two missing arguments, and adjusts the only two places in the kernel where the function is called. A follow-on patch will add a third caller that does utilize the new capabilities. The two conversion routines are still annoyingly inconsistent in the way they handle invalid byte combinations. But that's a subject for a different patch. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * drm/usb: bind driver to correct deviceDave Airlie2013-02-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 9f23de52b64f7fb801fd76f3dd8651a0dc89187b upstream. While looking at plymouth on udl I noticed that plymouth was trying to use its fb plugin not its drm one, it was trying to drmOpen a driver called usb not udl, noticed that we actually had out driver pointing at the wrong device. Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * sunvdc: Fix off-by-one in generic_request().David S. Miller2013-02-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | [ Upstream commit f4d9605434c0fd4cc8639bf25cfc043418c52362 ] The 'operations' bitmap corresponds one-for-one with the operation codes, no adjustment is necessary. Reported-by: Mark Kettenis <mark.kettenis@xs4all.nl> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ext4: add missing kfree() on error return path in add_new_gdb()Dan Carpenter2013-02-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c49bafa3842751b8955a962859f42d307673d75d upstream. We added some more error handling in b40971426a "ext4: add error checking to calls to ext4_handle_dirty_metadata()". But we need to call kfree() as well to avoid a memory leak. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ext4: Free resources in some error path in ext4_fill_superTao Ma2013-02-281-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit dcf2d804ed6ffe5e942b909ed5e5b74628be6ee4 upstream. Some of the error path in ext4_fill_super don't release the resouces properly. So this patch just try to release them in the right way. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ALSA: usb: Fix Processing Unit Descriptor parsersPawel Moll2013-02-281-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit b531f81b0d70ffbe8d70500512483227cc532608 upstream. Commit 99fc86450c439039d2ef88d06b222fd51a779176 "ALSA: usb-mixer: parse descriptors with structs" introduced a set of useful parsers for descriptors. Unfortunately the parses for the Processing Unit Descriptor came with a very subtle bug... Functions uac_processing_unit_iProcessing() and uac_processing_unit_specific() were indexing the baSourceID array forgetting the fields before the iProcessing and process-specific descriptors. The problem was observed with Sound Blaster Extigy mixer, where nNrModes in Up/Down-mix Processing Unit Descriptor was accessed at offset 10 of the descriptor (value 0) instead of offset 15 (value 7). In result the resulting control had interesting limit values: Simple mixer control 'Channel Routing Mode Select',0 Capabilities: volume volume-joined penum Playback channels: Mono Capture channels: Mono Limits: 0 - -1 Mono: -1 [100%] Fixed by starting from the bmControls, which was calculated correctly, instead of baSourceID. Now the mentioned control is fine: Simple mixer control 'Channel Routing Mode Select',0 Capabilities: volume volume-joined penum Playback channels: Mono Capture channels: Mono Limits: 0 - 6 Mono: 0 [0%] Signed-off-by: Pawel Moll <mail@pawelmoll.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ALSA: usb-audio: fix Roland A-PRO supportClemens Ladisch2013-02-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | commit 7da58046482fceb17c4a0d4afefd9507ec56de7f upstream. The quirk for the Roland/Cakewalk A-PRO keyboards accidentally used the wrong interface number, which prevented the driver from attaching to the device. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * p54usb: corrected USB ID for T-Com Sinus 154 data IITomasz Guszkowski2013-02-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 008e33f733ca51acb2dd9d88ea878693b04d1d2a upstream. Corrected USB ID for T-Com Sinus 154 data II. ISL3887-based. The device was tested in managed mode with no security, WEP 128 bit and WPA-PSK (TKIP) with firmware 2.13.1.0.lm87.arm (md5sum: 7d676323ac60d6e1a3b6d61e8c528248). It works. Signed-off-by: Tomasz Guszkowski <tsg@o2.pl> Acked-By: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * NLM: Ensure that we resend all pending blocking locks after a reclaimTrond Myklebust2013-02-281-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 666b3d803a511fbc9bc5e5ea8ce66010cf03ea13 upstream. Currently, nlmclnt_lock will break out of the for(;;) loop when the reclaimer wakes up the blocking lock thread by setting nlm_lck_denied_grace_period. This causes the lock request to fail with an ENOLCK error. The intention was always to ensure that we resend the lock request after the grace period has expired. Reported-by: Wangyuan Zhang <Wangyuan.Zhang@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * mm/fadvise.c: drain all pagevecs if POSIX_FADV_DONTNEED fails to discard all ↵Mel Gorman2013-02-281-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pages commit 67d46b296a1ba1477c0df8ff3bc5e0167a0b0732 upstream. Rob van der Heij reported the following (paraphrased) on private mail. The scenario is that I want to avoid backups to fill up the page cache and purge stuff that is more likely to be used again (this is with s390x Linux on z/VM, so I don't give it as much memory that we don't care anymore). So I have something with LD_PRELOAD that intercepts the close() call (from tar, in this case) and issues a posix_fadvise() just before closing the file. This mostly works, except for small files (less than 14 pages) that remains in page cache after the face. Unfortunately Rob has not had a chance to test this exact patch but the test program below should be reproducing the problem he described. The issue is the per-cpu pagevecs for LRU additions. If the pages are added by one CPU but fadvise() is called on another then the pages remain resident as the invalidate_mapping_pages() only drains the local pagevecs via its call to pagevec_release(). The user-visible effect is that a program that uses fadvise() properly is not obeyed. A possible fix for this is to put the necessary smarts into invalidate_mapping_pages() to globally drain the LRU pagevecs if a pagevec page could not be discarded. The downside with this is that an inode cache shrink would send a global IPI and memory pressure potentially causing global IPI storms is very undesirable. Instead, this patch adds a check during fadvise(POSIX_FADV_DONTNEED) to check if invalidate_mapping_pages() discarded all the requested pages. If a subset of pages are discarded it drains the LRU pagevecs and tries again. If the second attempt fails, it assumes it is due to the pages being mapped, locked or dirty and does not care. With this patch, an application using fadvise() correctly will be obeyed but there is a downside that a malicious application can force the kernel to send global IPIs and increase overhead. If accepted, I would like this to be considered as a -stable candidate. It's not an urgent issue but it's a system call that is not working as advertised which is weak. The following test program demonstrates the problem. It should never report that pages are still resident but will without this patch. It assumes that CPU 0 and 1 exist. int main() { int fd; int pagesize = getpagesize(); ssize_t written = 0, expected; char *buf; unsigned char *vec; int resident, i; cpu_set_t set; /* Prepare a buffer for writing */ expected = FILESIZE_PAGES * pagesize; buf = malloc(expected + 1); if (buf == NULL) { printf("ENOMEM\n"); exit(EXIT_FAILURE); } buf[expected] = 0; memset(buf, 'a', expected); /* Prepare the mincore vec */ vec = malloc(FILESIZE_PAGES); if (vec == NULL) { printf("ENOMEM\n"); exit(EXIT_FAILURE); } /* Bind ourselves to CPU 0 */ CPU_ZERO(&set); CPU_SET(0, &set); if (sched_setaffinity(getpid(), sizeof(set), &set) == -1) { perror("sched_setaffinity"); exit(EXIT_FAILURE); } /* open file, unlink and write buffer */ fd = open("fadvise-test-file", O_CREAT|O_EXCL|O_RDWR); if (fd == -1) { perror("open"); exit(EXIT_FAILURE); } unlink("fadvise-test-file"); while (written < expected) { ssize_t this_write; this_write = write(fd, buf + written, expected - written); if (this_write == -1) { perror("write"); exit(EXIT_FAILURE); } written += this_write; } free(buf); /* * Force ourselves to another CPU. If fadvise only flushes the local * CPUs pagevecs then the fadvise will fail to discard all file pages */ CPU_ZERO(&set); CPU_SET(1, &set); if (sched_setaffinity(getpid(), sizeof(set), &set) == -1) { perror("sched_setaffinity"); exit(EXIT_FAILURE); } /* sync and fadvise to discard the page cache */ fsync(fd); if (posix_fadvise(fd, 0, expected, POSIX_FADV_DONTNEED) == -1) { perror("posix_fadvise"); exit(EXIT_FAILURE); } /* map the file and use mincore to see which parts of it are resident */ buf = mmap(NULL, expected, PROT_READ, MAP_SHARED, fd, 0); if (buf == NULL) { perror("mmap"); exit(EXIT_FAILURE); } if (mincore(buf, expected, vec) == -1) { perror("mincore"); exit(EXIT_FAILURE); } /* Check residency */ for (i = 0, resident = 0; i < FILESIZE_PAGES; i++) { if (vec[i]) resident++; } if (resident != 0) { printf("Nr unexpected pages resident: %d\n", resident); exit(EXIT_FAILURE); } munmap(buf, expected); close(fd); free(vec); exit(EXIT_SUCCESS); } Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: Rob van der Heij <rvdheij@gmail.com> Tested-by: Rob van der Heij <rvdheij@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tmpfs: fix use-after-free of mempolicy objectGreg Thelen2013-02-281-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 5f00110f7273f9ff04ac69a5f85bb535a4fd0987 upstream. The tmpfs remount logic preserves filesystem mempolicy if the mpol=M option is not specified in the remount request. A new policy can be specified if mpol=M is given. Before this patch remounting an mpol bound tmpfs without specifying mpol= mount option in the remount request would set the filesystem's mempolicy object to a freed mempolicy object. To reproduce the problem boot a DEBUG_PAGEALLOC kernel and run: # mkdir /tmp/x # mount -t tmpfs -o size=100M,mpol=interleave nodev /tmp/x # grep /tmp/x /proc/mounts nodev /tmp/x tmpfs rw,relatime,size=102400k,mpol=interleave:0-3 0 0 # mount -o remount,size=200M nodev /tmp/x # grep /tmp/x /proc/mounts nodev /tmp/x tmpfs rw,relatime,size=204800k,mpol=??? 0 0 # note ? garbage in mpol=... output above # dd if=/dev/zero of=/tmp/x/f count=1 # panic here Panic: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [< (null)>] (null) [...] Oops: 0010 [#1] SMP DEBUG_PAGEALLOC Call Trace: mpol_shared_policy_init+0xa5/0x160 shmem_get_inode+0x209/0x270 shmem_mknod+0x3e/0xf0 shmem_create+0x18/0x20 vfs_create+0xb5/0x130 do_last+0x9a1/0xea0 path_openat+0xb3/0x4d0 do_filp_open+0x42/0xa0 do_sys_open+0xfe/0x1e0 compat_sys_open+0x1b/0x20 cstar_dispatch+0x7/0x1f Non-debug kernels will not crash immediately because referencing the dangling mpol will not cause a fault. Instead the filesystem will reference a freed mempolicy object, which will cause unpredictable behavior. The problem boils down to a dropped mpol reference below if shmem_parse_options() does not allocate a new mpol: config = *sbinfo shmem_parse_options(data, &config, true) mpol_put(sbinfo->mpol) sbinfo->mpol = config.mpol /* BUG: saves unreferenced mpol */ This patch avoids the crash by not releasing the mempolicy if shmem_parse_options() doesn't create a new mpol. How far back does this issue go? I see it in both 2.6.36 and 3.3. I did not look back further. Signed-off-by: Greg Thelen <gthelen@google.com> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * drivers/video/backlight/adp88?0_bl.c: fix resumeLars-Peter Clausen2013-02-282-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 5eb02c01bd1f3ef195989ab05e835e2b0711b5a9 upstream. Clearing the NSTBY bit in the control register also automatically clears the BLEN bit. So we need to make sure to set it again during resume, otherwise the backlight will stay off. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Acked-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ocfs2: unlock super lock if lockres refresh failedJunxiao Bi2013-02-281-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 3278bb748d2437eb1464765f36429e5d6aa91c38 upstream. If lockres refresh failed, the super lock will never be released which will cause some processes on other cluster nodes hung forever. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * inotify: remove broken mask checks causing unmount to be EINVALJim Somerville2013-02-281-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 676a0675cf9200ac047fb50825f80867b3bb733b upstream. Running the command: inotifywait -e unmount /mnt/disk immediately aborts with a -EINVAL return code. This is however a valid parameter. This abort occurs only if unmount is the sole event parameter. If other event parameters are supplied, then the unmount event wait will work. The problem was introduced by commit 44b350fc23e ("inotify: Fix mask checks"). In that commit, it states: The mask checks in inotify_update_existing_watch() and inotify_new_watch() are useless because inotify_arg_to_mask() sets FS_IN_IGNORED and FS_EVENT_ON_CHILD bits anyway. But instead of removing the useless checks, it did this: mask = inotify_arg_to_mask(arg); - if (unlikely(!mask)) + if (unlikely(!(mask & IN_ALL_EVENTS))) return -EINVAL; The problem is that IN_ALL_EVENTS doesn't include IN_UNMOUNT, and other parts of the code keep IN_UNMOUNT separate from IN_ALL_EVENTS. So the check should be: if (unlikely(!(mask & (IN_ALL_EVENTS | IN_UNMOUNT)))) But inotify_arg_to_mask(arg) always sets the IN_UNMOUNT bit in the mask anyway, so the check is always going to pass and thus should simply be removed. Also note that inotify_arg_to_mask completely controls what mask bits get set from arg, there's no way for invalid bits to get enabled there. Lets fix it by simply removing the useless broken checks. Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: John McCutchan <john@johnmccutchan.com> Cc: Robert Love <rlove@rlove.org> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * s390/kvm: Fix store status for ACRS/FPRSChristian Borntraeger2013-02-281-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 15bc8d8457875f495c59d933b05770ba88d1eacb upstream. On store status we need to copy the current state of registers into a save area. Currently we might save stale versions: The sie state descriptor doesnt have fields for guest ACRS,FPRS, those registers are simply stored in the host registers. The host program must copy these away if needed. We do that in vcpu_put/load. If we now do a store status in KVM code between vcpu_put/load, the saved values are not up-to-date. Lets collect the ACRS/FPRS before saving them. This also fixes some strange problems with hotplug and virtio-ccw, since the low level machine check handler (on hotplug a machine check will happen) will revalidate all registers with the content of the save area. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * KVM: s390: Handle hosts not supporting s390-virtio.Cornelia Huck2013-02-281-8/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 55c171a6d90dc0574021f9c836127cfd1a7d2e30 upstream. Running under a kvm host does not necessarily imply the presence of a page mapped above the main memory with the virtio information; however, the code includes a hard coded access to that page. Instead, check for the presence of the page and exit gracefully before we hit an addressing exception if it does not exist. Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Reviewed-by: Alexander Graf <agraf@suse.de> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * mmu_notifier_unregister NULL Pointer deref and multiple ->release() calloutsRobin Holt2013-02-281-40/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 751efd8610d3d7d67b7bdf7f62646edea7365dd7 upstream. There is a race condition between mmu_notifier_unregister() and __mmu_notifier_release(). Assume two tasks, one calling mmu_notifier_unregister() as a result of a filp_close() ->flush() callout (task A), and the other calling mmu_notifier_release() from an mmput() (task B). A B t1 srcu_read_lock() t2 if (!hlist_unhashed()) t3 srcu_read_unlock() t4 srcu_read_lock() t5 hlist_del_init_rcu() t6 synchronize_srcu() t7 srcu_read_unlock() t8 hlist_del_rcu() <--- NULL pointer deref. Additionally, the list traversal in __mmu_notifier_release() is not protected by the by the mmu_notifier_mm->hlist_lock which can result in callouts to the ->release() notifier from both mmu_notifier_unregister() and __mmu_notifier_release(). -stable suggestions: The stable trees prior to 3.7.y need commits 21a92735f660 and 70400303ce0c cherry-picked in that order prior to cherry-picking this commit. The 3.7.y tree already has those two commits. Signed-off-by: Robin Holt <holt@sgi.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: Avi Kivity <avi@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Sagi Grimberg <sagig@mellanox.co.il> Cc: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * mm: mmu_notifier: make the mmu_notifier srcu staticAndrea Arcangeli2013-02-281-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 70400303ce0c4ced3139499c676d5c79636b0c72 upstream. The variable must be static especially given the variable name. s/RCU/SRCU/ over a few comments. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * mm: mmu_notifier: have mmu_notifiers use a global SRCU so they may safely ↵Sagi Grimberg2013-02-282-25/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | schedule commit 21a92735f660eaecf69a6f2e777f18463760ec32 upstream. With an RCU based mmu_notifier implementation, any callout to mmu_notifier_invalidate_range_{start,end}() or mmu_notifier_invalidate_page() would not be allowed to call schedule() as that could potentially allow a modification to the mmu_notifier structure while it is currently being used. Since srcu allocs 4 machine words per instance per cpu, we may end up with memory exhaustion if we use srcu per mm. So all mms share a global srcu. Note that during large mmu_notifier activity exit & unregister paths might hang for longer periods, but it is tolerable for current mmu_notifier clients. Signed-off-by: Sagi Grimberg <sagig@mellanox.co.il> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Haggai Eran <haggaie@mellanox.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Driver core: treat unregistered bus_types as having no devicesBjorn Helgaas2013-02-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 4fa3e78be7e985ca814ce2aa0c09cbee404efcf7 upstream. A bus_type has a list of devices (klist_devices), but the list and the subsys_private structure that contains it are not initialized until the bus_type is registered with bus_register(). The panic/reboot path has fixups that look up devices in pci_bus_type. If we panic before registering pci_bus_type, the bus_type exists but the list does not, so mach_reboot_fixups() trips over a null pointer and panics again: mach_reboot_fixups pci_get_device .. bus_find_device(&pci_bus_type, ...) bus->p is NULL Joonsoo reported a problem when panicking before PCI was initialized. I think this patch should be sufficient to replace the patch he posted here: https://lkml.org/lkml/2012/12/28/75 ("[PATCH] x86, reboot: skip reboot_fixups in early boot phase") Reported-by: Joonsoo Kim <js1304@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * xen: Send spinlock IPI to all waitersStefan Bader2013-02-281-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 76eaca031f0af2bb303e405986f637811956a422 upstream. There is a loophole between Xen's current implementation of pv-spinlocks and the scheduler. This was triggerable through a testcase until v3.6 changed the TLB flushing code. The problem potentially is still there just not observable in the same way. What could happen was (is): 1. CPU n tries to schedule task x away and goes into a slow wait for the runq lock of CPU n-# (must be one with a lower number). 2. CPU n-#, while processing softirqs, tries to balance domains and goes into a slow wait for its own runq lock (for updating some records). Since this is a spin_lock_irqsave in softirq context, interrupts will be re-enabled for the duration of the poll_irq hypercall used by Xen. 3. Before the runq lock of CPU n-# is unlocked, CPU n-1 receives an interrupt (e.g. endio) and when processing the interrupt, tries to wake up task x. But that is in schedule and still on_cpu, so try_to_wake_up goes into a tight loop. 4. The runq lock of CPU n-# gets unlocked, but the message only gets sent to the first waiter, which is CPU n-# and that is busily stuck. 5. CPU n-# never returns from the nested interruption to take and release the lock because the scheduler uses a busy wait. And CPU n never finishes the task migration because the unlock notification only went to CPU n-#. To avoid this and since the unlocking code has no real sense of which waiter is best suited to grab the lock, just send the IPI to all of them. This causes the waiters to return from the hyper- call (those not interrupted at least) and do active spinlocking. BugLink: http://bugs.launchpad.net/bugs/1011792 Acked-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * xen/netback: check correct frag when looking for head fragIan Campbell2013-02-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | When I backported 7d5145d8eb2b "xen/netback: don't leak pages on failure in xen_netbk_tx_check_gop" to 3.0 (where it became f0457844e605) I somehow picked up an extraneous hunk breaking this. Reported-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tty: set_termios/set_termiox should not return -EINTROleg Nesterov2013-02-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 183d95cdd834381c594d3aa801c1f9f9c0c54fa9 upstream. See https://bugzilla.redhat.com/show_bug.cgi?id=904907 read command causes bash to abort with double free or corruption (out). A simple test-case from Roman: // Compile the reproducer and send sigchld ti that process. // EINTR occurs even if SA_RESTART flag is set. void handler(int sig) { } main() { struct sigaction act; act.sa_handler = handler; act.sa_flags = SA_RESTART; sigaction (SIGCHLD, &act, 0); struct termio ttp; ioctl(0, TCGETA, &ttp); while(1) { if (ioctl(0, TCSETAW, ttp) < 0) { if (errno == EINTR) { fprintf(stderr, "BUG!"); return(1); } } } } Change set_termios/set_termiox to return -ERESTARTSYS to fix this particular problem. I didn't dare to change other EINTR's in drivers/tty/, but they look equally wrong. Reported-by: Roman Rakus <rrakus@redhat.com> Reported-by: Lingzhu Xiang <lxiang@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ALSA: rme32.c irq enabling after spin_lock_irqDenis Efremov2013-02-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit f49a59c4471d81a233e09dda45187cc44fda009d upstream. According to the other code in this driver and similar code in rme96 it seems, that spin_lock_irq in snd_rme32_capture_close function should be paired with spin_unlock_irq. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Denis Efremov <yefremov.denis@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ALSA: ali5451: remove irq enabling in pointer callbackDenis Efremov2013-02-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit dacae5a19b4cbe1b5e3a86de23ea74cbe9ec9652 upstream. snd_ali_pointer function is called with local interrupts disabled. However it seems very strange to reenable them in such way. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Denis Efremov <yefremov.denis@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * hrtimer: Prevent hrtimer_enqueue_reprogram raceLeonid Shatz2013-02-281-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit b22affe0aef429d657bc6505aacb1c569340ddd2 upstream. hrtimer_enqueue_reprogram contains a race which could result in timer.base switch during unlock/lock sequence. hrtimer_enqueue_reprogram is releasing the lock protecting the timer base for calling raise_softirq_irqsoff() due to a lock ordering issue versus rq->lock. If during that time another CPU calls __hrtimer_start_range_ns() on the same hrtimer, the timer base might switch, before the current CPU can lock base->lock again and therefor the unlock_timer_base() call will unlock the wrong lock. [ tglx: Added comment and massaged changelog ] Signed-off-by: Leonid Shatz <leonid.shatz@ravellosystems.com> Signed-off-by: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Link: http://lkml.kernel.org/r/1359981217-389-1-git-send-email-izik.eidus@ravellosystems.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * posix-cpu-timers: Fix nanosleep task_struct leakStanislaw Gruszka2013-02-281-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit e6c42c295e071dd74a66b5a9fcf4f44049888ed8 upstream. The trinity fuzzer triggered a task_struct reference leak via clock_nanosleep with CPU_TIMERs. do_cpu_nanosleep() calls posic_cpu_timer_create(), but misses a corresponding posix_cpu_timer_del() which leads to the task_struct reference leak. Reported-and-tested-by: Tommi Rantala <tt.rantala@gmail.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Dave Jones <davej@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20130215100810.GF4392@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * genirq: Avoid deadlock in spurious handlingThomas Gleixner2013-02-281-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit e716efde75267eab919cdb2bef5b2cb77f305326 upstream. commit 52553ddf(genirq: fix regression in irqfixup, irqpoll) introduced a potential deadlock by calling the action handler with the irq descriptor lock held. Remove the call and let the handling code run even for an interrupt where only a single action is registered. That matches the goal of the above commit and avoids the deadlock. Document the confusing action = desc->action reload in the handling loop while at it. Reported-and-tested-by: "Wang, Warner" <warner.wang@hp.com> Tested-by: Edward Donovan <edward.donovan@numble.net> Cc: "Wang, Song-Bo (Stoney)" <song-bo.wang@hp.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * timeconst.pl: Eliminate Perl warningH. Peter Anvin2013-02-281-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 63a3f603413ffe82ad775f2d62a5afff87fd94a0 upstream. defined(@array) is deprecated in Perl and gives off a warning. Restructure the code to remove that warning. [ hpa: it would be interesting to revert to the timeconst.bc script. It appears that the failures reported by akpm during testing of that script was due to a known broken version of make, not a problem with bc. The Makefile rules could probably be restructured to avoid the make bug, or it is probably old enough that it doesn't matter. ] Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * mm: fix pageblock bitmap allocationLinus Torvalds2013-02-281-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 7c45512df987c5619db041b5c9b80d281e26d3db upstream. Commit c060f943d092 ("mm: use aligned zone start for pfn_to_bitidx calculation") fixed out calculation of the index into the pageblock bitmap when a !SPARSEMEM zome was not aligned to pageblock_nr_pages. However, the _allocation_ of that bitmap had never taken this alignment requirement into accout, so depending on the exact size and alignment of the zone, the use of that index could then access past the allocation, resulting in some very subtle memory corruption. This was reported (and bisected) by Ingo Molnar: one of his random config builds would hang with certain very specific kernel command line options. In the meantime, commit c060f943d092 has been marked for stable, so this fix needs to be back-ported to the stable kernels that backported the commit to use the right alignment. Bisected-and-tested-by: Ingo Molnar <mingo@kernel.org> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * x86-32, mm: Remove reference to resume_map_numa_kva()H. Peter Anvin2013-02-282-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | commit bb112aec5ee41427e9b9726e3d57b896709598ed upstream. Remove reference to removed function resume_map_numa_kva(). Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20130131005616.1C79F411@kernel.stglabs.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * 3.0.66Greg Kroah-Hartman2013-02-211-1/+1
| |
| * printk: fix buffer overflow when calling log_prefix function from ↵Alexandre SIMON2013-02-212-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | call_console_drivers This patch corrects a buffer overflow in kernels from 3.0 to 3.4 when calling log_prefix() function from call_console_drivers(). This bug existed in previous releases but has been revealed with commit 162a7e7500f9664636e649ba59defe541b7c2c60 (2.6.39 => 3.0) that made changes about how to allocate memory for early printk buffer (use of memblock_alloc). It disappears with commit 7ff9554bb578ba02166071d2d487b7fc7d860d62 (3.4 => 3.5) that does a refactoring of printk buffer management. In log_prefix(), the access to "p[0]", "p[1]", "p[2]" or "simple_strtoul(&p[1], &endp, 10)" may cause a buffer overflow as this function is called from call_console_drivers by passing "&LOG_BUF(cur_index)" where the index must be masked to do not exceed the buffer's boundary. The trick is to prepare in call_console_drivers() a buffer with the necessary data (PRI field of syslog message) to be safely evaluated in log_prefix(). This patch can be applied to stable kernel branches 3.0.y, 3.2.y and 3.4.y. Without this patch, one can freeze a server running this loop from shell : $ export DUMMY=`cat /dev/urandom | tr -dc '12345AZERTYUIOPQSDFGHJKLMWXCVBNazertyuiopqsdfghjklmwxcvbn' | head -c255` $ while true do ; echo $DUMMY > /dev/kmsg ; done The "server freeze" depends on where memblock_alloc does allocate printk buffer : if the buffer overflow is inside another kernel allocation the problem may not be revealed, else the server may hangs up. Signed-off-by: Alexandre SIMON <Alexandre.Simon@univ-lorraine.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Linux 3.0.65Greg Kroah-Hartman2013-02-171-1/+1
| |
| * igb: Remove artificial restriction on RQDPC stat readingAlexander Duyck2013-02-171-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit ae1c07a6b7ced6c0c94c99e3b53f4e7856fa8bff upstream. For some reason the reading of the RQDPC register was being artificially limited to 4K. Instead of limiting the value we should read the value and add the full amount. Otherwise this can lead to a misleading number of dropped packets when the actual value is in fact much higher. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Vinson Lee <vlee@twitter.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * PCI/PM: Clean up PME state when removing a deviceRafael J. Wysocki2013-02-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 249bfb83cf8ba658955f0245ac3981d941f746ee upstream. Devices are added to pci_pme_list when drivers use pci_enable_wake() or pci_wake_from_d3(), but they aren't removed from the list unless the driver explicitly disables wakeup. Many drivers never disable wakeup, so their devices remain on the list even after they are removed, e.g., via hotplug. A subsequent PME poll will oops when it tries to touch the device. This patch disables PME# on a device before removing it, which removes the device from pci_pme_list. This is safe even if the device never had PME# enabled. This oops can be triggered by unplugging a Thunderbolt ethernet adapter on a Macbook Pro, as reported by Daniel below. [bhelgaas: changelog] Reference: http://lkml.kernel.org/r/CAMVG2svG21yiM1wkH4_2pen2n+cr2-Zv7TbH3Gj+8MwevZjDbw@mail.gmail.com Reported-and-tested-by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS.Jan Beulich2013-02-171-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 13d2b4d11d69a92574a55bfd985cfb0ca77aebdc upstream. This fixes CVE-2013-0228 / XSA-42 Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user in 32bit PV guest can use to crash the > guest with the panic like this: ------------- general protection fault: 0000 [#1] SMP last sysfs file: /sys/devices/vbd-51712/block/xvda/dev Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_netfront ext4 mbcache jbd2 xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 1250, comm: r Not tainted 2.6.32-356.el6.i686 #1 EIP: 0061:[<c0407462>] EFLAGS: 00010086 CPU: 0 EIP is at xen_iret+0x12/0x2b EAX: eb8d0000 EBX: 00000001 ECX: 08049860 EDX: 00000010 ESI: 00000000 EDI: 003d0f00 EBP: b77f8388 ESP: eb8d1fe0 DS: 0000 ES: 007b FS: 0000 GS: 00e0 SS: 0069 Process r (pid: 1250, ti=eb8d0000 task=c2953550 task.ti=eb8d0000) Stack: 00000000 0027f416 00000073 00000206 b77f8364 0000007b 00000000 00000000 Call Trace: Code: c3 8b 44 24 18 81 4c 24 38 00 02 00 00 8d 64 24 30 e9 03 00 00 00 8d 76 00 f7 44 24 08 00 00 02 80 75 33 50 b8 00 e0 ff ff 21 e0 <8b> 40 10 8b 04 85 a0 f6 ab c0 8b 80 0c b0 b3 c0 f6 44 24 0d 02 EIP: [<c0407462>] xen_iret+0x12/0x2b SS:ESP 0069:eb8d1fe0 general protection fault: 0000 [#2] ---[ end trace ab0d29a492dcd330 ]--- Kernel panic - not syncing: Fatal exception Pid: 1250, comm: r Tainted: G D --------------- 2.6.32-356.el6.i686 #1 Call Trace: [<c08476df>] ? panic+0x6e/0x122 [<c084b63c>] ? oops_end+0xbc/0xd0 [<c084b260>] ? do_general_protection+0x0/0x210 [<c084a9b7>] ? error_code+0x73/ ------------- Petr says: " I've analysed the bug and I think that xen_iret() cannot cope with mangled DS, in this case zeroed out (null selector/descriptor) by either xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT entry was invalidated by the reproducer. " Jan took a look at the preliminary patch and came up a fix that solves this problem: "This code gets called after all registers other than those handled by IRET got already restored, hence a null selector in %ds or a non-null one that got loaded from a code or read-only data descriptor would cause a kernel mode fault (with the potential of crashing the kernel as a whole, if panic_on_oops is set)." The way to fix this is to realize that the we can only relay on the registers that IRET restores. The two that are guaranteed are the %cs and %ss as they are always fixed GDT selectors. Also they are inaccessible from user mode - so they cannot be altered. This is the approach taken in this patch. Another alternative option suggested by Jan would be to relay on the subtle realization that using the %ebp or %esp relative references uses the %ss segment. In which case we could switch from using %eax to %ebp and would not need the %ss over-rides. That would also require one extra instruction to compensate for the one place where the register is used as scaled index. However Andrew pointed out that is too subtle and if further work was to be done in this code-path it could escape folks attention and lead to accidents. Reviewed-by: Petr Matousek <pmatouse@redhat.com> Reported-by: Petr Matousek <pmatouse@redhat.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * x86/mm: Check if PUD is large when validating a kernel addressMel Gorman2013-02-172-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 0ee364eb316348ddf3e0dfcd986f5f13f528f821 upstream. A user reported the following oops when a backup process reads /proc/kcore: BUG: unable to handle kernel paging request at ffffbb00ff33b000 IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110 [...] Call Trace: [<ffffffff811b8aaa>] read_kcore+0x17a/0x370 [<ffffffff811ad847>] proc_reg_read+0x77/0xc0 [<ffffffff81151687>] vfs_read+0xc7/0x130 [<ffffffff811517f3>] sys_read+0x53/0xa0 [<ffffffff81449692>] system_call_fastpath+0x16/0x1b Investigation determined that the bug triggered when reading system RAM at the 4G mark. On this system, that was the first address using 1G pages for the virt->phys direct mapping so the PUD is pointing to a physical address, not a PMD page. The problem is that the page table walker in kern_addr_valid() is not checking pud_large() and treats the physical address as if it was a PMD. If it happens to look like pmd_none then it'll silently fail, probably returning zeros instead of real data. If the data happens to look like a present PMD though, it will be walked resulting in the oops above. This patch adds the necessary pud_large() check. Unfortunately the problem was not readily reproducible and now they are running the backup program without accessing /proc/kcore so the patch has not been validated but I think it makes sense. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.coM> Reviewed-by: Michal Hocko <mhocko@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20130211145236.GX21389@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Linux 3.0.64Greg Kroah-Hartman2013-02-141-1/+1
| |
| * netback: correct netbk_tx_err to handle wrap around.Ian Campbell2013-02-141-1/+1
| | | | | | | | | | | | | | | | | | [ Upstream commit b9149729ebdcfce63f853aa54a404c6a8f6ebbf3 ] Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * xen/netback: free already allocated memory on failure in xen_netbk_get_requestsIan Campbell2013-02-141-1/+12
| | | | | | | | | | | | | | | | [ Upstream commit 4cc7c1cb7b11b6f3515bd9075527576a1eecc4aa ] Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * xen/netback: don't leak pages on failure in xen_netbk_tx_check_gop.Matthew Daley2013-02-141-26/+14
| | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 7d5145d8eb2b9791533ffe4dc003b129b9696c48 ] Signed-off-by: Matthew Daley <mattjd@gmail.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * xen/netback: shutdown the ring if it contains garbage.Ian Campbell2013-02-143-26/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 48856286b64e4b66ec62b94e504d0b29c1ade664 ] A buggy or malicious frontend should not be able to confuse netback. If we spot anything which is not as it should be then shutdown the device and don't try to continue with the ring in a potentially hostile state. Well behaved and non-hostile frontends will not be penalised. As well as making the existing checks for such errors fatal also add a new check that ensures that there isn't an insane number of requests on the ring (i.e. more than would fit in the ring). If the ring contains garbage then previously is was possible to loop over this insane number, getting an error each time and therefore not generating any more pending requests and therefore not exiting the loop in xen_netbk_tx_build_gops for an externded period. Also turn various netdev_dbg calls which no precipitate a fatal error into netdev_err, they are rate limited because the device is shutdown afterwards. This fixes at least one known DoS/softlockup of the backend domain. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tg3: Fix crc errors on jumbo frame receiveNithin Nayak Sujir2013-02-141-21/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit daf3ec688e057f6060fb9bb0819feac7a8bbf45c ] TG3_PHY_AUXCTL_SMDSP_ENABLE/DISABLE macros do a blind write to the phy auxiliary control register and overwrite the EXT_PKT_LEN (bit 14) resulting in intermittent crc errors on jumbo frames with some link partners. Change the code to do a read/modify/write. Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>