summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-12-07Add missing defines for ACL query supportSteve French
Add missing defines needed for ACL query support. For definitions of these security info type additionalinfo flags and also the EA Flags see MS-SMB2 (2.2.37) or MS-DTYP Signed-of-by: Steven French <smfrench@gmail.com> Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
2014-12-07Add support for original fallocateSteve French
In many cases the simple fallocate call is a no op (since the file is already not sparse) or can simply be converted from a sparse to a non-sparse file if we are fallocating the whole file and keeping the size. Signed-off-by: Steven French <smfrench@gmail.com>
2014-12-07Linux 3.18Linus Torvalds
2014-12-07genirq: Move irq_chip_write_msi_msg() helper to coreThomas Gleixner
No point to expose this to the world. The only legitimate user is the core code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Marc Zyngier <marc.zyngier@arm.com>
2014-12-07can: flexcan: Consolidate and unify state change handlingAndri Yngvason
Replacing error state change handling with the new mechanism. Signed-off-by: Andri Yngvason <andri.yngvason@marel.com> Acked-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-12-07can: mscan: Consolidate and unify state change handlingAndri Yngvason
Replacing error state change handling with the new mechanism. Signed-off-by: Andri Yngvason <andri.yngvason@marel.com> Acked-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-12-07can: sja1000: Consolidate and unify state change handlingAndri Yngvason
Replacing error state change handling with the new mechanism. Signed-off-by: Andri Yngvason <andri.yngvason@marel.com> Acked-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-12-07can: dev: Consolidate and unify state change handlingAndri Yngvason
The handling of can error states is different between platforms. This is an attempt to correct that problem. I've moved this handling into a generic function for changing the error state. This ensures that error state changes are handled the same way everywhere (where this function is used). This new mechanism also adds reverse state transitioning in error frames, i.e. the user will be notified through the socket interface when the state goes down. Signed-off-by: Andri Yngvason <andri.yngvason@marel.com> Acked-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-12-07can: Enable -D__CHECK_ENDIAN__ for sparse by defaultMarc Kleine-Budde
This patch enables endian checking by default when running sparse via "make C=2" for example. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-12-07can: fix spelling errorsJeremiah Mahler
Fix various spelling errors in the comments of the CAN modules. Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-12-07can: slcan/vcan: eliminate banner[] variable, switch to pr_info()Jeremiah Mahler
Several can modules in drivers/net/can use a banner[] variable at the top which defines a string that is used once during init. This string is also embedded with KERN_INFO which makes it printk() specific. Improve the code by eliminating the banner[] variable and moving the string to where it is printed. Then switch from printk(KERN_INFO to pr_info() for the lines that were changed. This patch is similar to [1] which was applied to net/can. [1]: https://lkml.org/lkml/2014/11/22/10 Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-12-07can: eliminate banner[] variable and switch to pr_info()Jeremiah Mahler
Several CAN modules use a design pattern with a banner[] variable at the top which defines a string that is used once during init to print the banner. The string is also embedded with KERN_INFO which makes it printk() specific. Improve the code by eliminating the banner[] variable and moving the string to where it is printed. Then switch from printk(KERN_INFO to pr_info() for the lines that were changed. Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-12-07can: peak_usb: fix multi-byte values endianessStephane Grosjean
This patch fixes the endianess definition as well as the usage of the multi-byte fields in the data structures exchanged with the PEAK-System USB adapters. By fixing the endianess, this patch also fixes the wrong usage of a 32-bits local variable for handling the error status 16-bits field, in function pcan_usb_pro_handle_error(). Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-12-07Merge branch 'for-3.18-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "Three libata fixes for v3.18. Nothing too interesting. PCI ID ID and quirk additions to ahci and an error handling path fix in sata_fsl" * 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: ahci: disable MSI on SAMSUNG 0xa800 SSD sata_fsl: fix error handling of irq_of_parse_and_map AHCI: Add DeviceIDs for Sunrise Point-LP SATA controller
2014-12-07spi/s3c64xx: Remove redundant runtime PM managementMark Brown
The device already asks the core to hold a runtime PM reference while it is active so it is redundant to open code that in the driver itself. Signed-off-by: Mark Brown <broonie@kernel.org>
2014-12-07ASoC: rsnd: rename SSI function name of PIOKuninori Morimoto
Current R-Car sound SSI PIO/DMA mode are using interrupt. it is no longer "xxx_pio_xxx", rename it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2014-12-07ASoC: rsnd: add salvage support for under/over flow error on SSIKuninori Morimoto
L/R channel will be switched if under/over flow error happen on Renesas R-Car sound device by the HW bugs. Then, HW restart is required for salvage. This patch add salvage support for SSI. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2014-12-06ALSA: usb-audio: Don't resubmit pending URBs at MIDI error recoveryTakashi Iwai
In snd_usbmidi_error_timer(), the driver tries to resubmit MIDI input URBs to reactivate the MIDI stream, but this causes the error when some of URBs are still pending like: WARNING: CPU: 0 PID: 0 at ../drivers/usb/core/urb.c:339 usb_submit_urb+0x5f/0x70() URB ef705c40 submitted while active CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.6-2-desktop #1 Hardware name: FOXCONN TPS01/TPS01, BIOS 080015 03/23/2010 c0984bfa f4009ed4 c078deaf f4009ee4 c024c884 c09a135c f4009f00 00000000 c0984bfa 00000153 c061ac4f c061ac4f 00000009 00000001 ef705c40 e854d1c0 f4009eec c024c8d3 00000009 f4009ee4 c09a135c f4009f00 f4009f04 c061ac4f Call Trace: [<c0205df6>] try_stack_unwind+0x156/0x170 [<c020482a>] dump_trace+0x5a/0x1b0 [<c0205e56>] show_trace_log_lvl+0x46/0x50 [<c02049d1>] show_stack_log_lvl+0x51/0xe0 [<c0205eb7>] show_stack+0x27/0x50 [<c078deaf>] dump_stack+0x45/0x65 [<c024c884>] warn_slowpath_common+0x84/0xa0 [<c024c8d3>] warn_slowpath_fmt+0x33/0x40 [<c061ac4f>] usb_submit_urb+0x5f/0x70 [<f7974104>] snd_usbmidi_submit_urb+0x14/0x60 [snd_usbmidi_lib] [<f797483a>] snd_usbmidi_error_timer+0x6a/0xa0 [snd_usbmidi_lib] [<c02570c0>] call_timer_fn+0x30/0x130 [<c0257442>] run_timer_softirq+0x1c2/0x260 [<c0251493>] __do_softirq+0xc3/0x270 [<c0204732>] do_softirq_own_stack+0x22/0x30 [<c025186d>] irq_exit+0x8d/0xa0 [<c0795228>] smp_apic_timer_interrupt+0x38/0x50 [<c0794a3c>] apic_timer_interrupt+0x34/0x3c [<c0673d9e>] cpuidle_enter_state+0x3e/0xd0 [<c028bb8d>] cpu_idle_loop+0x29d/0x3e0 [<c028bd23>] cpu_startup_entry+0x53/0x60 [<c0bfac1e>] start_kernel+0x415/0x41a For avoiding these errors, check the pending URBs and skip resubmitting such ones. Reported-and-tested-by: Stefan Seyfried <stefan.seyfried@googlemail.com> Acked-by: Clemens Ladisch <clemens@ladisch.de> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-12-06Merge git://www.linux-watchdog.org/linux-watchdogLinus Torvalds
Pull watchdog fix from Wim Van Sebroeck: "Fix the watchdog mask bit offset for Exynos7" * git://www.linux-watchdog.org/linux-watchdog: watchdog: s3c2410_wdt: Fix the mask bit offset for Exynos7
2014-12-06Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Here are two more driver bugfixes for I2C which would be good to have" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: cadence: Set the hardware time-out register to maximum value i2c: davinci: generate STP always when NACK is received
2014-12-06can: peak_usb: fix cleanup sequence order in case of error during initStephane Grosjean
This patch sets the correct reverse sequence order to the instructions set to run, when any failure occurs during the initialization steps. It also adds the missing unregistration call of the can device if the failure appears after having been registered. Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-12-06can: peak_usb: fix memset() usageStephane Grosjean
This patchs fixes a misplaced call to memset() that fills the request buffer with 0. The problem was with sending PCAN_USBPRO_REQ_FCT requests, the content set by the caller was thus lost. With this patch, the memory area is zeroed only when requesting info from the device. Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-12-06i40e: Reduce stack in i40e_dbg_dump_descJoe Perches
Reduce stack use by using kmemdup and not using a very large struct on stack. In function ‘i40e_dbg_dump_desc’: warning: the frame size of 8192 bytes is larger than 2048 bytes [-Wframe-larger-than=] Signed-off-by: Joe Perches <joe@perches.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-06i40e: Bump i40e version to 1.2.2 and i40evf version to 1.0.6Catherine Sullivan
Bump version. Change-ID: I4264e81dcfb57ec46a3ede54b0a6cb25b497d3cb Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-06i40e: get pf_id from HW rather than PCI functionShannon Nelson
Getting the pf_id from the function number was a good place to start, but when the PF was setup in passthru mode, the PCI bus/device/function was virtualized and the number in the VM is different from the number in the bare metal. This caused HW configuration issues when the wrong pf_id was used to set up the HMC and other structures. The PF_FUNC_RID register has the real bus/device/function information as configured by the BIOS, so use that for a better number. This works in NPAR mode as well. Change-ID: I65e3dd6c97594890c2bad566b83cc670b1dae534 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Acked-by: Greg Rose <gregory.v.rose@intel.com> Acked-by: Kevin Scott <kevin.c.scott@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-06i40e: increase ARQ sizeMitch Williams
The ARQ needs to have at least as many entries as VFs, or the VFs will get errors from the FW when they send messages to the PF. Since we don't know how many VFs we'll end up with, just set up 128 descriptors. Change-ID: I04ae3d1c7faf09110eb782214e9c05aeb62a6c59 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-06i40e: Re enable Main VSI loopback setting in the reset pathAnjali Singhai Jain
There is an order in which this should happen. It turns out that FW will not let you change the Loopback setting of the VSI with update VSI prior to the VEB creation. Change-ID: I7614ddff8b4c37702930c02f16f8c346aaa64bd1 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-06i40e: Add new update VSI flow to accommodate FW fix with VSI Loopback modeAnjali Singhai Jain
All VSIs on a VEB should either have loopback enabled or disabled, a mixed mode is not supported for a VEB. Since our driver supports multiple VSIs per PF that need to talk to each other make sure to enable Loopback for the PF and FDIR VSI as well. Also, we now have to explicitly enable Loopback mode otherwise we fail VSI creation for VMDq and VF VSIs. Change-ID: Ib68c3ea4aeb730ac9468f930610de456efbe5b20 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-06ASoC: samsung: Fix error handling for clock lookupMark Brown
Return the error code we got from clk_get() and check to make sure that clk_prepare_enable() worked. Signed-off-by: Mark Brown <broonie@kernel.org>
2014-12-06i40e: Increase reset delayKevin Scott
Increase reset delay to ensure all internal caches are properly flushed in worst case scenario. Change-ID: I6f059a9e024fbf9ef1debd32497eed21369957fc Signed-off-by: Kevin Scott <kevin.c.scott@intel.com> Acked-by: Shannon Nelson <shannon.nelson@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-06i40evf: make early init sequence even more robustMitch Williams
When multiple VFs attempt to initialize simultaneously, the firmware may delay or drop messages. Make the init code more adept at handling these situations by a) reinitializing the admin queue if the firmware fails to process a request, and b) resending a request if the PF doesn't answer. Once the request has been sent again, the PF might end up getting both requests and send the configuration information to the driver twice. This will cause the VF to complain about receiving an unexpected message from the PF. Since this is not fatal, reduce the warning level of the log messages that are generated in response to this event. Change-ID: I9370a1a2fde2ad3934fa25ccfd0545edfbbb4805 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-06i40e: fix netdev_stat macro definitionShannon Nelson
The old xxx_NETDEV_STAT() macro was defined long before the newer rtnl_link_stats64 came into being, and just never got updated. Since we're using rtnl_link_stats64 in other parts of the driver, we should use it here as well. We've just been lucky that the field definitions are the same sizes. Change-ID: I19fc71619905700235dcdf0d3c8153aec81d36de Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-06x86, microcode: Reload microcode on resumeBorislav Petkov
Normally, we do reapply microcode on resume. However, in the cases where that microcode comes from the early loader and the late loader hasn't been utilized yet, there's no easy way for us to go and apply the patch applied during boot by the early loader. Thus, reuse the patch stashed by the early loader for the BSP. Signed-off-by: Borislav Petkov <bp@suse.de>
2014-12-06i40e: Define and use i40e_is_vf macroAnjali Singhai Jain
This patch is useful for future expansion when new VF MAC types get added. It helps with cleaning up VF driver flow. Change-ID: Ibe1eeb71262a3a40f24a1c5409436bdc3411da7f Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Acked-by: Shannon Nelson <shannon.nelson@intel.com> Acked-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-06x86, microcode: Don't initialize microcode code on paravirtBoris Ostrovsky
Paravirtual guests are not expected to load microcode into processors and therefore it is not necessary to initialize microcode loading logic. In fact, under certain circumstances initializing this logic may cause the guest to crash. Specifically, 32-bit kernels use __pa_nodebug() macro which does not work in Xen (the code path that leads to this macro happens during resume when we call mc_bp_resume()->load_ucode_ap() ->check_loader_disabled_ap()) Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: http://lkml.kernel.org/r/1417469264-31470-1-git-send-email-boris.ostrovsky@oracle.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-12-06x86, microcode, intel: Drop unused parameterBorislav Petkov
apply_microcode_early() doesn't use mc_saved_data, kill it. Signed-off-by: Borislav Petkov <bp@suse.de>
2014-12-06i40e: Add a virtual channel op to config RSSAnjali Singhai Jain
Add the Virtual Channel OP event opcode for CONFIG_RSS, so that the Virtual Channel state machine can properly decipher status change events. Change-ID: I09939c7aa380147f60c49fd01ef2e27d0dc1c299 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Acked-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-06i40e: don't enable PTP support on more than one PF per portJacob Keller
Resolve an issue related to images with multiple PFs per physical port. We cannot fully support 1588 PTP features, since only one port should control (ie: write) the registers at a time. Doing so can cause interference of functionality. It may be possible to partially implement the API for only those features without side effects. However, this at minimum means non controlling PFs lose Tx timestamps, frequency atunement, and possibly SYSTIME adjustment. There may be further impact I did not discover. Since the API in the kernel expects these features to work, it is simpler and less dangerous to just disable PTP features on all PFs not identified as the controlling PF in PRTTSYN_CTL0.PF_ID. This change also removes the warning printed when hwtstaml IOCTL is called on the wrong PF. This is actually meaningless now, since only one PF per port will support it. In addition, the ethtool get_ts_info IOCTL was updated so that only the controlling port will even indicate support (so as not to confuse users). The overall downside is complete loss of functionality on non controlling PF, vs the possible gain of partial support. The biggest factor for choosing this approach is simplicity and ensuring that the main PF will work. There could easily be other portions of the 1588 logic with side effects I am not aware, and the reduced functionality that might be made available is significantly less useful. In addition, the API does not allow for proper indication of why particular features are not supported. These reasons are enough to decide for the simpler approach to resolving this issue. Change-ID: If4696bae686fc18aef6552b67dd417213d987c16 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-06i40e: Add description to misc and fd interruptsCarolyn Wyborny
This patch adds additional text description for base pf0 and flow director generated interrupts. Without this patch, these interrupts are difficult to distinguish per port on a multi-function device. Change-ID: I4662e1b38840757765a3fe63d90219d28e76bfab Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-06i40e: allow various base numbers in debugfs aq commandsShannon Nelson
Use the 'i' rather than the more restrictive 'x' or 'd' in the aq_cmd arguments. This makes the user interface much more forgiving and user friendly. Change-ID: I5dcd57b9befc047e06b74cf1152a25a3fa9e1309 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-06i40e: remove useless debug noiseShannon Nelson
This message really doesn't give any useful information and ends up getting printed every service_task loop in the Linux driver, filling the logfile with noise when AQ tracing is enabled. This patch simply removes the noise. Change-ID: I30ad51e6b03c7ad12a7d9c102def0087db622df3 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Acked-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-06i40e: Remove unneeded break statementShannon Nelson
This case statement is empty and the fall through just breaks out so remove the break and let it fall through to break out. Change-ID: I1b5ba9870d5245ca80bfca6e7f5f089e2eb8ccb0 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-12-05Merge branch 'ebpf-next'David S. Miller
Alexei Starovoitov says: ==================== allow eBPF programs to be attached to sockets V1->V2: fixed comments in sample code to state clearly that packet data is accessed with LD_ABS instructions and not internal skb fields. Also replaced constants in: BPF_LD_ABS(BPF_B, 14 + 9 /* R0 = ip->proto */), with: BPF_LD_ABS(BPF_B, ETH_HLEN + offsetof(struct iphdr, protocol) /* R0 = ip->proto */), V1 cover: Introduce BPF_PROG_TYPE_SOCKET_FILTER type of eBPF programs that can be attached to sockets with setsockopt(). Allow such programs to access maps via lookup/update/delete helpers. This feature was previewed by bpf manpage in commit b4fc1a460f30("Merge branch 'bpf-next'") Now it can actually run. 1st patch adds LD_ABS/LD_IND instruction verification and 2nd patch adds new setsockopt() flag. Patches 3-6 are examples in assembler and in C. Though native eBPF programs are way more powerful than classic filters (attachable through similar setsockopt() call), they don't have skb field accessors yet. Like skb->pkt_type, skb->dev->ifindex are not accessible. There are sevaral ways to achieve that. That will be in the next set of patches. So in this set native eBPF programs can only read data from packet and access maps. The most powerful example is sockex2_kern.c from patch 6 where ~200 lines of C are compiled into ~300 of eBPF instructions. It shows how quite complex packet parsing can be done. LLVM used to build examples is at https://github.com/iovisor/llvm which is fork of llvm trunk that I'm cleaning up for upstreaming. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-05samples: bpf: large eBPF program in CAlexei Starovoitov
sockex2_kern.c is purposefully large eBPF program in C. llvm compiles ~200 lines of C code into ~300 eBPF instructions. It's similar to __skb_flow_dissect() to demonstrate that complex packet parsing can be done by eBPF. Then it uses (struct flow_keys)->dst IP address (or hash of ipv6 dst) to keep stats of number of packets per IP. User space loads eBPF program, attaches it to loopback interface and prints dest_ip->#packets stats every second. Usage: $sudo samples/bpf/sockex2 ip 127.0.0.1 count 19 ip 127.0.0.1 count 178115 ip 127.0.0.1 count 369437 ip 127.0.0.1 count 559841 ip 127.0.0.1 count 750539 Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-05samples: bpf: trivial eBPF program in CAlexei Starovoitov
this example does the same task as previous socket example in assembler, but this one does it in C. eBPF program in kernel does: /* assume that packet is IPv4, load one byte of IP->proto */ int index = load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol)); long *value; value = bpf_map_lookup_elem(&my_map, &index); if (value) __sync_fetch_and_add(value, 1); Corresponding user space reads map[tcp], map[udp], map[icmp] and prints protocol stats every second Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-05samples: bpf: elf_bpf file loaderAlexei Starovoitov
simple .o parser and loader using BPF syscall. .o is a standard ELF generated by LLVM backend It parses elf file compiled by llvm .c->.o - parses 'maps' section and creates maps via BPF syscall - parses 'license' section and passes it to syscall - parses elf relocations for BPF maps and adjusts BPF_LD_IMM64 insns by storing map_fd into insn->imm and marking such insns as BPF_PSEUDO_MAP_FD - loads eBPF programs via BPF syscall One ELF file can contain multiple BPF programs. int load_bpf_file(char *path); populates prog_fd[] and map_fd[] with FDs received from bpf syscall bpf_helpers.h - helper functions available to eBPF programs written in C Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-05samples: bpf: example of stateful socket filteringAlexei Starovoitov
this socket filter example does: - creates arraymap in kernel with key 4 bytes and value 8 bytes - loads eBPF program which assumes that packet is IPv4 and loads one byte of IP->proto from the packet and uses it as a key in a map r0 = skb->data[ETH_HLEN + offsetof(struct iphdr, protocol)]; *(u32*)(fp - 4) = r0; value = bpf_map_lookup_elem(map_fd, fp - 4); if (value) (*(u64*)value) += 1; - attaches this program to raw socket - every second user space reads map[IPPROTO_TCP], map[IPPROTO_UDP], map[IPPROTO_ICMP] to see how many packets of given protocol were seen on loopback interface Usage: $sudo samples/bpf/sock_example TCP 0 UDP 0 ICMP 0 packets TCP 187600 UDP 0 ICMP 4 packets TCP 376504 UDP 0 ICMP 8 packets TCP 563116 UDP 0 ICMP 12 packets TCP 753144 UDP 0 ICMP 16 packets Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-05net: sock: allow eBPF programs to be attached to socketsAlexei Starovoitov
introduce new setsockopt() command: setsockopt(sock, SOL_SOCKET, SO_ATTACH_BPF, &prog_fd, sizeof(prog_fd)) where prog_fd was received from syscall bpf(BPF_PROG_LOAD, attr, ...) and attr->prog_type == BPF_PROG_TYPE_SOCKET_FILTER setsockopt() calls bpf_prog_get() which increments refcnt of the program, so it doesn't get unloaded while socket is using the program. The same eBPF program can be attached to multiple sockets. User task exit automatically closes socket which calls sk_filter_uncharge() which decrements refcnt of eBPF program Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-05bpf: verifier: add checks for BPF_ABS | BPF_IND instructionsAlexei Starovoitov
introduce program type BPF_PROG_TYPE_SOCKET_FILTER that is used for attaching programs to sockets where ctx == skb. add verifier checks for ABS/IND instructions which can only be seen in socket filters, therefore the check: if (env->prog->aux->prog_type != BPF_PROG_TYPE_SOCKET_FILTER) verbose("BPF_LD_ABS|IND instructions are only allowed in socket filters\n"); Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-05tun/macvtap: use consume_skb() instead of kfree_skb() when neededJason Wang
To be more friendly with drop monitor, we should only call kfree_skb() when the packets were dropped and use consume_skb() in other cases. Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>