summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2020-09-03blk-mq: Record active_queues_shared_sbitmap per tag_set for when using ↵John Garry
shared sbitmap For when using a shared sbitmap, no longer should the number of active request queues per hctx be relied on for when judging how to share the tag bitmap. Instead maintain the number of active request queues per tag_set, and make the judgement based on that. Originally-from: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Don Brace<don.brace@microsemi.com> #SCSI resv cmds patches used Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-03blk-mq: Record nr_active_requests per queue for when using shared sbitmapJohn Garry
The per-hctx nr_active value can no longer be used to fairly assign a share of tag depth per request queue for when using a shared sbitmap, as it does not consider that the tags are shared tags over all hctx's. For this case, record the nr_active_requests per request_queue, and make the judgement based on that value. Co-developed-with: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Don Brace<don.brace@microsemi.com> #SCSI resv cmds patches used Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-03blk-mq: Facilitate a shared sbitmap per tagsetJohn Garry
Some SCSI HBAs (such as HPSA, megaraid, mpt3sas, hisi_sas_v3 ..) support multiple reply queues with single hostwide tags. In addition, these drivers want to use interrupt assignment in pci_alloc_irq_vectors(PCI_IRQ_AFFINITY). However, as discussed in [0], CPU hotplug may cause in-flight IO completion to not be serviced when an interrupt is shutdown. That problem is solved in commit bf0beec0607d ("blk-mq: drain I/O when all CPUs in a hctx are offline"). However, to take advantage of that blk-mq feature, the HBA HW queuess are required to be mapped to that of the blk-mq hctx's; to do that, the HBA HW queues need to be exposed to the upper layer. In making that transition, the per-SCSI command request tags are no longer unique per Scsi host - they are just unique per hctx. As such, the HBA LLDD would have to generate this tag internally, which has a certain performance overhead. However another problem is that blk-mq assumes the host may accept (Scsi_host.can_queue * #hw queue) commands. In commit 6eb045e092ef ("scsi: core: avoid host-wide host_busy counter for scsi_mq"), the Scsi host busy counter was removed, which would stop the LLDD being sent more than .can_queue commands; however, it should still be ensured that the block layer does not issue more than .can_queue commands to the Scsi host. To solve this problem, introduce a shared sbitmap per blk_mq_tag_set, which may be requested at init time. New flag BLK_MQ_F_TAG_HCTX_SHARED should be set when requesting the tagset to indicate whether the shared sbitmap should be used. Even when BLK_MQ_F_TAG_HCTX_SHARED is set, a full set of tags and requests are still allocated per hctx; the reason for this is that if tags and requests were only allocated for a single hctx - like hctx0 - it may break block drivers which expect a request be associated with a specific hctx, i.e. not always hctx0. This will introduce extra memory usage. This change is based on work originally from Ming Lei in [1] and from Bart's suggestion in [2]. [0] https://lore.kernel.org/linux-block/alpine.DEB.2.21.1904051331270.1802@nanos.tec.linutronix.de/ [1] https://lore.kernel.org/linux-block/20190531022801.10003-1-ming.lei@redhat.com/ [2] https://lore.kernel.org/linux-block/ff77beff-5fd9-9f05-12b6-826922bace1f@huawei.com/T/#m3db0a602f095cbcbff27e9c884d6b4ae826144be Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Don Brace<don.brace@microsemi.com> #SCSI resv cmds patches used Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-03blk-mq: Rename BLK_MQ_F_TAG_SHARED as BLK_MQ_F_TAG_QUEUE_SHAREDMing Lei
BLK_MQ_F_TAG_SHARED actually means that tags is shared among request queues, all of which should belong to LUNs attached to same HBA. So rename it to make the point explicitly. [jpg: rebase a few times, add rnbd-clt.c change] Suggested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-03media: rc: harmonize infrared durations to microsecondsSean Young
rc-core kapi uses nanoseconds for infrared durations for receiving, and microseconds for sending. The uapi already uses microseconds for both, so this patch does not change the uapi. Infrared durations do not need nanosecond resolution. IR protocols do not have durations shorter than about 100 microseconds. Some IR hardware offers 250 microseconds resolution, which is sufficient for most protocols. Better hardware has 50 microsecond resolution and is enough for every protocol I am aware off. Unify on microseconds everywhere. This simplifies the code since less conversion between microseconds and nanoseconds needs to be done. This affects: - rx_resolution member of struct rc_dev - timeout member of struct rc_dev - duration member in struct ir_raw_event Cc: "Bruno Prémont" <bonbons@linux-vserver.org> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Patrick Lerda <patrick9876@free.fr> Cc: Kevin Hilman <khilman@baylibre.com> Cc: Neil Armstrong <narmstrong@baylibre.com> Cc: Jerome Brunet <jbrunet@baylibre.com> Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Cc: Sean Wang <sean.wang@mediatek.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Patrice Chotard <patrice.chotard@st.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Chen-Yu Tsai <wens@csie.org> Cc: "David Härdeman" <david@hardeman.nu> Cc: Benjamin Valentin <benpicco@googlemail.com> Cc: Antti Palosaari <crope@iki.fi> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-03media: videobuf-dma-sg: number of pages should be unsigned longMauro Carvalho Chehab
As reported by smatch: drivers/media/v4l2-core/videobuf-dma-sg.c:245 videobuf_dma_init_kernel() warn: should 'nr_pages << 12' be a 64 bit type? The printk should not be using %d for the number of pages. After looking better, the real problem here is that the number of pages should be long int. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-02block: allow for_each_bvec to support zero len bvecMing Lei
Block layer usually doesn't support or allow zero-length bvec. Since commit 1bdc76aea115 ("iov_iter: use bvec iterator to implement iterate_bvec()"), iterate_bvec() switches to bvec iterator. However, Al mentioned that 'Zero-length segments are not disallowed' in iov_iter. Fixes for_each_bvec() so that it can move on after seeing one zero length bvec. Fixes: 1bdc76aea115 ("iov_iter: use bvec iterator to implement iterate_bvec()") Reported-by: syzbot <syzbot+61acc40a49a3e46e25ea@syzkaller.appspotmail.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Link: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2262077.html Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - data sanitization and validtion fixes for report descriptor parser from Marc Zyngier - memory leak fix for hid-elan driver from Dinghao Liu - two device-specific quirks * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: core: Sanitize event code and type when mapping input HID: core: Correctly handle ReportSize being zero HID: elan: Fix memleak in elan_input_configured HID: microsoft: Add rumble support for the 8bitdo SN30 Pro+ controller HID: quirks: Set INCREMENT_USAGE_ON_DUPLICATE for all Saitek X52 devices
2020-09-02regmap: Add can_sleep configuration optionDmitry Osipenko
Regmap can't sleep if spinlock is used for the locking protection. This patch fixes regression caused by a previous commit that switched regmap to use fsleep() and this broke Amlogic S922X platform. This patch adds new configuration option for regmap users, allowing to specify whether regmap operations can sleep and assuming that sleep is allowed if mutex is used for the regmap locking protection. Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Fixes: 2b32d2f7ce0a ("regmap: Use flexible sleep") Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20200902141843.6591-1-digetx@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-09-02libata: implement ATA_HORKAGE_MAX_TRIM_128M and apply to SandisksTejun Heo
All three generations of Sandisk SSDs lock up hard intermittently. Experiments showed that disabling NCQ lowered the failure rate significantly and the kernel has been disabling NCQ for some models of SD7's and 8's, which is obviously undesirable. Karthik worked with Sandisk to root cause the hard lockups to trim commands larger than 128M. This patch implements ATA_HORKAGE_MAX_TRIM_128M which limits max trim size to 128M and applies it to all three generations of Sandisk SSDs. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Karthik Shivaram <karthikgs@fb.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02block: remove revalidate_disk()Christoph Hellwig
Remove the now unused helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02block: use revalidate_disk_size in set_capacity_revalidate_and_notifyChristoph Hellwig
Only virtio_blk and xen-blkfront set the revalidate argument to true, and both do not implement the ->revalidate_disk method. So switch to the helper that just updates the size instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02block: add a new revalidate_disk_size helperChristoph Hellwig
revalidate_disk is a relative awkward helper for driver use, as it first calls an optional driver method and then updates the block device size, while most callers either don't need the method call at all, or want to keep state between the caller and the called method. Add a revalidate_disk_size helper that just performs the update of the block device size from the gendisk one, and switch all drivers that do not implement ->revalidate_disk to use the new helper instead of revalidate_disk() Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02block: rename bd_invalidatedChristoph Hellwig
Replace bd_invalidate with a new BDEV_NEED_PART_SCAN flag in a bd_flags variable to better describe the condition. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02drm/i915: Fix sha_text population codeSean Paul
This patch fixes a few bugs: 1- We weren't taking into account sha_leftovers when adding multiple ksvs to sha_text. As such, we were or'ing the end of ksv[j - 1] with the beginning of ksv[j] 2- In the sha_leftovers == 2 and sha_leftovers == 3 case, bstatus was being placed on the wrong half of sha_text, overlapping the leftover ksv value 3- In the sha_leftovers == 2 case, we need to manually terminate the byte stream with 0x80 since the hardware doesn't have enough room to add it after writing M0 The upside is that all of the HDCP supported HDMI repeaters I could find on Amazon just strip HDCP anyways, so it turns out to be _really_ hard to hit any of these cases without an MST hub, which is not (yet) supported. Oh, and the sha_leftovers == 1 case works perfectly! Fixes: ee5e5e7a5e0f ("drm/i915: Add HDCP framework + base implementation") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Sean Paul <seanpaul@chromium.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: intel-gfx@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v4.17+ Reviewed-by: Ramalingam C <ramalingam.c@intel.com> Signed-off-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Ramalingam C <ramalingam.c@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200818153910.27894-2-sean@poorly.run (cherry picked from commit 1f0882214fd0037b74f245d9be75c31516fed040) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-09-01block: grant IOPRIO_CLASS_RT to CAP_SYS_NICEKhazhismel Kumykov
CAP_SYS_ADMIN is too broad, and ionice fits into CAP_SYS_NICE's grouping. Retain CAP_SYS_ADMIN permission for backwards compatibility. Signed-off-by: Khazhismel Kumykov <khazhy@google.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01blk-iocost: restore inuse update tracepointsTejun Heo
Update and restore the inuse update tracepoints. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01blk-iocost: decouple vrate adjustment from surplus transfersTejun Heo
Budget donations are inaccurate and could take multiple periods to converge. To prevent triggering vrate adjustments while surplus transfers were catching up, vrate adjustment was suppressed if donations were increasing, which was indicated by non-zero nr_surpluses. This entangling won't be necessary with the scheduled rewrite of donation mechanism which will make it precise and immediate. Let's decouple the two in preparation. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01blk-iocost: calculate iocg->usages[] from iocg->local_stat.usage_usTejun Heo
Currently, iocg->usages[] which are used to guide inuse adjustments are calculated from vtime deltas. This, however, assumes that the hierarchical inuse weight at the time of calculation held for the entire period, which often isn't true and can lead to significant errors. Now that we have absolute usage information collected, we can derive iocg->usages[] from iocg->local_stat.usage_us so that inuse adjustment decisions are made based on actual absolute usage. The calculated usage is clamped between 1 and WEIGHT_ONE and WEIGHT_ONE is also used to signal saturation regardless of the current hierarchical inuse weight. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: remove an outdated comment on the bd_dev fieldChristoph Hellwig
kdev_t is long gone, so we don't need to comment a field isn't one.. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: remove the discard_alignment field from struct hd_structChristoph Hellwig
The alignment offset is only used in slow path callers, so just calculate it on the fly. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: remove the alignment_offset field from struct hd_structChristoph Hellwig
The alignment offset is only used in slow path callers, so just calculate it on the fly. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: Move blk_mq_bio_list_merge() into blk-merge.cBaolin Wang
Move the blk_mq_bio_list_merge() into blk-merge.c and rename it as a generic name. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: remove the BIO_USER_MAPPED flagChristoph Hellwig
Just check if there is private data, in which case the bio must have originated from bio_copy_user_iov. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: remove the BIO_NULL_MAPPED flagChristoph Hellwig
We can simply use a boolean flag in the bio_map_data data structure instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: fix locking for struct block_device size updatesChristoph Hellwig
Two different callers use two different mutexes for updating the block device size, which obviously doesn't help to actually protect against concurrent updates from the different callers. In addition one of the locks, bd_mutex is rather prone to deadlocks with other parts of the block stack that use it for high level synchronization. Switch to using a new spinlock protecting just the size updates, as that is all we need, and make sure everyone does the update through the proper helper. This fixes a bug reported with the nvme revalidating disks during a hot removal operation, which can currently deadlock on bd_mutex. Reported-by: Xianting Tian <xianting_tian@126.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: replace bd_set_size with bd_set_nr_sectorsChristoph Hellwig
Replace bd_set_size with a version that takes the number of sectors instead, as that fits most of the current and future callers much better. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: Make request_queue.rpm_status an enumGeert Uytterhoeven
request_queue.rpm_status is assigned values of the rpm_status enum only, so reflect that in its type. Note that including <linux/pm.h> is (currently) a no-op, as it is already included through <linux/genhd.h> and <linux/device.h>, but it is better to play it safe. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01locks: Remove extra "0x" in tracepoint format specifierChuck Lever
Clean up: %p adds its own 0x already. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jeff Layton <jlayton@kernel.org>
2020-09-01media: v4l2-ctrl: Add frame-skip std encoder controlStanimir Varbanov
Adds encoders standard v4l2 control for frame-skip. The control is a copy of a custom encoder control so that other v4l2 encoder drivers can use it. Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: v4l2-ctrls: Add encoder constant quality controlMaheshwar Ajja
When V4L2_CID_MPEG_VIDEO_BITRATE_MODE value is V4L2_MPEG_VIDEO_BITRATE_MODE_CQ, encoder will produce constant quality output indicated by V4L2_CID_MPEG_VIDEO_CONSTANT_QUALITY control value. Encoder will choose appropriate quantization parameter and bitrate to produce requested frame quality level. Signed-off-by: Maheshwar Ajja <majja@codeaurora.org> Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Rename and clarify PPS_FLAG_SCALING_MATRIX_PRESENTEzequiel Garcia
Applications are expected to fill V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX if a non-flat scaling matrix applies to the picture. This is the case if SPS scaling_matrix_present_flag or PPS pic_scaling_matrix_present_flag are set, and should be handled by applications. On one hand, the PPS bitstream syntax element signals the presence of a Picture scaling matrix modifying the Sequence (SPS) scaling matrix. On the other hand, our flag should indicate if the scaling matrix V4L2 control is applicable to this request. Rename the flag from PPS_FLAG_PIC_SCALING_MATRIX_PRESENT to PPS_FLAG_SCALING_MATRIX_PRESENT, to avoid mixing this flag with bitstream syntax element pic_scaling_matrix_present_flag, and clarify the meaning of our flag. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Clean slice invariants syntax elementsEzequiel Garcia
The H.264 specification requires in section 7.4.3 "Slice header semantics", that the following values shall be the same in all slice headers: pic_parameter_set_id frame_num field_pic_flag bottom_field_flag idr_pic_id pic_order_cnt_lsb delta_pic_order_cnt_bottom delta_pic_order_cnt[ 0 ] delta_pic_order_cnt[ 1 ] sp_for_switch_flag slice_group_change_cycle These bitstream fields are part of the slice header, and therefore passed redundantly on each slice. The purpose of the redundancy is to make the codec fault-tolerant in network scenarios. This is of course not needed to be reflected in the V4L2 controls, given the bitstream has already been parsed by applications. Therefore, move the redundant fields to the per-frame decode parameters control (DECODE_PARAMS). Field 'pic_parameter_set_id' is simply removed in this case, because the PPS control must currently contain the active PPS. Syntax elements dec_ref_pic_marking() and those related to pic order count, remain invariant as well, and therefore, the fields dec_ref_pic_marking_bit_size and pic_order_cnt_bit_size are also common to all slices. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Clarify SLICE_BASED modeEzequiel Garcia
Currently, the SLICE_BASED and FRAME_BASED modes documentation is misleading and not matching the intended use-cases. Drop non-required fields SLICE_PARAMS 'start_byte_offset' and DECODE_PARAMS 'num_slices' and clarify the decoding modes in the documentation. On SLICE_BASED mode, a single slice is expected per OUTPUT buffer, and therefore 'start_byte_offset' is not needed (since the offset to the slice is the start of the buffer). This mode requires the use of CAPTURE buffer holding, and so the number of slices shall not be required. On FRAME_BASED mode, the devices are expected to take care of slice parsing. Neither SLICE_PARAMS are required (and shouldn't be exposed by frame-based drivers), nor the number of slices. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Drop SLICE_PARAMS 'size' fieldEzequiel Garcia
The SLICE_PARAMS control is intended for slice-based devices. In this mode, the OUTPUT buffer contains a single slice, and so the buffer's plane payload size can be used to query the slice size. To reduce the API surface drop the size from the SLICE_PARAMS control. A follow-up change will remove other members in SLICE_PARAMS so we don't need to add padding fields here. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Increase size of DPB entry pic_numEzequiel Garcia
DPB entry PicNum maximum value is 2*MaxFrameNum for interlaced content (field_pic_flag=1). As specified, MaxFrameNum is 2^(log2_max_frame_num_minus4 + 4) and log2_max_frame_num_minus4 is in the range of 0 to 12, which means pic_num should be a 32-bit field. The v4l2_h264_dpb_entry struct needs to be padded to avoid a hole, which might be also useful to allow future uAPI extensions. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Clean DPB entry interfaceEzequiel Garcia
As discussed recently, the current interface for the Decoded Picture Buffer is not enough to properly support field coding. This commit introduces enough semantics to support frame and field coding, and to signal how DPB entries are "used for reference". Reserved fields will be added by a follow-up commit. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Increase size of 'first_mb_in_slice' fieldEzequiel Garcia
Slice header syntax element 'first_mb_in_slice' can point to the last macroblock, currently the field can only reference 65536 macroblocks which is insufficient for 8K videos. Although unlikely, a 8192x4320 video (where macroblocks are 16x16), would contain 138240 macroblocks on a frame. As per the H264 specification, 'first_mb_in_slice' can be up to PicSizeInMbs - 1, so increase the size of the field to 32-bits. Note that v4l2_ctrl_h264_slice_params struct will be modified in a follow-up commit, and so we defer its 64-bit padding. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Split prediction weight parametersEzequiel Garcia
The prediction weight parameters are only required under certain conditions, which depend on slice header parameters. As specified in section 7.3.3 Slice header syntax, the prediction weight table is present if: ((weighted_pred_flag && (slice_type == P || slice_type == SP)) || \ (weighted_bipred_idc == 1 && slice_type == B)) Given its size, it makes sense to move this table to its control, so applications can avoid passing it if the slice doesn't specify it. Before this change struct v4l2_ctrl_h264_slice_params was 960 bytes. With this change, it's 188 bytes and struct v4l2_ctrl_h264_pred_weight is 772 bytes. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Update reference listsJernej Skrabec
When dealing with interlaced frames, reference lists must tell if each particular reference is meant for top or bottom field. This info is currently not provided at all in the H264 related controls. Change reference lists to hold a structure, which specifies an index into the DPB array and the field/frame specification for the picture. Currently the only user of these lists is Cedrus which is just compile fixed here. Actual usage of will come in a following commit. Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net> Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: cec: no need to check return value of debugfs_create functionsGreg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01HID: core: Sanitize event code and type when mapping inputMarc Zyngier
When calling into hid_map_usage(), the passed event code is blindly stored as is, even if it doesn't fit in the associated bitmap. This event code can come from a variety of sources, including devices masquerading as input devices, only a bit more "programmable". Instead of taking the event code at face value, check that it actually fits the corresponding bitmap, and if it doesn't: - spit out a warning so that we know which device is acting up - NULLify the bitmap pointer so that we catch unexpected uses Code paths that can make use of untrusted inputs can now check that the mapping was indeed correct and bail out if not. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
2020-09-01tracepoint: Optimize using static_call()Steven Rostedt (VMware)
Currently the tracepoint site will iterate a vector and issue indirect calls to however many handlers are registered (ie. the vector is long). Using static_call() it is possible to optimize this for the common case of only having a single handler registered. In this case the static_call() can directly call this handler. Otherwise, if the vector is longer than 1, call a function that iterates the whole vector like the current code. [peterz: updated to new interface] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135805.279421092@infradead.org
2020-09-01static_call: Allow early initPeter Zijlstra
In order to use static_call() to wire up x86_pmu, we need to initialize earlier, specifically before memory allocation works; copy some of the tricks from jump_label to enable this. Primarily we overload key->next to store a sites pointer when there are no modules, this avoids having to use kmalloc() to initialize the sites and allows us to run much earlier. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20200818135805.220737930@infradead.org
2020-09-01static_call: Handle tail-callsPeter Zijlstra
GCC can turn our static_call(name)(args...) into a tail call, in which case we get a JMP.d32 into the trampoline (which then does a further tail-call). Teach objtool to recognise and mark these in .static_call_sites and adjust the code patching to deal with this. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135805.101186767@infradead.org
2020-09-01static_call: Add static_call_cond()Peter Zijlstra
Extend the static_call infrastructure to optimize the following common pattern: if (func_ptr) func_ptr(args...) For the trampoline (which is in effect a tail-call), we patch the JMP.d32 into a RET, which then directly consumes the trampoline call. For the in-line sites we replace the CALL with a NOP5. NOTE: this is 'obviously' limited to functions with a 'void' return type. NOTE: DEFINE_STATIC_COND_CALL() only requires a typename, as opposed to a full function. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135805.042977182@infradead.org
2020-09-01x86/static_call: Add inline static call implementation for x86-64Josh Poimboeuf
Add the inline static call implementation for x86-64. The generated code is identical to the out-of-line case, except we move the trampoline into it's own section. Objtool uses the trampoline naming convention to detect all the call sites. It then annotates those call sites in the .static_call_sites section. During boot (and module init), the call sites are patched to call directly into the destination function. The temporary trampoline is then no longer used. [peterz: merged trampolines, put trampoline in section] Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135804.864271425@infradead.org
2020-09-01static_call: Avoid kprobes on inline static_call()sPeter Zijlstra
Similar to how we disallow kprobes on any other dynamic text (ftrace/jump_label) also disallow kprobes on inline static_call()s. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200818135804.744920586@infradead.org
2020-09-01static_call: Add inline static call infrastructureJosh Poimboeuf
Add infrastructure for an arch-specific CONFIG_HAVE_STATIC_CALL_INLINE option, which is a faster version of CONFIG_HAVE_STATIC_CALL. At runtime, the static call sites are patched directly, rather than using the out-of-line trampolines. Compared to out-of-line static calls, the performance benefits are more modest, but still measurable. Steven Rostedt did some tracepoint measurements: https://lkml.kernel.org/r/20181126155405.72b4f718@gandalf.local.home This code is heavily inspired by the jump label code (aka "static jumps"), as some of the concepts are very similar. For more details, see the comments in include/linux/static_call.h. [peterz: simplified interface; merged trampolines] Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135804.684334440@infradead.org
2020-09-01static_call: Add basic static call infrastructureJosh Poimboeuf
Static calls are a replacement for global function pointers. They use code patching to allow direct calls to be used instead of indirect calls. They give the flexibility of function pointers, but with improved performance. This is especially important for cases where retpolines would otherwise be used, as retpolines can significantly impact performance. The concept and code are an extension of previous work done by Ard Biesheuvel and Steven Rostedt: https://lkml.kernel.org/r/20181005081333.15018-1-ard.biesheuvel@linaro.org https://lkml.kernel.org/r/20181006015110.653946300@goodmis.org There are two implementations, depending on arch support: 1) out-of-line: patched trampolines (CONFIG_HAVE_STATIC_CALL) 2) basic function pointers For more details, see the comments in include/linux/static_call.h. [peterz: simplified interface] Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135804.623259796@infradead.org