OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Eric Anholt	4c70ac7639	drm/vc4: Add a pad field to align drm_vc4_submit_cl to 64 bits. I had originally asked Stefan Schake to drop the pad field from the syncobj changes that just landed, because I couldn't come up with a reason to align to 64 bits. Talking with Dave Airlie about the new v3d driver's submit ioctl, we came up with a reason: sizeof() on 64-bit platforms may align to 64 bits, in which case the userspace will be submitting the aligned size and the final 32 bits won't be zero-padded by the kernel. If userspace doesn't zero-fill, then a future ABI change adding a 32-bit field at the end could potentially cause the kernel to read undefined data from old userspace (our userspace happens to use structure initialization that zero-fills, but as a general rule we try not to rely on that in the kernel). Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20180430235927.28712-1-eric@anholt.net Reviewed-by: Stefan Schake <stschake@gmail.com>	2018-05-03 15:20:09 -07:00
Stefan Schake	e84fcb95e0	drm/vc4: Export fence through syncobj Allow specifying a syncobj on render job submission where we store the fence for the job. This gives userland flexible access to the fence. v2: Use 0 as invalid syncobj to drop flag (Eric) Don't reintroduce the padding (Eric) Signed-off-by: Stefan Schake <stschake@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/1524607427-12876-3-git-send-email-stschake@gmail.com	2018-04-30 16:04:23 -07:00
Stefan Schake	818f5c8f4c	drm/vc4: Syncobj import support Allow userland to specify a syncobj that is waited on before a render job starts processing. v2: Use 0 as invalid syncobj to drop flag (Eric) Drop extra newline (Eric) Signed-off-by: Stefan Schake <stschake@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/1524607427-12876-2-git-send-email-stschake@gmail.com	2018-04-30 16:04:14 -07:00
Boris Brezillon	65101d8c91	drm/vc4: Expose performance counters to userspace The V3D engine has various hardware counters which might be interesting to userspace performance analysis tools. Expose new ioctls to create/destroy a performance monitor object and query the counter values of this perfmance monitor. Note that a perfomance monitor is given an ID that is only valid on the file descriptor it has been allocated from. A performance monitor can be attached to a CL submission and the driver will enable HW counters for this request and update the performance monitor values at the end of the job. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20180112090926.12538-1-boris.brezillon@free-electrons.com	2018-02-10 22:23:26 +00:00
Boris Brezillon	b9f19259b8	drm/vc4: Add the DRM_IOCTL_VC4_GEM_MADVISE ioctl This ioctl will allow us to purge inactive userspace buffers when the system is running out of contiguous memory. For now, the purge logic is rather dumb in that it does not try to release only the amount of BO needed to meet the last CMA alloc request but instead purges all objects placed in the purgeable pool as soon as we experience a CMA allocation failure. Note that the in-kernel BO cache is always purged before the purgeable cache because those objects are known to be unused while objects marked as purgeable by a userspace application/library might have to be restored when they are marked back as unpurgeable, which can be expensive. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20171019125748.3152-1-boris.brezillon@free-electrons.com	2017-10-19 10:34:49 -07:00
Eric Anholt	3be8eddd9d	drm/vc4: Add exec flags to allow forcing a specific X/Y tile walk order. This is useful to allow GL to provide defined results for overlapping glBlitFramebuffer, which X11 in turn uses to accelerate uncomposited window movement without first blitting to a temporary. x11perf -copywinwin100 goes from 1850/sec to 4850/sec. v2: Default to the same behavior as before when the flags aren't passed. (suggested by Boris) Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20170725162733.28007-2-eric@anholt.net Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>	2017-08-08 13:26:44 -07:00
Eric Anholt	f30994622b	drm/vc4: Add an ioctl for labeling GEM BOs for summary stats This has proven immensely useful for debugging memory leaks and overallocation (which is a rather serious concern on the platform, given that we typically run at about 256MB of CMA out of up to 1GB total memory, with framebuffers that are about 8MB ecah). The state of the art without this is to dump debug logs from every GL application, guess as to kernel allocations based on bo_stats, and try to merge that all together into a global picture of memory allocation state. With this, you can add a couple of calls to the debug build of the 3D driver and get a pretty detailed view of GPU memory usage from /debug/dri/0/bo_stats (or when we debug print to dmesg on allocation failure). The Mesa side currently labels at the gallium resource level (so you see that a 1920x20 pixmap has been created, presumably for the window system panel), but we could extend that to be even more useful with glObjectLabel() names being sent all the way down to the kernel. (partial) example of sorted debugfs output with Mesa labeling all resources: kernel BO cache: 16392kb BOs (3) tiling shadow 1920x1080: 8160kb BOs (1) resource 1920x1080@32/0: 8160kb BOs (1) scanout resource 1920x1080@32/0: 8100kb BOs (1) kernel: 8100kb BOs (1) v2: Use strndup_user(), use lockdep assertion instead of just a comment, fix an array[-1] reference, extend comment about name freeing. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20170725182718.31468-2-eric@anholt.net Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-07-28 16:04:53 -07:00
Eric Anholt	83753117f1	drm/vc4: Add get/set tiling ioctls. This allows mesa to set the tiling format for a BO and have that tiling format be respected by mesa on the other side of an import/export (and by vc4 scanout in the kernel), without defining a protocol to pass the tiling through userspace. Signed-off-by: Eric Anholt <eric@anholt.net> Link: http://patchwork.freedesktop.org/patch/msgid/20170608001336.12842-2-eric@anholt.net Acked-by: Dave Airlie <airlied@redhat.com>	2017-06-15 16:02:45 -07:00
Jonas Pfeil	c778cc5df9	drm/vc4: Add fragment shader threading support FS threading brings performance improvements of 0-20% in glmark2. The validation code checks for thread switch signals and ensures that the registers of the other thread are not touched, and that our clamps are not live across thread switches. It also checks that the threading and branching instructions do not interfere. (Original patch by Jonas, changes by anholt for style cleanup, removing validation the kernel doesn't need to do, and adding the flag for userspace). v2: Minor style fixes from checkpatch. Signed-off-by: Jonas Pfeil <pfeiljonas@gmx.de> Signed-off-by: Eric Anholt <eric@anholt.net>	2016-11-16 13:25:26 -08:00
Eric Anholt	7154d76fed	drm/vc4: Add support for rendering with ETC1 textures. The validation for it ends up being quite simple, but I hadn't got around to it before merging the driver. For backwards compatibility, we also need to add a flag so that the userspace GL driver can easily tell if the kernel will allow ETC1 textures (on an old kernel, it will continue to convert to RGBA8) Signed-off-by: Eric Anholt <eric@anholt.net>	2016-11-03 18:55:46 -07:00
Eric Anholt	7363cee5b4	drm/vc4: Add a getparam to signal support for branches. Userspace needs to know if it can create shaders that do branching. Otherwise, for backwards compatibility with old kernels it needs to lower if statements to conditional assignments. Signed-off-by: Eric Anholt <eric@anholt.net>	2016-07-15 15:19:51 -07:00
Eric Anholt	af713795c5	drm/vc4: Add a getparam ioctl for getting the V3D identity regs. As I extend the driver to support different V3D revisions, userspace needs to know what version it's targeting. This is most easily detected using the V3D identity registers. v2: Make sure V3D is runtime PM on when reading the registers. v3: Switch to a 64-bit param value (suggested by Rob Clark in review) Signed-off-by: Eric Anholt <eric@anholt.net> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v2) Reviewed-by: Rob Clark <robdclark@gmail.com> (v3, over irc)	2016-07-14 08:08:35 -07:00
Emil Velikov	6a982350f8	drm/vc4: add extern C guard for the UAPI header Cc: Eric Anholt <eric@anholt.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-13 14:06:18 +01:00
Eric Anholt	214613656b	drm/vc4: Add an interface for capturing the GPU state after a hang. This can be parsed with vc4-gpu-tools tools for trying to figure out what was going on. v2: Use __u32-style types. Signed-off-by: Eric Anholt <eric@anholt.net>	2015-12-07 20:49:49 -08:00
Eric Anholt	d5b1a78a77	drm/vc4: Add support for drawing 3D frames. The user submission is basically a pointer to a command list and a pointer to uniforms. We copy those in to the kernel, validate and relocate them, and store the result in a GPU BO which we queue for execution. v2: Drop support for NV shader recs (not necessary for GL), simplify vc4_use_bo(), improve bin flush/semaphore checks, use __u32 style types. Signed-off-by: Eric Anholt <eric@anholt.net>	2015-12-07 20:05:10 -08:00
Eric Anholt	463873d570	drm/vc4: Add an API for creating GPU shaders in GEM BOs. Since we have no MMU, the kernel needs to validate that the submitted shader code won't make any accesses to memory that the user doesn't control, which involves banning some operations (general purpose DMA writes), and tracking where we need to write out pointers for other operations (texture sampling). Once it's validated, we return a GEM BO containing the shader, which doesn't allow mapping for write or exporting to other subsystems. v2: Use __u32-style types. Signed-off-by: Eric Anholt <eric@anholt.net>	2015-12-07 20:05:09 -08:00
Eric Anholt	d5bc60f6ad	drm/vc4: Add create and map BO ioctls. While there exist dumb APIs for creating and mapping BOs, one of the rules is that drivers doing 3D acceleration have to provide their own APIs for buffer allocation (besides, the pitch/height parameters of the dumb alloc don't really make sense for a lot of 3D allocations). v2: Use __u32-style types, use "drm.h" instead of <drm/drm.h>. Signed-off-by: Eric Anholt <eric@anholt.net>	2015-12-07 20:04:57 -08:00

17 Commits