OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Christian König	ce64bc25ef	drm/amdgpu: pipeline evictions as well This boosts Xonotic from 38fps to 47fps when artificially limiting VRAM to 256MB for testing. It should improve all CPU bound rendering situations where we have a lot of swapping to/from VRAM. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:54:42 -04:00
Christian König	74561cd4f1	drm/ttm: remove no_gpu_wait param from ttm_bo_move_accel_cleanup It isn't used and not waiting for the GPU after scheduling a move is actually quite dangerous. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:54:39 -04:00
Christian König	99c44632d4	drm/amdgpu: remove pre move wait Not needed any more. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:54:38 -04:00
Christian König	77dfc28bad	drm/ttm: wait for BO idle in ttm_bo_move_memcpy When we want to pipeline accelerated moves we need to wait in the fallback path. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:54:35 -04:00
Christian König	88932a7be2	drm/ttm: add wait for idle in all drivers bo_move functions Wait for idle before moving the BO in all drivers implementing an accelerated move function. This should keep the current behavior when removing the pre move wait. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:54:35 -04:00
Dave Airlie	bafb86f5bc	Linux 4.6-rc7 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXL7HfAAoJEHm+PkMAQRiGYe8IAJBGaPUq38EJh2YOV+AQf9v6 t/alhwB3DUE1E0zjLy7I7JJ+xDXtKjZh9fS6OFuIS8Q3RIrBteIJ/oH8TPpt7yZ/ SnP6rYPvYD6CImTyrh7+ORL/udEwJX8+YqFYAgUAq167gvpDjYj8r26VzdIaIN4/ oBbL8NrQNWfODieywYyhUoitVhwMz09zmBfLtGVks4vd2jUJk2Fdd9cOtGV5tRfk DPndPgyQtbr8W0mKovV8sT9WkQeV5TsUr4MLgf7hjnAGYQ8+0KamkzzVVLBeBiiw uazyrOCFkddZp+N7KbmbOmazV/yULRuLGgDjVKazoCsOaKOvoGCzrCk7daOPy6Q= =CegX -----END PGP SIGNATURE----- Merge tag 'v4.6-rc7' into drm-next Merge this back as we've built up a fair few conflicts, and I have some newer trees to pull in.	2016-05-09 13:49:56 +10:00
Christian König	29b3259a3a	drm/amdgpu: group BOs by log2 of the size on the LRU v2 This allows us to have small BOs on the LRU before big ones. v2: fix of by one and list corruption bug Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-04 20:23:08 -04:00
Christian König	98c2872ae9	drm/ttm: implement LRU add callbacks v2 This allows fine grained control for the driver where to add a BO into the LRU. v2: fix typo in comment Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-04 20:21:38 -04:00
Nils Wallménius	06ab6832ac	drm/amdgpu: Mark all instances of struct drm_info_list as const All these are compile time constand and the drm_debugfs_create/remove_files functions take a const pointer argument. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nils Wallménius <nils.wallmenius@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-04 20:20:10 -04:00
Christian König	a1d29476d6	drm/amdgpu: optionally enable GART debugfs file Keeping the pages array around can use a lot of system memory when you want a large GART. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-02 15:26:57 -04:00
Jérôme Glisse	054892ed6a	drm/amdgpu: forbid mapping of userptr bo through radeon device file Allowing userptr bo which are basicly a list of page from some vma (so either anonymous page or file backed page) would lead to serious corruption of kernel structures and counters (because we overwrite the page->mapping field when mapping buffer). This will already block if the buffer was populated before anyone does try to mmap it because then TTM_PAGE_FLAG_SG would be set in in the ttm_tt flags. But that flag is check before ttm_tt_populate in the ttm vm fault handler. So to be safe just add a check to verify_access() callback. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-04-21 20:03:47 -04:00
Linus Torvalds	4a2d057e4f	Merge branch 'PAGE_CACHE_SIZE-removal' Merge PAGE_CACHE_SIZE removal patches from Kirill Shutemov: "PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced long time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. Let's stop pretending that pages in page cache are special. They are not. The first patch with most changes has been done with coccinelle. The second is manual fixups on top. The third patch removes macros definition" [ I was planning to apply this just before rc2, but then I spaced out, so here it is right _after_ rc2 instead. As Kirill suggested as a possibility, I could have decided to only merge the first two patches, and leave the old interfaces for compatibility, but I'd rather get it all done and any out-of-tree modules and patches can trivially do the converstion while still also working with older kernels, so there is little reason to try to maintain the redundant legacy model. - Linus ] * PAGE_CACHE_SIZE-removal: mm: drop PAGE_CACHE_* and page_cache_{get,release} definition mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros	2016-04-04 10:50:24 -07:00
Kirill A. Shutemov	09cbfeaf1a	mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced long time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-04-04 10:41:08 -07:00
Dave Airlie	2f4fcb3eaf	Merge branch 'drm-next-4.6' of git://people.freedesktop.org/~agd5f/linux into drm-fixes Just a few fixes for 4.6 this week: - Add some SI DPM quirks - Improve the ACP Kconfig text - Additional BO pinning checks * 'drm-next-4.6' of git://people.freedesktop.org/~agd5f/linux: drm/amdgpu: Don't move pinned BOs drm/radeon: Don't move pinned BOs drm/radeon: add a dpm quirk for all R7 370 parts drm/radeon: add another R7 370 quirk drm/radeon: add a dpm quirk for sapphire Dual-X R7 370 2G D5 drm/amd: Beef up ACP Kconfig menu text	2016-04-01 13:13:34 +10:00
Michel Dänzer	104ece9757	drm/amdgpu: Don't move pinned BOs The purpose of pinning is to prevent a buffer from moving. Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-03-28 11:55:38 -04:00
Linus Torvalds	266c73b777	Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux Pull drm updates from Dave Airlie: "This is the main drm pull request for 4.6 kernel. Overall the coolest thing here for me is the nouveau maxwell signed firmware support from NVidia, it's taken a long while to extract this from them. I also wish the ARM vendors just designed one set of display IP, ARM display block proliferation is definitely increasing. Core: - drm_event cleanups - Internal API cleanup making mode_fixup optional. - Apple GMUX vga switcheroo support. - DP AUX testing interface Panel: - Refactoring of DSI core for use over more transports. New driver: - ARM hdlcd driver i915: - FBC/PSR (framebuffer compression, panel self refresh) enabled by default. - Ongoing atomic display support work - Ongoing runtime PM work - Pixel clock limit checks - VBT DSI description support - GEM fixes - GuC firmware scheduler enhancements amdkfd: - Deferred probing fixes to avoid make file or link ordering. amdgpu/radeon: - ACP support for i2s audio support. - Command Submission/GPU scheduler/GPUVM optimisations - Initial GPU reset support for amdgpu vmwgfx: - Support for DX10 gen mipmaps - Pageflipping and other fixes. exynos: - Exynos5420 SoC support for FIMD - Exynos5422 SoC support for MIPI-DSI nouveau: - GM20x secure boot support - adds acceleration for Maxwell GPUs. - GM200 support - GM20B clock driver support - Power sensors work etnaviv: - Correctness fixes for GPU cache flushing - Better support for i.MX6 systems. imx-drm: - VBlank IRQ support - Fence support - OF endpoint support msm: - HDMI support for 8996 (snapdragon 820) - Adreno 430 support - Timestamp queries support virtio-gpu: - Fixes for Android support. rockchip: - Add support for Innosilicion HDMI rcar-du: - Support for 4 crtcs - R8A7795 support - RCar Gen 3 support omapdrm: - HDMI interlace output support - dma-buf import support - Refactoring to remove a lot of legacy code. tilcdc: - Rewrite of pageflipping code - dma-buf support - pinctrl support vc4: - HDMI modesetting bug fixes - Significant 3D performance improvement. fsl-dcu (FreeScale): - Lots of fixes tegra: - Two small fixes sti: - Atomic support for planes - Improved HDMI support" * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (1063 commits) drm/amdgpu: release_pages requires linux/pagemap.h drm/sti: restore mode_fixup callback drm/amdgpu/gfx7: add MTYPE definition drm/amdgpu: removing BO_VAs shouldn't be interruptible drm/amd/powerplay: show uvd/vce power gate enablement for tonga. drm/amd/powerplay: show uvd/vce power gate info for fiji drm/amdgpu: use sched fence if possible drm/amdgpu: move ib.fence to job.fence drm/amdgpu: give a fence param to ib_free drm/amdgpu: include the right version of gmc header files for iceland drm/radeon: fix indentation. drm/amd/powerplay: add uvd/vce dpm enabling flag to fix the performance issue for CZ drm/amdgpu: switch back to 32bit hw fences v2 drm/amdgpu: remove amdgpu_fence_is_signaled drm/amdgpu: drop the extra fence range check v2 drm/amdgpu: signal fences directly in amdgpu_fence_process drm/amdgpu: cleanup amdgpu_fence_wait_empty v2 drm/amdgpu: keep all fences in an RCU protected array v2 drm/amdgpu: add number of hardware submissions to amdgpu_fence_driver_init_ring drm/amdgpu: RCU protected amd_sched_fence_release ...	2016-03-21 13:48:00 -07:00
Linus Torvalds	643ad15d47	Merge branch 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 protection key support from Ingo Molnar: "This tree adds support for a new memory protection hardware feature that is available in upcoming Intel CPUs: 'protection keys' (pkeys). There's a background article at LWN.net: https://lwn.net/Articles/643797/ The gist is that protection keys allow the encoding of user-controllable permission masks in the pte. So instead of having a fixed protection mask in the pte (which needs a system call to change and works on a per page basis), the user can map a (handful of) protection mask variants and can change the masks runtime relatively cheaply, without having to change every single page in the affected virtual memory range. This allows the dynamic switching of the protection bits of large amounts of virtual memory, via user-space instructions. It also allows more precise control of MMU permission bits: for example the executable bit is separate from the read bit (see more about that below). This tree adds the MM infrastructure and low level x86 glue needed for that, plus it adds a high level API to make use of protection keys - if a user-space application calls: mmap(..., PROT_EXEC); or mprotect(ptr, sz, PROT_EXEC); (note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice this special case, and will set a special protection key on this memory range. It also sets the appropriate bits in the Protection Keys User Rights (PKRU) register so that the memory becomes unreadable and unwritable. So using protection keys the kernel is able to implement 'true' PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies PROT_READ as well. Unreadable executable mappings have security advantages: they cannot be read via information leaks to figure out ASLR details, nor can they be scanned for ROP gadgets - and they cannot be used by exploits for data purposes either. We know about no user-space code that relies on pure PROT_EXEC mappings today, but binary loaders could start making use of this new feature to map binaries and libraries in a more secure fashion. There is other pending pkeys work that offers more high level system call APIs to manage protection keys - but those are not part of this pull request. Right now there's a Kconfig that controls this feature (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled (like most x86 CPU feature enablement code that has no runtime overhead), but it's not user-configurable at the moment. If there's any serious problem with this then we can make it configurable and/or flip the default" * 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits) x86/mm/pkeys: Fix mismerge of protection keys CPUID bits mm/pkeys: Fix siginfo ABI breakage caused by new u64 field x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA mm/core, x86/mm/pkeys: Add execute-only protection keys support x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags x86/mm/pkeys: Allow kernel to modify user pkey rights register x86/fpu: Allow setting of XSAVE state x86/mm: Factor out LDT init from context init mm/core, x86/mm/pkeys: Add arch_validate_pkey() mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits() x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU x86/mm/pkeys: Add Kconfig prompt to existing config option x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps x86/mm/pkeys: Dump PKRU with other kernel registers mm/core, x86/mm/pkeys: Differentiate instruction fetches x86/mm/pkeys: Optimize fault handling in access_error() mm/core: Do not enforce PKEY permissions on remote mm access um, pkeys: Add UML arch_*_access_permitted() methods mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys x86/mm/gup: Simplify get_user_pages() PTE bit handling ...	2016-03-20 19:08:56 -07:00
Dave Airlie	9f443bf53b	Merge branch 'drm-next-4.6' of git://people.freedesktop.org/~agd5f/linux into drm-next A few more fixes and cleanups for 4.6: - DCE code cleanups - HDP flush/invalidation fixes - GPUVM fixes - switch to drm_vblank_[on\|off] - PX fixes - misc bug fixes * 'drm-next-4.6' of git://people.freedesktop.org/~agd5f/linux: (50 commits) drm/amdgpu: split pipeline sync out of SDMA vm_flush() as well drm/amdgpu: Revert "add mutex for ba_va->valids/invalids" drm/amdgpu: Revert "add lock for interval tree in vm" drm/amdgpu: Revert "add spin lock to protect freed list in vm (v3)" drm/amdgpu: reserve the PD during unmap and remove drm/amdgpu: Fix two bugs in amdgpu_vm_bo_split_mapping drm/radeon: Don't drop DP 2.7 Ghz link setup on some cards. MAINTAINERS: update radeon entry to include amdgpu as well drm/amdgpu: disable runtime pm on PX laptops without dGPU power control drm/radeon: disable runtime pm on PX laptops without dGPU power control drm/amd/amdgpu: Fix indentation in do_set_base() (DCEv8) drm/amd/amdgpu: make afmt_init cleanup if alloc fails (DCEv8) drm/amd/amdgpu: Move config init flag to bottom of sw_init (DCEv8) drm/amd/amdgpu: Don't proceed into audio_fini if audio is disabled (DCEv8) drm/amd/amdgpu: Fix identation in do_set_base() (DCEv10) drm/amd/amdgpu: Make afmt_init cleanup if alloc fails (DCEv10) drm/amd/amdgpu: Move initialized flag to bottom of sw_init (DCEv10) drm/amd/amdgpu: Don't proceed in audio_fini if disabled (DCEv10) drm/amd/amdgpu: Fix indentation in dce_v11_0_crtc_do_set_base() drm/amd/amdgpu: Make afmt_init() cleanup if alloc fails (DCEv11) ...	2016-03-17 08:25:04 +10:00
Dave Airlie	9b61c0fcdf	Merge drm-fixes into drm-next. Nouveau wanted this to avoid some worse conflicts when I merge that.	2016-03-14 09:46:02 +10:00
Christian König	2f568dbd6b	drm/amdgpu: move get_user_pages out of amdgpu_ttm_tt_pin_userptr v6 That avoids lock inversion between the BO reservation lock and the anon_vma lock. v2: * Changed amdgpu_bo_list_entry.user_pages to an array of pointers * Lock mmap_sem only for get_user_pages * Added invalidation of unbound userpointer BOs * Fixed memory leak and page reference leak v3 (chk): * Revert locking mmap_sem only for_get user_pages * Revert adding invalidation of unbound userpointer BOs * Sanitize and fix error handling v4 (chk): * Init userpages pointer everywhere. * Fix error handling when get_user_pages() fails. * Add invalidation of unbound userpointer BOs again. v5 (chk): * Add maximum number of tries. v6 (chk): * Fix error handling when we run out of tries. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> (v4) Acked-by: Alex Deucher <alexander.deucher@amd.com>	2016-03-08 11:01:50 -05:00
Christian König	637dd3b5ca	drm/amdgpu: prevent get_user_pages recursion Remember the tasks which are inside get_user_pages() and ignore MMU callbacks from there. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-03-08 11:01:46 -05:00
Rasmus Villemoes	09ccbb74b6	drm/amdgpu: use post-decrement in error handling We need to use post-decrement to get the pci_map_page undone also for i==0, and to avoid some very unpleasant behaviour if pci_map_page failed already at i==0. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2016-02-16 10:05:38 -05:00
Dave Hansen	d4edcf0d56	mm/gup: Switch all callers of get_user_pages() to not pass tsk/mm We will soon modify the vanilla get_user_pages() so it can no longer be used on mm/tasks other than 'current/current->mm', which is by far the most common way it is called. For now, we allow the old-style calls, but warn when they are used. (implemented in previous patch) This patch switches all callers of: get_user_pages() get_user_pages_unlocked() get_user_pages_locked() to stop passing tsk/mm so they will no longer see the warnings. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: jack@suse.cz Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210156.113E9407@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-02-16 10:11:12 +01:00
Christian König	703297c1fe	drm/amdgpu: use separate scheduler entitiy for buffer moves This allows us to remove the global kernel context. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-12 15:39:07 -05:00
Christian König	2bd9ccfa75	drm/amdgpu: use per VM entity for page table updates (v2) Updates from different VMs can be processed independently. v2: agd: rebase on upstream Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-12 15:35:16 -05:00
Christian König	e86f9ceee1	drm/amdgpu: move sync into job object No need to keep that for every IB. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-10 14:17:24 -05:00
Christian König	d71518b5aa	drm/amdgpu: cleanup in kernel job submission Add a job_alloc_with_ib helper and proper job submission. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-10 14:17:22 -05:00
Christian König	b07c60c065	drm/amdgpu: move ring from IBs into job We can't submit to multiple rings at the same time anyway. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-10 14:17:20 -05:00
Christian König	9e5d53094c	drm/amdgpu: make pad_ib a ring function v3 The padding depends on the firmware version and we need that for BO moves as well, not only for VM updates. v2: new approach of making pad_ib a ring function v3: fix typo in macro name Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-10 14:17:20 -05:00
Christian König	cc325d1913	drm/amdgpu: check userptrs mm earlier Instead of when we try to bind it check the usermm when we try to use it in the IOCTLs. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-10 14:17:16 -05:00
Chunming Zhou	cadf97b196	drm/amdgpu: clean up non-scheduler code path (v2) Non-scheduler code is longer supported. v2: agd: rebased on upstream Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Ken Wang <Qingqing.Wang@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-10 14:16:50 -05:00
Christian König	d7006964d4	drm/amdgpu: fix issue with overlapping userptrs Otherwise we could try to evict overlapping userptr BOs in get_user_pages(), leading to a possible circular locking dependency. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-10 14:16:43 -05:00
Christian König	cc1de6e800	drm/amdgpu: fix issue with overlapping userptrs Otherwise we could try to evict overlapping userptr BOs in get_user_pages(), leading to a possible circular locking dependency. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2016-02-10 14:07:52 -05:00
Ken Wang	8f3c162961	drm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Ken Wang <Qingqing.Wang@amd.com> Cc: stable@vger.kernel.org	2016-02-02 22:54:11 -05:00
Christian König	6d99905a8c	drm/amdgpu: set snooped flags only on system addresses v2 Not necessary for VRAM. v2: no need to check if ttm is NULL. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-04 12:31:46 -05:00
Chunming Zhou	e2f784fa8a	drm/amdgpu: add err check for pin userptr Missing error check if the operation failed. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-12-02 15:03:54 -05:00
Arnd Bergmann	e1b35f6103	drm/amdgpu: fix seq_printf format string The amdgpu driver has a debugfs interface that shows the amount of VRAM in use, but the newly added code causes a build error on all 32-bit architectures: drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:1076:17: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'long long int' [-Wformat=] This fixes the format string to use "%llu" for printing 64-bit numbers, which works everywhere, as long as we also cast to 'u64'. Unlike atomic64_t, u64 is defined as 'unsigned long long' on all architectures. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: `a2ef8a9749` ("drm/amdgpu: add vram usage into debugfs") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-16 11:05:55 -05:00
Christian König	7a91d6cb3c	drm/amdgpu: remove AMDGPU_FENCE_OWNER_MOVE Moves are exclusive operations anyway, just use the undefined owner for those. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-04 12:29:23 -05:00
Chunming Zhou	a2ef8a9749	drm/amdgpu: add vram usage into debugfs Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-07 23:48:22 -04:00
Christian König	72d7668b5b	drm/amdgpu: export reservation_object from dmabuf to ttm (v2) Adds an extra argument to amdgpu_bo_create, which is only used in amdgpu_prime.c. Port of radeon commit `831b6966a6`. v2: fix up kfd. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-09-23 17:23:34 -04:00
Alex Deucher	857d913d05	drm/amdgpu: be explicit about cpu vram access for driver BOs (v2) For kernel driver BOs, be explicit about whether we need vram access up front. This avoids unecessary migrations and avoids using visible vram for buffers were it's not needed. v2: line wrap fixes Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-09-03 10:29:32 -04:00
Chunming Zhou	c7ae72c01b	drm/amdgpu: use IB for copy buffer of eviction This aids handling buffers moves with the scheduler. Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian K?nig <christian.koenig@amd.com>	2015-08-26 17:50:42 -04:00
Chunming Zhou	9066b0c318	drm/amdgpu: fix no sync_wait in copy_buffer when eviction is happening, if don't handle dependency, then the fence could be dead off. Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Christian K?nig <christian.koenig@amd.com>	2015-08-25 10:53:48 -04:00
Chunming Zhou	4ce9891ee1	drm/amdgpu: improve sa_bo->fence by kernel fence Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian K?nig <christian.koenig@amd.com>	2015-08-25 10:38:41 -04:00
Maninder Singh	5f0b34cc72	drm/amdgpu: use kzalloc for allocating one thing Use kzalloc rather than kcalloc(1.. for allocating one thing. Signed-off-by: Maninder Singh <maninder1.s@samsung.com> Reviewed-by: Vaneet Narang <v.narang@samsung.com> Reviewed-by: Christian Konig <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-06-29 11:21:50 -04:00
Christian König	e176fe176d	drm/amdgpu: remove mclk_lock Not needed any more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-06-03 21:03:58 -04:00
monk.liu	dd08fae1e9	drm/amdgpu: fix userptr BO unpin bug (v2) sg could point to array of contigiouse page*, only free page could lead to memory leak. v2: use iterator Signed-off-by: monk.liu <monk.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-06-03 21:03:25 -04:00
Alex Deucher	d38ceaf99e	drm/amdgpu: add core driver (v4) This adds the non-asic specific core driver code. v2: remove extra kconfig option v3: implement minor fixes from Fengguang Wu v4: fix cast in amdgpu_ucode.c Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Jammy Zhou <Jammy.Zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-06-03 21:03:15 -04:00

1 2

98 Commits