Fix the size computation of the htile buffer.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need to hold bdev->fence_lock while grabbing a reference to
the fence, to prevent concurrent clearing/changing of the
ttm_bo->sync_obj field.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
With the new per-crtc locking mutliple set-cursor calls could happen
in parallel. Out of sheer paranoia I've opted for an irqsave spinlock.
But if there's indeed an access from interrupt contexts to these regs
it's already broken with the old code, so this can likely just be
reduced to a normal spinlock. Otoh the pageflip completion happens
from the vblank irq handler ...
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Just refactoring to make the next patche simpler. Now all indirect register
access in the new modesetting driver should go through the r100_mm_(w|r)reg
fucntions.
RADEON_READ_MM from the old driver seems to be totally unused, so just kill
it.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The dma ring can't write to register thus have to write to memory
its fence value. This ensure that it doesn't try to use scratch
register for dma ring fence driver.
Should fix:
https://bugs.freedesktop.org/show_bug.cgi?id=58166
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
The bo creation placement is where the bo will be. Instead of trying
to move bo at each command stream let this work to another worker
thread that will use more advance heuristic.
agd5f: remove leftover unused variable
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Alex writes:
"adds support for the
asynchronous DMA engines on r6xx-SI. These engines are used
for ttm bo moves and VM page table updates currently. They
could also be exposed via the CS ioctl for userspace use,
but I haven't had a chance to add proper CS checker patches
for them yet. These patches have been tested extensively
internally for months, so they should be pretty solid."
* 'drm-next-3.8' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: use DMA engine for VM page table updates on SI
drm/radeon: add dma engine support for vm pt updates on si (v2)
drm/radeon: use DMA engine for VM page table updates on cayman/TN
drm/radeon: add dma engine support for vm pt updates on ni (v5)
drm/radeon: use async dma for ttm buffer moves on 6xx-SI
drm/radeon/kms: add support for dma rings to radeon_test_moves()
drm/radeon/kms: Add initial support for async DMA on SI
drm/radeon/kms: Add initial support for async DMA on cayman/TN
drm/radeon/kms: Add initial support for async DMA on evergreen
drm/radeon/kms: Add initial support for async DMA on r6xx/r7xx
DMA engine has special packets to facilitate this and it also keeps
the 3D engine free for other things.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
DMA engine has special packets to facilitate this and it also keeps
the 3D engine free for other things.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Async DMA has a special packet for contiguous pt updates
which saves overhead.
v2: leave the CP method enabled for now as doing the updates
in the DMA rings is not working properly yet.
v3: update for 2 level pts
v4: rebase
v5: drop pte/pde packet. doesn't seem to work on NI.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There are 2 async DMA engines on cayman, one at 0xd000 and
one at 0xd800. The programming interface is the same as
evergreen however there are some changes to the commands
for using vmids.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pretty similar to 6xx/7xx except the count field increased in the
packet header and the max IB size increased.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
All items on the lru list are always reservable, so this is a stupid
thing to keep. Not only that, it is used in a way which would
guarantee deadlocks if it were ever to be set to block on reserve.
This is a lot of churn, but mostly because of the removal of the
argument which can be nested arbitrarily deeply in many places.
No change of code in this patch except removal of the no_wait_reserve
argument, the previous patch removed the use of no_wait_reserve.
v2:
- Warn if -EBUSY is returned on reservation, all objects on the list
should be reservable. Adjusted patch slightly due to conflicts.
v3:
- Focus on no_wait_reserve removal only.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Replace the goto loop with a simple for each loop, and only run the
delayed destroy cleanup if we can reserve the buffer first.
No race occurs, since lru lock is never dropped any more. An empty list
and a list full of unreservable buffers both cause -EBUSY to be returned,
which is identical to the previous situation, because previously buffers
on the lru list were always guaranteed to be reservable.
This should work since currently ttm guarantees items on the lru are
always reservable, and reserving items blockingly with some bo held
are enough to cause you to run into a deadlock.
Currently this is not a concern since removal off the lru list and
reservations are always done with atomically, but when this guarantee
no longer holds, we have to handle this situation or end up with
possible deadlocks.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Replace the while loop with a simple for each loop, and only run the
delayed destroy cleanup if we can reserve the buffer first.
No race occurs, since lru lock is never dropped any more. An empty list
and a list full of unreservable buffers both cause -EBUSY to be returned,
which is identical to the previous situation.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
By removing the unlocking of lru and retaking it immediately, a race is
removed where the bo is taken off the swap list or the lru list between
the unlock and relock. As such the cleanup_refs code can be simplified,
it will attempt to call ttm_bo_wait non-blockingly, and if it fails
it will drop the locks and perform a blocking wait, or return an error
if no_wait_gpu was set.
The need for looping is also eliminated, since swapout and evict_mem_first
will always follow the destruction path, no new fence is allowed
to be attached. As far as I can see this may already have been the case,
but the unlocking / relocking required a complicated loop to deal with
re-reservation.
Changes since v1:
- Simplify no_wait_gpu case by folding it in with empty ddestroy.
- Hold a reservation while calling ttm_bo_cleanup_memtype_use again.
Changes since v2:
- Do not remove bo from lru list while waiting
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The few places that care should have those checks instead.
This allows destruction of bo backed memory without a reservation.
It's required for being able to rework the delayed destroy path,
as it is no longer guaranteed to hold a reservation before unlocking.
However any previous wait is still guaranteed to complete, and it's
one of the last things to be done before the buffer object is freed.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This requires changing the order in ttm_bo_cleanup_refs_or_queue to
take the reservation first, as there is otherwise no race free way to
take lru lock before fence_lock.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex writes:
Pretty minor -next pull request. We some additional new bits waiting
internally for release. Hopefully Monday we can get at least some of
them out. The others will probably take a few more weeks.
Highlights of the current request:
- ELD registers for passing audio information to the sound hardware
- Handle GPUVM page faults more gracefully
- Misc fixes
Merge radeon test
* 'drm-next-3.8' of git://people.freedesktop.org/~agd5f/linux: (483 commits)
drm/radeon: bump driver version for new info ioctl requests
drm/radeon: fix eDP clk and lane setup for scaled modes
drm/radeon: add new INFO ioctl requests
drm/radeon/dce32+: use fractional fb dividers for high clocks
drm/radeon: use cached memory when evicting for vram on non agp
drm/radeon: add a CS flag END_OF_FRAME
drm/radeon: stop page faults from hanging the system (v2)
drm/radeon/dce4/5: add registers for ELD handling
drm/radeon/dce3.2: add registers for ELD handling
radeon: fix pll/ctrc mapping on dce2 and dce3 hardware
Linux 3.7-rc7
powerpc/eeh: Do not invalidate PE properly
Revert "drm/i915: enable rc6 on ilk again"
ALSA: hda - Fix build without CONFIG_PM
of/address: sparc: Declare of_iomap as an extern function for sparc again
PM / QoS: fix wrong error-checking condition
bnx2x: remove redundant warning log
vxlan: fix command usage in its doc
8139cp: revert "set ring address before enabling receiver"
MPI: Fix compilation on MIPS with GCC 4.4 and newer
...
Conflicts:
drivers/gpu/drm/exynos/exynos_drm_encoder.c
drivers/gpu/drm/exynos/exynos_drm_fbdev.c
drivers/gpu/drm/nouveau/core/engine/disp/nv50.c
Need to use the adjusted mode since we are sending native
timing and using the scaler for non-native modes.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
cc: stable@vger.kernel.org
Add requests to get the number of shader engines (SE) and
the number of SH per SE. These are needed for geometry
and tesselation shaders in the 3D driver as well as setting
up PA_SC_RASTER_CONFIG on SI asics.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Force the use of cached memory when evicting from vram on non agp
hardware. Also force write combine on agp hw. This is to insure
the minimum cache type change when allocating memory and improving
memory eviction especialy on pci/pcie hw.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Redirect invalid memory accesses to the default page
instead of locking up the memory controller. Also
enable the invalid memory access interrupts and
start spamming system log with it.
v2 (agd5f): fix up against 2 level PT changes
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This patch adds code for composing AVI and AUI info frames
and send them every VSYNC.
This patch is important for hdmi certification.
v3:
- Moved enums, macros to exynos_hdmi.c.
- Corrected hex format.
- Added static to hdmi_reg_infoframe.
v2:
- Added few blank lines.
- Corrected comments format.
- Added comments for 2's Complement calculation for check sum.
v1:
- Remove un-necessary blank lines.
- Change the case of hex constants.
Signed-off-by: Rahul Sharma <rahul.sharma@samsung.com>
Signed-off-by: Fahad Kunnathadi <fahad.k@samsung.com>
Signed-off-by: Shirish S <s.shirish@samsung.com>
Acked-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
devm_clk_get is device managed and makes error handling and exit code
simpler.
Also fixes an error related to returning 'ret' without initialising
with error code.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
devm_* functions are device managed and make error handling and exit code
simpler.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
devm_clk_get is device managed and makes error handling and exit code
simpler.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Pointer was being dereferenced after freeing.
Fixes the following error:
drivers/gpu/drm/exynos/exynos_drm_g2d.c:323 g2d_userptr_put_dma_addr() error:
dereferencing freed memory 'g2d_userptr'
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
devm_clk_get is device managed and makes error handling and exit code
simpler.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
The 'pages' structure in the exynos gem buffer has been
removed. So we get the fix.smem_start from the first sgl
of the scatter gather table.
Signed-off-by: Prathyush K <prathyush.k@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
This patch is to preserve the display mode header during the mode adjustment.
Display mode header is overwritten with the adjusted mode header which is
throwing the stack dump.
Signed-off-by: Rahul Sharma <rahul.sharma@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
drm_get_edid() returns a pointer to an EDID block. The caller
is responsible to free this pointer itself.
Here the pointer gets assigned to the local variable raw_edid.
Therefore it should be freed before the variable goes out of
scope.
Signed-off-by: Egbert Eich <eich@suse.de>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Changelog v2:
Removed redundant check for invalid sgl.
Added check for valid page_offset in the beginning of exynos_drm_gem_map_buf.
Changelog v1:
The 'pages' structure is not required since we can use the 'sgt'. Even for
CONTIG buffers, a SGT is created (which will have just one sgl). This SGT
can be used during mmap instead of 'pages'. The 'page_size' element of the
structure is also not used anywhere and is removed.
This patch also fixes a memory leak where the 'pages' structure was being
allocated during gem buffer allocation but not being freed during deallocate.
Signed-off-by: Prathyush K <prathyush.k@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Changelog v3:
Passing the actual buffer size instead of vm_size to dma_mmap_attrs.
Changelog v2:
Extracting the private data from fb_info structure to obtain the exynos
gem buffer structure. Now, dma address is obtained from the exynos gem
buffer structure and not from smem_start. Also calling dma_mmap_attrs
(instead of dma_mmap_writecombine) with the same attributes used
during allocation.
Changelog v1:
This patch adds a exynos drm specific implementation of fb_mmap
which supports mapping a non-contiguous buffer to user space.
This new function does not assume that the frame buffer is contiguous
and calls dma_mmap_writecombine for mapping the buffer to user space.
dma_mmap_writecombine will be able to map a contiguous buffer as well
as non-contig buffer depending on whether an IOMMU mapping is created
for drm or not.
Signed-off-by: Prathyush K <prathyush.k@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Changelog v2:
fix a little bit performance issue to previous patch.
- When drm framebuffer is destroyed, make sure that overlay
data are updated to real hardwrae for all encoders
instead of waiting for vblank every page flip request.
For this, it adds a new function,
exynos_drm_encoder_complete_scanout function.
Changelog v1:
This patch removes wait_for_vblank call from
exynos_drm_encoder_plane_disable function and move it to
exynos_drm_encoder_plane_commit function.
Disabling dma channel to each plane doens't need vblank
signal to update data to real hardware. But updating
overlay data to real hardware does need vblank signal.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>