On systems using kexec, the new kernel is booted straight from the old kernel, without any warning to the graphics driver. So the GPU is basically left as-is in a running state, however the CPU side is completly reset.
Without stating the saneness of anyone using kexec on live systems, we should at least try not to crash the GPU. This patch resets 3 registers to 0 that could cause bad things to happen to the running system.
This allows kexec to work on a Power6/RN50 system.
Signed-off-by: Dave Airlie <airlied@redhat.com>
- Make the logic in r100_pll_errata_after_index() match the other
errata functions
- Use rdev->family rather than rdev->flags & RADEON_FAMILY_MASK for kms
- replace rn50 check using ids with ASIC_IS_RN50 convenience macro
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We only add/remove crtcs at driver load, you cannot remove when
the GPU is running a CS packet since the fd is open, when
GPU hotplugging on radeons actually is needed all this locking
needs a review and I've started re-working kms core locking to deal
with this better. But for now avoid long delays in CS processing when
hotplug detect is happening in a different thread.
this fixes a regression introduced with hotplug detection.
Signed-off-by: Dave Airlie <airlied@redhat.com>
The asics in question have the following requirements with regard to
their gart setups:
1. The GART aperture size has to be in the form of 2^X bytes, where X is from 25 to 31
2. The GART aperture MC base has to be aligned to a boundary equal to the size of the
aperture.
3. The GART page table has to be aligned to the boundary equal to the size of the table.
4. The GART page table size is: table_entry_size * (aperture_size / page_size)
5. The GART page table has to be allocated in non-paged, non-cached, contiguous system
memory.
This patch takes care 2. The rest should already be handled properly.
This fixes a regression noticed by: Torsten Kaiser <just.for.lkml@googlemail.com>
Tested-by: Torsten Kaiser <just.for.lkml@googlemail.com>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
* drm-platform:
drm: Make sure the DRM offset matches the CPU
drm: Add __arm defines to DRM
drm: Add support for platform devices to register as DRM devices
drm: Remove drm_resource wrappers
This needs similar handling to other compressed textures.
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=26428
Signed-off-by: sroland@vmware.com
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=28459
agd5f: apply to r1xx/r2xx as well.
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Cc: stable@kernel.org
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Installing 2.6.34 on a Power5/rn50 combo machine, X showed buggy sw rendering,
enabling tiling in the DDX fixed it. Investigation showed that a further /16
was needed in the untiled case on this chipset. Need further investigations
on what other chips this could affect, possibly rv100->rv280.
Signed-off-by: Dave Airlie <airlied@redhat.com>
This adds an additional profile, mid, to the pm profile
code which takes the place of the old low profile. The default
behavior remains the same, e.g., auto profile now selects between
mid and high profiles based on power source, however, you can now
manually force the low profile which was previously only available
as a dpms off state. Enabling the low profile when the displays
are on has been known to cause display corruption in some cases.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Remove the drm_resource wrappers and directly use the
actual PCI and/or platform functions in their place.
[airlied: fixup nouveau properly to build]
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
* anholt/drm-intel-next: (515 commits)
drm/i915: Fix out of tree builds
drm/i915: move fence lru to struct drm_i915_fence_reg
drm/i915: don't allow tiling changes on pinned buffers v2
drm/i915: Be extra careful about A/D matching for multifunction SDVO
drm/i915: Fix DDC bus selection for multifunction SDVO
drm/i915: cleanup mode setting before unmapping registers
drm/i915: Make fbc control wrapper functions
drm/i915: Wait for the GPU whilst shrinking, if truly desperate.
drm/i915: Use spatio-temporal dithering on PCH
[MTD] Remove zero-length files mtdbdi.c and internal.ho
pata_pcmcia / ide-cs: Fix bad hashes for Transcend and kingston IDs
libata: Fix several inaccuracies in developer's guide
slub: Fix bad boundary check in init_kmem_cache_nodes()
raid6: fix recovery performance regression
KEYS: call_sbin_request_key() must write lock keyrings before modifying them
KEYS: Use RCU dereference wrappers in keyring key type code
KEYS: find_keyring_by_name() can gain access to a freed keyring
ALSA: hda: Fix 0 dB for Packard Bell models using Conexant CX20549 (Venice)
ALSA: hda - Add quirk for Dell Inspiron 19T using a Conexant CX20582
ALSA: take tu->qlock with irqs disabled
...
- Separate dynpm and profile based power management methods. You can select the pm method
by echoing the selected method ("dynpm" or "profile") to power_method in sysfs.
- Expose basic 4 profile in profile method
"default" - default clocks
"auto" - select between low and high based on ac/dc state
"low" - DC, low power mode
"high" - AC, performance mode
The current base profile is "default", but it should switched to "auto" once we've tested
on more systems. Switching the state is a matter of echoing the requested profile to
power_profile in sysfs. The lowest power states are selected automatically when dpms turns
the monitors off in all states but default.
- Remove dynamic fence-based reclocking for the moment. We can revisit this later once we
have basic pm in.
- Move pm init/fini to modesetting path. pm is tightly coupled with display state. Make sure
display side is initialized before pm.
- Add pm suspend/resume functions to make sure pm state is properly reinitialized on resume.
- Remove dynpm module option. It's now selectable via sysfs.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The lowest power states often cause display problems, so only enable
them when all displays are off.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
voltage drop, dynamic voltage, dynamic sclk, pcie lane adjust, etc,
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
- disable gui idle interrupt use
Seems to hang some r5xx chips
- move vbl range check into
existing vbl check function in
radeon_pm.c
- disable crtc mc acccess for the
whole reclocking process
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This seems to be relatively stable now, so enable it for these chipsets
too.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Add two new sysfs attributes:
- dynpm
- power_state
Echoing 0/1 to dynpm disables/enables dynamic power management.
The driver scales the sclk dynamically based on the number of
queued fences. dynpm only scales sclk dynamically in single head
mode.
Echoing x.y to power_state selects a static power state (x) and clock
mode (y). This allows you to statically select a power state and clock
mode. Selecting a static clock mode will disable dynpm.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
- pm_misc() - handles voltage, pcie lanes, and other non
clock related power mode settings. Currently disabled.
Needs further debugging
- pm_prepare() - disables crtc mem requests right now.
All memory clients need to be disabled when changing
memory clocks. This function can be expanded to include
disabling fb access as well.
- pm_finish() - enable active memory clients.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
- remove non_clock_info struct
- track power state misc flags
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This also simplifies the code and enables reclocking with multiple heads
active by tracking whether the power states are single or multi-head
capable.
Eventually, we will want to select a power state based on external
factors (AC/DC state, user selection, etc.).
(v2) Update for evergreen
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Useful for certain power management operations. You
need to wait for the GUI engine (2D, 3D, CP, etc.) to be
idle before changing clocks or adjusting engine parameters.
(v2) Fix gui idle enable on pre-r6xx asics
(v3) The gui idle interrrupt status bit is permanently asserted
on pre-r6xx chips, but the interrrupt is still generated.
workaround it in the driver.
(v4) Add support for evergreen
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Check to see if the GUI engine and related blocks
(2D, 3D, CP, etc) are idle or not. There are a number
of cases when we need to know if the drawing engine
is busy.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Conflicts:
drivers/gpu/drm/i915/i915_dma.c
drivers/gpu/drm/i915/i915_drv.h
drivers/gpu/drm/radeon/r300.c
The BSD ringbuffer support that is landing in this branch
significantly conflicts with the Ironlake PIPE_CONTROL fix on master,
and requires it to be tested successfully anyway.
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon/kms/legacy: only enable load detection property on DVI-I
drm/radeon/kms: fix panel scaling adjusted mode setup
drivers/gpu/drm/drm_sysfs.c: sysfs files error handling
drivers/gpu/drm/radeon/radeon_atombios.c: range check issues
gpu: vga_switcheroo, fix lock imbalance
drivers/gpu/drm/drm_memory.c: fix check for end of loop
drivers/gpu/drm/via/via_video.c: fix off by one issue
drm/radeon/kms/agp The wrong AGP chipset can cause a NULL pointer dereference
drm/radeon/kms: r300 fix CS checker to allow zbuffer-only fastfill
* drm-ttm-unmappable:
drm/radeon/kms: enable use of unmappable VRAM V2
drm/ttm: remove io_ field from TTM V6
drm/vmwgfx: add support for new TTM fault callback V5
drm/nouveau/kms: add support for new TTM fault callback V5
drm/radeon/kms: add support for new fault callback V7
drm/ttm: ttm_fault callback to allow driver to handle bo placement V6
drm/ttm: split no_wait argument in 2 GPU or reserve wait
Conflicts:
drivers/gpu/drm/nouveau/nouveau_bo.c
This patch enable the use of unmappable VRAM thanks to
previous TTM infrastructure change.
V2 update after io_mem_reserve/io_mem_free callback balancing
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon/kms: add FireMV 2400 PCI ID.
drm/radeon/kms: allow R500 regs VAP_ALT_NUM_VERTICES and VAP_INDEX_OFFSET
drivers/gpu/radeon: Add MSPOS regs to safe list.
drm/radeon/kms: disable the tv encoder when tv/cv is not in use
drm/radeon/kms: adjust pll settings for tv
drm/radeon/kms: fix tv dac conflict resolver
drm/radeon/kms/evergreen: don't enable hdmi audio stuff
drm/radeon/kms/atom: fix dual-link DVI on DCE3.2/4.0
drm/radeon/kms: fix rs600 tlb flush
drm/radeon/kms: print GPU family and device id when loading
drm/radeon/kms: fix calculation of mipmapped 3D texture sizes
drm/radeon/kms: only change mode when coherent value changes.
drm/radeon/kms: more atom parser fixes (v2)
[airlied: fix V_A_N_V to not be safe and fix check to make sure only r500
- bump userspace version]
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This simplify and improve GPU reset for R1XX-R6XX hw, it's
not 100% reliable here are result:
- R1XX/R2XX works bunch of time in a row, sometimes it
seems it can work indifinitly
- R3XX/R3XX the most unreliable one, sometimes you will be
able to reset few times, sometimes not even once
- R5XX more reliable than previous hw, seems to work most
of the times but once in a while it fails for no obvious
reasons (same status than previous reset just no same
happy ending)
- R6XX/R7XX are lot more reliable with this patch, still
it seems that it can fail after a bunch (reset every
2sec for 3hour bring down the GPU & computer)
This have been tested on various hw, for some odd reasons
i wasn't able to lockup RS480/RS690 (while they use to
love locking up).
Note that on R1XX-R5XX the cursor will disapear after
lockup haven't checked why, switch to console and back
to X will restore cursor.
Next step is to record the bogus command that leaded to
the lockup.
V2 Fix r6xx resume path to avoid reinitializing blit
module, use the gpu_lockup boolean to avoid entering
inifinite waiting loop on fence while reiniting the GPU
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Patch rename gpu_reset to asic_reset in prevision of having
gpu_reset doing more stuff than just basic asic reset.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This patch cleanup the fence code, it drops the timeout field of
fence as the time to complete each IB is unpredictable and shouldn't
be bound.
The fence cleanup lead to GPU lockup detection improvement, this
patch introduce a callback, allowing to do asic specific test for
lockup detection. In this patch the CP is use as a first indicator
of GPU lockup. If CP doesn't make progress during 1second we assume
we are facing a GPU lockup.
To avoid overhead of testing GPU lockup frequently due to fence
taking time to be signaled we query the lockup callback every
500msec. There is plenty code comment explaining the design & choise
inside the code.
This have been tested mostly on R3XX/R5XX hw, in normal running
destkop (compiz firefox, quake3 running) the lockup callback wasn't
call once (1 hour session). Also tested with forcing GPU lockup and
lockup was reported after the 1s CP activity timeout.
V2 switch to 500ms timeout so GPU lockup get call at least 2 times
in less than 2sec.
V3 store last jiffies in fence struct so on ERESTART, EBUSY we keep
track of how long we already wait for a given fence
V4 make sure we got up to date cp read pointer so we don't have
false positive
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Some GPUs have an APM/ACPI PM mode selection switch and some BIOSes
set this to APM. We really want this in ACPI mode for Linux.
Signed-off-by: Dave Airlie <airlied@redhat.com>
If we resume in a bad way, we'll get 0xffffffff in wptr, and then
oops with no console. This just adds a sanity check so that we can
avoid the oops and hopefully get more details out of people's systems.
Signed-off-by: Dave Airlie <airlied@redhat.com>
- Add module option to force the display priority
0 = auto, 1 = normal, 2 = high
- Default to high on r3xx/r4xx/rv515 chips
Fixes flickering problems during heavy acceleration
due to underflow to the display controllers
- Fill in minimal support for RS600
v2 - update display priority when bandwidth is updated
so the user can change the parameter at runtime and it
will take affect on the next modeset.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
radeon_gart_fini might call GART unbind callback function which
might try to access GART table but if gart_disable is call first
the GART table will be unmapped so any access to it will oops.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
- rs780/880 were using the wrong bandwidth functions
- convert r1xx-r4xx to use the same pm sclk/mclk structs as
r5xx+
- move bandwidth setup to a common function
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Look up i2c bus in the power table and expose it.
You'll need to load a hwmon driver for any chips
on the bus, this patch just exposes the bus.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
In essence this creates a home for all asic specific declarations in
radeon_asic.h
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We tried to implement interruptible waiting with timeout (it was broken
anyway) which was not a good idea as explained by Andrew. It's possible
to avoid using additional variable but actually it inroduces using more
complex in-kernel tools. So simply add one variable for condition.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This patch properly set visible VRAM and enforce any pinned buffer
to be into visible VRAM. We might later add a flag to release this
constraint for some newer hw more clever than previous.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Get rid of _location and use _start/_end also simplify the
computation of vram_start|end & gtt_start|end. For R1XX-R2XX
we place VRAM at the same address of PCI aperture, those GPU
shouldn't have much memory and seems to behave better when
setup that way. For R3XX and newer we place VRAM at 0. For
R6XX-R7XX AGP we place VRAM before or after AGP aperture this
might limit to limit the VRAM size but it's very unlikely.
For IGP we don't change the VRAM placement.
Tested on (compiz,quake3,suspend/resume):
PCI/PCIE:RV280,R420,RV515,RV570,RV610,RV710
AGP:RV100,RV280,R420,RV350,RV620(RPB*),RV730
IGP:RS480(RPB*),RS690,RS780(RPB*),RS880
RPB: resume previously broken
V2 correct commit message to reflect more accurately the bug
and move VRAM placement to 0 for most of the GPU to avoid
limiting VRAM.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
this uses a new entrypoint to invalidate gart entries instead of using 0.
Changed to rather than pointing to 0 address point empty entry to dummy
page. This might help to avoid hard lockup if for some wrong
reasons GPU try to access unmapped GART entry.
I'm not 100% sure this is going to work, we probably need to allocate
a dummy page and point all the GTT entries at it similiar to what AGP does.
but we can test this first I suppose.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This already simplifies code significally and makes it maintaible
in case of adding memory reclocking plus voltage changing in future.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
V2: reorganize functions, fix modesetting calls
V3: rebase patch, use radeon's workqueue
V4: enable on tested chipsets only, request VBLANK IRQs
V5: enable PM on older hardware (IRQs, mode_fixup, dpms)
V6: use separate dynpm module parameter
V7: drop RADEON_ prefix, set minimum mode for dpms off
V8: update legacy encoder call, fix order in rs600 IRQ
V9: update compute_clocks call in legacy, not only DPMS_OFF
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Switch some magic numbers to their proper defines.
The register header madness needs to be cleaned up
at some point.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Some servers have two VGA ports but only report
one in the bios connector tables. On these systems
always set up the TV DAC so that it displays properly
even if the bios is wrong.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
In suspend path we unmap the GART table while in cleaning up
path we will unbind buffer and thus try to write to unmapped
GART leading to oops. In order to avoid this we don't call the
suspend path in cleanup path. Cleanup path is clever enough
to desactive GPU like the suspend path is doing, thus this was
redondant.
Tested on: RV370, R420, RV515, RV570, RV610, RV770 (all PCIE)
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cache flush is required in case CPU is accessing rendered data.
This fixes glean/readPixSanity test case and random rendering
errors in sauerbraten and warzone2100.
v2 Fix comment ordering in r100_fence_ring_emit and remove extra
defines added in first version.
Signed-off-by: Pauli Nieminen <suokkos@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The first dword of PACKET3_3D_DRAW_IMMD maps to
SE_VTX_FMT so the vertex size is part of the draw
packet.
This patch fixes a possible case where you have a
command buffer that does not contain SE_VTX_FMT
register write, but does contain PACKET3_3D_DRAW_IMMD.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Use same common function to disable agp so we replace the GART
callback by the proper one when we do so. This fix oops if
radeon_agp_init report failure.
This patch also move radeon_agp_init out of *_mc_init for r600
& rv770 so that we can have a similar behavior than for previous
hw, ie if agp_init fails it will fallback to GPU GART and disable
AGP.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If for any reason we haven't installed handler we shouldn't try to
enable IRQ/MSI on the hw so we don't get unhandled IRQ/MSI which
makes the kernel sad.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
In some case we weren't releasing the AGP device at module unloading.
This leaded to unfunctional AGP at next module load. This patch make
sure we release the AGP bus if we acquire it.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
R300 family will hard lockup if host path read cache flush is
done through MMIO to HOST_PATH_CNTL. But scheduling same flush
through ring seems harmless. This patch remove the hdp_flush
callback and add a flush after each fence emission which means
a flush after each IB schedule. Thus we should have same behavior
without the hard lockup.
Tested on R100,R200,R300,R400,R500,R600,R700 family.
V2: Adjust fence counts in r600_blit_prepare_copy()
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Because hardware cannot disable all colorbuffers directly to do depth-only
rendering, a user should:
- disable reading from a colorbuffer in blending
- disable fastfill
- set the color channel mask to 0 to prevent writing to a colorbuffer
Signed-off-by: Dave Airlie <airlied@redhat.com>
This adds support for compressed textures to the r100->r500 CS
checker, it lets me run openarena and the demos in mesa fine.
Thanks to Maciej Cencora for initial comments.
Changes since v1:
fix calculations with Maciej formulas
Reviewed-by: Maciej Cencora <m.cencora@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
On resume on my rv530 laptop surface cntl was left disabled, so
wierd stuff would happen with rendering to a tiled front buffer.
This checks if the surface regs are assigned to bos and reprograms
the surface registers on resume using the same path that clears
them all on init.
Signed-off-by: Dave Airlie <airlied@redhat.com>
This enabled interrupt driven hpd support for all
radeon chips. Assuming the hpd pin is wired up
correctly, the driver will generate uevents on
digital monitor connect and disconnect and retrain
DP monitors automatically.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This just adds the functionality, it's not hooked up
yet.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The DDX and radeonfb always set these regs to a sane value.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The locking & protection of radeon object was somewhat messy.
This patch completely rework it to now use ttm reserve as a
protection for the radeon object structure member. It also
shrink down the various radeon object structure by removing
field which were redondant with the ttm information. Last it
converts few simple functions to inline which should with
performances.
airlied: rebase on top of r600 and other changes.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We really don't need to process every irq that comes in, we only
really want to do SW irq processing when we are actually waiting for
a fence to pass. I'm not 100% sure this is race free esp on non-MSI systems
so it needs some testing.
Signed-off-by: Dave Airlie <airlied@redhat.com>
rendercheck under kms on r600s was failing due to HDP flushing not happening.
This adds HDP flushing to the object wait function for r100->r700 families.
rendercheck passes basic tests on r600 with this change.
Acked-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We might not hit this yet, but when if we do any sort of writeback
we really need to enable PCI bus mastering on these systems from
what I can see.
This enables PCI BM on all radeons that require it.
Signed-off-by: Dave Airlie <airlied@redhat.com>
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (52 commits)
drm/kms: Init the CRTC info fields for modes forced from the command line.
drm/radeon/r600: CS parser updates
drm/radeon/kms: add debugfs for power management for AtomBIOS devices
drm/radeon/kms: initial mode validation support
drm/radeon/kms/atom/dce3: call transmitter init on mode set
drm/radeon/kms: store detailed connector info
drm/radeon/kms/atom/dce3: fix up usPixelClock calculation for Transmitter tables
drm/radeon/kms/r600: fix rs880 support v2
drm/radeon/kms/r700: fix some typos in chip init
drm/radeon/kms: remove some misleading debugging output
drm/radeon/kms: stop putting VRAM at 0 in MC space on r600s.
drm/radeon/kms: disable D1VGA and D2VGA if enabled
drm/radeon/kms: Don't RMW CP_RB_CNTL
drm/radeon/kms: fix coherency issues on AGP cards.
drm/radeon/kms: fix rc410 suspend/resume.
drm/radeon/kms: add quirk for hp dc5750
drm/radeon/kms/atom: fix potential oops in spread spectrum code
drm/kms: typo fix
drm/radeon/kms/atom: Make card_info per device
drm/radeon/kms/atom: Fix DVO support
...
Immediate readback seems faulty on some chips. I
suspect it takes a while to get through the fifo
to the actual register backbone. There's no need
to read it back, so, just write the driver's copy
of the register's value directly.
Should fix bug 24535 and possibly 24218
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The problem boils down to the order when the bit11
of the texture size is or'ed to the original width.
In the end each mipmap level has the same width or
height because of that 11 bit is ored to the scaled
down lod with and thus blows up the size again to the
full size or more due to the power of two rounding
afterwards.
The attached patch changes this order so that the
texture sizes are computed correct. Also the on error
the yet missing inputs to the size computation are
printed which helped me to find out where it really breaks.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
While investigating the cause of CRTC FIFO underruns, I noticed that when
converting the memory bandwidth calculation from the userspace X driver code,
an instance of '8.0' was apparently accidentally converted to '80'.
Signed-off-by: Michel Dänzer <daenzer@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
conflict in radeon since new init path merged with vga arb code.
Conflicts:
drivers/gpu/drm/radeon/radeon.h
drivers/gpu/drm/radeon/radeon_asic.h
drivers/gpu/drm/radeon/radeon_device.c
Also add single crtc for RN50 chips.
changes in v2:
fix vblank init to respect single crtc flag
fix r100 mode bandwidth to respect single crtc flag
Signed-off-by: Dave Airlie <airlied@redhat.com>
This remove old init path and allow code cleanup, now all hw
use the new init path, see top of radeon.h for description of
this.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
New init path allow to simply asic initialization and make easier
to trace what happen on each different asic. We are removing most
callback. More cleanup should happen latter to remove even more
callback. Also cleanup register specific to R100,RV200,RV250.
Version 2 correct the placement on IGP of the VRAM inside GPU address
space to match the stollen RAM placement of IGP.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Also cleanup register specific to RS400/RS480. This patch also fix
legacy VGA register used to disable VGA access we were programming
wrong register. Now we should properly disable VGA on r100 up to
rs400 asics. Note that RS400/RS480 resume is broken, it hangs the
computer while reprogramming dynamic clock, doesn't work either
without that patch. We need to spend more time investigating this
issue. Version 2 of the patch remove dead code that was left
commented out in the previous version. Version 3 correct the
placement on IGP of the VRAM inside GPU address space to match the
stollen RAM placement of IGP.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
* 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (25 commits)
drm/radeon/kms: Convert R520 to new init path and associated cleanup
drm/radeon/kms: Convert RV515 to new init path and associated cleanup
drm: fix radeon DRM warnings when !CONFIG_DEBUG_FS
drm: fix drm_fb_helper warning when !CONFIG_MAGIC_SYSRQ
drm/r600: fix memory leak introduced with 64k malloc avoidance fix.
drm/kms: make fb helper work for all drivers.
drm/radeon/r600: fix offset handling in CS parser
drm/radeon/kms/r600: fix forcing pci mode on agp cards
drm/radeon/kms: fix for the extra pages copying.
drm/radeon/kms/r600: add support for vline relocs
drm/radeon/kms: fix some bugs in vline reloc
drm/radeon/kms/r600: clamp vram to aperture size
drm/kms: protect against fb helper not being created.
drm/r600: get values from the passed in IB not the copy.
drm: create gitignore file for radeon
drm/radeon/kms: remove unneeded master create/destroy functions.
drm/kms: start adding command line interface using fb.
fb: change rules for global rules match.
drm/radeon/kms: don't require up to 64k allocations. (v2)
drm/radeon/kms: enable dac load detection by default.
...
Trivial conflicts in drivers/gpu/drm/radeon/radeon_asic.h due to adding
'->vga_set_state' function pointers.
- fix offset of NOP packet for parsing
- fix p->idx increments
- fix bad mask when updating crtc vline info
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@linux.ie>
This avoids needing to do a kmalloc > PAGE_SIZE for the main
indirect buffer chunk, it adds an accessor for all reads from
the chunk and caches a single page at a time for subsequent
reads.
changes since v1:
Use a two page pool which should be the most common case
a single packet spanning > PAGE_SIZE will be hit, but I'm
having trouble seeing anywhere we currently generate anything like that.
hopefully proper short page copying at end
added parser_error flag to set deep errors instead of having to test
every ib value fetch.
fixed bug in patch that went to list.
Signed-off-by: Dave Airlie <airlied@redhat.com>
VGA arb requires DRM support for non-kms drivers, to turn on/off
irqs when disabling the mem/io regions.
VGA arb requires KMS support for GPUs where we can turn off VGA
decoding. Currently we know how to do this for intel and radeon
kms drivers, which allows them to be removed from the arbiter.
This patch comes from Fedora rawhide kernel.
Signed-off-by: Dave Airlie <airlied@redhat.com>
GART static one time initialization was mixed up with GART
enabling/disabling which could happen several time for instance
during suspend/resume cycles. This patch splits all GART
handling into 4 differents function. gart_init is for one
time initialization, gart_deinit is called upon module unload
to free resources allocated by gart_init, gart_enable enable
the GART and is intented to be call after first initialization
and at each resume cycle or reset cycle. Finaly gart_disable
stop the GART and is intended to be call at suspend time or
when unloading the module.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This convert r4xx to new init path it also fix few bugs.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If module is being unloaded we should not try to handle irq especialy
we should not call into drm helper or we could hard hang the computer
free_irq will call the irq handler to make sure we behave properly.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If we stop CP and that it's still processing thing GPU hang might
happen, this patch wait for CP idle (the wait can timeout) so we
can avoid shutting down CP at bad time. This is especialy usefull
when reseting the GPU as it seems GPU reset fails to properly reset
CP when the CP wasn't stop after being idle.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This adds the r600 KMS + CS support to the Linux kernel.
The r600 TTM support is quite basic and still needs more
work esp around using interrupts, but the polled fencing
should work okay for now.
Also currently TTM is using memcpy to do VRAM moves,
the code is here to use a 3D blit to do this, but
isn't fully debugged yet.
Authors:
Alex Deucher <alexdeucher@gmail.com>
Dave Airlie <airlied@redhat.com>
Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This adds the command stream checker for the RN50, R100 and R200 cards.
It stops any access to 3D registers on RN50, and does checks
on buffer sizes on the r100/r200 cards. It also fixes some texture
sizing checks on r300.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Loosely based on a patch by
Jaswinder Singh Rajput <jaswinderlinux@gmail.com>.
KMS support by Dave Airlie <airlied@redhat.com>.
For Radeon 100- to 500-series, firmware blobs look like:
struct {
__be32 datah;
__be32 datal;
} cp_ucode[256];
For Radeon 600-series, there are two separate firmware blobs:
__be32 me_ucode[PM4_UCODE_SIZE * 3];
__be32 pfp_ucode[PFP_UCODE_SIZE];
For Radeon 700-series, likewise:
__be32 me_ucode[R700_PM4_UCODE_SIZE];
__be32 pfp_ucode[R700_PFP_UCODE_SIZE];
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We really don't want to be doing all these indirects, updating
the GPU gart table is something we do often so the less overhead the
better.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Fixes 3D apps timing out in the WAIT_VBLANK ioctl.
AVIVO bits compile-tested only.
Signed-off-by: Michel Dänzer <daenzer@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Check whether index is within bounds before grabbing the element.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If an rn50/r100/m6/m7 GPU has < 64MB RAM, i.e. 8/16/32, the
aperture used to calculate the MC_FB_LOCATION needs to be worked
out from the CONFIG_APER_SIZE register, and not the actual vram size.
TTM VRAM size was also being initialised wrong, use actual vram size
to initialise it.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Fix bandwidth computation and crtc priority in memory controller
so that crtc memory request are fullfill in time to avoid display
artifact.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This adds new set/get tiling interfaces where the pitch
and macro/micro tiling enables can be set. Along with
a flag to decide if this object should have a surface when mapped.
The only thing we need to allocate with a mapped surface should be
the frontbuffer. Note rotate scanout shouldn't require one, and
back/depth shouldn't either, though mesa needs some fixes.
It fixes the TTM interfaces along Thomas's suggestions, and I've tested
the surface stealing code with two X servers and not seen any lockdep issues.
I've stopped tiling the fbcon frontbuffer, as I don't see there being
any advantage other than testing, I've left the testing commands in there,
just flip the fb_tiled to true in radeon_fb.c
Open: Can we integrate endian swapping in with this?
Future features:
texture tiling - need to relocate texture registers TXOFFSET* with tiling info.
This also merges Michel's cleanup surfaces regs at init time patch
even though it makes sense on its own, this patch really relies on it.
Some PowerMac firmwares set up a tiling surface at the beginning of VRAM
which messes us up otherwise.
that patch is:
Signed-off-by: Michel Dänzer <daenzer@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
RN50/ES1000 is a cut-down rv100 chip used in the server market.
The 3D engine on these is either not there or unverified so refuse
any attempt to configure registers on it.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Doing this like the DDX seems like the most sure fire way to avoid
having to reinvent it slowly and painfully. At the moment we keep
getting things wrong with aper vs vram, so we know the DDX does it right.
booted on PCI r100, PCIE rv370, IGP rs400.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Userspace sends us a special relocation type to sync video/exa
to vlines to avoid tearing, this deals with the relocation
in the kernel, it picks the correct crtc and avoids issues
where crtcs are disabled.
This version also parses the wait until to make sure it isn't
trying to do anything evil.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Normally we are free to place VRAM where we want in the GPUs
memory address space, however on IGP chips the VRAM is actual RAM,
and no special translation or aperture is used inside the GPU MC.
So when you move the VRAM aperture away from the TOM register,
you actually move it into main memory and can trash things quite badly.
This commit makes the code respect the TOM location for MC_FB_LOCATION.
Signed-off-by: Dave Airlie <airlied@redhat.com>
1. rv370 can accept 40-bit addresses - also at 24-bit shift not 4 bits
2. rs480 table can be in 40-bit space. - 4 bit shift for top 8 bits
3. rs480 table entries can be in 40-bit space.
Signed-off-by: Dave Airlie <airlied@redhat.com>
For security purpose we want to make sure the userspace process doesn't
access memory beyond buffer it owns. To achieve this we need to check
states the userspace program. For color buffer and zbuffer we check that
the clipping register will discard access beyond buffers set as color
or zbuffer. For vertex buffer we check that no vertex fetch will happen
beyond buffer end. For texture we check various texture states (number
of mipmap level, texture size, texture depth, ...) to compute the amount
of memory the texture fetcher might access.
The command stream checking impact the performances so far quick benchmark
shows an average of 3% decrease in fps of various applications. It can
be optimized a bit more by caching result of checking and thus avoid a
full recheck if no states changed since last check.
Note that this patch is still incomplete on checking side as it doesn't
check 2d rendering states.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Add kernel modesetting support to radeon driver, use the ttm memory
manager to manage memory and DRM/GEM to provide userspace API.
In order to avoid backward compatibility issue and to allow clean
design and code the radeon kernel modesetting use different code path
than old radeon/drm driver.
When kernel modesetting is enabled the IOCTL of radeon/drm
driver are considered as invalid and an error message is printed
in the log and they return failure.
KMS enabled userspace will use new API to talk with the radeon/drm
driver. The new API provide functions to create/destroy/share/mmap
buffer object which are then managed by the kernel memory manager
(here TTM). In order to submit command to the GPU the userspace
provide a buffer holding the command stream, along this buffer
userspace have to provide a list of buffer object used by the
command stream. The kernel radeon driver will then place buffer
in GPU accessible memory and will update command stream to reflect
the position of the different buffers.
The kernel will also perform security check on command stream
provided by the user, we want to catch and forbid any illegal use
of the GPU such as DMA into random system memory or into memory
not owned by the process supplying the command stream. This part
of the code is still incomplete and this why we propose that patch
as a staging driver addition, future security might forbid current
experimental userspace to run.
This code support the following hardware : R1XX,R2XX,R3XX,R4XX,R5XX
(radeon up to X1950). Works is underway to provide support for R6XX,
R7XX and newer hardware (radeon from HD2XXX to HD4XXX).
Authors:
Jerome Glisse <jglisse@redhat.com>
Dave Airlie <airlied@redhat.com>
Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>