We need to read a bunch of registers on each compute unit and possibly
on the current CPU too. Disable preemption around it. Otherwise, you
get:
BUG: using smp_processor_id() in preemptible [00000000] code: systemd-udevd/327
caller is read_registers+0x6a/0x110 [fam15h_power]
CPU: 3 PID: 327 Comm: systemd-udevd Not tainted 4.7.0-rc1+ #4
Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.08 01/28/2016
...
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Rui Huang <ray.huang@amd.com>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Acked-by: Huang Rui <ray.huang@amd.com>
Tested-by: Huang Rui <ray.huang@amd.com>
Fixes: fa79434499 ("hwmon: (fam15h_power) Add compute unit accumulated power")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
This patch adds a platform check function to make code more readable.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
This patch adds the description to explain the TDP reporting mechanism
and accumulated power algorithm.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
This patch introduces an algorithm that computes the average power by
reading a delta value of “core power accumulator” register during
measurement interval, and then dividing delta value by the length of
the time interval.
User is able to use power1_average entry to measure the processor power
consumption and power1_average_interval entry to set the interval.
A simple example:
ray@hr-ub:~/tip$ sensors
fam15h_power-pci-00c4
Adapter: PCI adapter
power1: 19.58 mW (avg = 2.55 mW, interval = 0.01 s)
(crit = 15.00 W)
...
The result is current average processor power consumption in 10
millisecond. The unit of the result is uWatt.
Suggested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
PTSC is the performance timestamp counter value in a cpu core and the
cores in one compute unit have the fixed frequency. So it picks up the
performance timestamp counter value of the first core per compute unit
to measure the interval for average power per compute unit.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
This patch adds a member in fam15h_power_data which specifies the
compute unit accumulated power. It adds do_read_registers_on_cu to do
all the read to all MSRs and run it on one of the online cores on each
compute unit with smp_call_function_many(). This behavior can decrease
IPI numbers.
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
AMD Family 15h Models 70h-7fh processors also support TDP power
reporting interface.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
This patch adds a member in fam15h_power_data which specifies the
maximum accumulated power in a compute unit.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Attributes depend on the CPU model the driver gets loaded on.
Therefore, add those attributes dynamically at init time. This is more
flexible to control the different attributes on different platforms.
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
This patch adds a member (cpu_pwr_sample_ratio) of fam15h_power_data,
that represents the ratio of compute unit power accumulator sample
period to the PTSC counter period.
Tsample: compute unit power accumulator sample period
Tref: the performance timestamp counter period
PTSC: performance timestamp counter
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
On Carrizo and later platforms, running_avg_capture bit field is
extended to 4:31 (28 bits) from 4:25.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
We rename fam15h_power_is_internal_node0() function to
should_load_on_this_node(), because it may not be node0 from KV and
on, and they are single-node processors.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
AMD Carrizo(Fam15h, M60h) processors can report power1_crit
(ProcessorPwrWatts) and power1_input (CurrPwrWatts) values.
And this patch adds support for CZ.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
F3 device ID is wrongly included in fam15h_power_id_table
for F16h M30h. It should be F4 device ID. Fix this.
Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
power1_input should only be reported for Fam15h, Models 00h-0fh
So, introduce a is_visible function to take care of this.
As suggested by Guenter here:
http://marc.info/?l=linux-kernel&m=141038145616437&w=2
Suggested-by: Guenter Roeck <linux@roeck-us.net>
Fixes: 22e32f4f57 ('x86,AMD: Power driver support for AMD's family 16h processors')
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
PCI_DEVICE_ID_AMD_16H_NB_F4 can be obtained from it's
definition in pci_ids.h. So we don't have to define it
again here.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Use ATTRIBUTE_GROUPS macro and devm_hwmon_device_register_with_groups() to
simplify the code a bit.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Don't use DEFINE_PCI_DEVICE_TABLE macro, because this macro
is not preferred.
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: Corentin Labbe <corentin.labbe@geomatys.fr>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Juerg Haefliger <juergh@gmail.com>
Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
Cc: Rudolf Marek <r.marek@assembler.cz>
Cc: Jim Cromie <jim.cromie@gmail.com>
Cc: Roger Lucas <vt8231@hiddenengine.co.uk>
Cc: Marc Hulsman <m.hulsman@tudelft.nl>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Here's the large driver core updates for 3.8-rc1.
The biggest thing here is the various __dev* marking removals. This is
going to be a pain for the merge with different subsystem trees, I know,
but all of the patches included here have been ACKed by their various
subsystem maintainers, as they wanted them to go through here.
If this is too much of a pain, I can pull all of them out of this tree
and just send you one with the other fixes/updates and then, after
3.8-rc1 is out, do the rest of the removals to ensure we catch them all,
it's up to you. The merges should all be trivial, and Stephen has been
doing them all in linux-next for a few weeks now quite easily.
Other than the __dev* marking removals, there's nothing major here, some
firmware loading updates and other minor things in the driver core.
All of these have (much to Stephen's annoyance), been in linux-next for
a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEYEABECAAYFAlDHkPkACgkQMUfUDdst+ykaWgCfW7AM30cv0nzoVO08ax6KjlG1
KVYAn3z/KYazvp4B6LMvrW9y0G34Wmad
=yvVr
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg Kroah-Hartman:
"Here's the large driver core updates for 3.8-rc1.
The biggest thing here is the various __dev* marking removals. This
is going to be a pain for the merge with different subsystem trees, I
know, but all of the patches included here have been ACKed by their
various subsystem maintainers, as they wanted them to go through here.
If this is too much of a pain, I can pull all of them out of this tree
and just send you one with the other fixes/updates and then, after
3.8-rc1 is out, do the rest of the removals to ensure we catch them
all, it's up to you. The merges should all be trivial, and Stephen
has been doing them all in linux-next for a few weeks now quite
easily.
Other than the __dev* marking removals, there's nothing major here,
some firmware loading updates and other minor things in the driver
core.
All of these have (much to Stephen's annoyance), been in linux-next
for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
Fixed up trivial conflicts in drivers/gpio/gpio-{em,stmpe}.c due to gpio
update.
* tag 'driver-core-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (93 commits)
modpost.c: Stop checking __dev* section mismatches
init.h: Remove __dev* sections from the kernel
acpi: remove use of __devinit
PCI: Remove __dev* markings
PCI: Always build setup-bus when PCI is enabled
PCI: Move pci_uevent into pci-driver.c
PCI: Remove CONFIG_HOTPLUG ifdefs
unicore32/PCI: Remove CONFIG_HOTPLUG ifdefs
sh/PCI: Remove CONFIG_HOTPLUG ifdefs
powerpc/PCI: Remove CONFIG_HOTPLUG ifdefs
mips/PCI: Remove CONFIG_HOTPLUG ifdefs
microblaze/PCI: Remove CONFIG_HOTPLUG ifdefs
dma: remove use of __devinit
dma: remove use of __devexit_p
firewire: remove use of __devinitdata
firewire: remove use of __devinit
leds: remove use of __devexit
leds: remove use of __devinit
leds: remove use of __devexit_p
mmc: remove use of __devexit
...
Add family 16h PCI ID to AMD's power driver to allow it report
power consumption on these processors.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
CONFIG_HOTPLUG is going away as an option so __devexit is no
longer needed.
Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Alistair John Strachan <alistair@devzero.co.uk>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Juerg Haefliger <juergh@gmail.com>
Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: Rudolf Marek <r.marek@assembler.cz>
Cc: Jim Cromie <jim.cromie@gmail.com>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Cc: Roger Lucas <vt8231@hiddenengine.co.uk>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CONFIG_HOTPLUG is going away as an option so __devinit is no longer
needed.
Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Alistair John Strachan <alistair@devzero.co.uk>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Juerg Haefliger <juergh@gmail.com>
Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: Rudolf Marek <r.marek@assembler.cz>
Cc: Jim Cromie <jim.cromie@gmail.com>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Cc: Roger Lucas <vt8231@hiddenengine.co.uk>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CONFIG_HOTPLUG is going away as an option so __devexit_p is no longer
needed.
Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Alistair John Strachan <alistair@devzero.co.uk>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Juerg Haefliger <juergh@gmail.com>
Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: Rudolf Marek <r.marek@assembler.cz>
Cc: Jim Cromie <jim.cromie@gmail.com>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Cc: Roger Lucas <vt8231@hiddenengine.co.uk>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert to use devm_ functions to reduce code size and simplify the code.
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The quirk introduced with commit
00250ec909 (hwmon: fam15h_power: fix
bogus values with current BIOSes) is not only required during driver
load but also when system resumes from suspend. The BIOS might set the
previously recommended (but unsuitable) initilization value for the
running average range register during resume.
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Tested-by: Andreas Hartmann <andihartmann@01019freenet.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@vger.kernel.org # 3.0+
Expression with two unsigned integer variables is calculated as unsigned integer
before it is converted to u64. This may result in an integer overflow.
Fix by typecasting the left operand to u64 before performing the left shift.
This patch addresses Coverity #402320: Unintentional integer overflow.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com>
This patch converts the drivers in drivers/hwmon/* to use module_pci_driver()
macro which makes the code smaller and a bit simpler.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: Rudolf Marek <r.marek@assembler.cz>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Newer BKDG[1] versions recommend a different initialization value for
the running average range register in the northbridge. This improves
the power reading by avoiding counter saturations resulting in bogus
values for anything below about 80% of TDP power consumption.
Updated BIOSes will have this new value set up from the beginning,
but meanwhile we correct this value ourselves.
This needs to be done on all northbridges, even on those where the
driver itself does not register at.
This fixes the driver on all current machines to provide proper
values for idle load.
[1]
http://support.amd.com/us/Processor_TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf
Chapter 3.8: D18F5xE0 Processor TDP Running Average (p. 452)
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
[guenter.roeck@ericsson.com: Removed unnecessary return statement]
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Cc: stable@vger.kernel.org # 3.0+
On high CPU load the accumulating values in the running_avg_cap
register are very low (below 10), so averaging them too early leads
to unnecessary poor output resolution. Since we pretend to output
micro-Watt we better keep all the bits we have as long as possible.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Wrong bit was used for sign extension which caused wrong end results.
Thanks to Andre for spotting this bug.
Reported-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@vger.kernel.org
This CPU family provides NB register values to gather following
TDP information
* ProcessorPwrWatts: Specifies in Watts the maximum amount of power
the processor can support.
* CurrPwrWatts: Specifies in Watts the current amount of power being
consumed by the processor.
This driver provides
* power1_crit (ProcessorPwrWatts)
* power1_input (CurrPwrWatts)
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>