linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Hanjun Guo	5b1e80204d	ACPI: NUMA: Remove the useless 'node >= MAX_NUMNODES' check acpi_map_pxm_to_node() will never return a NUMA node greater than MAX_NUMNODES, so the 'node >= MAX_NUMNODES' check is not needed. Remove it. Signed-off-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-07-27 15:19:12 +02:00
Hanjun Guo	1c60f91c31	ACPI: NUMA: Remove the useless sub table pointer check In acpi_parse_entries_array(), the subtable entries (entry.hdr) will never be NULL, so for ACPI subtable handler in struct acpi_subtable_proc, will never handle NULL subtable entries. Remove those useless subtable pointer checks in the callback handlers. Signed-off-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-07-27 15:19:12 +02:00
David Hildenbrand	f2af6d3978	virtio-mem: Allow to specify an ACPI PXM as nid We want to allow to specify (similar as for a DIMM), to which node a virtio-mem device (and, therefore, its memory) belongs. Add a new virtio-mem feature flag and export pxm_to_node, so it can be used in kernel module context. Acked-by: Michal Hocko <mhocko@suse.com> # for the export Acked-by: "Rafael J. Wysocki" <rafael@kernel.org> # for the export Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Tested-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Len Brown <lenb@kernel.org> Cc: linux-acpi@vger.kernel.org Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-4-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-06-04 15:36:52 -04:00
Dan Williams	b2ca916ce3	ACPI: NUMA: Up-level "map to online node" functionality The acpi_map_pxm_to_online_node() helper is used to find the closest online node to a given proximity domain. This is used to map devices in a proximity domain with no online memory or cpus to the closest online node and populate a device's 'numa_node' property. The numa_node property allows applications to be migrated "close" to a resource. In preparation for providing a generic facility to optionally map an address range to its closest online node, or the node the range would represent were it to be onlined (target_node), up-level the core of acpi_map_pxm_to_online_node() to a generic mm/numa helper. Cc: Michal Hocko <mhocko@suse.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/158188324802.894464.13128795207831894206.stgit@dwillia2-desk3.amr.corp.intel.com	2020-02-17 10:49:06 -08:00
Tao Xu	0f1839d088	ACPI: HMAT: use %u instead of %d to print u32 values Use %u instead of %d to print u32 values to expand the value range, especially when latency or bandwidth value is bigger than INT_MAX. Then HMAT latency can support up to 4.29s and bandwidth can support up to 4PB/s. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jingqi Liu <Jingqi.liu@intel.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-12 10:05:57 +01:00
Qian Cai	59b2c5b635	ACPI: NUMA: HMAT: fix a section mismatch Commit `cf8741ac57` ("ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device") introduced a linker warning, WARNING: vmlinux.o(.text+0x64ec3c): Section mismatch in reference from the function hmat_register_target() to the function .init.text:hmat_register_target_devices() The function hmat_register_target() references the function __init hmat_register_target_devices(). Since hmat_register_target() is also called from hmat_callback(), and then register_hotmemory_notifier(), where it should not be freed when hmat_init() is done, it indicates that the __init annotation of hmat_register_target_devices() is incorrect. Fixes: `cf8741ac57` ("ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device") Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-12 10:01:03 +01:00
Brice Goglin	4caa525b78	ACPI: HMAT: don't mix pxm and nid when setting memory target processor_pxm On systems where PXMs and nids are in different order, memory initiators exposed in sysfs could be wrong: On dual-socket CLX with SNC enabled (4 nodes, 1 and 2 swapped between PXMs and nids), node1 would only get node2 as initiator, and node2 would only get node1. With this patch, we get node1 as the only initiator of itself, and node2 as the only initiator of itself, as expected. This should likely go to stable up to 5.2. Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-07 15:46:52 +01:00
Dan Williams	cf8741ac57	ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device Memory that has been tagged EFI_MEMORY_SP, and has performance properties described by the ACPI HMAT is expected to have an application specific consumer. Those consumers may want 100% of the memory capacity to be reserved from any usage by the kernel. By default, with this enabling, a platform device is created to represent this differentiated resource. The device-dax "hmem" driver claims these devices by default and provides an mmap interface for the target application. If the administrator prefers, the hmem resource range can be made available to the core-mm via the device-dax hotplug facility, kmem, to online the memory with its own numa node. This was tested with an emulated HMAT produced by qemu (with the pending HMAT enabling patches), and "efi_fake_mem=8G@9G:0x40000" on the kernel command line to mark the memory ranges associated with node2 and node3 as EFI_MEMORY_SP. qemu numa configuration options: -numa node,mem=4G,cpus=0-19,nodeid=0 -numa node,mem=4G,cpus=20-39,nodeid=1 -numa node,mem=4G,nodeid=2 -numa node,mem=4G,nodeid=3 -numa dist,src=0,dst=0,val=10 -numa dist,src=0,dst=1,val=21 -numa dist,src=0,dst=2,val=21 -numa dist,src=0,dst=3,val=21 -numa dist,src=1,dst=0,val=21 -numa dist,src=1,dst=1,val=10 -numa dist,src=1,dst=2,val=21 -numa dist,src=1,dst=3,val=21 -numa dist,src=2,dst=0,val=21 -numa dist,src=2,dst=1,val=21 -numa dist,src=2,dst=2,val=10 -numa dist,src=2,dst=3,val=21 -numa dist,src=3,dst=0,val=21 -numa dist,src=3,dst=1,val=21 -numa dist,src=3,dst=2,val=21 -numa dist,src=3,dst=3,val=10 -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,base-lat=10,latency=5 -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,base-bw=20,bandwidth=5 -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,base-lat=10,latency=10 -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,base-bw=20,bandwidth=10 -numa hmat-lb,initiator=0,target=2,hierarchy=memory,data-type=access-latency,base-lat=10,latency=15 -numa hmat-lb,initiator=0,target=2,hierarchy=memory,data-type=access-bandwidth,base-bw=20,bandwidth=15 -numa hmat-lb,initiator=0,target=3,hierarchy=memory,data-type=access-latency,base-lat=10,latency=20 -numa hmat-lb,initiator=0,target=3,hierarchy=memory,data-type=access-bandwidth,base-bw=20,bandwidth=20 -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-latency,base-lat=10,latency=10 -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-bandwidth,base-bw=20,bandwidth=10 -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-latency,base-lat=10,latency=5 -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-bandwidth,base-bw=20,bandwidth=5 -numa hmat-lb,initiator=1,target=2,hierarchy=memory,data-type=access-latency,base-lat=10,latency=15 -numa hmat-lb,initiator=1,target=2,hierarchy=memory,data-type=access-bandwidth,base-bw=20,bandwidth=15 -numa hmat-lb,initiator=1,target=3,hierarchy=memory,data-type=access-latency,base-lat=10,latency=20 -numa hmat-lb,initiator=1,target=3,hierarchy=memory,data-type=access-bandwidth,base-bw=20,bandwidth=20 Result: [ { "path":"\/platform\/hmem.1", "id":1, "size":"4.00 GiB (4.29 GB)", "align":2097152, "devices":[ { "chardev":"dax1.0", "size":"4.00 GiB (4.29 GB)" } ] }, { "path":"\/platform\/hmem.0", "id":0, "size":"4.00 GiB (4.29 GB)", "align":2097152, "devices":[ { "chardev":"dax0.0", "size":"4.00 GiB (4.29 GB)" } ] } ] [..] 240000000-43fffffff : Soft Reserved 240000000-33fffffff : hmem.0 240000000-33fffffff : dax0.0 340000000-43fffffff : hmem.1 340000000-43fffffff : dax1.0 Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-07 15:45:23 +01:00
Dan Williams	0f847f8c08	ACPI: NUMA: HMAT: Register HMAT at device_initcall level In preparation for registering device-dax instances for accessing EFI specific-purpose memory, arrange for the HMAT registration to occur later in the init process. Critically HMAT initialization needs to occur after e820__reserve_resources_late() which is the point at which the iomem resource tree is populated with "Application Reserved" (IORES_DESC_APPLICATION_RESERVED). e820__reserve_resources_late() happens at subsys_initcall time. Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-07 15:45:09 +01:00
Dan Williams	c710fcc5d9	ACPI: NUMA: Establish a new drivers/acpi/numa/ directory Currently hmat.c lives under an "hmat" directory which does not enhance the description of the file. The initial motivation for giving hmat.c its own directory was to delineate it as mm functionality in contrast to ACPI device driver functionality. As ACPI continues to play an increasing role in conveying memory location and performance topology information to the OS take the opportunity to co-locate these NUMA relevant tables in a combined directory. numa.c is renamed to srat.c and moved to drivers/acpi/numa/ along with hmat.c. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-07 15:43:38 +01:00

10 Commits