/sys/firmware/acpi/hotplug/force_remove was presumably added to support
auto offlining in the past. This is, however, inherently dangerous for
some hotplugable resources like memory. The memory offlining fails when
the memory is still in use and cannot be dropped or migrated. If we
ignore the failure we are basically allowing for subtle memory
corruption or a crash.
We have actually noticed the later while hitting BUG() during the memory
hotremove (remove_memory):
ret = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1), NULL,
check_memblock_offlined_cb);
if (ret)
BUG();
it took us quite non-trivial time realize that the customer had
force_remove enabled. Even if the BUG was removed here and we could
propagate the error up the call chain it wouldn't help at all because
then we would hit a crash or a memory corruption later and harder to
debug. So force_remove is unfixable for the memory hotremove. We haven't
checked other hotplugable resources to be prone to a similar problems.
Remove the force_remove functionality because it is not fixable currently.
Keep the sysfs file and report an error if somebody tries to enable it.
Encourage users to report about the missing functionality and work with
them with an alternative solution.
Reviewed-by: Lee, Chun-Yi <jlee@suse.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>