License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2005-04-17 06:20:36 +08:00
|
|
|
#ifndef _LINUX_GENHD_H
|
|
|
|
#define _LINUX_GENHD_H
|
|
|
|
|
|
|
|
/*
|
|
|
|
* genhd.h Copyright (C) 1992 Drew Eckhardt
|
|
|
|
* Generic hard disk header file by
|
|
|
|
* Drew Eckhardt
|
|
|
|
*
|
|
|
|
* <drew@colorado.edu>
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <linux/types.h>
|
2007-05-22 04:08:01 +08:00
|
|
|
#include <linux/kdev_t.h>
|
2008-09-03 15:03:02 +08:00
|
|
|
#include <linux/rcupdate.h>
|
2010-09-01 04:47:05 +08:00
|
|
|
#include <linux/slab.h>
|
2015-07-16 11:16:45 +08:00
|
|
|
#include <linux/percpu-refcount.h>
|
2016-05-21 08:01:24 +08:00
|
|
|
#include <linux/uuid.h>
|
2018-07-18 19:47:38 +08:00
|
|
|
#include <linux/blk_types.h>
|
2018-12-07 00:41:20 +08:00
|
|
|
#include <asm/local.h>
|
2024-06-11 20:26:44 +08:00
|
|
|
#include <linux/kabi.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
|
[PATCH] BLOCK: Make it possible to disable the block layer [try #6]
Make it possible to disable the block layer. Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.
This patch does the following:
(*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
support.
(*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
an item that uses the block layer. This includes:
(*) Block I/O tracing.
(*) Disk partition code.
(*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.
(*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
block layer to do scheduling. Some drivers that use SCSI facilities -
such as USB storage - end up disabled indirectly from this.
(*) Various block-based device drivers, such as IDE and the old CDROM
drivers.
(*) MTD blockdev handling and FTL.
(*) JFFS - which uses set_bdev_super(), something it could avoid doing by
taking a leaf out of JFFS2's book.
(*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
however, still used in places, and so is still available.
(*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
parts of linux/fs.h.
(*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.
(*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.
(*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
is not enabled.
(*) fs/no-block.c is created to hold out-of-line stubs and things that are
required when CONFIG_BLOCK is not set:
(*) Default blockdev file operations (to give error ENODEV on opening).
(*) Makes some /proc changes:
(*) /proc/devices does not list any blockdevs.
(*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.
(*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.
(*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
given command other than Q_SYNC or if a special device is specified.
(*) In init/do_mounts.c, no reference is made to the blockdev routines if
CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.
(*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
error ENOSYS by way of cond_syscall if so).
(*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
CONFIG_BLOCK is not set, since they can't then happen.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-10-01 02:45:40 +08:00
|
|
|
#ifdef CONFIG_BLOCK
|
|
|
|
|
2008-08-29 15:01:47 +08:00
|
|
|
#define dev_to_disk(device) container_of((device), struct gendisk, part0.__dev)
|
2008-08-25 18:56:05 +08:00
|
|
|
#define dev_to_part(device) container_of((device), struct hd_struct, __dev)
|
2008-08-29 15:01:47 +08:00
|
|
|
#define disk_to_dev(disk) (&(disk)->part0.__dev)
|
2008-08-25 18:56:05 +08:00
|
|
|
#define part_to_dev(part) (&((part)->__dev))
|
2007-05-22 04:08:01 +08:00
|
|
|
|
|
|
|
extern struct device_type part_type;
|
|
|
|
extern struct kobject *block_depr;
|
|
|
|
extern struct class block_class;
|
2024-06-11 20:08:33 +08:00
|
|
|
extern const struct device_type disk_type;
|
2007-05-22 04:08:01 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
enum {
|
|
|
|
/* These three have identical behaviour; use the second one if DOS FDISK gets
|
|
|
|
confused about extended/logical partitions starting past cylinder 1023. */
|
|
|
|
DOS_EXTENDED_PARTITION = 5,
|
|
|
|
LINUX_EXTENDED_PARTITION = 0x85,
|
|
|
|
WIN98_EXTENDED_PARTITION = 0x0f,
|
|
|
|
|
2007-02-11 15:50:00 +08:00
|
|
|
SUN_WHOLE_DISK = DOS_EXTENDED_PARTITION,
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
LINUX_SWAP_PARTITION = 0x82,
|
2007-02-10 17:45:47 +08:00
|
|
|
LINUX_DATA_PARTITION = 0x83,
|
|
|
|
LINUX_LVM_PARTITION = 0x8e,
|
2005-04-17 06:20:36 +08:00
|
|
|
LINUX_RAID_PARTITION = 0xfd, /* autodetect RAID partition */
|
|
|
|
|
|
|
|
SOLARIS_X86_PARTITION = LINUX_SWAP_PARTITION,
|
|
|
|
NEW_SOLARIS_X86_PARTITION = 0xbf,
|
|
|
|
|
|
|
|
DM6_AUX1PARTITION = 0x51, /* no DDO: use xlated geom */
|
|
|
|
DM6_AUX3PARTITION = 0x53, /* no DDO: use xlated geom */
|
|
|
|
DM6_PARTITION = 0x54, /* has DDO: use xlated geom & offset */
|
|
|
|
EZD_PARTITION = 0x55, /* EZ-DRIVE */
|
|
|
|
|
|
|
|
FREEBSD_PARTITION = 0xa5, /* FreeBSD Partition ID */
|
|
|
|
OPENBSD_PARTITION = 0xa6, /* OpenBSD Partition ID */
|
|
|
|
NETBSD_PARTITION = 0xa9, /* NetBSD Partition ID */
|
|
|
|
BSDI_PARTITION = 0xb7, /* BSDI Partition ID */
|
|
|
|
MINIX_PARTITION = 0x81, /* Minix Partition ID */
|
|
|
|
UNIXWARE_PARTITION = 0x63, /* Same as GNU_HURD and SCO Unix */
|
|
|
|
};
|
|
|
|
|
2008-08-25 18:56:16 +08:00
|
|
|
#define DISK_MAX_PARTS 256
|
2008-08-25 18:56:17 +08:00
|
|
|
#define DISK_NAME_LEN 32
|
2008-08-25 18:56:16 +08:00
|
|
|
|
2006-04-25 21:07:57 +08:00
|
|
|
#include <linux/major.h>
|
|
|
|
#include <linux/device.h>
|
|
|
|
#include <linux/smp.h>
|
|
|
|
#include <linux/string.h>
|
|
|
|
#include <linux/fs.h>
|
2007-05-24 04:57:38 +08:00
|
|
|
#include <linux/workqueue.h>
|
2006-04-25 21:07:57 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
struct partition {
|
|
|
|
unsigned char boot_ind; /* 0x80 - active */
|
|
|
|
unsigned char head; /* starting head */
|
|
|
|
unsigned char sector; /* starting sector */
|
|
|
|
unsigned char cyl; /* starting cylinder */
|
|
|
|
unsigned char sys_ind; /* What partition type */
|
|
|
|
unsigned char end_head; /* end head */
|
|
|
|
unsigned char end_sector; /* end sector */
|
|
|
|
unsigned char end_cyl; /* end cylinder */
|
|
|
|
__le32 start_sect; /* starting sector counting from 0 */
|
|
|
|
__le32 nr_sects; /* nr of sectors in partition */
|
|
|
|
} __attribute__((packed));
|
|
|
|
|
2008-02-08 18:04:09 +08:00
|
|
|
struct disk_stats {
|
2018-09-22 07:44:34 +08:00
|
|
|
u64 nsecs[NR_STAT_GROUPS];
|
2018-07-18 19:47:38 +08:00
|
|
|
unsigned long sectors[NR_STAT_GROUPS];
|
|
|
|
unsigned long ios[NR_STAT_GROUPS];
|
|
|
|
unsigned long merges[NR_STAT_GROUPS];
|
2008-02-08 18:04:09 +08:00
|
|
|
unsigned long io_ticks;
|
|
|
|
unsigned long time_in_queue;
|
2018-12-07 00:41:20 +08:00
|
|
|
local_t in_flight[2];
|
2008-02-08 18:04:09 +08:00
|
|
|
};
|
2010-09-01 04:47:05 +08:00
|
|
|
|
|
|
|
#define PARTITION_META_INFO_VOLNAMELTH 64
|
2012-11-09 08:12:25 +08:00
|
|
|
/*
|
|
|
|
* Enough for the string representation of any kind of UUID plus NULL.
|
|
|
|
* EFI UUID is 36 characters. MSDOS UUID is 11 characters.
|
|
|
|
*/
|
2016-05-21 08:01:24 +08:00
|
|
|
#define PARTITION_META_INFO_UUIDLTH (UUID_STRING_LEN + 1)
|
2010-09-01 04:47:05 +08:00
|
|
|
|
|
|
|
struct partition_meta_info {
|
2012-11-09 08:12:25 +08:00
|
|
|
char uuid[PARTITION_META_INFO_UUIDLTH];
|
2010-09-01 04:47:05 +08:00
|
|
|
u8 volname[PARTITION_META_INFO_VOLNAMELTH];
|
|
|
|
};
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
struct hd_struct {
|
|
|
|
sector_t start_sect;
|
2012-08-01 18:24:18 +08:00
|
|
|
/*
|
|
|
|
* nr_sects is protected by sequence counter. One might extend a
|
|
|
|
* partition while IO is happening to it and update of nr_sects
|
|
|
|
* can be non-atomic on 32bit machines with 64bit sector_t.
|
|
|
|
*/
|
2005-04-17 06:20:36 +08:00
|
|
|
sector_t nr_sects;
|
2012-08-01 18:24:18 +08:00
|
|
|
seqcount_t nr_sects_seq;
|
2009-05-23 05:17:53 +08:00
|
|
|
sector_t alignment_offset;
|
2011-05-30 13:42:51 +08:00
|
|
|
unsigned int discard_alignment;
|
2008-08-25 18:56:05 +08:00
|
|
|
struct device __dev;
|
2006-03-27 17:17:55 +08:00
|
|
|
struct kobject *holder_dir;
|
2005-04-17 06:20:36 +08:00
|
|
|
int policy, partno;
|
2010-09-01 04:47:05 +08:00
|
|
|
struct partition_meta_info *info;
|
2006-12-08 18:39:46 +08:00
|
|
|
#ifdef CONFIG_FAIL_MAKE_REQUEST
|
|
|
|
int make_it_fail;
|
2008-02-08 18:04:09 +08:00
|
|
|
#endif
|
|
|
|
unsigned long stamp;
|
|
|
|
#ifdef CONFIG_SMP
|
2010-02-02 13:38:57 +08:00
|
|
|
struct disk_stats __percpu *dkstats;
|
2008-02-08 18:04:09 +08:00
|
|
|
#else
|
|
|
|
struct disk_stats dkstats;
|
2006-12-08 18:39:46 +08:00
|
|
|
#endif
|
2015-07-16 11:16:45 +08:00
|
|
|
struct percpu_ref ref;
|
2024-06-11 20:26:44 +08:00
|
|
|
|
|
|
|
KABI_RESERVE(1);
|
|
|
|
KABI_RESERVE(2);
|
|
|
|
KABI_RESERVE(3);
|
|
|
|
KABI_RESERVE(4);
|
|
|
|
|
block: use rcu_work instead of call_rcu to avoid sleep in softirq
We recently got a stack by syzkaller like this:
BUG: sleeping function called from invalid context at mm/slab.h:361
in_atomic(): 1, irqs_disabled(): 0, pid: 6644, name: blkid
INFO: lockdep is turned off.
CPU: 1 PID: 6644 Comm: blkid Not tainted 4.4.163-514.55.6.9.x86_64+ #76
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
0000000000000000 5ba6a6b879e50c00 ffff8801f6b07b10 ffffffff81cb2194
0000000041b58ab3 ffffffff833c7745 ffffffff81cb2080 5ba6a6b879e50c00
0000000000000000 0000000000000001 0000000000000004 0000000000000000
Call Trace:
<IRQ> [<ffffffff81cb2194>] __dump_stack lib/dump_stack.c:15 [inline]
<IRQ> [<ffffffff81cb2194>] dump_stack+0x114/0x1a0 lib/dump_stack.c:51
[<ffffffff8129a981>] ___might_sleep+0x291/0x490 kernel/sched/core.c:7675
[<ffffffff8129ac33>] __might_sleep+0xb3/0x270 kernel/sched/core.c:7637
[<ffffffff81794c13>] slab_pre_alloc_hook mm/slab.h:361 [inline]
[<ffffffff81794c13>] slab_alloc_node mm/slub.c:2610 [inline]
[<ffffffff81794c13>] slab_alloc mm/slub.c:2692 [inline]
[<ffffffff81794c13>] kmem_cache_alloc_trace+0x2c3/0x5c0 mm/slub.c:2709
[<ffffffff81cbe9a7>] kmalloc include/linux/slab.h:479 [inline]
[<ffffffff81cbe9a7>] kzalloc include/linux/slab.h:623 [inline]
[<ffffffff81cbe9a7>] kobject_uevent_env+0x2c7/0x1150 lib/kobject_uevent.c:227
[<ffffffff81cbf84f>] kobject_uevent+0x1f/0x30 lib/kobject_uevent.c:374
[<ffffffff81cbb5b9>] kobject_cleanup lib/kobject.c:633 [inline]
[<ffffffff81cbb5b9>] kobject_release+0x229/0x440 lib/kobject.c:675
[<ffffffff81cbb0a2>] kref_sub include/linux/kref.h:73 [inline]
[<ffffffff81cbb0a2>] kref_put include/linux/kref.h:98 [inline]
[<ffffffff81cbb0a2>] kobject_put+0x72/0xd0 lib/kobject.c:692
[<ffffffff8216f095>] put_device+0x25/0x30 drivers/base/core.c:1237
[<ffffffff81c4cc34>] delete_partition_rcu_cb+0x1d4/0x2f0 block/partition-generic.c:232
[<ffffffff813c08bc>] __rcu_reclaim kernel/rcu/rcu.h:118 [inline]
[<ffffffff813c08bc>] rcu_do_batch kernel/rcu/tree.c:2705 [inline]
[<ffffffff813c08bc>] invoke_rcu_callbacks kernel/rcu/tree.c:2973 [inline]
[<ffffffff813c08bc>] __rcu_process_callbacks kernel/rcu/tree.c:2940 [inline]
[<ffffffff813c08bc>] rcu_process_callbacks+0x59c/0x1c70 kernel/rcu/tree.c:2957
[<ffffffff8120f509>] __do_softirq+0x299/0xe20 kernel/softirq.c:273
[<ffffffff81210496>] invoke_softirq kernel/softirq.c:350 [inline]
[<ffffffff81210496>] irq_exit+0x216/0x2c0 kernel/softirq.c:391
[<ffffffff82c2cd7b>] exiting_irq arch/x86/include/asm/apic.h:652 [inline]
[<ffffffff82c2cd7b>] smp_apic_timer_interrupt+0x8b/0xc0 arch/x86/kernel/apic/apic.c:926
[<ffffffff82c2bc25>] apic_timer_interrupt+0xa5/0xb0 arch/x86/entry/entry_64.S:746
<EOI> [<ffffffff814cbf40>] ? audit_kill_trees+0x180/0x180
[<ffffffff8187d2f7>] fd_install+0x57/0x80 fs/file.c:626
[<ffffffff8180989e>] do_sys_open+0x45e/0x550 fs/open.c:1043
[<ffffffff818099c2>] SYSC_open fs/open.c:1055 [inline]
[<ffffffff818099c2>] SyS_open+0x32/0x40 fs/open.c:1050
[<ffffffff82c299e1>] entry_SYSCALL_64_fastpath+0x1e/0x9a
In softirq context, we call rcu callback function delete_partition_rcu_cb(),
which may allocate memory by kzalloc with GFP_KERNEL flag. If the
allocation cannot be satisfied, it may sleep. However, That is not allowed
in softirq contex.
Although we found this problem on linux 4.4, the latest kernel version
seems to have this problem as well. And it is very similar to the
previous one:
https://lkml.org/lkml/2018/7/9/391
Fix it by using RCU workqueue, which allows sleep.
Reviewed-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-28 16:42:01 +08:00
|
|
|
struct rcu_work rcu_work;
|
2005-04-17 06:20:36 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
#define GENHD_FL_REMOVABLE 1
|
2010-03-16 15:55:32 +08:00
|
|
|
/* 2 is unused */
|
2007-05-24 04:57:38 +08:00
|
|
|
#define GENHD_FL_MEDIA_CHANGE_NOTIFY 4
|
2005-04-17 06:20:36 +08:00
|
|
|
#define GENHD_FL_CD 8
|
|
|
|
#define GENHD_FL_UP 16
|
|
|
|
#define GENHD_FL_SUPPRESS_PARTITION_INFO 32
|
2008-08-25 18:56:16 +08:00
|
|
|
#define GENHD_FL_EXT_DEVT 64 /* allow extended devt */
|
2009-06-07 19:52:52 +08:00
|
|
|
#define GENHD_FL_NATIVE_CAPACITY 128
|
2011-04-22 02:54:46 +08:00
|
|
|
#define GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE 256
|
2011-08-24 02:01:04 +08:00
|
|
|
#define GENHD_FL_NO_PART_SCAN 512
|
2017-11-03 02:29:53 +08:00
|
|
|
#define GENHD_FL_HIDDEN 1024
|
2005-04-17 06:20:36 +08:00
|
|
|
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 03:57:37 +08:00
|
|
|
enum {
|
|
|
|
DISK_EVENT_MEDIA_CHANGE = 1 << 0, /* media changed */
|
|
|
|
DISK_EVENT_EJECT_REQUEST = 1 << 1, /* eject requested */
|
|
|
|
};
|
|
|
|
|
2019-03-27 21:51:02 +08:00
|
|
|
enum {
|
|
|
|
/* Poll even if events_poll_msecs is unset */
|
|
|
|
DISK_EVENT_FLAG_POLL = 1 << 0,
|
|
|
|
/* Forward events to udev */
|
|
|
|
DISK_EVENT_FLAG_UEVENT = 1 << 1,
|
|
|
|
};
|
|
|
|
|
2008-08-25 18:56:15 +08:00
|
|
|
struct disk_part_tbl {
|
|
|
|
struct rcu_head rcu_head;
|
|
|
|
int len;
|
2010-02-25 03:01:56 +08:00
|
|
|
struct hd_struct __rcu *last_lookup;
|
|
|
|
struct hd_struct __rcu *part[];
|
2008-08-25 18:56:15 +08:00
|
|
|
};
|
|
|
|
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 03:57:37 +08:00
|
|
|
struct disk_events;
|
2016-01-10 00:36:51 +08:00
|
|
|
struct badblocks;
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 03:57:37 +08:00
|
|
|
|
2015-10-22 01:19:49 +08:00
|
|
|
#if defined(CONFIG_BLK_DEV_INTEGRITY)
|
|
|
|
|
|
|
|
struct blk_integrity {
|
2017-03-25 09:03:48 +08:00
|
|
|
const struct blk_integrity_profile *profile;
|
|
|
|
unsigned char flags;
|
|
|
|
unsigned char tuple_size;
|
|
|
|
unsigned char interval_exp;
|
|
|
|
unsigned char tag_size;
|
2024-06-11 20:26:44 +08:00
|
|
|
|
|
|
|
KABI_RESERVE(1);
|
|
|
|
KABI_RESERVE(2);
|
2015-10-22 01:19:49 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
#endif /* CONFIG_BLK_DEV_INTEGRITY */
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
struct gendisk {
|
2008-08-25 18:56:16 +08:00
|
|
|
/* major, first_minor and minors are input parameters only,
|
|
|
|
* don't use directly. Use disk_devt() and disk_max_parts().
|
2008-09-03 15:01:48 +08:00
|
|
|
*/
|
2005-04-17 06:20:36 +08:00
|
|
|
int major; /* major number of driver */
|
|
|
|
int first_minor;
|
|
|
|
int minors; /* maximum number of minors, =1 for
|
|
|
|
* disks that can't be partitioned. */
|
2008-09-03 15:01:48 +08:00
|
|
|
|
2008-08-25 18:56:17 +08:00
|
|
|
char disk_name[DISK_NAME_LEN]; /* name of major driver */
|
2011-07-24 08:24:48 +08:00
|
|
|
char *(*devnode)(struct gendisk *gd, umode_t *mode);
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 03:57:37 +08:00
|
|
|
|
2019-03-27 21:51:02 +08:00
|
|
|
unsigned short events; /* supported events */
|
|
|
|
unsigned short event_flags; /* flags related to event processing */
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 03:57:37 +08:00
|
|
|
|
2008-09-03 15:06:42 +08:00
|
|
|
/* Array of pointers to partitions indexed by partno.
|
2008-09-03 15:03:02 +08:00
|
|
|
* Protected with matching bdev lock but stat and other
|
|
|
|
* non-critical accesses use RCU. Always access through
|
|
|
|
* helpers.
|
|
|
|
*/
|
2010-02-25 03:01:56 +08:00
|
|
|
struct disk_part_tbl __rcu *part_tbl;
|
2008-09-03 15:06:42 +08:00
|
|
|
struct hd_struct part0;
|
2008-09-03 15:03:02 +08:00
|
|
|
|
2009-09-22 08:01:13 +08:00
|
|
|
const struct block_device_operations *fops;
|
2005-04-17 06:20:36 +08:00
|
|
|
struct request_queue *queue;
|
|
|
|
void *private_data;
|
|
|
|
|
|
|
|
int flags;
|
2018-02-26 20:01:41 +08:00
|
|
|
struct rw_semaphore lookup_sem;
|
2006-03-27 17:17:55 +08:00
|
|
|
struct kobject *slave_dir;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
struct timer_rand_state *random;
|
|
|
|
atomic_t sync_io; /* RAID */
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 03:57:37 +08:00
|
|
|
struct disk_events *ev;
|
2008-07-01 02:04:41 +08:00
|
|
|
#ifdef CONFIG_BLK_DEV_INTEGRITY
|
2015-10-22 01:19:27 +08:00
|
|
|
struct kobject integrity_kobj;
|
2015-10-22 01:19:49 +08:00
|
|
|
#endif /* CONFIG_BLK_DEV_INTEGRITY */
|
2008-08-25 18:56:15 +08:00
|
|
|
int node_id;
|
2016-01-10 00:36:51 +08:00
|
|
|
struct badblocks *bb;
|
2017-10-25 16:56:05 +08:00
|
|
|
struct lockdep_map lockdep_map;
|
2024-06-11 20:26:44 +08:00
|
|
|
|
|
|
|
KABI_RESERVE(1);
|
|
|
|
KABI_RESERVE(2);
|
|
|
|
KABI_RESERVE(3);
|
|
|
|
KABI_RESERVE(4);
|
2005-04-17 06:20:36 +08:00
|
|
|
};
|
|
|
|
|
2008-08-25 18:47:17 +08:00
|
|
|
static inline struct gendisk *part_to_disk(struct hd_struct *part)
|
|
|
|
{
|
2008-08-29 15:01:47 +08:00
|
|
|
if (likely(part)) {
|
|
|
|
if (part->partno)
|
|
|
|
return dev_to_disk(part_to_dev(part)->parent);
|
|
|
|
else
|
|
|
|
return dev_to_disk(part_to_dev(part));
|
|
|
|
}
|
2008-08-25 18:47:17 +08:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2008-09-03 15:01:48 +08:00
|
|
|
static inline int disk_max_parts(struct gendisk *disk)
|
|
|
|
{
|
2008-08-25 18:56:16 +08:00
|
|
|
if (disk->flags & GENHD_FL_EXT_DEVT)
|
|
|
|
return DISK_MAX_PARTS;
|
|
|
|
return disk->minors;
|
2008-09-03 15:06:42 +08:00
|
|
|
}
|
|
|
|
|
2011-08-24 02:01:04 +08:00
|
|
|
static inline bool disk_part_scan_enabled(struct gendisk *disk)
|
2008-09-03 15:06:42 +08:00
|
|
|
{
|
2011-08-24 02:01:04 +08:00
|
|
|
return disk_max_parts(disk) > 1 &&
|
|
|
|
!(disk->flags & GENHD_FL_NO_PART_SCAN);
|
2008-09-03 15:01:48 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline dev_t disk_devt(struct gendisk *disk)
|
|
|
|
{
|
2017-11-03 02:29:52 +08:00
|
|
|
return MKDEV(disk->major, disk->first_minor);
|
2008-09-03 15:01:48 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline dev_t part_devt(struct hd_struct *part)
|
|
|
|
{
|
2008-08-25 18:56:05 +08:00
|
|
|
return part_to_dev(part)->devt;
|
2008-09-03 15:01:48 +08:00
|
|
|
}
|
|
|
|
|
2017-10-25 01:21:48 +08:00
|
|
|
extern struct hd_struct *__disk_get_part(struct gendisk *disk, int partno);
|
2008-09-03 15:03:02 +08:00
|
|
|
extern struct hd_struct *disk_get_part(struct gendisk *disk, int partno);
|
|
|
|
|
|
|
|
static inline void disk_put_part(struct hd_struct *part)
|
|
|
|
{
|
|
|
|
if (likely(part))
|
2008-08-25 18:56:05 +08:00
|
|
|
put_device(part_to_dev(part));
|
2008-09-03 15:03:02 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Smarter partition iterator without context limits.
|
|
|
|
*/
|
|
|
|
#define DISK_PITER_REVERSE (1 << 0) /* iterate in the reverse direction */
|
|
|
|
#define DISK_PITER_INCL_EMPTY (1 << 1) /* include 0-sized parts */
|
2008-09-03 15:06:42 +08:00
|
|
|
#define DISK_PITER_INCL_PART0 (1 << 2) /* include partition 0 */
|
2009-04-17 14:34:48 +08:00
|
|
|
#define DISK_PITER_INCL_EMPTY_PART0 (1 << 3) /* include empty partition 0 */
|
2008-09-03 15:03:02 +08:00
|
|
|
|
|
|
|
struct disk_part_iter {
|
|
|
|
struct gendisk *disk;
|
|
|
|
struct hd_struct *part;
|
|
|
|
int idx;
|
|
|
|
unsigned int flags;
|
|
|
|
};
|
|
|
|
|
|
|
|
extern void disk_part_iter_init(struct disk_part_iter *piter,
|
|
|
|
struct gendisk *disk, unsigned int flags);
|
|
|
|
extern struct hd_struct *disk_part_iter_next(struct disk_part_iter *piter);
|
|
|
|
extern void disk_part_iter_exit(struct disk_part_iter *piter);
|
|
|
|
|
|
|
|
extern struct hd_struct *disk_map_sector_rcu(struct gendisk *disk,
|
|
|
|
sector_t sector);
|
|
|
|
|
2008-08-25 18:47:21 +08:00
|
|
|
/*
|
2005-04-17 06:20:36 +08:00
|
|
|
* Macros to operate on percpu disk statistics:
|
|
|
|
*
|
2008-08-25 18:47:21 +08:00
|
|
|
* {disk|part|all}_stat_{add|sub|inc|dec}() modify the stat counters
|
|
|
|
* and should be called between disk_stat_lock() and
|
|
|
|
* disk_stat_unlock().
|
|
|
|
*
|
|
|
|
* part_stat_read() can be called at any time.
|
|
|
|
*
|
|
|
|
* part_stat_{add|set_all}() and {init|free}_part_stats are for
|
|
|
|
* internal use only.
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
|
|
|
#ifdef CONFIG_SMP
|
2008-08-25 18:56:14 +08:00
|
|
|
#define part_stat_lock() ({ rcu_read_lock(); get_cpu(); })
|
|
|
|
#define part_stat_unlock() do { put_cpu(); rcu_read_unlock(); } while (0)
|
2008-02-08 18:04:09 +08:00
|
|
|
|
2018-12-07 00:41:20 +08:00
|
|
|
#define part_stat_get_cpu(part, field, cpu) \
|
|
|
|
(per_cpu_ptr((part)->dkstats, (cpu))->field)
|
|
|
|
|
|
|
|
#define part_stat_get(part, field) \
|
|
|
|
part_stat_get_cpu(part, field, smp_processor_id())
|
2008-02-08 18:04:09 +08:00
|
|
|
|
|
|
|
#define part_stat_read(part, field) \
|
|
|
|
({ \
|
2008-08-25 18:56:14 +08:00
|
|
|
typeof((part)->dkstats->field) res = 0; \
|
2010-01-07 07:45:55 +08:00
|
|
|
unsigned int _cpu; \
|
|
|
|
for_each_possible_cpu(_cpu) \
|
|
|
|
res += per_cpu_ptr((part)->dkstats, _cpu)->field; \
|
2008-02-08 18:04:09 +08:00
|
|
|
res; \
|
|
|
|
})
|
|
|
|
|
2008-05-07 16:15:46 +08:00
|
|
|
static inline void part_stat_set_all(struct hd_struct *part, int value)
|
|
|
|
{
|
2008-02-08 18:04:09 +08:00
|
|
|
int i;
|
2008-05-07 16:15:46 +08:00
|
|
|
|
2008-02-08 18:04:09 +08:00
|
|
|
for_each_possible_cpu(i)
|
|
|
|
memset(per_cpu_ptr(part->dkstats, i), value,
|
2008-05-07 16:15:46 +08:00
|
|
|
sizeof(struct disk_stats));
|
2008-02-08 18:04:09 +08:00
|
|
|
}
|
2008-08-25 18:47:21 +08:00
|
|
|
|
2008-08-25 18:56:14 +08:00
|
|
|
static inline int init_part_stats(struct hd_struct *part)
|
2008-02-08 18:04:09 +08:00
|
|
|
{
|
2008-08-25 18:56:14 +08:00
|
|
|
part->dkstats = alloc_percpu(struct disk_stats);
|
|
|
|
if (!part->dkstats)
|
|
|
|
return 0;
|
2024-06-11 20:26:44 +08:00
|
|
|
part->stamp = jiffies - 1;
|
2008-08-25 18:56:14 +08:00
|
|
|
return 1;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
2008-02-08 18:04:09 +08:00
|
|
|
|
2008-08-25 18:56:14 +08:00
|
|
|
static inline void free_part_stats(struct hd_struct *part)
|
2008-02-08 18:04:09 +08:00
|
|
|
{
|
2008-08-25 18:56:14 +08:00
|
|
|
free_percpu(part->dkstats);
|
2008-02-08 18:04:09 +08:00
|
|
|
}
|
|
|
|
|
2008-08-25 18:56:14 +08:00
|
|
|
#else /* !CONFIG_SMP */
|
|
|
|
#define part_stat_lock() ({ rcu_read_lock(); 0; })
|
|
|
|
#define part_stat_unlock() rcu_read_unlock()
|
2008-08-25 18:47:21 +08:00
|
|
|
|
2018-12-07 00:41:20 +08:00
|
|
|
#define part_stat_get(part, field) ((part)->dkstats.field)
|
|
|
|
#define part_stat_get_cpu(part, field, cpu) part_stat_get(part, field)
|
|
|
|
#define part_stat_read(part, field) part_stat_get(part, field)
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-08-25 18:56:14 +08:00
|
|
|
static inline void part_stat_set_all(struct hd_struct *part, int value)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-08-25 18:56:14 +08:00
|
|
|
memset(&part->dkstats, value, sizeof(struct disk_stats));
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
2008-02-08 18:04:09 +08:00
|
|
|
|
|
|
|
static inline int init_part_stats(struct hd_struct *part)
|
|
|
|
{
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void free_part_stats(struct hd_struct *part)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
2008-08-25 18:56:14 +08:00
|
|
|
#endif /* CONFIG_SMP */
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2018-09-22 07:44:34 +08:00
|
|
|
#define part_stat_read_msecs(part, which) \
|
|
|
|
div_u64(part_stat_read(part, nsecs[which]), NSEC_PER_MSEC)
|
|
|
|
|
2018-07-18 19:47:37 +08:00
|
|
|
#define part_stat_read_accum(part, field) \
|
2018-07-18 19:47:38 +08:00
|
|
|
(part_stat_read(part, field[STAT_READ]) + \
|
2018-07-18 19:47:40 +08:00
|
|
|
part_stat_read(part, field[STAT_WRITE]) + \
|
|
|
|
part_stat_read(part, field[STAT_DISCARD]))
|
2018-07-18 19:47:37 +08:00
|
|
|
|
2018-12-07 00:41:20 +08:00
|
|
|
#define __part_stat_add(part, field, addnd) \
|
|
|
|
(part_stat_get(part, field) += (addnd))
|
|
|
|
|
2018-12-07 00:41:18 +08:00
|
|
|
#define part_stat_add(part, field, addnd) do { \
|
|
|
|
__part_stat_add((part), field, addnd); \
|
2008-08-25 18:56:14 +08:00
|
|
|
if ((part)->partno) \
|
2018-12-07 00:41:18 +08:00
|
|
|
__part_stat_add(&part_to_disk((part))->part0, \
|
2008-08-25 18:56:14 +08:00
|
|
|
field, addnd); \
|
|
|
|
} while (0)
|
2008-02-08 18:04:09 +08:00
|
|
|
|
2018-12-07 00:41:18 +08:00
|
|
|
#define part_stat_dec(gendiskp, field) \
|
|
|
|
part_stat_add(gendiskp, field, -1)
|
|
|
|
#define part_stat_inc(gendiskp, field) \
|
|
|
|
part_stat_add(gendiskp, field, 1)
|
|
|
|
#define part_stat_sub(gendiskp, field, subnd) \
|
|
|
|
part_stat_add(gendiskp, field, -subnd)
|
2008-08-25 18:56:14 +08:00
|
|
|
|
2018-12-07 00:41:20 +08:00
|
|
|
#define part_stat_local_dec(gendiskp, field) \
|
|
|
|
local_dec(&(part_stat_get(gendiskp, field)))
|
|
|
|
#define part_stat_local_inc(gendiskp, field) \
|
|
|
|
local_inc(&(part_stat_get(gendiskp, field)))
|
|
|
|
#define part_stat_local_read(gendiskp, field) \
|
|
|
|
local_read(&(part_stat_get(gendiskp, field)))
|
|
|
|
#define part_stat_local_read_cpu(gendiskp, field, cpu) \
|
|
|
|
local_read(&(part_stat_get_cpu(gendiskp, field, cpu)))
|
|
|
|
|
2018-12-07 00:41:21 +08:00
|
|
|
unsigned int part_in_flight(struct request_queue *q, struct hd_struct *part);
|
2018-04-26 15:21:59 +08:00
|
|
|
void part_in_flight_rw(struct request_queue *q, struct hd_struct *part,
|
|
|
|
unsigned int inflight[2]);
|
2017-08-09 07:51:45 +08:00
|
|
|
void part_dec_in_flight(struct request_queue *q, struct hd_struct *part,
|
|
|
|
int rw);
|
|
|
|
void part_inc_in_flight(struct request_queue *q, struct hd_struct *part,
|
|
|
|
int rw);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2010-09-01 04:47:05 +08:00
|
|
|
static inline struct partition_meta_info *alloc_part_info(struct gendisk *disk)
|
|
|
|
{
|
|
|
|
if (disk)
|
|
|
|
return kzalloc_node(sizeof(struct partition_meta_info),
|
|
|
|
GFP_KERNEL, disk->node_id);
|
|
|
|
return kzalloc(sizeof(struct partition_meta_info), GFP_KERNEL);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void free_part_info(struct hd_struct *part)
|
|
|
|
{
|
|
|
|
kfree(part->info);
|
|
|
|
}
|
|
|
|
|
2024-06-11 20:26:44 +08:00
|
|
|
void update_io_ticks(struct hd_struct *part, unsigned long now, bool end);
|
|
|
|
void sync_io_ticks(struct hd_struct *part, bool busy);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-03-10 15:25:54 +08:00
|
|
|
/* block/genhd.c */
|
2018-09-28 14:17:19 +08:00
|
|
|
extern void device_add_disk(struct device *parent, struct gendisk *disk,
|
|
|
|
const struct attribute_group **groups);
|
2016-06-16 09:17:27 +08:00
|
|
|
static inline void add_disk(struct gendisk *disk)
|
|
|
|
{
|
2018-09-28 14:17:19 +08:00
|
|
|
device_add_disk(NULL, disk, NULL);
|
2016-06-16 09:17:27 +08:00
|
|
|
}
|
2018-01-09 11:01:13 +08:00
|
|
|
extern void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk);
|
|
|
|
static inline void add_disk_no_queue_reg(struct gendisk *disk)
|
|
|
|
{
|
|
|
|
device_add_disk_no_queue_reg(NULL, disk);
|
|
|
|
}
|
2016-06-16 09:17:27 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
extern void del_gendisk(struct gendisk *gp);
|
2008-09-03 15:01:09 +08:00
|
|
|
extern struct gendisk *get_gendisk(dev_t dev, int *partno);
|
2008-09-03 15:01:48 +08:00
|
|
|
extern struct block_device *bdget_disk(struct gendisk *disk, int partno);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
extern void set_device_ro(struct block_device *bdev, int flag);
|
|
|
|
extern void set_disk_ro(struct gendisk *disk, int flag);
|
|
|
|
|
2008-08-25 18:56:10 +08:00
|
|
|
static inline int get_disk_ro(struct gendisk *disk)
|
|
|
|
{
|
|
|
|
return disk->part0.policy;
|
|
|
|
}
|
|
|
|
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 03:57:37 +08:00
|
|
|
extern void disk_block_events(struct gendisk *disk);
|
|
|
|
extern void disk_unblock_events(struct gendisk *disk);
|
2011-07-01 22:17:47 +08:00
|
|
|
extern void disk_flush_events(struct gendisk *disk, unsigned int mask);
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 03:57:37 +08:00
|
|
|
extern unsigned int disk_clear_events(struct gendisk *disk, unsigned int mask);
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/* drivers/char/random.c */
|
2016-06-21 02:42:34 +08:00
|
|
|
extern void add_disk_randomness(struct gendisk *disk) __latent_entropy;
|
2005-04-17 06:20:36 +08:00
|
|
|
extern void rand_initialize_disk(struct gendisk *disk);
|
|
|
|
|
|
|
|
static inline sector_t get_start_sect(struct block_device *bdev)
|
|
|
|
{
|
2008-08-25 18:56:12 +08:00
|
|
|
return bdev->bd_part->start_sect;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
static inline sector_t get_capacity(struct gendisk *disk)
|
|
|
|
{
|
2008-08-25 18:56:07 +08:00
|
|
|
return disk->part0.nr_sects;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
static inline void set_capacity(struct gendisk *disk, sector_t size)
|
|
|
|
{
|
2008-08-25 18:56:07 +08:00
|
|
|
disk->part0.nr_sects = size;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
#ifdef CONFIG_SOLARIS_X86_PARTITION
|
|
|
|
|
2007-07-26 09:30:08 +08:00
|
|
|
#define SOLARIS_X86_NUMSLICE 16
|
2005-04-17 06:20:36 +08:00
|
|
|
#define SOLARIS_X86_VTOC_SANE (0x600DDEEEUL)
|
|
|
|
|
|
|
|
struct solaris_x86_slice {
|
|
|
|
__le16 s_tag; /* ID tag of partition */
|
|
|
|
__le16 s_flag; /* permission flags */
|
|
|
|
__le32 s_start; /* start sector no of partition */
|
|
|
|
__le32 s_size; /* # of blocks in partition */
|
|
|
|
};
|
|
|
|
|
|
|
|
struct solaris_x86_vtoc {
|
|
|
|
unsigned int v_bootinfo[3]; /* info needed by mboot (unsupported) */
|
|
|
|
__le32 v_sanity; /* to verify vtoc sanity */
|
|
|
|
__le32 v_version; /* layout version */
|
|
|
|
char v_volume[8]; /* volume name */
|
|
|
|
__le16 v_sectorsz; /* sector size in bytes */
|
|
|
|
__le16 v_nparts; /* number of partitions */
|
|
|
|
unsigned int v_reserved[10]; /* free space */
|
|
|
|
struct solaris_x86_slice
|
|
|
|
v_slice[SOLARIS_X86_NUMSLICE]; /* slice headers */
|
|
|
|
unsigned int timestamp[SOLARIS_X86_NUMSLICE]; /* timestamp (unsupported) */
|
|
|
|
char v_asciilabel[128]; /* for compatibility */
|
|
|
|
};
|
|
|
|
|
|
|
|
#endif /* CONFIG_SOLARIS_X86_PARTITION */
|
|
|
|
|
|
|
|
#ifdef CONFIG_BSD_DISKLABEL
|
|
|
|
/*
|
|
|
|
* BSD disklabel support by Yossi Gottlieb <yogo@math.tau.ac.il>
|
|
|
|
* updated by Marc Espie <Marc.Espie@openbsd.org>
|
|
|
|
*/
|
|
|
|
|
|
|
|
/* check against BSD src/sys/sys/disklabel.h for consistency */
|
|
|
|
|
|
|
|
#define BSD_DISKMAGIC (0x82564557UL) /* The disk magic number */
|
|
|
|
#define BSD_MAXPARTITIONS 16
|
|
|
|
#define OPENBSD_MAXPARTITIONS 16
|
|
|
|
#define BSD_FS_UNUSED 0 /* disklabel unused partition entry ID */
|
|
|
|
struct bsd_disklabel {
|
|
|
|
__le32 d_magic; /* the magic number */
|
|
|
|
__s16 d_type; /* drive type */
|
|
|
|
__s16 d_subtype; /* controller/d_type specific */
|
|
|
|
char d_typename[16]; /* type name, e.g. "eagle" */
|
|
|
|
char d_packname[16]; /* pack identifier */
|
|
|
|
__u32 d_secsize; /* # of bytes per sector */
|
|
|
|
__u32 d_nsectors; /* # of data sectors per track */
|
|
|
|
__u32 d_ntracks; /* # of tracks per cylinder */
|
|
|
|
__u32 d_ncylinders; /* # of data cylinders per unit */
|
|
|
|
__u32 d_secpercyl; /* # of data sectors per cylinder */
|
|
|
|
__u32 d_secperunit; /* # of data sectors per unit */
|
|
|
|
__u16 d_sparespertrack; /* # of spare sectors per track */
|
|
|
|
__u16 d_sparespercyl; /* # of spare sectors per cylinder */
|
|
|
|
__u32 d_acylinders; /* # of alt. cylinders per unit */
|
|
|
|
__u16 d_rpm; /* rotational speed */
|
|
|
|
__u16 d_interleave; /* hardware sector interleave */
|
|
|
|
__u16 d_trackskew; /* sector 0 skew, per track */
|
|
|
|
__u16 d_cylskew; /* sector 0 skew, per cylinder */
|
|
|
|
__u32 d_headswitch; /* head switch time, usec */
|
|
|
|
__u32 d_trkseek; /* track-to-track seek, usec */
|
|
|
|
__u32 d_flags; /* generic flags */
|
|
|
|
#define NDDATA 5
|
|
|
|
__u32 d_drivedata[NDDATA]; /* drive-type specific information */
|
|
|
|
#define NSPARE 5
|
|
|
|
__u32 d_spare[NSPARE]; /* reserved for future use */
|
|
|
|
__le32 d_magic2; /* the magic number (again) */
|
|
|
|
__le16 d_checksum; /* xor of data incl. partitions */
|
|
|
|
|
|
|
|
/* filesystem and partition information: */
|
|
|
|
__le16 d_npartitions; /* number of partitions in following */
|
|
|
|
__le32 d_bbsize; /* size of boot area at sn0, bytes */
|
|
|
|
__le32 d_sbsize; /* max size of fs superblock, bytes */
|
|
|
|
struct bsd_partition { /* the partition table */
|
|
|
|
__le32 p_size; /* number of sectors in partition */
|
|
|
|
__le32 p_offset; /* starting sector */
|
|
|
|
__le32 p_fsize; /* filesystem basic fragment size */
|
|
|
|
__u8 p_fstype; /* filesystem type, see below */
|
|
|
|
__u8 p_frag; /* filesystem fragments per block */
|
|
|
|
__le16 p_cpg; /* filesystem cylinders per group */
|
|
|
|
} d_partitions[BSD_MAXPARTITIONS]; /* actually may be more */
|
|
|
|
};
|
|
|
|
|
|
|
|
#endif /* CONFIG_BSD_DISKLABEL */
|
|
|
|
|
|
|
|
#ifdef CONFIG_UNIXWARE_DISKLABEL
|
|
|
|
/*
|
|
|
|
* Unixware slices support by Andrzej Krzysztofowicz <ankry@mif.pg.gda.pl>
|
|
|
|
* and Krzysztof G. Baranowski <kgb@knm.org.pl>
|
|
|
|
*/
|
|
|
|
|
|
|
|
#define UNIXWARE_DISKMAGIC (0xCA5E600DUL) /* The disk magic number */
|
|
|
|
#define UNIXWARE_DISKMAGIC2 (0x600DDEEEUL) /* The slice table magic nr */
|
|
|
|
#define UNIXWARE_NUMSLICE 16
|
|
|
|
#define UNIXWARE_FS_UNUSED 0 /* Unused slice entry ID */
|
|
|
|
|
|
|
|
struct unixware_slice {
|
|
|
|
__le16 s_label; /* label */
|
|
|
|
__le16 s_flags; /* permission flags */
|
|
|
|
__le32 start_sect; /* starting sector */
|
|
|
|
__le32 nr_sects; /* number of sectors in slice */
|
|
|
|
};
|
|
|
|
|
|
|
|
struct unixware_disklabel {
|
|
|
|
__le32 d_type; /* drive type */
|
|
|
|
__le32 d_magic; /* the magic number */
|
|
|
|
__le32 d_version; /* version number */
|
|
|
|
char d_serial[12]; /* serial number of the device */
|
|
|
|
__le32 d_ncylinders; /* # of data cylinders per device */
|
|
|
|
__le32 d_ntracks; /* # of tracks per cylinder */
|
|
|
|
__le32 d_nsectors; /* # of data sectors per track */
|
|
|
|
__le32 d_secsize; /* # of bytes per sector */
|
|
|
|
__le32 d_part_start; /* # of first sector of this partition */
|
|
|
|
__le32 d_unknown1[12]; /* ? */
|
|
|
|
__le32 d_alt_tbl; /* byte offset of alternate table */
|
|
|
|
__le32 d_alt_len; /* byte length of alternate table */
|
|
|
|
__le32 d_phys_cyl; /* # of physical cylinders per device */
|
|
|
|
__le32 d_phys_trk; /* # of physical tracks per cylinder */
|
|
|
|
__le32 d_phys_sec; /* # of physical sectors per track */
|
|
|
|
__le32 d_phys_bytes; /* # of physical bytes per sector */
|
|
|
|
__le32 d_unknown2; /* ? */
|
|
|
|
__le32 d_unknown3; /* ? */
|
|
|
|
__le32 d_pad[8]; /* pad */
|
|
|
|
|
|
|
|
struct unixware_vtoc {
|
|
|
|
__le32 v_magic; /* the magic number */
|
|
|
|
__le32 v_version; /* version number */
|
|
|
|
char v_name[8]; /* volume name */
|
|
|
|
__le16 v_nslices; /* # of slices */
|
|
|
|
__le16 v_unknown1; /* ? */
|
|
|
|
__le32 v_reserved[10]; /* reserved */
|
|
|
|
struct unixware_slice
|
|
|
|
v_slice[UNIXWARE_NUMSLICE]; /* slice headers */
|
|
|
|
} vtoc;
|
|
|
|
|
|
|
|
}; /* 408 */
|
|
|
|
|
|
|
|
#endif /* CONFIG_UNIXWARE_DISKLABEL */
|
|
|
|
|
|
|
|
#ifdef CONFIG_MINIX_SUBPARTITION
|
|
|
|
# define MINIX_NR_SUBPARTITIONS 4
|
|
|
|
#endif /* CONFIG_MINIX_SUBPARTITION */
|
|
|
|
|
2007-02-11 15:50:00 +08:00
|
|
|
#define ADDPART_FLAG_NONE 0
|
|
|
|
#define ADDPART_FLAG_RAID 1
|
|
|
|
#define ADDPART_FLAG_WHOLEDISK 2
|
|
|
|
|
2008-08-25 18:47:22 +08:00
|
|
|
extern int blk_alloc_devt(struct hd_struct *part, dev_t *devt);
|
|
|
|
extern void blk_free_devt(dev_t devt);
|
2019-04-02 20:06:34 +08:00
|
|
|
extern void blk_invalidate_devt(dev_t devt);
|
2008-09-03 15:01:09 +08:00
|
|
|
extern dev_t blk_lookup_devt(const char *name, int partno);
|
|
|
|
extern char *disk_name (struct gendisk *hd, int partno, char *buf);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-08-25 18:56:15 +08:00
|
|
|
extern int disk_expand_part_tbl(struct gendisk *disk, int target);
|
2005-04-17 06:20:36 +08:00
|
|
|
extern int rescan_partitions(struct gendisk *disk, struct block_device *bdev);
|
2012-03-02 17:38:33 +08:00
|
|
|
extern int invalidate_partitions(struct gendisk *disk, struct block_device *bdev);
|
2008-11-10 14:29:58 +08:00
|
|
|
extern struct hd_struct * __must_check add_partition(struct gendisk *disk,
|
|
|
|
int partno, sector_t start,
|
2010-09-01 04:47:05 +08:00
|
|
|
sector_t len, int flags,
|
|
|
|
struct partition_meta_info
|
|
|
|
*info);
|
2015-07-16 11:16:45 +08:00
|
|
|
extern void __delete_partition(struct percpu_ref *);
|
2005-04-17 06:20:36 +08:00
|
|
|
extern void delete_partition(struct gendisk *, int);
|
2007-05-09 17:33:24 +08:00
|
|
|
extern void printk_all_partitions(void);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2017-10-25 16:56:05 +08:00
|
|
|
extern struct gendisk *__alloc_disk_node(int minors, int node_id);
|
2018-02-26 20:01:38 +08:00
|
|
|
extern struct kobject *get_disk_and_module(struct gendisk *disk);
|
2005-04-17 06:20:36 +08:00
|
|
|
extern void put_disk(struct gendisk *disk);
|
2018-02-26 20:01:39 +08:00
|
|
|
extern void put_disk_and_module(struct gendisk *disk);
|
2007-05-22 04:08:01 +08:00
|
|
|
extern void blk_register_region(dev_t devt, unsigned long range,
|
2005-04-17 06:20:36 +08:00
|
|
|
struct module *module,
|
|
|
|
struct kobject *(*probe)(dev_t, int *, void *),
|
|
|
|
int (*lock)(dev_t, void *),
|
|
|
|
void *data);
|
2007-05-22 04:08:01 +08:00
|
|
|
extern void blk_unregister_region(dev_t devt, unsigned long range);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-08-25 18:56:09 +08:00
|
|
|
extern ssize_t part_size_show(struct device *dev,
|
|
|
|
struct device_attribute *attr, char *buf);
|
2008-08-25 18:56:14 +08:00
|
|
|
extern ssize_t part_stat_show(struct device *dev,
|
|
|
|
struct device_attribute *attr, char *buf);
|
block: Seperate read and write statistics of in_flight requests v2
Commit a9327cac440be4d8333bba975cbbf76045096275 added seperate read
and write statistics of in_flight requests. And exported the number
of read and write requests in progress seperately through sysfs.
But Corrado Zoccolo <czoccolo@gmail.com> reported getting strange
output from "iostat -kx 2". Global values for service time and
utilization were garbage. For interval values, utilization was always
100%, and service time is higher than normal.
So this was reverted by commit 0f78ab9899e9d6acb09d5465def618704255963b
The problem was in part_round_stats_single(), I missed the following:
if (now == part->stamp)
return;
- if (part->in_flight) {
+ if (part_in_flight(part)) {
__part_stat_add(cpu, part, time_in_queue,
part_in_flight(part) * (now - part->stamp));
__part_stat_add(cpu, part, io_ticks, (now - part->stamp));
With this chunk included, the reported regression gets fixed.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
--
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-07 02:16:55 +08:00
|
|
|
extern ssize_t part_inflight_show(struct device *dev,
|
|
|
|
struct device_attribute *attr, char *buf);
|
2008-08-25 18:56:13 +08:00
|
|
|
#ifdef CONFIG_FAIL_MAKE_REQUEST
|
|
|
|
extern ssize_t part_fail_show(struct device *dev,
|
|
|
|
struct device_attribute *attr, char *buf);
|
|
|
|
extern ssize_t part_fail_store(struct device *dev,
|
|
|
|
struct device_attribute *attr,
|
|
|
|
const char *buf, size_t count);
|
|
|
|
#endif /* CONFIG_FAIL_MAKE_REQUEST */
|
2008-08-25 18:56:09 +08:00
|
|
|
|
2017-10-25 16:56:05 +08:00
|
|
|
#define alloc_disk_node(minors, node_id) \
|
|
|
|
({ \
|
|
|
|
static struct lock_class_key __key; \
|
|
|
|
const char *__name; \
|
|
|
|
struct gendisk *__disk; \
|
|
|
|
\
|
|
|
|
__name = "(gendisk_completion)"#minors"("#node_id")"; \
|
|
|
|
\
|
|
|
|
__disk = __alloc_disk_node(minors, node_id); \
|
|
|
|
\
|
|
|
|
if (__disk) \
|
|
|
|
lockdep_init_map(&__disk->lockdep_map, __name, &__key, 0); \
|
|
|
|
\
|
|
|
|
__disk; \
|
|
|
|
})
|
|
|
|
|
|
|
|
#define alloc_disk(minors) alloc_disk_node(minors, NUMA_NO_NODE)
|
|
|
|
|
2015-07-16 11:16:45 +08:00
|
|
|
static inline int hd_ref_init(struct hd_struct *part)
|
2011-01-07 15:43:37 +08:00
|
|
|
{
|
2015-07-16 11:16:45 +08:00
|
|
|
if (percpu_ref_init(&part->ref, __delete_partition, 0,
|
|
|
|
GFP_KERNEL))
|
|
|
|
return -ENOMEM;
|
|
|
|
return 0;
|
2011-01-07 15:43:37 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline void hd_struct_get(struct hd_struct *part)
|
|
|
|
{
|
2015-07-16 11:16:45 +08:00
|
|
|
percpu_ref_get(&part->ref);
|
2011-01-07 15:43:37 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline int hd_struct_try_get(struct hd_struct *part)
|
|
|
|
{
|
2015-07-16 11:16:45 +08:00
|
|
|
return percpu_ref_tryget_live(&part->ref);
|
2011-01-07 15:43:37 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline void hd_struct_put(struct hd_struct *part)
|
|
|
|
{
|
2015-07-16 11:16:45 +08:00
|
|
|
percpu_ref_put(&part->ref);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void hd_struct_kill(struct hd_struct *part)
|
|
|
|
{
|
|
|
|
percpu_ref_kill(&part->ref);
|
2011-01-07 15:43:37 +08:00
|
|
|
}
|
|
|
|
|
2015-07-16 11:16:44 +08:00
|
|
|
static inline void hd_free_part(struct hd_struct *part)
|
|
|
|
{
|
|
|
|
free_part_stats(part);
|
|
|
|
free_part_info(part);
|
2015-07-16 11:16:45 +08:00
|
|
|
percpu_ref_exit(&part->ref);
|
2015-07-16 11:16:44 +08:00
|
|
|
}
|
|
|
|
|
2012-08-01 18:24:18 +08:00
|
|
|
/*
|
|
|
|
* Any access of part->nr_sects which is not protected by partition
|
|
|
|
* bd_mutex or gendisk bdev bd_mutex, should be done using this
|
|
|
|
* accessor function.
|
|
|
|
*
|
|
|
|
* Code written along the lines of i_size_read() and i_size_write().
|
|
|
|
* CONFIG_PREEMPT case optimizes the case of UP kernel with preemption
|
|
|
|
* on.
|
|
|
|
*/
|
|
|
|
static inline sector_t part_nr_sects_read(struct hd_struct *part)
|
|
|
|
{
|
2019-04-06 00:08:59 +08:00
|
|
|
#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
|
2012-08-01 18:24:18 +08:00
|
|
|
sector_t nr_sects;
|
|
|
|
unsigned seq;
|
|
|
|
do {
|
|
|
|
seq = read_seqcount_begin(&part->nr_sects_seq);
|
|
|
|
nr_sects = part->nr_sects;
|
|
|
|
} while (read_seqcount_retry(&part->nr_sects_seq, seq));
|
|
|
|
return nr_sects;
|
2019-04-06 00:08:59 +08:00
|
|
|
#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPT)
|
2012-08-01 18:24:18 +08:00
|
|
|
sector_t nr_sects;
|
|
|
|
|
|
|
|
preempt_disable();
|
|
|
|
nr_sects = part->nr_sects;
|
|
|
|
preempt_enable();
|
|
|
|
return nr_sects;
|
|
|
|
#else
|
|
|
|
return part->nr_sects;
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Should be called with mutex lock held (typically bd_mutex) of partition
|
|
|
|
* to provide mutual exlusion among writers otherwise seqcount might be
|
|
|
|
* left in wrong state leaving the readers spinning infinitely.
|
|
|
|
*/
|
|
|
|
static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
|
|
|
|
{
|
2019-04-06 00:08:59 +08:00
|
|
|
#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
|
2024-06-11 20:26:44 +08:00
|
|
|
preempt_disable();
|
2012-08-01 18:24:18 +08:00
|
|
|
write_seqcount_begin(&part->nr_sects_seq);
|
|
|
|
part->nr_sects = size;
|
|
|
|
write_seqcount_end(&part->nr_sects_seq);
|
2024-06-11 20:26:44 +08:00
|
|
|
preempt_enable();
|
2019-04-06 00:08:59 +08:00
|
|
|
#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPT)
|
2012-08-01 18:24:18 +08:00
|
|
|
preempt_disable();
|
|
|
|
part->nr_sects = size;
|
|
|
|
preempt_enable();
|
|
|
|
#else
|
|
|
|
part->nr_sects = size;
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
2015-10-22 01:19:49 +08:00
|
|
|
#if defined(CONFIG_BLK_DEV_INTEGRITY)
|
|
|
|
extern void blk_integrity_add(struct gendisk *);
|
|
|
|
extern void blk_integrity_del(struct gendisk *);
|
|
|
|
#else /* CONFIG_BLK_DEV_INTEGRITY */
|
|
|
|
static inline void blk_integrity_add(struct gendisk *disk) { }
|
|
|
|
static inline void blk_integrity_del(struct gendisk *disk) { }
|
|
|
|
#endif /* CONFIG_BLK_DEV_INTEGRITY */
|
|
|
|
|
2007-05-11 19:29:54 +08:00
|
|
|
#else /* CONFIG_BLOCK */
|
|
|
|
|
|
|
|
static inline void printk_all_partitions(void) { }
|
|
|
|
|
2008-09-03 15:01:09 +08:00
|
|
|
static inline dev_t blk_lookup_devt(const char *name, int partno)
|
2007-05-22 04:08:01 +08:00
|
|
|
{
|
|
|
|
dev_t devt = MKDEV(0, 0);
|
|
|
|
return devt;
|
|
|
|
}
|
2007-05-11 19:29:54 +08:00
|
|
|
#endif /* CONFIG_BLOCK */
|
[PATCH] BLOCK: Make it possible to disable the block layer [try #6]
Make it possible to disable the block layer. Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.
This patch does the following:
(*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
support.
(*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
an item that uses the block layer. This includes:
(*) Block I/O tracing.
(*) Disk partition code.
(*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.
(*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
block layer to do scheduling. Some drivers that use SCSI facilities -
such as USB storage - end up disabled indirectly from this.
(*) Various block-based device drivers, such as IDE and the old CDROM
drivers.
(*) MTD blockdev handling and FTL.
(*) JFFS - which uses set_bdev_super(), something it could avoid doing by
taking a leaf out of JFFS2's book.
(*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
however, still used in places, and so is still available.
(*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
parts of linux/fs.h.
(*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.
(*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.
(*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
is not enabled.
(*) fs/no-block.c is created to hold out-of-line stubs and things that are
required when CONFIG_BLOCK is not set:
(*) Default blockdev file operations (to give error ENODEV on opening).
(*) Makes some /proc changes:
(*) /proc/devices does not list any blockdevs.
(*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.
(*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.
(*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
given command other than Q_SYNC or if a special device is specified.
(*) In init/do_mounts.c, no reference is made to the blockdev routines if
CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.
(*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
error ENOSYS by way of cond_syscall if so).
(*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
CONFIG_BLOCK is not set, since they can't then happen.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-10-01 02:45:40 +08:00
|
|
|
|
2008-03-13 00:52:56 +08:00
|
|
|
#endif /* _LINUX_GENHD_H */
|