Lots of updates for net-next for this cycle. As usual, we have
a lot of small fixes and cleanups, the bigger items are: * proper mac80211 rate control locking, to fix some random crashes (this required changing other locking as well) * mac80211 "fast-xmit", a mechanism to reduce, in most cases, the amount of code we execute while going from ndo_start_xmit() to the driver * this also clears the way for properly supporting S/G and checksum and segmentation offloads -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJVSh65AAoJEDBSmw7B7bqrudwP/0iXyNQhF0mLTENrx+rdsDZS qQhB/8wejJaOJb89Re7M+bhwri7Q6S5BM/G24vhMc01dxmqNMcdKfEV3+nlmc5C+ KeEgTI9aZiCnUt4WAd54Zwbkc9o+1kBtaFuaWDvOdQHUf0WDwEIQxjnV4+SZujV9 xl1TV5yV35hRQgrDE8ZSbtOYRmhSVoi0MEgwqAjzdN2fEPyWVeqwYULDtpOopjL2 UHQgv0E2fYVRWennHyQQ88tWBQg+EsRaG1U1/rYHhNBmAJ+f9AOxKi7ErzxYfkbM 961B+3E++pM+zUeqw6+jaMKqT5jeCCM5ugCNSG4NrIvfxDIDgecAFV9Fs2islnI4 8xd3GqyA5iqaitAWIUsaYaQfaAcwSIlpSinfQW9EUm2wuCkPyZboFP+GRd2K7sQn FnRJSJ9PkGPdWwdDE3gunLHBHtbDS0z+R8VegIeS0qT8LamkqICiNQSyPlsTeluW ig2kwHsDdj3k11wyelhfp/RdtsOch/brKpLSjdzPXC1BzIWhQLwmsPh9qZ83vSB9 qbLsdnM/IPQXocWB6fOhmwaGsLeRalxs2yQFM0zdJCwpaU9dzKsJrxepAXVuq31p r0fygWTp8GVevHXzfS7fRya8xjsTRrSs6n2kH7ErOfiep13HQypAjbyLswNe4kW/ D6x8pVC3AhdGkl/9CW4m =oUlh -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-davem-2015-05-06' into iwlwifi-next Lots of updates for net-next for this cycle. As usual, we have a lot of small fixes and cleanups, the bigger items are: * proper mac80211 rate control locking, to fix some random crashes (this required changing other locking as well) * mac80211 "fast-xmit", a mechanism to reduce, in most cases, the amount of code we execute while going from ndo_start_xmit() to the driver * this also clears the way for properly supporting S/G and checksum and segmentation offloads
This commit is contained in:
commit
31eb07f5f8
|
@ -24,6 +24,7 @@
|
|||
*.order
|
||||
*.elf
|
||||
*.bin
|
||||
*.tar
|
||||
*.gz
|
||||
*.bz2
|
||||
*.lzma
|
||||
|
|
1
.mailmap
1
.mailmap
|
@ -100,6 +100,7 @@ Rajesh Shah <rajesh.shah@intel.com>
|
|||
Ralf Baechle <ralf@linux-mips.org>
|
||||
Ralf Wildenhues <Ralf.Wildenhues@gmx.de>
|
||||
Rémi Denis-Courmont <rdenis@simphalempin.com>
|
||||
Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
|
||||
Rudolf Marek <R.Marek@sh.cvut.cz>
|
||||
Rui Saraiva <rmps@joel.ist.utl.pt>
|
||||
Sachin P Sant <ssant@in.ibm.com>
|
||||
|
|
17
CREDITS
17
CREDITS
|
@ -508,6 +508,10 @@ E: paul@paulbristow.net
|
|||
W: http://paulbristow.net/linux/idefloppy.html
|
||||
D: Maintainer of IDE/ATAPI floppy driver
|
||||
|
||||
N: Stefano Brivio
|
||||
E: stefano.brivio@polimi.it
|
||||
D: Broadcom B43 driver
|
||||
|
||||
N: Dominik Brodowski
|
||||
E: linux@brodo.de
|
||||
W: http://www.brodo.de/
|
||||
|
@ -3008,6 +3012,19 @@ W: http://www.qsl.net/dl1bke/
|
|||
D: Generic Z8530 driver, AX.25 DAMA slave implementation
|
||||
D: Several AX.25 hacks
|
||||
|
||||
N: Ricardo Ribalda Delgado
|
||||
E: ricardo.ribalda@gmail.com
|
||||
W: http://ribalda.com
|
||||
D: PLX USB338x driver
|
||||
D: PCA9634 driver
|
||||
D: Option GTM671WFS
|
||||
D: Fintek F81216A
|
||||
D: Various kernel hacks
|
||||
S: Qtechnology A/S
|
||||
S: Valby Langgade 142
|
||||
S: 2500 Valby
|
||||
S: Denmark
|
||||
|
||||
N: Francois-Rene Rideau
|
||||
E: fare@tunes.org
|
||||
W: http://www.tunes.org/~fare
|
||||
|
|
|
@ -0,0 +1,119 @@
|
|||
What: /sys/block/zram<id>/num_reads
|
||||
Date: August 2015
|
||||
Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
||||
Description:
|
||||
The num_reads file is read-only and specifies the number of
|
||||
reads (failed or successful) done on this device.
|
||||
Now accessible via zram<id>/stat node.
|
||||
|
||||
What: /sys/block/zram<id>/num_writes
|
||||
Date: August 2015
|
||||
Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
||||
Description:
|
||||
The num_writes file is read-only and specifies the number of
|
||||
writes (failed or successful) done on this device.
|
||||
Now accessible via zram<id>/stat node.
|
||||
|
||||
What: /sys/block/zram<id>/invalid_io
|
||||
Date: August 2015
|
||||
Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
||||
Description:
|
||||
The invalid_io file is read-only and specifies the number of
|
||||
non-page-size-aligned I/O requests issued to this device.
|
||||
Now accessible via zram<id>/io_stat node.
|
||||
|
||||
What: /sys/block/zram<id>/failed_reads
|
||||
Date: August 2015
|
||||
Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
||||
Description:
|
||||
The failed_reads file is read-only and specifies the number of
|
||||
failed reads happened on this device.
|
||||
Now accessible via zram<id>/io_stat node.
|
||||
|
||||
What: /sys/block/zram<id>/failed_writes
|
||||
Date: August 2015
|
||||
Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
||||
Description:
|
||||
The failed_writes file is read-only and specifies the number of
|
||||
failed writes happened on this device.
|
||||
Now accessible via zram<id>/io_stat node.
|
||||
|
||||
What: /sys/block/zram<id>/notify_free
|
||||
Date: August 2015
|
||||
Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
||||
Description:
|
||||
The notify_free file is read-only. Depending on device usage
|
||||
scenario it may account a) the number of pages freed because
|
||||
of swap slot free notifications or b) the number of pages freed
|
||||
because of REQ_DISCARD requests sent by bio. The former ones
|
||||
are sent to a swap block device when a swap slot is freed, which
|
||||
implies that this disk is being used as a swap disk. The latter
|
||||
ones are sent by filesystem mounted with discard option,
|
||||
whenever some data blocks are getting discarded.
|
||||
Now accessible via zram<id>/io_stat node.
|
||||
|
||||
What: /sys/block/zram<id>/zero_pages
|
||||
Date: August 2015
|
||||
Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
||||
Description:
|
||||
The zero_pages file is read-only and specifies number of zero
|
||||
filled pages written to this disk. No memory is allocated for
|
||||
such pages.
|
||||
Now accessible via zram<id>/mm_stat node.
|
||||
|
||||
What: /sys/block/zram<id>/orig_data_size
|
||||
Date: August 2015
|
||||
Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
||||
Description:
|
||||
The orig_data_size file is read-only and specifies uncompressed
|
||||
size of data stored in this disk. This excludes zero-filled
|
||||
pages (zero_pages) since no memory is allocated for them.
|
||||
Unit: bytes
|
||||
Now accessible via zram<id>/mm_stat node.
|
||||
|
||||
What: /sys/block/zram<id>/compr_data_size
|
||||
Date: August 2015
|
||||
Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
||||
Description:
|
||||
The compr_data_size file is read-only and specifies compressed
|
||||
size of data stored in this disk. So, compression ratio can be
|
||||
calculated using orig_data_size and this statistic.
|
||||
Unit: bytes
|
||||
Now accessible via zram<id>/mm_stat node.
|
||||
|
||||
What: /sys/block/zram<id>/mem_used_total
|
||||
Date: August 2015
|
||||
Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
||||
Description:
|
||||
The mem_used_total file is read-only and specifies the amount
|
||||
of memory, including allocator fragmentation and metadata
|
||||
overhead, allocated for this disk. So, allocator space
|
||||
efficiency can be calculated using compr_data_size and this
|
||||
statistic.
|
||||
Unit: bytes
|
||||
Now accessible via zram<id>/mm_stat node.
|
||||
|
||||
What: /sys/block/zram<id>/mem_used_max
|
||||
Date: August 2015
|
||||
Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
||||
Description:
|
||||
The mem_used_max file is read/write and specifies the amount
|
||||
of maximum memory zram have consumed to store compressed data.
|
||||
For resetting the value, you should write "0". Otherwise,
|
||||
you could see -EINVAL.
|
||||
Unit: bytes
|
||||
Downgraded to write-only node: so it's possible to set new
|
||||
value only; its current value is stored in zram<id>/mm_stat
|
||||
node.
|
||||
|
||||
What: /sys/block/zram<id>/mem_limit
|
||||
Date: August 2015
|
||||
Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
||||
Description:
|
||||
The mem_limit file is read/write and specifies the maximum
|
||||
amount of memory ZRAM can use to store the compressed data.
|
||||
The limit could be changed in run time and "0" means disable
|
||||
the limit. No limit is the initial state. Unit: bytes
|
||||
Downgraded to write-only node: so it's possible to set new
|
||||
value only; its current value is stored in zram<id>/mm_stat
|
||||
node.
|
|
@ -141,3 +141,28 @@ Description:
|
|||
amount of memory ZRAM can use to store the compressed data. The
|
||||
limit could be changed in run time and "0" means disable the
|
||||
limit. No limit is the initial state. Unit: bytes
|
||||
|
||||
What: /sys/block/zram<id>/compact
|
||||
Date: August 2015
|
||||
Contact: Minchan Kim <minchan@kernel.org>
|
||||
Description:
|
||||
The compact file is write-only and trigger compaction for
|
||||
allocator zrm uses. The allocator moves some objects so that
|
||||
it could free fragment space.
|
||||
|
||||
What: /sys/block/zram<id>/io_stat
|
||||
Date: August 2015
|
||||
Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
||||
Description:
|
||||
The io_stat file is read-only and accumulates device's I/O
|
||||
statistics not accounted by block layer. For example,
|
||||
failed_reads, failed_writes, etc. File format is similar to
|
||||
block layer statistics file format.
|
||||
|
||||
What: /sys/block/zram<id>/mm_stat
|
||||
Date: August 2015
|
||||
Contact: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
|
||||
Description:
|
||||
The mm_stat file is read-only and represents device's mm
|
||||
statistics (orig_data_size, compr_data_size, etc.) in a format
|
||||
similar to block layer statistics file format.
|
||||
|
|
|
@ -100,7 +100,7 @@ Description: read only
|
|||
Hexadecimal value of the device ID found in this AFU
|
||||
configuration record.
|
||||
|
||||
What: /sys/class/cxl/<afu>/cr<config num>/vendor
|
||||
What: /sys/class/cxl/<afu>/cr<config num>/class
|
||||
Date: February 2015
|
||||
Contact: linuxppc-dev@lists.ozlabs.org
|
||||
Description: read only
|
||||
|
|
|
@ -0,0 +1,80 @@
|
|||
What: /sys/class/leds/<led>/flash_brightness
|
||||
Date: March 2015
|
||||
KernelVersion: 4.0
|
||||
Contact: Jacek Anaszewski <j.anaszewski@samsung.com>
|
||||
Description: read/write
|
||||
Set the brightness of this LED in the flash strobe mode, in
|
||||
microamperes. The file is created only for the flash LED devices
|
||||
that support setting flash brightness.
|
||||
|
||||
The value is between 0 and
|
||||
/sys/class/leds/<led>/max_flash_brightness.
|
||||
|
||||
What: /sys/class/leds/<led>/max_flash_brightness
|
||||
Date: March 2015
|
||||
KernelVersion: 4.0
|
||||
Contact: Jacek Anaszewski <j.anaszewski@samsung.com>
|
||||
Description: read only
|
||||
Maximum brightness level for this LED in the flash strobe mode,
|
||||
in microamperes.
|
||||
|
||||
What: /sys/class/leds/<led>/flash_timeout
|
||||
Date: March 2015
|
||||
KernelVersion: 4.0
|
||||
Contact: Jacek Anaszewski <j.anaszewski@samsung.com>
|
||||
Description: read/write
|
||||
Hardware timeout for flash, in microseconds. The flash strobe
|
||||
is stopped after this period of time has passed from the start
|
||||
of the strobe. The file is created only for the flash LED
|
||||
devices that support setting flash timeout.
|
||||
|
||||
What: /sys/class/leds/<led>/max_flash_timeout
|
||||
Date: March 2015
|
||||
KernelVersion: 4.0
|
||||
Contact: Jacek Anaszewski <j.anaszewski@samsung.com>
|
||||
Description: read only
|
||||
Maximum flash timeout for this LED, in microseconds.
|
||||
|
||||
What: /sys/class/leds/<led>/flash_strobe
|
||||
Date: March 2015
|
||||
KernelVersion: 4.0
|
||||
Contact: Jacek Anaszewski <j.anaszewski@samsung.com>
|
||||
Description: read/write
|
||||
Flash strobe state. When written with 1 it triggers flash strobe
|
||||
and when written with 0 it turns the flash off.
|
||||
|
||||
On read 1 means that flash is currently strobing and 0 means
|
||||
that flash is off.
|
||||
|
||||
What: /sys/class/leds/<led>/flash_fault
|
||||
Date: March 2015
|
||||
KernelVersion: 4.0
|
||||
Contact: Jacek Anaszewski <j.anaszewski@samsung.com>
|
||||
Description: read only
|
||||
Space separated list of flash faults that may have occurred.
|
||||
Flash faults are re-read after strobing the flash. Possible
|
||||
flash faults:
|
||||
|
||||
* led-over-voltage - flash controller voltage to the flash LED
|
||||
has exceeded the limit specific to the flash controller
|
||||
* flash-timeout-exceeded - the flash strobe was still on when
|
||||
the timeout set by the user has expired; not all flash
|
||||
controllers may set this in all such conditions
|
||||
* controller-over-temperature - the flash controller has
|
||||
overheated
|
||||
* controller-short-circuit - the short circuit protection
|
||||
of the flash controller has been triggered
|
||||
* led-power-supply-over-current - current in the LED power
|
||||
supply has exceeded the limit specific to the flash
|
||||
controller
|
||||
* indicator-led-fault - the flash controller has detected
|
||||
a short or open circuit condition on the indicator LED
|
||||
* led-under-voltage - flash controller voltage to the flash
|
||||
LED has been below the minimum limit specific to
|
||||
the flash
|
||||
* controller-under-voltage - the input voltage of the flash
|
||||
controller is below the limit under which strobing the
|
||||
flash at full current will not be possible;
|
||||
the condition persists until this flag is no longer set
|
||||
* led-over-temperature - the temperature of the LED has exceeded
|
||||
its allowed upper limit
|
|
@ -659,6 +659,19 @@ macros using parameters.
|
|||
#define CONSTANT 0x4000
|
||||
#define CONSTEXP (CONSTANT | 3)
|
||||
|
||||
5) namespace collisions when defining local variables in macros resembling
|
||||
functions:
|
||||
|
||||
#define FOO(x) \
|
||||
({ \
|
||||
typeof(x) ret; \
|
||||
ret = calc_ret(x); \
|
||||
(ret); \
|
||||
)}
|
||||
|
||||
ret is a common name for a local variable - __foo_ret is less likely
|
||||
to collide with an existing variable.
|
||||
|
||||
The cpp manual deals with macros exhaustively. The gcc internals manual also
|
||||
covers RTL which is used frequently with assembly language in the kernel.
|
||||
|
||||
|
|
|
@ -509,6 +509,270 @@
|
|||
select it due to the used type and mask field.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Internal Structure of Kernel Crypto API</title>
|
||||
|
||||
<para>
|
||||
The kernel crypto API has an internal structure where a cipher
|
||||
implementation may use many layers and indirections. This section
|
||||
shall help to clarify how the kernel crypto API uses
|
||||
various components to implement the complete cipher.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The following subsections explain the internal structure based
|
||||
on existing cipher implementations. The first section addresses
|
||||
the most complex scenario where all other scenarios form a logical
|
||||
subset.
|
||||
</para>
|
||||
|
||||
<sect2><title>Generic AEAD Cipher Structure</title>
|
||||
|
||||
<para>
|
||||
The following ASCII art decomposes the kernel crypto API layers
|
||||
when using the AEAD cipher with the automated IV generation. The
|
||||
shown example is used by the IPSEC layer.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For other use cases of AEAD ciphers, the ASCII art applies as
|
||||
well, but the caller may not use the GIVCIPHER interface. In
|
||||
this case, the caller must generate the IV.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The depicted example decomposes the AEAD cipher of GCM(AES) based
|
||||
on the generic C implementations (gcm.c, aes-generic.c, ctr.c,
|
||||
ghash-generic.c, seqiv.c). The generic implementation serves as an
|
||||
example showing the complete logic of the kernel crypto API.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
It is possible that some streamlined cipher implementations (like
|
||||
AES-NI) provide implementations merging aspects which in the view
|
||||
of the kernel crypto API cannot be decomposed into layers any more.
|
||||
In case of the AES-NI implementation, the CTR mode, the GHASH
|
||||
implementation and the AES cipher are all merged into one cipher
|
||||
implementation registered with the kernel crypto API. In this case,
|
||||
the concept described by the following ASCII art applies too. However,
|
||||
the decomposition of GCM into the individual sub-components
|
||||
by the kernel crypto API is not done any more.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Each block in the following ASCII art is an independent cipher
|
||||
instance obtained from the kernel crypto API. Each block
|
||||
is accessed by the caller or by other blocks using the API functions
|
||||
defined by the kernel crypto API for the cipher implementation type.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The blocks below indicate the cipher type as well as the specific
|
||||
logic implemented in the cipher.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The ASCII art picture also indicates the call structure, i.e. who
|
||||
calls which component. The arrows point to the invoked block
|
||||
where the caller uses the API applicable to the cipher type
|
||||
specified for the block.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
<![CDATA[
|
||||
kernel crypto API | IPSEC Layer
|
||||
|
|
||||
+-----------+ |
|
||||
| | (1)
|
||||
| givcipher | <----------------------------------- esp_output
|
||||
| (seqiv) | ---+
|
||||
+-----------+ |
|
||||
| (2)
|
||||
+-----------+ |
|
||||
| | <--+ (2)
|
||||
| aead | <----------------------------------- esp_input
|
||||
| (gcm) | ------------+
|
||||
+-----------+ |
|
||||
| (3) | (5)
|
||||
v v
|
||||
+-----------+ +-----------+
|
||||
| | | |
|
||||
| ablkcipher| | ahash |
|
||||
| (ctr) | ---+ | (ghash) |
|
||||
+-----------+ | +-----------+
|
||||
|
|
||||
+-----------+ | (4)
|
||||
| | <--+
|
||||
| cipher |
|
||||
| (aes) |
|
||||
+-----------+
|
||||
]]>
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
The following call sequence is applicable when the IPSEC layer
|
||||
triggers an encryption operation with the esp_output function. During
|
||||
configuration, the administrator set up the use of rfc4106(gcm(aes)) as
|
||||
the cipher for ESP. The following call sequence is now depicted in the
|
||||
ASCII art above:
|
||||
</para>
|
||||
|
||||
<orderedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
esp_output() invokes crypto_aead_givencrypt() to trigger an encryption
|
||||
operation of the GIVCIPHER implementation.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In case of GCM, the SEQIV implementation is registered as GIVCIPHER
|
||||
in crypto_rfc4106_alloc().
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The SEQIV performs its operation to generate an IV where the core
|
||||
function is seqiv_geniv().
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Now, SEQIV uses the AEAD API function calls to invoke the associated
|
||||
AEAD cipher. In our case, during the instantiation of SEQIV, the
|
||||
cipher handle for GCM is provided to SEQIV. This means that SEQIV
|
||||
invokes AEAD cipher operations with the GCM cipher handle.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
During instantiation of the GCM handle, the CTR(AES) and GHASH
|
||||
ciphers are instantiated. The cipher handles for CTR(AES) and GHASH
|
||||
are retained for later use.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The GCM implementation is responsible to invoke the CTR mode AES and
|
||||
the GHASH cipher in the right manner to implement the GCM
|
||||
specification.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
The GCM AEAD cipher type implementation now invokes the ABLKCIPHER API
|
||||
with the instantiated CTR(AES) cipher handle.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
During instantiation of the CTR(AES) cipher, the CIPHER type
|
||||
implementation of AES is instantiated. The cipher handle for AES is
|
||||
retained.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
That means that the ABLKCIPHER implementation of CTR(AES) only
|
||||
implements the CTR block chaining mode. After performing the block
|
||||
chaining operation, the CIPHER implementation of AES is invoked.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
The ABLKCIPHER of CTR(AES) now invokes the CIPHER API with the AES
|
||||
cipher handle to encrypt one block.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
The GCM AEAD implementation also invokes the GHASH cipher
|
||||
implementation via the AHASH API.
|
||||
</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
|
||||
<para>
|
||||
When the IPSEC layer triggers the esp_input() function, the same call
|
||||
sequence is followed with the only difference that the operation starts
|
||||
with step (2).
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2><title>Generic Block Cipher Structure</title>
|
||||
<para>
|
||||
Generic block ciphers follow the same concept as depicted with the ASCII
|
||||
art picture above.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For example, CBC(AES) is implemented with cbc.c, and aes-generic.c. The
|
||||
ASCII art picture above applies as well with the difference that only
|
||||
step (4) is used and the ABLKCIPHER block chaining mode is CBC.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2><title>Generic Keyed Message Digest Structure</title>
|
||||
<para>
|
||||
Keyed message digest implementations again follow the same concept as
|
||||
depicted in the ASCII art picture above.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For example, HMAC(SHA256) is implemented with hmac.c and
|
||||
sha256_generic.c. The following ASCII art illustrates the
|
||||
implementation:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
<![CDATA[
|
||||
kernel crypto API | Caller
|
||||
|
|
||||
+-----------+ (1) |
|
||||
| | <------------------ some_function
|
||||
| ahash |
|
||||
| (hmac) | ---+
|
||||
+-----------+ |
|
||||
| (2)
|
||||
+-----------+ |
|
||||
| | <--+
|
||||
| shash |
|
||||
| (sha256) |
|
||||
+-----------+
|
||||
]]>
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
The following call sequence is applicable when a caller triggers
|
||||
an HMAC operation:
|
||||
</para>
|
||||
|
||||
<orderedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
The AHASH API functions are invoked by the caller. The HMAC
|
||||
implementation performs its operation as needed.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
During initialization of the HMAC cipher, the SHASH cipher type of
|
||||
SHA256 is instantiated. The cipher handle for the SHA256 instance is
|
||||
retained.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
At one time, the HMAC implementation requires a SHA256 operation
|
||||
where the SHA256 cipher handle is used.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
The HMAC instance now invokes the SHASH API with the SHA256
|
||||
cipher handle to calculate the message digest.
|
||||
</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
</sect2>
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="Development"><title>Developing Cipher Algorithms</title>
|
||||
|
@ -808,6 +1072,602 @@
|
|||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="User"><title>User Space Interface</title>
|
||||
<sect1><title>Introduction</title>
|
||||
<para>
|
||||
The concepts of the kernel crypto API visible to kernel space is fully
|
||||
applicable to the user space interface as well. Therefore, the kernel
|
||||
crypto API high level discussion for the in-kernel use cases applies
|
||||
here as well.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The major difference, however, is that user space can only act as a
|
||||
consumer and never as a provider of a transformation or cipher algorithm.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The following covers the user space interface exported by the kernel
|
||||
crypto API. A working example of this description is libkcapi that
|
||||
can be obtained from [1]. That library can be used by user space
|
||||
applications that require cryptographic services from the kernel.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Some details of the in-kernel kernel crypto API aspects do not
|
||||
apply to user space, however. This includes the difference between
|
||||
synchronous and asynchronous invocations. The user space API call
|
||||
is fully synchronous.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
[1] http://www.chronox.de/libkcapi.html
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1><title>User Space API General Remarks</title>
|
||||
<para>
|
||||
The kernel crypto API is accessible from user space. Currently,
|
||||
the following ciphers are accessible:
|
||||
</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Message digest including keyed message digest (HMAC, CMAC)</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Symmetric ciphers</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>AEAD ciphers</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Random Number Generators</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>
|
||||
The interface is provided via socket type using the type AF_ALG.
|
||||
In addition, the setsockopt option type is SOL_ALG. In case the
|
||||
user space header files do not export these flags yet, use the
|
||||
following macros:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
#ifndef AF_ALG
|
||||
#define AF_ALG 38
|
||||
#endif
|
||||
#ifndef SOL_ALG
|
||||
#define SOL_ALG 279
|
||||
#endif
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
A cipher is accessed with the same name as done for the in-kernel
|
||||
API calls. This includes the generic vs. unique naming schema for
|
||||
ciphers as well as the enforcement of priorities for generic names.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
To interact with the kernel crypto API, a socket must be
|
||||
created by the user space application. User space invokes the cipher
|
||||
operation with the send()/write() system call family. The result of the
|
||||
cipher operation is obtained with the read()/recv() system call family.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The following API calls assume that the socket descriptor
|
||||
is already opened by the user space application and discusses only
|
||||
the kernel crypto API specific invocations.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
To initialize the socket interface, the following sequence has to
|
||||
be performed by the consumer:
|
||||
</para>
|
||||
|
||||
<orderedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
Create a socket of type AF_ALG with the struct sockaddr_alg
|
||||
parameter specified below for the different cipher types.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Invoke bind with the socket descriptor
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Invoke accept with the socket descriptor. The accept system call
|
||||
returns a new file descriptor that is to be used to interact with
|
||||
the particular cipher instance. When invoking send/write or recv/read
|
||||
system calls to send data to the kernel or obtain data from the
|
||||
kernel, the file descriptor returned by accept must be used.
|
||||
</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>In-place Cipher operation</title>
|
||||
<para>
|
||||
Just like the in-kernel operation of the kernel crypto API, the user
|
||||
space interface allows the cipher operation in-place. That means that
|
||||
the input buffer used for the send/write system call and the output
|
||||
buffer used by the read/recv system call may be one and the same.
|
||||
This is of particular interest for symmetric cipher operations where a
|
||||
copying of the output data to its final destination can be avoided.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If a consumer on the other hand wants to maintain the plaintext and
|
||||
the ciphertext in different memory locations, all a consumer needs
|
||||
to do is to provide different memory pointers for the encryption and
|
||||
decryption operation.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Message Digest API</title>
|
||||
<para>
|
||||
The message digest type to be used for the cipher operation is
|
||||
selected when invoking the bind syscall. bind requires the caller
|
||||
to provide a filled struct sockaddr data structure. This data
|
||||
structure must be filled as follows:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
struct sockaddr_alg sa = {
|
||||
.salg_family = AF_ALG,
|
||||
.salg_type = "hash", /* this selects the hash logic in the kernel */
|
||||
.salg_name = "sha1" /* this is the cipher name */
|
||||
};
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
The salg_type value "hash" applies to message digests and keyed
|
||||
message digests. Though, a keyed message digest is referenced by
|
||||
the appropriate salg_name. Please see below for the setsockopt
|
||||
interface that explains how the key can be set for a keyed message
|
||||
digest.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Using the send() system call, the application provides the data that
|
||||
should be processed with the message digest. The send system call
|
||||
allows the following flags to be specified:
|
||||
</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
MSG_MORE: If this flag is set, the send system call acts like a
|
||||
message digest update function where the final hash is not
|
||||
yet calculated. If the flag is not set, the send system call
|
||||
calculates the final message digest immediately.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>
|
||||
With the recv() system call, the application can read the message
|
||||
digest from the kernel crypto API. If the buffer is too small for the
|
||||
message digest, the flag MSG_TRUNC is set by the kernel.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In order to set a message digest key, the calling application must use
|
||||
the setsockopt() option of ALG_SET_KEY. If the key is not set the HMAC
|
||||
operation is performed without the initial HMAC state change caused by
|
||||
the key.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Symmetric Cipher API</title>
|
||||
<para>
|
||||
The operation is very similar to the message digest discussion.
|
||||
During initialization, the struct sockaddr data structure must be
|
||||
filled as follows:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
struct sockaddr_alg sa = {
|
||||
.salg_family = AF_ALG,
|
||||
.salg_type = "skcipher", /* this selects the symmetric cipher */
|
||||
.salg_name = "cbc(aes)" /* this is the cipher name */
|
||||
};
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Before data can be sent to the kernel using the write/send system
|
||||
call family, the consumer must set the key. The key setting is
|
||||
described with the setsockopt invocation below.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Using the sendmsg() system call, the application provides the data that should be processed for encryption or decryption. In addition, the IV is
|
||||
specified with the data structure provided by the sendmsg() system call.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The sendmsg system call parameter of struct msghdr is embedded into the
|
||||
struct cmsghdr data structure. See recv(2) and cmsg(3) for more
|
||||
information on how the cmsghdr data structure is used together with the
|
||||
send/recv system call family. That cmsghdr data structure holds the
|
||||
following information specified with a separate header instances:
|
||||
</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
specification of the cipher operation type with one of these flags:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>ALG_OP_ENCRYPT - encryption of data</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>ALG_OP_DECRYPT - decryption of data</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
specification of the IV information marked with the flag ALG_SET_IV
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>
|
||||
The send system call family allows the following flag to be specified:
|
||||
</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
MSG_MORE: If this flag is set, the send system call acts like a
|
||||
cipher update function where more input data is expected
|
||||
with a subsequent invocation of the send system call.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>
|
||||
Note: The kernel reports -EINVAL for any unexpected data. The caller
|
||||
must make sure that all data matches the constraints given in
|
||||
/proc/crypto for the selected cipher.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
With the recv() system call, the application can read the result of
|
||||
the cipher operation from the kernel crypto API. The output buffer
|
||||
must be at least as large as to hold all blocks of the encrypted or
|
||||
decrypted data. If the output data size is smaller, only as many
|
||||
blocks are returned that fit into that output buffer size.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>AEAD Cipher API</title>
|
||||
<para>
|
||||
The operation is very similar to the symmetric cipher discussion.
|
||||
During initialization, the struct sockaddr data structure must be
|
||||
filled as follows:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
struct sockaddr_alg sa = {
|
||||
.salg_family = AF_ALG,
|
||||
.salg_type = "aead", /* this selects the symmetric cipher */
|
||||
.salg_name = "gcm(aes)" /* this is the cipher name */
|
||||
};
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Before data can be sent to the kernel using the write/send system
|
||||
call family, the consumer must set the key. The key setting is
|
||||
described with the setsockopt invocation below.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In addition, before data can be sent to the kernel using the
|
||||
write/send system call family, the consumer must set the authentication
|
||||
tag size. To set the authentication tag size, the caller must use the
|
||||
setsockopt invocation described below.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Using the sendmsg() system call, the application provides the data that should be processed for encryption or decryption. In addition, the IV is
|
||||
specified with the data structure provided by the sendmsg() system call.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The sendmsg system call parameter of struct msghdr is embedded into the
|
||||
struct cmsghdr data structure. See recv(2) and cmsg(3) for more
|
||||
information on how the cmsghdr data structure is used together with the
|
||||
send/recv system call family. That cmsghdr data structure holds the
|
||||
following information specified with a separate header instances:
|
||||
</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
specification of the cipher operation type with one of these flags:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>ALG_OP_ENCRYPT - encryption of data</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>ALG_OP_DECRYPT - decryption of data</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
specification of the IV information marked with the flag ALG_SET_IV
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
specification of the associated authentication data (AAD) with the
|
||||
flag ALG_SET_AEAD_ASSOCLEN. The AAD is sent to the kernel together
|
||||
with the plaintext / ciphertext. See below for the memory structure.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>
|
||||
The send system call family allows the following flag to be specified:
|
||||
</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
MSG_MORE: If this flag is set, the send system call acts like a
|
||||
cipher update function where more input data is expected
|
||||
with a subsequent invocation of the send system call.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>
|
||||
Note: The kernel reports -EINVAL for any unexpected data. The caller
|
||||
must make sure that all data matches the constraints given in
|
||||
/proc/crypto for the selected cipher.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
With the recv() system call, the application can read the result of
|
||||
the cipher operation from the kernel crypto API. The output buffer
|
||||
must be at least as large as defined with the memory structure below.
|
||||
If the output data size is smaller, the cipher operation is not performed.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The authenticated decryption operation may indicate an integrity error.
|
||||
Such breach in integrity is marked with the -EBADMSG error code.
|
||||
</para>
|
||||
|
||||
<sect2><title>AEAD Memory Structure</title>
|
||||
<para>
|
||||
The AEAD cipher operates with the following information that
|
||||
is communicated between user and kernel space as one data stream:
|
||||
</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>plaintext or ciphertext</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>associated authentication data (AAD)</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>authentication tag</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>
|
||||
The sizes of the AAD and the authentication tag are provided with
|
||||
the sendmsg and setsockopt calls (see there). As the kernel knows
|
||||
the size of the entire data stream, the kernel is now able to
|
||||
calculate the right offsets of the data components in the data
|
||||
stream.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The user space caller must arrange the aforementioned information
|
||||
in the following order:
|
||||
</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
AEAD encryption input: AAD || plaintext
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
AEAD decryption input: AAD || ciphertext || authentication tag
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>
|
||||
The output buffer the user space caller provides must be at least as
|
||||
large to hold the following data:
|
||||
</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
AEAD encryption output: ciphertext || authentication tag
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
AEAD decryption output: plaintext
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Random Number Generator API</title>
|
||||
<para>
|
||||
Again, the operation is very similar to the other APIs.
|
||||
During initialization, the struct sockaddr data structure must be
|
||||
filled as follows:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
struct sockaddr_alg sa = {
|
||||
.salg_family = AF_ALG,
|
||||
.salg_type = "rng", /* this selects the symmetric cipher */
|
||||
.salg_name = "drbg_nopr_sha256" /* this is the cipher name */
|
||||
};
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Depending on the RNG type, the RNG must be seeded. The seed is provided
|
||||
using the setsockopt interface to set the key. For example, the
|
||||
ansi_cprng requires a seed. The DRBGs do not require a seed, but
|
||||
may be seeded.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Using the read()/recvmsg() system calls, random numbers can be obtained.
|
||||
The kernel generates at most 128 bytes in one call. If user space
|
||||
requires more data, multiple calls to read()/recvmsg() must be made.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
WARNING: The user space caller may invoke the initially mentioned
|
||||
accept system call multiple times. In this case, the returned file
|
||||
descriptors have the same state.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Zero-Copy Interface</title>
|
||||
<para>
|
||||
In addition to the send/write/read/recv system call familty, the AF_ALG
|
||||
interface can be accessed with the zero-copy interface of splice/vmsplice.
|
||||
As the name indicates, the kernel tries to avoid a copy operation into
|
||||
kernel space.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The zero-copy operation requires data to be aligned at the page boundary.
|
||||
Non-aligned data can be used as well, but may require more operations of
|
||||
the kernel which would defeat the speed gains obtained from the zero-copy
|
||||
interface.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The system-interent limit for the size of one zero-copy operation is
|
||||
16 pages. If more data is to be sent to AF_ALG, user space must slice
|
||||
the input into segments with a maximum size of 16 pages.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Zero-copy can be used with the following code example (a complete working
|
||||
example is provided with libkcapi):
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
int pipes[2];
|
||||
|
||||
pipe(pipes);
|
||||
/* input data in iov */
|
||||
vmsplice(pipes[1], iov, iovlen, SPLICE_F_GIFT);
|
||||
/* opfd is the file descriptor returned from accept() system call */
|
||||
splice(pipes[0], NULL, opfd, NULL, ret, 0);
|
||||
read(opfd, out, outlen);
|
||||
</programlisting>
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Setsockopt Interface</title>
|
||||
<para>
|
||||
In addition to the read/recv and send/write system call handling
|
||||
to send and retrieve data subject to the cipher operation, a consumer
|
||||
also needs to set the additional information for the cipher operation.
|
||||
This additional information is set using the setsockopt system call
|
||||
that must be invoked with the file descriptor of the open cipher
|
||||
(i.e. the file descriptor returned by the accept system call).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Each setsockopt invocation must use the level SOL_ALG.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The setsockopt interface allows setting the following data using
|
||||
the mentioned optname:
|
||||
</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
ALG_SET_KEY -- Setting the key. Key setting is applicable to:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>the skcipher cipher type (symmetric ciphers)</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>the hash cipher type (keyed message digests)</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>the AEAD cipher type</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>the RNG cipher type to provide the seed</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
ALG_SET_AEAD_AUTHSIZE -- Setting the authentication tag size
|
||||
for AEAD ciphers. For a encryption operation, the authentication
|
||||
tag of the given size will be generated. For a decryption operation,
|
||||
the provided ciphertext is assumed to contain an authentication tag
|
||||
of the given size (see section about AEAD memory layout below).
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1><title>User space API example</title>
|
||||
<para>
|
||||
Please see [1] for libkcapi which provides an easy-to-use wrapper
|
||||
around the aforementioned Netlink kernel interface. [1] also contains
|
||||
a test application that invokes all libkcapi API calls.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
[1] http://www.chronox.de/libkcapi.html
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter id="API"><title>Programming Interface</title>
|
||||
<sect1><title>Block Cipher Context Data Structures</title>
|
||||
!Pinclude/linux/crypto.h Block Cipher Context Data Structures
|
||||
|
|
|
@ -95,8 +95,7 @@ since it doesn't need to allocate a table as large as the largest
|
|||
hwirq number. The disadvantage is that hwirq to IRQ number lookup is
|
||||
dependent on how many entries are in the table.
|
||||
|
||||
Very few drivers should need this mapping. At the moment, powerpc
|
||||
iseries is the only user.
|
||||
Very few drivers should need this mapping.
|
||||
|
||||
==== No Map ===-
|
||||
irq_domain_add_nomap()
|
||||
|
|
|
@ -614,8 +614,8 @@ The canonical patch message body contains the following:
|
|||
|
||||
- An empty line.
|
||||
|
||||
- The body of the explanation, which will be copied to the
|
||||
permanent changelog to describe this patch.
|
||||
- The body of the explanation, line wrapped at 75 columns, which will
|
||||
be copied to the permanent changelog to describe this patch.
|
||||
|
||||
- The "Signed-off-by:" lines, described above, which will
|
||||
also go in the changelog.
|
||||
|
|
|
@ -1,17 +1,31 @@
|
|||
Network Block Device (TCP version)
|
||||
|
||||
What is it: With this compiled in the kernel (or as a module), Linux
|
||||
can use a remote server as one of its block devices. So every time
|
||||
the client computer wants to read, e.g., /dev/nb0, it sends a
|
||||
request over TCP to the server, which will reply with the data read.
|
||||
This can be used for stations with low disk space (or even diskless)
|
||||
to borrow disk space from another computer.
|
||||
Unlike NFS, it is possible to put any filesystem on it, etc.
|
||||
Network Block Device (TCP version)
|
||||
==================================
|
||||
|
||||
For more information, or to download the nbd-client and nbd-server
|
||||
tools, go to http://nbd.sf.net/.
|
||||
1) Overview
|
||||
-----------
|
||||
|
||||
What is it: With this compiled in the kernel (or as a module), Linux
|
||||
can use a remote server as one of its block devices. So every time
|
||||
the client computer wants to read, e.g., /dev/nb0, it sends a
|
||||
request over TCP to the server, which will reply with the data read.
|
||||
This can be used for stations with low disk space (or even diskless)
|
||||
to borrow disk space from another computer.
|
||||
Unlike NFS, it is possible to put any filesystem on it, etc.
|
||||
|
||||
For more information, or to download the nbd-client and nbd-server
|
||||
tools, go to http://nbd.sf.net/.
|
||||
|
||||
The nbd kernel module need only be installed on the client
|
||||
system, as the nbd-server is completely in userspace. In fact,
|
||||
the nbd-server has been successfully ported to other operating
|
||||
systems, including Windows.
|
||||
|
||||
A) NBD parameters
|
||||
-----------------
|
||||
|
||||
max_part
|
||||
Number of partitions per device (default: 0).
|
||||
|
||||
nbds_max
|
||||
Number of block devices that should be initialized (default: 16).
|
||||
|
||||
The nbd kernel module need only be installed on the client
|
||||
system, as the nbd-server is completely in userspace. In fact,
|
||||
the nbd-server has been successfully ported to other operating
|
||||
systems, including Windows.
|
||||
|
|
|
@ -98,20 +98,79 @@ size of the disk when not in use so a huge zram is wasteful.
|
|||
mount /dev/zram1 /tmp
|
||||
|
||||
7) Stats:
|
||||
Per-device statistics are exported as various nodes under
|
||||
/sys/block/zram<id>/
|
||||
disksize
|
||||
num_reads
|
||||
num_writes
|
||||
failed_reads
|
||||
failed_writes
|
||||
invalid_io
|
||||
notify_free
|
||||
zero_pages
|
||||
orig_data_size
|
||||
compr_data_size
|
||||
mem_used_total
|
||||
mem_used_max
|
||||
Per-device statistics are exported as various nodes under /sys/block/zram<id>/
|
||||
|
||||
A brief description of exported device attritbutes. For more details please
|
||||
read Documentation/ABI/testing/sysfs-block-zram.
|
||||
|
||||
Name access description
|
||||
---- ------ -----------
|
||||
disksize RW show and set the device's disk size
|
||||
initstate RO shows the initialization state of the device
|
||||
reset WO trigger device reset
|
||||
num_reads RO the number of reads
|
||||
failed_reads RO the number of failed reads
|
||||
num_write RO the number of writes
|
||||
failed_writes RO the number of failed writes
|
||||
invalid_io RO the number of non-page-size-aligned I/O requests
|
||||
max_comp_streams RW the number of possible concurrent compress operations
|
||||
comp_algorithm RW show and change the compression algorithm
|
||||
notify_free RO the number of notifications to free pages (either
|
||||
slot free notifications or REQ_DISCARD requests)
|
||||
zero_pages RO the number of zero filled pages written to this disk
|
||||
orig_data_size RO uncompressed size of data stored in this disk
|
||||
compr_data_size RO compressed size of data stored in this disk
|
||||
mem_used_total RO the amount of memory allocated for this disk
|
||||
mem_used_max RW the maximum amount memory zram have consumed to
|
||||
store compressed data
|
||||
mem_limit RW the maximum amount of memory ZRAM can use to store
|
||||
the compressed data
|
||||
num_migrated RO the number of objects migrated migrated by compaction
|
||||
|
||||
|
||||
WARNING
|
||||
=======
|
||||
per-stat sysfs attributes are considered to be deprecated.
|
||||
The basic strategy is:
|
||||
-- the existing RW nodes will be downgraded to WO nodes (in linux 4.11)
|
||||
-- deprecated RO sysfs nodes will eventually be removed (in linux 4.11)
|
||||
|
||||
The list of deprecated attributes can be found here:
|
||||
Documentation/ABI/obsolete/sysfs-block-zram
|
||||
|
||||
Basically, every attribute that has its own read accessible sysfs node
|
||||
(e.g. num_reads) *AND* is accessible via one of the stat files (zram<id>/stat
|
||||
or zram<id>/io_stat or zram<id>/mm_stat) is considered to be deprecated.
|
||||
|
||||
User space is advised to use the following files to read the device statistics.
|
||||
|
||||
File /sys/block/zram<id>/stat
|
||||
|
||||
Represents block layer statistics. Read Documentation/block/stat.txt for
|
||||
details.
|
||||
|
||||
File /sys/block/zram<id>/io_stat
|
||||
|
||||
The stat file represents device's I/O statistics not accounted by block
|
||||
layer and, thus, not available in zram<id>/stat file. It consists of a
|
||||
single line of text and contains the following stats separated by
|
||||
whitespace:
|
||||
failed_reads
|
||||
failed_writes
|
||||
invalid_io
|
||||
notify_free
|
||||
|
||||
File /sys/block/zram<id>/mm_stat
|
||||
|
||||
The stat file represents device's mm statistics. It consists of a single
|
||||
line of text and contains the following stats separated by whitespace:
|
||||
orig_data_size
|
||||
compr_data_size
|
||||
mem_used_total
|
||||
mem_limit
|
||||
mem_used_max
|
||||
zero_pages
|
||||
num_migrated
|
||||
|
||||
8) Deactivate:
|
||||
swapoff /dev/zram0
|
||||
|
|
|
@ -1,205 +0,0 @@
|
|||
Introduction
|
||||
============
|
||||
|
||||
The concepts of the kernel crypto API visible to kernel space is fully
|
||||
applicable to the user space interface as well. Therefore, the kernel crypto API
|
||||
high level discussion for the in-kernel use cases applies here as well.
|
||||
|
||||
The major difference, however, is that user space can only act as a consumer
|
||||
and never as a provider of a transformation or cipher algorithm.
|
||||
|
||||
The following covers the user space interface exported by the kernel crypto
|
||||
API. A working example of this description is libkcapi that can be obtained from
|
||||
[1]. That library can be used by user space applications that require
|
||||
cryptographic services from the kernel.
|
||||
|
||||
Some details of the in-kernel kernel crypto API aspects do not
|
||||
apply to user space, however. This includes the difference between synchronous
|
||||
and asynchronous invocations. The user space API call is fully synchronous.
|
||||
In addition, only a subset of all cipher types are available as documented
|
||||
below.
|
||||
|
||||
|
||||
User space API general remarks
|
||||
==============================
|
||||
|
||||
The kernel crypto API is accessible from user space. Currently, the following
|
||||
ciphers are accessible:
|
||||
|
||||
* Message digest including keyed message digest (HMAC, CMAC)
|
||||
|
||||
* Symmetric ciphers
|
||||
|
||||
Note, AEAD ciphers are currently not supported via the symmetric cipher
|
||||
interface.
|
||||
|
||||
The interface is provided via Netlink using the type AF_ALG. In addition, the
|
||||
setsockopt option type is SOL_ALG. In case the user space header files do not
|
||||
export these flags yet, use the following macros:
|
||||
|
||||
#ifndef AF_ALG
|
||||
#define AF_ALG 38
|
||||
#endif
|
||||
#ifndef SOL_ALG
|
||||
#define SOL_ALG 279
|
||||
#endif
|
||||
|
||||
A cipher is accessed with the same name as done for the in-kernel API calls.
|
||||
This includes the generic vs. unique naming schema for ciphers as well as the
|
||||
enforcement of priorities for generic names.
|
||||
|
||||
To interact with the kernel crypto API, a Netlink socket must be created by
|
||||
the user space application. User space invokes the cipher operation with the
|
||||
send/write system call family. The result of the cipher operation is obtained
|
||||
with the read/recv system call family.
|
||||
|
||||
The following API calls assume that the Netlink socket descriptor is already
|
||||
opened by the user space application and discusses only the kernel crypto API
|
||||
specific invocations.
|
||||
|
||||
To initialize a Netlink interface, the following sequence has to be performed
|
||||
by the consumer:
|
||||
|
||||
1. Create a socket of type AF_ALG with the struct sockaddr_alg parameter
|
||||
specified below for the different cipher types.
|
||||
|
||||
2. Invoke bind with the socket descriptor
|
||||
|
||||
3. Invoke accept with the socket descriptor. The accept system call
|
||||
returns a new file descriptor that is to be used to interact with
|
||||
the particular cipher instance. When invoking send/write or recv/read
|
||||
system calls to send data to the kernel or obtain data from the
|
||||
kernel, the file descriptor returned by accept must be used.
|
||||
|
||||
In-place cipher operation
|
||||
=========================
|
||||
|
||||
Just like the in-kernel operation of the kernel crypto API, the user space
|
||||
interface allows the cipher operation in-place. That means that the input buffer
|
||||
used for the send/write system call and the output buffer used by the read/recv
|
||||
system call may be one and the same. This is of particular interest for
|
||||
symmetric cipher operations where a copying of the output data to its final
|
||||
destination can be avoided.
|
||||
|
||||
If a consumer on the other hand wants to maintain the plaintext and the
|
||||
ciphertext in different memory locations, all a consumer needs to do is to
|
||||
provide different memory pointers for the encryption and decryption operation.
|
||||
|
||||
Message digest API
|
||||
==================
|
||||
|
||||
The message digest type to be used for the cipher operation is selected when
|
||||
invoking the bind syscall. bind requires the caller to provide a filled
|
||||
struct sockaddr data structure. This data structure must be filled as follows:
|
||||
|
||||
struct sockaddr_alg sa = {
|
||||
.salg_family = AF_ALG,
|
||||
.salg_type = "hash", /* this selects the hash logic in the kernel */
|
||||
.salg_name = "sha1" /* this is the cipher name */
|
||||
};
|
||||
|
||||
The salg_type value "hash" applies to message digests and keyed message digests.
|
||||
Though, a keyed message digest is referenced by the appropriate salg_name.
|
||||
Please see below for the setsockopt interface that explains how the key can be
|
||||
set for a keyed message digest.
|
||||
|
||||
Using the send() system call, the application provides the data that should be
|
||||
processed with the message digest. The send system call allows the following
|
||||
flags to be specified:
|
||||
|
||||
* MSG_MORE: If this flag is set, the send system call acts like a
|
||||
message digest update function where the final hash is not
|
||||
yet calculated. If the flag is not set, the send system call
|
||||
calculates the final message digest immediately.
|
||||
|
||||
With the recv() system call, the application can read the message digest from
|
||||
the kernel crypto API. If the buffer is too small for the message digest, the
|
||||
flag MSG_TRUNC is set by the kernel.
|
||||
|
||||
In order to set a message digest key, the calling application must use the
|
||||
setsockopt() option of ALG_SET_KEY. If the key is not set the HMAC operation is
|
||||
performed without the initial HMAC state change caused by the key.
|
||||
|
||||
|
||||
Symmetric cipher API
|
||||
====================
|
||||
|
||||
The operation is very similar to the message digest discussion. During
|
||||
initialization, the struct sockaddr data structure must be filled as follows:
|
||||
|
||||
struct sockaddr_alg sa = {
|
||||
.salg_family = AF_ALG,
|
||||
.salg_type = "skcipher", /* this selects the symmetric cipher */
|
||||
.salg_name = "cbc(aes)" /* this is the cipher name */
|
||||
};
|
||||
|
||||
Before data can be sent to the kernel using the write/send system call family,
|
||||
the consumer must set the key. The key setting is described with the setsockopt
|
||||
invocation below.
|
||||
|
||||
Using the sendmsg() system call, the application provides the data that should
|
||||
be processed for encryption or decryption. In addition, the IV is specified
|
||||
with the data structure provided by the sendmsg() system call.
|
||||
|
||||
The sendmsg system call parameter of struct msghdr is embedded into the
|
||||
struct cmsghdr data structure. See recv(2) and cmsg(3) for more information
|
||||
on how the cmsghdr data structure is used together with the send/recv system
|
||||
call family. That cmsghdr data structure holds the following information
|
||||
specified with a separate header instances:
|
||||
|
||||
* specification of the cipher operation type with one of these flags:
|
||||
ALG_OP_ENCRYPT - encryption of data
|
||||
ALG_OP_DECRYPT - decryption of data
|
||||
|
||||
* specification of the IV information marked with the flag ALG_SET_IV
|
||||
|
||||
The send system call family allows the following flag to be specified:
|
||||
|
||||
* MSG_MORE: If this flag is set, the send system call acts like a
|
||||
cipher update function where more input data is expected
|
||||
with a subsequent invocation of the send system call.
|
||||
|
||||
Note: The kernel reports -EINVAL for any unexpected data. The caller must
|
||||
make sure that all data matches the constraints given in /proc/crypto for the
|
||||
selected cipher.
|
||||
|
||||
With the recv() system call, the application can read the result of the
|
||||
cipher operation from the kernel crypto API. The output buffer must be at least
|
||||
as large as to hold all blocks of the encrypted or decrypted data. If the output
|
||||
data size is smaller, only as many blocks are returned that fit into that
|
||||
output buffer size.
|
||||
|
||||
Setsockopt interface
|
||||
====================
|
||||
|
||||
In addition to the read/recv and send/write system call handling to send and
|
||||
retrieve data subject to the cipher operation, a consumer also needs to set
|
||||
the additional information for the cipher operation. This additional information
|
||||
is set using the setsockopt system call that must be invoked with the file
|
||||
descriptor of the open cipher (i.e. the file descriptor returned by the
|
||||
accept system call).
|
||||
|
||||
Each setsockopt invocation must use the level SOL_ALG.
|
||||
|
||||
The setsockopt interface allows setting the following data using the mentioned
|
||||
optname:
|
||||
|
||||
* ALG_SET_KEY -- Setting the key. Key setting is applicable to:
|
||||
|
||||
- the skcipher cipher type (symmetric ciphers)
|
||||
|
||||
- the hash cipher type (keyed message digests)
|
||||
|
||||
User space API example
|
||||
======================
|
||||
|
||||
Please see [1] for libkcapi which provides an easy-to-use wrapper around the
|
||||
aforementioned Netlink kernel interface. [1] also contains a test application
|
||||
that invokes all libkcapi API calls.
|
||||
|
||||
[1] http://www.chronox.de/libkcapi.html
|
||||
|
||||
Author
|
||||
======
|
||||
|
||||
Stephan Mueller <smueller@chronox.de>
|
|
@ -26,6 +26,13 @@ Required properties:
|
|||
|
||||
Optional properties:
|
||||
|
||||
- interrupt-affinity : Valid only when using SPIs, specifies a list of phandles
|
||||
to CPU nodes corresponding directly to the affinity of
|
||||
the SPIs listed in the interrupts property.
|
||||
|
||||
This property should be present when there is more than
|
||||
a single SPI.
|
||||
|
||||
- qcom,no-pc-write : Indicates that this PMU doesn't support the 0xc and 0xd
|
||||
events.
|
||||
|
||||
|
|
|
@ -0,0 +1,123 @@
|
|||
Imagination Technologies Pistachio SoC clock controllers
|
||||
========================================================
|
||||
|
||||
Pistachio has four clock controllers (core clock, peripheral clock, peripheral
|
||||
general control, and top general control) which are instantiated individually
|
||||
from the device-tree.
|
||||
|
||||
External clocks:
|
||||
----------------
|
||||
|
||||
There are three external inputs to the clock controllers which should be
|
||||
defined with the following clock-output-names:
|
||||
- "xtal": External 52Mhz oscillator (required)
|
||||
- "audio_clk_in": Alternate audio reference clock (optional)
|
||||
- "enet_clk_in": Alternate ethernet PHY clock (optional)
|
||||
|
||||
Core clock controller:
|
||||
----------------------
|
||||
|
||||
The core clock controller generates clocks for the CPU, RPU (WiFi + BT
|
||||
co-processor), audio, and several peripherals.
|
||||
|
||||
Required properties:
|
||||
- compatible: Must be "img,pistachio-clk".
|
||||
- reg: Must contain the base address and length of the core clock controller.
|
||||
- #clock-cells: Must be 1. The single cell is the clock identifier.
|
||||
See dt-bindings/clock/pistachio-clk.h for the list of valid identifiers.
|
||||
- clocks: Must contain an entry for each clock in clock-names.
|
||||
- clock-names: Must include "xtal" (see "External clocks") and
|
||||
"audio_clk_in_gate", "enet_clk_in_gate" which are generated by the
|
||||
top-level general control.
|
||||
|
||||
Example:
|
||||
clk_core: clock-controller@18144000 {
|
||||
compatible = "img,pistachio-clk";
|
||||
reg = <0x18144000 0x800>;
|
||||
clocks = <&xtal>, <&cr_top EXT_CLK_AUDIO_IN>,
|
||||
<&cr_top EXT_CLK_ENET_IN>;
|
||||
clock-names = "xtal", "audio_clk_in_gate", "enet_clk_in_gate";
|
||||
|
||||
#clock-cells = <1>;
|
||||
};
|
||||
|
||||
Peripheral clock controller:
|
||||
----------------------------
|
||||
|
||||
The peripheral clock controller generates clocks for the DDR, ROM, and other
|
||||
peripherals. The peripheral system clock ("periph_sys") generated by the core
|
||||
clock controller is the input clock to the peripheral clock controller.
|
||||
|
||||
Required properties:
|
||||
- compatible: Must be "img,pistachio-periph-clk".
|
||||
- reg: Must contain the base address and length of the peripheral clock
|
||||
controller.
|
||||
- #clock-cells: Must be 1. The single cell is the clock identifier.
|
||||
See dt-bindings/clock/pistachio-clk.h for the list of valid identifiers.
|
||||
- clocks: Must contain an entry for each clock in clock-names.
|
||||
- clock-names: Must include "periph_sys", the peripheral system clock generated
|
||||
by the core clock controller.
|
||||
|
||||
Example:
|
||||
clk_periph: clock-controller@18144800 {
|
||||
compatible = "img,pistachio-clk-periph";
|
||||
reg = <0x18144800 0x800>;
|
||||
clocks = <&clk_core CLK_PERIPH_SYS>;
|
||||
clock-names = "periph_sys";
|
||||
|
||||
#clock-cells = <1>;
|
||||
};
|
||||
|
||||
Peripheral general control:
|
||||
---------------------------
|
||||
|
||||
The peripheral general control block generates system interface clocks and
|
||||
resets for various peripherals. It also contains miscellaneous peripheral
|
||||
control registers. The system clock ("sys") generated by the peripheral clock
|
||||
controller is the input clock to the system clock controller.
|
||||
|
||||
Required properties:
|
||||
- compatible: Must include "img,pistachio-periph-cr" and "syscon".
|
||||
- reg: Must contain the base address and length of the peripheral general
|
||||
control registers.
|
||||
- #clock-cells: Must be 1. The single cell is the clock identifier.
|
||||
See dt-bindings/clock/pistachio-clk.h for the list of valid identifiers.
|
||||
- clocks: Must contain an entry for each clock in clock-names.
|
||||
- clock-names: Must include "sys", the system clock generated by the peripheral
|
||||
clock controller.
|
||||
|
||||
Example:
|
||||
cr_periph: syscon@18144800 {
|
||||
compatible = "img,pistachio-cr-periph", "syscon";
|
||||
reg = <0x18148000 0x1000>;
|
||||
clocks = <&clock_periph PERIPH_CLK_PERIPH_SYS>;
|
||||
clock-names = "sys";
|
||||
|
||||
#clock-cells = <1>;
|
||||
};
|
||||
|
||||
Top-level general control:
|
||||
--------------------------
|
||||
|
||||
The top-level general control block contains miscellaneous control registers and
|
||||
gates for the external clocks "audio_clk_in" and "enet_clk_in".
|
||||
|
||||
Required properties:
|
||||
- compatible: Must include "img,pistachio-cr-top" and "syscon".
|
||||
- reg: Must contain the base address and length of the top-level
|
||||
control registers.
|
||||
- clocks: Must contain an entry for each clock in clock-names.
|
||||
- clock-names: Two optional clocks, "audio_clk_in" and "enet_clk_in" (see
|
||||
"External clocks").
|
||||
- #clock-cells: Must be 1. The single cell is the clock identifier.
|
||||
See dt-bindings/clock/pistachio-clk.h for the list of valid identifiers.
|
||||
|
||||
Example:
|
||||
cr_top: syscon@18144800 {
|
||||
compatible = "img,pistachio-cr-top", "syscon";
|
||||
reg = <0x18149000 0x200>;
|
||||
clocks = <&audio_refclk>, <&ext_enet_in>;
|
||||
clock-names = "audio_clk_in", "enet_clk_in";
|
||||
|
||||
#clock-cells = <1>;
|
||||
};
|
|
@ -0,0 +1,27 @@
|
|||
Imagination Technologies hardware hash accelerator
|
||||
|
||||
The hash accelerator provides hardware hashing acceleration for
|
||||
SHA1, SHA224, SHA256 and MD5 hashes
|
||||
|
||||
Required properties:
|
||||
|
||||
- compatible : "img,hash-accelerator"
|
||||
- reg : Offset and length of the register set for the module, and the DMA port
|
||||
- interrupts : The designated IRQ line for the hashing module.
|
||||
- dmas : DMA specifier as per Documentation/devicetree/bindings/dma/dma.txt
|
||||
- dma-names : Should be "tx"
|
||||
- clocks : Clock specifiers
|
||||
- clock-names : "sys" Used to clock the hash block registers
|
||||
"hash" Used to clock data through the accelerator
|
||||
|
||||
Example:
|
||||
|
||||
hash: hash@18149600 {
|
||||
compatible = "img,hash-accelerator";
|
||||
reg = <0x18149600 0x100>, <0x18101100 0x4>;
|
||||
interrupts = <GIC_SHARED 59 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&dma 8 0xffffffff 0>;
|
||||
dma-names = "tx";
|
||||
clocks = <&cr_periph SYS_CLK_HASH>, <&clk_periph PERIPH_CLK_ROM>;
|
||||
clock-names = "sys", "hash";
|
||||
};
|
|
@ -0,0 +1,12 @@
|
|||
HWRNG support for the iproc-rng200 driver
|
||||
|
||||
Required properties:
|
||||
- compatible : "brcm,iproc-rng200"
|
||||
- reg : base address and size of control register block
|
||||
|
||||
Example:
|
||||
|
||||
rng {
|
||||
compatible = "brcm,iproc-rng200";
|
||||
reg = <0x18032000 0x28>;
|
||||
};
|
|
@ -0,0 +1,41 @@
|
|||
Broadcom BCM3380-style Level 1 / Level 2 interrupt controller
|
||||
|
||||
This interrupt controller shows up in various forms on many BCM338x/BCM63xx
|
||||
chipsets. It has the following properties:
|
||||
|
||||
- outputs a single interrupt signal to its interrupt controller parent
|
||||
|
||||
- contains one or more enable/status word pairs, which often appear at
|
||||
different offsets in different blocks
|
||||
|
||||
- no atomic set/clear operations
|
||||
|
||||
Required properties:
|
||||
|
||||
- compatible: should be "brcm,bcm3380-l2-intc"
|
||||
- reg: specifies one or more enable/status pairs, in the following format:
|
||||
<enable_reg 0x4 status_reg 0x4>...
|
||||
- interrupt-controller: identifies the node as an interrupt controller
|
||||
- #interrupt-cells: specifies the number of cells needed to encode an interrupt
|
||||
source, should be 1.
|
||||
- interrupt-parent: specifies the phandle to the parent interrupt controller
|
||||
this one is cascaded from
|
||||
- interrupts: specifies the interrupt line in the interrupt-parent controller
|
||||
node, valid values depend on the type of parent interrupt controller
|
||||
|
||||
Optional properties:
|
||||
|
||||
- brcm,irq-can-wake: if present, this means the L2 controller can be used as a
|
||||
wakeup source for system suspend/resume.
|
||||
|
||||
Example:
|
||||
|
||||
irq0_intc: interrupt-controller@10000020 {
|
||||
compatible = "brcm,bcm3380-l2-intc";
|
||||
reg = <0x10000024 0x4 0x1000002c 0x4>,
|
||||
<0x10000020 0x4 0x10000028 0x4>;
|
||||
interrupt-controller;
|
||||
#interrupt-cells = <1>;
|
||||
interrupt-parent = <&cpu_intc>;
|
||||
interrupts = <2>;
|
||||
};
|
|
@ -0,0 +1,52 @@
|
|||
Broadcom BCM7038-style Level 1 interrupt controller
|
||||
|
||||
This block is a first level interrupt controller that is typically connected
|
||||
directly to one of the HW INT lines on each CPU. Every BCM7xxx set-top chip
|
||||
since BCM7038 has contained this hardware.
|
||||
|
||||
Key elements of the hardware design include:
|
||||
|
||||
- 64, 96, 128, or 160 incoming level IRQ lines
|
||||
|
||||
- Most onchip peripherals are wired directly to an L1 input
|
||||
|
||||
- A separate instance of the register set for each CPU, allowing individual
|
||||
peripheral IRQs to be routed to any CPU
|
||||
|
||||
- Atomic mask/unmask operations
|
||||
|
||||
- No polarity/level/edge settings
|
||||
|
||||
- No FIFO or priority encoder logic; software is expected to read all
|
||||
2-5 status words to determine which IRQs are pending
|
||||
|
||||
Required properties:
|
||||
|
||||
- compatible: should be "brcm,bcm7038-l1-intc"
|
||||
- reg: specifies the base physical address and size of the registers;
|
||||
the number of supported IRQs is inferred from the size argument
|
||||
- interrupt-controller: identifies the node as an interrupt controller
|
||||
- #interrupt-cells: specifies the number of cells needed to encode an interrupt
|
||||
source, should be 1.
|
||||
- interrupt-parent: specifies the phandle to the parent interrupt controller(s)
|
||||
this one is cascaded from
|
||||
- interrupts: specifies the interrupt line(s) in the interrupt-parent controller
|
||||
node; valid values depend on the type of parent interrupt controller
|
||||
|
||||
If multiple reg ranges and interrupt-parent entries are present on an SMP
|
||||
system, the driver will allow IRQ SMP affinity to be set up through the
|
||||
/proc/irq/ interface. In the simplest possible configuration, only one
|
||||
reg range and one interrupt-parent is needed.
|
||||
|
||||
Example:
|
||||
|
||||
periph_intc: periph_intc@1041a400 {
|
||||
compatible = "brcm,bcm7038-l1-intc";
|
||||
reg = <0x1041a400 0x30 0x1041a600 0x30>;
|
||||
|
||||
interrupt-controller;
|
||||
#interrupt-cells = <1>;
|
||||
|
||||
interrupt-parent = <&cpu_intc>;
|
||||
interrupts = <2>, <3>;
|
||||
};
|
|
@ -13,8 +13,7 @@ Such an interrupt controller has the following hardware design:
|
|||
or if they will output an interrupt signal at this 2nd level interrupt
|
||||
controller, in particular for UARTs
|
||||
|
||||
- typically has one 32-bit enable word and one 32-bit status word, but on
|
||||
some hardware may have more than one enable/status pair
|
||||
- has one 32-bit enable word and one 32-bit status word
|
||||
|
||||
- no atomic set/clear operations
|
||||
|
||||
|
@ -53,9 +52,7 @@ The typical hardware layout for this controller is represented below:
|
|||
Required properties:
|
||||
|
||||
- compatible: should be "brcm,bcm7120-l2-intc"
|
||||
- reg: specifies the base physical address and size of the registers;
|
||||
multiple pairs may be specified, with the first pair handling IRQ offsets
|
||||
0..31 and the second pair handling 32..63
|
||||
- reg: specifies the base physical address and size of the registers
|
||||
- interrupt-controller: identifies the node as an interrupt controller
|
||||
- #interrupt-cells: specifies the number of cells needed to encode an interrupt
|
||||
source, should be 1.
|
||||
|
@ -66,10 +63,7 @@ Required properties:
|
|||
- brcm,int-map-mask: 32-bits bit mask describing how many and which interrupts
|
||||
are wired to this 2nd level interrupt controller, and how they match their
|
||||
respective interrupt parents. Should match exactly the number of interrupts
|
||||
specified in the 'interrupts' property, multiplied by the number of
|
||||
enable/status register pairs implemented by this controller. For
|
||||
multiple parent IRQs with multiple enable/status words, this looks like:
|
||||
<irq0_w0 irq0_w1 irq1_w0 irq1_w1 ...>
|
||||
specified in the 'interrupts' property.
|
||||
|
||||
Optional properties:
|
||||
|
||||
|
|
|
@ -0,0 +1,18 @@
|
|||
* Xtensa Interrupt Distributor and Programmable Interrupt Controller (MX)
|
||||
|
||||
Required properties:
|
||||
- compatible: Should be "cdns,xtensa-mx".
|
||||
|
||||
Remaining properties have exact same meaning as in Xtensa PIC
|
||||
(see cdns,xtensa-pic.txt).
|
||||
|
||||
Examples:
|
||||
pic: pic {
|
||||
compatible = "cdns,xtensa-mx";
|
||||
/* one cell: internal irq number,
|
||||
* two cells: second cell == 0: internal irq number
|
||||
* second cell == 1: external irq number
|
||||
*/
|
||||
#interrupt-cells = <2>;
|
||||
interrupt-controller;
|
||||
};
|
|
@ -0,0 +1,25 @@
|
|||
* Xtensa built-in Programmable Interrupt Controller (PIC)
|
||||
|
||||
Required properties:
|
||||
- compatible: Should be "cdns,xtensa-pic".
|
||||
- interrupt-controller: Identifies the node as an interrupt controller.
|
||||
- #interrupt-cells: The number of cells to define the interrupts.
|
||||
It may be either 1 or 2.
|
||||
When it's 1, the first cell is the internal IRQ number.
|
||||
When it's 2, the first cell is the IRQ number, and the second cell
|
||||
specifies whether it's internal (0) or external (1).
|
||||
Periferals are usually connected to a fixed external IRQ, but for different
|
||||
core variants it may be mapped to different internal IRQ.
|
||||
IRQ sensitivity and priority are fixed for each core variant and may not be
|
||||
changed at runtime.
|
||||
|
||||
Examples:
|
||||
pic: pic {
|
||||
compatible = "cdns,xtensa-pic";
|
||||
/* one cell: internal irq number,
|
||||
* two cells: second cell == 0: internal irq number
|
||||
* second cell == 1: external irq number
|
||||
*/
|
||||
#interrupt-cells = <2>;
|
||||
interrupt-controller;
|
||||
};
|
|
@ -27,8 +27,13 @@ Optional properties:
|
|||
Required properties for timer sub-node:
|
||||
- compatible : Should be "mti,gic-timer".
|
||||
- interrupts : Interrupt for the GIC local timer.
|
||||
|
||||
Optional properties for timer sub-node:
|
||||
- clocks : GIC timer operating clock.
|
||||
- clock-frequency : Clock frequency at which the GIC timers operate.
|
||||
|
||||
Note that one of clocks or clock-frequency must be specified.
|
||||
|
||||
Example:
|
||||
|
||||
gic: interrupt-controller@1bdc0000 {
|
||||
|
|
|
@ -14,8 +14,10 @@ Optional properties for child nodes:
|
|||
- led-sources : List of device current outputs the LED is connected to. The
|
||||
outputs are identified by the numbers that must be defined
|
||||
in the LED device binding documentation.
|
||||
- label : The label for this LED. If omitted, the label is
|
||||
taken from the node name (excluding the unit address).
|
||||
- label : The label for this LED. If omitted, the label is taken from the node
|
||||
name (excluding the unit address). It has to uniquely identify
|
||||
a device, i.e. no other LED class device can be assigned the same
|
||||
label.
|
||||
|
||||
- linux,default-trigger : This parameter, if present, is a
|
||||
string defining the trigger assigned to the LED. Current triggers are:
|
||||
|
|
|
@ -26,16 +26,18 @@ LED sub-node properties:
|
|||
|
||||
Examples:
|
||||
|
||||
#include <dt-bindings/gpio/gpio.h>
|
||||
|
||||
leds {
|
||||
compatible = "gpio-leds";
|
||||
hdd {
|
||||
label = "IDE Activity";
|
||||
gpios = <&mcu_pio 0 1>; /* Active low */
|
||||
gpios = <&mcu_pio 0 GPIO_ACTIVE_LOW>;
|
||||
linux,default-trigger = "ide-disk";
|
||||
};
|
||||
|
||||
fault {
|
||||
gpios = <&mcu_pio 1 0>;
|
||||
gpios = <&mcu_pio 1 GPIO_ACTIVE_HIGH>;
|
||||
/* Keep LED on if BIOS detected hardware fault */
|
||||
default-state = "keep";
|
||||
};
|
||||
|
@ -44,11 +46,11 @@ leds {
|
|||
run-control {
|
||||
compatible = "gpio-leds";
|
||||
red {
|
||||
gpios = <&mpc8572 6 0>;
|
||||
gpios = <&mpc8572 6 GPIO_ACTIVE_HIGH>;
|
||||
default-state = "off";
|
||||
};
|
||||
green {
|
||||
gpios = <&mpc8572 7 0>;
|
||||
gpios = <&mpc8572 7 GPIO_ACTIVE_HIGH>;
|
||||
default-state = "on";
|
||||
};
|
||||
};
|
||||
|
@ -57,7 +59,7 @@ leds {
|
|||
compatible = "gpio-leds";
|
||||
|
||||
charger-led {
|
||||
gpios = <&gpio1 2 0>;
|
||||
gpios = <&gpio1 2 GPIO_ACTIVE_HIGH>;
|
||||
linux,default-trigger = "max8903-charger-charging";
|
||||
retain-state-suspended;
|
||||
};
|
||||
|
|
|
@ -0,0 +1,43 @@
|
|||
Binding for Qualcomm PM8941 WLED driver
|
||||
|
||||
Required properties:
|
||||
- compatible: should be "qcom,pm8941-wled"
|
||||
- reg: slave address
|
||||
|
||||
Optional properties:
|
||||
- label: The label for this led
|
||||
See Documentation/devicetree/bindings/leds/common.txt
|
||||
- linux,default-trigger: Default trigger assigned to the LED
|
||||
See Documentation/devicetree/bindings/leds/common.txt
|
||||
- qcom,cs-out: bool; enable current sink output
|
||||
- qcom,cabc: bool; enable content adaptive backlight control
|
||||
- qcom,ext-gen: bool; use externally generated modulator signal to dim
|
||||
- qcom,current-limit: mA; per-string current limit; value from 0 to 25
|
||||
default: 20mA
|
||||
- qcom,current-boost-limit: mA; boost current limit; one of:
|
||||
105, 385, 525, 805, 980, 1260, 1400, 1680
|
||||
default: 805mA
|
||||
- qcom,switching-freq: kHz; switching frequency; one of:
|
||||
600, 640, 685, 738, 800, 872, 960, 1066, 1200, 1371,
|
||||
1600, 1920, 2400, 3200, 4800, 9600,
|
||||
default: 1600kHz
|
||||
- qcom,ovp: V; Over-voltage protection limit; one of:
|
||||
27, 29, 32, 35
|
||||
default: 29V
|
||||
- qcom,num-strings: #; number of led strings attached; value from 1 to 3
|
||||
default: 2
|
||||
|
||||
Example:
|
||||
|
||||
pm8941-wled@d800 {
|
||||
compatible = "qcom,pm8941-wled";
|
||||
reg = <0xd800>;
|
||||
label = "backlight";
|
||||
|
||||
qcom,cs-out;
|
||||
qcom,current-limit = <20>;
|
||||
qcom,current-boost-limit = <805>;
|
||||
qcom,switching-freq = <1600>;
|
||||
qcom,ovp = <29>;
|
||||
qcom,num-strings = <2>;
|
||||
};
|
|
@ -0,0 +1,43 @@
|
|||
ARM MHU Mailbox Driver
|
||||
======================
|
||||
|
||||
The ARM's Message-Handling-Unit (MHU) is a mailbox controller that has
|
||||
3 independent channels/links to communicate with remote processor(s).
|
||||
MHU links are hardwired on a platform. A link raises interrupt for any
|
||||
received data. However, there is no specified way of knowing if the sent
|
||||
data has been read by the remote. This driver assumes the sender polls
|
||||
STAT register and the remote clears it after having read the data.
|
||||
The last channel is specified to be a 'Secure' resource, hence can't be
|
||||
used by Linux running NS.
|
||||
|
||||
Mailbox Device Node:
|
||||
====================
|
||||
|
||||
Required properties:
|
||||
--------------------
|
||||
- compatible: Shall be "arm,mhu" & "arm,primecell"
|
||||
- reg: Contains the mailbox register address range (base
|
||||
address and length)
|
||||
- #mbox-cells Shall be 1 - the index of the channel needed.
|
||||
- interrupts: Contains the interrupt information corresponding to
|
||||
each of the 3 links of MHU.
|
||||
|
||||
Example:
|
||||
--------
|
||||
|
||||
mhu: mailbox@2b1f0000 {
|
||||
#mbox-cells = <1>;
|
||||
compatible = "arm,mhu", "arm,primecell";
|
||||
reg = <0 0x2b1f0000 0x1000>;
|
||||
interrupts = <0 36 4>, /* LP-NonSecure */
|
||||
<0 35 4>, /* HP-NonSecure */
|
||||
<0 37 4>; /* Secure */
|
||||
clocks = <&clock 0 2 1>;
|
||||
clock-names = "apb_pclk";
|
||||
};
|
||||
|
||||
mhu_client: scb@2e000000 {
|
||||
compatible = "fujitsu,mb86s70-scb-1.0";
|
||||
reg = <0 0x2e000000 0x4000>;
|
||||
mboxes = <&mhu 1>; /* HP-NonSecure */
|
||||
};
|
|
@ -1,37 +0,0 @@
|
|||
* Interrupt Controller
|
||||
|
||||
Properties:
|
||||
- compatible: "brcm,bcm3384-intc"
|
||||
|
||||
Compatibility with BCM3384 and possibly other BCM33xx/BCM63xx SoCs.
|
||||
|
||||
- reg: Address/length pairs for each mask/status register set. Length must
|
||||
be 8. If multiple register sets are specified, the first set will
|
||||
handle IRQ offsets 0..31, the second set 32..63, and so on.
|
||||
|
||||
- interrupt-controller: This is an interrupt controller.
|
||||
|
||||
- #interrupt-cells: Must be <1>. Just a simple IRQ offset; no level/edge
|
||||
or polarity configuration is possible with this controller.
|
||||
|
||||
- interrupt-parent: This controller is cascaded from a MIPS CPU HW IRQ, or
|
||||
from another INTC.
|
||||
|
||||
- interrupts: The IRQ on the parent controller.
|
||||
|
||||
Example:
|
||||
periph_intc: periph_intc@14e00038 {
|
||||
compatible = "brcm,bcm3384-intc";
|
||||
|
||||
/*
|
||||
* IRQs 0..31: mask reg 0x14e00038, status reg 0x14e0003c
|
||||
* IRQs 32..63: mask reg 0x14e00340, status reg 0x14e00344
|
||||
*/
|
||||
reg = <0x14e00038 0x8 0x14e00340 0x8>;
|
||||
|
||||
interrupt-controller;
|
||||
#interrupt-cells = <1>;
|
||||
|
||||
interrupt-parent = <&cpu_intc>;
|
||||
interrupts = <4>;
|
||||
};
|
|
@ -1,11 +0,0 @@
|
|||
* Broadcom cable/DSL platforms
|
||||
|
||||
SoCs:
|
||||
|
||||
Required properties:
|
||||
- compatible: "brcm,bcm3384", "brcm,bcm33843"
|
||||
|
||||
Boards:
|
||||
|
||||
Required properties:
|
||||
- compatible: "brcm,bcm93384wvg"
|
|
@ -0,0 +1,12 @@
|
|||
* Broadcom cable/DSL/settop platforms
|
||||
|
||||
Required properties:
|
||||
|
||||
- compatible: "brcm,bcm3384", "brcm,bcm33843"
|
||||
"brcm,bcm3384-viper", "brcm,bcm33843-viper"
|
||||
"brcm,bcm6328", "brcm,bcm6368",
|
||||
"brcm,bcm7125", "brcm,bcm7346", "brcm,bcm7358", "brcm,bcm7360",
|
||||
"brcm,bcm7362", "brcm,bcm7420", "brcm,bcm7425"
|
||||
|
||||
The experimental -viper variants are for running Linux on the 3384's
|
||||
BMIPS4355 cable modem CPU instead of the BMIPS5000 application processor.
|
|
@ -0,0 +1,42 @@
|
|||
Imagination Pistachio SoC
|
||||
=========================
|
||||
|
||||
Required properties:
|
||||
--------------------
|
||||
- compatible: Must include "img,pistachio".
|
||||
|
||||
CPU nodes:
|
||||
----------
|
||||
A "cpus" node is required. Required properties:
|
||||
- #address-cells: Must be 1.
|
||||
- #size-cells: Must be 0.
|
||||
A CPU sub-node is also required for at least CPU 0. Since the topology may
|
||||
be probed via CPS, it is not necessary to specify secondary CPUs. Required
|
||||
propertis:
|
||||
- device_type: Must be "cpu".
|
||||
- compatible: Must be "mti,interaptiv".
|
||||
- reg: CPU number.
|
||||
- clocks: Must include the CPU clock. See ../../clock/clock-bindings.txt for
|
||||
details on clock bindings.
|
||||
Example:
|
||||
cpus {
|
||||
#address-cells = <1>;
|
||||
#size-cells = <0>;
|
||||
|
||||
cpu0: cpu@0 {
|
||||
device_type = "cpu";
|
||||
compatible = "mti,interaptiv";
|
||||
reg = <0>;
|
||||
clocks = <&clk_core CLK_MIPS>;
|
||||
};
|
||||
};
|
||||
|
||||
|
||||
Boot protocol:
|
||||
--------------
|
||||
In accordance with the MIPS UHI specification[1], the bootloader must pass the
|
||||
following arguments to the kernel:
|
||||
- $a0: -2.
|
||||
- $a1: KSEG0 address of the flattened device-tree blob.
|
||||
|
||||
[1] http://prplfoundation.org/wiki/MIPS_documentation
|
|
@ -19,6 +19,12 @@ The following properties are common to the Ethernet controllers:
|
|||
- phy: the same as "phy-handle" property, not recommended for new bindings.
|
||||
- phy-device: the same as "phy-handle" property, not recommended for new
|
||||
bindings.
|
||||
- rx-fifo-depth: the size of the controller's receive fifo in bytes. This
|
||||
is used for components that can have configurable receive fifo sizes,
|
||||
and is useful for determining certain configuration settings such as
|
||||
flow control thresholds.
|
||||
- tx-fifo-depth: the size of the controller's transmit fifo in bytes. This
|
||||
is used for components that can have configurable fifo sizes.
|
||||
|
||||
Child nodes of the Ethernet controller are typically the individual PHY devices
|
||||
connected via the MDIO bus (sometimes the MDIO bus controller is separate).
|
||||
|
|
|
@ -45,6 +45,8 @@ Optional properties:
|
|||
If not passed then the system clock will be used and this is fine on some
|
||||
platforms.
|
||||
- snps,burst_len: The AXI burst lenth value of the AXI BUS MODE register.
|
||||
- tx-fifo-depth: See ethernet.txt file in the same directory
|
||||
- rx-fifo-depth: See ethernet.txt file in the same directory
|
||||
|
||||
Examples:
|
||||
|
||||
|
@ -59,6 +61,8 @@ Examples:
|
|||
phy-mode = "gmii";
|
||||
snps,multicast-filter-bins = <256>;
|
||||
snps,perfect-filter-entries = <128>;
|
||||
rx-fifo-depth = <16384>;
|
||||
tx-fifo-depth = <16384>;
|
||||
clocks = <&clock>;
|
||||
clock-names = "stmmaceth";
|
||||
};
|
||||
|
|
|
@ -0,0 +1,17 @@
|
|||
Conexant Digicolor Real Time Clock controller
|
||||
|
||||
This binding currently supports the CX92755 SoC.
|
||||
|
||||
Required properties:
|
||||
- compatible: should be "cnxt,cx92755-rtc"
|
||||
- reg: physical base address of the controller and length of memory mapped
|
||||
region.
|
||||
- interrupts: rtc alarm interrupt
|
||||
|
||||
Example:
|
||||
|
||||
rtc@f0000c30 {
|
||||
compatible = "cnxt,cx92755-rtc";
|
||||
reg = <0xf0000c30 0x18>;
|
||||
interrupts = <25>;
|
||||
};
|
|
@ -7,6 +7,11 @@ Required properties:
|
|||
region.
|
||||
- interrupts: rtc alarm interrupt
|
||||
|
||||
Optional properties:
|
||||
- stmp,crystal-freq: override crystal frequency as determined from fuse bits.
|
||||
Only <32000> and <32768> are possible for the hardware. Use <0> for
|
||||
"no crystal".
|
||||
|
||||
Example:
|
||||
|
||||
rtc@80056000 {
|
||||
|
|
|
@ -0,0 +1,34 @@
|
|||
* STMicroelectronics SAS. ST33ZP24 TPM SoC
|
||||
|
||||
Required properties:
|
||||
- compatible: Should be "st,st33zp24-spi".
|
||||
- spi-max-frequency: Maximum SPI frequency (<= 10000000).
|
||||
|
||||
Optional ST33ZP24 Properties:
|
||||
- interrupt-parent: phandle for the interrupt gpio controller
|
||||
- interrupts: GPIO interrupt to which the chip is connected
|
||||
- lpcpd-gpios: Output GPIO pin used for ST33ZP24 power management D1/D2 state.
|
||||
If set, power must be present when the platform is going into sleep/hibernate mode.
|
||||
|
||||
Optional SoC Specific Properties:
|
||||
- pinctrl-names: Contains only one value - "default".
|
||||
- pintctrl-0: Specifies the pin control groups used for this controller.
|
||||
|
||||
Example (for ARM-based BeagleBoard xM with ST33ZP24 on SPI4):
|
||||
|
||||
&mcspi4 {
|
||||
|
||||
status = "okay";
|
||||
|
||||
st33zp24@0 {
|
||||
|
||||
compatible = "st,st33zp24-spi";
|
||||
|
||||
spi-max-frequency = <10000000>;
|
||||
|
||||
interrupt-parent = <&gpio5>;
|
||||
interrupts = <7 IRQ_TYPE_LEVEL_HIGH>;
|
||||
|
||||
lpcpd-gpios = <&gpio5 15 GPIO_ACTIVE_HIGH>;
|
||||
};
|
||||
};
|
|
@ -1,7 +1,7 @@
|
|||
Ingenic JZ4740 I2S controller
|
||||
|
||||
Required properties:
|
||||
- compatible : "ingenic,jz4740-i2s"
|
||||
- compatible : "ingenic,jz4740-i2s" or "ingenic,jz4780-i2s"
|
||||
- reg : I2S registers location and length
|
||||
- clocks : AIC and I2S PLL clock specifiers.
|
||||
- clock-names: "aic" and "i2s"
|
||||
|
|
|
@ -0,0 +1,22 @@
|
|||
max98925 audio CODEC
|
||||
|
||||
This device supports I2C.
|
||||
|
||||
Required properties:
|
||||
|
||||
- compatible : "maxim,max98925"
|
||||
|
||||
- vmon-slot-no : slot number used to send voltage information
|
||||
|
||||
- imon-slot-no : slot number used to send current information
|
||||
|
||||
- reg : the I2C address of the device for I2C
|
||||
|
||||
Example:
|
||||
|
||||
codec: max98925@1a {
|
||||
compatible = "maxim,max98925";
|
||||
vmon-slot-no = <0>;
|
||||
imon-slot-no = <2>;
|
||||
reg = <0x1a>;
|
||||
};
|
|
@ -18,6 +18,7 @@ Required properties:
|
|||
* Headphones
|
||||
* Speakers
|
||||
* Mic Jack
|
||||
* Int Mic
|
||||
|
||||
- nvidia,i2s-controller : The phandle of the Tegra I2S controller that's
|
||||
connected to the CODEC.
|
||||
|
|
|
@ -0,0 +1,43 @@
|
|||
* Qualcomm Technologies LPASS CPU DAI
|
||||
|
||||
This node models the Qualcomm Technologies Low-Power Audio SubSystem (LPASS).
|
||||
|
||||
Required properties:
|
||||
|
||||
- compatible : "qcom,lpass-cpu"
|
||||
- clocks : Must contain an entry for each entry in clock-names.
|
||||
- clock-names : A list which must include the following entries:
|
||||
* "ahbix-clk"
|
||||
* "mi2s-osr-clk"
|
||||
* "mi2s-bit-clk"
|
||||
- interrupts : Must contain an entry for each entry in
|
||||
interrupt-names.
|
||||
- interrupt-names : A list which must include the following entries:
|
||||
* "lpass-irq-lpaif"
|
||||
- pinctrl-N : One property must exist for each entry in
|
||||
pinctrl-names. See ../pinctrl/pinctrl-bindings.txt
|
||||
for details of the property values.
|
||||
- pinctrl-names : Must contain a "default" entry.
|
||||
- reg : Must contain an address for each entry in reg-names.
|
||||
- reg-names : A list which must include the following entries:
|
||||
* "lpass-lpaif"
|
||||
|
||||
Optional properties:
|
||||
|
||||
- qcom,adsp : Phandle for the audio DSP node
|
||||
|
||||
Example:
|
||||
|
||||
lpass@28100000 {
|
||||
compatible = "qcom,lpass-cpu";
|
||||
clocks = <&lcc AHBIX_CLK>, <&lcc MI2S_OSR_CLK>, <&lcc MI2S_BIT_CLK>;
|
||||
clock-names = "ahbix-clk", "mi2s-osr-clk", "mi2s-bit-clk";
|
||||
interrupts = <0 85 1>;
|
||||
interrupt-names = "lpass-irq-lpaif";
|
||||
pinctrl-names = "default", "idle";
|
||||
pinctrl-0 = <&mi2s_default>;
|
||||
pinctrl-1 = <&mi2s_idle>;
|
||||
reg = <0x28100000 0x10000>;
|
||||
reg-names = "lpass-lpaif";
|
||||
qcom,adsp = <&adsp>;
|
||||
};
|
|
@ -29,9 +29,17 @@ SSI subnode properties:
|
|||
- shared-pin : if shared clock pin
|
||||
- pio-transfer : use PIO transfer mode
|
||||
- no-busif : BUSIF is not ussed when [mem -> SSI] via DMA case
|
||||
- dma : Should contain Audio DMAC entry
|
||||
- dma-names : SSI case "rx" (=playback), "tx" (=capture)
|
||||
SSIU case "rxu" (=playback), "txu" (=capture)
|
||||
|
||||
SRC subnode properties:
|
||||
no properties at this point
|
||||
- dma : Should contain Audio DMAC entry
|
||||
- dma-names : "rx" (=playback), "tx" (=capture)
|
||||
|
||||
DVC subnode properties:
|
||||
- dma : Should contain Audio DMAC entry
|
||||
- dma-names : "tx" (=playback/capture)
|
||||
|
||||
DAI subnode properties:
|
||||
- playback : list of playback modules
|
||||
|
@ -45,56 +53,145 @@ rcar_sound: rcar_sound@ec500000 {
|
|||
reg = <0 0xec500000 0 0x1000>, /* SCU */
|
||||
<0 0xec5a0000 0 0x100>, /* ADG */
|
||||
<0 0xec540000 0 0x1000>, /* SSIU */
|
||||
<0 0xec541000 0 0x1280>; /* SSI */
|
||||
<0 0xec541000 0 0x1280>, /* SSI */
|
||||
<0 0xec740000 0 0x200>; /* Audio DMAC peri peri*/
|
||||
reg-names = "scu", "adg", "ssiu", "ssi", "audmapp";
|
||||
|
||||
clocks = <&mstp10_clks R8A7790_CLK_SSI_ALL>,
|
||||
<&mstp10_clks R8A7790_CLK_SSI9>, <&mstp10_clks R8A7790_CLK_SSI8>,
|
||||
<&mstp10_clks R8A7790_CLK_SSI7>, <&mstp10_clks R8A7790_CLK_SSI6>,
|
||||
<&mstp10_clks R8A7790_CLK_SSI5>, <&mstp10_clks R8A7790_CLK_SSI4>,
|
||||
<&mstp10_clks R8A7790_CLK_SSI3>, <&mstp10_clks R8A7790_CLK_SSI2>,
|
||||
<&mstp10_clks R8A7790_CLK_SSI1>, <&mstp10_clks R8A7790_CLK_SSI0>,
|
||||
<&mstp10_clks R8A7790_CLK_SCU_SRC9>, <&mstp10_clks R8A7790_CLK_SCU_SRC8>,
|
||||
<&mstp10_clks R8A7790_CLK_SCU_SRC7>, <&mstp10_clks R8A7790_CLK_SCU_SRC6>,
|
||||
<&mstp10_clks R8A7790_CLK_SCU_SRC5>, <&mstp10_clks R8A7790_CLK_SCU_SRC4>,
|
||||
<&mstp10_clks R8A7790_CLK_SCU_SRC3>, <&mstp10_clks R8A7790_CLK_SCU_SRC2>,
|
||||
<&mstp10_clks R8A7790_CLK_SCU_SRC1>, <&mstp10_clks R8A7790_CLK_SCU_SRC0>,
|
||||
<&mstp10_clks R8A7790_CLK_SCU_DVC0>, <&mstp10_clks R8A7790_CLK_SCU_DVC1>,
|
||||
<&audio_clk_a>, <&audio_clk_b>, <&audio_clk_c>, <&m2_clk>;
|
||||
clock-names = "ssi-all",
|
||||
"ssi.9", "ssi.8", "ssi.7", "ssi.6", "ssi.5",
|
||||
"ssi.4", "ssi.3", "ssi.2", "ssi.1", "ssi.0",
|
||||
"src.9", "src.8", "src.7", "src.6", "src.5",
|
||||
"src.4", "src.3", "src.2", "src.1", "src.0",
|
||||
"dvc.0", "dvc.1",
|
||||
"clk_a", "clk_b", "clk_c", "clk_i";
|
||||
|
||||
rcar_sound,dvc {
|
||||
dvc0: dvc@0 { };
|
||||
dvc1: dvc@1 { };
|
||||
dvc0: dvc@0 {
|
||||
dmas = <&audma0 0xbc>;
|
||||
dma-names = "tx";
|
||||
};
|
||||
dvc1: dvc@1 {
|
||||
dmas = <&audma0 0xbe>;
|
||||
dma-names = "tx";
|
||||
};
|
||||
};
|
||||
|
||||
rcar_sound,src {
|
||||
src0: src@0 { };
|
||||
src1: src@1 { };
|
||||
src2: src@2 { };
|
||||
src3: src@3 { };
|
||||
src4: src@4 { };
|
||||
src5: src@5 { };
|
||||
src6: src@6 { };
|
||||
src7: src@7 { };
|
||||
src8: src@8 { };
|
||||
src9: src@9 { };
|
||||
src0: src@0 {
|
||||
interrupts = <0 352 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x85>, <&audma1 0x9a>;
|
||||
dma-names = "rx", "tx";
|
||||
};
|
||||
src1: src@1 {
|
||||
interrupts = <0 353 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x87>, <&audma1 0x9c>;
|
||||
dma-names = "rx", "tx";
|
||||
};
|
||||
src2: src@2 {
|
||||
interrupts = <0 354 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x89>, <&audma1 0x9e>;
|
||||
dma-names = "rx", "tx";
|
||||
};
|
||||
src3: src@3 {
|
||||
interrupts = <0 355 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x8b>, <&audma1 0xa0>;
|
||||
dma-names = "rx", "tx";
|
||||
};
|
||||
src4: src@4 {
|
||||
interrupts = <0 356 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x8d>, <&audma1 0xb0>;
|
||||
dma-names = "rx", "tx";
|
||||
};
|
||||
src5: src@5 {
|
||||
interrupts = <0 357 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x8f>, <&audma1 0xb2>;
|
||||
dma-names = "rx", "tx";
|
||||
};
|
||||
src6: src@6 {
|
||||
interrupts = <0 358 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x91>, <&audma1 0xb4>;
|
||||
dma-names = "rx", "tx";
|
||||
};
|
||||
src7: src@7 {
|
||||
interrupts = <0 359 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x93>, <&audma1 0xb6>;
|
||||
dma-names = "rx", "tx";
|
||||
};
|
||||
src8: src@8 {
|
||||
interrupts = <0 360 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x95>, <&audma1 0xb8>;
|
||||
dma-names = "rx", "tx";
|
||||
};
|
||||
src9: src@9 {
|
||||
interrupts = <0 361 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x97>, <&audma1 0xba>;
|
||||
dma-names = "rx", "tx";
|
||||
};
|
||||
};
|
||||
|
||||
rcar_sound,ssi {
|
||||
ssi0: ssi@0 {
|
||||
interrupts = <0 370 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x01>, <&audma1 0x02>, <&audma0 0x15>, <&audma1 0x16>;
|
||||
dma-names = "rx", "tx", "rxu", "txu";
|
||||
};
|
||||
ssi1: ssi@1 {
|
||||
interrupts = <0 371 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x03>, <&audma1 0x04>, <&audma0 0x49>, <&audma1 0x4a>;
|
||||
dma-names = "rx", "tx", "rxu", "txu";
|
||||
};
|
||||
ssi2: ssi@2 {
|
||||
interrupts = <0 372 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x05>, <&audma1 0x06>, <&audma0 0x63>, <&audma1 0x64>;
|
||||
dma-names = "rx", "tx", "rxu", "txu";
|
||||
};
|
||||
ssi3: ssi@3 {
|
||||
interrupts = <0 373 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x07>, <&audma1 0x08>, <&audma0 0x6f>, <&audma1 0x70>;
|
||||
dma-names = "rx", "tx", "rxu", "txu";
|
||||
};
|
||||
ssi4: ssi@4 {
|
||||
interrupts = <0 374 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x09>, <&audma1 0x0a>, <&audma0 0x71>, <&audma1 0x72>;
|
||||
dma-names = "rx", "tx", "rxu", "txu";
|
||||
};
|
||||
ssi5: ssi@5 {
|
||||
interrupts = <0 375 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x0b>, <&audma1 0x0c>, <&audma0 0x73>, <&audma1 0x74>;
|
||||
dma-names = "rx", "tx", "rxu", "txu";
|
||||
};
|
||||
ssi6: ssi@6 {
|
||||
interrupts = <0 376 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x0d>, <&audma1 0x0e>, <&audma0 0x75>, <&audma1 0x76>;
|
||||
dma-names = "rx", "tx", "rxu", "txu";
|
||||
};
|
||||
ssi7: ssi@7 {
|
||||
interrupts = <0 377 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x0f>, <&audma1 0x10>, <&audma0 0x79>, <&audma1 0x7a>;
|
||||
dma-names = "rx", "tx", "rxu", "txu";
|
||||
};
|
||||
ssi8: ssi@8 {
|
||||
interrupts = <0 378 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x11>, <&audma1 0x12>, <&audma0 0x7b>, <&audma1 0x7c>;
|
||||
dma-names = "rx", "tx", "rxu", "txu";
|
||||
};
|
||||
ssi9: ssi@9 {
|
||||
interrupts = <0 379 IRQ_TYPE_LEVEL_HIGH>;
|
||||
dmas = <&audma0 0x13>, <&audma1 0x14>, <&audma0 0x7d>, <&audma1 0x7e>;
|
||||
dma-names = "rx", "tx", "rxu", "txu";
|
||||
};
|
||||
};
|
||||
|
||||
|
|
|
@ -0,0 +1,67 @@
|
|||
Renesas Sampling Rate Convert Sound Card:
|
||||
|
||||
Renesas Sampling Rate Convert Sound Card specifies audio DAI connections of SoC <-> codec.
|
||||
|
||||
Required properties:
|
||||
|
||||
- compatible : "renesas,rsrc-card,<board>"
|
||||
Examples with soctypes are:
|
||||
- "renesas,rsrc-card,lager"
|
||||
- "renesas,rsrc-card,koelsch"
|
||||
Optional properties:
|
||||
|
||||
- card_name : User specified audio sound card name, one string
|
||||
property.
|
||||
- cpu : CPU sub-node
|
||||
- codec : CODEC sub-node
|
||||
|
||||
Optional subnode properties:
|
||||
|
||||
- format : CPU/CODEC common audio format.
|
||||
"i2s", "right_j", "left_j" , "dsp_a"
|
||||
"dsp_b", "ac97", "pdm", "msb", "lsb"
|
||||
- frame-master : Indicates dai-link frame master.
|
||||
phandle to a cpu or codec subnode.
|
||||
- bitclock-master : Indicates dai-link bit clock master.
|
||||
phandle to a cpu or codec subnode.
|
||||
- bitclock-inversion : bool property. Add this if the
|
||||
dai-link uses bit clock inversion.
|
||||
- frame-inversion : bool property. Add this if the
|
||||
dai-link uses frame clock inversion.
|
||||
- convert-rate : platform specified sampling rate convert
|
||||
|
||||
Required CPU/CODEC subnodes properties:
|
||||
|
||||
- sound-dai : phandle and port of CPU/CODEC
|
||||
|
||||
Optional CPU/CODEC subnodes properties:
|
||||
|
||||
- clocks / system-clock-frequency : specify subnode's clock if needed.
|
||||
it can be specified via "clocks" if system has
|
||||
clock node (= common clock), or "system-clock-frequency"
|
||||
(if system doens't support common clock)
|
||||
If a clock is specified, it is
|
||||
enabled with clk_prepare_enable()
|
||||
in dai startup() and disabled with
|
||||
clk_disable_unprepare() in dai
|
||||
shutdown().
|
||||
|
||||
Example
|
||||
|
||||
sound {
|
||||
compatible = "renesas,rsrc-card,lager";
|
||||
|
||||
card-name = "rsnd-ak4643";
|
||||
format = "left_j";
|
||||
bitclock-master = <&sndcodec>;
|
||||
frame-master = <&sndcodec>;
|
||||
|
||||
sndcpu: cpu {
|
||||
sound-dai = <&rcar_sound>;
|
||||
};
|
||||
|
||||
sndcodec: codec {
|
||||
sound-dai = <&ak4643>;
|
||||
system-clock-frequency = <11289600>;
|
||||
};
|
||||
};
|
|
@ -0,0 +1,23 @@
|
|||
* Sound complex for Storm boards
|
||||
|
||||
Models a soundcard for Storm boards with the Qualcomm Technologies IPQ806x SOC
|
||||
connected to a MAX98357A DAC via I2S.
|
||||
|
||||
Required properties:
|
||||
|
||||
- compatible : "google,storm-audio"
|
||||
- cpu : Phandle of the CPU DAI
|
||||
- codec : Phandle of the codec DAI
|
||||
|
||||
Optional properties:
|
||||
|
||||
- qcom,model : The user-visible name of this sound card.
|
||||
|
||||
Example:
|
||||
|
||||
sound {
|
||||
compatible = "google,storm-audio";
|
||||
qcom,model = "ipq806x-storm";
|
||||
cpu = <&lpass_cpu>;
|
||||
codec = <&max98357a>;
|
||||
};
|
|
@ -10,6 +10,13 @@ Required properties:
|
|||
- reg : the I2C address of the device for I2C, the chip select
|
||||
number for SPI.
|
||||
|
||||
- PVDD-supply, DVDD-supply : Power supplies for the device, as covered
|
||||
in Documentation/devicetree/bindings/regulator/regulator.txt
|
||||
|
||||
Optional properties:
|
||||
|
||||
- wlf,reset-gpio: A GPIO specifier for the GPIO controlling the reset pin
|
||||
|
||||
Example:
|
||||
|
||||
codec: wm8804@1a {
|
||||
|
|
|
@ -15,6 +15,7 @@ Table of Contents
|
|||
1) Entry point for arch/arm
|
||||
2) Entry point for arch/powerpc
|
||||
3) Entry point for arch/x86
|
||||
4) Entry point for arch/mips/bmips
|
||||
|
||||
II - The DT block format
|
||||
1) Header
|
||||
|
@ -288,6 +289,33 @@ it with special cases.
|
|||
or initrd address. It simply holds information which can not be retrieved
|
||||
otherwise like interrupt routing or a list of devices behind an I2C bus.
|
||||
|
||||
4) Entry point for arch/mips/bmips
|
||||
----------------------------------
|
||||
|
||||
Some bootloaders only support a single entry point, at the start of the
|
||||
kernel image. Other bootloaders will jump to the ELF start address.
|
||||
Both schemes are supported; CONFIG_BOOT_RAW=y and CONFIG_NO_EXCEPT_FILL=y,
|
||||
so the first instruction immediately jumps to kernel_entry().
|
||||
|
||||
Similar to the arch/arm case (b), a DT-aware bootloader is expected to
|
||||
set up the following registers:
|
||||
|
||||
a0 : 0
|
||||
|
||||
a1 : 0xffffffff
|
||||
|
||||
a2 : Physical pointer to the device tree block (defined in chapter
|
||||
II) in RAM. The device tree can be located anywhere in the first
|
||||
512MB of the physical address space (0x00000000 - 0x1fffffff),
|
||||
aligned on a 64 bit boundary.
|
||||
|
||||
Legacy bootloaders do not use this convention, and they do not pass in a
|
||||
DT block. In this case, Linux will look for a builtin DTB, selected via
|
||||
CONFIG_DT_*.
|
||||
|
||||
This convention is defined for 32-bit systems only, as there are not
|
||||
currently any 64-bit BMIPS implementations.
|
||||
|
||||
II - The DT block format
|
||||
========================
|
||||
|
||||
|
|
|
@ -289,6 +289,10 @@ IRQ
|
|||
devm_request_irq()
|
||||
devm_request_threaded_irq()
|
||||
|
||||
LED
|
||||
devm_led_classdev_register()
|
||||
devm_led_classdev_unregister()
|
||||
|
||||
MDIO
|
||||
devm_mdiobus_alloc()
|
||||
devm_mdiobus_alloc_size()
|
||||
|
|
|
@ -196,7 +196,7 @@ prototypes:
|
|||
void (*invalidatepage) (struct page *, unsigned int, unsigned int);
|
||||
int (*releasepage) (struct page *, int);
|
||||
void (*freepage)(struct page *);
|
||||
int (*direct_IO)(int, struct kiocb *, struct iov_iter *iter, loff_t offset);
|
||||
int (*direct_IO)(struct kiocb *, struct iov_iter *iter, loff_t offset);
|
||||
int (*migratepage)(struct address_space *, struct page *, struct page *);
|
||||
int (*launder_page)(struct page *);
|
||||
int (*is_partially_uptodate)(struct page *, unsigned long, unsigned long);
|
||||
|
@ -429,8 +429,6 @@ prototypes:
|
|||
loff_t (*llseek) (struct file *, loff_t, int);
|
||||
ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
|
||||
ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
|
||||
ssize_t (*aio_read) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
|
||||
ssize_t (*aio_write) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
|
||||
ssize_t (*read_iter) (struct kiocb *, struct iov_iter *);
|
||||
ssize_t (*write_iter) (struct kiocb *, struct iov_iter *);
|
||||
int (*iterate) (struct file *, struct dir_context *);
|
||||
|
@ -525,6 +523,7 @@ prototypes:
|
|||
void (*close)(struct vm_area_struct*);
|
||||
int (*fault)(struct vm_area_struct*, struct vm_fault *);
|
||||
int (*page_mkwrite)(struct vm_area_struct *, struct vm_fault *);
|
||||
int (*pfn_mkwrite)(struct vm_area_struct *, struct vm_fault *);
|
||||
int (*access)(struct vm_area_struct *, unsigned long, void*, int, int);
|
||||
|
||||
locking rules:
|
||||
|
@ -534,6 +533,7 @@ close: yes
|
|||
fault: yes can return with page locked
|
||||
map_pages: yes
|
||||
page_mkwrite: yes can return with page locked
|
||||
pfn_mkwrite: yes
|
||||
access: yes
|
||||
|
||||
->fault() is called when a previously not present pte is about
|
||||
|
@ -560,6 +560,12 @@ the page has been truncated, the filesystem should not look up a new page
|
|||
like the ->fault() handler, but simply return with VM_FAULT_NOPAGE, which
|
||||
will cause the VM to retry the fault.
|
||||
|
||||
->pfn_mkwrite() is the same as page_mkwrite but when the pte is
|
||||
VM_PFNMAP or VM_MIXEDMAP with a page-less entry. Expected return is
|
||||
VM_FAULT_NOPAGE. Or one of the VM_FAULT_ERROR types. The default behavior
|
||||
after this call is to make the pte read-write, unless pfn_mkwrite returns
|
||||
an error.
|
||||
|
||||
->access() is called when get_user_pages() fails in
|
||||
access_process_vm(), typically used to debug a process through
|
||||
/proc/pid/mem or ptrace. This function is needed only for
|
||||
|
|
|
@ -471,3 +471,15 @@ in your dentry operations instead.
|
|||
[mandatory]
|
||||
f_dentry is gone; use f_path.dentry, or, better yet, see if you can avoid
|
||||
it entirely.
|
||||
--
|
||||
[mandatory]
|
||||
never call ->read() and ->write() directly; use __vfs_{read,write} or
|
||||
wrappers; instead of checking for ->write or ->read being NULL, look for
|
||||
FMODE_CAN_{WRITE,READ} in file->f_mode.
|
||||
--
|
||||
[mandatory]
|
||||
do _not_ use new_sync_{read,write} for ->read/->write; leave it NULL
|
||||
instead.
|
||||
--
|
||||
[mandatory]
|
||||
->aio_read/->aio_write are gone. Use ->read_iter/->write_iter.
|
||||
|
|
|
@ -200,12 +200,12 @@ contains details information about the process itself. Its fields are
|
|||
explained in Table 1-4.
|
||||
|
||||
(for SMP CONFIG users)
|
||||
For making accounting scalable, RSS related information are handled in
|
||||
asynchronous manner and the vaule may not be very precise. To see a precise
|
||||
For making accounting scalable, RSS related information are handled in an
|
||||
asynchronous manner and the value may not be very precise. To see a precise
|
||||
snapshot of a moment, you can see /proc/<pid>/smaps file and scan page table.
|
||||
It's slow but very precise.
|
||||
|
||||
Table 1-2: Contents of the status files (as of 2.6.30-rc7)
|
||||
Table 1-2: Contents of the status files (as of 3.20.0)
|
||||
..............................................................................
|
||||
Field Content
|
||||
Name filename of the executable
|
||||
|
@ -213,6 +213,7 @@ Table 1-2: Contents of the status files (as of 2.6.30-rc7)
|
|||
in an uninterruptible wait, Z is zombie,
|
||||
T is traced or stopped)
|
||||
Tgid thread group ID
|
||||
Ngid NUMA group ID (0 if none)
|
||||
Pid process id
|
||||
PPid process id of the parent process
|
||||
TracerPid PID of process tracing this process (0 if not)
|
||||
|
@ -220,6 +221,10 @@ Table 1-2: Contents of the status files (as of 2.6.30-rc7)
|
|||
Gid Real, effective, saved set, and file system GIDs
|
||||
FDSize number of file descriptor slots currently allocated
|
||||
Groups supplementary group list
|
||||
NStgid descendant namespace thread group ID hierarchy
|
||||
NSpid descendant namespace process ID hierarchy
|
||||
NSpgid descendant namespace process group ID hierarchy
|
||||
NSsid descendant namespace session ID hierarchy
|
||||
VmPeak peak virtual memory size
|
||||
VmSize total program size
|
||||
VmLck locked memory size
|
||||
|
@ -1704,6 +1709,10 @@ A typical output is
|
|||
flags: 0100002
|
||||
mnt_id: 19
|
||||
|
||||
All locks associated with a file descriptor are shown in its fdinfo too.
|
||||
|
||||
lock: 1: FLOCK ADVISORY WRITE 359 00:13:11691 0 EOF
|
||||
|
||||
The files such as eventfd, fsnotify, signalfd, epoll among the regular pos/flags
|
||||
pair provide additional information particular to the objects they represent.
|
||||
|
||||
|
|
|
@ -590,7 +590,7 @@ struct address_space_operations {
|
|||
void (*invalidatepage) (struct page *, unsigned int, unsigned int);
|
||||
int (*releasepage) (struct page *, int);
|
||||
void (*freepage)(struct page *);
|
||||
ssize_t (*direct_IO)(int, struct kiocb *, struct iov_iter *iter, loff_t offset);
|
||||
ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter, loff_t offset);
|
||||
/* migrate the contents of a page to the specified target */
|
||||
int (*migratepage) (struct page *, struct page *);
|
||||
int (*launder_page) (struct page *);
|
||||
|
@ -804,8 +804,6 @@ struct file_operations {
|
|||
loff_t (*llseek) (struct file *, loff_t, int);
|
||||
ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
|
||||
ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
|
||||
ssize_t (*aio_read) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
|
||||
ssize_t (*aio_write) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
|
||||
ssize_t (*read_iter) (struct kiocb *, struct iov_iter *);
|
||||
ssize_t (*write_iter) (struct kiocb *, struct iov_iter *);
|
||||
int (*iterate) (struct file *, struct dir_context *);
|
||||
|
@ -838,14 +836,10 @@ otherwise noted.
|
|||
|
||||
read: called by read(2) and related system calls
|
||||
|
||||
aio_read: vectored, possibly asynchronous read
|
||||
|
||||
read_iter: possibly asynchronous read with iov_iter as destination
|
||||
|
||||
write: called by write(2) and related system calls
|
||||
|
||||
aio_write: vectored, possibly asynchronous write
|
||||
|
||||
write_iter: possibly asynchronous write with iov_iter as source
|
||||
|
||||
iterate: called when the VFS needs to read the directory contents
|
||||
|
|
|
@ -2317,7 +2317,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
|
|||
noexec32=off: disable non-executable mappings
|
||||
read implies executable mappings
|
||||
|
||||
nofpu [SH] Disable hardware FPU at boot time.
|
||||
nofpu [MIPS,SH] Disable hardware FPU at boot time.
|
||||
|
||||
nofxsr [BUGS=X86-32] Disables x86 floating point extended
|
||||
register save and restore. The kernel will only save
|
||||
|
|
|
@ -0,0 +1,22 @@
|
|||
|
||||
Flash LED handling under Linux
|
||||
==============================
|
||||
|
||||
Some LED devices provide two modes - torch and flash. In the LED subsystem
|
||||
those modes are supported by LED class (see Documentation/leds/leds-class.txt)
|
||||
and LED Flash class respectively. The torch mode related features are enabled
|
||||
by default and the flash ones only if a driver declares it by setting
|
||||
LED_DEV_CAP_FLASH flag.
|
||||
|
||||
In order to enable the support for flash LEDs CONFIG_LEDS_CLASS_FLASH symbol
|
||||
must be defined in the kernel config. A LED Flash class driver must be
|
||||
registered in the LED subsystem with led_classdev_flash_register function.
|
||||
|
||||
Following sysfs attributes are exposed for controlling flash LED devices:
|
||||
(see Documentation/ABI/testing/sysfs-class-led-flash)
|
||||
- flash_brightness
|
||||
- max_flash_brightness
|
||||
- flash_timeout
|
||||
- max_flash_timeout
|
||||
- flash_strobe
|
||||
- flash_fault
|
|
@ -0,0 +1,301 @@
|
|||
Wei Yang <weiyang@linux.vnet.ibm.com>
|
||||
Benjamin Herrenschmidt <benh@au1.ibm.com>
|
||||
Bjorn Helgaas <bhelgaas@google.com>
|
||||
26 Aug 2014
|
||||
|
||||
This document describes the requirement from hardware for PCI MMIO resource
|
||||
sizing and assignment on PowerKVM and how generic PCI code handles this
|
||||
requirement. The first two sections describe the concepts of Partitionable
|
||||
Endpoints and the implementation on P8 (IODA2). The next two sections talks
|
||||
about considerations on enabling SRIOV on IODA2.
|
||||
|
||||
1. Introduction to Partitionable Endpoints
|
||||
|
||||
A Partitionable Endpoint (PE) is a way to group the various resources
|
||||
associated with a device or a set of devices to provide isolation between
|
||||
partitions (i.e., filtering of DMA, MSIs etc.) and to provide a mechanism
|
||||
to freeze a device that is causing errors in order to limit the possibility
|
||||
of propagation of bad data.
|
||||
|
||||
There is thus, in HW, a table of PE states that contains a pair of "frozen"
|
||||
state bits (one for MMIO and one for DMA, they get set together but can be
|
||||
cleared independently) for each PE.
|
||||
|
||||
When a PE is frozen, all stores in any direction are dropped and all loads
|
||||
return all 1's value. MSIs are also blocked. There's a bit more state that
|
||||
captures things like the details of the error that caused the freeze etc., but
|
||||
that's not critical.
|
||||
|
||||
The interesting part is how the various PCIe transactions (MMIO, DMA, ...)
|
||||
are matched to their corresponding PEs.
|
||||
|
||||
The following section provides a rough description of what we have on P8
|
||||
(IODA2). Keep in mind that this is all per PHB (PCI host bridge). Each PHB
|
||||
is a completely separate HW entity that replicates the entire logic, so has
|
||||
its own set of PEs, etc.
|
||||
|
||||
2. Implementation of Partitionable Endpoints on P8 (IODA2)
|
||||
|
||||
P8 supports up to 256 Partitionable Endpoints per PHB.
|
||||
|
||||
* Inbound
|
||||
|
||||
For DMA, MSIs and inbound PCIe error messages, we have a table (in
|
||||
memory but accessed in HW by the chip) that provides a direct
|
||||
correspondence between a PCIe RID (bus/dev/fn) with a PE number.
|
||||
We call this the RTT.
|
||||
|
||||
- For DMA we then provide an entire address space for each PE that can
|
||||
contain two "windows", depending on the value of PCI address bit 59.
|
||||
Each window can be configured to be remapped via a "TCE table" (IOMMU
|
||||
translation table), which has various configurable characteristics
|
||||
not described here.
|
||||
|
||||
- For MSIs, we have two windows in the address space (one at the top of
|
||||
the 32-bit space and one much higher) which, via a combination of the
|
||||
address and MSI value, will result in one of the 2048 interrupts per
|
||||
bridge being triggered. There's a PE# in the interrupt controller
|
||||
descriptor table as well which is compared with the PE# obtained from
|
||||
the RTT to "authorize" the device to emit that specific interrupt.
|
||||
|
||||
- Error messages just use the RTT.
|
||||
|
||||
* Outbound. That's where the tricky part is.
|
||||
|
||||
Like other PCI host bridges, the Power8 IODA2 PHB supports "windows"
|
||||
from the CPU address space to the PCI address space. There is one M32
|
||||
window and sixteen M64 windows. They have different characteristics.
|
||||
First what they have in common: they forward a configurable portion of
|
||||
the CPU address space to the PCIe bus and must be naturally aligned
|
||||
power of two in size. The rest is different:
|
||||
|
||||
- The M32 window:
|
||||
|
||||
* Is limited to 4GB in size.
|
||||
|
||||
* Drops the top bits of the address (above the size) and replaces
|
||||
them with a configurable value. This is typically used to generate
|
||||
32-bit PCIe accesses. We configure that window at boot from FW and
|
||||
don't touch it from Linux; it's usually set to forward a 2GB
|
||||
portion of address space from the CPU to PCIe
|
||||
0x8000_0000..0xffff_ffff. (Note: The top 64KB are actually
|
||||
reserved for MSIs but this is not a problem at this point; we just
|
||||
need to ensure Linux doesn't assign anything there, the M32 logic
|
||||
ignores that however and will forward in that space if we try).
|
||||
|
||||
* It is divided into 256 segments of equal size. A table in the chip
|
||||
maps each segment to a PE#. That allows portions of the MMIO space
|
||||
to be assigned to PEs on a segment granularity. For a 2GB window,
|
||||
the segment granularity is 2GB/256 = 8MB.
|
||||
|
||||
Now, this is the "main" window we use in Linux today (excluding
|
||||
SR-IOV). We basically use the trick of forcing the bridge MMIO windows
|
||||
onto a segment alignment/granularity so that the space behind a bridge
|
||||
can be assigned to a PE.
|
||||
|
||||
Ideally we would like to be able to have individual functions in PEs
|
||||
but that would mean using a completely different address allocation
|
||||
scheme where individual function BARs can be "grouped" to fit in one or
|
||||
more segments.
|
||||
|
||||
- The M64 windows:
|
||||
|
||||
* Must be at least 256MB in size.
|
||||
|
||||
* Do not translate addresses (the address on PCIe is the same as the
|
||||
address on the PowerBus). There is a way to also set the top 14
|
||||
bits which are not conveyed by PowerBus but we don't use this.
|
||||
|
||||
* Can be configured to be segmented. When not segmented, we can
|
||||
specify the PE# for the entire window. When segmented, a window
|
||||
has 256 segments; however, there is no table for mapping a segment
|
||||
to a PE#. The segment number *is* the PE#.
|
||||
|
||||
* Support overlaps. If an address is covered by multiple windows,
|
||||
there's a defined ordering for which window applies.
|
||||
|
||||
We have code (fairly new compared to the M32 stuff) that exploits that
|
||||
for large BARs in 64-bit space:
|
||||
|
||||
We configure an M64 window to cover the entire region of address space
|
||||
that has been assigned by FW for the PHB (about 64GB, ignore the space
|
||||
for the M32, it comes out of a different "reserve"). We configure it
|
||||
as segmented.
|
||||
|
||||
Then we do the same thing as with M32, using the bridge alignment
|
||||
trick, to match to those giant segments.
|
||||
|
||||
Since we cannot remap, we have two additional constraints:
|
||||
|
||||
- We do the PE# allocation *after* the 64-bit space has been assigned
|
||||
because the addresses we use directly determine the PE#. We then
|
||||
update the M32 PE# for the devices that use both 32-bit and 64-bit
|
||||
spaces or assign the remaining PE# to 32-bit only devices.
|
||||
|
||||
- We cannot "group" segments in HW, so if a device ends up using more
|
||||
than one segment, we end up with more than one PE#. There is a HW
|
||||
mechanism to make the freeze state cascade to "companion" PEs but
|
||||
that only works for PCIe error messages (typically used so that if
|
||||
you freeze a switch, it freezes all its children). So we do it in
|
||||
SW. We lose a bit of effectiveness of EEH in that case, but that's
|
||||
the best we found. So when any of the PEs freezes, we freeze the
|
||||
other ones for that "domain". We thus introduce the concept of
|
||||
"master PE" which is the one used for DMA, MSIs, etc., and "secondary
|
||||
PEs" that are used for the remaining M64 segments.
|
||||
|
||||
We would like to investigate using additional M64 windows in "single
|
||||
PE" mode to overlay over specific BARs to work around some of that, for
|
||||
example for devices with very large BARs, e.g., GPUs. It would make
|
||||
sense, but we haven't done it yet.
|
||||
|
||||
3. Considerations for SR-IOV on PowerKVM
|
||||
|
||||
* SR-IOV Background
|
||||
|
||||
The PCIe SR-IOV feature allows a single Physical Function (PF) to
|
||||
support several Virtual Functions (VFs). Registers in the PF's SR-IOV
|
||||
Capability control the number of VFs and whether they are enabled.
|
||||
|
||||
When VFs are enabled, they appear in Configuration Space like normal
|
||||
PCI devices, but the BARs in VF config space headers are unusual. For
|
||||
a non-VF device, software uses BARs in the config space header to
|
||||
discover the BAR sizes and assign addresses for them. For VF devices,
|
||||
software uses VF BAR registers in the *PF* SR-IOV Capability to
|
||||
discover sizes and assign addresses. The BARs in the VF's config space
|
||||
header are read-only zeros.
|
||||
|
||||
When a VF BAR in the PF SR-IOV Capability is programmed, it sets the
|
||||
base address for all the corresponding VF(n) BARs. For example, if the
|
||||
PF SR-IOV Capability is programmed to enable eight VFs, and it has a
|
||||
1MB VF BAR0, the address in that VF BAR sets the base of an 8MB region.
|
||||
This region is divided into eight contiguous 1MB regions, each of which
|
||||
is a BAR0 for one of the VFs. Note that even though the VF BAR
|
||||
describes an 8MB region, the alignment requirement is for a single VF,
|
||||
i.e., 1MB in this example.
|
||||
|
||||
There are several strategies for isolating VFs in PEs:
|
||||
|
||||
- M32 window: There's one M32 window, and it is split into 256
|
||||
equally-sized segments. The finest granularity possible is a 256MB
|
||||
window with 1MB segments. VF BARs that are 1MB or larger could be
|
||||
mapped to separate PEs in this window. Each segment can be
|
||||
individually mapped to a PE via the lookup table, so this is quite
|
||||
flexible, but it works best when all the VF BARs are the same size. If
|
||||
they are different sizes, the entire window has to be small enough that
|
||||
the segment size matches the smallest VF BAR, which means larger VF
|
||||
BARs span several segments.
|
||||
|
||||
- Non-segmented M64 window: A non-segmented M64 window is mapped entirely
|
||||
to a single PE, so it could only isolate one VF.
|
||||
|
||||
- Single segmented M64 windows: A segmented M64 window could be used just
|
||||
like the M32 window, but the segments can't be individually mapped to
|
||||
PEs (the segment number is the PE#), so there isn't as much
|
||||
flexibility. A VF with multiple BARs would have to be in a "domain" of
|
||||
multiple PEs, which is not as well isolated as a single PE.
|
||||
|
||||
- Multiple segmented M64 windows: As usual, each window is split into 256
|
||||
equally-sized segments, and the segment number is the PE#. But if we
|
||||
use several M64 windows, they can be set to different base addresses
|
||||
and different segment sizes. If we have VFs that each have a 1MB BAR
|
||||
and a 32MB BAR, we could use one M64 window to assign 1MB segments and
|
||||
another M64 window to assign 32MB segments.
|
||||
|
||||
Finally, the plan to use M64 windows for SR-IOV, which will be described
|
||||
more in the next two sections. For a given VF BAR, we need to
|
||||
effectively reserve the entire 256 segments (256 * VF BAR size) and
|
||||
position the VF BAR to start at the beginning of a free range of
|
||||
segments/PEs inside that M64 window.
|
||||
|
||||
The goal is of course to be able to give a separate PE for each VF.
|
||||
|
||||
The IODA2 platform has 16 M64 windows, which are used to map MMIO
|
||||
range to PE#. Each M64 window defines one MMIO range and this range is
|
||||
divided into 256 segments, with each segment corresponding to one PE.
|
||||
|
||||
We decide to leverage this M64 window to map VFs to individual PEs, since
|
||||
SR-IOV VF BARs are all the same size.
|
||||
|
||||
But doing so introduces another problem: total_VFs is usually smaller
|
||||
than the number of M64 window segments, so if we map one VF BAR directly
|
||||
to one M64 window, some part of the M64 window will map to another
|
||||
device's MMIO range.
|
||||
|
||||
IODA supports 256 PEs, so segmented windows contain 256 segments, so if
|
||||
total_VFs is less than 256, we have the situation in Figure 1.0, where
|
||||
segments [total_VFs, 255] of the M64 window may map to some MMIO range on
|
||||
other devices:
|
||||
|
||||
0 1 total_VFs - 1
|
||||
+------+------+- -+------+------+
|
||||
| | | ... | | |
|
||||
+------+------+- -+------+------+
|
||||
|
||||
VF(n) BAR space
|
||||
|
||||
0 1 total_VFs - 1 255
|
||||
+------+------+- -+------+------+- -+------+------+
|
||||
| | | ... | | | ... | | |
|
||||
+------+------+- -+------+------+- -+------+------+
|
||||
|
||||
M64 window
|
||||
|
||||
Figure 1.0 Direct map VF(n) BAR space
|
||||
|
||||
Our current solution is to allocate 256 segments even if the VF(n) BAR
|
||||
space doesn't need that much, as shown in Figure 1.1:
|
||||
|
||||
0 1 total_VFs - 1 255
|
||||
+------+------+- -+------+------+- -+------+------+
|
||||
| | | ... | | | ... | | |
|
||||
+------+------+- -+------+------+- -+------+------+
|
||||
|
||||
VF(n) BAR space + extra
|
||||
|
||||
0 1 total_VFs - 1 255
|
||||
+------+------+- -+------+------+- -+------+------+
|
||||
| | | ... | | | ... | | |
|
||||
+------+------+- -+------+------+- -+------+------+
|
||||
|
||||
M64 window
|
||||
|
||||
Figure 1.1 Map VF(n) BAR space + extra
|
||||
|
||||
Allocating the extra space ensures that the entire M64 window will be
|
||||
assigned to this one SR-IOV device and none of the space will be
|
||||
available for other devices. Note that this only expands the space
|
||||
reserved in software; there are still only total_VFs VFs, and they only
|
||||
respond to segments [0, total_VFs - 1]. There's nothing in hardware that
|
||||
responds to segments [total_VFs, 255].
|
||||
|
||||
4. Implications for the Generic PCI Code
|
||||
|
||||
The PCIe SR-IOV spec requires that the base of the VF(n) BAR space be
|
||||
aligned to the size of an individual VF BAR.
|
||||
|
||||
In IODA2, the MMIO address determines the PE#. If the address is in an M32
|
||||
window, we can set the PE# by updating the table that translates segments
|
||||
to PE#s. Similarly, if the address is in an unsegmented M64 window, we can
|
||||
set the PE# for the window. But if it's in a segmented M64 window, the
|
||||
segment number is the PE#.
|
||||
|
||||
Therefore, the only way to control the PE# for a VF is to change the base
|
||||
of the VF(n) BAR space in the VF BAR. If the PCI core allocates the exact
|
||||
amount of space required for the VF(n) BAR space, the VF BAR value is fixed
|
||||
and cannot be changed.
|
||||
|
||||
On the other hand, if the PCI core allocates additional space, the VF BAR
|
||||
value can be changed as long as the entire VF(n) BAR space remains inside
|
||||
the space allocated by the core.
|
||||
|
||||
Ideally the segment size will be the same as an individual VF BAR size.
|
||||
Then each VF will be in its own PE. The VF BARs (and therefore the PE#s)
|
||||
are contiguous. If VF0 is in PE(x), then VF(n) is in PE(x+n). If we
|
||||
allocate 256 segments, there are (256 - numVFs) choices for the PE# of VF0.
|
||||
|
||||
If the segment size is smaller than the VF BAR size, it will take several
|
||||
segments to cover a VF BAR, and a VF will be in several PEs. This is
|
||||
possible, but the isolation isn't as good, and it reduces the number of PE#
|
||||
choices because instead of consuming only numVFs segments, the VF(n) BAR
|
||||
space will consume (numVFs * n) segments. That means there aren't as many
|
||||
available segments for adjusting base of the VF(n) BAR space.
|
|
@ -74,22 +74,23 @@ Causes of transaction aborts
|
|||
Syscalls
|
||||
========
|
||||
|
||||
Performing syscalls from within transaction is not recommended, and can lead
|
||||
to unpredictable results.
|
||||
Syscalls made from within an active transaction will not be performed and the
|
||||
transaction will be doomed by the kernel with the failure code TM_CAUSE_SYSCALL
|
||||
| TM_CAUSE_PERSISTENT.
|
||||
|
||||
Syscalls do not by design abort transactions, but beware: The kernel code will
|
||||
not be running in transactional state. The effect of syscalls will always
|
||||
remain visible, but depending on the call they may abort your transaction as a
|
||||
side-effect, read soon-to-be-aborted transactional data that should not remain
|
||||
invisible, etc. If you constantly retry a transaction that constantly aborts
|
||||
itself by calling a syscall, you'll have a livelock & make no progress.
|
||||
Syscalls made from within a suspended transaction are performed as normal and
|
||||
the transaction is not explicitly doomed by the kernel. However, what the
|
||||
kernel does to perform the syscall may result in the transaction being doomed
|
||||
by the hardware. The syscall is performed in suspended mode so any side
|
||||
effects will be persistent, independent of transaction success or failure. No
|
||||
guarantees are provided by the kernel about which syscalls will affect
|
||||
transaction success.
|
||||
|
||||
Simple syscalls (e.g. sigprocmask()) "could" be OK. Even things like write()
|
||||
from, say, printf() should be OK as long as the kernel does not access any
|
||||
memory that was accessed transactionally.
|
||||
|
||||
Consider any syscalls that happen to work as debug-only -- not recommended for
|
||||
production use. Best to queue them up till after the transaction is over.
|
||||
Care must be taken when relying on syscalls to abort during active transactions
|
||||
if the calls are made via a library. Libraries may cache values (which may
|
||||
give the appearance of success) or perform operations that cause transaction
|
||||
failure before entering the kernel (which may produce different failure codes).
|
||||
Examples are glibc's getpid() and lazy symbol resolution.
|
||||
|
||||
|
||||
Signals
|
||||
|
@ -174,10 +175,9 @@ These are defined in <asm/reg.h>, and distinguish different reasons why the
|
|||
kernel aborted a transaction:
|
||||
|
||||
TM_CAUSE_RESCHED Thread was rescheduled.
|
||||
TM_CAUSE_TLBI Software TLB invalide.
|
||||
TM_CAUSE_TLBI Software TLB invalid.
|
||||
TM_CAUSE_FAC_UNAV FP/VEC/VSX unavailable trap.
|
||||
TM_CAUSE_SYSCALL Currently unused; future syscalls that must abort
|
||||
transactions for consistency will use this.
|
||||
TM_CAUSE_SYSCALL Syscall from active transaction.
|
||||
TM_CAUSE_SIGNAL Signal delivered.
|
||||
TM_CAUSE_MISC Currently unused.
|
||||
TM_CAUSE_ALIGNMENT Alignment fault.
|
||||
|
@ -185,7 +185,7 @@ kernel aborted a transaction:
|
|||
|
||||
These can be checked by the user program's abort handler as TEXASR[0:7]. If
|
||||
bit 7 is set, it indicates that the error is consider persistent. For example
|
||||
a TM_CAUSE_ALIGNMENT will be persistent while a TM_CAUSE_RESCHED will not.q
|
||||
a TM_CAUSE_ALIGNMENT will be persistent while a TM_CAUSE_RESCHED will not.
|
||||
|
||||
GDB
|
||||
===
|
||||
|
|
|
@ -8,6 +8,21 @@ If variable is of Type, use printk format specifier:
|
|||
unsigned long long %llu or %llx
|
||||
size_t %zu or %zx
|
||||
ssize_t %zd or %zx
|
||||
s32 %d or %x
|
||||
u32 %u or %x
|
||||
s64 %lld or %llx
|
||||
u64 %llu or %llx
|
||||
|
||||
If <type> is dependent on a config option for its size (e.g., sector_t,
|
||||
blkcnt_t) or is architecture-dependent for its size (e.g., tcflag_t), use a
|
||||
format specifier of its largest possible type and explicitly cast to it.
|
||||
Example:
|
||||
|
||||
printk("test: sector number/total blocks: %llu/%llu\n",
|
||||
(unsigned long long)sector, (unsigned long long)blockcount);
|
||||
|
||||
Reminder: sizeof() result is of type size_t.
|
||||
|
||||
|
||||
Raw pointer value SHOULD be printed with %p. The kernel supports
|
||||
the following extended format specifiers for pointer types:
|
||||
|
@ -54,6 +69,7 @@ Struct Resources:
|
|||
|
||||
For printing struct resources. The 'R' and 'r' specifiers result in a
|
||||
printed resource with ('R') or without ('r') a decoded flags member.
|
||||
Passed by reference.
|
||||
|
||||
Physical addresses types phys_addr_t:
|
||||
|
||||
|
@ -132,6 +148,8 @@ MAC/FDDI addresses:
|
|||
specifier to use reversed byte order suitable for visual interpretation
|
||||
of Bluetooth addresses which are in the little endian order.
|
||||
|
||||
Passed by reference.
|
||||
|
||||
IPv4 addresses:
|
||||
|
||||
%pI4 1.2.3.4
|
||||
|
@ -146,6 +164,8 @@ IPv4 addresses:
|
|||
host, network, big or little endian order addresses respectively. Where
|
||||
no specifier is provided the default network/big endian order is used.
|
||||
|
||||
Passed by reference.
|
||||
|
||||
IPv6 addresses:
|
||||
|
||||
%pI6 0001:0002:0003:0004:0005:0006:0007:0008
|
||||
|
@ -160,6 +180,8 @@ IPv6 addresses:
|
|||
print a compressed IPv6 address as described by
|
||||
http://tools.ietf.org/html/rfc5952
|
||||
|
||||
Passed by reference.
|
||||
|
||||
IPv4/IPv6 addresses (generic, with port, flowinfo, scope):
|
||||
|
||||
%pIS 1.2.3.4 or 0001:0002:0003:0004:0005:0006:0007:0008
|
||||
|
@ -186,6 +208,8 @@ IPv4/IPv6 addresses (generic, with port, flowinfo, scope):
|
|||
specifiers can be used as well and are ignored in case of an IPv6
|
||||
address.
|
||||
|
||||
Passed by reference.
|
||||
|
||||
Further examples:
|
||||
|
||||
%pISfc 1.2.3.4 or [1:2:3:4:5:6:7:8]/123456789
|
||||
|
@ -207,6 +231,8 @@ UUID/GUID addresses:
|
|||
Where no additional specifiers are used the default little endian
|
||||
order with lower case hex characters will be printed.
|
||||
|
||||
Passed by reference.
|
||||
|
||||
dentry names:
|
||||
%pd{,2,3,4}
|
||||
%pD{,2,3,4}
|
||||
|
@ -216,6 +242,8 @@ dentry names:
|
|||
equivalent of %s dentry->d_name.name we used to use, %pd<n> prints
|
||||
n last components. %pD does the same thing for struct file.
|
||||
|
||||
Passed by reference.
|
||||
|
||||
struct va_format:
|
||||
|
||||
%pV
|
||||
|
@ -231,23 +259,20 @@ struct va_format:
|
|||
Do not use this feature without some mechanism to verify the
|
||||
correctness of the format string and va_list arguments.
|
||||
|
||||
u64 SHOULD be printed with %llu/%llx:
|
||||
Passed by reference.
|
||||
|
||||
printk("%llu", u64_var);
|
||||
struct clk:
|
||||
|
||||
s64 SHOULD be printed with %lld/%llx:
|
||||
%pC pll1
|
||||
%pCn pll1
|
||||
%pCr 1560000000
|
||||
|
||||
printk("%lld", s64_var);
|
||||
For printing struct clk structures. '%pC' and '%pCn' print the name
|
||||
(Common Clock Framework) or address (legacy clock framework) of the
|
||||
structure; '%pCr' prints the current clock rate.
|
||||
|
||||
If <type> is dependent on a config option for its size (e.g., sector_t,
|
||||
blkcnt_t) or is architecture-dependent for its size (e.g., tcflag_t), use a
|
||||
format specifier of its largest possible type and explicitly cast to it.
|
||||
Example:
|
||||
Passed by reference.
|
||||
|
||||
printk("test: sector number/total blocks: %llu/%llu\n",
|
||||
(unsigned long long)sector, (unsigned long long)blockcount);
|
||||
|
||||
Reminder: sizeof() result is of type size_t.
|
||||
|
||||
Thank you for your cooperation and attention.
|
||||
|
||||
|
|
|
@ -33,11 +33,18 @@ The current git repository for Smack user space is:
|
|||
git://github.com/smack-team/smack.git
|
||||
|
||||
This should make and install on most modern distributions.
|
||||
There are three commands included in smackutil:
|
||||
There are five commands included in smackutil:
|
||||
|
||||
smackload - properly formats data for writing to /smack/load
|
||||
smackcipso - properly formats data for writing to /smack/cipso
|
||||
chsmack - display or set Smack extended attribute values
|
||||
smackctl - load the Smack access rules
|
||||
smackaccess - report if a process with one label has access
|
||||
to an object with another
|
||||
|
||||
These two commands are obsolete with the introduction of
|
||||
the smackfs/load2 and smackfs/cipso2 interfaces.
|
||||
|
||||
smackload - properly formats data for writing to smackfs/load
|
||||
smackcipso - properly formats data for writing to smackfs/cipso
|
||||
|
||||
In keeping with the intent of Smack, configuration data is
|
||||
minimal and not strictly required. The most important
|
||||
|
@ -47,9 +54,9 @@ of this, but it can be manually as well.
|
|||
|
||||
Add this line to /etc/fstab:
|
||||
|
||||
smackfs /smack smackfs smackfsdef=* 0 0
|
||||
smackfs /sys/fs/smackfs smackfs defaults 0 0
|
||||
|
||||
and create the /smack directory for mounting.
|
||||
The /sys/fs/smackfs directory is created by the kernel.
|
||||
|
||||
Smack uses extended attributes (xattrs) to store labels on filesystem
|
||||
objects. The attributes are stored in the extended attribute security
|
||||
|
@ -92,13 +99,13 @@ There are multiple ways to set a Smack label on a file:
|
|||
# attr -S -s SMACK64 -V "value" path
|
||||
# chsmack -a value path
|
||||
|
||||
A process can see the smack label it is running with by
|
||||
A process can see the Smack label it is running with by
|
||||
reading /proc/self/attr/current. A process with CAP_MAC_ADMIN
|
||||
can set the process smack by writing there.
|
||||
can set the process Smack by writing there.
|
||||
|
||||
Most Smack configuration is accomplished by writing to files
|
||||
in the smackfs filesystem. This pseudo-filesystem is usually
|
||||
mounted on /smack.
|
||||
in the smackfs filesystem. This pseudo-filesystem is mounted
|
||||
on /sys/fs/smackfs.
|
||||
|
||||
access
|
||||
This interface reports whether a subject with the specified
|
||||
|
@ -206,23 +213,30 @@ onlycap
|
|||
file or cleared by writing "-" to the file.
|
||||
ptrace
|
||||
This is used to define the current ptrace policy
|
||||
0 - default: this is the policy that relies on smack access rules.
|
||||
0 - default: this is the policy that relies on Smack access rules.
|
||||
For the PTRACE_READ a subject needs to have a read access on
|
||||
object. For the PTRACE_ATTACH a read-write access is required.
|
||||
1 - exact: this is the policy that limits PTRACE_ATTACH. Attach is
|
||||
only allowed when subject's and object's labels are equal.
|
||||
PTRACE_READ is not affected. Can be overriden with CAP_SYS_PTRACE.
|
||||
PTRACE_READ is not affected. Can be overridden with CAP_SYS_PTRACE.
|
||||
2 - draconian: this policy behaves like the 'exact' above with an
|
||||
exception that it can't be overriden with CAP_SYS_PTRACE.
|
||||
exception that it can't be overridden with CAP_SYS_PTRACE.
|
||||
revoke-subject
|
||||
Writing a Smack label here sets the access to '-' for all access
|
||||
rules with that subject label.
|
||||
unconfined
|
||||
If the kernel is configured with CONFIG_SECURITY_SMACK_BRINGUP
|
||||
a process with CAP_MAC_ADMIN can write a label into this interface.
|
||||
Thereafter, accesses that involve that label will be logged and
|
||||
the access permitted if it wouldn't be otherwise. Note that this
|
||||
is dangerous and can ruin the proper labeling of your system.
|
||||
It should never be used in production.
|
||||
|
||||
You can add access rules in /etc/smack/accesses. They take the form:
|
||||
|
||||
subjectlabel objectlabel access
|
||||
|
||||
access is a combination of the letters rwxa which specify the
|
||||
access is a combination of the letters rwxatb which specify the
|
||||
kind of access permitted a subject with subjectlabel on an
|
||||
object with objectlabel. If there is no rule no access is allowed.
|
||||
|
||||
|
@ -318,8 +332,9 @@ each of the subject and the object.
|
|||
|
||||
Labels
|
||||
|
||||
Smack labels are ASCII character strings, one to twenty-three characters in
|
||||
length. Single character labels using special characters, that being anything
|
||||
Smack labels are ASCII character strings. They can be up to 255 characters
|
||||
long, but keeping them to twenty-three characters is recommended.
|
||||
Single character labels using special characters, that being anything
|
||||
other than a letter or digit, are reserved for use by the Smack development
|
||||
team. Smack labels are unstructured, case sensitive, and the only operation
|
||||
ever performed on them is comparison for equality. Smack labels cannot
|
||||
|
@ -335,10 +350,9 @@ There are some predefined labels:
|
|||
? Pronounced "huh", a single question mark character.
|
||||
@ Pronounced "web", a single at sign character.
|
||||
|
||||
Every task on a Smack system is assigned a label. System tasks, such as
|
||||
init(8) and systems daemons, are run with the floor ("_") label. User tasks
|
||||
are assigned labels according to the specification found in the
|
||||
/etc/smack/user configuration file.
|
||||
Every task on a Smack system is assigned a label. The Smack label
|
||||
of a process will usually be assigned by the system initialization
|
||||
mechanism.
|
||||
|
||||
Access Rules
|
||||
|
||||
|
@ -393,6 +407,7 @@ describe access modes:
|
|||
w: indicates that write access should be granted.
|
||||
x: indicates that execute access should be granted.
|
||||
t: indicates that the rule requests transmutation.
|
||||
b: indicates that the rule should be reported for bring-up.
|
||||
|
||||
Uppercase values for the specification letters are allowed as well.
|
||||
Access mode specifications can be in any order. Examples of acceptable rules
|
||||
|
@ -402,6 +417,7 @@ are:
|
|||
Secret Unclass R
|
||||
Manager Game x
|
||||
User HR w
|
||||
Snap Crackle rwxatb
|
||||
New Old rRrRr
|
||||
Closed Off -
|
||||
|
||||
|
@ -413,7 +429,7 @@ Examples of unacceptable rules are:
|
|||
|
||||
Spaces are not allowed in labels. Since a subject always has access to files
|
||||
with the same label specifying a rule for that case is pointless. Only
|
||||
valid letters (rwxatRWXAT) and the dash ('-') character are allowed in
|
||||
valid letters (rwxatbRWXATB) and the dash ('-') character are allowed in
|
||||
access specifications. The dash is a placeholder, so "a-r" is the same
|
||||
as "ar". A lone dash is used to specify that no access should be allowed.
|
||||
|
||||
|
@ -462,16 +478,11 @@ receiver. The receiver is not required to have read access to the sender.
|
|||
Setting Access Rules
|
||||
|
||||
The configuration file /etc/smack/accesses contains the rules to be set at
|
||||
system startup. The contents are written to the special file /smack/load.
|
||||
Rules can be written to /smack/load at any time and take effect immediately.
|
||||
For any pair of subject and object labels there can be only one rule, with the
|
||||
most recently specified overriding any earlier specification.
|
||||
|
||||
The program smackload is provided to ensure data is formatted
|
||||
properly when written to /smack/load. This program reads lines
|
||||
of the form
|
||||
|
||||
subjectlabel objectlabel mode.
|
||||
system startup. The contents are written to the special file
|
||||
/sys/fs/smackfs/load2. Rules can be added at any time and take effect
|
||||
immediately. For any pair of subject and object labels there can be only
|
||||
one rule, with the most recently specified overriding any earlier
|
||||
specification.
|
||||
|
||||
Task Attribute
|
||||
|
||||
|
@ -488,7 +499,10 @@ only be changed by a process with privilege.
|
|||
|
||||
Privilege
|
||||
|
||||
A process with CAP_MAC_OVERRIDE is privileged.
|
||||
A process with CAP_MAC_OVERRIDE or CAP_MAC_ADMIN is privileged.
|
||||
CAP_MAC_OVERRIDE allows the process access to objects it would
|
||||
be denied otherwise. CAP_MAC_ADMIN allows a process to change
|
||||
Smack data, including rules and attributes.
|
||||
|
||||
Smack Networking
|
||||
|
||||
|
@ -510,14 +524,14 @@ intervention. Unlabeled packets that come into the system will be given the
|
|||
ambient label.
|
||||
|
||||
Smack requires configuration in the case where packets from a system that is
|
||||
not smack that speaks CIPSO may be encountered. Usually this will be a Trusted
|
||||
not Smack that speaks CIPSO may be encountered. Usually this will be a Trusted
|
||||
Solaris system, but there are other, less widely deployed systems out there.
|
||||
CIPSO provides 3 important values, a Domain Of Interpretation (DOI), a level,
|
||||
and a category set with each packet. The DOI is intended to identify a group
|
||||
of systems that use compatible labeling schemes, and the DOI specified on the
|
||||
smack system must match that of the remote system or packets will be
|
||||
discarded. The DOI is 3 by default. The value can be read from /smack/doi and
|
||||
can be changed by writing to /smack/doi.
|
||||
Smack system must match that of the remote system or packets will be
|
||||
discarded. The DOI is 3 by default. The value can be read from
|
||||
/sys/fs/smackfs/doi and can be changed by writing to /sys/fs/smackfs/doi.
|
||||
|
||||
The label and category set are mapped to a Smack label as defined in
|
||||
/etc/smack/cipso.
|
||||
|
@ -539,15 +553,13 @@ The ":" and "," characters are permitted in a Smack label but have no special
|
|||
meaning.
|
||||
|
||||
The mapping of Smack labels to CIPSO values is defined by writing to
|
||||
/smack/cipso. Again, the format of data written to this special file
|
||||
is highly restrictive, so the program smackcipso is provided to
|
||||
ensure the writes are done properly. This program takes mappings
|
||||
on the standard input and sends them to /smack/cipso properly.
|
||||
/sys/fs/smackfs/cipso2.
|
||||
|
||||
In addition to explicit mappings Smack supports direct CIPSO mappings. One
|
||||
CIPSO level is used to indicate that the category set passed in the packet is
|
||||
in fact an encoding of the Smack label. The level used is 250 by default. The
|
||||
value can be read from /smack/direct and changed by writing to /smack/direct.
|
||||
value can be read from /sys/fs/smackfs/direct and changed by writing to
|
||||
/sys/fs/smackfs/direct.
|
||||
|
||||
Socket Attributes
|
||||
|
||||
|
@ -565,8 +577,8 @@ sockets.
|
|||
Smack Netlabel Exceptions
|
||||
|
||||
You will often find that your labeled application has to talk to the outside,
|
||||
unlabeled world. To do this there's a special file /smack/netlabel where you can
|
||||
add some exceptions in the form of :
|
||||
unlabeled world. To do this there's a special file /sys/fs/smackfs/netlabel
|
||||
where you can add some exceptions in the form of :
|
||||
@IP1 LABEL1 or
|
||||
@IP2/MASK LABEL2
|
||||
|
||||
|
@ -574,22 +586,22 @@ It means that your application will have unlabeled access to @IP1 if it has
|
|||
write access on LABEL1, and access to the subnet @IP2/MASK if it has write
|
||||
access on LABEL2.
|
||||
|
||||
Entries in the /smack/netlabel file are matched by longest mask first, like in
|
||||
classless IPv4 routing.
|
||||
Entries in the /sys/fs/smackfs/netlabel file are matched by longest mask
|
||||
first, like in classless IPv4 routing.
|
||||
|
||||
A special label '@' and an option '-CIPSO' can be used there :
|
||||
@ means Internet, any application with any label has access to it
|
||||
-CIPSO means standard CIPSO networking
|
||||
|
||||
If you don't know what CIPSO is and don't plan to use it, you can just do :
|
||||
echo 127.0.0.1 -CIPSO > /smack/netlabel
|
||||
echo 0.0.0.0/0 @ > /smack/netlabel
|
||||
echo 127.0.0.1 -CIPSO > /sys/fs/smackfs/netlabel
|
||||
echo 0.0.0.0/0 @ > /sys/fs/smackfs/netlabel
|
||||
|
||||
If you use CIPSO on your 192.168.0.0/16 local network and need also unlabeled
|
||||
Internet access, you can have :
|
||||
echo 127.0.0.1 -CIPSO > /smack/netlabel
|
||||
echo 192.168.0.0/16 -CIPSO > /smack/netlabel
|
||||
echo 0.0.0.0/0 @ > /smack/netlabel
|
||||
echo 127.0.0.1 -CIPSO > /sys/fs/smackfs/netlabel
|
||||
echo 192.168.0.0/16 -CIPSO > /sys/fs/smackfs/netlabel
|
||||
echo 0.0.0.0/0 @ > /sys/fs/smackfs/netlabel
|
||||
|
||||
|
||||
Writing Applications for Smack
|
||||
|
@ -676,7 +688,7 @@ Smack auditing
|
|||
If you want Smack auditing of security events, you need to set CONFIG_AUDIT
|
||||
in your kernel configuration.
|
||||
By default, all denied events will be audited. You can change this behavior by
|
||||
writing a single character to the /smack/logging file :
|
||||
writing a single character to the /sys/fs/smackfs/logging file :
|
||||
0 : no logging
|
||||
1 : log denied (default)
|
||||
2 : log accepted
|
||||
|
@ -686,3 +698,20 @@ Events are logged as 'key=value' pairs, for each event you at least will get
|
|||
the subject, the object, the rights requested, the action, the kernel function
|
||||
that triggered the event, plus other pairs depending on the type of event
|
||||
audited.
|
||||
|
||||
Bringup Mode
|
||||
|
||||
Bringup mode provides logging features that can make application
|
||||
configuration and system bringup easier. Configure the kernel with
|
||||
CONFIG_SECURITY_SMACK_BRINGUP to enable these features. When bringup
|
||||
mode is enabled accesses that succeed due to rules marked with the "b"
|
||||
access mode will logged. When a new label is introduced for processes
|
||||
rules can be added aggressively, marked with the "b". The logging allows
|
||||
tracking of which rules actual get used for that label.
|
||||
|
||||
Another feature of bringup mode is the "unconfined" option. Writing
|
||||
a label to /sys/fs/smackfs/unconfined makes subjects with that label
|
||||
able to access any object, and objects with that label accessible to
|
||||
all subjects. Any access that is granted because a label is unconfined
|
||||
is logged. This feature is dangerous, as files and directories may
|
||||
be created in places they couldn't if the policy were being enforced.
|
||||
|
|
|
@ -71,11 +71,11 @@ SOURCE:
|
|||
HDMI/DP (either HDMI or DisplayPort)
|
||||
|
||||
Exceptions (deprecated):
|
||||
[Digital] Capture Source
|
||||
[Digital] Capture Switch (aka input gain switch)
|
||||
[Digital] Capture Volume (aka input gain volume)
|
||||
[Digital] Playback Switch (aka output gain switch)
|
||||
[Digital] Playback Volume (aka output gain volume)
|
||||
[Analogue|Digital] Capture Source
|
||||
[Analogue|Digital] Capture Switch (aka input gain switch)
|
||||
[Analogue|Digital] Capture Volume (aka input gain volume)
|
||||
[Analogue|Digital] Playback Switch (aka output gain switch)
|
||||
[Analogue|Digital] Playback Volume (aka output gain volume)
|
||||
Tone Control - Switch
|
||||
Tone Control - Bass
|
||||
Tone Control - Treble
|
||||
|
|
|
@ -466,7 +466,11 @@ The generic parser supports the following hints:
|
|||
- add_jack_modes (bool): add "xxx Jack Mode" enum controls to each
|
||||
I/O jack for allowing to change the headphone amp and mic bias VREF
|
||||
capabilities
|
||||
- power_down_unused (bool): power down the unused widgets
|
||||
- power_save_node (bool): advanced power management for each widget,
|
||||
controlling the power sate (D0/D3) of each widget node depending on
|
||||
the actual pin and stream states
|
||||
- power_down_unused (bool): power down the unused widgets, a subset of
|
||||
power_save_node, and will be dropped in future
|
||||
- add_hp_mic (bool): add the headphone to capture source if possible
|
||||
- hp_mic_detect (bool): enable/disable the hp/mic shared input for a
|
||||
single built-in mic case; default true
|
||||
|
|
|
@ -0,0 +1,200 @@
|
|||
The ALSA API can provide two different system timestamps:
|
||||
|
||||
- Trigger_tstamp is the system time snapshot taken when the .trigger
|
||||
callback is invoked. This snapshot is taken by the ALSA core in the
|
||||
general case, but specific hardware may have synchronization
|
||||
capabilities or conversely may only be able to provide a correct
|
||||
estimate with a delay. In the latter two cases, the low-level driver
|
||||
is responsible for updating the trigger_tstamp at the most appropriate
|
||||
and precise moment. Applications should not rely solely on the first
|
||||
trigger_tstamp but update their internal calculations if the driver
|
||||
provides a refined estimate with a delay.
|
||||
|
||||
- tstamp is the current system timestamp updated during the last
|
||||
event or application query.
|
||||
The difference (tstamp - trigger_tstamp) defines the elapsed time.
|
||||
|
||||
The ALSA API provides reports two basic pieces of information, avail
|
||||
and delay, which combined with the trigger and current system
|
||||
timestamps allow for applications to keep track of the 'fullness' of
|
||||
the ring buffer and the amount of queued samples.
|
||||
|
||||
The use of these different pointers and time information depends on
|
||||
the application needs:
|
||||
|
||||
- 'avail' reports how much can be written in the ring buffer
|
||||
- 'delay' reports the time it will take to hear a new sample after all
|
||||
queued samples have been played out.
|
||||
|
||||
When timestamps are enabled, the avail/delay information is reported
|
||||
along with a snapshot of system time. Applications can select from
|
||||
CLOCK_REALTIME (NTP corrections including going backwards),
|
||||
CLOCK_MONOTONIC (NTP corrections but never going backwards),
|
||||
CLOCK_MONOTIC_RAW (without NTP corrections) and change the mode
|
||||
dynamically with sw_params
|
||||
|
||||
|
||||
The ALSA API also provide an audio_tstamp which reflects the passage
|
||||
of time as measured by different components of audio hardware. In
|
||||
ascii-art, this could be represented as follows (for the playback
|
||||
case):
|
||||
|
||||
|
||||
--------------------------------------------------------------> time
|
||||
^ ^ ^ ^ ^
|
||||
| | | | |
|
||||
analog link dma app FullBuffer
|
||||
time time time time time
|
||||
| | | | |
|
||||
|< codec delay >|<--hw delay-->|<queued samples>|<---avail->|
|
||||
|<----------------- delay---------------------->| |
|
||||
|<----ring buffer length---->|
|
||||
|
||||
The analog time is taken at the last stage of the playback, as close
|
||||
as possible to the actual transducer
|
||||
|
||||
The link time is taken at the output of the SOC/chipset as the samples
|
||||
are pushed on a link. The link time can be directly measured if
|
||||
supported in hardware by sample counters or wallclocks (e.g. with
|
||||
HDAudio 24MHz or PTP clock for networked solutions) or indirectly
|
||||
estimated (e.g. with the frame counter in USB).
|
||||
|
||||
The DMA time is measured using counters - typically the least reliable
|
||||
of all measurements due to the bursty natured of DMA transfers.
|
||||
|
||||
The app time corresponds to the time tracked by an application after
|
||||
writing in the ring buffer.
|
||||
|
||||
The application can query what the hardware supports, define which
|
||||
audio time it wants reported by selecting the relevant settings in
|
||||
audio_tstamp_config fields, get an estimate of the timestamp
|
||||
accuracy. It can also request the delay-to-analog be included in the
|
||||
measurement. Direct access to the link time is very interesting on
|
||||
platforms that provide an embedded DSP; measuring directly the link
|
||||
time with dedicated hardware, possibly synchronized with system time,
|
||||
removes the need to keep track of internal DSP processing times and
|
||||
latency.
|
||||
|
||||
In case the application requests an audio tstamp that is not supported
|
||||
in hardware/low-level driver, the type is overridden as DEFAULT and the
|
||||
timestamp will report the DMA time based on the hw_pointer value.
|
||||
|
||||
For backwards compatibility with previous implementations that did not
|
||||
provide timestamp selection, with a zero-valued COMPAT timestamp type
|
||||
the results will default to the HDAudio wall clock for playback
|
||||
streams and to the DMA time (hw_ptr) in all other cases.
|
||||
|
||||
The audio timestamp accuracy can be returned to user-space, so that
|
||||
appropriate decisions are made:
|
||||
|
||||
- for dma time (default), the granularity of the transfers can be
|
||||
inferred from the steps between updates and in turn provide
|
||||
information on how much the application pointer can be rewound
|
||||
safely.
|
||||
|
||||
- the link time can be used to track long-term drifts between audio
|
||||
and system time using the (tstamp-trigger_tstamp)/audio_tstamp
|
||||
ratio, the precision helps define how much smoothing/low-pass
|
||||
filtering is required. The link time can be either reset on startup
|
||||
or reported as is (the latter being useful to compare progress of
|
||||
different streams - but may require the wallclock to be always
|
||||
running and not wrap-around during idle periods). If supported in
|
||||
hardware, the absolute link time could also be used to define a
|
||||
precise start time (patches WIP)
|
||||
|
||||
- including the delay in the audio timestamp may
|
||||
counter-intuitively not increase the precision of timestamps, e.g. if a
|
||||
codec includes variable-latency DSP processing or a chain of
|
||||
hardware components the delay is typically not known with precision.
|
||||
|
||||
The accuracy is reported in nanosecond units (using an unsigned 32-bit
|
||||
word), which gives a max precision of 4.29s, more than enough for
|
||||
audio applications...
|
||||
|
||||
Due to the varied nature of timestamping needs, even for a single
|
||||
application, the audio_tstamp_config can be changed dynamically. In
|
||||
the STATUS ioctl, the parameters are read-only and do not allow for
|
||||
any application selection. To work around this limitation without
|
||||
impacting legacy applications, a new STATUS_EXT ioctl is introduced
|
||||
with read/write parameters. ALSA-lib will be modified to make use of
|
||||
STATUS_EXT and effectively deprecate STATUS.
|
||||
|
||||
The ALSA API only allows for a single audio timestamp to be reported
|
||||
at a time. This is a conscious design decision, reading the audio
|
||||
timestamps from hardware registers or from IPC takes time, the more
|
||||
timestamps are read the more imprecise the combined measurements
|
||||
are. To avoid any interpretation issues, a single (system, audio)
|
||||
timestamp is reported. Applications that need different timestamps
|
||||
will be required to issue multiple queries and perform an
|
||||
interpolation of the results
|
||||
|
||||
In some hardware-specific configuration, the system timestamp is
|
||||
latched by a low-level audio subsytem, and the information provided
|
||||
back to the driver. Due to potential delays in the communication with
|
||||
the hardware, there is a risk of misalignment with the avail and delay
|
||||
information. To make sure applications are not confused, a
|
||||
driver_timestamp field is added in the snd_pcm_status structure; this
|
||||
timestamp shows when the information is put together by the driver
|
||||
before returning from the STATUS and STATUS_EXT ioctl. in most cases
|
||||
this driver_timestamp will be identical to the regular system tstamp.
|
||||
|
||||
Examples of typestamping with HDaudio:
|
||||
|
||||
1. DMA timestamp, no compensation for DMA+analog delay
|
||||
$ ./audio_time -p --ts_type=1
|
||||
playback: systime: 341121338 nsec, audio time 342000000 nsec, systime delta -878662
|
||||
playback: systime: 426236663 nsec, audio time 427187500 nsec, systime delta -950837
|
||||
playback: systime: 597080580 nsec, audio time 598000000 nsec, systime delta -919420
|
||||
playback: systime: 682059782 nsec, audio time 683020833 nsec, systime delta -961051
|
||||
playback: systime: 852896415 nsec, audio time 853854166 nsec, systime delta -957751
|
||||
playback: systime: 937903344 nsec, audio time 938854166 nsec, systime delta -950822
|
||||
|
||||
2. DMA timestamp, compensation for DMA+analog delay
|
||||
$ ./audio_time -p --ts_type=1 -d
|
||||
playback: systime: 341053347 nsec, audio time 341062500 nsec, systime delta -9153
|
||||
playback: systime: 426072447 nsec, audio time 426062500 nsec, systime delta 9947
|
||||
playback: systime: 596899518 nsec, audio time 596895833 nsec, systime delta 3685
|
||||
playback: systime: 681915317 nsec, audio time 681916666 nsec, systime delta -1349
|
||||
playback: systime: 852741306 nsec, audio time 852750000 nsec, systime delta -8694
|
||||
|
||||
3. link timestamp, compensation for DMA+analog delay
|
||||
$ ./audio_time -p --ts_type=2 -d
|
||||
playback: systime: 341060004 nsec, audio time 341062791 nsec, systime delta -2787
|
||||
playback: systime: 426242074 nsec, audio time 426244875 nsec, systime delta -2801
|
||||
playback: systime: 597080992 nsec, audio time 597084583 nsec, systime delta -3591
|
||||
playback: systime: 682084512 nsec, audio time 682088291 nsec, systime delta -3779
|
||||
playback: systime: 852936229 nsec, audio time 852940916 nsec, systime delta -4687
|
||||
playback: systime: 938107562 nsec, audio time 938112708 nsec, systime delta -5146
|
||||
|
||||
Example 1 shows that the timestamp at the DMA level is close to 1ms
|
||||
ahead of the actual playback time (as a side time this sort of
|
||||
measurement can help define rewind safeguards). Compensating for the
|
||||
DMA-link delay in example 2 helps remove the hardware buffering abut
|
||||
the information is still very jittery, with up to one sample of
|
||||
error. In example 3 where the timestamps are measured with the link
|
||||
wallclock, the timestamps show a monotonic behavior and a lower
|
||||
dispersion.
|
||||
|
||||
Example 3 and 4 are with USB audio class. Example 3 shows a high
|
||||
offset between audio time and system time due to buffering. Example 4
|
||||
shows how compensating for the delay exposes a 1ms accuracy (due to
|
||||
the use of the frame counter by the driver)
|
||||
|
||||
Example 3: DMA timestamp, no compensation for delay, delta of ~5ms
|
||||
$ ./audio_time -p -Dhw:1 -t1
|
||||
playback: systime: 120174019 nsec, audio time 125000000 nsec, systime delta -4825981
|
||||
playback: systime: 245041136 nsec, audio time 250000000 nsec, systime delta -4958864
|
||||
playback: systime: 370106088 nsec, audio time 375000000 nsec, systime delta -4893912
|
||||
playback: systime: 495040065 nsec, audio time 500000000 nsec, systime delta -4959935
|
||||
playback: systime: 620038179 nsec, audio time 625000000 nsec, systime delta -4961821
|
||||
playback: systime: 745087741 nsec, audio time 750000000 nsec, systime delta -4912259
|
||||
playback: systime: 870037336 nsec, audio time 875000000 nsec, systime delta -4962664
|
||||
|
||||
Example 4: DMA timestamp, compensation for delay, delay of ~1ms
|
||||
$ ./audio_time -p -Dhw:1 -t1 -d
|
||||
playback: systime: 120190520 nsec, audio time 120000000 nsec, systime delta 190520
|
||||
playback: systime: 245036740 nsec, audio time 244000000 nsec, systime delta 1036740
|
||||
playback: systime: 370034081 nsec, audio time 369000000 nsec, systime delta 1034081
|
||||
playback: systime: 495159907 nsec, audio time 494000000 nsec, systime delta 1159907
|
||||
playback: systime: 620098824 nsec, audio time 619000000 nsec, systime delta 1098824
|
||||
playback: systime: 745031847 nsec, audio time 744000000 nsec, systime delta 1031847
|
|
@ -80,7 +80,7 @@ static void hex_dump(const void *src, size_t length, size_t line_size, char *pre
|
|||
* Unescape - process hexadecimal escape character
|
||||
* converts shell input "\x23" -> 0x23
|
||||
*/
|
||||
int unespcape(char *_dst, char *_src, size_t len)
|
||||
static int unescape(char *_dst, char *_src, size_t len)
|
||||
{
|
||||
int ret = 0;
|
||||
char *src = _src;
|
||||
|
@ -304,7 +304,7 @@ int main(int argc, char *argv[])
|
|||
size = strlen(input_tx+1);
|
||||
tx = malloc(size);
|
||||
rx = malloc(size);
|
||||
size = unespcape((char *)tx, input_tx, size);
|
||||
size = unescape((char *)tx, input_tx, size);
|
||||
transfer(fd, tx, rx, size);
|
||||
free(rx);
|
||||
free(tx);
|
||||
|
|
|
@ -872,6 +872,27 @@ can be ORed together:
|
|||
|
||||
==============================================================
|
||||
|
||||
threads-max
|
||||
|
||||
This value controls the maximum number of threads that can be created
|
||||
using fork().
|
||||
|
||||
During initialization the kernel sets this value such that even if the
|
||||
maximum number of threads is created, the thread structures occupy only
|
||||
a part (1/8th) of the available RAM pages.
|
||||
|
||||
The minimum value that can be written to threads-max is 20.
|
||||
The maximum value that can be written to threads-max is given by the
|
||||
constant FUTEX_TID_MASK (0x3fffffff).
|
||||
If a value outside of this range is written to threads-max an error
|
||||
EINVAL occurs.
|
||||
|
||||
The value written is checked against the available RAM pages. If the
|
||||
thread structures would occupy too much (more than 1/8th) of the
|
||||
available RAM pages threads-max is reduced accordingly.
|
||||
|
||||
==============================================================
|
||||
|
||||
unknown_nmi_panic:
|
||||
|
||||
The value in this file affects behavior of handling NMI. When the
|
||||
|
|
|
@ -21,6 +21,7 @@ Currently, these files are in /proc/sys/vm:
|
|||
- admin_reserve_kbytes
|
||||
- block_dump
|
||||
- compact_memory
|
||||
- compact_unevictable_allowed
|
||||
- dirty_background_bytes
|
||||
- dirty_background_ratio
|
||||
- dirty_bytes
|
||||
|
@ -106,6 +107,16 @@ huge pages although processes will also directly compact memory as required.
|
|||
|
||||
==============================================================
|
||||
|
||||
compact_unevictable_allowed
|
||||
|
||||
Available only when CONFIG_COMPACTION is set. When set to 1, compaction is
|
||||
allowed to examine the unevictable lru (mlocked pages) for pages to compact.
|
||||
This should be used on systems where stalls for minor page faults are an
|
||||
acceptable trade for large contiguous free memory. Set to 0 to prevent
|
||||
compaction from moving pages that are unevictable. Default value is 1.
|
||||
|
||||
==============================================================
|
||||
|
||||
dirty_background_bytes
|
||||
|
||||
Contains the amount of dirty memory at which the background kernel
|
||||
|
|
|
@ -267,21 +267,34 @@ call, then it is required that system administrator mount a file system of
|
|||
type hugetlbfs:
|
||||
|
||||
mount -t hugetlbfs \
|
||||
-o uid=<value>,gid=<value>,mode=<value>,size=<value>,nr_inodes=<value> \
|
||||
none /mnt/huge
|
||||
-o uid=<value>,gid=<value>,mode=<value>,pagesize=<value>,size=<value>,\
|
||||
min_size=<value>,nr_inodes=<value> none /mnt/huge
|
||||
|
||||
This command mounts a (pseudo) filesystem of type hugetlbfs on the directory
|
||||
/mnt/huge. Any files created on /mnt/huge uses huge pages. The uid and gid
|
||||
options sets the owner and group of the root of the file system. By default
|
||||
the uid and gid of the current process are taken. The mode option sets the
|
||||
mode of root of file system to value & 01777. This value is given in octal.
|
||||
By default the value 0755 is picked. The size option sets the maximum value of
|
||||
memory (huge pages) allowed for that filesystem (/mnt/huge). The size is
|
||||
rounded down to HPAGE_SIZE. The option nr_inodes sets the maximum number of
|
||||
inodes that /mnt/huge can use. If the size or nr_inodes option is not
|
||||
provided on command line then no limits are set. For size and nr_inodes
|
||||
options, you can use [G|g]/[M|m]/[K|k] to represent giga/mega/kilo. For
|
||||
example, size=2K has the same meaning as size=2048.
|
||||
By default the value 0755 is picked. If the paltform supports multiple huge
|
||||
page sizes, the pagesize option can be used to specify the huge page size and
|
||||
associated pool. pagesize is specified in bytes. If pagesize is not specified
|
||||
the paltform's default huge page size and associated pool will be used. The
|
||||
size option sets the maximum value of memory (huge pages) allowed for that
|
||||
filesystem (/mnt/huge). The size option can be specified in bytes, or as a
|
||||
percentage of the specified huge page pool (nr_hugepages). The size is
|
||||
rounded down to HPAGE_SIZE boundary. The min_size option sets the minimum
|
||||
value of memory (huge pages) allowed for the filesystem. min_size can be
|
||||
specified in the same way as size, either bytes or a percentage of the
|
||||
huge page pool. At mount time, the number of huge pages specified by
|
||||
min_size are reserved for use by the filesystem. If there are not enough
|
||||
free huge pages available, the mount will fail. As huge pages are allocated
|
||||
to the filesystem and freed, the reserve count is adjusted so that the sum
|
||||
of allocated and reserved huge pages is always at least min_size. The option
|
||||
nr_inodes sets the maximum number of inodes that /mnt/huge can use. If the
|
||||
size, min_size or nr_inodes option is not provided on command line then
|
||||
no limits are set. For pagesize, size, min_size and nr_inodes options, you
|
||||
can use [G|g]/[M|m]/[K|k] to represent giga/mega/kilo. For example, size=2K
|
||||
has the same meaning as size=2048.
|
||||
|
||||
While read system calls are supported on files that reside on hugetlb
|
||||
file systems, write system calls are not.
|
||||
|
@ -289,15 +302,23 @@ file systems, write system calls are not.
|
|||
Regular chown, chgrp, and chmod commands (with right permissions) could be
|
||||
used to change the file attributes on hugetlbfs.
|
||||
|
||||
Also, it is important to note that no such mount command is required if the
|
||||
Also, it is important to note that no such mount command is required if
|
||||
applications are going to use only shmat/shmget system calls or mmap with
|
||||
MAP_HUGETLB. Users who wish to use hugetlb page via shared memory segment
|
||||
should be a member of a supplementary group and system admin needs to
|
||||
configure that gid into /proc/sys/vm/hugetlb_shm_group. It is possible for
|
||||
same or different applications to use any combination of mmaps and shm*
|
||||
calls, though the mount of filesystem will be required for using mmap calls
|
||||
without MAP_HUGETLB. For an example of how to use mmap with MAP_HUGETLB see
|
||||
map_hugetlb.c.
|
||||
MAP_HUGETLB. For an example of how to use mmap with MAP_HUGETLB see map_hugetlb
|
||||
below.
|
||||
|
||||
Users who wish to use hugetlb memory via shared memory segment should be a
|
||||
member of a supplementary group and system admin needs to configure that gid
|
||||
into /proc/sys/vm/hugetlb_shm_group. It is possible for same or different
|
||||
applications to use any combination of mmaps and shm* calls, though the mount of
|
||||
filesystem will be required for using mmap calls without MAP_HUGETLB.
|
||||
|
||||
Syscalls that operate on memory backed by hugetlb pages only have their lengths
|
||||
aligned to the native page size of the processor; they will normally fail with
|
||||
errno set to EINVAL or exclude hugetlb pages that extend beyond the length if
|
||||
not hugepage aligned. For example, munmap(2) will fail if memory is backed by
|
||||
a hugetlb page and the length is smaller than the hugepage size.
|
||||
|
||||
|
||||
Examples
|
||||
========
|
||||
|
|
|
@ -22,6 +22,7 @@ CONTENTS
|
|||
- Filtering special vmas.
|
||||
- munlock()/munlockall() system call handling.
|
||||
- Migrating mlocked pages.
|
||||
- Compacting mlocked pages.
|
||||
- mmap(MAP_LOCKED) system call handling.
|
||||
- munmap()/exit()/exec() system call handling.
|
||||
- try_to_unmap().
|
||||
|
@ -450,6 +451,17 @@ list because of a race between munlock and migration, page migration uses the
|
|||
putback_lru_page() function to add migrated pages back to the LRU.
|
||||
|
||||
|
||||
COMPACTING MLOCKED PAGES
|
||||
------------------------
|
||||
|
||||
The unevictable LRU can be scanned for compactable regions and the default
|
||||
behavior is to do so. /proc/sys/vm/compact_unevictable_allowed controls
|
||||
this behavior (see Documentation/sysctl/vm.txt). Once scanning of the
|
||||
unevictable LRU is enabled, the work of compaction is mostly handled by
|
||||
the page migration code and the same work flow as described in MIGRATING
|
||||
MLOCKED PAGES will apply.
|
||||
|
||||
|
||||
mmap(MAP_LOCKED) SYSTEM CALL HANDLING
|
||||
-------------------------------------
|
||||
|
||||
|
|
|
@ -0,0 +1,70 @@
|
|||
zsmalloc
|
||||
--------
|
||||
|
||||
This allocator is designed for use with zram. Thus, the allocator is
|
||||
supposed to work well under low memory conditions. In particular, it
|
||||
never attempts higher order page allocation which is very likely to
|
||||
fail under memory pressure. On the other hand, if we just use single
|
||||
(0-order) pages, it would suffer from very high fragmentation --
|
||||
any object of size PAGE_SIZE/2 or larger would occupy an entire page.
|
||||
This was one of the major issues with its predecessor (xvmalloc).
|
||||
|
||||
To overcome these issues, zsmalloc allocates a bunch of 0-order pages
|
||||
and links them together using various 'struct page' fields. These linked
|
||||
pages act as a single higher-order page i.e. an object can span 0-order
|
||||
page boundaries. The code refers to these linked pages as a single entity
|
||||
called zspage.
|
||||
|
||||
For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE
|
||||
since this satisfies the requirements of all its current users (in the
|
||||
worst case, page is incompressible and is thus stored "as-is" i.e. in
|
||||
uncompressed form). For allocation requests larger than this size, failure
|
||||
is returned (see zs_malloc).
|
||||
|
||||
Additionally, zs_malloc() does not return a dereferenceable pointer.
|
||||
Instead, it returns an opaque handle (unsigned long) which encodes actual
|
||||
location of the allocated object. The reason for this indirection is that
|
||||
zsmalloc does not keep zspages permanently mapped since that would cause
|
||||
issues on 32-bit systems where the VA region for kernel space mappings
|
||||
is very small. So, before using the allocating memory, the object has to
|
||||
be mapped using zs_map_object() to get a usable pointer and subsequently
|
||||
unmapped using zs_unmap_object().
|
||||
|
||||
stat
|
||||
----
|
||||
|
||||
With CONFIG_ZSMALLOC_STAT, we could see zsmalloc internal information via
|
||||
/sys/kernel/debug/zsmalloc/<user name>. Here is a sample of stat output:
|
||||
|
||||
# cat /sys/kernel/debug/zsmalloc/zram0/classes
|
||||
|
||||
class size almost_full almost_empty obj_allocated obj_used pages_used pages_per_zspage
|
||||
..
|
||||
..
|
||||
9 176 0 1 186 129 8 4
|
||||
10 192 1 0 2880 2872 135 3
|
||||
11 208 0 1 819 795 42 2
|
||||
12 224 0 1 219 159 12 4
|
||||
..
|
||||
..
|
||||
|
||||
|
||||
class: index
|
||||
size: object size zspage stores
|
||||
almost_empty: the number of ZS_ALMOST_EMPTY zspages(see below)
|
||||
almost_full: the number of ZS_ALMOST_FULL zspages(see below)
|
||||
obj_allocated: the number of objects allocated
|
||||
obj_used: the number of objects allocated to the user
|
||||
pages_used: the number of pages allocated for the class
|
||||
pages_per_zspage: the number of 0-order pages to make a zspage
|
||||
|
||||
We assign a zspage to ZS_ALMOST_EMPTY fullness group when:
|
||||
n <= N / f, where
|
||||
n = number of allocated objects
|
||||
N = total number of objects zspage can store
|
||||
f = fullness_threshold_frac(ie, 4 at the moment)
|
||||
|
||||
Similarly, we assign zspage to:
|
||||
ZS_ALMOST_FULL when n > N / f
|
||||
ZS_EMPTY when n == 0
|
||||
ZS_FULL when n == N
|
21
Kbuild
21
Kbuild
|
@ -13,8 +13,9 @@ define sed-y
|
|||
s:->::; p;}"
|
||||
endef
|
||||
|
||||
quiet_cmd_offsets = GEN $@
|
||||
define cmd_offsets
|
||||
# Use filechk to avoid rebuilds when a header changes, but the resulting file
|
||||
# does not
|
||||
define filechk_offsets
|
||||
(set -e; \
|
||||
echo "#ifndef $2"; \
|
||||
echo "#define $2"; \
|
||||
|
@ -24,9 +25,9 @@ define cmd_offsets
|
|||
echo " * This file was generated by Kbuild"; \
|
||||
echo " */"; \
|
||||
echo ""; \
|
||||
sed -ne $(sed-y) $<; \
|
||||
sed -ne $(sed-y); \
|
||||
echo ""; \
|
||||
echo "#endif" ) > $@
|
||||
echo "#endif" )
|
||||
endef
|
||||
|
||||
#####
|
||||
|
@ -35,16 +36,15 @@ endef
|
|||
bounds-file := include/generated/bounds.h
|
||||
|
||||
always := $(bounds-file)
|
||||
targets := $(bounds-file) kernel/bounds.s
|
||||
targets := kernel/bounds.s
|
||||
|
||||
# We use internal kbuild rules to avoid the "is up to date" message from make
|
||||
kernel/bounds.s: kernel/bounds.c FORCE
|
||||
$(Q)mkdir -p $(dir $@)
|
||||
$(call if_changed_dep,cc_s_c)
|
||||
|
||||
$(obj)/$(bounds-file): kernel/bounds.s Kbuild
|
||||
$(Q)mkdir -p $(dir $@)
|
||||
$(call cmd,offsets,__LINUX_BOUNDS_H__)
|
||||
$(obj)/$(bounds-file): kernel/bounds.s FORCE
|
||||
$(call filechk,offsets,__LINUX_BOUNDS_H__)
|
||||
|
||||
#####
|
||||
# 2) Generate asm-offsets.h
|
||||
|
@ -53,7 +53,6 @@ $(obj)/$(bounds-file): kernel/bounds.s Kbuild
|
|||
offsets-file := include/generated/asm-offsets.h
|
||||
|
||||
always += $(offsets-file)
|
||||
targets += $(offsets-file)
|
||||
targets += arch/$(SRCARCH)/kernel/asm-offsets.s
|
||||
|
||||
# We use internal kbuild rules to avoid the "is up to date" message from make
|
||||
|
@ -62,8 +61,8 @@ arch/$(SRCARCH)/kernel/asm-offsets.s: arch/$(SRCARCH)/kernel/asm-offsets.c \
|
|||
$(Q)mkdir -p $(dir $@)
|
||||
$(call if_changed_dep,cc_s_c)
|
||||
|
||||
$(obj)/$(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s Kbuild
|
||||
$(call cmd,offsets,__ASM_OFFSETS_H__)
|
||||
$(obj)/$(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s FORCE
|
||||
$(call filechk,offsets,__ASM_OFFSETS_H__)
|
||||
|
||||
#####
|
||||
# 3) Check for missing system calls
|
||||
|
|
96
MAINTAINERS
96
MAINTAINERS
|
@ -625,16 +625,16 @@ F: drivers/iommu/amd_iommu*.[ch]
|
|||
F: include/linux/amd-iommu.h
|
||||
|
||||
AMD KFD
|
||||
M: Oded Gabbay <oded.gabbay@amd.com>
|
||||
L: dri-devel@lists.freedesktop.org
|
||||
T: git git://people.freedesktop.org/~gabbayo/linux.git
|
||||
S: Supported
|
||||
F: drivers/gpu/drm/amd/amdkfd/
|
||||
M: Oded Gabbay <oded.gabbay@amd.com>
|
||||
L: dri-devel@lists.freedesktop.org
|
||||
T: git git://people.freedesktop.org/~gabbayo/linux.git
|
||||
S: Supported
|
||||
F: drivers/gpu/drm/amd/amdkfd/
|
||||
F: drivers/gpu/drm/amd/include/cik_structs.h
|
||||
F: drivers/gpu/drm/amd/include/kgd_kfd_interface.h
|
||||
F: drivers/gpu/drm/radeon/radeon_kfd.c
|
||||
F: drivers/gpu/drm/radeon/radeon_kfd.h
|
||||
F: include/uapi/linux/kfd_ioctl.h
|
||||
F: drivers/gpu/drm/radeon/radeon_kfd.c
|
||||
F: drivers/gpu/drm/radeon/radeon_kfd.h
|
||||
F: include/uapi/linux/kfd_ioctl.h
|
||||
|
||||
AMD MICROCODE UPDATE SUPPORT
|
||||
M: Borislav Petkov <bp@alien8.de>
|
||||
|
@ -1215,6 +1215,7 @@ F: arch/arm/mach-orion5x/ts78xx-*
|
|||
ARM/Mediatek SoC support
|
||||
M: Matthias Brugger <matthias.bgg@gmail.com>
|
||||
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
|
||||
L: linux-mediatek@lists.infradead.org (moderated for non-subscribers)
|
||||
S: Maintained
|
||||
F: arch/arm/boot/dts/mt6*
|
||||
F: arch/arm/boot/dts/mt8*
|
||||
|
@ -1764,7 +1765,7 @@ S: Supported
|
|||
F: drivers/tty/serial/atmel_serial.c
|
||||
|
||||
ATMEL Audio ALSA driver
|
||||
M: Bo Shen <voice.shen@atmel.com>
|
||||
M: Nicolas Ferre <nicolas.ferre@atmel.com>
|
||||
L: alsa-devel@alsa-project.org (moderated for non-subscribers)
|
||||
S: Supported
|
||||
F: sound/soc/atmel
|
||||
|
@ -1915,16 +1916,14 @@ S: Maintained
|
|||
F: drivers/media/radio/radio-aztech*
|
||||
|
||||
B43 WIRELESS DRIVER
|
||||
M: Stefano Brivio <stefano.brivio@polimi.it>
|
||||
L: linux-wireless@vger.kernel.org
|
||||
L: b43-dev@lists.infradead.org
|
||||
W: http://wireless.kernel.org/en/users/Drivers/b43
|
||||
S: Maintained
|
||||
S: Odd Fixes
|
||||
F: drivers/net/wireless/b43/
|
||||
|
||||
B43LEGACY WIRELESS DRIVER
|
||||
M: Larry Finger <Larry.Finger@lwfinger.net>
|
||||
M: Stefano Brivio <stefano.brivio@polimi.it>
|
||||
L: linux-wireless@vger.kernel.org
|
||||
L: b43-dev@lists.infradead.org
|
||||
W: http://wireless.kernel.org/en/users/Drivers/b43
|
||||
|
@ -1967,10 +1966,10 @@ F: Documentation/filesystems/befs.txt
|
|||
F: fs/befs/
|
||||
|
||||
BECKHOFF CX5020 ETHERCAT MASTER DRIVER
|
||||
M: Dariusz Marcinkiewicz <reksio@newterm.pl>
|
||||
L: netdev@vger.kernel.org
|
||||
S: Maintained
|
||||
F: drivers/net/ethernet/ec_bhf.c
|
||||
M: Dariusz Marcinkiewicz <reksio@newterm.pl>
|
||||
L: netdev@vger.kernel.org
|
||||
S: Maintained
|
||||
F: drivers/net/ethernet/ec_bhf.c
|
||||
|
||||
BFS FILE SYSTEM
|
||||
M: "Tigran A. Aivazian" <tigran@aivazian.fsnet.co.uk>
|
||||
|
@ -2825,6 +2824,7 @@ L: linux-crypto@vger.kernel.org
|
|||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git
|
||||
S: Maintained
|
||||
F: Documentation/crypto/
|
||||
F: Documentation/DocBook/crypto-API.tmpl
|
||||
F: arch/*/crypto/
|
||||
F: crypto/
|
||||
F: drivers/crypto/
|
||||
|
@ -2895,11 +2895,11 @@ S: Supported
|
|||
F: drivers/net/ethernet/chelsio/cxgb3/
|
||||
|
||||
CXGB3 ISCSI DRIVER (CXGB3I)
|
||||
M: Karen Xie <kxie@chelsio.com>
|
||||
L: linux-scsi@vger.kernel.org
|
||||
W: http://www.chelsio.com
|
||||
S: Supported
|
||||
F: drivers/scsi/cxgbi/cxgb3i
|
||||
M: Karen Xie <kxie@chelsio.com>
|
||||
L: linux-scsi@vger.kernel.org
|
||||
W: http://www.chelsio.com
|
||||
S: Supported
|
||||
F: drivers/scsi/cxgbi/cxgb3i
|
||||
|
||||
CXGB3 IWARP RNIC DRIVER (IW_CXGB3)
|
||||
M: Steve Wise <swise@chelsio.com>
|
||||
|
@ -2916,11 +2916,11 @@ S: Supported
|
|||
F: drivers/net/ethernet/chelsio/cxgb4/
|
||||
|
||||
CXGB4 ISCSI DRIVER (CXGB4I)
|
||||
M: Karen Xie <kxie@chelsio.com>
|
||||
L: linux-scsi@vger.kernel.org
|
||||
W: http://www.chelsio.com
|
||||
S: Supported
|
||||
F: drivers/scsi/cxgbi/cxgb4i
|
||||
M: Karen Xie <kxie@chelsio.com>
|
||||
L: linux-scsi@vger.kernel.org
|
||||
W: http://www.chelsio.com
|
||||
S: Supported
|
||||
F: drivers/scsi/cxgbi/cxgb4i
|
||||
|
||||
CXGB4 IWARP RNIC DRIVER (IW_CXGB4)
|
||||
M: Steve Wise <swise@chelsio.com>
|
||||
|
@ -5222,7 +5222,7 @@ F: arch/x86/kernel/tboot.c
|
|||
INTEL WIRELESS WIMAX CONNECTION 2400
|
||||
M: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
|
||||
M: linux-wimax@intel.com
|
||||
L: wimax@linuxwimax.org (subscribers-only)
|
||||
L: wimax@linuxwimax.org (subscribers-only)
|
||||
S: Supported
|
||||
W: http://linuxwimax.org
|
||||
F: Documentation/wimax/README.i2400m
|
||||
|
@ -5300,6 +5300,13 @@ F: drivers/char/ipmi/
|
|||
F: include/linux/ipmi*
|
||||
F: include/uapi/linux/ipmi*
|
||||
|
||||
QCOM AUDIO (ASoC) DRIVERS
|
||||
M: Patrick Lai <plai@codeaurora.org>
|
||||
M: Banajit Goswami <bgoswami@codeaurora.org>
|
||||
L: alsa-devel@alsa-project.org (moderated for non-subscribers)
|
||||
S: Supported
|
||||
F: sound/soc/qcom/
|
||||
|
||||
IPS SCSI RAID DRIVER
|
||||
M: Adaptec OEM Raid Solutions <aacraid@adaptec.com>
|
||||
L: linux-scsi@vger.kernel.org
|
||||
|
@ -5918,7 +5925,7 @@ F: arch/powerpc/platforms/512x/
|
|||
F: arch/powerpc/platforms/52xx/
|
||||
|
||||
LINUX FOR POWERPC EMBEDDED PPC4XX
|
||||
M: Alistair Popple <alistair@popple.id.au>
|
||||
M: Alistair Popple <alistair@popple.id.au>
|
||||
M: Matt Porter <mporter@kernel.crashing.org>
|
||||
W: http://www.penguinppc.org/
|
||||
L: linuxppc-dev@lists.ozlabs.org
|
||||
|
@ -6391,7 +6398,7 @@ S: Supported
|
|||
F: drivers/watchdog/mena21_wdt.c
|
||||
|
||||
MEN CHAMELEON BUS (mcb)
|
||||
M: Johannes Thumshirn <johannes.thumshirn@men.de>
|
||||
M: Johannes Thumshirn <johannes.thumshirn@men.de>
|
||||
S: Supported
|
||||
F: drivers/mcb/
|
||||
F: include/linux/mcb.h
|
||||
|
@ -7947,10 +7954,10 @@ L: rtc-linux@googlegroups.com
|
|||
S: Maintained
|
||||
|
||||
QAT DRIVER
|
||||
M: Tadeusz Struk <tadeusz.struk@intel.com>
|
||||
L: qat-linux@intel.com
|
||||
S: Supported
|
||||
F: drivers/crypto/qat/
|
||||
M: Tadeusz Struk <tadeusz.struk@intel.com>
|
||||
L: qat-linux@intel.com
|
||||
S: Supported
|
||||
F: drivers/crypto/qat/
|
||||
|
||||
QIB DRIVER
|
||||
M: Mike Marciniszyn <infinipath@intel.com>
|
||||
|
@ -8101,7 +8108,7 @@ S: Maintained
|
|||
F: drivers/net/wireless/rt2x00/
|
||||
|
||||
RAMDISK RAM BLOCK DEVICE DRIVER
|
||||
M: Nick Piggin <npiggin@kernel.dk>
|
||||
M: Jens Axboe <axboe@kernel.dk>
|
||||
S: Maintained
|
||||
F: Documentation/blockdev/ramdisk.txt
|
||||
F: drivers/block/brd.c
|
||||
|
@ -8177,6 +8184,7 @@ X: kernel/torture.c
|
|||
|
||||
REAL TIME CLOCK (RTC) SUBSYSTEM
|
||||
M: Alessandro Zummo <a.zummo@towertech.it>
|
||||
M: Alexandre Belloni <alexandre.belloni@free-electrons.com>
|
||||
L: rtc-linux@googlegroups.com
|
||||
Q: http://patchwork.ozlabs.org/project/rtc-linux/list/
|
||||
S: Maintained
|
||||
|
@ -8647,11 +8655,9 @@ F: drivers/scsi/sg.c
|
|||
F: include/scsi/sg.h
|
||||
|
||||
SCSI SUBSYSTEM
|
||||
M: "James E.J. Bottomley" <JBottomley@parallels.com>
|
||||
M: "James E.J. Bottomley" <JBottomley@odin.com>
|
||||
L: linux-scsi@vger.kernel.org
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6.git
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-pending-2.6.git
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git
|
||||
S: Maintained
|
||||
F: drivers/scsi/
|
||||
F: include/scsi/
|
||||
|
@ -9967,6 +9973,7 @@ F: drivers/media/pci/tw68/
|
|||
TPM DEVICE DRIVER
|
||||
M: Peter Huewe <peterhuewe@gmx.de>
|
||||
M: Marcel Selhorst <tpmdd@selhorst.net>
|
||||
R: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
|
||||
W: http://tpmdd.sourceforge.net
|
||||
L: tpmdd-devel@lists.sourceforge.net (moderated for non-subscribers)
|
||||
Q: git git://github.com/PeterHuewe/linux-tpmdd.git
|
||||
|
@ -10120,11 +10127,11 @@ F: include/linux/cdrom.h
|
|||
F: include/uapi/linux/cdrom.h
|
||||
|
||||
UNISYS S-PAR DRIVERS
|
||||
M: Benjamin Romer <benjamin.romer@unisys.com>
|
||||
M: David Kershner <david.kershner@unisys.com>
|
||||
L: sparmaintainer@unisys.com (Unisys internal)
|
||||
S: Supported
|
||||
F: drivers/staging/unisys/
|
||||
M: Benjamin Romer <benjamin.romer@unisys.com>
|
||||
M: David Kershner <david.kershner@unisys.com>
|
||||
L: sparmaintainer@unisys.com (Unisys internal)
|
||||
S: Supported
|
||||
F: drivers/staging/unisys/
|
||||
|
||||
UNIVERSAL FLASH STORAGE HOST CONTROLLER DRIVER
|
||||
M: Vinayak Holikatti <vinholikatti@gmail.com>
|
||||
|
@ -10681,7 +10688,7 @@ F: drivers/media/rc/winbond-cir.c
|
|||
WIMAX STACK
|
||||
M: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
|
||||
M: linux-wimax@intel.com
|
||||
L: wimax@linuxwimax.org (subscribers-only)
|
||||
L: wimax@linuxwimax.org (subscribers-only)
|
||||
S: Supported
|
||||
W: http://linuxwimax.org
|
||||
F: Documentation/wimax/README.wimax
|
||||
|
@ -10972,6 +10979,7 @@ L: linux-mm@kvack.org
|
|||
S: Maintained
|
||||
F: mm/zsmalloc.c
|
||||
F: include/linux/zsmalloc.h
|
||||
F: Documentation/vm/zsmalloc.txt
|
||||
|
||||
ZSWAP COMPRESSED SWAP CACHING
|
||||
M: Seth Jennings <sjennings@variantweb.net>
|
||||
|
|
28
Makefile
28
Makefile
|
@ -10,9 +10,10 @@ NAME = Hurr durr I'ma sheep
|
|||
# Comments in this file are targeted only to the developer, do not
|
||||
# expect to learn how to build the kernel reading this file.
|
||||
|
||||
# Do not use make's built-in rules and variables
|
||||
# (this increases performance and avoids hard-to-debug behaviour);
|
||||
MAKEFLAGS += -rR
|
||||
# o Do not use make's built-in rules and variables
|
||||
# (this increases performance and avoids hard-to-debug behaviour);
|
||||
# o Look for make include files relative to root of kernel src
|
||||
MAKEFLAGS += -rR --include-dir=$(CURDIR)
|
||||
|
||||
# Avoid funny character set dependencies
|
||||
unexport LC_ALL
|
||||
|
@ -344,12 +345,9 @@ endif
|
|||
export COMPILER
|
||||
endif
|
||||
|
||||
# Look for make include files relative to root of kernel src
|
||||
MAKEFLAGS += --include-dir=$(srctree)
|
||||
|
||||
# We need some generic definitions (do not try to remake the file).
|
||||
$(srctree)/scripts/Kbuild.include: ;
|
||||
include $(srctree)/scripts/Kbuild.include
|
||||
scripts/Kbuild.include: ;
|
||||
include scripts/Kbuild.include
|
||||
|
||||
# Make variables (CC, etc...)
|
||||
AS = $(CROSS_COMPILE)as
|
||||
|
@ -533,7 +531,7 @@ ifeq ($(config-targets),1)
|
|||
# Read arch specific Makefile to set KBUILD_DEFCONFIG as needed.
|
||||
# KBUILD_DEFCONFIG may point out an alternative default configuration
|
||||
# used for 'make defconfig'
|
||||
include $(srctree)/arch/$(SRCARCH)/Makefile
|
||||
include arch/$(SRCARCH)/Makefile
|
||||
export KBUILD_DEFCONFIG KBUILD_KCONFIG
|
||||
|
||||
config: scripts_basic outputmakefile FORCE
|
||||
|
@ -609,7 +607,7 @@ endif # $(dot-config)
|
|||
# Defaults to vmlinux, but the arch makefile usually adds further targets
|
||||
all: vmlinux
|
||||
|
||||
include $(srctree)/arch/$(SRCARCH)/Makefile
|
||||
include arch/$(SRCARCH)/Makefile
|
||||
|
||||
KBUILD_CFLAGS += $(call cc-option,-fno-delete-null-pointer-checks,)
|
||||
|
||||
|
@ -782,8 +780,8 @@ ifeq ($(shell $(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(CC)), y)
|
|||
KBUILD_AFLAGS += -DCC_HAVE_ASM_GOTO
|
||||
endif
|
||||
|
||||
include $(srctree)/scripts/Makefile.kasan
|
||||
include $(srctree)/scripts/Makefile.extrawarn
|
||||
include scripts/Makefile.kasan
|
||||
include scripts/Makefile.extrawarn
|
||||
|
||||
# Add user supplied CPPFLAGS, AFLAGS and CFLAGS as the last assignments
|
||||
KBUILD_CPPFLAGS += $(KCPPFLAGS)
|
||||
|
@ -1026,12 +1024,6 @@ headerdep:
|
|||
$(Q)find $(srctree)/include/ -name '*.h' | xargs --max-args 1 \
|
||||
$(srctree)/scripts/headerdep.pl -I$(srctree)/include
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
PHONY += depend dep
|
||||
depend dep:
|
||||
@echo '*** Warning: make $@ is unnecessary now.'
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Firmware install
|
||||
INSTALL_FW_PATH=$(INSTALL_MOD_PATH)/lib/firmware
|
||||
|
|
|
@ -32,7 +32,7 @@ config HAVE_OPROFILE
|
|||
|
||||
config OPROFILE_NMI_TIMER
|
||||
def_bool y
|
||||
depends on PERF_EVENTS && HAVE_PERF_EVENTS_NMI
|
||||
depends on PERF_EVENTS && HAVE_PERF_EVENTS_NMI && !PPC64
|
||||
|
||||
config KPROBES
|
||||
bool "Kprobes"
|
||||
|
|
|
@ -44,6 +44,7 @@ struct task_struct;
|
|||
extern unsigned long thread_saved_pc(struct task_struct *);
|
||||
|
||||
/* Do necessary setup to start up a newly executed thread. */
|
||||
struct pt_regs;
|
||||
extern void start_thread(struct pt_regs *, unsigned long, unsigned long);
|
||||
|
||||
/* Free all resources held by a thread. */
|
||||
|
|
|
@ -18,7 +18,6 @@ struct thread_info {
|
|||
unsigned int flags; /* low level flags */
|
||||
unsigned int ieee_state; /* see fpu.h */
|
||||
|
||||
struct exec_domain *exec_domain; /* execution domain */
|
||||
mm_segment_t addr_limit; /* thread address space */
|
||||
unsigned cpu; /* current CPU */
|
||||
int preempt_count; /* 0 => preemptable, <0 => BUG */
|
||||
|
@ -35,7 +34,6 @@ struct thread_info {
|
|||
#define INIT_THREAD_INFO(tsk) \
|
||||
{ \
|
||||
.task = &tsk, \
|
||||
.exec_domain = &default_exec_domain, \
|
||||
.addr_limit = KERNEL_DS, \
|
||||
.preempt_count = INIT_PREEMPT_COUNT, \
|
||||
}
|
||||
|
|
|
@ -43,7 +43,6 @@ struct thread_info {
|
|||
int preempt_count; /* 0 => preemptable, <0 => BUG */
|
||||
struct task_struct *task; /* main task structure */
|
||||
mm_segment_t addr_limit; /* thread address space */
|
||||
struct exec_domain *exec_domain;/* execution domain */
|
||||
__u32 cpu; /* current CPU */
|
||||
unsigned long thr_ptr; /* TLS ptr */
|
||||
};
|
||||
|
@ -56,7 +55,6 @@ struct thread_info {
|
|||
#define INIT_THREAD_INFO(tsk) \
|
||||
{ \
|
||||
.task = &tsk, \
|
||||
.exec_domain = &default_exec_domain, \
|
||||
.flags = 0, \
|
||||
.cpu = 0, \
|
||||
.preempt_count = INIT_PREEMPT_COUNT, \
|
||||
|
|
|
@ -171,18 +171,6 @@ static inline void __user *get_sigframe(struct ksignal *ksig,
|
|||
return frame;
|
||||
}
|
||||
|
||||
/*
|
||||
* translate the signal
|
||||
*/
|
||||
static inline int map_sig(int sig)
|
||||
{
|
||||
struct thread_info *thread = current_thread_info();
|
||||
if (thread->exec_domain && thread->exec_domain->signal_invmap
|
||||
&& sig < 32)
|
||||
sig = thread->exec_domain->signal_invmap[sig];
|
||||
return sig;
|
||||
}
|
||||
|
||||
static int
|
||||
setup_rt_frame(struct ksignal *ksig, sigset_t *set, struct pt_regs *regs)
|
||||
{
|
||||
|
@ -231,7 +219,7 @@ setup_rt_frame(struct ksignal *ksig, sigset_t *set, struct pt_regs *regs)
|
|||
return err;
|
||||
|
||||
/* #1 arg to the user Signal handler */
|
||||
regs->r0 = map_sig(ksig->sig);
|
||||
regs->r0 = ksig->sig;
|
||||
|
||||
/* setup PC of user space signal handler */
|
||||
regs->ret = (unsigned long)ksig->ka.sa.sa_handler;
|
||||
|
|
|
@ -52,7 +52,7 @@ static void show_callee_regs(struct callee_regs *cregs)
|
|||
print_reg_file(&(cregs->r13), 13);
|
||||
}
|
||||
|
||||
void print_task_path_n_nm(struct task_struct *tsk, char *buf)
|
||||
static void print_task_path_n_nm(struct task_struct *tsk, char *buf)
|
||||
{
|
||||
struct path path;
|
||||
char *path_nm = NULL;
|
||||
|
@ -77,7 +77,6 @@ void print_task_path_n_nm(struct task_struct *tsk, char *buf)
|
|||
done:
|
||||
pr_info("Path: %s\n", path_nm);
|
||||
}
|
||||
EXPORT_SYMBOL(print_task_path_n_nm);
|
||||
|
||||
static void show_faulting_vma(unsigned long address, char *buf)
|
||||
{
|
||||
|
|
|
@ -2133,16 +2133,6 @@ menu "Userspace binary formats"
|
|||
|
||||
source "fs/Kconfig.binfmt"
|
||||
|
||||
config ARTHUR
|
||||
tristate "RISC OS personality"
|
||||
depends on !AEABI
|
||||
help
|
||||
Say Y here to include the kernel code necessary if you want to run
|
||||
Acorn RISC OS/Arthur binaries under Linux. This code is still very
|
||||
experimental; if this sounds frightening, say N and sleep in peace.
|
||||
You can also say M here to compile this support as a module (which
|
||||
will be called arthur).
|
||||
|
||||
endmenu
|
||||
|
||||
menu "Power management options"
|
||||
|
@ -2175,6 +2165,9 @@ source "arch/arm/Kconfig.debug"
|
|||
source "security/Kconfig"
|
||||
|
||||
source "crypto/Kconfig"
|
||||
if CRYPTO
|
||||
source "arch/arm/crypto/Kconfig"
|
||||
endif
|
||||
|
||||
source "lib/Kconfig"
|
||||
|
||||
|
|
|
@ -12,7 +12,7 @@
|
|||
#
|
||||
|
||||
ifneq ($(MACHINE),)
|
||||
include $(srctree)/$(MACHINE)/Makefile.boot
|
||||
include $(MACHINE)/Makefile.boot
|
||||
endif
|
||||
|
||||
# Note: the following conditions must always be true:
|
||||
|
|
|
@ -12,7 +12,6 @@ CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
|
|||
CONFIG_FPE_NWFPE=y
|
||||
CONFIG_BINFMT_AOUT=m
|
||||
CONFIG_BINFMT_MISC=m
|
||||
CONFIG_ARTHUR=m
|
||||
CONFIG_NET=y
|
||||
CONFIG_PACKET=y
|
||||
CONFIG_UNIX=y
|
||||
|
|
|
@ -0,0 +1,130 @@
|
|||
|
||||
menuconfig ARM_CRYPTO
|
||||
bool "ARM Accelerated Cryptographic Algorithms"
|
||||
depends on ARM
|
||||
help
|
||||
Say Y here to choose from a selection of cryptographic algorithms
|
||||
implemented using ARM specific CPU features or instructions.
|
||||
|
||||
if ARM_CRYPTO
|
||||
|
||||
config CRYPTO_SHA1_ARM
|
||||
tristate "SHA1 digest algorithm (ARM-asm)"
|
||||
select CRYPTO_SHA1
|
||||
select CRYPTO_HASH
|
||||
help
|
||||
SHA-1 secure hash standard (FIPS 180-1/DFIPS 180-2) implemented
|
||||
using optimized ARM assembler.
|
||||
|
||||
config CRYPTO_SHA1_ARM_NEON
|
||||
tristate "SHA1 digest algorithm (ARM NEON)"
|
||||
depends on KERNEL_MODE_NEON
|
||||
select CRYPTO_SHA1_ARM
|
||||
select CRYPTO_SHA1
|
||||
select CRYPTO_HASH
|
||||
help
|
||||
SHA-1 secure hash standard (FIPS 180-1/DFIPS 180-2) implemented
|
||||
using optimized ARM NEON assembly, when NEON instructions are
|
||||
available.
|
||||
|
||||
config CRYPTO_SHA1_ARM_CE
|
||||
tristate "SHA1 digest algorithm (ARM v8 Crypto Extensions)"
|
||||
depends on KERNEL_MODE_NEON
|
||||
select CRYPTO_SHA1_ARM
|
||||
select CRYPTO_HASH
|
||||
help
|
||||
SHA-1 secure hash standard (FIPS 180-1/DFIPS 180-2) implemented
|
||||
using special ARMv8 Crypto Extensions.
|
||||
|
||||
config CRYPTO_SHA2_ARM_CE
|
||||
tristate "SHA-224/256 digest algorithm (ARM v8 Crypto Extensions)"
|
||||
depends on KERNEL_MODE_NEON
|
||||
select CRYPTO_SHA256_ARM
|
||||
select CRYPTO_HASH
|
||||
help
|
||||
SHA-256 secure hash standard (DFIPS 180-2) implemented
|
||||
using special ARMv8 Crypto Extensions.
|
||||
|
||||
config CRYPTO_SHA256_ARM
|
||||
tristate "SHA-224/256 digest algorithm (ARM-asm and NEON)"
|
||||
select CRYPTO_HASH
|
||||
depends on !CPU_V7M
|
||||
help
|
||||
SHA-256 secure hash standard (DFIPS 180-2) implemented
|
||||
using optimized ARM assembler and NEON, when available.
|
||||
|
||||
config CRYPTO_SHA512_ARM_NEON
|
||||
tristate "SHA384 and SHA512 digest algorithm (ARM NEON)"
|
||||
depends on KERNEL_MODE_NEON
|
||||
select CRYPTO_SHA512
|
||||
select CRYPTO_HASH
|
||||
help
|
||||
SHA-512 secure hash standard (DFIPS 180-2) implemented
|
||||
using ARM NEON instructions, when available.
|
||||
|
||||
This version of SHA implements a 512 bit hash with 256 bits of
|
||||
security against collision attacks.
|
||||
|
||||
This code also includes SHA-384, a 384 bit hash with 192 bits
|
||||
of security against collision attacks.
|
||||
|
||||
config CRYPTO_AES_ARM
|
||||
tristate "AES cipher algorithms (ARM-asm)"
|
||||
depends on ARM
|
||||
select CRYPTO_ALGAPI
|
||||
select CRYPTO_AES
|
||||
help
|
||||
Use optimized AES assembler routines for ARM platforms.
|
||||
|
||||
AES cipher algorithms (FIPS-197). AES uses the Rijndael
|
||||
algorithm.
|
||||
|
||||
Rijndael appears to be consistently a very good performer in
|
||||
both hardware and software across a wide range of computing
|
||||
environments regardless of its use in feedback or non-feedback
|
||||
modes. Its key setup time is excellent, and its key agility is
|
||||
good. Rijndael's very low memory requirements make it very well
|
||||
suited for restricted-space environments, in which it also
|
||||
demonstrates excellent performance. Rijndael's operations are
|
||||
among the easiest to defend against power and timing attacks.
|
||||
|
||||
The AES specifies three key sizes: 128, 192 and 256 bits
|
||||
|
||||
See <http://csrc.nist.gov/encryption/aes/> for more information.
|
||||
|
||||
config CRYPTO_AES_ARM_BS
|
||||
tristate "Bit sliced AES using NEON instructions"
|
||||
depends on KERNEL_MODE_NEON
|
||||
select CRYPTO_ALGAPI
|
||||
select CRYPTO_AES_ARM
|
||||
select CRYPTO_ABLK_HELPER
|
||||
help
|
||||
Use a faster and more secure NEON based implementation of AES in CBC,
|
||||
CTR and XTS modes
|
||||
|
||||
Bit sliced AES gives around 45% speedup on Cortex-A15 for CTR mode
|
||||
and for XTS mode encryption, CBC and XTS mode decryption speedup is
|
||||
around 25%. (CBC encryption speed is not affected by this driver.)
|
||||
This implementation does not rely on any lookup tables so it is
|
||||
believed to be invulnerable to cache timing attacks.
|
||||
|
||||
config CRYPTO_AES_ARM_CE
|
||||
tristate "Accelerated AES using ARMv8 Crypto Extensions"
|
||||
depends on KERNEL_MODE_NEON
|
||||
select CRYPTO_ALGAPI
|
||||
select CRYPTO_ABLK_HELPER
|
||||
help
|
||||
Use an implementation of AES in CBC, CTR and XTS modes that uses
|
||||
ARMv8 Crypto Extensions
|
||||
|
||||
config CRYPTO_GHASH_ARM_CE
|
||||
tristate "PMULL-accelerated GHASH using ARMv8 Crypto Extensions"
|
||||
depends on KERNEL_MODE_NEON
|
||||
select CRYPTO_HASH
|
||||
select CRYPTO_CRYPTD
|
||||
help
|
||||
Use an implementation of GHASH (used by the GCM AEAD chaining mode)
|
||||
that uses the 64x64 to 128 bit polynomial multiplication (vmull.p64)
|
||||
that is part of the ARMv8 Crypto Extensions
|
||||
|
||||
endif
|
|
@ -6,13 +6,35 @@ obj-$(CONFIG_CRYPTO_AES_ARM) += aes-arm.o
|
|||
obj-$(CONFIG_CRYPTO_AES_ARM_BS) += aes-arm-bs.o
|
||||
obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o
|
||||
obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
|
||||
obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
|
||||
obj-$(CONFIG_CRYPTO_SHA512_ARM_NEON) += sha512-arm-neon.o
|
||||
|
||||
ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
|
||||
ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
|
||||
ce-obj-$(CONFIG_CRYPTO_SHA2_ARM_CE) += sha2-arm-ce.o
|
||||
ce-obj-$(CONFIG_CRYPTO_GHASH_ARM_CE) += ghash-arm-ce.o
|
||||
|
||||
ifneq ($(ce-obj-y)$(ce-obj-m),)
|
||||
ifeq ($(call as-instr,.fpu crypto-neon-fp-armv8,y,n),y)
|
||||
obj-y += $(ce-obj-y)
|
||||
obj-m += $(ce-obj-m)
|
||||
else
|
||||
$(warning These ARMv8 Crypto Extensions modules need binutils 2.23 or higher)
|
||||
$(warning $(ce-obj-y) $(ce-obj-m))
|
||||
endif
|
||||
endif
|
||||
|
||||
aes-arm-y := aes-armv4.o aes_glue.o
|
||||
aes-arm-bs-y := aesbs-core.o aesbs-glue.o
|
||||
sha1-arm-y := sha1-armv4-large.o sha1_glue.o
|
||||
sha1-arm-neon-y := sha1-armv7-neon.o sha1_neon_glue.o
|
||||
sha256-arm-neon-$(CONFIG_KERNEL_MODE_NEON) := sha256_neon_glue.o
|
||||
sha256-arm-y := sha256-core.o sha256_glue.o $(sha256-arm-neon-y)
|
||||
sha512-arm-neon-y := sha512-armv7-neon.o sha512_neon_glue.o
|
||||
sha1-arm-ce-y := sha1-ce-core.o sha1-ce-glue.o
|
||||
sha2-arm-ce-y := sha2-ce-core.o sha2-ce-glue.o
|
||||
aes-arm-ce-y := aes-ce-core.o aes-ce-glue.o
|
||||
ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o
|
||||
|
||||
quiet_cmd_perl = PERL $@
|
||||
cmd_perl = $(PERL) $(<) > $(@)
|
||||
|
@ -20,4 +42,7 @@ quiet_cmd_perl = PERL $@
|
|||
$(src)/aesbs-core.S_shipped: $(src)/bsaes-armv7.pl
|
||||
$(call cmd,perl)
|
||||
|
||||
.PRECIOUS: $(obj)/aesbs-core.S
|
||||
$(src)/sha256-core.S_shipped: $(src)/sha256-armv4.pl
|
||||
$(call cmd,perl)
|
||||
|
||||
.PRECIOUS: $(obj)/aesbs-core.S $(obj)/sha256-core.S
|
||||
|
|
|
@ -0,0 +1,518 @@
|
|||
/*
|
||||
* aes-ce-core.S - AES in CBC/CTR/XTS mode using ARMv8 Crypto Extensions
|
||||
*
|
||||
* Copyright (C) 2015 Linaro Ltd <ard.biesheuvel@linaro.org>
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License version 2 as
|
||||
* published by the Free Software Foundation.
|
||||
*/
|
||||
|
||||
#include <linux/linkage.h>
|
||||
#include <asm/assembler.h>
|
||||
|
||||
.text
|
||||
.fpu crypto-neon-fp-armv8
|
||||
.align 3
|
||||
|
||||
.macro enc_round, state, key
|
||||
aese.8 \state, \key
|
||||
aesmc.8 \state, \state
|
||||
.endm
|
||||
|
||||
.macro dec_round, state, key
|
||||
aesd.8 \state, \key
|
||||
aesimc.8 \state, \state
|
||||
.endm
|
||||
|
||||
.macro enc_dround, key1, key2
|
||||
enc_round q0, \key1
|
||||
enc_round q0, \key2
|
||||
.endm
|
||||
|
||||
.macro dec_dround, key1, key2
|
||||
dec_round q0, \key1
|
||||
dec_round q0, \key2
|
||||
.endm
|
||||
|
||||
.macro enc_fround, key1, key2, key3
|
||||
enc_round q0, \key1
|
||||
aese.8 q0, \key2
|
||||
veor q0, q0, \key3
|
||||
.endm
|
||||
|
||||
.macro dec_fround, key1, key2, key3
|
||||
dec_round q0, \key1
|
||||
aesd.8 q0, \key2
|
||||
veor q0, q0, \key3
|
||||
.endm
|
||||
|
||||
.macro enc_dround_3x, key1, key2
|
||||
enc_round q0, \key1
|
||||
enc_round q1, \key1
|
||||
enc_round q2, \key1
|
||||
enc_round q0, \key2
|
||||
enc_round q1, \key2
|
||||
enc_round q2, \key2
|
||||
.endm
|
||||
|
||||
.macro dec_dround_3x, key1, key2
|
||||
dec_round q0, \key1
|
||||
dec_round q1, \key1
|
||||
dec_round q2, \key1
|
||||
dec_round q0, \key2
|
||||
dec_round q1, \key2
|
||||
dec_round q2, \key2
|
||||
.endm
|
||||
|
||||
.macro enc_fround_3x, key1, key2, key3
|
||||
enc_round q0, \key1
|
||||
enc_round q1, \key1
|
||||
enc_round q2, \key1
|
||||
aese.8 q0, \key2
|
||||
aese.8 q1, \key2
|
||||
aese.8 q2, \key2
|
||||
veor q0, q0, \key3
|
||||
veor q1, q1, \key3
|
||||
veor q2, q2, \key3
|
||||
.endm
|
||||
|
||||
.macro dec_fround_3x, key1, key2, key3
|
||||
dec_round q0, \key1
|
||||
dec_round q1, \key1
|
||||
dec_round q2, \key1
|
||||
aesd.8 q0, \key2
|
||||
aesd.8 q1, \key2
|
||||
aesd.8 q2, \key2
|
||||
veor q0, q0, \key3
|
||||
veor q1, q1, \key3
|
||||
veor q2, q2, \key3
|
||||
.endm
|
||||
|
||||
.macro do_block, dround, fround
|
||||
cmp r3, #12 @ which key size?
|
||||
vld1.8 {q10-q11}, [ip]!
|
||||
\dround q8, q9
|
||||
vld1.8 {q12-q13}, [ip]!
|
||||
\dround q10, q11
|
||||
vld1.8 {q10-q11}, [ip]!
|
||||
\dround q12, q13
|
||||
vld1.8 {q12-q13}, [ip]!
|
||||
\dround q10, q11
|
||||
blo 0f @ AES-128: 10 rounds
|
||||
vld1.8 {q10-q11}, [ip]!
|
||||
beq 1f @ AES-192: 12 rounds
|
||||
\dround q12, q13
|
||||
vld1.8 {q12-q13}, [ip]
|
||||
\dround q10, q11
|
||||
0: \fround q12, q13, q14
|
||||
bx lr
|
||||
|
||||
1: \dround q12, q13
|
||||
\fround q10, q11, q14
|
||||
bx lr
|
||||
.endm
|
||||
|
||||
/*
|
||||
* Internal, non-AAPCS compliant functions that implement the core AES
|
||||
* transforms. These should preserve all registers except q0 - q2 and ip
|
||||
* Arguments:
|
||||
* q0 : first in/output block
|
||||
* q1 : second in/output block (_3x version only)
|
||||
* q2 : third in/output block (_3x version only)
|
||||
* q8 : first round key
|
||||
* q9 : secound round key
|
||||
* ip : address of 3rd round key
|
||||
* q14 : final round key
|
||||
* r3 : number of rounds
|
||||
*/
|
||||
.align 6
|
||||
aes_encrypt:
|
||||
add ip, r2, #32 @ 3rd round key
|
||||
.Laes_encrypt_tweak:
|
||||
do_block enc_dround, enc_fround
|
||||
ENDPROC(aes_encrypt)
|
||||
|
||||
.align 6
|
||||
aes_decrypt:
|
||||
add ip, r2, #32 @ 3rd round key
|
||||
do_block dec_dround, dec_fround
|
||||
ENDPROC(aes_decrypt)
|
||||
|
||||
.align 6
|
||||
aes_encrypt_3x:
|
||||
add ip, r2, #32 @ 3rd round key
|
||||
do_block enc_dround_3x, enc_fround_3x
|
||||
ENDPROC(aes_encrypt_3x)
|
||||
|
||||
.align 6
|
||||
aes_decrypt_3x:
|
||||
add ip, r2, #32 @ 3rd round key
|
||||
do_block dec_dround_3x, dec_fround_3x
|
||||
ENDPROC(aes_decrypt_3x)
|
||||
|
||||
.macro prepare_key, rk, rounds
|
||||
add ip, \rk, \rounds, lsl #4
|
||||
vld1.8 {q8-q9}, [\rk] @ load first 2 round keys
|
||||
vld1.8 {q14}, [ip] @ load last round key
|
||||
.endm
|
||||
|
||||
/*
|
||||
* aes_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
|
||||
* int blocks)
|
||||
* aes_ecb_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
|
||||
* int blocks)
|
||||
*/
|
||||
ENTRY(ce_aes_ecb_encrypt)
|
||||
push {r4, lr}
|
||||
ldr r4, [sp, #8]
|
||||
prepare_key r2, r3
|
||||
.Lecbencloop3x:
|
||||
subs r4, r4, #3
|
||||
bmi .Lecbenc1x
|
||||
vld1.8 {q0-q1}, [r1, :64]!
|
||||
vld1.8 {q2}, [r1, :64]!
|
||||
bl aes_encrypt_3x
|
||||
vst1.8 {q0-q1}, [r0, :64]!
|
||||
vst1.8 {q2}, [r0, :64]!
|
||||
b .Lecbencloop3x
|
||||
.Lecbenc1x:
|
||||
adds r4, r4, #3
|
||||
beq .Lecbencout
|
||||
.Lecbencloop:
|
||||
vld1.8 {q0}, [r1, :64]!
|
||||
bl aes_encrypt
|
||||
vst1.8 {q0}, [r0, :64]!
|
||||
subs r4, r4, #1
|
||||
bne .Lecbencloop
|
||||
.Lecbencout:
|
||||
pop {r4, pc}
|
||||
ENDPROC(ce_aes_ecb_encrypt)
|
||||
|
||||
ENTRY(ce_aes_ecb_decrypt)
|
||||
push {r4, lr}
|
||||
ldr r4, [sp, #8]
|
||||
prepare_key r2, r3
|
||||
.Lecbdecloop3x:
|
||||
subs r4, r4, #3
|
||||
bmi .Lecbdec1x
|
||||
vld1.8 {q0-q1}, [r1, :64]!
|
||||
vld1.8 {q2}, [r1, :64]!
|
||||
bl aes_decrypt_3x
|
||||
vst1.8 {q0-q1}, [r0, :64]!
|
||||
vst1.8 {q2}, [r0, :64]!
|
||||
b .Lecbdecloop3x
|
||||
.Lecbdec1x:
|
||||
adds r4, r4, #3
|
||||
beq .Lecbdecout
|
||||
.Lecbdecloop:
|
||||
vld1.8 {q0}, [r1, :64]!
|
||||
bl aes_decrypt
|
||||
vst1.8 {q0}, [r0, :64]!
|
||||
subs r4, r4, #1
|
||||
bne .Lecbdecloop
|
||||
.Lecbdecout:
|
||||
pop {r4, pc}
|
||||
ENDPROC(ce_aes_ecb_decrypt)
|
||||
|
||||
/*
|
||||
* aes_cbc_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
|
||||
* int blocks, u8 iv[])
|
||||
* aes_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
|
||||
* int blocks, u8 iv[])
|
||||
*/
|
||||
ENTRY(ce_aes_cbc_encrypt)
|
||||
push {r4-r6, lr}
|
||||
ldrd r4, r5, [sp, #16]
|
||||
vld1.8 {q0}, [r5]
|
||||
prepare_key r2, r3
|
||||
.Lcbcencloop:
|
||||
vld1.8 {q1}, [r1, :64]! @ get next pt block
|
||||
veor q0, q0, q1 @ ..and xor with iv
|
||||
bl aes_encrypt
|
||||
vst1.8 {q0}, [r0, :64]!
|
||||
subs r4, r4, #1
|
||||
bne .Lcbcencloop
|
||||
vst1.8 {q0}, [r5]
|
||||
pop {r4-r6, pc}
|
||||
ENDPROC(ce_aes_cbc_encrypt)
|
||||
|
||||
ENTRY(ce_aes_cbc_decrypt)
|
||||
push {r4-r6, lr}
|
||||
ldrd r4, r5, [sp, #16]
|
||||
vld1.8 {q6}, [r5] @ keep iv in q6
|
||||
prepare_key r2, r3
|
||||
.Lcbcdecloop3x:
|
||||
subs r4, r4, #3
|
||||
bmi .Lcbcdec1x
|
||||
vld1.8 {q0-q1}, [r1, :64]!
|
||||
vld1.8 {q2}, [r1, :64]!
|
||||
vmov q3, q0
|
||||
vmov q4, q1
|
||||
vmov q5, q2
|
||||
bl aes_decrypt_3x
|
||||
veor q0, q0, q6
|
||||
veor q1, q1, q3
|
||||
veor q2, q2, q4
|
||||
vmov q6, q5
|
||||
vst1.8 {q0-q1}, [r0, :64]!
|
||||
vst1.8 {q2}, [r0, :64]!
|
||||
b .Lcbcdecloop3x
|
||||
.Lcbcdec1x:
|
||||
adds r4, r4, #3
|
||||
beq .Lcbcdecout
|
||||
vmov q15, q14 @ preserve last round key
|
||||
.Lcbcdecloop:
|
||||
vld1.8 {q0}, [r1, :64]! @ get next ct block
|
||||
veor q14, q15, q6 @ combine prev ct with last key
|
||||
vmov q6, q0
|
||||
bl aes_decrypt
|
||||
vst1.8 {q0}, [r0, :64]!
|
||||
subs r4, r4, #1
|
||||
bne .Lcbcdecloop
|
||||
.Lcbcdecout:
|
||||
vst1.8 {q6}, [r5] @ keep iv in q6
|
||||
pop {r4-r6, pc}
|
||||
ENDPROC(ce_aes_cbc_decrypt)
|
||||
|
||||
/*
|
||||
* aes_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
|
||||
* int blocks, u8 ctr[])
|
||||
*/
|
||||
ENTRY(ce_aes_ctr_encrypt)
|
||||
push {r4-r6, lr}
|
||||
ldrd r4, r5, [sp, #16]
|
||||
vld1.8 {q6}, [r5] @ load ctr
|
||||
prepare_key r2, r3
|
||||
vmov r6, s27 @ keep swabbed ctr in r6
|
||||
rev r6, r6
|
||||
cmn r6, r4 @ 32 bit overflow?
|
||||
bcs .Lctrloop
|
||||
.Lctrloop3x:
|
||||
subs r4, r4, #3
|
||||
bmi .Lctr1x
|
||||
add r6, r6, #1
|
||||
vmov q0, q6
|
||||
vmov q1, q6
|
||||
rev ip, r6
|
||||
add r6, r6, #1
|
||||
vmov q2, q6
|
||||
vmov s7, ip
|
||||
rev ip, r6
|
||||
add r6, r6, #1
|
||||
vmov s11, ip
|
||||
vld1.8 {q3-q4}, [r1, :64]!
|
||||
vld1.8 {q5}, [r1, :64]!
|
||||
bl aes_encrypt_3x
|
||||
veor q0, q0, q3
|
||||
veor q1, q1, q4
|
||||
veor q2, q2, q5
|
||||
rev ip, r6
|
||||
vst1.8 {q0-q1}, [r0, :64]!
|
||||
vst1.8 {q2}, [r0, :64]!
|
||||
vmov s27, ip
|
||||
b .Lctrloop3x
|
||||
.Lctr1x:
|
||||
adds r4, r4, #3
|
||||
beq .Lctrout
|
||||
.Lctrloop:
|
||||
vmov q0, q6
|
||||
bl aes_encrypt
|
||||
subs r4, r4, #1
|
||||
bmi .Lctrhalfblock @ blocks < 0 means 1/2 block
|
||||
vld1.8 {q3}, [r1, :64]!
|
||||
veor q3, q0, q3
|
||||
vst1.8 {q3}, [r0, :64]!
|
||||
|
||||
adds r6, r6, #1 @ increment BE ctr
|
||||
rev ip, r6
|
||||
vmov s27, ip
|
||||
bcs .Lctrcarry
|
||||
teq r4, #0
|
||||
bne .Lctrloop
|
||||
.Lctrout:
|
||||
vst1.8 {q6}, [r5]
|
||||
pop {r4-r6, pc}
|
||||
|
||||
.Lctrhalfblock:
|
||||
vld1.8 {d1}, [r1, :64]
|
||||
veor d0, d0, d1
|
||||
vst1.8 {d0}, [r0, :64]
|
||||
pop {r4-r6, pc}
|
||||
|
||||
.Lctrcarry:
|
||||
.irp sreg, s26, s25, s24
|
||||
vmov ip, \sreg @ load next word of ctr
|
||||
rev ip, ip @ ... to handle the carry
|
||||
adds ip, ip, #1
|
||||
rev ip, ip
|
||||
vmov \sreg, ip
|
||||
bcc 0f
|
||||
.endr
|
||||
0: teq r4, #0
|
||||
beq .Lctrout
|
||||
b .Lctrloop
|
||||
ENDPROC(ce_aes_ctr_encrypt)
|
||||
|
||||
/*
|
||||
* aes_xts_encrypt(u8 out[], u8 const in[], u8 const rk1[], int rounds,
|
||||
* int blocks, u8 iv[], u8 const rk2[], int first)
|
||||
* aes_xts_decrypt(u8 out[], u8 const in[], u8 const rk1[], int rounds,
|
||||
* int blocks, u8 iv[], u8 const rk2[], int first)
|
||||
*/
|
||||
|
||||
.macro next_tweak, out, in, const, tmp
|
||||
vshr.s64 \tmp, \in, #63
|
||||
vand \tmp, \tmp, \const
|
||||
vadd.u64 \out, \in, \in
|
||||
vext.8 \tmp, \tmp, \tmp, #8
|
||||
veor \out, \out, \tmp
|
||||
.endm
|
||||
|
||||
.align 3
|
||||
.Lxts_mul_x:
|
||||
.quad 1, 0x87
|
||||
|
||||
ce_aes_xts_init:
|
||||
vldr d14, .Lxts_mul_x
|
||||
vldr d15, .Lxts_mul_x + 8
|
||||
|
||||
ldrd r4, r5, [sp, #16] @ load args
|
||||
ldr r6, [sp, #28]
|
||||
vld1.8 {q0}, [r5] @ load iv
|
||||
teq r6, #1 @ start of a block?
|
||||
bxne lr
|
||||
|
||||
@ Encrypt the IV in q0 with the second AES key. This should only
|
||||
@ be done at the start of a block.
|
||||
ldr r6, [sp, #24] @ load AES key 2
|
||||
prepare_key r6, r3
|
||||
add ip, r6, #32 @ 3rd round key of key 2
|
||||
b .Laes_encrypt_tweak @ tail call
|
||||
ENDPROC(ce_aes_xts_init)
|
||||
|
||||
ENTRY(ce_aes_xts_encrypt)
|
||||
push {r4-r6, lr}
|
||||
|
||||
bl ce_aes_xts_init @ run shared prologue
|
||||
prepare_key r2, r3
|
||||
vmov q3, q0
|
||||
|
||||
teq r6, #0 @ start of a block?
|
||||
bne .Lxtsenc3x
|
||||
|
||||
.Lxtsencloop3x:
|
||||
next_tweak q3, q3, q7, q6
|
||||
.Lxtsenc3x:
|
||||
subs r4, r4, #3
|
||||
bmi .Lxtsenc1x
|
||||
vld1.8 {q0-q1}, [r1, :64]! @ get 3 pt blocks
|
||||
vld1.8 {q2}, [r1, :64]!
|
||||
next_tweak q4, q3, q7, q6
|
||||
veor q0, q0, q3
|
||||
next_tweak q5, q4, q7, q6
|
||||
veor q1, q1, q4
|
||||
veor q2, q2, q5
|
||||
bl aes_encrypt_3x
|
||||
veor q0, q0, q3
|
||||
veor q1, q1, q4
|
||||
veor q2, q2, q5
|
||||
vst1.8 {q0-q1}, [r0, :64]! @ write 3 ct blocks
|
||||
vst1.8 {q2}, [r0, :64]!
|
||||
vmov q3, q5
|
||||
teq r4, #0
|
||||
beq .Lxtsencout
|
||||
b .Lxtsencloop3x
|
||||
.Lxtsenc1x:
|
||||
adds r4, r4, #3
|
||||
beq .Lxtsencout
|
||||
.Lxtsencloop:
|
||||
vld1.8 {q0}, [r1, :64]!
|
||||
veor q0, q0, q3
|
||||
bl aes_encrypt
|
||||
veor q0, q0, q3
|
||||
vst1.8 {q0}, [r0, :64]!
|
||||
subs r4, r4, #1
|
||||
beq .Lxtsencout
|
||||
next_tweak q3, q3, q7, q6
|
||||
b .Lxtsencloop
|
||||
.Lxtsencout:
|
||||
vst1.8 {q3}, [r5]
|
||||
pop {r4-r6, pc}
|
||||
ENDPROC(ce_aes_xts_encrypt)
|
||||
|
||||
|
||||
ENTRY(ce_aes_xts_decrypt)
|
||||
push {r4-r6, lr}
|
||||
|
||||
bl ce_aes_xts_init @ run shared prologue
|
||||
prepare_key r2, r3
|
||||
vmov q3, q0
|
||||
|
||||
teq r6, #0 @ start of a block?
|
||||
bne .Lxtsdec3x
|
||||
|
||||
.Lxtsdecloop3x:
|
||||
next_tweak q3, q3, q7, q6
|
||||
.Lxtsdec3x:
|
||||
subs r4, r4, #3
|
||||
bmi .Lxtsdec1x
|
||||
vld1.8 {q0-q1}, [r1, :64]! @ get 3 ct blocks
|
||||
vld1.8 {q2}, [r1, :64]!
|
||||
next_tweak q4, q3, q7, q6
|
||||
veor q0, q0, q3
|
||||
next_tweak q5, q4, q7, q6
|
||||
veor q1, q1, q4
|
||||
veor q2, q2, q5
|
||||
bl aes_decrypt_3x
|
||||
veor q0, q0, q3
|
||||
veor q1, q1, q4
|
||||
veor q2, q2, q5
|
||||
vst1.8 {q0-q1}, [r0, :64]! @ write 3 pt blocks
|
||||
vst1.8 {q2}, [r0, :64]!
|
||||
vmov q3, q5
|
||||
teq r4, #0
|
||||
beq .Lxtsdecout
|
||||
b .Lxtsdecloop3x
|
||||
.Lxtsdec1x:
|
||||
adds r4, r4, #3
|
||||
beq .Lxtsdecout
|
||||
.Lxtsdecloop:
|
||||
vld1.8 {q0}, [r1, :64]!
|
||||
veor q0, q0, q3
|
||||
add ip, r2, #32 @ 3rd round key
|
||||
bl aes_decrypt
|
||||
veor q0, q0, q3
|
||||
vst1.8 {q0}, [r0, :64]!
|
||||
subs r4, r4, #1
|
||||
beq .Lxtsdecout
|
||||
next_tweak q3, q3, q7, q6
|
||||
b .Lxtsdecloop
|
||||
.Lxtsdecout:
|
||||
vst1.8 {q3}, [r5]
|
||||
pop {r4-r6, pc}
|
||||
ENDPROC(ce_aes_xts_decrypt)
|
||||
|
||||
/*
|
||||
* u32 ce_aes_sub(u32 input) - use the aese instruction to perform the
|
||||
* AES sbox substitution on each byte in
|
||||
* 'input'
|
||||
*/
|
||||
ENTRY(ce_aes_sub)
|
||||
vdup.32 q1, r0
|
||||
veor q0, q0, q0
|
||||
aese.8 q0, q1
|
||||
vmov r0, s0
|
||||
bx lr
|
||||
ENDPROC(ce_aes_sub)
|
||||
|
||||
/*
|
||||
* void ce_aes_invert(u8 *dst, u8 *src) - perform the Inverse MixColumns
|
||||
* operation on round key *src
|
||||
*/
|
||||
ENTRY(ce_aes_invert)
|
||||
vld1.8 {q0}, [r1]
|
||||
aesimc.8 q0, q0
|
||||
vst1.8 {q0}, [r0]
|
||||
bx lr
|
||||
ENDPROC(ce_aes_invert)
|
|
@ -0,0 +1,524 @@
|
|||
/*
|
||||
* aes-ce-glue.c - wrapper code for ARMv8 AES
|
||||
*
|
||||
* Copyright (C) 2015 Linaro Ltd <ard.biesheuvel@linaro.org>
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License version 2 as
|
||||
* published by the Free Software Foundation.
|
||||
*/
|
||||
|
||||
#include <asm/hwcap.h>
|
||||
#include <asm/neon.h>
|
||||
#include <asm/hwcap.h>
|
||||
#include <crypto/aes.h>
|
||||
#include <crypto/ablk_helper.h>
|
||||
#include <crypto/algapi.h>
|
||||
#include <linux/module.h>
|
||||
|
||||
MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 Crypto Extensions");
|
||||
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
|
||||
MODULE_LICENSE("GPL v2");
|
||||
|
||||
/* defined in aes-ce-core.S */
|
||||
asmlinkage u32 ce_aes_sub(u32 input);
|
||||
asmlinkage void ce_aes_invert(void *dst, void *src);
|
||||
|
||||
asmlinkage void ce_aes_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[],
|
||||
int rounds, int blocks);
|
||||
asmlinkage void ce_aes_ecb_decrypt(u8 out[], u8 const in[], u8 const rk[],
|
||||
int rounds, int blocks);
|
||||
|
||||
asmlinkage void ce_aes_cbc_encrypt(u8 out[], u8 const in[], u8 const rk[],
|
||||
int rounds, int blocks, u8 iv[]);
|
||||
asmlinkage void ce_aes_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[],
|
||||
int rounds, int blocks, u8 iv[]);
|
||||
|
||||
asmlinkage void ce_aes_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
|
||||
int rounds, int blocks, u8 ctr[]);
|
||||
|
||||
asmlinkage void ce_aes_xts_encrypt(u8 out[], u8 const in[], u8 const rk1[],
|
||||
int rounds, int blocks, u8 iv[],
|
||||
u8 const rk2[], int first);
|
||||
asmlinkage void ce_aes_xts_decrypt(u8 out[], u8 const in[], u8 const rk1[],
|
||||
int rounds, int blocks, u8 iv[],
|
||||
u8 const rk2[], int first);
|
||||
|
||||
struct aes_block {
|
||||
u8 b[AES_BLOCK_SIZE];
|
||||
};
|
||||
|
||||
static int num_rounds(struct crypto_aes_ctx *ctx)
|
||||
{
|
||||
/*
|
||||
* # of rounds specified by AES:
|
||||
* 128 bit key 10 rounds
|
||||
* 192 bit key 12 rounds
|
||||
* 256 bit key 14 rounds
|
||||
* => n byte key => 6 + (n/4) rounds
|
||||
*/
|
||||
return 6 + ctx->key_length / 4;
|
||||
}
|
||||
|
||||
static int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
|
||||
unsigned int key_len)
|
||||
{
|
||||
/*
|
||||
* The AES key schedule round constants
|
||||
*/
|
||||
static u8 const rcon[] = {
|
||||
0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36,
|
||||
};
|
||||
|
||||
u32 kwords = key_len / sizeof(u32);
|
||||
struct aes_block *key_enc, *key_dec;
|
||||
int i, j;
|
||||
|
||||
if (key_len != AES_KEYSIZE_128 &&
|
||||
key_len != AES_KEYSIZE_192 &&
|
||||
key_len != AES_KEYSIZE_256)
|
||||
return -EINVAL;
|
||||
|
||||
memcpy(ctx->key_enc, in_key, key_len);
|
||||
ctx->key_length = key_len;
|
||||
|
||||
kernel_neon_begin();
|
||||
for (i = 0; i < sizeof(rcon); i++) {
|
||||
u32 *rki = ctx->key_enc + (i * kwords);
|
||||
u32 *rko = rki + kwords;
|
||||
|
||||
rko[0] = ror32(ce_aes_sub(rki[kwords - 1]), 8);
|
||||
rko[0] = rko[0] ^ rki[0] ^ rcon[i];
|
||||
rko[1] = rko[0] ^ rki[1];
|
||||
rko[2] = rko[1] ^ rki[2];
|
||||
rko[3] = rko[2] ^ rki[3];
|
||||
|
||||
if (key_len == AES_KEYSIZE_192) {
|
||||
if (i >= 7)
|
||||
break;
|
||||
rko[4] = rko[3] ^ rki[4];
|
||||
rko[5] = rko[4] ^ rki[5];
|
||||
} else if (key_len == AES_KEYSIZE_256) {
|
||||
if (i >= 6)
|
||||
break;
|
||||
rko[4] = ce_aes_sub(rko[3]) ^ rki[4];
|
||||
rko[5] = rko[4] ^ rki[5];
|
||||
rko[6] = rko[5] ^ rki[6];
|
||||
rko[7] = rko[6] ^ rki[7];
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Generate the decryption keys for the Equivalent Inverse Cipher.
|
||||
* This involves reversing the order of the round keys, and applying
|
||||
* the Inverse Mix Columns transformation on all but the first and
|
||||
* the last one.
|
||||
*/
|
||||
key_enc = (struct aes_block *)ctx->key_enc;
|
||||
key_dec = (struct aes_block *)ctx->key_dec;
|
||||
j = num_rounds(ctx);
|
||||
|
||||
key_dec[0] = key_enc[j];
|
||||
for (i = 1, j--; j > 0; i++, j--)
|
||||
ce_aes_invert(key_dec + i, key_enc + j);
|
||||
key_dec[i] = key_enc[0];
|
||||
|
||||
kernel_neon_end();
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int ce_aes_setkey(struct crypto_tfm *tfm, const u8 *in_key,
|
||||
unsigned int key_len)
|
||||
{
|
||||
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
|
||||
int ret;
|
||||
|
||||
ret = ce_aes_expandkey(ctx, in_key, key_len);
|
||||
if (!ret)
|
||||
return 0;
|
||||
|
||||
tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
struct crypto_aes_xts_ctx {
|
||||
struct crypto_aes_ctx key1;
|
||||
struct crypto_aes_ctx __aligned(8) key2;
|
||||
};
|
||||
|
||||
static int xts_set_key(struct crypto_tfm *tfm, const u8 *in_key,
|
||||
unsigned int key_len)
|
||||
{
|
||||
struct crypto_aes_xts_ctx *ctx = crypto_tfm_ctx(tfm);
|
||||
int ret;
|
||||
|
||||
ret = ce_aes_expandkey(&ctx->key1, in_key, key_len / 2);
|
||||
if (!ret)
|
||||
ret = ce_aes_expandkey(&ctx->key2, &in_key[key_len / 2],
|
||||
key_len / 2);
|
||||
if (!ret)
|
||||
return 0;
|
||||
|
||||
tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
static int ecb_encrypt(struct blkcipher_desc *desc, struct scatterlist *dst,
|
||||
struct scatterlist *src, unsigned int nbytes)
|
||||
{
|
||||
struct crypto_aes_ctx *ctx = crypto_blkcipher_ctx(desc->tfm);
|
||||
struct blkcipher_walk walk;
|
||||
unsigned int blocks;
|
||||
int err;
|
||||
|
||||
desc->flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
|
||||
blkcipher_walk_init(&walk, dst, src, nbytes);
|
||||
err = blkcipher_walk_virt(desc, &walk);
|
||||
|
||||
kernel_neon_begin();
|
||||
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
|
||||
ce_aes_ecb_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
|
||||
(u8 *)ctx->key_enc, num_rounds(ctx), blocks);
|
||||
err = blkcipher_walk_done(desc, &walk,
|
||||
walk.nbytes % AES_BLOCK_SIZE);
|
||||
}
|
||||
kernel_neon_end();
|
||||
return err;
|
||||
}
|
||||
|
||||
static int ecb_decrypt(struct blkcipher_desc *desc, struct scatterlist *dst,
|
||||
struct scatterlist *src, unsigned int nbytes)
|
||||
{
|
||||
struct crypto_aes_ctx *ctx = crypto_blkcipher_ctx(desc->tfm);
|
||||
struct blkcipher_walk walk;
|
||||
unsigned int blocks;
|
||||
int err;
|
||||
|
||||
desc->flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
|
||||
blkcipher_walk_init(&walk, dst, src, nbytes);
|
||||
err = blkcipher_walk_virt(desc, &walk);
|
||||
|
||||
kernel_neon_begin();
|
||||
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
|
||||
ce_aes_ecb_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
|
||||
(u8 *)ctx->key_dec, num_rounds(ctx), blocks);
|
||||
err = blkcipher_walk_done(desc, &walk,
|
||||
walk.nbytes % AES_BLOCK_SIZE);
|
||||
}
|
||||
kernel_neon_end();
|
||||
return err;
|
||||
}
|
||||
|
||||
static int cbc_encrypt(struct blkcipher_desc *desc, struct scatterlist *dst,
|
||||
struct scatterlist *src, unsigned int nbytes)
|
||||
{
|
||||
struct crypto_aes_ctx *ctx = crypto_blkcipher_ctx(desc->tfm);
|
||||
struct blkcipher_walk walk;
|
||||
unsigned int blocks;
|
||||
int err;
|
||||
|
||||
desc->flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
|
||||
blkcipher_walk_init(&walk, dst, src, nbytes);
|
||||
err = blkcipher_walk_virt(desc, &walk);
|
||||
|
||||
kernel_neon_begin();
|
||||
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
|
||||
ce_aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
|
||||
(u8 *)ctx->key_enc, num_rounds(ctx), blocks,
|
||||
walk.iv);
|
||||
err = blkcipher_walk_done(desc, &walk,
|
||||
walk.nbytes % AES_BLOCK_SIZE);
|
||||
}
|
||||
kernel_neon_end();
|
||||
return err;
|
||||
}
|
||||
|
||||
static int cbc_decrypt(struct blkcipher_desc *desc, struct scatterlist *dst,
|
||||
struct scatterlist *src, unsigned int nbytes)
|
||||
{
|
||||
struct crypto_aes_ctx *ctx = crypto_blkcipher_ctx(desc->tfm);
|
||||
struct blkcipher_walk walk;
|
||||
unsigned int blocks;
|
||||
int err;
|
||||
|
||||
desc->flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
|
||||
blkcipher_walk_init(&walk, dst, src, nbytes);
|
||||
err = blkcipher_walk_virt(desc, &walk);
|
||||
|
||||
kernel_neon_begin();
|
||||
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
|
||||
ce_aes_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
|
||||
(u8 *)ctx->key_dec, num_rounds(ctx), blocks,
|
||||
walk.iv);
|
||||
err = blkcipher_walk_done(desc, &walk,
|
||||
walk.nbytes % AES_BLOCK_SIZE);
|
||||
}
|
||||
kernel_neon_end();
|
||||
return err;
|
||||
}
|
||||
|
||||
static int ctr_encrypt(struct blkcipher_desc *desc, struct scatterlist *dst,
|
||||
struct scatterlist *src, unsigned int nbytes)
|
||||
{
|
||||
struct crypto_aes_ctx *ctx = crypto_blkcipher_ctx(desc->tfm);
|
||||
struct blkcipher_walk walk;
|
||||
int err, blocks;
|
||||
|
||||
desc->flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
|
||||
blkcipher_walk_init(&walk, dst, src, nbytes);
|
||||
err = blkcipher_walk_virt_block(desc, &walk, AES_BLOCK_SIZE);
|
||||
|
||||
kernel_neon_begin();
|
||||
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
|
||||
ce_aes_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
|
||||
(u8 *)ctx->key_enc, num_rounds(ctx), blocks,
|
||||
walk.iv);
|
||||
nbytes -= blocks * AES_BLOCK_SIZE;
|
||||
if (nbytes && nbytes == walk.nbytes % AES_BLOCK_SIZE)
|
||||
break;
|
||||
err = blkcipher_walk_done(desc, &walk,
|
||||
walk.nbytes % AES_BLOCK_SIZE);
|
||||
}
|
||||
if (nbytes) {
|
||||
u8 *tdst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
|
||||
u8 *tsrc = walk.src.virt.addr + blocks * AES_BLOCK_SIZE;
|
||||
u8 __aligned(8) tail[AES_BLOCK_SIZE];
|
||||
|
||||
/*
|
||||
* Minimum alignment is 8 bytes, so if nbytes is <= 8, we need
|
||||
* to tell aes_ctr_encrypt() to only read half a block.
|
||||
*/
|
||||
blocks = (nbytes <= 8) ? -1 : 1;
|
||||
|
||||
ce_aes_ctr_encrypt(tail, tsrc, (u8 *)ctx->key_enc,
|
||||
num_rounds(ctx), blocks, walk.iv);
|
||||
memcpy(tdst, tail, nbytes);
|
||||
err = blkcipher_walk_done(desc, &walk, 0);
|
||||
}
|
||||
kernel_neon_end();
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static int xts_encrypt(struct blkcipher_desc *desc, struct scatterlist *dst,
|
||||
struct scatterlist *src, unsigned int nbytes)
|
||||
{
|
||||
struct crypto_aes_xts_ctx *ctx = crypto_blkcipher_ctx(desc->tfm);
|
||||
int err, first, rounds = num_rounds(&ctx->key1);
|
||||
struct blkcipher_walk walk;
|
||||
unsigned int blocks;
|
||||
|
||||
desc->flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
|
||||
blkcipher_walk_init(&walk, dst, src, nbytes);
|
||||
err = blkcipher_walk_virt(desc, &walk);
|
||||
|
||||
kernel_neon_begin();
|
||||
for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) {
|
||||
ce_aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
|
||||
(u8 *)ctx->key1.key_enc, rounds, blocks,
|
||||
walk.iv, (u8 *)ctx->key2.key_enc, first);
|
||||
err = blkcipher_walk_done(desc, &walk,
|
||||
walk.nbytes % AES_BLOCK_SIZE);
|
||||
}
|
||||
kernel_neon_end();
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static int xts_decrypt(struct blkcipher_desc *desc, struct scatterlist *dst,
|
||||
struct scatterlist *src, unsigned int nbytes)
|
||||
{
|
||||
struct crypto_aes_xts_ctx *ctx = crypto_blkcipher_ctx(desc->tfm);
|
||||
int err, first, rounds = num_rounds(&ctx->key1);
|
||||
struct blkcipher_walk walk;
|
||||
unsigned int blocks;
|
||||
|
||||
desc->flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
|
||||
blkcipher_walk_init(&walk, dst, src, nbytes);
|
||||
err = blkcipher_walk_virt(desc, &walk);
|
||||
|
||||
kernel_neon_begin();
|
||||
for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) {
|
||||
ce_aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
|
||||
(u8 *)ctx->key1.key_dec, rounds, blocks,
|
||||
walk.iv, (u8 *)ctx->key2.key_enc, first);
|
||||
err = blkcipher_walk_done(desc, &walk,
|
||||
walk.nbytes % AES_BLOCK_SIZE);
|
||||
}
|
||||
kernel_neon_end();
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static struct crypto_alg aes_algs[] = { {
|
||||
.cra_name = "__ecb-aes-ce",
|
||||
.cra_driver_name = "__driver-ecb-aes-ce",
|
||||
.cra_priority = 0,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER |
|
||||
CRYPTO_ALG_INTERNAL,
|
||||
.cra_blocksize = AES_BLOCK_SIZE,
|
||||
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
|
||||
.cra_alignmask = 7,
|
||||
.cra_type = &crypto_blkcipher_type,
|
||||
.cra_module = THIS_MODULE,
|
||||
.cra_blkcipher = {
|
||||
.min_keysize = AES_MIN_KEY_SIZE,
|
||||
.max_keysize = AES_MAX_KEY_SIZE,
|
||||
.ivsize = AES_BLOCK_SIZE,
|
||||
.setkey = ce_aes_setkey,
|
||||
.encrypt = ecb_encrypt,
|
||||
.decrypt = ecb_decrypt,
|
||||
},
|
||||
}, {
|
||||
.cra_name = "__cbc-aes-ce",
|
||||
.cra_driver_name = "__driver-cbc-aes-ce",
|
||||
.cra_priority = 0,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER |
|
||||
CRYPTO_ALG_INTERNAL,
|
||||
.cra_blocksize = AES_BLOCK_SIZE,
|
||||
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
|
||||
.cra_alignmask = 7,
|
||||
.cra_type = &crypto_blkcipher_type,
|
||||
.cra_module = THIS_MODULE,
|
||||
.cra_blkcipher = {
|
||||
.min_keysize = AES_MIN_KEY_SIZE,
|
||||
.max_keysize = AES_MAX_KEY_SIZE,
|
||||
.ivsize = AES_BLOCK_SIZE,
|
||||
.setkey = ce_aes_setkey,
|
||||
.encrypt = cbc_encrypt,
|
||||
.decrypt = cbc_decrypt,
|
||||
},
|
||||
}, {
|
||||
.cra_name = "__ctr-aes-ce",
|
||||
.cra_driver_name = "__driver-ctr-aes-ce",
|
||||
.cra_priority = 0,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER |
|
||||
CRYPTO_ALG_INTERNAL,
|
||||
.cra_blocksize = 1,
|
||||
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
|
||||
.cra_alignmask = 7,
|
||||
.cra_type = &crypto_blkcipher_type,
|
||||
.cra_module = THIS_MODULE,
|
||||
.cra_blkcipher = {
|
||||
.min_keysize = AES_MIN_KEY_SIZE,
|
||||
.max_keysize = AES_MAX_KEY_SIZE,
|
||||
.ivsize = AES_BLOCK_SIZE,
|
||||
.setkey = ce_aes_setkey,
|
||||
.encrypt = ctr_encrypt,
|
||||
.decrypt = ctr_encrypt,
|
||||
},
|
||||
}, {
|
||||
.cra_name = "__xts-aes-ce",
|
||||
.cra_driver_name = "__driver-xts-aes-ce",
|
||||
.cra_priority = 0,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER |
|
||||
CRYPTO_ALG_INTERNAL,
|
||||
.cra_blocksize = AES_BLOCK_SIZE,
|
||||
.cra_ctxsize = sizeof(struct crypto_aes_xts_ctx),
|
||||
.cra_alignmask = 7,
|
||||
.cra_type = &crypto_blkcipher_type,
|
||||
.cra_module = THIS_MODULE,
|
||||
.cra_blkcipher = {
|
||||
.min_keysize = 2 * AES_MIN_KEY_SIZE,
|
||||
.max_keysize = 2 * AES_MAX_KEY_SIZE,
|
||||
.ivsize = AES_BLOCK_SIZE,
|
||||
.setkey = xts_set_key,
|
||||
.encrypt = xts_encrypt,
|
||||
.decrypt = xts_decrypt,
|
||||
},
|
||||
}, {
|
||||
.cra_name = "ecb(aes)",
|
||||
.cra_driver_name = "ecb-aes-ce",
|
||||
.cra_priority = 300,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER|CRYPTO_ALG_ASYNC,
|
||||
.cra_blocksize = AES_BLOCK_SIZE,
|
||||
.cra_ctxsize = sizeof(struct async_helper_ctx),
|
||||
.cra_alignmask = 7,
|
||||
.cra_type = &crypto_ablkcipher_type,
|
||||
.cra_module = THIS_MODULE,
|
||||
.cra_init = ablk_init,
|
||||
.cra_exit = ablk_exit,
|
||||
.cra_ablkcipher = {
|
||||
.min_keysize = AES_MIN_KEY_SIZE,
|
||||
.max_keysize = AES_MAX_KEY_SIZE,
|
||||
.ivsize = AES_BLOCK_SIZE,
|
||||
.setkey = ablk_set_key,
|
||||
.encrypt = ablk_encrypt,
|
||||
.decrypt = ablk_decrypt,
|
||||
}
|
||||
}, {
|
||||
.cra_name = "cbc(aes)",
|
||||
.cra_driver_name = "cbc-aes-ce",
|
||||
.cra_priority = 300,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER|CRYPTO_ALG_ASYNC,
|
||||
.cra_blocksize = AES_BLOCK_SIZE,
|
||||
.cra_ctxsize = sizeof(struct async_helper_ctx),
|
||||
.cra_alignmask = 7,
|
||||
.cra_type = &crypto_ablkcipher_type,
|
||||
.cra_module = THIS_MODULE,
|
||||
.cra_init = ablk_init,
|
||||
.cra_exit = ablk_exit,
|
||||
.cra_ablkcipher = {
|
||||
.min_keysize = AES_MIN_KEY_SIZE,
|
||||
.max_keysize = AES_MAX_KEY_SIZE,
|
||||
.ivsize = AES_BLOCK_SIZE,
|
||||
.setkey = ablk_set_key,
|
||||
.encrypt = ablk_encrypt,
|
||||
.decrypt = ablk_decrypt,
|
||||
}
|
||||
}, {
|
||||
.cra_name = "ctr(aes)",
|
||||
.cra_driver_name = "ctr-aes-ce",
|
||||
.cra_priority = 300,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER|CRYPTO_ALG_ASYNC,
|
||||
.cra_blocksize = 1,
|
||||
.cra_ctxsize = sizeof(struct async_helper_ctx),
|
||||
.cra_alignmask = 7,
|
||||
.cra_type = &crypto_ablkcipher_type,
|
||||
.cra_module = THIS_MODULE,
|
||||
.cra_init = ablk_init,
|
||||
.cra_exit = ablk_exit,
|
||||
.cra_ablkcipher = {
|
||||
.min_keysize = AES_MIN_KEY_SIZE,
|
||||
.max_keysize = AES_MAX_KEY_SIZE,
|
||||
.ivsize = AES_BLOCK_SIZE,
|
||||
.setkey = ablk_set_key,
|
||||
.encrypt = ablk_encrypt,
|
||||
.decrypt = ablk_decrypt,
|
||||
}
|
||||
}, {
|
||||
.cra_name = "xts(aes)",
|
||||
.cra_driver_name = "xts-aes-ce",
|
||||
.cra_priority = 300,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER|CRYPTO_ALG_ASYNC,
|
||||
.cra_blocksize = AES_BLOCK_SIZE,
|
||||
.cra_ctxsize = sizeof(struct async_helper_ctx),
|
||||
.cra_alignmask = 7,
|
||||
.cra_type = &crypto_ablkcipher_type,
|
||||
.cra_module = THIS_MODULE,
|
||||
.cra_init = ablk_init,
|
||||
.cra_exit = ablk_exit,
|
||||
.cra_ablkcipher = {
|
||||
.min_keysize = 2 * AES_MIN_KEY_SIZE,
|
||||
.max_keysize = 2 * AES_MAX_KEY_SIZE,
|
||||
.ivsize = AES_BLOCK_SIZE,
|
||||
.setkey = ablk_set_key,
|
||||
.encrypt = ablk_encrypt,
|
||||
.decrypt = ablk_decrypt,
|
||||
}
|
||||
} };
|
||||
|
||||
static int __init aes_init(void)
|
||||
{
|
||||
if (!(elf_hwcap2 & HWCAP2_AES))
|
||||
return -ENODEV;
|
||||
return crypto_register_algs(aes_algs, ARRAY_SIZE(aes_algs));
|
||||
}
|
||||
|
||||
static void __exit aes_exit(void)
|
||||
{
|
||||
crypto_unregister_algs(aes_algs, ARRAY_SIZE(aes_algs));
|
||||
}
|
||||
|
||||
module_init(aes_init);
|
||||
module_exit(aes_exit);
|
|
@ -301,7 +301,8 @@ static struct crypto_alg aesbs_algs[] = { {
|
|||
.cra_name = "__cbc-aes-neonbs",
|
||||
.cra_driver_name = "__driver-cbc-aes-neonbs",
|
||||
.cra_priority = 0,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER |
|
||||
CRYPTO_ALG_INTERNAL,
|
||||
.cra_blocksize = AES_BLOCK_SIZE,
|
||||
.cra_ctxsize = sizeof(struct aesbs_cbc_ctx),
|
||||
.cra_alignmask = 7,
|
||||
|
@ -319,7 +320,8 @@ static struct crypto_alg aesbs_algs[] = { {
|
|||
.cra_name = "__ctr-aes-neonbs",
|
||||
.cra_driver_name = "__driver-ctr-aes-neonbs",
|
||||
.cra_priority = 0,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER |
|
||||
CRYPTO_ALG_INTERNAL,
|
||||
.cra_blocksize = 1,
|
||||
.cra_ctxsize = sizeof(struct aesbs_ctr_ctx),
|
||||
.cra_alignmask = 7,
|
||||
|
@ -337,7 +339,8 @@ static struct crypto_alg aesbs_algs[] = { {
|
|||
.cra_name = "__xts-aes-neonbs",
|
||||
.cra_driver_name = "__driver-xts-aes-neonbs",
|
||||
.cra_priority = 0,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER |
|
||||
CRYPTO_ALG_INTERNAL,
|
||||
.cra_blocksize = AES_BLOCK_SIZE,
|
||||
.cra_ctxsize = sizeof(struct aesbs_xts_ctx),
|
||||
.cra_alignmask = 7,
|
||||
|
|
|
@ -0,0 +1,94 @@
|
|||
/*
|
||||
* Accelerated GHASH implementation with ARMv8 vmull.p64 instructions.
|
||||
*
|
||||
* Copyright (C) 2015 Linaro Ltd. <ard.biesheuvel@linaro.org>
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify it
|
||||
* under the terms of the GNU General Public License version 2 as published
|
||||
* by the Free Software Foundation.
|
||||
*/
|
||||
|
||||
#include <linux/linkage.h>
|
||||
#include <asm/assembler.h>
|
||||
|
||||
SHASH .req q0
|
||||
SHASH2 .req q1
|
||||
T1 .req q2
|
||||
T2 .req q3
|
||||
MASK .req q4
|
||||
XL .req q5
|
||||
XM .req q6
|
||||
XH .req q7
|
||||
IN1 .req q7
|
||||
|
||||
SHASH_L .req d0
|
||||
SHASH_H .req d1
|
||||
SHASH2_L .req d2
|
||||
T1_L .req d4
|
||||
MASK_L .req d8
|
||||
XL_L .req d10
|
||||
XL_H .req d11
|
||||
XM_L .req d12
|
||||
XM_H .req d13
|
||||
XH_L .req d14
|
||||
|
||||
.text
|
||||
.fpu crypto-neon-fp-armv8
|
||||
|
||||
/*
|
||||
* void pmull_ghash_update(int blocks, u64 dg[], const char *src,
|
||||
* struct ghash_key const *k, const char *head)
|
||||
*/
|
||||
ENTRY(pmull_ghash_update)
|
||||
vld1.64 {SHASH}, [r3]
|
||||
vld1.64 {XL}, [r1]
|
||||
vmov.i8 MASK, #0xe1
|
||||
vext.8 SHASH2, SHASH, SHASH, #8
|
||||
vshl.u64 MASK, MASK, #57
|
||||
veor SHASH2, SHASH2, SHASH
|
||||
|
||||
/* do the head block first, if supplied */
|
||||
ldr ip, [sp]
|
||||
teq ip, #0
|
||||
beq 0f
|
||||
vld1.64 {T1}, [ip]
|
||||
teq r0, #0
|
||||
b 1f
|
||||
|
||||
0: vld1.64 {T1}, [r2]!
|
||||
subs r0, r0, #1
|
||||
|
||||
1: /* multiply XL by SHASH in GF(2^128) */
|
||||
#ifndef CONFIG_CPU_BIG_ENDIAN
|
||||
vrev64.8 T1, T1
|
||||
#endif
|
||||
vext.8 T2, XL, XL, #8
|
||||
vext.8 IN1, T1, T1, #8
|
||||
veor T1, T1, T2
|
||||
veor XL, XL, IN1
|
||||
|
||||
vmull.p64 XH, SHASH_H, XL_H @ a1 * b1
|
||||
veor T1, T1, XL
|
||||
vmull.p64 XL, SHASH_L, XL_L @ a0 * b0
|
||||
vmull.p64 XM, SHASH2_L, T1_L @ (a1 + a0)(b1 + b0)
|
||||
|
||||
vext.8 T1, XL, XH, #8
|
||||
veor T2, XL, XH
|
||||
veor XM, XM, T1
|
||||
veor XM, XM, T2
|
||||
vmull.p64 T2, XL_L, MASK_L
|
||||
|
||||
vmov XH_L, XM_H
|
||||
vmov XM_H, XL_L
|
||||
|
||||
veor XL, XM, T2
|
||||
vext.8 T2, XL, XL, #8
|
||||
vmull.p64 XL, XL_L, MASK_L
|
||||
veor T2, T2, XH
|
||||
veor XL, XL, T2
|
||||
|
||||
bne 0b
|
||||
|
||||
vst1.64 {XL}, [r1]
|
||||
bx lr
|
||||
ENDPROC(pmull_ghash_update)
|
|
@ -0,0 +1,320 @@
|
|||
/*
|
||||
* Accelerated GHASH implementation with ARMv8 vmull.p64 instructions.
|
||||
*
|
||||
* Copyright (C) 2015 Linaro Ltd. <ard.biesheuvel@linaro.org>
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify it
|
||||
* under the terms of the GNU General Public License version 2 as published
|
||||
* by the Free Software Foundation.
|
||||
*/
|
||||
|
||||
#include <asm/hwcap.h>
|
||||
#include <asm/neon.h>
|
||||
#include <asm/simd.h>
|
||||
#include <asm/unaligned.h>
|
||||
#include <crypto/cryptd.h>
|
||||
#include <crypto/internal/hash.h>
|
||||
#include <crypto/gf128mul.h>
|
||||
#include <linux/crypto.h>
|
||||
#include <linux/module.h>
|
||||
|
||||
MODULE_DESCRIPTION("GHASH secure hash using ARMv8 Crypto Extensions");
|
||||
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
|
||||
MODULE_LICENSE("GPL v2");
|
||||
|
||||
#define GHASH_BLOCK_SIZE 16
|
||||
#define GHASH_DIGEST_SIZE 16
|
||||
|
||||
struct ghash_key {
|
||||
u64 a;
|
||||
u64 b;
|
||||
};
|
||||
|
||||
struct ghash_desc_ctx {
|
||||
u64 digest[GHASH_DIGEST_SIZE/sizeof(u64)];
|
||||
u8 buf[GHASH_BLOCK_SIZE];
|
||||
u32 count;
|
||||
};
|
||||
|
||||
struct ghash_async_ctx {
|
||||
struct cryptd_ahash *cryptd_tfm;
|
||||
};
|
||||
|
||||
asmlinkage void pmull_ghash_update(int blocks, u64 dg[], const char *src,
|
||||
struct ghash_key const *k, const char *head);
|
||||
|
||||
static int ghash_init(struct shash_desc *desc)
|
||||
{
|
||||
struct ghash_desc_ctx *ctx = shash_desc_ctx(desc);
|
||||
|
||||
*ctx = (struct ghash_desc_ctx){};
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int ghash_update(struct shash_desc *desc, const u8 *src,
|
||||
unsigned int len)
|
||||
{
|
||||
struct ghash_desc_ctx *ctx = shash_desc_ctx(desc);
|
||||
unsigned int partial = ctx->count % GHASH_BLOCK_SIZE;
|
||||
|
||||
ctx->count += len;
|
||||
|
||||
if ((partial + len) >= GHASH_BLOCK_SIZE) {
|
||||
struct ghash_key *key = crypto_shash_ctx(desc->tfm);
|
||||
int blocks;
|
||||
|
||||
if (partial) {
|
||||
int p = GHASH_BLOCK_SIZE - partial;
|
||||
|
||||
memcpy(ctx->buf + partial, src, p);
|
||||
src += p;
|
||||
len -= p;
|
||||
}
|
||||
|
||||
blocks = len / GHASH_BLOCK_SIZE;
|
||||
len %= GHASH_BLOCK_SIZE;
|
||||
|
||||
kernel_neon_begin();
|
||||
pmull_ghash_update(blocks, ctx->digest, src, key,
|
||||
partial ? ctx->buf : NULL);
|
||||
kernel_neon_end();
|
||||
src += blocks * GHASH_BLOCK_SIZE;
|
||||
partial = 0;
|
||||
}
|
||||
if (len)
|
||||
memcpy(ctx->buf + partial, src, len);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int ghash_final(struct shash_desc *desc, u8 *dst)
|
||||
{
|
||||
struct ghash_desc_ctx *ctx = shash_desc_ctx(desc);
|
||||
unsigned int partial = ctx->count % GHASH_BLOCK_SIZE;
|
||||
|
||||
if (partial) {
|
||||
struct ghash_key *key = crypto_shash_ctx(desc->tfm);
|
||||
|
||||
memset(ctx->buf + partial, 0, GHASH_BLOCK_SIZE - partial);
|
||||
kernel_neon_begin();
|
||||
pmull_ghash_update(1, ctx->digest, ctx->buf, key, NULL);
|
||||
kernel_neon_end();
|
||||
}
|
||||
put_unaligned_be64(ctx->digest[1], dst);
|
||||
put_unaligned_be64(ctx->digest[0], dst + 8);
|
||||
|
||||
*ctx = (struct ghash_desc_ctx){};
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int ghash_setkey(struct crypto_shash *tfm,
|
||||
const u8 *inkey, unsigned int keylen)
|
||||
{
|
||||
struct ghash_key *key = crypto_shash_ctx(tfm);
|
||||
u64 a, b;
|
||||
|
||||
if (keylen != GHASH_BLOCK_SIZE) {
|
||||
crypto_shash_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/* perform multiplication by 'x' in GF(2^128) */
|
||||
b = get_unaligned_be64(inkey);
|
||||
a = get_unaligned_be64(inkey + 8);
|
||||
|
||||
key->a = (a << 1) | (b >> 63);
|
||||
key->b = (b << 1) | (a >> 63);
|
||||
|
||||
if (b >> 63)
|
||||
key->b ^= 0xc200000000000000UL;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static struct shash_alg ghash_alg = {
|
||||
.digestsize = GHASH_DIGEST_SIZE,
|
||||
.init = ghash_init,
|
||||
.update = ghash_update,
|
||||
.final = ghash_final,
|
||||
.setkey = ghash_setkey,
|
||||
.descsize = sizeof(struct ghash_desc_ctx),
|
||||
.base = {
|
||||
.cra_name = "ghash",
|
||||
.cra_driver_name = "__driver-ghash-ce",
|
||||
.cra_priority = 0,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_SHASH | CRYPTO_ALG_INTERNAL,
|
||||
.cra_blocksize = GHASH_BLOCK_SIZE,
|
||||
.cra_ctxsize = sizeof(struct ghash_key),
|
||||
.cra_module = THIS_MODULE,
|
||||
},
|
||||
};
|
||||
|
||||
static int ghash_async_init(struct ahash_request *req)
|
||||
{
|
||||
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
|
||||
struct ghash_async_ctx *ctx = crypto_ahash_ctx(tfm);
|
||||
struct ahash_request *cryptd_req = ahash_request_ctx(req);
|
||||
struct cryptd_ahash *cryptd_tfm = ctx->cryptd_tfm;
|
||||
|
||||
if (!may_use_simd()) {
|
||||
memcpy(cryptd_req, req, sizeof(*req));
|
||||
ahash_request_set_tfm(cryptd_req, &cryptd_tfm->base);
|
||||
return crypto_ahash_init(cryptd_req);
|
||||
} else {
|
||||
struct shash_desc *desc = cryptd_shash_desc(cryptd_req);
|
||||
struct crypto_shash *child = cryptd_ahash_child(cryptd_tfm);
|
||||
|
||||
desc->tfm = child;
|
||||
desc->flags = req->base.flags;
|
||||
return crypto_shash_init(desc);
|
||||
}
|
||||
}
|
||||
|
||||
static int ghash_async_update(struct ahash_request *req)
|
||||
{
|
||||
struct ahash_request *cryptd_req = ahash_request_ctx(req);
|
||||
|
||||
if (!may_use_simd()) {
|
||||
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
|
||||
struct ghash_async_ctx *ctx = crypto_ahash_ctx(tfm);
|
||||
struct cryptd_ahash *cryptd_tfm = ctx->cryptd_tfm;
|
||||
|
||||
memcpy(cryptd_req, req, sizeof(*req));
|
||||
ahash_request_set_tfm(cryptd_req, &cryptd_tfm->base);
|
||||
return crypto_ahash_update(cryptd_req);
|
||||
} else {
|
||||
struct shash_desc *desc = cryptd_shash_desc(cryptd_req);
|
||||
return shash_ahash_update(req, desc);
|
||||
}
|
||||
}
|
||||
|
||||
static int ghash_async_final(struct ahash_request *req)
|
||||
{
|
||||
struct ahash_request *cryptd_req = ahash_request_ctx(req);
|
||||
|
||||
if (!may_use_simd()) {
|
||||
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
|
||||
struct ghash_async_ctx *ctx = crypto_ahash_ctx(tfm);
|
||||
struct cryptd_ahash *cryptd_tfm = ctx->cryptd_tfm;
|
||||
|
||||
memcpy(cryptd_req, req, sizeof(*req));
|
||||
ahash_request_set_tfm(cryptd_req, &cryptd_tfm->base);
|
||||
return crypto_ahash_final(cryptd_req);
|
||||
} else {
|
||||
struct shash_desc *desc = cryptd_shash_desc(cryptd_req);
|
||||
return crypto_shash_final(desc, req->result);
|
||||
}
|
||||
}
|
||||
|
||||
static int ghash_async_digest(struct ahash_request *req)
|
||||
{
|
||||
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
|
||||
struct ghash_async_ctx *ctx = crypto_ahash_ctx(tfm);
|
||||
struct ahash_request *cryptd_req = ahash_request_ctx(req);
|
||||
struct cryptd_ahash *cryptd_tfm = ctx->cryptd_tfm;
|
||||
|
||||
if (!may_use_simd()) {
|
||||
memcpy(cryptd_req, req, sizeof(*req));
|
||||
ahash_request_set_tfm(cryptd_req, &cryptd_tfm->base);
|
||||
return crypto_ahash_digest(cryptd_req);
|
||||
} else {
|
||||
struct shash_desc *desc = cryptd_shash_desc(cryptd_req);
|
||||
struct crypto_shash *child = cryptd_ahash_child(cryptd_tfm);
|
||||
|
||||
desc->tfm = child;
|
||||
desc->flags = req->base.flags;
|
||||
return shash_ahash_digest(req, desc);
|
||||
}
|
||||
}
|
||||
|
||||
static int ghash_async_setkey(struct crypto_ahash *tfm, const u8 *key,
|
||||
unsigned int keylen)
|
||||
{
|
||||
struct ghash_async_ctx *ctx = crypto_ahash_ctx(tfm);
|
||||
struct crypto_ahash *child = &ctx->cryptd_tfm->base;
|
||||
int err;
|
||||
|
||||
crypto_ahash_clear_flags(child, CRYPTO_TFM_REQ_MASK);
|
||||
crypto_ahash_set_flags(child, crypto_ahash_get_flags(tfm)
|
||||
& CRYPTO_TFM_REQ_MASK);
|
||||
err = crypto_ahash_setkey(child, key, keylen);
|
||||
crypto_ahash_set_flags(tfm, crypto_ahash_get_flags(child)
|
||||
& CRYPTO_TFM_RES_MASK);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static int ghash_async_init_tfm(struct crypto_tfm *tfm)
|
||||
{
|
||||
struct cryptd_ahash *cryptd_tfm;
|
||||
struct ghash_async_ctx *ctx = crypto_tfm_ctx(tfm);
|
||||
|
||||
cryptd_tfm = cryptd_alloc_ahash("__driver-ghash-ce",
|
||||
CRYPTO_ALG_INTERNAL,
|
||||
CRYPTO_ALG_INTERNAL);
|
||||
if (IS_ERR(cryptd_tfm))
|
||||
return PTR_ERR(cryptd_tfm);
|
||||
ctx->cryptd_tfm = cryptd_tfm;
|
||||
crypto_ahash_set_reqsize(__crypto_ahash_cast(tfm),
|
||||
sizeof(struct ahash_request) +
|
||||
crypto_ahash_reqsize(&cryptd_tfm->base));
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void ghash_async_exit_tfm(struct crypto_tfm *tfm)
|
||||
{
|
||||
struct ghash_async_ctx *ctx = crypto_tfm_ctx(tfm);
|
||||
|
||||
cryptd_free_ahash(ctx->cryptd_tfm);
|
||||
}
|
||||
|
||||
static struct ahash_alg ghash_async_alg = {
|
||||
.init = ghash_async_init,
|
||||
.update = ghash_async_update,
|
||||
.final = ghash_async_final,
|
||||
.setkey = ghash_async_setkey,
|
||||
.digest = ghash_async_digest,
|
||||
.halg.digestsize = GHASH_DIGEST_SIZE,
|
||||
.halg.base = {
|
||||
.cra_name = "ghash",
|
||||
.cra_driver_name = "ghash-ce",
|
||||
.cra_priority = 300,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_AHASH | CRYPTO_ALG_ASYNC,
|
||||
.cra_blocksize = GHASH_BLOCK_SIZE,
|
||||
.cra_type = &crypto_ahash_type,
|
||||
.cra_ctxsize = sizeof(struct ghash_async_ctx),
|
||||
.cra_module = THIS_MODULE,
|
||||
.cra_init = ghash_async_init_tfm,
|
||||
.cra_exit = ghash_async_exit_tfm,
|
||||
},
|
||||
};
|
||||
|
||||
static int __init ghash_ce_mod_init(void)
|
||||
{
|
||||
int err;
|
||||
|
||||
if (!(elf_hwcap2 & HWCAP2_PMULL))
|
||||
return -ENODEV;
|
||||
|
||||
err = crypto_register_shash(&ghash_alg);
|
||||
if (err)
|
||||
return err;
|
||||
err = crypto_register_ahash(&ghash_async_alg);
|
||||
if (err)
|
||||
goto err_shash;
|
||||
|
||||
return 0;
|
||||
|
||||
err_shash:
|
||||
crypto_unregister_shash(&ghash_alg);
|
||||
return err;
|
||||
}
|
||||
|
||||
static void __exit ghash_ce_mod_exit(void)
|
||||
{
|
||||
crypto_unregister_ahash(&ghash_async_alg);
|
||||
crypto_unregister_shash(&ghash_alg);
|
||||
}
|
||||
|
||||
module_init(ghash_ce_mod_init);
|
||||
module_exit(ghash_ce_mod_exit);
|
|
@ -0,0 +1,125 @@
|
|||
/*
|
||||
* sha1-ce-core.S - SHA-1 secure hash using ARMv8 Crypto Extensions
|
||||
*
|
||||
* Copyright (C) 2015 Linaro Ltd.
|
||||
* Author: Ard Biesheuvel <ard.biesheuvel@linaro.org>
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License version 2 as
|
||||
* published by the Free Software Foundation.
|
||||
*/
|
||||
|
||||
#include <linux/linkage.h>
|
||||
#include <asm/assembler.h>
|
||||
|
||||
.text
|
||||
.fpu crypto-neon-fp-armv8
|
||||
|
||||
k0 .req q0
|
||||
k1 .req q1
|
||||
k2 .req q2
|
||||
k3 .req q3
|
||||
|
||||
ta0 .req q4
|
||||
ta1 .req q5
|
||||
tb0 .req q5
|
||||
tb1 .req q4
|
||||
|
||||
dga .req q6
|
||||
dgb .req q7
|
||||
dgbs .req s28
|
||||
|
||||
dg0 .req q12
|
||||
dg1a0 .req q13
|
||||
dg1a1 .req q14
|
||||
dg1b0 .req q14
|
||||
dg1b1 .req q13
|
||||
|
||||
.macro add_only, op, ev, rc, s0, dg1
|
||||
.ifnb \s0
|
||||
vadd.u32 tb\ev, q\s0, \rc
|
||||
.endif
|
||||
sha1h.32 dg1b\ev, dg0
|
||||
.ifb \dg1
|
||||
sha1\op\().32 dg0, dg1a\ev, ta\ev
|
||||
.else
|
||||
sha1\op\().32 dg0, \dg1, ta\ev
|
||||
.endif
|
||||
.endm
|
||||
|
||||
.macro add_update, op, ev, rc, s0, s1, s2, s3, dg1
|
||||
sha1su0.32 q\s0, q\s1, q\s2
|
||||
add_only \op, \ev, \rc, \s1, \dg1
|
||||
sha1su1.32 q\s0, q\s3
|
||||
.endm
|
||||
|
||||
.align 6
|
||||
.Lsha1_rcon:
|
||||
.word 0x5a827999, 0x5a827999, 0x5a827999, 0x5a827999
|
||||
.word 0x6ed9eba1, 0x6ed9eba1, 0x6ed9eba1, 0x6ed9eba1
|
||||
.word 0x8f1bbcdc, 0x8f1bbcdc, 0x8f1bbcdc, 0x8f1bbcdc
|
||||
.word 0xca62c1d6, 0xca62c1d6, 0xca62c1d6, 0xca62c1d6
|
||||
|
||||
/*
|
||||
* void sha1_ce_transform(struct sha1_state *sst, u8 const *src,
|
||||
* int blocks);
|
||||
*/
|
||||
ENTRY(sha1_ce_transform)
|
||||
/* load round constants */
|
||||
adr ip, .Lsha1_rcon
|
||||
vld1.32 {k0-k1}, [ip, :128]!
|
||||
vld1.32 {k2-k3}, [ip, :128]
|
||||
|
||||
/* load state */
|
||||
vld1.32 {dga}, [r0]
|
||||
vldr dgbs, [r0, #16]
|
||||
|
||||
/* load input */
|
||||
0: vld1.32 {q8-q9}, [r1]!
|
||||
vld1.32 {q10-q11}, [r1]!
|
||||
subs r2, r2, #1
|
||||
|
||||
#ifndef CONFIG_CPU_BIG_ENDIAN
|
||||
vrev32.8 q8, q8
|
||||
vrev32.8 q9, q9
|
||||
vrev32.8 q10, q10
|
||||
vrev32.8 q11, q11
|
||||
#endif
|
||||
|
||||
vadd.u32 ta0, q8, k0
|
||||
vmov dg0, dga
|
||||
|
||||
add_update c, 0, k0, 8, 9, 10, 11, dgb
|
||||
add_update c, 1, k0, 9, 10, 11, 8
|
||||
add_update c, 0, k0, 10, 11, 8, 9
|
||||
add_update c, 1, k0, 11, 8, 9, 10
|
||||
add_update c, 0, k1, 8, 9, 10, 11
|
||||
|
||||
add_update p, 1, k1, 9, 10, 11, 8
|
||||
add_update p, 0, k1, 10, 11, 8, 9
|
||||
add_update p, 1, k1, 11, 8, 9, 10
|
||||
add_update p, 0, k1, 8, 9, 10, 11
|
||||
add_update p, 1, k2, 9, 10, 11, 8
|
||||
|
||||
add_update m, 0, k2, 10, 11, 8, 9
|
||||
add_update m, 1, k2, 11, 8, 9, 10
|
||||
add_update m, 0, k2, 8, 9, 10, 11
|
||||
add_update m, 1, k2, 9, 10, 11, 8
|
||||
add_update m, 0, k3, 10, 11, 8, 9
|
||||
|
||||
add_update p, 1, k3, 11, 8, 9, 10
|
||||
add_only p, 0, k3, 9
|
||||
add_only p, 1, k3, 10
|
||||
add_only p, 0, k3, 11
|
||||
add_only p, 1
|
||||
|
||||
/* update state */
|
||||
vadd.u32 dga, dga, dg0
|
||||
vadd.u32 dgb, dgb, dg1a0
|
||||
bne 0b
|
||||
|
||||
/* store new state */
|
||||
vst1.32 {dga}, [r0]
|
||||
vstr dgbs, [r0, #16]
|
||||
bx lr
|
||||
ENDPROC(sha1_ce_transform)
|
|
@ -0,0 +1,96 @@
|
|||
/*
|
||||
* sha1-ce-glue.c - SHA-1 secure hash using ARMv8 Crypto Extensions
|
||||
*
|
||||
* Copyright (C) 2015 Linaro Ltd <ard.biesheuvel@linaro.org>
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License version 2 as
|
||||
* published by the Free Software Foundation.
|
||||
*/
|
||||
|
||||
#include <crypto/internal/hash.h>
|
||||
#include <crypto/sha.h>
|
||||
#include <crypto/sha1_base.h>
|
||||
#include <linux/crypto.h>
|
||||
#include <linux/module.h>
|
||||
|
||||
#include <asm/hwcap.h>
|
||||
#include <asm/neon.h>
|
||||
#include <asm/simd.h>
|
||||
|
||||
#include "sha1.h"
|
||||
|
||||
MODULE_DESCRIPTION("SHA1 secure hash using ARMv8 Crypto Extensions");
|
||||
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
|
||||
MODULE_LICENSE("GPL v2");
|
||||
|
||||
asmlinkage void sha1_ce_transform(struct sha1_state *sst, u8 const *src,
|
||||
int blocks);
|
||||
|
||||
static int sha1_ce_update(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len)
|
||||
{
|
||||
struct sha1_state *sctx = shash_desc_ctx(desc);
|
||||
|
||||
if (!may_use_simd() ||
|
||||
(sctx->count % SHA1_BLOCK_SIZE) + len < SHA1_BLOCK_SIZE)
|
||||
return sha1_update_arm(desc, data, len);
|
||||
|
||||
kernel_neon_begin();
|
||||
sha1_base_do_update(desc, data, len, sha1_ce_transform);
|
||||
kernel_neon_end();
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int sha1_ce_finup(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len, u8 *out)
|
||||
{
|
||||
if (!may_use_simd())
|
||||
return sha1_finup_arm(desc, data, len, out);
|
||||
|
||||
kernel_neon_begin();
|
||||
if (len)
|
||||
sha1_base_do_update(desc, data, len, sha1_ce_transform);
|
||||
sha1_base_do_finalize(desc, sha1_ce_transform);
|
||||
kernel_neon_end();
|
||||
|
||||
return sha1_base_finish(desc, out);
|
||||
}
|
||||
|
||||
static int sha1_ce_final(struct shash_desc *desc, u8 *out)
|
||||
{
|
||||
return sha1_ce_finup(desc, NULL, 0, out);
|
||||
}
|
||||
|
||||
static struct shash_alg alg = {
|
||||
.init = sha1_base_init,
|
||||
.update = sha1_ce_update,
|
||||
.final = sha1_ce_final,
|
||||
.finup = sha1_ce_finup,
|
||||
.descsize = sizeof(struct sha1_state),
|
||||
.digestsize = SHA1_DIGEST_SIZE,
|
||||
.base = {
|
||||
.cra_name = "sha1",
|
||||
.cra_driver_name = "sha1-ce",
|
||||
.cra_priority = 200,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
|
||||
.cra_blocksize = SHA1_BLOCK_SIZE,
|
||||
.cra_module = THIS_MODULE,
|
||||
}
|
||||
};
|
||||
|
||||
static int __init sha1_ce_mod_init(void)
|
||||
{
|
||||
if (!(elf_hwcap2 & HWCAP2_SHA1))
|
||||
return -ENODEV;
|
||||
return crypto_register_shash(&alg);
|
||||
}
|
||||
|
||||
static void __exit sha1_ce_mod_fini(void)
|
||||
{
|
||||
crypto_unregister_shash(&alg);
|
||||
}
|
||||
|
||||
module_init(sha1_ce_mod_init);
|
||||
module_exit(sha1_ce_mod_fini);
|
|
@ -7,4 +7,7 @@
|
|||
extern int sha1_update_arm(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len);
|
||||
|
||||
extern int sha1_finup_arm(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len, u8 *out);
|
||||
|
||||
#endif
|
|
@ -22,127 +22,47 @@
|
|||
#include <linux/cryptohash.h>
|
||||
#include <linux/types.h>
|
||||
#include <crypto/sha.h>
|
||||
#include <crypto/sha1_base.h>
|
||||
#include <asm/byteorder.h>
|
||||
#include <asm/crypto/sha1.h>
|
||||
|
||||
#include "sha1.h"
|
||||
|
||||
asmlinkage void sha1_block_data_order(u32 *digest,
|
||||
const unsigned char *data, unsigned int rounds);
|
||||
|
||||
|
||||
static int sha1_init(struct shash_desc *desc)
|
||||
{
|
||||
struct sha1_state *sctx = shash_desc_ctx(desc);
|
||||
|
||||
*sctx = (struct sha1_state){
|
||||
.state = { SHA1_H0, SHA1_H1, SHA1_H2, SHA1_H3, SHA1_H4 },
|
||||
};
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
static int __sha1_update(struct sha1_state *sctx, const u8 *data,
|
||||
unsigned int len, unsigned int partial)
|
||||
{
|
||||
unsigned int done = 0;
|
||||
|
||||
sctx->count += len;
|
||||
|
||||
if (partial) {
|
||||
done = SHA1_BLOCK_SIZE - partial;
|
||||
memcpy(sctx->buffer + partial, data, done);
|
||||
sha1_block_data_order(sctx->state, sctx->buffer, 1);
|
||||
}
|
||||
|
||||
if (len - done >= SHA1_BLOCK_SIZE) {
|
||||
const unsigned int rounds = (len - done) / SHA1_BLOCK_SIZE;
|
||||
sha1_block_data_order(sctx->state, data + done, rounds);
|
||||
done += rounds * SHA1_BLOCK_SIZE;
|
||||
}
|
||||
|
||||
memcpy(sctx->buffer, data + done, len - done);
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
int sha1_update_arm(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len)
|
||||
{
|
||||
struct sha1_state *sctx = shash_desc_ctx(desc);
|
||||
unsigned int partial = sctx->count % SHA1_BLOCK_SIZE;
|
||||
int res;
|
||||
/* make sure casting to sha1_block_fn() is safe */
|
||||
BUILD_BUG_ON(offsetof(struct sha1_state, state) != 0);
|
||||
|
||||
/* Handle the fast case right here */
|
||||
if (partial + len < SHA1_BLOCK_SIZE) {
|
||||
sctx->count += len;
|
||||
memcpy(sctx->buffer + partial, data, len);
|
||||
return 0;
|
||||
}
|
||||
res = __sha1_update(sctx, data, len, partial);
|
||||
return res;
|
||||
return sha1_base_do_update(desc, data, len,
|
||||
(sha1_block_fn *)sha1_block_data_order);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(sha1_update_arm);
|
||||
|
||||
|
||||
/* Add padding and return the message digest. */
|
||||
static int sha1_final(struct shash_desc *desc, u8 *out)
|
||||
{
|
||||
struct sha1_state *sctx = shash_desc_ctx(desc);
|
||||
unsigned int i, index, padlen;
|
||||
__be32 *dst = (__be32 *)out;
|
||||
__be64 bits;
|
||||
static const u8 padding[SHA1_BLOCK_SIZE] = { 0x80, };
|
||||
|
||||
bits = cpu_to_be64(sctx->count << 3);
|
||||
|
||||
/* Pad out to 56 mod 64 and append length */
|
||||
index = sctx->count % SHA1_BLOCK_SIZE;
|
||||
padlen = (index < 56) ? (56 - index) : ((SHA1_BLOCK_SIZE+56) - index);
|
||||
/* We need to fill a whole block for __sha1_update() */
|
||||
if (padlen <= 56) {
|
||||
sctx->count += padlen;
|
||||
memcpy(sctx->buffer + index, padding, padlen);
|
||||
} else {
|
||||
__sha1_update(sctx, padding, padlen, index);
|
||||
}
|
||||
__sha1_update(sctx, (const u8 *)&bits, sizeof(bits), 56);
|
||||
|
||||
/* Store state in digest */
|
||||
for (i = 0; i < 5; i++)
|
||||
dst[i] = cpu_to_be32(sctx->state[i]);
|
||||
|
||||
/* Wipe context */
|
||||
memset(sctx, 0, sizeof(*sctx));
|
||||
return 0;
|
||||
sha1_base_do_finalize(desc, (sha1_block_fn *)sha1_block_data_order);
|
||||
return sha1_base_finish(desc, out);
|
||||
}
|
||||
|
||||
|
||||
static int sha1_export(struct shash_desc *desc, void *out)
|
||||
int sha1_finup_arm(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len, u8 *out)
|
||||
{
|
||||
struct sha1_state *sctx = shash_desc_ctx(desc);
|
||||
memcpy(out, sctx, sizeof(*sctx));
|
||||
return 0;
|
||||
sha1_base_do_update(desc, data, len,
|
||||
(sha1_block_fn *)sha1_block_data_order);
|
||||
return sha1_final(desc, out);
|
||||
}
|
||||
|
||||
|
||||
static int sha1_import(struct shash_desc *desc, const void *in)
|
||||
{
|
||||
struct sha1_state *sctx = shash_desc_ctx(desc);
|
||||
memcpy(sctx, in, sizeof(*sctx));
|
||||
return 0;
|
||||
}
|
||||
|
||||
EXPORT_SYMBOL_GPL(sha1_finup_arm);
|
||||
|
||||
static struct shash_alg alg = {
|
||||
.digestsize = SHA1_DIGEST_SIZE,
|
||||
.init = sha1_init,
|
||||
.init = sha1_base_init,
|
||||
.update = sha1_update_arm,
|
||||
.final = sha1_final,
|
||||
.export = sha1_export,
|
||||
.import = sha1_import,
|
||||
.finup = sha1_finup_arm,
|
||||
.descsize = sizeof(struct sha1_state),
|
||||
.statesize = sizeof(struct sha1_state),
|
||||
.base = {
|
||||
.cra_name = "sha1",
|
||||
.cra_driver_name= "sha1-asm",
|
||||
|
|
|
@ -25,147 +25,60 @@
|
|||
#include <linux/cryptohash.h>
|
||||
#include <linux/types.h>
|
||||
#include <crypto/sha.h>
|
||||
#include <asm/byteorder.h>
|
||||
#include <crypto/sha1_base.h>
|
||||
#include <asm/neon.h>
|
||||
#include <asm/simd.h>
|
||||
#include <asm/crypto/sha1.h>
|
||||
|
||||
#include "sha1.h"
|
||||
|
||||
asmlinkage void sha1_transform_neon(void *state_h, const char *data,
|
||||
unsigned int rounds);
|
||||
|
||||
|
||||
static int sha1_neon_init(struct shash_desc *desc)
|
||||
{
|
||||
struct sha1_state *sctx = shash_desc_ctx(desc);
|
||||
|
||||
*sctx = (struct sha1_state){
|
||||
.state = { SHA1_H0, SHA1_H1, SHA1_H2, SHA1_H3, SHA1_H4 },
|
||||
};
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int __sha1_neon_update(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len, unsigned int partial)
|
||||
{
|
||||
struct sha1_state *sctx = shash_desc_ctx(desc);
|
||||
unsigned int done = 0;
|
||||
|
||||
sctx->count += len;
|
||||
|
||||
if (partial) {
|
||||
done = SHA1_BLOCK_SIZE - partial;
|
||||
memcpy(sctx->buffer + partial, data, done);
|
||||
sha1_transform_neon(sctx->state, sctx->buffer, 1);
|
||||
}
|
||||
|
||||
if (len - done >= SHA1_BLOCK_SIZE) {
|
||||
const unsigned int rounds = (len - done) / SHA1_BLOCK_SIZE;
|
||||
|
||||
sha1_transform_neon(sctx->state, data + done, rounds);
|
||||
done += rounds * SHA1_BLOCK_SIZE;
|
||||
}
|
||||
|
||||
memcpy(sctx->buffer, data + done, len - done);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int sha1_neon_update(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len)
|
||||
unsigned int len)
|
||||
{
|
||||
struct sha1_state *sctx = shash_desc_ctx(desc);
|
||||
unsigned int partial = sctx->count % SHA1_BLOCK_SIZE;
|
||||
int res;
|
||||
|
||||
/* Handle the fast case right here */
|
||||
if (partial + len < SHA1_BLOCK_SIZE) {
|
||||
sctx->count += len;
|
||||
memcpy(sctx->buffer + partial, data, len);
|
||||
if (!may_use_simd() ||
|
||||
(sctx->count % SHA1_BLOCK_SIZE) + len < SHA1_BLOCK_SIZE)
|
||||
return sha1_update_arm(desc, data, len);
|
||||
|
||||
return 0;
|
||||
}
|
||||
kernel_neon_begin();
|
||||
sha1_base_do_update(desc, data, len,
|
||||
(sha1_block_fn *)sha1_transform_neon);
|
||||
kernel_neon_end();
|
||||
|
||||
if (!may_use_simd()) {
|
||||
res = sha1_update_arm(desc, data, len);
|
||||
} else {
|
||||
kernel_neon_begin();
|
||||
res = __sha1_neon_update(desc, data, len, partial);
|
||||
kernel_neon_end();
|
||||
}
|
||||
|
||||
return res;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int sha1_neon_finup(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len, u8 *out)
|
||||
{
|
||||
if (!may_use_simd())
|
||||
return sha1_finup_arm(desc, data, len, out);
|
||||
|
||||
kernel_neon_begin();
|
||||
if (len)
|
||||
sha1_base_do_update(desc, data, len,
|
||||
(sha1_block_fn *)sha1_transform_neon);
|
||||
sha1_base_do_finalize(desc, (sha1_block_fn *)sha1_transform_neon);
|
||||
kernel_neon_end();
|
||||
|
||||
return sha1_base_finish(desc, out);
|
||||
}
|
||||
|
||||
/* Add padding and return the message digest. */
|
||||
static int sha1_neon_final(struct shash_desc *desc, u8 *out)
|
||||
{
|
||||
struct sha1_state *sctx = shash_desc_ctx(desc);
|
||||
unsigned int i, index, padlen;
|
||||
__be32 *dst = (__be32 *)out;
|
||||
__be64 bits;
|
||||
static const u8 padding[SHA1_BLOCK_SIZE] = { 0x80, };
|
||||
|
||||
bits = cpu_to_be64(sctx->count << 3);
|
||||
|
||||
/* Pad out to 56 mod 64 and append length */
|
||||
index = sctx->count % SHA1_BLOCK_SIZE;
|
||||
padlen = (index < 56) ? (56 - index) : ((SHA1_BLOCK_SIZE+56) - index);
|
||||
if (!may_use_simd()) {
|
||||
sha1_update_arm(desc, padding, padlen);
|
||||
sha1_update_arm(desc, (const u8 *)&bits, sizeof(bits));
|
||||
} else {
|
||||
kernel_neon_begin();
|
||||
/* We need to fill a whole block for __sha1_neon_update() */
|
||||
if (padlen <= 56) {
|
||||
sctx->count += padlen;
|
||||
memcpy(sctx->buffer + index, padding, padlen);
|
||||
} else {
|
||||
__sha1_neon_update(desc, padding, padlen, index);
|
||||
}
|
||||
__sha1_neon_update(desc, (const u8 *)&bits, sizeof(bits), 56);
|
||||
kernel_neon_end();
|
||||
}
|
||||
|
||||
/* Store state in digest */
|
||||
for (i = 0; i < 5; i++)
|
||||
dst[i] = cpu_to_be32(sctx->state[i]);
|
||||
|
||||
/* Wipe context */
|
||||
memset(sctx, 0, sizeof(*sctx));
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int sha1_neon_export(struct shash_desc *desc, void *out)
|
||||
{
|
||||
struct sha1_state *sctx = shash_desc_ctx(desc);
|
||||
|
||||
memcpy(out, sctx, sizeof(*sctx));
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int sha1_neon_import(struct shash_desc *desc, const void *in)
|
||||
{
|
||||
struct sha1_state *sctx = shash_desc_ctx(desc);
|
||||
|
||||
memcpy(sctx, in, sizeof(*sctx));
|
||||
|
||||
return 0;
|
||||
return sha1_neon_finup(desc, NULL, 0, out);
|
||||
}
|
||||
|
||||
static struct shash_alg alg = {
|
||||
.digestsize = SHA1_DIGEST_SIZE,
|
||||
.init = sha1_neon_init,
|
||||
.init = sha1_base_init,
|
||||
.update = sha1_neon_update,
|
||||
.final = sha1_neon_final,
|
||||
.export = sha1_neon_export,
|
||||
.import = sha1_neon_import,
|
||||
.finup = sha1_neon_finup,
|
||||
.descsize = sizeof(struct sha1_state),
|
||||
.statesize = sizeof(struct sha1_state),
|
||||
.base = {
|
||||
.cra_name = "sha1",
|
||||
.cra_driver_name = "sha1-neon",
|
||||
|
|
|
@ -0,0 +1,125 @@
|
|||
/*
|
||||
* sha2-ce-core.S - SHA-224/256 secure hash using ARMv8 Crypto Extensions
|
||||
*
|
||||
* Copyright (C) 2015 Linaro Ltd.
|
||||
* Author: Ard Biesheuvel <ard.biesheuvel@linaro.org>
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License version 2 as
|
||||
* published by the Free Software Foundation.
|
||||
*/
|
||||
|
||||
#include <linux/linkage.h>
|
||||
#include <asm/assembler.h>
|
||||
|
||||
.text
|
||||
.fpu crypto-neon-fp-armv8
|
||||
|
||||
k0 .req q7
|
||||
k1 .req q8
|
||||
rk .req r3
|
||||
|
||||
ta0 .req q9
|
||||
ta1 .req q10
|
||||
tb0 .req q10
|
||||
tb1 .req q9
|
||||
|
||||
dga .req q11
|
||||
dgb .req q12
|
||||
|
||||
dg0 .req q13
|
||||
dg1 .req q14
|
||||
dg2 .req q15
|
||||
|
||||
.macro add_only, ev, s0
|
||||
vmov dg2, dg0
|
||||
.ifnb \s0
|
||||
vld1.32 {k\ev}, [rk, :128]!
|
||||
.endif
|
||||
sha256h.32 dg0, dg1, tb\ev
|
||||
sha256h2.32 dg1, dg2, tb\ev
|
||||
.ifnb \s0
|
||||
vadd.u32 ta\ev, q\s0, k\ev
|
||||
.endif
|
||||
.endm
|
||||
|
||||
.macro add_update, ev, s0, s1, s2, s3
|
||||
sha256su0.32 q\s0, q\s1
|
||||
add_only \ev, \s1
|
||||
sha256su1.32 q\s0, q\s2, q\s3
|
||||
.endm
|
||||
|
||||
.align 6
|
||||
.Lsha256_rcon:
|
||||
.word 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5
|
||||
.word 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5
|
||||
.word 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3
|
||||
.word 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174
|
||||
.word 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc
|
||||
.word 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da
|
||||
.word 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7
|
||||
.word 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967
|
||||
.word 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13
|
||||
.word 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85
|
||||
.word 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3
|
||||
.word 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070
|
||||
.word 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5
|
||||
.word 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3
|
||||
.word 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208
|
||||
.word 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2
|
||||
|
||||
/*
|
||||
* void sha2_ce_transform(struct sha256_state *sst, u8 const *src,
|
||||
int blocks);
|
||||
*/
|
||||
ENTRY(sha2_ce_transform)
|
||||
/* load state */
|
||||
vld1.32 {dga-dgb}, [r0]
|
||||
|
||||
/* load input */
|
||||
0: vld1.32 {q0-q1}, [r1]!
|
||||
vld1.32 {q2-q3}, [r1]!
|
||||
subs r2, r2, #1
|
||||
|
||||
#ifndef CONFIG_CPU_BIG_ENDIAN
|
||||
vrev32.8 q0, q0
|
||||
vrev32.8 q1, q1
|
||||
vrev32.8 q2, q2
|
||||
vrev32.8 q3, q3
|
||||
#endif
|
||||
|
||||
/* load first round constant */
|
||||
adr rk, .Lsha256_rcon
|
||||
vld1.32 {k0}, [rk, :128]!
|
||||
|
||||
vadd.u32 ta0, q0, k0
|
||||
vmov dg0, dga
|
||||
vmov dg1, dgb
|
||||
|
||||
add_update 1, 0, 1, 2, 3
|
||||
add_update 0, 1, 2, 3, 0
|
||||
add_update 1, 2, 3, 0, 1
|
||||
add_update 0, 3, 0, 1, 2
|
||||
add_update 1, 0, 1, 2, 3
|
||||
add_update 0, 1, 2, 3, 0
|
||||
add_update 1, 2, 3, 0, 1
|
||||
add_update 0, 3, 0, 1, 2
|
||||
add_update 1, 0, 1, 2, 3
|
||||
add_update 0, 1, 2, 3, 0
|
||||
add_update 1, 2, 3, 0, 1
|
||||
add_update 0, 3, 0, 1, 2
|
||||
|
||||
add_only 1, 1
|
||||
add_only 0, 2
|
||||
add_only 1, 3
|
||||
add_only 0
|
||||
|
||||
/* update state */
|
||||
vadd.u32 dga, dga, dg0
|
||||
vadd.u32 dgb, dgb, dg1
|
||||
bne 0b
|
||||
|
||||
/* store new state */
|
||||
vst1.32 {dga-dgb}, [r0]
|
||||
bx lr
|
||||
ENDPROC(sha2_ce_transform)
|
|
@ -0,0 +1,114 @@
|
|||
/*
|
||||
* sha2-ce-glue.c - SHA-224/SHA-256 using ARMv8 Crypto Extensions
|
||||
*
|
||||
* Copyright (C) 2015 Linaro Ltd <ard.biesheuvel@linaro.org>
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License version 2 as
|
||||
* published by the Free Software Foundation.
|
||||
*/
|
||||
|
||||
#include <crypto/internal/hash.h>
|
||||
#include <crypto/sha.h>
|
||||
#include <crypto/sha256_base.h>
|
||||
#include <linux/crypto.h>
|
||||
#include <linux/module.h>
|
||||
|
||||
#include <asm/hwcap.h>
|
||||
#include <asm/simd.h>
|
||||
#include <asm/neon.h>
|
||||
#include <asm/unaligned.h>
|
||||
|
||||
#include "sha256_glue.h"
|
||||
|
||||
MODULE_DESCRIPTION("SHA-224/SHA-256 secure hash using ARMv8 Crypto Extensions");
|
||||
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
|
||||
MODULE_LICENSE("GPL v2");
|
||||
|
||||
asmlinkage void sha2_ce_transform(struct sha256_state *sst, u8 const *src,
|
||||
int blocks);
|
||||
|
||||
static int sha2_ce_update(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len)
|
||||
{
|
||||
struct sha256_state *sctx = shash_desc_ctx(desc);
|
||||
|
||||
if (!may_use_simd() ||
|
||||
(sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE)
|
||||
return crypto_sha256_arm_update(desc, data, len);
|
||||
|
||||
kernel_neon_begin();
|
||||
sha256_base_do_update(desc, data, len,
|
||||
(sha256_block_fn *)sha2_ce_transform);
|
||||
kernel_neon_end();
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int sha2_ce_finup(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len, u8 *out)
|
||||
{
|
||||
if (!may_use_simd())
|
||||
return crypto_sha256_arm_finup(desc, data, len, out);
|
||||
|
||||
kernel_neon_begin();
|
||||
if (len)
|
||||
sha256_base_do_update(desc, data, len,
|
||||
(sha256_block_fn *)sha2_ce_transform);
|
||||
sha256_base_do_finalize(desc, (sha256_block_fn *)sha2_ce_transform);
|
||||
kernel_neon_end();
|
||||
|
||||
return sha256_base_finish(desc, out);
|
||||
}
|
||||
|
||||
static int sha2_ce_final(struct shash_desc *desc, u8 *out)
|
||||
{
|
||||
return sha2_ce_finup(desc, NULL, 0, out);
|
||||
}
|
||||
|
||||
static struct shash_alg algs[] = { {
|
||||
.init = sha224_base_init,
|
||||
.update = sha2_ce_update,
|
||||
.final = sha2_ce_final,
|
||||
.finup = sha2_ce_finup,
|
||||
.descsize = sizeof(struct sha256_state),
|
||||
.digestsize = SHA224_DIGEST_SIZE,
|
||||
.base = {
|
||||
.cra_name = "sha224",
|
||||
.cra_driver_name = "sha224-ce",
|
||||
.cra_priority = 300,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
|
||||
.cra_blocksize = SHA256_BLOCK_SIZE,
|
||||
.cra_module = THIS_MODULE,
|
||||
}
|
||||
}, {
|
||||
.init = sha256_base_init,
|
||||
.update = sha2_ce_update,
|
||||
.final = sha2_ce_final,
|
||||
.finup = sha2_ce_finup,
|
||||
.descsize = sizeof(struct sha256_state),
|
||||
.digestsize = SHA256_DIGEST_SIZE,
|
||||
.base = {
|
||||
.cra_name = "sha256",
|
||||
.cra_driver_name = "sha256-ce",
|
||||
.cra_priority = 300,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
|
||||
.cra_blocksize = SHA256_BLOCK_SIZE,
|
||||
.cra_module = THIS_MODULE,
|
||||
}
|
||||
} };
|
||||
|
||||
static int __init sha2_ce_mod_init(void)
|
||||
{
|
||||
if (!(elf_hwcap2 & HWCAP2_SHA2))
|
||||
return -ENODEV;
|
||||
return crypto_register_shashes(algs, ARRAY_SIZE(algs));
|
||||
}
|
||||
|
||||
static void __exit sha2_ce_mod_fini(void)
|
||||
{
|
||||
crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
|
||||
}
|
||||
|
||||
module_init(sha2_ce_mod_init);
|
||||
module_exit(sha2_ce_mod_fini);
|
|
@ -0,0 +1,716 @@
|
|||
#!/usr/bin/env perl
|
||||
|
||||
# ====================================================================
|
||||
# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
|
||||
# project. The module is, however, dual licensed under OpenSSL and
|
||||
# CRYPTOGAMS licenses depending on where you obtain it. For further
|
||||
# details see http://www.openssl.org/~appro/cryptogams/.
|
||||
#
|
||||
# Permission to use under GPL terms is granted.
|
||||
# ====================================================================
|
||||
|
||||
# SHA256 block procedure for ARMv4. May 2007.
|
||||
|
||||
# Performance is ~2x better than gcc 3.4 generated code and in "abso-
|
||||
# lute" terms is ~2250 cycles per 64-byte block or ~35 cycles per
|
||||
# byte [on single-issue Xscale PXA250 core].
|
||||
|
||||
# July 2010.
|
||||
#
|
||||
# Rescheduling for dual-issue pipeline resulted in 22% improvement on
|
||||
# Cortex A8 core and ~20 cycles per processed byte.
|
||||
|
||||
# February 2011.
|
||||
#
|
||||
# Profiler-assisted and platform-specific optimization resulted in 16%
|
||||
# improvement on Cortex A8 core and ~15.4 cycles per processed byte.
|
||||
|
||||
# September 2013.
|
||||
#
|
||||
# Add NEON implementation. On Cortex A8 it was measured to process one
|
||||
# byte in 12.5 cycles or 23% faster than integer-only code. Snapdragon
|
||||
# S4 does it in 12.5 cycles too, but it's 50% faster than integer-only
|
||||
# code (meaning that latter performs sub-optimally, nothing was done
|
||||
# about it).
|
||||
|
||||
# May 2014.
|
||||
#
|
||||
# Add ARMv8 code path performing at 2.0 cpb on Apple A7.
|
||||
|
||||
while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {}
|
||||
open STDOUT,">$output";
|
||||
|
||||
$ctx="r0"; $t0="r0";
|
||||
$inp="r1"; $t4="r1";
|
||||
$len="r2"; $t1="r2";
|
||||
$T1="r3"; $t3="r3";
|
||||
$A="r4";
|
||||
$B="r5";
|
||||
$C="r6";
|
||||
$D="r7";
|
||||
$E="r8";
|
||||
$F="r9";
|
||||
$G="r10";
|
||||
$H="r11";
|
||||
@V=($A,$B,$C,$D,$E,$F,$G,$H);
|
||||
$t2="r12";
|
||||
$Ktbl="r14";
|
||||
|
||||
@Sigma0=( 2,13,22);
|
||||
@Sigma1=( 6,11,25);
|
||||
@sigma0=( 7,18, 3);
|
||||
@sigma1=(17,19,10);
|
||||
|
||||
sub BODY_00_15 {
|
||||
my ($i,$a,$b,$c,$d,$e,$f,$g,$h) = @_;
|
||||
|
||||
$code.=<<___ if ($i<16);
|
||||
#if __ARM_ARCH__>=7
|
||||
@ ldr $t1,[$inp],#4 @ $i
|
||||
# if $i==15
|
||||
str $inp,[sp,#17*4] @ make room for $t4
|
||||
# endif
|
||||
eor $t0,$e,$e,ror#`$Sigma1[1]-$Sigma1[0]`
|
||||
add $a,$a,$t2 @ h+=Maj(a,b,c) from the past
|
||||
eor $t0,$t0,$e,ror#`$Sigma1[2]-$Sigma1[0]` @ Sigma1(e)
|
||||
# ifndef __ARMEB__
|
||||
rev $t1,$t1
|
||||
# endif
|
||||
#else
|
||||
@ ldrb $t1,[$inp,#3] @ $i
|
||||
add $a,$a,$t2 @ h+=Maj(a,b,c) from the past
|
||||
ldrb $t2,[$inp,#2]
|
||||
ldrb $t0,[$inp,#1]
|
||||
orr $t1,$t1,$t2,lsl#8
|
||||
ldrb $t2,[$inp],#4
|
||||
orr $t1,$t1,$t0,lsl#16
|
||||
# if $i==15
|
||||
str $inp,[sp,#17*4] @ make room for $t4
|
||||
# endif
|
||||
eor $t0,$e,$e,ror#`$Sigma1[1]-$Sigma1[0]`
|
||||
orr $t1,$t1,$t2,lsl#24
|
||||
eor $t0,$t0,$e,ror#`$Sigma1[2]-$Sigma1[0]` @ Sigma1(e)
|
||||
#endif
|
||||
___
|
||||
$code.=<<___;
|
||||
ldr $t2,[$Ktbl],#4 @ *K256++
|
||||
add $h,$h,$t1 @ h+=X[i]
|
||||
str $t1,[sp,#`$i%16`*4]
|
||||
eor $t1,$f,$g
|
||||
add $h,$h,$t0,ror#$Sigma1[0] @ h+=Sigma1(e)
|
||||
and $t1,$t1,$e
|
||||
add $h,$h,$t2 @ h+=K256[i]
|
||||
eor $t1,$t1,$g @ Ch(e,f,g)
|
||||
eor $t0,$a,$a,ror#`$Sigma0[1]-$Sigma0[0]`
|
||||
add $h,$h,$t1 @ h+=Ch(e,f,g)
|
||||
#if $i==31
|
||||
and $t2,$t2,#0xff
|
||||
cmp $t2,#0xf2 @ done?
|
||||
#endif
|
||||
#if $i<15
|
||||
# if __ARM_ARCH__>=7
|
||||
ldr $t1,[$inp],#4 @ prefetch
|
||||
# else
|
||||
ldrb $t1,[$inp,#3]
|
||||
# endif
|
||||
eor $t2,$a,$b @ a^b, b^c in next round
|
||||
#else
|
||||
ldr $t1,[sp,#`($i+2)%16`*4] @ from future BODY_16_xx
|
||||
eor $t2,$a,$b @ a^b, b^c in next round
|
||||
ldr $t4,[sp,#`($i+15)%16`*4] @ from future BODY_16_xx
|
||||
#endif
|
||||
eor $t0,$t0,$a,ror#`$Sigma0[2]-$Sigma0[0]` @ Sigma0(a)
|
||||
and $t3,$t3,$t2 @ (b^c)&=(a^b)
|
||||
add $d,$d,$h @ d+=h
|
||||
eor $t3,$t3,$b @ Maj(a,b,c)
|
||||
add $h,$h,$t0,ror#$Sigma0[0] @ h+=Sigma0(a)
|
||||
@ add $h,$h,$t3 @ h+=Maj(a,b,c)
|
||||
___
|
||||
($t2,$t3)=($t3,$t2);
|
||||
}
|
||||
|
||||
sub BODY_16_XX {
|
||||
my ($i,$a,$b,$c,$d,$e,$f,$g,$h) = @_;
|
||||
|
||||
$code.=<<___;
|
||||
@ ldr $t1,[sp,#`($i+1)%16`*4] @ $i
|
||||
@ ldr $t4,[sp,#`($i+14)%16`*4]
|
||||
mov $t0,$t1,ror#$sigma0[0]
|
||||
add $a,$a,$t2 @ h+=Maj(a,b,c) from the past
|
||||
mov $t2,$t4,ror#$sigma1[0]
|
||||
eor $t0,$t0,$t1,ror#$sigma0[1]
|
||||
eor $t2,$t2,$t4,ror#$sigma1[1]
|
||||
eor $t0,$t0,$t1,lsr#$sigma0[2] @ sigma0(X[i+1])
|
||||
ldr $t1,[sp,#`($i+0)%16`*4]
|
||||
eor $t2,$t2,$t4,lsr#$sigma1[2] @ sigma1(X[i+14])
|
||||
ldr $t4,[sp,#`($i+9)%16`*4]
|
||||
|
||||
add $t2,$t2,$t0
|
||||
eor $t0,$e,$e,ror#`$Sigma1[1]-$Sigma1[0]` @ from BODY_00_15
|
||||
add $t1,$t1,$t2
|
||||
eor $t0,$t0,$e,ror#`$Sigma1[2]-$Sigma1[0]` @ Sigma1(e)
|
||||
add $t1,$t1,$t4 @ X[i]
|
||||
___
|
||||
&BODY_00_15(@_);
|
||||
}
|
||||
|
||||
$code=<<___;
|
||||
#ifndef __KERNEL__
|
||||
# include "arm_arch.h"
|
||||
#else
|
||||
# define __ARM_ARCH__ __LINUX_ARM_ARCH__
|
||||
# define __ARM_MAX_ARCH__ 7
|
||||
#endif
|
||||
|
||||
.text
|
||||
#if __ARM_ARCH__<7
|
||||
.code 32
|
||||
#else
|
||||
.syntax unified
|
||||
# ifdef __thumb2__
|
||||
# define adrl adr
|
||||
.thumb
|
||||
# else
|
||||
.code 32
|
||||
# endif
|
||||
#endif
|
||||
|
||||
.type K256,%object
|
||||
.align 5
|
||||
K256:
|
||||
.word 0x428a2f98,0x71374491,0xb5c0fbcf,0xe9b5dba5
|
||||
.word 0x3956c25b,0x59f111f1,0x923f82a4,0xab1c5ed5
|
||||
.word 0xd807aa98,0x12835b01,0x243185be,0x550c7dc3
|
||||
.word 0x72be5d74,0x80deb1fe,0x9bdc06a7,0xc19bf174
|
||||
.word 0xe49b69c1,0xefbe4786,0x0fc19dc6,0x240ca1cc
|
||||
.word 0x2de92c6f,0x4a7484aa,0x5cb0a9dc,0x76f988da
|
||||
.word 0x983e5152,0xa831c66d,0xb00327c8,0xbf597fc7
|
||||
.word 0xc6e00bf3,0xd5a79147,0x06ca6351,0x14292967
|
||||
.word 0x27b70a85,0x2e1b2138,0x4d2c6dfc,0x53380d13
|
||||
.word 0x650a7354,0x766a0abb,0x81c2c92e,0x92722c85
|
||||
.word 0xa2bfe8a1,0xa81a664b,0xc24b8b70,0xc76c51a3
|
||||
.word 0xd192e819,0xd6990624,0xf40e3585,0x106aa070
|
||||
.word 0x19a4c116,0x1e376c08,0x2748774c,0x34b0bcb5
|
||||
.word 0x391c0cb3,0x4ed8aa4a,0x5b9cca4f,0x682e6ff3
|
||||
.word 0x748f82ee,0x78a5636f,0x84c87814,0x8cc70208
|
||||
.word 0x90befffa,0xa4506ceb,0xbef9a3f7,0xc67178f2
|
||||
.size K256,.-K256
|
||||
.word 0 @ terminator
|
||||
#if __ARM_MAX_ARCH__>=7 && !defined(__KERNEL__)
|
||||
.LOPENSSL_armcap:
|
||||
.word OPENSSL_armcap_P-sha256_block_data_order
|
||||
#endif
|
||||
.align 5
|
||||
|
||||
.global sha256_block_data_order
|
||||
.type sha256_block_data_order,%function
|
||||
sha256_block_data_order:
|
||||
#if __ARM_ARCH__<7
|
||||
sub r3,pc,#8 @ sha256_block_data_order
|
||||
#else
|
||||
adr r3,sha256_block_data_order
|
||||
#endif
|
||||
#if __ARM_MAX_ARCH__>=7 && !defined(__KERNEL__)
|
||||
ldr r12,.LOPENSSL_armcap
|
||||
ldr r12,[r3,r12] @ OPENSSL_armcap_P
|
||||
tst r12,#ARMV8_SHA256
|
||||
bne .LARMv8
|
||||
tst r12,#ARMV7_NEON
|
||||
bne .LNEON
|
||||
#endif
|
||||
add $len,$inp,$len,lsl#6 @ len to point at the end of inp
|
||||
stmdb sp!,{$ctx,$inp,$len,r4-r11,lr}
|
||||
ldmia $ctx,{$A,$B,$C,$D,$E,$F,$G,$H}
|
||||
sub $Ktbl,r3,#256+32 @ K256
|
||||
sub sp,sp,#16*4 @ alloca(X[16])
|
||||
.Loop:
|
||||
# if __ARM_ARCH__>=7
|
||||
ldr $t1,[$inp],#4
|
||||
# else
|
||||
ldrb $t1,[$inp,#3]
|
||||
# endif
|
||||
eor $t3,$B,$C @ magic
|
||||
eor $t2,$t2,$t2
|
||||
___
|
||||
for($i=0;$i<16;$i++) { &BODY_00_15($i,@V); unshift(@V,pop(@V)); }
|
||||
$code.=".Lrounds_16_xx:\n";
|
||||
for (;$i<32;$i++) { &BODY_16_XX($i,@V); unshift(@V,pop(@V)); }
|
||||
$code.=<<___;
|
||||
#if __ARM_ARCH__>=7
|
||||
ite eq @ Thumb2 thing, sanity check in ARM
|
||||
#endif
|
||||
ldreq $t3,[sp,#16*4] @ pull ctx
|
||||
bne .Lrounds_16_xx
|
||||
|
||||
add $A,$A,$t2 @ h+=Maj(a,b,c) from the past
|
||||
ldr $t0,[$t3,#0]
|
||||
ldr $t1,[$t3,#4]
|
||||
ldr $t2,[$t3,#8]
|
||||
add $A,$A,$t0
|
||||
ldr $t0,[$t3,#12]
|
||||
add $B,$B,$t1
|
||||
ldr $t1,[$t3,#16]
|
||||
add $C,$C,$t2
|
||||
ldr $t2,[$t3,#20]
|
||||
add $D,$D,$t0
|
||||
ldr $t0,[$t3,#24]
|
||||
add $E,$E,$t1
|
||||
ldr $t1,[$t3,#28]
|
||||
add $F,$F,$t2
|
||||
ldr $inp,[sp,#17*4] @ pull inp
|
||||
ldr $t2,[sp,#18*4] @ pull inp+len
|
||||
add $G,$G,$t0
|
||||
add $H,$H,$t1
|
||||
stmia $t3,{$A,$B,$C,$D,$E,$F,$G,$H}
|
||||
cmp $inp,$t2
|
||||
sub $Ktbl,$Ktbl,#256 @ rewind Ktbl
|
||||
bne .Loop
|
||||
|
||||
add sp,sp,#`16+3`*4 @ destroy frame
|
||||
#if __ARM_ARCH__>=5
|
||||
ldmia sp!,{r4-r11,pc}
|
||||
#else
|
||||
ldmia sp!,{r4-r11,lr}
|
||||
tst lr,#1
|
||||
moveq pc,lr @ be binary compatible with V4, yet
|
||||
bx lr @ interoperable with Thumb ISA:-)
|
||||
#endif
|
||||
.size sha256_block_data_order,.-sha256_block_data_order
|
||||
___
|
||||
######################################################################
|
||||
# NEON stuff
|
||||
#
|
||||
{{{
|
||||
my @X=map("q$_",(0..3));
|
||||
my ($T0,$T1,$T2,$T3,$T4,$T5)=("q8","q9","q10","q11","d24","d25");
|
||||
my $Xfer=$t4;
|
||||
my $j=0;
|
||||
|
||||
sub Dlo() { shift=~m|q([1]?[0-9])|?"d".($1*2):""; }
|
||||
sub Dhi() { shift=~m|q([1]?[0-9])|?"d".($1*2+1):""; }
|
||||
|
||||
sub AUTOLOAD() # thunk [simplified] x86-style perlasm
|
||||
{ my $opcode = $AUTOLOAD; $opcode =~ s/.*:://; $opcode =~ s/_/\./;
|
||||
my $arg = pop;
|
||||
$arg = "#$arg" if ($arg*1 eq $arg);
|
||||
$code .= "\t$opcode\t".join(',',@_,$arg)."\n";
|
||||
}
|
||||
|
||||
sub Xupdate()
|
||||
{ use integer;
|
||||
my $body = shift;
|
||||
my @insns = (&$body,&$body,&$body,&$body);
|
||||
my ($a,$b,$c,$d,$e,$f,$g,$h);
|
||||
|
||||
&vext_8 ($T0,@X[0],@X[1],4); # X[1..4]
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vext_8 ($T1,@X[2],@X[3],4); # X[9..12]
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vshr_u32 ($T2,$T0,$sigma0[0]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vadd_i32 (@X[0],@X[0],$T1); # X[0..3] += X[9..12]
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vshr_u32 ($T1,$T0,$sigma0[2]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vsli_32 ($T2,$T0,32-$sigma0[0]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vshr_u32 ($T3,$T0,$sigma0[1]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&veor ($T1,$T1,$T2);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vsli_32 ($T3,$T0,32-$sigma0[1]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vshr_u32 ($T4,&Dhi(@X[3]),$sigma1[0]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&veor ($T1,$T1,$T3); # sigma0(X[1..4])
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vsli_32 ($T4,&Dhi(@X[3]),32-$sigma1[0]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vshr_u32 ($T5,&Dhi(@X[3]),$sigma1[2]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vadd_i32 (@X[0],@X[0],$T1); # X[0..3] += sigma0(X[1..4])
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&veor ($T5,$T5,$T4);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vshr_u32 ($T4,&Dhi(@X[3]),$sigma1[1]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vsli_32 ($T4,&Dhi(@X[3]),32-$sigma1[1]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&veor ($T5,$T5,$T4); # sigma1(X[14..15])
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vadd_i32 (&Dlo(@X[0]),&Dlo(@X[0]),$T5);# X[0..1] += sigma1(X[14..15])
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vshr_u32 ($T4,&Dlo(@X[0]),$sigma1[0]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vsli_32 ($T4,&Dlo(@X[0]),32-$sigma1[0]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vshr_u32 ($T5,&Dlo(@X[0]),$sigma1[2]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&veor ($T5,$T5,$T4);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vshr_u32 ($T4,&Dlo(@X[0]),$sigma1[1]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vld1_32 ("{$T0}","[$Ktbl,:128]!");
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vsli_32 ($T4,&Dlo(@X[0]),32-$sigma1[1]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&veor ($T5,$T5,$T4); # sigma1(X[16..17])
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vadd_i32 (&Dhi(@X[0]),&Dhi(@X[0]),$T5);# X[2..3] += sigma1(X[16..17])
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vadd_i32 ($T0,$T0,@X[0]);
|
||||
while($#insns>=2) { eval(shift(@insns)); }
|
||||
&vst1_32 ("{$T0}","[$Xfer,:128]!");
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
|
||||
push(@X,shift(@X)); # "rotate" X[]
|
||||
}
|
||||
|
||||
sub Xpreload()
|
||||
{ use integer;
|
||||
my $body = shift;
|
||||
my @insns = (&$body,&$body,&$body,&$body);
|
||||
my ($a,$b,$c,$d,$e,$f,$g,$h);
|
||||
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vld1_32 ("{$T0}","[$Ktbl,:128]!");
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vrev32_8 (@X[0],@X[0]);
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
eval(shift(@insns));
|
||||
&vadd_i32 ($T0,$T0,@X[0]);
|
||||
foreach (@insns) { eval; } # remaining instructions
|
||||
&vst1_32 ("{$T0}","[$Xfer,:128]!");
|
||||
|
||||
push(@X,shift(@X)); # "rotate" X[]
|
||||
}
|
||||
|
||||
sub body_00_15 () {
|
||||
(
|
||||
'($a,$b,$c,$d,$e,$f,$g,$h)=@V;'.
|
||||
'&add ($h,$h,$t1)', # h+=X[i]+K[i]
|
||||
'&eor ($t1,$f,$g)',
|
||||
'&eor ($t0,$e,$e,"ror#".($Sigma1[1]-$Sigma1[0]))',
|
||||
'&add ($a,$a,$t2)', # h+=Maj(a,b,c) from the past
|
||||
'&and ($t1,$t1,$e)',
|
||||
'&eor ($t2,$t0,$e,"ror#".($Sigma1[2]-$Sigma1[0]))', # Sigma1(e)
|
||||
'&eor ($t0,$a,$a,"ror#".($Sigma0[1]-$Sigma0[0]))',
|
||||
'&eor ($t1,$t1,$g)', # Ch(e,f,g)
|
||||
'&add ($h,$h,$t2,"ror#$Sigma1[0]")', # h+=Sigma1(e)
|
||||
'&eor ($t2,$a,$b)', # a^b, b^c in next round
|
||||
'&eor ($t0,$t0,$a,"ror#".($Sigma0[2]-$Sigma0[0]))', # Sigma0(a)
|
||||
'&add ($h,$h,$t1)', # h+=Ch(e,f,g)
|
||||
'&ldr ($t1,sprintf "[sp,#%d]",4*(($j+1)&15)) if (($j&15)!=15);'.
|
||||
'&ldr ($t1,"[$Ktbl]") if ($j==15);'.
|
||||
'&ldr ($t1,"[sp,#64]") if ($j==31)',
|
||||
'&and ($t3,$t3,$t2)', # (b^c)&=(a^b)
|
||||
'&add ($d,$d,$h)', # d+=h
|
||||
'&add ($h,$h,$t0,"ror#$Sigma0[0]");'. # h+=Sigma0(a)
|
||||
'&eor ($t3,$t3,$b)', # Maj(a,b,c)
|
||||
'$j++; unshift(@V,pop(@V)); ($t2,$t3)=($t3,$t2);'
|
||||
)
|
||||
}
|
||||
|
||||
$code.=<<___;
|
||||
#if __ARM_MAX_ARCH__>=7
|
||||
.arch armv7-a
|
||||
.fpu neon
|
||||
|
||||
.global sha256_block_data_order_neon
|
||||
.type sha256_block_data_order_neon,%function
|
||||
.align 4
|
||||
sha256_block_data_order_neon:
|
||||
.LNEON:
|
||||
stmdb sp!,{r4-r12,lr}
|
||||
|
||||
sub $H,sp,#16*4+16
|
||||
adrl $Ktbl,K256
|
||||
bic $H,$H,#15 @ align for 128-bit stores
|
||||
mov $t2,sp
|
||||
mov sp,$H @ alloca
|
||||
add $len,$inp,$len,lsl#6 @ len to point at the end of inp
|
||||
|
||||
vld1.8 {@X[0]},[$inp]!
|
||||
vld1.8 {@X[1]},[$inp]!
|
||||
vld1.8 {@X[2]},[$inp]!
|
||||
vld1.8 {@X[3]},[$inp]!
|
||||
vld1.32 {$T0},[$Ktbl,:128]!
|
||||
vld1.32 {$T1},[$Ktbl,:128]!
|
||||
vld1.32 {$T2},[$Ktbl,:128]!
|
||||
vld1.32 {$T3},[$Ktbl,:128]!
|
||||
vrev32.8 @X[0],@X[0] @ yes, even on
|
||||
str $ctx,[sp,#64]
|
||||
vrev32.8 @X[1],@X[1] @ big-endian
|
||||
str $inp,[sp,#68]
|
||||
mov $Xfer,sp
|
||||
vrev32.8 @X[2],@X[2]
|
||||
str $len,[sp,#72]
|
||||
vrev32.8 @X[3],@X[3]
|
||||
str $t2,[sp,#76] @ save original sp
|
||||
vadd.i32 $T0,$T0,@X[0]
|
||||
vadd.i32 $T1,$T1,@X[1]
|
||||
vst1.32 {$T0},[$Xfer,:128]!
|
||||
vadd.i32 $T2,$T2,@X[2]
|
||||
vst1.32 {$T1},[$Xfer,:128]!
|
||||
vadd.i32 $T3,$T3,@X[3]
|
||||
vst1.32 {$T2},[$Xfer,:128]!
|
||||
vst1.32 {$T3},[$Xfer,:128]!
|
||||
|
||||
ldmia $ctx,{$A-$H}
|
||||
sub $Xfer,$Xfer,#64
|
||||
ldr $t1,[sp,#0]
|
||||
eor $t2,$t2,$t2
|
||||
eor $t3,$B,$C
|
||||
b .L_00_48
|
||||
|
||||
.align 4
|
||||
.L_00_48:
|
||||
___
|
||||
&Xupdate(\&body_00_15);
|
||||
&Xupdate(\&body_00_15);
|
||||
&Xupdate(\&body_00_15);
|
||||
&Xupdate(\&body_00_15);
|
||||
$code.=<<___;
|
||||
teq $t1,#0 @ check for K256 terminator
|
||||
ldr $t1,[sp,#0]
|
||||
sub $Xfer,$Xfer,#64
|
||||
bne .L_00_48
|
||||
|
||||
ldr $inp,[sp,#68]
|
||||
ldr $t0,[sp,#72]
|
||||
sub $Ktbl,$Ktbl,#256 @ rewind $Ktbl
|
||||
teq $inp,$t0
|
||||
it eq
|
||||
subeq $inp,$inp,#64 @ avoid SEGV
|
||||
vld1.8 {@X[0]},[$inp]! @ load next input block
|
||||
vld1.8 {@X[1]},[$inp]!
|
||||
vld1.8 {@X[2]},[$inp]!
|
||||
vld1.8 {@X[3]},[$inp]!
|
||||
it ne
|
||||
strne $inp,[sp,#68]
|
||||
mov $Xfer,sp
|
||||
___
|
||||
&Xpreload(\&body_00_15);
|
||||
&Xpreload(\&body_00_15);
|
||||
&Xpreload(\&body_00_15);
|
||||
&Xpreload(\&body_00_15);
|
||||
$code.=<<___;
|
||||
ldr $t0,[$t1,#0]
|
||||
add $A,$A,$t2 @ h+=Maj(a,b,c) from the past
|
||||
ldr $t2,[$t1,#4]
|
||||
ldr $t3,[$t1,#8]
|
||||
ldr $t4,[$t1,#12]
|
||||
add $A,$A,$t0 @ accumulate
|
||||
ldr $t0,[$t1,#16]
|
||||
add $B,$B,$t2
|
||||
ldr $t2,[$t1,#20]
|
||||
add $C,$C,$t3
|
||||
ldr $t3,[$t1,#24]
|
||||
add $D,$D,$t4
|
||||
ldr $t4,[$t1,#28]
|
||||
add $E,$E,$t0
|
||||
str $A,[$t1],#4
|
||||
add $F,$F,$t2
|
||||
str $B,[$t1],#4
|
||||
add $G,$G,$t3
|
||||
str $C,[$t1],#4
|
||||
add $H,$H,$t4
|
||||
str $D,[$t1],#4
|
||||
stmia $t1,{$E-$H}
|
||||
|
||||
ittte ne
|
||||
movne $Xfer,sp
|
||||
ldrne $t1,[sp,#0]
|
||||
eorne $t2,$t2,$t2
|
||||
ldreq sp,[sp,#76] @ restore original sp
|
||||
itt ne
|
||||
eorne $t3,$B,$C
|
||||
bne .L_00_48
|
||||
|
||||
ldmia sp!,{r4-r12,pc}
|
||||
.size sha256_block_data_order_neon,.-sha256_block_data_order_neon
|
||||
#endif
|
||||
___
|
||||
}}}
|
||||
######################################################################
|
||||
# ARMv8 stuff
|
||||
#
|
||||
{{{
|
||||
my ($ABCD,$EFGH,$abcd)=map("q$_",(0..2));
|
||||
my @MSG=map("q$_",(8..11));
|
||||
my ($W0,$W1,$ABCD_SAVE,$EFGH_SAVE)=map("q$_",(12..15));
|
||||
my $Ktbl="r3";
|
||||
|
||||
$code.=<<___;
|
||||
#if __ARM_MAX_ARCH__>=7 && !defined(__KERNEL__)
|
||||
|
||||
# ifdef __thumb2__
|
||||
# define INST(a,b,c,d) .byte c,d|0xc,a,b
|
||||
# else
|
||||
# define INST(a,b,c,d) .byte a,b,c,d
|
||||
# endif
|
||||
|
||||
.type sha256_block_data_order_armv8,%function
|
||||
.align 5
|
||||
sha256_block_data_order_armv8:
|
||||
.LARMv8:
|
||||
vld1.32 {$ABCD,$EFGH},[$ctx]
|
||||
# ifdef __thumb2__
|
||||
adr $Ktbl,.LARMv8
|
||||
sub $Ktbl,$Ktbl,#.LARMv8-K256
|
||||
# else
|
||||
adrl $Ktbl,K256
|
||||
# endif
|
||||
add $len,$inp,$len,lsl#6 @ len to point at the end of inp
|
||||
|
||||
.Loop_v8:
|
||||
vld1.8 {@MSG[0]-@MSG[1]},[$inp]!
|
||||
vld1.8 {@MSG[2]-@MSG[3]},[$inp]!
|
||||
vld1.32 {$W0},[$Ktbl]!
|
||||
vrev32.8 @MSG[0],@MSG[0]
|
||||
vrev32.8 @MSG[1],@MSG[1]
|
||||
vrev32.8 @MSG[2],@MSG[2]
|
||||
vrev32.8 @MSG[3],@MSG[3]
|
||||
vmov $ABCD_SAVE,$ABCD @ offload
|
||||
vmov $EFGH_SAVE,$EFGH
|
||||
teq $inp,$len
|
||||
___
|
||||
for($i=0;$i<12;$i++) {
|
||||
$code.=<<___;
|
||||
vld1.32 {$W1},[$Ktbl]!
|
||||
vadd.i32 $W0,$W0,@MSG[0]
|
||||
sha256su0 @MSG[0],@MSG[1]
|
||||
vmov $abcd,$ABCD
|
||||
sha256h $ABCD,$EFGH,$W0
|
||||
sha256h2 $EFGH,$abcd,$W0
|
||||
sha256su1 @MSG[0],@MSG[2],@MSG[3]
|
||||
___
|
||||
($W0,$W1)=($W1,$W0); push(@MSG,shift(@MSG));
|
||||
}
|
||||
$code.=<<___;
|
||||
vld1.32 {$W1},[$Ktbl]!
|
||||
vadd.i32 $W0,$W0,@MSG[0]
|
||||
vmov $abcd,$ABCD
|
||||
sha256h $ABCD,$EFGH,$W0
|
||||
sha256h2 $EFGH,$abcd,$W0
|
||||
|
||||
vld1.32 {$W0},[$Ktbl]!
|
||||
vadd.i32 $W1,$W1,@MSG[1]
|
||||
vmov $abcd,$ABCD
|
||||
sha256h $ABCD,$EFGH,$W1
|
||||
sha256h2 $EFGH,$abcd,$W1
|
||||
|
||||
vld1.32 {$W1},[$Ktbl]
|
||||
vadd.i32 $W0,$W0,@MSG[2]
|
||||
sub $Ktbl,$Ktbl,#256-16 @ rewind
|
||||
vmov $abcd,$ABCD
|
||||
sha256h $ABCD,$EFGH,$W0
|
||||
sha256h2 $EFGH,$abcd,$W0
|
||||
|
||||
vadd.i32 $W1,$W1,@MSG[3]
|
||||
vmov $abcd,$ABCD
|
||||
sha256h $ABCD,$EFGH,$W1
|
||||
sha256h2 $EFGH,$abcd,$W1
|
||||
|
||||
vadd.i32 $ABCD,$ABCD,$ABCD_SAVE
|
||||
vadd.i32 $EFGH,$EFGH,$EFGH_SAVE
|
||||
it ne
|
||||
bne .Loop_v8
|
||||
|
||||
vst1.32 {$ABCD,$EFGH},[$ctx]
|
||||
|
||||
ret @ bx lr
|
||||
.size sha256_block_data_order_armv8,.-sha256_block_data_order_armv8
|
||||
#endif
|
||||
___
|
||||
}}}
|
||||
$code.=<<___;
|
||||
.asciz "SHA256 block transform for ARMv4/NEON/ARMv8, CRYPTOGAMS by <appro\@openssl.org>"
|
||||
.align 2
|
||||
#if __ARM_MAX_ARCH__>=7 && !defined(__KERNEL__)
|
||||
.comm OPENSSL_armcap_P,4,4
|
||||
#endif
|
||||
___
|
||||
|
||||
open SELF,$0;
|
||||
while(<SELF>) {
|
||||
next if (/^#!/);
|
||||
last if (!s/^#/@/ and !/^$/);
|
||||
print;
|
||||
}
|
||||
close SELF;
|
||||
|
||||
{ my %opcode = (
|
||||
"sha256h" => 0xf3000c40, "sha256h2" => 0xf3100c40,
|
||||
"sha256su0" => 0xf3ba03c0, "sha256su1" => 0xf3200c40 );
|
||||
|
||||
sub unsha256 {
|
||||
my ($mnemonic,$arg)=@_;
|
||||
|
||||
if ($arg =~ m/q([0-9]+)(?:,\s*q([0-9]+))?,\s*q([0-9]+)/o) {
|
||||
my $word = $opcode{$mnemonic}|(($1&7)<<13)|(($1&8)<<19)
|
||||
|(($2&7)<<17)|(($2&8)<<4)
|
||||
|(($3&7)<<1) |(($3&8)<<2);
|
||||
# since ARMv7 instructions are always encoded little-endian.
|
||||
# correct solution is to use .inst directive, but older
|
||||
# assemblers don't implement it:-(
|
||||
sprintf "INST(0x%02x,0x%02x,0x%02x,0x%02x)\t@ %s %s",
|
||||
$word&0xff,($word>>8)&0xff,
|
||||
($word>>16)&0xff,($word>>24)&0xff,
|
||||
$mnemonic,$arg;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
foreach (split($/,$code)) {
|
||||
|
||||
s/\`([^\`]*)\`/eval $1/geo;
|
||||
|
||||
s/\b(sha256\w+)\s+(q.*)/unsha256($1,$2)/geo;
|
||||
|
||||
s/\bret\b/bx lr/go or
|
||||
s/\bbx\s+lr\b/.word\t0xe12fff1e/go; # make it possible to compile with -march=armv4
|
||||
|
||||
print $_,"\n";
|
||||
}
|
||||
|
||||
close STDOUT; # enforce flush
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,128 @@
|
|||
/*
|
||||
* Glue code for the SHA256 Secure Hash Algorithm assembly implementation
|
||||
* using optimized ARM assembler and NEON instructions.
|
||||
*
|
||||
* Copyright © 2015 Google Inc.
|
||||
*
|
||||
* This file is based on sha256_ssse3_glue.c:
|
||||
* Copyright (C) 2013 Intel Corporation
|
||||
* Author: Tim Chen <tim.c.chen@linux.intel.com>
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify it
|
||||
* under the terms of the GNU General Public License as published by the Free
|
||||
* Software Foundation; either version 2 of the License, or (at your option)
|
||||
* any later version.
|
||||
*
|
||||
*/
|
||||
|
||||
#include <crypto/internal/hash.h>
|
||||
#include <linux/crypto.h>
|
||||
#include <linux/init.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/mm.h>
|
||||
#include <linux/cryptohash.h>
|
||||
#include <linux/types.h>
|
||||
#include <linux/string.h>
|
||||
#include <crypto/sha.h>
|
||||
#include <crypto/sha256_base.h>
|
||||
#include <asm/simd.h>
|
||||
#include <asm/neon.h>
|
||||
|
||||
#include "sha256_glue.h"
|
||||
|
||||
asmlinkage void sha256_block_data_order(u32 *digest, const void *data,
|
||||
unsigned int num_blks);
|
||||
|
||||
int crypto_sha256_arm_update(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len)
|
||||
{
|
||||
/* make sure casting to sha256_block_fn() is safe */
|
||||
BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0);
|
||||
|
||||
return sha256_base_do_update(desc, data, len,
|
||||
(sha256_block_fn *)sha256_block_data_order);
|
||||
}
|
||||
EXPORT_SYMBOL(crypto_sha256_arm_update);
|
||||
|
||||
static int sha256_final(struct shash_desc *desc, u8 *out)
|
||||
{
|
||||
sha256_base_do_finalize(desc,
|
||||
(sha256_block_fn *)sha256_block_data_order);
|
||||
return sha256_base_finish(desc, out);
|
||||
}
|
||||
|
||||
int crypto_sha256_arm_finup(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len, u8 *out)
|
||||
{
|
||||
sha256_base_do_update(desc, data, len,
|
||||
(sha256_block_fn *)sha256_block_data_order);
|
||||
return sha256_final(desc, out);
|
||||
}
|
||||
EXPORT_SYMBOL(crypto_sha256_arm_finup);
|
||||
|
||||
static struct shash_alg algs[] = { {
|
||||
.digestsize = SHA256_DIGEST_SIZE,
|
||||
.init = sha256_base_init,
|
||||
.update = crypto_sha256_arm_update,
|
||||
.final = sha256_final,
|
||||
.finup = crypto_sha256_arm_finup,
|
||||
.descsize = sizeof(struct sha256_state),
|
||||
.base = {
|
||||
.cra_name = "sha256",
|
||||
.cra_driver_name = "sha256-asm",
|
||||
.cra_priority = 150,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
|
||||
.cra_blocksize = SHA256_BLOCK_SIZE,
|
||||
.cra_module = THIS_MODULE,
|
||||
}
|
||||
}, {
|
||||
.digestsize = SHA224_DIGEST_SIZE,
|
||||
.init = sha224_base_init,
|
||||
.update = crypto_sha256_arm_update,
|
||||
.final = sha256_final,
|
||||
.finup = crypto_sha256_arm_finup,
|
||||
.descsize = sizeof(struct sha256_state),
|
||||
.base = {
|
||||
.cra_name = "sha224",
|
||||
.cra_driver_name = "sha224-asm",
|
||||
.cra_priority = 150,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
|
||||
.cra_blocksize = SHA224_BLOCK_SIZE,
|
||||
.cra_module = THIS_MODULE,
|
||||
}
|
||||
} };
|
||||
|
||||
static int __init sha256_mod_init(void)
|
||||
{
|
||||
int res = crypto_register_shashes(algs, ARRAY_SIZE(algs));
|
||||
|
||||
if (res < 0)
|
||||
return res;
|
||||
|
||||
if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && cpu_has_neon()) {
|
||||
res = crypto_register_shashes(sha256_neon_algs,
|
||||
ARRAY_SIZE(sha256_neon_algs));
|
||||
|
||||
if (res < 0)
|
||||
crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
|
||||
}
|
||||
|
||||
return res;
|
||||
}
|
||||
|
||||
static void __exit sha256_mod_fini(void)
|
||||
{
|
||||
crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
|
||||
|
||||
if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && cpu_has_neon())
|
||||
crypto_unregister_shashes(sha256_neon_algs,
|
||||
ARRAY_SIZE(sha256_neon_algs));
|
||||
}
|
||||
|
||||
module_init(sha256_mod_init);
|
||||
module_exit(sha256_mod_fini);
|
||||
|
||||
MODULE_LICENSE("GPL");
|
||||
MODULE_DESCRIPTION("SHA256 Secure Hash Algorithm (ARM), including NEON");
|
||||
|
||||
MODULE_ALIAS_CRYPTO("sha256");
|
|
@ -0,0 +1,14 @@
|
|||
#ifndef _CRYPTO_SHA256_GLUE_H
|
||||
#define _CRYPTO_SHA256_GLUE_H
|
||||
|
||||
#include <linux/crypto.h>
|
||||
|
||||
extern struct shash_alg sha256_neon_algs[2];
|
||||
|
||||
int crypto_sha256_arm_update(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len);
|
||||
|
||||
int crypto_sha256_arm_finup(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len, u8 *hash);
|
||||
|
||||
#endif /* _CRYPTO_SHA256_GLUE_H */
|
|
@ -0,0 +1,101 @@
|
|||
/*
|
||||
* Glue code for the SHA256 Secure Hash Algorithm assembly implementation
|
||||
* using NEON instructions.
|
||||
*
|
||||
* Copyright © 2015 Google Inc.
|
||||
*
|
||||
* This file is based on sha512_neon_glue.c:
|
||||
* Copyright © 2014 Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify it
|
||||
* under the terms of the GNU General Public License as published by the Free
|
||||
* Software Foundation; either version 2 of the License, or (at your option)
|
||||
* any later version.
|
||||
*
|
||||
*/
|
||||
|
||||
#include <crypto/internal/hash.h>
|
||||
#include <linux/cryptohash.h>
|
||||
#include <linux/types.h>
|
||||
#include <linux/string.h>
|
||||
#include <crypto/sha.h>
|
||||
#include <crypto/sha256_base.h>
|
||||
#include <asm/byteorder.h>
|
||||
#include <asm/simd.h>
|
||||
#include <asm/neon.h>
|
||||
|
||||
#include "sha256_glue.h"
|
||||
|
||||
asmlinkage void sha256_block_data_order_neon(u32 *digest, const void *data,
|
||||
unsigned int num_blks);
|
||||
|
||||
static int sha256_update(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len)
|
||||
{
|
||||
struct sha256_state *sctx = shash_desc_ctx(desc);
|
||||
|
||||
if (!may_use_simd() ||
|
||||
(sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE)
|
||||
return crypto_sha256_arm_update(desc, data, len);
|
||||
|
||||
kernel_neon_begin();
|
||||
sha256_base_do_update(desc, data, len,
|
||||
(sha256_block_fn *)sha256_block_data_order_neon);
|
||||
kernel_neon_end();
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int sha256_finup(struct shash_desc *desc, const u8 *data,
|
||||
unsigned int len, u8 *out)
|
||||
{
|
||||
if (!may_use_simd())
|
||||
return crypto_sha256_arm_finup(desc, data, len, out);
|
||||
|
||||
kernel_neon_begin();
|
||||
if (len)
|
||||
sha256_base_do_update(desc, data, len,
|
||||
(sha256_block_fn *)sha256_block_data_order_neon);
|
||||
sha256_base_do_finalize(desc,
|
||||
(sha256_block_fn *)sha256_block_data_order_neon);
|
||||
kernel_neon_end();
|
||||
|
||||
return sha256_base_finish(desc, out);
|
||||
}
|
||||
|
||||
static int sha256_final(struct shash_desc *desc, u8 *out)
|
||||
{
|
||||
return sha256_finup(desc, NULL, 0, out);
|
||||
}
|
||||
|
||||
struct shash_alg sha256_neon_algs[] = { {
|
||||
.digestsize = SHA256_DIGEST_SIZE,
|
||||
.init = sha256_base_init,
|
||||
.update = sha256_update,
|
||||
.final = sha256_final,
|
||||
.finup = sha256_finup,
|
||||
.descsize = sizeof(struct sha256_state),
|
||||
.base = {
|
||||
.cra_name = "sha256",
|
||||
.cra_driver_name = "sha256-neon",
|
||||
.cra_priority = 250,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
|
||||
.cra_blocksize = SHA256_BLOCK_SIZE,
|
||||
.cra_module = THIS_MODULE,
|
||||
}
|
||||
}, {
|
||||
.digestsize = SHA224_DIGEST_SIZE,
|
||||
.init = sha224_base_init,
|
||||
.update = sha256_update,
|
||||
.final = sha256_final,
|
||||
.finup = sha256_finup,
|
||||
.descsize = sizeof(struct sha256_state),
|
||||
.base = {
|
||||
.cra_name = "sha224",
|
||||
.cra_driver_name = "sha224-neon",
|
||||
.cra_priority = 250,
|
||||
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
|
||||
.cra_blocksize = SHA224_BLOCK_SIZE,
|
||||
.cra_module = THIS_MODULE,
|
||||
}
|
||||
} };
|
|
@ -21,6 +21,7 @@ generic-y += preempt.h
|
|||
generic-y += resource.h
|
||||
generic-y += rwsem.h
|
||||
generic-y += scatterlist.h
|
||||
generic-y += seccomp.h
|
||||
generic-y += sections.h
|
||||
generic-y += segment.h
|
||||
generic-y += sembuf.h
|
||||
|
|
|
@ -269,6 +269,16 @@ static inline void __kvm_flush_dcache_pud(pud_t pud)
|
|||
void kvm_set_way_flush(struct kvm_vcpu *vcpu);
|
||||
void kvm_toggle_cache(struct kvm_vcpu *vcpu, bool was_enabled);
|
||||
|
||||
static inline bool __kvm_cpu_uses_extended_idmap(void)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
static inline void __kvm_extend_hypmap(pgd_t *boot_hyp_pgd,
|
||||
pgd_t *hyp_pgd,
|
||||
pgd_t *merged_hyp_pgd,
|
||||
unsigned long hyp_idmap_start) { }
|
||||
|
||||
#endif /* !__ASSEMBLY__ */
|
||||
|
||||
#endif /* __ARM_KVM_MMU_H__ */
|
||||
|
|
|
@ -1,11 +0,0 @@
|
|||
#ifndef _ASM_ARM_SECCOMP_H
|
||||
#define _ASM_ARM_SECCOMP_H
|
||||
|
||||
#include <linux/unistd.h>
|
||||
|
||||
#define __NR_seccomp_read __NR_read
|
||||
#define __NR_seccomp_write __NR_write
|
||||
#define __NR_seccomp_exit __NR_exit
|
||||
#define __NR_seccomp_sigreturn __NR_rt_sigreturn
|
||||
|
||||
#endif /* _ASM_ARM_SECCOMP_H */
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue