Commit Graph

644 Commits

Author SHA1 Message Date
Herbert Xu c5d18e984a [IPSEC]: Fix catch-22 with algorithm IDs above 31
As it stands it's impossible to use any authentication algorithms
with an ID above 31 portably.  It just happens to work on x86 but
fails miserably on ppc64.

The reason is that we're using a bit mask to check the algorithm
ID but the mask is only 32 bits wide.

After looking at how this is used in the field, I have concluded
that in the long term we should phase out state matching by IDs
because this is made superfluous by the reqid feature.  For current
applications, the best solution IMHO is to allow all algorithms when
the bit masks are all ~0.

The following patch does exactly that.

This bug was identified by IBM when testing on the ppc64 platform
using the NULL authentication algorithm which has an ID of 251.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-22 00:46:42 -07:00
Denis V. Lunev 2c8dd11636 [XFRM]: Compilation warnings in xfrm_user.c.
When CONFIG_SECURITY_NETWORK_XFRM is undefined the following warnings appears:
net/xfrm/xfrm_user.c: In function 'xfrm_add_pol_expire':
net/xfrm/xfrm_user.c:1576: warning: 'ctx' may be used uninitialized in this function
net/xfrm/xfrm_user.c: In function 'xfrm_get_policy':
net/xfrm/xfrm_user.c:1340: warning: 'ctx' may be used uninitialized in this function
(security_xfrm_policy_alloc is noop for the case).

It seems that they are result of the commit
03e1ad7b5d ("LSM: Make the Labeled IPsec
hooks more stack friendly")

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 14:47:48 -07:00
David S. Miller df39e8ba56 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/ehea/ehea_main.c
	drivers/net/wireless/iwlwifi/Kconfig
	drivers/net/wireless/rt2x00/rt61pci.c
	net/ipv4/inet_timewait_sock.c
	net/ipv6/raw.c
	net/mac80211/ieee80211_sta.c
2008-04-14 02:30:23 -07:00
Paul Moore 03e1ad7b5d LSM: Make the Labeled IPsec hooks more stack friendly
The xfrm_get_policy() and xfrm_add_pol_expire() put some rather large structs
on the stack to work around the LSM API.  This patch attempts to fix that
problem by changing the LSM API to require only the relevant "security"
pointers instead of the entire SPD entry; we do this for all of the
security_xfrm_policy*() functions to keep things consistent.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-12 19:07:52 -07:00
Patrick McHardy bcf0dda8d2 [XFRM]: xfrm_user: fix selector family initialization
Commit df9dcb45 ([IPSEC]: Fix inter address family IPsec tunnel handling)
broke openswan by removing the selector initialization for tunnel mode
in case it is uninitialized.

This patch restores the initialization, fixing openswan, but probably
breaking inter-family tunnels again (unknown since the patch author
disappeared). The correct thing for inter-family tunnels is probably
to simply initialize the selector family explicitly.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-09 15:08:24 -07:00
David S. Miller 8e8e43843b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/usb/rndis_host.c
	drivers/net/wireless/b43/dma.c
	net/ipv6/ndisc.c
2008-03-27 18:48:56 -07:00
YOSHIFUJI Hideaki c346dca108 [NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS.
Introduce per-net_device inlines: dev_net(), dev_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:39:53 +09:00
YOSHIFUJI Hideaki 9bb182a700 [XFRM] MIP6: Fix address keys for routing search.
Each MIPv6 XFRM state (DSTOPT/RH2) holds either destination or source
address to be mangled in the IPv6 header (that is "CoA").
On Inter-MN communication after both nodes binds each other,
they use route optimized traffic two MIPv6 states applied, and
both source and destination address in the IPv6 header
are replaced by the states respectively.
The packet format is correct, however, next-hop routing search
are not.
This patch fixes it by remembering address pairs for later states.

Based on patch from Masahide NAKAMURA <nakam@linux-ipv6.org>.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:57 +09:00
Kazunori MIYAZAWA df9dcb4588 [IPSEC]: Fix inter address family IPsec tunnel handling.
Signed-off-by: Kazunori MIYAZAWA <kazunori@miyazawa.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:51:51 -07:00
Timo Teras 4c563f7669 [XFRM]: Speed up xfrm_policy and xfrm_state walking
Change xfrm_policy and xfrm_state walking algorithm from O(n^2) to O(n).
This is achieved adding the entries to one more list which is used
solely for walking the entries.

This also fixes some races where the dump can have duplicate or missing
entries when the SPD/SADB is modified during an ongoing dump.

Dumping SADB with 20000 entries using "time ip xfrm state" the sys
time dropped from 1.012s to 0.080s.

Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 21:31:08 -08:00
YOSHIFUJI Hideaki b791160b5a [XFRM]: Fix ordering issue in xfrm_dst_hash_transfer().
Keep ordering of policy entries with same selector in
xfrm_dst_hash_transfer().

Issue should not appear in usual cases because multiple policy entries
with same selector are basically not allowed so far.  Bug was pointed
out by Sebastien Decugis <sdecugis@hongo.wide.ad.jp>.

We could convert bydst from hlist to list and use list_add_tail()
instead.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Sebastien Decugis <sdecugis@hongo.wide.ad.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 23:29:30 -08:00
YOSHIFUJI Hideaki 073a371987 [XFRM]: Avoid bogus BUG() when throwing new policy away.
From: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

When we destory a new policy entry, we need to tell
xfrm_policy_destroy() explicitly that the entry is not
alive yet.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-14 14:52:38 -08:00
Paul Mundt 0f4bda005f net: xfrm statistics depend on INET
net/built-in.o: In function `xfrm_policy_init':
/home/pmundt/devel/git/sh-2.6.25/net/xfrm/xfrm_policy.c:2338: undefined reference to `snmp_mib_init'

snmp_mib_init() is only built in if CONFIG_INET is set.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-14 14:48:45 -08:00
Herbert Xu b318e0e4ef [IPSEC]: Fix bogus usage of u64 on input sequence number
Al Viro spotted a bogus use of u64 on the input sequence number which
is big-endian.  This patch fixes it by giving the input sequence number
its own member in the xfrm_skb_cb structure.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:50:35 -08:00
Joy Latten 405137d16f [IPSEC]: Add support for aes-ctr.
The below patch allows IPsec to use CTR mode with AES encryption
algorithm. Tested this using setkey in ipsec-tools.

Signed-off-by: Joy Latten <latten@austin.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 23:11:56 -08:00
Al Viro 0c11b9428f [PATCH] switch audit_get_loginuid() to task_struct *
all callers pass something->audit_context

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-02-01 14:04:59 -05:00
Masahide NAKAMURA 9472c9ef64 [XFRM]: Fix statistics.
o Outbound sequence number overflow error status
  is counted as XfrmOutStateSeqError.
o Additionaly, it changes inbound sequence number replay
  error name from XfrmInSeqOutOfWindow to XfrmInStateSeqError
  to apply name scheme above.
o Inbound IPv4 UDP encapsuling type mismatch error is wrongly
  mapped to XfrmInStateInvalid then this patch fiex the error
  to XfrmInStateMismatch.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:30 -08:00
Adrian Bunk 5255dc6e14 [XFRM]: Remove unused exports.
This patch removes the following no longer used EXPORT_SYMBOL's:
- xfrm_input.c: xfrm_parse_spi
- xfrm_state.c: xfrm_replay_check
- xfrm_state.c: xfrm_replay_advance

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:29 -08:00
Eric Dumazet 533cb5b0a6 [XFRM]: constify 'struct xfrm_type'
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:20 -08:00
Herbert Xu 1a6509d991 [IPSEC]: Add support for combined mode algorithms
This patch adds support for combined mode algorithms with GCM being
the first algorithm supported.

Combined mode algorithms can be added through the xfrm_user interface
using the new algorithm payload type XFRMA_ALG_AEAD.  Each algorithms
is identified by its name and the ICV length.

For the purposes of matching algorithms in xfrm_tmpl structures,
combined mode algorithms occupy the same name space as encryption
algorithms.  This is in line with how they are negotiated using IKE.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:03 -08:00
Herbert Xu 6fbf2cb774 [IPSEC]: Allow async algorithms
Now that ESP uses authenc we can turn on the support for async
algorithms in IPsec.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:02 -08:00
Denis V. Lunev b7c6ba6eb1 [NETNS]: Consolidate kernel netlink socket destruction.
Create a specific helper for netlink kernel socket disposal. This just
let the code look better and provides a ground for proper disposal
inside a namespace.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Tested-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:07 -08:00
Ilpo Järvinen 1486cbd777 [XFRM] xfrm_policy: kill some bloat
net/xfrm/xfrm_policy.c:
  xfrm_audit_policy_delete | -692
  xfrm_audit_policy_add    | -692
 2 functions changed, 1384 bytes removed, diff: -1384

net/xfrm/xfrm_policy.c:
  xfrm_audit_common_policyinfo | +704
 1 function changed, 704 bytes added, diff: +704

net/xfrm/xfrm_policy.o:
 3 functions changed, 704 bytes added, 1384 bytes removed, diff: -680

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:49 -08:00
Sebastian Siewior 50dd79653e [XFRM]: Remove ifdef crypto.
and select the crypto subsystem if neccessary

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:12 -08:00
Eric Dumazet 6666351df9 [XFRM]: xfrm_state_clone() should be static, not exported
xfrm_state_clone() is not used outside of net/xfrm/xfrm_state.c
There is no need to export it.

Spoted by sparse checker.
   CHECK   net/xfrm/xfrm_state.c
net/xfrm/xfrm_state.c:1103:19: warning: symbol 'xfrm_state_clone' was not
declared. Should it be static?

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:49 -08:00
WANG Cong 64c31b3f76 [XFRM] xfrm_policy_destroy: Rename and relative fixes.
Since __xfrm_policy_destroy is used to destory the resources
allocated by xfrm_policy_alloc. So using the name
__xfrm_policy_destroy is not correspond with xfrm_policy_alloc.
Rename it to xfrm_policy_destroy.

And along with some instances that call xfrm_policy_alloc
but not using xfrm_policy_destroy to destroy the resource,
fix them.

Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:46 -08:00
Masahide NAKAMURA d66e37a99d [XFRM] Statistics: Add outbound-dropping error.
o Increment PolError counter when flow_cache_lookup() returns
  errored pointer.

o Increment NoStates counter at larval-drop.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:45 -08:00
Ilpo Järvinen cf35f43e6e [XFRM]: Kill some bloat
net/xfrm/xfrm_state.c:
  xfrm_audit_state_delete          | -589
  xfrm_replay_check                | -542
  xfrm_audit_state_icvfail         | -520
  xfrm_audit_state_add             | -589
  xfrm_audit_state_replay_overflow | -523
  xfrm_audit_state_notfound_simple | -509
  xfrm_audit_state_notfound        | -521
 7 functions changed, 3793 bytes removed, diff: -3793

net/xfrm/xfrm_state.c:
  xfrm_audit_helper_pktinfo | +522
  xfrm_audit_helper_sainfo  | +598
 2 functions changed, 1120 bytes added, diff: +1120

net/xfrm/xfrm_state.o:
 9 functions changed, 1120 bytes added, 3793 bytes removed, diff: -2673

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:43 -08:00
Herbert Xu dbb1db8b59 [IPSEC]: Return EOVERFLOW when output sequence number overflows
Previously we made it an error on the output path if the sequence number
overflowed.  However we did not set the err variable accordingly.  This
patch sets err to -EOVERFLOW in that case.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:32 -08:00
Eric Dumazet 9a429c4983 [NET]: Add some acquires/releases sparse annotations.
Add __acquires() and __releases() annotations to suppress some sparse
warnings.

example of warnings :

net/ipv4/udp.c:1555:14: warning: context imbalance in 'udp_seq_start' - wrong
count at exit
net/ipv4/udp.c:1571:13: warning: context imbalance in 'udp_seq_stop' -
unexpected unlock

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:31 -08:00
Herbert Xu 9dd3245a2a [IPSEC]: Move all calls to xfrm_audit_state_icvfail to xfrm_input
Let's nip the code duplication in the bud :)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:10 -08:00
Herbert Xu fcb8c156c8 [IPSEC]: Fix double free on skb on async output
When the output transform returns EINPROGRESS due to async operation we'll
free the skb the straight away as if it were an error.  This patch fixes
that so that the skb is freed when the async operation completes.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:09 -08:00
Masahide NAKAMURA b15c4bcd15 [XFRM]: Fix outbound statistics.
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:04 -08:00
Paul Moore e1af9f270b [XFRM]: Drop packets when replay counter would overflow
According to RFC4303, section 3.3.3 we need to drop outgoing packets which
cause the replay counter to overflow:

   3.3.3.  Sequence Number Generation

   The sender's counter is initialized to 0 when an SA is established.
   The sender increments the sequence number (or ESN) counter for this
   SA and inserts the low-order 32 bits of the value into the Sequence
   Number field.  Thus, the first packet sent using a given SA will
   contain a sequence number of 1.

   If anti-replay is enabled (the default), the sender checks to ensure
   that the counter has not cycled before inserting the new value in the
   Sequence Number field.  In other words, the sender MUST NOT send a
   packet on an SA if doing so would cause the sequence number to cycle.
   An attempt to transmit a packet that would result in sequence number
   overflow is an auditable event.  The audit log entry for this event
   SHOULD include the SPI value, current date/time, Source Address,
   Destination Address, and (in IPv6) the cleartext Flow ID.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:02 -08:00
Paul Moore afeb14b490 [XFRM]: RFC4303 compliant auditing
This patch adds a number of new IPsec audit events to meet the auditing
requirements of RFC4303.  This includes audit hooks for the following events:

 * Could not find a valid SA [sections 2.1, 3.4.2]
   . xfrm_audit_state_notfound()
   . xfrm_audit_state_notfound_simple()

 * Sequence number overflow [section 3.3.3]
   . xfrm_audit_state_replay_overflow()

 * Replayed packet [section 3.4.3]
   . xfrm_audit_state_replay()

 * Integrity check failure [sections 3.4.4.1, 3.4.4.2]
   . xfrm_audit_state_icvfail()

While RFC4304 deals only with ESP most of the changes in this patch apply to
IPsec in general, i.e. both AH and ESP.  The one case, integrity check
failure, where ESP specific code had to be modified the same was done to the
AH code for the sake of consistency.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:01 -08:00
Paul Moore 68277accb3 [XFRM]: Assorted IPsec fixups
This patch fixes a number of small but potentially troublesome things in the
XFRM/IPsec code:

 * Use the 'audit_enabled' variable already in include/linux/audit.h
   Removed the need for extern declarations local to each XFRM audit fuction

 * Convert 'sid' to 'secid' everywhere we can
   The 'sid' name is specific to SELinux, 'secid' is the common naming
   convention used by the kernel when refering to tokenized LSM labels,
   unfortunately we have to leave 'ctx_sid' in 'struct xfrm_sec_ctx' otherwise
   we risk breaking userspace

 * Convert address display to use standard NIP* macros
   Similar to what was recently done with the SPD audit code, this also also
   includes the removal of some unnecessary memcpy() calls

 * Move common code to xfrm_audit_common_stateinfo()
   Code consolidation from the "less is more" book on software development

 * Proper spacing around commas in function arguments
   Minor style tweak since I was already touching the code

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:59:40 -08:00
Masahide NAKAMURA 8ea843495d [XFRM]: Add packet processing statistics option.
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:59:39 -08:00
Masahide NAKAMURA 0aa647746e [XFRM]: Support to increment packet dropping statistics.
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:59:39 -08:00
Masahide NAKAMURA 558f82ef6e [XFRM]: Define packet dropping statistics.
This statistics is shown factor dropped by transformation
at /proc/net/xfrm_stat for developer.
It is a counter designed from current transformation source code
and defined as linux private MIB.

See Documentation/networking/xfrm_proc.txt for the detail.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:59:38 -08:00
Masahide NAKAMURA a1b051405b [XFRM] IPv6: Fix dst/routing check at transformation.
IPv6 specific thing is wrongly removed from transformation at net-2.6.25.
This patch recovers it with current design.

o Update "path" of xfrm_dst since IPv6 transformation should
  care about routing changes. It is required by MIPv6 and
  off-link destined IPsec.
o Rename nfheader_len which is for non-fragment transformation used by
  MIPv6 to rt6i_nfheader_len as IPv6 name space.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:59:36 -08:00
Herbert Xu 910ef70aa3 [IPSEC]: Do xfrm_state_check_space before encapsulation
While merging the IPsec output path I moved the encapsulation output
operation to the top of the loop so that it sits outside of the locked
section.  Unfortunately in doing so it now sits in front of the space
check as well which could be a fatal error.

This patch rearranges the calls so that the space check happens as
the thing on the output path.

This patch also fixes an incorrect goto should the encapsulation output
fail.

Thanks to Kazunori MIYAZAWA for finding this bug.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:59:13 -08:00
Pavel Emelyanov 4bda4f250d [XFRM]: Fix potential race vs xfrm_state(only)_find and xfrm_hash_resize.
The _find calls calculate the hash value using the
xfrm_state_hmask, without the xfrm_state_lock. But the
value of this mask can change in the _resize call under
the state_lock, so we risk to fail in finding the desired
entry in hash.

I think, that the hash value is better to calculate
under the state lock.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:58:07 -08:00
Herbert Xu aef2178599 [IPSEC]: Fix zero return value in xfrm_lookup on error
Further testing shows that my ICMP relookup patch can cause xfrm_lookup
to return zero on error which isn't very nice since it leads to the caller
dying on null pointer dereference.  The bug is due to not setting err
to ENOENT just before we leave xfrm_lookup in case of no policy.

This patch moves the err setting to where it should be.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:57:54 -08:00
Herbert Xu 8b7817f3a9 [IPSEC]: Add ICMP host relookup support
RFC 4301 requires us to relookup ICMP traffic that does not match any
policies using the reverse of its payload.  This patch implements this
for ICMP traffic that originates from or terminates on localhost.

This is activated on outbound with the new policy flag XFRM_POLICY_ICMP,
and on inbound by the new state flag XFRM_STATE_ICMP.

On inbound the policy check is now performed by the ICMP protocol so
that it can repeat the policy check where necessary.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:57:23 -08:00
Herbert Xu d5422efe68 [IPSEC]: Added xfrm_decode_session_reverse and xfrmX_policy_check_reverse
RFC 4301 requires us to relookup ICMP traffic that does not match any
policies using the reverse of its payload.  This patch adds the functions
xfrm_decode_session_reverse and xfrmX_policy_check_reverse so we can get
the reverse flow to perform such a lookup.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:57:22 -08:00
Herbert Xu 815f4e57e9 [IPSEC]: Make xfrm_lookup flags argument a bit-field
This patch introduces an enum for bits in the flags argument of xfrm_lookup.
This is so that we can cram more information into it later.

Since all current users use just the values 0 and 1, XFRM_LOOKUP_WAIT has
been added with the value 1 << 0 to represent the current meaning of flags.

The test in __xfrm_lookup has been changed accordingly.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:57:21 -08:00
Herbert Xu 005011211f [IPSEC]: Add xfrm_input_state helper
This patch adds the xfrm_input_state helper function which returns the
current xfrm state being processed on the input path given an sk_buff.
This is currently only used by xfrm_input but will be used by ESP upon
asynchronous resumption.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:57:05 -08:00
Denis Cheng df01812eba [XFRM] net/xfrm/xfrm_state.c: use LIST_HEAD instead of LIST_HEAD_INIT
single list_head variable initialized with LIST_HEAD_INIT could almost
always can be replaced with LIST_HEAD declaration, this shrinks the code
and looks better.

Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:56:54 -08:00
Denis V. Lunev 5a3e55d68e [NET]: Multiple namespaces in the all dst_ifdown routines.
Move dst entries to a namespace loopback to catch refcounting leaks.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:56:44 -08:00
Herbert Xu 2fcb45b6b8 [IPSEC]: Use the correct family for input state lookup
When merging the input paths of IPsec I accidentally left a hard-coded
AF_INET for the state lookup call.  This broke IPv6 obviously.  This
patch fixes by getting the input callers to specify the family through
skb->cb.

Credit goes to Kazunori Miyazawa for diagnosing this and providing an
initial patch.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:49 -08:00
Paul Moore 875179fa60 [IPSEC]: SPD auditing fix to include the netmask/prefix-length
Currently the netmask/prefix-length of an IPsec SPD entry is not included in
any of the SPD related audit messages.  This can cause a problem when the
audit log is examined as the netmask/prefix-length is vital in determining
what network traffic is affected by a particular SPD entry.  This patch fixes
this problem by adding two additional fields, "src_prefixlen" and
"dst_prefixlen", to the SPD audit messages to indicate the source and
destination netmasks.  These new fields are only included in the audit message
when the netmask/prefix-length is less than the address length, i.e. the SPD
entry applies to a network address and not a host address.

Example audit message:

 type=UNKNOWN[1415] msg=audit(1196105849.752:25): auid=0 \
   subj=root:system_r:unconfined_t:s0-s0:c0.c1023 op=SPD-add res=1 \
   src=192.168.0.0 src_prefixlen=24 dst=192.168.1.0 dst_prefixlen=24

In addition, this patch also fixes a few other things in the
xfrm_audit_common_policyinfo() function.  The IPv4 string formatting was
converted to use the standard NIPQUAD_FMT constant, the memcpy() was removed
from the IPv6 code path and replaced with a typecast (the memcpy() was acting
as a slow, implicit typecast anyway), and two local variables were created to
make referencing the XFRM security context and selector information cleaner.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:19 -08:00
Joonwoo Park dcaee95a1b [IPSEC]: kmalloc + memset conversion to kzalloc
2007/11/26, Patrick McHardy <kaber@trash.net>:
> How about also switching vmalloc/get_free_pages to GFP_ZERO
> and getting rid of the memset entirely while you're at it?
>

xfrm_hash: kmalloc + memset conversion to kzalloc
fix to avoid memset entirely.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:05 -08:00
David S. Miller 294b4baf29 [IPSEC]: Kill afinfo->nf_post_routing
After changeset:

	[NETFILTER]: Introduce NF_INET_ hook values

It always evaluates to NF_INET_POST_ROUTING.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:55 -08:00
Herbert Xu 1bf06cd2e3 [IPSEC]: Add async resume support on input
This patch adds support for async resumptions on input.  To do so, the
transform would return -EINPROGRESS and subsequently invoke the
function xfrm_input_resume to resume processing.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:54 -08:00
Herbert Xu 60d5fcfb19 [IPSEC]: Remove nhoff from xfrm_input
The nhoff field isn't actually necessary in xfrm_input.  For tunnel
mode transforms we now throw away the output IP header so it makes no
sense to fill in the nexthdr field.  For transport mode we can now let
the function transport_finish do the setting and it knows where the
nexthdr field is.

The only other thing that needs the nexthdr field to be set is the
header extraction code.  However, we can simply move the protocol
extraction out of the generic header extraction.

We want to minimise the amount of info we have to carry around between
transforms as this simplifies the resumption process for async crypto.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:53 -08:00
Herbert Xu d26f398400 [IPSEC]: Make x->lastused an unsigned long
Currently x->lastused is u64 which means that it cannot be
read/written atomically on all architectures.  David Miller observed
that the value stored in it is only an unsigned long which is always
atomic.

So based on his suggestion this patch changes the internal
representation from u64 to unsigned long while the user-interface
still refers to it as u64.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:52 -08:00
Herbert Xu 0ebea8ef35 [IPSEC]: Move state lock into x->type->input
This patch releases the lock on the state before calling
x->type->input.  It also adds the lock to the spots where they're
currently needed.

Most of those places (all except mip6) are expected to disappear with
async crypto.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:52 -08:00
Herbert Xu 668dc8af31 [IPSEC]: Move integrity stat collection into xfrm_input
Similar to the moving out of the replay processing on the output, this
patch moves the integrity stat collectin from x->type->input into
xfrm_input.

This would eventually allow transforms such as AH/ESP to be lockless.

The error value EBADMSG (currently unused in the crypto layer) is used
to indicate a failed integrity check.  In future this error can be
directly returned by the crypto layer once we switch to aead
algorithms.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:51 -08:00
Herbert Xu b2aa5e9d43 [IPSEC]: Store xfrm states in security path directly
As it is xfrm_input first collects a list of xfrm states on the stack
before storing them in the packet's security path just before it
returns.  For async crypto, this construction presents an obstacle
since we may need to leave the loop after each transform.

In fact, it's much easier to just skip the stack completely and always
store to the security path.  This is proven by the fact that this
patch actually shrinks the code.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:50 -08:00
Herbert Xu 716062fd4c [IPSEC]: Merge most of the input path
As part of the work on asynchronous cryptographic operations, we need
to be able to resume from the spot where they occur.  As such, it
helps if we isolate them to one spot.

This patch moves most of the remaining family-specific processing into
the common input code.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:50 -08:00
Herbert Xu c6581a457e [IPSEC]: Add async resume support on output
This patch adds support for async resumptions on output.  To do so,
the transform would return -EINPROGRESS and subsequently invoke the
function xfrm_output_resume to resume processing.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:49 -08:00
Herbert Xu 862b82c6f9 [IPSEC]: Merge most of the output path
As part of the work on asynchrnous cryptographic operations, we need
to be able to resume from the spot where they occur.  As such, it
helps if we isolate them to one spot.

This patch moves most of the remaining family-specific processing into
the common output code.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:48 -08:00
Herbert Xu 227620e295 [IPSEC]: Separate inner/outer mode processing on input
With inter-family transforms the inner mode differs from the outer
mode.  Attempting to handle both sides from the same function means
that it needs to handle both IPv4 and IPv6 which creates duplication
and confusion.

This patch separates the two parts on the input path so that each
function deals with one family only.

In particular, the functions xfrm4_extract_inut/xfrm6_extract_inut
moves the pertinent fields from the IPv4/IPv6 IP headers into a
neutral format stored in skb->cb.  This is then used by the inner mode
input functions to modify the inner IP header.  In this way the input
function no longer has to know about the outer address family.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:46 -08:00
Herbert Xu a2deb6d26f [IPSEC]: Move x->outer_mode->output out of locked section
RO mode is the only one that requires a locked output function.  So
it's easier to move the lock into that function rather than requiring
everyone else to run under the lock.

In particular, this allows us to move the size check into the output
function without causing a potential dead-lock should the ICMP error
somehow hit the same SA on transmission.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:44 -08:00
Herbert Xu 25ee3286dc [IPSEC]: Merge common code into xfrm_bundle_create
Half of the code in xfrm4_bundle_create and xfrm6_bundle_create are
common.  This patch extracts that logic and puts it into
xfrm_bundle_create.  The rest of it are then accessed through afinfo.

As a result this fixes the problem with inter-family transforms where
we treat every xfrm dst in the bundle as if it belongs to the top
family.

This patch also fixes a long-standing error-path bug where we may free
the xfrm states twice.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:43 -08:00
Herbert Xu 66cdb3ca27 [IPSEC]: Move flow construction into xfrm_dst_lookup
This patch moves the flow construction from the callers of
xfrm_dst_lookup into that function.  It also changes xfrm_dst_lookup
so that it takes an xfrm state as its argument instead of explicit
addresses.

This removes any address-specific logic from the callers of
xfrm_dst_lookup which is needed to correctly support inter-family
transforms.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:42 -08:00
Herbert Xu 550ade8432 [IPSEC]: Use dst->header_len when resizing on output
Currently we use x->props.header_len when resizing on output.
However, if we're resizing at all we might as well go the whole hog
and do it for the whole dst.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:36 -08:00
Pavel Emelyanov b24b8a247f [NET]: Convert init_timer into setup_timer
Many-many code in the kernel initialized the timer->function
and  timer->data together with calling init_timer(timer). There
is already a helper for this. Use it for networking code.

The patch is HUGE, but makes the code 130 lines shorter
(98 insertions(+), 228 deletions(-)).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:35 -08:00
Eric Dumazet 0f99be0d11 [XFRM]: xfrm_algo_clone() allocates too much memory
alg_key_len is the length in bits of the key, not in bytes.

Best way to fix this is to move alg_len() function from net/xfrm/xfrm_user.c 
to include/net/xfrm.h, and to use it in xfrm_algo_clone()

alg_len() is renamed to xfrm_alg_len() because of its global exposition.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-08 23:39:06 -08:00
Eric Dumazet 2d60abc2a9 [XFRM]: Do not define km_migrate() if !CONFIG_XFRM_MIGRATE
In include/net/xfrm.h we find :

#ifdef CONFIG_XFRM_MIGRATE
extern int km_migrate(struct xfrm_selector *sel, u8 dir, u8 type,
                      struct xfrm_migrate *m, int num_bundles);
...
#endif

We can also guard the function body itself in net/xfrm/xfrm_state.c
with same condition.

(Problem spoted by sparse checker)
make C=2 net/xfrm/xfrm_state.o
...
net/xfrm/xfrm_state.c:1765:5: warning: symbol 'km_migrate' was not declared. Should it be static?
...

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-04 00:47:03 -08:00
Paul Moore 5951cab136 [XFRM]: Audit function arguments misordered
In several places the arguments to the xfrm_audit_start() function are
in the wrong order resulting in incorrect user information being
reported.  This patch corrects this by pacing the arguments in the
correct order.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-20 00:00:45 -08:00
Paul Moore 9ab4c954ce [XFRM]: Display the audited SPI value in host byte order.
Currently the IPsec protocol SPI values are written to the audit log in
network byte order which is different from almost all other values which
are recorded in host byte order.  This patch corrects this inconsistency
by writing the SPI values to the audit record in host byte order.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-14 13:54:35 -08:00
Herbert Xu 75b8c13326 [IPSEC]: Fix potential dst leak in xfrm_lookup
If we get an error during the actual policy lookup we don't free the
original dst while the caller expects us to always free the original
dst in case of error.

This patch fixes that.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-11 04:38:08 -08:00
Herbert Xu 5e5234ff17 [IPSEC]: Fix uninitialised dst warning in __xfrm_lookup
Andrew Morton reported that __xfrm_lookup generates this warning:

net/xfrm/xfrm_policy.c: In function '__xfrm_lookup':
net/xfrm/xfrm_policy.c:1449: warning: 'dst' may be used uninitialized in this function

This is because if policy->action is of an unexpected value then dst will
not be initialised.  Of course, in practice this should never happen since
the input layer xfrm_user/af_key will filter out all illegal values.  But
the compiler doesn't know that of course.

So this patch fixes this by taking the conservative approach and treat all
unknown actions the same as a blocking action.

Thanks to Andrew for finding this and providing an initial fix.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2007-11-30 00:50:31 +11:00
Patrick McHardy 5dba479711 [XFRM]: Fix leak of expired xfrm_states
The xfrm_timer calls __xfrm_state_delete, which drops the final reference
manually without triggering destruction of the state. Change it to use
xfrm_state_put to add the state to the gc list when we're dropping the
last reference. The timer function may still continue to use the state
safely since the final destruction does a del_timer_sync().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2007-11-27 11:10:07 +08:00
Herbert Xu 8053fc3de7 [IPSEC]: Temporarily remove locks around copying of non-atomic fields
The change 050f009e16

	[IPSEC]: Lock state when copying non-atomic fields to user-space

caused a regression.

Ingo Molnar reports that it causes a potential dead-lock found by the
lock validator as it tries to take x->lock within xfrm_state_lock while
numerous other sites take the locks in opposite order.

For 2.6.24, the best fix is to simply remove the added locks as that puts
us back in the same state as we've been in for years.  For later kernels
a proper fix would be to reverse the locking order for every xfrm state
user such that if x->lock is taken together with xfrm_state_lock then
it is to be taken within it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2007-11-26 19:07:34 +08:00
Adrian Bunk 87ae9afdca cleanup asm/scatterlist.h includes
Not architecture specific code should not #include <asm/scatterlist.h>.

This patch therefore either replaces them with
#include <linux/scatterlist.h> or simply removes them if they were
unused.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-02 08:47:06 +01:00
David S. Miller 0e0940d4bb [IPSEC]: Fix scatterlist handling in skb_icv_walk().
Use sg_init_one() and sg_init_table() as needed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 00:39:27 -07:00
Jens Axboe 642f149031 SG: Change sg_set_page() to take length and offset argument
Most drivers need to set length and offset as well, so may as well fold
those three lines into one.

Add sg_assign_page() for those two locations that only needed to set
the page, where the offset/length is set outside of the function context.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-24 11:20:47 +02:00
Heiko Carstens b3b724f48c net: fix xfrm build - missing scatterlist.h include
net/xfrm/xfrm_algo.c: In function 'skb_icv_walk':
net/xfrm/xfrm_algo.c:555: error: implicit declaration of function
'sg_set_page'
make[2]: *** [net/xfrm/xfrm_algo.o] Error 1

Cc: David Miller <davem@davemloft.net>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-23 09:49:27 +02:00
Jens Axboe fa05f1286b Update net/ to use sg helpers
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-22 21:19:56 +02:00
Herbert Xu 13996378e6 [IPSEC]: Rename mode to outer_mode and add inner_mode
This patch adds a new field to xfrm states called inner_mode.  The existing
mode object is renamed to outer_mode.

This is the first part of an attempt to fix inter-family transforms.  As it
is we always use the outer family when determining which mode to use.  As a
result we may end up shoving IPv4 packets into netfilter6 and vice versa.

What we really want is to use the inner family for the first part of outbound
processing and the outer family for the second part.  For inbound processing
we'd use the opposite pairing.

I've also added a check to prevent silly combinations such as transport mode
with inter-family transforms.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-17 21:35:51 -07:00
Herbert Xu 17c2a42a24 [IPSEC]: Store afinfo pointer in xfrm_mode
It is convenient to have a pointer from xfrm_state to address-specific
functions such as the output function for a family.  Currently the
address-specific policy code calls out to the xfrm state code to get
those pointers when we could get it in an easier way via the state
itself.

This patch adds an xfrm_state_afinfo to xfrm_mode (since they're
address-specific) and changes the policy code to use it.  I've also
added an owner field to do reference counting on the module providing
the afinfo even though it isn't strictly necessary today since IPv6
can't be unloaded yet.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-17 21:33:12 -07:00
Herbert Xu 1bfcb10f67 [IPSEC]: Add missing BEET checks
Currently BEET mode does not reinject the packet back into the stack
like tunnel mode does.  Since BEET should behave just like tunnel mode
this is incorrect.

This patch fixes this by introducing a flags field to xfrm_mode that
tells the IPsec code whether it should terminate and reinject the packet
back into the stack.

It then sets the flag for BEET and tunnel mode.

I've also added a number of missing BEET checks elsewhere where we check
whether a given mode is a tunnel or not.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-17 21:31:50 -07:00
Herbert Xu aa5d62cc87 [IPSEC]: Move type and mode map into xfrm_state.c
The type and mode maps are only used by SAs, not policies.  So it makes
sense to move them from xfrm_policy.c into xfrm_state.c.  This also allows
us to mark xfrm_get_type/xfrm_put_type/xfrm_get_mode/xfrm_put_mode as
static.

The only other change I've made in the move is to get rid of the casts
on the request_module call for types.  They're unnecessary because C
will promote them to ints anyway.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-17 21:31:12 -07:00
Herbert Xu 440725000c [IPSEC]: Fix length check in xfrm_parse_spi
Currently xfrm_parse_spi requires there to be 16 bytes for AH and ESP.
In contrived cases there may not actually be 16 bytes there since the
respective header sizes are less than that (8 and 12 currently).

This patch changes the test to use the actual header length instead of 16.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-17 21:30:34 -07:00
Denis V. Lunev cd40b7d398 [NET]: make netlink user -> kernel interface synchronious
This patch make processing netlink user -> kernel messages synchronious.
This change was inspired by the talk with Alexey Kuznetsov about current
netlink messages processing. He says that he was badly wrong when introduced 
asynchronious user -> kernel communication.

The call netlink_unicast is the only path to send message to the kernel
netlink socket. But, unfortunately, it is also used to send data to the
user.

Before this change the user message has been attached to the socket queue
and sk->sk_data_ready was called. The process has been blocked until all
pending messages were processed. The bad thing is that this processing
may occur in the arbitrary process context.

This patch changes nlk->data_ready callback to get 1 skb and force packet
processing right in the netlink_unicast.

Kernel -> user path in netlink_unicast remains untouched.

EINTR processing for in netlink_run_queue was changed. It forces rtnl_lock
drop, but the process remains in the cycle until the message will be fully
processed. So, there is no need to use this kludges now.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 21:15:29 -07:00
Herbert Xu b7c6538cd8 [IPSEC]: Move state lock into x->type->output
This patch releases the lock on the state before calling x->type->output.
It also adds the lock to the spots where they're currently needed.

Most of those places (all except mip6) are expected to disappear with
async crypto.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:55:03 -07:00
Herbert Xu 050f009e16 [IPSEC]: Lock state when copying non-atomic fields to user-space
This patch adds locking so that when we're copying non-atomic fields such as
life-time or coaddr to user-space we don't get a partial result.

For af_key I've changed every instance of pfkey_xfrm_state2msg apart from
expiration notification to include the keys and life-times.  This is in-line
with XFRM behaviour.

The actual cases affected are:

* pfkey_getspi: No change as we don't have any keys to copy.
* key_notify_sa:
	+ ADD/UPD: This wouldn't work otherwise.
	+ DEL: It can't hurt.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:55:02 -07:00
Herbert Xu 68325d3b12 [XFRM] user: Move attribute copying code into copy_to_user_state_extra
Here's a good example of code duplication leading to code rot.  The
notification patch did its own netlink message creation for xfrm states.
It duplicated code that was already in dump_one_state.  Guess what, the
next time (and the time after) when someone updated dump_one_state the
notification path got zilch.

This patch moves that code from dump_one_state to copy_to_user_state_extra
and uses it in xfrm_notify_sa too.  Unfortunately whoever updates this
still needs to update xfrm_sa_len since the notification path wants to
know the exact size for allocation.

At least I've added a comment saying so and if someone still forgest, we'll
have a WARN_ON telling us so.

I also changed the security size calculation to use xfrm_user_sec_ctx since
that's what we actually put into the skb.  However it makes no practical
difference since it has the same size as xfrm_sec_ctx.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:55:02 -07:00
Herbert Xu 658b219e93 [IPSEC]: Move common code into xfrm_alloc_spi
This patch moves some common code that conceptually belongs to the xfrm core
from af_key/xfrm_user into xfrm_alloc_spi.

In particular, the spin lock on the state is now taken inside xfrm_alloc_spi.
Previously it also protected the construction of the response PF_KEY/XFRM
messages to user-space.  This is inconsistent as other identical constructions
are not protected by the state lock.  This is bad because they in fact should
be protected but only in certain spots (so as not to hold the lock for too
long which may cause packet drops).

The SPI byte order conversion has also been moved.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:55:01 -07:00
Herbert Xu 75ba28c633 [IPSEC]: Remove gratuitous km wake-up events on ACQUIRE
There is no point in waking people up when creating/updating larval states
because they'll just go back to sleep again as larval states by definition
cannot be found by xfrm_state_find.

We should only wake them up when the larvals mature or die.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:55:01 -07:00
Herbert Xu 007f0211a8 [IPSEC]: Store IPv6 nh pointer in mac_header on output
Current the x->mode->output functions store the IPv6 nh pointer in the
skb network header.  This is inconvenient because the network header then
has to be fixed up before the packet can leave the IPsec stack.  The mac
header field is unused on output so we can use that to store this instead.

This patch does that and removes the network header fix-up in xfrm_output.

It also uses ipv6_hdr where appropriate in the x->type->output functions.

There is also a minor clean-up in esp4 to make it use the same code as
esp6 to help any subsequent effort to merge the two.

Lastly it kills two redundant skb_set_* statements in BEET that were
simply copied over from transport mode.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:55:00 -07:00
Herbert Xu 1ecafede83 [IPSEC]: Remove bogus ref count in xfrm_secpath_reject
Constructs of the form

	xfrm_state_hold(x);
	foo(x);
	xfrm_state_put(x);

tend to be broken because foo is either synchronous where this is totally
unnecessary or if foo is asynchronous then the reference count is in the
wrong spot.

In the case of xfrm_secpath_reject, the function is synchronous and therefore
we should just kill the reference count.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:54:59 -07:00
Herbert Xu 45b17f48ea [IPSEC]: Move RO-specific output code into xfrm6_mode_ro.c
The lastused update check in xfrm_output can be done just as well in
the mode output function which is specific to RO.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:54:56 -07:00
Herbert Xu cdf7e668d4 [IPSEC]: Unexport xfrm_replay_notify
Now that the only callers of xfrm_replay_notify are in xfrm, we can remove
the export.

This patch also removes xfrm_aevent_doreplay since it's now called in just
one spot.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:54:55 -07:00
Herbert Xu 436a0a4022 [IPSEC]: Move output replay code into xfrm_output
The replay counter is one of only two remaining things in the output code
that requires a lock on the xfrm state (the other being the crypto).  This
patch moves it into the generic xfrm_output so we can remove the lock from
the transforms themselves.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:54:54 -07:00
Herbert Xu 83815dea47 [IPSEC]: Move xfrm_state_check into xfrm_output.c
The functions xfrm_state_check and xfrm_state_check_space are only used by
the output code in xfrm_output.c so we can move them over.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:54:54 -07:00
Herbert Xu 406ef77c89 [IPSEC]: Move common output code to xfrm_output
Most of the code in xfrm4_output_one and xfrm6_output_one are identical so
this patch moves them into a common xfrm_output function which will live
in net/xfrm.

In fact this would seem to fix a bug as on IPv4 we never reset the network
header after a transform which may upset netfilter later on.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:54:53 -07:00
Eric W. Biederman 2774c7aba6 [NET]: Make the loopback device per network namespace.
This patch makes loopback_dev per network namespace.  Adding
code to create a different loopback device for each network
namespace and adding the code to free a loopback device
when a network namespace exits.

This patch modifies all users the loopback_dev so they
access it as init_net.loopback_dev, keeping all of the
code compiling and working.  A later pass will be needed to
update the users to use something other than the initial network
namespace.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:52:49 -07:00
Daniel Lezcano de3cb747ff [NET]: Dynamically allocate the loopback device, part 1.
This patch replaces all occurences to the static variable
loopback_dev to a pointer loopback_dev. That provides the
mindless, trivial, uninteressting change part for the dynamic
allocation for the loopback.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Acked-By: Kirill Korotaev <dev@sw.ru>
Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:52:14 -07:00
Herbert Xu 0cfad07555 [NETLINK]: Avoid pointer in netlink_run_queue
I was looking at Patrick's fix to inet_diag and it occured
to me that we're using a pointer argument to return values
unnecessarily in netlink_run_queue.  Changing it to return
the value will allow the compiler to generate better code
since the value won't have to be memory-backed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:51:24 -07:00
Eric W. Biederman b4b510290b [NET]: Support multiple network namespaces with netlink
Each netlink socket will live in exactly one network namespace,
this includes the controlling kernel sockets.

This patch updates all of the existing netlink protocols
to only support the initial network namespace.  Request
by clients in other namespaces will get -ECONREFUSED.
As they would if the kernel did not have the support for
that netlink protocol compiled in.

As each netlink protocol is updated to be multiple network
namespace safe it can register multiple kernel sockets
to acquire a presence in the rest of the network namespaces.

The implementation in af_netlink is a simple filter implementation
at hash table insertion and hash table look up time.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:09 -07:00
Eric W. Biederman e9dc865340 [NET]: Make device event notification network namespace safe
Every user of the network device notifiers is either a protocol
stack or a pseudo device.  If a protocol stack that does not have
support for multiple network namespaces receives an event for a
device that is not in the initial network namespace it quite possibly
can get confused and do the wrong thing.

To avoid problems until all of the protocol stacks are converted
this patch modifies all netdev event handlers to ignore events on
devices that are not in the initial network namespace.

As the rest of the code is made network namespace aware these
checks can be removed.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:09 -07:00
Joy Latten ab5f5e8b14 [XFRM]: xfrm audit calls
This patch modifies the current ipsec audit layer
by breaking it up into purpose driven audit calls.

So far, the only audit calls made are when add/delete
an SA/policy. It had been discussed to give each
key manager it's own calls to do this, but I found
there to be much redundnacy since they did the exact
same things, except for how they got auid and sid, so I
combined them. The below audit calls can be made by any
key manager. Hopefully, this is ok.

Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:02 -07:00
Thomas Graf f7944fb191 [XFRM] policy: Replace magic number with XFRM_POLICY_OUT
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:34 -07:00
Thomas Graf fd21150a0f [XFRM] netlink: Inline attach_encap_tmpl(), attach_sec_ctx(), and attach_one_addr()
These functions are only used once and are a lot easier to understand if
inlined directly into the function.

Fixes by Masahide NAKAMURA.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:26 -07:00
Thomas Graf 15901a2746 [XFRM] netlink: Remove dependency on rtnetlink
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:25 -07:00
Thomas Graf 5424f32e48 [XFRM] netlink: Use nlattr instead of rtattr
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:25 -07:00
Thomas Graf 35a7aa08bf [XFRM] netlink: Rename attribute array from xfrma[] to attrs[]
Increases readability a lot.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:24 -07:00
Thomas Graf fab448991d [XFRM] netlink: Enhance indexing of the attribute array
nlmsg_parse() puts attributes at array[type] so the indexing
method can be simpilfied by removing the obscuring "- 1".

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:23 -07:00
Thomas Graf cf5cb79f69 [XFRM] netlink: Establish an attribute policy
Adds a policy defining the minimal payload lengths for all the attributes
allowing for most attribute validation checks to be removed from in
the middle of the code path. Makes updates more consistent as many format
errors are recognised earlier, before any changes have been attempted.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:23 -07:00
Thomas Graf a7bd9a45c8 [XFRM] netlink: Use nlmsg_parse() to parse attributes
Uses nlmsg_parse() to parse the attributes. This actually changes
behaviour as unknown attributes (type > MAXTYPE) no longer cause
an error. Instead unknown attributes will be ignored henceforth
to keep older kernels compatible with more recent userspace tools.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:22 -07:00
Thomas Graf 7deb226490 [XFRM] netlink: Use nlmsg_new() and type-safe size calculation helpers
Moves all complex message size calculation into own inlined helper
functions and makes use of the type-safe netlink interface.

Using nlmsg_new() simplifies the calculation itself as it takes care
of the netlink header length by itself.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:22 -07:00
Thomas Graf cfbfd45a8c [XFRM] netlink: Clear up some of the CONFIG_XFRM_SUB_POLICY ifdef mess
Moves all of the SUB_POLICY ifdefs related to the attribute size
calculation into a function.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:21 -07:00
Thomas Graf c26445acbc [XFRM] netlink: Move algorithm length calculation to its own function
Adds alg_len() to calculate the properly padded length of an
algorithm attribute to simplify the code.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:21 -07:00
Thomas Graf c0144beaec [XFRM] netlink: Use nla_put()/NLA_PUT() variantes
Also makes use of copy_sec_ctx() in another place and removes
duplicated code.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:20 -07:00
Thomas Graf 082a1ad573 [XFRM] netlink: Use nlmsg_broadcast() and nlmsg_unicast()
This simplifies successful return codes from >0 to 0.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:20 -07:00
Thomas Graf 7b67c8575f [XFRM] netlink: Use nlmsg_data() instead of NLMSG_DATA()
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:19 -07:00
Thomas Graf 9825069d09 [XFRM] netlink: Use nlmsg_end() and nlmsg_cancel()
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:18 -07:00
Thomas Graf 79b8b7f4ab [XFRM] netlink: Use nlmsg_put() instead of NLMSG_PUT()
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:18 -07:00
Jesper Juhl b5890d8ba4 [XFRM]: Clean up duplicate includes in net/xfrm/
This patch cleans up duplicate includes in
	net/xfrm/

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-08-13 22:52:08 -07:00
Paul Moore e6e0871cce Net/Security: fix memory leaks from security_secid_to_secctx()
The security_secid_to_secctx() function returns memory that must be freed
by a call to security_release_secctx() which was not always happening.  This
patch fixes two of these problems (all that I could find in the kernel source
at present).

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2007-08-02 11:52:26 -04:00
Joakim Koskela 48b8d78315 [XFRM]: State selection update to use inner addresses.
This patch modifies the xfrm state selection logic to use the inner
addresses where the outer have been (incorrectly) used. This is
required for beet mode in general and interfamily setups in both
tunnel and beet mode.

Signed-off-by: Joakim Koskela <jookos@gmail.com>
Signed-off-by: Herbert Xu     <herbert@gondor.apana.org.au>
Signed-off-by: Diego Beltrami <diego.beltrami@gmail.com>
Signed-off-by: Miika Komu     <miika@iki.fi>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-31 02:28:33 -07:00
Herbert Xu 196b003620 [IPSEC]: Ensure that state inner family is set
Similar to the issue we had with template families which
specified the inner families of policies, we need to set
the inner families of states as the main xfrm user Openswan
leaves it as zero.

af_key is unaffected because the inner family is set by it
and not the KM.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-31 02:28:32 -07:00
Paul Mundt 20c2df83d2 mm: Remove slab destructors from kmem_cache_create().
Slab destructors were no longer supported after Christoph's
c59def9f22 change. They've been
BUGs for both slab and slub, and slob never supported them
either.

This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-07-20 10:11:58 +09:00
YOSHIFUJI Hideaki 7dc12d6dd6 [NET] XFRM: Fix whitespace errors.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2007-07-19 10:45:15 +09:00
Patrick McHardy bd0bf0765e [XFRM]: Fix crash introduced by struct dst_entry reordering
XFRM expects xfrm_dst->u.next to be same pointer as dst->next, which
was broken by the dst_entry reordering in commit 1e19e02c~, causing
an oops in xfrm_bundle_ok when walking the bundle upwards.

Kill xfrm_dst->u.next and change the only user to use dst->next instead.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-18 01:55:52 -07:00
Jamal Hadi Salim 628529b6ee [XFRM] Introduce standalone SAD lookup
This allows other in-kernel functions to do SAD lookups.
The only known user at the moment is pktgen.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:16:35 -07:00
Patrick McHardy 281216177a [XFRM]: Fix MTU calculation for non-ESP SAs
My IPsec MTU optimization patch introduced a regression in MTU calculation
for non-ESP SAs, the SA's header_len needs to be subtracted from the MTU if
the transform doesn't provide a ->get_mtu() function.

Reported-and-tested-by: Marco Berizzi <pupilla@hotmail.com>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-18 22:30:15 -07:00
Joy Latten 4aa2e62c45 xfrm: Add security check before flushing SAD/SPD
Currently we check for permission before deleting entries from SAD and
SPD, (see security_xfrm_policy_delete() security_xfrm_state_delete())
However we are not checking for authorization when flushing the SPD and
the SAD completely. It was perhaps missed in the original security hooks
patch.

This patch adds a security check when flushing entries from the SAD and
SPD.  It runs the entire database and checks each entry for a denial.
If the process attempting the flush is unable to remove all of the
entries a denial is logged the the flush function returns an error
without removing anything.

This is particularly useful when a process may need to create or delete
its own xfrm entries used for things like labeled networking but that
same process should not be able to delete other entries or flush the
entire database.

Signed-off-by: Joy Latten<latten@austin.ibm.com>
Signed-off-by: Eric Paris <eparis@parisplace.org>
Signed-off-by: James Morris <jmorris@namei.org>
2007-06-07 13:42:46 -07:00
David S. Miller aad0e0b9b6 [XFRM]: xfrm_larval_drop sysctl should be __read_mostly.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-31 01:23:24 -07:00
David S. Miller 01e67d08fa [XFRM]: Allow XFRM_ACQ_EXPIRES to be tunable via sysctl.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-31 01:23:23 -07:00
David S. Miller 14e50e57ae [XFRM]: Allow packet drops during larval state resolution.
The current IPSEC rule resolution behavior we have does not work for a
lot of people, even though technically it's an improvement from the
-EAGAIN buisness we had before.

Right now we'll block until the key manager resolves the route.  That
works for simple cases, but many folks would rather packets get
silently dropped until the key manager resolves the IPSEC rules.

We can't tell these folks to "set the socket non-blocking" because
they don't have control over the non-block setting of things like the
sockets used to resolve DNS deep inside of the resolver libraries in
libc.

With that in mind I coded up the patch below with some help from
Herbert Xu which provides packet-drop behavior during larval state
resolution, controllable via sysctl and off by default.

This lays the framework to either:

1) Make this default at some point or...

2) Move this logic into xfrm{4,6}_policy.c and implement the
   ARP-like resolution queue we've all been dreaming of.
   The idea would be to queue packets to the policy, then
   once the larval state is resolved by the key manager we
   re-resolve the route and push the packets out.  The
   packets would timeout if the rule didn't get resolved
   in a certain amount of time.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-24 18:17:54 -07:00
Herbert Xu 26b8e51e98 [IPSEC]: Fix warnings with casting int to pointer
This patch adds some casts to shut up the warnings introduced by my
last patch that added a common interator function for xfrm algorightms.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-22 16:12:26 -07:00
Herbert Xu c92b3a2f1f [IPSEC] pfkey: Load specific algorithm in pfkey_add rather than all
This is a natural extension of the changeset

    [XFRM]: Probe selected algorithm only.

which only removed the probe call for xfrm_user.  This patch does exactly
the same thing for af_key.  In other words, we load the algorithm requested
by the user rather than everything when adding xfrm states in af_key.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-19 14:21:18 -07:00
Herbert Xu 6253db055e [IPSEC]: Don't warn if high-order hash resize fails
Multi-page allocations are always likely to fail.  Since such failures
are expected and non-critical in xfrm_hash_alloc, we shouldn't warn about
them.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-14 02:19:11 -07:00
Herbert Xu b5505c6e10 [IPSEC]: Check validity of direction in xfrm_policy_byid
The function xfrm_policy_byid takes a dir argument but finds the policy
using the index instead.  We only use the dir argument to update the
policy count for that direction.  Since the user can supply any value
for dir, this can corrupt our policy count.

I know this is the problem because a few days ago I was deleting
policies by hand using indicies and accidentally typed in the wrong
direction.  It still deleted the policy and at the time I thought
that was cool.  In retrospect it isn't such a good idea :)

I decided against letting it delete the policy anyway just in case
we ever remove the connection between indicies and direction.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-14 02:15:47 -07:00
Jamal Hadi Salim 5a6d34162f [XFRM] SPD info TLV aggregation
Aggregate the SPD info TLVs.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-04 12:55:39 -07:00
Jamal Hadi Salim af11e31609 [XFRM] SAD info TLV aggregationx
Aggregate the SAD info TLVs.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-04 12:55:13 -07:00
Masahide NAKAMURA 157bfc2502 [XFRM]: Restrict upper layer information by bundle.
On MIPv6 usage, XFRM sub policy is enabled.
When main (IPsec) and sub (MIPv6) policy selectors have the same
address set but different upper layer information (i.e. protocol
number and its ports or type/code), multiple bundle should be created.
However, currently we have issue to use the same bundle created for
the first time with all flows covered by the case.

It is useful for the bundle to have the upper layer information
to be restructured correctly if it does not match with the flow.

1. Bundle was created by two policies
Selector from another policy is added to xfrm_dst.
If the flow does not match the selector, it goes to slow path to
restructure new bundle by single policy.

2. Bundle was created by one policy
Flow cache is added to xfrm_dst as originated one. If the flow does
not match the cache, it goes to slow path to try searching another
policy.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-30 00:58:09 -07:00
Jamal Hadi Salim ecfd6b1837 [XFRM]: Export SPD info
With this patch you can use iproute2 in user space to efficiently see
how many policies exist in different directions.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-28 21:20:32 -07:00
David S. Miller 1a028e5072 [NET]: Revert sk_buff walker cleanups.
This reverts eefa390628

The simplification made in that change works with the assumption that
the 'offset' parameter to these functions is always positive or zero,
which is not true.  It can be and often is negative in order to access
SKB header values in front of skb->data.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-27 15:21:23 -07:00
Jamal Hadi Salim 566ec03448 [XFRM]: Missing bits to SAD info.
This brings the SAD info in sync with net-2.6.22/net-2.6

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 14:12:15 -07:00
Jean Delvare eefa390628 [NET]: Clean up sk_buff walkers.
I noticed recently that, in skb_checksum(), "offset" and "start" are
essentially the same thing and have the same value throughout the
function, despite being computed differently. Using a single variable
allows some cleanups and makes the skb_checksum() function smaller,
more readable, and presumably marginally faster.

We appear to have many other "sk_buff walker" functions built on the
exact same model, so the cleanup applies to them, too. Here is a list
of the functions I found to be affected:

net/appletalk/ddp.c:atalk_sum_skb()
net/core/datagram.c:skb_copy_datagram_iovec()
net/core/datagram.c:skb_copy_and_csum_datagram()
net/core/skbuff.c:skb_copy_bits()
net/core/skbuff.c:skb_store_bits()
net/core/skbuff.c:skb_checksum()
net/core/skbuff.c:skb_copy_and_csum_bit()
net/core/user_dma.c:dma_skb_copy_datagram_iovec()
net/xfrm/xfrm_algo.c:skb_icv_walk()
net/xfrm/xfrm_algo.c:skb_to_sgvec()

OTOH, I admit I'm a bit surprised, the cleanup is rather obvious so I'm
really wondering if I am missing something. Can anyone please comment
on this?

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 00:44:22 -07:00
Jamal Hadi Salim 28d8909bc7 [XFRM]: Export SAD info.
On a system with a lot of SAs, counting SAD entries chews useful
CPU time since you need to dump the whole SAD to user space;
i.e something like ip xfrm state ls | grep -i src | wc -l
I have seen taking literally minutes on a 40K SAs when the system
is swapping.
With this patch, some of the SAD info (that was already being tracked)
is exposed to user space. i.e you do:
ip xfrm state count
And you get the count; you can also pass -s to the command line and
get the hash info.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 00:10:29 -07:00
Stephen Hemminger 3ff50b7997 [NET]: cleanup extra semicolons
Spring cleaning time...

There seems to be a lot of places in the network code that have
extra bogus semicolons after conditionals.  Most commonly is a
bogus semicolon after: switch() { }

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:29:24 -07:00
Patrick McHardy af65bdfce9 [NETLINK]: Switch cb_lock spinlock to mutex and allow to override it
Switch cb_lock to mutex and allow netlink kernel users to override it
with a subsystem specific mutex for consistent locking in dump callbacks.
All netlink_dump_start users have been audited not to rely on any
side-effects of the previously used spinlock.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:29:03 -07:00
Patrick McHardy c5c2523893 [XFRM]: Optimize MTU calculation
Replace the probing based MTU estimation, which usually takes 2-3 iterations
to find a fitting value and may underestimate the MTU, by an exact calculation.

Also fix underestimation of the XFRM trailer_len, which causes unnecessary
reallocations.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:28:38 -07:00
David Howells 716ea3a7aa [NET]: Move generic skbuff stuff from XFRM code to generic code
Move generic skbuff stuff from XFRM code to generic code so that
AF_RXRPC can use it too.

The kdoc comments I've attached to the functions needs to be checked
by whoever wrote them as I had to make some guesses about the workings
of these functions.

Signed-off-By: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:28:33 -07:00
Thomas Graf c702e8047f [NETLINK]: Directly return -EINTR from netlink_dump_start()
Now that all users of netlink_dump_start() use netlink_run_queue()
to process the receive queue, it is possible to return -EINTR from
netlink_dump_start() directly, therefore simplying the callers.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:27:33 -07:00
Thomas Graf 1d00a4eb42 [NETLINK]: Remove error pointer from netlink message handler
The error pointer argument in netlink message handlers is used
to signal the special case where processing has to be interrupted
because a dump was started but no error happened. Instead it is
simpler and more clear to return -EINTR and have netlink_run_queue()
deal with getting the queue right.

nfnetlink passed on this error pointer to its subsystem handlers
but only uses it to signal the start of a netlink dump. Therefore
it can be removed there as well.

This patch also cleans up the error handling in the affected
message handlers to be consistent since it had to be touched anyway.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:27:30 -07:00
Thomas Graf 45e7ae7f71 [NETLINK]: Ignore control messages directly in netlink_run_queue()
Changes netlink_rcv_skb() to skip netlink controll messages and don't
pass them on to the message handler.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:27:29 -07:00
Thomas Graf d35b685640 [NETLINK]: Ignore !NLM_F_REQUEST messages directly in netlink_run_queue()
netlink_rcv_skb() is changed to skip messages which don't have the
NLM_F_REQUEST bit to avoid every netlink family having to perform this
check on their own.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:27:29 -07:00
Arnaldo Carvalho de Melo dc5fc579b9 [NETLINK]: Use nlmsg_trim() where appropriate
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:37 -07:00
Arnaldo Carvalho de Melo 27a884dc3c [SK_BUFF]: Convert skb->tail to sk_buff_data_t
So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
on 64bit architectures, allowing us to combine the 4 bytes hole left by the
layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
:-)

Many calculations that previously required that skb->{transport,network,
mac}_header be first converted to a pointer now can be done directly, being
meaningful as offsets or pointers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:28 -07:00
Arnaldo Carvalho de Melo 9c70220b73 [SK_BUFF]: Introduce skb_transport_header(skb)
For the places where we need a pointer to the transport header, it is
still legal to touch skb->h.raw directly if just adding to,
subtracting from or setting it to another layer header.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:25:31 -07:00
James Morris 9d729f72dc [NET]: Convert xtime.tv_sec to get_seconds()
Where appropriate, convert references to xtime.tv_sec to the
get_seconds() helper function.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:23:32 -07:00
Joy Latten 661697f728 [IPSEC] XFRM_USER: kernel panic when large security contexts in ACQUIRE
When sending a security context of 50+ characters in an ACQUIRE 
message, following kernel panic occurred.

kernel BUG in xfrm_send_acquire at net/xfrm/xfrm_user.c:1781!
cpu 0x3: Vector: 700 (Program Check) at [c0000000421bb2e0]
    pc: c00000000033b074: .xfrm_send_acquire+0x240/0x2c8
    lr: c00000000033b014: .xfrm_send_acquire+0x1e0/0x2c8
    sp: c0000000421bb560
   msr: 8000000000029032
  current = 0xc00000000fce8f00
  paca    = 0xc000000000464b00
    pid   = 2303, comm = ping
kernel BUG in xfrm_send_acquire at net/xfrm/xfrm_user.c:1781!
enter ? for help
3:mon> t
[c0000000421bb650] c00000000033538c .km_query+0x6c/0xec
[c0000000421bb6f0] c000000000337374 .xfrm_state_find+0x7f4/0xb88
[c0000000421bb7f0] c000000000332350 .xfrm_tmpl_resolve+0xc4/0x21c
[c0000000421bb8d0] c0000000003326e8 .xfrm_lookup+0x1a0/0x5b0
[c0000000421bba00] c0000000002e6ea0 .ip_route_output_flow+0x88/0xb4
[c0000000421bbaa0] c0000000003106d8 .ip4_datagram_connect+0x218/0x374
[c0000000421bbbd0] c00000000031bc00 .inet_dgram_connect+0xac/0xd4
[c0000000421bbc60] c0000000002b11ac .sys_connect+0xd8/0x120
[c0000000421bbd90] c0000000002d38d0 .compat_sys_socketcall+0xdc/0x214
[c0000000421bbe30] c00000000000869c syscall_exit+0x0/0x40
--- Exception: c00 (System Call) at 0000000007f0ca9c
SP (fc0ef8f0) is in userspace

We are using size of security context from xfrm_policy to determine
how much space to alloc skb and then putting security context from
xfrm_state into skb. Should have been using size of security context 
from xfrm_state to alloc skb. Following fix does that

Signed-off-by: Joy Latten <latten@austin.ibm.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-13 16:14:35 -07:00
Herbert Xu 4c4d51a731 [IPSEC]: Reject packets within replay window but outside the bit mask
Up until this point we've accepted replay window settings greater than
32 but our bit mask can only accomodate 32 packets.  Thus any packet
with a sequence number within the window but outside the bit mask would
be accepted.

This patch causes those packets to be rejected instead.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-05 00:07:39 -07:00
Dave Jones b6f99a2119 [NET]: fix up misplaced inlines.
Turning up the warnings on gcc makes it emit warnings
about the placement of 'inline' in function declarations.
Here's everything that was under net/

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-22 12:27:49 -07:00
Joy Latten 961995582e [XFRM]: ipsecv6 needs a space when printing audit record.
This patch adds a space between printing of the src and dst ipv6 addresses.
Otherwise, audit or other test tools may fail to process the audit
record properly because they cannot find the dst address.

Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-20 00:09:47 -07:00
Joy Latten 75e252d981 [XFRM]: Fix missing protocol comparison of larval SAs.
I noticed that in xfrm_state_add we look for the larval SA in a few
places without checking for protocol match. So when using both 
AH and ESP, whichever one gets added first, deletes the larval SA. 
It seems AH always gets added first and ESP is always the larval 
SA's protocol since the xfrm->tmpl has it first. Thus causing the
additional km_query()

Adding the check eliminates accidental double SA creation. 

Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-12 17:14:07 -07:00
Eric Paris 16bec31db7 [IPSEC]: xfrm audit hook misplaced in pfkey_delete and xfrm_del_sa
Inside pfkey_delete and xfrm_del_sa the audit hooks were not called if
there was any permission/security failures in attempting to do the del
operation (such as permission denied from security_xfrm_state_delete).
This patch moves the audit hook to the exit path such that all failures
(and successes) will actually get audited.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Venkat Yekkirala <vyekkirala@trustedcs.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-07 16:08:11 -08:00
Eric Paris ef41aaa0b7 [IPSEC]: xfrm_policy delete security check misplaced
The security hooks to check permissions to remove an xfrm_policy were
actually done after the policy was removed.  Since the unlinking and
deletion are done in xfrm_policy_by* functions this moves the hooks
inside those 2 functions.  There we have all the information needed to
do the security check and it can be done before the deletion.  Since
auditing requires the result of that security check err has to be passed
back and forth from the xfrm_policy_by* functions.

This patch also fixes a bug where a deletion that failed the security
check could cause improper accounting on the xfrm_policy
(xfrm_get_policy didn't have a put on the exit path for the hold taken
by xfrm_policy_by*)

It also fixes the return code when no policy is found in
xfrm_add_pol_expire.  In old code (at least back in the 2.6.18 days) err
wasn't used before the return when no policy is found and so the
initialization would cause err to be ENOENT.  But since err has since
been used above when we don't get a policy back from the xfrm_policy_by*
function we would always return 0 instead of the intended ENOENT.  Also
fixed some white space damage in the same area.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Venkat Yekkirala <vyekkirala@trustedcs.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-07 16:08:09 -08:00
Patrick McHardy b08d5840d2 [NET]: Fix kfree(skb)
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-28 09:42:14 -08:00
David S. Miller 3a765aa528 [XFRM] xfrm_user: Fix return values of xfrm_add_sa_expire.
As noted by Kent Yoder, this function will always return an
error.  Make sure it returns zero on success.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-28 09:41:57 -08:00
Kazunori MIYAZAWA 928ba4169d [IPSEC]: Fix the address family to refer encap_family
Fix the address family to refer encap_family
when comparing with a kernel generated xfrm_state

Signed-off-by: Kazunori MIYAZAWA <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-13 12:57:16 -08:00
David S. Miller 13fcfbb067 [XFRM]: Fix OOPSes in xfrm_audit_log().
Make sure that this function is called correctly, and
add BUG() checking to ensure the arguments are sane.

Based upon a patch by Joy Latten.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-12 13:53:54 -08:00
YOSHIFUJI Hideaki a716c1197d [NET] XFRM: Fix whitespace errors.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-10 23:20:24 -08:00
David S. Miller 9783e1df7a Merge branch 'HEAD' of master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6
Conflicts:

	crypto/Kconfig
2007-02-08 15:25:18 -08:00
David S. Miller e610e679dd [XFRM]: xfrm_migrate() needs exporting to modules.
Needed by xfrm_user and af_key.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:29:15 -08:00
Shinta Sugimoto f6ed0ec0ee [PFKEYV2]: CONFIG_NET_KEY_MIGRATE option
Add CONFIG_NET_KEY_MIGRATE option which makes it possible for user
application to send or receive MIGRATE message to/from PF_KEY socket.

Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com>
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:15:05 -08:00
Shinta Sugimoto d0473655c8 [XFRM]: CONFIG_XFRM_MIGRATE option
Add CONFIG_XFRM_MIGRATE option which makes it possible for for user
application to send or receive MIGRATE message to/from netlink socket.

Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com>
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:13:07 -08:00
Shinta Sugimoto 5c79de6e79 [XFRM]: User interface for handling XFRM_MSG_MIGRATE
Add user interface for handling XFRM_MSG_MIGRATE. The message is issued
by user application. When kernel receives the message, procedure of
updating XFRM databases will take place.

Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com>
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:12:32 -08:00
Shinta Sugimoto 80c9abaabf [XFRM]: Extension for dynamic update of endpoint address(es)
Extend the XFRM framework so that endpoint address(es) in the XFRM
databases could be dynamically updated according to a request (MIGRATE
message) from user application. Target XFRM policy is first identified
by the selector in the MIGRATE message. Next, the endpoint addresses
of the matching templates and XFRM states are updated according to
the MIGRATE message.

Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com>
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 13:11:42 -08:00
Miika Komu cdca72652a [IPSEC]: exporting xfrm_state_afinfo
This patch exports xfrm_state_afinfo.

Signed-off-by: Miika Komu <miika@iki.fi>
Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi>
Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:39:00 -08:00
Noriaki TAKAMIYA 6a0dc8d733 [IPSEC]: added the entry of Camellia cipher algorithm to ealg_list[]
This patch adds the entry of Camellia cipher algorithm to ealg_list[].

Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2007-02-07 09:21:05 +11:00
Herbert Xu a6c7ab55dd [IPSEC]: Policy list disorder
The recent hashing introduced an off-by-one bug in policy list insertion.
Instead of adding after the last entry with a lesser or equal priority,
we're adding after the successor of that entry.

This patch fixes this and also adds a warning if we detect a duplicate
entry in the policy list.  This should never happen due to this if clause.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-01-23 20:25:51 -08:00
Christoph Hellwig 22e7005023 [XFRM_USER]: avoid pointless void ** casts
All ->doit handlers want a struct rtattr **, so pass down the right
type.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-01-03 18:38:13 -08:00
Martin Willi b836267aa7 [XFRM]: Algorithm lookup using .compat name
Installing an IPsec SA using old algorithm names (.compat) does not work
if the algorithm is not already loaded. When not using the PF_KEY
interface, algorithms are not preloaded in xfrm_probe_algs() and
installing a IPsec SA fails.

Signed-off-by: Martin Willi <martin@strongswan.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-31 14:06:51 -08:00
Linus Torvalds 2685b267bc Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (48 commits)
  [NETFILTER]: Fix non-ANSI func. decl.
  [TG3]: Identify Serdes devices more clearly.
  [TG3]: Use msleep.
  [TG3]: Use netif_msg_*.
  [TG3]: Allow partial speed advertisement.
  [TG3]: Add TG3_FLG2_IS_NIC flag.
  [TG3]: Add 5787F device ID.
  [TG3]: Fix Phy loopback.
  [WANROUTER]: Kill kmalloc debugging code.
  [TCP] inet_twdr_hangman: Delete unnecessary memory barrier().
  [NET]: Memory barrier cleanups
  [IPSEC]: Fix inetpeer leak in ipv4 xfrm dst entries.
  audit: disable ipsec auditing when CONFIG_AUDITSYSCALL=n
  audit: Add auditing to ipsec
  [IRDA] irlan: Fix compile warning when CONFIG_PROC_FS=n
  [IrDA]: Incorrect TTP header reservation
  [IrDA]: PXA FIR code device model conversion
  [GENETLINK]: Fix misplaced command flags.
  [NETLIK]: Add a pointer to the Generic Netlink wiki page.
  [IPV6] RAW: Don't release unlocked sock.
  ...
2006-12-07 09:05:15 -08:00
Christoph Lameter e18b890bb0 [PATCH] slab: remove kmem_cache_t
Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

	#!/bin/sh
	#
	# Replace one string by another in all the kernel sources.
	#

	set -e

	for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
		quilt add $file
		sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
		mv /tmp/$$ $file
		quilt refresh
	done

The script was run like this

	sh replace kmem_cache_t "struct kmem_cache"

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:25 -08:00
Christoph Lameter 54e6ecb239 [PATCH] slab: remove SLAB_ATOMIC
SLAB_ATOMIC is an alias of GFP_ATOMIC

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:24 -08:00
Joy Latten c9204d9ca7 audit: disable ipsec auditing when CONFIG_AUDITSYSCALL=n
Disables auditing in ipsec when CONFIG_AUDITSYSCALL is
disabled in the kernel.

Also includes a bug fix for xfrm_state.c as a result of
original ipsec audit patch.

Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-06 20:14:23 -08:00
Joy Latten 161a09e737 audit: Add auditing to ipsec
An audit message occurs when an ipsec SA
or ipsec policy is created/deleted.

Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-06 20:14:22 -08:00
Kazunori MIYAZAWA 7cf4c1a5fd [IPSEC]: Add support for AES-XCBC-MAC
The glue of xfrm.

Signed-off-by: Kazunori MIYAZAWA <miyazawa@linux-ipv6.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-12-06 18:38:51 -08:00
Jamal Hadi Salim 94b9bb5480 [XFRM] Optimize SA dumping
Same comments as in "[XFRM] Optimize policy dumping"

The numbers are (20K SAs):
2006-12-06 18:38:45 -08:00
Jamal Hadi Salim baf5d743d1 [XFRM] Optimize policy dumping
This change optimizes the dumping of Security policies.

1) Before this change ..
speedopolis:~# time ./ip xf pol

real    0m22.274s
user    0m0.000s
sys     0m22.269s

2) Turn off sub-policies

speedopolis:~# ./ip xf pol

real    0m13.496s
user    0m0.000s
sys     0m13.493s

i suppose the above is to be expected

3) With this change ..
speedopolis:~# time ./ip x policy

real    0m7.901s
user    0m0.008s
sys     0m7.896s
2006-12-06 18:38:44 -08:00
David Howells 9db7372445 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	drivers/ata/libata-scsi.c
	include/linux/libata.h

Futher merge of Linus's head and compilation fixups.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-12-05 17:01:28 +00:00
David Howells 4c1ac1b491 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	drivers/infiniband/core/iwcm.c
	drivers/net/chelsio/cxgb2.c
	drivers/net/wireless/bcm43xx/bcm43xx_main.c
	drivers/net/wireless/prism54/islpci_eth.c
	drivers/usb/core/hub.h
	drivers/usb/input/hid-core.c
	net/core/netpoll.c

Fix up merge failures with Linus's head and fix new compilation failures.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-12-05 14:37:56 +00:00
David S. Miller b4ad86bf52 [XFRM] xfrm_user: Better validation of user templates.
Since we never checked the ->family value of templates
before, many applications simply leave it at zero.
Detect this and fix it up to be the pol->family value.

Also, do not clobber xp->family while reading in templates,
that is not necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-03 19:19:26 -08:00
Jamal Hadi Salim 2b5f6dcce5 [XFRM]: Fix aevent structuring to be more complete.
aevents can not uniquely identify an SA. We break the ABI with this
patch, but consensus is that since it is not yet utilized by any
(known) application then it is fine (better do it now than later).

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:22:25 -08:00
Miika Komu 8511d01d7c [IPSEC]: Add netlink interface for the encapsulation family.
Signed-off-by: Miika Komu <miika@iki.fi>
Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi>
Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:49 -08:00
Miika Komu 76b3f055f3 [IPSEC]: Add encapsulation family.
Signed-off-by: Miika Komu <miika@iki.fi>
Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi>
Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:48 -08:00
Jamal Hadi Salim b798a9ede2 [XFRM]: Convert a few __u8 to proper u8
Caught by the EyeBalls(tm) of Thomas Graf

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:30:50 -08:00
Jamal Hadi Salim 0c51f53c57 [XFRM]: Make flush notifier prettier when subpolicy used
Might as well make flush notifier prettier when subpolicy used

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:30:49 -08:00
Thomas Graf 4e9b826935 [NETLINK]: Remove unused dst_pid field in netlink_skb_parms
The destination PID is passed directly to netlink_unicast()
respectively netlink_multicast().

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:30:43 -08:00
Arnaldo Carvalho de Melo cdbc6dae5c [XFRM]: Use kmemdup where appropriate
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:30:22 -08:00
Jamal Hadi Salim 1459bb36b1 [XFRM]: Make copy_to_user_policy_type take a type
Make copy_to_user_policy_type take a type instead a policy and
fix its users to pass the type

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:14 -08:00
Andrew Morton 776810217a [XFRM]: uninline xfrm_selector_match()
Six callsites, huge.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:36 -08:00
Venkat Yekkirala 67f83cbf08 SELinux: Fix SA selection semantics
Fix the selection of an SA for an outgoing packet to be at the same
context as the originating socket/flow. This eliminates the SELinux
policy's ability to use/sendto SAs with contexts other than the socket's.

With this patch applied, the SELinux policy will require one or more of the
following for a socket to be able to communicate with/without SAs:

1. To enable a socket to communicate without using labeled-IPSec SAs:

allow socket_t unlabeled_t:association { sendto recvfrom }

2. To enable a socket to communicate with labeled-IPSec SAs:

allow socket_t self:association { sendto };
allow socket_t peer_sa_t:association { recvfrom };

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:21:34 -08:00
Al Viro 5d36b1803d [XFRM]: annotate ->new_mapping()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:18 -08:00
Masahide NAKAMURA 9abbffee86 [XFRM] STATE: Fix to respond error to get operation if no matching entry exists.
When application uses XFRM_MSG_GETSA to get state entry through
netlink socket and kernel has no matching one, the application expects
reply message with error status by kernel.

Kernel doesn't send the message back in the case of Mobile IPv6 route
optimization protocols (i.e. routing header or destination options
header). This is caused by incorrect return code "0" from
net/xfrm/xfrm_user.c(xfrm_user_state_lookup) and it makes kernel skip
to acknowledge at net/netlink/af_netlink.c(netlink_rcv_skb).

This patch fix to reply ESRCH to application.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: TAKAMIYA Noriaki <takamiya@po.ntts.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-25 15:16:52 -08:00
David Howells c4028958b6 WorkStruct: make allyesconfig
Fix up for make allyesconfig.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-11-22 14:57:56 +00:00
Jamal Hadi Salim 785fd8b8a5 [XFRM]: nlmsg length not computed correctly in the presence of subpolicies
I actually dont have a test case for these; i just found them by
inspection. Refer to patch "[XFRM]: Sub-policies broke policy events"
for more info

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Acked-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-21 16:16:35 -08:00
Jamal Hadi Salim 334f3d45d3 [XFRM]: Sub-policies broke policy events
XFRM policy events are broken when sub-policy feature is turned on.
A simple test to verify this:
run ip xfrm mon on one window and add then delete a policy on another
window ..

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Acked-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-21 16:16:34 -08:00
David S. Miller 54489c14c0 [XFRM] xfrm_user: Fix unaligned accesses.
Use memcpy() to move xfrm_address_t objects in and out
of netlink messages.  The vast majority of xfrm_user was
doing this properly, except for copy_from_user_state()
and copy_to_user_state().

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-30 15:24:35 -08:00
Patrick McHardy 2fab22f2d3 [XFRM]: Fix xfrm_state accounting
xfrm_state_num needs to be increased for XFRM_STATE_ACQ states created
by xfrm_state_find() to prevent the counter from going negative when
the state is destroyed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-24 15:34:00 -07:00
David S. Miller 918049f013 [XFRM]: Fix xfrm_state_num going negative.
Missing counter bump when hashing in a new ACQ
xfrm_state.

Now that we have two spots to do the hash grow
check, break it out into a helper function.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-15 23:14:18 -07:00
Venkat Yekkirala 3bccfbc7a7 IPsec: fix handling of errors for socket policies
This treats the security errors encountered in the case of
socket policy matching, the same as how these are treated in
the case of main/sub policies, which is to return a full lookup
failure.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-10-11 23:59:39 -07:00
Venkat Yekkirala 5b368e61c2 IPsec: correct semantics for SELinux policy matching
Currently when an IPSec policy rule doesn't specify a security
context, it is assumed to be "unlabeled" by SELinux, and so
the IPSec policy rule fails to match to a flow that it would
otherwise match to, unless one has explicitly added an SELinux
policy rule allowing the flow to "polmatch" to the "unlabeled"
IPSec policy rules. In the absence of such an explicitly added
SELinux policy rule, the IPSec policy rule fails to match and
so the packet(s) flow in clear text without the otherwise applicable
xfrm(s) applied.

The above SELinux behavior violates the SELinux security notion of
"deny by default" which should actually translate to "encrypt by
default" in the above case.

This was first reported by Evgeniy Polyakov and the way James Morris
was seeing the problem was when connecting via IPsec to a
confined service on an SELinux box (vsftpd), which did not have the
appropriate SELinux policy permissions to send packets via IPsec.

With this patch applied, SELinux "polmatching" of flows Vs. IPSec
policy rules will only come into play when there's a explicit context
specified for the IPSec policy rule (which also means there's corresponding
SELinux policy allowing appropriate domains/flows to polmatch to this context).

Secondly, when a security module is loaded (in this case, SELinux), the
security_xfrm_policy_lookup() hook can return errors other than access denied,
such as -EINVAL.  We were not handling that correctly, and in fact
inverting the return logic and propagating a false "ok" back up to
xfrm_lookup(), which then allowed packets to pass as if they were not
associated with an xfrm policy.

The solution for this is to first ensure that errno values are
correctly propagated all the way back up through the various call chains
from security_xfrm_policy_lookup(), and handled correctly.

Then, flow_cache_lookup() is modified, so that if the policy resolver
fails (typically a permission denied via the security module), the flow
cache entry is killed rather than having a null policy assigned (which
indicates that the packet can pass freely).  This also forces any future
lookups for the same flow to consult the security module (e.g. SELinux)
for current security policy (rather than, say, caching the error on the
flow cache entry).

This patch: Fix the selinux side of things.

This makes sure SELinux polmatching of flow contexts to IPSec policy
rules comes into play only when an explicit context is associated
with the IPSec policy rule.

Also, this no longer defaults the context of a socket policy to
the context of the socket since the "no explicit context" case
is now handled properly.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-10-11 23:59:37 -07:00
James Morris 134b0fc544 IPsec: propagate security module errors up from flow_cache_lookup
When a security module is loaded (in this case, SELinux), the
security_xfrm_policy_lookup() hook can return an access denied permission
(or other error).  We were not handling that correctly, and in fact
inverting the return logic and propagating a false "ok" back up to
xfrm_lookup(), which then allowed packets to pass as if they were not
associated with an xfrm policy.

The way I was seeing the problem was when connecting via IPsec to a
confined service on an SELinux box (vsftpd), which did not have the
appropriate SELinux policy permissions to send packets via IPsec.

The first SYNACK would be blocked, because of an uncached lookup via
flow_cache_lookup(), which would fail to resolve an xfrm policy because
the SELinux policy is checked at that point via the resolver.

However, retransmitted SYNACKs would then find a cached flow entry when
calling into flow_cache_lookup() with a null xfrm policy, which is
interpreted by xfrm_lookup() as the packet not having any associated
policy and similarly to the first case, allowing it to pass without
transformation.

The solution presented here is to first ensure that errno values are
correctly propagated all the way back up through the various call chains
from security_xfrm_policy_lookup(), and handled correctly.

Then, flow_cache_lookup() is modified, so that if the policy resolver
fails (typically a permission denied via the security module), the flow
cache entry is killed rather than having a null policy assigned (which
indicates that the packet can pass freely).  This also forces any future
lookups for the same flow to consult the security module (e.g. SELinux)
for current security policy (rather than, say, caching the error on the
flow cache entry).

Signed-off-by: James Morris <jmorris@namei.org>
2006-10-11 23:59:34 -07:00
Diego Beltrami 0a69452cb4 [XFRM]: BEET mode
This patch introduces the BEET mode (Bound End-to-End Tunnel) with as
specified by the ietf draft at the following link:

http://www.ietf.org/internet-drafts/draft-nikander-esp-beet-mode-06.txt

The patch provides only single family support (i.e. inner family =
outer family).

Signed-off-by: Diego Beltrami <diego.beltrami@gmail.com>
Signed-off-by: Miika Komu     <miika@iki.fi>
Signed-off-by: Herbert Xu     <herbert@gondor.apana.org.au>
Signed-off-by: Abhinav Pathak <abhinav.pathak@hiit.fi>
Signed-off-by: Jeff Ahrenholz <ahrenholz@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-04 00:31:09 -07:00
David S. Miller ae8c05779a [XFRM]: Clearing xfrm_policy_count[] to zero during flush is incorrect.
When we flush policies, we do a type match so we might not
actually delete all policies matching a certain direction.

So keep track of how many policies we actually kill and
subtract that number from xfrm_policy_count[dir] at the
end.

Based upon a patch by Masahide NAKAMURA.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-04 00:31:02 -07:00
Masahide NAKAMURA 667bbcb6c0 [XFRM] STATE: Use destination address for src hash.
Src hash is introduced for Mobile IPv6 route optimization usage.
On current kenrel code it is calculated with source address only.
It results we uses the same hash value for outbound state (when
the node has only one address for Mobile IPv6).
This patch use also destination address as peer information for
src hash to be dispersed.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-04 00:31:02 -07:00
Masahide NAKAMURA 7b4dc3600e [XFRM]: Do not add a state whose SPI is zero to the SPI hash.
SPI=0 is used for acquired IPsec SA and MIPv6 RO state.
Such state should not be added to the SPI hash
because we do not care about it on deleting path.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2006-09-28 18:02:49 -07:00
Al Viro 8122adf06e [XFRM]: xfrm_spi_hash() annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28 18:02:44 -07:00
Al Viro 61f4627b2f [XFRM]: xfrm_replay_advance() annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28 18:02:41 -07:00
Al Viro a252cc2371 [XFRM]: xrfm_replay_check() annotations
seq argument is net-endian

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28 18:02:40 -07:00
Al Viro 6067b2baba [XFRM]: xfrm_parse_spi() annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28 18:02:39 -07:00
Al Viro a94cfd1974 [XFRM]: xfrm_state_lookup() annotations
spi argument of xfrm_state_lookup() is net-endian

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28 18:02:37 -07:00
Al Viro 26977b4ed7 [XFRM]: xfrm_alloc_spi() annotated
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28 18:02:36 -07:00
Patrick McHardy a1e59abf82 [XFRM]: Fix wildcard as tunnel source
Hashing SAs by source address breaks templates with wildcards as tunnel
source since the source address used for hashing/lookup is still 0/0.
Move source address lookup to xfrm_tmpl_resolve_one() so we can use the
real address in the lookup.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:19:06 -07:00
James Morris d1d9facfd1 [XFRM]: remove xerr_idxp from __xfrm_policy_check()
It seems that during the MIPv6 respin, some code which was originally
conditionally compiled around CONFIG_XFRM_ADVANCED was accidently left
in after the config option was removed.

This patch removes an extraneous pointer (xerr_idxp) which is no
longer needed.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:18:49 -07:00
Masahide NAKAMURA a9917c0665 [XFRM] STATE: Fix flusing with hash mask.
This is a minor fix about transformation state flushing
for net-2.6.19. Please apply it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:18:45 -07:00
Alexey Dobriyan e5d679f339 [NET]: Use SLAB_PANIC
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:18:19 -07:00
David S. Miller acba48e1a3 [XFRM]: Respect priority in policy lookups.
Even if we find an exact match in the hash table,
we must inspect the inexact list to look for a match
with a better priority.

Noticed by Masahide NAKAMURA <nakam@linux-ipv6.org>.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:18:05 -07:00
David S. Miller 44e36b42a8 [XFRM]: Extract common hashing code into xfrm_hash.[ch]
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:49 -07:00
David S. Miller 2518c7c2b3 [XFRM]: Hash policies when non-prefixed.
This idea is from Alexey Kuznetsov.

It is common for policies to be non-prefixed.  And for
that case we can optimize lookups, insert, etc. quite
a bit.

For each direction, we have a dynamically sized policy
hash table for non-prefixed policies.  We also have a
hash table on policy->index.

For prefixed policies, we have a list per-direction which
we will consult on lookups when a non-prefix hashtable
lookup fails.

This still isn't as efficient as I would like it.  There
are four immediate problems:

1) Lots of excessive refcounting, which can be fixed just
   like xfrm_state was
2) We do 2 hash probes on insert, one to look for dups and
   one to allocate a unique policy->index.  Althought I wonder
   how much this matters since xfrm_state inserts do up to
   3 hash probes and that seems to perform fine.
3) xfrm_policy_insert() is very complex because of the priority
   ordering and entry replacement logic.
4) Lots of counter bumping, in addition to policy refcounts,
   in the form of xfrm_policy_count[].  This is merely used
   to let code path(s) know that some IPSEC rules exist.  So
   this count is indexed per-direction, maybe that is overkill.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:48 -07:00
David S. Miller c1969f294e [XFRM]: Hash xfrm_state objects by source address too.
The source address is always non-prefixed so we should use
it to help give entropy to the bydst hash.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:47 -07:00
David S. Miller a47f0ce05a [XFRM]: Kill excessive refcounting of xfrm_state objects.
The refcounting done for timers and hash table insertions
are just wasted cycles.  We can eliminate all of this
refcounting because:

1) The implicit refcount when the xfrm_state object is active
   will always be held while the object is in the hash tables.
   We never kfree() the xfrm_state until long after we've made
   sure that it has been unhashed.

2) Timers are even easier.  Once we mark that x->km.state as
   anything other than XFRM_STATE_VALID (__xfrm_state_delete
   sets it to XFRM_STATE_DEAD), any timer that fires will
   do nothing and return without rearming the timer.

   Therefore we can defer the del_timer calls until when the
   object is about to be freed up during GC.  We have to use
   del_timer_sync() and defer it to GC because we can't do
   a del_timer_sync() while holding x->lock which all callers
   of __xfrm_state_delete hold.

This makes SA changes even more light-weight.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:47 -07:00
David S. Miller 1c09539975 [XFRM]: Purge dst references to deleted SAs passively.
Just let GC and other normal mechanisms take care of getting
rid of DST cache references to deleted xfrm_state objects
instead of walking all the policy bundles.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:46 -07:00
David S. Miller c7f5ea3a4d [XFRM]: Do not flush all bundles on SA insert.
Instead, simply set all potentially aliasing existing xfrm_state
objects to have the current generation counter value.

This will make routes get relooked up the next time an existing
route mentioning these aliased xfrm_state objects gets used,
via xfrm_dst_check().

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:45 -07:00
David S. Miller 2575b65434 [XFRM]: Simplify xfrm_spi_hash
It can use __xfrm{4,6}_addr_hash().

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:44 -07:00
David S. Miller a624c108e5 [XFRM]: Put more keys into destination hash function.
Besides the daddr, key the hash on family and reqid too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:43 -07:00
David S. Miller 9d4a706d85 [XFRM]: Add generation count to xfrm_state and xfrm_dst.
Each xfrm_state inserted gets a new generation counter
value.  When a bundle is created, the xfrm_dst objects
get the current generation counter of the xfrm_state
they will attach to at dst->xfrm.

xfrm_bundle_ok() will return false if it sees an
xfrm_dst with a generation count different from the
generation count of the xfrm_state that dst points to.

This provides a facility by which to passively and
cheaply invalidate cached IPSEC routes during SA
database changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:42 -07:00
David S. Miller f034b5d4ef [XFRM]: Dynamic xfrm_state hash table sizing.
The grow algorithm is simple, we grow if:

1) we see a hash chain collision at insert, and
2) we haven't hit the hash size limit (currently 1*1024*1024 slots), and
3) the number of xfrm_state objects is > the current hash mask

All of this needs some tweaking.

Remove __initdata from "hashdist" so we can use it safely at run time.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:41 -07:00
David S. Miller 8f126e37c0 [XFRM]: Convert xfrm_state hash linkage to hlists.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:40 -07:00
David S. Miller edcd582152 [XFRM]: Pull xfrm_state_by{spi,src} hash table knowledge out of afinfo.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:39 -07:00
David S. Miller 2770834c9f [XFRM]: Pull xfrm_state_bydst hash table knowledge out of afinfo.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:38 -07:00
Masahide NAKAMURA f7b6983f0f [XFRM] POLICY: Support netlink socket interface for sub policy.
Sub policy can be used through netlink socket.
PF_KEY uses main only and it is TODO to support sub.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:35 -07:00
Masahide NAKAMURA 41a49cc3c0 [XFRM]: Add sorting interface for state and template.
Under two transformation policies it is required to merge them.
This is a platform to sort state for outbound and templates
for inbound respectively.
It will be used when Mobile IPv6 and IPsec are used at the same time.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:34 -07:00
Masahide NAKAMURA 4e81bb8336 [XFRM] POLICY: sub policy support.
Sub policy is introduced. Main and sub policy are applied the same flow.
(Policy that current kernel uses is named as main.)
It is required another transformation policy management to keep IPsec
and Mobile IPv6 lives separate.
Policy which lives shorter time in kernel should be a sub i.e. normally
main is for IPsec and sub is for Mobile IPv6.
(Such usage as two IPsec policies on different database can be used, too.)

Limitation or TODOs:
 - Sub policy is not supported for per socket one (it is always inserted as main).
 - Current kernel makes cached outbound with flowi to skip searching database.
   However this patch makes it disabled only when "two policies are used and
   the first matched one is bypass case" because neither flowi nor bundle
   information knows about transformation template size.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2006-09-22 15:08:34 -07:00
Masahide NAKAMURA c11f1a15c5 [XFRM] POLICY: Add Kconfig to support sub policy.
Add Kconfig to support sub policy.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:33 -07:00
Masahide NAKAMURA 97a64b4577 [XFRM]: Introduce XFRM_MSG_REPORT.
XFRM_MSG_REPORT is a message as notification of state protocol and
selector from kernel to user-space.

Mobile IPv6 will use it when inbound reject is occurred at route
optimization to make user-space know a binding error requirement.

Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:30 -07:00
Masahide NAKAMURA df0ba92a99 [XFRM]: Trace which secpath state is reject factor.
For Mobile IPv6 usage, it is required to trace which secpath state is
reject factor in order to notify it to user space (to know the address
which cannot be used route optimized communication).

Based on MIPL2 kernel patch.

This patch was also written by: Henrik Petander <petander@tcs.hut.fi>

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:08:29 -07:00
Masahide NAKAMURA e23c7194a8 [XFRM] STATE: Add Mobile IPv6 route optimization protocols to netlink interface.
Add Mobile IPv6 route optimization protocols to netlink interface.
Route optimization states carry care-of address.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:06:59 -07:00
Masahide NAKAMURA 654b32c6aa [XFRM]: Fix message about transformation user interface.
Transformation user interface is not only for IPsec.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:06:45 -07:00
Masahide NAKAMURA e53820de0f [XFRM] IPV6: Restrict bundle reusing
For outbound transformation, bundle is checked whether it is
suitable for current flow to be reused or not. In such IPv6 case
as below, transformation may apply incorrect bundle for the flow instead
of creating another bundle:

- The policy selector has destination prefix length < 128
  (Two or more addresses can be matched it)
- Its bundle holds dst entry of default route whose prefix length < 128
  (Previous traffic was used such route as next hop)
- The policy and the bundle were used a transport mode state and
  this time flow address is not matched the bundled state.

This issue is found by Mobile IPv6 usage to protect mobility signaling
by IPsec, but it is not a Mobile IPv6 specific.
This patch adds strict check to xfrm_bundle_ok() for each
state mode and address when prefix length is less than 128.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:06:44 -07:00
Masahide NAKAMURA 9afaca0579 [XFRM] IPV6: Update outbound state timestamp for each sending.
With this patch transformation state is updated last used time
for each sending. Xtime is used for it like other state lifetime
expiration.
Mobile IPv6 enabled nodes will want to know traffic status of each
binding (e.g. judgement to request binding refresh by correspondent node,
or to keep home/care-of nonce alive by mobile node).
The last used timestamp is an important hint about it.
Based on MIPL2 kernel patch.

This patch was also written by: Henrik Petander <petander@tcs.hut.fi>

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:06:43 -07:00
Noriaki TAKAMIYA 060f02a3bd [XFRM] STATE: Introduce care-of address.
Care-of address is carried by state as a transformation option like
IPsec encryption/authentication algorithm.

Based on MIPL2 kernel patch.

Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2006-09-22 15:06:42 -07:00
Masahide NAKAMURA 9e51fd371a [XFRM]: Rename secpath_has_tunnel to secpath_has_nontransport.
On current kernel inbound transformation state is allowed transport and
disallowed tunnel mode when mismatch is occurred between tempates and states.
As the result of adding two more modes by Mobile IPv6, this function name
is misleading. Inbound transformation can allow only transport mode
when mismatch is occurred between template and secpath.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:06:40 -07:00
Masahide NAKAMURA fbd9a5b47e [XFRM] STATE: Common receive function for route optimization extension headers.
XFRM_STATE_WILDRECV flag is introduced; the last resort state is set
it and receives packet which is not route optimized but uses such
extension headers i.e. Mobile IPv6 signaling (binding update and
acknowledgement).  A node enabled Mobile IPv6 adds the state.

Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:06:39 -07:00
Masahide NAKAMURA f3bd484021 [XFRM]: Restrict authentication algorithm only when inbound transformation protocol is IPsec.
For Mobile IPv6 usage, routing header or destination options header is
used and it doesn't require this comparison. It is checked only for
IPsec template.

Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:06:38 -07:00
Masahide NAKAMURA eb2971b68a [XFRM] STATE: Search by address using source address list.
This is a support to search transformation states by its addresses
by using source address list for Mobile IPv6 usage.
To use it from user-space, it is also added a message type for
source address as a xfrm state option.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:06:35 -07:00
Masahide NAKAMURA 6c44e6b7ab [XFRM] STATE: Add source address list.
Support source address based searching.
Mobile IPv6 will use it.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:06:34 -07:00
Masahide NAKAMURA dc00a52560 [XFRM] STATE: Allow non IPsec protocol.
It will be added two more transformation protocols (routing header
and destination options header) for Mobile IPv6.
xfrm_id_proto_match() can be handle zero as all, IPSEC_PROTO_ANY as
all IPsec and otherwise as exact one.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:06:32 -07:00
Masahide NAKAMURA 5794708f11 [XFRM]: Introduce a helper to compare id protocol.
Put the helper to header for future use.
Based on MIPL2 kernel patch.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:06:24 -07:00
Masahide NAKAMURA 7e49e6de30 [XFRM]: Add XFRM_MODE_xxx for future use.
Transformation mode is used as either IPsec transport or tunnel.
It is required to add two more items, route optimization and inbound trigger
for Mobile IPv6.
Based on MIPL2 kernel patch.

This patch was also written by: Ville Nuorvala <vnuorval@tcs.hut.fi>

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:05:15 -07:00
Venkat Yekkirala cb969f072b [MLSXFRM]: Default labeling of socket specific IPSec policies
This defaults the label of socket-specific IPSec policies to be the
same as the socket they are set on.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:28 -07:00
Venkat Yekkirala beb8d13bed [MLSXFRM]: Add flow labeling
This labels the flows that could utilize IPSec xfrms at the points the
flows are defined so that IPSec policy and SAs at the right label can
be used.

The following protos are currently not handled, but they should
continue to be able to use single-labeled IPSec like they currently
do.

ipmr
ip_gre
ipip
igmp
sit
sctp
ip6_tunnel (IPv6 over IPv6 tunnel device)
decnet

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:27 -07:00
Serge Hallyn 0d681623d3 [MLSXFRM]: Add security context to acquire messages using netlink
This includes the security context of a security association created
for use by IKE in the acquire messages sent to IKE daemons using
netlink/xfrm_user. This would allow the daemons to include the
security context in the negotiation, so that the resultant association
is unique to that security context.

Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:25 -07:00
Venkat Yekkirala e0d1caa7b0 [MLSXFRM]: Flow based matching of xfrm policy and state
This implements a seemless mechanism for xfrm policy selection and
state matching based on the flow sid. This also includes the necessary
SELinux enforcement pieces.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:24 -07:00
Herbert Xu e4d5b79c66 [CRYPTO] users: Use crypto_comp and crypto_has_*
This patch converts all users to use the new crypto_comp type and the
crypto_has_* functions.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:46:22 +10:00
Herbert Xu 07d4ee583e [IPSEC]: Use HMAC template and hash interface
This patch converts IPsec to use the new HMAC template.  The names of
existing simple digest algorithms may still be used to refer to their
HMAC composites.

The same structure can be used by other MACs such as AES-XCBC-MAC.

This patch also switches from the digest interface to hash.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-21 11:46:18 +10:00
Herbert Xu 6b7326c849 [IPSEC] ESP: Use block ciphers where applicable
This patch converts IPSec/ESP to use the new block cipher type where
applicable.  Similar to the HMAC conversion, existing algorithm names
have been kept for compatibility.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:46:14 +10:00
Herbert Xu 04ff126094 [IPSEC]: Add compatibility algorithm name support
This patch adds a compatibility name field for each IPsec algorithm.  This
is needed when parameterised algorithms are used.  For example, "md5" will
become "hmac(md5)", and "aes" will become "cbc(aes)".

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:46:14 +10:00
Herbert Xu 9409f38a0c [IPSEC]: Move linux/crypto.h inclusion out of net/xfrm.h
The header file linux/crypto.h is only needed by a few files so including
it in net/xfrm.h (which is included by half of the networking stack) is a
waste.  This patch moves it out of net/xfrm.h and into the specific header
files that actually need it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:16:30 +10:00
David S. Miller d49c73c729 [IPSEC]: Validate properly in xfrm_dst_check()
If dst->obsolete is -1, this is a signal from the
bundle creator that we want the XFRM dst and the
dsts that it references to be validated on every
use.

I misunderstood this intention when I changed
xfrm_dst_check() to always return NULL.

Now, when we purge a dst entry, by running dst_free()
on it.  This will set the dst->obsolete to a positive
integer, and we want to return NULL in that case so
that the socket does a relookup for the route.

Thus, if dst->obsolete<0, let stale_bundle() validate
the state, else always return NULL.

In general, we need to do things more intelligently
here because we flush too much state during rule
changes.  Herbert Xu has some ideas wherein the key
manager gives us some help in this area.  We can also
use smarter state management algorithms inside of
the kernel as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13 18:55:53 -07:00
Panagiotis Issaris 0da974f4f3 [NET]: Conversions from kmalloc+memset to k(z|c)alloc.
Signed-off-by: Panagiotis Issaris <takis@issaris.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21 14:51:30 -07:00
Jörn Engel 6ab3d5624e Remove obsolete #include <linux/config.h>
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 19:25:36 +02:00
Adrian Bunk 244055fdc8 [XFRM]: unexport xfrm_state_mtu
This patch removes the unused EXPORT_SYMBOL(xfrm_state_mtu).

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29 16:58:33 -07:00
Darrel Goeddel c7bdb545d2 [NETLINK]: Encapsulate eff_cap usage within security framework.
This patch encapsulates the usage of eff_cap (in netlink_skb_params) within
the security framework by extending security_netlink_recv to include a required
capability parameter and converting all direct usage of eff_caps outside
of the lsm modules to use the interface.  It also updates the SELinux
implementation of the security_netlink_send and security_netlink_recv
hooks to take advantage of the sid in the netlink_skb_params struct.
This also enables SELinux to perform auditing of netlink capability checks.
Please apply, for 2.6.18 if possible.

Signed-off-by: Darrel Goeddel <dgoeddel@trustedcs.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by:  James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29 16:57:55 -07:00
David S. Miller 6f68dc3775 [NET]: Fix warnings after LSM-IPSEC changes.
Assignment used as truth value in xfrm_del_sa()
and xfrm_get_policy().

Wrong argument type declared for security_xfrm_state_delete()
when SELINUX is disabled.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:49 -07:00
Catherine Zhang c8c05a8eec [LSM-IPsec]: SELinux Authorize
This patch contains a fix for the previous patch that adds security
contexts to IPsec policies and security associations.  In the previous
patch, no authorization (besides the check for write permissions to
SAD and SPD) is required to delete IPsec policies and security
assocations with security contexts.  Thus a user authorized to change
SAD and SPD can bypass the IPsec policy authorization by simply
deleteing policies with security contexts.  To fix this security hole,
an additional authorization check is added for removing security
policies and security associations with security contexts.

Note that if no security context is supplied on add or present on
policy to be deleted, the SELinux module allows the change
unconditionally.  The hook is called on deletion when no context is
present, which we may want to change.  At present, I left it up to the
module.

LSM changes:

The patch adds two new LSM hooks: xfrm_policy_delete and
xfrm_state_delete.  The new hooks are necessary to authorize deletion
of IPsec policies that have security contexts.  The existing hooks
xfrm_policy_free and xfrm_state_free lack the context to do the
authorization, so I decided to split authorization of deletion and
memory management of security data, as is typical in the LSM
interface.

Use:

The new delete hooks are checked when xfrm_policy or xfrm_state are
deleted by either the xfrm_user interface (xfrm_get_policy,
xfrm_del_sa) or the pfkey interface (pfkey_spddelete, pfkey_delete).

SELinux changes:

The new policy_delete and state_delete functions are added.

Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:45 -07:00
Herbert Xu b59f45d0b2 [IPSEC] xfrm: Abstract out encapsulation modes
This patch adds the structure xfrm_mode.  It is meant to represent
the operations carried out by transport/tunnel modes.

By doing this we allow additional encapsulation modes to be added
without clogging up the xfrm_input/xfrm_output paths.

Candidate modes include 4-to-6 tunnel mode, 6-to-4 tunnel mode, and
BEET modes.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:39 -07:00
Herbert Xu 546be2405b [IPSEC] xfrm: Undo afinfo lock proliferation
The number of locks used to manage afinfo structures can easily be reduced
down to one each for policy and state respectively.  This is based on the
observation that the write locks are only held by module insertion/removal
which are very rare events so there is no need to further differentiate
between the insertion of modules like ipv6 versus esp6.

The removal of the read locks in xfrm4_policy.c/xfrm6_policy.c might look
suspicious at first.  However, after you realise that nobody ever takes
the corresponding write lock you'll feel better :)

As far as I can gather it's an attempt to guard against the removal of
the corresponding modules.  Since neither module can be unloaded at all
we can leave it to whoever fixes up IPv6 unloading :)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:37 -07:00
Alexey Dobriyan 4195f81453 [NET]: Fix "ntohl(ntohs" bugs
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-22 16:53:22 -07:00
Ingo Molnar e959d8121f [XFRM]: fix incorrect xfrm_policy_afinfo_lock use
xfrm_policy_afinfo_lock can be taken in bh context, at:

 [<c013fe1a>] lockdep_acquire_read+0x54/0x6d
 [<c0f6e024>] _read_lock+0x15/0x22
 [<c0e8fcdb>] xfrm_policy_get_afinfo+0x1a/0x3d
 [<c0e8fd10>] xfrm_decode_session+0x12/0x32
 [<c0e66094>] ip_route_me_harder+0x1c9/0x25b
 [<c0e770d3>] ip_nat_local_fn+0x94/0xad
 [<c0e2bbc8>] nf_iterate+0x2e/0x7a
 [<c0e2bc50>] nf_hook_slow+0x3c/0x9e
 [<c0e3a342>] ip_push_pending_frames+0x2de/0x3a7
 [<c0e53e19>] icmp_push_reply+0x136/0x141
 [<c0e543fb>] icmp_reply+0x118/0x1a0
 [<c0e54581>] icmp_echo+0x44/0x46
 [<c0e53fad>] icmp_rcv+0x111/0x138
 [<c0e36764>] ip_local_deliver+0x150/0x1f9
 [<c0e36be2>] ip_rcv+0x3d5/0x413
 [<c0df760f>] netif_receive_skb+0x337/0x356
 [<c0df76c3>] process_backlog+0x95/0x110
 [<c0df5fe2>] net_rx_action+0xa5/0x16d
 [<c012d8a7>] __do_softirq+0x6f/0xe6
 [<c0105ec2>] do_softirq+0x52/0xb1

this means that all write-locking of xfrm_policy_afinfo_lock must be
bh-safe. This patch fixes xfrm_policy_register_afinfo() and
xfrm_policy_unregister_afinfo().

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:21 -07:00
Ingo Molnar f3111502c0 [XFRM]: fix incorrect xfrm_state_afinfo_lock use
xfrm_state_afinfo_lock can be read-locked from bh context, so take it
in a bh-safe manner in xfrm_state_register_afinfo() and
xfrm_state_unregister_afinfo(). Found by the lock validator.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:20 -07:00
Ingo Molnar 8dff7c2970 [XFRM]: fix softirq-unsafe xfrm typemap->lock use
xfrm typemap->lock may be used in softirq context, so all write_lock()
uses must be softirq-safe.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:18 -07:00
Jamal Hadi Salim 2717096ab4 [XFRM]: Fix aevent timer.
Send aevent immediately if we have sent nothing since last timer and
this is the first packet.

Fixes a corner case when packet threshold is very high, the timer low
and a very low packet rate input which is bursty.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-14 15:03:05 -07:00
Herbert Xu dbe5b4aaaf [IPSEC]: Kill unused decap state structure
This patch removes the *_decap_state structures which were previously
used to share state between input/post_input.  This is no longer
needed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-01 00:54:16 -08:00
Patrick McHardy be33690d8f [XFRM]: Fix aevent related crash
When xfrm_user isn't loaded xfrm_nl is NULL, which makes IPsec crash because
xfrm_aevent_is_on passes the NULL pointer to netlink_has_listeners as socket.
A second problem is that the xfrm_nl pointer is not cleared when the socket
is releases at module unload time.

Protect references of xfrm_nl from outside of xfrm_user by RCU, check
that the socket is present in xfrm_aevent_is_on and set it to NULL
when unloading xfrm_user.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:40:54 -08:00
Arjan van de Ven 4a3e2f711a [NET] sem2mutex: net/
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:33:17 -08:00
David S. Miller 253aa11578 [IPSEC] xfrm_user: Kill PAGE_SIZE check in verify_sec_ctx_len()
First, it warns when PAGE_SIZE >= 64K because the ctx_len
field is 16-bits.

Secondly, if there are any real length limitations it can
be verified by the security layer security_xfrm_state_alloc()
call.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:23:35 -08:00
David S. Miller a70fcb0ba3 [XFRM]: Add some missing exports.
To fix the case of modular xfrm_user.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:18:52 -08:00
David S. Miller ee857a7d67 [XFRM]: Move xfrm_nl to xfrm_state.c from xfrm_user.c
xfrm_user could be modular, and since generic code uses this symbol
now...

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:18:37 -08:00
David S. Miller 0ac8475248 [XFRM]: Make sure xfrm_replay_timer_handler() is declared early enough.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:18:23 -08:00
Jamal Hadi Salim 6c5c8ca7ff [IPSEC]: Sync series - policy expires
This is similar to the SA expire insertion patch - only it inserts
expires for SP.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:17:25 -08:00
Jamal Hadi Salim 53bc6b4d29 [IPSEC]: Sync series - SA expires
This patch allows a user to insert SA expires. This is useful to
do on an HA backup for the case of byte counts but may not be very
useful for the case of time based expiry.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:17:03 -08:00
Jamal Hadi Salim 980ebd2579 [IPSEC]: Sync series - acquire insert
This introduces a feature similar to the one described in RFC 2367:
"
   ... the application needing an SA sends a PF_KEY
   SADB_ACQUIRE message down to the Key Engine, which then either
   returns an error or sends a similar SADB_ACQUIRE message up to one or
   more key management applications capable of creating such SAs.
   ...
   ...
   The third is where an application-layer consumer of security
   associations (e.g.  an OSPFv2 or RIPv2 daemon) needs a security
   association.

        Send an SADB_ACQUIRE message from a user process to the kernel.

        <base, address(SD), (address(P),) (identity(SD),) (sensitivity,)
          proposal>

        The kernel returns an SADB_ACQUIRE message to registered
          sockets.

        <base, address(SD), (address(P),) (identity(SD),) (sensitivity,)
          proposal>

        The user-level consumer waits for an SADB_UPDATE or SADB_ADD
        message for its particular type, and then can use that
        association by using SADB_GET messages.

 "
An app such as OSPF could then use ipsec KM to get keys

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:16:40 -08:00
Jamal Hadi Salim d51d081d65 [IPSEC]: Sync series - user
Add xfrm as the user of the core changes

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:16:12 -08:00
Jamal Hadi Salim f8cd54884e [IPSEC]: Sync series - core changes
This patch provides the core functionality needed for sync events
for ipsec. Derived work of Krisztian KOVACS <hidden@balabit.hu>

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:15:11 -08:00
Herbert Xu 752c1f4c78 [IPSEC]: Kill post_input hook and do NAT-T in esp_input directly
The only reason post_input exists at all is that it gives us the
potential to adjust the checksums incrementally in future which
we ought to do.

However, after thinking about it for a bit we can adjust the
checksums without using this post_input stuff at all.  The crucial
point is that only the inner-most NAT-T SA needs to be considered
when adjusting checksums.  What's more, the checksum adjustment
comes down to a single u32 due to the linearity of IP checksums.

We just happen to have a spare u32 lying around in our skb structure :)
When ip_summed is set to CHECKSUM_NONE on input, the value of skb->csum
is currently unused.  All we have to do is to make that the checksum
adjustment and voila, there goes all the post_input and decap structures!

I've left in the decap data structures for now since it's intricately
woven into the sec_path stuff.  We can kill them later too.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-27 13:00:40 -08:00
Herbert Xu 21380b81ef [XFRM]: Eliminate refcounting confusion by creating __xfrm_state_put().
We often just do an atomic_dec(&x->refcnt) on an xfrm_state object
because we know there is more than 1 reference remaining and thus
we can elide the heavier xfrm_state_put() call.

Do this behind an inline function called __xfrm_state_put() so that is
more obvious and also to allow us to more cleanly add refcount
debugging later.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-23 16:10:53 -08:00
Patrick McHardy 42cf93cd46 [NETFILTER]: Fix bridge netfilter related in xfrm_lookup
The bridge-netfilter code attaches a fake dst_entry with dst->ops == NULL
to purely bridged packets. When these packets are SNATed and a policy
lookup is done, xfrm_lookup crashes because it tries to dereference
dst->ops.

Change xfrm_lookup not to dereference dst->ops before checking for the
DST_NOXFRM flag and set this flag in the fake dst_entry.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-23 16:10:51 -08:00
Patrick McHardy 9951101438 [XFRM]: Fix policy double put
The policy is put once immediately and once at the error label, which results
in the following Oops:

kernel BUG at net/xfrm/xfrm_policy.c:250!
invalid opcode: 0000 [#2]
PREEMPT
[...]
CPU:    0
EIP:    0060:[<c028caf7>]    Not tainted VLI
EFLAGS: 00210246   (2.6.16-rc3 #39)
EIP is at __xfrm_policy_destroy+0xf/0x46
eax: d49f2000   ebx: d49f2000   ecx: f74bd880   edx: f74bd280
esi: d49f2000   edi: 00000001   ebp: cd506dcc   esp: cd506dc8
ds: 007b   es: 007b   ss: 0068
Process ssh (pid: 31970, threadinfo=cd506000 task=cfb04a70)
Stack: <0>cd506000 cd506e34 c028e92b ebde7280 cd506e58 cd506ec0 f74bd280 00000000
       00000214 0000000a 0000000a 00000000 00000002 f7ae6000 00000000 cd506e58
       cd506e14 c0299e36 f74bd280 e873fe00 c02943fd cd506ec0 ebde7280 f271f440
Call Trace:
 [<c0103a44>] show_stack_log_lvl+0xaa/0xb5
 [<c0103b75>] show_registers+0x126/0x18c
 [<c0103e68>] die+0x14e/0x1db
 [<c02b6809>] do_trap+0x7c/0x96
 [<c0104237>] do_invalid_op+0x89/0x93
 [<c01035af>] error_code+0x4f/0x54
 [<c028e92b>] xfrm_lookup+0x349/0x3c2
 [<c02b0b0d>] ip6_datagram_connect+0x317/0x452
 [<c0281749>] inet_dgram_connect+0x49/0x54
 [<c02404d2>] sys_connect+0x51/0x68
 [<c0240928>] sys_socketcall+0x6f/0x166
 [<c0102aa1>] syscall_call+0x7/0xb

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-19 22:11:50 -08:00
Herbert Xu 00de651d14 [IPSEC]: Fix strange IPsec freeze.
Problem discovered and initial patch by Olaf Kirch:

	there's a problem with IPsec that has been bugging some of our users
	for the last couple of kernel revs. Every now and then, IPsec will
	freeze the machine completely. This is with openswan user land,
	and with kernels up to and including 2.6.16-rc2.

	I managed to debug this a little, and what happens is that we end
	up looping in xfrm_lookup, and never get out. With a bit of debug
	printks added, I can this happening:

		ip_route_output_flow calls xfrm_lookup

		xfrm_find_bundle returns NULL (apparently we're in the
			middle of negotiating a new SA or something)

		We therefore call xfrm_tmpl_resolve. This returns EAGAIN
			We go to sleep, waiting for a policy update.
			Then we loop back to the top

		Apparently, the dst_orig that was passed into xfrm_lookup
			has been dropped from the routing table (obsolete=2)
			This leads to the endless loop, because we now create
			a new bundle, check the new bundle and find it's stale
			(stale_bundle -> xfrm_bundle_ok -> dst_check() return 0)

	People have been testing with the patch below, which seems to fix the
	problem partially. They still see connection hangs however (things
	only clear up when they start a new ping or new ssh). So the patch
	is obvsiouly not sufficient, and something else seems to go wrong.

	I'm grateful for any hints you may have...

I suggest that we simply bail out always.  If the dst decides to die
on us later on, the packet will be dropped anyway.  So there is no
great urgency to retry here.  Once we have the proper resolution
queueing, we can then do the retry again.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Olaf Kirch <okir@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-13 16:01:27 -08:00
Al Viro 1b8623545b [PATCH] remove bogus asm/bug.h includes.
A bunch of asm/bug.h includes are both not needed (since it will get
pulled anyway) and bogus (since they are done too early).  Removed.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-02-07 20:56:35 -05:00
Kris Katterjohn 09a626600b [NET]: Change some "if (x) BUG();" to "BUG_ON(x);"
This changes some simple "if (x) BUG();" statements to "BUG_ON(x);"

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:18 -08:00
Patrick McHardy eb9c7ebe69 [NETFILTER]: Handle NAT in IPsec policy checks
Handle NAT of decapsulated IPsec packets by reconstructing the struct flowi
of the original packet from the conntrack information for IPsec policy
checks.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:37 -08:00
Patrick McHardy 3e3850e989 [NETFILTER]: Fix xfrm lookup in ip_route_me_harder/ip6_route_me_harder
ip_route_me_harder doesn't use the port numbers of the xfrm lookup and
uses ip_route_input for non-local addresses which doesn't do a xfrm
lookup, ip6_route_me_harder doesn't do a xfrm lookup at all.

Use xfrm_decode_session and do the lookup manually, make sure both
only do the lookup if the packet hasn't been transformed already.

Makeing sure the lookup only happens once needs a new field in the
IP6CB, which exceeds the size of skb->cb. The size of skb->cb is
increased to 48b. Apparently the IPv6 mobile extensions need some
more room anyway.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:33 -08:00
Trent Jaeger 5f8ac64b15 [LSM-IPSec]: Corrections to LSM-IPSec Nethooks
This patch contains two corrections to the LSM-IPsec Nethooks patches
previously applied.  

(1) free a security context on a failed insert via xfrm_user 
interface in xfrm_add_policy.  Memory leak.

(2) change the authorization of the allocation of a security context
in a xfrm_policy or xfrm_state from both relabelfrom and relabelto 
to setcontext.

Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:22:39 -08:00
Trent Jaeger df71837d50 [LSM-IPSec]: Security association restriction.
This patch series implements per packet access control via the
extension of the Linux Security Modules (LSM) interface by hooks in
the XFRM and pfkey subsystems that leverage IPSec security
associations to label packets.  Extensions to the SELinux LSM are
included that leverage the patch for this purpose.

This patch implements the changes necessary to the XFRM subsystem,
pfkey interface, ipv4/ipv6, and xfrm_user interface to restrict a
socket to use only authorized security associations (or no security
association) to send/receive network packets.

Patch purpose:

The patch is designed to enable access control per packets based on
the strongly authenticated IPSec security association.  Such access
controls augment the existing ones based on network interface and IP
address.  The former are very coarse-grained, and the latter can be
spoofed.  By using IPSec, the system can control access to remote
hosts based on cryptographic keys generated using the IPSec mechanism.
This enables access control on a per-machine basis or per-application
if the remote machine is running the same mechanism and trusted to
enforce the access control policy.

Patch design approach:

The overall approach is that policy (xfrm_policy) entries set by
user-level programs (e.g., setkey for ipsec-tools) are extended with a
security context that is used at policy selection time in the XFRM
subsystem to restrict the sockets that can send/receive packets via
security associations (xfrm_states) that are built from those
policies.

A presentation available at
www.selinux-symposium.org/2005/presentations/session2/2-3-jaeger.pdf
from the SELinux symposium describes the overall approach.

Patch implementation details:

On output, the policy retrieved (via xfrm_policy_lookup or
xfrm_sk_policy_lookup) must be authorized for the security context of
the socket and the same security context is required for resultant
security association (retrieved or negotiated via racoon in
ipsec-tools).  This is enforced in xfrm_state_find.

On input, the policy retrieved must also be authorized for the socket
(at __xfrm_policy_check), and the security context of the policy must
also match the security association being used.

The patch has virtually no impact on packets that do not use IPSec.
The existing Netfilter (outgoing) and LSM rcv_skb hooks are used as
before.

Also, if IPSec is used without security contexts, the impact is
minimal.  The LSM must allow such policies to be selected for the
combination of socket and remote machine, but subsequent IPSec
processing proceeds as in the original case.

Testing:

The pfkey interface is tested using the ipsec-tools.  ipsec-tools have
been modified (a separate ipsec-tools patch is available for version
0.5) that supports assignment of xfrm_policy entries and security
associations with security contexts via setkey and the negotiation
using the security contexts via racoon.

The xfrm_user interface is tested via ad hoc programs that set
security contexts.  These programs are also available from me, and
contain programs for setting, getting, and deleting policy for testing
this interface.  Testing of sa functions was done by tracing kernel
behavior.

Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:24 -08:00
David S. Miller 9b78a82c1c [IPSEC]: Fix policy updates missed by sockets
The problem is that when new policies are inserted, sockets do not see
the update (but all new route lookups do).

This bug is related to the SA insertion stale route issue solved
recently, and this policy visibility problem can be fixed in a similar
way.

The fix is to flush out the bundles of all policies deeper than the
policy being inserted.  Consider beginning state of "outgoing"
direction policy list:

	policy A --> policy B --> policy C --> policy D

First, realize that inserting a policy into a list only potentially
changes IPSEC routes for that direction.  Therefore we need not bother
considering the policies for other directions.  We need only consider
the existing policies in the list we are doing the inserting.

Consider new policy "B'", inserted after B.

	policy A --> policy B --> policy B' --> policy C --> policy D

Two rules:

1) If policy A or policy B matched before the insertion, they
   appear before B' and thus would still match after inserting
   B'

2) Policy C and D, now "shadowed" and after policy B', potentially
   contain stale routes because policy B' might be selected
   instead of them.

Therefore we only need flush routes assosciated with policies
appearing after a newly inserted policy, if any.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-22 07:39:48 -08:00
David S. Miller 399c180ac5 [IPSEC]: Perform SA switchover immediately.
When we insert a new xfrm_state which potentially
subsumes an existing one, make sure all cached
bundles are flushed so that the new SA is used
immediately.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-19 14:23:23 -08:00
Thomas Graf 88fc2c8431 [XFRM]: Use generic netlink receive queue processor
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 02:26:40 +01:00
Thomas Graf a8f74b2288 [NETLINK]: Make netlink_callback->done() optional
Most netlink families make no use of the done() callback, making
it optional gets rid of all unnecessary dummy implementations.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 02:26:40 +01:00
Jesper Juhl a51482bde2 [NET]: kfree cleanup
From: Jesper Juhl <jesper.juhl@gmail.com>

This is the net/ part of the big kfree cleanup patch.

Remove pointless checks for NULL prior to calling kfree() in net/.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Arnaldo Carvalho de Melo <acme@conectiva.com.br>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2005-11-08 09:41:34 -08:00
Herbert Xu 80b30c1023 [IPSEC]: Kill obsolete get_mss function
Now that we've switched over to storing MTUs in the xfrm_dst entries,
we no longer need the dst's get_mss methods.  This patch gets rid of
them.

It also documents the fact that our MTU calculation is not optimal
for ESP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:48:45 -02:00
Al Viro dd0fc66fb3 [PATCH] gfp flags annotations - part 1
- added typedef unsigned int __nocast gfp_t;

 - replaced __nocast uses for gfp flags with gfp_t - it gives exactly
   the same warnings as far as sparse is concerned, doesn't change
   generated code (from gcc point of view we replaced unsigned int with
   typedef) and documents what's going on far better.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-08 15:00:57 -07:00
Herbert Xu 77d8d7a684 [IPSEC]: Document that policy direction is derived from the index.
Here is a patch that adds a helper called xfrm_policy_id2dir to
document the fact that the policy direction can be and is derived
from the index.

This is based on a patch by YOSHIFUJI Hideaki and 210313105@suda.edu.cn.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-05 12:15:12 -07:00
Randy Dunlap 83fa3400eb [XFRM]: fix sparse gfp nocast warnings
Fix implicit nocast warnings in xfrm code:
net/xfrm/xfrm_policy.c:232:47: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 22:45:35 -07:00
Patrick McHardy e104411b82 [XFRM]: Always release dst_entry on error in xfrm_lookup
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-08 15:11:55 -07:00
Eric Dumazet ba89966c19 [NET]: use __read_mostly on kmem_cache_t , DEFINE_SNMP_STAT pointers
This patch puts mostly read only data in the right section
(read_mostly), to help sharing of these data between CPUS without
memory ping pongs.

On one of my production machine, tcp_statistics was sitting in a
heavily modified cache line, so *every* SNMP update had to force a
reload.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:11:18 -07:00
Patrick McHardy 066286071d [NETLINK]: Add "groups" argument to netlink_kernel_create
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:01:11 -07:00
Patrick McHardy ac6d439d20 [NETLINK]: Convert netlink users to use group numbers instead of bitmasks
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:00:54 -07:00
Patrick McHardy 43e943c32b [NETLINK]: Fix missing dst_groups initializations in netlink_broadcast users
netlink_broadcast users must initialize NETLINK_CB(skb).dst_groups to the
destination group mask for netlink_recvmsg.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:00:34 -07:00
Harald Welte 4fdb3bb723 [NETLINK]: Add properly module refcounting for kernel netlink sockets.
- Remove bogus code for compiling netlink as module
- Add module refcounting support for modules implementing a netlink
  protocol
- Add support for autoloading modules that implement a netlink protocol
  as soon as someone opens a socket for that protocol

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:35:08 -07:00
Herbert Xu a4f1bac625 [XFRM]: Fix possible overflow of sock->sk_policy
Spotted by, and original patch by, Balazs Scheidler.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-26 15:43:17 -07:00
Sam Ravnborg 6a2e9b738c [NET]: move config options out to individual protocols
Move the protocol specific config options out to the specific protocols.
With this change net/Kconfig now starts to become readable and serve as a
good basis for further re-structuring.

The menu structure is left almost intact, except that indention is
fixed in most cases. Most visible are the INET changes where several
"depends on INET" are replaced with a single ifdef INET / endif pair.

Several new files were created to accomplish this change - they are
small but serve the purpose that config options are now distributed
out where they belongs.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-11 21:13:56 -07:00
Herbert Xu d094cd83c0 [IPSEC]: Add xfrm_state_afinfo->init_flags
This patch adds the xfrm_state_afinfo->init_flags hook which allows
each address family to perform any common initialisation that does
not require a corresponding destructor call.

It will be used subsequently to set the XFRM_STATE_NOPMTUDISC flag
in IPv4.

It also fixes up the error codes returned by xfrm_init_state.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: James Morris <jmorris@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-20 13:19:41 -07:00
Herbert Xu 72cb6962a9 [IPSEC]: Add xfrm_init_state
This patch adds xfrm_init_state which is simply a wrapper that calls
xfrm_get_type and subsequently x->type->init_state.  It also gets rid
of the unused args argument.

Abstracting it out allows us to add common initialisation code, e.g.,
to set family-specific flags.

The add_time setting in xfrm_user.c was deleted because it's already
set by xfrm_state_alloc.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: James Morris <jmorris@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-20 13:18:08 -07:00
Herbert Xu 0603eac0d6 [IPSEC]: Add XFRMA_SA/XFRMA_POLICY for delete notification
This patch changes the format of the XFRM_MSG_DELSA and
XFRM_MSG_DELPOLICY notification so that the main message
sent is of the same format as that received by the kernel
if the original message was via netlink.  This also means
that we won't lose the byid information carried in km_event.

Since this user interface is introduced by Jamal's patch
we can still afford to change it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18 22:54:36 -07:00
Jamal Hadi Salim ee57eef99b [IPSEC] Use NLMSG_LENGTH in xfrm_exp_state_notify
Small fixup to use netlink macros instead of hardcoding.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18 22:45:56 -07:00
Patrick McHardy 7d6dfe1f5b [IPSEC] Fix xfrm_state leaks in error path
Herbert Xu wrote:
> @@ -1254,6 +1326,7 @@ static int pfkey_add(struct sock *sk, st
>       if (IS_ERR(x))
>               return PTR_ERR(x);
>
> +     xfrm_state_hold(x);

This introduces a leak when xfrm_state_add()/xfrm_state_update()
fail. We hold two references (one from xfrm_state_alloc(), one
from xfrm_state_hold()), but only drop one. We need to take the
reference because the reference from xfrm_state_alloc() can
be dropped by __xfrm_state_delete(), so the fix is to drop both
references on error. Same problem in xfrm_user.c.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18 22:45:31 -07:00
Herbert Xu f60f6b8f70 [IPSEC] Use XFRM_MSG_* instead of XFRM_SAP_*
This patch removes XFRM_SAP_* and converts them over to XFRM_MSG_*.
The netlink interface is meant to map directly onto the underlying
xfrm subsystem.  Therefore rather than using a new independent
representation for the events we can simply use the existing ones
from xfrm_user.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-06-18 22:44:37 -07:00
Herbert Xu e7443892f6 [IPSEC] Set byid for km_event in xfrm_get_policy
This patch fixes policy deletion in xfrm_user so that it sets
km_event.data.byid.  This puts xfrm_user on par with what af_key
does in this case.
   
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-06-18 22:44:18 -07:00
Herbert Xu bf08867f91 [IPSEC] Turn km_event.data into a union
This patch turns km_event.data into a union.  This makes code that
uses it clearer.
  
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-06-18 22:44:00 -07:00
Herbert Xu 4666faab09 [IPSEC] Kill spurious hard expire messages
This patch ensures that the hard state/policy expire notifications are
only sent when the state/policy is successfully removed from their
respective tables.

As it is, it's possible for a state/policy to both expire through
reaching a hard limit, as well as being deleted by the user.

Note that this behaviour isn't actually forbidden by RFC 2367.
However, it is a quality of implementation issue.

As an added bonus, the restructuring in this patch will help
eventually in moving the expire notifications from softirq
context into process context, thus improving their reliability.

One important side-effect from this change is that SAs reaching
their hard byte/packet limits are now deleted immediately, just
like SAs that have reached their hard time limits.

Previously they were announced immediately but only deleted after
30 seconds.

This is bad because it prevents the system from issuing an ACQUIRE
command until the existing state was deleted by the user or expires
after the time is up.

In the scenario where the expire notification was lost this introduces
a 30 second delay into the system for no good reason.
 
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-06-18 22:43:22 -07:00
Jamal Hadi Salim 26b15dad9f [IPSEC] Add complete xfrm event notification
Heres the final patch.
What this patch provides

- netlink xfrm events
- ability to have events generated by netlink propagated to pfkey
  and vice versa.
- fixes the acquire lets-be-happy-with-one-success issue

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-06-18 22:42:13 -07:00
Hideaki YOSHIFUJI 92d63decc0 From: Kazunori Miyazawa <kazunori@miyazawa.org>
[XFRM] Call dst_check() with appropriate cookie

This fixes infinite loop issue with IPv6 tunnel mode.

Signed-off-by: Kazunori Miyazawa <kazunori@miyazawa.org>
Signed-off-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-26 12:58:04 -07:00
Herbert Xu 31c26852cb [IPSEC]: Verify key payload in verify_one_algo
We need to verify that the payload contains enough data so that
attach_one_algo can copy alg_key_len bits from the payload.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-19 12:39:49 -07:00
Herbert Xu b9e9dead05 [IPSEC]: Fixed alg_key_len usage in attach_one_algo
The variable alg_key_len is in bits and not bytes.  The function
attach_one_algo is currently using it as if it were in bytes.
This causes it to read memory which may not be there.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-19 12:39:04 -07:00
Evgeniy Polyakov d48102007d [XFRM]: skb_cow_data() does not set proper owner for new skbs.
It looks like skb_cow_data() does not set 
proper owner for newly created skb.

If we have several fragments for skb and some of them
are shared(?) or cloned (like in async IPsec) there 
might be a situation when we require recreating skb and 
thus using skb_copy() for it.
Newly created skb has neither a destructor nor a socket
assotiated with it, which must be copied from the old skb.
As far as I can see, current code sets destructor and socket
for the first one skb only and uses truesize of the first skb
only to increment sk_wmem_alloc value.

If above "analysis" is correct then attached patch fixes that.

Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-18 22:51:45 -07:00
Herbert Xu aabc9761b6 [IPSEC]: Store idev entries
I found a bug that stopped IPsec/IPv6 from working.  About
a month ago IPv6 started using rt6i_idev->dev on the cached socket dst
entries.  If the cached socket dst entry is IPsec, then rt6i_idev will
be NULL.

Since we want to look at the rt6i_idev of the original route in this
case, the easiest fix is to store rt6i_idev in the IPsec dst entry just
as we do for a number of other IPv6 route attributes.  Unfortunately
this means that we need some new code to handle the references to
rt6i_idev.  That's why this patch is bigger than it would otherwise be.

I've also done the same thing for IPv4 since it is conceivable that
once these idev attributes start getting used for accounting, we
probably need to dereference them for IPv4 IPsec entries too.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-03 16:27:10 -07:00
David S. Miller 0f4821e7b9 [XFRM/RTNETLINK]: Decrement qlen properly in {xfrm_,rt}netlink_rcv().
If we free up a partially processed packet because it's
skb->len dropped to zero, we need to decrement qlen because
we are dropping out of the top-level loop so it will do
the decrement for us.

Spotted by Herbert Xu.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-03 16:15:59 -07:00
David S. Miller 09e1430598 [NETLINK]: Fix infinite loops in synchronous netlink changes.
The qlen should continue to decrement, even if we
pop partially processed SKBs back onto the receive queue.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-03 15:30:05 -07:00
Herbert Xu 2a0a6ebee1 [NETLINK]: Synchronous message processing.
Let's recap the problem.  The current asynchronous netlink kernel
message processing is vulnerable to these attacks:

1) Hit and run: Attacker sends one or more messages and then exits
before they're processed.  This may confuse/disable the next netlink
user that gets the netlink address of the attacker since it may
receive the responses to the attacker's messages.

Proposed solutions:

a) Synchronous processing.
b) Stream mode socket.
c) Restrict/prohibit binding.

2) Starvation: Because various netlink rcv functions were written
to not return until all messages have been processed on a socket,
it is possible for these functions to execute for an arbitrarily
long period of time.  If this is successfully exploited it could
also be used to hold rtnl forever.

Proposed solutions:

a) Synchronous processing.
b) Stream mode socket.

Firstly let's cross off solution c).  It only solves the first
problem and it has user-visible impacts.  In particular, it'll
break user space applications that expect to bind or communicate
with specific netlink addresses (pid's).

So we're left with a choice of synchronous processing versus
SOCK_STREAM for netlink.

For the moment I'm sticking with the synchronous approach as
suggested by Alexey since it's simpler and I'd rather spend
my time working on other things.

However, it does have a number of deficiencies compared to the
stream mode solution:

1) User-space to user-space netlink communication is still vulnerable.

2) Inefficient use of resources.  This is especially true for rtnetlink
since the lock is shared with other users such as networking drivers.
The latter could hold the rtnl while communicating with hardware which
causes the rtnetlink user to wait when it could be doing other things.

3) It is still possible to DoS all netlink users by flooding the kernel
netlink receive queue.  The attacker simply fills the receive socket
with a single netlink message that fills up the entire queue.  The
attacker then continues to call sendmsg with the same message in a loop.

Point 3) can be countered by retransmissions in user-space code, however
it is pretty messy.

In light of these problems (in particular, point 3), we should implement
stream mode netlink at some point.  In the mean time, here is a patch
that implements synchronous processing.  

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-03 14:55:09 -07:00
Thomas Graf 492b558b31 [XFRM]: Cleanup xfrm_msg_min and xfrm_dispatch
Converts xfrm_msg_min and xfrm_dispatch to use c99 designated
initializers to make greping a little bit easier. Also replaces
two hardcoded message type with meaningful names.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-03 14:26:40 -07:00
Patrick McHardy 5c5d281a93 [XFRM]: Fix existence lookup in xfrm_state_find
Use 'daddr' instead of &tmpl->id.daddr, since the latter
might be zero.  Also, only perform the lookup when
tmpl->id.spi is non-zero.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-04-21 20:12:32 -07:00
Linus Torvalds 1da177e4c3 Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!
2005-04-16 15:20:36 -07:00