Commit Graph

46 Commits

Author SHA1 Message Date
Jesse Gross d9d59089c4 openvswitch: Move LRO check from transmit to receive.
The check for LRO packets was incorrectly put in the transmit path
instead of on receive.  Since this check is supposed to protect OVS
(and other parts of the system) from packets that it cannot handle
it is obviously not useful on egress.  Therefore, this commit moves
it back to the receive side.

The primary problem that this caused is upcalls to userspace tried
to segment the packet even though no segmentation information is
available.  This would later cause NULL pointer dereferences when
skb_gso_segment() did nothing.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-01-21 23:57:26 -08:00
Jesse Gross 92eb1d4771 openvswitch: Use RCU callback when detaching netdevices.
Currently, each time a device is detached from an OVS datapath
we call synchronize RCU before freeing associated data structures.
However, if a bridge is deleted (which detaches all ports) when
many devices are connected then there can be a long delay.  This
switches to use call_rcu() to group the cost together.

Reported-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-11-28 14:04:34 -08:00
Ansis Atteka 39c7caebc9 openvswitch: add skb mark matching and set action
This patch adds support for skb mark matching and set action.

Signed-off-by: Ansis Atteka <aatteka@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-11-26 11:33:18 -08:00
Shan Wei 404f2f1019 net: openvswitch: use this_cpu_ptr per-cpu helper
just use more faster this_cpu_ptr instead of per_cpu_ptr(p, smp_processor_id());

Signed-off-by: Shan Wei <davidshan@tencent.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-11-16 13:26:20 -08:00
Ansis Atteka 3fdbd1ce11 openvswitch: add ipv6 'set' action
This patch adds ipv6 set action functionality. It allows to change
traffic class, flow label, hop-limit, ipv6 source and destination
address fields.

Signed-off-by: Ansis Atteka <aatteka@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-11-13 15:57:33 -08:00
Mehak Mahajan c061853381 openvswitch: Process RARP packets with ethertype 0x8035 similar to ARP packets.
With this commit, OVS will match the data in the RARP packets having
ethertype 0x8035, in the same way as the data in the ARP packets.

Signed-off-by: Mehak Mahajan <mmahajan@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-11-02 14:14:31 -07:00
Mehak Mahajan d04d382980 openvswitch: Store flow key len if ARP opcode is not request or reply.
We currently only extract the ARP payload if the opcode indicates
that it is a request or reply.  However, we also only set the
key length in these situations even though it should still be
possible to match on the opcode.  There's no real reason to
restrict the ARP opcode since all have the same format so this
simply removes the check.

Signed-off-by: Mehak Mahajan <mmahajan@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-10-30 17:17:09 -07:00
Jesse Gross c1c92b6a5b openvswitch: Print device when warning about over MTU packets.
If an attempt is made to transmit a packet that is over the device's
MTU then we log it using the datapath's name.  However, it is much
more helpful to use the device name instead.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-10-30 15:48:48 -07:00
Peter Senna Tschudin a2bf91b5b8 net/openvswitch/vport.c: Remove unecessary semicolon
Found by http://coccinelle.lip6.fr/

Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-18 16:08:19 -04:00
David S. Miller b48b63a1f6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/netfilter/nfnetlink_log.c
	net/netfilter/xt_LOG.c

Rather easy conflict resolution, the 'net' tree had bug fixes to make
sure we checked if a socket is a time-wait one or not and elide the
logging code if so.

Whereas on the 'net-next' side we are calculating the UID and GID from
the creds using different interfaces due to the user namespace changes
from Eric Biederman.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-15 11:43:53 -04:00
Eric W. Biederman 15e473046c netlink: Rename pid to portid to avoid confusion
It is a frequent mistake to confuse the netlink port identifier with a
process identifier.  Try to reduce this confusion by renaming fields
that hold port identifiers portid instead of pid.

I have carefully avoided changing the structures exported to
userspace to avoid changing the userspace API.

I have successfully built an allyesconfig kernel with this change.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-10 15:30:41 -04:00
David S. Miller cefd81cfec Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch 2012-09-04 15:22:28 -04:00
Pravin B Shelar 15eac2a742 openvswitch: Increase maximum number of datapath ports.
Use hash table to store ports of datapath. Allow 64K ports per switch.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-09-03 19:20:49 -07:00
Jesse Gross c303aa94cd openvswitch: Fix FLOW_BUFSIZE definition.
The vlan encapsulation fields in the maximum flow defintion were
never updated when the representation changed before upstreaming.
In theory this could cause a kernel panic when a maximum length
flow is used.  In practice this has never happened (to my knowledge)
because skb allocations are padded out to a cache line so you would
need the right combination of flow and packet being sent to userspace.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-09-03 19:06:27 -07:00
Joe Stringer 39855b5ba9 openvswitch: Fix typo
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-09-02 12:18:25 -07:00
Wei Yongjun 80f0fd8a7f openvswitch: using kfree_rcu() to simplify the code
The callback function of call_rcu() just calls a kfree(), so we
can use kfree_rcu() instead of call_rcu() + callback function.

spatch with a semantic match is used to found this problem.
(http://coccinelle.lip6.fr/)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31 15:55:38 -04:00
Pravin B Shelar 46df7b8145 openvswitch: Add support for network namespaces.
Following patch adds support for network namespace to openvswitch.
Since it must release devices when namespaces are destroyed, a
side effect of this patch is that the module no longer keeps a
refcount but instead cleans up any state when it is unloaded.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-08-22 14:48:55 -07:00
Jesse Gross 4185392da4 openvswitch: Relax set header validation.
When installing a flow with an action to set a particular field we
need to validate that the packets that are part of the flow actually
contain that header.  With IP we use zeroed addresses and with TCP/UDP
the check is for zeroed ports.  This check is overly broad and can catch
packets like DHCP requests that have a zero source address in a
legitimate header.  This changes the check to look for a zeroed protocol
number for IP or for both ports be zero for TCP/UDP before considering
the header to not exist.

Reported-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-08-06 15:49:47 -07:00
Jesse Gross 6081030769 Revert "openvswitch: potential NULL deref in sample()"
This reverts commit 5b3e7e6cb5.

The problem that the original commit was attempting to fix can
never happen in practice because validation is done one a per-flow
basis rather than a per-packet basis.  Adding additional checks at
runtime is unnecessary and inconsistent with the rest of the code.

CC: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-27 13:45:51 -07:00
Dan Carpenter 5b3e7e6cb5 openvswitch: potential NULL deref in sample()
If there is no OVS_SAMPLE_ATTR_ACTIONS set then "acts_list" is NULL and
it leads to a NULL dereference when we call nla_len(acts_list).  This
is a static checker fix, not something I have seen in testing.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-23 00:59:54 -07:00
David S. Miller c073cfc89f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch
Jesse Gross says:

====================
A few bug fixes and small enhancements for net-next/3.6.
 ...
Ansis Atteka (1):
      openvswitch: Do not send notification if ovs_vport_set_options() failed

Ben Pfaff (1):
      openvswitch: Check gso_type for correct sk_buff in queue_gso_packets().

Jesse Gross (2):
      openvswitch: Enable retrieval of TCP flags from IPv6 traffic.
      openvswitch: Reset upper layer protocol info on internal devices.

Leo Alterman (1):
      openvswitch: Fix typo in documentation.

Pravin B Shelar (1):
      openvswitch: Check currect return value from skb_gso_segment()

Raju Subramanian (1):
      openvswitch: Replace Nicira Networks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-20 16:16:34 -07:00
Ben Pfaff a1b5d0dd28 openvswitch: Check gso_type for correct sk_buff in queue_gso_packets().
At the point where it was used, skb_shinfo(skb)->gso_type referred to a
post-GSO sk_buff.  Thus, it would always be 0.  We want to know the pre-GSO
gso_type, so we need to obtain it before segmenting.

Before this change, the kernel would pass inconsistent data to userspace:
packets for UDP fragments with nonzero offset would be passed along with
flow keys that indicate a zero offset (that is, the flow key for "later"
fragments claimed to be "first" fragments).  This inconsistency tended
to confuse Open vSwitch userspace, causing it to log messages about
"failed to flow_del" the flows with "later" fragments.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-07-20 14:47:54 -07:00
Pravin B Shelar 92e5dfc34c openvswitch: Check currect return value from skb_gso_segment()
Fix return check typo.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-07-20 14:46:29 -07:00
Jesse Gross 7fe99e2d43 openvswitch: Reset upper layer protocol info on internal devices.
It's possible that packets that are sent on internal devices (from
the OVS perspective) have already traversed the local IP stack.
After they go through the internal device, they will again travel
through the IP stack which may get confused by the presence of
existing information in the skb. The problem can be observed
when switching between namespaces. This clears out that information
to avoid problems but deliberately leaves other metadata alone.
This is to provide maximum flexibility in chaining together OVS
and other Linux components.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-05-25 11:29:30 -07:00
David S. Miller 028940342a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-05-16 22:17:37 -04:00
Joe Perches e87cc4728f net: Convert net_ratelimit uses to net_<level>_ratelimited
Standardize the net core ratelimited logging functions.

Coalesce formats, align arguments.
Change a printk then vprintk sequence to use printf extension %pV.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-15 13:45:03 -04:00
Dan Carpenter 8aa51d64c1 openvswitch: checking wrong variable in queue_userspace_packet()
"skb" is non-NULL here, for example we dereference it in skb_clone().
The intent was to test "nskb" which was just set.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-13 15:47:34 -04:00
Pravin B Shelar 072ae6314a openvswitch: Validation of IPv6 set port action uses IPv4 header
When the kernel validates set TCP/UDP port actions, it looks at
the ports in the existing flow to make sure that the L4 header exists.
However, these actions always use the IPv4 version of the struct.
Following patch fixes this by checking for flow ip protocol first.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-05-07 17:23:10 -07:00
Raju Subramanian caf2ee14bb openvswitch: Replace Nicira Networks.
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.

Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-05-03 18:55:23 -07:00
Ansis Atteka 4cb6e116bb openvswitch: Release rtnl_lock if ovs_vport_cmd_build_info() failed.
This patch fixes a possible lock-up bug where rtnl_lock might not
get released.

Signed-off-by: Ansis Atteka <aatteka@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-05-03 18:40:38 -07:00
Eric Dumazet 95c9617472 net: cleanup unsigned to unsigned int
Use of "unsigned int" is preferred to bare "unsigned" in net tree.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:44:40 -04:00
David S. Miller 06eb4eafbd Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-10 14:30:45 -04:00
Ansis Atteka 03fbf8b387 openvswitch: Do not send notification if ovs_vport_set_options() failed
There is no need to send a notification if ovs_vport_set_options() failed
and ovs_vport_cmd_set() did not change anything.

Signed-off-by: Ansis Atteka <aatteka@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-04-09 12:18:08 -07:00
Jesse Gross c55177e3e1 openvswitch: Enable retrieval of TCP flags from IPv6 traffic.
We currently check that a packet is IPv4 and TCP before fetching the
TCP flags.  This enables fetching from IPv6 packets as well.

Reported-by: Michael Mao <mmao@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-04-02 15:13:36 -07:00
Jesse Gross bf32fecdc1 openvswitch: Add length check when retrieving TCP flags.
When collecting TCP flags we check that the IP header indicates that
a TCP header is present but not that the packet is actually long
enough to contain the header.  This adds a check to prevent reading
off the end of the packet.

In practice, this is only likely to result in reading of bad data and
not a crash due to the presence of struct skb_shared_info at the end
of the packet.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-04-02 14:28:57 -07:00
David S. Miller 028d6a6767 openvswitch: Stop using NLA_PUT*().
These macros contain a hidden goto, and are thus extremely error
prone and make code hard to audit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-01 18:11:37 -04:00
David Howells 9ffc93f203 Remove all #inclusions of asm/system.h
Remove all #inclusions of asm/system.h preparatory to splitting and killing
it.  Performed with the following command:

perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`

Signed-off-by: David Howells <dhowells@redhat.com>
2012-03-28 18:30:03 +01:00
David S. Miller b2d3298e09 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-03-09 14:34:20 -08:00
Jesse Gross 81e5d41d7e openvswitch: Fix checksum update for actions on UDP packets.
When modifying IP addresses or ports on a UDP packet we don't
correctly follow the rules for unchecksummed packets.  This meant
that packets without a checksum can be given a incorrect new checksum
and packets with a checksum can become marked as being unchecksummed.
This fixes it to handle those requirements.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-03-07 14:36:57 -08:00
Ben Pfaff 651a68ea2c openvswitch: Honor dp_ifindex, when specified, for vport lookup by name.
When OVS_VPORT_ATTR_NAME is specified and dp_ifindex is nonzero, the
logical behavior would be for the vport name lookup scope to be limited
to the specified datapath, but in fact the dp_ifindex value was ignored.
This commit causes the search scope to be honored.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-03-06 15:04:04 -08:00
Danny Kukawka 7ce5d22219 net: use eth_hw_addr_random() and reset addr_assign_type
Use eth_hw_addr_random() instead of calling random_ether_addr()
to set addr_assign_type correctly to NET_ADDR_RANDOM.

Reset the state to NET_ADDR_PERM as soon as the MAC get
changed via .ndo_set_mac_address.

v2: adapt to renamed eth_hw_addr_random()

Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-15 15:34:17 -05:00
Ben Pfaff 77676fdbd5 openvswitch: Fix multipart datapath dumps.
The logic to split up the list of datapaths into multiple Netlink messages
was simply wrong, causing the list to be terminated after the first part.
Only about the first 50 datapaths would be dumped.  This fixes the
problem.

Reported-by: Paul Ingram <paul@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-17 23:56:19 -05:00
Shan Wei 2b2d465631 net: kill duplicate included header
For net part, remove duplicate included header.

Signed-off-by: Shan Wei <davidshan@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-17 10:31:12 -05:00
Devendra Naga 8d9d399f14 net: remove version.h includes in net/openvswitch/
remove version.h includes in net/openswitch/ as reported by make versioncheck.

Signed-off-by: Devendra Naga <devendra.aaru@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-17 10:07:58 -05:00
Dan Carpenter f0a98ae8db openvswitch: small potential memory leak in ovs_vport_alloc()
We're unlikely to hit this leak, but the static checkers complain if we
don't take care of it.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 12:58:57 -05:00
Jesse Gross ccb1352e76 net: Add Open vSwitch kernel components.
Open vSwitch is a multilayer Ethernet switch targeted at virtualized
environments.  In addition to supporting a variety of features
expected in a traditional hardware switch, it enables fine-grained
programmatic extension and flow-based control of the network.
This control is useful in a wide variety of applications but is
particularly important in multi-server virtualization deployments,
which are often characterized by highly dynamic endpoints and the need
to maintain logical abstractions for multiple tenants.

The Open vSwitch datapath provides an in-kernel fast path for packet
forwarding.  It is complemented by a userspace daemon, ovs-vswitchd,
which is able to accept configuration from a variety of sources and
translate it into packet processing rules.

See http://openvswitch.org for more information and userspace
utilities.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2011-12-03 09:35:17 -08:00