If a large number of namespaces is spawned on a node and TIPC is
enabled in each of these, the excessive printk tracing of network
events will cause the system to grind down to a near halt.
The traces are still of debug value, so instead of removing them
completely we fix it by changing the link state and node availability
logging debug traces.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If net namespace is supported in tipc, each namespace will be treated
as a separate tipc node. Therefore, every namespace must own its
private tipc node address. This means the "tipc_own_addr" global
variable of node address must be moved to tipc_net structure to
satisfy the requirement. It's turned out that users also can assign
node address for every namespace.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Tero Aho <Tero.Aho@coriant.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TIPC broadcast link is statically established and its relevant states
are maintained with the global variables: "bcbearer", "bclink" and
"bcl". Allowing different namespace to own different broadcast link
instances, these variables must be moved to tipc_net structure and
broadcast link instances would be allocated and initialized when
namespace is created.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Tero Aho <Tero.Aho@coriant.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Global variables associated with node table are below:
- node table list (node_htable)
- node hash table list (tipc_node_list)
- node table lock (node_list_lock)
- node number counter (tipc_num_nodes)
- node link number counter (tipc_num_links)
To make node table support namespace, above global variables must be
moved to tipc_net structure in order to keep secret for different
namespaces. As a consequence, these variables are allocated and
initialized when namespace is created, and deallocated when namespace
is destroyed. After the change, functions associated with these
variables have to utilize a namespace pointer to access them. So
adding namespace pointer as a parameter of these functions is the
major change made in the commit.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Tero Aho <Tero.Aho@coriant.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 908344cdda ("tipc: fix bug in multicast congestion handling")
introduced a race in the broadcast link wakeup functionality.
This patch eliminates this broadcast link wakeup race caused by
operation on the wakeup list without proper locking. If this race
hit and corrupted the list all subsequent wakeup messages would be
lost, resulting in a considerable memory leak.
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use standard SKB list APIs associated with struct sk_buff_head to
manage link's deferred queue, simplifying relevant code.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The node subscribe infrastructure represents a virtual base class, so
its users, such as struct tipc_port and struct publication, can derive
its implemented functionalities. However, after the removal of struct
tipc_port, struct publication is left as its only single user now. So
defining an abstract infrastructure for one user becomes no longer
reasonable. If corresponding new functions associated with the
infrastructure are moved to name_table.c file, the node subscription
infrastructure can be removed as well.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix sparse warnings about non-static declaration of static functions
in the new tipc netlink API.
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add TIPC_NL_NODE_GET to the new tipc netlink API.
This command can dump the address and node status of all nodes in the
tipc cluster.
Netlink logical layout of returned node/address data:
-> node
-> address
-> up flag
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Locking dependency detected below possible unsafe locking scenario:
CPU0 CPU1
T0: tipc_named_rcv() tipc_rcv()
T1: [grab nametble write lock]* [grab node lock]*
T2: tipc_update_nametbl() tipc_node_link_up()
T3: tipc_nodesub_subscribe() tipc_nametbl_publish()
T4: [grab node lock]* [grab nametble write lock]*
The opposite order of holding nametbl write lock and node lock on
above two different paths may result in a deadlock. If we move the
the updating of the name table after link state named out of node
lock, the reverse order of holding locks will be eliminated, and
as a result, the deadlock risk.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
One aim of commit 50100a5e39 ("tipc:
use pseudo message to wake up sockets after link congestion") was
to handle link congestion abatement in a uniform way for both unicast
and multicast transmit. However, the latter doesn't work correctly,
and has been broken since the referenced commit was applied.
If a user now sends a burst of multicast messages that is big
enough to cause broadcast link congestion, it will be put to sleep,
and not be waked up when the congestion abates as it should be.
This has two reasons. First, the flag that is used, TIPC_WAKEUP_USERS,
is set correctly, but in the wrong field. Instead of setting it in the
'action_flags' field of the arrival node struct, it is by mistake set
in the dummy node struct that is owned by the broadcast link, where it
will never tested for. Second, we cannot use the same flag for waking
up unicast and multicast users, since the function tipc_node_unlock()
needs to pick the wakeup pseudo messages to deliver from different
queues. It must hence be able to distinguish between the two cases.
This commit solves this problem by adding a new flag
TIPC_WAKEUP_BCAST_USERS, and a new function tipc_bclink_wakeup_user().
The latter is to be called by tipc_node_unlock() when the named flag,
now set in the correct field, is encountered.
v2: using explicit 'unsigned int' declaration instead of 'uint', as
per comment from David Miller.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the current implementation, each 'struct tipc_node' instance keeps
a linked list of those ports/sockets that are connected to the node
represented by that struct. The purpose of this is to let the node
object know which sockets to alert when it loses contact with its peer
node, i.e., which sockets need to have their connections aborted.
This entails an unwanted direct reference from the node structure
back to the port/socket structure, and a need to grab port_lock
when we have to make an upcall to the port. We want to get rid of
this unecessary BH entry point into the socket, and also eliminate
its use of port_lock.
In this commit, we instead let the node struct keep list of "connected
socket" structs, which each represents a connected socket, but is
allocated independently by the node at the moment of connection. If
the node loses contact with its peer node, the list is traversed, and
a "connection abort" message is created for each entry in the list. The
message is sent to it respective connected socket using the ordinary
data path, and the receiving socket aborts its connections upon reception
of the message.
This enables us to get rid of the direct reference from 'struct node' to
´struct port', and another unwanted BH access point to the latter.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current link implementation keeps a linked list of blocked ports/
sockets that is populated when there is link congestion. The purpose
of this is to let the link know which users to wake up when the
congestion abates.
This adds unnecessary complexity to the data structure and the code,
since it forces us to involve the link each time we want to delete
a socket. It also forces us to grab the spinlock port_lock within
the scope of node_lock. We want to get rid of this direct dependence,
as well as the deadlock hazard resulting from the usage of port_lock.
In this commit, we instead let the link keep list of a "wakeup" pseudo
messages for use in such situations. Those messages are sent to the
pending sockets via the ordinary message reception path, and wake up
the socket's owner when they are received.
This enables us to get rid of the 'waiting_ports' linked lists in struct
tipc_port that manifest this direct reference. As a consequence, we can
eliminate another BH entry into the socket, and hence the need to grab
port_lock. This is a further step in our effort to remove port_lock
altogether.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a previous commit series ("tipc: new unicast transmission code")
we introduced a new message sending function, tipc_link_xmit2(),
and moved the unicast data users over to use that function. We now
let the internal name table distributor do the same.
The interaction between the name distributor and the node/link
layer also becomes significantly simpler, so we can eliminate
the function tipc_link_names_xmit().
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Message fragmentation is currently performed at link level, inside
the protection of node_lock. This potentially binds up the sending
link structure for a long time, instead of letting it do other tasks,
such as handle reception of new packets.
In this commit, we make the MTUs of each active link become easily
accessible from the socket level, i.e., without taking any spinlock
or dereferencing the target link pointer. This way, we make it possible
to perform fragmentation in the sending socket, before sending the
whole fragment chain to the link for transport.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function tipc_link_frag_rcv() is in reality a re-entrant generic
message reassemby function that has nothing in particular to do with
the link, where it is defined now. This becomes obvious when we see
the need to call the function from other places in the code.
In this commit rename it to tipc_buf_append() and move it to the file
msg.c. We also simplify its signature by moving the tail pointer to
the control block of the head buffer, hence making the head buffer
self-contained.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each node action flag should be set or cleared separately, instead
we now set the whole flags variable in one shot, and it's turned
out to be hard to see which other flags are affected. Therefore,
for instance, we explicitly clear TIPC_WAIT_OWN_LINKS_DOWN bit in
node_lost_contact().
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename node flags to action_flags as well as its enum names so
that they can reflect its real meanings.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Postpone the actions of delivering name tables until after node
lock is released, avoiding to do it under asynchronous context.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since previously what all publications pertaining to the lost node
were removed from name table was finished in tasklet context
asynchronously, we need to TIPC_NAMES_GONE flag indicating whether
the node cleanup work is finished or not. But now as the cleanup work
has been finished when node lock is released, the flag becomes
meaningless for us.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Postpone the actions of notifying subscriptions until after node lock
is released, avoiding to asynchronously execute registered handlers
when node is lost.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename setup_blocked variable of node struct to a more common
name called "flags", which will be used to represent kinds of
node states.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 78acb1f9b8 ("tipc: add
ioctl to fetch link names") introduced a buffer overflow bug where
specially crafted ioctl requests could cause out-of-bounds indexing
of the node->links array. This was caused by an incorrect check vs
MAX_BEARERS, and the static code checker complaint is:
net/tipc/node.c:459 tipc_node_get_linkname() error: buffer overflow 'node->links' 2 <= 2
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We add a new ioctl for AF_TIPC that can be used to fetch the
logical name for a link to a remote node on a given bearer. This
should be used in combination with link state subscriptions.
The logical name size limit definitions are moved to tipc.h, as
they are now also needed by the new ioctl.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When links are established over a bearer plane, we create a node
local publication containing information about the peer node and
bearer plane. This allows TIPC applications to use the standard
TIPC topology server subscription mechanism to get notifications
when a link goes up or down.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now tipc routing hierarchy comprises the structures 'node', 'link'and
'bearer'. The whole hierarchy is protected by a big read/write lock,
tipc_net_lock, to ensure that nothing is added or removed while code
is accessing any of these structures. Obviously the locking policy
makes node, link and bearer components closely bound together so that
their relationship becomes unnecessarily complex. In the worst case,
such locking policy not only has a negative influence on performance,
but also it's prone to lead to deadlock occasionally.
In order o decouple the complex relationship between bearer and node
as well as link, the locking policy is adjusted as follows:
- Bearer level
RTNL lock is used on update side, and RCU is used on read side.
Meanwhile, all bearer instances including broadcast bearer are
saved into bearer_list array.
- Node and link level
All node instances are saved into two tipc_node_list and node_htable
lists. The two lists are protected by node_list_lock on write side,
and they are guarded with RCU lock on read side. All members in node
structure including link instances are protected by node spin lock.
- The relationship between bearer and node
When link accesses bearer, it first needs to find the bearer with
its bearer identity from the bearer_list array. When bearer accesses
node, it can iterate the node_htable hash list with the node
address to find the corresponding node.
In the new locking policy, every component has its private locking
solution and the relationship between bearer and node is very simple,
that is, they can find each other with node address or bearer identity
from node_htable hash list or bearer_list array.
Until now above all changes have been done, so tipc_net_lock can be
removed safely.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently on both paths of message transmission and reception, the
read lock of tipc_net_lock must be held before bearer is accessed,
while the write lock of tipc_net_lock has to be taken before bearer
is configured. Although it can ensure that bearer is always valid on
the two data paths, link and bearer is closely bound together.
So as the part of effort of removing tipc_net_lock, the locking
policy of bearer protection will be adjusted as below: on the two
data paths, RCU is used, and on the configuration path of bearer,
RTNL lock is applied.
Now RCU just covers the path of message reception. To make it possible
to protect the path of message transmission with RCU, link should not
use its stored bearer pointer to access bearer, but it should use the
bearer identity of its attached bearer as index to get bearer instance
from bearer_list array, which can help us decouple the relationship
between bearer and link. As a result, bearer on the path of message
transmission can be safely protected by RCU when we access bearer_list
array within RCU lock protection.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Without properly implicit or explicit read memory barrier, it's
unsafe to read an atomic variable with atomic_read() from another
thread which is different with the thread of changing the atomic
variable with atomic_inc() or atomic_dec(). So a stale tipc_num_links
may be got with atomic_read() in tipc_node_get_links(). If the
tipc_num_links variable type is converted from atomic to unsigned
integer and node list lock is used to protect it, the issue would
be avoided.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As tipc_node_list is protected by rcu read lock on read side, it's
unnecessary to hold node_list_lock to protect tipc_node_list in
tipc_node_get_links(). Instead, node_list_lock should just protects
tipc_num_nodes in the function.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert tipc_node_list list and node_htable hash list to RCU lists.
On read side, the two lists are protected with RCU read lock, and
on update side, node_list_lock is applied to them.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a node is created, tipc_net_lock read lock is first held and
then node_create_lock is grabbed in order to prevent the same node
from being created and inserted into both node list and hlist twice.
But when we query node from the two node lists, we only hold
tipc_net_lock read lock without grabbing node_create_lock. Obviously
this locking policy is unable to guarantee that the two node lists
are always synchronized especially when the operation of changing
and accessing them occurs in different contexts like currently doing.
Therefore, rename node_create_lock to node_list_lock to protect the
two node lists, that is, whenever node is inserted into them or node
is queried from them, the node_list_lock should be always held. As a
result, tipc_net_lock read lock becomes redundant and then can be
removed from the node query functions.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tipc_node_create routine doesn't need to check whether a node
object specified with a node address exists or not because its
caller(ie, tipc_disc_recv_msg routine) has checked this before
calling it.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename the following functions, which are shorter and more in line
with common naming practice in the network subsystem.
tipc_bclink_send_msg->tipc_bclink_xmit
tipc_bclink_recv_pkt->tipc_bclink_rcv
tipc_disc_recv_msg->tipc_disc_rcv
tipc_link_send_proto_msg->tipc_link_proto_xmit
link_recv_proto_msg->tipc_link_proto_rcv
link_send_sections_long->tipc_link_iovec_long_xmit
tipc_link_send_sections_fast->tipc_link_iovec_xmit_fast
tipc_link_send_sync->tipc_link_sync_xmit
tipc_link_recv_sync->tipc_link_sync_rcv
tipc_link_send_buf->__tipc_link_xmit
tipc_link_send->tipc_link_xmit
tipc_link_send_names->tipc_link_names_xmit
tipc_named_recv->tipc_named_rcv
tipc_link_recv_bundle->tipc_link_bundle_rcv
tipc_link_dup_send_queue->tipc_link_dup_queue_xmit
link_send_long_buf->tipc_link_frag_xmit
tipc_multicast->tipc_port_mcast_xmit
tipc_port_recv_mcast->tipc_port_mcast_rcv
tipc_port_reject_sections->tipc_port_iovec_reject
tipc_port_recv_proto_msg->tipc_port_proto_rcv
tipc_connect->tipc_port_connect
__tipc_connect->__tipc_port_connect
__tipc_disconnect->__tipc_port_disconnect
tipc_disconnect->tipc_port_disconnect
tipc_shutdown->tipc_port_shutdown
tipc_port_recv_msg->tipc_port_rcv
tipc_port_recv_sections->tipc_port_iovec_rcv
release->tipc_release
accept->tipc_accept
bind->tipc_bind
get_name->tipc_getname
poll->tipc_poll
send_msg->tipc_sendmsg
send_packet->tipc_send_packet
send_stream->tipc_send_stream
recv_msg->tipc_recvmsg
recv_stream->tipc_recv_stream
connect->tipc_connect
listen->tipc_listen
shutdown->tipc_shutdown
setsockopt->tipc_setsockopt
getsockopt->tipc_getsockopt
Above changes have no impact on current users of the functions.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 7d33939f47
("tipc: delay delete of link when failover is needed") we
introduced a loop for finding and removing a link pointer
in an array. The removal is done after we have left the loop,
giving the impression that one may remove the wrong pointer
if no matching element is found.
This is not really a bug, since we know that there will always
be a matching element, but it looks wrong, and causes a smatch
warning.
We fix this loop with this commit.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a bearer is disabled, all its attached links are deleted.
Ideally, we should do link failover to redundant links on other bearers,
if there are any, in such cases. This would be consistent with current
behavior when a link is reset, but not deleted. However, due to the
complexity involved, and the (wrongly) perceived low demand for this
feature, it was never implemented until now.
We mark the doomed link for deletion with a new flag, but wait until the
failover process is finished before we actually delete it. With the
improved link tunnelling/failover code introduced earlier in this commit
series, it is now easy to identify a spot in the code where the failover
is finished and it is safe to delete the marked link. Moreover, the test
for the flag and the deletion can be done synchronously, and outside the
most time critical data path.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The functionality related to link addition and failover is unnecessarily
hard to understand and maintain. We try to improve this by renaming
some of the functions, at the same time adding or improving the
explanatory comments around them. Names such as "tipc_rcv()" etc. also
align better with what is used in other networking components.
The changes in this commit are purely cosmetic, no functional changes
are made.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove dead code;
tipc_bearer_find_interface
tipc_node_redundant_links
This may break out of tree version of TIPC if there still is one.
But that maybe a good thing :-)
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
sk_buff lists are currently relased by looping over the list and
explicitly releasing each buffer.
We replace all occurrences of this loop with a call to kfree_skb_list().
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the first fragment of a long data data message is received on a link, a
reassembly buffer large enough to hold the data from this and all subsequent
fragments of the message is allocated. The payload of each new fragment is
copied into this buffer upon arrival. When the last fragment is received, the
reassembled message is delivered upwards to the port/socket layer.
Not only is this an inefficient approach, but it may also cause bursts of
reassembly failures in low memory situations. since we may fail to allocate
the necessary large buffer in the first place. Furthermore, after 100 subsequent
such failures the link will be reset, something that in reality aggravates the
situation.
To remedy this problem, this patch introduces a different approach. Instead of
allocating a big reassembly buffer, we now append the arriving fragments
to a reassembly chain on the link, and deliver the whole chain up to the
socket layer once the last fragment has been received. This is safe because
the retransmission layer of a TIPC link always delivers packets in strict
uninterrupted order, to the reassembly layer as to all other upper layers.
Hence there can never be more than one fragment chain pending reassembly at
any given time in a link, and we can trust (but still verify) that the
fragments will be chained up in the correct order.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I'm not sure why, but the hlist for each entry iterators were conceived
list_for_each_entry(pos, head, member)
The hlist ones were greedy and wanted an extra parameter:
hlist_for_each_entry(tpos, pos, head, member)
Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.
Besides the semantic patch, there was some manual work required:
- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.
The semantic patch which is mostly the work of Peter Senna Tschudin is here:
@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
type T;
expression a,c,d,e;
identifier b;
statement S;
@@
-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>
[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Upon establishing a first link between two nodes, there is
currently a risk that the two endpoints will disagree on exactly
which sequence number reception and acknowleding of broadcast
packets should start.
The following scenarios may happen:
1: Node A sends an ACTIVATE message to B, telling it to start acking
packets from sequence number N.
2: Node A sends out broadcast N, but does not expect an acknowledge
from B, since B is not yet in its broadcast receiver's list.
3: Node A receives ACK for N from all nodes except B, and releases
packet N.
4: Node B receives the ACTIVATE, activates its link endpoint, and
stores the value N as sequence number of first expected packet.
5: Node B sends a NAME_DISTR message to A.
6: Node A receives the NAME_DISTR message, and activates its endpoint.
At this moment B is added to A's broadcast receiver's set.
Node A also sets sequence number 0 as the first broadcast packet
to be received from B.
7: Node A sends broadcast N+1.
8: B receives N+1, determines there is a gap in the sequence, since
it is expecting N, and sends a NACK for N back to A.
9: Node A has already released N, so no retransmission is possible.
The broadcast link in direction A->B is stale.
In addition to, or instead of, 7-9 above, the following may happen:
10: Node B sends broadcast M > 0 to A.
11: Node A receives M, falsely decides there must be a gap, since
it is expecting packet 0, and asks for retransmission of packets
[0,M-1].
12: Node B has already released these packets, so the broadcast
link is stale in direction B->A.
We solve this problem by introducing a new unicast message type,
BCAST_PROTOCOL/STATE, to convey the sequence number of the next
sent broadcast packet to the other endpoint, at exactly the moment
that endpoint is added to the own node's broadcast receivers list,
and before any other unicast messages are permitted to be sent.
Furthermore, we don't allow any node to start receiving and
processing broadcast packets until this new synchronization
message has been received.
To maintain backwards compatibility, we still open up for
broadcast reception if we receive a NAME_DISTR message without
any preceding broadcast sync message. In this case, we must
assume that the other end has an older code version, and will
never send out the new synchronization message. Hence, for mixed
old and new nodes, the issue arising in 7-12 of the above may
happen with the same probability as before.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Rename the "supported" flag in bclink structure to "recv_permitted"
to better reflect what it is used for. When this flag is set for a
given node, we are permitted to receive and acknowledge broadcast
messages from that node. Convert it to a bool at the same time,
since it is not used to store any numerical values.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
The "supportable" flag in bclink structure is a compatibility flag
indicating whether a peer node is capable of receiving TIPC broadcast
messages. However, all TIPC versions since tipc-1.5, and after the
inclusion in the upstream Linux kernel in 2006, support this capability.
It is highly unlikely that anybody is still using such an old
version of TIPC, let alone that they want to mix it with TIPC-2.0
nodes. Therefore, we now remove the "supportable" flag.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
All messages should go directly to the kernel log. The TIPC
specific error, warning, info and debug trace macro's are
removed and all references replaced with pr_err, pr_warn,
pr_info and pr_debug.
Commonly used sub-strings are explicitly declared as a const
char to reduce .text size.
Note that this means the debug messages (changed to pr_debug),
are now enabled through dynamic debugging, instead of a TIPC
specific Kconfig option (TIPC_DEBUG). The latter will be
phased out completely
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: use pr_fmt as suggested by Joe Perches <joe@perches.com>]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Some of the comment blocks are floating in limbo between two
functions, or between blocks of code. Delete the extra line
feeds between any comment and its associated following block
of code, to be consistent with the majority of the rest of
the kernel. Also delete trailing newlines at EOF and fix
a couple trivial typos in existing comments.
This is a 100% cosmetic change with no runtime impact. We get
rid of over 500 lines of non-code, and being blank line deletes,
they won't even show up as noise in git blame.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Untie gcc's hands and let it do what it wants within the
individual source files. There are two files, node.c and
port.c -- only the latter effectively changes (gcc-4.5.2).
Objdump shows gcc deciding to not inline port_peernode().
Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduces routines that test whether a given network address is
equal to a node's own network address or if it lies within the node's
own network cluster, and which work properly regardless of whether
the node is using the default network address <0.0.0> or a non-zero
network address that is assigned later on. In essence, these routines
ensure that address <0.0.0> is treated as an alias for "this node",
regardless of which network address the node is actually using.
Old users of the pre-existing more strict match in_own_cluster()
have been accordingly redirected to what is now called
in_own_cluster_exact() --- which does not extend matching to <0,0,0>.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Removes all references to the global variable that records whether
TIPC is running in "single node" mode or "network" mode, since this
information can be easily deduced from the global variable that
records TIPC's network address. (i.e. a non-zero network address
means that TIPC is running in network mode.)
The changes made update most existing mode-based checks to use the
network address global variable. A few checks that are no longer
needed are removed entirely, along with any associated code lying on
non-executable control paths.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Optimizes processing done when contact with a neighboring node is
established to avoid recording the current state of outgoing broadcast
messages if the neighboring node isn't a valid broadcast link destination,
since this state information isn't needed for such nodes.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Eliminates a block of comments that describe how routing table updates
are to be handled. These comments no longer apply following the removal
of TIPC's prototype multi-cluster support.
Note that these changes are essentially cosmetic in nature, and have
no impact on the actual operation of TIPC.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Gets rid of two inlined routines that simply call existing sk_buff
manipulation routines, since there is no longer any extra processing
done by the helper routines.
Note that these changes are essentially cosmetic in nature, and have
no impact on the actual operation of TIPC.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Relocates information about the size of TIPC's node table index and
its associated hash function, since only node subsystem routines need
to have access to this information.
Note that these changes are essentially cosmetic in nature, and have
no impact on the actual operation of TIPC.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Adds support for the new "node signature" in neighbor discovery messages,
which is a 16 bit identifier chosen randomly when TIPC is initialized.
This field makes it possible for nodes receiving a neighbor discovery
message to detect if multiple neighboring nodes are using the same network
address (i.e. <Z.C.N>), even when the messages are arriving on different
interfaces.
This first phase of node signature support creates the signature,
incorporates it into outgoing neighbor discovery messages, and tracks
the signature used by valid neighbors. An upcoming patch builds on this
foundation to implement the improved duplicate neighbor detection checking.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Eliminates support for the broadcast tag field, which is no longer
used by broadcast link NACK messages.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Completely redesigns broadcast link ACK and NACK mechanisms to prevent
spurious retransmit requests in dual LAN networks, and to prevent the
broadcast link from stalling due to the failure of a receiving node to
acknowledge receiving a broadcast message or request its retransmission.
Note: These changes only impact the timing of when ACK and NACK messages
are sent, and not the basic broadcast link protocol itself, so inter-
operability with nodes using the "classic" algorithms is maintained.
The revised algorithms are as follows:
1) An explicit ACK message is still sent after receiving 16 in-sequence
messages, and implicit ACK information continues to be carried in other
unicast link message headers (including link state messages). However,
the timing of explicit ACKs is now based on the receiving node's absolute
network address rather than its relative network address to ensure that
the failure of another node does not delay the ACK beyond its 16 message
target.
2) A NACK message is now typically sent only when a message gap persists
for two consecutive incoming link state messages; this ensures that a
suspected gap is not confirmed until both LANs in a dual LAN network have
had an opportunity to deliver the message, thereby preventing spurious NACKs.
A NACK message can also be generated by the arrival of a single link state
message, if the deferred queue is so big that the current message gap
cannot be the result of "normal" mis-ordering due to the use of dual LANs
(or one LAN using a bonded interface). Since link state messages typically
arrive at different nodes at different times the problem of multiple nodes
issuing identical NACKs simultaneously is inherently avoided.
3) Nodes continue to "peek" at NACK messages sent by other nodes. If
another node requests retransmission of a message gap suspected (but not
yet confirmed) by the peeking node, the peeking node forgets about the
gap and does not generate a duplicate retransmit request. (If the peeking
node subsequently fails to receive the lost message, later link state
messages will cause it to rediscover and confirm the gap and send another
NACK.)
4) Message gap "equality" is now determined by the start of the gap only.
This is sufficient to deal with the most common cases of message loss,
and eliminates the need for complex end of gap computations.
5) A peeking node no longer tries to determine whether it should send a
complementary NACK, since the most common cases of message loss don't
require it to be sent. Consequently, the node no longer examines the
"broadcast tag" field of a NACK message when peeking.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Fix a bug that can prevent TIPC from sending broadcast messages to a node
if contact with the node is lost and then regained. The problem occurs if
the broadcast link first clears the flag indicating the node is part of the
link's distribution set (when it loses contact with the node), and later
fails to restore the flag (when contact is regained); restoration fails
if contact with the node is regained by implicit unicast link activation
triggered by the arrival of a data message, rather than explicitly by the
arrival of a link activation message.
The broadcast link now uses separate fields to track whether a node is
theoretically capable of receiving broadcast messages versus whether it is
actually part of the link's distribution set. The former member is updated
by the receipt of link protocol messages, which can occur at any time; the
latter member is updated only when contact with the node is gained or lost.
This change also permits the simplification of several conditional
expressions since the broadcast link's "supported" field can now only be
set if there are working links to the associated node.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Adds checks to TIPC's broadcast link so that it ignores any
acknowledgement message containing a sequence number that does not
correspond to an unacknowledged message currently in the broadcast
link's transmit queue.
This change prevents the broadcast link from becoming stalled if a
newly booted node receives stale broadcast link acknowledgement
information from another node that has not yet fully synchronized
its end of the broadcast link to reflect the current state of the
new node's end.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Fixes oversight that allowed broadcast link node map to be updated without
first taking the broadcast link spinlock that protects the map. As part
of this fix the node map has been incorporated into the broadcast link
structure to make the need for such protection more evident.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Enhances TIPC to ensure that a node that loses contact with a
neighboring node does not allow contact to be re-established until
it sees that its peer has also recognized the loss of contact.
Previously, nodes that were connected by two or more links could
encounter a situation in which node A would lose contact with node B
on all of its links, purge its name table of names published by B,
and then fail to repopulate those names once contact with B was restored.
This would happen because B was able to re-establish one or more links
so quickly that it never reached a point where it had no links to A --
meaning that B never saw a loss of contact with A, and consequently
didn't re-publish its names to A.
This problem is now prevented by enhancing the cleanup done by TIPC
following a loss of contact with a neighboring node to ensure that
node A ignores all messages sent by B until it receives a LINK_PROTOCOL
message that indicates B has lost contact with A, thereby preventing
the (re)establishment of links between the nodes. The loss of contact
is recognized when a RESET or ACTIVATE message is received that has
a "redundant link exists" field of 0, indicating that B's sending link
endpoint is in a reset state and that B has no other working links.
Additionally, TIPC now suppresses the sending of (most) link protocol
messages to a neighboring node while it is cleaning up after an earlier
loss of contact with that node. This stops the peer node from prematurely
activating its link endpoint, which would prevent TIPC from later
activating its own end. TIPC still allows outgoing RESET messages to
occur during cleanup, to avoid problems if its own node recognizes
the loss of contact first and tries to notify the peer of the situation.
Finally, TIPC now recognizes an impending loss of contact with a peer node
as soon as it receives a RESET message on a working link that is the
peer's only link to the node, and ensures that the link protocol
suppression mentioned above goes into effect right away -- that is,
even before its own link endpoints have failed. This is necessary to
ensure correct operation when there are redundant links between the nodes,
since otherwise TIPC would send an ACTIVATE message upon receiving a RESET
on its first link and only begin suppressing when a RESET on its second
link was received, instead of initiating suppression with the first RESET
message as it needs to.
Note: The reworked cleanup code also eliminates a check that prevented
a link endpoint's discovery object from responding to incoming messages
while stale name table entries are being purged. This check is now
unnecessary and would have slowed down re-establishment of communication
between the nodes in some situations.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Ensure that broadcast link messages that have not been acknowledged
by a newly failed node do not get an implied acknowledgement until the
failed node is removed from the broadcast link's map of reachable nodes.
Previously, a race condition allowed a new broadcast link message to be
sent after the implicit acknowledgement processing was completed, but
before the map of reachable nodes was updated, resulting in the message
having an expected acknowledgement count that required the failed node
to explicitly acknowledge the message. Since this would never occur
the new message would remain in the broadcast link's transmit queue
forever, eventually causing the link to become congested and "stall".
Delaying the implicit acknowledgement processing until after the update
of the map of reachable nodes eliminates this race condition and prevents
stalling.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Enhances cleanup of broadcast link-related information when contact
with a node is lost.
1) All broadcast link-related cleanup now occurs only if the lost node
was capable of communicating over the broadcast link.
2) Following cleanup, the lost node is marked as no longer supporting
the broadcast link, ensuring that any remaining broadcast messages
received from that node prior to the re-establishment of a normal
communication link are ignored.
Thanks to Surya [Suryanarayana.Garlapati@emerson.com] for contributing
a prototype version of this patch.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Enhances link creation code as follows:
1) Detects illegal attempts to add a requested link earlier in the
link creation process. This prevents TIPC from wasting time
initializing a link object it then throws away, and also eliminates
the code needed to do the throwing away.
2) Passes in the node object associated with the requested link.
This allows TIPC to eliminate a search to locate the node object,
as well as code that attempted to create the node if it doesn't
exist.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Function names like "tipc_node_has_redundant_links" are unweildy
and result in long lines even for simple lines. The "has" doesn't
contribute any value add, so dropping that is a slight step in the
right direction. This is a cosmetic change, basic result of:
for i in `grep -l tipc_node_has_ *` ; do sed -i s/tipc_node_has_/tipc_node_/ $i ; done
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Relocates network-related variables into the subsystem files where
they are now primarily used (following the recent rework of TIPC's
node table), and converts globals into locals where possible. Changes
the initialization of tipc_num_links from run-time to compile-time,
and eliminates the net_start routine that becomes empty as a result.
Also eliminates the corresponding net_stop routine by moving its
(trivial) content into the one location that called the routine.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Replaces the dynamically allocated array of pointers to the cluster's
node objects with a static hash table. Hash collisions are resolved
using chaining, with a typical hash chain having only a single node,
to avoid degrading performance during processing of incoming packets.
The conversion to a hash table reduces the memory requirements for
TIPC's node table to approximately the same size it had prior to
the previous commit.
In addition to the hash table itself, TIPC now also maintains a
linked list for the node objects, sorted by ascending network address.
This list allows TIPC to continue sending responses to user space
applications that request node and link information in sorted order.
The list also improves performance when name table update messages are
sent by making it easier to identify the nodes that must be notified.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Converts the fields of the global "tipc_net" structure into individual
variables. Since the struct was never referenced as a complete unit,
its existence was pointless. This will facilitate upcoming changes to
TIPC's node table and simpify upcoming relocation of the variables so
they are only visible to the files that actually use them.
This change is essentially cosmetic in nature, and doesn't affect the
operation of TIPC.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Removes a race condition that could cause TIPC's internal counter
of the number of links it has to neighboring nodes to have the
incorrect value if two independent threads of control simultaneously
create new link endpoints connecting to two different nodes using two
different bearers. Such under counting would result in TIPC failing to
list the final link(s) in its response to a configuration request to
list all of the node's links. The counter is now updated atomically
to ensure that simultaneous increments do not interfere with each
other.
Thanks go to Peter Butler <pbutler@pt.com> for his assistance in
diagnosing and fixing this problem.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Relocates the code that notifies users of node subscriptions so that
it is adjacent to the rest of the routines that implement TIPC's node
subscription capability. Renames the name table routine that is
invoked by a node subscription to better reflect its purpose and to
be consistent with other, similar name table routines.
These changes are cosmetic in nature, and do not alter the behavior
of TIPC.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Prevents a null pointer dereference from occurring if a node subscription
is triggered at the same time that the subscribing port or publication is
terminating the subscription. The problem arises if the triggering routine
asynchronously activates and deregisters the node subscription while
deregistration is already underway -- the deregistration routine may find
that the pointer it has just verified to be non-NULL is now NULL.
To avoid this race condition the triggering routine now simply marks the
node subscription as defunct (to prevent it from re-activating)
instead of deregistering it. The subscription is now both deregistered
and destroyed only when the subscribing port or publication code terminates
the node subscription.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Introduces a pair of helper routines that convert the network address
for a TIPC node into the network address for its cluster or zone.
This is a cosmetic change designed to avoid future errors caused by
the incorrect use of address bitmasks, and does not alter the existing
operation of TIPC.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Fixes a typo in the calculation of the network address of a node's own
cluster when generating a response to the configuration command that
lists all of the node's links. The correct mask value for a <Z.C.N>
network address uses 1's for the 8-bit zone and 12-bit cluster parts
and 0's for the 12-bit node part.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Combines two distinct structures containing information about a TIPC bearer
into a single structure. The structures were previously kept separate so
that public information about a bearer could be made available to plug-in
media types using TIPC's native API, while the remaining information was
kept private for use by TIPC itself. However, now that the native API has
been removed there is no longer any need for this arrangement.
Since one of the structures was already embedded within the other, the
change largely involves replacing instances of "publ.foo" with "foo".
The changes do not otherwise alter the operation of TIPC bearers.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cleans up TIPC's source code to eliminate the needless initialization
of static variables to zero.
These changes are purely cosmetic and do not alter the operation of TIPC
in any way.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cleans up TIPC's source code to eliminate deviations from generally
accepted coding conventions relating to leading/trailing white space
and white space around commas, braces, cases, and sizeof.
These changes are purely cosmetic and do not alter the operation of TIPC
in any way.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminates a number of #include statements that no longer serve any
useful purpose.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminates obsolete calls to two of TIPC's main debugging macros, as well
as a pair of associated debugging routines that are no longer required.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminates a sorted list TIPC uses to keep track of the neighboring
nodes it has links to, since this duplicates information already present
in the internal array of node object pointers.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminates routines, data structures, and files that were intended
to allow TIPC to support a network containing multiple clusters.
Currently, TIPC supports only networks consisting of a single cluster
within a single zone, so this code is unnecessary.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminates routines and data structures that were intended to allow
TIPC to route messages to other clusters. Currently, TIPC supports only
networks consisting of a single cluster within a single zone, so this
code is unnecessary.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simplifies routines and data structures that were intended to allow
TIPC to support slave nodes (i.e. nodes that did not have links to
all of the other nodes in its cluster, forcing TIPC to route messages
that it could not deliver directly through a non-slave node).
Currently, TIPC supports only networks containing non-slave nodes,
so this code is unnecessary.
Note: The latest edition of the TIPC 2.0 Specification has eliminated
the concept of slave nodes entirely.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminates routines, data structures, and files that were intended
to allows TIPC to support a network containing multiple zones.
Currently, TIPC supports only networks consisting of a single cluster
within a single zone, so this code is unnecessary.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removes several function declarations that aren't used anywhere,
either because they reference routines that no longer exist or
because all users of the function reference it after it has already
been defined.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gets rid of #include statements that are no longer required as a
result of the merging of obsolete native API header file content
into other TIPC include files.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do some cleanups of TIPC based on make namespacecheck
1. Don't export unused symbols
2. Eliminate dead code
3. Make functions and variables local
4. Rename buf_acquire to tipc_buf_acquire since it is used in several files
Compile tested only.
This make break out of tree kernel modules that depend on TIPC routines.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove all instances of legacy, or as yet to be implemented code
that is currently living within an #if 0 ... #endif block.
In the rare instance that some of it be needed in the future,
it can still be dragged out of history, but there is no need
for it to sit in mainline.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change "return (EXPR);" to "return EXPR;"
return is not a function, parentheses are not required.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ensure that TIPC does not re-establish communication with a
neighboring node until it has finished updating all data structures
containing information about that node to reflect the earlier loss of
contact. Previously, it was possible for TIPC to perform its purge of
name table entries relating to the node once contact had already been
re-established, resulting in the unwanted removal of valid name table
entries.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminate unnecessary checking for null node pointer and redundant
check of second active link array entry.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These functions have enough code in them such that they
seem like sensible targets for un-inlining. Prior to doing
that, this adds the tipc_ prefix to the functions, so that
in the event of a panic dump or similar, the subsystem from
which the functions come from is immediately clear.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Frans Pop <elendil@planet.nl>
Cc: Per Liden <per.liden@ericsson.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix warnings from current gcc about using non-const strings as printf
args in TIPC. Compile tested only (not a TIPC user).
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton reported a build failure on sparc32, because TIPC
uses names like "struct node" and there is a like named data
structure defined in linux/node.h
This just regexp replaces "struct node*" to "struct tipc_node*"
to avoid this and any future similar problems.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch ensures that TIPC configuration commands that display info
about neighboring nodes and their links take the spinlocks that
protect the node list and link lists from changing while the lists
are being traversed.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch ensures that the simultaneous discovery of the same
neighboring node by multiple interfaces does not cause TIPC to add
the node into its internal data structures more than once.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fix two warnings generated by sparse:
link.c:2386 symbol 'msgcount' shadows an earlier one
node.c:244 symbol 'addr_string' shadows an earlier one
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a minor bug that prevents "tipc-config -l" from
displaying the multicast link if a TIPC node has never successfully
established at least one unicast link.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now allocates reply space for "get links" request based on number of actual
links, not number of potential links. Also, limits reply to "get links" and
"get nodes" requests to 32KB to match capabilities of tipc-config utility
that issued request.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
locking init cleanups:
- convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK()
- convert rwlocks in a similar manner
this patch was generated automatically.
Motivation:
- cleanliness
- lockdep needs control of lock initialization, which the open-coded
variants do not give
- it's also useful for -rt and for lock debugging in general
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Incorporates several related fixes:
- switchover now occurs when switching from an active link to a standby link
- failure of a standby link no longer initiates switchover
- links now display correct # of received packtes following reactivation
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This also contains enhancements to simplify comparisons in name table
publication removal algorithm and to simplify name table sanity checking
when shutting down TIPC.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch contains the following possible cleanups:
- make needlessly global code static
- #if 0 the following unused global functions:
- name_table.c: tipc_nametbl_print()
- name_table.c: tipc_nametbl_dump()
- net.c: tipc_net_next_node()
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tried to run the new tipc stack through sparse.
Following patch fixes all cases where 0 was used
as replacement of NULL.
Use NULL to document this is a pointer and to silence sparse.
This brough sparse warning count down with 127 to 24 warnings.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Updated copyright notice to include the year the file was
actually created. Information about file creation dates
was extracted from the files in the old CVS repository
at tipc.sourceforge.net.
Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
The license header in each file now more clearly state that this
code is licensed under a dual BSD/GPL. Before this was only
evident if you looked at the MODULE_LICENSE line in core.c.
Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
TIPC (Transparent Inter Process Communication) is a protocol designed for
intra cluster communication. For more information see
http://tipc.sourceforge.net
Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>