The most traditional interface of the network subsystem is the Berkeley socket interface. Historically, the Berkeley socket interface was developed at the University of California at Berkeley as a part of BSD 4.2 from 1981 to 1983. These days, it is present in virtually all flavors of Unix and Windows.
The Berkeley socket interface centers around the concept of a socket as an object that facilitates communication. The socket can be bound to a local address and connected to a remote address. Data can be sent and received over a socket.
int socket (int domain, int type, int protocol);
Domain specifies socket protocol class:
PF_UNIX - local communication
PF_INET - IPv4 protocol family
PF_INET6 - IPv6 protocol family
PF_IPX - IPX protocol family
PF_NETLINK - kernel communication
PF_PACKET - raw packet communication
Type specifies socket semantics:
SOCK_STREAM - reliable bidirectional ordered stream
SOCK_RDM - reliable bidirectional unordered messages
SOCK_DGRAM - unreliable bidirectional unordered messages
SOCK_SEQPACKET - reliable bidirectional ordered messages
SOCK_RAW - raw packets
Protocol specifies socket protocol:
0 - class and type determine protocol
other - identification of supported protocol
The socket
call creates the socket object.
An error is returned if the combination of class, type, protocol is not
supported.
int bind (int sockfd, struct sockaddr *my_addr, socklen_t addrlen); #define __SOCKADDR_COMMON(sa_prefix) \ sa_family_t sa_prefix##family struct sockaddr_in { __SOCKADDR_COMMON (sin_); in_port_t sin_port; struct in_addr sin_addr; unsigned char sin_zero [sizeof (struct sockaddr) - __SOCKADDR_COMMON_SIZE - sizeof (in_port_t) - sizeof (struct in_addr)]; }; struct sockaddr_in6 { __SOCKADDR_COMMON (sin6_); in_port_t sin6_port; uint32_t sin6_flowinfo; struct in6_addr sin6_addr; uint32_t sin6_scope_id; };
The bind
call binds the socket to a given
local address. The binding is typically necessary to tell the socket what local
address to listen on for incoming connections.
int listen (int sockfd, int backlog);
The listen
call tells the socket to listen
for incoming connections and sets the length of the incoming connection queue.
int accept (int sockfd, struct sockaddr *addr, socklen_t *addrlen);
The accept
call accepts an incoming connection on a listening socket that is SOCK_SEQPACKET
, SOCK_STREAM
or SOCK_RDM
.
The function returns a new socket and an address that the new socket is connected to and keeps the original socket untouched.
int connect (int sockfd, const struct sockaddr *serv_addr, socklen_t addrlen);
The connect
call connects a socket that is SOCK_SEQPACKET
, SOCK_STREAM
or SOCK_RDM
to a remote address.
For other socket types, it sets a remote address of the socket.
ssize_t send (int sockfd, const void *buf, size_t len, int flags); ssize_t sendto (int sockfd, const void *buf, size_t len, int flags, const struct sockaddr *to, socklen_t tolen); ssize_t sendmsg (int sockfd, const struct msghdr *msg, int flags); struct msghdr { void *msg_name; // optional address socklen_t msg_namelen; // optional address length struct iovec *msg_iov; // array for scatter gather size_t msg_iovlen; // array for scatter gather length void *msg_control; // additional control data socklen_t msg_controllen; // additional control data length int msg_flags; };
The send
family of calls sends data over a
socket. Either the socket is connected or the remote address is specified. The
write
call can also be used but the flags cannot be
specified in that case.
ssize_t recv (int sockfd, void *buf, size_t len, int flags); ssize_t recvfrom (int sockfd, void *buf, size_t len, int flags, struct sockaddr *from, socklen_t *fromlen); ssize_t recvmsg (int sockfd, struct msghdr *msg, int flags); struct msghdr { void *msg_name; // optional address socklen_t msg_namelen; // optional address length struct iovec *msg_iov; // array for scatter gather size_t msg_iovlen; // array for scatter gather length void *msg_control; // additional control data socklen_t msg_controllen; // additional control data length int msg_flags; };
The recv
family of calls receives data over
a socket. The read
call can also be used but the
flags cannot be specified in that case.
The additional control data can provide data such as list of queued errors or additional protocol and transport information. The additional control data is structured as a list with headers and payload, which is protocol specific.
int select (int setsize, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); int poll (struct pollfd *ufds, unsigned int nfds, int timeout); struct pollfd { int fd; short events; // requested events short revents; // returned events };
The select
call is used to wait for data on
several sockets at the same time. The arguments are sets of file descriptors,
usually implemented as bitmaps. The file descriptors in readfds
are waited for until a read would not block, the file descriptors in writefds
are waited for until a write would not block, the file descriptors in exceptfds
are waited for until an exceptional condition occurs. The call returns the number
of file descriptors that meet the condition of the wait.
The poll
call makes it possible to more
precisely distinguish what events to wait for.
int getsockopt (int sockfd, int level, int optname, void *optval, socklen_t *optlen); int setsockopt (int sockfd, int level, int optname, const void *optval, socklen_t optlen);
References.Â
Hewlett Packard: BSD Sockets Interface Programmers Guide
Unix sockets represent a class of sockets used for local communication between processes. The sockets are represented by a file name or an abstract socket name.
struct sockaddr_un { sa_family_t sun_family; // set to AF_UNIX char sun_path [PATH_MAX]; // socket name };
It is also possible to use sockets without names, the socketpair
function creates a pair of connected sockets that can be inherited by child processes
and used for communication.
int socketpair (int domain, int type, int protocol, int sockets [2]);
Unix sockets can use additional control data to send file descriptors or to send process credentials (PID, UID, GID) whose correctness is verified by kernel.
Important uses of the Unix sockets include the X protocol.
> netstat --unix --all (servers and established) Proto RefCnt Flags Type State Path unix 2 [ ACC ] STREAM LISTENING /var/run/acpid.socket unix 2 [ ACC ] STREAM LISTENING /tmp/.font-unix/fs7100 unix 2 [ ACC ] STREAM LISTENING /tmp/.gdm_socket unix 2 [ ACC ] STREAM LISTENING /tmp/.X11-unix/X0 unix 2 [ ACC ] STREAM LISTENING /tmp/.ICE-unix/4088 unix 2 [ ACC ] STREAM LISTENING /var/run/dbus/system_bus_socket unix 3 [ ] STREAM CONNECTED /var/run/dbus/system_bus_socket unix 2 [ ] DGRAM @/var/run/hal/hotplug_socket unix 2 [ ] DGRAM @udevd unix 2 [ ACC ] STREAM LISTENING /tmp/xmms_ceres.0 unix 3 [ ] STREAM CONNECTED /tmp/.X11-unix/X0 unix 3 [ ] STREAM CONNECTED /tmp/.ICE-unix/4088
Netlink sockets represent a class of sockets used for communication between processes and kernel. The sockets are represented by a netlink family that is specified in place of protocol when creating the socket.
NETLINK_ARPD - ARP table
NETLINK_ROUTE - routing updates and modifications of IPv4 routing table
NETLINK_ROUTE6 - routing updates and modifications of IPv6 routing table
NETLINK_FIREWALL - IPv4 firewall
...
Messages sent over the netlink socket have a standardized format. Macros and libraries are provided for handling messages of specific netlink families.
From the application programmer perspective, Winsock sockets offer an interface that is, in principle, based on that of the Berkeley sockets. From the service programmer perspective, Winsock offers an interface that allows service providers to install multiple protocol libraries underneath the unified API. The interface, called SPI (Service Provider Interface), distinguishes two types of services, transport and naming, and allows layering of protocol libraries.