Message Passing IPC has a simple API
9
fd[1] fd[0]
int pipe (pipefd, /* flags */);
int read (pipefd[0], <buf>, len);
int write (pipefd[1], <buf>, len);
Single system: flow allocation
12
IPC FACILITY
X Y
Port ID Port ID
flow
Flow allocation: reserves resources in the IPC facility and assigns port ID’s ~ fd’s
Applications X and Y are known (instantiated by kernel)
Part of OS
<port_id> alloc (pid, …);
dealloc (port_id);
Implementation dependent
Message passing
shared memory
Single system: connection
13
IPC FACILITY
X Y
Port ID Port ID
flow
Flow allocation: reserves resources in the IPC facility and assigns port ID’s ~ fd’s
Applications X and Y are known (instantiated by kernel)
Part of OS
<port_id> alloc (pid, …);
<port_id> accept (…);
dealloc (port_id);
read/write (port_id, sdu *, len);
Implementation dependent
Message passing
shared memory
APPLICATION FACILITY
Connection (application protocol)
Two systems connected by a wire
14
X Y
Applications X and Y -> need to register a name that is unique over both systems!
Two systems connected by a wire
15
X Y
IPC process: provides IPC service.
Not only locally, but over 2 systems
Requires an IPC process component that
manages the medium (MAC)
Applications X and Y -> need to register a name that is unique over both systems!
IPC Process
IRM IRM
IPC Resource Manager
basically creates/destroys IPC processes.
Ideally: part of the OS
Two systems connected by a wire
16
DISTRIBUTED IPC FACILITY
X Y
Port ID Port ID
flow
IPC Process
Locating an application
if it’s not here, it’s over there… or doesn’t exist.
Applications X and Y -> need to register a name that is unique over both systems!
reg(pid, name);
unreg(pid);
alloc (name, …);
dealloc (port_id);
IRMIRM
DISTRIBUTED APPLICATION FACILITY
Two systems connected by a wire
17
DISTRIBUTED IPC FACILITY
X Y
Port ID Port ID
flow
IPC Process
Locating an application
if it’s not here, it’s over there… or doesn’t exist.
Applications X and Y -> need to register a name that is unique over both systems!
reg(pid, name);
unreg(pid);
alloc (name, …);
dealloc (port_id);
IRM IRM
Multiple systems sharing a transmission
medium such as radio
18
X Y
Port ID Port ID
flow
IPC Process
1 2
Z
3
Port ID
Need to synchronize names between the the systems
register names (X,1), (Z,3), … -> directory function
small scope: broadcasting information is a feasible option
IPC Processes need addresses for scalability
DISTRIBUTED IPC FACILITY
IRMIRM IRM
IPC API
• APs communicate using a “port”
– portID: ~ file descriptor
• 6 operations:
– int _registerApp(appName, List<difName>)
– portId _allocateFlow(destAppName, List<QoSParams>)
– int _write(portId, sdu)
– sdu _read(portId)
– int _deallocate(portId)
– int _unregisterApp(appName, List<difName>)
19
Building networks: Normal IPCP
23
X Y
23
C2 C1
A1 A2 B1 B2
E1 E2
Provides IPC to higher layers (DIFs/DAFs)
Uses IPC from lower layers (DIFs)
Normal IPC Process
(IPCP)
D1
Building networks: IPCP registration
24
X Y
A1 A2 B1 B2
C2 C1 E1 E2D1
24
IPCP D1 registers in 2 DIFs (A, B)
D1/A2
D1/B1
Building networks: IPCP registration
25
X Y
A1 A2 B1 B2
C2 C1 E1 E2D1
25
D2
D1/A2
D2/A1
D1/B1
Create IPCP D2, can register in DIF A (optional)
Building networks: flow allocation
26
X Y
A1 A2 B1 B2
C2 C1 E1 E2D1
26
D2
IPCP D2 allocates a flow with D1
D2 can now send messages to D1
D1/A2
D2/A1
D1/B1
Building networks: enrollment
27
X Y
A1 A2 B1 B2
C2 C1 E1 E2D1
27
D2
A new operation: enrollment: “joining a DIF”
authentication
exchanging some basic information
configuration parameters
addresses
current equivalent: joining a wifi network
D1D2
D1/A2
D2/A1
D1/B1
Building networks: enrollment
28
X Y
A1 A2 B1 B2
C2 C1 E1 E2
28
D3 performs the same procedures. DIF “D” now has 3 members
D1 D3D2
D1/A2
D2/A1
D1/B1
Flow allocation in normal IPCP
31
F1 F2
D1 D3D2
Flow allocation = reservation of resources
Flows vs Connections
32
F1 F2
D1 D3D2
Port_id Port_id
Cep-id Cep-id Cep-id Cep-id
Flow: implemented by a lower layer, between port_id’s
Resource: EFCP connection inside a layer, between CEP-id’s
Flow allocation = reservation of resources
Error and Flow Control Protocol
• DTP ~UDP
– Fragmentation
– Reassembly
– Sequencing
– Concatenation
– Separation
• DTCP ~ TCP
– Transmission control
– Retransmission control
– Flow control
• Loosely coupled by a state vector
• Based on Delta-t
33
Delta-t (Watson, 1981)
• Developed at L.Livermore labs, unique approach.
– Assumes all connections exist all the time.
– keep caches of state on ones with recent activity
• Watson proves that the conditions for distributed
synchronization are met if and only if 3 timers are bounded:
– Maximum Packet Lifetime: MPL
– Maximum number of Retries: R
– Maximum time before ACK: A
• That no explicit state synchronization, i.e. hard state, is
necessary.
– SYNs, FINs are unnecessary
• 1981:Watson shows that TCP has all three timers and more.
34
Inside the normal IPC process
IPC
Process
IPC API
Data Transfer Data Transfer Control Layer Management
SDU Delimiting
Data Transfer
Relaying and
Multiplexing
SDU Protection
Transmission
Control
Retransmission
Control
Flow Control
RIB
Daemon
RIB CDAP
Parser/Generator
CACEP Enrollment
Flow Allocation
Resource Allocation
Forwarding Table
Generator
Authentication
StateVector
StateVector
StateVector
Data TransferData Transfer
Transmission
Control
Transmission
Control
Retransmission
Control
Retransmission
Control
Flow Control
Flow Control
Appl.
Process
IPC
Resource
Mgt.
SDU
Protecti
on
Multipl
exing
IPC Mgt. Tasks
Other Mgt. Tasks
Application Specific Tasks
•Authentication of all processes
•RIB Daemon manages state objects
•EFCP protocol performs SDU transport
35
Building networks: enrollment
36
X Y
A1 A2 B1 B2
C2 C1 E1 E2
36
D3 performs the same procedures. DIF “D” now has 3 members
D1 D3D2
D1/A2
D2/A1
D1/B1
Basic concept of RINA
43
“Simplicity is prerequisite
for reliability” – E. Dijkstra
The lowest layer: shim DIFs
44
DISTRIBUTED IPC FACILITY
X Y
connection
Port ID Port ID
flow
Shim IPC Process
Wrap the IPC API around a legacy protocol
Ex. Ethernet
Shim DIF over UDP using DNS
45
Register in shim DIF
Flow allocation
Data transfer
Query directory
Flow allocation
Data transfer
Check if the name exists in
the shim DIF (uniqueness!)
46
• A structure of recursive layers
that provide IPC (Inter
Process Communication)
services to applications on
top
• There’s a single type of layer
that repeats as many times
as required by the network
designer
• Separation of mechanism
from policy
• All layers have the same functions, with different scope and range.
– Not all instances of layers may need all functions, but don’t need more.
• A Layer is a Distributed Application that performs and manages IPC.
– A Distributed IPC Facility (DIF)
• This yields a scalable architecture
Adoption path
47
TCP/IP or UDP/IP
Ethernet
Physical Media
Applications
Ethernet
Physical Media
Applications
DIF
DIF
…
Physical Media
ApplicationsToday
DIF
DIF
…
TCP/IP or UDP/IP
Physical Media
Applications
DIF
DIF
…
End goal
FP7 IRATI Project Prototype
IoT?
48
RINA: DTP + Flow Control
The Internet relies on TCP congestion control
What would happen if the dominant traffic becomes UDP?
Mobility in RINA
56
DIF
Y
F4
X
F3
BACKHAUL DIF
Registration / Unregistration of Application in DIFs
No routing or address updates!
Device moves physically, but no IPCP moves logically within a DIF!
And flow allocations
Enroll / Unenroll in the LTE DIFs
2
Layering in Networking – OSI vs. RINA
• From the OSI perspective:
each layer performs a specific task or tasks and builds upon the
preceding layer until the communications are complete.
• In Recursive InterNetworking Architecture (RINA):
every layer (called “Distributed InterProcess Communication (IPC)
Facility” (DIF)) has some mechanisms and policies with the goal of:
– providing and managing the communication among its entities
– “Mechanisms” are the same in various DIFs.
– “Policies” can be programmed differently in various DIFs.
3
Congestion Control in RINA vs. the Internet
FP7-619305 PRISTINE Collaborative Project
B8.2. Work Package 3
Processes with
specific functionality
Processes with
generic functionality
4
Congestion Control in the Internet
• Problems with the Internet:
– TCP scalability with:
• The diameter of the network
• The bottleneck link capacity
• The number of flows
– Different link types
– Split-TCP (PEPs):
• IPsec and SSL
• Scalability with the number of flows
• Processing delay at splitters
5
Congestion Control in RINA
• Our goal:
– highlighting RINA Congestion Control (CC) benefits
Showing that improvements that have
been done to TCP on the internet
"naturally appear" with RINA without
their side effects
In 3 steps:
1. Inspecting DIF and its modules,
2. Showing DIF organizations,
3. Comparative results
7
IPC Modules
Delimiting
Data Transfer
Protocol (DTP)
Data Transfer
Control Protocol
(DTCP)
(N-1)-DIF A (N-1)-DIF B
N-DIF(N-1)-DIFs
Queues
Relaying and
Multiplexing TaskFlow Aggregation
based on
QoSCube!
EFCP
(N+1)-DIFs
8
Error and Flow Control Protocol (EFCP)
• Ensures reliability, order, and flow and congestion control.
• Each EFCP instance consists of distinct instances of
– Data Transfer Protocol (DTP), and
– Data Transfer Control Protocol (DTCP), (optional)
which coordinate through a state vector.
• Features:
– Win. based
– Rate based
– In/out of order delivery
– ACKs:
• Delayed
• Selective
• Allowable gaps
Delimiting
Data Transfer
Protocol (DTP)
Data Transfer
Control Protocol
(DTCP)
(N-1)-DIF A (N-1)-DIF B
Queues
Relaying and
Multiplexing Task
EFCP
9
Data Transfer Protocol (DTP)
• Required
• Consists of tightly bound mechanisms found in all DIFs.
• Roughly equivalent to UDP.
• There is one instance of DTP for each flow.
• Some policies:
– RTT Estimator,
– SenderInactivity,
– RcvrInactivity
Delimiting
Data Transfer
Protocol (DTP)
Data Transfer
Control Protocol
(DTCP)
(N-1)-DIF A (N-1)-DIF B
Queues
Relaying and
Multiplexing Task
EFCP
10
Data Transfer Control Protocol (DTCP)
• Optional
• Provides the loosely-bound mechanisms
• Each DTCP instance is paired with a DTP instance
• Controls the flow, based on its policies and the content
of the shared state vector.
• Some policies:
– TxControl,
– SenderAck,
– RxTimerExpiry,
– ECN.
Delimiting
Data Transfer
Protocol (DTP)
Data Transfer
Control Protocol
(DTCP)
(N-1)-DIF A (N-1)-DIF B
Queues
Relaying and
Multiplexing Task
EFCP
11
Step 2: Some DIF Configurations in RINA
• Two simple RINA stack configurations by
different organizations of DIFs
Delimiting
Data Transfer
Protocol
Data Transfer
Control
(N-1)-DIF A (N-1)-DIF B
Queues
Relaying and
Multiplexing Task
EFCPDelimiting
Data Transfer
Protocol
Data Transfer
Control
(N-1)-DIF A (N-1)-DIF B
Queues
Relaying and
Multiplexing Task
EFCP
12
A General DIF Configuration
“Error and Flow Control Protocol” (EFCP),
“Relaying and Multiplexing Task” (RMT),
“Resource Allocation” (RA).
Queue
builds up
Reduces
rate
Might be
blocked
Queue
builds up
Reduces
rate
Might be
blocked
Queue
builds up
Reduces
rate
This is the so-called
pushback method.
Other feedback types
are under investigation.
13
Step 3: A Comparative Example of Stacks
10 Mbps
75 ms
10 Mbps
25 ms
14
Horizontal: Consecutive DIFs
From
P. Teymoori, M. Welzl, S. Gjessing, E. Grasa, R. Riggio, K. Rausch, D. Siracusa: "Congestion
Control in the Recursive InterNetworking Architecture (RINA)", IEEE ICC 2016, Kuala Lumpur,
Malaysia, 23-27 May 2016.
15
Vertical: Stacked DIFs
Topology: Results:
From the same publication
1 sender, 1 receiver:
Sender sends flow 1 (large) at 0,
and flow 2 (small) at time 10.
There is only one
flow carrying
packets here
There is only one
flow carrying
packets here
Scenario 1 Scenario 2
16
Around: In-Network Resource Pooling
• A follow-up to: Psaras, Ioannis, Lorenzo Saino, and George Pavlou. "Revisiting
Resource Pooling: The Case for In-Network Resource Sharing." Proceedings of the
13th ACM Workshop on Hot Topics in Networks. ACM, 2014
• Easily implementable in RINA by an RMT routing policy
S1
S2
R1
R2
Router1
Router3
Router2
Router4
10 Mbps
10 Mbps
2 Mbps
3 Mbps
3 Mbps
10 Mbps
10 Mbps
10 Mbps
local stability, global fairness (1:1)
Result: Jain’s fairness index for the two flows was 0.999, which shows global fairness
while local stability was provided through RINA-ACC
Delimiting
Data Transfer
Protocol
Data Transfer
Control
(N-1)-DIF A (N-1)-DIF B
Queues
Relaying and
Multiplexing Task
EFCP
17
Discussion
• RINA can solve the Internet problems by
– breaking up the long control loop into shorter ones,
• Link-specific congestion control
– controlling flow aggregates inside the network, and
• Fewer competition among flows
– enabling the deployment of arbitrary congestion
control mechanisms per DIF.
• Coupled with the policies in other modules
• and of course, there are other questions that
need to be answered before a full deployment.
Large scale RINA Experimentation on FIRE +
RINA for converged operator networks in
5G scenarios
Leonardo Bergesio, Eduard Grasa – i2CAT
June 27th 2016, Athens
Large scale RINA Experimentation on FIRE +
An Operator Network today and the
converged vision
2
A (C)ON today..
Large-scale RINA Experimentation on FIRE+ 3
• Different access technologies use a dedicated aggregation and backbone
network segments, resulting in a poor utilization of the infrastructure.
• Users will have different subscriber profiles (mobile vs. fix).
• The provider must deploy duplicated service platforms (mobile vs. fix).
• Challenging service continuity when transitioning between type of access
• Management and deployment of new services cumbersome
A CON vision..
• Any access media, any application requirement
supported by a common network infrastructure
• Single architecture, single management system, single
users database (regardless of access)
4
Manageusersand sessions,
Local managed services
Capillarity,Capacity,
Mobility support
Multiplexing Switching,
Transport
Control functions,
Regional managed services
Devices
Places
Users Access Aggregation LocalPoints of Presence Core RegionalDataCentres
Radio
Fiber
Large scale RINA Experimentation on FIRE +
The all-IP approach
Current protocol stack limitations
5
xDSL FTTH WiFi 4G
Building a CON today…
6
National DCRegional
DC(s)
Metropolitan
aggregation
(Carrier Eth, MPLS, …)
Metropolitan
aggregation
… …
Metro
DC
Metro
DC
Access
Internet Border
Private peering
with other
operators , or
Internet transit
To Internet
eXchange
Point (IXP)
IP eXchange border
To IPX
network (IMS
traffic)
micro
DC
Metropolitan
aggregation
(Carrier Eth, MPLS, …)
Metropolitan
aggregation
Metropolitan
aggregation
(Carrier Eth, MPLS, …)
Metropolitan
aggregation
Metropolitan
aggregation
(Carrier Eth, MPLS, …)
• Micro DC: Mobile-edge computing, C-RAN
• Metro DC/Regional/National DCs: (functions splitted according desired degree
of distribution by provider). Service platforms (DNS, SMTP, etc.), LTE EPC (S-GW, P-
GW, MME), IMS, cloud hosting, user accounts, billing, NOC, etc.
Core/backbone
IP/MPLS
Are “All IP networks” fit for this purpose?
• Computer networking & telecom industry has been
steadily moving towards an “all IP” world.
– Is “all-IP convergence” a simple, scalable, robust,
manageable, performing and future-proof solution for all
types of computer networks?
• Could be if
– The “IP protocol suite” had been designed with generality in
mind, allowing its protocols to adapt to specific network
environments
– The “IP protocol suite” is well know for having no scalability,
performance or security issues
7
1
2
1
42
Large scale RINA Experimentation on FIRE +
The RINA approach
And its benefits
8
There is a better approach: RINA
• Network architecture resulting from a fundamental theory of
computer networking
• Networking is InterProcess Communication (IPC) and only IPC.
Unifies networking and distributed computing: the network is a
distributed application that provides IPC
• There is a single type of layer with programmable functions, that
repeats as many times as needed by the network designers
recursion, ease management
• All layers provide the same service: instances of communication
(flows) to two or more application instances, with certain
characteristics (delay, loss, in-order-delivery, etc)
programmability, limited number of protocols + policies
• There are only 3 types of systems: hosts, interior and border routers.
No middleboxes (firewalls, NATs, etc) are needed
• Deploy it over, under and next to current networking technologies
9
1
2
3
4
5
6
“IP protocol suite” macro-structure
• Functional layers organized for modularity, each layer
provides a different service to each other
– As the RM is applied to the real world, it proofs to be
incomplete. As a consequence, new layers are patched into
the reference model as needed (layers 2.5, VLANs, VPNs,
virtual network overlays, tunnels, MAC-in-MAC, etc.)
Large-scale RINA Experimentation on FIRE+ 10
(Theory) (Practice)
Network management
Commonality is the key to effective network management
11
• Commonality and consistency in RINA greatly simplifies
management models, opening the door to increased
automation in multi-layer networks
– Reduce opex, network downtime, speed-up network service delivery,
reduce components that need to be standardised
Frommanaginga set of layers, eachwith its
own protocols,conceptsanddefinitions…
… to managinga common,repeatingstructure
of two protocolsanddifferentpolicies
Naming and addressing, mobility, routing
No need for special protocols
12
Name Indicates Property RINA IP
Application name What Locationindependent Yes No
Node address Where Locationdependent, route
independent
Yes No
Pointof
Attachment
How to get
there
Route dependent Yes Yes (twice:
IP, MAC)
Security: DIFs are securable containers
Secure layers instead of protocols, expose less to apps, scope
Large-scale RINA Experimentation on FIRE+ 13
Allocatinga flow to
destination application
Access control
Sending/receivingSDUs
through N-1 DIF
Confidentiality,integrity
N DIF
N-1 DIF
IPC
Process
IPC
Process
IPC
Process
IPC
Process Joininga DIF
authentication,access
control
Sending/receiving SDUs
through N-1 DIF
Confidentiality,integrity
Allocatinga flow to
destination application
Access control
IPC
Process
Appl.
Process
DIF Operation
Logging/Auditing
DIF Operation
Logging/Auditing
RINA IP protocolsuite
Consistent security model,enforced byeach
layer via pluggable policies
Each protocol has its own security
model/functions (IPsec, TLS, BGPsec, DNSsec,
etc.)
Scope as a native construct: controlled
connectivity bydefault
Single scope (global),connectivityto everyone by
default.Scope via ad-hocmeans:firewalls, ACLs,
VLANs, VPNs, etc.
Complete namingand addressing,separation of
synchronization fromport allocation
No application names,addresses exposedto
applications, well-known ports
Deployment
Clean-slate concepts but incremental deployment
14
• IPv6 brings very small improvements to IPv4, but requires a
clean slate deployment (not compatibleto IPv4)
• RINA can be deployed incrementally where it has the right
incentives, and interoperate with current technologies (IP,
Ethernet, MPLS, etc.)
– Over IP (just likeany overlay such as VXLAN, NVGRE, GTP-U, etc.)
– Below IP (just like any underlay such as MPLS or MAC-in-MAC)
– Next to IP (gateways/protocol translationsuch as IPv6)
IP Network
RINAProvider
RINANetwork
Sockets ApplicationsRINAsupported Applications
IP or Ethernet or MPLS, etc
Network Programmability
• Centralized control of data
forwarding
– GSMPv3 (label switches:
ATM, MPLS, optical),
OpenFlow (Ethernet, IP,
evolving)
• APIs for controlling network
services & network devices
– ONF SDN architecture, IEEE
P1520 (from 1998) (P1520
distinguished between virtual
devices and hardware)
15
ONF‘s SDN architecture
Separation of mechanism from policy
16
IPC API
DataTransfer DataTransfer Control Layer Management
SDU Delimiting
Data Transfer
Relaying and
Multiplexing
SDU Protection
Retransmission
Control
Flow Control
RIB
Daemon
RIB
CDAP
Parser/Generator
CACEP
Enrollment
Flow Allocation
Resource Allocation
Routing
Authentication
StateVector
StateVector
StateVector
Data TransferData Transfer
Retransmission
Control
Retransmission
Control
Flow Control
Flow Control
Namespace
Management
Security
Management
• All layers have the same mechanisms and 2 protocols (EFCP for data
transfer, CDAP for layer management),programmable via policies.
– All data transfer and layer management functions are programmable!
• Don’t specify/implement protocols, only policies
– Re-use common layer structure, re-use policies across layers
• This approach greatly simplifies the network structure, minimizing the
management overhead and the cost of supporting new
requirements, new physical media or new applications
Large scale RINA Experimentation on FIRE +
From all-IP to RINA
Examples exploiting recursion
17
Recursion instead of virtualization (I)
• RINA recursive layering structure cleans up and
generalizes the current protocol stack.
• Example 1: PBB-VPLS (Virtual Private LAN Service)
– Uses MAC-in-MAC encapsulation to isolate provider’s core from
customers addresses andVLANs
24
Green CustomerVPN DIF
ProviderVPN Service DIF
Metro DIF Metro DIFCore DIF
PtP DIF PtP DIF PtP DIF PtP DIF
PtP DIFPtP DIFPtP DIFPtP DIF PtP DIF PtP DIF PtP DIF
Recursion instead of virtualization (II)
• Example 2: LTE (Long Term Evolution)
– Uses PDCP, GTP to transport user’s IP payload, and also relies on
internal IP network.
31
IP (e.g. Internet)
TCP or UDP
PDCP GTP-U
Protocol
conversion
GTP-U
RLC
MAC
L1
UDP
IP (LTE transport)
MAC MAC. . .
L1 . . . L1
UDP
IP (LTE transport)
MAC MAC. . .
L1 . . . L1UE
eNodeB S-GW P-GW
EPS bearerEPS bearer
LTE-Uu
S1-U S5/S8
MAC
L1
SGi
PublicInternet DIF
Mobile Access Network Top Level DIF
Multi-access radio
DIF
Mobile Operator
TransportDIF
Mobile Operator
Transport DIF
PtP DIF PtP DIF PtP DIF PtP DIF
PtP DIF
Recursion instead of virtualization (III)
• Example 3: Data Center Network with NVO3
– Network Virtualization Over Layer 3, uses overlay virtual networks on
top of the DCN’s fabric layer 3 to support multi-tenancy
• Recursion provides a cleaner, simpler solution than
virtualization
– Repeat the same building block, with the same interface. 36
ToR ToRFabric Spine Fabric
Server ServerIPv4 orIPv6 (Fabric layer)
UDPVM VM
Ethernet Ethernet Ethernet Ethernet
VXLAN802.1Q802.3 802.1Q
IPv4 orIPv6 (tenant overlay)
TCP orUDP orSCTP, … (transport layer)
802.3
Protocol conversion,
Local bridging PtP DIF PtP DIF PtP DIF PtP DIF
PtP DIF PtP DIFPtP DIFPtP DIF
DC Fabric DIF
Tenant DIF
Large scale RINA Experimentation on FIRE +
A Service Provider Network design with
RINA
37
Service provider, RINA, Internet (e-mall), wired Access
Access
router
PtP DIF
CPE
Edge
Service
Router
MAN P.E MAN P. E.
MAN Access DIF
MAN Core DIF
PtP DIF PtP DIF
PtP DIF PtP DIF
MAN P
PtP DIF
Host
Core Backbone DIF
PtP DIF
Core router Core router e-mall
Access
Router
E-mall
Border
Router
Customer network Service Prov. 1 network
Access Aggregation Service Edge Core Internet Edge
Internet ( e-mall)
eXchange Point
Core PoP, city B
Core PoP, city A
City A MAN
City A Cabinets
PtP DIF PtP DIF PtP DIF
Service ProviderTop LevelDIF
E-mall1 DIF
PtP DIF
E-mall2 DIF
Service provider, RINA, Internet (e-mall) wireless Access
Access
router
PtP DIF
Cell Tower
(eNodeB)
Mobile Edge
Service
Router
MAN P.E MAN P. E.
MAN Access DIF
MAN Core DIF
PtP DIF
PtP DIF
PtP DIF PtP DIF
MAN P
Cell DIF
Mobile
Host
(or border
router)
Core Backbone DIF
PtP DIF
Core router Core router e-mall
Access
Router
E-mall
Border
Router
Service Prov. 1 network
Access Aggregation Service Edge Core Internet Edge
PtP DIF PtP DIF PtP DIF
Service ProviderTop LevelDIF
E-mall1 DIF
PtP DIF
E-mall2 DIF
Mobile AccessDIF
Internet ( e-mall)
eXchange Point
Core PoP, city B
Core PoP, city A
City A MANCity A Cabinets
Cell sites
Example mobile network with RINA
Better design
40
Border
Router
Core DIF
UnderDIFs
Border
Router
UnderDIFs
Border
Router
InteriorRouter
(Base Station)
Host
(Mobile)
BD DIF
(radio)
Under
DIFs
District DIF
Metro DIF
RegionalDIF
Public Internet DIF
Application-specific DIF
Mobile InfrastructureNetworkCustomer Terminal
…
…
…
• In this example “e-mall” DIFs (providing access to applications) are
available via the regional DIF, but could be available also through
metro or district DIFs
– Essentially, every border router can be a “mobility anchor”, no need to do
anything special.
UnderDIFs
Operatorcore
Example with 4 levels (where needed)
41
Urban Sub-urban Urban UrbanDense Urban
BS DIF District DIF LegendMetro DIF Regional DIF
• 4 levels of DIFs may not be needed everywhere (e.g. suburban, not
enough density to require a district DIF).
• If more levels needed to scale can be added anywhere in the
network
RINA macro-structure (layers)
Single type of layer, consistent API, programmable policies
44
Host
Border router Interior Router
DIF
DIF DIF
Border router
DIFDIF
DIF (DistributedIPC Facility)
Host
App
A
App
B
Consistent
API through
layers
IPC API
DataTransfer DataTransfer Control Layer Management
SDU Delimiting
Data Transfer
Relaying and
Multiplexing
SDU Protection
Retransmission
Control
Flow Control
RIB
Daemon
RIB
CDAP
Parser/Generator
CACEP
Enrollment
Flow Allocation
Resource Allocation
Routing
Authentication
StateVector
StateVector
StateVector
Data TransferData Transfer
Retransmission
Control
Retransmission
Control
Flow Control
Flow Control
Increasingtimescale (functionsperformedlessoften) and complexity
Namespace
Management
Security
Management
Radio Access DIF and District DIF
Example connectivity graphs
Multi-homed host
BR
BS
H H H
BS
H
Metro
DIF
Metro
DIF
Metro
DIF
BR
BS
H H H
BS
H
Metro
DIF
Metro
DIF
Metro
DIF
Multi-homed host
Metro
DIF
Metro
DIF
Metro
DIF
District DIF 1 District DIF 2
CellDIF
Cell
DIF
CellDIF
Cell
DIF
DISTRICT DIF
BS = IPCP at Base Station
H = IPCP at Host
BR = IPCP at Border Router
BS
H
BS
H H
H
H
Cell DIF 1
(radio) Cell DIF 2
(radio)
District
DIF
District
DIF
District
DIF
District
DIF
District
DIF
District
DIF
CELL DIF
BS = IPCP at Base Station
H = IPCP at Host
45
E-mall
DIF
Metro DIF and Regional DIF
Example connectivity graphs
METRO DIF
H = IPCP at Host
BR = IPCP at Border Router
H H H H H
BR
H H H H H
BR
Multi-homed host
Reg.
DIF
H H H H
H
BR
H
District
DIF
District
DIF
District
DIF
BR
Regional
DIF
H
H H HH H HH H H HH H HH H H HH H HH H H HH H H
Metro
DIF
BR
HH H HH H H HH H HH H H HH H HH H HH H HH
Metro
DIF
BR
BR
Metro DIF
(fixed)
Public
InternetDIF
REGIONAL DIF
H = IPCP at Host
BR = IPCP at Border Router
46
Securing RINA networks
Tutorial 1: RINA: a future--proof approach towards re-
architecting the infocomms protocol stack
supporting Cloud, IoT and beyond 5G requirements
Securing RINA networks
Miquel Tarzan (presenter), Eduard Grasa, Ondrej Lichtner,
Ondrej Rysavy, Hamid Asgari, John Day, Lou Chitkushev
FP7 PRISTINE
EuCNC 2016, Athens, June 27th 2016
Protecting layers instead of protocols
All layers have the same consistent, security model
3
• Benefits of having an architecture instead of a protocol suite: the
architecture tells you where security related functions are placed.
– Instead of thinking protocol security (BGPsec, DNSsec, IPsec, TLS, etc.), think
security of the architecture: no more ‘each protocol has its own security’,
‘add another protocol for security’ or ‘add another box that does security’
Opera&ng on the
IPCP’s RIB
Access control
Sending/receiving PDUs
through N-1 DIF
Confiden.ality, integrity
N DIF
N-1 DIF
IPC
Process
IPC
Process
IPC
Process
IPC
Process
Joining a DIF
authen.ca.on,
access control
Sending/receiving PDUs
through N DIF
Confiden.ality, integrity
Opera&ng on the
IPCP’s RIB
Access control
IPC
Process
Appl.
Process
Access control
(DIF members)
Confiden.ality, integrity
Authen.ca.on
Access control
Opera.ons on RIB
DIF Opera&on
Logging
DIF Opera&on
Logging
Separation of mechanism from policy
4
IPC API
Data Transfer Data Transfer Control Layer Management
SDU Delimi&ng
Data Transfer
Relaying and
Mul&plexing
SDU Protec&on
Retransmission
Control
Flow Control
RIB
Daemon
RIB
CDAP Parser/
Generator
CACEP
Enrollment
Flow Alloca&on
Resource Alloca&on
Rou&ng
Authen&ca&on
State Vector
State Vector
State Vector
Data Transfer Data Transfer
Retransmission
Control
Retransmission
Control
Flow Control
Flow Control
Namespace
Management
Security
Management
• All layers have the same mechanisms and 2 protocols (EFCP for data
transfer, CDAP for layer management), programmable via policies.
• Don’t specify/implement security protocols, only security policies
– Re-use common layer structure, re-use security policies across layers
• This approach greatly simplifies the network structure, minimizing the
cost of security and improving the security level
– Complexity is the worst enemy of security (B. Schneier)
Authen.ca.on
Access control (layer mgmt
opera.ons)
Access control
(joining the DIF)
Coordina.on of security
func.ons
Confiden.ality,
Integrity
Separation of mechanism from policy
5
IPC API
Data Transfer Data Transfer Control Layer Management
SDU Delimi&ng
Data Transfer
Relaying and
Mul&plexing
SDU Protec&on
Retransmission
Control
Flow Control
RIB
Daemon
RIB
CDAP Parser/
Generator
CACEP
Enrollment
Flow Alloca&on
Resource Alloca&on
Rou&ng
Authen&ca&on
State Vector
State Vector
State Vector
Data Transfer Data Transfer
Retransmission
Control
Retransmission
Control
Flow Control
Flow Control
Namespace
Management
Security
Management
• All layers have the same mechanisms and 2 protocols (EFCP for data
transfer, CDAP for layer management), programmable via policies.
• Don’t specify/implement security protocols, only security policies
– Re-use common layer structure, re-use security policies across layers
• This approach greatly simplifies the network structure, minimizing the
cost of security and improving the security level
– Complexity is the worst enemy of security (B. Schneier)
Authen.ca.on
Access control (layer mgmt
opera.ons)
Access control
(joining the DIF)
Coordina.on of security
func.ons
Confiden.ality,
Integrity
Source: J. Small master thesis
Separating port allocation from sync.
Complete application naming
• With app-names no need for well-
known ports. Port-ids of local scope
(not in protocol headers)
• CEP-ids (in protocol headers)
dynamically generated for each
flow
7
IPCPP
A
App
A
Port-id
read/
write
1
EFCP instance,
cep-id
8736
IPCPP
A
App
B
Port-id
read/
write
4
EFCP instance,
cep-id
9123
Synchroniza9on
• Well-known ports used to identify
app endpoints; statically
assigned. @s exposed to apps.
• Ports used also to identify TCP
instances (in protocol headers).
Attacker only needs to guess
source port-id
RINA TCP/IP
IP@:
12
Port
read/
write
12
TCP instance,
port
12
IP @:
78
Port
read/
write
78
TCP instance,
port
78
TCP PM
A
TCP PM
A
Scope as a native construct
Recursion provides isolation
• Size each DIF to the scope supported applications need
– Only allow those that really need to connect to the apps
• No need for extra tools to do that: scope is built-in
– DIFs are securable containers, no need for firewalls
8
Internet (TCP/IP) RINA
Default model Global connec*vity Controlled connec*vity
Control scope via Firewalls, ACLs, VLANs, Virtual
Private Networks, etc..
Scope na*ve concept in
architecture (DIF)
Example: Provider’s
network internal layers
hidden from customers
and other providers
CDAP + Access control for layer
management
• There’s only one application protocol, CDAP
• How layers can be managed?
– Access control and operations applied to objects in the RIB
• With access control, just one protocol suffices
– No need for one protocol for each function (routing, …)
11
IPC API
Data Transfer Data Transfer Control Layer Management
SDU Delimi&ng
Data Transfer
Relaying and
Mul&plexing
SDU Protec&on
Retransmission
Control
Flow Control
RIB
Daemon
RIB
CDAP Parser/
Generator
CACEP
Enrollment
Flow Alloca&on
Resource Alloca&on
Rou&ng
Authen&ca&on
State Vector
State Vector
State Vector
Data Transfer Data Transfer
Retransmission
Control
Retransmission
Control
Flow Control
Flow Control
Namespace
Management
Security
Management
Access control (layer mgmt
opera.ons)
Access control
(joining the DIF)
Customer network
Interior
Router
Customer
Border
Router
Interior
Router Border
Router
P2P DIF
Interior
Router
P2P DIF
Border
Router
P2P DIF P2P DIF
Interior
Router
Border
Router
Provider 1 Backbone DIF
P2P DIF
Border
Router
Provider 1 Regional DIF
Mul&-provider DIF
P2P DIF
Access DIF
P2P DIFP2P DIF
Provider 1 network
Provider 2 network
IPCP
A
IPCP
B
IPCP
C
P2P DIF P2P DIF
IPCP
D
• DIFs are securable containers, strength of authentication and SDU
Protection policies depends on its operational environment
• DIFs shared between provider/customer (blue DIF) may require strong
authentication and encryption, specially if operating over wireless (red DIF)
• DIFs internal to a provider may do with no auth.: accessing the DIF requires
physically compromising the provider’s assets (green and orange DIFs).
Border
Router
Authentication and SDU protection policies
Authentication policy: SSH2-based (I)
• Once applications (including IPCPs) have a flow allocated, go
through application connection establishment phase
– Negotiate app protocol (CDAP) version, RIB version, authenticate
• Specified authentication policy based on SSH2 authentication
(uses per IPCP public/private RSA key pairs), adapted to the
RINA environment
13
Crypto SDU protection policy
• Crypto policy that encrypts/decrypts PCI and payload of
EFCP PDUs
– In general SDU protection is used by a DIF to protect its own data
(PCIs of data transfer PDUs and full layer management PDUs)
• Not assuming N-1 DIF will provide reliable and in-order-
delivery -> using counter mode (as in IPsec)
– AES128 and AES256 as supported encryption algorithms
• HMAC code to protect integrity of PDU
– SHA256 chosen as hash algorithm
15
1 2
IPCPP
A
IPCPP
A
N-1 flow
SDU
Protec9on
SDU
Protec9on counter Encrypted data HMAC
@ictpristine Athens, 27th June 2016 3
Introduction
How to install?
Where to get it?
Documentation and useful links.
Requirements
OMNeT++ discrete event simulator
Windows, Linux, FreeBSD environment
Free for non-commercial purposes
Supported by versions 4.4, 4.5, 4.6
work in progress on 5.0
C++ for implementation, NED for description
C++11 with gcc 4.9.2
No other libraries or frameworks needed
Potential cooperation with INET framework
@ictpristine Athens, 27th June 2016 4
Installation
Out-of-the box
VM (http://nes.fit.vutbr.cz/ivesely/vm/RINASim.zip)
Windows
1) Download OMNeT++
http://www.omnetpp.org/omnetpp
2) ./configure && make
http://omnetpp.org/doc/omnetpp/InstallGuide.pdf
3) Download RINASim
https://github.com/kvetak/RINA/archive/master.zip
4) Import RINASim project
@ictpristine Athens, 27th June 2016 5
Languages
NED
to define models and interconnections
*.ned
C++
to implement model behavior
*.h and *.cc
Message definition
to deploy easily C++ message classes
*.msg
@ictpristine Athens, 27th June 2016 10
Design
Split between mechanism and policy
Interface like modules
Simulations allow changing of parameters
Statically preconfigured
NED parameters in omnetpp.ini
config.xml
On-the-fly
@ictpristine Athens, 27th June 2016 15
DAF Components
Application Process
contains AE(s)
manages DAF
enrollment
data and mgmt flows
@ictpristine Athens, 27th June 2016 18
DAF Components
IPC Resource Manager
interconnects APs with IPCs
passes messages from
applications to DIFs
DIF Allocator
Maintains naming and
addressing info
Knows how AP/IPCP is
reachable via which IPCP
@ictpristine Athens, 27th June 2016 19
Common Distributed App Protocol
Simulation module used by
AE and RIBd
CDAP
Sends/Receives CDAP
messages
CDAPSplitter
Delegates CDAP message to
appropriate module
CDAPMsgLog
Statistic collector
CACE + AUTH
Used by Enrollment or during
authentication phase
@ictpristine Athens, 27th June 2016 20
Flow Allocator
Manages flow lifecycle
FA
Core functionality
FAI_portId_cepId
Instance
NFlowTable
Information about all (N)-DIF
flows
NewFlowRequestPolicy
Score or Min compare
AllocateRetryPolicy
Upon treshold reach
QoSComparerPolicy
For multi QoSCube routing
purposes
@ictpristine Athens, 27th June 2016 23
Error and Flow Control Protocol
EFCP
Manages EFCP instances
EFCPTable
Table of known EFCPIs
Delimiting_portId
Creates SDUs from incoming PDUs
EFCPI_cepId
Provides DTP and DTCP services
MockEFCPI
Provides unreliable communication for
IPCP management messages
Simple en/decapsulater between SDUs
and PDUs
@ictpristine Athens, 27th June 2016 24
EFCP Instance
DTP
Actual Data Transfer
DTCP
Handles Flow Control and
Retransmission
@ictpristine Athens, 27th June 2016 25
DTPState
Holds all DTP related variables
DTCPState
Holds all DTCP related variables
EFCP policies
Triggered during various DTP states
Resource Allocator
Provides access to
(N-1)-DIFs and their resources
RA
Core functionality
Manages IPCP’s QoSCubes
NM1FlowTable
Information about current (N-1)-flows
PDUFwdGenerator
Forwarding information management
QueueAllocPolicy
How and when should RMT queues be
allocated?
QueueIdGenerator
In which RMT queue should a PDU be
stored?
AddressComparator
Syntax and comparison of addresses
@ictpristine Athens, 27th June 2016 27
Relaying and Multiplexing Task
Relays incoming/outgoing PDUs to their proper destination (either an EFCP
instance or an (N-1)-flow)
RMT
The core PDU forwarder
SchedulingPolicy
When a PDU needs to be sent/received, which queue should it be taken from?
QueueMonitorPolicy
Keeping information about port/queue states
MaxQueuePolicy
What should happen to a queue when it overflows?
PDUForwardingPolicy
Where should be PDU relayed based on a given header?
@ictpristine Athens, 27th June 2016 28
Routing
The policy computing
optimal paths to other
destinations by given
metrics
Usually some sort of
routing algorithm
exchanging information
with other members of a
DIF
@ictpristine Athens, 27th June 2016 29
@ictpristine Athens, 27th June 2016 30
Interactive Demo
How IPC works between two hosts
interconnected to a common node?
HostA Switch HostB
HostA HostBSwitch
Cookbook
Topology
2× host with single AP
1× interior router
2× datarate channel between
Task
1) Setup network
2) Schedule simulation
3) Run
Goal
To observe IPC between two hosts interconnected by a
interior router
@ictpristine Athens, 27th June 2016 32
HostA HostBSwitch
1) Setup network
Create new simulation in folder
examples/Athens/Demo
@ictpristine Athens, 27th June 2016 33
1) Setup network
Open Demo.ned and add two Host1AP onto
canvas and one InteriorRouter2Int
Rename them with F6
Connect them with DatarateChannel
@ictpristine Athens, 27th June 2016 34
Notable Events
t=5
hostA enrolls to Layer01 and Layer11
t=10
hostA creats flows for AP communication
t=15
SourceA and DestinationB apps exchange ping
messages
t=20
hostA deallocates Layer11 flow
@ictpristine Athens, 27th June 2016 39
Conclusion
RINASim
Educational tool
A way how to visualize what is happening in the native
RINA network
Helping to improve learning curve
Research tool
http://ict-pristine.eu/?page_id=35
@ictpristine Athens, 27th June 2016 40
The IRATI stack
A programmable RINA implementation for Linux/OS
27th June 2016
Vincenzo Maffione, Nextworks
Pre-IRATI prototypes implementing RINA
● ProtoRINA (https://github.com/ProtoRINA/users/wiki)
● Alba (closed source)
High level design choices:
● Focus on validation of the architecture
● Completely user-space implementations → written in Java
● No direct access to Network Interfaces Cards (NICs) → only run over sockets
Consequences:
● Limited deployability for real world scenarios
● Limited performance
DIF components for a complete RINA stack
IPC
Process
IPC API
Data Transfer Data Transfer Control Layer Management
SDU Delimiting
Data Transfer
Relaying and
Multiplexing
SDU Protection
Retransmission
Control
Flow Control
RIB
Daemon
RIB
CDAP
Parser/Generator
CACEP
Enrollment
Flow Allocation
Resource Allocation
Routing
Authentication
StateVector
StateVector
StateVector
Data TransferData Transfer
Retransmission
Control
Retransmission
Control
Flow Control
Flow Control
Namespace
Management
Security
Management
Increasing timescale (functions performed less often)
System (Host)
About IPC Processes
There are two categories:
● Normal IPC Processes
○ Have RINA-compatible Northbound/Southbound
interfaces
○ Implement all the DIF functionalities
● Shim IPC Processes
○ Have a RINA-compatible Northbound interface
■ The IPC API
○ Wrap the legacy transport technology they lay over
■ Ethernet (802.1q)
■ TCP/IP
■ Hypervisor shared-memory mechanisms
Shim IPC Process
(Normal) IPC
Process
(Normal) IPC
Process
0
1
2
Hardware
The IRATI stack
Open source RINA implementation for Linux OS, available at https://github.com/irati/stack.
Developed in 2013-2014 within the FP7-IRATI project, with the following goals:
● Implementation for a UNIX-like OS of all the basic DIF functionalities (Flow allocation,
Enrollment, Routing, Data Transfer, Data Transfer Control, etc.) from scratch
● Support to run over Ethernet (802.1q)
● Support to run inside Virtual Machines, using I/O paravirtualization
● Support to run over the TCP/IP traditional network stack (through socket API)
● Provide a solid baseline for further RINA research work
IRATI functionalities splitting (2)
Split user-space functionalities in different daemon processes:
● A separate IPC Process Daemon to implement layer
management functionalities of each IPC Process
● An IPC Manager Daemon to coordinate layer
management among applications and IPC Process
daemons
Rationale: each process in a different container → more
reliable solution, minimize interferences in case of problems
IPC Process
Daemon
IPC Process
Daemon
N
User
Kernel
Application
1
IPC Manager
Daemon
Application
Kernel
IPC Process
Daemon
Application
N
Communication among components
Two mechanisms:
● System calls:
○ Bootstrapping: create/destroy the kernel-path IPC Processes
○ Used by applications and IPC Process Daemons to send and
receive SDUs (data-path)
○ User-space-originated
● Netlink:
○ A bus-like mechanism for IPC
○ A linux-standard for applications controlling the network stack
○ Control messages exchanged between
■ Applications ← → IPC Manager
■ IPC Manager ← → IPC Process Daemons
■ IPC Process Daemons ← → kernel
○ Messages are originated by user-space or kernel-space
netlink
IPC Process
Daemon
IPC Process
Daemon
N
User
Kernel
Application
1
IPC Manager
Daemon
Application
Kernel
IPC Process
Daemon
Application
N
IRATI design decisions
Decision Pros Cons
Linux/OS vs other Operating
systems
Adoption, Community, Stability, Documentation,
Support
Monolithic kernel (RINA/ IPC Model
may be better suited to micro-
kernels)
User/kernel split
vs user-space only
IPC as a fundamental OS service, access device
drivers, hardware offload, IP over RINA,
performance
More complex implementation and
debugging
C/C++
vs Java, Python, …
Native implementation, Performance Less portability
Multiple user-space daemons vs
single one
Reliability, Isolation between IPCPs and IPC
Manager
Communication overhead, more
complex impl.
Soft-irqs/tasklets vs.
workqueues (kernel)
Minimize latency and context switches of data
going through the “stack”
More complex kernel locking and
debugging
Core
Normal IPC Process datapath
IRATI kernel-space architecture (1)
KIPCM → Manages the syscalls:
● IPCP management:
○ ipc_create
○ ipc_destroy
● Flows management:
○ allocate_port
○ deallocate_port
● SDUs I/O (fast-path):
○ sdu_read
○ sdu_write
○ mgmt_sdu_read
○ mgmt_sdu_write
API
Mux/Demux
Kernel IPC
Manager
IPCP
Factories
Kernel Flow
Allocator
EFCP
Relaying
Multiplexing
Task
PDU
forwarding
function
Netlink
support
Netlink,
syscalls
User space
Shim over
802.1q
RINA-ARP
Shim over
Hypervisors
kernel
Shim over
TCP/UDP
SDU
protection
Core
Normal IPC Process datapath
IRATI kernel-space architecture (2)
Netlink support:
● Abstracts message’s
reception, sending, parsing &
crafting
● 40+ message types (control-
path):
○ assign_to_dif_req
○ assign_to_dif_resp
○ register_app
○ unregister_app
○ ...
API
Mux/Demux
Kernel IPC
Manager
IPCP
Factories
Kernel Flow
Allocator
EFCP
Relaying
Multiplexing
Task
PDU
forwarding
function
Netlink
support
Netlink,
syscalls
User space
Shim over
802.1q
RINA-ARP
Shim over
Hypervisors
kernel
Shim over
TCP/UDP
SDU
protection
Core
Normal IPC Process datapath
IRATI kernel-space architecture (3)
KIPCM:
● Counterpart of the IPC Manager
● Manages the lifecycle the IPC
Processes and KFA
○ ipcp-id → ipcp-instance
● Same API for all the IPC
Processes regardless the type →
abstraction
KFA:
● Counterpart of the Flow Allocator
● Manages ports and flows
○ port-id → ipcp-instance
API
Mux/Demux
Kernel IPC
Manager
IPCP
Factories
Kernel Flow
Allocator
EFCP
Relaying
Multiplexing
Task
PDU
forwarding
function
Netlink
support
Netlink,
syscalls
User space
Shim over
802.1q
RINA-ARP
Shim over
Hypervisors
kernel
Shim over
TCP/UDP
SDU
protection
Kernel IPC Process factories
Different IPC Process types:
● The Northbound interface is the same
● Each IPC Process implements its “core” code:
● Shim IPC Processes:
● Each type provides a different implementation
● Normal IPC Processes:
● A programmable implementation for all of them
IPC Process factories:
● Abstract factory design pattern
● Used by IPC Processes modules to publish/unpublish their availability into the system
IPC Process kernel API
● The IPC Process kernel API is the same for all IPC Processes
● Each type decides which operations will support
● Some are specific for normal or shims, others are common to both
● .connection_create = normal_ connection_create
● . connection_update = normal _ connection_update
● . connection_destroy = normal _ connection_destroy
● .connection_create_arrived = normal _connection_arrived
● .pft_add = normal_pft_add
● . pft_remove = normal_pft_remove
● . pft_dump = normal_pft_dump
● .application_register = shim_application_register
● .application_unregister = shim_application_unregister
● .assign_to_dif = shim_assign_to_dif
● .sdu_write = shim_sdu_write
● .flow_allocate_request = shim_allocate_request
● .flow_allocate_response = shim_allocate_response
● .flow_deallocate = shim_deallocate
ipcp_ops →
The kernel normal IPC Process (1)
Contains an EFCP container and a Relaying and Multiplexing Task
EFCP Container
● Multiple EFCP instances (one per connection)
● Mux/Demux among EFCP instances
● EFCP instance:
○ Implements Watson’s Delta-T
○ Data Transfer Protocol
○ Data Transfer Control Protocol
■ Retransmission → RTX queue
■ Flow control → Closed Window queue
○ State Vector for DTP and DTCP to interact
EFCP-C
EFCP-I
EFCP-I
EFCP-I
DTP
SV
DTCP
IPC Process API interface
Relaying Multiplexing Task (RMT)
The kernel normal IPC Process (2)
Contains an EFCP container and a Relaying and Multiplexing Task
RMT
● Ingress queues
○ For packets arriving from N-1 flows
● Egress queues
○ For packets to be transmitted to N-1 flows
● Schedules transmission among queues
○ Scheduling algorithm is policy → programmability
● Accesses the PDU Forwarding Table (PFT)
○ Locally generated (EFCP-I) packets
○ Foreign packets to be forwarded
EFCP-C
EFCP-I
EFCP-I
EFCP-I
DTP
SV
DTCP
IPC Process API interface
Relaying Multiplexing Task (RMT)
IRATI Shim IPC Processes
● Implemented entirely in kernel-space → easy access to I/O devices
● Lowest part of the IRATI stack --> They wrap a legacy transport technology
● Currently 3 shims available:
● shim-eth-vlan:
● Runs over 802.1Q (Ethernet with VLAN tagging)
● shim-hv:
● Targets hypervisor-based environments (QEMU-KVM and Xen)
● Allows removing unnecessary layering commonly used in traditional VM networking environments (e.g.
software bridges, virtual-NICs), with two advantages:
● Increased performances
● Reduced maintenance costs
● shim-tcp-udp
● Runs RINA over traditional TCP or UDP sockets
● Allows for interoperability over existing IP networks.
Shim-eth-vlan architecture
User-space
Kernel
KIPCM / KFA
Shim IPC Process over 802.1Q
Devices layer
RINARP
rinarp_add rinarp_remove
rinarp_resolve
dev_queue_xmit
RINA IPC API
IPC
Process
Daemon
IPC
Manager
Daemon
shim_eth_rcv
shim_eth_destroyshim_eth_create
Normal IPC Process - instance
Normal IPC Process - instance
Normal IPC Process - instance
EFCP Container - instance
EFCP Instance
RMT - instance
PDU-FWD-T
Core O
I
Shim IPC Process
instance
Normal IPC Process - instance
EFCP Container - instance
EFCP Instance
RMT - instance
PDU-FWD-T
Core
I
O
Shim IPC Process
instance
KIPCM / KFA
User space
User space
Queue
Queue
Queue
Queue
Queue
Queue
DTP DTCP
DT
DTP DTCP
DT
KIPCM / KFA
Packet workflow example
Legacy
technology
Legacy
technology
TX
RX
librina: the IRATI user-space library
● Completely abstracts the interactions with the kernel
○ syscalls and netlink
● Provides functionalities to applications and IRATI daemons
● More a middleware than a library
○ Explicit memory allocation → no garbage collection
○ Event-based
○ Multi-threaded
○ Built from scratch in C++ → STL only
○ Design patterns → singletons, observers, factories, reactors
○ Concurrency → threads, mutexes, semaphores, condition variables
IRATI user-space architecture (2)
● IPC Manager Daemon
● Manages the IPC Processes lifecycle
● Broker between applications and IPC Processes
● Local management agent, to interact with a remote DIF Management System
● DIF Allocator client to search for applications - possibly not available through local DIFs
Kernel
User space
Netlink
sockets
Normal IPC Process
(Layer Management)Application C
RIB & RIB
Daemon
librina
Resource
allocation
Flow allocation
Enrollment
PDU Forwarding
Table Generation
Application B
System calls Netlink
sockets
Sysfs
IPC Manager
RIB & RIB
Daemon
librina
Management
agent
DIF Allocator
Main logic
System calls
Netlink
sockets
SysfsApplication A
librina
Application logic
System
calls Netlink
sockets
IRATI user-space architecture (3)
Kernel
User space
Netlink
sockets
Normal IPC Process
(Layer Management)Application C
RIB & RIB
Daemon
librina
Resource
allocation
Flow allocation
Enrollment
PDU Forwarding
Table Generation
Application B
System calls Netlink
sockets
Sysfs
IPC Manager
RIB & RIB
Daemon
librina
Management
agent
DIF Allocator
Main logic
System calls
Netlink
sockets
SysfsApplication A
librina
Application logic
System
calls Netlink
sockets
● IPC Process Daemon
● Layer Management components (RIB Daemon, RIB, CDAP parsers/generators, CACEP, Enrollment, Flow
Allocation, Resource Allocation, Routing, PDU Forwarding Table Generation, Security Management)
IPC Manager Daemon
IPC Manager Daemon (C++)
Librina
IPC Process
Factory
IPC Process
Message
classes
Message
classes
Event
classes
Event
Producer
Message
classes
Message
classes
Model
classes
System calls Netlink Messages
Console
thread
local TCP
Connection
Main event loop
EventProducer.eventWait()
IPC Manager core classes
IPC Process
Manager
Flow Manager
Application
Registration
Manager
Call IPC Process Factory, IPC
Process or Application Manager
Call operation on IPC
Manager core classes
Application Manager
CLI Session
Message
classes
Message
classes
Console
classes
Operation result
Bootstrapper
Configuration fileCall operation on IPC
Manager core classes
Message
classes
Message
classes
Configuratio
n
classes
IPC Process Daemon
IPC Process Daemon (C++)
librina (C++)
IPC Manager
KernelIPC
Process
Message
classes
Message
classes
Event
classes
Event
Producer
Message
classes
Message
classes
Model
classes
System calls Netlink Messages
CDAP Message
reader Thread
KernelIPCProcess.readMgmtSDU()
RIB Daemon
Resource
Information Base
(RIB)
RIBDaemon.cdapMessageReceived()
Main event
loop
EventProducer.eventWait()
Supporting classes
Delimiter Encoder
CDAP
parser
Layer Management function classes
Enrollment Task Flow Allocator
Resource
Allocator
Forwarding
Table
Generator
Registration
Manager
Call IPCManager or
KernelIPCProcess
RIBDaemon.
sendCDAPMessage()
KernelIPCProcess.writeMgmtSDU()
Example workflow: Flow allocation
An application requests a flow to another application, without specifying what DIF to use
Application A
Kernel
User space
IPC Manager
Daemon
IPC Process
Daemon
1. Allocate Flow Request
(NL)
2. Check app permissions
3. Decide what DIF to use
4. Forward request to adequate IPC Process Daemon
5. Allocate Flow Request (NL)
6. Request port-id (syscall)
7. Create connection request (NL)
8. On create connection response (NL),
write CDAP message to N-1 port (syscall)
9. On getting an incoming CDAP message
response (syscall), update connection (NL)
10. On getting update connection
response (NL) reply to IPC Manager (NL)
11. Allocate Flow Request Result (NL)
12. Forward response to app
13. Allocate Flow Request
Result (NL)
14. Read data from the flow
(syscall) or write data to the
flow (syscall)
Extending IRATI for programmability
● Original IRATI stack had hardwired policies
● FP7-PRISTINE extends IRATI with a Software Development Kit (SDK)
○ Allows extension modules to be plugged in/out at run-time → Dynamic code loading
○ Define public APIs for each component of the normal IPC Process
■ To be used by plugins
○ SDK is implemented by the RINA Plugin Infrastructure (RPI):
■ User-space RPI (uRPI) to manage plugins for user-space components
■ Kernel-space RPI (kRPI) to manage plugins for kernel-space components
○ IPC Manager Daemon holds the catalog of plugins/policies installed on the system
● Added management agent subsystem in the IPC Manager daemon
○ Allow management actions from remote DIF Management System
Policy-set concept
● Policy-set = The set of all policies defined on a single component of the software
architecture
○ A different policy set for each DIF component
○ Rationale: different “behavioural” policies in the same component can cooperate (share state)
in a plugin-specific way.
● Two types of policies (software-wise):
○ parameters, e.g. A-timer value for DTCP, MaxQLength for RMT queues
○ behaviours, e.g. SchedulingPolicy for RMT, NewFlowAccessControl for the Security Manager
IRATI with programmability support
Normal IPCP
(Data transfer)
Error and Flow Control Protocol
Relaying and Multiplexing Task
SDU Protection
. . .
Normal IPCP
(Layer Mgmt)
RIB & RIB
Daemon
librina
Resource allocation
Flow allocation
Enrollment
Namespace Management
Security Management
Routing
Address
assignmentpolicy
Directory
replicationpolicy
Addressvalidation
policy
Enrollment
sequence policy
Routing policy
TTLpolicy
CRCpolicy
Encryptionpolicy
Forwardingpolicy
Schedulingpolicy
MaxQ
policy
Monitoring
policy
New flow policy
PFTgeneration
policy
Pushbacknotify
policy
RTTcomputation
policy
Transmissioncontrol
policy
ECNpolicy
Authenticationpolicy
Accountingcontrol
policy
Coordinationcontrol
policy
RINA kernel-space plugin infrastructure (kRPI)
● Plugins are Loadable Kernel Modules (LKM)
○ They publish a list of policy-sets, which are made available to the IRATI stack.
● Factories, named after each policy set, provide operations to create/delete instances of policy set classes
● Different policy-set class per component, since each
component has different policies.
● “OO” approach
○ All policy set classes derive from a base class
○ All components derive from a base class
RINA user-space plugin infrastructure (uRPI)
● Same concepts as kRPI (factories, lifecycle, policy classes), different implementation
● Plugins are shared objects dynamically loaded by the IPCP Daemon, loaded through the
libdl library
Code status
● Sources are partitioned into four different packages
○ rinad: provides the IPC Process & IPC Manager daemons (user-space parts)
■ Depends on librina
○ rina-tools: provides the rina-echo-time application
■ Depends on librina
○ librina: user-space libraries
■ Depends on the IRATI modified Linux kernel
○ linux: the Linux sources enhanced with RINA functionalities (kernel-space parts)
■ Sources almost confined in net/rina, to allow easier upgrades
● Build systems for librina, rinad and rina-tools is based on autotools
librina sublibraries
● librina-application
○ Provides the APIs that allow an application to use RINA natively
■ allocate and deallocate flows
■ read and write SDUs to that flows
■ register/unregister to one or more DIFs
● librina-ipc-manager
○ Provides the APIs that facilitate the IPC Manager to perform the tasks related to IPC Process management
■ creation, deletion and configuration of IPC Processes
● librina-ipc-process
○ Provides APIs that allow an IPC Process to
■ configure the PDU forwarding table
■ create and delete EFCP instances
■ request the allocation of kernel resources to support a flow
● librina-cdap
○ Implementation of the CDAP protocol