profile_document

An article about the future of networks with references to various mesh network routing protocols

Last Update:2015-05-20
Version:001
Language:en

页面内容

Routers - supplied by companies such CISCO, Huawei, Juniper, etc. - are the cornerstone of Enterprise networks and a possible intrusion vector to steal trade secrets in a fierce international competition context. Back 15 years ago, it is said that German government financed the GPG open source project after it discovered that US government had been spying its diplomatic delegation at United Nations (UN) through backdoors present in a US made router [1]. CISCO routers are likely to include backdoors [2]. Huawei routers are banned in certain countries for similar rationale [3]. And Alcatel routers do not seem to be less exempt of backdoors either [4].

15 years after rumours of router intrusions emerged, Edward Snowden reports have provided evidences of the existence of backdoors in telecommunication equipment. Yet, companies, governments and military still rely on routers for very sensitive infrastructure and thus expose themselves to remote intrusion and trade secret theft. Free Sofware routers such as the Linux router project [5] are a good solution to eliminate backdoors because most of their code can be audited. But they failed to scale and match carrier grade reliability due to the limitation of the PC architecture, to the absence of open source drivers for high performance network cards or to difficulties of hiring developers that understand the Linux kernel network stack. Lost Oasis, an independent French telecommunication company [6] which used to be a major user of the Linux Router project in the early 2000, now relies on proprietary routers for its backbone network.

Yet, one alternative has not yet been fully considered for Enterprise networks: mesh topology.

Most enterprise networks are centralized. They are based on so-called hierarchical topology, where a central high performance router acts as border gateway to the outside world and aggregates network traffic from smaller routers in charge of each region or department of an organisation. This central router has to provide extremely high routing performance, something that a only specialized hardware can achieve.

But if we look in detail, nothing except maybe access control rules actually require a centralized networking architecture. Network routing is often compared to car traffic management. There is for example no need for cars to pass through Beijing in order to travel from Shanghai to Shenzen. It is the same for networks: there is no need for network packets to go through a central router at the headquarters of a company in order to travel from one corporate department to another. With a good car navigation technology – such as Amap [7] or Google Maps [8] – car drivers can even know in real time which small roads can be used to circumvent congestion on highways. It is the same for networks: modern routing protocols can automatically circumvent network delays on congested routers and find a faster path at any time.

Thanks to advances in routing protocols known as babel [9] or OLSR [10], it is now possible to design an Enterprise network based on the above metaphor of roads and car navigation. Every network cable or wireless network acts like a road for network packets. Every PC and every smartphone acts like a crossing. Every PC and every smartphone embeds a routing service that acts like a car navigation system by tracking the speed of the network traffic on each cable or wireless network. Network packets that are sent from one part of the company to another part of the company can thus find at any time the most efficient route to take. If one network cable required to access a server remains always congested, adding a second cable will likely solve congestion, just like adding a second access road can solve congestion to reach a popular exhibition center. By advising packets to take the least congested route at any time, traffic gets automatically split between the two cables – for a server – or between the two roads – for an exhibition center.

What we have just described is called a mesh network. It is known to be the most resilient form of network since it can still operate in case of partial destruction, which is not the case of hierarchical enterprise networks. Mesh networks are used primarily by military to quickly deploy a wireless network on a battle field. Each soldier's PC acts as a router for neighbouring soldiers. Casualties among soldiers do not have consequences on the general availability of the network.

But mesh networks could have many civilian applications in data centers or in wide area wired networks.

Let us imagine for example a datacenter with 160 servers. Let us split the 160 servers in 32 groups of 5 servers. Each server in a group of 5 servers is connected to a non manageable switch through its first network interface. The second network interface of each server in a group of 5 servers is then connected to a server in 5 other different groups. Network cables connected on the second interface of each server form together called a “hypercube”, a geometric structure similar to the cube (see illustration bellow) but in a 5 dimension space. A total of 320 cables are used to interconnect 160 servers with a huge potential bandwidth and high resiliency: each server can access another server in another a group through 5 possible different exit routes. The routing protocol – babel for example – finds at any time which of the 5 possible exit routes is the best to reach another server.

Example of application of mesh networking to data center management

Illustration - a hypercube mesh network for data center management (credit Wendelin project)

Let us now imagine a company with 1000 users of laptops and smartphones and 30 servers in 20 different countries. This company uses a combination of network technologies: optical fiber, 3G, 4G, DSL, Wifi, etc. For this type of situation, we can use a structure called a “random mesh”. Each laptop, smartphone or server creates randomly 10 links to other laptops, smartphones or servers in the world. Each link uses some kind of encapsulation such as GRE. Links play here the same role as cables in the previous datacenter example. The routing protocol – babel for example – finds at any time the fastest route between two device by combining links. With about 1000 device and 10 links per device, this route does not usually require more than 3 successive links.

The re6st open source project that was initiated by my company is an example of implementation of the “random mesh” approach based on babel. It has been used since 2013 to solve downtime problems often found in transnational deployments of online business applications for large European and Japanese companies. Configuration of routers in peering points sometimes include errors that either lead to extremely high latency (ex. 800 ms from Hong-Kong to Hong-Kong) or to connectivity loss (ex. from Dublin to Paris via broken router in Amsterdam). The use of re6st helps reducing latency (ex. 100 ms from Hong-Kong to Hong-Kong via Singapore) or recovering connectivity (ex. from Dublin to Paris via Marseilles) by discovering alternate routes. It thus provides better online access to business applications in a multinational company without having to rely on redundant dedicated lines.

Online gaming industry in China could be another possible application of mesh networking. By creating a fully connected mesh between all gaming servers and deploying babel with re6st, it is possible to circumvent congested routed between north and south of China, between cities or between telecommunication companies. Babel protocol has been extended in 2014 to optimize routes based on low latency, which is exactly what online gamers are expecting.

Mesh networks have many other applications: telematics in the automotive industry [11, 12, 13], distributed mesh cloud [14], internet of things, smart cities, control systems in navy, etc. One should however be careful about one aspect in mesh networks: security. As in any distributed system, intrusion in one part of the system bears the risk of propagating to the whole system. Since the system is distributed, there are many more entry points than with a centralized system. Critics of distributed networking architectures often point this risk to stick to a conservative approach, but also ignore the danger of single point of failure in hierarchical networks which can tear down instantly the whole network.

The babel protocol provides a first solution to strengthen security: authentication certificates. Thanks to the efforts of Yandex engineer in Russia, all nodes in a babel network authenticate each other: this reduces the risk of accepting intruders [15]. re6st provides another solution: authentication of links [16]. Intruders without a valid certificate can not create a links to other nodes of a re6st network. re6st can also revoke certificates of compromised nodes. For large corporations or distributed cloud operators, a hybrid approach combining central definition of firewall policies with distributed implementation of packet filtering rules may provide the best of both worlds.

I hope that this article will raise your curiosity and lead you to research more about networking protocols that have made immense progress compared to the 26 years old OSPF used by most corporations (RFC 1131 published in October 1989). There many protocols similar to babel which are worth considering: AODV [17], batman [18], OLSR, RPL [19], etc. “Fair routing”, an algorithm that prevents malicious intruders from deviating network traffic [20], could also solve the unresolved problem of building a truly secure network. RINA [21], a new networking protocol supported by John Day and Louis Pouzin (two pioneers who inspired the Internet), introduces an innovative approach that unifies all network protocols better than IPv6. Overall, network innovation is still alive and potentially very useful to design Enterprise networks more efficiently, as long as one tries to look beyond traditional suppliers of hardware routers.

References

Contact

Cédric Le Ninivin
cedric (dot) leninivin (at) nexedi (dot) com

Klaus Wölfel
klaus (dot) woelfel (at) nexedi (dot) com

Jean-Paul Smets
jp (at) rapid (dot) space
Jean-Paul Smets is the founder and CEO of Nexedi. After graduating in mathematics and computer science at ENS (Paris), he started his career as a civil servant at the French Ministry of Economy. He then left government to start a small company called “Nexedi” where he developed his first Free Software, an Enterprise Resource Planning (ERP) designed to manage the production of swimsuits in the not-so-warm but friendly north of France. ERP5 was born. In parallel, he led with Hartmut Pilch (FFII) the successful campaign to protect software innovation against the dangers of software patents. The campaign eventually succeeeded by rallying more than 100.000 supporters and thousands of CEOs of European software companies (both open source and proprietary). The Proposed directive on the patentability of computer-implemented inventions was rejected on 6 July 2005 by the European Parliament by an overwhelming majority of 648 to 14 votes, showing how small companies can together in Europe defeat the powerful lobbying of large corporations. Since then, he has helped Nexedi to grow either organically or by investing in new ventures led by bright entrepreneurs.

Ni Yan
ni (dot) yan (at) nexedi (dot) com

The Routerless Enterprise

References

Contact