Virtualization | 0 articles | Tech News, Tutorials & Expert Insights

article-image-symmetric-messages-and-asynchronous-messages-part-1

05 May 2015

31 min read

Symmetric Messages and Asynchronous Messages (Part 1)

05 May 2015

0
0
6076

article-image-introduction-sdn-transformation-legacy-sdn

Packt

07 Jun 2017

23 min read

Introduction to SDN - Transformation from legacy to SDN

Packt

07 Jun 2017

23 min read

In this article, by Reza Toghraee, the author of the book, Learning OpenDayLight, we will: What is and what is not SDN Components of a SDN Difference between SDN and overlay The SDN controllers (For more resources related to this topic, see here.) You might have heard about Software-Defined Networking (SDN). If you are in networking industry this is a topic which probably you have studied initially when first time you heard about the SDN.To understand the importance of SDN and SDN controller, let's look at Google. Google silently built its own networking switches and controller called Jupiter. A home grown project which is mostly software driven and supports such massive scale of Google. The SDN base is There is a controller who knows the whole network. OpenDaylight (ODL), is a SDN controller. In other words, it's the central brain of the network. Why we are going towards SDN Everyone who is hearing about SDN, should ask this question that why are we talking about SDN. What problem is it trying to solve? If we look at traditional networking (layer 2, layer 3 with routing protocols such as BGP, OSPF) we are completely dominated by what is so called protocols. These protocols in fact have been very helpful to the industry. They are mostly standard. Different vendor and products can communicate using standard protocols with each other. A Cisco router can establish a BGP session with a Huawei switch or an open source Quagga router can exchange OSPF routes with a Juniper firewall. Routing protocol is a constant standard with solid bases. If you need to override something in your network routing, you have to find a trick to use protocols, even by using a static route. SDN can help us to come out of routing protocol cage, look at different ways to forward traffic. SDN can directly program each switch or even override a route which is installed by routing protocol. There are high-level benefits of using SDN which we explain few of them as follows: An integrated network: We used to have a standalone concept in traditional network. Each switch was managed separately, each switch was running its own routing protocol instance and was processing routing information messages from other neighbors. In SDN, we are migrating to a centralized model, where the SDN controller becomes the single point of configuration of the network, where you will apply the policies and configuration. Scalable layer 2 across layer 3:Having a layer 2 network across multiple layer 3 network is something which all network architects are interested and till date we have been using proprietary methods such as OTV or by using a service provider VPLS service. With SDN, we can create layer 2 networks across multiple switches or layer 3 domains (using VXLAN) and expand the layer 2 networks. In many cloud environment, where the virtual machines are distributed across different hosts in different datacenters, this is a major requirement. Third-party application programmability: This is a very generic term, isn't it? But what I'm referring to is to let other applications communicate with your network. For example,In many new distributed IP storage systems, the IP storage controller has ability to talk to network to provide the best, shortest path to the storage node. With SDN we are letting other applications to control the network. Of course this control has limitation and SDN doesn't allow an application to scrap the whole network. Flexible application based network:In SDN, everything is an application. L2/L3, BGP, VMware Integration, and so on all are applications running in SDN controller. Service chaining:On the fly you add a firewall in the path or a load balancer. This is service insertion. Unified wired and wireless: This is an ideal benefit, to have a controller which supports both wired and wireless network. OpenDaylight is the only controller which supports CAPWAP protocols which allows integration with wireless access points. Components of a SDN A software defined network infrastructure has two main key components: The SDN Controller (only one, could be deployed in a highly available cluster) The SDN enabled switches(multiple switches, mostly in a Clos topology in a datacenter): SDN controller is the single brain of the SDN domain. In fact, an SDN domain is very similar to a chassis based switch. You can imagine supervisor or management module of a chassis based switch as a SDN controller and rest of the line card and I/O cards as SDN switches. The main difference between a SDN network and a chassis based switch is that you can scale out the SDN with multiple switches, where in a chassis based switch you are limited to the number of slots in that chassis: Controlling the fabric It is very important that you understand the main technologies involved in SDN. These methods are used by SDN controllers to manage and control the SDN network. In general, there are two methods available for controlling the fabric: Direct fabric programming: In this method, SDN controller directly communicates with SDN enabled switches via southbound protocols such as OpenFlow, NETCONF and OVSDB. SDN controller programs each switch member with related information about fabric, and how to forward the traffic. Direct fabric programming is the method used by OpenDaylight. Overlay:In overlay method, SDN controller doesn't rely on programing the network switches and routers. Instead it builds an virtual overlay network on top of existing underlay network. The underlay network can be a L2 or L3 network with traditional network switches and router, just providing IP connectivity. SDN controller uses this platform to build the overlay using encapsulation protocols such as VXLAN and NVGRE. VMware NSX uses overlay technology to build and control the virtual fabric. SDN controllers One of the key fundamentals of SDN is disaggregation. Disaggregation of software and hardware in a network and also disaggregation of control and forwarding planes. SDN controller is the main brain and controller of an SDN environment, it's a strategic control point within the network and responsible for communicating information to: Routers and switches and other network devices behind them. SDN controllers uses APIs or protocols (such as OpenFlow or NETCONF) to communicate with these devices. This communication is known as southbound Upstream switches, routers or applications and the aforementioned business logic (via APIs or protocols). This communication is known as northbound. An example for a northbound communication is a BGP session between a legacy router and SDN controller. If you are familiar with chassis based switches like Cisco Catalyst 6500 or Nexus 7k chassis, you can imagine a SDN network as a chassis, with switches and routers as its I/O line cards and SDN controller as its supervisor or management module. Infact SDN is similar to a very scalable chassis where you don't have any limitation on number of physical slots. SDN controller is similar to role of management module of a chassis based switch and it controls all switches via its southbound protocols and APIs. The following table compares the SDN controller and a chassis based switch: SDN Controller Chassis based switch Supports any switch hardware Supports only specific switch line cards Can scale out, unlimited number of switches Limited to number of physical slots in the chassis Supports high redundancy by multiple controllers in a cluster Supports dual management redundancy, active standby Communicates with switches via southbound protocols such as OpenFlow, NETCONF, BGP PCEP Use proprietary protocols between management module and line cards Communicates with routers, switches and applications outside of SDN via northbound protocols such as BGP, OSPF and direct API Communicates with other routers and switches outside of chassis via standard protocols such as BGP, OSPF or APIs. The first protocol that popularized the concept behind SDN was OpenFlow. When conceptualized by networking researchers at Stanford back in 2008, it was meant to manipulate the data plane to optimize traffic flows and make adjustments, so the network could quickly adapt to changing requirements. Version 1.0 of the OpenFlow specification was released in December of 2009; it continues to be enhanced under the management of the Open Networking Foundation, which is a user-led organization focused on advancing the development of open standards and the adoption of SDN technologies. OpenFlow protocol was the first protocol that helped in popularizing SDN. OpenFlow is a protocol designed to update the flow tables in a switch. Allowing the SDN controller to access the forwarding table of each member switch or in other words to connect control plane and data plane in SDN world. Back in 2008, OpenFlow conceptualized by networking researchers at Stanford University, the initial use of OpenFlow was to alter the switch forwarding tables to optimize traffic flows and make adjustments, so the network could quickly adapt to changing requirements. After introduction of OpenFlow, NOX introduced as original OpenFlow controller (still there wasn't concept of SDN controller). NOX was providing a high-level API capable of managing and also developing network control applications. Separate applications were required to run on top of NOX to manage the network.NOX was initially developed by Nicira networks (which acquired by VMware, and finally became part of VMware NSX). NOX introduced along with OpenFlow in 2009. NOX was a closed source product but ultimately it was donated to SDN community which led to multiple forks and sub projects out of original NOX. For example, POX is a sub project of NOX which provides Python support. Both NOX and POX were early controllers. NOX appears an inactive development, however POX is still in use by the research community as it is a Python based project and can be easily deployed. POX is hosted at http://github.com/noxrepo/pox NOX apart from being the first OpenFlow or SDN controller also established a programing model which inherited by other subsequent controllers. The model was based on processing of OpenFlow messages, with each incoming OpenFlow message trigger an event that had to be processed individually. This model was simple to implement but not efficient and robust and couldn't scale. Nicira along with NTT and Google started developing ONIX, which was meant to be a more abstract and scalable for large deployments. ONIX became the base for Nicira (the core of VMware NSX or network virtualization platform) also there are rumors that it is also the base for Google WAN controller. ONIX was planned to become open source and donated to community but for some reasons the main contributors decided to not to do it which forced the SDN community to focus on developing other platforms. Started in 2010, a new controller introduced,the Beacon controller and it became one of the most popular controllers. It born with contribution of developers from Stanford University. Beacon is a Java-based open source OpenFlow controller created in 2010. It has been widely used for teaching, research, and as the basis of Floodlight. Beacon had the first built-in web user-interface which was a huge step forward in the market of SDN controllers. Also it provided a easier method to deploy and run compared to NOX. Beacon was an influence for design of later controllers after it, however it was only supporting star topologies which was one of the limitations on this controller. Floodlight was a successful SDN controller which was built as a fork of Beacon. BigSwitch networks is developing Floodlight along with other developers. In 2013, Beacon popularity started to shrink down and Floodlight started to gain popularity. Floodlight had fixed many issues of Beacon and added lots of additional features which made it one of the most feature rich controllers available. It also had a web interface, a Java-based GUI and also could get integrated with OpenStack using quantum plugin. Integration with OpenStack was a big step forward as it could be used to provide networking to a large pool of virtual machines, compute and storage. Floodlight adoption increased by evolution of OpenStack and OpenStack adopters. This gave Floodlight greater popularity and applicability than other controllers that came before. Most of controllers came after Floodlight also supported OpenStack integration. Floodlight is still supported and developed by community and BigSwitch networks, and is a base for BigCloud Fabric (the BigSwitch's commercial SDN controller). There are other open source SDN controllers which introduced such as Trema (ruby-based from NEC), Ryu (supported by NTT), FlowER, LOOM and the recent OpenMUL. The following table shows the current open source SDN controllers: Active open source SDNcontroller Non-active open source SDN controllers Floodlight Beacon OpenContrail FlowER OpenDaylight NOX LOOM NodeFlow OpenMUL ONOS POX Ryu Trema OpenDaylight OpenDaylight started in early 2013, and was originally led by IBM and Cisco. It was a new collaborative open source project. OpenDaylight hosted under Linux Foundation and draw support and interest from many developers and adopters. OpenDaylight is a platform to provide common foundations and a robust array of services for SDN environments. OpenDaylight uses a controller model which supports OpenFlow as well as other southbound protocols. It is the first open source controller capable of employing non-OpenFlow proprietary control protocols which eventually lets OpenDaylight to integrate with modern and multi-vendor networks. The first release of OpenDaylight in February 2014 with code name of Hydrogen, followed by Helium in September 2014. The Helium release was significant because it marked a change in direction for the platform that has influenced the way subsequent controllers have been architected. The main change was in the service abstraction layer, which is the part of the controller platform that resides just above the southbound protocols, such as OpenFlow, isolating them from the northbound side and where the applications reside. Hydrogen used an API-driven Service Abstraction Layer (AD-SAL), which had limitations specifically, it meant the controller needed to know about every type of device in the network AND have an inventory of drivers to support them. Helium introduced a Model-driven service abstraction layer (MD-SAL), which meant the controller didn't have to account for all the types of equipment installed in the network, allowing it to manage a wide range of hardware and southbound protocols. Helium release made the framework much more agile and adaptable to changes in the applications; an application could now request changes to the model, which would be received by the abstraction layer and forwarded to the network devices. The OpenDaylight platform built on this advancement in its third release, Lithium, which was introduced in June of 2015. This release focused on broadening the programmability of the network, enabling organizations to create their own service architectures to deliver dynamic network services in a cloud environment and craft intent-based policies. Lithium release was worked on by more than 400 individuals, and contributions from Big Switch Networks, Cisco, Ericsson, HP, NEC, and so on, making it one of the fastest growing open source projects ever. The fourth release, Beryllium come out in February of 2016 and the most recent fifth release, Boron released in September 2016. Many vendors have built and developed commercial SDN controller solutions based on OpenDaylight. Each product has enhanced or added features to OpenDaylight to have some differentiating factor. The use of OpenDaylight in different vendor products are: A base, but sell a commercial version with additional proprietary functionality—for example: Brocade, Ericsson, Ciena, and so on. Part of their infrastructure in their Network as a Service (or XaaS) offerings—for example: Telstra, IBM, and so on. Elements for use in their solution—for example: ConteXtream (now part of HP) Open Networking Operating System (ONOS), which was open sourced in December 2014 is focused on serving the needs of service providers. It is not as widely adopted as OpenDaylight, ONOS has been finding success and gaining momentum around WAN use cases. ONOS is backed by numerous organizations including AT&T, Cisco, Fujitsu, Ericsson, Ciena, Huawei, NTT, SK Telecom, NEC, and Intel, many of whom are also participants in and supporters of OpenDaylight. Apart from open source SDN controllers, there are many commercial, proprietary controllers available in the market. Products such as VMware NSX, Cisco APIC, BigSwitch Big Cloud Fabric, HP VAN and NEC ProgrammableFlow are example commercial and proprietary products. The following table lists the commercially available controllers and their relationship to OpenDaylight: ODL-based ODL-friendly Non-ODL based Avaya Cyan (acquired by Ciena) BigSwitch Brocade HP Juniper Ciena NEC Cisco ConteXtream (HP) Nuage Plexxi Coriant PLUMgrid Ericsson Pluribus Extreme Sonus Huawei (also ships non-ODL controller) VMware NSX Core features of SDN Regardless of an open source or a proprietary SDN platform, there are core features and capabilities which requires the SDN platform to support. These capabilities include: Fabric programmability:Providing the ability to redirect traffic, apply filters to packets (dynamically), and leverage templates to streamline the creation of custom applications. Ensuring northbound APIs allow the control information centralized in the controller available to be changed by SDN applications. This will ensure the controller can dynamically adjust the underlying network to optimize traffic flows to use the least expensive path, take into consideration varying bandwidth constraints, meet quality of service (QoS) requirements. Southbound protocol support:Enabling the controller to communicate to switches and routers and manipulate and optimize how they manage the flow of traffic. Currently OpenFlow is the most standard protocol used between different networking vendors, while there are other southbound protocols that can be used. A SDN platform should support different versions of OpenFlow in order to provide compatibility with different switching equipments. External API support:Ensuring the controller can be used within the varied orchestration and cloud environments such as VMware vSphere, OpenStack, and so on. Using APIs the orchestration platform can communicate with SDN platform in order to publish network policies. For example VMware vSphere shall talk to SDN platform to extend the virtual distributed switches(VDS) from virtual environment to the physical underlay network without any requirement form an network engineer to configure the network. Centralized monitoring and visualization:Since SDN controller has a full visibility over the network, it can offer end-to-end visibility of the network and centralized management to improve overall performance, simplify the identification of issues and accelerate troubleshooting. The SDN controller will be able to discover and present a logical abstraction of all the physical links in the network, also it can discover and present a map of connected devices (MAC addresses) which are related to virtual or physical devices connected to the network. The SDN controller support monitoring protocols, such as syslog, snmp and APIs in order to integrate with third-party management and monitoring systems. Performance: Performance in a SDN environment mainly depends on how fast SDN controller fills the flow tables of SDN enabled switches. Most of SDN controllers pre-populate the flow tables on switches to minimize the delay. When a SDN enabled switch receives a packet which doesn't find a matching entry in its flow table, it sends the packet to the SDN controller in order to find where the packet needs to get forwarded to. A robust SDN solution should ensure that the number of requests form switches are minimum and SDN controller doesn't become a bottleneck in the network. High availability and scalability: Controllers must support high availability clusters to ensure reliability and service continuity in case of failure of a controller. Clustering in SDN controller expands to scalability. A modern SDN platform should support scalability in order to add more controller nodes with load balancing in order to increase the performance and availability. Modern SDN controllers support clustering across multiple different geographical locations. Security:Since all switches communicate with SDN controller, the communication channel needs to be secured to ensure unauthorized devices doesn't compromise the network. SDN controller should secure the southbound channels, use encrypted messaging and mutual authentication to provide access control. Apart from that the SDN controller must implement preventive mechanisms to prevent from denial of services attacks. Also deployment of authorization levels and level controls for multi-tenant SDN platforms is a key requirement. Apart from the aforementioned features SDN controllers are likely to expand their function in future. They may become a network operating system and change the way we used to build networks with hardware, switches, SFPs and gigs of bandwidth. The future will look more software defined, as the silicon and hardware industry has already delivered their promises for high performance networking chips of 40G, 100G. Industry needs more time to digest the new hardware and silicons and refresh the equipment with new gears supporting 10 times the current performance. Current SDN controllers In this section, I'm putting the different SDN controllers in a table. This will help you to understand the current market players in SDN and how OpenDaylight relates to them: Vendors/product Based on OpenDaylight? Commercial/open source Description Brocade SDN controller Yes Commercial It's a commercial version of OpenDaylight, fully supported and with extra reliable modules. Cisco APIC No Commercial Cisco Application Policy Infrastructure Controller (APIC) is the unifying automation and management point for the Application Centric Infrastructure (ACI) data center fabric. Cisco uses APIC controller and Nexus 9k switches to build the fabric. Cisco uses OpFlex as main southbound protocol. Erricson SDN controller Yes Commercial Ericsson's SDN controller is a commercial (hardened) version OpenDaylight SDN controller. Domain specific control applications that use the SDN controller as platform form the basis of the three commercial products in our SDN controller portfolio. Juniper OpenContrial /Contrail No Both OpenContrail is opensource, and Contrail itself is a commercial product. Juniper Contrail Networking is an open SDN solution that consists of Contrail controller, Contrail vRouter, an analytics engine, and published northbound APIs for cloud and NFV. OpenContrail is also available for free from Juniper. Contrail promotes and use MPLS in datacenter. NEC Programmable Flow No Commercial NEC provides its own SDN controller and switches. NEC SDN platform is one of choices of enterprises and has lots of traction and some case studies. Avaya SDN Fx controller Yes Commercial Based on OpenDaylight, bundled as a solution package. Big Cloud Fabric No Commercial BigSwitch networks solution is based on Floodlight opensource project. BigCloud Fabric is a robust, clean SDN controller and works with bare metal whitebox switches. BigCloud Fabric includes SwitchLightOS which is a switch operating system can be loaded on whitebox switches with Broadcom Trident 2 or Tomahawk silicons. The benefit of BigCloud Fabric is that you are not bound to any hardware and you can use baremetal switches from any vendor. Ciena's Agility Yes Commercial Ciena's Agility multilayer WAN controller is built atop the open-source baseline of the OpenDaylight Project—an open, modular framework created by a vendor-neutral ecosystem (rather than a vendor-centric ego-system) that will enable network operators to source network services and applications from both Ciena's Agility and others. HP VAN (Virtual Application Network) No Commercial The building block of the HP open SDN ecosystem, the controller allows third-party developers to deliver innovative SDN solutions. Huawei Agile controller Yes and No (based on editions) Commercial Huawei's SDN controller which integrates as a solution with Huawei enterprise switches Nuage No Commercial Nuage Networks VSP provides SDN capabilities for clouds of all sizes. It is implemented as a non-disruptive overlay for all existing virtualized and non-virtualized server and network resources. Pluribus Netvisor No Commercial Netvisor Premium and Open Netvisor Linux are distributed network operating systems. Open Netvisor integrates atraditional, interoperable networking stack (L2/L3/VXLAN) with an SDN distributed controller that runs in everyswitch of the fabric. VMware NSX No Commercial VMware NSX is an Overlay type of SDN, which currently works with VMware vSphere. The plan is to support OpenStack in future. VMware NSX also has built-in firewall, router and L4 load balancers allowing micro segmentation. OpenDaylight as an SDN controller Previously, we went through the role of SDN controller, and a brief history of ODL.ODL is a modular open SDN platform which allows developers to build any network or business application on top of it to drive the network in the way they want. Currently OpenDaylight has reached to its fifth release (Boron, which is the fifth element in periodic table). ODL releases are named based on periodic table elements, started from first release the Hydrogen. ODL has a 6 month release period, with many developers working on expanding the ODL, 2 releases per year is expected from community. For technical readers to understand it more clearly, the following diagram will help: ODL platform has a broad set of use cases for multivendor, brown field, green fields, service providers and enterprises. ODL is a foundation for networks of the future. Service providers are using ODL to migrate their services to a software enabled level with automatic service delivery and coming out of circuit-based mindset of service delivery. Also they work on providing a virtualized CPE with NFV support in order to provide flexible offerings. Enterprises use ODL for many use cases, from datacenter networking, Cloud and NFS, network automation and resource optimization, visibility, control to deploying a fully SDN campus network. ODL uses a MD-SAL which makes it very scalable and lets it incorporate new applications and protocols faster. ODL supports multiple standard and proprietary southbound protocols, for example with full support of OpenFlow and OVSDB, ODL can communicate with any standard hardware (or even the virtual switches such as Open vSwitch(OVS) supporting such protocols). With such support, ODL can be deployed and used in multivendor environments and control hardware from different vendors from a single console no matter what vendor and what device it is, as long as they support standard southbound protocols. ODL uses a micro service architecture model which allows users to control applications, protocols and plugins while deploying SDN applications. Also ODL is able to manage the connection between external consumers and providers. The followingdiagram explains the ODL footprint and different components and projects within the ODL: Micro servicesarchitecture ODL stores its YANG data structure in a common data store and uses messaging infrastructure between different components to enable a model-driven approach to describe the network and functions. In ODL MD-SAL, any SDN application can be integrated as a service and then loaded into the SDN controller. These services (apps) can be chained together in any number and ways to match the application needs. This concept allows users to only install and enable the protocols and services they need which makes the system light and efficient. Also services and applications created by users can be shared among others in the ecosystem since the SDN application deployment for ODL follows a modular design. ODL supports multiple southbound protocols. OpenFlow and OpenFlow extension such as Table Type Patterns (TTP), as well as other protocols including NETCONF, BGP/PCEP, CAPWAP and OVSDB. Also ODL supports Cisco OpFlex protocol: ODL platform provides a framework for authentication, authorization and accounting (AAA), as well as automatic discovery and securing of network devices and controllers. Another key area in security is to use encrypted and authenticated communication trough southbound protocols with switches and routers within the SDN domain. Most of southbound protocols support security encryption mechanisms. Summary In this article we learned about SDN, and why it is important. We reviewed the SDN controller products, the ODL history as well as core features of SDN controllers and market leader controllers. We managed to dive in some details about SDN . Resources for Article: Further resources on this subject: Managing Network Devices [article] Setting Up a Network Backup Server with Bacula [article] Point-to-Point Networks [article]

0
0
6044

Packt

16 Oct 2009

6 min read

Oracle VM Management

Packt

16 Oct 2009

6 min read

0
0
5965

Packt

07 Sep 2015

17 min read

Working with PowerShell

Packt

07 Sep 2015

17 min read

0
0
5830

Packt

05 Jan 2016

16 min read

NSX Core Components

Packt

05 Jan 2016

16 min read

In this article by Ranjit Singh Thakurratan, the author of the book, Learning VMware NSX, we have discussed some of the core components of NSX. The article begins with a brief introduction of the NSX core components followed by a detailed discussion of these core components. We will go over three different control planes and see how each of the NSX core components fit in this architecture. Next, we will cover the VXLAN architecture and the transport zones that allow us to create and extend overlay networks across multiple clusters. We will also look at NSX Edge and the distributed firewall in greater detail and take a look at the newest NSX feature of multi-vCenter or cross-vCenterNSX deployment. By the end of this article, you will have a thorough understanding of the NSX core components and also their functional inter-dependencies. In this article, we will cover the following topics: An introduction to the NSX core components NSX Manager NSX Controller clusters VXLAN architecture overview Transport zones NSX Edge Distributed firewall Cross-vCenterNSX (For more resources related to this topic, see here.) An introduction to the NSX core components The foundational core components of NSX are divided across three different planes. The core components of a NSX deployment consist of a NSX Manager, Controller clusters, and hypervisor kernel modules. Each of these are crucial for your NSX deployment; however, they are decoupled to a certain extent to allow resiliency during the failure of multiple components. For example if your controller clusters fail, your virtual machines will still be able to communicate with each other without any network disruption. You have to ensure that the NSX components are always deployed in a clustered environment so that they are protected by vSphere HA. The high-level architecture of NSX primarily describes three different planes wherein each of the core components fit in. They are the Management plane, the Control plane, and the Data plane. The following figure represents how the three planes are interlinked with each other. The management plane is how an end user interacts with NSX as a centralized access point, while the data plane consists of north-south or east-west traffic. Let's look at some of the important components in the preceding figure: Management plane: The management plane primarily consists of NSX Manager. NSX Manager is a centralized network management component and primarily allows a single management point. It also provides the REST API that a user can use to perform all the NSX functions and actions. During the deployment phase, the management plane is established when the NSX appliance is deployed and configured. This management plane directly interacts with the control plane and also with the data plane. The NSX Manager is then managed via the vSphere web client and CLI. The NSX Manager is configured to interact with vSphere and ESXi, and once configured, all of the NSX components are then configured and managed via the vSphere web GUI. Control plane: The control plane consists of the NSX Controller that manages the state of virtual networks. NSX Controllers also enable overlay networks (VXLAN) that are multicast-free and make it easier to create new VXLAN networks without having to enable multicast functionality on physical switches. The controllers also keep track of all the information about the virtual machines, hosts, and VXLAN networks and can perform ARP suppression as well. No data passes through the control plane, and a loss of controllers does not affect network functionality between virtual machines. Overlay networks and VXLANs can be used interchangeably. They both represent L2 over L3 virtual networks. Data plane: The NSX data plane primarily consists of NSX logical switch. The NSX logical switch is a part of the vSphere distributed switch and is created when a VXLAN network is created. The logical switch and other NSX services such as logical routing and logical firewall are enabled at the hypervisor kernel level after the installation of hypervisor kernel modules (VIBs). This logical switch is the key to enabling overlay networks that are able to encapsulate and send traffic over existing physical networks. It also allows gateway devices that allow L2 bridging between virtual and physical workloads.The data plane receives its updates from the control plane as hypervisors maintain local virtual machines and VXLAN (Logical switch) mapping tables as well. A loss of data plane will cause a loss of the overlay (VXLAN) network, as virtual machines that are part of a NSX logical switch will not be able to send and receive data. NSX Manager NSX Manager, once deployed and configured, can deploy Controller cluster appliances and prepare the ESXi host that involves installing various vSphere installation bundles (VIB) that allow network virtualization features such as VXLAN, logical switching, logical firewall, and logical routing. NSX Manager can also deploy and configure Edge gateway appliances and its services. The NSX version as of this writing is 6.2 that only supports 1:1 vCenter connectivity. NSX Manager is deployed as a single virtual machine and relies on VMware's HA functionality to ensure its availability. There is no NSX Manager clustering available as of this writing. It is important to note that a loss of NSX Manager will lead to a loss of management and API access, but does not disrupt virtual machine connectivity. Finally, the NSX Manager's configuration UI allows an administrator to collect log bundles and also to back up the NSX configuration. NSX Controller clusters NSX Controller provides a control plane functionality to distribute Logical Routing, VXLAN network information to the underlying hypervisor. Controllers are deployed as Virtual Appliances, and they should be deployed in the same vCenter to which NSX Manager is connected. In a production environment, it is recommended to deploy minimum three controllers. For better availability and scalability, we need to ensure that DRS ant-affinity rules are configured to deploy Controllers on a separate ESXI host. The control plane to management and data plane traffic is secured by a certificate-based authentication. It is important to note that controller nodes employ a scale-out mechanism, where each controller node uses a slicing mechanism that divides the workload equally across all the nodes. This renders all the controller nodes as Active at all times. If one controller node fails, then the other nodes are reassigned the tasks that were owned by the failed node to ensure operational status. The VMware NSX Controller uses a Paxos-based algorithm within the NSX Controller cluster. The Controller removes dependency on multicast routing/PIM in the physical network. It also suppresses broadcast traffic in VXLAN networks. The NSX version 6.2 only supports three controller nodes. VXLAN architecture overview One of the most important functions of NSX is enabling virtual networks. These virtual networks or overlay networks have become very popular due to the fact that they can leverage existing network infrastructure without the need to modify it in any way. The decoupling of logical networks from the physical infrastructure allows users to scale rapidly. Overlay networks or VXLAN was developed by a host of vendors that include Arista, Cisco, Citrix, Red Hat, and Broadcom. Due to this joint effort in developing its architecture, it allows the VXLAN standard to be implemented by multiple vendors. VXLAN is a layer 2 over layer 3 tunneling protocol that allows logical network segments to extend on routable networks. This is achieved by encapsulating the Ethernet frame with additional UPD, IP, and VXLAN headers. Consequently, this increases the size of the packet by 50 bytes. Hence, VMware recommends increasing the MTU size to a minimum of 1600 bytes for all the interfaces in the physical infrastructure and any associated vSwitches. When a virtual machine generates traffic meant for another virtual machine on the same virtual network, the hosts on which these source and destination virtual machines run are called VXLAN Tunnel End Point (VTEP). VTEPs are configured as separate VM Kernel interfaces on the hosts. The outer IP header block in the VXLAN frame contains the source and the destination IP addresses that contain the source hypervisor and the destination hypervisor. When a packet leaves the source virtual machine, it is encapsulated at the source hypervisor and sent to the target hypervisor. The target hypervisor, upon receiving this packet, decapsulates the Ethernet frame and forwards it to the destination virtual machine. Once the ESXI host is prepared from NSX Manager, we need to configure VTEP. NSX supports multiple VXLAN vmknics per host for uplink load balancing features. In addition to this, Guest VLAN tagging is also supported. A sample packet flow We face a challenging situation when a virtual machine generates traffic—Broadcast, Unknown Unicast, or Multicast (BUM)—meant for another virtual machine on the same virtual network (VNI) on a different host. Control plane modes play a crucial factor in optimizing the VXLAN traffic depending on the modes selected for the Logical Switch/Transport Scope: Unicast Hybrid Multicast By default, a Logical Switch inherits its replication mode from the transport zone. However, we can set this on a per-Logical-Switch basis. Segment ID is needed for Multicast and Hybrid Modes. The following is a representation of the VXLAN-encapsulated packet showing the VXLAN headers: As indicated in the preceding figure, the outer IP header identifies the source and the destination VTEPs. The VXLAN header also has the Virtual Network Identifier (VNI) that is a 24-bit unique network identifier. This allows the scaling of virtual networks beyond the 4094 VLAN limitation placed by the physical switches. Two virtual machines that are a part of the same virtual network will have the same virtual network identifier, similar to how two machines on the same VLAN share the same VLAN ID. Transport zones A group of ESXi hosts that are able to communicate with one another over the physical network by means of VTEPs are said to be in the same transport zone. A transport zone defines the extension of a logical switch across multiple ESXi clusters that span across multiple virtual distributed switches. A typical environment has more than one virtual distributed switch that spans across multiple hosts. A transport zone enables a logical switch to extend across multiple virtual distributed switches, and any ESXi host that is a part of this transport zone can have virtual machines as a part of that logical network. A logical switch is always created as part of a transport zone and ESXi hosts can participate in them. The following is a figure that shows a transport zone that defines the extension of a logical switch across multiple virtual distributed switches: NSX Edge Services Gateway The NSX Edge Services Gateway (ESG) offers a feature rich set of services that include NAT, routing, firewall, load balancing, L2/L3 VPN, and DHCP/DNS relay. NSX API allows each of these services to be deployed, configured, and consumed on-demand. The ESG is deployed as a virtual machine from NSX Manager that is accessed using the vSphere web client. Four different form factors are offered for differently-sized environments. It is important that you factor in enough resources for the appropriate ESG when building your environment. The ESG can be deployed in different sizes. The following are the available size options for an ESG appliance: X-Large: The X-large form factor is suitable for high performance firewall, load balancer, and routing or a combination of multiple services. When an X-large form factor is selected, the ESG will be deployed with six vCPUs and 8GB of RAM. Quad-Large: The Quad-large form factor is ideal for a high performance firewall. It will be deployed with four vCPUs and 1GB of RAM. Large: The large form factor is suitable for medium performance routing and firewall. It is recommended that, in production, you start with the large form factor. The large ESG is deployed with two vCPUs and 1GB of RAM. Compact: The compact form factor is suitable for DHCP and DNS replay functions. It is deployed with one vCPU and 512MB of RAM. Once deployed, a form factor can be upgraded by using the API or the UI. The upgrade action will incur an outage. Edge gateway services can also be deployed in an Active/Standby mode to ensure high availability and resiliency. A heartbeat network between the Edge appliances ensures state replication and uptime. If the active gateway goes down and the "declared dead time" passes, the standby Edge appliance takes over. The default declared dead time is 15 seconds and can be reduced to 6 seconds. Let's look at some of the Edge services as follows: Network Address Translation: The NSX Edge supports both source and destination NAT and NAT is allowed for all traffic flowing through the Edge appliance. If the Edge appliance supports more than 100 virtual machines, it is recommended that a Quad instance be deployed to allow high performance translation. Routing: The NSX Edge allows centralized routing that allows the logical networks deployed in the NSX domain to be routed to the external physical network. The Edge supports multiple routing protocols including OSPF, iBGP, and eBGP. The Edge also supports static routing. Load balancing: The NSX Edge also offers a load balancing functionality that allows the load balancing of traffic between the virtual machines. The load balancer supports different balancing mechanisms including IP Hash, least connections, URI-based, and round robin. Firewall: NSX Edge provides a stateful firewall functionality that is ideal for north-south traffic flowing between the physical and the virtual workloads behind the Edge gateway. The Edge firewall can be deployed alongside the hypervisor kernel-based distributed firewall that is primarily used to enforce security policies between workloads in the same logical network. L2/L3VPN: The Edge also provides L2 and L3 VPNs that makes it possible to extend L2 domains between two sites. An IPSEC site-to-site connectivity between two NSX Edges or other VPN termination devices can also be set up. DHCP/DNS relay: NSX Edge also offers DHCP and DNS relay functions that allows you to offload these services to the Edge gateway. Edge only supports DNS relay functionality and can forward any DNS requests to the DNS server. The Edge gateway can be configured as a DHCP server to provide and manage IP addresses, default gateway, DNS servers and, search domain information for workloads connected to the logical networks. Distributed firewall NSX provides L2-L4stateful firewall services by means of a distributed firewall that runs in the ESXi hypervisor kernel. Because the firewall is a function of the ESXi kernel, it provides massive throughput and performs at a near line rate. When the ESXi host is initially prepared by NSX, the distributed firewall service is installed in the kernel by deploying the kernel VIB – VMware Internetworking Service insertion platform or VSIP. VSIP is responsible for monitoring and enforcing security policies on all the traffic flowing through the data plane. The distributed firewall (DFW) throughput and performance scales horizontally as more ESXi hosts are added. DFW instances are associated to each vNIC, and every vNIC requires one DFW instance. A virtual machine with 2 vNICs has two DFW instances associated with it, each monitoring its own vNIC and applying security policies to it. DFW is ideally deployed to protect virtual-to-virtual or virtual-to-physical traffic. This makes DFW very effective in protecting east-west traffic between workloads that are a part of the same logical network. DFW policies can also be used to restrict traffic between virtual machines and external networks because it is applied at the vNIC of the virtual machine. Any virtual machine that does not require firewall protection can be added to the exclusion list. A diagrammatic representation is shown as follows: DFW fully supports vMotion and the rules applied to a virtual machine always follow the virtual machine. This means any manual or automated vMotion triggered by DRS does not cause any disruption in its protection status. The VSIP kernel module also adds spoofguard and traffic redirection functionalities as well. The spoofguard function maintains a VM name and IP address mapping table and prevents against IP spoofing. Spoofguard is disabled by default and needs to be manually enabled per logical switch or virtual distributed switch port group. Traffic redirection allows traffic to be redirected to a third-party appliance that can do enhanced monitoring, if needed. This allows third-party vendors to be interfaced with DFW directly and offer custom services as needed. Cross-vCenterNSX With NSX 6.2, VMware introduced an interesting feature that allows you to manage multiple vCenterNSX environments using a primary NSX Manager. This allows to have easy management and also enables lots of new functionalities including extending networks and other features such as distributed logical routing. Cross-vCenterNSX deployment also allows centralized management and eases disaster recovery architectures. In a cross-vCenter deployment, multiple vCenters are all paired with their own NSX Manager per vCenter. One NSX Manager is assigned as the primary while other NSX Managers become secondary. This primary NSX Manager can now deploy a universal controller cluster that provides the control plane. Unlike a standalone vCenter-NSX deployment, secondary NSX Managers do not deploy their own controller clusters. The primary NSX Manager also creates objects whose scope is universal. This means that these objects extend to all the secondary NSX Managers. These universal objects are synchronized across all the secondary NSX Managers and can be edited and changed by the primary NSX Manager only. This does not prevent you from creating local objects on each of the NSX Managers. Similar to local NSX objects, a primary NSX Manager can create global objects such as universal transport zones, universal logical switches, universal distributed routers, universal firewall rules, and universal security objects. There can be only one universal transport zone in a cross-vCenterNSX environment. After it is created, it is synchronized across all the secondary NSX Managers. When a logical switch is created inside a universal transport zone, it becomes a universal logical switch that spans layer 2 network across all the vCenters. All traffic is routed using the universal logical router, and any traffic that needs to be routed between a universal logical switch and a logical switch (local scope) requires an ESG. Summary We began the article with a brief introduction of the NSX core components and looked at the management, control, and the data plane. We then discussed NSX Manager and the NSX Controller clusters. This was followed by a VXLAN architecture overview discussion, where we looked at the VXLAN packet. We then discussed transport zones and NSX Edge gateway services. We ended the article with NSX Distributed firewall services and also an overview of Cross-vCenterNSX deployment. Resources for Article: Further resources on this subject: vRealize Automation and the Deconstruction of Components [article] Monitoring and Troubleshooting Networking [article] Managing Pools for Desktops [article]

0
0
5689

article-image-chaos-conf-2018-recap-chaos-engineering-hits-maturity-as-community-moves-towards-controlled-experimentation

Richard Gall

12 Oct 2018

11 min read

Chaos Conf 2018 Recap: Chaos engineering hits maturity as community moves towards controlled experimentation

Richard Gall

12 Oct 2018

11 min read

Conferences can sometimes be confusing. Even at the most professional and well-planned conferences, you sometimes just take a minute and think what's actually the point of this? Am I learning anything? Am I meant to be networking? Will anyone notice if I steal extra food for the journey home? Chaos Conf 2018 was different, however. It had a clear purpose: to take the first step in properly forging a chaos engineering community. After almost a decade somewhat hidden in the corners of particularly innovative teams at Netflix and Amazon, chaos engineering might feel that its time has come. As software infrastructure becomes more complex, less monolithic, and as business and consumer demands expect more of the software systems that have become integral to the very functioning of life, resiliency has never been more important but more challenging to achieve. But while it feels like the right time for chaos engineering, it hasn't quite established itself in the mainstream. This is something the conference host, Gremlin, a platform that offers chaos engineering as a service, is acutely aware of. On the hand it's actively helping push chaos engineering into the hands of businesses, but on the other its growth and success, backed by millions of VC cash (and faith), depends upon chaos engineering becoming a mainstream discipline in the DevOps and SRE worlds. It's perhaps this reason that the conference felt so important. It was, according to Gremlin, the first ever public chaos engineering conference. And while it was relatively small in the grand scheme of many of today's festival-esque conferences attended by thousands of delegates (Dreamforce, the Salesforce conference, was also running in San Francisco in the same week), the fact that the conference had quickly sold out all 350 of its tickets - with more hoping on waiting lists - indicates that this was an event that had been eagerly awaited. And with some big names from the industry - notably Adrian Cockcroft from AWS and Jessie Frazelle from Microsoft - Chaos Conf had the air of an event that had outgrown its insider status before it had even began. The renovated cinema and bar in San Francisco's Mission District, complete with pinball machines upstairs, was the perfect container for a passionate community that had grown out of the clean corporate environs of Silicon Valley to embrace the chaotic mess that resembles modern software engineering. Kolton Andrus sets out a vision for the future of Gremlin and chaos engineering Chaos Conf was quick to deliver big news. They keynote speech, by Gremlin co-founder Kolton Andrus launched Gremlin's brand new Application Level Fault Injection (ALFI) feature, which makes it possible to run chaos experiments at an application level. Andrus broke the news by building towards it with a story of the progression of chaos engineering. Starting with Chaos Monkey, the tool first developed by Netflix, and moving from infrastructure to network, he showed how, as chaos engineering has evolved, it requires and faciliates different levels of control and insight on how your software works. "As we're maturing, the host level failures and the network level failures are necessary to building a robust and resilient system, but not sufficient. We need more - we need a finer granularity," Andrus explains. This is where ALFI comes in. By allowing Gremlin users to inject failure at an application level, it allows them to have more control over the 'blast radius' of their chaos experiments. The narrative Andrus was setting was clear, and would ultimately inform the ethos of the day - chaos engineering isn't just about chaos, it's about controlled experimentation to ensure resiliency. To do that requires a level of intelligence - technical and organizational - about how the various components of your software work, and how humans interact with them. Adrian Cockcroft on the importance of historical context and domain knowledge Adrian Cockcroft (@adrianco) VP at AWS followed Andrus' talk. In it he took the opportunity to set the broader context of chaos engineering, highlighting how tackling system failures is often a question of culture - how we approach system failure and think about our software. Developers love to learn things from first principles" he said. "But some historical context and domain knowledge can help illuminate the path and obstacles." If this sounds like Cockcroft was about to stray into theoretical territory, he certainly didn't. He offered plenty of practical frameworks for thinking through potential failure. But the talk wasn't theoretical - Cockcroft offered a taxonomy of failure that provides a useful framework for thinking through potential failure at every level. He also touched on how he sees the future of resiliency evolving, focusing on: Observability of systems Epidemic failure modes Automation and continuous chaos The crucial point Cockcroft makes is that cloud is the big driver for chaos engineering. "As datacenters migrate to the cloud, fragile and manual disaster recovery will be replaced by chaos engineering" read one of his slides. But more than that, the cloud also paves the way for the future of the discipline, one where 'chaos' is simply an automated part of the test and deployment pipeline. Selling chaos engineering to your boss Kriss Rochefolle, DevOps engineer and author of one of the best selling DevOps books in French, delivered a short talk on how engineers can sell chaos to their boss. He takes on the assumption that a rational proposal, informed by ROI is the best way to sell chaos engineering. He suggests instead that engineers need to play into emotions, and presenting chaos engineer as a method for tackling and minimizing the fear of (inevitable failure. Follow Kriss on Twitter: @crochefolle Walmart and chaos engineering Vilas Veraraghavan, the Director of Engineering was keen to clarify that Walmart doesn't practice chaos. Rather it practices resiliency - chaos engineering is simply a method the organization uses to achieve that. It was particularly useful to note the entire process that Vilas' team adopts when it comes to resiliency, which has largely developed out of Vilas' own work building his team from scratch. You can learn more about how Walmart is using chaos engineering for software resiliency in this post. Twitter's Ronnie Chen on diving and planning for failure Ronnie Chen (@rondoftw) is an engineering manager at Twitter. But she didn't talk about Twitter. In fact, she didn't even talk about engineering. Instead she spoke about her experience as a technical diver. By talking about her experiences, Ronnie was able to make a number of vital points about how to manage and tackle failure as a team. With mortality rates so high in diving, it's a good example of the relationship between complexity and risk. Chen made the point that things don't fail because of a single catalyst. Instead, failures - particularly fatal ones - happen because of a 'failure cascade'. Chen never makes the link explicit, but the comparison is clear - the ultimate outcome (ie. success or failure) is impacted by a whole range of situational and behavioral factors that we can't afford to ignore. Chen also made the point that, in diving, inexperienced people should be at the front of an expedition. "If you're inexperienced people are leading, they're learning and growing, and being able to operate with a safety net... when you do this, all kinds of hidden dependencies reveal themselves... every undocumented assumption, every piece of ancient team lore that you didn't even know you were relying on, comes to light." Charity Majors on the importance of observability Charity Majors (@mipsytipsy), CEO of Honeycomb, talked in detail about the key differences between monitoring and observability. As with other talks, context was important: a world where architectural complexity has grown rapidly in the space of a decade. Majors made the point that this increase in complexity has taken us from having known unknowns in our architectures, to many more unknown unknowns in a distributed system. This means that monitoring is dead - it simply isn't sophisticated enough to deal with the complexities and dependencies within a distributed system. Observability, meanwhile, allows you to to understand "what's happening in your systems just by observing it from the outside." Put simply, it lets you understand how your software is functioning from your perspective - almost turning it inside out. Majors then linked the concept to observability to the broader philosophy of chaos engineering - echoing some of the points raised by Adrian Cockcroft in his keynote. But this was her key takeaway: "Software engineers spend too much time looking at code in elaborately falsified environments, and not enough time observing it in the real world." This leads to one conclusion - the importance of testing in production. "Accept no substitute." Tammy Butow and Ana Medina on making an impact Tammy Butow (@tammybutow) and Ana Medina (@Ana_M_Medina) from Gremlin took us through how to put chaos engineering into practice - from integrating it into your organizational culture to some practical tests you can run. One of the best examples of putting chaos into practice is Gremlin's concept of 'Failure Fridays', in which chaos testing becomes a valuable step in the product development process, to dogfood it and test out how a customer experiences it. Another way which Tammy and Ana suggested chaos engineering can be used was as a way of testing out new versions of technologies before you properly upgrade in production. To end, their talk, they demo'd a chaos battle between EKS (Kubernetes on AWS) and AKS (Kubernetes on Azure), doing an app container attack, a packet loss attack and a region failover attack. Jessie Frazelle on how containers can empower experimentation Jessie Frazelle (@jessfraz) didn't actually talk that much about chaos engineering. However, like Ronnie Chen's talk, chaos engineering seeped through what she said about bugs and containers. Bugs, for Frazelle, are a way of exploring how things work, and how different parts of a software infrastructure interact with each other: "Bugs are like my favorite thing... some people really hate when they get one of those bugs that turns out to be a rabbit hole and your kind of debugging it until the end of time... while debugging those bugs I hate them but afterwards, I'm like, that was crazy!" This was essentially an endorsement of the core concept of chaos engineering - injecting bugs into your software to understand how it reacts. Jessie then went on to talk about containers, joking that they're NOT REAL. This is because they're made up of numerous different component parts, like Cgroups, namespaces, and LSMs. She contrasted containers with Virtual machines, zones and jails, which are 'first class concepts' - in other worlds, real things (Jessie wrote about this in detail last year in this blog post). In practice what this means is that whereas containers are like Lego pieces, VMs, zones, and jails are like a pre-assembled lego set that you don't need to play around with in the same way. From this perspective, it's easy to see how containers are relevant to chaos engineering - they empower a level of experimentation that you simply don't have with other virtualization technologies. "The box says to build the death star. But you can build whatever you want." The chaos ends... Chaos Conf was undoubtedly a huge success, and a lot of credit has to go to Gremlin for organizing the conference. It's clear that the team care a lot about the chaos engineering community and want it to expand in a way that transcends the success of the Gremlin platform. While chaos engineering might not feel relevant to a lot of people at the moment, it's only a matter of time that it's impact will be felt. That doesn't mean that everyone will suddenly become a chaos engineer by July 2019, but the cultural ripples will likely be felt across the software engineering landscape. But without Chaos Conf, it would be difficult to see chaos engineering growing as a discipline or set of practices. By sharing ideas and learning how other people work, a more coherent picture of chaos engineering started to emerge, one that can quickly make an impact in ways people wouldn't have expected six months ago. You can watch videos of all the talks from Chaos Conf 2018 on YouTube.

0
0
5279

article-image-importance-windows-rds-horizon-view

Packt

30 Oct 2014

15 min read

Importance of Windows RDS in Horizon View

Packt

30 Oct 2014

15 min read

0
0
5184

article-image-openstack-networking-nutshell

Packt

22 Sep 2016

13 min read

OpenStack Networking in a Nutshell

Packt

22 Sep 2016

13 min read

Information technology (IT) applications are rapidly moving from dedicated infrastructure to cloud based infrastructure. This move to cloud started with server virtualization where a hardware server ran as a virtual machine on a hypervisor. The adoptionof cloud based applicationshas accelerated due to factors such as globalization and outsourcing where diverse teams need to collaborate in real time. Server hardware connects to network switches using Ethernet and IP to establish network connectivity. However, as servers move from physical to virtual, the network boundary also moves from the physical network to the virtual network.Traditionally applications, servers and networking were tightly integrated. But modern enterprises and IT infrastructure demand flexibility in order to support complex applications. The flexibility of cloud infrastructure requires networking to be dynamic and scalable. Software Defined Networking (SDN) and Network Functions Virtualization (NFV) play a critical role in data centers in order to deliver the flexibility and agility demanded by cloud based applications. By providing practical management tools and abstractions that hide underlying physical network’s complexity, SDN allows operators to build complex networking capabilities on demand. OpenStack is an open source cloud platform that helps build public and private cloud at scale. Within OpenStack, the name for OpenStack Networking project is Neutron. The functionality of Neutron can be classified as core and service. In this article by Sriram Subramanian and SreenivasVoruganti, authors of the book Software Defined Networking (SDN) with OpenStack, aims to provide a short introduction toOpenStack Networking. We will cover the following topics in this article: Understand traffic flows between virtual and physical networks Neutron entities that support Layer 2 (L2) networking Layer 3 (L3) or routing between OpenStack Networks Securing OpenStack network traffic Advanced networking services in OpenStack OpenStack and SDN The terms Neutron and OpenStack Networking are used interchangeably throughout this article. (For more resources related to this topic, see here.) Virtual and physical networking Server virtualization led to the adoption of virtualized applications and workloads running inside physical servers. While physical servers are connected to the physical network equipment, modern networking has pushed the boundary of networking into the virtual domain as well. Virtual switches, firewalls and routers play a critical role in the flexibility provided by cloud infrastructure. Figure 1: Networking components for server virtualization The preceding figure describes a typical virtualized server and the various networking components. The virtual machines are connected to a virtual switch inside the compute node (or server). The traffic is secured using virtual routers and firewalls. The compute node is connected to a physical switch which is the entry point into the physical network. Let us now walk through different traffic flow scenarios using the picture above as the background. In Figure 2, traffic from one VM to another on same compute node is forwarded by the virtual switch itself. It does not reach the physical network. You can even apply firewall rules to traffic between the two virtual machine. Figure 2: Traffic flow between two virtual machines on the same server Next, let us have a look at how traffic flows between virtual machines across two compute nodes. In Figure 3, the traffic comes out from compute node and then reaches the physical switch. The physical switch forwards the traffic to the second compute node and the virtual switch within the second compute node steers the traffic to appropriate VM. Figure 3: Traffic flow between two virtual machines on the different servers Finally, here is the depiction of traffic flow when a virtual machine sends or receives traffic from the Internet. The physical switch forwards the traffic to the physical router and firewall which is presumed to be connected to the internet. Figure 4: Traffic flow from a virtual machine to external network As seen from the above diagrams, the physical and the virtual network components work together to provide connectivity to virtual machines and applications. Tenant isolation As a cloud platform, OpenStack supports multiple users grouped into tenants. One of the key requirements of a multi-tenant cloud is to provide isolation of data traffic belonging to one tenant from rest of the tenants that use the same infrastructure. OpenStack supports different ways of achieving isolation and the it is the responsibility of the virtual switch to implement the isolation. Layer 2 (L2) capabilities in OpenStack The connectivity to a physical or virtual switch is also known as Layer 2 (L2) connectivity in networking terminology. Layer 2 connectivity is the most fundamental form of network connectivity needed for virtual machines. As mentioned earlier OpenStack supports core and service functionality. The L2 connectivity for virtual machines falls under the core capability of OpenStack Networking, whereas Router, Firewall etc., fall under the service category. The L2 connectivity in OpenStack is realized using two constructs called Network and Subnet. Operators can use OpenStack CLI or the web interface to create Networks and Subnets. And virtual machines are instantiated, the operators can associate them appropriate Networks. Creatingnetwork using OpenStack CLI A Network defines the Layer 2 (L2) boundary for all the instances that are associated with it. All the virtual machines within a Network are part of the same L2 broadcast domain. The Liberty release has introduced new OpenStack CLI (Command Line Interface) for different services. We will use the new CLI and see how to create a Network. Creating Subnet using OpenStack CLI A Subnet is a range of IP addresses that are assigned to virtual machines on the associated network. OpenStack Neutron configures a DHCP server with this IP address range and it starts one DHCP server instance per Network, by default. We will now show you how to create a Subnet using OpenStack CLI. Note: Unlike Network, for Subnet, we need to use the regular neutron CLI command in the Liberty release. Associating a network and Subnet to a virtual machine To give a complete perspective, we will create a virtual machine using OpenStack web interface and show you how to associate a Network and Subnet to a virtual machine. In your OpenStack web interface, navigate to Project|Compute|Instances. Click on Launch Instances action on the right hand side as highlighted above. In the resulting window enter the name for your instance and how you want to boot your instance. To associate a network and a subnet with the instance, click on Networking tab. If you have more than one tenant network, you will be able to choose the network you want to associate with the instance. If you have exactly one network, the web interface will automatically select it. As mentioned earlier, providing isolation for Tenant network traffic is a key requirement for any cloud. OpenStack Neutron uses Network and Subnet to define the boundaries and isolate data traffic between different tenants. Depending on Neutron configuration, the actual isolation of traffic is accomplished by the virtual switches. VLAN and VXLAN are common networking technologies used to isolate traffic. Layer 3 (L3) capabilities in OpenStack Once L2 connectivity is established, the virtual machines within one Network can send or receive traffic between themselves. However, two virtual machines belonging to two different Networks will not be able communicate with each other automatically. This is done to provide privacy and isolation for Tenant networks. In order to allow traffic from one Network to reach another Network, OpenStack Networking supports an entity called Router. The default implementation of OpenStack uses Namespaces to support L3 routing capabilities. Creating Router using OpenStack CLI Operators can create Routers using OpenStack CLI or web interface. They can then add more than one Subnets as interface to the Router. This allows the Networks associated with the router to exchange traffic with one another. The command to create a Router is as follows: This command creates a Router with the specified name. Associating Subnetwork to a Router Once a Router is created, the next step is to associate one or more sub-networks to the Router. The command to accomplish this is: The Subnet represented by subnet1 is now associated to the Router router1. Securing network traffic in OpenStack The security of network traffic is very critical and OpenStack supports two mechanisms to secure network traffic. Security Groups allow traffic within a tenant’s network to be secured. Linux iptables on the compute nodes are used to implement OpenStack security groups. The traffic that goes outside of a tenant’s network – to another Network or the Internet, is secured using OpenStackFirewall Service functionality. Like Routing, Firewall is a service with Neutron. Firewall service also uses iptables but the scope of iptables is limited to the OpenStack Router used as part of the Firewall Service. Usingsecurity groups to secure traffic within a network In order to secure traffic going from one VM to another within a given Network, we must create a security group. The command to create a security group is: The next step is to create one or more rules within the security group. As an example let us create a rule which allows only UDP, incoming traffic on port 8080 from any Source IP address. The final step is to associate this security group and the rules to a virtual machine instance. We will use the nova boot command for this: Once the virtual machine instance has a security group associated with it, the incoming traffic will be monitored and depending upon the rules inside the security group, data traffic may be blocked or permitted to reach the virtual machine. Note: it is possible to block ingress or egress traffic using security groups. Using firewall service to secure traffic We have seen that security groups provide a fine grain control over what traffic is allowed to and from a virtual machine instance. Another layer of security supported by OpenStack is the Firewall as a Service (FWaaS). The FWaaS enforces security at the Router level whereas security groups enforce security at a virtual machine interface level. The main use case of FWaaS is to protect all virtual machine instances within a Network from threats and attacks from outsidethe Network. This could be virtual machines part of another Network in the same OpenStack cloud or some entity on the Internet trying to make an unauthorized access. Let us now see how FWaaS is used in OpenStack. In FWaaS, a set of firewall rules are grouped into a firewall policy. And then a firewall is created that implements one policy at a time. This firewall is then associated to a Router. Firewall rule can be created using neutron firewall-rule-create command as follows: This rule blocks ICMP protocol so applications like Ping will be blocked by the firewall. The next step is to create a Firewall policy. In real world scenarios the security administrators will define several rules and consolidate them under a single policy. For example all rules that block various types of traffic can be combined into a single Policy. The command to create a firewall policy is: The final step is to create a firewall and associate it with a router. The command to do this is: In the command above we did not specify any Routers and the OpenStack behavior is to associate the firewall (and in turn the policy and rules) to all the Routers available for that tenant. The neutron firewall-create command supports an option to pick a specific Router as well. Advanced networking services Besides routing and firewall, there are few other commonly used networking technologies supported by OpenStack. Let’s take a quick look at these without delving deep into the respective commands. Load Balancing as a Service (LBaaS) Virtual machines instances created in OpenStack are used to run applications. Most applications are required to support redundancy and concurrent access. For example, a web server may be accessed by a large number of users at the same time. One of the common strategies to handle scale and redundancy is to implement load-balancing for incoming requests. In this approach, aLoad Balancer distributes an incoming service request onto a pool of servers, which processes the request thus providing higher throughput. If one of the servers in the pool fails, the Load Balancer removes it from the pool and the subsequent service requests are distributed among the remaining servers. User of the application use the IP address of the Load Balancer to access the application and are unaware of the pool of servers. OpenStack implements Load Balancer using HAProxy software and Linux Namespace. Virtual Private Network as a Service (VPNaaS) As mentioned earlier tenant isolation requires data traffic to be segregated and secured within an OpenStack cloud. However, there are times when external entities need to be part of the same Network without removing the firewall based security. This can be accomplished using a Virtual Private Network or VPN. A Virtual Private Network (VPN) connects two endpoints on different networks over a public Internet connection, such that the endpoints appear to be directly connected to each other. VPNs also provide confidentiality and integrity of transmitted data. Neutron provides a service plugin that enables OpenStack users to connect two networks using a Virtual Private Network (VPN). The reference implementation of VPN plugin in Neutron uses Openswan to create an IPSec based VPN. IPSec is a suite of protocols that provides secure connection between two endpoints by encrypting each IP packet transferred between them. OpenStack and SDN context So far in this article we have seen the different networking capabilities provided by OpenStack. Let us know look at two capabilities in OpenStack that enable SDN to be leveraged effectively. Choice of technology OpenStack being an open source platform bundles open source networking solutions as default implementation for these networking capabilities. For example, Routing is supported using Namespace, security using iptables and Load balancing using HAproxy. Historically these networking capabilities were implemented using customized hardware and software, most of them being proprietary solutions. These custom solutions are capable of much higher performance and are well supported by their vendors. And hence they have a place in the OpenStack and SDN ecosystem. From it initial releases OpenStack has been designed for extensibility. Vendors can write their own extensions and then can easily configure OpenStack to use their extension instead of the default solutions. This allows the operators to deploy the networking technology of their choice. OpenStack API for networking One of the most powerful capabilities of OpenStack is the extensive support for APIs. All services of OpenStack interact with one another using well defined RESTful APIs. This allows custom implementations and pluggable components to provide powerful enhancements for practical cloud implementation. For example, when a Network is created using OpenStack web interface, a RESTful request is sent to Horizon service. This in turn invokes a RESTful API to validate the user using Keystone service. Once validated, Horizon sends another RESTful API request to Neutron to actually create the Network. Summary As seen in this article, OpenStack supports a wide variety of networking functionality right out of the box. The importance of isolating tenant traffic and the need to allow customized solution requires OpenStack to support flexible configuration. We also highlighted some key aspects of OpenStack that will play a key role in deploying Software Defined Networking in datacenters, thereby supporting powerful cloud architecture and solution. Resources for Article: Further resources on this subject: Setting Up a Network Backup Server with Bacula [article] Jenkins 2.0: The impetus for DevOps Movement [article] Integrating Accumulo into Various Cloud Platforms [article]

0
0
5023

article-image-citrix-xenapp-performance-essentials

Packt

21 Aug 2013

18 min read

Citrix XenApp Performance Essentials

Packt

21 Aug 2013

18 min read

0
0
4865

article-image-6-signs-you-need-containers

Richard Gall

05 Feb 2019

9 min read

6 signs you need containers

Richard Gall

05 Feb 2019

9 min read

I’m not about to tell you containers is a hot new trend - clearly, it isn’t. Today, they are an important part of the mainstream software development industry that probably won't be disappearing any time soon. But while containers certainly can’t be described as a niche or marginal way of deploying applications, they aren’t necessarily ubiquitous. There are still developers or development teams yet to fully appreciate the usefulness of containers. You might know them - you might even be one of them. Joking aside, there are often many reasons why people aren’t using containers. Sometimes these are good reasons: maybe you just don’t need them. Often, however, you do need them, but the mere thought of changing your systems and workflow can feel like more trouble than it’s worth. If everything seems to be (just about) working, why shake things up? Well, I’m here to tell you that more often than not it is worthwhile. But to know that you’re not wasting your time and energy, there are a few important signs that can tell you if you should be using containers. Download Containerize Your Apps with Docker and Kubernetes for free, courtesy of Microsoft. Your codebase is too complex There are few developers in the world who would tell you that their codebase couldn’t do with a little pruning and simplification. But if your code has grown into a beast that everyone fears and doesn’t really understand, containers could probably help you a lot. Why do containers help simplify your codebase? Let’s think about how spaghetti code actually happens. Yes, it always happens by accident, but usually it’s something that evolves out of years of solving intractable problems with knock on effects and consequences that only need to be solved later. By using containers you can begin to think differently about your code. Instead of everything being tied up together, like a complex concrete network of road junctions, containers allow you to isolate specific parts of it. When you can better isolate your code, you can also isolate different problems and domains. This is one of the reasons that containers is so closely aligned with microservices. Software testing is nightmarish The efficiency benefits of containers are well documented, but the way containers can help the software testing process is often underplayed - this probably says more about a general inability to treat testing with the respect and time it deserves as much as anything else. How do containers make testing easier? There are a number of reasons containers make software testing easier. On the one hand, by using containers you’re reducing that gap between the development environment and production, which means you shouldn’t be faced with as many surprises once your code hits production as you sometimes might. Containers also make the testing process faster - you only need to test against a container image, you don’t need a fully-fledged testing environment for every application you do tests on. What this all boils down to is that testing becomes much quicker and easier. In theory, then, this means the testing process fits much more neatly within the development workflow. Code quality should never be seen as a bottleneck; with containers it becomes much easier to embed the principle in your workflow. Read next: How to build 12 factor microservices on Docker Your software isn’t secure - you’ve had breaches that could have been prevented Spaghetti code, lack of effective testing can lead to major security risks. If no one really knows what’s going on inside your applications and inside your code it’s inevitable that you’ll have vulnerabilities. And, in turn, it’s highly likely these vulnerabilities will be exploited. How can containers make software security easier? Because containers allow you to make changes to parts of your software infrastructure (rather than requiring wholesale changes), this makes security patches much easier to achieve. Essentially, you can isolate the problem and tackle it. Without containers, it becomes harder to isolate specific pieces of your infrastructure, which means any changes could have a knock on effect on other parts of your code that you can’t predict. That all being said, it probably is worth mentioning that containers do still pose a significant set of security challenges. While simplicity in your codebase can make testing easier, you are replacing simplicity at that level with increased architectural complexity. To really feel the benefits of container security, you need a strong sense of how your container deployments are working together and how they might interact. Your software infrastructure is expensive (you feel the despair of vendor lock-in) Running multiple virtual machines can quickly get expensive. In terms of both storage and memory, if you want to scale up, you’re going to be running through resources at a rapid rate. While you might end up spending big on more traditional compute resources, the tools around container management and automation are getting cheaper. One of the costs of many organization’s software infrastructure is lock-in. This isn’t just about price, it’s about the restrictions that come with sticking with a certain software vendor - you’re spending money on software systems that are almost literally restricting your capacity for growth and change. How do containers solve the software infrastructure problem and reduce vendor lock-in? Traditional software infrastructure - whether that’s on-premise servers or virtual ones - is a fixed cost - you invest in the resources you need, and then either use it or you don’t. With containers running on, say, cloud, it becomes a lot easier to manage your software spend alongside strategic decisions about scalability. Fundamentally, it means you can avoid vendor lock-in. Yes, you might still be paying a lot of money for AWS or Azure, but because containers are much more portable, moving your applications between providers is much less hassle and risk. Read next: CNCF releases 9 security best practices for Kubernetes, to protect a customer’s infrastructure DevOps is a war, not a way of working Like containers, DevOps could hardly be considered a hot new trend any more. But this doesn’t mean it’s now part of the norm. There are plenty of organizations that simply don’t get DevOps, or, at the very least, seem to be stumbling their way through sprint meetings with little real alignment between development and operations. There could be multiple causes for this conflict (maybe people just don’t get on), but DevOps often fails where the code that’s being written and deployed is too complicated for anyone to properly take accountability. This takes us back to the issue of the complex codebase. Think of it this way - if code is a gigantic behemoth that can’t be easily broken up, the unintended effects and consequences of every new release and update can cause some big problems - both personally and technically. How do containers solve DevOps challenges? Containers can help solve the problems that DevOps aims to tackle by breaking up software into different pieces. This means that developers and operations teams have much more clarity on what code is being written and why, as well as what it should do. Indeed, containers arguably facilitate DevOps practices much more effectively than DevOps proponents have been trying to do in pre-container years. Adding new product features is a pain The issue of adding features or improving applications is a complaint that reaches far beyond the development team. Product management, marketing - these departments will all bemoan the ability to make necessary changes or add new features that they will argue is business critical. Often, developers will take the heat. But traditional monolithic applications make life difficult for developers - you simply can’t make changes or updates. It’s like wanting to replace a radiator and having to redo your house’s plumbing. This actually returns us to the earlier point about DevOps - containers makes DevOps easier because it enables faster delivery cycles. You can make changes to an application at the level of a container or set of containers. Indeed, you might even simply kill one container and replace it with a new one. In turn, this means you can change and build things much more quickly. How do containers make it easier to update or build new features? To continue with the radiator analogy: containers would allow you to replace or change an individual radiator without having to gut your home. Essentially, if you want to add a new feature or change an element, you wouldn’t need to go into your application and make wholesale changes - that may have unintended consequences - instead, you can simply make a change by running the resources you need inside a new container (or set of containers). Watch for the warning signs As with any technology decision, it’s well worth paying careful attention to your own needs and demands. So, before fully committing to containers, or containerizing an application, keep a close eye on the signs that they could be a valuable option. Containers may well force you to come face to face with the reality of technical debt - and if it does, so be it. There’s no time like the present, after all. Of course, all of the problems listed above are ultimately symptoms of broader issues or challenges you face as a development team or wider organization. Containers shouldn’t be seen as a sure-fire corrective, but they can be an important element in changing your culture and processes. Learn how to containerize your apps with a new eBook, free courtesy of Microsoft. Download it here.

0
0
4788

article-image-monitoring-and-troubleshooting-networking

Packt

21 Oct 2015

21 min read

Monitoring and Troubleshooting Networking

Packt

21 Oct 2015

21 min read

This article by Muhammad Zeeshan Munir, author of the book VMware vSphere Troubleshooting, includes troubleshooting vSphere virtual distributed switches, vSphere standard virtual switches, vLANs, uplinks, DNS, and routing, which is one of the core issues a seasonal system engineer has to deal with on a daily basis. This article will cover all these topics and give you hands-on step-by-step instructions to manage and monitor your network resources. The following topics will be covered in this article: Different network troubleshooting commands VLANs troubleshooting Verification of physical trunks and VLAN configuration Testing of VM connectivity VMkernel interface troubleshooting Configuration command (Vicfg-vmknic and esxcli network ip interface) Use of Direct Console User Interface (DCUI) to verify configuration (For more resources related to this topic, see here.) Network troubleshooting commands Some of the commands that can be used for networking troubleshooting include net-dvs, Esxcli network, vicfg-route, vicfg-vmknic, vicfg-dns, vicfg-nics, and vicfg-vswitch. You can use the net-dvs command to troubleshoot VMware distributed dvSwitches. The command shows all the information regarding the VMware distributed dvSwtich configuration. The net-dvs command reads the information from the /etc/vmware/dvsdata.db file and displays all the data in the console. A vSphere host keeps updating its dvsdata.db file every five minutes. Connect to a vSphere host using PuTTY. Enter your user name and password when prompted. Type the following command in the CLI: net-dvs You will see something similar to the following screenshot: In the preceding screenshot, you can see that the first line represents the UUID of a VMware distributed switch. The second line shows the maximum number of ports a distributed switch can have. The line com.vmware.common.alias = dvswitch-Network-Pools represents the name of a distributed switch. The next line com.vmware.common.uplinkPorts: dvUplink1 to dvUplinkn shows the uplink ports a distributed switch has. The distributed switch MTU is set to 1,600 and you can see the information about CDP just below it. CDP information can be useful to troubleshoot connectivity issues. You can see com.vmware.common.respools.list listing networking resource pools, while com.vmware.common.host.uplinkPorts shows the ports numbers assigned to uplink ports. Further details about these uplink ports are explained as follows for each uplink port by their port number. You can also see the port statistics as displayed in the following screenshot. When you perform troubleshooting, these statistics can help you to check the behavior of the distributed switch and the ports. From these statistics, you can diagnose if the data packets are going in and out. As you can see in the following screenshot, all the metrics regarding packet drops are zero. If you find in your troubleshooting that the packets are being dropped, you can easily start finding the root cause of the problem: Unfortunately, the net-dvs command is very poorly documented, and usually, it is hard to find useful references. Moreover, it is not supported by VMware. However, you can use it with –h switch to display more options. Repairing a dvsdata.db file Sometimes, the dvsdata.db file of a vSphere host becomes corrupted and you face different types of distributed switch errors, for example, unable to create proxy DVS. In this case, when you try to run the net-dvs command on a vSphere host, it will fail with an error as well. As I have mentioned earlier, the net-dvs command reads data from the /etc/vmware/dvsdata.db file—it fails because it is unable to read data from the file. The possible cause for the corruption of the dvsdata.db file could be network outage; or when a vSphere host is disconnected from vCenter and deleted, it might have the information in its cache. You can resolve this issue by restoring the dvsdata.db file by following these steps: Through PuTTY, connect to a functioning vSphere host in your infrastructure. Copy the dvsdata.db file from the vSphere host. The file can be found in /etc/vmware/dvsdata.db. Transfer the copied dvsdata.db file to the corrupted vSphere host and overwrite it. Restart your vSphere host. Once the vSphere host is up and running, use PuTTY to connect to it. Run the net-dvs command. The command should be executed successfully this time without any errors. ESXCLI network The esxcli network command is a longtime friend of the system administrator and the support staff for troubleshooting network related issues. The esxcli network command will be used to examine different network configurations and to troubleshoot problems. You can type esxcli network to quickly see a help reference and the different options that can be used with the command. Let's walk through some useful esxcli network troubleshooting commands. Type the following command into your vSphere CLI to list all the virtual machines and the networks they are on. You can see that the command returned World ID, virtual machine name, number of ports, and the network: esxcli network vm list World ID Name Num Ports Networks -------- --------------------------------------------------- --------- --------------- 14323012 cluster08_(5fa21117-18f7-427c-84d1-c63922199e05) 1 dvportgroup-372 Now use the World ID of a virtual machine returned by the last command to list all the ports the virtual machine is currently using. You can see the virtual switch name, MAC address of the NIC, IP address, and uplink port ID: esxcli network vm port list -w 14323012 Port ID: 50331662 vSwitch: dvSwitch-Network-Pools Portgroup: dvportgroup-372 DVPort ID: 1063 MAC Address: 00:50:56:01:00:7e IP Address: 0.0.0.0 Team Uplink: all(2) Uplink Port ID: 0 Active Filters: Type the following command in the CLI to list the statistics of the virtual switch—you need to replace the port ID as returned by the last command after –p flag: esxcli network port stats get -p 50331662 Packet statistics for port 50331662 Packets received: 10787391024 Packets sent: 7661812086 Bytes received: 3048720170788 Bytes sent: 154147668506 Broadcast packets received: 17831672 Broadcast packets sent: 309404 Multicast packets received: 656 Multicast packets sent: 52 Unicast packets received: 10769558696 Unicast packets sent: 7661502630 Receive packets dropped: 92865923 Transmit packets dropped: 0 Type the following command to list complete information about the network card of the virtual machine: esxcli network nic stats get -n vmnic0 NIC statistics for vmnic0 Packets received: 2969343419 Packets sent: 155331621 Bytes received: 2264469102098 Bytes sent: 46007679331 Receive packets dropped: 0 Transmit packets dropped: 0 Total receive errors: 78507 Receive length errors: 0 Receive over errors: 22 Receive CRC errors: 0 Receive frame errors: 0 Receive FIFO errors: 78485 Receive missed errors: 0 Total transmit errors: 0 Transmit aborted errors: 0 Transmit carrier errors: 0 Transmit FIFO errors: 0 Transmit heartbeat errors: 0 Transmit window errors: 0 A complete reference of the ESXCli Network command can be found here at https://goo.gl/9OMbVU. All the vicfg-* commands are very helpful and easy to use. I will encourage you to learn in order to make your life easier. Here are some of the vicfg-* commands relevant to network troubleshooting: vicfg-route: We will use this command for how to add or remove IP routes and how to create and delete default IP Gateways. vicfg-vmknic: We will use this command to perform different operations on VMkernel NICs for vSphere hosts. vicfg-dns: This command will be used to manipulate DNS information. vicfg-nics: We will use this command to manipulate vSphere Physical NICs. vicfg-vswitch: We will use this command to to create, delete, and modify vswitch information. Troubleshooting uplinks We will use the vicfg-nics command to manage physical network adapters of vSphere hosts. The vicfg-nics command can also be used to set up the speed, VMkernel name for the uplink adapters, duplex setting, driver information, and link state information of the NIC. Connect to your vMA appliance console and set up the target vSphere host: vifptarget --set crimv3esx001.linxsol.com List all the network cards available in the vSphere host. See the following screenshot for the output: vicfg-nics –l You can see that my vSphere host has five network cards from vmnic0 to vmnic5. You are able to see the PCI and driver information. The link state for the all the network cards is up. You can also see two types of network card speeds: 1000 Mbs and 9000 Mbs. There is also a card name in the Description field, MTU, and the Mac address for the network cards. You can set up a network card to auto-negotiate as follows: vicfg-nics --auto vimnic0 Now let's set the speed of vmnic0 to 1000 and its duplex settings to full: vicfg-nics --duplex full --speed 1000 vmnic0 Troubleshooting virtual switches The last command we will discuss in this article is vicfg-vswitch. The vicfg-vswitch command is a very powerful command that can be used to manipulate the day-to-day operations of a virtual switch. I will show you how to create and configure port groups and virtual switches. Set up a vSphere host in the vMA appliance in which you want to get information about virtual switches: vifptarget --set crimv3esx001.linxsol.com Type the following command to list all the information about the switches the vSphere host has. You can see the command output in the screenshot that follows: vicfg-vswitch -l You can see that the vSphere host has one virtual switch and two virtual NICs carrying traffic for the management network and for the vMotion. The virtual switch has 128 ports, and 7 of them are in used state. There are two uplinks to the switch with MTU set to 1500, while two VLANS are being used: one for the management network and one for the vMotion traffic. You can also see three distributed switches named OpenStack, dvSwitch-External-Networks, and dvSwitch-Network-Pools. Prefixing dv with the distributed switch name is a command practice, and it can help you to easily recognize a distributed switch. I will go through adding a new virtual switch: vicfg-vswitch --add vSwitch002 This creates a virtual switch with 128 ports and MTU of 1500. You can use the --mtu flag to specify a different MTU. Now add an uplink adapter vnic02 to the newly created virtual switch vSwitch002: vicfg-vswitch --link vmnic0 vSwitch002 To add a port group to the virtual switch, use the following command: vicfg-vswitch --add-pg portgroup002 vSwitch002 Now add an uplink adapter to the port group: vicfg-vswitch --add-pg-uplink vmnic0 --pg portgroup002 vSwitch002 We have discussed all the commands to create a virtual switch and its port groups and to add uplinks. Now we will see how to delete and edit the configuration of a virtual switch. An uplink NIC from the port group can be deleted using –N flag. Remove vmnic0 from the portgroup002: vicfg-vswitch --del-pg-uplink vmnic0 --pg portgroup002 vSwitch002 You can delete the recently created port group as follows: vicfg-vswitch --del-pg portgroup002 vSwitch002 To delete a switch, you first need to remove an uplink adapter from the virtual switch. You need to use the –U flag, which unlinks the uplink from the switch: vicfg-vswitch --unlink vmnic0 vSwitch002 You can delete a virtual switch using the –d flag. Here is how you do it: vicfg-vswitch --delete vSwitch002 You can check the Cisco Discovery Protocol (CDP) settings by using the --get-cdp flag with the vicfg-vswitch command. The following command resulted in putting the CDP in the Listen state, which indicates that the vSphere host is configured to receive CDP information from the physical switch: vi-admin@vma:~[crimv3esx001.linxsol.com]> vicfg-vswitch --get-cdp vSwitch0 listen You can configure CDP options for the vSphere host to down, listen, or advertise. In the Listen mode, the vSphere host tries to discover and publish this information received from a Cisco switch port, though the information of the vSwitch cannot be seen by the Cisco device. In the Advertise mode, the vSphere host doesn't discover and publish the information about the Cisco switch; instead, it publishes information about its vSwitch to the Cisco switch device. vicfg-vswitch --set-cdp both vSwitch0 Troubleshooting VLANs Virtual LANS or VLANs are used to separate the physical switching segment into different logical switching segments in order to segregate the broadcast domains. VLANs not only provide network segmentation but also provide us a method of effective network management. It also increases the overall network security, and nowadays, it is very commonly used in infrastructure. If not set up correctly, it can lead your vSphere host to no connectivity, and you can face some very common problems where you are unable to ping or resolve the host names anymore. Some common errors are exposed, such as Destination host unreachable and Connection failed. A Private VLAN (PVLAN) is an extended version of VLAN that divides logical broadcast domain into further segments and forms private groups. PVLANs are divided into primary and secondary PVLANs. Primary PVLAN is the VLAN distributed into smaller segments that are called primary. These then host all the secondary PVLANs within them. Secondary PVLANs live within primary VLANS, and individual secondary VLANs are recognized by VLAN IDs linked to them. Just like their ancestor VLANs, the packets that travel within secondary VLANS are tagged with their associated IDs. Then, the physical switch recognizes if the packets are tagged as isolated, community, or promiscuous. As network troubleshooting involves taking care of many different aspects, one aspect you will come across in the troubleshooting cycle is actually troubleshooting VLANS. vSphere Enterprise Plus licensing is a requirement to connect a host using a virtual distributed switch and VLANs. You can see the three different network segments in the following screenshot. VLAN A connects all the virtual machines on different vSphere hosts; VLAN B is responsible for carrying out management network traffic; and VLAN C is responsible for carrying out vMotion-related traffic. In order to create PVLANs on your vSphere host, you also need the support of a physical switch: For detailed information about the vSphere network, refer to the VMware official networking guide for vSphere 5.5 at http://goo.gl/SYySFL. Verifying physical trunks and VLAN configuration The first and most important step to troubleshooting your VLAN problem is to look into the VLAN configuration of your vSphere host. You should always start by verifying it. Let's walk through how to verify the network configuration of the management network and VLAN configuration from the vSphere client: Open and log in to your vSphere client. Click on the vSphere host you are trying to troubleshoot. Click on the Configuration menu and choose Networking and then Properties of the switch you are troubleshooting. Choose the network you are troubleshooting from the list, and click on Edit. This will open a new window. Verify the VLAN ID for Management Network. Match the ID of the VLAN provided by your network administrator. Verifying VLAN configuration from CLI Following are the steps for verifying VLAN configuration from CLI: Log in to vSphere CLI. Type the following command in the console: esxcfg-vswitch -l Alternatively, in the vMA appliance, type the vicfg-vswitch command—the output is similar for both commands: vicfg-vswitch –l The output of the excfg-vswitch –l command is as follows: Switch Name Num Ports Used Ports Configured Ports MTU Uplinks vSwitch0 128 7 128 1500 vmnic3,vmnic2 PortGroup Name VLAN ID Used Ports Uplinks vMotion 2231 1 vmnic3,vmnic2 Management Network 2230 1 vmnic3,vmnic2 ---Omitted output--- The output of the vicfg-vswitch –l command is as follows: Switch Name Num Ports Used Ports Configured Ports MTU Uplinks vSwitch0 128 7 128 1500 vmnic2,vmnic3 PortGroup Name VLAN ID Used Ports Uplinks vMotion 2231 1 vmnic2,vmnic3 Management Network 2230 1 vmnic3,vmnic2 --Omitted output--- Match it with your network configuration. If the VLAN ID is incorrect or missing, you can add or edit it using the following command from the vSphere CLI: esxcfg-vswitch –v 2233 –p "Management Network" vSwitch0 To add or edit the VLAN ID from the vMA appliance, use the following command: vicfg-vswitch --vlan 2233 --pg "Management Network" vSwitch0 Verifying VLANs from PowerCLI Verifying information about VLANs from the PowerCLI is fairly simple. Type the following command into the console after connecting with vCenter using Connect-VIServer: Get-VirtualPortGroup –VMHost crimv3esx001.linxsol.com | select Name, VirtualSwitch VLanID Name VirtualSwitch VlanId ---- ------------- ----- vMotion vSwitch0 2231 Management Network vSwitch0 2233 Verifying PVLANs and secondary PVLANs When you have configured PVLANs or secondary PVLANs in your vSphere infrastructure, you may arrive at a situation where you need to troubleshoot them. This topic will provide you some tips to obtain and view information about PVLANs and secondary PVLANs, as follows: Log in to the vSphere client and click on Networking. Select a distributed switch and right-click on it. From the menu, choose Edit Settings and click on it. This will open the Distributed Switch Settings window. Click on the third tab named Private VLAN. In the section on the left named Primary private VLAN ID, verify the VLAN ID provided by your network engineer. You can verify the VLAN ID of the secondary PVLAN in the next section on the right. Testing virtual machine connectivity Whenever you are troubleshooting, virtual-machine-to-virtual-machine testing is very important. It helps you to isolate the problem domain to a smaller scope. When performing virtual-machine-to-virtual-machine testing, you should always move virtual machines to a single vSphere host. You can then start troubleshooting the network using basic commands, such as ping. If ping works, you are ready to test it further and move the virtual machines to other hosts, and if it still doesn't work, it most likely is a configuration problem of a physical switch or is likely to be a mismatched physical trunk configuration. The most common problem in this scenario is a problematic physical switch configuration. Troubleshooting VMkernel interfaces In this section, we will see how to troubleshoot VMkernel interfaces: Confirm VLAN tagging Ping to check connectivity Vicfg-vmknic Escli network ip interface for local configuration Escli network ip interface list Add or remove Set Escli network ip interface ipv4 get You should know how to use these commands to test if everything is working. You should be able to ping to ensure connectivity exists. We will use the vicfg-vmknic command to configure vSphere VMkernel NICs. Let's create a new VMkernel NIC in a vSphere host using the following steps: Log in to your VMware vSphere CLI. Type the following command to create a new VMkernel NIC: vicfg-vmknic –h crimv3esx001.linxsol.com --add --ip 10.2.0.10 –n 255.255.255.0 'portgroup01' You can enable vMotion using the vicfg-vmknic command as follows: vicfg-vmknic –enable-vmotion. You will not be able to enable vMotion from ESXCLI.vMotion protect migration of your virtual machines with zero down time. You can delete an existing VMkernel NIC as follows: vicfg-vmknic –h crimv3esx001.linxsol.com --delete 'portgroup01' Now check by typing the following command which VMkernel NICs are available in the system: vicfg-vmknic -l Verifying configuration from DCUI When you successfully install vSphere, the first yellow screen that you see is called the vSphere DCUI. DCUI is a frontend management system that helps perform some basic system administration tasks. It also offers the best way to troubleshoot some problems that may be difficult to troubleshoot through vMA, vCLI, or PowerCLI. Further, it is very useful when your host becomes irresponsive from the vCenter or is not accessible from any of the management tools. Some useful tasks that can be performed using the DCUI are as follows: Configuring the Lockdown mode Checking connectivity of Management Network by Ping Configuring and restarting network settings Restarting management agents Viewing logs Resetting vSphere configuration Changing root password Verifying network connectivity from DCUI The vSphere host automatically assigns the first network card available to the system for the management network. Moreover, the default installation of the vSphere host does not let you set up VLAN tags until the VMkernel has been loaded. Verifying network connectivity from the DCUI is important but easy. To do so, follow these steps: Press F2 and enter your root user name and password. Click OK. Use the cursor keys to go down to the Test Management Network option. Click Enter, and you will see a new screen. Here you can enter up to three IP addresses and the host name to be resolved. You can also type your gateway address on this screen to see if you are able to reach to your gateway. In the host name, you can enter your DNS server name to test if the name resolves successfully. Press Esc to get back and Esc again to log off from the vSphere DCUI. Verifying management network from DCUI You can also verify the settings of your management network from the DCUI. Press F2 and enter your root user name and password. Click OK. Use the cursor keys to go down to option Configure Management Network option and click Enter. Click Enter again after selecting the first option Network Adapters. On the next screen, you will see a list of all the network adapters your system has. It will show you the Device Name, Hardware Type, Label, Mac Address of the network card, and the status as Connected or Disconnected. From the given network cards, you can select or deselect any of the network card by pressing the space Bar on your keyboard. Press Esc to get back and Esc again to log off from the vSphere DCUI. As you can see in the preceding screenshot, you can also configure the IP address and DNS settings for your vSphere host. You can also use DCUI to configure VLANs and DNS Suffix for your vSphere host. Summary In this article, for troubleshooting, we took a deep dive into the troubleshooting commands and some of the monitoring tools to monitor network performance. The various platforms to execute different commands help you to isolate your troubleshooting techniques. For example, for troubleshooting a single vSphere host, you may like to use esxcli, but for a bunch of vSphere hosts you would like to automate scripting tasks from PowerCLI or from a vMA appliance. Resources for Article: Further resources on this subject: UPGRADING VMWARE VIRTUAL INFRASTRUCTURE SETUPS [article] VMWARE VREALIZE OPERATIONS PERFORMANCE AND CAPACITY MANAGEMENT [article] WORKING WITH VIRTUAL MACHINES [article]

0
0
4729

article-image-its-black-friday-but-whats-the-business-and-developer-cost-of-downtime

Richard Gall

23 Nov 2018

4 min read

It's Black Friday: But what's the business (and developer) cost of downtime?

Richard Gall

23 Nov 2018

4 min read

Black Friday is back, and, as you've probably already noticed, with a considerable vengeance. According to Adobe Analytics data, online spending is predicted to hit $3.7 billion over this holiday season in the U.S, up from $2.9 billion in 2017. But while consumers clamour for deals and businesses reap the rewards, it's important to remember there's a largely hidden plane of software engineering labour. Without this army of developers, consumers will most likely be hitting their devices in frustration, while business leaders will be missing tough revenue targets - so, as we enter into Black Friday let's pour one out for all those engineers on call and trying their best to keep eCommerce sites on their feet. Here's to the software engineers keeping things running on Black Friday Of course, the pain that hits on days like Black Friday and Cyber Monday can be minimised with smart planning and effective decision making long before those sales begin. However, for engineering teams under-resourced and lacking the right tools, that is simply impossible. This means that software engineers are left in a position where they're treading water, knowing that they're going to be sinking once those big days come around. It doesn't have to be like this. With smarter leadership and, indeed, more respect for the intensive work engineers put in to make websites and apps actually work, revenue driving platforms can become more secure, resilient and stable. Chaos engineering platform Gremlin publishes the 'true cost of downtime' This is the central argument of chaos engineering platform Gremlin, who we've covered a number of times this year. To coincide with Black Friday the team has put together what they believe is the 'true cost of downtime'. On the one hand this is a good marketing hook for their chaos engineering platform, but, cynicism aside, it's also a good explanation of why the principles of chaos engineering can be so valuable from both a business and developer perspective. Estimating the annual revenue of some of the biggest companies in the world, Gremlin has been then created an interactive table to demonstrate what the cost of downtime for each of those businesses would be, for the length of time you are on the page. For 20 minutes downtime, Amazon.com would have lost a staggering $4.4 million. For Walgreens it's more than $80,000. Gremlin provide some context to all this, saying: "Enterprise commerce businesses typically rely on a complex microservices architecture, from fulfillment, to website security, ability to scale with holiday traffic, and payment processing - there is a lot that can go wrong and impact revenue, damage customer trust, and consume engineering time. If an ecommerce site isn’t 100% online and performant, it’s losing revenue." "The holiday season is especially demanding for SREs working in ecommerce. Even the most skilled engineering teams can struggle to keep up with the demands of peak holiday traffic (i.e. Black Friday and Cyber Monday). Just going down for a few seconds can mean thousands in lost revenue, but for some sites, downtime can be exponentially more expensive." For Gremlin, chaos engineering is clearly the answer to many of the problems days like Black Friday poses. While it might not work for every single organization, it's nevertheless true that failing to pay attention to the value of your applications and websites at an hour by hour level could be incredibly damaging. With outages on Facebook, WhatsApp, and Instagram happening earlier this week, these problems aren't hidden away - they're in full view of the public. What does remain hidden, however, is the work and stress that goes in to tackling these issues and ensuring things are working as they should be. Perhaps it's time to start learning the lessons of Black Friday - business revenues will be that little bit healthier, but engineers will also be that little bit happier.

0
0
4720

Packt

04 Apr 2016

12 min read

Proxmox VE Fundamentals

Packt

04 Apr 2016

12 min read

In this article written by Rik Goldman author of the book Learning Proxmox VE, we introduce to you Proxmox Virtual Environment (PVE) which is a mature, complete, well-supported, enterprise-class virtualization environment for servers. It is an open source tool—based in the Debian GNU/Linux distribution—that manages containers, virtual machines, storage, virtualized networks, and high-availability clustering through a well-designed, web-based interface or via the command-line interface. (For more resources related to this topic, see here.) Developers provided the first stable release of Proxmox VE in 2008; 4 years and eight point releases later, ZDNet's Ken Hess boldly, but quite sensibly, declared Proxmox VE as Proxmox: The Ultimate Hypervisor (http://www.zdnet.com/article/proxmox-the-ultimate-hypervisor/). Four years later, PVE is on version 4.1, in use by at least 90,000 hosts, and more than 500 commercial customers in 140 countries; the web-based administrative interface itself is translated into nineteen languages. This article will explore the fundamental technologies underlying PVE's hypervisor features: LXC, KVM, and QEMU. To do so, we will develop a working understanding of virtual machines, containers, and their appropriate use. We will cover the following topics: Proxmox VE in brief Virtualization and containerization with PVE Proxmox VE virtual machines, KVM, and QEMU Containerization with PVE and LXC Proxmox VE in brief With Proxmox VE, Proxmox Server Solutions GmbH (https://www.proxmox.com/en/about) provides us with an enterprise-ready, open source type II hypervisor. Later, you'll find some of the features that make Proxmox VE such a strong enterprise candidate. The license for Proxmox VE is very deliberately the GNU Affero General Public License (V3) (https://www.gnu.org/licenses/agpl-3.0.html). From among the many free and open source compatible licenses available, this is a significant choice because it is "specifically designed to ensure cooperation with the community in the case of network server software." PVE is primarily administered from an integrated web interface or from the command line locally or via SSH. Consequently, there is no need for a separate management server and the associated expenditure. In this way, Proxmox VE significantly contrasts with alternative enterprise virtualization solutions by vendors such as VMware. Proxmox VE instances/nodes can be incorporated into PVE clusters, and centrally administered from a unified web interface. Proxmox VE provides for live migration—the movement of a virtual machine or container from one cluster node to another without any disruption of services. This is a rather unique feature to PVE and not common to competing products. Features Proxmox VE VMware vSphere Hardware requirements Flexible Strict compliance with HCL Integrated management interface Web- and shell-based (browser and SSH) No. Requires dedicated management server at additional cost Simple subscription structure Yes; based on number of premium support tickets per year and CPU socket count No High availability Yes Yes VM live migration Yes Yes Supports containers Yes No Virtual machine OS support Windows and Linux Windows, Linux, and Unix Community support Yes No Live VM snapshots Yes Yes Contrasting Proxmox VE and VMware vSphere features For a complete catalog of features, see the Proxmox VE datasheet at https://www.proxmox.com/images/download/pve/docs/Proxmox-VE-Datasheet.pdf. Like its competitors, PVE is a hypervisor: a typical hypervisor is software that creates, runs, configures, and manages virtual machines based on an administrator or engineer's choices. PVE is known as a type II hypervisor because the virtualization layer is built upon an operating system. As a type II hypervisor, Proxmox VE is built on the Debian project. Debian is a GNU/Linux distribution renowned for its reliability, commitment to security, and its thriving and dedicated community of contributing developers. A type II hypervisor, such as PVE, runs directly over the operating system. In Proxmox VE's case, the operating system is Debian; since the release of PVE 4.0, the underlying operating system has been Debian "Jessie." By contrast, a type I hypervisor (such as VMware's ESXi) runs directly on bare metal without the mediation of an operating system. It has no additional function beyond managing virtualization and the physical hardware. A type I hypervisor runs directly on hardware, without the mediation of an operating system. As a type II hypervisor, Proxmox VE is built on the Debian project. Debian is a GNU/Linux distribution renowned for its reliability, commitment to security, and its thriving and dedicated community of contributing developers. Debian-based GNU/Linux distributions are arguably the most popular GNU/Linux distributions for the desktop. One characteristic that distinguishes Debian from competing distribution is its release policy: Debian releases only when its development community can stand behind it for its stability, security, and usability. Debian does not distinguish between long-term support releases and regular releases as do some other distributions. Instead, all Debian releases receive strong support and critical updates through the first year following the next release. (Since 2007, a major release of Debian has been made about every two years. Debian 8, Jessie, was released just about on schedule in 2015. Proxmox VE's reliance on Debian is thus a testament to its commitment to these values: stability, security, and usability over scheduled releases that favor cutting-edge features. PVE provides its virtualization functionality through three open technologies, and the efficiency with which they're integrated by its administrative web interface: LXC KVM QEMU To understand how this foundation serves Proxmox VE, we must first be able to clearly understand the relationship between virtualization (or, specifically, hardware virtualization) and containerization (OS virtualization). As we proceed, their respective use cases should become clear. Virtualization and containerization with Proxmox VE It is correct to ultimately understand containerization as a type of virtualization. However, here, we'll look first to conceptually distinguish a virtual machine from a container by focusing on contrasting characteristics. Simply put, virtualization is a technique through which we provide fully-functional, computing resources without a demand for resources' physical organization, locations, or relative proximity. Briefly put, virtualization technology allows you to share and allocate the resources of a physical computer into multiple execution environments. Without context, virtualization is a vague term, encapsulating the abstraction of such resources as storage, networks, servers, desktop environments, and even applications from their concrete hardware requirements through software implementation solutions called hypervisors. Virtualization thus affords us more flexibility, more functionality, and a significant positive impact on our budgets—often realized with merely the resources we have at hand. In terms of PVE, virtualization most commonly refers to the abstraction of all aspects of a discrete computing system from its hardware. In this context, virtualization is the creation, in other words, of a virtual machine or VM, with its own operating system and applications. A VM may be initially understood as a computer that has the same functionality as a physical machine. Likewise, it may be incorporated and communicated with via a network exactly as a machine with physical hardware would. Put yet another way, from inside a VM, we will experience no difference from which we can distinguish it from a physical computer. The virtual machine, moreover, hasn't the physical footprint of its physical counterparts. The hardware it relies on is, in fact, provided by software that borrows from the hardware resources from a host installed on a physical machine (or bare metal). Nevertheless, the software components of the virtual machine, from the applications to the operating system, are distinctly separated from those of the host machine. This advantage is realized when it comes to allocating physical space for resources. For example, we may have a PVE server running a web server, database server, firewall, and log management system—all as discrete virtual machines. Rather than consuming the physical space, resources, and labor of maintaining four physical machines, we simply make physical room for the single Proxmox VE server and configure an appropriate virtual LAN as necessary. In a white paper entitled Putting Server Virtualization to Work, AMD articulates well the benefits of virtualization to businesses and developers (https://www.amd.com/Documents/32951B_Virtual_WP.pdf): Top 5 business benefits of virtualization: Increases server utilization Improves service levels Streamlines manageability and security Decreases hardware costs Reduces facility costs The benefits of virtualization with a development and test environment: Lowers capital and space requirements. Lowers power and cooling costs Increases efficiencies through shorter test cycles Faster time-to-market To these benefits, let's add portability and encapsulation: the unique ability to migrate a live VM from one PVE host to another—without suffering a service outage. Proxmox VE makes the creation and control of virtual machines possible through the combined use of two free and open source technologies: Kernel-based Virtual Machine (or KVM) and Quick Emulator (QEMU). Used together, we refer to this integration of tools as KVM-QEMU. KVM KVM has been an integral part of the Linux kernel since February, 2007. This kernel module allows GNU/Linux users and administrators to take advantage of an architecture's hardware virtualization extensions; for our purposes, these extensions are AMD's AMD-V and Intel's VT-X for the x86_64 architecture. To really make the most of Proxmox VE's feature set, you'll therefore very much want to install on an x86_64 machine with a CPU with integrated virtualization extensions. For a full list of AMD and Intel processors supported by KVM, visit Intel at http://ark.intel.com/Products/VirtualizationTechnology or AMD at http://support.amd.com/en-us/kb-articles/Pages/GPU120AMDRVICPUsHyperVWin8.aspx. QEMU QEMU provides an emulation and virtualization interface that can be scripted or otherwise controlled by a user. Visualizing the relationship between KVM and QEMU Without Proxmox VE, we could essentially define the hardware, create a virtual disk, and start and stop a virtualized server from the command line using QEMU. Alternatively, we could rely on any one of an array of GUI frontends for QEMU (a list of GUIs available for various platforms can be found at http://wiki.qemu.org/Links#GUI_Front_Ends). Of course, working with these solutions is productive only if you're interested in what goes on behind the scenes in PVE when virtual machines are defined. Proxmox VE's management of virtual machines is itself managing QEMU through its API. Managing QEMU from the command line can be tedious. The following is a line from a script that launched Raspbian, a Debian remix intended for the architecture of the Raspberry Pi, on an x86 Intel machine running Ubuntu. When we see how easy it is to manage VMs from Proxmox VE's administrative interfaces, we'll sincerely appreciate that relative simplicity: qemu-system-arm -kernel kernel-qemu -cpu arm1176 -m 256 -M versatilepb -no-reboot -serial stdio -append "root=/dev/sda2 panic=1" -hda ./$raspbian_img -hdb swap If you're familiar with QEMU's emulation features, it's perhaps important to note that we can't manage emulation through the tools and features Proxmox VE provides—despite its reliance on QEMU. From a bash shell provided by Debian, it's possible. However, the emulation can't be controlled through PVE's administration and management interfaces. Containerization with Proxmox VE Containers are a class of virtual machines (as containerization has enjoyed a renaissance since 2005, the term OS virtualization has become synonymous with containerization and is often used for clarity). However, by way of contrast with VMs, containers share operating system components, such as libraries and binaries, with the host operating system; a virtual machine does not. Visually contrasting virtual machines with containers The container advantage This arrangement potentially allows a container to run leaner and with fewer hardware resources borrowed from the host. For many authors, pundits, and users, containers also offer a demonstrable advantage in terms of speed and efficiency. (However, it should be noted here that as resources such as RAM and more powerful CPUs become cheaper, this advantage will diminish.) The Proxmox VE container is made possible through LXC from version 4.0 (it's made possible through OpenVZ in previous PVE versions). LXC is the third fundamental technology serving Proxmox VE's ultimate interest. Like KVM and QEMU, LXC (or Linux Containers) is an open source technology. It allows a host to run, and an administrator to manage, multiple operating system instances as isolated containers on a single physical host. Conceptually then, a container very clearly represents a class of virtualization, rather than an opposing concept. Nevertheless, it's helpful to maintain a clear distinction between a virtual machine and a container as we come to terms with PVE. The ideal implementation of a Proxmox VE guest is contingent on our distinguishing and choosing between a virtual-machine solution and a container solution. Since Proxmox VE containers share components of the host operating system and can offer advantages in terms of efficiency, this text will guide you through the creation of containers whenever the intended guest can be fully realized with Debian Jessie as our hypervisor's operating system without sacrificing features. When our intent is a guest running a Microsoft Windows operating system, for example, a Proxmox VE container ceases to be a solution. In such a case, we turn, instead, to creating a virtual machine. We must rely on a VM precisely because the operating system components that Debian can share with a Linux container are not components a Microsoft Windows operating system can make use of. Summary In this article, we have come to terms with the three open source technologies that provide Proxmox VE's foundational features: containerization and virtualization with LXC, KVM, and QEMU. Along the way, we've come to understand that containers, while being a type of virtualization, have characteristics that distinguish them from virtual machines. These differences will be crucial as we determine which technology to rely on for a virtual server solution with Proxmox VE. Resources for Article: Further resources on this subject: Deploying App-V 5 in a Virtual Environment[article] Setting Up a Spark Virtual Environment[article] Basic Concepts of Proxmox Virtual Environment[article]

0
0
4707

article-image-working-vmware-infrastructure

Packt

04 Mar 2015

21 min read

Working with VMware Infrastructure

Packt

04 Mar 2015

21 min read

In this article by Daniel Langenhan, the author of VMware vRealize Orchestrator Cookbook, we will take a closer look at how Orchestrator interacts with vCenter Server and vRealize Automation (vRA—formerly known as vCloud Automation Center, vCAC). vRA uses Orchestrator to access and automate infrastructure using Orchestrator plugins. We will take a look at how to make Orchestrator workflows available to vRA. We will investigate the following recipes: Unmounting all the CD-ROMs of all VMs in a cluster Provisioning a VM from a template An approval process for VM provisioning (For more resources related to this topic, see here.) There are quite a lot of plugins for Orchestrator to interact with VMware infrastructure and programs: vCenter Server vCloud Director (vCD) vRealize Automation (vRA—formally known as vCloud Automation Center, vCAC) Site Recovery Manager (SRM) VMware Auto Deploy Horizon (View and Virtual Desktops) vRealize Configuration Manager (earlier known as vCenter Configuration Manager) vCenter Update Manager vCenter Operation Manager, vCOPS (only example packages) VMware, as of writing of this article, is still renaming its products. An overview of all plugins and their names and download links can be found at http://www.vcoteam.info/links/plug-ins.html. There are quite a lot of plugins, and we will not be able to cover all of them, so we will focus on the one that is most used, vCenter. Sadly, vCloud Director is earmarked by VMware to disappear for everyone but service providers, so there is no real need to show any workflow for it. We will also work with vRA and see how it interacts with Orchestrator. vSphere automation The interaction between Orchestrator and vCenter is done using the vCenter API. Here is the explanation of the interaction, which you can refer to in the following figure. A user starts an Orchestrator workflow (1) either in an interactive way via the vSphere Web Client, the Orchestrator Web Operator, the Orchestrator Client, or via the API. The workflow in Orchestrator will then send a job (2) to vCenter and receive a task ID back (type VC:Task). vCenter will then start enacting the job (3). Using the vim3WaitTaskEnd action (4), Orchestrator pauses until the task has been completed. If we do not use the wait task, we can't be certain whether the task has ended or failed. It is extremely important to use the vim3WaitTaskEnd action whenever we send a job to vCenter. When the wait task reports that the job has finished, the workflow will be marked as finished. The vCenter MoRef The MoRef (Managed Object Reference) is a unique ID for every object inside vCenter. MoRefs are basically strings; some examples are shown here: VM Network Datastore ESXi host Data center Cluster vm-301 network-312 dvportgroup-242 datastore-101 host-44 data center-21 domain-c41 The MoRefs are typically stored in the attribute .id or .key of the Orchestrator API object. For example, the MoRef of a vSwitch Network is VC:Network.id. To browse for MoRefs, you can use the Managed Object Browser (MOB), documented at https://pubs.vmware.com/vsphere-55/index.jsp#com.vmware.wssdk.pg.doc/PG_Appx_Using_MOB.20.1.html. The vim3WaitTaskEnd action As already said, vim3WaitTaskEnd is one of the most central actions while interacting with vCenter. The action has the following variables: Category Name Type Usage IN vcTask VC:Task Carries the reconfiguration task from the script to the wait task IN progress Boolean Write to the logs the progress of a task in percentage IN pollRate Number How often the action should be checked for task completion in vCenter OUT ActionResult Any Returns the task's result The wait task will check in regular intervals (pollRate) the status of a task that has been submitted to vCenter. The task can have the following states: State Meaning Queued The task is queued and will be executed as soon as possible. Running The task is currently running. If the progress is set to true, the progress in percentage will be displayed in the logs. Success The task is finished successfully. Error The task has failed and an error will be thrown. Other vCenter wait actions There are actually five waiting tasks that come with the vCenter Server plugin. Here's an overview of the other four: Task Description vim3WaitToolsStarted This task waits until the VMware tools are started on a VM or until a timeout is reached. Vim3WaitForPrincipalIP This task waits until the VMware tools report the primary IP of a VM or until a timeout is reached. This typically indicates that the operating system is ready to receive network traffic. The action will return the primary IP. Vim3WaitDnsNameInTools This task waits until the VMware tools report a given DNS name of a VM or until a timeout is reached. The in-parameter addNumberToName is not used and can be set to Null. WaitTaskEndOrVMQuestion This task waits until a task is finished or if a VM develops a question. A vCenter question is related to user interaction. vRealize Automation (vRA) Automation has changed since the beginning of Orchestrator. Before, tools such as vCloud Director or vCloud Automation Center (vCAC)/vRealize Automation (vRA), Orchestrator was the main tool for automating vCenter resources. With version 6.2 of vCloud Automation Center (vCAC), the product has been renamed vRealize Automation. Now vRA is deemed to become the central cornerstone in the VMware automation effort. vRealize Orchestrator (vRO), is used by vRA to interact with and automate VMware and non-VMware products and infrastructure elements. Throughout the various vCAC/vRA interactions, the role of Orchestrator has changed substantially. Orchestrator started off as an extension to vCAC and became a central part of vRA. In vCAC 5.x, Orchestrator was only an extension of the IaaS life cycle. Orchestrator was tied in using the stubs vCAC 6.0 integrated Orchestrator as an XaaS service (Everything as a Service) using the Advanced Service Designer (ASD) In vCAC 6.1, Orchestrator is used to perform all VMware NSX operations (VMware's new network virtualization and automation), meaning that it became even more of a central part of the IaaS services. With vCAC 6.2, the Advance Service Designer (ASD) was enhanced to allow more complex form of designs, allowing better leverage of Orchestrator workflows. As you can see in the following figure, vRA connects to the vCenter Server using an infrastructure endpoint that allows vRA to conduct basic infrastructure actions, such as power operations, cloning, and so on. It doesn't allow any complex interactions with the vSphere infrastructure, such as HA configurations. Using the Advanced Service Endpoints, vRA integrates the Orchestrator (vRO) plugins as additional services. This allows vRA to offer the entire plugin infrastructure as services to vRA. The vCenter Server, AD, and PowerShell plugins are typical integrations that are used with vRA. Using Advance Service Designer (ASD), you can create integrations that use Orchestrator workflows. ASD allows you to offer Orchestrator workflows as vRA catalog items, making it possible for tenants to access any IT service that can be configured with Orchestrator via its plugins. The following diagram shows an example using the Active Directory plugin. The Orchestrator Plugin provides access to the AD services. By creating a custom resource using the exposed AD infrastructure, we can create a service blueprint and resource actions, both of which are based on Orchestrator workflows that use the AD plugin. The other method of integrating Orchestrator into the IaaS life cycle, which was predominately used in vCAC 5.x was to use the stubs. The build process of a VM has several steps; each step can be assigned a customizable workflow (called a stub). You can configure vRA to run an Orchestrator workflow at these stubs in order to facilitate a few customized actions. Such actions could be taken to change the VMs HA or DRS configuration, or to use the guest integration to install or configure a program on a VM. Installation How to install and configure vRA is out of the scope of this article, but take a look at http://www.kendrickcoleman.com/index.php/Tech-Blog/how-to-install-vcloud-automation-center-vcac-60-part-1-identity-appliance.html for more information. If you don't have the hardware or the time to install vRA yourself, you can use the VMware Hands-on Labs, which can be accessed after clicking on Try for Free at http://hol.vmware.com. The vRA Orchestrator plugin Due to the renaming, the vRA plugin is called vRealize Orchestrator vRA Plug-in 6.2.0, however the file you download and use is named o11nplugin-vcac-6.2.0-2287231.vmoapp. The plugin currently creates a workflow folder called vCloud Automation Center. vRA-integrated Orchestrator The vRA appliance comes with an installed and configured vRO instance; however, the best practice for a production environment is to use a dedicated Orchestrator installation, even better would be an Orchestrator cluster. Dynamic Types or XaaS XaaS means Everything (X) as a Service. The introduction of Dynamic Types in Orchestrator Version 5.5.1 does exactly that; it allows you to build your own plugins and interact with infrastructure that has not yet received its own plugin. Take a look at this article by Christophe Decanini; it integrates Twitter with Orchestrator using Dynamic Types at http://www.vcoteam.info/articles/learn-vco/282-dynamic-types-tutorial-implement-your-own-twitter-plug-in-without-any-scripting.html. Read more… To read more about Orchestrator integration with vRA, please take a look at the official VMware documentation. Please note that the official documentation you need to look at is about vRealize Automation, and not about vCloud Automation Center, but, as of writing this article, the documentation can be found at https://www.vmware.com/support/pubs/vrealize-automation-pubs.html. The document called Advanced Service Design deals with vRO and Advanced Service Designer The document called Machine Extensibility discusses customization using subs Unmounting all the CD-ROMs of all VMs in a cluster This is an easy recipe to start with, but one you can really make it work for your existing infrastructure. The workflow will unmount all CD-ROMs from a running VM. A mounted CD-ROM may block a VM from being vMotioned. Getting ready We need a VM that can mount a CD-ROM either as an ISO from a host or from the client. Before you start the workflow, make sure that the VM is powered on and has an ISO connected to it. How to do it... Create a new workflow with the following variables: Name Type Section Use cluster VC:ClusterComputerResource IN Used to input the cluster clusterVMs Array of VC:VirtualMachine Attribute Use to capture all VMs in a cluster Add the getAllVMsOfCluster action to the schema and assign the cluster in-parameter and the clusterVMs attribute to it as actionResult. Now, add a Foreach element to the schema and assign the workflow Disconnect all detachable devices from a running virtual machine. Assign the Foreach element clusterVMs as a parameter. Save and run the workflow. How it works... This recipe shows how fast and easily you can design solutions that help you with everyday vCenter problems. The problem is that VMs that have CD-ROMs or floppies mounted may experience problems using vMotion, making it impossible for them to be used with DRS. The reality is that a lot of admins mount CD-ROMs and then forget to disconnect them. Scheduling this script every evening just before the nighttime backups will make sure that a production cluster is able to make full use of DRS and is therefore better load-balanced. You can improve this workflow by integrating an exclusion list. See also Refer to the example workflow, 7.01 UnMount CD-ROM from Cluster. Provisioning a VM from a template In this recipe, we will build a deployment workflow for Windows and Linux VMs. We will learn how to create workflows and reduce the amount of input variables. Getting ready We need a Linux or Windows template that we can clone and provision. How to do it… We have split this recipe in two sections. In the first section, we will create a configuration element, and in the second, we will create the workflow. Creating a configuration We will use a configuration for all reusable variables. Build a configuration element that contains the following items: Name Type Use productId String This is the Windows product ID—the licensing code joinDomain String This is the Windows domain FQDN to join domainAdmin Credential These are the credentials to join the domain licenseMode VC:CustomizationLicenseDataMode Example, perServer licenseUsers Number This denotes the number of licensed concurrent users inTimezone Enums:MSTimeZone Time zone fullName String Full name of the user orgName String Organization name newAdminPassword String New admin password dnsServerList Array of String List of DNS servers dnsDomain String DNS domain gateway Array of String List of gateways Creating the base workflow Now we will create the base workflow: Create the workflow as shown in the following figure by adding the given elements: Clone, Windows with single NIC and credential Clone, Linux with single NIC Custom decision Use the Clone, Windows… workflow to create all variables. Link up the ones that you have defined in the configuration as attributes. The rest are defined as follows: Name Type Section Use vmName String IN This is the new virtual machine's name vm VC:VirtualMachine IN Virtual machine to clone folder VC:VmFolder IN This is the virtual machine folder datastore VC:Datastore IN This is the datastore in which you store the virtual machine pool VC:ResourcePool IN This is the resource pool in which you create the virtual machine network VC:Network IN This is the network to which you attach the virtual network interface ipAddress String IN This is the fixed valid IP address subnetMask String IN This is the subnet mask template Boolean Attribute For value No, mark new VM as template powerOn Boolean Attribute For value Yes, power on the VM after creation doSysprep Boolean Attribute For value Yes, run Windows Sysprep dhcp Boolean Attribute For value No, use DHCP newVM VC:VirtualMachine OUT This is the newly-created VM The following sub-workflow in-parameters will be set to special values: Workflow In-parameter value Clone, Windows with single NIC and credential host Null joinWorkgroup Null macAddress Null netBIOS Null primaryWINS Null secondaryWINS Null name vmName clientName vmName Clone, Linux with single NIC host Null macAddress Null name vmName clientName vmName Define the in-parameter VM as input for the Custom decision and add the following script. The script will check whether the name of the OS contains the word Microsoft: guestOS=vm.config.guestFullName; System.log(guestOS);if (guestOS.indexOf("Microsoft") >=0){return true;} else {return false} Save and run the workflow. This workflow will now create a new VM from an existing VM and customize it with a fixed IP. How it works… As you can see, creating workflows to automate vCenter deployments is pretty straightforward. Dealing with the various in-parameters of workflows can be quite overwhelming. The best way to deal with this problem is to hide away variables by defining them centrally using a configuration, or define them locally as attributes. Using configurations has the advantage that you can create them once and reuse them as needed. You can even push the concept a bit further by defining multiple configurations for multiple purposes, such as different environments. While creating a new workflow for automation, a typical approach is as follows: Look for a workflow that you need. Run the workflow normally to check out what it actually does. Either create a new workflow that uses the original or duplicate and edit the one you tried, modifying it until it does what you want. A fast way to deal with a lot of variables is to drag every element you need into the schema and then use the binding to create the variables as needed. You may have noticed that this workflow only lets you select vSwitch networks, not distributed vSwitch networks. You can improve this workflow with the following features: Read the existing Sysprep information stored in your vCenter Server Generate different predefined configurations (for example DEV or Prod) There's more... We can improve the workflow by implementing the ability to change the vCPU and the memory of the VM. Follow these steps to implement it: Move the out-parameter newVM to be an attribute. Add the following variables: Name Type Section Use vCPU Number IN This variable denotes the amount of vCPUs Memory Number IN This variable denotes the amount of VM memory vcTask VC:Task Attribute This variable will carry the reconfiguration task from the script to the wait task progress Boolean Attribute Value NO, vim3WaitTaskEnd pollRate Number Attribute Value 5, vim3WaitTaskEnd ActionResult Any Attribute vim3WaitTaskEnd Add the following actions and workflows according to the next figure: shutdownVMAndForce changeVMvCPU vim3WaitTaskEnd changeVMRAM Start virtual machine Bind newVM to all the appropriate input parameters of the added actions and workflows. Bind actionResults (VC:tasks) of the change actions to vim3WaitTasks. See also Refer to the example workflows, 7.02.1 Provision VM (Base), 7.02.2 Provision VM (HW custom), as well as the configuration element, 7 VM provisioning. An approval process for VM provisioning In this recipe, we will see how to create a workflow that waits for an approver to approve the VM creation before provisioning it. We will learn how to combine mail and external events in a workflow to make it interact with different users. Getting ready For this recipe, we first need the provisioning workflow that we have created in the Provisioning a VM from a template recipe. You can use the example workflow, 7.02.1 Provision VM (Base). Additionally, we need a functional e-mail system as well as a workflow to send e-mails. You can use the example workflow, 4.02.1 SendMail as well as its configuration item, 4.2.1 Working with e-mail. How to do it… We will split this recipe in three parts. First, we will create a configuration element then, we will create the workflow, and lastly, we will use a presentation to make the workflow usable. Creating a configuration element We will use a configuration for all reusable variables. Build a configuration element that contains the following items: Name Type Use templates Array/VC:VirtualMachine This contains all the VMs that serve as templates folders Array/VC:VmFolder This contains all the VM folders that are targets for VM provisioning networks Array/VC:Network This contains all VM networks that are targets for VM provisioning resourcePools Array/VC:ResourcePool This contains all resource pools that are targets for VM provisioning datastores Array/VC:Datastore This contains all datastores that are targets for VM provisioning daysToApproval Number These are the number of days the approval should be available for approver String This is the e-mail of the approver Please note that you also have to define or use the configuration elements for SendMail, as well as the Provision VM workflows. You can use the examples contained in the example package. Creating a workflow Create a new workflow and add the following variables: Name Type Section Use mailRequester String IN This is the e-mail address of the requester vmName String IN This is the name of the new virtual machine vm VC:VirtualMachine IN This is the virtual machine to be cloned folder VC:VmFolder IN This is the virtual machine folder datastore VC:Datastore IN This is the datastore in which you store the virtual machine pool VC:ResourcePool IN This is the resource pool in which you create the virtual machine network VC:Network IN This is the network to which you attach the virtual network interface ipAddress String IN This is the fixed valid IP address subnetMask String IN This is the subnet mask isExternalEvent Boolean Attribute A value of true defines this event as external mailApproverSubject String Attribute This is the subject line of the mail sent to the approver mailApproverContent String Attribute This is the content of the mail that is sent to the approver mailRequesterSubject String Attribute This is the subject line of the mail sent to the requester when the VM is provisioned mailRequesterContent String Attribute This is the content of the mail that is sent to the requester when the VM is provisioned mailRequesterDeclinedSubject String Attribute This is the subject line of the mail sent to the requester when the VM is declined mailRequesterDeclinedContent String Attribute This is the content of the mail that is sent to the requester when the VM is declined eventName String Attribute This is the name of the external event endDate Date Attribute This is the end date for the wait of external event approvalSuccess Boolean Attribute This checks whether the VM has been approved Now add all the attributes we defined in the configuration element and link them to the configuration. Create the workflow as shown in the following figure by adding the given elements: Scriptable task 4.02.1 SendMail (example workflow) Wait for custom event Decision Provision VM (example workflow) Edit the scriptable task and bind the following variables to it: In Out vmName ipAddress mailRequester template approver days to approval mailApproverSubject mailApproverContent mailRequesterSubject mailRequesterContent mailRequesterDeclinedSubject mailRequesterDeclinedContent eventName endDate Add the following script to the scriptable task: //construct event name eventName="provision-"+vmName; //add days to today for approval var today = new Date(); var endDate = new Date(today); endDate.setDate(today.getDate()+daysToApproval); //construct external URL for approval var myURL = new URL() ; myURL=System.customEventUrl(eventName, false); externalURL=myURL.url; //mail to approver mailApproverSubject="Approval needed: "+vmName; mailApproverContent="Dear Approver,n the user "+mailRequester+" would like to provision a VM from template "+template.name+".n To approve please click here: "+externalURL; //VM provisioned mailRequesterSubject="VM ready :"+vmName; mailRequesterContent="Dear Requester,n the VM "+vmName+" has been provisioned and is now available under IP :"+ipAddress; //declined mailRequesterDeclinedSubject="Declined :"+vmName; mailRequesterDeclinedContent="Dear Requester,n the VM "+vmName+" has been declined by "+approver; Bind the out-parameter of Wait for customer event to approvalSuccess. Configure the Decision element with approvalSuccess as true. Bind all the other variables to the workflow elements. Improving with the presentation We will now edit the workflow's presentation in order to make it workable for the requester. To do so, follow the given steps: Click on Presentation and follow the steps to alter the presentation, as seen in the following screenshot: Add the following properties to the in-parameters: In-parameter Property Value template Predefined list of elements #templates folder Predefined list of elements #folders datastore Predefined list of elements #datastores pool Predefined list of elements #resourcePools network Predefined list of elements #networks You can now use the General tab of each in-parameter to change the displayed text. Save and close the workflow. How it works… This is a very simplified example of an approval workflow to create VMs. The aim of this recipe is to introduce you to the method and ideas of how to build such a workflow. This workflow will only give a requester the choices that are configured in the configuration element, making the workflow quite safe for users that have only limited knowhow of the IT environment. When the requester submits the workflow, an e-mail is sent to the approver. The e-mail contains a link, which when clicked, triggers the external event and approves the VM. If the VM is approved it will get provisioned, and when the provisioning has finished an e-mail is sent to the requester stating that the VM is now available. If the VM is not approved within a certain timeframe, the requester will receive an e-mail that the VM was not approved. To make this workflow fully functional, you can add permissions for a requester group to the workflow and Orchestrator so that the user can use the vCenter to request a VM. Things you can do to improve the workflow are as follows: Schedule the provisioning to a future date. Use the resources for the e-mail and replace the content. Add an error workflow in case the provisioning fails. Use AD to read out the current user's e-mail and full name to improve the workflow. Create a workflow that lets an approver configure the configuration elements that a requester can chose from. Reduce the selections by creating, for instance, a development and production configuration that contains the correct folders, datastores, networks, and so on. Create a decommissioning workflow that is automatically scheduled so that the VM is destroyed automatically after a given period of time. See also Refer to the example workflow, 7.03 Approval and the configuration element, 7 approval. Summary In this article, we discussed one of the important aspects of the interaction of Orchestrator with vCenter Server and vRealize Automation, that is VM provisioning. Resources for Article: Further resources on this subject: Importance of Windows RDS in Horizon View [article] Metrics in vRealize Operations [article] Designing and Building a Horizon View 6.0 Infrastructure [article]

0
0
4594

article-image-new-for-2020-in-operations-and-infrastructure-engineering

Richard Gall

19 Dec 2019

5 min read

New for 2020 in operations and infrastructure engineering

Richard Gall

19 Dec 2019

5 min read

It’s an exciting time if you work in operations and software infrastructure. Indeed, you could even say that as the pace of change and innovation increases, your role only becomes more important. Operations and systems engineers, solution architects, everyone - you’re jobs are all about bringing stability, order and control into what can sometimes feel like chaos. As anyone that’s been working in the industry knows, managing change, from a personal perspective, requires a lot of effort. To keep on top of what’s happening in the industry - what tools are being released and updated, what approaches are gaining traction - you need to have one eye on the future and the wider industry. To help you with that challenge and get you ready for 2020, we’ve put together a list of what’s new for 2020 - and what you should start learning. Learn how to make Kubernetes work for you It goes without saying that Kubernetes was huge in 2019. But there are plenty of murmurs and grumblings that it’s too complicated and adds an additional burden for engineering and operations teams. To a certain extent there’s some truth in this - and arguably now would be a good time to accept that just because it seems like everyone is using Kubernetes, it doesn’t mean it’s the right solution for you. However, having said that, 2020 will be all about understanding how to make Kubernetes relevant to you. This doesn’t mean you should just drop the way you work and start using Kubernetes, but it does mean that spending some time with the platform and getting a better sense of how it could be used in the future is a useful way to spend your learning time in 2020. Explore Packt's extensive range of Kubernetes eBooks and videos on the Packt store. Learn how to architect If software has eaten the world, then by the same token perhaps complexity has well and truly eaten software as we know it. Indeed, Kubernetes is arguably just one of the symptoms and causes of this complexity. Another is the growing demand for architects in engineering and IT teams. There are a number of different ‘architecture’ job roles circulating across the industry, from solutions architect to application architect. While they each have their own subtle differences, and will even vary from company to company, they’re all roles that are about organizing and managing different pieces into something that is both stable and value-driving. Cloud has been particularly instrumental in making architect roles more prominent in the industry. As organizations look to resist the pitfalls of lock-in and better manage resources (financial and otherwise), it will be down to architects to balance business and technology concerns carefully. Learn how to architect cloud native applications. Read Architecting Cloud Computing Solutions. Get to grips with everything you need to know to be a software architect. Pick up Software Architect's Handbook. Artificial intelligence It’s strange that the hype around AI doesn’t seem to have reached the world of ops. Perhaps this is because the area is more resistant to the spin that comes with AI, preferring instead to focus more on the technical capabilities of tools and platforms. Whatever the case, it’s nevertheless true that AI will play an important part in how we manage and secure infrastructure. From monitoring system health, to automating infrastructure deployments and configuration, and even identifying security threats, artificial intelligence is already an important component for operations engineers and others. Indeed, artificial intelligence is being embedded inside products and platforms that ops teams are using - this means the need to ‘learn’ artificial intelligence is somewhat reduced. But it would be wrong to think it’s something that can just be managed from a dashboard. In 2020 it will be essential to better understand where and how artificial intelligence can fit into your operations and architectural toolchain. Find artificial intelligence eBooks and videos in Packt's collection of curated data science bundles. Observability, monitoring, tracing, and logging One of the challenges of software complexity is understanding exactly what’s going on under the hood. Yes, the network might be unreliable, as the saying goes, but what makes things even worse is that we’re not even sure why. This is where observability and the next generation of monitoring, logging and tracing all come into play. Having detailed insights into how applications and infrastructures are performing, how resources are being managed, and what things are actually causing problems is vitally important from a team perspective. Without the ability to understand these things, it can put pressure on teams as knowledge becomes siloed inside the brains of specific engineers. It makes you vulnerable to failure as you start to have points of failure at a personnel level. There are, of course, a wide range of tools and products available that can make monitoring and tracing easy (or easier, at least). But understanding which ones are right for your needs still requires some time learning and exploring the options out there. Make sure you do exactly that in 2020. Learn how to monitor distributed systems with Learn Centralized Logging and Monitoring with Kubernetes. Making serverless a reality We’ve talked about serverless a lot this year. But as a concept there’s still considerable confusion about what role it should play in modern DevOps processes. Indeed, even the nomenclature is a little confusing. Platforms using their own terminology, such as ‘lambdas’ and ‘functions’, only adds to the sense that serverless is something amorphous and hard to pin down. So, in 2020, we need to work out how to make serverless work for us. Just as we need to consider how Kubernetes might be relevant to our needs, we need to consider in what ways serverless represents both a technical and business opportunity. Search Packt's library for the latest serverless eBooks and videos. Explore more technology eBooks and videos on the Packt store.

0
0
4560

How-To Tutorials - Virtualization

Symmetric Messages and Asynchronous Messages (Part 1)

Introduction to SDN - Transformation from legacy to SDN

Oracle VM Management

Working with PowerShell

NSX Core Components

Chaos Conf 2018 Recap: Chaos engineering hits maturity as community moves towards controlled experimentation

Importance of Windows RDS in Horizon View

OpenStack Networking in a Nutshell

Citrix XenApp Performance Essentials

6 signs you need containers

Trending Topics

Monitoring and Troubleshooting Networking

It's Black Friday: But what's the business (and developer) cost of downtime?

Proxmox VE Fundamentals

Working with VMware Infrastructure

New for 2020 in operations and infrastructure engineering