Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Virtualization

115 Articles
article-image-symmetric-messages-and-asynchronous-messages-part-1
Packt
05 May 2015
31 min read
Save for later

Symmetric Messages and Asynchronous Messages (Part 1)

Packt
05 May 2015
31 min read
In this article by Kingston Smiler. S, author of the book OpenFlow Cookbook describes the steps involved in sending and processing symmetric messages and asynchronous messages in the switch and contains the following recipes: Sending and processing a hello message Sending and processing an echo request and a reply message Sending and processing an error message Sending and processing an experimenter message Handling a Get Asynchronous Configuration message from the controller, which is used to fetch a list of asynchronous events that will be sent from the switch Sending a Packet-In message to the controller Sending a Flow-removed message to the controller Sending a port-status message to the controller Sending a controller-role status message to the controller Sending a table-status message to the controller Sending a request-forward message to the controller Handling a packet-out message from the controller Handling a barrier-message from the controller (For more resources related to this topic, see here.) Symmetric messages can be sent from both the controller and the switch without any solicitation between them. The OpenFlow switch should be able to send and process the following symmetric messages to or from the controller, but error messages will not be processed by the switch: Hello message Echo request and echo reply message Error message Experimenter message Asynchronous messages are sent by both the controller and the switch when there is any state change in the system. Like symmetric messages, asynchronous messages also should be sent without any solicitation between the switch and the controller. The switch should be able to send the following asynchronous messages to the controller: Packet-in message Flow-removed message Port-status message Table-status message Controller-role status message Request-forward message Similarly, the switch should be able to receive, or process, the following controller-to-switch messages: Packet-out message Barrier message The controller can program or instruct the switch to send a subset of interested asynchronous messages using an asynchronous configuration message. Based on this configuration, the switch should send the subset of asynchronous messages only via the communication channel. The switch should replicate and send asynchronous messages to all the controllers based on the information present in the asynchronous configuration message sent from each controller. The switch should maintain asynchronous configuration information on a per communication channel basis. Sending and processing a hello message The OFPT_HELLO message is used by both the switch and the controller to identify and negotiate the OpenFlow version supported by both the devices. Hello messages should be sent from the switch once the TCP/TLS connection is established and are considered part of the communication channel establishment procedure. The switch should send a hello message to the controller immediately after establishing the TCP/TLS connection with the controller. How to do it... As hello messages are transmitted by both the switch and the controller, the switch should be able to send, receive, and process the hello message. The following section explains these procedures in detail. Sending the OFPT_HELLO message The message format to be used to send the hello message from the switch is as follows. This message includes the OpenFlow header along with zero or more elements that have variable size: /* OFPT_HELLO. This message includes zero or more    hello elements having variable size. */ struct ofp_hello { struct ofp_header header; /* Hello element list */ struct ofp_hello_elem_header elements[0]; /* List of elements */ }; The version field in the ofp_header should be set with the highest OpenFlow protocol version supported by the switch. The elements field is an optional field and might contain the element definition, which takes the following TLV format: /* Version bitmap Hello Element */ struct ofp_hello_elem_versionbitmap { uint16_t type;           /* OFPHET_VERSIONBITMAP. */ uint16_t length;         /* Length in bytes of this element. */        /* Followed by:          * - Exactly (length - 4) bytes containing the bitmaps,          * then Exactly (length + 7)/8*8 - (length) (between 0          * and 7) bytes of all-zero bytes */ uint32_t bitmaps[0]; /* List of bitmaps - supported versions */ }; The type field should be set with OFPHET_VERSIONBITMAP. The length field should be set to the length of this element. The bitmaps field should be set with the list of the OpenFlow versions the switch supports. The number of bitmaps included in the field should depend on the highest version number supported by the switch. The ofp_versions 0 to 31 should be encoded in the first bitmap, ofp_versions 32 to 63 should be encoded in the second bitmap, and so on. For example, if the switch supports only version 1.0 (ofp_versions = 0 x 01) and version 1.3 (ofp_versions = 0 x 04), then the first bitmap should be set to 0 x 00000012. Refer to the send_hello_message() function in the of/openflow.c file for the procedure to build and send the OFPT_Hello message. Receiving the OFPT_HELLO message The switch should be able to receive and process the OFPT_HELLO messages that are sent from the controller. The controller also uses the same message format, structures, and enumerations as defined in the previous section of this recipe. Once the switch receives the hello message, it should calculate the protocol version to be used for messages exchanged with the controller. The procedure required to calculate the protocol version to be used is as follows: If the hello message received from the switch contains an optional OFPHET_VERSIONBITMAP element and the bitmap field contains a valid value, then the negotiated version should be the highest common version among the supported protocol versions in the controller, with the bitmap field in the OFPHET_VERSIONBITMAP element. If the hello message doesn't contain any OFPHET_VERSIONBITMAP element, then the negotiated version should be the smallest of the switch-supported protocol versions and the version field set in the OpenFlow header of the received hello message. If the negotiated version is supported by the switch, then the OpenFlow connection between the controller and the switch continues. Otherwise, the switch should send an OFPT_ERROR message with the type field set as OFPET_HELLO_FAILED, the code field set as OFPHFC_INCOMPATIBLE, and an optional ASCII string explaining the situation in the data and terminate the connection. There's more… Once the switch and the controller negotiate the OpenFlow protocol version to be used, the connection setup procedure is complete. From then on, both the controller and the switch can send OpenFlow protocol messages to each other. Sending and processing an echo request and a reply message Echo request and reply messages are used by both the controller and the switch to maintain and verify the liveliness of the controller-switch connection. Echo messages are also used to calculate the latency and bandwidth of the controller-switch connection. On reception of an echo request message, the switch should respond with an echo reply message. How to do it... As echo messages are transmitted by both the switch and the controller, the switch should be able to send, receive, and process them. The following section explains these procedures in detail. Sending the OFPT_ECHO_REQUEST message The OpenFlow specification doesn't specify how frequently this echo message has to be sent from the switch. However, the switch might choose to send an echo request message periodically to the controller with the configured interval. Similarly, the OpenFlow specification doesn't mention what the timeout (the longest period of time the switch should wait) for receiving echo reply message from the controller should be. After sending an echo request message to the controller, the switch should wait for the echo reply message for the configured timeout period. If the switch doesn't receive the echo reply message within this period, then it should initiate the connection interruption procedure. The OFPT_ECHO_REQUEST message contains an OpenFlow header followed by an undefined data field of arbitrary length. The data field might be filled with the timestamp at which the echo request message was sent, various lengths or values to measure the bandwidth, or be zero-size for just checking the liveliness of the connection. In most open source implementations of OpenFlow, the echo request message only contains the header field and doesn't contain any body. Refer to the send_echo_request() function in the of/openflow.c file for the procedure to build and send the echo_request message. Receiving OFPT_ECHO_REQUEST The switch should be able to receive and process OFPT_ECHO_REQUEST messages that are sent from the controller. The controller also uses the same message format, structures, and enumerations as defined in the previous section of this recipe. Once the switch receives the echo request message, it should build the OFPT_ECHO_REPLY message. This message consists of ofp_header and an arbitrary-length data field. While forming the echo reply message, the switch should copy the content present in the arbitrary-length field of the request message to the reply message. Refer to the process_echo_request() function in the of/openflow.c file for the procedure to handle and process the echo request message and send the echo reply message. Processing OFPT_ECHO_REPLY message The switch should be able to receive the echo reply message from the controller. If the switch sends the echo request message to calculate the latency or bandwidth, on receiving the echo reply message, it should parse the arbitrary-length data field and can calculate the bandwidth, latency, and so on. There's more… If the OpenFlow switch implementation is divided into multiple layers, then the processing of the echo request and reply should be handled in the deepest possible layer. For example, if the OpenFlow switch implementation is divided into user-space processing and kernel-space processing, then the echo request and reply message handling should be in the kernel space. Sending and processing an error message Error messages are used by both the controller and the switch to notify the other end of the connection about any problem. Error messages are typically used by the switch to inform the controller about failure of execution of the request sent from the controller. How to do it... Whenever the switch wants to send the error message to the controller, it should build the OFPT_ERROR message, which takes the following message format: /* OFPT_ERROR: Error message (datapath -> the controller). */ struct ofp_error_msg { struct ofp_header header; uint16_t type; uint16_t code; uint8_t data[0]; /* Variable-length data. Interpreted based on the type and code. No padding. */ }; The type field indicates a high-level type of error. The code value is interpreted based on the type. The data value is a piece of variable-length data that is interpreted based on both the type and the value. The data field should contain an ASCII text string that adds details about why the error occurred. Unless specified otherwise, the data field should contain at least 64 bytes of the failed message that caused this error. If the failed message is shorter 64 bytes, then the data field should contain the full message without any padding. If the switch needs to send an error message in response to a specific message from the controller (say, OFPET_BAD_REQUEST, OFPET_BAD_ACTION, OFPET_BAD_INSTRUCTION, OFPET_BAD_MATCH, or OFPET_FLOW_MOD_FAILED), then the xid field of the OpenFlow header in the error message should be set with the offending request message. Refer to the send_error_message() function in the of/openflow.c file for the procedure to build and send an error message. If the switch needs to send an error message for a request message sent from the controller (because of an error condition), then the switch need not send the reply message to that request. Sending and processing an experimenter message Experimenter messages provide a way for the switch to offer additional vendor-defined functionalities. How to do it... The controller sends the experimenter message with the format. Once the switch receives this message, it should invoke the appropriate vendor-specific functions. Handling a "Get Asynchronous Configuration message" from the controller The OpenFlow specification provides a mechanism in the controller to fetch the list of asynchronous events that can be sent from the switch to the controller channel. This is achieved by sending the "Get Asynchronous Configuration message" (OFPT_GET_ASYNC_REQUEST) to the switch. How to do it... The message format to be used to get the asynchronous configuration message (OFPT_GET_ASYNC_REQUEST) doesn't have any body other than ofp_header. On receiving this OFPT_GET_ASYNC_REQUEST message, the switch should respond with the OFPT_GET_ASYNC_REPLY message. The switch should fill the property list with the list of asynchronous configuration events / property types that the relevant controller channel is preconfigured to receive. The switch should get this information from its internal data structures. Refer to the process_async_config_request() function in the of/openflow.c file for the procedure to process the get asynchronous configuration request message from the controller. Sending a packet-in message to the controller Packet-in messages (OFP_PACKET_IN) are sent from the switch to the controller to transfer a packet received from one of the switch-ports to the controller for further processing. By default, a packet-in message should be sent to all the controllers that are in equal (OFPCR_ROLE_EQUAL) and master (OFPCR_ROLE_MASTER) roles. This message should not be sent to controllers that are in the slave state. There are three ways by which the switch can send a packet-in event to the controller: Table-miss entry: When there is no matching flow entry for the incoming packet, the switch can send the packet to the controller. TTL checking: When the TTL value in a packet reaches zero, the switch can send the packet to the controller. The "send to the controller" action in the matching entry (either the flow table entry or the group table entry) of the packet. How to do it... When the switch wants to send a packet received in its data path to the controller, the following message format should be used: /* Packet received on port (datapath -> the controller). */ struct ofp_packet_in { struct ofp_header header; uint32_t buffer_id; /* ID assigned by datapath. */ uint16_t total_len; /* Full length of frame. */ uint8_t reason;     /* Reason packet is being sent                      * (one of OFPR_*) */ uint8_t table_id;   /* ID of the table that was looked up */ uint64_t cookie;   /* Cookie of the flow entry that was                      * looked up. */ struct ofp_match match; /* Packet metadata. Variable size. */ /* The variable size and padded match is always followed by: * - Exactly 2 all-zero padding bytes, then * - An Ethernet frame whose length is inferred from header.length. * The padding bytes preceding the Ethernet frame ensure that IP * header (if any) following the Ethernet header is 32-bit aligned. */ uint8_t pad[2]; /* Align to 64 bit + 16 bit */ uint8_t data[0]; /* Ethernet frame */ }; The buffer-id field should be set to the opaque value generated by the switch. When the packet is buffered, the data portion of the packet-in message should contain some bytes of data from the incoming packet. If the packet is sent to the controller because of the "send to the controller" action of a table entry, then the max_len field of ofp_action_output should be used as the size of the packet to be included in the packet-in message. If the packet is sent to the controller for any other reason, then the miss_send_len field of the OFPT_SET_CONFIG message should be used to determine the size of the packet. If the packet is not buffered, either because of unavailability of buffers or an explicit configuration via OFPCML_NO_BUFFER, then the entire packet should be included in the data portion of the packet-in message with the buffer-id value as OFP_NO_BUFFER. The date field should be set to the complete packet or a fraction of the packet. The total_length field should be set to the length of the packet included in the data field. The reason field should be set with any one of the following values defined in the enumeration, based on the context that triggers the packet-in event: /* Why is this packet being sent to the controller? */ enum ofp_packet_in_reason { OFPR_TABLE_MISS = 0,   /* No matching flow (table-miss                        * flow entry). */ OFPR_APPLY_ACTION = 1, /* Output to the controller in                        * apply-actions. */ OFPR_INVALID_TTL = 2, /* Packet has invalid TTL */ OFPR_ACTION_SET = 3,   /* Output to the controller in action set. */ OFPR_GROUP = 4,       /* Output to the controller in group bucket. */ OFPR_PACKET_OUT = 5,   /* Output to the controller in packet-out. */ }; If the packet-in message was triggered by the flow-entry "send to the controller" action, then the cookie field should be set with the cookie of the flow entry that caused the packet to be sent to the controller. This field should be set to -1 if the cookie cannot be associated with a particular flow. When the packet-in message is triggered by the "send to the controller" action of a table entry, there is a possibility that some changes have already been applied over the packet in previous stages of the pipeline. This information needs to be carried along with the packet-in message, and it can be carried in the match field of the packet-in message with a set of OXM (short for OpenFlow Extensible Match) TLVs. If the switch includes an OXM TLV in the packet-in message, then the match field should contain a set of OXM TLVs that include context fields. The standard context fields that can be added into the OXL TLVs are OFPXMT_OFB_IN_PORT, OFPXMT_OFB_IN_PHY_PORT, OFPXMT_OFB_METADATA, and OFPXMT_OFB_TUNNEL_ID. When the switch receives the packet in the physical port, and this packet information needs to be carried in the packet-in message, then OFPXMT_OFB_IN_PORT and OFPXMT_OFB_IN_PHY_PORT should have the same value, which is the OpenFlow port number of that physical port. When the switch receives the packet in the logical port and this packet information needs to be carried in the packet-in message, then the switch should set the logical port's port number in OFPXMT_OFB_IN_PORT and the physical port's port number in OFPXMT_OFB_IN_PHY_PORT. For example, consider a packet received on a tunnel interface defined over a Link Aggregation Group (LAG) with two member ports. Then the packet-in message should carry the tunnel interface's port_no to the OFPXMT_OFB_IN_PORT field and the physical interface's port_no to the OFPXMT_OFB_IN_PHY_PORT field. Refer to the send_packet_in_message() function in the of/openflow.c file for the procedure to send a packet-in message event to the controller. How it works... The switch can send either the entire packet it receives from the switch port to the controller, or a fraction of the packet to the controller. When the switch is configured to send only a fraction of the packet, it should buffer the packet in its memory and send a portion of packet data. This is controlled by the switch configuration. If the switch is configured to buffer the packet, and it has sufficient memory to buffer the packet, then the packet-in message should contain the following: A fraction of the packet. This is the size of the packet to be included in the packet-in message, configured via the switch configuration message. By default, it is 128 bytes. When the packet-in message is resulted by a table-entry action, then the output action itself can specify the size of the packet to be sent to the controller. For all other packet-in messages, it is defined in the switch configuration. The buffer ID to be used by the controller when the controller wants to forward the message at a later point in time. There's more… The switch that implements buffering should be expected to expose some details, such as the amount of available buffers, the period of time the buffered data will be available, and so on, through documentation. The switch should implement the procedure to release the buffered packet when there is no response from the controller to the packet-in event. Sending a flow-removed message to the controller A flow-removed message (OFPT_FLOW_REMOVED) is sent from the switch to the controller when a flow entry is removed from the flow table. This message should be sent to the controller only when the OFPFF_SEND_FLOW_REM flag in the flow entry is set. The switch should send this message only to the controller channel wherein the controller requested the switch to send this event. The controller can express its interest in receiving this event by sending the switch configuration message to the switch. By default, OFPT_FLOW_REMOVED should be sent to all the controllers that are in equal (OFPCR_ROLE_EQUAL) and master (OFPCR_ROLE_MASTER) roles. This message should not be sent to a controller that is in the slave state. How to do it... When the switch removes an entry from the flow table, it should build an OFPT_FLOW_REMOVED message with the following format and send this message to the controllers that have already shown interest in this event: /* Flow removed (datapath -> the controller). */ struct ofp_flow_removed { struct ofp_header header; uint64_t cookie;       /* Opaque the controller-issued identifier. */ uint16_t priority;     /* Priority level of flow entry. */ uint8_t reason;         /* One of OFPRR_*. */ uint8_t table_id;       /* ID of the table */ uint32_t duration_sec; /* Time flow was alive in seconds. */ uint32_t duration_nsec; /* Time flow was alive in nanoseconds                          * beyond duration_sec. */ uint16_t idle_timeout; /* Idle timeout from original flow mod. */ uint16_t hard_timeout; /* Hard timeout from original flow mod. */ uint64_t packet_count; uint64_t byte_count; struct ofp_match match; /* Description of fields.Variable size. */ }; The cookie field should be set with the cookie of the flow entry, the priority field should be set with the priority of the flow entry, and the reason field should be set with one of the following values defined in the enumeration: /* Why was this flow removed? */ enum ofp_flow_removed_reason { OFPRR_IDLE_TIMEOUT = 0,/* Flow idle time exceeded idle_timeout. */ OFPRR_HARD_TIMEOUT = 1, /* Time exceeded hard_timeout. */ OFPRR_DELETE = 2,       /* Evicted by a DELETE flow mod. */ OFPRR_GROUP_DELETE = 3, /* Group was removed. */ OFPRR_METER_DELETE = 4, /* Meter was removed. */ OFPRR_EVICTION = 5,     /* the switch eviction to free resources. */ }; The duration_sec and duration_nsec should be set with the elapsed time of the flow entry in the switch. The total duration in nanoseconds can be computed as duration_sec*109 + duration_nsec. All the other fields, such as idle_timeout, hard_timeoutm, and so on, should be set with the appropriate value from the flow entry, that is, these values can be directly copied from the flow mode that created this entry. The packet_count and byte_count should be set with the number of packet count and the byte count associated with the flow entry, respectively. If the values are not available, then these fields should be set with the maximum possible value. Refer to the send_flow_removed_message() function in the of/openflow.c file for the procedure to send a flow removed event message to the controller. Sending a port-status message to the controller Port-status messages (OFPT_PORT_STATUS) are sent from the switch to the controller when there is any change in the port status or when a new port is added, removed, or modified in the switch's data path. The switch should send this message only to the controller channel that the controller requested the switch to send it. The controller can express its interest to receive this event by sending an asynchronous configuration message to the switch. By default, the port-status message should be sent to all configured controllers in the switch, including the controller in the slave role (OFPCR_ROLE_SLAVE). How to do it... The switch should construct an OFPT_PORT_STATUS message with the following format and send this message to the controllers that have already shown interest in this event: /* A physical port has changed in the datapath */ struct ofp_port_status { struct ofp_header header; uint8_t reason; /* One of OFPPR_*. */ uint8_t pad[7]; /* Align to 64-bits. */ struct ofp_port desc; }; The reason field should be set to one of the following values as defined in the enumeration: /* What changed about the physical port */ enum ofp_port_reason { OFPPR_ADD = 0,   /* The port was added. */ OFPPR_DELETE = 1, /* The port was removed. */ OFPPR_MODIFY = 2, /* Some attribute of the port has changed. */ }; The desc field should be set to the port description. In the port description, all properties need not be filled by the switch. The switch should fill the properties that have changed, whereas the unchanged properties can be included optionally. Refer to the send_port_status_message() function in the of/openflow.c file for the procedure to send port_status_message to the controller. Sending a controller role-status message to the controller Controller role-status messages (OFPT_ROLE_STATUS) are sent from the switch to the set of controllers when the role of a controller is changed as a result of an OFPT_ROLE_REQUEST message. For example, if there are three the controllers connected to a switch (say controller1, controller2, and controller3) and controller1 sends an OFPT_ROLE_REQUEST message to the switch, then the switch should send an OFPT_ROLE_STATUS message to controller2 and controller3. How to do it... The switch should build the OFPT_ROLE_STATUS message with the following format and send it to all the other controllers: /* Role status event message. */ struct ofp_role_status { struct ofp_header header; /* Type OFPT_ROLE_REQUEST /                            * OFPT_ROLE_REPLY. */ uint32_t role;           /* One of OFPCR_ROLE_*. */ uint8_t reason;           /* One of OFPCRR_*. */ uint8_t pad[3];           /* Align to 64 bits. */ uint64_t generation_id;   /* Master Election Generation Id */ /* Role Property list */ struct ofp_role_prop_header properties[0]; }; The reason field should be set with one of the following values as defined in the enumeration: /* What changed about the controller role */ enum ofp_controller_role_reason { OFPCRR_MASTER_REQUEST = 0, /* Another the controller asked                            * to be master. */ OFPCRR_CONFIG = 1,         /* Configuration changed on the                            * the switch. */ OFPCRR_EXPERIMENTER = 2,   /* Experimenter data changed. */ }; The role should be set to the new role of the controller. The generation_id should be set with the generation ID of the OFPT_ROLE_REQUEST message that triggered the OFPT_ROLE_STATUS message. If the reason code is OFPCRR_EXPERIMENTER, then the role property list should be set in the following format: /* Role property types. */ enum ofp_role_prop_type { OFPRPT_EXPERIMENTER = 0xFFFF, /* Experimenter property. */ };   /* Experimenter role property */ struct ofp_role_prop_experimenter { uint16_t type;         /* One of OFPRPT_EXPERIMENTER. */ uint16_t length;       /* Length in bytes of this property. */ uint32_t experimenter; /* Experimenter ID which takes the same                        * form as struct                        * ofp_experimenter_header. */ uint32_t exp_type;     /* Experimenter defined. */ /* Followed by: * - Exactly (length - 12) bytes containing the experimenter data, * - Exactly (length + 7)/8*8 - (length) (between 0 and 7) * bytes of all-zero bytes */ uint32_t experimenter_data[0]; }; The experimenter field in the experimenter ID should take the same format as the experimenter structure. Refer to the send_role_status_message() function in the of/openflow.c file for the procedure to send a role status message to the controller. Sending a table-status message to the controller Table-status messages (OFPT_TABLE_STATUS) are sent from the switch to the controller when there is any change in the table status; for example, the number of entries in the table crosses the threshold value, called the vacancy threshold. The switch should send this message only to the controller channel in which the controller requested the switch to send it. The controller can express its interest to receive this event by sending the asynchronous configuration message to the switch. How to do it... The switch should build an OFPT_TABLE_STATUS message with the following format and send this message to the controllers that have already shown interest in this event: /* A table config has changed in the datapath */ struct ofp_table_status { struct ofp_header header; uint8_t reason;             /* One of OFPTR_*. */ uint8_t pad[7];             /* Pad to 64 bits */ struct ofp_table_desc table; /* New table config. */ }; The reason field should be set with one of the following values defined in the enumeration: /* What changed about the table */ enum ofp_table_reason { OFPTR_VACANCY_DOWN = 3, /* Vacancy down threshold event. */ OFPTR_VACANCY_UP = 4,   /* Vacancy up threshold event. */ }; When the number of free entries in the table crosses the vacancy_down threshold, the switch should set the reason code as OFPTR_VACANCY_DOWN. Once the vacancy_down event is generated by the switch, the switch should not generate any further vacancy down event until a vacancy up event is generated. When the number of free entries in the table crosses the vacancy_up threshold value, the switch should set the reason code as OFPTR_VACANCY_UP. Again, once the vacancy up event is generated by the switch, the switch should not generate any further vacancy up event until a vacancy down event is generated. The table field should be set with the table description. Refer to the send_table_status_message() function in the of/openflow.c file for the procedure to send a table status message to the controller. Sending a request-forward message to the controller When a the switch receives a modify request message from the controller to modify the state of a group or meter entries, after successful modification of the state, the switch should forward this request message to all other controllers as a request forward message (OFPT_REQUESTFORWAD). The switch should send this message only to the controller channel in which the controller requested the switch to send this event. The controller can express its interest to receive this event by sending an asynchronous configuration message to the switch. How to do it... The switch should build the OFPT_REQUESTFORWAD message with the following format, and send this message to the controllers that have already shown interest in this event: /* Group/Meter request forwarding. */ struct ofp_requestforward_header { struct ofp_header header; /* Type OFPT_REQUESTFORWARD. */ struct ofp_header request; /* Request being forwarded. */ }; The request field should be set with the request that received from the controller. Refer to the send_request_forward_message() function in the of/openflow.c file for the procedure to send request_forward_message to the controller. Handling a packet-out message from the controller Packet-out (OFPT_PACKET_OUT) messages are sent from the controller to the switch when the controller wishes to send a packet out through the switch's data path via a switch port. How to do it... There are two ways in which the controller can send a packet-out message to the switch: Construct the full packet: In this case, the controller generates the complete packet and adds the action list field to the packet-out message. The action field contains a list of actions defining how the packet should be processed by the switch. If the switch receives a packet_out message with buffer_id set as OFP_NO_BUFFER, then the switch should look into the action list, and based on the action to be performed, it can do one of the following: Modify the packet and send it via the switch port mentioned in the action list Hand over the packet to OpenFlow's pipeline processing, based on the OFPP_TABLE specified in the action list Use a packet buffer in the switch: In this mechanism, the switch should use the buffer that was created at the time of sending the packet-in message to the controller. While sending the packet_in message to the controller, the switch adds the buffer_id to the packet_in message. When the controller wants to send a packet_out message that uses this buffer, the controller includes this buffer_id in the packet_out message. On receiving the packet_out message with a valid buffer_id, the switch should fetch the packet from the buffer and send it via the switch port. Once the packet is sent out, the switch should free the memory allocated to the buffer, which was cached. Handling a barrier message from the controller The switch implementation could arbitrarily reorder the message sent from the controller to maximize its performance. So, if the controller wants to enforce the processing of the messages in order, then barrier messages are used. Barrier messages (OFPT_TABLE_STATUS) are sent from the controller to the switch to ensure message ordering. The switch should not reorder any messages across the barrier message. For example, if the controller is sending a group add message, followed by a flow add message referencing the group, then the message order should be preserved in the barrier message. How to do it... When the controller wants to send messages that are related to each other, it sends a barrier message between these messages. The switch should process these messages as follows: Messages before a barrier request should be processed fully before the barrier, including sending any resulting replies or errors. The barrier request message should then be processed and a barrier reply should be sent. While sending the barrier reply message, the switch should copy the xid value from the barrier request message. The switch should process the remaining messages. Both the barrier request and barrier reply messages don't have any body. They only have the ofp_header. Summary This article covers the list of symmetric and asynchronous messages sent and received by the OpenFlow switch, along with the procedure for handling these messages. Resources for Article: Further resources on this subject: The OpenFlow Controllers [article] Untangle VPN Services [article] Getting Started [article]
Read more
  • 0
  • 0
  • 6076

article-image-introduction-sdn-transformation-legacy-sdn
Packt
07 Jun 2017
23 min read
Save for later

Introduction to SDN - Transformation from legacy to SDN

Packt
07 Jun 2017
23 min read
In this article, by Reza Toghraee, the author of the book, Learning OpenDayLight, we will: What is and what is not SDN Components of a SDN Difference between SDN and overlay The SDN controllers (For more resources related to this topic, see here.) You might have heard about Software-Defined Networking (SDN). If you are in networking industry this is a topic which probably you have studied initially when first time you heard about the SDN.To understand the importance of SDN and SDN controller, let's look at Google. Google silently built its own networking switches and controller called Jupiter. A home grown project which is mostly software driven and supports such massive scale of Google. The SDN base is There is a controller who knows the whole network. OpenDaylight (ODL), is a SDN controller. In other words, it's the central brain of the network. Why we are going towards SDN Everyone who is hearing about SDN, should ask this question that why are we talking about SDN. What problem is it trying to solve? If we look at traditional networking (layer 2, layer 3 with routing protocols such as BGP, OSPF) we are completely dominated by what is so called protocols. These protocols in fact have been very helpful to the industry. They are mostly standard. Different vendor and products can communicate using standard protocols with each other. A Cisco router can establish a BGP session with a Huawei switch or an open source Quagga router can exchange OSPF routes with a Juniper firewall. Routing protocol is a constant standard with solid bases. If you need to override something in your network routing, you have to find a trick to use protocols, even by using a static route. SDN can help us to come out of routing protocol cage, look at different ways to forward traffic. SDN can directly program each switch or even override a route which is installed by routing protocol. There are high-level benefits of using SDN which we explain few of them as follows: An integrated network: We used to have a standalone concept in traditional network. Each switch was managed separately, each switch was running its own routing protocol instance and was processing routing information messages from other neighbors. In SDN, we are migrating to a centralized model, where the SDN controller becomes the single point of configuration of the network, where you will apply the policies and configuration. Scalable layer 2 across layer 3:Having a layer 2 network across multiple layer 3 network is something which all network architects are interested and till date we have been using proprietary methods such as OTV or by using a service provider VPLS service. With SDN, we can create layer 2 networks across multiple switches or layer 3 domains (using VXLAN) and expand the layer 2 networks. In many cloud environment, where the virtual machines are distributed across different hosts in different datacenters, this is a major requirement. Third-party application programmability: This is a very generic term, isn't it? But what I'm referring to is to let other applications communicate with your network. For example,In many new distributed IP storage systems, the IP storage controller has ability to talk to network to provide the best, shortest path to the storage node. With SDN we are letting other applications to control the network. Of course this control has limitation and SDN doesn't allow an application to scrap the whole network. Flexible application based network:In SDN, everything is an application. L2/L3, BGP, VMware Integration, and so on all are applications running in SDN controller. Service chaining:On the fly you add a firewall in the path or a load balancer. This is service insertion. Unified wired and wireless: This is an ideal benefit, to have a controller which supports both wired and wireless network. OpenDaylight is the only controller which supports CAPWAP protocols which allows integration with wireless access points. Components of a SDN A software defined network infrastructure has two main key components: The SDN Controller (only one, could be deployed in a highly available cluster) The SDN enabled switches(multiple switches, mostly in a Clos topology in a datacenter):   SDN controller is the single brain of the SDN domain. In fact, an SDN domain is very similar to a chassis based switch. You can imagine supervisor or management module of a chassis based switch as a SDN controller and rest of the line card and I/O cards as SDN switches. The main difference between a SDN network and a chassis based switch is that you can scale out the SDN with multiple switches, where in a chassis based switch you are limited to the number of slots in that chassis: Controlling the fabric It is very important that you understand the main technologies involved in SDN. These methods are used by SDN controllers to manage and control the SDN network. In general, there are two methods available for controlling the fabric: Direct fabric programming: In this method, SDN controller directly communicates with SDN enabled switches via southbound protocols such as OpenFlow, NETCONF and OVSDB. SDN controller programs each switch member with related information about fabric, and how to forward the traffic. Direct fabric programming is the method used by OpenDaylight. Overlay:In overlay method, SDN controller doesn't rely on programing the network switches and routers. Instead it builds an virtual overlay network on top of existing underlay network. The underlay network can be a L2 or L3 network with traditional network switches and router, just providing IP connectivity. SDN controller uses this platform to build the overlay using encapsulation protocols such as VXLAN and NVGRE. VMware NSX uses overlay technology to build and control the virtual fabric. SDN controllers One of the key fundamentals of SDN is disaggregation. Disaggregation of software and hardware in a network and also disaggregation of control and forwarding planes. SDN controller is the main brain and controller of an SDN environment, it's a strategic control point within the network and responsible for communicating information to: Routers and switches and other network devices behind them. SDN controllers uses APIs or protocols (such as OpenFlow or NETCONF) to communicate with these devices. This communication is known as southbound Upstream switches, routers or applications and the aforementioned business logic (via APIs or protocols). This communication is known as northbound. An example for a northbound communication is a BGP session between a legacy router and SDN controller. If you are familiar with chassis based switches like Cisco Catalyst 6500 or Nexus 7k chassis, you can imagine a SDN network as a chassis, with switches and routers as its I/O line cards and SDN controller as its supervisor or management module. Infact SDN is similar to a very scalable chassis where you don't have any limitation on number of physical slots. SDN controller is similar to role of management module of a chassis based switch and it controls all switches via its southbound protocols and APIs. The following table compares the SDN controller and a chassis based switch:  SDN Controller Chassis based switch Supports any switch hardware Supports only specific switch line cards Can scale out, unlimited number of switches Limited to number of physical slots in the chassis Supports high redundancy by multiple controllers in a cluster Supports dual management redundancy, active standby Communicates with switches via southbound protocols such as OpenFlow, NETCONF, BGP PCEP Use proprietary protocols between management module and line cards Communicates with routers, switches and applications outside of SDN via northbound protocols such as BGP, OSPF and direct API Communicates with other routers and switches outside of chassis via standard protocols such as BGP, OSPF or APIs. The first protocol that popularized the concept behind SDN was OpenFlow. When conceptualized by networking researchers at Stanford back in 2008, it was meant to manipulate the data plane to optimize traffic flows and make adjustments, so the network could quickly adapt to changing requirements. Version 1.0 of the OpenFlow specification was released in December of 2009; it continues to be enhanced under the management of the Open Networking Foundation, which is a user-led organization focused on advancing the development of open standards and the adoption of SDN technologies. OpenFlow protocol was the first protocol that helped in popularizing SDN. OpenFlow is a protocol designed to update the flow tables in a switch. Allowing the SDN controller to access the forwarding table of each member switch or in other words to connect control plane and data plane in SDN world. Back in 2008, OpenFlow conceptualized by networking researchers at Stanford University, the initial use of OpenFlow was to alter the switch forwarding tables to optimize traffic flows and make adjustments, so the network could quickly adapt to changing requirements. After introduction of OpenFlow, NOX introduced as original OpenFlow controller (still there wasn't concept of SDN controller). NOX was providing a high-level API capable of managing and also developing network control applications. Separate applications were required to run on top of NOX to manage the network.NOX was initially developed by Nicira networks (which acquired by VMware, and finally became part of VMware NSX). NOX introduced along with OpenFlow in 2009. NOX was a closed source product but ultimately it was donated to SDN community which led to multiple forks and sub projects out of original NOX. For example, POX is a sub project of NOX which provides Python support. Both NOX and POX were early controllers. NOX appears an inactive development, however POX is still in use by the research community as it is a Python based project and can be easily deployed. POX is hosted at http://github.com/noxrepo/pox NOX apart from being the first OpenFlow or SDN controller also established a programing model which inherited by other subsequent controllers. The model was based on processing of OpenFlow messages, with each incoming OpenFlow message trigger an event that had to be processed individually. This model was simple to implement but not efficient and robust and couldn't scale. Nicira along with NTT and Google started developing ONIX, which was meant to be a more abstract and scalable for large deployments. ONIX became the base for Nicira (the core of VMware NSX or network virtualization platform) also there are rumors that it is also the base for Google WAN controller. ONIX was planned to become open source and donated to community but for some reasons the main contributors decided to not to do it which forced the SDN community to focus on developing other platforms. Started in 2010, a new controller introduced,the Beacon controller and it became one of the most popular controllers. It born with contribution of developers from Stanford University. Beacon is a Java-based open source OpenFlow controller created in 2010. It has been widely used for teaching, research, and as the basis of Floodlight. Beacon had the first built-in web user-interface which was a huge step forward in the market of SDN controllers. Also it provided a easier method to deploy and run compared to NOX. Beacon was an influence for design of later controllers after it, however it was only supporting star topologies which was one of the limitations on this controller. Floodlight was a successful SDN controller which was built as a fork of Beacon. BigSwitch networks is developing Floodlight along with other developers. In 2013, Beacon popularity started to shrink down and Floodlight started to gain popularity. Floodlight had fixed many issues of Beacon and added lots of additional features which made it one of the most feature rich controllers available. It also had a web interface, a Java-based GUI and also could get integrated with OpenStack using quantum plugin. Integration with OpenStack was a big step forward as it could be used to provide networking to a large pool of virtual machines, compute and storage. Floodlight adoption increased by evolution of OpenStack and OpenStack adopters. This gave Floodlight greater popularity and applicability than other controllers that came before. Most of controllers came after Floodlight also supported OpenStack integration. Floodlight is still supported and developed by community and BigSwitch networks, and is a base for BigCloud Fabric (the BigSwitch's commercial SDN controller). There are other open source SDN controllers which introduced such as Trema (ruby-based from NEC), Ryu (supported by NTT), FlowER, LOOM and the recent OpenMUL. The following table shows the current open source SDN controllers:  Active open source SDNcontroller Non-active open source SDN controllers Floodlight Beacon OpenContrail FlowER OpenDaylight NOX LOOM NodeFlow OpenMUL   ONOS   POX   Ryu   Trema     OpenDaylight OpenDaylight started in early 2013, and was originally led by IBM and Cisco. It was a new collaborative open source project. OpenDaylight hosted under Linux Foundation and draw support and interest from many developers and adopters. OpenDaylight is a platform to provide common foundations and a robust array of services for SDN environments. OpenDaylight uses a controller model which supports OpenFlow as well as other southbound protocols. It is the first open source controller capable of employing non-OpenFlow proprietary control protocols which eventually lets OpenDaylight to integrate with modern and multi-vendor networks. The first release of OpenDaylight in February 2014 with code name of Hydrogen, followed by Helium in September 2014. The Helium release was significant because it marked a change in direction for the platform that has influenced the way subsequent controllers have been architected. The main change was in the service abstraction layer, which is the part of the controller platform that resides just above the southbound protocols, such as OpenFlow, isolating them from the northbound side and where the applications reside. Hydrogen used an API-driven Service Abstraction Layer (AD-SAL), which had limitations specifically, it meant the controller needed to know about every type of device in the network AND have an inventory of drivers to support them. Helium introduced a Model-driven service abstraction layer (MD-SAL), which meant the controller didn't have to account for all the types of equipment installed in the network, allowing it to manage a wide range of hardware and southbound protocols. Helium release made the framework much more agile and adaptable to changes in the applications; an application could now request changes to the model, which would be received by the abstraction layer and forwarded to the network devices. The OpenDaylight platform built on this advancement in its third release, Lithium, which was introduced in June of 2015. This release focused on broadening the programmability of the network, enabling organizations to create their own service architectures to deliver dynamic network services in a cloud environment and craft intent-based policies. Lithium release was worked on by more than 400 individuals, and contributions from Big Switch Networks, Cisco, Ericsson, HP, NEC, and so on, making it one of the fastest growing open source projects ever. The fourth release, Beryllium come out in February of 2016 and the most recent fifth release, Boron released in September 2016. Many vendors have built and developed commercial SDN controller solutions based on OpenDaylight. Each product has enhanced or added features to OpenDaylight to have some differentiating factor. The use of OpenDaylight in different vendor products are: A base, but sell a commercial version with additional proprietary functionality—for example: Brocade, Ericsson, Ciena, and so on. Part of their infrastructure in their Network as a Service (or XaaS) offerings—for example: Telstra, IBM, and so on. Elements for use in their solution—for example: ConteXtream (now part of HP) Open Networking Operating System (ONOS), which was open sourced in December 2014 is focused on serving the needs of service providers. It is not as widely adopted as OpenDaylight, ONOS has been finding success and gaining momentum around WAN use cases. ONOS is backed by numerous organizations including AT&T, Cisco, Fujitsu, Ericsson, Ciena, Huawei, NTT, SK Telecom, NEC, and Intel, many of whom are also participants in and supporters of OpenDaylight. Apart from open source SDN controllers, there are many commercial, proprietary controllers available in the market. Products such as VMware NSX, Cisco APIC, BigSwitch Big Cloud Fabric, HP VAN and NEC ProgrammableFlow are example commercial and proprietary products. The following table lists the commercially available controllers and their relationship to OpenDaylight:  ODL-based ODL-friendly Non-ODL based Avaya Cyan (acquired by Ciena) BigSwitch Brocade HP Juniper Ciena NEC Cisco ConteXtream (HP) Nuage Plexxi Coriant   PLUMgrid Ericsson Pluribus Extreme Sonus Huawei (also ships non-ODL controller) VMware NSX Core features of SDN Regardless of an open source or a proprietary SDN platform, there are core features and capabilities which requires the SDN platform to support. These capabilities include: Fabric programmability:Providing the ability to redirect traffic, apply filters to packets (dynamically), and leverage templates to streamline the creation of custom applications. Ensuring northbound APIs allow the control information centralized in the controller available to be changed by SDN applications. This will ensure the controller can dynamically adjust the underlying network to optimize traffic flows to use the least expensive path, take into consideration varying bandwidth constraints, meet quality of service (QoS) requirements. Southbound protocol support:Enabling the controller to communicate to switches and routers and manipulate and optimize how they manage the flow of traffic. Currently OpenFlow is the most standard protocol used between different networking vendors, while there are other southbound protocols that can be used. A SDN platform should support different versions of OpenFlow in order to provide compatibility with different switching equipments. External API support:Ensuring the controller can be used within the varied orchestration and cloud environments such as VMware vSphere, OpenStack, and so on. Using APIs the orchestration platform can communicate with SDN platform in order to publish network policies. For example VMware vSphere shall talk to SDN platform to extend the virtual distributed switches(VDS) from virtual environment to the physical underlay network without any requirement form an network engineer to configure the network. Centralized monitoring and visualization:Since SDN controller has a full visibility over the network, it can offer end-to-end visibility of the network and centralized management to improve overall performance, simplify the identification of issues and accelerate troubleshooting. The SDN controller will be able to discover and present a logical abstraction of all the physical links in the network, also it can discover and present a map of connected devices (MAC addresses) which are related to virtual or physical devices connected to the network. The SDN controller support monitoring protocols, such as syslog, snmp and APIs in order to integrate with third-party management and monitoring systems. Performance: Performance in a SDN environment mainly depends on how fast SDN controller fills the flow tables of SDN enabled switches. Most of SDN controllers pre-populate the flow tables on switches to minimize the delay. When a SDN enabled switch receives a packet which doesn't find a matching entry in its flow table, it sends the packet to the SDN controller in order to find where the packet needs to get forwarded to. A robust SDN solution should ensure that the number of requests form switches are minimum and SDN controller doesn't become a bottleneck in the network. High availability and scalability: Controllers must support high availability clusters to ensure reliability and service continuity in case of failure of a controller. Clustering in SDN controller expands to scalability. A modern SDN platform should support scalability in order to add more controller nodes with load balancing in order to increase the performance and availability. Modern SDN controllers support clustering across multiple different geographical locations. Security:Since all switches communicate with SDN controller, the communication channel needs to be secured to ensure unauthorized devices doesn't compromise the network. SDN controller should secure the southbound channels, use encrypted messaging and mutual authentication to provide access control. Apart from that the SDN controller must implement preventive mechanisms to prevent from denial of services attacks. Also deployment of authorization levels and level controls for multi-tenant SDN platforms is a key requirement. Apart from the aforementioned features SDN controllers are likely to expand their function in future. They may become a network operating system and change the way we used to build networks with hardware, switches, SFPs and gigs of bandwidth. The future will look more software defined, as the silicon and hardware industry has already delivered their promises for high performance networking chips of 40G, 100G. Industry needs more time to digest the new hardware and silicons and refresh the equipment with new gears supporting 10 times the current performance. Current SDN controllers In this section, I'm putting the different SDN controllers in a table. This will help you to understand the current market players in SDN and how OpenDaylight relates to them:  Vendors/product Based on OpenDaylight? Commercial/open source Description Brocade SDN controller Yes Commercial It's a commercial version of OpenDaylight, fully supported and with extra reliable modules. Cisco APIC No Commercial Cisco Application Policy Infrastructure Controller (APIC) is the unifying automation and management point for the Application Centric Infrastructure (ACI) data center fabric. Cisco uses APIC controller and Nexus 9k switches to build the fabric. Cisco uses OpFlex as main southbound protocol. Erricson SDN controller Yes Commercial Ericsson's SDN controller is a commercial (hardened) version OpenDaylight SDN controller. Domain specific control applications that use the SDN controller as platform form the basis of the three commercial products in our SDN controller portfolio. Juniper OpenContrial /Contrail No Both OpenContrail is opensource, and Contrail itself is a commercial product. Juniper Contrail Networking is an open SDN solution that consists of Contrail controller, Contrail vRouter, an analytics engine, and published northbound APIs for cloud and NFV. OpenContrail is also available for free from Juniper. Contrail promotes and use MPLS in datacenter. NEC Programmable Flow No Commercial NEC provides its own SDN controller and switches. NEC SDN platform is one of choices of enterprises and has lots of traction and some case studies.   Avaya SDN Fx controller Yes Commercial Based on OpenDaylight, bundled as a solution package.   Big Cloud Fabric No Commercial BigSwitch networks solution is based on Floodlight opensource project. BigCloud Fabric is a robust, clean SDN controller and works with bare metal whitebox switches. BigCloud Fabric includes SwitchLightOS which is a switch operating system can be loaded on whitebox switches with Broadcom Trident 2 or Tomahawk silicons. The benefit of BigCloud Fabric is that you are not bound to any hardware and you can use baremetal switches from any vendor.   Ciena's Agility Yes Commercial Ciena's Agility multilayer WAN controller is built atop the open-source baseline of the OpenDaylight Project—an open, modular framework created by a vendor-neutral ecosystem (rather than a vendor-centric ego-system) that will enable network operators to source network services and applications from both Ciena's Agility and others. HP VAN (Virtual Application Network) No Commercial The building block of the HP open SDN ecosystem, the controller allows third-party developers to deliver innovative SDN solutions. Huawei Agile controller Yes and No (based on editions) Commercial Huawei's SDN controller which integrates as a solution with Huawei enterprise switches Nuage No Commercial Nuage Networks VSP provides SDN capabilities for clouds of all sizes. It is implemented as a non-disruptive overlay for all existing virtualized and non-virtualized server and network resources. Pluribus Netvisor No Commercial Netvisor Premium and Open Netvisor Linux are distributed network operating systems. Open Netvisor integrates atraditional, interoperable networking stack (L2/L3/VXLAN) with an SDN distributed controller that runs in everyswitch of the fabric. VMware NSX No Commercial VMware NSX is an Overlay type of SDN, which currently works with VMware vSphere. The plan is to support OpenStack in future. VMware NSX also has built-in firewall, router and L4 load balancers allowing micro segmentation. OpenDaylight as an SDN controller Previously, we went through the role of SDN controller, and a brief history of ODL.ODL is a modular open SDN platform which allows developers to build any network or business application on top of it to drive the network in the way they want. Currently OpenDaylight has reached to its fifth release (Boron, which is the fifth element in periodic table). ODL releases are named based on periodic table elements, started from first release the Hydrogen. ODL has a 6 month release period, with many developers working on expanding the ODL, 2 releases per year is expected from community. For technical readers to understand it more clearly, the following diagram will help: ODL platform has a broad set of use cases for multivendor, brown field, green fields, service providers and enterprises. ODL is a foundation for networks of the future. Service providers are using ODL to migrate their services to a software enabled level with automatic service delivery and coming out of circuit-based mindset of service delivery. Also they work on providing a virtualized CPE with NFV support in order to provide flexible offerings. Enterprises use ODL for many use cases, from datacenter networking, Cloud and NFS, network automation and resource optimization, visibility, control to deploying a fully SDN campus network. ODL uses a MD-SAL which makes it very scalable and lets it incorporate new applications and protocols faster. ODL supports multiple standard and proprietary southbound protocols, for example with full support of OpenFlow and OVSDB, ODL can communicate with any standard hardware (or even the virtual switches such as Open vSwitch(OVS) supporting such protocols). With such support, ODL can be deployed and used in multivendor environments and control hardware from different vendors from a single console no matter what vendor and what device it is, as long as they support standard southbound protocols. ODL uses a micro service architecture model which allows users to control applications, protocols and plugins while deploying SDN applications. Also ODL is able to manage the connection between external consumers and providers. The followingdiagram explains the ODL footprint and different components and projects within the ODL: Micro servicesarchitecture ODL stores its YANG data structure in a common data store and uses messaging infrastructure between different components to enable a model-driven approach to describe the network and functions. In ODL MD-SAL, any SDN application can be integrated as a service and then loaded into the SDN controller. These services (apps) can be chained together in any number and ways to match the application needs. This concept allows users to only install and enable the protocols and services they need which makes the system light and efficient. Also services and applications created by users can be shared among others in the ecosystem since the SDN application deployment for ODL follows a modular design. ODL supports multiple southbound protocols. OpenFlow and OpenFlow extension such as Table Type Patterns (TTP), as well as other protocols including NETCONF, BGP/PCEP, CAPWAP and OVSDB. Also ODL supports Cisco OpFlex protocol: ODL platform provides a framework for authentication, authorization and accounting (AAA), as well as automatic discovery and securing of network devices and controllers. Another key area in security is to use encrypted and authenticated communication trough southbound protocols with switches and routers within the SDN domain. Most of southbound protocols support security encryption mechanisms. Summary In this article we learned about SDN, and why it is important. We reviewed the SDN controller products, the ODL history as well as core features of SDN controllers and market leader controllers. We managed to dive in some details about SDN . Resources for Article: Further resources on this subject: Managing Network Devices [article] Setting Up a Network Backup Server with Bacula [article] Point-to-Point Networks [article]
Read more
  • 0
  • 0
  • 6044

article-image-oracle-vm-management
Packt
16 Oct 2009
6 min read
Save for later

Oracle VM Management

Packt
16 Oct 2009
6 min read
Before we get to manage the VMs in the Oracle VM Manager, let's take a quick look at the Oracle VM Manager by logging into it. Getting started with Oracle VM Manager In this article, we will perform the following actions while exploring the Oracle VM Manager: Registering an account Logging in to Oracle VM Manager Create a Server Pool After we are done with the Oracle VM Manager installation, we will use one of the following links to log on to the Oracle VM Manager: Within the local machine: http://127.0.0.1:8888/OVS Logging in remotely: http://vmmgr:8888/OVS Here, vmmgr refers to the host name or IP address of your Oracle VM Manager host. How to register an account Registering of an account can be done in several ways. If, during the installation of Oracle VM Manager, we have chosen to configure the default admin account "admin", then we can use this account directly to log on to Oracle's IntraCloud portal we call Oracle VM Manager. We will explain later in detail about the user accounts and why we would need separate accounts for separate roles for fine-grained access control; something that is crucial for security purposes. So let's have a quick look at the three available options: Default installation: This option applies if we have performed the default installation ourselves and have gone ahead to create the account ourselves. Here we have the default administrator role. Request for account creation: Contacting the administrator of Oracle VM Manager is another way to attain an account with the privileges, such as administrator, manager, and user. Create yourself: If we need to conduct basic functions of a common user with operator's role such as creating and using virtual machines, or importing resources, we can create a new account ourselves. However, we will need the administrator to assign us the server pools and groups to our account before we can get started. Here by default we are granted a user role. We will talk more about roles later in this article. Now let's go about registering a new account with Oracle VM Manager. Once on the Oracle VM Manager Login page click on the Register link. We are presented with the following screen. We must enter a Username of our choice and a hard-to-crack password twice. Also, we have to fill in our First Name and Last Name and complete the registration with a valid email address. Click Next: Next, we need to confirm our account details by clicking on the Confirm button. Now our account will be created and a confirmation message is displayed on the Oracle VM Manager Login screen. It should be noted that we will need some Server Pools and groups before we can get started. We will have to ask the administrator to assign us access to those pools and groups. It's time now to login to our newly created account. Logging in to Oracle VM Manager Again we will need to either access the URL locally by typing http://127.0.0.1:8888/OVS or by typing the following: http://hostname:8888/ OVS. If we are accessing the Oracle VM Manager Portal remotely, replace the "hostname" with either the FQDN (Fully Qualified Distinguished Name) if the machine is registered in our DNS or just the hostname of the VM Manager machine. We can login to the portal by simply typing in our Username and Password that we just created. Depending on the role and the server pools that we have been assigned, we will be displayed with the tabs upon the screen as shown in the following table. To change the role, we will need to contact our enterprise domain administrator. Only administrators are allowed to change the roles of accounts. If we forget our password, we can click on Forgot Password and on submitting our account name, the password will be sent to the registered email address that we had provided when we registered the account. The following table discusses the assigned tabs that are displayed for each Oracle VM Manager roles:   Role Grants User Virtual Machines, Resources Administrator Virtual Machines, Resources, Servers, Server Pools, Administration Manager Virtual Machines, Resources, Servers, Server Pools   We can obviously change the roles by editing the Profile (on the upper-right section of the portal). As it can be seen in the following screenshot, we have access to the Virtual Machines pane and the Resources pane. We will continue to add Servers to the pool when logged in as admin. Oracle VM management: Managing Server Pool A Server Pool is logically an autonomous region that contains one or more physical servers and the dynamic nature of such pool and pools of pools makes what we call  an infinite Cloud infrastructure. Currently Oracle has its Cloud portal with Amazon but it is very much viable to have an IntraCloud portal or private Cloud where we can run all sorts of Linux and Windows flavors on our Cloud backbone. It eventually rests on the array of SAN, NAS, or other next generation storage substrate on which the VMs reside. We must ensure that we have the following prerequisites properly checked before creating the Virtual Machines on our IntraCloud Oracle VM. Oracle VM Servers: These are available to deploy as Utility Master, Server Master pool, and Virtual Machine Servers. Repositories: Used for Live Migration or Hot Migration of the VMs and for local storage on the Oracle VM Servers. FQDN/IP address of Oracle VM Servers: It is better to have the Oracle VM Servers known as OracleVM01.AVASTU.COM and OracleVM02.AVASTU. COM. This way you don't have to bother about the IP changes or infrastructural relocation of the IntraCloud to another location. Oracle VM Agent passwords: Needed to access the Oracle VM Servers. Let's now go about exploring the designing process of the Oracle VM. Then we will do the following systematically: Creating the Server Pool Editing Server Pool information Search and retrieval within Server Pool Restoring Server Pool Enabling HA Deleting a Server Pool However, we can carry out these actions only as a Manager or an Administrator. But first let's take a look at the decisions on what type of Server Pools will suit us the best and what the architectural considerations could be around building your Oracle VM farm.
Read more
  • 0
  • 0
  • 5965
Visually different images

article-image-working-powershell
Packt
07 Sep 2015
17 min read
Save for later

Working with PowerShell

Packt
07 Sep 2015
17 min read
In this article, you will cover: Retrieving system information – Configuration Service cmdlets Administering hosts and machines – Host and MachineCreation cmdlets Managing additional components – StoreFront Admin and Logging cmdlets (For more resources related to this topic, see here.) Introduction With hundreds or thousands of hosts to configure and machines to deploy, configuring all the components manually could be difficult. As for the previous XenDesktop releases, and also with the XenDesktop 7.6 version, you can find an integrated set of PowerShell modules. With its use, IT technicians are able to reduce the time required to perform management tasks by the creation of PowerShell scripts, which will be used to deploy, manage, and troubleshoot at scale the greatest part of the XenDesktop components. Working with PowerShell instead of the XenDesktop GUI will give you more flexibility in terms of what kind of operations to execute, having a set of additional features to use during the infrastructure creation and configuration phases. Retrieving system information – Configuration Service cmdlets In this recipe, we will use and explain a general-purpose PowerShell cmdlet: the Configuration Service category. This is used to retrieve general configuration parameters, and to obtain information about the implementation of the XenDesktop Configuration Service. Getting ready No preliminary tasks are required. You have already installed the Citrix XenDesktop PowerShell SDK during the installation of the Desktop Controller role machine(s). To be able to run a PowerShell script (.ps1 format), you have to enable the script execution from the PowerShell prompt in the following way, using its application: Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Force How to do it… In this section, we will explain and execute the commands associated with the XenDesktop System and Services configuration area: Connect to one of the Desktop Broker servers, by using a remote Desktop connection, for instance. Right-click on the PowerShell icon installed on the Windows taskbar and select the Run as Administrator option. Load the Citrix PowerShell modules by typing the following command and then press the Enter key: Asnp Citrix* As an alternative to the Asnp command, you can use the Add-PSSnapin command. Retrieve the active and configured Desktop Controller features by running the following command: Get-ConfigEnabledFeature To retrieve the current status of the Config Service, run the following command. The output result will be OK in the absence of configuration issues: Get-ConfigServiceStatus To get the connection string used by the Configuration Service and to connect to the XenDesktop database, run the following command: Get-ConfigDBConnection Starting from the previously received output, it's possible to configure the connection string to let the Configuration Service use the system DB. For this command, you have to specify the Server, Initial Catalog, and Integrated Security parameters: Set-ConfigDBConnection –DBConnection"Server=<ServernameInstanceName>; Initial Catalog=<DatabaseName>; Integrated Security=<True | False>" Starting from an existing Citrix database, you can generate a SQL procedure file to use as a backup to recreate the database. Run the following command to complete this task, specifying the DatabaseName and ServiceGroupName parameters: Get-ConfigDBSchema -DatabaseName<DatabaseName> -ServiceGroupName<ServiceGroupName>> Path:FileName.sql You need to configure a destination database with the same name as that of the source DB, otherwise the script will fail! To retrieve information about the active Configuration Service objects (Instance, Service, and Service Group), run the following three commands respectively: Get-ConfigRegisteredServiceInstance Get-ConfigService Get-ConfigServiceGroup To test a set of operations to check the status of the Configuration Service, run the following script: #------------ Script - Configuration Service #------------ Define Variables $Server_Conn="SqlDatabaseServer.xdseven.localCITRIX,1434" $Catalog_Conn="CitrixXD7-Site-First" #------------ write-Host"XenDesktop - Configuration Service CmdLets" #---------- Clear the existing Configuration Service DB connection $Clear = Set-ConfigDBConnection -DBConnection $null Write-Host "Clearing any previous DB connection - Status: " $Clear #---------- Set the Configuration Service DB connection string $New_Conn = Set-ConfigDBConnection -DBConnection"Server=$Server_Conn; Initial Catalog=$Catalog_Conn; Integrated Security=$true" Write-Host "Configuring the DB string connection - Status: " $New_Conn $Configured_String = Get-configDBConnection Write-Host "The new configured DB connection string is: " $Configured_String You have to save this script with the .ps1 extension, in order to invoke it with PowerShell. Be sure to change the specific parameters related to your infrastructure, in order to be able to run the script. This is shown in the following screenshot: How it works... The Configuration Service cmdlets of XenDesktop PowerShell permit the managing of the Configuration Service and its related information: the Metadata for the entire XenDesktop infrastructure, the Service instances registered within the VDI architecture, and the collections of these services, called Service Groups. This set of commands offers the ability to retrieve and check the DB connection string to contact the configured XenDesktop SQL Server database. These operations are permitted by the Get-ConfigDBConnection command (to retrieve the current configuration) and the Set-ConfigDBConnection command (to configure the DB connection string); both the commands use the DB Server Name with the Instance name, DB name, and Integrated Security as information fields. In the attached script, we have regenerated a database connection string. To be sure to be able to recreate it, first of all we have cleared any existing connection, setting it to null (verify the command associated with the $Clear variable), then we have defined the $New_Conn variable, using the Set-ConfigDBConnection command; all the parameters are defined at the top of the script, in the form of variables. Use the Write-Host command to echo results on the standard output. There's more... In some cases, you may need to retrieve the state of the registered services, in order to verify their availability. You can use the Test-ConfigServiceInstanceAvailability cmdlet, retrieving whether the service is responding or not and its response time. Run the following example to test the use of this command: Get-ConfigRegisteredServiceInstance | Test-ConfigServiceInstanceAvailability | more Use the –ForceWaitForOneOfEachType parameter to stop the check for a service category, when one of its services responds. Administering hosts and machines – Host and MachineCreation cmdlets In this recipe, we will describe how to create the connection between the Hypervisor and the XenDesktop servers, and the way to generate machines to assign to the end users, all by using Citrix PowerShell. Getting ready No preliminary tasks are required. You have already installed the Citrix XenDesktop PowerShell SDK during the installation of the Desktop Controller role machine(s). To be sure to be able to run a PowerShell script (the.ps1 format), you have to enable the script execution from the PowerShell prompt in this way: Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Force How to do it… In this section, we will discuss the PowerShell commands used to connect XenDesktop with the supported hypervisors plus the creation of the machines from the command line: Connect to one of the Desktop Broker servers. Click on the PowerShell icon installed on the Windows taskbar. Load the Citrix PowerShell modules by typing the following command, and then press the Enter key: Asnp Citrix* To list the available Hypervisor types, execute this task: Get-HypHypervisorPlugin –AdminAddress<BrokerAddress> To list the configured properties for the XenDesktop root-level location (XDHyp:), execute the following command: Get-ChildItemXDHyp:HostingUnits Please refer to the PSPath, Storage, and PersonalvDiskStorage output fields to retrieve information on the storage configuration. Execute the following cmdlet to add a storage resource to the XenDesktop Controller host: Add-HypHostingUnitStorage –LiteralPath<HostPathLocation> -StoragePath<StoragePath> -StorageType<OSStorage|PersonalvDiskStorage> - AdminAddress<BrokerAddress> To generate a snapshot for an existing VM, perform the following task: New-HypVMSnapshot –LiteralPath<HostPathLocation> -SnapshotDescription<Description> Use the Get-HypVMMacAddress -LiteralPath<HostPathLocation> command to list the MAC address of specified desktop VMs. To provision machine instances starting from the Desktop base image template, run the following command: New-ProvScheme –ProvisioningSchemeName<SchemeName> -HostingUnitName<HypervisorServer> -IdentityPoolName<PoolName> -MasterImageVM<BaseImageTemplatePath> -VMMemoryMB<MemoryAssigned> -VMCpuCount<NumberofCPU> To specify the creation of instances with the Personal vDisk technology, use the following option: -UsePersonalVDiskStorage. After the creation process, retrieve the provisioning scheme information by running the following command: Get-ProvScheme –ProvisioningSchemeName<SchemeName> To modify the resources assigned to desktop instances in a provisioning scheme, use the Set-ProvScheme cmdlet. The permitted parameters are –ProvisioningSchemeName, -VMCpuCount, and –VMMemoryMB. To update the desktop instances to the latest version of the Desktop base image template, run the following cmdlet: Publish-ProvMasterVmImage –ProvisioningSchemeName<SchemeName> -MasterImageVM<BaseImageTemplatePath> If you do not want to maintain the pre-update instance version to use as a restore checkpoint, use the –DoNotStoreOldImage option. To create machine instances, based on the previously configured provisioning scheme for an MCS architecture, run this command: New-ProvVM –ProvisioningSchemeName<SchemeName> -ADAccountName"DomainMachineAccount" Use the -FastBuild option to make the machine creation process faster. On the other hand, you cannot start up the machines until the process has been completed. Retrieve the configured desktop instances by using the next cmdlet: Get-ProvVM –ProvisioningSchemeName<SchemeName> -VMName<MachineName> To remove an existing virtual desktop, use the following command: Remove-ProvVM –ProvisioningSchemeName<SchemeName> -VMName<MachineName> -AdminAddress<BrokerAddress> The next script will combine the use of part of the commands listed in this recipe: #------------ Script - Hosting + MCS #----------------------------------- #------------ Define Variables $LitPath = "XDHyp:HostingUnitsVMware01" $StorPath = "XDHyp:HostingUnitsVMware01datastore1.storage" $Controller_Address="192.168.110.30" $HostUnitName = "Vmware01" $IDPool = $(Get-AcctIdentityPool -IdentityPoolName VDI-DESKTOP) $BaseVMPath = "XDHyp:HostingUnitsVMware01VMXD7-W8MCS-01.vm" #------------ Creating a storage location Add-HypHostingUnitStorage –LiteralPath $LitPath -StoragePath $StorPath -StorageTypeOSStorage -AdminAddress $Controller_Address #---------- Creating a Provisioning Scheme New-ProvScheme –ProvisioningSchemeName Deploy_01 -HostingUnitName $HostUnitName -IdentityPoolName $IDPool.IdentityPoolName -MasterImageVM $BaseVMPathT0-Post.snapshot -VMMemoryMB 4096 -VMCpuCount 2 -CleanOnBoot #---------- List the VM configured on the Hypervisor Host dir $LitPath*.vm exit How it works... The Host and MachineCreation cmdlet groups manage the interfacing with the Hypervisor hosts, in terms of machines and storage resources. This allows you to create the desktop instances to assign to the end user, starting from an existing and mapped Desktop virtual machine. The Get-HypHypervisorPlugin command retrieves and lists the available hypervisors to use to deploy virtual desktops and to configure the storage types. You need to configure an operating system storage area or a Personal vDisk storage zone. The way to map an existing storage location from the Hypervisor to the XenDesktop controller is by running the Add-HypHostingUnitStorage cmdlet. In this case you have to specify the destination path on which the storage object will be created (LiteralPath), the source storage path on the Hypervisor machine(s) (StoragePath), and the StorageType previously discussed. The storage types are in the form of XDHyp:HostingUnits<UnitName>. To list all the configured storage objects, execute the following command: dirXDHyp:HostingUnits<UnitName> *.storage After configuring the storage area, we have discussed the Machine Creation Service (MCS) architecture. In this cmdlets collection, we have the availability of commands to generate VM snapshots from which we can deploy desktop instances (New-HypVMSnapshot), and specify a name and a description for the generated disk snapshot. Starting from the available disk image, the New-ProvScheme command permits you to create a resource provisioning scheme, on which to specify the desktop base image, and the resources to assign to the desktop instances (in terms of CPU and RAM -VMCpuCount and –VMMemoryMB), and if generating these virtual desktops in a non-persistent mode (-CleanOnBoot option), with or without the use of the Personal vDisk technology (-UsePersonalVDiskStorage). It's possible to update the deployed instances to the latest base image update through the use of the Publish-ProvMasterVmImage command. In the generated script, we have located all the main storage locations (the LitPath and StorPath variables) useful to realize a provisioning scheme, then we have implemented a provisioning procedure for a desktop based on an existing base image snapshot, with two vCPUs and 4GB of RAM for the delivered instances, which will be cleaned every time they stop and start (by using the -CleanOnBoot option). You can navigate the local and remote storage paths configured with the XenDesktop Broker machine; to list an object category (such as VM or Snapshot) you can execute this command: dirXDHyp:HostingUnits<UnitName>*.<category> There's more... The discussed cmdlets also offer you the technique to preserve a virtual desktop from an accidental deletion or unauthorized use. With the Machine Creation cmdlets group, you have the ability to use a particular command, which allows you to lock critical desktops: Lock-ProvVM. This cmdlet requires as parameters the name of the scheme to which they refer (-ProvisioningSchemeName) and the ID of the virtual desktop to lock (-VMID). You can retrieve the Virtual Machine ID by running the Get-ProvVM command discussed previously. To revert the machine lock, and free the desktop instance from accidental deletion or improper use, you have to execute the Unlock-ProvVM cmdlet, using the same parameter showed for the lock procedure. Managing additional components – StoreFrontÔ admin and logging cmdlets In this recipe, we will use and explain how to manage and configure the StoreFront component, by using the available Citrix PowerShell cmdlets. Moreover, we will explain how to manage and check the configurations for the system logging activities. Getting ready No preliminary tasks are required. You have already installed the Citrix XenDesktop PowerShell SDK during the installation of the Desktop Controller role machine(s). To be able to run a PowerShell script (in the.ps1 format), you have to enable the script execution from the PowerShell prompt in this way: Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Force How to do it… In this section, we will explain and execute the commands associated with the Citrix Storefront system: Connect to one of the Desktop Broker servers. Click on the PowerShell icon installed on the Windows taskbar. Load the Citrix PowerShell modules by typing the following command, and then press the Enter key: Asnp Citrix* To execute a command, you have to press the Enter button after completing the right command syntax. Retrieve the currently existing StoreFront service instances, by running the following command: Get-SfService To limit the number of rows as output result, you can add the –MaxRecordCount<value> parameter. To list the detailed information about the StoreFront service(s) currently configured, execute the following command: Get-SfServiceInstance –AdminAddress<ControllerAddress> The status of the currently active StoreFront instances can be retrieved by using the Get-SfServiceStatus command. The OK output will confirm the correct service execution. To list the task history associated with the configured StoreFront instances, you have to run the following command: Get-SfTask You can filter the desired information for the ID of the researched task (-taskid) and sort the results by the use of the –sortby parameter. To retrieve the installed database schema versions, you can execute the following command: Get-SfInstalledDBVersion By applying the –Upgrade and –Downgrade filters, you will receive respectively the schemas for which the database version can be updated or reverted to a previous compatible one. To modify the StoreFront configurations to register its state on a different database, you can use the following command: Set-SfDBConnection-DBConnection<DBConnectionString> -AdminAddress<ControllerAddress> Be careful when you specify the database connection string; if not specified, the existing database connections and configurations will be cleared! To check that the database connection has been correctly configured, the following command is available: Test-SfDBConnection-DBConnection<DBConnectionString>-AdminAddress<ControllerAddress> The second discussed cmdlets allows the logging group to retrieve information about the current status of the logging service and run the following command: Get-LogServiceStatus To verify the language used and whether the logging service has been enabled, run the following command: Get-LogSite The available configurable locales are en, ja, zh-CN, de, es, and fr. The available states are Enabled, Disabled, NotSupported, and Mandatory. The NotSupported state will show you an incorrect configuration for the listed parameters. To retrieve detailed information about the running logging service, you have to use the following command: Get-LogService As discussed earlier for the StoreFront commands, you can filter the output by applying the –MaxRecordCount<value> parameter. In order to get all the operations logged within a specified time range, run the following command; this will return the global operations count: Get-LogSummary –StartDateRange<StartDate>-EndDateRange<EndDate> The date format must be the following: AAAA-MM-GGHH:MM:SS. To list the collected operations per day in the specified time period, run the previous command in the following way: Get-LogSummary –StartDateRange<StartDate> -EndDateRange<EndDate>-intervalSeconds 86400 The value 86400 is the number of seconds that are present in a day. To retrieve the connection string information about the database on which logging data is stored, execute the following command: Get-LogDataStore To retrieve detailed information about the high level operations performed on the XenDesktop infrastructure, you have to run the following command: Get-LogHighLevelOperation –Text <TextincludedintheOperation> -StartTime<FormattedDateandTime> -EndTime<FormattedDateandTime>-IsSuccessful<true | false>-User <DomainUserName>-OperationType<AdminActivity | ConfigurationChange> The indicated filters are not mandatory. If you do not apply any filters, all the logged operations will be returned. This could be a very long output. The same information can be retrieved for the low level system operations in the following way: Get-LogLowLevelOperation-StartTime<FormattedDateandTime> -EndTime<FormattedDateandTime> -IsSuccessful<true | false>-User <DomainUserName> -OperationType<AdminActivity | ConfigurationChange> In the How it works section we will explain the difference between the high and low level operations. To log when a high level operation starts and stops respectively, use the following two commands: Start-LogHighLevelOperation –Text <OperationDescriptionText>- Source <OperationSource> -StartTime<FormattedDateandTime> -OperationType<AdminActivity | ConfigurationChange> Stop-LogHighLevelOperation –HighLevelOperationId<OperationID> -IsSuccessful<true | false> The Stop-LogHighLevelOperation must be related to an existing start high level operation, because they are related tasks. How it works... Here, we have discussed two new PowerShell command collections for the XenDesktop 7 versions: the cmdlet related to the StoreFront platform; and the activities Logging set of commands. The first collection is quite limited in terms of operations, despite the other discussed cmdlets. In fact, the only actions permitted with the StoreFront PowerShell set of commands are retrieving configurations and settings about the configured stores and the linked database. More activities can be performed regarding the modification of existing StoreFront clusters, by using the Get-SfCluster, Add-SfServerToCluster, New-SfCluster, and Set-SfCluster set of operations. More interesting is the PowerShell Logging collection. In this case, you can retrieve all the system-logged data, putting it into two principal categories: High-level operations: These tasks group all the system configuration changes that are executed by using the Desktop Studio, the Desktop Director, or Citrix PowerShell. Low-level operations: This category is related to all the system configuration changes that are executed by a service and not by using the system software's consoles. With the low level operations command, you can filter for a specific high level operation to which the low level refers, by specifying the -HighLevelOperationId parameter. This cmdlet category also gives you the ability to track the start and stop of a high level operation, by the use of Start-LogHighLevelOperation and Stop-LogHighLevelOperation. In this second case, you have to specify the previously started high level operation. There's more... In case of too much information in the log store, you have the ability to clear all of it. To refresh all the log entries, we use the following command: Remove-LogOperation -UserName<DBAdministrativeCredentials> -Password <DBUserPassword>-StartDateRange <StartDate> -EndDateRange <EndDate> The not encrypted –Password parameter can be substituted by –SecurePassword, the password indicated in secure string form. The credentials must be database administrative credentials, with deleting permissions on the destination database. This is a not reversible operation, so ensure that you want to delete the logs in the specified time range, or verify that you have some form of data backup. Resources for Article: Further resources on this subject: Working with Virtual Machines [article] Virtualization [article] Upgrading VMware Virtual Infrastructure Setups [article]
Read more
  • 0
  • 0
  • 5830

article-image-nsx-core-components
Packt
05 Jan 2016
16 min read
Save for later

NSX Core Components

Packt
05 Jan 2016
16 min read
In this article by Ranjit Singh Thakurratan, the author of the book, Learning VMware NSX, we have discussed some of the core components of NSX. The article begins with a brief introduction of the NSX core components followed by a detailed discussion of these core components. We will go over three different control planes and see how each of the NSX core components fit in this architecture. Next, we will cover the VXLAN architecture and the transport zones that allow us to create and extend overlay networks across multiple clusters. We will also look at NSX Edge and the distributed firewall in greater detail and take a look at the newest NSX feature of multi-vCenter or cross-vCenterNSX deployment. By the end of this article, you will have a thorough understanding of the NSX core components and also their functional inter-dependencies. In this article, we will cover the following topics: An introduction to the NSX core components NSX Manager NSX Controller clusters VXLAN architecture overview Transport zones NSX Edge Distributed firewall Cross-vCenterNSX (For more resources related to this topic, see here.) An introduction to the NSX core components The foundational core components of NSX are divided across three different planes. The core components of a NSX deployment consist of a NSX Manager, Controller clusters, and hypervisor kernel modules. Each of these are crucial for your NSX deployment; however, they are decoupled to a certain extent to allow resiliency during the failure of multiple components. For example if your controller clusters fail, your virtual machines will still be able to communicate with each other without any network disruption. You have to ensure that the NSX components are always deployed in a clustered environment so that they are protected by vSphere HA. The high-level architecture of NSX primarily describes three different planes wherein each of the core components fit in. They are the Management plane, the Control plane, and the Data plane. The following figure represents how the three planes are interlinked with each other. The management plane is how an end user interacts with NSX as a centralized access point, while the data plane consists of north-south or east-west traffic. Let's look at some of the important components in the preceding figure: Management plane: The management plane primarily consists of NSX Manager. NSX Manager is a centralized network management component and primarily allows a single management point. It also provides the REST API that a user can use to perform all the NSX functions and actions. During the deployment phase, the management plane is established when the NSX appliance is deployed and configured. This management plane directly interacts with the control plane and also with the data plane. The NSX Manager is then managed via the vSphere web client and CLI. The NSX Manager is configured to interact with vSphere and ESXi, and once configured, all of the NSX components are then configured and managed via the vSphere web GUI. Control plane: The control plane consists of the NSX Controller that manages the state of virtual networks. NSX Controllers also enable overlay networks (VXLAN) that are multicast-free and make it easier to create new VXLAN networks without having to enable multicast functionality on physical switches. The controllers also keep track of all the information about the virtual machines, hosts, and VXLAN networks and can perform ARP suppression as well. No data passes through the control plane, and a loss of controllers does not affect network functionality between virtual machines. Overlay networks and VXLANs can be used interchangeably. They both represent L2 over L3 virtual networks. Data plane: The NSX data plane primarily consists of NSX logical switch. The NSX logical switch is a part of the vSphere distributed switch and is created when a VXLAN network is created. The logical switch and other NSX services such as logical routing and logical firewall are enabled at the hypervisor kernel level after the installation of hypervisor kernel modules (VIBs). This logical switch is the key to enabling overlay networks that are able to encapsulate and send traffic over existing physical networks. It also allows gateway devices that allow L2 bridging between virtual and physical workloads.The data plane receives its updates from the control plane as hypervisors maintain local virtual machines and VXLAN (Logical switch) mapping tables as well. A loss of data plane will cause a loss of the overlay (VXLAN) network, as virtual machines that are part of a NSX logical switch will not be able to send and receive data. NSX Manager NSX Manager, once deployed and configured, can deploy Controller cluster appliances and prepare the ESXi host that involves installing various vSphere installation bundles (VIB) that allow network virtualization features such as VXLAN, logical switching, logical firewall, and logical routing. NSX Manager can also deploy and configure Edge gateway appliances and its services. The NSX version as of this writing is 6.2 that only supports 1:1 vCenter connectivity. NSX Manager is deployed as a single virtual machine and relies on VMware's HA functionality to ensure its availability. There is no NSX Manager clustering available as of this writing. It is important to note that a loss of NSX Manager will lead to a loss of management and API access, but does not disrupt virtual machine connectivity. Finally, the NSX Manager's configuration UI allows an administrator to collect log bundles and also to back up the NSX configuration. NSX Controller clusters NSX Controller provides a control plane functionality to distribute Logical Routing, VXLAN network information to the underlying hypervisor. Controllers are deployed as Virtual Appliances, and they should be deployed in the same vCenter to which NSX Manager is connected. In a production environment, it is recommended to deploy minimum three controllers. For better availability and scalability, we need to ensure that DRS ant-affinity rules are configured to deploy Controllers on a separate ESXI host. The control plane to management and data plane traffic is secured by a certificate-based authentication. It is important to note that controller nodes employ a scale-out mechanism, where each controller node uses a slicing mechanism that divides the workload equally across all the nodes. This renders all the controller nodes as Active at all times. If one controller node fails, then the other nodes are reassigned the tasks that were owned by the failed node to ensure operational status. The VMware NSX Controller uses a Paxos-based algorithm within the NSX Controller cluster. The Controller removes dependency on multicast routing/PIM in the physical network. It also suppresses broadcast traffic in VXLAN networks. The NSX version 6.2 only supports three controller nodes. VXLAN architecture overview One of the most important functions of NSX is enabling virtual networks. These virtual networks or overlay networks have become very popular due to the fact that they can leverage existing network infrastructure without the need to modify it in any way. The decoupling of logical networks from the physical infrastructure allows users to scale rapidly. Overlay networks or VXLAN was developed by a host of vendors that include Arista, Cisco, Citrix, Red Hat, and Broadcom. Due to this joint effort in developing its architecture, it allows the VXLAN standard to be implemented by multiple vendors. VXLAN is a layer 2 over layer 3 tunneling protocol that allows logical network segments to extend on routable networks. This is achieved by encapsulating the Ethernet frame with additional UPD, IP, and VXLAN headers. Consequently, this increases the size of the packet by 50 bytes. Hence, VMware recommends increasing the MTU size to a minimum of 1600 bytes for all the interfaces in the physical infrastructure and any associated vSwitches. When a virtual machine generates traffic meant for another virtual machine on the same virtual network, the hosts on which these source and destination virtual machines run are called VXLAN Tunnel End Point (VTEP). VTEPs are configured as separate VM Kernel interfaces on the hosts. The outer IP header block in the VXLAN frame contains the source and the destination IP addresses that contain the source hypervisor and the destination hypervisor. When a packet leaves the source virtual machine, it is encapsulated at the source hypervisor and sent to the target hypervisor. The target hypervisor, upon receiving this packet, decapsulates the Ethernet frame and forwards it to the destination virtual machine. Once the ESXI host is prepared from NSX Manager, we need to configure VTEP. NSX supports multiple VXLAN vmknics per host for uplink load balancing features. In addition to this, Guest VLAN tagging is also supported. A sample packet flow We face a challenging situation when a virtual machine generates traffic—Broadcast, Unknown Unicast, or Multicast (BUM)—meant for another virtual machine on the same virtual network (VNI) on a different host. Control plane modes play a crucial factor in optimizing the VXLAN traffic depending on the modes selected for the Logical Switch/Transport Scope: Unicast Hybrid Multicast By default, a Logical Switch inherits its replication mode from the transport zone. However, we can set this on a per-Logical-Switch basis. Segment ID is needed for Multicast and Hybrid Modes. The following is a representation of the VXLAN-encapsulated packet showing the VXLAN headers: As indicated in the preceding figure, the outer IP header identifies the source and the destination VTEPs. The VXLAN header also has the Virtual Network Identifier (VNI) that is a 24-bit unique network identifier. This allows the scaling of virtual networks beyond the 4094 VLAN limitation placed by the physical switches. Two virtual machines that are a part of the same virtual network will have the same virtual network identifier, similar to how two machines on the same VLAN share the same VLAN ID. Transport zones A group of ESXi hosts that are able to communicate with one another over the physical network by means of VTEPs are said to be in the same transport zone. A transport zone defines the extension of a logical switch across multiple ESXi clusters that span across multiple virtual distributed switches. A typical environment has more than one virtual distributed switch that spans across multiple hosts. A transport zone enables a logical switch to extend across multiple virtual distributed switches, and any ESXi host that is a part of this transport zone can have virtual machines as a part of that logical network. A logical switch is always created as part of a transport zone and ESXi hosts can participate in them. The following is a figure that shows a transport zone that defines the extension of a logical switch across multiple virtual distributed switches: NSX Edge Services Gateway The NSX Edge Services Gateway (ESG) offers a feature rich set of services that include NAT, routing, firewall, load balancing, L2/L3 VPN, and DHCP/DNS relay. NSX API allows each of these services to be deployed, configured, and consumed on-demand. The ESG is deployed as a virtual machine from NSX Manager that is accessed using the vSphere web client. Four different form factors are offered for differently-sized environments. It is important that you factor in enough resources for the appropriate ESG when building your environment. The ESG can be deployed in different sizes. The following are the available size options for an ESG appliance: X-Large: The X-large form factor is suitable for high performance firewall, load balancer, and routing or a combination of multiple services. When an X-large form factor is selected, the ESG will be deployed with six vCPUs and 8GB of RAM. Quad-Large: The Quad-large form factor is ideal for a high performance firewall. It will be deployed with four vCPUs and 1GB of RAM. Large: The large form factor is suitable for medium performance routing and firewall. It is recommended that, in production, you start with the large form factor. The large ESG is deployed with two vCPUs and 1GB of RAM. Compact: The compact form factor is suitable for DHCP and DNS replay functions. It is deployed with one vCPU and 512MB of RAM. Once deployed, a form factor can be upgraded by using the API or the UI. The upgrade action will incur an outage. Edge gateway services can also be deployed in an Active/Standby mode to ensure high availability and resiliency. A heartbeat network between the Edge appliances ensures state replication and uptime. If the active gateway goes down and the "declared dead time" passes, the standby Edge appliance takes over. The default declared dead time is 15 seconds and can be reduced to 6 seconds. Let's look at some of the Edge services as follows: Network Address Translation: The NSX Edge supports both source and destination NAT and NAT is allowed for all traffic flowing through the Edge appliance. If the Edge appliance supports more than 100 virtual machines, it is recommended that a Quad instance be deployed to allow high performance translation. Routing: The NSX Edge allows centralized routing that allows the logical networks deployed in the NSX domain to be routed to the external physical network. The Edge supports multiple routing protocols including OSPF, iBGP, and eBGP. The Edge also supports static routing. Load balancing: The NSX Edge also offers a load balancing functionality that allows the load balancing of traffic between the virtual machines. The load balancer supports different balancing mechanisms including IP Hash, least connections, URI-based, and round robin. Firewall: NSX Edge provides a stateful firewall functionality that is ideal for north-south traffic flowing between the physical and the virtual workloads behind the Edge gateway. The Edge firewall can be deployed alongside the hypervisor kernel-based distributed firewall that is primarily used to enforce security policies between workloads in the same logical network. L2/L3VPN: The Edge also provides L2 and L3 VPNs that makes it possible to extend L2 domains between two sites. An IPSEC site-to-site connectivity between two NSX Edges or other VPN termination devices can also be set up. DHCP/DNS relay: NSX Edge also offers DHCP and DNS relay functions that allows you to offload these services to the Edge gateway. Edge only supports DNS relay functionality and can forward any DNS requests to the DNS server. The Edge gateway can be configured as a DHCP server to provide and manage IP addresses, default gateway, DNS servers and, search domain information for workloads connected to the logical networks. Distributed firewall NSX provides L2-L4stateful firewall services by means of a distributed firewall that runs in the ESXi hypervisor kernel. Because the firewall is a function of the ESXi kernel, it provides massive throughput and performs at a near line rate. When the ESXi host is initially prepared by NSX, the distributed firewall service is installed in the kernel by deploying the kernel VIB – VMware Internetworking Service insertion platform or VSIP. VSIP is responsible for monitoring and enforcing security policies on all the traffic flowing through the data plane. The distributed firewall (DFW) throughput and performance scales horizontally as more ESXi hosts are added. DFW instances are associated to each vNIC, and every vNIC requires one DFW instance. A virtual machine with 2 vNICs has two DFW instances associated with it, each monitoring its own vNIC and applying security policies to it. DFW is ideally deployed to protect virtual-to-virtual or virtual-to-physical traffic. This makes DFW very effective in protecting east-west traffic between workloads that are a part of the same logical network. DFW policies can also be used to restrict traffic between virtual machines and external networks because it is applied at the vNIC of the virtual machine. Any virtual machine that does not require firewall protection can be added to the exclusion list. A diagrammatic representation is shown as follows: DFW fully supports vMotion and the rules applied to a virtual machine always follow the virtual machine. This means any manual or automated vMotion triggered by DRS does not cause any disruption in its protection status. The VSIP kernel module also adds spoofguard and traffic redirection functionalities as well. The spoofguard function maintains a VM name and IP address mapping table and prevents against IP spoofing. Spoofguard is disabled by default and needs to be manually enabled per logical switch or virtual distributed switch port group. Traffic redirection allows traffic to be redirected to a third-party appliance that can do enhanced monitoring, if needed. This allows third-party vendors to be interfaced with DFW directly and offer custom services as needed. Cross-vCenterNSX With NSX 6.2, VMware introduced an interesting feature that allows you to manage multiple vCenterNSX environments using a primary NSX Manager. This allows to have easy management and also enables lots of new functionalities including extending networks and other features such as distributed logical routing. Cross-vCenterNSX deployment also allows centralized management and eases disaster recovery architectures. In a cross-vCenter deployment, multiple vCenters are all paired with their own NSX Manager per vCenter. One NSX Manager is assigned as the primary while other NSX Managers become secondary. This primary NSX Manager can now deploy a universal controller cluster that provides the control plane. Unlike a standalone vCenter-NSX deployment, secondary NSX Managers do not deploy their own controller clusters. The primary NSX Manager also creates objects whose scope is universal. This means that these objects extend to all the secondary NSX Managers. These universal objects are synchronized across all the secondary NSX Managers and can be edited and changed by the primary NSX Manager only. This does not prevent you from creating local objects on each of the NSX Managers. Similar to local NSX objects, a primary NSX Manager can create global objects such as universal transport zones, universal logical switches, universal distributed routers, universal firewall rules, and universal security objects. There can be only one universal transport zone in a cross-vCenterNSX environment. After it is created, it is synchronized across all the secondary NSX Managers. When a logical switch is created inside a universal transport zone, it becomes a universal logical switch that spans layer 2 network across all the vCenters. All traffic is routed using the universal logical router, and any traffic that needs to be routed between a universal logical switch and a logical switch (local scope) requires an ESG. Summary We began the article with a brief introduction of the NSX core components and looked at the management, control, and the data plane. We then discussed NSX Manager and the NSX Controller clusters. This was followed by a VXLAN architecture overview discussion, where we looked at the VXLAN packet. We then discussed transport zones and NSX Edge gateway services. We ended the article with NSX Distributed firewall services and also an overview of Cross-vCenterNSX deployment. Resources for Article: Further resources on this subject: vRealize Automation and the Deconstruction of Components [article] Monitoring and Troubleshooting Networking [article] Managing Pools for Desktops [article]
Read more
  • 0
  • 0
  • 5689

article-image-chaos-conf-2018-recap-chaos-engineering-hits-maturity-as-community-moves-towards-controlled-experimentation
Richard Gall
12 Oct 2018
11 min read
Save for later

Chaos Conf 2018 Recap: Chaos engineering hits maturity as community moves towards controlled experimentation

Richard Gall
12 Oct 2018
11 min read
Conferences can sometimes be confusing. Even at the most professional and well-planned conferences, you sometimes just take a minute and think what's actually the point of this? Am I learning anything? Am I meant to be networking? Will anyone notice if I steal extra food for the journey home? Chaos Conf 2018 was different, however. It had a clear purpose: to take the first step in properly forging a chaos engineering community. After almost a decade somewhat hidden in the corners of particularly innovative teams at Netflix and Amazon, chaos engineering might feel that its time has come. As software infrastructure becomes more complex, less monolithic, and as business and consumer demands expect more of the software systems that have become integral to the very functioning of life, resiliency has never been more important but more challenging to achieve. But while it feels like the right time for chaos engineering, it hasn't quite established itself in the mainstream. This is something the conference host, Gremlin, a platform that offers chaos engineering as a service, is acutely aware of. On the hand it's actively helping push chaos engineering into the hands of businesses, but on the other its growth and success, backed by millions of VC cash (and faith), depends upon chaos engineering becoming a mainstream discipline in the DevOps and SRE worlds. It's perhaps this reason that the conference felt so important. It was, according to Gremlin, the first ever public chaos engineering conference. And while it was relatively small in the grand scheme of many of today's festival-esque conferences attended by thousands of delegates (Dreamforce, the Salesforce conference, was also running in San Francisco in the same week), the fact that the conference had quickly sold out all 350 of its tickets - with more hoping on waiting lists - indicates that this was an event that had been eagerly awaited. And with some big names from the industry - notably Adrian Cockcroft from AWS and Jessie Frazelle from Microsoft - Chaos Conf had the air of an event that had outgrown its insider status before it had even began. The renovated cinema and bar in San Francisco's Mission District, complete with pinball machines upstairs, was the perfect container for a passionate community that had grown out of the clean corporate environs of Silicon Valley to embrace the chaotic mess that resembles modern software engineering. Kolton Andrus sets out a vision for the future of Gremlin and chaos engineering Chaos Conf was quick to deliver big news. They keynote speech, by Gremlin co-founder Kolton Andrus launched Gremlin's brand new Application Level Fault Injection (ALFI) feature, which makes it possible to run chaos experiments at an application level. Andrus broke the news by building towards it with a story of the progression of chaos engineering. Starting with Chaos Monkey, the tool first developed by Netflix, and moving from infrastructure to network, he showed how, as chaos engineering has evolved, it requires and faciliates different levels of control and insight on how your software works. "As we're maturing, the host level failures and the network level failures are necessary to building a robust and resilient system, but not sufficient. We need more - we need a finer granularity," Andrus explains. This is where ALFI comes in. By allowing Gremlin users to inject failure at an application level, it allows them to have more control over the 'blast radius' of their chaos experiments. The narrative Andrus was setting was clear, and would ultimately inform the ethos of the day - chaos engineering isn't just about chaos, it's about controlled experimentation to ensure resiliency. To do that requires a level of intelligence - technical and organizational - about how the various components of your software work, and how humans interact with them. Adrian Cockcroft on the importance of historical context and domain knowledge Adrian Cockcroft (@adrianco) VP at AWS followed Andrus' talk. In it he took the opportunity to set the broader context of chaos engineering, highlighting how tackling system failures is often a question of culture - how we approach system failure and think about our software. Developers love to learn things from first principles" he said. "But some historical context and domain knowledge can help illuminate the path and obstacles." If this sounds like Cockcroft was about to stray into theoretical territory, he certainly didn't. He offered plenty of practical frameworks for thinking through potential failure. But the talk wasn't theoretical - Cockcroft offered a taxonomy of failure that provides a useful framework for thinking through potential failure at every level. He also touched on how he sees the future of resiliency evolving, focusing on: Observability of systems Epidemic failure modes Automation and continuous chaos The crucial point Cockcroft makes is that cloud is the big driver for chaos engineering. "As datacenters migrate to the cloud, fragile and manual disaster recovery will be replaced by chaos engineering" read one of his slides. But more than that, the cloud also paves the way for the future of the discipline, one where 'chaos' is simply an automated part of the test and deployment pipeline. Selling chaos engineering to your boss Kriss Rochefolle, DevOps engineer and author of one of the best selling DevOps books in French, delivered a short talk on how engineers can sell chaos to their boss. He takes on the assumption that a rational proposal, informed by ROI is the best way to sell chaos engineering. He suggests instead that engineers need to play into emotions, and presenting chaos engineer as a method for tackling and minimizing the fear of (inevitable failure. Follow Kriss on Twitter: @crochefolle Walmart and chaos engineering Vilas Veraraghavan, the Director of Engineering was keen to clarify that Walmart doesn't practice chaos. Rather it practices resiliency - chaos engineering is simply a method the organization uses to achieve that. It was particularly useful to note the entire process that Vilas' team adopts when it comes to resiliency, which has largely developed out of Vilas' own work building his team from scratch. You can learn more about how Walmart is using chaos engineering for software resiliency in this post. Twitter's Ronnie Chen on diving and planning for failure Ronnie Chen (@rondoftw) is an engineering manager at Twitter. But she didn't talk about Twitter. In fact, she didn't even talk about engineering. Instead she spoke about her experience as a technical diver. By talking about her experiences, Ronnie was able to make a number of vital points about how to manage and tackle failure as a team. With mortality rates so high in diving, it's a good example of the relationship between complexity and risk. Chen made the point that things don't fail because of a single catalyst. Instead, failures - particularly fatal ones - happen because of a 'failure cascade'. Chen never makes the link explicit, but the comparison is clear - the ultimate outcome (ie. success or failure) is impacted by a whole range of situational and behavioral factors that we can't afford to ignore. Chen also made the point that, in diving, inexperienced people should be at the front of an expedition. "If you're inexperienced people are leading, they're learning and growing, and being able to operate with a safety net... when you do this, all kinds of hidden dependencies reveal themselves... every undocumented assumption, every piece of ancient team lore that you didn't even know you were relying on, comes to light." Charity Majors on the importance of observability Charity Majors (@mipsytipsy), CEO of Honeycomb, talked in detail about the key differences between monitoring and observability. As with other talks, context was important: a world where architectural complexity has grown rapidly in the space of a decade. Majors made the point that this increase in complexity has taken us from having known unknowns in our architectures, to many more unknown unknowns in a distributed system. This means that monitoring is dead - it simply isn't sophisticated enough to deal with the complexities and dependencies within a distributed system. Observability, meanwhile, allows you to to understand "what's happening in your systems just by observing it from the outside." Put simply, it lets you understand how your software is functioning from your perspective - almost turning it inside out. Majors then linked the concept to observability to the broader philosophy of chaos engineering - echoing some of the points raised by Adrian Cockcroft in his keynote. But this was her key takeaway: "Software engineers spend too much time looking at code in elaborately falsified environments, and not enough time observing it in the real world." This leads to one conclusion - the importance of testing in production. "Accept no substitute." Tammy Butow and Ana Medina on making an impact Tammy Butow (@tammybutow) and Ana Medina  (@Ana_M_Medina) from Gremlin took us through how to put chaos engineering into practice - from integrating it into your organizational culture to some practical tests you can run. One of the best examples of putting chaos into practice is Gremlin's concept of 'Failure Fridays', in which chaos testing becomes a valuable step in the product development process, to dogfood it and test out how a customer experiences it. Another way which Tammy and Ana suggested chaos engineering can be used was as a way of testing out new versions of technologies before you properly upgrade in production. To end, their talk, they demo'd a chaos battle between EKS (Kubernetes on AWS) and AKS (Kubernetes on Azure), doing an app container attack, a packet loss attack and a region failover attack. Jessie Frazelle on how containers can empower experimentation Jessie Frazelle (@jessfraz) didn't actually talk that much about chaos engineering. However, like Ronnie Chen's talk, chaos engineering seeped through what she said about bugs and containers. Bugs, for Frazelle, are a way of exploring how things work, and how different parts of a software infrastructure interact with each other: "Bugs are like my favorite thing... some people really hate when they get one of those bugs that turns out to be a rabbit hole and your kind of debugging it until the end of time... while debugging those bugs I hate them but afterwards, I'm like, that was crazy!" This was essentially an endorsement of the core concept of chaos engineering - injecting bugs into your software to understand how it reacts. Jessie then went on to talk about containers, joking that they're NOT REAL. This is because they're made up of  numerous different component parts, like Cgroups, namespaces, and LSMs. She contrasted containers with Virtual machines, zones and jails, which are 'first class concepts' - in other worlds, real things (Jessie wrote about this in detail last year in this blog post). In practice what this means is that whereas containers are like Lego pieces, VMs, zones, and jails are like a pre-assembled lego set that you don't need to play around with in the same way. From this perspective, it's easy to see how containers are relevant to chaos engineering - they empower a level of experimentation that you simply don't have with other virtualization technologies. "The box says to build the death star. But you can build whatever you want." The chaos ends... Chaos Conf was undoubtedly a huge success, and a lot of credit has to go to Gremlin for organizing the conference. It's clear that the team care a lot about the chaos engineering community and want it to expand in a way that transcends the success of the Gremlin platform. While chaos engineering might not feel relevant to a lot of people at the moment, it's only a matter of time that it's impact will be felt. That doesn't mean that everyone will suddenly become a chaos engineer by July 2019, but the cultural ripples will likely be felt across the software engineering landscape. But without Chaos Conf, it would be difficult to see chaos engineering growing as a discipline or set of practices. By sharing ideas and learning how other people work, a more coherent picture of chaos engineering started to emerge, one that can quickly make an impact in ways people wouldn't have expected six months ago. You can watch videos of all the talks from Chaos Conf 2018 on YouTube.
Read more
  • 0
  • 0
  • 5279
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $15.99/month. Cancel anytime
article-image-importance-windows-rds-horizon-view
Packt
30 Oct 2014
15 min read
Save for later

Importance of Windows RDS in Horizon View

Packt
30 Oct 2014
15 min read
In this article by Jason Ventresco, the author of VMware Horizon View 6 Desktop Virtualization Cookbook, has explained about the Windows Remote Desktop Services (RDS) and how they are implemented in Horizon View. He will discuss about configuring the Windows RDS server and also about creating RDS farm in Horizon View. (For more resources related to this topic, see here.) Configuring the Windows RDS server for use with Horizon View This recipe will provide an introduction to the minimum steps required to configure Windows RDS and integrate it with our Horizon View pod. For a more in-depth discussion on Windows RDS optimization and management, consult the Microsoft TechNet page for Windows Server 2012 R2 (http://technet.microsoft.com/en-us/library/hh801901.aspx). Getting ready VMware Horizon View supports the following versions of Window server for use with RDS: Windows Server 2008 R2: Standard, Enterprise, or Datacenter, with SP1 or later installed Windows Server 2012: Standard or Datacenter Windows Server 2012 R2: Standard or Datacenter The examples shown in this article were performed on Windows Server 2012 R2. Additionally, all of the applications required have already been installed on the server, which in this case included Microsoft Office 2010. Microsoft Office has specific licensing requirements when used with a Windows Server RDS. Consult Microsoft's Licensing of Microsoft Desktop Application Software for Use with Windows Server Remote Desktop Services document (http://www.microsoft.com/licensing/about-licensing/briefs/remote-desktop-services.aspx), for additional information. The Windows RDS feature requires a licensing server component called the Remote Desktop Licensing role service. For reasons of availability, it is not recommended that you install it on the RDS host itself, but rather, on an existing server that serves some other function or even on a dedicated server if possible. Ideally, the RDS licensing role should be installed on multiple servers for redundancy reasons. The Remote Desktop Licensing role service is different from the Microsoft Windows Key Management System (KMS), as it is used solely for Windows RDS hosts. Consult the Microsoft TechNet article, RD Licensing Configuration on Windows Server 2012 (http://blogs.technet.com/b/askperf/archive/2013/09/20/rd-licensing-configuration-on-windows-server-2012.aspx), for the steps required to install the Remote Desktop Licensing role service. Additionally, consult Microsoft document Licensing Windows Server 2012 R2 Remote Desktop Services (http://download.microsoft.com/download/3/D/4/3D42BDC2-6725-4B29-B75A-A5B04179958B/WindowsServerRDS_VLBrief.pdf) for information about the licensing options for Windows RDS, which include both per-user and per-device options. Windows RDS host – hardware recommendations The following resources represent a starting point for assigning CPU and RAM resources to Windows RDS hosts. The actual resources required will vary based on the applications being used and the number of concurrent users; so, it is important to monitor server utilization and adjust the CPU and RAM specifications if required. The following are the requirements: One vCPU for each of the 15 concurrent RDS sessions 2 GB RAM, base RAM amount equal to 2 GB per vCPU, plus 64 MB of additional RAM for each concurrent RDS session An additional RAM equal to the application requirements, multiplied by the estimated number of concurrent users of the application Sufficient hard drive space to store RDS user profiles, which will vary based on the configuration of the Windows RDS host: Windows RDS supports multiple options to control user profiles' configuration and growth, including a RD user home directory, RD roaming user profiles, and mandatory profiles. For information about these and other options, consult the Microsoft TechNet article, Manage User Profiles for Remote Desktop Services, at http://technet.microsoft.com/en-us/library/cc742820.aspx. This space is only required if you intend to store user profiles locally on the RDS hosts. Horizon View Persona Management is not supported and will not work with Windows RDS hosts. Consider native Microsoft features such as those described previously in this recipe, or third-party tools such as AppSense Environment Manager (http://www.appsense.com/products/desktop/desktopnow/environment-manager). Based on these values, a Windows Server 2012 R2 RDS host running Microsoft Office 2010 that will support 100 concurrent users will require the following resources: Seven vCPU to support upto 105 concurrent RDS sessions 45.25 GB of RAM, based on the following calculations: 20.25 GB of base RAM (2 GB for each vCPU, plus 64 MB for each of the 100 users) A total of 25 GB additional RAM to support Microsoft Office 2010 (Office 2010 recommends 256 MB of RAM for each user) While the vCPU and RAM requirements might seem excessive at first, remember that to deploy a virtual desktop for each of these 100 users, we would need at least 100 vCPUs and 100 GB of RAM, which is much more than what our Windows RDS host requires. By default, Horizon View allows only 150 unique RDS user sessions for each available Windows RDS host; so, we need to deploy multiple RDS hosts if users need to stream two applications at once or if we anticipate having more than 150 connections. It is possible to change the number of supported sessions, but it is not recommended due to potential performance issues. Importing the Horizon View RDS AD group policy templates Some of the settings configured throughout this article are applied using AD group policy templates. Prior to using the RDS feature, these templates should be distributed to either the RDS hosts in order to be used with the Windows local group policy editor, or to an AD domain controller where they can be applied using the domain. Complete the following steps to install the View RDS group policy templates: When referring to VMware Horizon View installation packages, y.y.y refers to the version number and xxxxxx refers to the build number. When you download packages, the actual version and build numbers will be in a numeric format. For example, the filename of the current Horizon View 6 GPO bundle is VMware-Horizon-View-Extras-Bundle-3.1.0-2085634.zip. Obtain the VMware-Horizon-View-GPO-Bundle-x.x.x-yyyyyyy.zip file, unzip it, and copy the en-US folder, the vmware_rdsh.admx file, and the vmware_rdsh_server.admx file to the C:WindowsPolicyDefinitions folder on either an AD domain controller or your target RDS host, based on how you wish to manage the policies. Make note of the following points while doing so: If you want to set the policies locally on each RDS host, you will need to copy the files to each server If you wish to set the policies using domain-based AD group policies, you will need to copy the files to the domain controllers, the group policy Central Store (http://support.microsoft.com/kb/929841), or to the workstation from which we manage these domain-based group policies. How to do it… The following steps outline the procedure to enable RDS on a Windows Server 2012 R2 host. The host used in this recipe has already been connected to the domain and has logged in with an AD account that has administrative permissions on the server. Perform the following steps: Open the Windows Server Manager utility and go to Manage | Add Roles and Features to open the Add Roles and Features Wizard. On the Before you Begin page, click on Next. On the Installation Type page, shown in the following screenshot, select Remote Desktop Services installation and click on Next. This is shown in the following screenshot: On the Deployment Type page, select Quick Start and click on Next. You can also implement the required roles using the standard deployment method outlined in the Deploy the Session Virtualization Standard deployment section of the Microsoft TechNet article, Test Lab Guide: Remote Desktop Services Session Virtualization Standard Deployment (http://technet.microsoft.com/en-us/library/hh831610.aspx). If you use this method, you will complete the component installation and proceed to step 9 in this recipe. On the Deployment Scenario page, select Session-based desktop deployment and click on Next. On the Server Selection page, select a server from the list under Server Pool, click the red, highlighted button to add the server to the list of selected servers, and click on Next. This is shown in the following screenshot: On the Confirmation page, check the box marked Restart the destination server automatically if required and click on Deploy. On the Completion page, monitor the installation process and click on Close when finished in order to complete the installation. If a reboot is required, the server will reboot without the need to click on Close. Once the reboot completes, proceed with the remaining steps. Set the RDS licensing server using the Set-RDLicenseConfiguration Windows PowerShell command. In this example, we are configuring the local RDS host to point to redundant license servers (RDS-LIC1 and RDS-LIC2) and setting the license mode to PerUser. This command must be executed on the target RDS host. After entering the command, confirm the values for the license mode and license server name by answering Y when prompted. Refer to the following code: Set-RDLicenseConfiguration -LicenseServer @("RDS-LIC1.vjason.local","RDS-LIC2.vjason.local") -Mode PerUser This setting might also be set using group policies applied either to the local computer or using Active Directory (AD). The policies are shown in the following screenshot, and you can locate them by going to Computer Configuration | Policies | Administrative Templates | Windows Components | Remote Desktop Services | Remote Desktop Session Host | Licensing when using AD-based policies. If you are using local group policies, there will be no Policies folder in the path: Use local computer or AD group policies to limit users to one session per RDS host using the Restrict Remote Desktop Services users to a single Remote Desktop Services session policy. The policy is shown in the following screenshot, and you can locate it by navigating to Computer Configuration | Policies | Administrative Templates | Windows Components | Remote Desktop Services | Remote Desktop Session Host | Connections: Use local computer or AD group policies to enable Timezone redirection. You can locate the policy by navigating to Computer Configuration | Policies | Administrative Templates | Windows Components | Horizon View RDSH Services | Remote Desktop Session Host | Device and Resource Redirection when using AD-based policies. If you are using local group policies, there will be no Policies folder in the path. To enable the setting, set Allow time zone redirection to Enabled. Use local computer or AD group policies to enable Windows Basic Aero-Styled Theme. You can locate the policy by going to User Configuration | Policies | Administrative Templates | Control Panel | Personalization when using AD-based policies. If you are using local group policies, there will be no Policies folder in the path. To configure the theme, set Force a specific visual style file or force Windows Classic to Enabled and set Path to Visual Style to %windir%resourcesThemesAeroaero.msstyles. Use local computer or AD group policies to start Runonce.exe when the RDS session starts. You can locate the policy by going to User Configuration | Policies | Windows Settings | Scripts (Logon/Logoff) when using AD-based policies. If you are using local group policies, there will be no Policies folder in the path. To configure the logon settings, double-click on Logon, click on Add, enter runonce.exe in the Script Name box, and enter /AlternateShellStartup in the Script Parameters box. On the Windows RDS host, double-click on the 64-bit Horizon View Agent installer to begin the installation process. The installer should have a name similar to VMware-viewagent-x86_64-y.y.y-xxxxxx.exe. On the Welcome to the Installation Wizard for VMware Horizon View Agent page, click on Next. On the License Agreement page, select the I accept the terms in the license agreement radio check box and click on Next. On the Custom Setup page, either leave all the options set to default, or if you are not using vCenter Operations Manager, deselect this optional component of the agent and click on Next. On the Register with Horizon View Connection Server page, shown in the following screenshot, enter the hostname or IP address of one of the Connection Servers in the pod where the RDS host will be used. If the user performing the installation of the agent software is an administrator in the Horizon View environment, leave the Authentication setting set to default; otherwise, select the Specify administrator credentials radio check box and provide the username and password of an account that has administrative rights in Horizon View. Click on Next to continue: On the Ready to Install the Program page, click on Install to begin the installation. When the installation completes, reboot the server if prompted. The Windows RDS service is now enabled, configured with the optimal settings for use with VMware Horizon View, and has the necessary agent software installed. This process should be repeated on additional RDS hosts, as needed, to support the target number of concurrent RDS sessions. How it works… The following resources provide detailed information about the configuration options used in this recipe: Microsoft TechNet's Set-RDLicenseConfiguration article at http://technet.microsoft.com/en-us/library/jj215465.aspx provides the complete syntax of the PowerShell command used to configure the RDS licensing settings. Microsoft TechNet's Remote Desktop Services Client Access Licenses (RDS CALs) article at http://technet.microsoft.com/en-us/library/cc753650.aspx explains the different RDS license types, which reveals that an RDS per-user Client Access License (CAL) allows our Horizon View clients to access the RDS servers from an unlimited number of endpoints while still consuming only one RDS license. The Microsoft TechNet article, Remote Desktop Session Host, Licensing (http://technet.microsoft.com/en-us/library/ee791926(v=ws.10).aspx) provides additional information on the group policies used to configure the RDS licensing options. The VMware document Setting up Desktop and Application Pools in View (https://pubs.vmware.com/horizon-view-60/index.jsp?topic=%2Fcom.vmware.horizon-view.desktops.doc%2FGUID-931FF6F3-44C1-4102-94FE-3C9BFFF8E38D.html) explains that the Windows Basic aero-styled theme is the only theme supported by Horizon View, and demonstrates how to implement it. The VMware document Setting up Desktop and Application Pools in View (https://pubs.vmware.com/horizon-view-60/topic/com.vmware.horizon-view.desktops.doc/GUID-443F9F6D-C9CB-4CD9-A783-7CC5243FBD51.html) explains why time zone redirection is required, as it ensures that the Horizon View RDS client session will use the same time zone as the client device. The VMware document Setting up Desktop and Application Pools in View (https://pubs.vmware.com/horizon-view-60/topic/com.vmware.horizon-view.desktops.doc/GUID-85E4EE7A-9371-483E-A0C8-515CF11EE51D.html) explains why we need to add the runonce.exe /AlternateShellStartup command to the RDS logon script. This ensures that applications which require Windows Explorer will work properly when streamed using Horizon View. Creating an RDS farm in Horizon View This recipe will discuss the steps that are required to create an RDS farm in our Horizon View pod. An RDS farm is a collection of Windows RDS hosts and serves as the point of integration between the View Connection Server and the individual applications installed on each RDS server. Additionally, key settings concerning client session handling and client connection protocols are set at the RDS farm level within Horizon View. Getting ready To create an RDS farm in Horizon View, we need to have at least one RDS host registered with our View pod. Assuming that the Horizon View Agent installation completed successfully in the previous recipe, we should see the RDS hosts registered in the Registered Machines menu under View Configuration of our View Manager Admin console. The tasks required to create the RDS pod are performed using the Horizon View Manager Admin console. How to do it… The following steps outline the procedure used to create a RDS farm. In this example, we have already created and registered two Window RDS hosts named WINRDS01 and WINRDS02. Perform the following steps: Navigate to Resources | Farms and click on Add, as shown in the following screenshot: On the Identification and Settings page, shown in the following screenshot, provide a farm ID, a description if desired, make any desired changes to the default settings, and then click on Next. The settings can be changed to On if needed: On the Select RDS Hosts page, shown in the following screenshot, click on the RDS hosts to be added to the farm and then click on Next: On the Ready to Complete page, review the configuration and click on Finish. The RDS farm has been created, which allows us to create application. How it works… The following RDS farm settings can be changed at any time and are described in the following points: Default display protocol: PCoIP (default) and RDP are available. Allow users to choose protocol: By default, Horizon View Clients can select their preferred protocol; we can change this setting to No in order to enforce the farm defaults. Empty session timeout (applications only): This denotes the amount of time that must pass after a client closes all RDS applications before the RDS farm will take the action specified in the When timeout occurs setting. The default setting is 1 minute. When timeout occurs: This determines which action is taken by the RDS farm when the session's timeout deadline passes; the options are Log off or Disconnect (default). Log off disconnected sessions: This determines what happens when a View RDS session is disconnected; the options are Never (default), Immediate, or After. If After is selected, a time in minutes must be provided. Summary We have learned about configuring the Windows RDS server for use in Horizon View and also about creating RDS farm in Horizon View. Resources for Article: Further resources on this subject: Backups in the VMware View Infrastructure [Article] An Introduction to VMware Horizon Mirage [Article] Designing and Building a Horizon View 6.0 Infrastructure [Article]
Read more
  • 0
  • 0
  • 5184

article-image-openstack-networking-nutshell
Packt
22 Sep 2016
13 min read
Save for later

OpenStack Networking in a Nutshell

Packt
22 Sep 2016
13 min read
Information technology (IT) applications are rapidly moving from dedicated infrastructure to cloud based infrastructure. This move to cloud started with server virtualization where a hardware server ran as a virtual machine on a hypervisor. The adoptionof cloud based applicationshas accelerated due to factors such as globalization and outsourcing where diverse teams need to collaborate in real time. Server hardware connects to network switches using Ethernet and IP to establish network connectivity. However, as servers move from physical to virtual, the network boundary also moves from the physical network to the virtual network.Traditionally applications, servers and networking were tightly integrated. But modern enterprises and IT infrastructure demand flexibility in order to support complex applications. The flexibility of cloud infrastructure requires networking to be dynamic and scalable. Software Defined Networking (SDN) and Network Functions Virtualization (NFV) play a critical role in data centers in order to deliver the flexibility and agility demanded by cloud based applications. By providing practical management tools and abstractions that hide underlying physical network’s complexity, SDN allows operators to build complex networking capabilities on demand. OpenStack is an open source cloud platform that helps build public and private cloud at scale. Within OpenStack, the name for OpenStack Networking project is Neutron. The functionality of Neutron can be classified as core and service. In this article by Sriram Subramanian and SreenivasVoruganti, authors of the book Software Defined Networking (SDN) with OpenStack, aims to provide a short introduction toOpenStack Networking. We will cover the following topics in this article: Understand traffic flows between virtual and physical networks Neutron entities that support Layer 2 (L2) networking Layer 3 (L3) or routing between OpenStack Networks Securing OpenStack network traffic Advanced networking services in OpenStack OpenStack and SDN The terms Neutron and OpenStack Networking are used interchangeably throughout this article. (For more resources related to this topic, see here.) Virtual and physical networking Server virtualization led to the adoption of virtualized applications and workloads running inside physical servers. While physical servers are connected to the physical network equipment, modern networking has pushed the boundary of networking into the virtual domain as well. Virtual switches, firewalls and routers play a critical role in the flexibility provided by cloud infrastructure. Figure 1: Networking components for server virtualization The preceding figure describes a typical virtualized server and the various networking components. The virtual machines are connected to a virtual switch inside the compute node (or server). The traffic is secured using virtual routers and firewalls. The compute node is connected to a physical switch which is the entry point into the physical network. Let us now walk through different traffic flow scenarios using the picture above as the background. In Figure 2, traffic from one VM to another on same compute node is forwarded by the virtual switch itself. It does not reach the physical network. You can even apply firewall rules to traffic between the two virtual machine. Figure 2: Traffic flow between two virtual machines on the same server Next, let us have a look at how traffic flows between virtual machines across two compute nodes. In Figure 3, the traffic comes out from compute node and then reaches the physical switch. The physical switch forwards the traffic to the second compute node and the virtual switch within the second compute node steers the traffic to appropriate VM. Figure 3: Traffic flow between two virtual machines on the different servers Finally, here is the depiction of traffic flow when a virtual machine sends or receives traffic from the Internet. The physical switch forwards the traffic to the physical router and firewall which is presumed to be connected to the internet. Figure 4: Traffic flow from a virtual machine to external network As seen from the above diagrams, the physical and the virtual network components work together to provide connectivity to virtual machines and applications. Tenant isolation As a cloud platform, OpenStack supports multiple users grouped into tenants. One of the key requirements of a multi-tenant cloud is to provide isolation of data traffic belonging to one tenant from rest of the tenants that use the same infrastructure. OpenStack supports different ways of achieving isolation and the it is the responsibility of the virtual switch to implement the isolation. Layer 2 (L2) capabilities in OpenStack The connectivity to a physical or virtual switch is also known as Layer 2 (L2) connectivity in networking terminology. Layer 2 connectivity is the most fundamental form of network connectivity needed for virtual machines. As mentioned earlier OpenStack supports core and service functionality. The L2 connectivity for virtual machines falls under the core capability of OpenStack Networking, whereas Router, Firewall etc., fall under the service category. The L2 connectivity in OpenStack is realized using two constructs called Network and Subnet. Operators can use OpenStack CLI or the web interface to create Networks and Subnets. And virtual machines are instantiated, the operators can associate them appropriate Networks. Creatingnetwork using OpenStack CLI A Network defines the Layer 2 (L2) boundary for all the instances that are associated with it. All the virtual machines within a Network are part of the same L2 broadcast domain. The Liberty release has introduced new OpenStack CLI (Command Line Interface) for different services. We will use the new CLI and see how to create a Network. Creating Subnet using OpenStack CLI A Subnet is a range of IP addresses that are assigned to virtual machines on the associated network. OpenStack Neutron configures a DHCP server with this IP address range and it starts one DHCP server instance per Network, by default. We will now show you how to create a Subnet using OpenStack CLI. Note: Unlike Network, for Subnet, we need to use the regular neutron CLI command in the Liberty release. Associating a network and Subnet to a virtual machine To give a complete perspective, we will create a virtual machine using OpenStack web interface and show you how to associate a Network and Subnet to a virtual machine. In your OpenStack web interface, navigate to Project|Compute|Instances. Click on Launch Instances action on the right hand side as highlighted above. In the resulting window enter the name for your instance and how you want to boot your instance. To associate a network and a subnet with the instance, click on Networking tab. If you have more than one tenant network, you will be able to choose the network you want to associate with the instance. If you have exactly one network, the web interface will automatically select it. As mentioned earlier, providing isolation for Tenant network traffic is a key requirement for any cloud. OpenStack Neutron uses Network and Subnet to define the boundaries and isolate data traffic between different tenants. Depending on Neutron configuration, the actual isolation of traffic is accomplished by the virtual switches. VLAN and VXLAN are common networking technologies used to isolate traffic. Layer 3 (L3) capabilities in OpenStack Once L2 connectivity is established, the virtual machines within one Network can send or receive traffic between themselves. However, two virtual machines belonging to two different Networks will not be able communicate with each other automatically. This is done to provide privacy and isolation for Tenant networks. In order to allow traffic from one Network to reach another Network, OpenStack Networking supports an entity called Router. The default implementation of OpenStack uses Namespaces to support L3 routing capabilities. Creating Router using OpenStack CLI Operators can create Routers using OpenStack CLI or web interface. They can then add more than one Subnets as interface to the Router. This allows the Networks associated with the router to exchange traffic with one another. The command to create a Router is as follows: This command creates a Router with the specified name. Associating Subnetwork to a Router Once a Router is created, the next step is to associate one or more sub-networks to the Router. The command to accomplish this is: The Subnet represented by subnet1 is now associated to the Router router1. Securing network traffic in OpenStack The security of network traffic is very critical and OpenStack supports two mechanisms to secure network traffic. Security Groups allow traffic within a tenant’s network to be secured. Linux iptables on the compute nodes are used to implement OpenStack security groups. The traffic that goes outside of a tenant’s network – to another Network or the Internet, is secured using OpenStackFirewall Service functionality. Like Routing, Firewall is a service with Neutron. Firewall service also uses iptables but the scope of iptables is limited to the OpenStack Router used as part of the Firewall Service. Usingsecurity groups to secure traffic within a network In order to secure traffic going from one VM to another within a given Network, we must create a security group. The command to create a security group is: The next step is to create one or more rules within the security group. As an example let us create a rule which allows only UDP, incoming traffic on port 8080 from any Source IP address. The final step is to associate this security group and the rules to a virtual machine instance. We will use the nova boot command for this: Once the virtual machine instance has a security group associated with it, the incoming traffic will be monitored and depending upon the rules inside the security group, data traffic may be blocked or permitted to reach the virtual machine. Note: it is possible to block ingress or egress traffic using security groups. Using firewall service to secure traffic We have seen that security groups provide a fine grain control over what traffic is allowed to and from a virtual machine instance. Another layer of security supported by OpenStack is the Firewall as a Service (FWaaS). The FWaaS enforces security at the Router level whereas security groups enforce security at a virtual machine interface level. The main use case of FWaaS is to protect all virtual machine instances within a Network from threats and attacks from outsidethe Network. This could be virtual machines part of another Network in the same OpenStack cloud or some entity on the Internet trying to make an unauthorized access. Let us now see how FWaaS is used in OpenStack. In FWaaS, a set of firewall rules are grouped into a firewall policy. And then a firewall is created that implements one policy at a time. This firewall is then associated to a Router. Firewall rule can be created using neutron firewall-rule-create command as follows: This rule blocks ICMP protocol so applications like Ping will be blocked by the firewall. The next step is to create a Firewall policy. In real world scenarios the security administrators will define several rules and consolidate them under a single policy. For example all rules that block various types of traffic can be combined into a single Policy. The command to create a firewall policy is: The final step is to create a firewall and associate it with a router. The command to do this is: In the command above we did not specify any Routers and the OpenStack behavior is to associate the firewall (and in turn the policy and rules) to all the Routers available for that tenant. The neutron firewall-create command supports an option to pick a specific Router as well. Advanced networking services Besides routing and firewall, there are few other commonly used networking technologies supported by OpenStack. Let’s take a quick look at these without delving deep into the respective commands. Load Balancing as a Service (LBaaS) Virtual machines instances created in OpenStack are used to run applications. Most applications are required to support redundancy and concurrent access. For example, a web server may be accessed by a large number of users at the same time. One of the common strategies to handle scale and redundancy is to implement load-balancing for incoming requests. In this approach, aLoad Balancer distributes an incoming service request onto a pool of servers, which processes the request thus providing higher throughput. If one of the servers in the pool fails, the Load Balancer removes it from the pool and the subsequent service requests are distributed among the remaining servers. User of the application use the IP address of the Load Balancer to access the application and are unaware of the pool of servers. OpenStack implements Load Balancer using HAProxy software and Linux Namespace. Virtual Private Network as a Service (VPNaaS) As mentioned earlier tenant isolation requires data traffic to be segregated and secured within an OpenStack cloud. However, there are times when external entities need to be part of the same Network without removing the firewall based security. This can be accomplished using a Virtual Private Network or VPN. A Virtual Private Network (VPN) connects two endpoints on different networks over a public Internet connection, such that the endpoints appear to be directly connected to each other.  VPNs also provide confidentiality and integrity of transmitted data. Neutron provides a service plugin that enables OpenStack users to connect two networks using a Virtual Private Network (VPN).  The reference implementation of VPN plugin in Neutron uses Openswan to create an IPSec based VPN. IPSec is a suite of protocols that provides secure connection between two endpoints by encrypting each IP packet transferred between them. OpenStack and SDN context So far in this article we have seen the different networking capabilities provided by OpenStack. Let us know look at two capabilities in OpenStack that enable SDN to be leveraged effectively. Choice of technology OpenStack being an open source platform bundles open source networking solutions as default implementation for these networking capabilities. For example, Routing is supported using Namespace, security using iptables and Load balancing using HAproxy. Historically these networking capabilities were implemented using customized hardware and software, most of them being proprietary solutions. These custom solutions are capable of much higher performance and are well supported by their vendors. And hence they have a place in the OpenStack and SDN ecosystem. From it initial releases OpenStack has been designed for extensibility. Vendors can write their own extensions and then can easily configure OpenStack to use their extension instead of the default solutions. This allows the operators to deploy the networking technology of their choice. OpenStack API for networking One of the most powerful capabilities of OpenStack is the extensive support for APIs. All services of OpenStack interact with one another using well defined RESTful APIs. This allows custom implementations and pluggable components to provide powerful enhancements for practical cloud implementation. For example, when a Network is created using OpenStack web interface, a RESTful request is sent to Horizon service. This in turn invokes a RESTful API to validate the user using Keystone service. Once validated, Horizon sends another RESTful API request to Neutron to actually create the Network. Summary As seen in this article, OpenStack supports a wide variety of networking functionality right out of the box. The importance of isolating tenant traffic and the need to allow customized solution requires OpenStack to support flexible configuration. We also highlighted some key aspects of OpenStack that will play a key role in deploying Software Defined Networking in datacenters, thereby supporting powerful cloud architecture and solution. Resources for Article: Further resources on this subject: Setting Up a Network Backup Server with Bacula [article] Jenkins 2.0: The impetus for DevOps Movement [article] Integrating Accumulo into Various Cloud Platforms [article]
Read more
  • 0
  • 0
  • 5023

article-image-citrix-xenapp-performance-essentials
Packt
21 Aug 2013
18 min read
Save for later

Citrix XenApp Performance Essentials

Packt
21 Aug 2013
18 min read
(For more resources related to this topic, see here.) Optimizing Session Startup The most frequent complaint that system administrators receive from users about XenApp is definitely that the applications start slowly. They certainly do not consider that at least the first time you launch an application published by XenApp, an entire login process takes place. In this article you'll learn: Which steps form the login process and which systems are involved The most common causes of logon delays and how to mitigate them The use of some advanced XenApp features, like session pre-launch The logon process Let's briefly review the logon process that starts when a user launches an application through the Web Interface or through a link created by the Receiver. The following diagram explains the logon process: The logon process Resolution The user launches an application (A) and the Web Interface queries the Data Collector (B) that returns the least-loaded server for the requested application (C). The Web Interface generates an ICA file and passes it to the client (D). Connection The Citrix client running on the user's PC establishes a connection to the session-host server specified in the ICA file. In the handshake process, client and server agree on the security level and capabilities. Remote Desktop Services (RDS) license Windows Server validates that an RDS/Terminal Server (TS) license is available. AD authentication Windows Server authenticates the user against the Active Directory (AD). If the authentication is successful, the server queries account details from the AD, including Group Policies (GPOs) and roaming profiles. Citrix license XenApp validates that a Citrix license is available. Session startup If the user has a roaming profile, Windows downloads it from the specified location (usually a file server). Windows then applies any GPOs and XenApp applies any Citrix policies. Windows executes applications included in the Startup menu and finally launches the requested application. Some other steps may be necessary if other Citrix components (for example, the Citrix Access Gateway) are included in your infrastructure. Analysing the logon process Users perceive the overall duration of the process from the time when they click on the icon until the appearance of the application on their desktops. To troubleshoot slowness, a system administrator must know the duration of the individual steps. Citrix EdgeSight Citrix EdgeSight is a performance and availability management solution for XenApp and XenDesktop. If you own an Enterprise or Platinum XenApp license, you're entitled to install EdgeSight Basic (for Enterprise licensing) or Advanced (for Platinum licensing). You can also license it as a standalone product. If you deployed Citrix EdgeSight in your farm, you can run the Session Startup Duration Detail report, which includes information on both, the duration of server-side and client-side steps. This report is available only with EdgeSight Advanced. For each session, you can drill down the report to display information about server-side and client-side startup processes. An example is shown in the following screenshot: EdgeSight's Session Startup Duration Detail report The columns report the time (in milliseconds) spent by the startup process in the different steps. SSD is the total server-side time, while CSD the total client-side time. You can find a full description of the available reports and the meaning of the different acronyms in the EdgeSight Report List at http://community.citrix.com/display/edgesight/EdgeSight+5.4+Report+List. In the preceding example most of the time was spent in the Profile Load (PLSD) and Login Script Execution (LSESD) steps on the server and in the Session Creation (SCCD) step on the client. EdgeSight is a very valuable tool to analyze your farm. The available reports cover all the critical areas and gives detailed information about the "hidden" work of Citrix XenApp. With the Session Startup Duration Detail report you can identify which steps cause a slow session startup. If you want to understand why server-side steps, like the Profile Load step in the preceding example that lasted more than 15 seconds, take too much time, you need a different tool. Windows Performance Toolkit Windows Performance Toolkit (WPT) is a tool included in the Windows ADK, freely downloadable from the Microsoft website (http://www.microsoft.com/en-us/download/details.aspx?id=30652). You need an Internet connection to install Windows ADK. You can run the setup on a client with Internet access and configure it to download all the required components in a folder. Move the folder on your server and perform an offline installation. WPT has two components: Windows Performance Recorder, which is used to record all the performance data in an .etl file Windows Performance Analyzer, a graphical program to analyze the recorded data Run the following command from the WPT installed folder to capture slow logons: C:WPT>xperf -on base+latency+dispatcher+NetworkTrace+Registry+File IO -stackWalk CSwitch+ReadyThread+ThreadCreate+Profile -BufferSize 128 -start UserTrace -on "Microsoft-Windows-Shell-Core+Microsoft-Windows-Wininit+Microsoft-Windows-Folder Redirection+Microsoft-Windows-User Profiles Service+Microsoft-Windows-GroupPolicy+Microsoft-Windows-Winlogon+Microsoft-Windows-Security-Kerberos+Microsoft-Windows-User Profiles General+e5ba83f6-07d0-46b1-8bc7-7e669a1d31dc+63b530f8-29c9-4880-a5b4-b8179096e7b8+2f07e2ee-15db-40f1-90ef-9d7ba282188a" -BufferSize 1024 -MinBuffers 64 -MaxBuffers 128 -MaxFile 1024 After having recorded at least one slow logon, run the following command to stop recording and save the performance data to an .etl file: C:WPT>xperf -stop -stop UserTrace -d merged.etl This command creates a file called merged.etl in the WPT install folder. You can open this file with Windows Performance Analyzer. The Windows Performance Analyzer timeline is shown in the following screenshot: Windows Performance Analyzer timeline Windows Performance Analyzer shows a timeline with the duration of each step; for any point in time you can view the running processes, the usage of CPU and memory, the number of I/O operations, and the bytes sent or received through the network. WPT is a great tool to identify the reason for delays in Windows; it, however, has no visibility of Citrix processes. This is why EdgeSight is still necessary for complete troubleshooting. Common causes of logon delays After having analyzed many logon problems, I found that the slowness was usually caused by one or more of the following reasons: Authentication issues Profile issues GPO and logon script issues In the next paragraphs, you'll learn how to identify those issues and how to mitigate them. Even if you can't use the advanced tools (EdgeSight, WPT, and so on) described in the preceding sections, you can follow the guidelines in the next sections and best practices to solve or prevent most of the problems related to the logon process. Authentication issues During the logon process, authentication happens at multiple stages; at minimum when a user logs on to the Web Interface and when the session-host server creates a session for launching the requested application. Citrix XenApp integrates with Active Directory. The authentication is therefore performed by a Domain Controller (DC) server of your domain. Slowness in the Domain Controller response, caused by an overloaded server, can slow down the entire process. Worse, if the Domain Controller is unavailable, a domain member server may try to connect for 30 seconds before timing out and choosing a different DC. Domain member servers choose the Domain Controller for authenticating users based on their membership to Active Directory Sites. If sites are not correctly configured or don't reflect the real topology of your network, a domain member server may decide to use a remote Domain Controller, through a slow WAN link, instead of using a Domain Controller on the same LAN. Profile issues Each user has a profile that is a collection of personal files and settings. Windows offers different types of profiles, with advantages and disadvantages as shown in the following table: Type Description Local The profile folder is local to each server. Roaming The profile folder is saved on a central storage (usually a file server). Mandatory A read-only profile is assigned to users; changes are not saved across sessions. From the administrator's point of view, mandatory profiles are the best option because they are simple to maintain, allow fast logon, and users can't modify Windows or application settings. This option however is not often feasible. I could use mandatory profiles only in specific cases, for example; when users have to run only a single application without the need to customize it. Local profiles are almost never used in a XenApp environment because even if they offer the fastest logon time, they are not consistent across servers and sessions. Furthermore, you'll end up with all your session-host servers storing local profiles for all your users, and that is a waste of disk space. Finally, if you're provisioning your servers with Provisioning Server, this option, if not applicable as local profiles. would be saved in the local cache, which is deleted every time the server reboots. System administrators usually choose roaming profiles for their users. Roaming profiles indeed allow consistency across servers and sessions and preserve user. Roaming profiles are, however, the most significant cause of slow logons. Without a continuous control, they can rapidly grow to a large size. A small profile with a large number of files, for example, a profile with many cookies, can cause delays too. Roaming profiles also suffer of the last write wins problem. In a distributed environment like a XenApp farm, it is not unlikely that users are connected to different servers at the same time. Profiles are updated when users log off, so with different sessions on different servers, some settings could be overwritten, or worse, the profile could be corrupted. Folder redirection To reduce the size of roaming profiles, you can redirect most of the user folders to a different location. Instead of saving files in the user's profile, you can configure Windows to save them on a file sharing system. The advantages of using folder redirection are: The data in the redirected folders is not included in the synchronization job of the roaming profile, making the user logon and logoff processes faster Using disk quotas and redirecting folders to different disks, you can limit how much space is taken up by single folders instead of the whole profile Windows Offline Files technology allows users to access their files even when no network connection is available You can redirect some folders (for example, the Start Menu) to a read-only share, giving all your users the same content Folder redirection is configured through group policies as shown in the following screenshot: Configuring Folder Redirection For each folder, you can choose to redirect it to a fixed location (useful if you want to provide the same content to all your users), to a subfolder (named as the username) under a fixed root path to the user's home directory, or to the local user profile location. You may also apply different redirections based on group membership and define advanced settings for the Documents folder. In my experience, folder redirection plays a key role in speeding up the logon process. You should enable it at least for the Desktop and My Documents folder if you're using roaming profiles. Background upload With Windows 2008 R2, Microsoft added the ability to perform periodic upload of the user's profile registry file (NTUSER.DAT) on the file share. Even if this option wasn't added to address the last write wins problem, it may help to avoid profile corruption and Microsoft recommends enabling it through a GPO as shown in the following screenshot: Enabling Background upload Citrix Profile Management Citrix developed its own solution for managing profiles, Citrix Profile Management. You're entitled to use Citrix Profile Management if you have an active Subscription Advantage for the following products: XenApp Enterprise and Platinum edition XenDesktop Advanced, Enterprise, and Platinum edition You need to install the software on each computer whose user profiles you want to manage. In a XenApp farm install it on your session-host servers. Features Citrix Profile Management was designed specifically to solve some of the problems Windows roaming profiles introduced. Its main features are: Support for multiple sessions, without the last write wins problem Ability to manage large profiles, without the need to perform a full sync when the user logs on Support for v1 (Windows XP/2003) and v2 (Windows Vista/Seven/2008) profiles Ability to define inclusion/exclusion lists Extended synchronization can include files and folders external to the profile to support legacy applications Configuring Citrix Profile Management is configured using Windows Group Policy. In the Profile Management package, downloadable from the Citrix website, you can find the administrative template (.admx) and its language file (.adml). Copy the ADMX file in C:WindowsPolicyDefintions and the ADML file in C:WindowsPolicyDefintionslang (for example, on English operating systems the lang folder is en-US). A new Profile Management folder in Citrix is then available in your GPOs as shown in the following screenshot: Profile Management's settings in Windows GPOs Profile Management settings are in the Computer section, therefore, link the GPO to the Organizational Unit (OU) that contains your session-host servers. Profiles priority order If you deployed Citrix Profile Management, it takes precedence over any other profile assignment method. The priority order on a XenApp server is the following: Citrix Profile Management Remote Desktop Services profile assigned by a GPO Remote Desktop Services profile assigned by a user property Roaming profile assigned by a GPO Roaming profile assigned by a user property Troubleshooting Citrix Profile Management includes a logging functionality, you can enable via GPO as shown in the following screenshot: Enabling the logging functionality With the Log settings setting, you can also enable verbose logging for specific events or actions. Logs are usually saved in %SystemRoot%System32LogfilesUserProfileManager but you can change the path with the Path to log file property. Profile Management's logs are also useful to troubleshoot slow logons. Each step is logged with a timestamp so analyzing those logs you can find which steps last for a long time. GPO and logon script issues In a Windows environment, it's common to apply settings and customizations via Group Policy Objects (GPOs) or using logon scripts. Numerous GPOs and long-running scripts can significantly impact the speed of the logon process. It's sometimes hard to find which GPOs or scripts are causing delays. A suggestion is to move the XenApp server or a test user account in a new Organizational Unit, with no policies applied and blocked inheritance. Comparing the logon time in this scenario with the normal logon time can help you understand how GPOs and scripts affect the logon process. The following are some of the best practices when working with GPOs and logon scripts: Reduce the number of GPOs by merging them when possible. The time Windows takes to apply 10 GPOs is much more than the time to apply a single GPO including all their settings. Disable unused GPOs sections. It's common to have GPOs with only computer or user settings. Explicitly disabling the unused sections can speed up the time required to apply the GPOs. Use GPOs instead of logon scripts. Windows 2008 introduced Group Policy Preferences, which can be used to perform common tasks (map network drives, change registry keys, and so on) previously performed by logon scripts. The following screenshot displays configuring drive maps through GPOs. Configuring drive maps through GPO Session pre-launch, sharing, and lingering Setting up a session is the most time-consuming task Citrix and Windows have to perform when a user requests an application. In the latest version of XenApp, Citrix added some features to anticipate the session setup (pre-launch) and to improve the sharing of the same session between different applications (lingering). Session pre-launch Session pre-launch is a new feature of XenApp 6.5. Instead of waiting for the user to launch an application, you can configure XenApp to set up a session as soon as the user logs on to the farm. At the moment, session pre-launch works only if the user logs on using the Receiver, not through the Web Interface. This means that when the user requests an application, a session is already loaded and all the steps of the logon process you've learned have already taken place. The application can start without any delay. From my experience, this is a feature much appreciated by users and I use it in the production farms. Please note that if you enable session pre-launch, a license is consumed at the time the user logs on. Configuring A session pre-launch is based on a published application on your farm. A common mistake is thinking that when you configure a pre-launch application, Citrix effectively launches that application when the user logs on. The application is actually used as a template for the session. Citrix uses some of its settings, like users, servers/worker groups, color depth, and so on. To create a pre-launch session, right-click on the application and choose Other Tasks | Create pre-launch application as shown in the following screenshot: Creating pre-launch application To avoid confusion, I suggest renaming the configured pre-launch session removing the actual application name, for example, you can name it Pre-launch WGProd. A pre-launched session can be used to run applications that have the same settings of the application you chose when you created the session. For example, it can be used by applications that run on the same servers. If you published different groups of applications, usually assigned to different worker groups, you should create pre-launch sessions choosing one application for each group to be sure that all your users' benefit from this feature. Life cycle of a session If you configured a pre-launch session, when the Receiver passes the user credentials to the XenApp server a new session is created. The server that will host the session is chosen in the usual way by the Data Collector. In Citrix AppCenter, you can identify pre-launched sessions from the value in the Application State column as shown in the following screenshot: Pre-launched session Using Citrix policies, you can set the maximum time a pre-launch session is kept: Pre-launch Disconnect Timer Interval, is the time after which the pre-launch session is put in disconnected state Pre-launch Terminate Timer Interval, is the time after which the pre-launch session is terminated Session sharing Session sharing occurs when a user has an open session on a server and launches an application that is published on the same server. The launch time for the second application is quicker because Citrix doesn't need to set up a new session for it. Session sharing is enabled by default if you publish your applications in seamless window mode. In this mode, applications appear in their own windows without containing an ICA session window. They seem physically installed on the client. Session sharing fails if applications are published with different settings (for example, color depth, encryption, and so on). Make sure to publish your applications using the same settings if possible. Session sharing takes precedence over load balancing; the only exception is if the server reports full load. You can force XenApp to override the load check and to also use fully loaded servers for session sharing. Refer to CTX126839 for the requested registry changes. This is, however, not a recommended configuration; a fully loaded server can lead to poor performance. Session lingering If a user closes all the applications running in a session, the session is ended too. Sometimes it would be useful to keep the session open to speed up the start of new applications. With XenApp 6.5 you can configure a lingering time. XenApp waits before closing a session even if all the running applications are closed. Configuring With Citrix user policies, you can configure two timers for session lingering: Linger Disconnect Timer Interval, is the time after which a session without applications is put in disconnected state LingerTerminate Timer Interval, is the time after which a session without applications is terminated If you're running an older version of XenApp, you can keep a session open even if users close all the running applications with the KeepMeLoggedIn tool; refer to CTX128579. Summary The optimization of the logon process can greatly improve the user experience. With EdgeSight and Windows Performance Toolkit you can perform a deep analysis and detect any causes of delay. If you can't use those tools, you are still able to implement some guidelines and best practices that will surely make users' logon faster. Setting up a session is a time-consuming task. With XenApp 6.5, Citrix implemented some new features to improve session management. With session pre-launch and session lingering you can maximize the reuse of existing sessions when users request an application, speeding up its launch time. Resources for Article: Further resources on this subject: Managing Citrix Policies [Article] Getting Started with XenApp 6 [Article] Getting Started with the Citrix Access Gateway Product Family [Article]
Read more
  • 0
  • 0
  • 4865

article-image-6-signs-you-need-containers
Richard Gall
05 Feb 2019
9 min read
Save for later

6 signs you need containers

Richard Gall
05 Feb 2019
9 min read
I’m not about to tell you containers is a hot new trend - clearly, it isn’t. Today, they are an important part of the mainstream software development industry that probably won't be disappearing any time soon. But while containers certainly can’t be described as a niche or marginal way of deploying applications, they aren’t necessarily ubiquitous. There are still developers or development teams yet to fully appreciate the usefulness of containers. You might know them - you might even be one of them. Joking aside, there are often many reasons why people aren’t using containers. Sometimes these are good reasons: maybe you just don’t need them. Often, however, you do need them, but the mere thought of changing your systems and workflow can feel like more trouble than it’s worth. If everything seems to be (just about) working, why shake things up? Well, I’m here to tell you that more often than not it is worthwhile. But to know that you’re not wasting your time and energy, there are a few important signs that can tell you if you should be using containers. Download Containerize Your Apps with Docker and Kubernetes for free, courtesy of Microsoft.  Your codebase is too complex There are few developers in the world who would tell you that their codebase couldn’t do with a little pruning and simplification. But if your code has grown into a beast that everyone fears and doesn’t really understand, containers could probably help you a lot. Why do containers help simplify your codebase? Let’s think about how spaghetti code actually happens. Yes, it always happens by accident, but usually it’s something that evolves out of years of solving intractable problems with knock on effects and consequences that only need to be solved later. By using containers you can begin to think differently about your code. Instead of everything being tied up together, like a complex concrete network of road junctions, containers allow you to isolate specific parts of it. When you can better isolate your code, you can also isolate different problems and domains. This is one of the reasons that containers is so closely aligned with microservices. Software testing is nightmarish The efficiency benefits of containers are well documented, but the way containers can help the software testing process is often underplayed - this probably says more about a general inability to treat testing with the respect and time it deserves as much as anything else. How do containers make testing easier? There are a number of reasons containers make software testing easier. On the one hand, by using containers you’re reducing that gap between the development environment and production, which means you shouldn’t be faced with as many surprises once your code hits production as you sometimes might. Containers also make the testing process faster - you only need to test against a container image, you don’t need a fully-fledged testing environment for every application you do tests on. What this all boils down to is that testing becomes much quicker and easier. In theory, then, this means the testing process fits much more neatly within the development workflow. Code quality should never be seen as a bottleneck; with containers it becomes much easier to embed the principle in your workflow. Read next: How to build 12 factor microservices on Docker Your software isn’t secure - you’ve had breaches that could have been prevented Spaghetti code, lack of effective testing can lead to major security risks. If no one really knows what’s going on inside your applications and inside your code it’s inevitable that you’ll have vulnerabilities. And, in turn, it’s highly likely these vulnerabilities will be exploited. How can containers make software security easier? Because containers allow you to make changes to parts of your software infrastructure (rather than requiring wholesale changes), this makes security patches much easier to achieve. Essentially, you can isolate the problem and tackle it. Without containers, it becomes harder to isolate specific pieces of your infrastructure, which means any changes could have a knock on effect on other parts of your code that you can’t predict. That all being said, it probably is worth mentioning that containers do still pose a significant set of security challenges. While simplicity in your codebase can make testing easier, you are replacing simplicity at that level with increased architectural complexity. To really feel the benefits of container security, you need a strong sense of how your container deployments are working together and how they might interact. Your software infrastructure is expensive (you feel the despair of vendor lock-in) Running multiple virtual machines can quickly get expensive. In terms of both storage and memory, if you want to scale up, you’re going to be running through resources at a rapid rate. While you might end up spending big on more traditional compute resources, the tools around container management and automation are getting cheaper. One of the costs of many organization’s software infrastructure is lock-in. This isn’t just about price, it’s about the restrictions that come with sticking with a certain software vendor - you’re spending money on software systems that are almost literally restricting your capacity for growth and change. How do containers solve the software infrastructure problem and reduce vendor lock-in? Traditional software infrastructure - whether that’s on-premise servers or virtual ones - is a fixed cost - you invest in the resources you need, and then either use it or you don’t. With containers running on, say, cloud, it becomes a lot easier to manage your software spend alongside strategic decisions about scalability. Fundamentally, it means you can avoid vendor lock-in. Yes, you might still be paying a lot of money for AWS or Azure, but because containers are much more portable, moving your applications between providers is much less hassle and risk. Read next: CNCF releases 9 security best practices for Kubernetes, to protect a customer’s infrastructure DevOps is a war, not a way of working Like containers, DevOps could hardly be considered a hot new trend any more. But this doesn’t mean it’s now part of the norm. There are plenty of organizations that simply don’t get DevOps, or, at the very least, seem to be stumbling their way through sprint meetings with little real alignment between development and operations. There could be multiple causes for this conflict (maybe people just don’t get on), but DevOps often fails where the code that’s being written and deployed is too complicated for anyone to properly take accountability. This takes us back to the issue of the complex codebase. Think of it this way - if code is a gigantic behemoth that can’t be easily broken up, the unintended effects and consequences of every new release and update can cause some big problems - both personally and technically. How do containers solve DevOps challenges? Containers can help solve the problems that DevOps aims to tackle by breaking up software into different pieces. This means that developers and operations teams have much more clarity on what code is being written and why, as well as what it should do. Indeed, containers arguably facilitate DevOps practices much more effectively than DevOps proponents have been trying to do in pre-container years. Adding new product features is a pain The issue of adding features or improving applications is a complaint that reaches far beyond the development team. Product management, marketing - these departments will all bemoan the ability to make necessary changes or add new features that they will argue is business critical. Often, developers will take the heat. But traditional monolithic applications make life difficult for developers - you simply can’t make changes or updates. It’s like wanting to replace a radiator and having to redo your house’s plumbing. This actually returns us to the earlier point about DevOps - containers makes DevOps easier because it enables faster delivery cycles. You can make changes to an application at the level of a container or set of containers. Indeed, you might even simply kill one container and replace it with a new one. In turn, this means you can change and build things much more quickly. How do containers make it easier to update or build new features? To continue with the radiator analogy: containers would allow you to replace or change an individual radiator without having to gut your home. Essentially, if you want to add a new feature or change an element, you wouldn’t need to go into your application and make wholesale changes - that may have unintended consequences - instead, you can simply make a change by running the resources you need inside a new container (or set of containers). Watch for the warning signs As with any technology decision, it’s well worth paying careful attention to your own needs and demands. So, before fully committing to containers, or containerizing an application, keep a close eye on the signs that they could be a valuable option. Containers may well force you to come face to face with the reality of technical debt - and if it does, so be it. There’s no time like the present, after all. Of course, all of the problems listed above are ultimately symptoms of broader issues or challenges you face as a development team or wider organization. Containers shouldn’t be seen as a sure-fire corrective, but they can be an important element in changing your culture and processes. Learn how to containerize your apps with a new eBook, free courtesy of Microsoft. Download it here.
Read more
  • 0
  • 0
  • 4788
article-image-monitoring-and-troubleshooting-networking
Packt
21 Oct 2015
21 min read
Save for later

Monitoring and Troubleshooting Networking

Packt
21 Oct 2015
21 min read
This article by Muhammad Zeeshan Munir, author of the book VMware vSphere Troubleshooting, includes troubleshooting vSphere virtual distributed switches, vSphere standard virtual switches, vLANs, uplinks, DNS, and routing, which is one of the core issues a seasonal system engineer has to deal with on a daily basis. This article will cover all these topics and give you hands-on step-by-step instructions to manage and monitor your network resources. The following topics will be covered in this article: Different network troubleshooting commands VLANs troubleshooting Verification of physical trunks and VLAN configuration Testing of VM connectivity VMkernel interface troubleshooting Configuration command (Vicfg-vmknic and esxcli network ip interface) Use of Direct Console User Interface (DCUI) to verify configuration (For more resources related to this topic, see here.) Network troubleshooting commands Some of the commands that can be used for networking troubleshooting include net-dvs, Esxcli network, vicfg-route, vicfg-vmknic, vicfg-dns, vicfg-nics, and vicfg-vswitch. You can use the net-dvs command to troubleshoot VMware distributed dvSwitches. The command shows all the information regarding the VMware distributed dvSwtich configuration. The net-dvs command reads the information from the /etc/vmware/dvsdata.db file and displays all the data in the console. A vSphere host keeps updating its dvsdata.db file every five minutes. Connect to a vSphere host using PuTTY. Enter your user name and password when prompted. Type the following command in the CLI: net-dvs You will see something similar to the following screenshot: In the preceding screenshot, you can see that the first line represents the UUID of a VMware distributed switch. The second line shows the maximum number of ports a distributed switch can have. The line com.vmware.common.alias = dvswitch-Network-Pools represents the name of a distributed switch. The next line com.vmware.common.uplinkPorts: dvUplink1 to dvUplinkn shows the uplink ports a distributed switch has. The distributed switch MTU is set to 1,600 and you can see the information about CDP just below it. CDP information can be useful to troubleshoot connectivity issues. You can see com.vmware.common.respools.list listing networking resource pools, while com.vmware.common.host.uplinkPorts shows the ports numbers assigned to uplink ports. Further details about these uplink ports are explained as follows for each uplink port by their port number. You can also see the port statistics as displayed in the following screenshot. When you perform troubleshooting, these statistics can help you to check the behavior of the distributed switch and the ports. From these statistics, you can diagnose if the data packets are going in and out. As you can see in the following screenshot, all the metrics regarding packet drops are zero. If you find in your troubleshooting that the packets are being dropped, you can easily start finding the root cause of the problem: Unfortunately, the net-dvs command is very poorly documented, and usually, it is hard to find useful references. Moreover, it is not supported by VMware. However, you can use it with –h switch to display more options. Repairing a dvsdata.db file Sometimes, the dvsdata.db file of a vSphere host becomes corrupted and you face different types of distributed switch errors, for example, unable to create proxy DVS. In this case, when you try to run the net-dvs command on a vSphere host, it will fail with an error as well. As I have mentioned earlier, the net-dvs command reads data from the /etc/vmware/dvsdata.db file—it fails because it is unable to read data from the file. The possible cause for the corruption of the dvsdata.db file could be network outage; or when a vSphere host is disconnected from vCenter and deleted, it might have the information in its cache. You can resolve this issue by restoring the dvsdata.db file by following these steps: Through PuTTY, connect to a functioning vSphere host in your infrastructure. Copy the dvsdata.db file from the vSphere host. The file can be found in /etc/vmware/dvsdata.db. Transfer the copied dvsdata.db file to the corrupted vSphere host and overwrite it. Restart your vSphere host. Once the vSphere host is up and running, use PuTTY to connect to it. Run the net-dvs command. The command should be executed successfully this time without any errors. ESXCLI network The esxcli network command is a longtime friend of the system administrator and the support staff for troubleshooting network related issues. The esxcli network command will be used to examine different network configurations and to troubleshoot problems. You can type esxcli network to quickly see a help reference and the different options that can be used with the command. Let's walk through some useful esxcli network troubleshooting commands. Type the following command into your vSphere CLI to list all the virtual machines and the networks they are on. You can see that the command returned World ID, virtual machine name, number of ports, and the network: esxcli network vm list World ID  Name  Num Ports  Networks --------  ---------------------------------------------------  ---------  --------------- 14323012  cluster08_(5fa21117-18f7-427c-84d1-c63922199e05)          1  dvportgroup-372 Now use the World ID of a virtual machine returned by the last command to list all the ports the virtual machine is currently using. You can see the virtual switch name, MAC address of the NIC, IP address, and uplink port ID: esxcli network vm port list -w 14323012 Port ID: 50331662 vSwitch: dvSwitch-Network-Pools Portgroup: dvportgroup-372 DVPort ID: 1063 MAC Address: 00:50:56:01:00:7e IP Address: 0.0.0.0 Team Uplink: all(2) Uplink Port ID: 0 Active Filters: Type the following command in the CLI to list the statistics of the virtual switch—you need to replace the port ID as returned by the last command after –p flag: esxcli network port stats get -p 50331662 Packet statistics for port 50331662 Packets received: 10787391024 Packets sent: 7661812086 Bytes received: 3048720170788 Bytes sent: 154147668506 Broadcast packets received: 17831672 Broadcast packets sent: 309404 Multicast packets received: 656 Multicast packets sent: 52 Unicast packets received: 10769558696 Unicast packets sent: 7661502630 Receive packets dropped: 92865923 Transmit packets dropped: 0 Type the following command to list complete information about the network card of the virtual machine: esxcli network nic stats get -n vmnic0 NIC statistics for vmnic0 Packets received: 2969343419 Packets sent: 155331621 Bytes received: 2264469102098 Bytes sent: 46007679331 Receive packets dropped: 0 Transmit packets dropped: 0 Total receive errors: 78507 Receive length errors: 0 Receive over errors: 22 Receive CRC errors: 0 Receive frame errors: 0 Receive FIFO errors: 78485 Receive missed errors: 0 Total transmit errors: 0 Transmit aborted errors: 0 Transmit carrier errors: 0 Transmit FIFO errors: 0 Transmit heartbeat errors: 0 Transmit window errors: 0 A complete reference of the ESXCli Network command can be found here at https://goo.gl/9OMbVU. All the vicfg-* commands are very helpful and easy to use. I will encourage you to learn in order to make your life easier. Here are some of the vicfg-* commands relevant to network troubleshooting: vicfg-route: We will use this command for how to add or remove IP routes and how to create and delete default IP Gateways. vicfg-vmknic: We will use this command to perform different operations on VMkernel NICs for vSphere hosts. vicfg-dns: This command will be used to manipulate DNS information. vicfg-nics: We will use this command to manipulate vSphere Physical NICs. vicfg-vswitch: We will use this command to to create, delete, and modify vswitch information. Troubleshooting uplinks We will use the vicfg-nics command to manage physical network adapters of vSphere hosts. The vicfg-nics command can also be used to set up the speed, VMkernel name for the uplink adapters, duplex setting, driver information, and link state information of the NIC. Connect to your vMA appliance console and set up the target vSphere host: vifptarget --set crimv3esx001.linxsol.com List all the network cards available in the vSphere host. See the following screenshot for the output: vicfg-nics –l You can see that my vSphere host has five network cards from vmnic0 to vmnic5. You are able to see the PCI and driver information. The link state for the all the network cards is up. You can also see two types of network card speeds: 1000 Mbs and 9000 Mbs. There is also a card name in the Description field, MTU, and the Mac address for the network cards. You can set up a network card to auto-negotiate as follows: vicfg-nics --auto vimnic0 Now let's set the speed of vmnic0 to 1000 and its duplex settings to full: vicfg-nics --duplex full --speed 1000 vmnic0 Troubleshooting virtual switches The last command we will discuss in this article is vicfg-vswitch. The vicfg-vswitch command is a very powerful command that can be used to manipulate the day-to-day operations of a virtual switch. I will show you how to create and configure port groups and virtual switches. Set up a vSphere host in the vMA appliance in which you want to get information about virtual switches: vifptarget --set crimv3esx001.linxsol.com Type the following command to list all the information about the switches the vSphere host has. You can see the command output in the screenshot that follows: vicfg-vswitch -l You can see that the vSphere host has one virtual switch and two virtual NICs carrying traffic for the management network and for the vMotion. The virtual switch has 128 ports, and 7 of them are in used state. There are two uplinks to the switch with MTU set to 1500, while two VLANS are being used: one for the management network and one for the vMotion traffic. You can also see three distributed switches named OpenStack, dvSwitch-External-Networks, and dvSwitch-Network-Pools. Prefixing dv with the distributed switch name is a command practice, and it can help you to easily recognize a distributed switch. I will go through adding a new virtual switch: vicfg-vswitch --add vSwitch002 This creates a virtual switch with 128 ports and MTU of 1500. You can use the --mtu flag to specify a different MTU. Now add an uplink adapter vnic02 to the newly created virtual switch vSwitch002: vicfg-vswitch --link vmnic0 vSwitch002 To add a port group to the virtual switch, use the following command: vicfg-vswitch --add-pg portgroup002 vSwitch002 Now add an uplink adapter to the port group: vicfg-vswitch --add-pg-uplink vmnic0 --pg portgroup002 vSwitch002 We have discussed all the commands to create a virtual switch and its port groups and to add uplinks. Now we will see how to delete and edit the configuration of a virtual switch. An uplink NIC from the port group can be deleted using –N flag. Remove vmnic0 from the portgroup002: vicfg-vswitch --del-pg-uplink vmnic0 --pg portgroup002 vSwitch002 You can delete the recently created port group as follows: vicfg-vswitch --del-pg portgroup002 vSwitch002 To delete a switch, you first need to remove an uplink adapter from the virtual switch. You need to use the –U flag, which unlinks the uplink from the switch: vicfg-vswitch --unlink vmnic0 vSwitch002 You can delete a virtual switch using the –d flag. Here is how you do it: vicfg-vswitch --delete vSwitch002 You can check the Cisco Discovery Protocol (CDP) settings by using the --get-cdp flag with the vicfg-vswitch command. The following command resulted in putting the CDP in the Listen state, which indicates that the vSphere host is configured to receive CDP information from the physical switch: vi-admin@vma:~[crimv3esx001.linxsol.com]> vicfg-vswitch --get-cdp vSwitch0 listen You can configure CDP options for the vSphere host to down, listen, or advertise. In the Listen mode, the vSphere host tries to discover and publish this information received from a Cisco switch port, though the information of the vSwitch cannot be seen by the Cisco device. In the Advertise mode, the vSphere host doesn't discover and publish the information about the Cisco switch; instead, it publishes information about its vSwitch to the Cisco switch device. vicfg-vswitch --set-cdp both vSwitch0 Troubleshooting VLANs Virtual LANS or VLANs are used to separate the physical switching segment into different logical switching segments in order to segregate the broadcast domains. VLANs not only provide network segmentation but also provide us a method of effective network management. It also increases the overall network security, and nowadays, it is very commonly used in infrastructure. If not set up correctly, it can lead your vSphere host to no connectivity, and you can face some very common problems where you are unable to ping or resolve the host names anymore. Some common errors are exposed, such as Destination host unreachable and Connection failed. A Private VLAN (PVLAN) is an extended version of VLAN that divides logical broadcast domain into further segments and forms private groups. PVLANs are divided into primary and secondary PVLANs. Primary PVLAN is the VLAN distributed into smaller segments that are called primary. These then host all the secondary PVLANs within them. Secondary PVLANs live within primary VLANS, and individual secondary VLANs are recognized by VLAN IDs linked to them. Just like their ancestor VLANs, the packets that travel within secondary VLANS are tagged with their associated IDs. Then, the physical switch recognizes if the packets are tagged as isolated, community, or promiscuous. As network troubleshooting involves taking care of many different aspects, one aspect you will come across in the troubleshooting cycle is actually troubleshooting VLANS. vSphere Enterprise Plus licensing is a requirement to connect a host using a virtual distributed switch and VLANs. You can see the three different network segments in the following screenshot. VLAN A connects all the virtual machines on different vSphere hosts; VLAN B is responsible for carrying out management network traffic; and VLAN C is responsible for carrying out vMotion-related traffic. In order to create PVLANs on your vSphere host, you also need the support of a physical switch: For detailed information about the vSphere network, refer to the VMware official networking guide for vSphere 5.5 at http://goo.gl/SYySFL. Verifying physical trunks and VLAN configuration The first and most important step to troubleshooting your VLAN problem is to look into the VLAN configuration of your vSphere host. You should always start by verifying it. Let's walk through how to verify the network configuration of the management network and VLAN configuration from the vSphere client: Open and log in to your vSphere client. Click on the vSphere host you are trying to troubleshoot. Click on the Configuration menu and choose Networking and then Properties of the switch you are troubleshooting. Choose the network you are troubleshooting from the list, and click on Edit. This will open a new window. Verify the VLAN ID for Management Network. Match the ID of the VLAN provided by your network administrator. Verifying VLAN configuration from CLI Following are the steps for verifying VLAN configuration from CLI: Log in to vSphere CLI. Type the following command in the console: esxcfg-vswitch -l Alternatively, in the vMA appliance, type the vicfg-vswitch command—the output is similar for both commands: vicfg-vswitch –l The output of the excfg-vswitch –l command is as follows: Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks vSwitch0         128         7           128               1500    vmnic3,vmnic2   PortGroup Name        VLAN ID  Used Ports  Uplinks   vMotion               2231     1           vmnic3,vmnic2   Management Network    2230     1           vmnic3,vmnic2  ---Omitted output--- The output of the vicfg-vswitch –l command is as follows: Switch Name     Num Ports       Used Ports      Configured Ports    MTU     Uplinks vSwitch0        128             7               128                 1500    vmnic2,vmnic3    PortGroup Name                VLAN ID   Used Ports      Uplinks    vMotion                       2231      1               vmnic2,vmnic3    Management Network            2230      1               vmnic3,vmnic2 --Omitted output--- Match it with your network configuration. If the VLAN ID is incorrect or missing, you can add or edit it using the following command from the vSphere CLI: esxcfg-vswitch –v 2233 –p "Management Network" vSwitch0 To add or edit the VLAN ID from the vMA appliance, use the following command: vicfg-vswitch --vlan 2233 --pg "Management Network" vSwitch0 Verifying VLANs from PowerCLI Verifying information about VLANs from the PowerCLI is fairly simple. Type the following command into the console after connecting with vCenter using Connect-VIServer: Get-VirtualPortGroup –VMHost crimv3esx001.linxsol.com | select Name, VirtualSwitch VLanID Name                                           VirtualSwitch                                  VlanId ----                                                -------------                                     ----- vMotion                                        vSwitch0                                      2231 Management Network                 vSwitch0                                       2233 Verifying PVLANs and secondary PVLANs When you have configured PVLANs or secondary PVLANs in your vSphere infrastructure, you may arrive at a situation where you need to troubleshoot them. This topic will provide you some tips to obtain and view information about PVLANs and secondary PVLANs, as follows: Log in to the vSphere client and click on Networking. Select a distributed switch and right-click on it. From the menu, choose Edit Settings and click on it. This will open the Distributed Switch Settings window. Click on the third tab named Private VLAN. In the section on the left named Primary private VLAN ID, verify the VLAN ID provided by your network engineer. You can verify the VLAN ID of the secondary PVLAN in the next section on the right. Testing virtual machine connectivity Whenever you are troubleshooting, virtual-machine-to-virtual-machine testing is very important. It helps you to isolate the problem domain to a smaller scope. When performing virtual-machine-to-virtual-machine testing, you should always move virtual machines to a single vSphere host. You can then start troubleshooting the network using basic commands, such as ping. If ping works, you are ready to test it further and move the virtual machines to other hosts, and if it still doesn't work, it most likely is a configuration problem of a physical switch or is likely to be a mismatched physical trunk configuration. The most common problem in this scenario is a problematic physical switch configuration. Troubleshooting VMkernel interfaces In this section, we will see how to troubleshoot VMkernel interfaces: Confirm VLAN tagging Ping to check connectivity Vicfg-vmknic Escli network ip interface for local configuration Escli network ip interface list Add or remove Set Escli network ip interface ipv4 get You should know how to use these commands to test if everything is working. You should be able to ping to ensure connectivity exists. We will use the vicfg-vmknic command to configure vSphere VMkernel NICs. Let's create a new VMkernel NIC in a vSphere host using the following steps: Log in to your VMware vSphere CLI. Type the following command to create a new VMkernel NIC: vicfg-vmknic –h crimv3esx001.linxsol.com --add --ip 10.2.0.10 –n 255.255.255.0 'portgroup01' You can enable vMotion using the vicfg-vmknic command as follows: vicfg-vmknic –enable-vmotion. You will not be able to enable vMotion from ESXCLI.vMotion protect migration of your virtual machines with zero down time. You can delete an existing VMkernel NIC as follows: vicfg-vmknic –h crimv3esx001.linxsol.com --delete 'portgroup01' Now check by typing the following command which VMkernel NICs are available in the system: vicfg-vmknic -l Verifying configuration from DCUI When you successfully install vSphere, the first yellow screen that you see is called the vSphere DCUI. DCUI is a frontend management system that helps perform some basic system administration tasks. It also offers the best way to troubleshoot some problems that may be difficult to troubleshoot through vMA, vCLI, or PowerCLI. Further, it is very useful when your host becomes irresponsive from the vCenter or is not accessible from any of the management tools. Some useful tasks that can be performed using the DCUI are as follows: Configuring the Lockdown mode Checking connectivity of Management Network by Ping Configuring and restarting network settings Restarting management agents Viewing logs Resetting vSphere configuration Changing root password Verifying network connectivity from DCUI The vSphere host automatically assigns the first network card available to the system for the management network. Moreover, the default installation of the vSphere host does not let you set up VLAN tags until the VMkernel has been loaded. Verifying network connectivity from the DCUI is important but easy. To do so, follow these steps: Press F2 and enter your root user name and password. Click OK. Use the cursor keys to go down to the Test Management Network option. Click Enter, and you will see a new screen. Here you can enter up to three IP addresses and the host name to be resolved. You can also type your gateway address on this screen to see if you are able to reach to your gateway. In the host name, you can enter your DNS server name to test if the name resolves successfully. Press Esc to get back and Esc again to log off from the vSphere DCUI. Verifying management network from DCUI You can also verify the settings of your management network from the DCUI. Press F2 and enter your root user name and password. Click OK. Use the cursor keys to go down to option Configure Management Network option and click Enter. Click Enter again after selecting the first option Network Adapters. On the next screen, you will see a list of all the network adapters your system has. It will show you the Device Name, Hardware Type, Label, Mac Address of the network card, and the status as Connected or Disconnected. From the given network cards, you can select or deselect any of the network card by pressing the space Bar on your keyboard. Press Esc to get back and Esc again to log off from the vSphere DCUI. As you can see in the preceding screenshot, you can also configure the IP address and DNS settings for your vSphere host. You can also use DCUI to configure VLANs and DNS Suffix for your vSphere host. Summary In this article, for troubleshooting, we took a deep dive into the troubleshooting commands and some of the monitoring tools to monitor network performance. The various platforms to execute different commands help you to isolate your troubleshooting techniques. For example, for troubleshooting a single vSphere host, you may like to use esxcli, but for a bunch of vSphere hosts you would like to automate scripting tasks from PowerCLI or from a vMA appliance. Resources for Article: Further resources on this subject: UPGRADING VMWARE VIRTUAL INFRASTRUCTURE SETUPS [article] VMWARE VREALIZE OPERATIONS PERFORMANCE AND CAPACITY MANAGEMENT [article] WORKING WITH VIRTUAL MACHINES [article]
Read more
  • 0
  • 0
  • 4729

article-image-its-black-friday-but-whats-the-business-and-developer-cost-of-downtime
Richard Gall
23 Nov 2018
4 min read
Save for later

It's Black Friday: But what's the business (and developer) cost of downtime?

Richard Gall
23 Nov 2018
4 min read
Black Friday is back, and, as you've probably already noticed, with a considerable vengeance. According to Adobe Analytics data, online spending is predicted to hit $3.7 billion over this holiday season in the U.S, up from $2.9 billion in 2017. But while consumers clamour for deals and businesses reap the rewards, it's important to remember there's a largely hidden plane of software engineering labour. Without this army of developers, consumers will most likely be hitting their devices in frustration, while business leaders will be missing tough revenue targets - so, as we enter into Black Friday let's pour one out for all those engineers on call and trying their best to keep eCommerce sites on their feet. Here's to the software engineers keeping things running on Black Friday Of course, the pain that hits on days like Black Friday and Cyber Monday can be minimised with smart planning and effective decision making long before those sales begin. However, for engineering teams under-resourced and lacking the right tools, that is simply impossible. This means that software engineers are left in a position where they're treading water, knowing that they're going to be sinking once those big days come around. It doesn't have to be like this. With smarter leadership and, indeed, more respect for the intensive work engineers put in to make websites and apps actually work, revenue driving platforms can become more secure, resilient and stable. Chaos engineering platform Gremlin publishes the 'true cost of downtime' This is the central argument of chaos engineering platform Gremlin, who we've covered a number of times this year. To coincide with Black Friday the team has put together what they believe is the 'true cost of downtime'. On the one hand this is a good marketing hook for their chaos engineering platform, but, cynicism aside, it's also a good explanation of why the principles of chaos engineering can be so valuable from both a business and developer perspective. Estimating the annual revenue of some of the biggest companies in the world, Gremlin has been then created an interactive table to demonstrate what the cost of downtime for each of those businesses would be, for the length of time you are on the page. For 20 minutes downtime, Amazon.com would have lost a staggering $4.4 million. For Walgreens it's more than $80,000. Gremlin provide some context to all this, saying: "Enterprise commerce businesses typically rely on a complex microservices architecture, from fulfillment, to website security, ability to scale with holiday traffic, and payment processing - there is a lot that can go wrong and impact revenue, damage customer trust, and consume engineering time. If an ecommerce site isn’t 100% online and performant, it’s losing revenue." "The holiday season is especially demanding for SREs working in ecommerce. Even the most skilled engineering teams can struggle to keep up with the demands of peak holiday traffic (i.e. Black Friday and Cyber Monday). Just going down for a few seconds can mean thousands in lost revenue, but for some sites, downtime can be exponentially more expensive." For Gremlin, chaos engineering is clearly the answer to many of the problems days like Black Friday poses. While it might not work for every single organization, it's nevertheless true that failing to pay attention to the value of your applications and websites at an hour by hour level could be incredibly damaging. With outages on Facebook, WhatsApp, and Instagram happening earlier this week, these problems aren't hidden away - they're in full view of the public. What does remain hidden, however, is the work and stress that goes in to tackling these issues and ensuring things are working as they should be. Perhaps it's time to start learning the lessons of Black Friday - business revenues will be that little bit healthier, but engineers will also be that little bit happier. 
Read more
  • 0
  • 0
  • 4720

article-image-proxmox-ve-fundamentals
Packt
04 Apr 2016
12 min read
Save for later

Proxmox VE Fundamentals

Packt
04 Apr 2016
12 min read
In this article written by Rik Goldman author of the book Learning Proxmox VE, we introduce to you Proxmox Virtual Environment (PVE) which is a mature, complete, well-supported, enterprise-class virtualization environment for servers. It is an open source tool—based in the Debian GNU/Linux distribution—that manages containers, virtual machines, storage, virtualized networks, and high-availability clustering through a well-designed, web-based interface or via the command-line interface. (For more resources related to this topic, see here.) Developers provided the first stable release of Proxmox VE in 2008; 4 years and eight point releases later, ZDNet's Ken Hess boldly, but quite sensibly, declared Proxmox VE as Proxmox: The Ultimate Hypervisor (http://www.zdnet.com/article/proxmox-the-ultimate-hypervisor/). Four years later, PVE is on version 4.1, in use by at least 90,000 hosts, and more than 500 commercial customers in 140 countries; the web-based administrative interface itself is translated into nineteen languages. This article will explore the fundamental technologies underlying PVE's hypervisor features: LXC, KVM, and QEMU. To do so, we will develop a working understanding of virtual machines, containers, and their appropriate use. We will cover the following topics: Proxmox VE in brief Virtualization and containerization with PVE Proxmox VE virtual machines, KVM, and QEMU Containerization with PVE and LXC Proxmox VE in brief With Proxmox VE, Proxmox Server Solutions GmbH (https://www.proxmox.com/en/about) provides us with an enterprise-ready, open source type II hypervisor. Later, you'll find some of the features that make Proxmox VE such a strong enterprise candidate. The license for Proxmox VE is very deliberately the GNU Affero General Public License (V3) (https://www.gnu.org/licenses/agpl-3.0.html). From among the many free and open source compatible licenses available, this is a significant choice because it is "specifically designed to ensure cooperation with the community in the case of network server software." PVE is primarily administered from an integrated web interface or from the command line locally or via SSH. Consequently, there is no need for a separate management server and the associated expenditure. In this way, Proxmox VE significantly contrasts with alternative enterprise virtualization solutions by vendors such as VMware. Proxmox VE instances/nodes can be incorporated into PVE clusters, and centrally administered from a unified web interface. Proxmox VE provides for live migration—the movement of a virtual machine or container from one cluster node to another without any disruption of services. This is a rather unique feature to PVE and not common to competing products. Features Proxmox VE VMware vSphere Hardware requirements Flexible Strict compliance with HCL Integrated management interface Web- and shell-based (browser and SSH) No. Requires dedicated management server at additional cost Simple subscription structure Yes; based on number of premium support tickets per year and CPU socket count No High availability Yes Yes VM live migration Yes Yes Supports containers Yes No Virtual machine OS support Windows and Linux Windows, Linux, and Unix Community support Yes No Live VM snapshots Yes Yes Contrasting Proxmox VE and VMware vSphere features For a complete catalog of features, see the Proxmox VE datasheet at https://www.proxmox.com/images/download/pve/docs/Proxmox-VE-Datasheet.pdf. Like its competitors, PVE is a hypervisor: a typical hypervisor is software that creates, runs, configures, and manages virtual machines based on an administrator or engineer's choices. PVE is known as a type II hypervisor because the virtualization layer is built upon an operating system. As a type II hypervisor, Proxmox VE is built on the Debian project. Debian is a GNU/Linux distribution renowned for its reliability, commitment to security, and its thriving and dedicated community of contributing developers. A type II hypervisor, such as PVE, runs directly over the operating system. In Proxmox VE's case, the operating system is Debian; since the release of PVE 4.0, the underlying operating system has been Debian "Jessie." By contrast, a type I hypervisor (such as VMware's ESXi) runs directly on bare metal without the mediation of an operating system. It has no additional function beyond managing virtualization and the physical hardware. A type I hypervisor runs directly on hardware, without the mediation of an operating system. As a type II hypervisor, Proxmox VE is built on the Debian project. Debian is a GNU/Linux distribution renowned for its reliability, commitment to security, and its thriving and dedicated community of contributing developers. Debian-based GNU/Linux distributions are arguably the most popular GNU/Linux distributions for the desktop. One characteristic that distinguishes Debian from competing distribution is its release policy: Debian releases only when its development community can stand behind it for its stability, security, and usability. Debian does not distinguish between long-term support releases and regular releases as do some other distributions. Instead, all Debian releases receive strong support and critical updates through the first year following the next release. (Since 2007, a major release of Debian has been made about every two years. Debian 8, Jessie, was released just about on schedule in 2015. Proxmox VE's reliance on Debian is thus a testament to its commitment to these values: stability, security, and usability over scheduled releases that favor cutting-edge features. PVE provides its virtualization functionality through three open technologies, and the efficiency with which they're integrated by its administrative web interface: LXC KVM QEMU To understand how this foundation serves Proxmox VE, we must first be able to clearly understand the relationship between virtualization (or, specifically, hardware virtualization) and containerization (OS virtualization). As we proceed, their respective use cases should become clear. Virtualization and containerization with Proxmox VE It is correct to ultimately understand containerization as a type of virtualization. However, here, we'll look first to conceptually distinguish a virtual machine from a container by focusing on contrasting characteristics. Simply put, virtualization is a technique through which we provide fully-functional, computing resources without a demand for resources' physical organization, locations, or relative proximity. Briefly put, virtualization technology allows you to share and allocate the resources of a physical computer into multiple execution environments. Without context, virtualization is a vague term, encapsulating the abstraction of such resources as storage, networks, servers, desktop environments, and even applications from their concrete hardware requirements through software implementation solutions called hypervisors. Virtualization thus affords us more flexibility, more functionality, and a significant positive impact on our budgets—often realized with merely the resources we have at hand. In terms of PVE, virtualization most commonly refers to the abstraction of all aspects of a discrete computing system from its hardware. In this context, virtualization is the creation, in other words, of a virtual machine or VM, with its own operating system and applications. A VM may be initially understood as a computer that has the same functionality as a physical machine. Likewise, it may be incorporated and communicated with via a network exactly as a machine with physical hardware would. Put yet another way, from inside a VM, we will experience no difference from which we can distinguish it from a physical computer. The virtual machine, moreover, hasn't the physical footprint of its physical counterparts. The hardware it relies on is, in fact, provided by software that borrows from the hardware resources from a host installed on a physical machine (or bare metal). Nevertheless, the software components of the virtual machine, from the applications to the operating system, are distinctly separated from those of the host machine. This advantage is realized when it comes to allocating physical space for resources. For example, we may have a PVE server running a web server, database server, firewall, and log management system—all as discrete virtual machines. Rather than consuming the physical space, resources, and labor of maintaining four physical machines, we simply make physical room for the single Proxmox VE server and configure an appropriate virtual LAN as necessary. In a white paper entitled Putting Server Virtualization to Work, AMD articulates well the benefits of virtualization to businesses and developers (https://www.amd.com/Documents/32951B_Virtual_WP.pdf): Top 5 business benefits of virtualization: Increases server utilization Improves service levels Streamlines manageability and security Decreases hardware costs Reduces facility costs The benefits of virtualization with a development and test environment: Lowers capital and space requirements. Lowers power and cooling costs Increases efficiencies through shorter test cycles Faster time-to-market To these benefits, let's add portability and encapsulation: the unique ability to migrate a live VM from one PVE host to another—without suffering a service outage. Proxmox VE makes the creation and control of virtual machines possible through the combined use of two free and open source technologies: Kernel-based Virtual Machine (or KVM) and Quick Emulator (QEMU). Used together, we refer to this integration of tools as KVM-QEMU. KVM KVM has been an integral part of the Linux kernel since February, 2007. This kernel module allows GNU/Linux users and administrators to take advantage of an architecture's hardware virtualization extensions; for our purposes, these extensions are AMD's AMD-V and Intel's VT-X for the x86_64 architecture. To really make the most of Proxmox VE's feature set, you'll therefore very much want to install on an x86_64 machine with a CPU with integrated virtualization extensions. For a full list of AMD and Intel processors supported by KVM, visit Intel at http://ark.intel.com/Products/VirtualizationTechnology or AMD at http://support.amd.com/en-us/kb-articles/Pages/GPU120AMDRVICPUsHyperVWin8.aspx. QEMU QEMU provides an emulation and virtualization interface that can be scripted or otherwise controlled by a user. Visualizing the relationship between KVM and QEMU Without Proxmox VE, we could essentially define the hardware, create a virtual disk, and start and stop a virtualized server from the command line using QEMU. Alternatively, we could rely on any one of an array of GUI frontends for QEMU (a list of GUIs available for various platforms can be found at http://wiki.qemu.org/Links#GUI_Front_Ends). Of course, working with these solutions is productive only if you're interested in what goes on behind the scenes in PVE when virtual machines are defined. Proxmox VE's management of virtual machines is itself managing QEMU through its API. Managing QEMU from the command line can be tedious. The following is a line from a script that launched Raspbian, a Debian remix intended for the architecture of the Raspberry Pi, on an x86 Intel machine running Ubuntu. When we see how easy it is to manage VMs from Proxmox VE's administrative interfaces, we'll sincerely appreciate that relative simplicity: qemu-system-arm -kernel kernel-qemu -cpu arm1176 -m 256 -M versatilepb -no-reboot -serial stdio -append "root=/dev/sda2 panic=1" -hda ./$raspbian_img -hdb swap If you're familiar with QEMU's emulation features, it's perhaps important to note that we can't manage emulation through the tools and features Proxmox VE provides—despite its reliance on QEMU. From a bash shell provided by Debian, it's possible. However, the emulation can't be controlled through PVE's administration and management interfaces. Containerization with Proxmox VE Containers are a class of virtual machines (as containerization has enjoyed a renaissance since 2005, the term OS virtualization has become synonymous with containerization and is often used for clarity). However, by way of contrast with VMs, containers share operating system components, such as libraries and binaries, with the host operating system; a virtual machine does not. Visually contrasting virtual machines with containers The container advantage This arrangement potentially allows a container to run leaner and with fewer hardware resources borrowed from the host. For many authors, pundits, and users, containers also offer a demonstrable advantage in terms of speed and efficiency. (However, it should be noted here that as resources such as RAM and more powerful CPUs become cheaper, this advantage will diminish.) The Proxmox VE container is made possible through LXC from version 4.0 (it's made possible through OpenVZ in previous PVE versions). LXC is the third fundamental technology serving Proxmox VE's ultimate interest. Like KVM and QEMU, LXC (or Linux Containers) is an open source technology. It allows a host to run, and an administrator to manage, multiple operating system instances as isolated containers on a single physical host. Conceptually then, a container very clearly represents a class of virtualization, rather than an opposing concept. Nevertheless, it's helpful to maintain a clear distinction between a virtual machine and a container as we come to terms with PVE. The ideal implementation of a Proxmox VE guest is contingent on our distinguishing and choosing between a virtual-machine solution and a container solution. Since Proxmox VE containers share components of the host operating system and can offer advantages in terms of efficiency, this text will guide you through the creation of containers whenever the intended guest can be fully realized with Debian Jessie as our hypervisor's operating system without sacrificing features. When our intent is a guest running a Microsoft Windows operating system, for example, a Proxmox VE container ceases to be a solution. In such a case, we turn, instead, to creating a virtual machine. We must rely on a VM precisely because the operating system components that Debian can share with a Linux container are not components a Microsoft Windows operating system can make use of. Summary In this article, we have come to terms with the three open source technologies that provide Proxmox VE's foundational features: containerization and virtualization with LXC, KVM, and QEMU. Along the way, we've come to understand that containers, while being a type of virtualization, have characteristics that distinguish them from virtual machines. These differences will be crucial as we determine which technology to rely on for a virtual server solution with Proxmox VE. Resources for Article: Further resources on this subject: Deploying App-V 5 in a Virtual Environment[article] Setting Up a Spark Virtual Environment[article] Basic Concepts of Proxmox Virtual Environment[article]
Read more
  • 0
  • 0
  • 4707
article-image-working-vmware-infrastructure
Packt
04 Mar 2015
21 min read
Save for later

Working with VMware Infrastructure

Packt
04 Mar 2015
21 min read
In this article by Daniel Langenhan, the author of VMware vRealize Orchestrator Cookbook, we will take a closer look at how Orchestrator interacts with vCenter Server and vRealize Automation (vRA—formerly known as vCloud Automation Center, vCAC). vRA uses Orchestrator to access and automate infrastructure using Orchestrator plugins. We will take a look at how to make Orchestrator workflows available to vRA. We will investigate the following recipes: Unmounting all the CD-ROMs of all VMs in a cluster Provisioning a VM from a template An approval process for VM provisioning (For more resources related to this topic, see here.) There are quite a lot of plugins for Orchestrator to interact with VMware infrastructure and programs: vCenter Server vCloud Director (vCD) vRealize Automation (vRA—formally known as vCloud Automation Center, vCAC) Site Recovery Manager (SRM) VMware Auto Deploy Horizon (View and Virtual Desktops) vRealize Configuration Manager (earlier known as vCenter Configuration Manager) vCenter Update Manager vCenter Operation Manager, vCOPS (only example packages) VMware, as of writing of this article, is still renaming its products. An overview of all plugins and their names and download links can be found at http://www.vcoteam.info/links/plug-ins.html. There are quite a lot of plugins, and we will not be able to cover all of them, so we will focus on the one that is most used, vCenter. Sadly, vCloud Director is earmarked by VMware to disappear for everyone but service providers, so there is no real need to show any workflow for it. We will also work with vRA and see how it interacts with Orchestrator. vSphere automation The interaction between Orchestrator and vCenter is done using the vCenter API. Here is the explanation of the interaction, which you can refer to in the following figure. A user starts an Orchestrator workflow (1) either in an interactive way via the vSphere Web Client, the Orchestrator Web Operator, the Orchestrator Client, or via the API. The workflow in Orchestrator will then send a job (2) to vCenter and receive a task ID back (type VC:Task). vCenter will then start enacting the job (3). Using the vim3WaitTaskEnd action (4), Orchestrator pauses until the task has been completed. If we do not use the wait task, we can't be certain whether the task has ended or failed. It is extremely important to use the vim3WaitTaskEnd action whenever we send a job to vCenter. When the wait task reports that the job has finished, the workflow will be marked as finished. The vCenter MoRef The MoRef (Managed Object Reference) is a unique ID for every object inside vCenter. MoRefs are basically strings; some examples are shown here: VM Network Datastore ESXi host Data center Cluster vm-301 network-312 dvportgroup-242 datastore-101 host-44 data center-21 domain-c41 The MoRefs are typically stored in the attribute .id or .key of the Orchestrator API object. For example, the MoRef of a vSwitch Network is VC:Network.id. To browse for MoRefs, you can use the Managed Object Browser (MOB), documented at https://pubs.vmware.com/vsphere-55/index.jsp#com.vmware.wssdk.pg.doc/PG_Appx_Using_MOB.20.1.html. The vim3WaitTaskEnd action As already said, vim3WaitTaskEnd is one of the most central actions while interacting with vCenter. The action has the following variables: Category Name Type Usage IN vcTask VC:Task Carries the reconfiguration task from the script to the wait task IN progress Boolean Write to the logs the progress of a task in percentage IN pollRate Number How often the action should be checked for task completion in vCenter OUT ActionResult Any Returns the task's result The wait task will check in regular intervals (pollRate) the status of a task that has been submitted to vCenter. The task can have the following states: State Meaning Queued The task is queued and will be executed as soon as possible. Running The task is currently running. If the progress is set to true, the progress in percentage will be displayed in the logs. Success The task is finished successfully. Error The task has failed and an error will be thrown. Other vCenter wait actions There are actually five waiting tasks that come with the vCenter Server plugin. Here's an overview of the other four: Task Description vim3WaitToolsStarted This task waits until the VMware tools are started on a VM or until a timeout is reached. Vim3WaitForPrincipalIP This task waits until the VMware tools report the primary IP of a VM or until a timeout is reached. This typically indicates that the operating system is ready to receive network traffic. The action will return the primary IP. Vim3WaitDnsNameInTools This task waits until the VMware tools report a given DNS name of a VM or until a timeout is reached. The in-parameter addNumberToName is not used and can be set to Null. WaitTaskEndOrVMQuestion This task waits until a task is finished or if a VM develops a question. A vCenter question is related to user interaction. vRealize Automation (vRA) Automation has changed since the beginning of Orchestrator. Before, tools such as vCloud Director or vCloud Automation Center (vCAC)/vRealize Automation (vRA), Orchestrator was the main tool for automating vCenter resources. With version 6.2 of vCloud Automation Center (vCAC), the product has been renamed vRealize Automation. Now vRA is deemed to become the central cornerstone in the VMware automation effort. vRealize Orchestrator (vRO), is used by vRA to interact with and automate VMware and non-VMware products and infrastructure elements. Throughout the various vCAC/vRA interactions, the role of Orchestrator has changed substantially. Orchestrator started off as an extension to vCAC and became a central part of vRA. In vCAC 5.x, Orchestrator was only an extension of the IaaS life cycle. Orchestrator was tied in using the stubs vCAC 6.0 integrated Orchestrator as an XaaS service (Everything as a Service) using the Advanced Service Designer (ASD) In vCAC 6.1, Orchestrator is used to perform all VMware NSX operations (VMware's new network virtualization and automation), meaning that it became even more of a central part of the IaaS services. With vCAC 6.2, the Advance Service Designer (ASD) was enhanced to allow more complex form of designs, allowing better leverage of Orchestrator workflows. As you can see in the following figure, vRA connects to the vCenter Server using an infrastructure endpoint that allows vRA to conduct basic infrastructure actions, such as power operations, cloning, and so on. It doesn't allow any complex interactions with the vSphere infrastructure, such as HA configurations. Using the Advanced Service Endpoints, vRA integrates the Orchestrator (vRO) plugins as additional services. This allows vRA to offer the entire plugin infrastructure as services to vRA. The vCenter Server, AD, and PowerShell plugins are typical integrations that are used with vRA. Using Advance Service Designer (ASD), you can create integrations that use Orchestrator workflows. ASD allows you to offer Orchestrator workflows as vRA catalog items, making it possible for tenants to access any IT service that can be configured with Orchestrator via its plugins. The following diagram shows an example using the Active Directory plugin. The Orchestrator Plugin provides access to the AD services. By creating a custom resource using the exposed AD infrastructure, we can create a service blueprint and resource actions, both of which are based on Orchestrator workflows that use the AD plugin. The other method of integrating Orchestrator into the IaaS life cycle, which was predominately used in vCAC 5.x was to use the stubs. The build process of a VM has several steps; each step can be assigned a customizable workflow (called a stub). You can configure vRA to run an Orchestrator workflow at these stubs in order to facilitate a few customized actions. Such actions could be taken to change the VMs HA or DRS configuration, or to use the guest integration to install or configure a program on a VM. Installation How to install and configure vRA is out of the scope of this article, but take a look at http://www.kendrickcoleman.com/index.php/Tech-Blog/how-to-install-vcloud-automation-center-vcac-60-part-1-identity-appliance.html for more information. If you don't have the hardware or the time to install vRA yourself, you can use the VMware Hands-on Labs, which can be accessed after clicking on Try for Free at http://hol.vmware.com. The vRA Orchestrator plugin Due to the renaming, the vRA plugin is called vRealize Orchestrator vRA Plug-in 6.2.0, however the file you download and use is named o11nplugin-vcac-6.2.0-2287231.vmoapp. The plugin currently creates a workflow folder called vCloud Automation Center. vRA-integrated Orchestrator The vRA appliance comes with an installed and configured vRO instance; however, the best practice for a production environment is to use a dedicated Orchestrator installation, even better would be an Orchestrator cluster. Dynamic Types or XaaS XaaS means Everything (X) as a Service. The introduction of Dynamic Types in Orchestrator Version 5.5.1 does exactly that; it allows you to build your own plugins and interact with infrastructure that has not yet received its own plugin. Take a look at this article by Christophe Decanini; it integrates Twitter with Orchestrator using Dynamic Types at http://www.vcoteam.info/articles/learn-vco/282-dynamic-types-tutorial-implement-your-own-twitter-plug-in-without-any-scripting.html. Read more… To read more about Orchestrator integration with vRA, please take a look at the official VMware documentation. Please note that the official documentation you need to look at is about vRealize Automation, and not about vCloud Automation Center, but, as of writing this article, the documentation can be found at https://www.vmware.com/support/pubs/vrealize-automation-pubs.html. The document called Advanced Service Design deals with vRO and Advanced Service Designer The document called Machine Extensibility discusses customization using subs Unmounting all the CD-ROMs of all VMs in a cluster This is an easy recipe to start with, but one you can really make it work for your existing infrastructure. The workflow will unmount all CD-ROMs from a running VM. A mounted CD-ROM may block a VM from being vMotioned. Getting ready We need a VM that can mount a CD-ROM either as an ISO from a host or from the client. Before you start the workflow, make sure that the VM is powered on and has an ISO connected to it. How to do it... Create a new workflow with the following variables: Name Type Section Use cluster VC:ClusterComputerResource IN Used to input the cluster clusterVMs Array of VC:VirtualMachine Attribute Use to capture all VMs in a cluster Add the getAllVMsOfCluster action to the schema and assign the cluster in-parameter and the clusterVMs attribute to it as actionResult. Now, add a Foreach element to the schema and assign the workflow Disconnect all detachable devices from a running virtual machine. Assign the Foreach element clusterVMs as a parameter. Save and run the workflow. How it works... This recipe shows how fast and easily you can design solutions that help you with everyday vCenter problems. The problem is that VMs that have CD-ROMs or floppies mounted may experience problems using vMotion, making it impossible for them to be used with DRS. The reality is that a lot of admins mount CD-ROMs and then forget to disconnect them. Scheduling this script every evening just before the nighttime backups will make sure that a production cluster is able to make full use of DRS and is therefore better load-balanced. You can improve this workflow by integrating an exclusion list. See also Refer to the example workflow, 7.01 UnMount CD-ROM from Cluster. Provisioning a VM from a template In this recipe, we will build a deployment workflow for Windows and Linux VMs. We will learn how to create workflows and reduce the amount of input variables. Getting ready We need a Linux or Windows template that we can clone and provision. How to do it… We have split this recipe in two sections. In the first section, we will create a configuration element, and in the second, we will create the workflow. Creating a configuration We will use a configuration for all reusable variables. Build a configuration element that contains the following items: Name Type Use productId String This is the Windows product ID—the licensing code joinDomain String This is the Windows domain FQDN to join domainAdmin Credential These are the credentials to join the domain licenseMode VC:CustomizationLicenseDataMode Example, perServer licenseUsers Number This denotes the number of licensed concurrent users inTimezone Enums:MSTimeZone Time zone fullName String Full name of the user orgName String Organization name newAdminPassword String New admin password dnsServerList Array of String List of DNS servers dnsDomain String DNS domain gateway Array of String List of gateways Creating the base workflow Now we will create the base workflow: Create the workflow as shown in the following figure by adding the given elements:      Clone, Windows with single NIC and credential      Clone, Linux with single NIC      Custom decision Use the Clone, Windows… workflow to create all variables. Link up the ones that you have defined in the configuration as attributes. The rest are defined as follows: Name Type Section Use vmName String IN This is the new virtual machine's name vm VC:VirtualMachine IN Virtual machine to clone folder VC:VmFolder IN This is the virtual machine folder datastore VC:Datastore IN This is the datastore in which you store the virtual machine pool VC:ResourcePool IN This is the resource pool in which you create the virtual machine network VC:Network IN This is the network to which you attach the virtual network interface ipAddress String IN This is the fixed valid IP address subnetMask String IN This is the subnet mask template Boolean Attribute For value No, mark new VM as template powerOn Boolean Attribute For value Yes, power on the VM after creation doSysprep Boolean Attribute For value Yes, run Windows Sysprep dhcp Boolean Attribute For value No, use DHCP newVM VC:VirtualMachine OUT This is the newly-created VM The following sub-workflow in-parameters will be set to special values: Workflow In-parameter value Clone, Windows with single NIC and credential host Null joinWorkgroup Null macAddress Null netBIOS Null primaryWINS Null secondaryWINS Null name vmName clientName vmName Clone, Linux with single NIC host Null macAddress Null name vmName clientName vmName Define the in-parameter VM as input for the Custom decision and add the following script. The script will check whether the name of the OS contains the word Microsoft: guestOS=vm.config.guestFullName; System.log(guestOS);if (guestOS.indexOf("Microsoft") >=0){return true;} else {return false} Save and run the workflow. This workflow will now create a new VM from an existing VM and customize it with a fixed IP. How it works… As you can see, creating workflows to automate vCenter deployments is pretty straightforward. Dealing with the various in-parameters of workflows can be quite overwhelming. The best way to deal with this problem is to hide away variables by defining them centrally using a configuration, or define them locally as attributes. Using configurations has the advantage that you can create them once and reuse them as needed. You can even push the concept a bit further by defining multiple configurations for multiple purposes, such as different environments. While creating a new workflow for automation, a typical approach is as follows: Look for a workflow that you need. Run the workflow normally to check out what it actually does. Either create a new workflow that uses the original or duplicate and edit the one you tried, modifying it until it does what you want. A fast way to deal with a lot of variables is to drag every element you need into the schema and then use the binding to create the variables as needed. You may have noticed that this workflow only lets you select vSwitch networks, not distributed vSwitch networks. You can improve this workflow with the following features: Read the existing Sysprep information stored in your vCenter Server Generate different predefined configurations (for example DEV or Prod) There's more... We can improve the workflow by implementing the ability to change the vCPU and the memory of the VM. Follow these steps to implement it: Move the out-parameter newVM to be an attribute. Add the following variables: Name Type Section Use vCPU Number IN This variable denotes the amount of vCPUs Memory Number IN This variable denotes the amount of VM memory vcTask VC:Task Attribute This variable will carry the reconfiguration task from the script to the wait task progress Boolean Attribute Value NO, vim3WaitTaskEnd pollRate Number Attribute Value 5, vim3WaitTaskEnd ActionResult Any Attribute vim3WaitTaskEnd Add the following actions and workflows according to the next figure:      shutdownVMAndForce      changeVMvCPU      vim3WaitTaskEnd      changeVMRAM      Start virtual machine Bind newVM to all the appropriate input parameters of the added actions and workflows. Bind actionResults (VC:tasks) of the change actions to vim3WaitTasks. See also Refer to the example workflows, 7.02.1 Provision VM (Base), 7.02.2 Provision VM (HW custom), as well as the configuration element, 7 VM provisioning. An approval process for VM provisioning In this recipe, we will see how to create a workflow that waits for an approver to approve the VM creation before provisioning it. We will learn how to combine mail and external events in a workflow to make it interact with different users. Getting ready For this recipe, we first need the provisioning workflow that we have created in the Provisioning a VM from a template recipe. You can use the example workflow, 7.02.1 Provision VM (Base). Additionally, we need a functional e-mail system as well as a workflow to send e-mails. You can use the example workflow, 4.02.1 SendMail as well as its configuration item, 4.2.1 Working with e-mail. How to do it… We will split this recipe in three parts. First, we will create a configuration element then, we will create the workflow, and lastly, we will use a presentation to make the workflow usable. Creating a configuration element We will use a configuration for all reusable variables. Build a configuration element that contains the following items: Name Type Use templates Array/VC:VirtualMachine This contains all the VMs that serve as templates folders Array/VC:VmFolder This contains all the VM folders that are targets for VM provisioning networks Array/VC:Network This contains all VM networks that are targets for VM provisioning resourcePools Array/VC:ResourcePool This contains all resource pools that are targets for VM provisioning datastores Array/VC:Datastore This contains all datastores that are targets for VM provisioning daysToApproval Number These are the number of days the approval should be available for approver String This is the e-mail of the approver Please note that you also have to define or use the configuration elements for SendMail, as well as the Provision VM workflows. You can use the examples contained in the example package. Creating a workflow Create a new workflow and add the following variables: Name Type Section Use mailRequester String IN This is the e-mail address of the requester vmName String IN This is the name of the new virtual machine vm VC:VirtualMachine IN This is the virtual machine to be cloned folder VC:VmFolder IN This is the virtual machine folder datastore VC:Datastore IN This is the datastore in which you store the virtual machine pool VC:ResourcePool IN This is the resource pool in which you create the virtual machine network VC:Network IN This is the network to which you attach the virtual network interface ipAddress String IN This is the fixed valid IP address subnetMask String IN This is the subnet mask isExternalEvent Boolean Attribute A value of true defines this event as external mailApproverSubject String Attribute This is the subject line of the mail sent to the approver mailApproverContent String Attribute This is the content of the mail that is sent to the approver mailRequesterSubject String Attribute This is the subject line of the mail sent to the requester when the VM is provisioned mailRequesterContent String Attribute This is the content of the mail that is sent to the requester when the VM is provisioned mailRequesterDeclinedSubject String Attribute This is the subject line of the mail sent to the requester when the VM is declined mailRequesterDeclinedContent String Attribute This is the content of the mail that is sent to the requester when the VM is declined eventName String Attribute This is the name of the external event endDate Date Attribute This is the end date for the wait of external event approvalSuccess Boolean Attribute This checks whether the VM has been approved Now add all the attributes we defined in the configuration element and link them to the configuration. Create the workflow as shown in the following figure by adding the given elements:      Scriptable task      4.02.1 SendMail (example workflow)       Wait for custom event       Decision       Provision VM (example workflow) Edit the scriptable task and bind the following variables to it: In Out vmName ipAddress mailRequester template approver days to approval mailApproverSubject mailApproverContent mailRequesterSubject mailRequesterContent mailRequesterDeclinedSubject mailRequesterDeclinedContent eventName endDate Add the following script to the scriptable task: //construct event name eventName="provision-"+vmName; //add days to today for approval var today = new Date(); var endDate = new Date(today); endDate.setDate(today.getDate()+daysToApproval); //construct external URL for approval var myURL = new URL() ; myURL=System.customEventUrl(eventName, false); externalURL=myURL.url; //mail to approver mailApproverSubject="Approval needed: "+vmName; mailApproverContent="Dear Approver,n the user "+mailRequester+" would like to provision a VM from template "+template.name+".n To approve please click here: "+externalURL; //VM provisioned mailRequesterSubject="VM ready :"+vmName; mailRequesterContent="Dear Requester,n the VM "+vmName+" has been provisioned and is now available under IP :"+ipAddress; //declined mailRequesterDeclinedSubject="Declined :"+vmName; mailRequesterDeclinedContent="Dear Requester,n the VM "+vmName+" has been declined by "+approver; Bind the out-parameter of Wait for customer event to approvalSuccess. Configure the Decision element with approvalSuccess as true. Bind all the other variables to the workflow elements. Improving with the presentation We will now edit the workflow's presentation in order to make it workable for the requester. To do so, follow the given steps: Click on Presentation and follow the steps to alter the presentation, as seen in the following screenshot: Add the following properties to the in-parameters: In-parameter Property Value template Predefined list of elements #templates folder Predefined list of elements #folders datastore Predefined list of elements #datastores pool Predefined list of elements #resourcePools network Predefined list of elements #networks You can now use the General tab of each in-parameter to change the displayed text. Save and close the workflow. How it works… This is a very simplified example of an approval workflow to create VMs. The aim of this recipe is to introduce you to the method and ideas of how to build such a workflow. This workflow will only give a requester the choices that are configured in the configuration element, making the workflow quite safe for users that have only limited knowhow of the IT environment. When the requester submits the workflow, an e-mail is sent to the approver. The e-mail contains a link, which when clicked, triggers the external event and approves the VM. If the VM is approved it will get provisioned, and when the provisioning has finished an e-mail is sent to the requester stating that the VM is now available. If the VM is not approved within a certain timeframe, the requester will receive an e-mail that the VM was not approved. To make this workflow fully functional, you can add permissions for a requester group to the workflow and Orchestrator so that the user can use the vCenter to request a VM. Things you can do to improve the workflow are as follows: Schedule the provisioning to a future date. Use the resources for the e-mail and replace the content. Add an error workflow in case the provisioning fails. Use AD to read out the current user's e-mail and full name to improve the workflow. Create a workflow that lets an approver configure the configuration elements that a requester can chose from. Reduce the selections by creating, for instance, a development and production configuration that contains the correct folders, datastores, networks, and so on. Create a decommissioning workflow that is automatically scheduled so that the VM is destroyed automatically after a given period of time. See also Refer to the example workflow, 7.03 Approval and the configuration element, 7 approval. Summary In this article, we discussed one of the important aspects of the interaction of Orchestrator with vCenter Server and vRealize Automation, that is VM provisioning. Resources for Article: Further resources on this subject: Importance of Windows RDS in Horizon View [article] Metrics in vRealize Operations [article] Designing and Building a Horizon View 6.0 Infrastructure [article]
Read more
  • 0
  • 0
  • 4594

article-image-new-for-2020-in-operations-and-infrastructure-engineering
Richard Gall
19 Dec 2019
5 min read
Save for later

New for 2020 in operations and infrastructure engineering

Richard Gall
19 Dec 2019
5 min read
It’s an exciting time if you work in operations and software infrastructure. Indeed, you could even say that as the pace of change and innovation increases, your role only becomes more important. Operations and systems engineers, solution architects, everyone - you’re jobs are all about bringing stability, order and control into what can sometimes feel like chaos. As anyone that’s been working in the industry knows, managing change, from a personal perspective, requires a lot of effort. To keep on top of what’s happening in the industry - what tools are being released and updated, what approaches are gaining traction - you need to have one eye on the future and the wider industry. To help you with that challenge and get you ready for 2020, we’ve put together a list of what’s new for 2020 - and what you should start learning. Learn how to make Kubernetes work for you It goes without saying that Kubernetes was huge in 2019. But there are plenty of murmurs and grumblings that it’s too complicated and adds an additional burden for engineering and operations teams. To a certain extent there’s some truth in this - and arguably now would be a good time to accept that just because it seems like everyone is using Kubernetes, it doesn’t mean it’s the right solution for you. However, having said that, 2020 will be all about understanding how to make Kubernetes relevant to you. This doesn’t mean you should just drop the way you work and start using Kubernetes, but it does mean that spending some time with the platform and getting a better sense of how it could be used in the future is a useful way to spend your learning time in 2020. Explore Packt's extensive range of Kubernetes eBooks and videos on the Packt store. Learn how to architect If software has eaten the world, then by the same token perhaps complexity has well and truly eaten software as we know it. Indeed, Kubernetes is arguably just one of the symptoms and causes of this complexity. Another is the growing demand for architects in engineering and IT teams. There are a number of different ‘architecture’ job roles circulating across the industry, from solutions architect to application architect. While they each have their own subtle differences, and will even vary from company to company, they’re all roles that are about organizing and managing different pieces into something that is both stable and value-driving. Cloud has been particularly instrumental in making architect roles more prominent in the industry. As organizations look to resist the pitfalls of lock-in and better manage resources (financial and otherwise), it will be down to architects to balance business and technology concerns carefully. Learn how to architect cloud native applications. Read Architecting Cloud Computing Solutions. Get to grips with everything you need to know to be a software architect. Pick up Software Architect's Handbook. Artificial intelligence It’s strange that the hype around AI doesn’t seem to have reached the world of ops. Perhaps this is because the area is more resistant to the spin that comes with AI, preferring instead to focus more on the technical capabilities of tools and platforms. Whatever the case, it’s nevertheless true that AI will play an important part in how we manage and secure infrastructure. From monitoring system health, to automating infrastructure deployments and configuration, and even identifying security threats, artificial intelligence is already an important component for operations engineers and others. Indeed, artificial intelligence is being embedded inside products and platforms that ops teams are using - this means the need to ‘learn’ artificial intelligence is somewhat reduced. But it would be wrong to think it’s something that can just be managed from a dashboard. In 2020 it will be essential to better understand where and how artificial intelligence can fit into your operations and architectural toolchain. Find artificial intelligence eBooks and videos in Packt's collection of curated data science bundles. Observability, monitoring, tracing, and logging One of the challenges of software complexity is understanding exactly what’s going on under the hood. Yes, the network might be unreliable, as the saying goes, but what makes things even worse is that we’re not even sure why. This is where observability and the next generation of monitoring, logging and tracing all come into play. Having detailed insights into how applications and infrastructures are performing, how resources are being managed, and what things are actually causing problems is vitally important from a team perspective. Without the ability to understand these things, it can put pressure on teams as knowledge becomes siloed inside the brains of specific engineers. It makes you vulnerable to failure as you start to have points of failure at a personnel level. There are, of course, a wide range of tools and products available that can make monitoring and tracing easy (or easier, at least). But understanding which ones are right for your needs still requires some time learning and exploring the options out there. Make sure you do exactly that in 2020. Learn how to monitor distributed systems with Learn Centralized Logging and Monitoring with Kubernetes. Making serverless a reality We’ve talked about serverless a lot this year. But as a concept there’s still considerable confusion about what role it should play in modern DevOps processes. Indeed, even the nomenclature is a little confusing. Platforms using their own terminology, such as ‘lambdas’ and ‘functions’, only adds to the sense that serverless is something amorphous and hard to pin down. So, in 2020, we need to work out how to make serverless work for us. Just as we need to consider how Kubernetes might be relevant to our needs, we need to consider in what ways serverless represents both a technical and business opportunity. Search Packt's library for the latest serverless eBooks and videos. Explore more technology eBooks and videos on the Packt store.
Read more
  • 0
  • 0
  • 4560