Cloud Computing | 0 articles | Tech News, Tutorials & Expert Insights

article-image-monitoring-physical-network-bandwidth-using-openstack-ceilometer

14 Oct 2015

9 min read

Monitoring Physical Network Bandwidth Using OpenStack Ceilometer

14 Oct 2015

0
0
3501

article-image-monitoring-openstack-networks

Packt

07 Oct 2015

6 min read

Monitoring OpenStack Networks

Packt

07 Oct 2015

6 min read

In this article by Chandan Dutta Chowdhury and Sriram Subramanian, authors of the book OpenStack Networking Cookbook, we will explore various means to monitor the network resource utilization using Ceilometer. We will cover the following topics: Virtual Machine bandwidth monitoring L3 bandwidth monitoring (For more resources related to this topic, see here.) Introduction Due to the dynamic nature of virtual infrastructure and multiple users sharing the same cloud platform, the OpenStack administrator needs to track how the tenants use the resources. The data can also help in capacity planning by giving an estimate of the capacity of the physical devices and the trends of resource usage. An OpenStack Ceilometer project provides you with telemetry service. It can measure the usage of the resources by collecting statistics across the various OpenStack components. The usage data is collected over the message bus or by polling the various components. The OpenStack Neutron provides Ceilometer with the statistics that are related to the virtual networks. The following figure shows you how Ceilometer interacts with the Neutron and Nova services: To implement these recipes, we will use an OpenStack setup as described in the following screenshot: This setup has two compute nodes and one node for the controller and networking services. Virtual Machine bandwidth monitoring OpenStack Ceilometer collects the resource utilization of virtual machines by running a Ceilometer compute agent on all the compute nodes. These agents collect the various metrics that are related to each virtual machine running on the compute node. The data that is collected is periodically sent to the Ceilometer collector over the message bus. In this recipe, we will learn how to use the Ceilometer client to check the bandwidth utilization by a virtual machine. Getting ready For this recipe, you will need the following information: The SSH login credentials for a node where the OpenStack client packages are installed A shell RC file that initializes the environment variables for CLI How to do it… The following steps will show you how to determine the bandwidth utilization of a virtual machine: Using the appropriate credentials, SSH into the OpenStack node installed with the OpenStack client packages. Source the shell RC file to initialize the environment variables required for the CLI commands. Use the nova list command to find the ID of the virtual machine instance that is to be monitored: Use the ceilometer resource-list| grep <virtual-machine-id> command to find the resource ID of the network port associated with the virtual machine. Note down the resource ID for the virtual port associated to the virtual machine for use in the later commands. The virtual port resource ID is a combination of the virtual machine ID and the name of the tap interface for the virtual port. It's named in the form instance-<virtual-machine-id>-<tap-interface-name>: Use ceilometer meter-list –q resource=<virtual-port-resource-id> to find the meters associated with the network port on the virtual machine: Next, use ceilometer statistics –m <meter-name> –q resource=<virtual-port-resource-id> to view the network usage statistics. Use the meters that we discovered in the last step to view the associated data: Ceilometer stores the port bandwidth data for the incoming and outgoing packets and the bytes and their rates. How it works… The OpenStack Ceilometer compute agent collects the statistics related to the network port connected to the virtual machines and posts them on the message bus. These statistics are collected by the Ceilometer collector daemon. The Ceilometer client can be used to query a meter and filter the statistical data based on the resource ID. L3 bandwidth monitoring The OpenStack Neutron provides you with metering commands in order to enable the Layer 3 (L3) traffic monitoring. The metering commands create a label that can hold a list of the packet matching rules. Neutron counts and associates any L3 packet that matches these rules with the metering label. In this recipe, we will learn how to use the L3 traffic monitoring commands of Neutron to enable packet counting. Getting ready For this recipe, we will use a virtual machine that is connected to a network that, in turn, is connected to a router. The following figure describes the topology: We will use a network called private with CIDR of 10.10.10.0/24. For this recipe, you will need the following information: The SSH login credentials for a node where the OpenStack client packages are installed A shell RC file that initializes the environment variables for CLI The name of the L3 metering label The CIDR for which the traffic needs to be measured How to do it… The following steps will show you how to enable the monitoring traffic to or from any L3 network: Using the appropriate credentials, SSH into the OpenStack node installed with the OpenStack client packages. Source the shell RC file to initialize the environment variables required for the CLI commands. Use the Neutron meter-label-create command to create a metering label. Note the label ID as this will be used later with the Ceilometer commands: Use the Neutron meter-label-rule-create command to create a rule that associates a network address to the label that we created in the last step. In our case, we will count any packet that reaches the gateway from the CIDR 10.10.10.0/24 network to which the virtual machine is connected: Use the ceilometer meter-list command with the resource filter to find the meters associated with the label resource: Use the ceilometer statistics command to view the number of packets matching the metering label: The packet counting is now enabled and the bandwidth statistics can be viewed using Ceilometer. How it works… The Neutron monitoring agent implements the packet counting meter in the L3 router. It uses iptables to implement a packet counter. The Neutron agent collects the counter statistics periodically and posts them on the message bus, which is collected by the Ceilometer collector daemon. Summary In this article, we learned about ways to monitor the usage of virtual and physical networking resources. The resource utilization data can be used to bill the users of a public cloud and debug the infrastructure-related problems. Resources for Article: Further resources on this subject: Using the OpenStack Dash-board [Article] Installing Red Hat CloudForms on Red Hat OpenStack [Article] Securing OpenStack Networking [Article]

0
0
5519

article-image-deploying-highly-available-openstack

Packt

21 Sep 2015

17 min read

Deploying Highly Available OpenStack

Packt

21 Sep 2015

17 min read

0
0
6878

article-image-how-to-run-code-in-the-cloud-with-aws-lambda

Ankit Patial

11 Sep 2015

5 min read

How to Run Code in the Cloud with AWS Lambda

Ankit Patial

11 Sep 2015

5 min read

AWS Lambda is a new compute service introduced by AWS to run a piece of code in response to events. The source of these events can be AWS S3, AWS SNS, AWS Kinesis, AWS Cognito and User Application using AWS-SDK. The idea behind this is to create backend services that are cost effective and highly scaleable. If you believe in the Unix Philosophy and you build your applications as components, then AWS Lambda is a nice feature that you can make use of. Some of Its Benefits Cost-effective: AWS Lambdas are not always executing, they are triggered on certain events and have a maximum execution time of 60 seconds (it's a lots of time to do many operations, but not all). There is zero wastage, and a maximum savings on resources used. No hassle of maintaining infrastructure: Create Lambda and forget. There is no need to worry about scaling infrastructure as load increases. It will be all done automatically by AWS. Integrations with other AWS service: The AWS Lambda function can be triggered in response to various events of other AWS Services. The following are services that can trigger a Lambda: AWS S3 AWS SNS(Publish) AWS Kinesis AWS Cognito Custom call using aws-sdk Creating a Lambda function First, login to your AWS account(create one if you haven't got one). Under Compute Services click on the Lambda option. You will see a screen with a "Get Started Now" button. Click on it, and then you will be on a screen to write your first Lambda function. Choose a name for it that will describe it best. Give it a nice description and move on to the code. We can code it in one of the following two ways: Inline code or Upload a zip file. Inline Code Inline code will be very helpful for writing simple scripts like image editing. The AMI (Amazon Machine Image) that Lambda runs on comes with preinstalled Ghostscript and ImageMagick libraries and NodeJs packages like aws-sdk and imagemagick. Let's create a Lambda that can list install packages on AMI and that runs Lambda. I will name it ls-packages The description will be list installed packages on AMI For code entry, type Edit Code Inline For the code template None, paste the below code in: var cp = require('child_process'); exports.handler = function(event, context) { cp.exec('rpm -qa', function (err, stdout, stderr ) { if (err) { return context.fail(err); } console.log(stdout); context.succeed('Done'); }); }; Handler name handler, this will be the entry point function name. You can change it as you like. Role, select Create new role Basic execution role. You will be prompted to create an IAM role with the required permission i.e. access to create logs. Press "Allow." For the Memory(MB), I am going to keep it low 128 Timeout(s), keep it default 3 Press Create Lambda function You will see your first Lambda created and showing up in Lambda: Function list, select it if it is not already selected, and click on the Actions drop-down. On the top select the Edit/Test option. You will see your Lambda function in edit mode, ignore the left side Sample event section just client Invoke button on the right bottom, wait for a few seconds and you will see nice details in Execution result. The "Execution logs" is where you will find out the list of installed packages on the machine that you can utilize. I wish there was a way to install custom packages, or at least have the latest version running of installed packages. I mean, look at ghostscript-8.70-19.23.amzn1.x86_64. It is an old version published in 2009. Maybe AWS will add such features in the future. I certainly hope so. Upload a zip file You now have created something complicated that is included in multiple code files and NPM packages that are not available on Lambda AMI. No worries, just create a simple NodeJs app, install you packages in write up your code and we are good to deploy it. Few things that need to be take care of are: Zip node_modules folder along with code don't exclude it while zipping your code. Steps will be the same as are of Inline Code online, but one addition is File name. File name will be path to entry file, so if you have lib dir in your code with index.js file then you can mention it as bin/index.js. Monitoring On the Lambda Dashboard you will see a nice graph of various events like Invocation Count, Invocation Duration, Invocation failures and Throttled invocations. You will also view the logs created by Lambda functions in AWS Cloud Watch(Administration & Security) Conclusion AWS Lambda is a unique, and very useful service. It can help us build nice scaleable backends for mobile applications. It can also help you to centralize many components that can be shared across applications that you are running on and off the AWS infrastructure. About the author Ankit Patial has a Masters in Computer Applications, and nine years of experience with custom APIs, web and desktop applications using .NET technologies, ROR and NodeJs. As a CTO with SimSaw Inc and Pink Hand Technologies, his job is to learn and and help his team to implement the best practices of using Cloud Computing and JavaScript technologies.

0
0
3316

Packt

04 Sep 2015

8 min read

How to Optimize PUT Requests

Packt

04 Sep 2015

8 min read

In this article by Naoya Hashiotmo, author of the book Amazon S3 Cookbook, explains how to optimize PUT requests, it would be effective to use multipart uploads because it can aggregate throughput by parallelizing PUT requests and uploading a large object into parts. It is recommended that the size of each part should be between 25 and 50 MB for higher networks and 10 MB for mobile networks. (For more resources related to this topic, see here.) Amazon S3 is a highly-scalable, reliable, and low-latency data storage service at a very low cost, designed for mission-critical and primary data storage. It provides the Amazon S3 APIs to simplify your programming tasks. S3 performance optimization is composed of several factors, for example, which region to choose to reduce latency, considering the naming scheme and optimizing the put and get operations. Multipart upload consists of three-step processes; the first step is initiating the upload, next is uploading the object parts, and finally, after uploading all the parts, the multipart upload is finished. The following methods are currently supported to upload objects with multipart upload: AWS SDK for Android AWS SDK for iOS AWS SDK for Java AWS SDK for JavaScript AWS SDK for PHP AWS SDK for Python AWS SDK for Ruby AWS SDK for .NET REST API AWS CLI In order to try multipart upload and see how much it aggregates throughput, we use AWS SDK for Node.js and S3 via NPM (package manager for Node.js). AWS CLI also supports multipart upload. When you use the AWS CLI s3 or s3api subcommand to upload an object, the object is automatically uploaded via multipart requests. Getting ready You need to complete the following set up in advance: Sign up on AWS and be able to access S3 with your IAM credentials Install and set up AWS CLI in your PC or use Amazon Linux AMI Install Node.js It is recommended that you score the benchmark from your local PC or if you use the EC2 instance, you should launch an instance and create an S3 bucket in different regions. For example, if you launch an instance in the Asia Pacific Tokyo region, you should create an S3 bucket in the US standard region. The reason is that the latency between EC2 and S3 is very low, and it is hard to see the difference. How to do it… We upload a 300 GB file in an S3 bucket over HTTP in two ways; one is to use a multipart upload and the other is not to use multipart upload to compare the time. To clearly see how the performance differs, I launched an instance and created an S3 bucket in different regions as follows: EC2 instance: Asia Pacific Tokyo Region (ap-northeast-1) S3 bucket: US Standard region (us-east-1) First, we install the S3 Node.js module via npm, create a dummy file, upload the object into a bucket using a sample Node.js script without enabling multipart upload, and then do the same enabling multipart upload, so that we can see how multipart upload performs the operation. Now, let's move on to the instructions: Install s3 via the npm command: $ cdaws-nodejs-sample/ $ npm install s3 Create a 300 GB dummy file: $ file=300mb.dmp $ dd if=/dev/zero of=${file} bs=10M count=30 Put the following script and save the script as s3_upload.js: // Load the SDK var AWS = require('aws-sdk'); var s3 = require('s3'); var conf = require('./conf'); // Load parameters var client = s3.createClient({ maxAsyncS3: conf.maxAsyncS3, s3RetryCount: conf.s3RetryCount, s3RetryDelay: conf.s3RetryDelay, multipartUploadThreshold: conf.multipartUploadThreshold, multipartUploadSize: conf.multipartUploadSize, }); var params = { localFile: conf.localFile, s3Params: { Bucket: conf.Bucket, Key: conf.localFile, }, }; // upload objects console.log("## s3 Parameters"); console.log(conf); console.log("## Begin uploading."); var uploader = client.uploadFile(params); uploader.on('error', function(err) { console.error("Unable to upload:", err.stack); }); uploader.on('progress', function() { console.log("Progress", uploader.progressMd5Amount,uploader.progressAmount, uploader.progressTotal); }); uploader.on('end', function() { console.log("## Finished uploading."); }); Create a configuration file and save the file conf.js in the same directory as s3_upload.js: exports.maxAsyncS3 = 20; // default value exports.s3RetryCount = 3; // default value exports.s3RetryDelay = 1000; // default value exports.multipartUploadThreshold = 20971520; // default value exports.multipartUploadSize = 15728640; // default value exports.Bucket = "your-bucket-name"; exports.localFile = "300mb.dmp"; exports.Key = "300mb.dmp"; How it works… First of all, let's try uploading a 300 GB object using multipart upload, and then upload the same file without using multipart upload. You can upload an object and see how long it takes by typing the following command: $ time node s3_upload.js ## s3 Parameters { maxAsyncS3: 20, s3RetryCount: 3, s3RetryDelay: 1000, multipartUploadThreshold: 20971520, multipartUploadSize: 15728640, localFile: './300mb.dmp', Bucket: 'bucket-sample-us-east-1', Key: './300mb.dmp' } ## Begin uploading. Progress 0 16384 314572800 Progress 0 32768 314572800 … Progress 0 314572800 314572800 Progress 0 314572800 314572800 ## Finished uploading. real 0m16.111s user 0m4.164s sys 0m0.884s As it took about 16 seconds to upload the object, the transfer rate was 18.75 MB/sec. Then, let's change the following parameters in the configuration (conf.js) as follows and see the result. The 300 GB object is uploaded through only one S3 client and exports.maxAsyncS3 = 1; exports. multipartUploadThreshold = 2097152000; exports.maxAsyncS3 = 1; exports.s3RetryCount = 3; // default value exports.s3RetryDelay = 1000; // default value exports.multipartUploadThreshold = 2097152000; exports.multipartUploadSize = 15728640; // default value exports.Bucket = "your-bucket-name"; exports.localFile = "300mb.dmp"; exports.Key = "300mb.dmp"; Let's see the result after changing the parameters in the configuration (conf.js): $ time node s3_upload.js ## s3 Parameters … ## Begin uploading. Progress 0 16384 314572800 … Progress 0 314572800 314572800 ## Finished uploading. real 0m41.887s user 0m4.196s sys 0m0.728s As it took about 42 seconds to upload the object, the transfer rate was 7.14 MB/sec. Now, let's quickly check each parameter, and then get to the conclusion; maxAsyncS3 defines the maximum number of simultaneous requests that S3 clients are open to Amazon S3. The default value is 20. s3RetryCount defines the number of retries when a request fails. The default value is 3. s3RetryDelay is how many milliseconds S3 clients will wait when a request fails. The default value is 1000. multipartUploadThreshold defines the size of uploading objects via multipart requests. The object will be uploaded via multipart request, if you choose an object that is greater than the size you specified. The default value is 20 MB, the minimum is 5 MB, and the maximum is 5 GB. multipartUploadSize defines the size for each part when uploaded via the multipart request. The default value is 15 MB, the minimum is 5 MB, and the maximum is 5 GB. The following table shows the speed test score with different parameters: maxAsyncS3 1 20 20 40 30 s3RetryCount 3 3 3 3 3 s3RetryDelay 1000 1000 1000 1000 1000 multipartUploadThreshold 2097152000 20971520 20971520 20971520 20971520 multipartUploadSize 15728640 15728640 31457280 15728640 10728640 Time (seconds) 41.88 16.11 17.41 16.37 9.68 Transfer Rate (MB) 7.51 19.53 18.07 19.22 32.50 In conclusion, multipart upload is effective for optimizing the PUT operation, aggregating throughput. However, you need to consider the following: Benchmark your scenario and evaluate the number of retry count, delay, parts, and the multipart upload size based on the networks that your application belongs to. There's more… Multipart upload specification There are limits to using multipart upload. The following table shows the specification of multipart upload: Item Specification Maximum object size 5 TB Maximum number of parts per upload 10,000 Part numbers 1 to 10,000 (inclusive) Part size 5 MB to 5 GB, last part can be more than 5 MB Maximum number of parts returned for a list of parts request 1,000 Maximum number of multipart uploads returned in a list of multipart uploads request 1,000 Multipart upload and charging If you initiate multipart upload and abort the request, Amazon S3 deletes the upload artifacts and any parts you have uploaded and you are not charged for the bills. However, you are charged for all storage, bandwidth, and requests for the multipart upload requests and the associated parts of an object after the operation is completed. The point is you are charged when a multipart upload is completed (not aborted). See also Multipart Upload Overview https://docs.aws.amazon.com/AmazonS3/latest/dev/mpuoverview.html AWS SDK for Node.js http://docs.aws.amazon.com/AWSJavaScriptSDK/guide/node-intro.htm Node.js S3 package npm https://www.npmjs.com/package/s3 Amazon Simple Storage Service: Introduction to Amazon S3 http://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html (PFC403) Maximizing Amazon S3 Performance | AWS re:Invent 2014 http://www.slideshare.net/AmazonWebServices/pfc403-maximizing-amazon-s3-performance-aws-reinvent-2014 AWS re:Invent 2014 | (PFC403) Maximizing Amazon S3 Performance https://www.youtube.com/watch?v=_FHRzq7eHQc Summary In this article, we learned how to optimize the PUT requests and uploading a large object into parts. Resources for Article: Further resources on this subject: Amazon Web Services[article] Achieving High-Availability on AWS Cloud[article] Architecture and Component Overview [article]

0
0
2963

article-image-installing-red-hat-cloudforms-red-hat-openstack

Packt

04 Sep 2015

8 min read

Installing Red Hat CloudForms on Red Hat OpenStack

Packt

04 Sep 2015

8 min read

0
0
3243

article-image-using-openstack-dash-board

Packt

18 Aug 2015

11 min read

Using the OpenStack Dashboard

Packt

18 Aug 2015

11 min read

0
0
2341

article-image-achieving-high-availability-aws-cloud

Packt

11 Aug 2015

18 min read

Achieving High-Availability on AWS Cloud

Packt

11 Aug 2015

18 min read

In this article, by Aurobindo Sarkar and Amit Shah, author of the book Learning AWS, we will introduce some key design principles and approaches to achieving high availability in your applications deployed on the AWS cloud. As a good practice, you want to ensure that your mission-critical applications are always available to serve your customers. The approaches in this article will address availability across the layers of your application architecture including availability aspects of key infrastructural components, ensuring there are no single points of failure. In order to address availability requirements, we will use the AWS infrastructure (Availability Zones and Regions), AWS Foundation Services (EC2 instances, Storage, Security and Access Control, Networking), and the AWS PaaS services (DynamoDB, RDS, CloudFormation, and so on). (For more resources related to this topic, see here.) Defining availability objectives Achieving high availability can be costly. Therefore, it is important to ensure that you align your application availability requirements with your business objectives. There are several options to achieve the level of availability that is right for your application. Hence, it is essential to start with a clearly defined set of availability objectives and then make the most prudent design choices to achieve those objectives at a reasonable cost. Typically, all systems and services do not need to achieve the highest levels of availability possible; at the same time ensure you do not introduce a single point of failure in your architecture through dependencies between your components. For example, a mobile taxi ordering service needs its ordering-related service to be highly available; however, a specific customer's travel history need not be addressed at the same level of availability. The best way to approach high availability design is to assume that anything can fail, at any time, and then consciously design against it. "Everything fails, all the time." - Werner Vogels, CTO, Amazon.com In other words, think in terms of availability for each and every component in your application and its environment because any given component can turn into a single point of failure for your entire application. Availability is something you should consider early on in your application design process, as it can be hard to retrofit it later. Key among these would be your database and application architecture (for example, RESTful architecture). In addition, it is important to understand that availability objectives can influence and/or impact your design, development, test, and running your system on the cloud. Finally, ensure you proactively test all your design assumptions and reduce uncertainty by injecting or forcing failures instead of waiting for random failures to occur. The nature of failures There are many types of failures that can happen at any time. These could be a result of disk failures, power outages, natural disasters, software errors, and human errors. In addition, there are several points of failure in any given cloud application. These would include DNS or domain services, load balancers, web and application servers, database servers, application services-related failures, and data center-related failures. You will need to ensure you have a mitigation strategy for each of these types and points of failure. It is highly recommended that you automate and implement detailed audit trails for your recovery strategy, and thoroughly test as many of these processes as possible. In the next few sections, we will discuss various strategies to achieve high availability for your application. Specifically, we will discuss the use of AWS features and services such as: VPC Amazon Route 53 Elastic Load Balancing, auto-scaling Redundancy Multi-AZ and multi-region deployments Setting up VPC for high availability Before setting up your VPC, you will need to carefully select your primary site and a disaster recovery (DR) site. Leverage AWS's global presence to select the best regions and availability zones to match your business objectives. The choice of a primary site is usually the closest region to the location of a majority of your customers and the DR site could be in the next closest region or in a different country depending on your specific requirements. Next, we need to set up the network topology, which essentially includes setting up the VPC and the appropriate subnets. The public facing servers are configured in a public subnet; whereas the database servers and other application servers hosting services such as the directory services will usually reside in the private subnets. Ensure you chose different sets of IP addresses across the different regions for the multi-region deployment, for example 10.0.0.0/16 for the primary region and 192.168.0.0/16 for the secondary region to avoid any IP addressing conflicts when these regions are connected via a VPN tunnel. Appropriate routing tables and ACLs will also need to be defined to ensure traffic can traverse between them. Cross-VPC connectivity is required so that data transfer can happen between the VPCs (say, from the private subnets in one region over to the other region). The secure VPN tunnels are basically IPSec tunnels powered by VPN appliances—a primary and a secondary tunnel should be defined (in case the primary IPSec tunnel fails). It is imperative you consult with your network specialists through all of these tasks. An ELB is configured in the primary region to route traffic across multiple availability zones; however, you need not necessarily commission the ELB for your secondary site at this time. This will help you avoid costs for the ELB in your DR or secondary site. However, always weigh these costs against the total cost/time required for recovery. It might be worthwhile to just commission the extra ELB and keep it running. Gateway servers and NAT will need to be configured as they act as gatekeepers for all inbound and outbound Internet access. Gateway servers are defined in the public subnet with appropriate licenses and keys to access your servers in the private subnet for server administration purposes. NAT is required for servers located in the private subnet to access the Internet and is typically used for automatic patch updates. Again, consult your network specialists for these tasks. Elastic load balancing and Amazon Route 53 are critical infrastructure components for scalable and highly available applications; we discuss these services in the next section. Using ELB and Route 53 for high availability In this section, we describe different levels of availability and the role ELBs and Route 53 play from an availability perspective. Instance availability The simplest guideline here is to never run a single instance in a production environment. The simplest approach to improving greatly from a single server scenario is to spin up multiple EC2 instances and stick an ELB in front of them. The incoming request load is shared by all the instances behind the load balancer. ELB uses the least outstanding requests routing algorithm to spread HTTP/HTTPS requests across healthy instances. This algorithm favors instances with the fewest outstanding requests. Even though it is not recommended to have different instance sizes between or within the AZs, the ELB will adjust for the number of requests it sends to smaller or larger instances based on response times. In addition, ELBs use cross-zone load balancing to distribute traffic across all healthy instances regardless of AZs. Hence, ELBs help balance the request load even if there are unequal number of instances in different AZs at any given time (perhaps due to a failed instance in one of the AZs). There is no bandwidth charge for cross-zone traffic (if you are using an ELB). Instances that fail can be seamlessly replaced using auto scaling while other instances continue to operate. Though auto-replacement of instances works really well, storing application state or caching locally on your instances can be hard to detect problems. Instance failure is detected and the traffic is shifted to healthy instances, which then carries the additional load. Health checks are used to determine the health of the instances and the application. TCP and/or HTTP-based heartbeats can be created for this purpose. It is worthwhile implementing health checks iteratively to arrive at the right set that meets your goals. In addition, you can customize the frequency and the failure thresholds as well. Finally, if all your instances are down, then AWS will return a 503. Zonal availability or availability zone redundancy Availability zones are distinct geographical locations engineered to be insulated from failures in other zones. It is critically important to run your application stack in more than one zone to achieve high availability. However, be mindful of component level dependencies across zones and cross-zone service calls leading to substantial latencies in your application or application failures during availability zone failures. For sites with very high request loads, a 3-zone configuration might be the preferred configuration to handle zone-level failures. In this situation, if one zone goes down, then other two AZs can ensure continuing high availability and better customer experience. In the event of a zone failure, there are several challenges in a Multi-AZ configuration, resulting from the rapidly shifting traffic to the other AZs. In such situations, the load balancers need to expire connections quickly and lingering connections to caches must be addressed. In addition, careful configuration is required for smooth failover by ensuring all clusters are appropriately auto scaled, avoiding cross-zone calls in your services, and avoiding mismatched timeouts across your architecture. ELBs can be used to balance across multiple availability zones. Each load balancer will contain one or more DNS records. The DNS record will contain multiple IP addresses and DNS round-robin can be used to balance traffic between the availability zones. You can expect the DNS records to change over time. Using multiple AZs can result in traffic imbalances between AZs due to clients caching DNS records. However, ELBs can help reduce the impact of this caching. Regional availability or regional redundancy ELB and Amazon Route 53 have been integrated to support a single application across multiple regions. Route 53 is AWS's highly available and scalable DNS and health checking service. Route 53 supports high availability architectures by health checking load balancer nodes and rerouting traffic to avoid the failed nodes, and by supporting implementation of multi-region architectures. In addition, Route 53 uses Latency Based Routing (LBR) to route your customers to the endpoint that has the least latency. If multiple primary sites are implemented with appropriate health checks configured, then in cases of failure, traffic shifts away from that site to the next closest region. Region failures can present several challenges as a result of rapidly shifting traffic (similar to the case of zone failures). These can include auto scaling, time required for instance startup, and the cache fill time (as we might need to default to our data sources, initially). Another difficulty usually arises from the lack of information or clarity on what constitutes the minimal or critical stack required to keep the site functioning as normally as possible. For example, any or all services will need to be considered as critical in these circumstances. The health checks are essentially automated requests sent over the Internet to your application to verify that your application is reachable, available, and functional. This can include both your EC2 instances and your application. As answers are returned only for the resources that are healthy and reachable from the outside world, the end users can be routed away from a failed application. Amazon Route 53 health checks are conducted from within each AWS region to check whether your application is reachable from that location. The DNS failover is designed to be entirely automatic. After you have set up your DNS records and health checks, no manual intervention is required for failover. Ensure you create appropriate alerts to be notified when this happens. Typically, it takes about 2 to 3 minutes from the time of the failure to the point where traffic is routed to an alternate location. Compare this to the traditional process where an operator receives an alarm, manually configures the DNS update, and waits for the DNS changes to propagate. The failover happens entirely within the Amazon Route 53 data plane. Depending on your availability objectives, there is an additional strategy (using Route 53) that you might want to consider for your application. For example, you can create a backup static site to maintain a presence for your end customers while your primary dynamic site is down. In the normal course, Route 53 will point to your dynamic site and maintain health checks for it. Furthermore, you will need to configure Route 53 to point to the S3 storage, where your static site resides. If your primary site goes down, then traffic can be diverted to the static site (while you work to restore your primary site). You can also combine this static backup site strategy with a multiple region deployment. Setting up high availability for application and data layers In this section, we will discuss approaches for implementing high availability in the application and data layers of your application architecture. The auto healing feature of AWS OpsWorks provides a good recovery mechanism from instance failures. All OpsWorks instances have an agent installed. If an agent does not communicate with the service for a short duration, then OpsWorks considers the instance to have failed. If auto healing is enabled at the layer and an instance becomes unhealthy, then OpsWorks first terminates the instance and starts a new one as per the layer configuration. In the application layer, we can also do cold starts from preconfigured images or a warm start from scaled down instances for your web servers and application servers in a secondary region. By leveraging auto scaling, we can quickly ramp up these servers to handle full production loads. In this configuration, you would deploy the web servers and application servers across multiple AZs in your primary region while the standby servers need not be launched in your secondary region until you actually need them. However, keep the preconfigured AMIs for these servers ready to launch in your secondary region. The data layer can comprise of SQL databases, NoSQL databases, caches, and so on. These can be AWS managed services such as RDS, DynamoDB, and S3, or your own SQL and NoSQL databases such as Oracle, SQL Server, or MongoDB running on EC2 instances. AWS services come with HA built-in, while using database products running on EC2 instances offers a do-it-yourself option. It can be advantageous to use AWS services if you want to avoid taking on database administration responsibilities. For example, with the increasing sizes of your databases, you might choose to share your databases, which is easy to do. However, resharding your databases while taking in live traffic can be a very complex undertaking and present availability risks. Choosing to use the AWS DynamoDB service in such a situation offloads this work to AWS, thereby resulting in higher availability out of the box. AWS provides many different data replication options and we will discuss a few of those in the following several paragraphs. DynamoDB automatically replicates your data across several AZs to provide higher levels of data durability and availability. In addition, you can use data pipelines to copy your data from one region to another. DynamoDB streams functionality that can be leveraged to replicate to another DynamoDB in a different region. For very high volumes, low latency Kinesis services can also be used for this replication across multiple regions distributed all over the world. You can also enable the Multi-AZ setting for the AWS RDS service to ensure AWS replicates your data to a different AZ within the same region. In the case of Amazon S3, the S3 bucket contents can be copied to a different bucket and the failover can be managed on the client side. Depending on the volume of data, always think in terms of multiple machines, multiple threads and multiple parts to significantly reduce the time it takes to upload data to S3 buckets. While using your own database (running on EC2 instances), use your database-specific high availability features for within and cross-region database deployments. For example, if you are using SQL Server, you can leverage the SQL Server Always-on feature for synchronous and asynchronous replication across the nodes. If the volume of data is high, then you can also use the SQL Server log shipping to first upload your data to Amazon S3 and then restore into your SQL Server instance on AWS. A similar approach in case of Oracle databases uses OSB Cloud Module and RMAN. You can also replicate your non-RDS databases (on-premise or on AWS) to AWS RDS databases. You will typically define two nodes in the primary region with synchronous replication and a third node in the secondary region with asynchronous replication. NoSQL databases such as MongoDB and Cassandra have their own asynchronous replication features that can be leveraged for replication to a different region. In addition, you can create Read Replicas for your databases in other AZs and regions. In this case, if your master database fails followed by a failure of your secondary database, then one of the read replicas can be promoted to being the master. In hybrid architectures, where you need to replicate between on-premise and AWS data sources, you can do so through a VPN connection between your data center and AWS. In case of any connectivity issues, you can also temporarily store pending data updates in SQS, and process them when the connectivity is restored. Usually, data is actively replicated to the secondary region while all other servers like the web servers and application servers are maintained in a cold state to control costs. However, in cases of high availability for web scale or mission critical applications, you can also choose to deploy your servers in active configuration across multiple regions. Implementing high availability in the application In this section, we will discuss a few design principles to use in your application from a high availability perspective. We will briefly discuss using highly available AWS services to implement common features in mobile and Internet of Things (IoT) applications. Finally, we also cover running packaged applications on the AWS cloud. Designing your application services to be stateless and following a micro services-oriented architecture approach can help the overall availability of your application. In such architectures, if a service fails then that failure is contained or isolated to that particular service while the rest of your application services continue to serve your customers. This approach can lead to an acceptable degraded experience rather than outright failures or worse. You should also store user or session information in a central location such as the AWS ElastiCache and then spread information across multiple AZs for high availability. Another design principle is to rigorously implement exception handling in your application code, and in each of your services to ensure graceful exit in case of failures. Most mobile applications share common features including user authentication and authorization, data synchronization across devices; user behavior analytics; retention tracking, storing, sharing, and delivering media globally; sending push notifications; store shared data; stream real-time data; and so on. There are a host of highly available AWS services that can be used for implementing such mobile application functionality. For example, you can use Amazon Cognito to authenticate users, Amazon Mobile Analytics for analyzing user behavior and tracking retention, Amazon SNS for push notifications and Amazon Kinesis for streaming real-time data. In addition, other AWS services such as S3, DynamoDB, IAM, and so on can also be effectively used to complete most mobile application scenarios. For mobile applications, you need to be especially sensitive about latency issues; hence, it is important to leverage AWS regions to get as close to your customers as possible. Similar to mobile applications, for IoT applications you can use the same highly available AWS services to implement common functionality such as device analytics and device messaging/notifications. You can also leverage Amazon Kinesis to ingest data from hundreds of thousands of sensors that are continuously generating massive quantities of data. Aside from your own custom applications, you can also run packaged applications such as SAP on AWS. These would typically include replicated standby systems, Multi-AZ and multi-region deployments, hybrid architectures spanning your own data center, and AWS cloud (connected via VPN or AWS Direct Connect service), and so on. For more details, refer to the specific package guides for achieving high availability on the AWS cloud. Summary In this article, we reviewed some of the strategies you can follow for achieving high availability in your cloud application. We emphasized the importance of both designing your application architecture for availability and using the AWS infrastructural services to get the best results. Resources for Article: Further resources on this subject: Securing vCloud Using the vCloud Networking and Security App Firewall [article] Introduction to Microsoft Azure Cloud Services [article] AWS Global Infrastructure [article]

0
0
4775

Packt

11 Aug 2015

7 min read

Setting up VPNaaS in OpenStack

Packt

11 Aug 2015

7 min read

In this article by Omar Khedher, author of the book Mastering OpenStack, we will create a VPN setup between two sites using the Neutron VPN driver plugin in Neutron. The VPN as a Service is a new network functionality that is provided by Neutron, which is the network service that introduces the VPN feature set. It was completely integrated in OpenStack since the Havana release. We will set up two private networks in two different OpenStack sites and create some IPSec site-to-site connections. The instances that are located in each OpenStack private network should be able to connect to each other across the VPN tunnel. The article assumes that you have two OpenStack environments in different networks. General settings The following figure illustrates the VPN topology between two OpenStack sites to provide a secure connection between a database instance that runs in a data center in Amsterdam along with a web server that runs in a data center in Tunis. Each site has a private and a public network as well as subnets that are managed by a neutron router. Enabling the VPNaaS The VPN driver plugin must be configured in /etc/neutron/neutron.conf. To enable it, add the following driver: # nano /etc/neutron/neutron.conf service_plugins = neutron.services.vpn.plugin.VPNDriverPlugin Next, we'll add the VPNaaS module interface in /usr/share/openstack-dashboard/openstack_dashboard/local/local_settings.py, as follows: 'enable_VPNaaS': True, Finally, restart the neutron-server and server web using the following commands: # service httpd restart # /etc/init.d/neutron-server restart Site's configuration Let's assume that each OpenStack site has at least a router attached within an external gateway and can provide access to the private network attached to an instance, such as a database or a web server virtual machine. A simple network topology of a private OpenStack environment running in the Amsterdam site will look like this: The private OpenStack environment running in the Tunis site will look like the following image: VPN management in the CLI Neutron offers several commands that can be used to create and manage the VPN connections in OpenStack. The essential commands that are associated with router management include the following: vpn-service-create vpn-service-delete vpn-service-list vpn-ikepolicy-create vpn-ikepolicy-list vpn-ipsecpolicy-create vpn-ipsecpolicy-delete vpn-ipsecpolicy-list vpn-ipsecpolicy-show ipsec-site-connection-create ipsec-site-connection-delete ipsec-site-connection-list Creating an IKE policy In the first VPN phase, we can create an IKE policy in Horizon. The following figure shows a simple IKE setup of the OpenStack environment that is located in the Amsterdam site: The same settings can be applied via the Neutron CLI using the vpn-ikepolicy-create command, as follows: Syntax: neutron vpn-ikepolicy-create --description <description> --auth-algorithm <auth-algorithm> --encryption-algorithm <encryption-algorithm> --ike-version <ike-version> --lifetime <units=UNITS, value=VALUE> --pfs <pfs> --phase1-negotiation-mode <phase1-negotiation-mode> --name <NAME> Creating an IPSec policy An IPSec policy in an OpenStack environment that is located in the Amsterdam site can be created in Horizon in the following way: The same settings can be applied via the Neutron CLI using the vpn-ipsecpolicy-create command, as follows: Syntax: neutron vpn-ipsecpolicy-create --description <description> --auth-algorithm <auth-algorithm> --encapsulation-mode <encapsulation-mode> --encryption-algorithm <encryption-algorithm> --lifetime <units=UNITS,value=VALUE> --pfs <pfs> --transform-protocol <transform-algorithm> --name <NAME> Creating a VPN service To create a VPN service, you will need to specify the router facing the external and attach the web server instance to the private network in the Amsterdam site. The router will act as a VPN gateway. We can add a new VPN service from Horizon in the following way: The same settings can be applied via the Neutron CLI using the vpn-service-create command, as follows: Syntax: neutron vpn-service-create --tenant-id <tenant-id> --description <description> ROUTER SUBNET --name <name> Creating an IPSec connection The following step requires an identification of the peer gateway of the remote site that is located in Tunis. We will need to collect the IPv4 address of the external gateway interface of the remote site as well as the remote private subnet. In addition to this, a pre-shared key (PSK) has to be defined and exchanged between both the sites in order to bring the tunnel up after establishing a successful PSK negotiation during the second phase of the VPN setup. You can collect the router information either from Horizon or from the command line. To get subnet-related information on the OpenStack Cloud that is located in the Tunis site, you can run the following command in the network node: # neutron router-list The preceding command yields the following result: We can then list the ports of Router-TN and check the attached subnets of each port using the following command line: # neutron router-port-list Router-TN The preceding command gives the following result: Now, we can add a new IPSec site connection from the Amsterdam OpenStack Cloud in the following way: Similarly, we can perform the same configuration to create the VPN service using the neutron ipsec-site-connection-create command line, as follows: Syntax: neutron ipsec-site-connection-create --name <NAME> --description <description> ---vpnservice-id <VPNSERVICE> --ikepolicy-id <IKEPOLICY> --ipsecpolicy-id <IPSECPOLICY> --peer-address <PEER-ADDRESS> --peer-id <PEER-ID> --psk <PRESHAREDKEY> Remote site settings A complete VPN setup requires you to perform the same steps in the OpenStack Cloud that is located in the Tunis site. For a successful VPN phase 1 configuration, you must set the same IKE and IPSec policy attributes and change only the naming convention for each IKE and IPSec setup. Creating a VPN service on the second OpenStack Cloud will look like this: The last tricky part involves gathering the same router information in the Amsterdam site. Using the command line in the network node in the Amsterdam Cloud side, we can perform the following Neutron command: # neutron router-list The preceding command yields the following result: To get a list of networks that are attached to Router-AMS, we can execute the following command line: # neutron router-port-list Router-AMS The preceding command gives the following result: The external and private subnets that are associated with Router-AMS can be checked from Horizon as well. Now that we have the peer router gateway and remote private subnet, we will need to fill the same PSK that was configured previously. The next figure illustrates the new IPSec site connection on the OpenStack Cloud that is located in the Tunis site: At the Amsterdam site, we can check the creation of the new IPSec site connection by means of the neutron CLI, as follows: # neutron ipsec-site-connection-list The preceding command will give the following result: The same will be done at the Tunis site, as follows: # neutron ipsec-site-connection-list The preceding command gives the following result: Managing security groups For VPNaaS to work when connecting the Amsterdam and Tunis subnets, you will need to create a few additional rules in each project's default security group to enable not only the pings by adding a general ICMP rule, but also SSH on port 22. Additionally, we can create a new security group called Application_PP to restrict traffic on port 53 (DNS), 80 (HTTP), and 443 (HTTPS), as follows: # neutron security-group-create Application_PP --description "allow web traffic from the Internet" # neutron security-group-rule-create --direction ingress --protocol tcp --port_range_min 80 --port_range_max 80 Application_PP --remote-ip-prefix 0.0.0.0/0 # neutron security-group-rule-create --direction ingress --protocol tcp --port_range_min 53 --port_range_max 53 Application_PP --remote-ip-prefix 0.0.0.0/0 # neutron security-group-rule-create --direction ingress --protocol tcp --port_range_min 443 --port_range_max 443 Application_PP --remote-ip-prefix 0.0.0.0/0 From Horizon, we will see the following security group rules added: Summary In this article, we created a VPN setup between two sites using the Neutron VPN driver plugin in Neutron. This process included enabling the VPNaas, configuring the site, and creating an IKE policy, an IPSec policy, a VPN service, and an IPSec connection. To continue Mastering OpenStack, take a look to see what else you will learn in the book here.

0
0
5343

Packt

06 Jul 2015

5 min read

AWS Global Infrastructure

Packt

06 Jul 2015

5 min read

In this article by Uchit Vyas, who is the author of the Mastering AWS Development book, we will see how to use AWS services in detail. It is important to have a choice of placing applications as close as possible to your users or customers, in order to ensure the lowest possible latency and best user experience while deploying them. AWS offers a choice of nine regions located all over the world (for example, East Coast of the United States, West Coast of the United States, Europe, Tokyo, Singapore, Sydney, and Brazil), 26 redundant Availability Zones, and 53 Amazon CloudFront points of presence. (For more resources related to this topic, see here.) It is very crucial and important to have the option to put applications as close as possible to your customers and end users by ensuring the best possible lowest latency and user-expected features and experience, when you are creating and deploying apps for performance. For this, AWS provides worldwide means to the regions located all over the world. To be specific via name and location, they are as follows: US East (Northern Virginia) region US West (Oregon) region US West (Northern California) region EU (Ireland) Region Asia Pacific (Singapore) region Asia Pacific (Sydney) region Asia Pacific (Tokyo) region South America (Sao Paulo) region US GovCloud In addition to regions, AWS has 25 redundant Availability Zones and 51 Amazon CloudFront points of presence. Apart from these infrastructure-level highlights, they have plenty of managed services that can be the cream of AWS candy bar! The managed services bucket has the following listed services: Security: For every organization, security in each and every aspect is the vital element. For that, AWS has several remarkable security features that distinguishes AWS from other Cloud providers as follows : Certifications and accreditations Identity and Access Management Right now, I am just underlining the very important security features. Global infrastructure: AWS provides a fully-functional, flexible technology infrastructure platform worldwide, with managed services over the globe with certain characteristics, for example: Multiple global locations for deployment Low-latency CDN service Reliable, low-latency DNS service Compute: AWS offers a huge range of various cloud-based core computing services (including variety of compute instances that can be auto scaled to justify the needs of your users and application), a managed elastic load balancing service, and more of fully managed desktop resources on the pathway of cloud. Some of the common characteristics of computer services include the following: Broad choice of resizable compute instances Flexible pricing opportunities Great discounts for always on compute resources Lower hourly rates for elastic workloads Wide-ranging networking configuration selections A widespread choice of operating systems Virtual desktops Save further as you grow with tiered pricing model Storage: AWS offers low cost with high durability and availability with their storage services. With pay-as-you-go pricing model with no commitment, provides more flexibility and agility in services and processes for storage with a highly secured environment. AWS provides storage solutions and services for backup, archive, disaster recovery, and so on. They also support block, file, and object kind of storages with a highly available and flexible infrastructure. A few major characteristics for storage are as follows: Cost-effective, high-scale storage varieties Data protection and data management Storage gateway Choice of instance storage options Content delivery and networking: AWS offers a wide set of networking services that enables you to create a logical isolated network that the architect defines and to create a private network connection to the AWS infrastructure, with fault tolerant, scalable, and highly available DNS service. It also provides delivery services for content to your end users, by very low latency and high data transfer speed with AWS CDN service. A few major characteristics for content delivery and networking are as follows: Application and media files delivery Software and large file distribution Private content Databases: AWS offers fully managed, distributed relational and NoSQL type of database services. Moreover, database services are capable of in-memory caching, sharding, and scaling with/without data warehouse solutions. A few major characteristics for databases are as follows: RDS DynamoDB Redshift ElastiCache Application services: AWS provides a variety of managed application services with lower cost such as application streaming and queuing, transcoding, push notification, searching, and so on. A few major characteristics for databases are as follows: AppStream CloudSearch Elastic transcoder SWF, SES, SNS, SQS Deployment and management: AWS offers management of credentials to explore AWS services such as monitor services, application services, and updating stacks of AWS resources. They also have deployment and security services alongside with AWS API activity. A few major characteristics for deployment and management services are as follows: IAM CloudWatch Elastic Beanstalk CloudFormation Data Pipeline OpsWorks CloudHSM Cloud Trail Summary There are a few more additional important services from AWS, such as support, integration with existing infrastructure, Big Data, and ecosystem, which puts it on the top of other infrastructure providers. As a cloud architect, it is necessary to learn about cloud service offerings and their all-important functionalities. Resources for Article: Further resources on this subject: Amazon DynamoDB - Modelling relationships, Error handling [article] Managing Microsoft Cloud [article] Amazon Web Services [article]

0
0
1531

article-image-installing-openstack-swift

Packt

04 Jun 2015

10 min read

Installing OpenStack Swift

Packt

04 Jun 2015

10 min read

In this article by Amar Kapadia, Sreedhar Varma, and Kris Rajana, authors of the book OpenStack Object Storage (Swift) Essentials, we will see how IT administrators can install OpenStack Swift. The version discussed here is the Juno release of OpenStack. Installation of Swift has several steps and requires careful planning before beginning the process. A simple installation consists of installing all Swift components on a single node, and a complex installation consists of installing Swift on several proxy server nodes and storage server nodes. The number of storage nodes can be in the order of thousands across multiple zones and regions. Depending on your installation, you need to decide on the number of proxy server nodes and storage server nodes that you will configure. This article demonstrates a manual installation process; advanced users may want to use utilities such as Puppet or Chef to simplify the process. This article walks you through an OpenStack Swift cluster installation that contains one proxy server and five storage servers. (For more resources related to this topic, see here.) Hardware planning This section describes the various hardware components involved in the setup. Since Swift deals with object storage, disks are going to be a major part of hardware planning. The size and number of disks required should be calculated based on your requirements. Networking is also an important component, where factors such as a public or private network and a separate network for communication between storage servers need to be planned. Network throughput of at least 1 GB per second is suggested, while 10 GB per second is recommended. The servers we set up as proxy and storage servers are dual quad-core servers with 12 GB of RAM. In our setup, we have a total of 15 x 2 TB disks for Swift storage; this gives us a total size of 30 TB. However, with in-built replication (with a default replica count of 3), Swift maintains three copies of the same data. Therefore, the effective capacity for storing files and objects is approximately 10 TB, taking filesystem overhead into consideration. This is further reduced due to less than 100 percent utilization. The following figure depicts the nodes of our Swift cluster configuration: The storage servers have container, object, and account services running in them. Server setup and network configuration All the servers are installed with the Ubuntu server operating system (64-bit LTS version 14.04). You'll need to configure three networks, which are as follows: Public network: The proxy server connects to this network. This network provides public access to the API endpoints within the proxy server. Storage network: This is a private network and it is not accessible to the outside world. All the storage servers and the proxy server will connect to this network. Communication between the proxy server and the storage servers and communication between the storage servers take place within this network. In our configuration, the IP addresses assigned in this network are 172.168.10.0 and 172.168.10.99. Replication network: This is also a private network that is not accessible to the outside world. It is dedicated to replication traffic, and only storage servers connect to it. All replication-related communication between storage servers takes place within this network. In our configuration, the IP addresses assigned in this network are 172.168.9.0 and 172.168.9.99. This network is optional, and if it is set up, the traffic on it needs to be monitored closely. Pre-installation steps In order for various servers to communicate easily, edit the /etc/hosts file and add the host names of each server in it. This has to be done on all the nodes. The following screenshot shows an example of the contents of the /etc/hosts file of the proxy server node: Install the Network Time Protocol (NTP) service on the proxy server node and storage server nodes. This helps all the nodes to synchronize their services effectively without any clock delays. The pre-installation steps to be performed are as follows: Run the following command to install the NTP service: # apt-get install ntp Configure the proxy server node to be the reference server for the storage server nodes to set their time from the proxy server node. Make sure that the following line is present in /etc/ntp.conf for NTP configuration in the proxy server node: server ntp.ubuntu.com For NTP configuration in the storage server nodes, add the following line to /etc/ntp.conf. Comment out the remaining lines with server addresses such as 0.ubuntu.pool.ntp.org, 1.ubuntu.pool.ntp.org, 2.ubuntu.pool.ntp.org, and 3.ubuntu.pool.ntp.org: # server 0.ubuntu.pool.ntp.org# server 1.ubuntu.pool.ntp.org# server 2.ubuntu.pool.ntp.org# server 3.ubuntu.pool.ntp.orgserver s-swift-proxy Restart the NTP service on each server with the following command: # service ntp restart Downloading and installing Swift The Ubuntu Cloud Archive is a special repository that provides users with the ability to install new releases of OpenStack. The steps required to download and install Swift are as follows: Enable the capability to install new releases of OpenStack, and install the latest version of Swift on each node using the following commands. The second command shown here creates a file named cloudarchive-juno.list in /etc/apt/sources.list.d, whose content is "deb http://ubuntu-cloud.archieve.canonical.com/ubuntu": Now, update the OS using the following command: # apt-get update && apt-get dist-upgrade On all the Swift nodes, we will install the prerequisite software and services using this command: # apt-get install swift rsync memcached python-netifaces python-xattr python-memcache Next, we create a Swift folder under /etc and give users the permission to access this folder, using the following commands: # mkdir –p /etc/swift/# chown –R swift:swift /etc/swift Download the /etc/swift/swift.conf file from GitHub using this command: # curl –o /etc/swift/swift.conf https://raw.githubusercontent.com/openstack/swift/stable/juno/etc/swift.conf-sample Modify the /etc/swift/swift.conf file and add a variable called swift_hash_path_suffix in the swift-hash section. We then create a unique hash string using # python –c "from uuid import uuid4; print uuid4()" or # openssl rand –hex 10, and assign it to this variable, as shown in the following configuration option: We then add another variable called swift_hash_path_prefix to the swift-hash section, and assign to it another hash string created using the method described in the preceding step. These strings will be used in the hashing process to determine the mappings in the ring. The swift.conf file should be identical on all the nodes in the cluster. Setting up storage server nodes This section explains additional steps to set up the storage server nodes, which will contain the object, container, and account services. Installing services The first step required to set up the storage server node is installing services. Let's look at the steps involved: On each storage server node, install the packages for swift-account services, swift-container services, swift-object services, and xfsprogs (XFS Filesystem) using this command: # apt-get install swift-account swift-container swift-object xfsprogs Download the account-server.conf, container-server.conf, and object-server.conf samples from GitHub, using the following commands: # curl –o /etc/swift/account-server.conf https://raw.githubusercontent.com/openstack/swift/stable/juno/etc/account-server.conf-sample# curl –o /etc/swift/container-server.conf https://raw.githubusercontent.com/openstack/swift/stable/juno/etc/container-server.conf-sample# curl –o /etc/swift/object-server.conf https://raw.githubusercontent.com/openstack/swift/stable/juno/etc/object-server.conf-sample Edit the /etc/swift/account-server.conf file with the following section: Edit the /etc/swift/container-server.conf file with this section: Edit the /etc/swift/object-server.conf file with the following section: Formatting and mounting hard disks On each storage server node, we need to identify the hard disks that will be used to store the data. We will then format the hard disks and mount them on a directory, which Swift will then use to store data. We will not create any RAID levels or subpartitions on these hard disks because they are not necessary for Swift. They will be used as entire disks. The operating system will be installed on separate disks, which will be RAID configured. First, identify the hard disks that are going to be used for storage and format them. In our storage server, we have identified sdb, sdc, and sdd to be used for storage. We will perform the following operations on sdb. These four steps should be repeated for sdc and sdd as well: Carry out the partitioning for sdb and create the filesystem using this command: # fdisk /dev/sdb# mkfs.xfs /dev/sdb1 Then let's create a directory in /srv/node/sdb1 that will be used to mount the filesystem. Give the permission to the swift user to access this directory. These operations can be performed using the following commands: # mkdir –p /srv/node/sdb1# chown –R swift:swift /srv/node/sdb1 We set up an entry in fstab for the sdb1 partition in the sdb hard disk, as follows. This will automatically mount sdb1 on /srv/node/sdb1 upon every boot. Add the following command line to the /etc/fstab file: /dev/sdb1 /srv/node/sdb1 xfsnoatime,nodiratime,nobarrier,logbufs=8 0 2 Mount sdb1 on /srv/node/sdb1 using the following command: # mount /srv/node/sdb1 RSYNC and RSYNCD In order for Swift to perform the replication of data, we need to configure rsync by configuring rsyncd.conf. This is done by performing the following steps: Create the rsyncd.conf file in the /etc folder with the following content: # vi /etc/rsyncd.conf We are setting up synchronization within the network by including the following lines in the configuration file: 172.168.9.52 is the IP address that is on the replication network for this storage server. Use the appropriate replication network IP addresses for the corresponding storage servers. We then have to edit the /etc/default/rsync file and set RSYNC_ENABLE to true using the following configuration option: RSYNC_ENABLE=true Next, we restart the rsync service using this command: # service rsync restart Then we create the swift, recon, and cache directories using the following commands, and then set its permissions: # mkdir -p /var/cache/swift# mkdir -p /var/swift/recon Setting permissions is done using these commands: # chown -R swift:swift /var/cache/swift# chown -R swift:swift /var/swift/recon Repeat these steps on every storage server. Setting up the proxy server node This section explains the steps required to set up the proxy server node, which are as follows: Install the following services only on the proxy server node: # apt-get install python-swiftclient python-keystoneclientpython-keystonemiddleware swift-proxy Swift doesn't support HTTPS. OpenSSL has already been installed as part of the operating system installation to support HTTPS. We are going to use the OpenStack Keystone service for authentication. In order to set up the proxy-server.conf file for this, we download the configuration file from the following link and edit it: https://raw.githubusercontent.com/openstack/swift/stable/juno/etc/proxy-server.conf-sample# vi /etc/swift/proxy-server.conf The proxy-server.conf file should be edited to get the correct auth_host, admin_token, admin_tenant_name, admin_user, and admin_password values: admin_token = 01d8b673-9ebb-41d2-968a-d2a85daa1324admin_tenant_name = adminadmin_user = adminadmin_password = changeme Next, we create a keystone-signing directory and give permissions to the swift user using the following commands: # mkdir -p /home/swift/keystone-signing# mkdir -R swift:swift /home/swift/keystone-signing Summary In this article, you learned how to install and set up the OpenStack Swift service to provide object storage, and install and set up the Keystone service to provide authentication for users to access the Swift object storage. Resources for Article: Further resources on this subject: Troubleshooting in OpenStack Cloud Computing [Article] Using OpenStack Swift [Article] Playing with Swift [Article]

0
0
6378

article-image-introduction-microsoft-azure-cloud-services

Packt

04 Jun 2015

10 min read

Introduction to Microsoft Azure Cloud Services

Packt

04 Jun 2015

10 min read

In this article by Gethyn Ellis, author of the book Microsoft Azure IaaS Essentials, we will understand cloud computing and the various services offered by it. (For more resources related to this topic, see here.) Understanding cloud computing What do we mean when we talk about cloud from an information technology perspective? People mention cloud services; where do we get the services from? What services are offered? The Wikipedia definition of cloud computing is as follows: "Cloud computing is a computing term or metaphor that evolved in the late 1990s, based on utility and consumption of computer resources. Cloud computing involves application systems which are executed within the cloud and operated through internet enabled devices. Purely cloud computing does not rely on the use of cloud storage as it will be removed upon users download action. Clouds can be classified as public, private and [hybrid cloud|hybrid]." If you have worked with virtualization, then the concept of cloud is not completely alien to you. With virtualization, you can group a bunch of powerful hardware together, using a hypervisor. A hypervisor is a kind of software, operating system, or firmware that allows you to run virtual machines. Some of the popular Hypervisors on the market are VMware ESX or Microsoft's Hyper-V. Then, you can use this powerful hardware to run a set of virtual servers or guests. The guests share the resources of the host in order to execute and provide the services and computing resources of your IT department. The IT department takes care of everything from maintaining the hypervisor hosts to managing and maintaining the virtual servers and guests. The internal IT department does all the work. This is sometimes termed as a private cloud. Third-party suppliers, such as Microsoft, VMware, and Amazon, have a public cloud offering. With a public cloud, some computing services are provided to you on the Internet, and you can pay for what you use, which is like a utility bill. For example, let's take the utilities you use at home. This model can be really useful for start-up business that might not have an accurate demand forecast for their services, or the demand may change very quickly. Cloud computing can also be very useful for established businesses, who would like to make use of the elastic billing model. The more services you consume, the more you pay when you get billed at the end of the month. There are various types of public cloud offerings and services from a number of different providers. The TechNet top ten cloud providers are as follows: VMware Microsoft Bluelock Citrix Joyent Terrmark Salesforce.com Century Link RackSpace Amazon Web Services It is interesting to read that in 2013, Microsoft was only listed ninth in the list. With a new CEO, Microsoft has taken a new direction and put its Azure cloud offering at the heart of the business model. To quote one TechNet 2014 attendee: "TechNet this year was all about Azure, even the on premises stuff was built on the Azure model" With a different direction, it seems pretty clear that Microsoft is investing heavily in its cloud offering, and this will be further enhanced with further investment. This will allow a hybrid cloud environment, a combination of on-premises and public cloud, to be combined to offer organizations that ultimate flexibility when it comes to consuming IT resources. Services offered The term cloud is used to describe a variety of service offerings from multiple providers. You could argue, in fact, that the term cloud doesn't actually mean anything specific in terms of the service that you're consuming. It is, in fact, just a term that means you are consuming an IT service from a provider. Be it an internal IT department in the form of a private cloud or a public offering from some cloud provider, a public cloud, or it could be some combination of both in the form of a hybrid cloud. So, then what are the services that cloud providers offer? Virtualization and on-premises technology Most business even in today's cloudy environment has some on-premises technology. Until virtualization became popular and widely deployed several years ago, it was very common to have a one-to-one relationship between a physical hardware server with its own physical resources, such as CPU, RAM, storage, and the operating system installed on the physical server. It became clear that in this type of environment, you would need a lot of physical servers in your data center. An expanding and sometimes, a sprawling environment brings its own set of problems. The servers need cooling and heat management as well as a power source, and all the hardware and software needs to be maintained. Also, in terms of utilization, this model left lots of resources under-utilized: Virtualization changed this to some extent. With virtualization, you can create several guests or virtual servers that are configured to share the resources of the underlying host, each with their own operating system installed. It is possible to run both a Windows and Linux guest on the same physical host using virtualization. This allows you to maximize the resource utilization and allows your business to get a better return on investment on its hardware infrastructure: Virtualization is very much a precursor to cloud; many virtualized environments are sometimes called private clouds, so having an understanding of virtualization and how it works will give you a good grounding in some of the concepts of a cloud-based infrastructure. Software as a service (SaaS) SaaS is a subscription where you need to pay to use the software for the time that you're using it. You don't own any of the infrastructures, and you don't have to manage any of the servers or operating systems, you simply consume the software that you will be using. You can think of SaaS as like taking a taxi ride. When you take a taxi ride, you don't own the car, you don't need to maintain the car, and you don't even drive the car. You simply tell the taxi driver or his company when and where you want to travel somewhere, and they will take care of getting you there. The longer the trip, that is, the longer you use the taxi, the more you pay. An example of Microsoft's Software as a service would be the Azure SQL Database. The following diagram shows the cloud-based SQL database: Microsoft offers customers a SQL database that is fully hosted and maintained in Microsoft data centers, and the customer simply has to make use of the service and the database. So, we can compare this to having an on-premises database. To have an on-premises database, you need a Windows Server machine (physical or virtual) with the appropriate version of SQL Server installed. The server would need enough CPU, RAM, and storage to fulfill the needs of your database, and you need to manage and maintain the environment, applying various patches to the operating systems as they become available, installing, and testing various SQL Server service packs as they become available, and all the while, your application makes use of the database platform. With the SQL Azure database, you have no overhead, you simply need to connect to the Microsoft Azure portal and request a SQL database by following the wizard: Simply, give the database a name. In this case, it's called Helpdesk, select the service tier you want. In this example, I have chosen the Basic service tier. The service tier will define things, such as the resources available to your database, and impose limits, in terms of database size. With the Basic tier, you have a database size limit of 2 GB. You can specify the server that you want to create your database with, accept the defaults on the other settings, click on the check button, and the database gets created: It's really that simple. You will then pay for what you use in terms of database size and data access. In a later section, you will see how to set up a Microsoft Azure account. Platform as a service (PaaS) With PaaS, you rent the hardware, operating system, storage, and network from the public cloud service provider. PaaS is an offshoot of SaaS. Initially, SaaS didn't take off quickly, possibly because of the lack of control that IT departments and business thought they were going to suffer as a result of using the SaaS cloud offering. Going back to the transport analogy, you can compare PaaS to car rentals. When you rent a car, you don't need to make the car, you don't need to own the car, and you have no responsibility to maintain the car. You do, however, need to drive the car if you are going to get to your required destination. In PaaS terms, the developer and the system administrator have slightly more control over how the environment is set up and configured but still much of the work is taken care of by the cloud service provider. So, the hardware, operating system, and all the other components that run your application are managed and taken care of by the cloud provider, but you get a little more control over how things are configured. A geographically dispersed website would be a good example of an application offered on a PaaS offering. Infrastructure as a service (IaaS) With IaaS, you have much more control over the environment, and everything is customizable. Going with the transport analogy again, you can compare it to buying a car. The service provides you with the car upfront, and you are then responsible for using the car to ensure that it gets you from A to B. You are also responsible to fix the car if something goes wrong, and also ensure that the car is maintained by servicing it regularly, adding fuel, checking the tyre pressure, and so on. You have more control, but you also have more to do in terms of maintenance. Microsoft Azure has an offering. You can deploy a virtual machine, you can specify what OS you want, how much RAM you want the virtual machine to have, you can specify where the server will sit in terms of Microsoft data centers, and you can set up and configure recoverability and high availability for your Azure virtual machine: Hybrid environments With a hybrid environment, you get a combination of on-premises infrastructure and cloud services. It allows you to flexibly add resilience and high availability to your existing infrastructure. It's perfectly possible for the cloud to act as a disaster recovery site for your existing infrastructure. Microsoft Azure In order to work with the examples in this article, you need sign up for a Microsoft account. You can visit http://azure.microsoft.com/, and create an account all by yourself by completing the necessary form as follows: Here, you simply enter your details; you can use your e-mail address as your username. Enter the credentials specified. Return to the Azure website, and if you want to make use of the free trial, click on the free trial link. Currently, you get $125 worth of free Azure services. Once you have clicked on the free trial link, you will have to verify your details. You will also need to enter a credit card number and its details. Microsoft assures that you won't be charged during the free trial. Enter the appropriate details and click on Sign Up: Summary In this article, we looked at and discussed some of the terminology around the cloud. From the services offered to some of the specific features available in Microsoft Azure, you should be able to differentiate between a public and private cloud. You can also now differentiate between some of the public cloud offerings. Resources for Article: Further resources on this subject: Windows Azure Service Bus: Key Features [article] Digging into Windows Azure Diagnostics [article] Using the Windows Azure Platform PowerShell Cmdlets [article]

0
0
1638

article-image-architecture-and-component-overview

Packt

22 May 2015

14 min read

Architecture and Component Overview

Packt

22 May 2015

14 min read

In this article by Dan Radez, author of the book OpenStack Essentials, we will be understanding the internal architecture of the components that make up OpenStack. OpenStack has a very modular design, and because of this design, there are lots of moving parts. It's overwhelming to start walking through installing and using OpenStack without understanding the internal architecture of the components that make up OpenStack. Each component in OpenStack manages a different resource that can be virtualized for the end user. Separating each of the resources that can be virtualized into separate components makes the OpenStack architecture very modular. If a particular service or resource provided by a component is not required, then the component is optional to an OpenStack deployment. Let's start by outlining some simple categories to group these services into. (For more resources related to this topic, see here.) OpenStack architecture Logically, the components of OpenStack can be divided into three groups: Control Network Compute The control tier runs the Application Programming Interfaces (API) services, web interface, database, and message bus. The network tier runs network service agents for networking, and the compute node is the virtualization hypervisor. It has services and agents to handle virtual machines. All of the components use a database and/or a message bus. The database can be MySQL, MariaDB, or PostgreSQL. The most popular message buses are RabbitMQ, Qpid, and ActiveMQ. For smaller deployments, the database and messaging services usually run on the control node, but they could have their own nodes if required. In a simple multi-node deployment, each of these groups is installed onto a separate server. OpenStack could be installed on one node or two nodes, but a good baseline for being able to scale out later is to put each of these groups on their own node. An OpenStack cluster can also scale far beyond three nodes. Now that a base logical architecture of OpenStack is defined, let's look at what components make up this basic architecture. To do that, we'll first touch on the web interface and then work towards collecting the resources necessary to launch an instance. Finally, we will look at what components are available to add resources to a launched instance. Dashboard The OpenStack dashboard is the web interface component provided with OpenStack. You'll sometimes hear the terms dashboard and Horizon used interchangeably. Technically, they are not the same thing. This web interface is referred to as the dashboard. The team that develops the web interface maintains both the dashboard interface and the Horizon framework that the dashboard uses. More important than getting these terms right is understanding the commitment that the team that maintains this code base has made to the OpenStack project. They have pledged to include support for all the officially accepted components that are included in OpenStack. Visit the OpenStack website (http://www.openstack.org/) to get an official list of OpenStack components. The dashboard cannot do anything that the API cannot do. All the actions that are taken through the dashboard result in calls to the API to complete the task requested by the end user. Throughout, we will examine how to use the web interface and the API clients to execute tasks in an OpenStack cluster. Next, we will discuss both the dashboard and the underlying components that the dashboard makes calls to when creating OpenStack resources. Keystone Keystone is the identity management component. The first thing that needs to happen while connecting to an OpenStack deployment is authentication. In its most basic installation, Keystone will manage tenants, users, and roles and be a catalog of services and endpoints for all the components in the running cluster. Everything in OpenStack must exist in a tenant. A tenant is simply a grouping of objects. Users, instances, and networks are examples of objects. They cannot exist outside of a tenant. Another name for a tenant is project. On the command line, the term tenant is used. In the web interface, the term project is used. Users must be granted a role in a tenant. It's important to understand this relationship between the user and a tenant via a role. We will look at how to create the user and tenant and how to associate the user with a role in a tenant. For now, understand that a user cannot log in to the cluster unless they are members of a tenant. Even the administrator has a tenant. Even the users the OpenStack components use to communicate with each other have to be members of a tenant to be able to authenticate. Keystone also keeps a catalog of services and endpoints of each of the OpenStack components in the cluster. This is advantageous because all of the components have different API endpoints. By registering them all with Keystone, an end user only needs to know the address of the Keystone server to interact with the cluster. When a call is made to connect to a component other than Keystone, the call will first have to be authenticated, so Keystone will be contacted regardless. Within the communication to Keystone, the client also asks Keystone for the address of the component the user intended to connect to. This makes managing the endpoints easier. If all the endpoints were distributed to the end users, then it would be a complex process to distribute a change in one of the endpoints to all of the end users. By keeping the catalog of services and endpoints in Keystone, a change is easily distributed to end users as new requests are made to connect to the components. By default, Keystone uses username/password authentication to request a token and Public Key Infrastructure (PKI) tokens for subsequent requests. The token has a user's roles and tenants encoded into it. All the components in the cluster can use the information in the token to verify the user and the user's access. Keystone can also be integrated into other common authentication systems instead of relying on the username and password authentication provided by Keystone. Glance Glance is the image management component. Once we're authenticated, there are a few resources that need to be available for an instance to launch. The first resource we'll look at is the disk image to launch from. Before a server is useful, it needs to have an operating system installed on it. This is a boilerplate task that cloud computing has streamlined by creating a registry of pre-installed disk images to boot from. Glance serves as this registry within an OpenStack deployment. In preparation for an instance to launch, a copy of a selected Glance image is first cached to the compute node where the instance is being launched. Then, a copy is made to the ephemeral disk location of the new instance. Subsequent instances launched on the same compute node using the same disk image will use the cached copy of the Glance image. The images stored in Glance are sometimes called sealed-disk images. These images are disk images that have had the operating system installed but have had things such as Secure Shell (SSH) host key, and network device MAC addresses removed. This makes the disk images generic, so they can be reused and launched repeatedly without the running copies conflicting with each other. To do this, the host-specific information is provided or generated at boot. The provided information is passed in through a post-boot configuration facility called cloud-init. The images can also be customized for special purposes beyond a base operating system install. If there was a specific purpose for which an instance would be launched many times, then some of the repetitive configuration tasks could be performed ahead of time and built into the disk image. For example, if a disk image was intended to be used to build a cluster of web servers, it would make sense to install a web server package on the disk image before it was used to launch an instance. It would save time and bandwidth to do it once before it is registered with Glance instead of doing this package installation and configuration over and over each time a web server instance is booted. There are quite a few ways to build these disk images. The simplest way is to do a virtual machine install manually, make sure that the host-specific information is removed, and include cloud-init in the built image. Cloud-init is packaged in most major distributions; you should be able to simply add it to a package list. There are also tools to make this happen in a more autonomous fashion. Some of the more popular tools are virt-install, Oz, and appliance-creator. The most important thing about building a cloud image for OpenStack is to make sure that cloud-init is installed. Cloud-init is a script that should run post boot to connect back to the metadata service. Neutron Neutron is the network management component. With Keystone, we're authenticated, and from Glance, a disk image will be provided. The next resource required for launch is a virtual network. Neutron is an API frontend (and a set of agents) that manages the Software Defined Networking (SDN) infrastructure for you. When an OpenStack deployment is using Neutron, it means that each of your tenants can create virtual isolated networks. Each of these isolated networks can be connected to virtual routers to create routes between the virtual networks. A virtual router can have an external gateway connected to it, and external access can be given to each instance by associating a floating IP on an external network with an instance. Neutron then puts all configuration in place to route the traffic sent to the floating IP address through these virtual network resources into a launched instance. This is also called Networking as a Service (NaaS). NaaS is the capability to provide networks and network resources on demand via software. By default, the OpenStack distribution we will install uses Open vSwitch to orchestrate the underlying virtualized networking infrastructure. Open vSwitch is a virtual managed switch. As long as the nodes in your cluster have simple connectivity to each other, Open vSwitch can be the infrastructure configured to isolate the virtual networks for the tenants in OpenStack. There are also many vendor plugins that would allow you to replace Open vSwitch with a physical managed switch to handle the virtual networks. Neutron even has the capability to use multiple plugins to manage multiple network appliances. As an example, Open vSwitch and a vendor's appliance could be used in parallel to manage virtual networks in an OpenStack deployment. This is a great example of how OpenStack is built to provide flexibility and choice to its users. Networking is the most complex component of OpenStack to configure and maintain. This is because Neutron is built around core networking concepts. To successfully deploy Neutron, you need to understand these core concepts and how they interact with one another. Nova Nova is the instance management component. An authenticated user who has access to a Glance image and has created a network for an instance to live on is almost ready to tie all of this together and launch an instance. The last resources that are required are a key pair and a security group. A key pair is simply an SSH key pair. OpenStack will allow you to import your own key pair or generate one to use. When the instance is launched, the public key is placed in the authorized_keys file so that a password-less SSH connection can be made to the running instance. Before that SSH connection can be made, the security groups have to be opened to allow the connection to be made. A security group is a firewall at the cloud infrastructure layer. The OpenStack distribution we'll use will have a default security group with rules to allow instances to communicate with each other within the same security group, but rules will have to be added for Internet Control Message Protocol (ICMP), SSH, and other connections to be made from outside the security group. Once there's an image, network, key pair, and security group available, an instance can be launched. The resource's identifiers are provided to Nova, and Nova looks at what resources are being used on which hypervisors, and schedules the instance to spawn on a compute node. The compute node gets the Glance image, creates the virtual network devices, and boots the instance. During the boot, cloud-init should run and connect to the metadata service. The metadata service provides the SSH public key needed for SSH login to the instance and, if provided, any post-boot configuration that needs to happen. This could be anything from a simple shell script to an invocation of a configuration management engine. Cinder Cinder is the block storage management component. Volumes can be created and attached to instances. Then, they are used on the instances as any other block device would be used. On the instance, the block device can be partitioned and a file system can be created and mounted. Cinder also handles snapshots. Snapshots can be taken of the block volumes or of instances. Instances can also use these snapshots as a boot source. There is an extensive collection of storage backends that can be configured as the backing store for Cinder volumes and snapshots. By default, Logical Volume Manager (LVM) is configured. GlusterFS and Ceph are two popular software-based storage solutions. There are also many plugins for hardware appliances. Swift Swift is the object storage management component. Object storage is a simple content-only storage system. Files are stored without the metadata that a block file system has. These are simply containers and files. The files are simply content. Swift has two layers as part of its deployment: the proxy and the storage engine. The proxy is the API layer. It's the service that the end user communicates with. The proxy is configured to talk to the storage engine on the user's behalf. By default, the storage engine is the Swift storage engine. It's able to do software-based storage distribution and replication. GlusterFS and Ceph are also popular storage backends for Swift. They have similar distribution and replication capabilities to those of Swift storage. Ceilometer Ceilometer is the telemetry component. It collects resource measurements and is able to monitor the cluster. Ceilometer was originally designed as a metering system for billing users. As it was being built, there was a realization that it would be useful for more than just billing and turned into a general-purpose telemetry system. Ceilometer meters measure the resources being used in an OpenStack deployment. When Ceilometer reads a meter, it's called a sample. These samples get recorded on a regular basis. A collection of samples is called a statistic. Telemetry statistics will give insights into how the resources of an OpenStack deployment are being used. The samples can also be used for alarms. Alarms are nothing but monitors that watch for a certain criterion to be met. These alarms were originally designed for Heat autoscaling. Heat Heat is the orchestration component. Orchestration is the process of launching multiple instances that are intended to work together. In orchestration, there is a file, known as a template, used to define what will be launched. In this template, there can also be ordering or dependencies set up between the instances. Data that needs to be passed between the instances for configuration can also be defined in these templates. Heat is also compatible with AWS CloudFormation templates and implements additional features in addition to the AWS CloudFormation template language. To use Heat, one of these templates is written to define a set of instances that needs to be launched. When a template launches a collection of instances, it's called a stack. When a stack is spawned, the ordering and dependencies, shared conflagration data, and post-boot configuration are coordinated via Heat. Heat is not configuration management. It is orchestration. It is intended to coordinate launching the instances, passing configuration data, and executing simple post-boot configuration. A very common post-boot configuration task is invoking an actual configuration management engine to execute more complex post-boot configuration. Summary The list of components that have been covered is not the full list. This is just a small subset to get you started with using and understanding OpenStack. Resources for Article: Further resources on this subject: Creating Routers [article] Using OpenStack Swift [article] Troubleshooting in OpenStack Cloud Computing [article]

0
0
2968

Packt

29 Apr 2015

12 min read

Patterns for Data Processing

Packt

29 Apr 2015

12 min read

0
0
5807

article-image-high-availability-protection-and-recovery-using-microsoft-azure

Packt

02 Apr 2015

23 min read

High Availability, Protection, and Recovery using Microsoft Azure

Packt

02 Apr 2015

23 min read

Microsoft Azure can be used to protect your on-premise assets such as virtual machines, applications, and data. In this article by Marcel van den Berg, the author of Managing Microsoft Hybrid Clouds, you will learn how to use Microsoft Azure to store backup data, replicate data, and even for orchestration of a failover and failback of a complete data center. We will focus on the following topics: High Availability in Microsoft Azure Introduction to geo-replication Disaster recovery using Azure Site Recovery (For more resources related to this topic, see here.) High availability in Microsoft Azure One of the most important limitations of Microsoft Azure is the lack of an SLA for single-instance virtual machines. If a virtual machine is not part of an availability set, that instance is not covered by any kind of SLA. The reason for this is that when Microsoft needs to perform maintenance on Azure hosts, in many cases, a reboot is required. Reboot means the virtual machines on that host will be unavailable for a while. So, in order to accomplish High Availability for your application, you should have at least two instances of the application running at any point in time. Microsoft is working on some sort of hot patching which enables virtual machines to remain active on hosts being patched. Details are not available at the moment of writing. High Availability is a crucial feature that must be an integral part of an architectural design, rather than something that can be "bolted on" to an application afterwards. Designing for High Availability involves leveraging both the development platform as well as available infrastructure in order to ensure an application's responsiveness and overall reliability. The Microsoft Azure Cloud platform offers software developers PaaS extensibility features and network administrators IaaS computing resources that enable availability to be built into an application's design from the beginning. The good news is that organizations with mission-critical applications can now leverage core features within the Microsoft Azure platform in order to deploy highly available, scalable, and fault-tolerant cloud services that have been shown to be more cost-effective than traditional approaches that leverage on-premises systems. Microsoft Failover Clustering support Windows Server Failover Clustering (WSFC) is not supported on Azure. However, Microsoft does support SQL Server AlwaysOn Availability Groups. For AlwaysOn Availability Groups, there is currently no support for availability group listeners in Azure. Also, you must work around a DHCP limitation in Azure when creating WSFC clusters in Azure. After you create a WSFC cluster using two Azure virtual machines, the cluster name cannot start because it cannot acquire a unique virtual IP address from the DHCP service. Instead, the IP address assigned to the cluster name is a duplicate address of one of the nodes. This has a cascading effect that ultimately causes the cluster quorum to fail, because the nodes cannot properly connect to one another. So if your application uses Failover Clustering, it is likely that you will not move it over to Azure. It might run, but Microsoft will not assist you when you encounter issues. Load balancing Besides clustering, we can also create highly available nodes using load balancing. Load balancing is useful for stateless servers. These are servers that are identical to each other and do not have a unique configuration or data. When two or more virtual machines deliver the same application logic, you will need a mechanism that is able to redirect network traffic to those virtual machines. The Windows Network Load Balancing (NLB) feature in Windows Server is not supported on Microsoft Azure. An Azure load balancer does exactly this. It analyzes incoming network traffic of Azure, determines the type of traffic, and reroutes it to a service. The Azure load balancer is running provided as a cloud service. In fact, this cloud service is running on virtual appliances managed by Microsoft. These are completely software-defined. The moment an administrator adds an endpoint, a set of load balancers is instructed to pass incoming network traffic on a certain port to a port on a virtual machine. If a load balancer fails, another one will take over. Azure load balancing is performed at layer 4 of the OSI mode. This means the load balancer is not aware of the application content of the network packages. It just distributes packets based on network ports. To load balance over multiple virtual machines, you can create a load-balanced set by performing the following steps: In Azure Management Portal, select the virtual machine whose service should be load balanced. Select Endpoints in the upper menu. Click on Add. Select Add a stand-alone endpoint and click on the right arrow. Select a name or a protocol and set the public and private port. Enable create a load-balanced set and click on the right arrow. Next, fill in a name for the load-balanced set. Fill in the probe port, the probe interval, and the number of probes. This information is used by the load balancer to check whether the service is available. It will connect to the probe port number; do that according to the interval. If the specified number of probes all result in unable to connect, the load balancer will no longer distribute traffic to this virtual machine. Click on the check mark. The load balancing mechanism available is based on a hash. Microsoft Azure Load Balancer uses a five tuple (source IP, source port, destination IP, destination port, and protocol type) to calculate the hash that is used to map traffic to the available servers. A second load balancing mode was introduced in October 2014. It is called Source IP Affinity (also known as session affinity or client IP affinity). On using Source IP affinity, connections initiated from the same client computer go to the same DIP endpoint. These load balancers provide high availability inside a single data center. If a virtual machine part of a cluster of instances fails, the load balancer will notice this and remove that virtual machine IP address from a table. However, load balancers will not protect for failure of a complete data center. The domains that are used to direct clients to an application will route to a particular virtual IP that is bound to an Azure data center. To access application even if an Azure region has failed, you can use Azure Traffic Manager. This service can be used for several purposes: To failover to a different Azure region if a disaster occurs To provide the best user experience by directing network traffic to Azure region closest to the location of the user To reroute traffic to another Azure region whenever there's any planned maintenance The main task of Traffic Manager is to map a DNS query to an IP address that is the access point of a service. This job can be compared for example with a job of someone working with the X-ray machine at an airport. I'm guessing that you have all seen those multiple rows of X-ray machines. Each queue at an X-ray machine is different at any moment. An officer standing at the entry of the area distributes people over the available X-rays machine such that all queues remain equal in length. Traffic Manager provides you with a choice of load-balancing methods, including performance, failover, and round-robin. Performance load balancing measures the latency between the client and the cloud service endpoint. Traffic Manager is not aware of the actual load on virtual machines servicing applications. As Traffic Manager resolved endpoints of Azure cloud services only, it cannot be used for load balancing between an Azure region and a non-Azure region (for example, Amazon EC2) or between on-premises and Azure services. It will perform health checks on a regular basis. This is done by querying the endpoints of the services. If the endpoint does not respond, Traffic Manager will stop distributing network traffic to that endpoint for as long as the state of the endpoint is unavailable. Traffic Manager is available in all Azure regions. Microsoft charges for using this service based on the number of DNS queries that are received by Traffic Manager. As the service is attached to an Azure subscription, you will be required to contact Azure support to transfer Traffic Manager to a different subscription. The following table shows the difference between Azure's built-in load balancer and Traffic Manager: Load balancer Traffic Manager Distribution targets Must reside in same region Can be across regions Load balancing 5 tuple Source IP Affinity Performance, failover, and round-robin Level OSI layer 4 TCP/UDP ports OSI Layer 4 DNS queries Third-party load balancers In certain configurations, the default Azure load balancer might not be sufficient. There are several vendors supporting or starting to support Azure. One of them is Kemp Technologies. Kemp Technologies offers a free load balancer for Microsoft Azure. The Virtual LoadMaster (VLM) provides layer 7 application delivery. The virtual appliance has some limitations compared to the commercially available unit. The maximum bandwidth is limited to 100 Mbps and High Availability is not offered. This means the Kemp LoadMaster for Azure free edition is a single point of failure. Also, the number of SSL transactions per second is limited. One of the use cases in which a third-party load balancer is required is when we use Microsoft Remote Desktop Gateway. As you might know, Citrix has been supporting the use of Citrix XenApp and Citrix XenDesktop running on Azure since 2013. This means service providers can offer cloud-based desktops and applications using these Citrix solutions. To make this a working configuration, session affinity is required. Session affinity makes sure that network traffic is always routed over the same server. Windows Server 2012 Remote Desktop Gateway uses two HTTP channels, one for input and one for output, which must be routed over the same Remote Desktop Gateway. The Azure load balancer is only able to do round-robin load balancing, which does not guarantee both channels using the same server. However, hardware and software load balancers that support IP affinity, cookie-based affinity, or SSL ID-based affinity (and thus ensure that both HTTP connections are routed to the same server) can be used with Remote Desktop Gateway. Another use case is load balancing of Active Directory Federation Services (ADFS). Microsoft Azure can be used as a backup for on-premises Active Directory (AD). Suppose your organization is using Office 365. To provide single sign-on, a federation has been set up between Office 365 directory and your on-premises AD. If your on-premises ADFS fails, external users would not be able to authenticate. By using Microsoft Azure for ADFS, you can provide high availability for authentication. Kemp LoadMaster for Azure can be used to load balance network traffic to ADFS and is able to do proper load balancing. To install Kemp LoadMaster, perform the following steps: Download the Publish Profile settings file from https://windows.azure.com/download/publishprofile.aspx. Use PowerShell for Azure with the Import-AzurePublishSettingsFile command. Upload the KEMP supplied VHD file to your Microsoft Azure storage account. Publish the VHD as an image. The VHD will be available as an image. The image can be used to create virtual machines. The complete steps are described in the documentation provided by Kemp. Geo-replication of data Microsoft Azure has geo-replication of Azure Storage enabled by default. This means all of your data is not only stored at three different locations in the primary region, but also replicated and stored at three different locations at the paired region. However, this data cannot be accessed by the customer. Microsoft has to declare a data center or storage stamp as lost before Microsoft will failover to the secondary location. In the rare circumstance where a failed storage stamp cannot be recovered, you will experience many hours of downtime. So, you have to make sure you have your own disaster recovery procedures in place. Zone Redundant Storage Microsoft offers a third option you can use to store data. Zone Redundant Storage (ZRS) is a mix of two options for data redundancy and allows data to be replicated to a secondary data center / facility located in the same region or to a paired region. Instead of storing six copies of data like geo-replicated storage does, only three copies of data are stored. So, ZRS is a mix of local redundant storage and geo-replicated storage. The cost for ZRS is about 66 percent of the cost for GRS. Snapshots of the Microsoft Azure disk Server virtualization solutions such as Hyper-V and VMware vSphere offer the ability to save the state of a running virtual machine. This can be useful when you're making changes to the virtual machine but want to have the ability to reverse those changes if something goes wrong. This feature is called a snapshot. Basically, a virtual disk is saved by marking it as read only. All writes to the disk after a snapshot has been initiated are stored on a temporary virtual disk. When a snapshot is deleted, those changes are committed from the delta disk to the initial disk. While the Microsoft Azure Management Portal does not have a feature to create snapshots, there is an ability to make point-in-time copies of virtual disks attached to virtual machines. Microsoft Azure Storage has the ability of versioning. Under the hood, this works differently than snapshots in Hyper-V. It creates a snapshot blob of the base blob. Snapshots are by no ways a replacement for a backup, but it is nice to know you can save the state as well as quickly reverse if required. Introduction to geo-replication By default, Microsoft replicates all data stored on Microsoft Azure Storage to the secondary location located in the paired region. Customers are able to enable or disable the replication. When enabled, customers are charged. When Geo Redundant Storage has been enabled on a storage account, all data is asynchronous replicated. At the secondary location, data is stored on three different storage nodes. So even when two nodes fail, the data is still accessible. However, before the read access Geo-Redundant feature was available, customers had no way to actually access replicated data. The replicated data could only be used by Microsoft when the primary storage could not be recovered again. Microsoft will try everything to restore data in the primary location and avoid a so-called geo-failover process. A geo-failover process means that a storage account's secondary location (the replicated data) will be configured as the new primary location. The problem is that a geo-failover process cannot be done per storage account, but needs to be done at the storage stamp level. A storage stamp has multiple racks of storage nodes. You can imagine how much data and how many customers are involved when a storage stamp needs to failover. Failover will have an effect on the availability of applications. Also, because of the asynchronous replication, some data will be lost when a failover is performed. Microsoft is working on an API that allows customers to failover a storage account themselves. When geo-redundant replication is enabled, you will only benefit from it when Microsoft has a major issue. Geo-redundant storage is neither a replacement for a backup nor for a disaster recovery solution. Microsoft states that the Recover Point Objective (RPO) for Geo Redundant Storage will be about 15 minutes. That means if a failover is required, customers can lose about 15 minutes of data. Microsoft does not provide a SLA on how long geo-replication will take. Microsoft does not give an indication for the Recovery Time Objective (RTO). The RTO indicates the time required by Microsoft to make data available again after a major failure that requires a failover. Microsoft once had to deal with a failure of storage stamps. They did not do a failover but it took many hours to restore the storage service to a normal level. In 2013, Microsoft introduced a new feature called Read Access Geo Redundant Storage (RA-GRS). This feature allows customers to perform reads on the replicated data. This increases the read availability from 99.9 percent when GRS is used to above 99.99 percent when RA-GRS is enabled. Microsoft charges more when RA-GRS is enabled. RA-GRS is an interesting addition for applications that are primarily meant for read-only purposes. When the primary location is not available and Microsoft has not done a failover, writes are not possible. The availability of the Azure Virtual Machine service is not increased by enabling RA-GRS. While the VHD data is replicated and can be read, the virtual machine itself is not replicated. Perhaps this will be a feature for the future. Disaster recovery using Azure Site Recovery Disaster recovery has always been on the top priorities for organizations. IT has become a very important, if not mission-critical factor for doing business. A failure of IT could result in loss of money, customers, orders, and brand value. There are many situations that can disrupt IT such as: Hurricanes Floods Earthquakes Disasters such as a failure of a nuclear power plant Fire Human error Outbreak of a virus Hardware or software failure While these threads are clear and the risk of being hit by such a thread can be calculated, many organizations do not have a proper protection against those threads. In three different situations, disaster recovery solutions can help an organization to continue doing business: Avoiding a possible failure of IT infrastructure by moving servers to a different location. Avoiding a disaster situation, such as hurricanes or floods, since such situations are generally well known in advance due to weather forecasting capabilities. Recovering as quickly as possible when a disaster has hit the data center. Disaster recovery is done when a disaster unexpectedly hit the data center, such as a fire, hardware error, or human error. Some reasons for not having a proper disaster recovery plan are complexity, lack of time, and ignorance; however, in most cases, a lack of budget and the belief that disaster recovery is expensive are the main reasons. Almost all organizations that have been hit by a major disaster causing unacceptable periods of downtime started to implement a disaster recovery plan, including technology immediately after they recovered. However, in many cases, this insight came too late. According to Gartner, 43 percent of companies experiencing disasters never reopen and 29 percent close within 2 years. Server virtualization has made disaster recovery a lot easier and cost effective. Verifying that your DR procedure actually works as designed and matches RTO and RPO is much easier using virtual machines. Since Windows Server 2012, Hyper-V has a feature for asynchronous replication of virtual machine virtual disks to another location. This feature, Hyper-V Replica, is very easy to enable and configure. It does not cost extra. Hyper-V Replica is storage agnostic, which means the storage type at the primary site can be different than the storage type used in the secondary site. So, Hyper-V Replica perfectly works when your virtual machines are hosted on, for example, EMC storage while in the secondary a HP solution is used. While replication is a must for DR, another very useful feature in DR is automation. As an administrator, you really appreciate the option to click on a button after deciding to perform a failover and sit back and relax. Recovery is mostly a stressful job when your primary location is flooded or burned and lots of things can go wrong if recovery is done manually. This is why Microsoft designed Azure Site Recovery. Azure Site Recovery is able to assist in disaster recovery in several scenarios: A customer has two data centers both running Hyper-V managed by System Center Virtual Machine Manager. Hyper-V Replica is used to replicate data at the virtual machine level. A customer has two data centers both running Hyper-V managed by System Center Virtual Machine Manager. NetApp storage is used to replicate between two sites at the storage level. A customer has a single data center running Hyper-V managed by System Center Virtual Machine Manager. A customer has two data centers both running VMware vSphere. In this case InMage Scout software is used to replicate between two datacenters. Azure is not used for orchestration. A customer has a single data centers not managed by System Center Virtual Machine Manager. In the second scenario, Microsoft Azure is used as a secondary data center if a disaster makes the primary data center unavailable. Microsoft announced also to support a scenario where vSphere is used on-premises and Azure Site Recovery can be used to replicate data to Azure. To enable this InMage software will be used. Details were not available at the time this article was written. In the first two described scenarios Site Recovery is used to orchestrate the failover and failback to the secondary location. The management is done using Azure Management Portal. This is available using any browser supporting HTML5. So a failover can be initiated even from a tablet or smartphone. Using Azure as a secondary data center for disaster recovery Azure Site Recovery went into preview in June 2014. For organizations using Hyper-V, there is no direct need to have a secondary data center as Azure can be used as a target for Hyper-V Replica. Some of the characteristics of the service are as follows: Allows nondisruptive disaster recovery failover testing Automated reconfigure of network configuration of guests Storage agnostic supports any type of on-premises storage supported by Hyper-V Support for VSS to enable application consistency Protects more than 1,000 virtual machines (Microsoft tested with 2,000 virtual machines and this went well) To be able to use Site Recovery, customers do not have to use System Center Virtual Machine Manager. Site Recovery can be used without this installed. System Center Virtual Machine Manager. Site Recovery will use information such as? virtual networks provided by SCVMM to map networks available in Microsoft Azure. Site Recovery does not support the ability to send a copy of the virtual hard disks on removable media to an Azure data center to prevent the initial replication using WAN (seeding). Customers will need to transfer all the replication data over the network. ExpressRoute will help to get a much better throughput compared to a site-to-site VPN over the Internet. Failover to Azure can be as simple as clicking on a single button. Site Recovery will then create new virtual machines in Azure and start the virtual machines in the order defined in the recovery plan. A recovery plan is a workflow that defines the startup sequence of virtual machines. It is possible to stop the recovery plan to allow a manual check, for example. If all is okay, the recovery plan will continue doing its job. Multiple recovery plans can be created. Microsoft Volume Shadow Copy Services (VSS) is supported. This allows application consistency. Replication of data can be configured at intervals of 15 seconds, 5 minutes, or 15 minutes. Replication is performed asynchronously. For recovery, 24 recovery points are available. These are like snapshots or point-in-time copies. If the most recent replica cannot be used (for example, because of damaged data), another replica can be used for restore. You can configure extended replication. In extended replication, your Replica server forwards changes that occur on the primary virtual machines to a third server (the extended Replica server). After a planned or unplanned failover from the primary server to the Replica server, the extended Replica server provides further business continuity protection. As with ordinary replication, you configure extended replication by using Hyper-V Manager, Windows PowerShell (using the –Extended option), or WMI. At the moment, only VHD virtual disk format is supported. Generation 2 virtual machines that can be created on Hyper-V are not supported by Site Recovery. Generation 2 virtual machines have a simplified virtual hardware model and support Unified Extensible Firmware Interface (UEFI) firmware instead of BIOS-based firmware. Also, boot from PXE, SCSI hard disk, SCSCI DVD, and Secure Boot are supported in Generation 2 virtual machines. However on March 19 Microsoft responded to numerous customer requests on support of Site Recovery for Generation 2 virtual machines. Site Recovery will soon support Gen 2 VM's. On failover, the VM will be converted to a Gen 1 VM. On failback, the VM will be converted to Gen 2. This conversion is done till the Azure platform natively supports Gen 2 VM's. Customers using Site Recovery are charged only for consumption of storage as long as they do not perform a failover or failover test. Failback is also supported. After running for a while in Microsoft Azure customers are likely to move their virtual machines back to the on-premises, primary data center. Site Recovery will replicate back only the changed data. Mind that customer data is not stored in Microsoft Azure when Hyper-V Recovery Manager is used. Azure is used to coordinate the failover and recovery. To be able to do this, it stores information on network mappings, runbooks, and names of virtual machines and virtual networks. All data sent to Azure is encrypted. By using Azure Site Recovery, we can perform service orchestration in terms of replication, planned failover, unplanned failover, and test failover. The entire engine is powered by Azure Site Recovery Manager. Let's have a closer look on the main features of Azure Site Recovery. It enables three main scenarios: Test Failover or DR Drills: Enable support for application testing by creating test virtual machines and networks as specified by the user. Without impacting production workloads or their protection, HRM can quickly enable periodic workload testing. Planned Failovers (PFO): For compliance or in the event of a planned outage, customers can use planned failovers, virtual machines are shutdown, final changes are replicated to ensure zero data loss, and then virtual machines are brought up in order on the recovery site as specified by the RP. More importantly, failback is a single-click gesture that executes a planned failover in the reverse direction. Unplanned Failovers (UFO): In the event of unplanned outage or a natural disaster, HRM opportunistically attempts to shut down the primary machines if some of the virtual machines are still running when the disaster strikes. It then automates their recovery on the secondary site as specified by the RP. If your secondary site uses a different IP subnet, Site Recovery is able to change the IP configuration of your virtual machines during the failover. Part of the Site Recovery installation is the installation of a VMM provider. This component communicates with Microsoft Azure. Site Recovery can be used even if you have a single VMM to manage both primary and secondary sites. Site Recovery does not rely on availability of any component in the primary site when performing a failover. So it doesn't matter if the complete site including link to Azure has been destroyed, as Site Recovery will be able to perform the coordinated failover. Azure Site Recovery to customer owned sites is billed per protected virtual machine per month. The costs are approximately €12 per month. Microsoft bills for the average consumption of virtual machines per month. So if you are protecting 20 virtual machines in the first half and 0 in the second half, you will be charged for 10 virtual machines for that month. When Azure is used as a target, Microsoft will only charge for consumption of storage during replication. The costs for this scenario are €40.22/month per instance protected. As soon as you perform a test failover or actual failover Microsoft will charge for the virtual machine CPU and memory consumption. Summary Thus this article has covered the concepts of High Availability in Microsoft Azure and disaster recovery using Azure Site Recovery, and also gives an introduction to the concept of geo-replication. Resources for Article: Further resources on this subject: Windows Azure Mobile Services - Implementing Push Notifications using [article] Configuring organization network services [article] Integration with System Center Operations Manager 2012 SP1 [article]

0
0
2563

How-To Tutorials - Cloud Computing

Monitoring Physical Network Bandwidth Using OpenStack Ceilometer

Monitoring OpenStack Networks

Deploying Highly Available OpenStack

How to Run Code in the Cloud with AWS Lambda

How to Optimize PUT Requests

Installing Red Hat CloudForms on Red Hat OpenStack

Using the OpenStack Dashboard

Achieving High-Availability on AWS Cloud

Setting up VPNaaS in OpenStack

AWS Global Infrastructure

Trending Topics

Installing OpenStack Swift

Introduction to Microsoft Azure Cloud Services

Architecture and Component Overview

Patterns for Data Processing

High Availability, Protection, and Recovery using Microsoft Azure

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access