Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Cloud Computing

120 Articles
article-image-windows-azure-service-bus-key-features
Packt
06 Dec 2012
13 min read
Save for later

Windows Azure Service Bus: Key Features

Packt
06 Dec 2012
13 min read
(For more resources related to this topic, see here.) Service Bus The Windows Azure Service Bus provides a hosted, secure, and widely available infrastructure for widespread communication, large-scale event distribution, naming, and service publishing. Service Bus provides connectivity options for Windows Communication Foundation (WCF) and other service endpoints, including REST endpoints, that would otherwise be difficult or impossible to reach. Endpoints can be located behind Network Address Translation (NAT) boundaries, or bound to frequently changing, dynamically assigned IP addresses, or both. Getting started To get started and use the features of Services Bus, you need to make sure you have the Windows Azure SDK installed. Queues Queues in the AppFabric feature (different from Table Storage queues) offer a FIFO message delivery capability. This can be an outcome for those applications that expect messages in a certain order. Just like with ordinary Azure Queues, Service Bus Queues enable the decoupling of your application components and can still function, even if some parts of the application are offline. Some differences between the two types of queues are (for example) that the Service Bus Queues can hold larger messages and can be used in conjunction with Access Control Service. Working with queues To create a queue, go to the Windows Azure portal and select the Service Bus, Access Control & Caching tab. Next, select Service Bus, select the namespace, and click on New Queue. The following screen will appear. If you did not set up a namespace earlier you need to create a namespace before you can create a queue: There are some properties that can be configured during the setup process of a queue. Obviously, the name uniquely identifies the queue in the namespace. Default Message Time To Live configures messages having this default TTL. This can also be set in code and is a TimeSpan value. Duplicate Detection History Time Window implicates how long the message ID (unique) of the received messages will be retained to check for duplicate messages. This property will be ignored if the Required Duplicate Detection option is not set. Keep in mind that a long detection history results in the persistency of message IDs during that period. If you process many messages, the queue size will grow and so does your bill. When a message expires or when the limit of the queue size is reached, it will be deadlettered . This means that they will end up in a different queue named $DeadLetterQueue. Imagine a scenario where a lot of traffic in your queue results in messages in the dead letter queue. Your application should be robust and process these messages as well. The lock duration property defines the duration of the lock when the PeekLock() method is called. The PeekLock() method hides a specific message from other consumers/processors until the lock duration expires. Typically, this value needs to be sufficient to process and delete the message. A sample scenario Remember the differences between the two queue types that Windows Azure offers, where the Service Bus queues are able to guarantee first-in first-out and to support transactions. The scenario is when a user posts a geotopic on the canvas containing text and also uploads a video by using the parallel upload functionality. What should happen next is for the WCF service CreateGeotopic() to post a message in the queue to enter the geotopic, but when the file finishes uploading, there is also a message sent to the queue. These two together should be in a single transaction. Geotopia.Processor processes this message but only if the media file is finished uploading. In this example, you can see how a transaction is handled and how a message can be abandoned and made available on the queue again. If the geotopic is validated as a whole (file is uploaded properly), the worker role will reroute the message to a designated audit trail queue to keep track of actions made by the system and also send to a topic (see next section) dedicated to keeping messages that need to be pushed to possible mobile devices. The messages in this topic will again be processed by a worker role. The reason for choosing a separate worker role is that it creates a role, a loosely-coupled solution, and possible to be fine-grained by only scaling the back-end worker role. See the following diagram for an overview of this scenario: In the previous section, we already created a queue named geotopicaqueue. In order to work with queues, you need the service identity (in this case we use a service identity with a symmetric issuer and the key credentials) of the service namespace. Preparing the project In order to make use of the Service Bus capabilities, you need to add a reference to Microsoft.ServiceBus.dll, located in <drive>:Program FilesMicrosoft SDKsWindows Azure.NET SDK2012-06ref. Next, add the following using statements to your file: using Microsoft.ServiceBus; using Microsoft.ServiceBus.Messaging; Your project is now ready to make use of Service Bus queues. In the configuration settings of the web role project hosting the WCF services, add a new configuration setting named ServiceBusQueue with the following value: "Endpoint=sb://<servicenamespace>.servicebus.windows. net/;SharedSecretIssuer=<issuerName>;SharedSecretValue=<yoursecret>" The properties of the queue you configured in the Windows Azure portal can also be set programmatically. Sending messages Messages that are sent to a Service Bus queue are instances of BrokeredMessage. This class contains standard properties such as TimeToLive and MessageId. An important property is Properties, which is of type IDictionary<string, object>, where you can add additional data. The body of the message can be set in the constructor of BrokerMessage, where the parameter must be of a type decorated with the [Serializable] attribute. The following code snippet shows how to send a message of type BrokerMessage: MessagingFactory factory = MessagingFactory.CreateFromConnectionString (connectionString); MessageSender sender = factory.CreateMessageSender("geotopiaqueue"); sender.Send(new BrokeredMessage( new Geotopic { id = id, subject = subject, text = text, PostToFacebook = PostToFacebook, accessToken = accessToken, MediaFile = MediaFile //Uri of uploaded mediafile })); As the scenario depicts a situation where two messages are expected to be sent in a certain order and to be treated as a single transaction, we need to add some more logic to the code snippet. Right before this message is sent, the media file is uploaded by using the BlobUtil class. Consider sending the media file together with BrokeredMessage if it is small enough. This might be a long-running operation, depending on the size of the file. The asynchronous upload process returns Uri, which is passed to BrokeredMessage. The situation is: A multimedia file is uploaded from the client to Windows Azure Blob storage using a parallel upload (or passed on in the message). A Parallel upload is breaking up the media file in several chunks and uploading them separately by using multithreading. A message is sent to geotopiaqueue, and Geotopia.Processor processes the messages in the queues in a single transaction. Receiving messages On the other side of the Service Bus queue resides our worker role, Geotopia. Processor, which performs the following tasks: It grabs the messages from the queue Sends the message straight to a table in Windows Azure Storage for auditing purposes Creates a geotopic that can be subscribed to The following code snippet shows how to perform these three tasks: MessagingFactory factory = MessagingFactory.CreateFromConnectionString (connectionString); MessageReceiver receiver = factory.CreateMessageReceiver("geotopiaqueue "); BrokeredMessage receivedMessage = receiver.Receive(); try { ProcessMessage(receivedMessage); receivedMessage.Complete(); } catch (Exception e) { receivedMessage.Abandon(); } Cross-domain communication We created a new web role in our Geotopia solution, hosting the WCF services we want to expose. As the client is a Silverlight one (and runs in the browser), we face cross-domain communication. To protect against security vulnerabilities and to prevent cross-site requests from a Silverlight client to some services (without the notice of the user), Silverlight by default allows only site-of-origin communication. A possible exploitation of a web application is cross-site forgery, exploits that can occur when cross-domain communication is allowed; for example, a Silverlight application sending commands to some service running on the Internet somewhere. As we want the Geotopia Silverlight client to access the WCF service running in another domain, we need to explicitly allow cross-domain operations. This can be achieved by adding a file named clientaccesspolicy.xml at the root of the domain where the WCF service is hosted and allowing this cross-domain access. Another option is to add a crossdomain.xml file at the root where the service is hosted. Please go to http://msdn.microsoft.com/en-us/library/cc197955(v=vs.95).aspx to find more details on the cross-domain communication issues. Comparison The following table shows the similarities and differences between Windows Azure and Service Bus queues: Criteria Windows Azure queue Service Bus queue Ordering guarantee No, but based on best effort first-in, first out First-in, first-out Delivery guarantee At least once At most once; use the PeekLock() method to ensure that no messages are missed. PeekLock() together with the Complete() method enable a two-stage receive operation. Transaction support No Yes, by using TransactionScope Receive Mode Peek & Lease Peek & Lock Receive & Delete Lease/Lock duration Between 30 seconds and 7 days Between 60 seconds and 5 minutes Lease/Lock granularity Message level Queue level Batched Receive Yes, by using GetMessages(count) Yes, by using the prefetch property or the use of transactions Scheduled Delivery Yes Yes Automatic dead lettering No Yes In-place update Yes No Duplicate detection No Yes WCF integration No Yes, through WCF bindings WF integration Not standard; needs a customized activity Yes, out-of-the-box activities Message Size Maximum 64 KB Maximum 256 KB Maximum queue size 100 TB, the limits of a storage account 1, 2, 3, 4, or 5 GB; configurable Message TTL Maximum 7 days Unlimited Number of queues Unlimited 10,000 per service namespace Mgmt protocol REST over HTTP(S) REST over HTTP(S) Runtime protocol REST over HTTP(S) REST over HTTP(S) Queue naming rules Maximum of 63 characters Maximum of 260 characters Queue length function Yes, value is approximate Yes, exact value Throughput Maximum of 2,000 messages/second Maximum of 2,000 messages/second Authentication Symmetric key ACS claims Role-based access control No Yes through ACS roles Identity provider federation No Yes Costs $0.01 per 10,000 transactions $ 0.01 per 10,000 transactions Billable operations Every call that touches "storage"' Only Send and Receive operations Storage costs $0.14 per GB per month None ACS transaction costs None, since ACS is not supported $1.99 per 100,000 token requests Background information There are some additional characteristics of Service Bus queues that need your attention: In order to guarantee the FIFO mechanism, you need to use messaging sessions. Using Receive & Delete in Service Bus queues reduces transaction costs, since it is counted as one. The maximum size of a Base64-encoded message on the Window Azure queue is 48 KB and for standard encoding it is 64 KB. Sending messages to a Service Bus queue that has reached its limit will throw an exception that needs to be caught. When the throughput has reached its limit, the HTTP 503 error response is returned from the Windows Azure queue service. Implement retrying logic to tackle this issue. Throttled requests (thus being rejected) are not billable. ACS transactions are based on instances of the message factory class. The received token will expire after 20 minutes, meaning that you will only need three tokens per hour of execution. Topics and subscriptions Topics and subscriptions can be useful in a scenario where (instead of a single consumer, in the case of queues) multiple consumers are part of the pattern. Imagine in our scenario where users want to be subscribed to topics posted by friends. In such a scenario, a subscription is created on a topic and the worker role processes it; for example, mobile clients can be push notified by the worker role. Sending messages to a topic works in a similar way as sending messages to a Service Bus queue. Preparing the project In the Windows Azure portal, go to the Service Bus, Access Control & Caching tab. Select Topics and create a new topic, as shown in the following screenshot: Next, click on OK and a new topic is created for you. The next thing you need to do is to create a subscription on this topic. To do this, select New Subscription and create a new subscription, as shown in the following screenshot: Using filters Topics and subscriptions, by default, it is a push/subscribe mechanism where messages are made available to registered subscriptions. To actively influence the subscription (and subscribe only to those messages that are of your interest), you can create subscription filters. SqlFilter can be passed as a parameter to the CreateSubscription method of the NamespaceManager class. SqlFilter operates on the properties of the messages so we need to extend the method. In our scenario, we are only interested in messages that are concerning a certain subject. The way to achieve this is shown in the following code snippet: BrokeredMessage message = new BrokeredMessage(new Geotopic { id = id, subject = subject, text = text, PostToFacebook = PostToFacebook, accessToken = accessToken, mediaFile = fileContent }); //used for topics & subscriptions message.Properties["subject"] = subject; The preceding piece of code extends BrokeredMessage with a subject property that can be used in SqlFilter. A filter can only be applied in code on the subscription and not in the Windows Azure portal. This is fine, because in Geotopia, users must be able to subscribe to interesting topics, and for every topic that does not exist yet, a new subscription is made and processed by the worker role, the processor. The worker role contains the following code snippet in one of its threads: Uri uri = ServiceBusEnvironment.CreateServiceUri ("sb", "<yournamespace>", string.Empty); string name = "owner"; string key = "<yourkey>"; //get some credentials TokenProvider tokenProvider = TokenProvider.CreateSharedSecretTokenProvider(name, key); // Create namespace client NamespaceManager namespaceClient = new NamespaceManager(ServiceBusEnvironment.CreateServiceUri ("sb", "geotopiaservicebus", string.Empty), tokenProvider); MessagingFactory factory = MessagingFactory.Create(uri, tokenProvider); BrokeredMessage message = new BrokeredMessage(); message.Properties["subject"] = "interestingsubject"; MessageSender sender = factory.CreateMessageSender("dataqueue"); sender.Send(message); //message is send to topic SubscriptionDescription subDesc = namespaceClient.CreateSubscription("geotopiatopic", "SubscriptionOnMe", new SqlFilter("subject='interestingsubject'")); //the processing loop while(true) { MessageReceiver receiver = factory.CreateMessageReceiver ("geotopiatopic/subscriptions/SubscriptionOnMe"); //it now only gets messages containing the property 'subject' //with the value 'interestingsubject' BrokeredMessage receivedMessage = receiver.Receive(); try { ProcessMessage(receivedMessage); receivedMessage.Complete(); } catch (Exception e) { receivedMessage.Abandon(); } } Windows Azure Caching Windows Azure offers caching capabilities out of the box. Caching is fast, because it is built as an in-memory (fast), distributed (running on different servers) technology. Windows Azure Caching offers two types of cache: Caching deployed on a role Shared caching When you decide to host caching on your Windows Azure roles, you need to pick from two deployment alternatives. The first is dedicated caching, where a worker role is fully dedicated to run as a caching store and its memory is used for caching. The second option is to create a co-located topology, meaning that a certain percentage of available memory in your roles is assigned and reserved to be used for in-memory caching purposes. Keep in mind that the second option is the most costeffective one, as you don't have a role running just for its memory. Shared caching is the central caching repository managed by the platform which is accessible for your hosted services. You need to register the shared caching mechanism on the portal in the Service Bus, Access Control & Caching section of the portal. You need to configure a namespace and the size of the cache (remember, there is money involved). This caching facility is a shared one and runs inside a multitenant environment.
Read more
  • 0
  • 0
  • 4311

article-image-creating-java-ee-applications
Packt
24 Oct 2014
16 min read
Save for later

Creating Java EE Applications

Packt
24 Oct 2014
16 min read
In this article by Grant Shipley author of Learning OpenShift we are going to learn how to use OpenShift in order to create and deploy Java-EE-based applications using the JBoss Enterprise Application Platform (EAP) application server. To illustrate and learn the concepts of Java EE, we are going to create an application that displays an interactive map that contains all of the major league baseball parks in the United States. We will start by covering some background information on the Java EE framework and then introduce each part of the sample application. The process for learning how to create the sample application, named mlbparks, will be started by creating the JBoss EAP container, then adding a database, creating the web services, and lastly, creating the responsive map UI. (For more resources related to this topic, see here.) Evolution of Java EE I can't think of a single programming language other than Java that has so many fans while at the same time has a large community of developers that profess their hatred towards it. The bad reputation that Java has can largely be attributed to early promises made by the community when the language was first released and then not being able to fulfill these promises. Developers were told that we would be able to write once and run anywhere, but we quickly found out that this meant that we could write once and then debug on every platform. Java was also perceived to consume more memory than required and was accused of being overly verbose by relying heavily on XML configuration files. Another problem the language had was not being able to focus on and excel at one particular task. We used Java to create thick client applications, applets that could be downloaded via a web browser, embedded applications, web applications, and so on. Having Java available as a tool that completes most projects was a great thing, but the implementation for each project was often confusing. For example, let's examine the history of the GUI development using the Java programming language. When the language was first introduced, it included an API called the Abstract Window Toolkit (AWT) that was essentially a Java wrapper around native UI components supplied by the operating system. When Java 1.2 was released, the AWT implementation was deprecated in the favor of the Swing API that contained GUI elements written in 100 percent Java. By this time, a lot of developers were quickly growing frustrated with the available APIs and a new toolkit called the Standard Widget Toolkit (SWT) was developed to create another UI toolkit for Java. SWT was developed at IBM and is the windowing toolkit in use by the Eclipse IDE and is considered by most to be the superior toolkit that can be used when creating applications. As you can see, rapid changes in the core functionality of the language coupled with the refusal of some vendors to ship the JRE as part of the operating system left a bad taste in most developers' mouths. Another reason why developers began switching from Java to more attractive programming languages was the implementation of Enterprise JavaBeans (EJB). The first Java EE release occurred in December, 1999, and the Java community is just now beginning to recover from the complexity introduced by the language in order to create applications. If you were able to escape creating applications using early EJBs, consider yourself one of the lucky ones, as many of your fellow developers were consumed by implementing large-scale systems using this new technology. It wasn't fun; trust me. I was there and experienced it firsthand. When developers began abandoning Java EE, they seemed to go in one of two directions. Developers who understood that the Java language itself was quite beautiful and useful adopted the Spring Framework methodology of having enterprise grade features while sticking with a Plain Old Java Object (POJO) implementation. Other developers were wooed away by languages that were considered more modern, such as Ruby and the popular Rails framework. While the rise in popularity of both Ruby and Spring was happening, the team behind Java EE continued to improve and innovate, which resulted in the creation of a new implementation that is both easy to use and develop with. I am happy to report that if you haven't taken a look at Java EE in the last few years, now is the time to do so. Working with the language after a long hiatus has been a rewarding and pleasurable experience. Introducing the sample application For the remainder of this article, we are going to develop an application called mlbparks that displays a map of the United States with a pin on the map representing the location of each major league baseball stadium. The requirements for the application are as follows: A single map that a user can zoom in and out of As the user moves the map around, the map must be updated with all baseball stadiums that are located in the shown area The location of the stadiums must be searchable based on map coordinates that are passed to the REST-based API The data should be transferred in the JSON format The web application must be responsive so that it is displayed correctly regardless of the resolution of the browser When a stadium is listed on the map, the user should be able to click on the stadium to view details about the associated team The end state application will look like the following screenshot: The user will also be able to zoom in on a specific location by double-clicking on the map or by clicking on the + zoom button in the top-left corner of the application. For example, if a user zooms the map in to the Phoenix, Arizona area of the United States, they will be able to see the information for the Arizona Diamondbacks stadium as shown in the following screenshot: To view this sample application running live, open your browser and type http://mlbparks-packt.rhcloud.com. Now that we have our requirements and know what the end result should look like, let's start creating our application. Creating a JBoss EAP application For the sample application that we are going to develop as part of this article, we are going to take advantage of the JBoss EAP application server that is available on the OpenShift platform. The JBoss EAP application server is a fully tested, stable, and supported platform for deploying mission-critical applications. Some developers prefer to use the open source community application server from JBoss called WildFly. Keep in mind when choosing WildFly over EAP that it only comes with community-based support and is a bleeding edge application server. To get started with building the mlbparks application, the first thing we need to do is create a gear that contains the cartridge for our JBoss EAP runtime. For this, we are going to use the RHC tools. Open up your terminal application and enter in the following command: $ rhc app create mlbparks jbosseap-6 Once the previous command is executed, you should see the following output: Application Options ------------------- Domain:     yourDomainName Cartridges: jbosseap-6 (addtl. costs may apply) Gear Size: default Scaling:   no Creating application 'mlbparks' ... done Waiting for your DNS name to be available ... done Cloning into 'mlbparks'... Your application 'mlbparks' is now available. URL:       http://mlbparks-yourDomainName.rhcloud.com/ SSH to:     [email protected] Git remote: ssh://[email protected]/~/git/mlbparks.git/ Cloned to: /home/gshipley/code/mlbparks  Run 'rhc show-app mlbparks' for more details about your app. If you have a paid subscription to OpenShift Online, you might want to consider using a medium- or large-size gear to host your Java-EE-based applications. To create this application using a medium-size gear, use the following command: $ rhc app create mlbparks jbosseap-6 -g medium Adding database support to the application Now that our application gear has been created, the next thing we want to do is embed a database cartridge that will hold the information about the baseball stadiums we want to track. Given that we are going to develop an application that doesn't require referential integrity but provides a REST-based API that will return JSON, it makes sense to use MongoDB as our database. MongoDB is arguably the most popular NoSQL database available today. The company behind the database, MongoDB, offers paid subscriptions and support plans for production deployments. For more information on this popular NoSQL database, visit www.mongodb.com. Run the following command to embed a database into our existing mlbparks OpenShift gear: $ rhc cartridge add mongodb-2.4 -a mlbparks Once the preceding command is executed and the database has been added to your application, you will see the following information on the screen that contains the username and password for the database: Adding mongodb-2.4 to application 'mlbparks' ... done  mongodb-2.4 (MongoDB 2.4) ------------------------- Gears:         Located with jbosseap-6 Connection URL: mongodb://$OPENSHIFT_MONGODB_DB_HOST:$OPENSHIFT_MONGODB_DB_PORT/ Database Name: mlbparks Password:       q_6eZ22-fraN Username:       admin MongoDB 2.4 database added. Please make note of these credentials:    Root User:     admin    Root Password: yourPassword    Database Name: mlbparks Connection URL: mongodb://$OPENSHIFT_MONGODB_DB_HOST:$OPENSHIFT_MONGODB_DB_PORT/ Importing the MLB stadiums into the database Now that we have our application gear created and our database added, we need to populate the database with the information about the stadiums that we are going to place on the map. The data is provided as a JSON document and contains the following information: The name of the baseball team The total payroll for the team The location of the stadium represented with the longitude and latitude The name of the stadium The name of the city where the stadium is located The league the baseball club belongs to (national or American) The year the data is relevant for All of the players on the roster including their position and salary A sample for the Arizona Diamondbacks looks like the following line of code: {   "name":"Diamondbacks",   "payroll":89000000,   "coordinates":[     -112.066662,     33.444799   ], "ballpark":"Chase Field",   "city":"Phoenix",   "league":"National League", "year":"2013",   "players":[     {       "name":"Miguel Montero", "position":"Catcher",       "salary":10000000     }, ………… ]} In order to import the preceding data, we are going to use the SSH command. To get started with the import, SSH into your OpenShift gear for the mlbparks application by issuing the following command in your terminal prompt: $ rhc app ssh mlbparks Once we are connected to the remote gear, we need to download the JSON file and store it in the /tmp directory of our gear. To complete these steps, use the following commands on your remote gear: $ cd /tmp $ wget https://raw.github.com/gshipley/mlbparks/master/mlbparks.json Wget is a software package that is available on most Linux-based operating systems in order to retrieve files using HTTP, HTTPS, or FTP. Once the file has completed downloading, take a quick look at the contents using your favorite text editor in order to get familiar with the structure of the document. When you are comfortable with the data that we are going to import into the database, execute the following command on the remote gear to populate MongoDB with the JSON documents: $ mongoimport --jsonArray -d $OPENSHIFT_APP_NAME -c teams --type json --file /tmp/mlbparks.json -h $OPENSHIFT_MONGODB_DB_HOST --port $OPENSHIFT_MONGODB_DB_PORT -u $OPENSHIFT_MONGODB_DB_USERNAME -p $OPENSHIFT_MONGODB_DB_PASSWORD If the command was executed successfully, you should see the following output on the screen: connected to: 127.7.150.130:27017 Fri Feb 28 20:57:24.125 check 9 30 Fri Feb 28 20:57:24.126 imported 30 objects What just happened? To understand this, we need to break the command we issued into smaller chunks, as detailed in the following table: Command/argument Description mongoimport This command is provided by MongoDB to allow users to import data into a database. --jsonArray This specifies that we are going to import an array of JSON documents. -d $OPENSHIFT_APP_NAME Specifies the database that we are going to import the data into the database. We are using a system environment variable to use the database that was created by default when we embedded the database cartridge in our application. -c teams This defines the collection to which we want to import the data. If the collection does not exist, it will be created. --type json This specifies the type of file we are going to import. --file /tmp/mlbparks.json This specifies the full path and name of the file that we are going to import into the database. -h $OPENSHIFT_MONGODB_DB_HOST This specifies the host of the MongoDB server. --port $OPENSHIFT_MONGODB_DB_PORT This specifies the port of the MongoDB server. -u $OPENSHIFT_MONGODB_DB_USERNAME This specifies the username to be used to be authenticated to the database. -p $OPENSHIFT_MONGODB_DB_PASSWORD This specifies the password to be authenticated to the database. To verify that data was loaded properly, you can use the following command that will print out the number of documents in the teams collections of the mlbparks database: $ mongo -quiet $OPENSHIFT_MONGODB_DB_HOST:$OPENSHIFT_MONGODB_DB_PORT/$OPENSHIFT_APP_NAME -u $OPENSHIFT_MONGODB_DB_USERNAME -p $OPENSHIFT_MONGODB_DB_PASSWORD --eval "db.teams.count()" The result should be 30. Lastly, we need to create a 2d index on the teams collection to ensure that we can perform spatial queries on the data. Geospatial queries are what allow us to search for specific documents that fall within a given location as provided by the latitude and longitude parameters. To add the 2d index to the teams collections, enter the following command on the remote gear: $ mongo $OPENSHIFT_MONGODB_DB_HOST:$OPENSHIFT_MONGODB_DB_PORT/$OPENSHIFT_APP_NAME --eval 'db.teams.ensureIndex( { coordinates : "2d" } );' Adding database support to our Java application The next step in creating the mlbparks application is adding the MongoDB driver dependency to our application. OpenShift Online supports the popular Apache Maven build system as the default way of compiling the source code and resolving dependencies. Maven was originally created to simplify the build process by allowing developers to specify specific JARs that their application depends on. This alleviates the bad practice of checking JAR files into the source code repository and allows a way to share JARs across several projects. This is accomplished via a pom.xml file that contains configuration items and dependency information for the project. In order to add the dependency for the MongoDB client to our mlbparks applications, we need to modify the pom.xml file that is in the root directory of the Git repository. The Git repository was cloned to our local machine during the application's creation step that we performed earlier in this article. Open up your favorite text editor and modify the pom.xml file to include the following lines of code in the <dependencies> block: <dependency> <groupId>org.mongodb</groupId> <artifactId>mongo-java-driver</artifactId> <version>2.9.1</version> </dependency> Once you have added the dependency, commit the changes to your local repository by using the following command: $ git commit -am "added MongoDB dependency" Finally, let's push the change to our Java application to include the MongoDB database drivers using the git push command: $ git push The first time the Maven build system builds the application, it downloads all the dependencies for the application and then caches them. Because of this, the first build will always that a bit longer than any subsequent build. Creating the database access class At this point, we have our application created, the MongoDB database embedded, all the information for the baseball stadiums imported, and the dependency for our database driver added to our application. The next step is to do some actual coding by creating a Java class that will act as the interface for connecting to and communicating with the MongoDB database. Create a Java file named DBConnection.java in the mlbparks/src/main/java/org/openshift/mlbparks/mongo directory and add the following source code: package org.openshift.mlbparks.mongo;  import java.net.UnknownHostException; import javax.annotation.PostConstruct; import javax.enterprise.context.ApplicationScoped; import javax.inject.Named; import com.mongodb.DB; import com.mongodb.Mongo;  @Named @ApplicationScoped public class DBConnection { private DB mongoDB; public DBConnection() {    super(); } @PostConstruct public void afterCreate() {    String mongoHost = System.getenv("OPENSHIFT_MONGODB_DB_HOST");    String mongoPort = System.getenv("OPENSHIFT_MONGODB_DB_PORT");    String mongoUser = System.getenv("OPENSHIFT_MONGODB_DB_USERNAME");    String mongoPassword = System.getenv("OPENSHIFT_MONGODB_DB_PASSWORD");    String mongoDBName = System.getenv("OPENSHIFT_APP_NAME");    int port = Integer.decode(mongoPort);    Mongo mongo = null;  try {     mongo = new Mongo(mongoHost, port);    } catch (UnknownHostException e) {     System.out.println("Couldn't connect to MongoDB: " + e.getMessage() + " :: " + e.getClass());    }    mongoDB = mongo.getDB(mongoDBName);    if (mongoDB.authenticate(mongoUser, mongoPassword.toCharArray()) == false) {     System.out.println("Failed to authenticate DB ");    } } public DB getDB() {    return mongoDB; } } The preceding source code as well as all source code for this article is available on GitHub at https://github.com/gshipley/mlbparks. The preceding code snippet simply creates an application-scoped bean that is available until the application is shut down. The @ApplicationScoped annotation is used when creating application-wide data or constants that should be available to all the users of the application. We chose this scope because we want to maintain a single connection class for the database that is shared among all requests. The next bit of interesting code is the afterCreate method that gets authenticated on the database using the system environment variables. Once you have created the DBConnection.java file and added the preceding source code, add the file to your local repository and commit the changes as follows: $ git add . $ git commit -am "Adding database connection class" Creating the beans.xml file The DBConnection class we just created makes use of Context Dependency Injection (CDI), which is part of the Java EE specification, for dependency injection. According to the official specification for CDI, an application that uses CDI must have a file called beans.xml. The file must be present and located under the WEB-INF directory. Given this requirement, create a file named beans.xml under the mlbparks/src/main/webapp/WEB-INF directory and add the following lines of code: <?xml version="1.0"?> <beans xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://jboss.org/schema/cdi/beans_1_0.xsd"/> After you have added the beans.xml file, add and commit it to your local Git repository: $ git add . $ git commit -am "Adding beans.xml for CDI" Summary In this article we learned about the evolution of Java EE, created a JBoss EAP application, and created the database access class. Resources for Article: Further resources on this subject: Using OpenShift [Article] Common performance issues [Article] The Business Layer (Java EE 7 First Look) [Article]
Read more
  • 0
  • 0
  • 4266

article-image-kubecon-cloudnativecon-north-america-2019-highlights-helm-3-0-release-codeready-workspaces-2-0-and-more
Savia Lobo
26 Nov 2019
6 min read
Save for later

KubeCon + CloudNativeCon North America 2019 Highlights: Helm 3.0 release, CodeReady Workspaces 2.0, and more!

Savia Lobo
26 Nov 2019
6 min read
Update: On November 26, the OpenFaaS community released a post including a few of its highlights at KubeCon, San Diego. The post also includes a few highlights from OpenFaaS Cloud, Flux from Weaveworks, Okteto, Dive from Buoyant and k3s going GA. The KubeCon + CloudNativeCon 2019 held at San Diego, North America from 18 - 21 November, witnessed over 12,000 attendees to discuss and improve their education and advancement about containers, Kubernetes and cloud-native. The conference was home to many major announcements including the release of Helm 3.0, Red Hat’s CodeReady Workspaces 2.0, GA of Managed Istio on IBM Cloud Kubernetes Service, and many more. Major highlights at the KubeCon + CloudNativeCon 2019 General availability of Managed Istio on IBM Cloud Kubernetes Service IBM cloud announced that the managed Istio on its cloud Kubernetes service is generally available. This service provides a seamless installation of Istio, automatic updates, lifecycle management of Istio control plane components, and integration with platform logging and monitoring tools. With managed Istio, a user’s service mesh is tuned for optimal performance in IBM Cloud Kubernetes Service. Istio is a service mesh that it is able to provide its features without developers having to make any modifications to their applications. The Istio installation is tuned to perform optimally on IBM Cloud Kubernetes Service and is pre-configured to work out of the box with IBM Log Analysis with LogDNA and IBM Cloud Monitoring with Sysdig. Red Hat announces CodeReady Workspaces 2.0 CodeReady Workspaces 2.0 helps developers to build applications and services similar to the production environment, i.e all apps run on Red Hat OpenShift. A few new services and tools in the CodeReady Workspaces 2.0 include: Air-gapped installs: These enable CodeReady Workspaces to be downloaded, scanned and moved into more secure environments when access to the public internet is limited or unavailable. It doesn’t "call back" to public internet services. An updated user interface: This brings an improved desktop-like experience to developers. Support for VSCode extensions: This gives developers access to thousands of IDE extensions. Devfile: A sharable workspace configuration that specifies everything a developer needs to work, including repositories, runtimes, build tools and IDE plugins, and is stored and versioned with the code in Git. Production consistent containers for developers: This clones the sources in where needed and adds development tools (such as debuggers, language servers, unit test tools, build tools) as sidecar containers so that the running application container mirrors production. Brad Micklea, vice president of Developer Tools, Developer Programs, and Advocacy, Red Hat, said, “Red Hat is working to make developing in cloud native environments easier offering the features developers need without requiring deep container knowledge. Red Hat CodeReady Workspaces 2 is well-suited for security-sensitive environments and those organizations that work with consultants and offshore development teams.” To know more about CodeReady Workspaces 2.0, read the press release on the Red Hat official blog. Helm 3.0 released Built upon the success of Helm 2, the internal implementation of Helm 3 has changed considerably from Helm 2. The most apparent change in Helm 3.0 is the removal of Tiller. A rich set of new features have been added as a result of the community’s input and requirements. A few features include: Improved Upgrade strategy: Helm 3 uses 3-way strategic merge patches Secrets as the default storage driver Go import path changes Validating Chart Values with JSONSchema Some features have been deprecated or refactored in ways that make them incompatible with Helm 2. Some new experimental features have also been introduced, including OCI support. Also, the Helm Go SDK has been refactored for general use. The goal is to share and re-use code open sourced with the broader Go community. To know more about Helm 3.0 in detail, read the official blog post. AWS, Intuit and WeaveWorks Collaborate on Argo Flux Recently, Weaveworks announced a partnership with Intuit to create Argo Flux, a major open-source project to drive GitOps application delivery for Kubernetes via an industry-wide community. Argo Flux combines the Argo CD project led by Intuit with the Flux CD project driven by Weaveworks, two well known open source tools with strong community support. At KubeCon, AWS announced that it is integrating the GitOps tooling based on Argo Flux in Elastic Kubernetes Service and Flagger for AWS App Mesh. The collaboration resulted in a new project called GitOps Engine to simplify application deployment in Kubernetes. The GitOps Engine will be responsible for the following functionality: Access to Git repositories Kubernetes resource cache Manifest Generation Resources reconciliation Sync Planning To know more about this collaboration in detail, read the GitOps Engine page on GitHub. Grafana Labs announces general availability of Loki 1.0 Grafana Labs, an open source analytics and monitoring solution provider, announced that Loki version 1.0 is generally available for production use. Loki is an open source logging platform that provides developers with an easy-to-use, highly efficient and cost-effective approach to log aggregation. With Loki 1.0, users can instantaneously switch between metrics and logs, preserving context and reducing MTTR. By storing compressed, unstructured logs and only indexing metadata, Loki is cost-effective and simple to operate by design. It includes a set of components that can be composed into a fully-featured logging stack. Grafana Cloud offers a high-performance, hosted Loki service that allows users to store all logs together in a single place with usage-based pricing. Read about Loki 1.0 on GitHub to know more in detail. Rancher Extends Kubernetes to the Edge with the general availability of K3s Rancher, creator of the vendor-agnostic and cloud-agnostic Kubernetes management platform, announced the general availability of K3s, a lightweight, certified Kubernetes distribution purpose-built for small footprint workloads. Rancher partnered with ARM to build a highly optimized version of Kubernetes for the edge. It is packaged as a single <40MB binary with a small footprint which reduces the dependencies and steps needed to install and run Kubernetes in resource-constrained environments such as IoT and edge devices. To know more about this announcement in detail, read the official press release. There were many additional announcements including Portworx launched PX-Autopilot, Huawei presented their latest advances on KubeEdge, Diamanti Announced Spektra Hybrid Cloud Solution, and may more! To know more about all the keynotes and tutorials in KubeCon North America 2019, visit its GitHub page. Chaos engineering comes to Kubernetes thanks to Gremlin “Don’t break your users and create a community culture”, says Linus Torvalds, Creator of Linux, at KubeCon + CloudNativeCon + Open Source Summit China 2019 KubeCon + CloudNativeCon EU 2019 highlights: Microsoft’s Service Mesh Interface, Enhancements to GKE, Virtual Kubelet 1.0, and much more!
Read more
  • 0
  • 0
  • 4211
Visually different images

article-image-amazon-simple-notification-service
Vijin Boricha
30 Apr 2018
5 min read
Save for later

Using Amazon Simple Notification Service (SNS) to create an SNS topic

Vijin Boricha
30 Apr 2018
5 min read
Simple Notification Service is a managed web service that you, as an end user, can leverage to send messages to various subscribing endpoints. SNS works in a publisher–subscriber or producer and consumer model, where producers create and send messages to a particular topic, which is in turn consumed by one or more subscribers over a supported set of protocols. At the time of writing this book, SNS supports HTTP, HTTPS, email, push notifications in the form of SMS, as well as AWS Lambda and Amazon SQS, as the preferred modes of subscribers. In today's tutorial, we will learn about Amazon Simple Notification Service (SNS) and how to create your very own simple SNS topics: SNS is a really simple and yet extremely useful service that you can use for a variety of purposes, the most common being pushing notifications or system alerts to cloud administrators whenever a particular event occurs. We have been using SNS throughout this book for this same purpose; however, there are many more features and use cases that SNS can be leveraged for. For example, you can use SNS to send out promotional emails or SMS to a large group of targeted audiences or even use it as a mobile push notification service where the messages are pushed directly to your Android or IOS applications. With this in mind, let's quickly go ahead and create a simple SNS topic of our own: To do so, first log in to your AWS Management Console and, from the Filter option, filter out SNS service. Alternatively, you can also access the SNS dashboard by selecting https://console.aws.amazon.com/sns. If this is your first time with SNS, simply select the Get Started option to begin. Here, at the SNS dashboard, you can start off by selecting the Create topic option, as shown in the following screenshot: Once selected, you will be prompted to provide a suitable Topic name and its corresponding Display name. Topics form the core functionality for SNS. You can use topics to send messages to a particular type of subscribing consumer. Remember, a single topic can be subscribed by more than one consumer. Once you have typed in the required fields, select the Create topic option to complete the process. That's it! Simple, isn't it? Having created your topic, you can now go ahead and associate it with one or more subscribers. To do so, first we need to create one or more subscriptions. Select the Create subscription option provided under the newly created topic, as shown in the following screenshot: Here, in the Create subscription dialog box, select a suitable Protocol that will subscribe to the newly created topic. In this case, I've selected Email as the Protocol. Next, provide a valid email address in the subsequent Endpoint field. The Endpoint field will vary based on the selected protocol. Once completed, click on the Create subscription button to complete the process. With the subscription created, you will now have to validate the subscription. This can be performed by launching your email application and selecting the Confirm subscription link in the mail that you would have received. Once the subscription is confirmed, you will be redirected to a confirmation page where you can view the subscribed topic's name as well as the subscription ID, as shown in the following screenshot: You can use the same process to create and assign multiple subscribers to the same topic. For example, select the Create subscription option, as performed earlier, and from the Protocol drop-down list, select SMS as the new protocol. Next, provide a valid phone number in the subsequent Endpoint field. The number can be prefixed by your country code, as shown in the following screenshot. Once completed, click on the Create subscription button to complete the process: With the subscriptions created successfully, you can now test the two by publishing a message to your topic. To do so, select the Publish to topic option from your topics page. Once a message is published here, SNS will attempt to deliver that message to each of its subscribing endpoints; in this case, to the email address as well as the phone number. Type in a suitable Subject name followed by the actual message that you wish to send. Note that if your character count exceeds 160 for an SMS, SNS will automatically send another SMS with the remainder of the character count. You can optionally switch the Message format between Raw and JSON to match your requirements. Once completed, select Publish Message. Check your email application once more for the published message. You should receive an mail, as shown in the following screenshot: Similarly, you can create and associate one or more such subscriptions to each of the topics that you create. We learned about creating simple Amazon SNS topics. You read an excerpt from the book AWS Administration - The Definitive Guide - Second Edition written by Yohan Wadia.  This book will help you build a highly secure, fault-tolerant, and scalable Cloud environment. Getting to know Amazon Web Services AWS IoT Analytics: The easiest way to run analytics on IoT data How to set up a Deep Learning System on Amazon Web Services    
Read more
  • 0
  • 0
  • 4194

article-image-ansible-2-automate-networking-tasks-on-google-cloud
Vijin Boricha
31 Jul 2018
8 min read
Save for later

Ansible 2 for automating networking tasks on Google Cloud Platform [Tutorial]

Vijin Boricha
31 Jul 2018
8 min read
Google Cloud Platform is one of the largest and most innovative cloud providers out there. It is used by various industry leaders such as Coca-Cola, Spotify, and Philips. Amazon Web Services and Google Cloud are always involved in a price war, which benefits consumers greatly. Google Cloud Platform covers 12 geographical regions across four continents with new regions coming up every year. In this tutorial, we will learn about Google compute engine and network services and how Ansible 2 can be leveraged to automate common networking tasks. This is an excerpt from Ansible 2 Cloud Automation Cookbook written by Aditya Patawari, Vikas Aggarwal.  Managing network and firewall rules By default, inbound connections are not allowed to any of the instances. One way to allow the traffic is by allowing incoming connections to a certain port of instances carrying a particular tag. For example, we can tag all the webservers as http and allow incoming connections to port 80 and 8080 for all the instances carrying the http tag. How to do it… We will create a firewall rule with source tag using the gce_net module: - name: Create Firewall Rule with Source Tags gce_net: name: my-network fwname: "allow-http" allowed: tcp:80,8080 state: "present" target_tags: "http" subnet_region: us-west1 service_account_email: "{{ service_account_email }}" project_id: "{{ project_id }}" credentials_file: "{{ credentials_file }}" tags: - recipe6 Using tags for firewalls is not possible all the time. A lot of organizations whitelist internal IP ranges or allow office IPs to reach the instances over the network. A simple way to allow a range of IP addresses is to use a source range: - name: Create Firewall Rule with Source Range gce_net: name: my-network fwname: "allow-internal" state: "present" src_range: ['10.0.0.0/16'] subnet_name: public-subnet allowed: 'tcp' service_account_email: "{{ service_account_email }}" project_id: "{{ project_id }}" credentials_file: "{{ credentials_file }}" tags: - recipe6 How it works... In step 1, we have created a firewall rule called allow-http to allow incoming requests to TCP port 80 and 8080. Since our instance app is tagged with http, it can accept incoming traffic to port 80 and 8080. In step 2, we have allowed all the instances with IP 10.0.0.0/16, which is a private IP address range. Along with connection parameters and the source IP address CIDR, we have defined the network name and subnet name. We have allowed all TCP connections. If we want to restrict it to a port or a range of ports, then we can use tcp:80 or tcp:4000-5000 respectively. Managing load balancer An important reason to use a cloud is to achieve scalability at a relatively low cost. Load balancers play a key role in scalability. We can attach multiple instances behind a load balancer to distribute the traffic between the instances. Google Cloud load balancer also supports health checks which helps to ensure that traffic is sent to healthy instances only. How to do it… Let us create a load balancer and attach an instance to it: - name: create load balancer and attach to instance gce_lb: name: loadbalancer1 region: us-west1 members: ["{{ zone }}/app"] httphealthcheck_name: hc httphealthcheck_port: 80 httphealthcheck_path: "/" service_account_email: "{{ service_account_email }}" project_id: "{{ project_id }}" credentials_file: "{{ credentials_file }}" tags: - recipe7 For creating a load balancer, we need to supply a comma separated list of instances. We also need to provide health check parameters including a name, a port and the path on which a GET request can be sent. Managing GCE images in Ansible 2 Images are a collection of a boot loader, operating system, and a root filesystem. There are public images provided by Google and various open source communities. We can use these images to create an instance. GCE also provides us capability to create our own image which we can use to boot instances. It is important to understand the difference between an image and a snapshot. A snapshot is incremental but it is just a disk snapshot. Due to its incremental nature, it is better for creating backups. Images consist of more information such as a boot loader. Images are non-incremental in nature. However, it is possible to import images from a different cloud provider or datacenter to GCE. Another reason we recommend snapshots for backup is that taking a snapshot does not require us to shut down the instance, whereas building an image would require us to shut down the instance. Why build images at all? We will discover that in subsequent sections. How to do it… Let us create an image for now: - name: stop the instance gce: instance_names: app zone: "{{ zone }}" machine_type: f1-micro image: centos-7 state: stopped service_account_email: "{{ service_account_email }}" credentials_file: "{{ credentials_file }}" project_id: "{{ project_id }}" disk_size: 15 metadata: "{{ instance_metadata }}" tags: - recipe8 - name: create image gce_img: name: app-image source: app zone: "{{ zone }}" state: present service_account_email: "{{ service_account_email }}" pem_file: "{{ credentials_file }}" project_id: "{{ project_id }}" tags: - recipe8 - name: start the instance gce: instance_names: app zone: "{{ zone }}" machine_type: f1-micro image: centos-7 state: started service_account_email: "{{ service_account_email }}" credentials_file: "{{ credentials_file }}" project_id: "{{ project_id }}" disk_size: 15 metadata: "{{ instance_metadata }}" tags: - recipe8 How it works... In these tasks, we are stopping the instance first and then creating the image. We just need to supply the instance name while creating the image, along with the standard connection parameters. Finally, we start the instance back. The parameters of these tasks are self-explanatory. Creating instance templates Instance templates define various characteristics of an instance and related attributes. Some of these attributes are: Machine type (f1-micro, n1-standard-1, custom) Image (we created one in the previous tip, app-image) Zone (us-west1-a) Tags (we have a firewall rule for tag http) How to do it… Once a template is created, we can use it to create a managed instance group which can be auto-scale based on various parameters. Instance templates are typically available globally as long as we do not specify a restrictive parameter like a specific subnet or disk: - name: create instance template named app-template gce_instance_template: name: app-template size: f1-micro tags: http,http-server image: app-image state: present subnetwork: public-subnet subnetwork_region: us-west1 service_account_email: "{{ service_account_email }}" credentials_file: "{{ credentials_file }}" project_id: "{{ project_id }}" tags: - recipe9 We have specified the machine type, image, subnets, and tags. This template can be used to create instance groups. Creating managed instance groups Traditionally, we have managed virtual machines individually. Instance groups let us manage a group of identical virtual machines as a single entity. These virtual machines are created from an instance template, like the one which we created in the previous tip. Now, if we have to make a change in instance configuration, that change would be applied to all the instances in the group. How to do it… Perhaps, the most important feature of an instance group is auto-scaling. In event of high resource requirements, the instance group can scale up to a predefined number automatically: - name: create an instance group with autoscaling gce_mig: name: app-mig zone: "{{ zone }}" service_account_email: "{{ service_account_email }}" credentials_file: "{{ credentials_file }}" project_id: "{{ project_id }}" state: present size: 2 named_ports: - name: http port: 80 template: app-template autoscaling: enabled: yes name: app-autoscaler policy: min_instances: 2 max_instances: 5 cool_down_period: 90 cpu_utilization: target: 0.6 load_balancing_utilization: target: 0.8 tags: - recipe10 How it works... The preceding task creates an instance group with an initial size of two instances, defined by size. We have named port 80 as HTTP. This can be used by other GCE components to route traffic. We have used the template that we created in the previous recipe. We also enable autoscaling with a policy to allow scaling up to five instances. At any given point, at least two instances would be running. We are scaling on two parameters, cpu_utilization, where 0.6 would trigger scaling after the utilization exceeds 60% and load_balancing_utilization where the scaling will trigger after 80% of the requests per minutes capacity is reached. Typically, when an instance is booted, it might take some time for initialization and startup. Data collected during that period might not make much sense. The parameter, cool_down_period, indicates that we should start collecting data from the instance after 90 seconds and should not trigger scaling based on data before. We learnt a few networking tricks to manage public cloud infrastructure effectively. You can know more about building the public cloud infrastructure by referring to this book Ansible 2 Cloud Automation Cookbook. Why choose Ansible for your automation and configuration management needs? Getting Started with Ansible 2 Top 7 DevOps tools in 2018
Read more
  • 0
  • 0
  • 4193

article-image-introduction-web-experience-factory
Packt
24 Sep 2012
20 min read
Save for later

Introduction to Web Experience Factory

Packt
24 Sep 2012
20 min read
What is Web Experience Factory? Web Experience Factory is a rapid application development tool, which applies software automation technology to construct applications. By using WEF, developers can quickly create single applications that can be deployed to a variety of platforms, such as IBM WebSphere Application Server and IBM WebSphere Portal Server , which in turn can serve your application to standard browsers, mobile phones, tablets, and so on. Web Experience Factory is the new product derived from the former WebSphere Portlet Factory (WPF) product. In addition to creating portal applications, WEF always had the capability of creating exceptional web applications. In fact, the initial product developed by Bowstreet, the company which originally created WPF, was meant to create web applications, way before the dawn of portal technologies. As the software automation technology developed by Bowstreet could easily be adapted to produce portal applications, it was then tailored for the portal market. This same adaptability is now expanded to enable WEF to target different platforms and multiple devices. Key benefits of using Web Experience Factory for portlet development While WEF has the capability of targeting several platforms, we will be focusing on IBM WebSphere Portal applications. The following are a few benefits of WEF for the portal space: Significantly improves productivity Makes portal application development easier Contains numerous components (builders) to facilitate portal application development Insulates the developer from the complexity of the low-level development tasks Automatically handles the deployment and redeployment of the portal project (WAR file ) to the portal Reduces portal development costs The development environment Before we discuss key components of WEF, let's take a look at the development environment. From a development environment perspective, WEF is a plugin that is installed into either Eclipse or IBM Rational Application Developer for WebSphere. As a plugin, it uses all the standard features from these development environments at the same time that it provides its own perspective and views to enable the development of portlets with WEF. Let's explore the WEF development perspective in Eclipse. The WEF development environment is commonly referred to as the designer. While we explore this perspective, you will read about new WEF-specific terms. In this section, we will neither define nor discuss them, but don't worry. Later on in this article, you will learn all about these new WEF terms. The following screenshot shows the WEF perspective with its various views and panes: The top-left pane, identified by number 1, shows the Project Explorer tab. In this pane you can navigate to the WEF project, which has a structure similar to a JEE project. WEF adds a few extra folders to host the WEF-specific files. Box 1 also contains a tab to access the Package Explorer view. The Package Explorer view enables you to navigate the several directories containing the .jar files. These views can be arranged in different ways within this Eclipse perspective. The area identified by number 2 shows the Outline view. This view holds the builder call list. This view also holds two important icons. The first one is the "Regeneration" button. This is the first icon from left to right, immediately above the builder call table header. Honestly, we do not know what the graphical image of this icon is supposed to convey. Some people say it looks like a candlelight, others say it looks like a chess pawn. We even heard people referring to this icon as the "Fisher-Price" icon, because it looks like the Fisher-Price children's toy. The button right next to the Regeneration button is the button to access the Builder palette. From the Builder palette, you can select all builders available in WEF. Box number 3 presents the panes available to work on several areas of the designer. The screenshot inside this box shows the Builder Call Editor. This is the area where you will be working with the builders you add to your model. Lastly, box number 4 displays the Applied Profiles view. This view displays content only when the open model contains profile-enabled inputs, which is not the case in this screenshot. The following screenshot shows the right-hand side pane, which contains four tabs—Source, Design, Model XML, and Builder Call Editor. The preceding screenshot shows the content displayed when you select the first tab from the right-hand side pane, the Source tab. The Source tab exposes two panes. The left-hand side pane contains the WebApp tree, and the right-hand side pane contains the source code for elements selected from the WebApp tree. Although it is not our intention to define the WEF elements in this section, it is important to make an exception to explain to you what the WebApp tree is. The WebApp tree is a graphical representation of your application. This tree represents an abstract object identified as WebApp object. As you add builders to your models or modify them, these builders add or modify elements in this WebApp object. You cannot modify this object directly except through builders. The preceding screenshot shows the source code for the selected element in the WebApp tree. The code shows what WEF has written and the code to be compiled. The following screenshot shows the Design pane. The Design pane displays the user interface elements placed on a page either directly or as they are created by builders. It enables you to have a good sense of what you are building from a UI perspective. The following screenshot shows the content of a model represented as an XML structure in the Model XML tab. The highlighted area in the right-hand side pane shows the XML representation of the sample_PG builder, which has been selected in the WebApp tree. We will discuss the next tab, Builder Call Editor, when we address builders in the next section. Key components of WEF—builders, models, and profiles Builders, models, and profiles comprise the key components of WEF. These three components work together to enable software automation through WEF. Here, we will explain and discuss in details what they are and what they do. Builders Builders are at the core of WEF technology. There have been many definitions for builders. Our favorite is the one that defines builders as "software components, which encapsulate design patterns". Let's look at the paradigm of software development as it maps to software patterns. Ultimately, everything a developer does in terms of software development can be defined as patterns. There are well-known patterns, simple and complex patterns, well-documented patterns, and patterns that have never been documented. Even simple, tiny code snippets can be mapped to patterns. Builders are the components that capture these countless patterns in a standard way, and present them to developers in an easy, common, and user-friendly interface. This way, developers can use and reuse these patterns to accomplish their tasks. Builders enable developers to put together these encapsulated patterns in a meaningful fashion in such a way that they become full-fl edged applications, which address business needs. In this sense, developers can focus more on quickly and efficiently building the business solutions instead of focusing on low-level, complex, and time consuming development activities. Through the builder technology, senior and experienced developers at the IBM labs can identify, capture, and code these countless patterns into reusable components. When you are using builders, you are using code that has not only been developed by a group, which has already put a lot of thought and effort into the development task, but also a component, which has been extensively tested by IBM. Here, we will refer to the IBM example, because they are the makers of WEF—but overall, any developer can create builders. Simple and complex builders The same way that development activities can range from very simple to very complex tasks, builders can also range from very simple to very complex. Simple builders can perform tasks such as placing an attribute on a tag, highlighting a row of a table, or creating a simple link. Equally, there are complex builders , which perform complex and extensive tasks. These builders can save WEF developers' days worth of work, troubleshooting, and aggravation. For instance, there are builders for accessing, retrieving, and transforming data from backend systems, builders to create tables, form, and hundreds of others. The face of builders The following screenshot shows a Button builder in the Builder Editor pane: All builders have a common interface, which enables developers to provide builder input values. The builder input values define several aspects concerning how the application code will be generated by this builder. Through the Builder Editor pane, developers define how a builder will contribute to the process of creating your application, be it a portlet, a web application, or a widget. Any builder contains required and optional builder inputs. The required inputs are identified with an asterisk symbol (*) in front of their names. For instance, the preceding screenshot representing the Button builder shows two required inputs—Page and Tag. As you can see through the preceding screenshot, builder input values can be provided through several ways. The following table describes the items identified by the numbered labels: Label number Description Function 1 Free form inputs Enables developer to type in any appropriate value. 2 Drop-down controls Enables developer to select values from a predefined list, which is populated based on the context of the input. This type of input is dynamically populated with possible influence from other builder inputs, other builders in the same model, or even other aspects of the current WEF project. 3 Picker controls This type of control enables users make a selection from multiple source types such as variables, action list builders, methods defined in the current model, public methods defined in java classes and exposed through the Linked Java Class builder, and so on. The values selected through the picker controls can be evaluated at runtime. 4 Profiling assignment button This button enables developers to profile-enable the value for this input. In another words, through this button, developers indicate that the value for this input will come from a profile to be evaluated at regeneration time.     Through these controls, builders contribute to make the modeling process faster at the same time it reduces errors, because only valid options and within the proper context are presented. Builders are also adaptive. Inputs, controls, and builder sections are either presented, hidden, or modified depending upon the resulting context that is being automatically built by the builder. This capability not only guides the developers to make the right choices, but it also helps developers become more productive. Builder artifacts We have already mentioned that builders either add artifacts to or modify existing artifacts in the WebApp abstract object. In this section, we will show you an instance of these actions. In order to demonstrate this, we will not walk you through a sample. Rather, we will show you this process through a few screenshots from a model. Here, we will simulate the action of adding a button to a portlet page. In WEF, it is common to start portlet development with a plain HTML page, which contains mostly placeholder tags. These placeholders, usually represented by the names of span or div tags, indicate locations where code will be added by the properly selected builders. The expression "code will be added" can be quite encompassing. Builders can create simple HTML code, JavaScript code, stylesheet values, XML schemas, Java code, and so on. In this case, we mean to say that builders have the capability of creating any code required to carry on the task or tasks for which they have been designed. In our example, we will start with a plain and simple HTML page, which is added to a model either through a Page builder or an Imported Page builder. Our sample page contains the following HTML content: Now, let's use a Button builder to add a button artifact to this sample_PG page, more specifically to the sampleButton span tag. Assume that this button performs some action through a Method builder (Java Method), which in turn returns the same page. The following screenshot shows what the builder will look like after we provide all the inputs we will describe ahead: Let's discuss the builder inputs we have provided in the preceding screenshot. The first input we provide to this builder is the builder name. Although this input is not required, you should always name your builders. Some naming convention should be used for naming your builders. If you do not name your builders, WEF will name them for you. The following table shows same sample names, which adds an underscore followed by two or three letters to indentify the builder type: Builder type Builder name Button search_BTN Link details_LNK Page search_PG Data Page searchCriteria_DP Variable searchInputs_VAR Imported Model results_IM Model Container customer_MC There are several schools of thoughts regarding naming convention. Some scholars like to debate in favor of one or another. Regardless of the naming convention you adopt, you need to make sure that the same convention is followed by the entire development team. The next inputs relate to the location where the content created by this builder will be placed. For User Interface builders, you need to specify which page will be targeted. You also need to specify, within that page, the tag with which this builder will be associated. Besides specifying a tag based on the name, you can also use the other location techniques to define this location. In our simple example, we will be selecting the sample_PG page. If you were working on a sample, and if you would click on the drop-down control, you would see that only the available pages would be displayed as options from which you could choose. When a page is not selected, the tag input does not display any value. That is because the builders know how to present only valid options based on the inputs you have previously provided. For this example, we will select sample_PG for page input. After doing so, the Tag input is populated with all the HTML tags available on this page. We selected the sampleButton tag. This means that the content to be created on this page will be placed at the same location where this tag currently exists. It replaces the span tag type, but it preserves the other attributes, which make sense for the builder being currently added. Another input is the label value to be displayed. Once again, here you can type in a value, you can select a value from the picker, or you can specify a value to be provided by a profile. In this sample, we have typed in Sample Button. For the Button builder, you need to define the action to be performed when the button is clicked. Here also, the builder presents only the valid actions from which we can select one. We have selected, Link to an action. For the Action input, we select sample_MTD. This is the mentioned method, which performs some action and returns the same page. Now that the input values to this Button builder have been provided, we will inspect the content created by this builder. Inspecting content created by builders The builder call list has a small gray arrow icon in front of each builder type. By clicking on this icon, you cause the designer to show the content and artifacts created by the selected builder: By clicking on the highlighted link, the designer displays the WebApp tree in its right-hand side pane. By expanding the Pages node, you can see that one of the nodes is sample_BTN, which is our button. By clicking on this element, the Source pane displays the sample page with which we started. If necessary, click on the Source tab at the bottom of the page to expose the source pane. Once the WebApp tree is shown, by clicking on the sample_BTN element, the right-hand side pane highlights the content created by the Button builder we have added. Let's compare the code shown in the preceding screenshot against the original code shown by the screenshot depicturing the Sample Page builder. Please refer to the screenshot that shows a Sample Page builder named sample_PG. This screenshot shows that the sample_PG builder contains simple HTML tags defined in the Page Contents (HTML) input. By comparing these two screenshots, the first difference we notice is that after adding the Button builder, our initial simple HTML page became a JSP page, as denoted by the numerous JSP notations on this page. We can also notice that the initial sampleButton span tag has been replaced by an input tag of the button type. This tag includes an onClick JavaScript event. The code for this JavaScript event is provided by JSP scriptlet created by the Button builder. As we learned in this section, builders add diverse content to the WebApp abstract object. They can add artifacts such as JSP pages, JavaScript code, Java classes, and so on, or they can modify content already created by other builders. In summary, builders add or modify any content or artifacts in order to carry on their purpose according to the design pattern they represent. Models Another important element of WEF is the Model component. Model is a container for builder calls. The builder call list is maintained in an XML file with a .model extension. The builder call list represents the list of builders added to a model. The Outline view of the WEF perspective displays the list of builders that have been added to a model. The following screenshot displays the list of builder calls contained in a sample model: To see what a builder call looks like inside the model, you can click on the gray arrow icon in front of the builder type and inspect it in the Model XML tab. For instance, let's look at the Button builder call inside the sample model we described in the previous section. The preceding image represents a builder call the way it is stored in the model file. This builder call is one of the XML elements found in the BuilderCallList node, which in turn is child of the Model node. Extra information is also added at the end of this file. This XML model file contains the input names and the values for each builder you have added to this model. WEF operates on this information and the set of instructions contained in these XML elements, to build your application by invoking a process known as generation or regeneration to actually build the executable version of your application, be it a portlet, a web application, or a widget. We will discuss more on regeneration at the end of this article. It is important to notice that models contain only the builder call list, not the builders themselves. Although the terms—builder call and builder are used interchangeably most of the times, technically they are different. Builder call can be defined as an entry in your model, which identifies the builder by the Builder call ID, and then provides inputs to that builder. Builders are the elements or components that actually perform the tasks of interacting with the WebApp object. These elements are the builder definition file (an XML file) and a Java Class. A builder can optionally have a coordinator class. This class coordinates the behavior of the builder interface you interact with through the Builder Editor. Modeling Unlike traditional development process utilizing pure Java, JSP, JavaScript coding, WEF enables developers to model their application. By modeling, WEF users actually define the instructions of how the tool will build the final intended application. The time-consuming, complex, and tedious coding and testing tasks have already been done by the creators of the builders. It is now left to the WEF developer to select the right builders and provide the right inputs to these builders in order to build the application. In this sense, WEF developers are actually modelers. A modeler works with a certain level of abstraction by not writing or interacting directly with the executable code. This is not to say that WEF developers do not have to understand or write some Java or eventually JavaScript code. It means that, when some code writing is necessary, the amount and complexity of this code is reduced as WEF does the bulk of the coding for you. There are many advantages to the modeling approach. Besides the fact that it significantly speeds the development process, it also manages changes to the underlying code, without requiring you to deal with low-level coding. You only change the instructions that generate your application. WEF handles all the intricacies and dependencies for you. In the software development lifecycle, requests to change requirements and functionality after implementation are very common. It is given that your application will change after you have coded it. So, be proactive by utilizing a tool, which efficiently and expeditiously handles these changes. WEF has been built with the right mechanism to graciously handle change request scenarios. That is because changing the instructions to build the code is much faster and easier than changing the code itself. Code generation versus software automation While software has been vastly utilized to automate an infinite number of processes in countless domains, very little has been done to facilitate and improve software automation itself. Prior to being a tool for building portlets, WEF exploits the quite dormant paradigm of software automation. It is beyond the scope of this book to discuss software automation in details, but it is suffice to say that builders, profiles, and the regeneration engine enable the automation of the process of creating software. In the particular case of WEF, the automation process targets web applications and portlets, but it keeps on expanding to other domains, such as widgets and mobile phones. WEF is not a code generation tool. While code generation tools utilize a static process mostly based on templates, WEF implements software automation to achieve not only high productivity but also variability. Profiles In the development world, the word profile can signify many things. From the WEF perspective, profile represents a means to provide variability to an application. WEF also enables profiles or profile entries to be exposed to external users. In this way, external users can modify predefined aspects of the application without assistance from development or redeployment of the application. The externalized elements are the builder input values. By externalizing the builder input values, line of business, administrators, and even users can change these values causing WEF to serve a new flavor of their application. Profile names, profile entry names, which map to the builder inputs, and their respective values are initially stored in an XML file with a .pset extention . This is part of your project and is deployed with your project. Once it is deployed, it can be stored in other persistence mechanisms, for example a database. WEF provides an interface to enable developers to create profile, define entries and their initial values, as well as define the mechanism that will select which profile to use at runtime. By selecting a profile, all the entry values associated with that profile will be applied to your application, providing an unlimited level of variability. Variability can be driven by personalization, configuration, LDAP attributes, roles, or it can even be explicitly set through the Java methods. The following screenshot shows the Manage Profile tab of Profile Manager. The Profile Manager enables you to manage every aspect related to profile sets. The top portion of this screenshot lists the three profiles available in this profile set. The bottom part of this screenshot shows the profile entries and their respective values for the selected profile:
Read more
  • 0
  • 0
  • 4167
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $15.99/month. Cancel anytime
article-image-creating-application-scratch
Packt
13 Nov 2013
5 min read
Save for later

Creating an Application from Scratch

Packt
13 Nov 2013
5 min read
(For more resources related to this topic, see here.) Creating an application Using the Command Line Tool, we are going to create a very new application. This application is also going to be a Sinatra application that displays some basic date and time information. First, navigate to a new directory that will be used to contain the code. Everything in this directory will be uploaded to AppFog when we create the new application. $ mkdir insideaf4 $ cd insideaf4 Now, create a new file called insideaf4.rb. The contents of the file should look like the following: require 'sinatra' get '/' do erb :index end This tells Sinatra to listen for requests to the base URL of / and then render the index page that we will create next. If you are using Ruby 1.8.7, you may need to add the following line at the top: require 'rubygems' Next, create a new directory called views under the insideaf4 directory: $ mkdir views $ cd views Now we are going to create a new file under the views directory called index.erb. This file will be the one that displays the date and time information for our example. The following are the contents of the index.erb file: <html><head> <title>Current Time</title> </head><body> <% time = Time.new %> <h1>Current Time</h1> <table border="1" cellpadding="5"> <tr> <td>Name</td> <td>Value</td> </tr> <tr> <td>Date (M/D/Y)</td> <td<%= time.strftime('%m/%d/%Y') % </tr> <tr> <td>Time</td> <td><%= time.strftime('%I:%M %p') </tr> <tr> <td>Month</td> <td><%= time.strftime('%B') %></t </tr> <tr> <td>Day</td> <td><%= time.strftime('%A') %></td </tr> </table> </body></html> This file will create a table that shows a number of different ways to format the date and time. Embeded in the HTML code are Ruby snippets that look like <%= %>. Inside of these snippets we use Ruby's strftime method to display the current date and time in a number of different string formats. At the beginning of the file, we create a new instance of a Time object which is automatically set to the current time. Then we use the strftime method to display different values in the table. For more information on using Ruby dates, please see the documentation available at http://www.ruby-doc.org/core-2.0.0/Time.html. Testing the application Before creating an application in AppFog, it is useful to test it out locally first. To do this you will again need the Sinatra Gem installed. If you need to do that, refer to Appendix, Installing the AppFog Gem. The following is the command to run your small application: $ ruby indiseaf4.rb You will see the Sinatra application start and then you can navigate to http://localhost:4567/ in a browser. You should see a page that has the current date and time information like the following screenshot: To terminate the application, return to the command line and press Control+C. Publishing to AppFog Now that you have a working application, you can publish it to AppFog and create the new AppFog application. Before you begin, make sure you are in the root director of your project. For this example that was the insideaf4 directory. Next, you will need to log in to AppFog. $ af login Attempting login to [https://api.appfog.com] Email: [email protected] Password: ******** Successfully logged into [https://api.appfog.com] You may be asked for your e-mail and password again, but the tool may remember your session if you logged in recently. Now you can push your application to AppFog, which will create a new application for you. Make sure you are in the correct directory and use the Push command. You will be prompted for a number of settings during the publishing process. In each case there will be a list of options along with a default. The default value will be listed with a capital letter or listed by itself in square brackets. For our purposes, you can just press Enter for each prompt to accept the default value. The only exception to that is the step that prompts you to choose an infrastructure. In that step you will need to make a selection. $ af push insideaf4 Would you like to deploy from the current directory? [Yn]: Detected a Sinatra Application, is this correct? [Yn]: 1: AWS US East - Virginia 2: AWS EU West - Ireland 3: AWS Asia SE - Singapore 4: HP AZ 2 - Las Vegas Select Infrastructure: Application Deployed URL [insideaf4.aws.af.cm]: Memory reservation (128M, 256M, 512M, 1G, 2G) [128M]: How many instances? [1]: Bind existing services to insideaf4? [yN]: Create services to bind to insideaf4? [yN]: Would you like to save this configuration? [yN]: Creating Application: OK Uploading Application: Checking for available resources: OK Packing application: OK Uploading (1K): OK Push Status: OK Staging Application insideaf4: OK Starting Application insideaf4: OK Summary Hence we have learned how to create application using Appfog. Resources for Article : Further resources on this subject: AppFog Top Features You Need to Know [Article] SSo, what is Node.js? [Article] Learning to add dependencies [Article]
Read more
  • 0
  • 0
  • 4054

article-image-hdfs-and-mapreduce
Packt
17 Jul 2014
11 min read
Save for later

HDFS and MapReduce

Packt
17 Jul 2014
11 min read
(For more resources related to this topic, see here.) Essentials of HDFS HDFS is a distributed filesystem that has been designed to run on top of a cluster of industry standard hardware. The architecture of HDFS is such that there is no specific need for high-end hardware. HDFS is a highly fault-tolerant system and can handle failures of nodes in a cluster without loss of data. The primary goal behind the design of HDFS is to serve large data files efficiently. HDFS achieves this efficiency and high throughput in data transfer by enabling streaming access to the data in the filesystem. The following are the important features of HDFS: Fault tolerance: Many computers working together as a cluster are bound to have hardware failures. Hardware failures such as disk failures, network connectivity issues, and RAM failures could disrupt processing and cause major downtime. This could lead to data loss as well slippage of critical SLAs. HDFS is designed to withstand such hardware failures by detecting faults and taking recovery actions as required. The data in HDFS is split across the machines in the cluster as chunks of data called blocks. These blocks are replicated across multiple machines of the cluster for redundancy. So, even if a node/machine becomes completely unusable and shuts down, the processing can go on with the copy of the data present on the nodes where the data was replicated. Streaming data: Streaming access enables data to be transferred in the form of a steady and continuous stream. This means if data from a file in HDFS needs to be processed, HDFS starts sending the data as it reads the file and does not wait for the entire file to be read. The client who is consuming this data starts processing the data immediately, as it receives the stream from HDFS. This makes data processing really fast. Large data store: HDFS is used to store large volumes of data. HDFS functions best when the individual data files stored are large files, rather than having large number of small files. File sizes in most Hadoop clusters range from gigabytes to terabytes. The storage scales linearly as more nodes are added to the cluster. Portable: HDFS is a highly portable system. Since it is built on Java, any machine or operating system that can run Java should be able to run HDFS. Even at the hardware layer, HDFS is flexible and runs on most of the commonly available hardware platforms. Most production level clusters have been set up on commodity hardware. Easy interface: The HDFS command-line interface is very similar to any Linux/Unix system. The commands are similar in most cases. So, if one is comfortable with the performing basic file actions in Linux/Unix, using commands with HDFS should be very easy. The following two daemons are responsible for operations on HDFS: Namenode Datanode The namenode and datanodes daemons talk to each other via TCP/IP. Configuring HDFS All HDFS-related configuration is done by adding/updating the properties in the hdfs-site.xml file that is found in the conf folder under the Hadoop installation folder. The following are the different properties that are part of the hdfs-site.xml file: dfs.namenode.servicerpc-address: This specifies the unique namenode RPC address in the cluster. Services/daemons such as the secondary namenode and datanode daemons use this address to connect to the namenode daemon whenever it needs to communicate. This property is shown in the following code snippet: <property> <name>dfs.namenode.servicerpc-address</name> <value>node1.hcluster:8022</value> </property> dfs.namenode.http-address: This specifies the URL that can be used to monitor the namenode daemon from a browser. This property is shown in the following code snippet: <property> <name>dfs.namenode.http-address</name> <value>node1.hcluster:50070</value> </property> dfs.replication: This specifies the replication factor for data block replication across the datanode daemons. The default is 3 as shown in the following code snippet: <property> <name>dfs.replication</name> <value>3</value> </property> dfs.blocksize: This specifies the block size. In the following example, the size is specified in bytes (134,217,728 bytes is 128 MB): <property> <name>dfs.blocksize</name> <value>134217728</value> </property> fs.permissions.umask-mode: This specifies the umask value that will be used when creating files and directories in HDFS. This property is shown in the following code snippet: <property> <name>fs.permissions.umask-mode</name> <value>022</value> </property> The read/write operational flow in HDFS To get a better understanding of HDFS, we need to understand the flow of operations for the following two scenarios: A file is written to HDFS A file is read from HDFS HDFS uses a single-write, multiple-read model, where the files are written once and read several times. The data cannot be altered once written. However, data can be appended to the file by reopening it. All files in the HDFS are saved as data blocks. Writing files in HDFS The following sequence of steps occur when a client tries to write a file to HDFS: The client informs the namenode daemon that it wants to write a file. The namenode daemon checks to see whether the file already exists. If it exists, an appropriate message is sent back to the client. If it does not exist, the namenode daemon makes a metadata entry for the new file. The file to be written is split into data packets at the client end and a data queue is built. The packets in the queue are then streamed to the datanodes in the cluster. The list of datanodes is given by the namenode daemon, which is prepared based on the data replication factor configured. A pipeline is built to perform the writes to all datanodes provided by the namenode daemon. The first packet from the data queue is then transferred to the first datanode daemon. The block is stored on the first datanode daemon and is then copied to the next datanode daemon in the pipeline. This process goes on till the packet is written to the last datanode daemon in the pipeline. The sequence is repeated for all the packets in the data queue. For every packet written on the datanode daemon, a corresponding acknowledgement is sent back to the client. If a packet fails to write onto one of the datanodes, the datanode daemon is removed from the pipeline and the remainder of the packets is written to the good datanodes. The namenode daemon notices the under-replication of the block and arranges for another datanode daemon where the block could be replicated. After all the packets are written, the client performs a close action, indicating that the packets in the data queue have been completely transferred. The client informs the namenode daemon that the write operation is now complete. The following diagram shows the data block replication process across the datanodes during a write operation in HDFS: Reading files in HDFS The following steps occur when a client tries to read a file in HDFS: The client contacts the namenode daemon to get the location of the data blocks of the file it wants to read. The namenode daemon returns the list of addresses of the datanodes for the data blocks. For any read operation, HDFS tries to return the node with the data block that is closest to the client. Here, closest refers to network proximity between the datanode daemon and the client. Once the client has the list, it connects the closest datanode daemon and starts reading the data block using a stream. After the block is read completely, the connection to datanode is terminated and the datanode daemon that hosts the next block in the sequence is identified and the data block is streamed. This goes on until the last data block for that file is read. The following diagram shows the read operation of a file in HDFS: Understanding the namenode UI Hadoop provides web interfaces for each of its services. The namenode UI or the namenode web interface is used to monitor the status of the namenode and can be accessed using the following URL: http://<namenode-server>:50070/ The namenode UI has the following sections: Overview: The general information section provides basic information of the namenode with options to browse the filesystem and the namenode logs. The following is the screenshot of the Overview section of the namenode UI: The Cluster ID parameter displays the identification number of the cluster. This number is same across all the nodes within the cluster. A block pool is a set of blocks that belong to a single namespace. The Block Pool Id parameter is used to segregate the block pools in case there are multiple namespaces configured when using HDFS federation. In HDFS federation, multiple namenodes are configured to scale the name service horizontally. These namenodes are configured to share datanodes amongst themselves. We will be exploring HDFS federation in detail a bit later. Summary: The following is the screenshot of the cluster's summary section from the namenode web interface: Under the Summary section, the first parameter is related to the security configuration of the cluster. If Kerberos (the authorization and authentication system used in Hadoop) is configured, the parameter will show as Security is on. If Kerberos is not configured, the parameter will show as Security is off. The next parameter displays information related to files and blocks in the cluster. Along with this information, the heap and non-heap memory utilization is also displayed. The other parameters displayed in the Summary section are as follows: Configured Capacity: This displays the total capacity (storage space) of HDFS. DFS Used: This displays the total space used in HDFS. Non DFS Used: This displays the amount of space used by other files that are not part of HDFS. This is the space used by the operating system and other files. DFS Remaining: This displays the total space remaining in HDFS. DFS Used%: This displays the total HDFS space utilization shown as percentage. DFS Remaining%: This displays the total HDFS space remaining shown as percentage. Block Pool Used: This displays the total space utilized by the current namespace. Block Pool Used%: This displays the total space utilized by the current namespace shown as percentage. As you can see in the preceding screenshot, in this case, the value matches that of the DFS Used% parameter. This is because there is only one namespace (one namenode) and HDFS is not federated. DataNodes usages% (Min, Median, Max, stdDev): This displays the usages across all datanodes in the cluster. This helps administrators identify unbalanced nodes, which may occur when data is not uniformly placed across the datanodes. Administrators have the option to rebalance the datanodes using a balancer. Live Nodes: This link displays all the datanodes in the cluster as shown in the following screenshot: Dead Nodes: This link displays all the datanodes that are currently in a dead state in the cluster. A dead state for a datanode daemon is when the datanode daemon is not running or has not sent a heartbeat message to the namenode daemon. Datanodes are unable to send heartbeats if there exists a network connection issue between the machines that host the datanode and namenode daemons. Excessive swapping on the datanode machine causes the machine to become unresponsive, which also prevents the datanode daemon from sending heartbeats. Decommissioning Nodes: This link lists all the datanodes that are being decommissioned. Number of Under-Replicated Blocks: This represents the number of blocks that have not replicated as per the replication factor configured in the hdfs-site.xml file. Namenode Journal Status: The journal status provides location information of the fsimage file and the state of the edits logfile. The following screenshot shows the NameNode Journal Status section: NameNode Storage: The namenode storage table provides the location of the fsimage file along with the type of the location. In this case, it is IMAGE_AND_EDITS, which means the same location is used to store the fsimage file as well as the edits logfile. The other types of locations are IMAGE, which stores only the fsimage file and EDITS, which stores only the edits logfile. The following screenshot shows the NameNode Storage information:
Read more
  • 0
  • 0
  • 3966

article-image-top-reasons-why-businesses-should-adopt-enterprise-collaboration-tools
Guest Contributor
05 Mar 2019
8 min read
Save for later

Top reasons why businesses should adopt enterprise collaboration tools

Guest Contributor
05 Mar 2019
8 min read
Following the trends of the modern digital workplace, organizations apply automation even to the domains that are intrinsically human-centric. Collaboration is one of them. And if we can say that organizations have already gained broad experience in digitizing business processes while foreseeing potential pitfalls, the situation is different with collaboration. The automation of collaboration processes can bring a significant number of unexpected challenges even to those companies that have tested the waters. State of Collaboration 2018 reveals a curious fact: even though organizations can be highly involved in collaborative initiatives, employees still report that both they and their companies are poorly prepared to collaborate. Almost a quarter of respondents (24%) affirm that they lack relevant enterprise collaboration tools, while 27% say that their organizations undervalue collaboration and don't offer any incentives for them to support it. Two reasons can explain these stats: The collaboration process can be hardly standardized and split into precise workflows. The number of collaboration scenarios is enormous, and it’s impossible to get them all into a single software solution. It’s also pretty hard to manage collaboration, assess its effectiveness, or understand bottlenecks. Unlike business process automation systems that play a critical role in an organization and ensure core production or business activities, enterprise collaboration tools are mostly seen as supplementary solutions, so they are the last to be implemented. Moreover, as organizations often don’t spend much effort on adapting collaboration tools to their specifics, the end solutions are frequently subject to poor adoption. At the same time, the IT market offers numerous enterprise collaboration tools Slack, Trello, Stride, Confluence, Google Suite, Workplace by Facebook, SharePoint and Office 365, to mention a few, compete to win enterprises’ loyalty. But how to choose the right enterprise Collaboration tools and make them effective? Or how to make employees use the implemented enterprise Collaboration tools actively? To answer these questions and understand how to succeed in their collaboration-focused projects, organizations have to examine both tech- and employee-related challenges they may face. Challenges rooted in technologies From the enterprise Collaboration tools' deployment model to its customization and integration flexibility, companies should consider a whole array of aspects before they decide which solution they will implement. Selecting a technologically suitable solution Finding a proper solution is a long process that requires companies to make several important decisions: Cloud or on-premises? By choosing the deployment type, organizations define their future infrastructure to run the solution, required management efforts, data location, and the amount of customization available. Cloud solutions can help enterprises save both technical and human resources. However, companies often mistrust them because of multiple security concerns. On-premises solutions can be attractive from the customization, performance, and security points of view, but they are resource-demanding and expensive due to high licensing costs. Ready-to-use or custom? Today many vendors offer ready-made enterprise collaboration tools, particularly in the field of enterprise intranets. This option is attractive for organizations because they can save on customizing a solution from scratch. However, with ready-made products, organizations can face a bigger risk of following a vendor’s rigid politics (subscription/ownership price, support rates, functional capabilities, etc.). If companies choose custom enterprise collaboration software, they have a wider choice of IT service providers to cooperate with and adjust their solutions to their needs. One tool or several integrated tools? Some organizations prefer using a couple of apps that cover different collaboration needs (for example, document management, video conferencing, instant messaging). At the same time, companies can also go for a centralized solution, such as SharePoint or Office 365 that can support all collaboration types and let users create a centralized enterprise collaboration environment. Exploring integration options Collaboration isn’t an isolated process. It is tightly related to business or organizational activities that employees do. That’s why integration capabilities are among the most critical aspects companies should check before investing in their collaboration stack. Connecting an enterprise Collaboration tool to ERP, CRM, HRM, or ITSM solutions will not only contribute to the business process consistency but will also reduce the risk of collaboration gaps and communication inconsistencies. Planning ongoing investment Like any other business solution, an enterprise collaboration tool requires financial investment to implement, customize (even ready-made solutions require tuning), and support it. The initial budget will strongly depend on the deployment type, the estimated number of users, and needed customizations. While planning their yearly collaboration investment, companies should remember that their budgets should cover not only the activities necessary to ensure the solution’s technical health but also a user adoption program. Eliminating duplicate functionality Let’s consider the following scenario: a company implements a collaboration tool that includes the project management functionality, while they also run a legacy project management system. The same situation can happen with time tracking, document management, knowledge management systems, and other stand-alone solutions. In this case, it will be reasonable to consider switching to the new suite completely and depriving the legacy one. For example, by choosing SharePoint Server or Online, organizations can unite various functions within a single solution. To ensure a smooth transition to a new environment, SharePoint developers can migrate all the data from legacy systems, thus making it part of the new solution. Choosing a security vector As mentioned before, the solution’s deployment model dictates the security measures that organizations have to take. Sometimes security is the paramount reason that holds enterprises’ collaboration initiatives back. Security concerns are particularly characteristic of organizations that hesitate between on-premises and cloud solutions. SharePoint and Office 365 trends 2018 show that security represents the major worry for organizations that consider changing their on-premises deployments for cloud environments. What’s even more surprising is that while software providers, like Microsoft, are continually improving their security measures, the degree of concern keeps on growing. The report mentioned above reveals that 50% of businesses were concerned about security in 2018 compared to 36% in 2017 and 32% in 2016. Human-related challenges Technology challenges are multiple, but they all can be solved quite quickly, especially if a company partners with a professional IT service provider that backs them up at the tech level. At the same time, companies should be ready to face employee-related barriers that may ruin their collaboration effort. Changing employees’ typical style of collaboration Don’t expect that your employees will welcome the new collaboration solution. It’s about to change their typical collaboration style, which may be difficult for many. Some employees won’t share their knowledge openly, while others will find it difficult to switch from one-to-one discussions to digitized team meetings. In this context, change management should work at two levels: a technological one and a mental one. Companies should not just explain to employees how to use the new solution effectively, but also show each team how to adapt the collaboration system to the needs of each team member without damaging the usual collaboration flow. Finding the right tools for collaborators and non-collaborators Every team consists of different personalities. Some people can be open to collaboration; others can be quite hesitant. The task is to ensure a productive co-work of these two very different types of employees and everyone in between. Teams shouldn’t wait for instant collaboration consistency or general satisfaction. These are only possible to achieve if the entire team works together to create an optimal collaboration area for each individual. Launching digital collaboration within large distributed teams When it’s about organizing collaboration within a small or medium-sized team, collaboration difficulties can be quite simple to avoid, as the collaboration flow is moderate. But when it comes to collaboration in big teams, the risk of failure increases dramatically. Organizing effective communication of remote employees, connecting distributed offices, offering relevant collaboration areas to the entire team and subteams, enable cross-device consistency of collaboration — these are just a few steps to undertake for effective teamwork. Preparing strategies to overcome adoption difficulties He biggest human-related the poor adoption of an enterprise collaboration system. It can be hard for employees to get used to the new solution, accept the new communication medium, its UI and logic. Adoption issues are critical to address because they may engender more severe consequences than the tech-related ones. Say, if there is a functional defect in a solution, a company can fix it within a few days. However, if there are adoption issues, it means that all the efforts an organization puts into technology polishing can be blown away because their employees don’t use the solution at all. Ongoing training and communication between collaboration manager and particular teams is a must to keep employees’ satisfied with the solution they use. Is there more pain than gain? On recognizing all the challenges, companies might feel that there are too many barriers to overcome to get a decent collaboration solution. So maybe it’s reasonable to stay away from the collaboration race? Is it the case? Not really. If you take a look at Internet Trends 2018, you will see that there are multiple improvements that companies get as they adopt enterprise collaboration tools. Typical advantages include reduced meeting time, quicker onboarding, less time required for support, more effective document management, and a substantial rise in teams’ productivity. If your company wants to get all these advantages, be brave to face the possible collaboration challenges to get a great reward. Author Bio Sandra Lupanova is SharePoint and Office 365 Evangelist at Itransition, a software development and IT consulting company headquartered in Denver. Sandra focuses on the SharePoint and Office 365 capabilities, challenges that companies face while adopting these platforms, as well as shares practical tips on how to improve SharePoint and Office 365 deployments through her articles.
Read more
  • 0
  • 0
  • 3959

article-image-building-mobile-apps
Packt
21 Apr 2014
6 min read
Save for later

Building Mobile Apps

Packt
21 Apr 2014
6 min read
(For more resources related to this topic, see here.) As mobile apps get closer to becoming the de-facto channel to do business on the move, more and more software vendors are providing easy to use mobile app development platforms for developers to build powerful HTML5, CSS, and JavaScript based apps. Most of these mobile app development platforms provide the ability to build native, web, and hybrid apps. There are several very feature rich and popular mobile app development toolkits available in the market today. Some of them worth mentioning are: Appery (http://appery.io) AppBuilder (http://www.telerik.com/appbuilder) Phonegap (http://phonegap.com/) Appmachine (http://www.appmachine.com/) AppMakr (http://www.appmakr.com/) (AppMakr is currently not starting new apps on their existing legacy platform. Any customers with existing apps still have full access to the editor and their apps.) Codiqa (https://codiqa.com) Conduit (http://mobile.conduit.com/) Apache Cordova (http://cordova.apache.org/) And there are more. The list is only a partial list of the amazing tools currently in the market for building and deploying mobile apps quickly. The Heroku platform integrates with the Appery.io (http://appery.io) mobile app development platform to provide a seamless app development experience. With the Appery.io mobile app development platform, the process of developing a mobile app is very straightforward. You build the user interface (UI) of your app using drag-and-drop from an available palette. The palette contains a rich set of user interface controls. Create the navigation flow between the different screens of the app, and link the actions to be taken when certain events such as clicking a button. Voila! You are done. You save the app and test it there using the Test menu option. Once you are done with testing the app, you can host the app using Appery's own hosting service or the Heroku hosting service. Mobile app development was never this easy. Introducing Appery.io Appery.io (http://www.appery.io) is a cloud-based mobile app development platform. With Appery.io, you can build powerful mobile apps leveraging the easy to use drag-and-drop tools combined with the ability to use client side JavaScript to provide custom functionality. Appery.io enables you to create real world mobile apps using built-in support for backend data stores, push notifications, server code besides plugins to consume third-party REST APIs and help you integrate with a plethora of external services. Appery.io is an innovative and intuitive way to develop mobile apps for any device, be it iOS, Windows or Android. Appery.io takes enterprise level data integration to the next level by exposing your enterprise data to mobile apps in a secure and straightforward way. It uses Exadel's (Appery.io's parent company) RESTExpress API to enable sharing your enterprise data with mobile apps through a REST interface. Appery.io's mobile UI development framework provides a rich toolkit to design the UI using many types of visual components required for the mobile apps including google maps, Vimeo and Youtube integration. You can build really powerful mobile apps and deploy them effortlessly using drag and drop functionality inside the Appery.io app builder. What is of particular interest to Heroku developers is Appery.io's integration with mobile backend services with option to deploy on the Heroku platform with the click of a button. This is a powerful feature where in you do not need to install any software on your local machine and can build and deploy real world mobile apps on cloud based platforms such as Appery.io and Heroku r. In this section, we create a simple mobile app and deploy it on Heroku. In doing so, we will also learn: How to create a mobile UI form How to configure your backend services (REST or database) How to integrate your UI with backend services How to deploy the app to Heroku How to test your mobile app We will also review the salient features of the Appery.io mobile app development platform and focus on the ease of development of these apps and how one could easily leverage web services to deploy apps and consume data from any system. Getting Appery.io The cool thing about Appery.io is that it is a cloud-based mobile app development toolkit and can be accessed from any popular web browser. To get started, create an account at http://appery.io and you are all set. You can sign up for a basic starter version which provides the ability to develop 1 app per account and go all the way up to the paid Premium and Enterprise grade subscriptions. Introducing the Appery.io app builder The Appery.io app builder is a cloud based mobile application development kit that can be used to build mobile apps for any platform. The Appery.io app builder consists of intuitive tooling and a rich controls palette to help developers drag and drop controls on to the device and design robust mobile apps. The Appery.io app builder has the following sections: Device layout section: This section contains the mock layout of the device onto which the developer can drag-and-drop visual controls to create a user interface. Palette: Contains a rich list of visual controls like buttons, text boxes, Google Map controls and more that developers can use to build the user experience. Project explorer: This section consists of many things including project files, application level settings/configuration, available themes for the device, custom components, available CSS and JavaScript code, templates, pop-ups and one of the key elements— the available backend services. Key menu options: Save and Test for the app being designed. Page properties: This section consists of the design level properties for the page being designed. Modifying these properties changes the user interface labels or the underlying properties of the page elements. Events: This is another very important section of the app builder that contains the event to action association for the various elements of the page. For example, it can contain the action to be taken when a click event happens on a button on this page. The following Appery.io app builder screenshot highlights the various sections of the rich toolkit available for mobile app developers to build apps quickly: Creating your first Appery.io mobile app Building a mobile app is quite straightforward using Appery.io's feature rich app development platform. To get started, create an Appery.io account at http://appery.io and login using valid credentials: Click on the Create new app link on the left section of your Appery.io account page: Enter the new app name for example, herokumobile app and click on Create: Enter the name of the first/launch page of your mobile app and click on Create page: This creates the new Appery.io app and points the user to the Appery.io app builder to design the start page of the new mobile app.
Read more
  • 0
  • 0
  • 3874
article-image-disaster-recovery-hyper-v
Packt
29 Jan 2013
9 min read
Save for later

Disaster Recovery for Hyper-V

Packt
29 Jan 2013
9 min read
(For more resources related to this topic, see here.) Hyper-V and Windows Server 2012 come with tools and solutions to make sure that your virtual machines will be up, running, and highly available. Components such as Failover Cluster can ensure that your servers are accessible, even in case of failures. However, disasters can occur and bring all the servers and services offline. Natural disasters, viruses, data corruption, human errors, and many other factors can make your entire system unavailable. People think that High Available (HA) is a solution for Disaster Recovery (DR) and that they can use it to replace DR. Actually HA is a component of a DR plan, which consists of process, policies, procedures, backup and recovery plan, documentation, tests, Service Level Agreements (SLA), best practices, and so on. The objective of a DR is simply to have business continuity in case of any disaster. In a Hyper-V environment, we have options to utilize the core components, such as Hyper-V Replica, for a DR plan, which replicates your virtual machines to another host or cluster and makes them available if the first host is offline, or even backs up and restores to bring VMs back, in case you lose everything. This module will walk you through the most important processes for setting up disaster recovery for your virtual machines running on Hyper-V. Backing up Hyper-V and virtual machines using Windows Server Backup Previous versions of Hyper-V had complications and incompatibilities with the built-in backup tool, forcing the administrators to acquire other solutions for backing up and restoring. Windows Server 2012 comes with a tool known as Windows Server Backup (WSB), which has full Hyper-V integration, allowing you to back up and restore your server, applications, Hyper-V, and virtual machines. WSB is easy and provides for a low cost scenario for small and medium companies. This recipe will guide you through the steps to back up your virtual machines using the Windows Server Backup tool. Getting ready Windows Server Backup does not support tapes. Make sure that you have a disk, external storage, network share, and free space to back up your virtual machines before you start. How to do it... The following steps will show you how to install the Windows Server Backup feature and how to schedule a task to back up your Hyper-V settings and virtual machines: To install the Windows Server Backup feature, open Server Manager from the taskbar. In the Server Manager Dashboard, click on Manage and select Add Roles and Features. On the Before you begin page, click on Next four times. Under the Add Roles and Features Wizard, select Windows Server Backup from the Features section, as shown in the following screenshot: Click on Next and then click on Install. Wait for the installation to be completed. After the installation, open the Start menu and type wbadmin.msc to open the Windows Server Backup tool. To change the backup performance options, click on Configure Performance from the pane on the right-hand side in the Windows Server Backup console. In the Optimize Backup Performance window, we have three options to select from—Normal backup performance, Faster backup performance, and Custom, as shown in the following screenshot: In the Windows Server Backup console, in the pane on the right-hand side, select the backup that you want to perform. The two available options are Backup Schedule to schedule an automatic backup and Backup Once for a single backup. The next steps will show how to schedule an automatic backup. In the Backup Schedule Wizard, in the Getting Started page, click on Next. In the Select Backup Configuration page, select Full Server to back up all the server data or click on Custom to select specific items to back up. If you want to backup only Hyper-V and virtual machines, click on Custom and then Next. In Select Items for Backup, click on Add Items. In the Select Items window, select Hyper-V to back up all the virtual machines and the host component, as shown in the following screenshot. You can also expand Hyper-V and select the virtual machines that you want to back up. When finished, click on OK. Back to the Select Items for Backup, click on Advanced Settings to change Exclusions and VSS Settings. In the Advanced Settings window, in the Exclusions tab, click on Add Exclusion to add any necessary exclusions. Click on the VSS Settings tab to select either VSS full Backup or VSS copy Backup as shown in the following screenshot. Click on OK. In the Select Items for Backup window, confirm the items that will be backed up and click on Next. In the Specify Backup Time page, select Once a day and the time for a daily backup or select More than once a day and the time and click on Next. In the Specify Destination Type page, select the option Back up to a hard disk that is dedicated for backups (recommended), back up to a volume, or back up to a shared network folder, as shown in the following screenshot and click on Next. If you select the first option, the disk you choose will be formatted and dedicated to storing the backup data only. In Select Destination Disk, click on Show All Available Disks to list the disks, select the one you want to use to store your backup, and click on OK. Click on Next twice. If you have selected the Back up to a hard disk that is dedicated for backups (recommended) option, you will see a warning message saying that the disk will be formatted. Click on Yes to confirm. In the Confirmation window, double-check the options you selected and click on Finish, as shown in the following screenshot: After that, the schedule will be created. Wait until the scheduled time to begin and check whether the backup has been finished successfully. How it works... Many Windows administrators used to miss the NTBackup tool from the old Windows Server 2003 times because of its capabilities and features. The Windows Server Backup tool, introduced in Windows Server 2008, has many limitations such as no tape support, no advanced schedule options, fewer backup options, and so on. When we talk about Hyper-V in this regards, the problem is even worse. Windows Server 2008 has minimal support and features for it. In Windows Server 2012, the same tool is available with some limitations; however, it provides at least the core components to back up, schedule, and restore Hyper-V and your virtual machines. By default, WSB is not installed. The feature installation is made by Server Manager. After its installation, the tool can be accessed via console or command lines. Before you start the backup of your servers, it is good to configure the backup performance options you want to use. By default, all the backups are created as normal. It creates a full backup of all the selected data. This is an interesting option when low amounts of data are backed up. You can also select the Faster backup performance option. This backs up the changes between the last and the current backup, increasing the backup time and decreasing the stored data. This is a good option to save storage space and backup time for large amounts of data. A backup schedule can be created to automate your backup operations. In the Backup Schedule Wizard, you can back up your entire server or a custom selection of volumes, applications, or files. For backing up Hyper-V and its virtual machines, the best option is the customized backup, so that you don't have to back up the whole physical server. When Hyper-V is present on the host, the system shows Hyper-V, and you will be able to select all the virtual machines and the host component configuration to be backed up. During the wizard, you can also change the advanced options such as exclusions and Volume Shadow Copy Services (VSS) settings. WSB has two VSS backup options—VSS full backup and VSS copy backup. When you opt for VSS full backup, everything is backed up and after that, the application may truncate log files. If you are using other backup solutions that integrate with WSB, these logs are essential to be used in future backups such as incremental ones. To preserve the log files you can use VSS copy backup so that other applications will not have problems with the incremental backups. After selecting the items for backup, you have to select the backup time. This is another limitation from the previous version—only two schedule options, namely Once a day or More than once a day. If you prefer to create different backup schedule such as weekly backups, you can use the WSB commandlets in PowerShell. Moving forward, in the backup destination type, you can select between a dedicated hard disk, a volume, or a network folder to save your backups in. When confirming all the items, the backup schedule will be ready to back up your system. You can also use the option Backup once to create a single backup of your system. There's more... To check whether previous backups were successful or not, you can use the details option in the WSB console. These details can be used as logs to get more information about the last (previous), next, and all the backups. To access these logs, open Windows Server Backup, under Status select View details. The following screenshot shows an example of the Last backup. To see which files where backed up, click on the View list of all backed up files link. Checking the Windows Server Backup commandlets Some options such as advanced schedule, policies, jobs, and other configurations can only be created through commandlets on PowerShell. To see all the available Windows Server Backup commandlets, type the following command: Get-Command –Module WindowsServerBackup See also The Restoring Hyper-V and virtual machines using Windows Server Backup recipe in this article
Read more
  • 0
  • 0
  • 3762

article-image-cloud-native-architectures-microservices-containers-serverless-part-2
Guest Contributor
14 Aug 2018
8 min read
Save for later

Modern Cloud Native architectures: Microservices, Containers, and Serverless - Part 2

Guest Contributor
14 Aug 2018
8 min read
This whitepaper is written by Mina Andrawos, an experienced engineer who has developed deep experience in the Go language, and modern software architectures. He regularly writes articles and tutorials about the Go language, and also shares open source projects. Mina Andrawos has authored the book Cloud Native programming with Golang, which provides practical techniques, code examples, and architectural patterns required to build cloud native microservices in the Go language.He is also the author of the Mastering Go Programming, and the Modern Golang Programming video courses. We published Part 1 of this paper yesterday and here we come up with Part 2 which involves Containers and Serverless applications. Let us get started: Containers The technology of software containers is the next key technology that needs to be discussed to practically explain cloud native applications. A container is simply the idea of encapsulating some software inside an isolated user space or “container.” For example, a MySQL database can be isolated inside a container where the environmental variables, and the configurations that it needs will live. Software outside the container will not see the environmental variables or configuration contained inside the container by default. Multiple containers can exist on the same local virtual machine, cloud virtual machine, or hardware server. Containers provide the ability to run numerous isolated software services, with all their configurations, software dependencies, runtimes, tools, and accompanying files, on the same machine. In a cloud environment, this ability translates into saved costs and efforts, as the need for provisioning and buying server nodes for each microservices will diminish, since different microservices can be deployed on the same host without disrupting each other. Containers  combined with microservices architectures are powerful tools to build modern, portable, scalable, and cost efficient software. In a production environment, more than a single server node combined with numerous containers would be needed to achieve scalability and redundancy. Containers also add more benefits to cloud native applications beyond microservices isolation. With a container, you can move your microservices, with all the configuration, dependencies, and environmental variables that it needs, to fresh server nodes without the need to reconfigure the environment, achieving powerful portability. Due to the power and popularity of the software containers technology, some new operating systems like CoreOS, or Photon OS, are built from the ground up to function as hosts for containers. One of the most popular software container projects in the software industry is Docker. Major organizations such as Cisco, Google, and IBM utilize Docker containers in their infrastructure as well as in their products. Another notable project in the software containers world is Kubernetes. Kubernetes is a tool that allows the automation of deployment, management, and scaling of containers. It was built by Google to facilitate the management of their containers, which are counted by billions per week. Kubernetes provides some powerful features such as load balancing between containers, restart for failed containers, and orchestration of storage utilized by the containers. The project is part of the cloud native foundation along with Prometheus. Container complexities In case of containers, sometimes the task of managing them can get rather complex for the same reasons as managing expanding numbers of microservices. As containers or microservices grow in size, there needs to be a mechanism to identify where each container or microservices is deployed, what their purpose is, and what they need in resources to keep running. Serverless applications Serverless architecture is a new software architectural paradigm that was popularized with the AWS Lambda service. In order to fully understand serverless applications, we must first cover an important concept known as ‘Function As A service’, or FaaS for short. Function as a service or FaaS is the idea that a cloud provider such as Amazon or even a local piece of software such as Fission.io or funktion would provide a service, where a user can request a function to run remotely in order to perform a very specific task, and then after the function concludes, the function results return back to the user. No services or stateful data are maintained and the function code is provided by the user to the service that runs the function. The idea behind properly designed cloud native production applications that utilize the serverless architecture is that instead of building multiple microservices expected to run continuously in order to carry out individual tasks, build an application that has fewer microservices combined with FaaS, where FaaS covers tasks that don’t need services to run continuously. FaaS is a smaller construct than a microservice. For example, in case of the event booking application we covered earlier, there were multiple microservices covering different tasks. If we use a serverless applications model, some of those microservices would be replaced with a number of functions that serve their purpose. Here is a diagram that showcases the application utilizing a serverless architecture: In this diagram, the event handler microservices as well as the booking handler microservices were replaced with a number of functions that produce the same functionality. This eliminates the need to run and maintain the two existing microservices. Serverless architectures have the advantage that no virtual machines and/or containers need to be provisioned to build the part of the application that utilizes FaaS. The computing instances that run the functions cease to exist from the user point of view once their functions conclude. Furthermore, the number of microservices and/or containers that need to be monitored and maintained by the user decreases, saving cost, time, and effort. Serverless architectures provide yet another powerful software building tool in the hands of software engineers and architects to design flexible and scalable software. Known FaaS are AWS Lambda by Amazon, Azure Functions by Microsoft, Cloud Functions by Google, and many more. Another definition for serverless applications is the applications that utilize the BaaS or backend as a service paradigm. BaaS is the idea that developers only write the client code of their application, which then relies on several software pre-built services hosted in the cloud, accessible via APIs. BaaS is popular in mobile app programming, where developers would rely on a number of backend services to drive the majority of the functionality of the application. Examples of BaaS services are: Firebase, and Parse. Disadvantages of serverless applications Similarly to microservices and cloud native applications, the serverless architecture is not suitable for all scenarios. The functions provided by FaaS don’t keep state by themselves which means special considerations need to be observed when writing the function code. This is unlike a full microservice, where the developer has full control over the state. One approach to keep state in case of FaaS, in spite of this limitation, is to propagate the state to a database or a memory cache like Redis. The startup times for the functions are not always fast since there is time allocated to sending the request to the FaaS service provider then the time needed to start a computing instance that runs the function in some cases. These delays have to be accounted for when designing serverless applications. FaaS do not run continuously like microservices, which makes them unsuitable for any task that requires continuous running of the software. Serverless applications have the same limitation as other cloud native applications where portability of the application from one cloud provider to another or from the cloud to a local environment becomes challenging because of vendor lock-in Conclusion Cloud computing architectures have opened avenues for developing efficient, scalable, and reliable software. This paper covered some significant concepts in the world of cloud computing such as microservices, cloud native applications, containers, and serverless applications. Microservices are the building blocks for most scalable cloud native applications; they decouple the application tasks into various efficient services. Containers are how microservices could be isolated and deployed safely to production environments without polluting them.  Serverless applications decouple application tasks into smaller constructs mostly called functions that can be consumed via APIs. Cloud native applications make use of all those architectural patterns to build scalable, reliable, and always available software. You read Part 2 of of Modern cloud native architectures, a white paper by Mina Andrawos. Also read Part 1 which includes Microservices and Cloud native applications with their advantages and disadvantages. If you are interested to learn more, check out Mina’s Cloud Native programming with Golang to explore practical techniques for building cloud-native apps that are scalable, reliable, and always available. About Author: Mina Andrawos Mina Andrawos is an experienced engineer who has developed deep experience in Go from using it personally and professionally. He regularly authors articles and tutorials about the language, and also shares Go's open source projects. He has written numerous Go applications with varying degrees of complexity. Other than Go, he has skills in Java, C#, Python, and C++. He has worked with various databases and software architectures. He is also skilled with the agile methodology for software development. Besides software development, he has working experience of scrum mastering, sales engineering, and software product management. Build Java EE containers using Docker [Tutorial] Are containers the end of virtual machines? Why containers are driving DevOps
Read more
  • 0
  • 0
  • 3735

article-image-cloud-native-architectures-microservices-containers-serverless-part-1
Guest Contributor
13 Aug 2018
9 min read
Save for later

Modern Cloud Native architectures: Microservices, Containers, and Serverless - Part 1

Guest Contributor
13 Aug 2018
9 min read
This whitepaper is written by Mina Andrawos, an experienced engineer who has developed deep experience in the Go language, and modern software architectures. He regularly writes articles and tutorials about the Go language, and also shares open source projects. Mina Andrawos has authored the book Cloud Native programming with Golang, which provides practical techniques, code examples, and architectural patterns required to build cloud native microservices in the Go language.He is also the author of the Mastering Go Programming, and the Modern Golang Programming video courses. This paper sheds some light and provides practical exposure on some key topics in the modern software industry, namely cloud native applications.This includes microservices, containers , and serverless applications. The paper will cover the practical advantages, and disadvantages of the technologies covered. Microservices The microservices architecture has gained reputation as a powerful approach to architect modern software applications. So what are microservices? Microservices can be described as simply the idea of separating the functionality required from a software application into multiple independent small software services or “microservices.” Each microservice is responsible for an individual focused task. In order for microservices to collaborate together to form a large scalable application, they communicate and exchange data. Microservices were born out of the need to tame the complexity, and inflexibility of “monolithic” applications. A monolithic application is a type of application, where all required functionality is coded together into the same service. For example, here is a diagram representing a monolithic events (like concerts, shows..etc) booking application that takes care of the booking payment processing and event reservation: The application can be used by a customer to book a concert or a show. A user interface will be needed. Furthermore, we will also need a search functionality to look for events, a bookings handler to process the user booking then save it, and an events handler to help find the event, ensure it has seats available, then link it to the booking. In a production level application, more tasks will be needed like payment processing for example, but for now let’s focus on the four tasks outlined in the above figure. This monolithic application will work well with small to medium load. It will run on a single server, connect to a single database and will be written probably in the same programming language. Now, what will happen if the business grows exponentially and hundreds of thousands or millions of users need to be handled and processed? Initially, the short term solution would be to ensure that the server where the application runs, has powerful hardware specifications to withstand higher loads, and if not then add more memory, storage, and processing power to the server. This is called vertical scaling, which is the act of increasing the power of the hardware  like RAM and hard drive capacity to run heavy applications.However, this is typically not  sustainable in the long run as the load on the application continues to grow. Another challenge with monolithic applications is the inflexibility caused by being limited to only one or two programming languages. This inflexibility can affect the overall quality, and efficiency of the application. For example, node.js is a popular JavaScript framework for building web applications, whereas R is popular for data science applications. A monolithic application will make it difficult to utilize both technologies, whereas in a microservices application, we can simply build a data science service written in R and a web service written in Node.js. The microservices version of the events application will take the below form: This application will be capable of scaling among multiple servers, a practice known as horizontal scaling. Each service can be deployed on a different server with dedicated resources or in separate containers (more on that later). The different services can be written in different programming languages enabling greater flexibility, and different dedicated teams can focus on different services achieving more overall quality for the application. Another notable advantage of using microservices is the ease of continuous delivery, which is the ability to deploy software often, and at any time. The reason why microservices make continuous delivery easier is because a new feature deployed to one microservices is less likely to affect other microservices compared to monolithic applications. Issues with Microservices One notable drawback of relying heavily on microservices is the fact that they can become too complicated to manage in the long run as they grow in numbers and scope. There are approaches to mitigate this by utilizing monitoring tools such as Prometheus to detect problems, container technologies such as Docker to avoid pollutions of the host environments and avoiding over designing the services. However, these approaches take effort and time. Cloud native applications Microservices architectures are a natural fit for cloud native applications. A cloud native application is simply defined as an application built from the ground up for cloud computing architectures. This simply means that our application is cloud native, if we design it as if it is expected to be deployed on a distributed, and scalable infrastructure. For example, building an application with a redundant microservices architecture -we’ll see an example shortly- makes the application cloud native, since this architecture allows our application to be deployed in a distributed manner that allows it to be scalable and almost always available. A cloud native application does not need to always be deployed to a public cloud like AWS, we can deploy it to our own distributed cloud-like infrastructure instead if we have one. In fact, what makes an application fully cloud native is beyond just using microservices. Your application  should employ continuous delivery, which is your ability to continuously deliver updates to your production applications without  disruptions. Your application should also make use of services like message queues and technologies like containers, and serverless (containers and serverless are important topics for modern software architectures, so we’ll be discussing them in the next few sections). Cloud native applications assume access to numerous server nodes, having access to pre-deployed software services like message queues or load balancers, ease of integration with continuous delivery services, among other things. If you deploy your cloud native application to a commercial cloud like AWS or Azure, your application gets the option to utilize cloud only software services. For example, DynamoDB is a powerful database engine that can only be used on Amazon Web Services for production applications. Another example is the DocumentDB database in Azure. There are also cloud only message queues such as Amazon Simple Queue Service (SQS), which can be used to allow communication between microservices in the Amazon Web Services cloud. As mentioned earlier, cloud native microservices should be designed to allow redundancy between services. If we take the events booking application as an example, the application will look like this: Multiple server nodes would be allocated per microservice, allowing a redundant microservices architecture to be deployed. If the primary node or service fails for any reason, the secondary can take over ensuring lasting reliability and availability for cloud native applications. This availability is vital for fault intolerant applications such as e-commerce platforms, where downtime translates into large amounts of lost revenue. Cloud native applications provide great value for developers, enterprises, and startups. A notable tool worth mentioning in the world of microservices and cloud computing is Prometheus. Prometheus is an open source system monitoring and alerting tool that can be used to monitor complex microservices architectures and alert when an action needs to be taken. Prometheus was originally created by SoundCloud to monitor their systems, but then grew to become an independent project. The project is now a part of the cloud native computing foundation, which is a foundation tasked with building a sustainable ecosystem for cloud native applications. Cloud native limitations For cloud native applications, you will face some challenges if the need arises to migrate some or all of the applications. That is due to multiple reasons, depending on where your application is deployed. For example,if your cloud native application is deployed on a public cloud like AWS, cloud native APIs are not cross cloud platform. So, a DynamoDB database API utilized in an application will only work on AWS but not on Azure, since DynamoDB belongs exclusively to AWS. The API will also never work in a local environment because DynamoDB can only be utilized in AWS in production. Another reason is because there are some assumptions made when some cloud native applications are built, like the fact that there will be virtually unlimited number of server nodes to utilize when needed and that a new server node can be made available very quickly. These assumptions are sometimes hard to guarantee in a local data center environment, where real servers, networking hardware, and wiring need to be purchased. This brings us to the end of Part 1 of this whitepaper. Check out Part 2 tomorrow to learn about Containers and Serverless applications along with their practical advantages and limitations. About Author: Mina Andrawos Mina Andrawos is an experienced engineer who has developed deep experience in Go from using it personally and professionally. He regularly authors articles and tutorials about the language, and also shares Go's open source projects. He has written numerous Go applications with varying degrees of complexity. Other than Go, he has skills in Java, C#, Python, and C++. He has worked with various databases and software architectures. He is also skilled with the agile methodology for software development. Besides software development, he has working experience of scrum mastering, sales engineering, and software product management. Building microservices from a monolith Java EE app [Tutorial] 6 Ways to blow up your Microservices! Have Microservices killed the monolithic architecture? Maybe not!
Read more
  • 0
  • 0
  • 3673
article-image-ceph-instant-deployment
Packt
09 Feb 2015
14 min read
Save for later

Ceph Instant Deployment

Packt
09 Feb 2015
14 min read
In this article by Karan Singh, author of the book, Learning Ceph, we will cover the following topics: Creating a sandbox environment with VirtualBox From zero to Ceph – deploying your first Ceph cluster Scaling up your Ceph cluster – monitor and OSD addition (For more resources related to this topic, see here.) Creating a sandbox environment with VirtualBox We can test deploy Ceph in a sandbox environment using Oracle VirtualBox virtual machines. This virtual setup can help us discover and perform experiments with Ceph storage clusters as if we are working in a real environment. Since Ceph is an open source software-defined storage deployed on top of commodity hardware in a production environment, we can imitate a fully functioning Ceph environment on virtual machines, instead of real-commodity hardware, for our testing purposes. Oracle VirtualBox is a free software available at http://www.virtualbox.org for Windows, Mac OS X, and Linux. We must fulfil system requirements for the VirtualBox software so that it can function properly during our testing. We assume that your host operating system is a Unix variant; for Microsoft windows, host machines use an absolute path to run the VBoxManage command, which is by default c:Program FilesOracleVirtualBoxVBoxManage.exe. The system requirement for VirtualBox depends upon the number and configuration of virtual machines running on top of it. Your VirtualBox host should require an x86-type processor (Intel or AMD), a few gigabytes of memory (to run three Ceph virtual machines), and a couple of gigabytes of hard drive space. To begin with, we must download VirtualBox from http://www.virtualbox.org/ and then follow the installation procedure once this has been downloaded. We will also need to download the CentOS 6.4 Server ISO image from http://vault.centos.org/6.4/isos/. To set up our sandbox environment, we will create a minimum of three virtual machines; you can create even more machines for your Ceph cluster based on the hardware configuration of your host machine. We will first create a single VM and install OS on it; after this, we will clone this VM twice. This will save us a lot of time and increase our productivity. Let's begin by performing the following steps to create the first virtual machine: The VirtualBox host machine used throughout in this demonstration is a Mac OS X which is a UNIX-type host. If you are performing these steps on a non-UNIX machine that is, on Windows-based host then keep in mind that virtualbox hostonly adapter name will be something like VirtualBox Host-Only Ethernet Adapter #<adapter number>. Please run these commands with the correct adapter names. On windows-based hosts, you can check VirtualBox networking options in Oracle VM VirtualBox Manager by navigating to File | VirtualBox Settings | Network | Host-only Networks. After the installation of the VirtualBox software, a network adapter is created that you can use, or you can create a new adapter with a custom IP:For UNIX-based VirtualBox hosts # VBoxManage hostonlyif remove vboxnet1 # VBoxManage hostonlyif create # VBoxManage hostonlyif ipconfig vboxnet1 --ip 192.168.57.1 --netmask 255.255.255.0 For Windows-based VirtualBox hosts # VBoxManage.exe hostonlyif remove "VirtualBox Host-Only Ethernet Adapter" # VBoxManage.exe hostonlyif create # VBoxManage hostonlyif ipconfig "VirtualBox Host-Only Ethernet Adapter" --ip 192.168.57.1 --netmask 255.255.255. VirtualBox comes with a GUI manager. If your host is running Linux OS, it should have the X-desktop environment (Gnome or KDE) installed. Open Oracle VM VirtualBox Manager and create a new virtual machine with the following specifications using GUI-based New Virtual Machine Wizard, or use the CLI commands mentioned at the end of every step: 1 CPU 1024 MB memory 10 GB X 4 hard disks (one drive for OS and three drives for Ceph OSD) 2 network adapters CentOS 6.4 ISO attached to VM The following is the step-by-step process to create virtual machines using CLI commands: Create your first virtual machine: # VBoxManage createvm --name ceph-node1 --ostype RedHat_64 --register # VBoxManage modifyvm ceph-node1 --memory 1024 --nic1 nat --nic2 hostonly --hostonlyadapter2 vboxnet1 For Windows VirtualBox hosts: # VBoxManage.exe modifyvm ceph-node1 --memory 1024 --nic1 nat --nic2 hostonly --hostonlyadapter2 "VirtualBox Host-Only Ethernet Adapter" Create CD-Drive and attach CentOS ISO image to first virtual machine: # VBoxManage storagectl ceph-node1 --name "IDE Controller" --add ide --controller PIIX4 --hostiocache on --bootable on # VBoxManage storageattach ceph-node1 --storagectl "IDE Controller" --type dvddrive --port 0 --device 0 --medium CentOS-6.4-x86_64-bin-DVD1.iso Make sure you execute the preceding command from the same directory where you have saved CentOS ISO image or you can specify the location where you saved it. Create SATA interface, OS hard drive and attach them to VM; make sure the VirtualBox host has enough free space for creating vm disks. If not, select the host drive which have free space: # VBoxManage storagectl ceph-node1 --name "SATA Controller" --add sata --controller IntelAHCI --hostiocache on --bootable on # VBoxManage createhd --filename OS-ceph-node1.vdi --size 10240 # VBoxManage storageattach ceph-node1 --storagectl "SATA Controller" --port 0 --device 0 --type hdd --medium OS-ceph-node1.vdi Create SATA interface, first ceph disk and attach them to VM: # VBoxManage createhd --filename ceph-node1-osd1.vdi --size 10240 # VBoxManage storageattach ceph-node1 --storagectl "SATA Controller" --port 1 --device 0 --type hdd --medium ceph-node1-osd1.vdi Create SATA interface, second ceph disk and attach them to VM: # VBoxManage createhd --filename ceph-node1-osd2.vdi --size 10240 # VBoxManage storageattach ceph-node1 --storagectl "SATA Controller" --port 2 --device 0 --type hdd --medium ceph-node1-osd2.vdi Create SATA interface, third ceph disk and attach them to VM: # VBoxManage createhd --filename ceph-node1-osd3.vdi --size 10240 # VBoxManage storageattach ceph-node1 --storagectl "SATA Controller" --port 3 --device 0 --type hdd --medium ceph-node1-osd3.vdi Now, at this point, we are ready to power on our ceph-node1 VM. You can do this by selecting the ceph-node1 VM from Oracle VM VirtualBox Manager, and then clicking on the Start button, or you can run the following command: # VBoxManage startvm ceph-node1 --type gui As soon as you start your VM, it should boot from the ISO image. After this, you should install CentOS on VM. If you are not already familiar with Linux OS installation, you can follow the documentation at https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Installation_Guide/index.html. Once you have successfully installed the operating system, edit the network configuration of the machine: Edit /etc/sysconfig/network and change the hostname parameter HOSTNAME=ceph-node1 Edit the /etc/sysconfig/network-scripts/ifcfg-eth0 file and add: ONBOOT=yes BOOTPROTO=dhcp Edit the /etc/sysconfig/network-scripts/ifcfg-eth1 file and add:< ONBOOT=yes BOOTPROTO=static IPADDR=192.168.57.101 NETMASK=255.255.255.0 Edit the /etc/hosts file and add: 192.168.57.101 ceph-node1 192.168.57.102 ceph-node2 192.168.57.103 ceph-node3 Once the network settings have been configured, restart VM and log in via SSH from your host machine. Also, test the Internet connectivity on this machine, which is required to download Ceph packages: # ssh [email protected] Once the network setup has been configured correctly, you should shut down your first VM so that we can make two clones of your first VM. If you do not shut down your first VM, the cloning operation might fail. Create clone of ceph-node1 as ceph-node2: # VBoxManage clonevm --name ceph-node2 ceph-node1 --register Create clone of ceph-node1 as ceph-node3: # VBoxManage clonevm --name ceph-node3 ceph-node1 --register After the cloning operation is complete, you can start all three VMs: # VBoxManage startvm ceph-node1 # VBoxManage startvm ceph-node2 # VBoxManage startvm ceph-node3 Set up VM ceph-node2 with the correct hostname and network configuration: Edit /etc/sysconfig/network and change the hostname parameter: HOSTNAME=ceph-node2 Edit the /etc/sysconfig/network-scripts/ifcfg-<first interface name> file and add: DEVICE=<correct device name of your first network interface, check ifconfig -a> ONBOOT=yes BOOTPROTO=dhcp HWADDR= <correct MAC address of your first network interface, check ifconfig -a > Edit the /etc/sysconfig/network-scripts/ifcfg-<second interface name> file and add: DEVICE=<correct device name of your second network interface, check ifconfig -a> ONBOOT=yes BOOTPROTO=static IPADDR=192.168.57.102 NETMASK=255.255.255.0 HWADDR= <correct MAC address of your second network interface, check ifconfig -a > Edit the /etc/hosts file and add: 192.168.57.101 ceph-node1 192.168.57.102 ceph-node2 192.168.57.103 ceph-node3 After performing these changes, you should restart your virtual machine to bring the new hostname into effect. The restart will also update your network configurations. Set up VM ceph-node3 with the correct hostname and network configuration: Edit /etc/sysconfig/network and change the hostname parameter:HOSTNAME=ceph-node3 Edit the /etc/sysconfig/network-scripts/ifcfg-<first interface name> file and add: DEVICE=<correct device name of your first network interface, check ifconfig -a> ONBOOT=yes BOOTPROTO=dhcp HWADDR= <correct MAC address of your first network interface, check ifconfig -a > Edit the /etc/sysconfig/network-scripts/ifcfg-<second interface name> file and add: DEVICE=<correct device name of your second network interface, check ifconfig -a> ONBOOT=yes BOOTPROTO=static IPADDR=192.168.57.103 NETMASK=255.255.255.0 HWADDR= <correct MAC address of your second network interface, check ifconfig -a > Edit the /etc/hosts file and add: 192.168.57.101 ceph-node1 192.168.57.102 ceph-node2 192.168.57.103 ceph-node3 After performing these changes, you should restart your virtual machine to bring a new hostname into effect; the restart will also update your network configurations. At this point, we prepare three virtual machines and make sure each VM communicates with each other. They should also have access to the Internet to install Ceph packages. From zero to Ceph – deploying your first Ceph cluster To deploy our first Ceph cluster, we will use the ceph-deploy tool to install and configure Ceph on all three virtual machines. The ceph-deploy tool is a part of the Ceph software-defined storage, which is used for easier deployment and management of your Ceph storage cluster. Since we created three virtual machines that run CentOS 6.4 and have connectivity with the Internet as well as private network connections, we will configure these machines as Ceph storage clusters as mentioned in the following diagram: Configure ceph-node1 for an SSH passwordless login to other nodes. Execute the following commands from ceph-node1: While configuring SSH, leave the paraphrase empty and proceed with the default settings: # ssh-keygen Copy the SSH key IDs to ceph-node2 and ceph-node3 by providing their root passwords. After this, you should be able to log in on these nodes without a password: # ssh-copy-id ceph-node2 Installing and configuring EPEL on all Ceph nodes: Install EPEL which is the repository for installing extra packages for your Linux system by executing the following command on all Ceph nodes: # rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm Make sure the baserul parameter is enabled under the /etc/yum.repos.d/epel.repo file. The baseurl parameter defines the URL for extra Linux packages. Also make sure the mirrorlist parameter must be disabled (commented) under this file. Problems been observed during installation if the mirrorlist parameter is enabled under epel.repo file. Perform this step on all the three nodes. Install ceph-deploy on the ceph-node1 machine by executing the following command from ceph-node1: # yum install ceph-deploy Next, we will create a Ceph cluster using ceph-deploy by executing the following command from ceph-node1: # ceph-deploy new ceph-node1 ## Create a directory for ceph # mkdir /etc/ceph # cd /etc/ceph The new subcommand of ceph-deploy deploys a new cluster with ceph as the cluster name, which is by default; it generates a cluster configuration and keying files. List the present working directory; you will find the ceph.conf and ceph.mon.keyring files. In this testing, we will intentionally install the Emperor release (v0.72) of Ceph software, which is not the latest release. Later in this book, we will demonstrate the upgradation of Emperor to Firefly release of Ceph. To install Ceph software binaries on all the machines using ceph-deploy; execute the following command from ceph-node1:ceph-deploy install --release emperor ceph-node1 ceph-node2 ceph-node3 The ceph-deploy tool will first install all the dependencies followed by the Ceph Emperor binaries. Once the command completes successfully, check the Ceph version and Ceph health on all the nodes, as follows: # ceph –v Create your first monitor on ceph-node1: # ceph-deploy mon create-initial Once monitor creation is successful, check your cluster status. Your cluster will not be healthy at this stage: # ceph status Create an object storage device (OSD) on the ceph-node1 machine, and add it to the Ceph cluster executing the following steps: List the disks on VM: # ceph-deploy disk list ceph-node1 From the output, carefully identify the disks (other than OS-partition disks) on which we should create Ceph OSD. In our case, the disk names will ideally be sdb, sdc, and sdd. The disk zap subcommand will destroy the existing partition table and content from the disk. Before running the following command, make sure you use the correct disk device name. # ceph-deploy disk zap ceph-node1:sdb ceph-node1:sdc ceph-node1:sdd The osd create subcommand will first prepare the disk, that is, erase the disk with a filesystem, which is xfs by default. Then, it will activate the disk's first partition as data partition and second partition as journal: # ceph-deploy osd create ceph-node1:sdb ceph-node1:sdc ceph-node1:sdd Check the cluster status for new OSD entries: # ceph status At this stage, your cluster will not be healthy. We need to add a few more nodes to the Ceph cluster so that it can set up a distributed, replicated object storage, and hence become healthy. Scaling up your Ceph cluster – monitor and OSD addition Now we have a single-node Ceph cluster. We should scale it up to make it a distributed, reliable storage cluster. To scale up a cluster, we should add more monitor nodes and OSD. As per our plan, we will now configure ceph-node2 and ceph-node3 machines as monitor as well as OSD nodes. Adding the Ceph monitor A Ceph storage cluster requires at least one monitor to run. For high availability, a Ceph storage cluster relies on an odd number of monitors that's more than one, for example, 3 or 5, to form a quorum. It uses the Paxos algorithm to maintain quorum majority. Since we already have one monitor running on ceph-node1, let's create two more monitors for our Ceph cluster: The firewall rules should not block communication between Ceph monitor nodes. If they do, you need to adjust the firewall rules in order to let monitors form a quorum. Since this is our test setup, let's disable firewall on all three nodes. We will run these commands from the ceph-node1 machine, unless otherwise specified: # service iptables stop # chkconfig iptables off # ssh ceph-node2 service iptables stop # ssh ceph-node2 chkconfig iptables off # ssh ceph-node3 service iptables stop # ssh ceph-node3 chkconfig iptables off Deploy a monitor on ceph-node2 and ceph-node3: # ceph-deploy mon create ceph-node2 # ceph-deploy mon create ceph-node3 The deploy operation should be successful; you can then check your newly added monitors in the Ceph status: You might encounter warning messages related to clock skew on new monitor nodes. To resolve this, we need to set up Network Time Protocol (NTP) on new monitor nodes: # chkconfig ntpd on # ssh ceph-node2 chkconfig ntpd on # ssh ceph-node3 chkconfig ntpd on # ntpdate pool.ntp.org # ssh ceph-node2 ntpdate pool.ntp.org # ssh ceph-node3 ntpdate pool.ntp.org # /etc/init.d/ntpd start # ssh ceph-node2 /etc/init.d/ntpd start # ssh ceph-node3 /etc/init.d/ntpd start Adding the Ceph OSD At this point, we have a running Ceph cluster with three monitors OSDs. Now we will scale our cluster and add more OSDs. To accomplish this, we will run the following commands from the ceph-node1 machine, unless otherwise specified. We will follow the same method for OSD addition: # ceph-deploy disk list ceph-node2 ceph-node3 # ceph-deploy disk zap ceph-node2:sdb ceph-node2:sdc ceph-node2:sdd # ceph-deploy disk zap ceph-node3:sdb ceph-node3:sdc ceph-node3:sdd # ceph-deploy osd create ceph-node2:sdb ceph-node2:sdc ceph-node2:sdd # ceph-deploy osd create ceph-node3:sdb ceph-node3:sdc ceph-node3:sdd # ceph status Check the cluster status for a new OSD. At this stage, your cluster will be healthy with nine OSDs in and up: Summary The software-defined nature of Ceph provides a great deal of flexibility to its adopters. Unlike other proprietary storage systems, which are hardware dependent, Ceph can be easily deployed and tested on almost any computer system available today. Moreover, if getting physical machines is a challenge, you can use virtual machines to install Ceph, as mentioned in this article, but keep in mind that such a setup should only be used for testing purposes. In this article, we learned how to create a set of virtual machines using the VirtualBox software, followed by Ceph deployment as a three-node cluster using the ceph-deploy tool. We also added a couple of OSDs and monitor machines to our cluster in order to demonstrate its dynamic scalability. We recommend you deploy a Ceph cluster of your own using the instructions mentioned in this article. Resources for Article: Further resources on this subject: Linux Shell Scripting - various recipes to help you [article] GNU Octave: Data Analysis Examples [article] What is Kali Linux [article]
Read more
  • 0
  • 0
  • 3665

article-image-software-defined-data-center
Packt
14 Nov 2016
33 min read
Save for later

The Software-defined Data Center

Packt
14 Nov 2016
33 min read
In this article by Valentin Hamburger, author of the book Building VMware Software-Defined Data Centers, we are introduced and briefed about the software-defined data center (SDDC) that has been introduced by VMware, to further describe the move to a cloud like IT experience. The term software-defined is the important bit of information. It basically means that every key function in the data center is performed and controlled by software, instead of hardware. This opens a whole new way of operating, maintaining but also innovating in a modern data center. (For more resources related to this topic, see here.) But how does a so called SDDC look like – and why is a whole industry pushing so hard towards its adoption? This question might also be a reason why you are reading this article, which is meant to provide a deeper understanding of it and give practical examples and hints how to build and run such a data center. Meanwhile it will also provide the knowledge of mapping business challenges with IT solutions. This is a practice which becomes more and more important these days. IT has come a long way from a pure back office, task oriented role in the early days, to a business relevant asset, which can help organizations to compete with their competition. There has been a major shift from a pure infrastructure provider role to a business enablement function. Today, most organizations business is just as good as their internal IT agility and ability to innovate. There are many examples in various markets where a whole business branch was built on IT innovations such as Netflix, Amazon Web Services, Uber, Airbnb – just to name a few. However, it is unfair to compare any startup with a traditional organization. A startup has one application to maintain and they have to build up a customer base. A traditional organization has a proven and wide customer base and many applications to maintain. So they need to adapt their internal IT to become a digital enterprise, with all the flexibility and agility of a startup, but also maintaining the trust and control over their legacy services. This article will cover the following points: Why is there a demand for SDDC in IT What is SDDC Understand the business challenges and map it to SDDC deliverables The relation of a SDDC and an internal private cloud Identify new data center opportunities and possibilities Become a center of innovation to empower your organizations business The demand for change Today organizations face different challenges in the market to stay relevant. The biggest move was clearly introduced by smartphones and tablets. It was not just a computer in a smaller device, they changed the way IT is delivered and consumed by end users. These devices proved that it can be simple to consume and install applications. Just search in an app store – choose what you like – use it as long as you like it. If you do not need it any longer, simply remove it. All with very simplistic commands and easy to use gestures. More and more people relying on IT services by using a smartphone as their terminal to almost everything. These devices created a demand for fast and easy application and service delivery. So in a way, smartphones have not only transformed the whole mobile market, they also transformed how modern applications and services are delivered from organizations to their customers. Although it would be quite unfair to compare a large enterprise data center with an app store or enterprise service delivery with any app installs on a mobile device, there are startups and industries which rely solely on the smartphone as their target for services, such as Uber or WhatsApp. On the other side, smartphone apps also introduce a whole new way of delivering IT services, since any company never knows how many people will use the app simultaneously. But in the backend they still have to use web servers and databases to continuously provide content and data for these apps. This also introduces a new value model for all other companies. People start to judge a company by the quality of their smartphone apps available. Also people started to migrate to companies which might offer a better smartphone integration as the previous one used. This is not bound to a single industry, but affects a broad spectrum of industries today such as the financial industry, car manufacturers, insurance groups, and even food retailers, just to name a few. A classic data center structure might not be ideal for quick and seamless service delivery. These architectures are created by projects to serve a particular use case for a couple of years. An example of this bigger application environments are web server farms, traditional SAP environments, or a data warehouse. Traditionally these were designed with an assumption about their growth and use. Special project teams have set them up across the data center pillars, as shown in the following figure. Typically, those project teams separate after such the application environment has been completed. All these pillars in the data center are required to work together, but every one of them also needs to mind their own business. Mostly those different divisions also have their own processes which than may integrate in a data center wide process. There was a good reason to structure a data center in this way, the simple fact that nobody can be an expert for every discipline. Companies started to create groups to operate certain areas in a data center, each building their own expertise for their own subject. This was evolving and became the most applied model for IT operations within organizations. Many, if not all, bigger organizations have adopted this approach and people build their careers on this definitions. It served IT well for decades and ensured that each party was adding its best knowledge to any given project. However, this setup has one flaw, it has not been designed for massive change and scale. The bigger these divisions get, the slower they can react to request from other groups in the data center. This introduces a bi-directional issue – since all groups may grow in a similar rate, the overall service delivery time might also increase exponentially. Unfortunately, this also introduces a cost factor when it comes to service deployments across these pillars. Each new service, an organization might introduce or develop, will require each area of IT to contribute. Traditionally, this is done by human hand overs from one department to the other. Each of these hand overs will delay the overall project time or service delivery time, which is also often referred to as time to market. It reflects the needed time interval from the request of a new service to its actual delivery. It is important to mention that this is a level of complexity every modern organization has to deal with, when it comes to application deployment today. The difference between organizations might be in the size of the separate units, but the principle is always the same. Most organizations try to bring their overall service delivery time down to be quicker and more agile. This is often related to business reasons as well as IT cost reasons. In some organizations the time to deliver a brand new service from request to final roll out may take 90 working days. This means – a requestor might wait 18 weeks or more than four and a half month from requesting a new business service to its actual delivery. Do not forget that this reflects the complete service delivery – over all groups until it is ready for production. Also, after these 90 days the requirement of the original request might have changed which would lead into repeating the entire process. Often a quicker time to market is driven by the lines of business (LOB) owners to respond to a competitor in the market, who might already deliver their services faster. This means that today's IT has changed from a pure internal service provider to a business enabler supporting its organization to fight the competition with advanced and innovative services. While this introduces a great chance to the IT department to enable and support their organizations business, it also introduces a threat at the same time. If the internal IT struggles to deliver what the business is asking for, it may lead to leverage shadow IT within the organization. The term shadow IT describes a situation where either the LOBs of an organization or its application developers have grown so disappointed with the internal IT delivery times, that they actually use an external provider for their requirements. This behavior is not agreed with the IT security and can lead to heavy business or legal troubles. This happens more often than one might expect, and it can be as simple as putting some internal files on a public cloud storage provider. These services grant quick results. It is as simple as Register – Download – Use. They are very quick in enrolling new users and sometimes provide a limited use for free. The developer or business owner might not even be aware that there is something non-compliant going on while using this services. So besides the business demand for a quicker service delivery and the security aspect, there an organizations IT department has now also the pressure of staying relevant. But SDDC can provide much more value to the IT than just staying relevant. The automated data center will be an enabler for innovation and trust and introduce a new era of IT delivery. It can not only provide faster service delivery to the business, it can also enable new services or offerings to help the whole organization being innovative for their customers or partners. Business challenges—the use case Today's business strategies often involve a digital delivery of services of any kind. This implies that the requirements a modern organization has towards their internal IT have changed drastically. Unfortunately, the business owners and the IT department tend to have communication issues in some organizations. Sometimes they even operate completely disconnected from each other, as if each of them where their own small company within the organization. Nevertheless, a lot of data center automation projects are driven by enhanced business requirements. In some of these cases, the IT department has not been made aware of what these business requirements look like, or even what the actual business challenges are. Sometimes IT just gets as little information as: We are doing cloud now. This is a dangerous simplification since, the use case is key when it comes to designing and identifying the right solution to solve the organizations challenges. It is important to get the requirements from both sides, the IT delivery side as well as the business requirements and expectations. Here is a simple example how a use case might be identified and mapped to technical implementation. The business view John works as a business owner in an insurance company. He recognizes that their biggest competitor in the market started to offer a mobile application to their clients. The app is simple and allows to do online contract management and tells the clients which products the have enrolled as well as rich information about contract timelines and possible consolidation options. He asks his manager to start a project to also deliver such an application to their customers. Since it is only a simple smartphone application, he expects that it's development might take a couple of weeks and than they can start a beta phase. To be competitive he estimates that they should have something useable for their customers within maximum of 5 months. Based on these facts, he got approval from his manager to request such a product from the internal IT. The IT view Tom is the data center manager of this insurance company. He got informed that the business wants to have a smartphone application to do all kinds of things for the new and existing customers. He is responsible to create a project and bring all necessary people on board to support this project and finally deliver the service to the business. The programming of the app will be done by an external consulting company. Tom discusses a couple of questions regarding this request with his team: How many users do we need to serve? How much time do we need to create this environment? What is the expected level of availability? How much compute power/disk space might be required? After a round of brainstorming and intense discussion, the team still is quite unsure how to answer these questions. For every question there is a couple of variables the team cannot predict. Will only a few of their thousands of users adopt to the app, what if they undersize the middleware environment? What if the user adoption rises within a couple of days, what if it lowers and the environment is over powered and therefor the cost is too high? Tom and his team identified that they need a dynamic solution to be able to serve the business request. He creates a mapping to match possible technical capabilities to the use case. After this mapping was completed, he is using it to discuss with his CIO if and how it can be implemented. Business challenge Question IT capability Easy to use app to win new customers/keep existing How many users do we need to server? Dynamic scale of an environment based on actual performance demand. How much time do we need to create this environment? To fulfill the expectations the environment needs to be flexible. Start small – scale big. What is the expected level of availability? Analytics and monitoring over all layers. Including possible self healing approach. How much compute power/disk space might be required? Create compute nodes based on actual performance requirements on demand. Introduce a capacity on demand model for required resources. Given this table, Tom revealed that with their current data center structure it is quite difficult to deliver what the business is asking for. Also, he got a couple of requirements from other departments, which are going in a similar direction. Based on these mappings, he identified that they need to change their way of deploying services and applications. They will need to use a fair amount of automation. Also, they have to span these functionalities across each data center department as a holistic approach, as shown in the following diagram: In this example, Tom actually identified a very strong use case for SDDC in his company. Based on the actual business requirements of a "simple" application, the whole IT delivery of this company needs to adopt. While this may sound like pure fiction, these are the challenges modern organizations need to face today. It is very important to identify the required capabilities for the entire data center and not just for a single department. You will also have to serve the legacy applications and bring them onto the new model. Therefore it is important to find a solution, which is serving the new business case as well as the legacy applications either way. In the first stage of any SDDC introduction in an organization, it is key to keep always an eye on the big picture. Tools to enable SDDC There is a basic and broadly accepted declaration of what a SDDC needs to offer. It can be considered as the second evolutionary step after server virtualization. It offers an abstraction layer from the infrastructure components such as compute, storage, and network by using automation and tools as such as a self service catalog In a way, it represents a virtualization of the whole data center with the purpose to simplify the request and deployment of complex services. Other capabilities of an SDDC are: Automated infrastructure/service consumption Policy based services and applications deployment Changes to services can be made easily and instantly All infrastructure layers are automated (storage, network, and compute) No human intervention is needed for infrastructure/service deployment High level of standardization is used Business logic is for chargeback or show back functionality All of the preceding points define a SDDC technically. But it is important to understand that a SDDC is considered to solve the business challenges of the organization running it. That means based on the actual business requirements, each SDDC will serve a different use case. Of course there is a main setup you can adopt and roll out – but it is important to understand your organizations business challenges in order to prevent any planning or design shortcomings. Also, to realize this functionality, SDDC needs a couple of software tools. These are designed to work together to deliver a seamless environment. The different parts can be seen like gears in a watch where each gear has an equally important role to make the clockwork function correctly. It is important to remember this when building your SDDC, since missing on one part can make another very complex or even impossible afterwards. This is a list of VMware tools building a SDDC: vRealize Business for Cloud vRealize Operations Manager vRealize Log Insight vRealize Automation vRealize Orchestrator vRealize Automation Converged Blueprint vRealize Code Stream VMware NSX VMware vSphere vRealize Business for Cloud is a charge back/show back tool. It can be used to track cost of services as well as the cost of a whole data center. Since the agility of a SDDC is much higher than for a traditional data center, it is important to track and show also the cost of adding new services. It is not only important from a financial perspective, it also serves as a control mechanism to ensure users are not deploying uncontrolled services and leaving them running even if they are not required anymore. vRealize Operations Manager is serving basically two functionalities. One is to help with the troubleshooting and analytics of the whole SDDC platform. It has an analytics engine, which applies machine learning to the behavior of its monitored components. The other important function is capacity management. It is capable of providing what-if analysis and informs about possible shortcomings of resources way before they occur. These functionalities also use the machine learning algorithms and get more accurate over time. This becomes very important in an dynamic environment where on-demand provisioning is granted. vRealize Log Insight is a unified log management. It offers rich functionality and can search and profile a large amount of log files in seconds. It is recommended to use it as a universal log endpoint for all components in your SDDC. This includes all OSes as well as applications and also your underlying hardware. In an event of error, it is much simpler to have a central log management which is easy searchable and delivers an outcome in seconds. vRealize Automation (vRA) is the base automation tool. It is providing the cloud portal to interact with your SDDC. The portal it provides offers the business logic such as service catalogs, service requests, approvals, and application life cycles. However, it relies strongly on vRealize Orchestrator for its technical automation part. vRA can also tap into external clouds to extend the internal data center. Extending a SDDC is mostly referred to as hybrid cloud. There are a couple of supported cloud offerings vRA can manage. vRealize Orchestrator (vRO) is providing the workflow engine and the technical automation part of the SDDC. It is literally the orchestrator of your new data center. vRO can be easily bound together with vRA to form a very powerful automation suite, where anything with an application programming interface (API) can be integrated. Also it is required to integrate third-party solutions into your deployment workflows, such as configuration management database (CMDB), IP address management (IPAM), or ticketing systems via IT service management (ITSM). vRealize Automation Converged Blueprint was formally known as vRealize Automation Application Services and is an add-on functionality to vRA, which takes care of application installations. It can be used with pre-existing scripts (like Windows PowerShell or Bash on Linux) – but also with variables received from vRA. This makes it very powerful when it comes to on demand application installations. This tool can also make use of vRO to provide even better capabilities for complex application installations. vRealize Code Stream is an addition to vRA and serves specific use cases in the DevOps area of the SDDC. It can be used with various development frameworks such as Jenkins. Also it can be used as a tool for developers to build and operate their own software test, QA and deployment environment. Not only can the developer build these separate stages, the migration from one stage into another can also be fully automated by scripts. This makes it a very powerful tool when it comes to stage and deploy modern and traditional applications within the SDDC. VMware NSX is the network virtualization component. Given the complexity some applications/services might introduce, NSX will provide a good and profound solution to help solving it. The challenges include: Dynamic network creation Microsegmentation Advanced security Network function virtualization VMware vSphere is mostly the base infrastructure and used as the hypervisor for server virtualization. You are probably familiar with vSphere and its functionalities. However, since the SDDC is introducing a change to you data center architecture, it is recommended to re-visit some of the vSphere functionalities and configurations. By using the full potential of vSphere it is possible to save effort when it comes to automation aspects as well as the service/application deployment part of the SDDC. This represents your toolbox required to build the platform for an automated data center. All of them will bring tremendous value and possibilities, but they also will introduce change. It is important that this change needs to be addressed and is a part of the overall SDDC design and installation effort. Embrace the change. The implementation journey While a big part of this article focuses on building and configuring the SDDC, it is important to mention that there are also non-technical aspects to consider. Creating a new way of operating and running your data center will always involve people. It is important to also briefly touch this part of the SDDC. Basically there are three major players when it comes to a fundamental change in any data center, as shown in the following image: Basically there are three major topics relevant for every successful SDDC deployment. Same as for the tools principle, these three disciplines need to work together in order to enable the change and make sure that all benefits can be fully leveraged. These three categories are: People Process Technology The process category Data center processes are as established and settled as IT itself. Beginning with the first operator tasks like changing tapes or starting procedures up to highly sophisticated processes to ensure that the service deployment and management is working as expected they have already come a long way. However, some of these processes might not be fit for purpose anymore, once automation is applied to a data center. To build a SDDC it is very important to revisit data center processes and adopt them to work with the new automation tasks. The tools will offer integration points into processes, but it is equally important to remove bottle necks for the processes as well. However, keep in mind that if you automate a bad process, the process will still be bad – but fully automated. So it is also necessary to re-visit those processes so that they can become slim and effective as well. Remember Tom, the data center manager. He has successfully identified that they need a SDDC to fulfill the business requirements and also did a use case to IT capabilities mapping. While this mapping is mainly talking about what the IT needs to deliver technically, it will also imply that the current IT processes need to adopt to this new delivery model. The process change example in Tom's organization If the compute department works on a service involving OS deployment, they need to fill out an Excel sheet with IP addresses and server names and send it to the networking department. The network admins will ensure that there is no double booking by reserving the IP address and approve the requested host name. After successfully proving the uniqueness of this data, name and IP gets added to the organizations DNS server. The manual part of this process is not longer feasible once the data center enters the automation era – imagine that every time somebody orders a service involving a VM/OS deploy, the network department gets an e-mail containing the Excel with the IP and host name combination. The whole process will have to stop until this step is manually finished. To overcome this, the process has to be changed to use an automated solution for IPAM. The new process has to track IP and host names programmatically to ensure there is no duplication within the entire data center. Also, after successfully checking the uniqueness of the data, it has to be added to the Domain Name System (DNS). While this is a simple example on one small process, normally there is a large number of processes involved which need to be re-viewed for a fully automated data center. This is a very important task and should not be underestimated since it can be a differentiator for success or failure of an SDDC. Think about all other processes in place which are used to control the deploy/enable/install mechanics in your data center. Here is a small example list of questions to ask regarding established processes: What is our current IPAM/DNS process? Do we need to consider a CMDB integration? What is our current ticketing process? (ITSM) What is our process to get resources from network, storage, and compute? What OS/VM deployment process is currently in place? What is our process to deploy an application (hand overs, steps, or departments involved)? What does our current approval process look like? Do we need a technical approval to deliver a service? Do we need a business approval to deliver a service? What integration process do we have for a service/application deployment? DNS, Active Directory (AD), Dynamic Host Configuration Protocol (DHCP), routing, Information Technology Infrastructure Library (ITIL), and so on Now for the approval question, normally these are an exception for the automation part, since approvals are meant to be manual in the first place (either technical or business). If all the other answers to this example questions involve human interaction as well, consider to change these processes to be fully automated by the SDDC. Since human intervention creates waiting times, it has to be avoided during service deployments in any automated data center. Think of it as the robotic construction bands todays car manufacturers are using. The processes they have implemented, developed over ages of experience, are all designed to stop the band only in case of an emergency. The same comes true for the SDDC – try to enable the automated deployment through your processes, stop the automation only in case of an emergency. Identifying processes is the simple part, changing them is the tricky part. However, keep in mind that this is an all new model of IT delivery, therefore there is no golden way of doing it. Once you have committed to change those processes, keep monitoring if they truly fulfill their requirement. This leads to another process principle in the SDDC: Continual Service Improvement (CSI). Re-visit what you have changed from time to time and make sure that those processes are still working as expected, if they don't, change them again. The people category Since every data center is run by people, it is important to also consider that a change of technology will also impact those people. There are some claims that a SDDC can be run with only half of the staff or save a couple of employees since all is automated. The truth is, a SDDC will transform IT roles in a data center. This means that some classic roles might vanish, while others will be added by this change. It is unrealistic to say that you can run an automated data center with half the staff than before. But it is realistic to say that your staff can concentrate on innovation and development instead of working a 100% to keep the lights on. And this is the change an automated data center introduces. It opens up the possibilities to evolve into a more architecture and design focused role for current administrators. The people example in Tom's organization Currently there are two admins in the compute department working for Tom. They are managing and maintaining the virtual environment, which is largely VMware vSphere. They are creating VMs manually, deploying an OS by a network install routine (which was a requirement for physical installs – so they kept the process) and than handing the ready VMs over to the next department to finish installing the service they are meant for. Recently they have experienced a lot of demand for VMs and each of them configures 10 to 12 VMs per day. Given this, they cannot concentrate on other aspects of their job, like improving OS deployments or the hand over process. At a first look it seems like the SDDC might replace these two employees since the tools will largely automate their work. But that is like saying a jackhammer will replace a construction worker. Actually their roles will shift to a more architectural aspect. They need to come up with a template for OS installations and an improvement how to further automate the deployment process. Also they might need to add new services/parts to the SDDC in order to fulfill the business needs continuously. So instead of creating all the VMs manually, they are now focused on designing a blueprint, able to be replicated as easy and efficient as possible. While their tasks might have changed, their workforce is still important to operate and run the SDDC. However, given that they focus on design and architectural tasks now, they also have the time to introduce innovative functions and additions to the data center. Keep in mind that an automated data center affects all departments in an IT organization. This means that also the tasks of the network and storage as well as application and database teams will change. In fact, in a SDDC it is quite impossible to still operate the departments disconnected from each other since a deployment will affect all of them. This also implies that all of these departments will have admins shifting to higher-level functions in order to make the automation possible. In the industry, this shift is also often referred to as Operational Transformation. This basically means that not only the tools have to be in place, you also have to change the way how the staff operates the data center. In most cases organizations decide to form a so-called center of excellence (CoE) to administer and operate the automated data center. This virtual group of admins in a data center is very similar to project groups in traditional data centers. The difference is that these people should be permanently assigned to the CoE for a SDDC. Typically you might have one champion from each department taking part in this virtual team. Each person acts as an expert and ambassador for their department. With this principle, it can be ensured that decisions and overlapping processes are well defined and ready to function across the departments. Also, as an ambassador, each participant should advertise the new functionalities within their department and enable their colleagues to fully support the new data center approach. It is important to have good expertise in terms of technology as well as good communication skills for each member of the CoE. The technology category This is the third aspect of the triangle to successfully implement a SDDC in your environment. Often this is the part where people spend most of their attention, sometimes by ignoring one of the other two parts. However, it is important to note that all three topics need to be equally considered. Think of it like a three legged chair, if one leg is missing it can never stand. The term technology does not necessarily only refer to new tools required to deploy services. It also refers to already established technology, which has to be integrated with the automation toolset (often referred to as third-party integration). This might be your AD, DHCP server, e-mail system, and so on. There might be technology which is not enabling or empowering the data center automation, so instead of only thinking about adding tools, there might also be tools to be removed or replaced. This is a normal IT lifecycle task and has been gone through many iterations already. Think of things like a fax machine or the telex – you might not use them anymore, they have been replaced by e-mail and messaging. The technology example in Tom's organization The team uses some tools to make their daily work easier when it comes to new service deployments. One of the tools is a little graphical user interface to quickly add content to AD. The admins use it to insert the host name, Organizational Unit as well as creating the computer account with it. This was meant to save admin time, since they don't have to open all the various menus in the AD configuration to accomplish these tasks. With the automated service delivery, this has to be done programmatically. Once a new OS is deployed it has to be added to the AD including all requirements by the deployment tool. Since AD offers an API this can be easily automated and integrated into the deployment automation. Instead of painfully integrating the graphical tool, this is now done directly by interfacing the organizations AD, ultimately replacing the old graphical tool. The automated deployment of a service across the entire data center requires a fair amount of communication. Not in a traditional way, but machine-to-machine communication leveraging programmable interfaces. Using such APIs is another important aspect of the applied data center technologies. Most of today's data center tools, from backup all the way up to web servers, do come with APIs. The better the API is documented, the easier the integration into the automation tool. In some cases you might need the vendors to support you with the integration of their tools. If you have identified a tool in the data center, which does not offer any API or even command-line interface (CLI) option at all, try to find a way around this software or even consider replacing it with a new tool. APIs are the equivalent of hand overs in the manual world. The better the communication works between tools, the faster and easier the deployment will be completed. To coordinate and control all this communication, you will need far more than scripts to run. This is a task for an orchestrator, which can run all necessary integration workflows from a central point. This orchestrator will act like a conductor for a big orchestra. It will form the backbone of your SDDC. Why are these three topics so important? The technology aspect closes the triangle and brings the people and the processes parts together. If the processes are not altered to fit the new deployment methods – automation will be painful and complex to implement. If the deployment stops at some point, since the processes require manual intervention, the people will have to fill in this gap. This means that they now have new roles, but also need to maintain some of their old tasks to keep the process running. By introducing such an unbalanced implementation of an automated data center, the workload for people can actually increase, while the service delivery times may not dramatically decrease. This may lead to an avoidance of the automated tasks, since the manual intervention might seen as faster by individual admins. So it is very important to accept all three aspects as the main part of the SDDC implementation journey. They all need to be addressed equally and thoughtfully to unveil the benefits and improvements an automated data center has to offer. However, keep in mind that this truly is a journey. A SDDC is not implemented in days but in month. Given this, also the implementation team in the data center has this time to adopt themselves and their process to this new way of delivering IT services. Also all necessary departments and their lead needs to be involved in this procedure. A SDDC implementation is always a team effort. Additional possibilities and opportunities All the previews mentioned topics serve the sole goal to install and use the SDDC within your data center. However, once you have the SDDC running the real fun begins since you can start to introduce additional functionalities impossible for any traditional data center. Lets just briefly touch on some of the possibilities from an IT view. The self-healing data center This is a concept where the automatic deployment of services is connected to a monitoring system. Once the monitoring system detects that a service or environment may be facing constraints, it can automatically trigger an additional deployment for this service to increase the throughput. While this is application dependent, for infrastructure services this can become quite handy. Think of ESXi host auto deployments if compute power is becoming a constraint, or data store deployments if disk space is running low. If this automation is acting to aggressive for your organization, it can be used with an approval function. Once the monitoring detects a shortcoming it will ask for approval to fix it with a deployment action. Instead of getting an e-mail from your monitoring system that there is a constraint identified, you get an e-mail with the constraint and the resolving action. All you need to do is to approve the action. The self-scaling data center A similar principle is to use a capacity management tool to predict the growth of your environment. If it approaches a trigger, the system can automatically generate an order letter, containing all needed components to satisfy the growing capacity demands. This can than be sent to finance or the purchasing management for approval and before you even get into any capacity constraints, the new gear might be available and ready to run. However, consider the regular turnaround time for ordering hardware, which might affect how far in the future you have to set the trigger for such functionality. Both of this opportunities are more than just nice to haves, they enable your data center to be truly flexible and proactive. Due to the fact that a SDDC is offering a high amount of agility, it will also need some self-monitoring to stay flexible and useable and to fulfill unpredicted demand. Summary In this article we discussed the main principles and declarations of an SDDC. It provided an overview of the opportunities and possibilities this new data center architecture provides. Also, it covered the changes which will be introduced by this new approach. Finally it discussed the implementation journey and its involvement with people, processes and technology. Resources for Article: Further resources on this subject: VM, It Is Not What You Think! [article] Introducing vSphere vMotion [article] Creating a VM using VirtualBox - Ubuntu Linux [article]
Read more
  • 0
  • 0
  • 3553
Modal Close icon
Modal Close icon