Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

6719 Articles
article-image-python-ldap-applications-part-4-ldap-schema
Packt
23 Oct 2009
8 min read
Save for later

Python LDAP Applications: Part 4 - LDAP Schema

Packt
23 Oct 2009
8 min read
Using Schema Information As with most LDAP servers, OpenLDAP provides schema access to LDAP clients. An LDAP schema defines object classes, attributes, matching rules, and other LDAP structures. In this article, we will take a brief look at what might be the most complex module in the Python-LDAP API, the ldap.schema module. This module provides programmatic access to the schema, using the LDAP subschema record, and the subschema's subentries obtained from the LDAP server. The module has two major components. The first is the SubSchema object, which contains the schema definition, and provides numerous functions for navigating through the definitions stored in the schema. The second component is the model, which contains classes that describe structural components (Schema Elements) of the schema. For example, the model contains classes like ObjectClass, AttributeType, MatchingRule, and DITContentRule. Getting the Schema from the LDAP Server The ldap.schema module does not automatically retrieve the schema information. It must be fetched from the server with an LDAP search operation. The schema is always stored in a specific entry, almost always accessible with the DN cn=subschema. (If it is elsewhere, and is accessible, the Root DSE will note the location.) We can retrieve the record by doing a search with a base scope: >>> res = l.search_s('cn=subschema',... ldap.SCOPE_BASE,... '(objectclass=*)',... ['*','+']... )>>> subschema_entry = ldaphelper.get_search_results(res)[0]>>> subschema_subentry = subschema_entry.get_attributes()>>> The search configuration above should return only one record – the record for cn=subschema. Because most of the schema attributes are operational attributes, we need to specify, in the list of attributes, both * for all regular attributes and + for all operational attributes. The ldaphelper.get_search_results() function we created early in this series returns a list of LDAPSearchResult objects. Since we know that we want the first one (in a list of one), we can use the [0] notation at the end to return just the first item in the resulting list. Now, schema_entry contains the LDAPSearchResult object for the cn=subschema record. We need the list of attributes – namely, the schema-defining attributes, usually called the subschema subentry. We can use the get_attributes() method to retrieve the dict of attributes. Now we have the information necessary for creating a new SubSchema object. The SubSchema Object The SubSchema object provides access to the details of the schema definitions. The SubSchema() constructor takes one parameter: a dictionary of attributes that contains the subschema subentry information. This is the information we retrieved and stored in the subschema_subentry variable above. Creating a new SubSchema object is done like this: >>> subschema = ldap.schema.SubSchema( subschema_subentry )>>> Now we can access the schema information. We can, for instance, get the schema information for the cn attribute: >>> cn_attr = subschema.get_obj( ldap.schema.AttributeType, 'cn' )>>> cn_attr.names('cn', 'commonName')>>> cn_attr.desc'RFC2256: common name(s) for which the entity is known by'>>> cn_attr.oid'2.5.4.3'>>> The first line employs the get_obj() method to retrieve an AttributeType object. The call to get_obj() above uses two parameters. The first is the class (a subclass of SchemaElement) that represents an attribute. This is ldap.schema.AttributeType. If we were getting an object class instead of an attribute, we would use the same method, but pass an ldap.schema.ObjectClass as the first parameter. The second parameter is a string name (or OID) of the attribute. We could have used 'commonName' or '2.5.4.3' and attained the same result. The cn_attr object (an instance of an AttributeType class) has a number of properties representing schema statements. For example, in the example above, the names property contains a tuple of the attribute names for that attribute, and the desc property contains the value of the description, as specified in the schema. The oid attribute contains the Object Identifier (OID) for the CN attribute. Let's look at one more method of the SubSchema class before moving on to the final script in this article. Using the attribute_types() method of the SubSchema class, we can find out what attributes are required for an record, and what attributes are allowed. For example, consider a record that has the object classes account and simpleSecurityObject. The uid=authenticate,ou=system,dc=example,dc=com entry in our directory information tree is an example of such a user. We can use the attribute_types() method to get information about what attributes this record can or must have: >>> oc_list = ['account', 'simpleSecurityObject']>>> oc_attrs = subschema.attribute_types( oc_list )>>> must_attrs = oc_attrs[0]>>> may_attrs = oc_attrs[1]>>> >>> for ( oid, attr_obj ) in must_attrs.iteritems():... print "Must have %s" % attr_obj.names[0]... Must have userPasswordMust have objectClassMust have uid>>> for ( oid, attr_obj ) in may_attrs.iteritems():... print "May have %s" % attr_obj.names[0]... May have oMay have ouMay have seeAlsoMay have descriptionMay have lMay have host>>> The oc_list list has the names of the two object classes in which we are interested: account and simpleSecurityObject. Passing this list to the attribute_types() method, we get a two-item tuple. The first item in the tuple is a dictionary of required attributes. The key in the dictionary is the OID: >>> must_attrs.keys()['2.5.4.35', '2.5.4.0', '0.9.2342.19200300.100.1.1']>>> The value in the dictionary is an AttributeType object corresponding to the attribute defined for the OID key: >>> must_attrs['2.5.4.35'].oid'2.5.4.35'>>> In the code snippet above, we assigned each value in the two-item tuple to a different variable: must_attrs contains the first item in the tuple – the dictionary of must-have attributes. The may_attrs contains a dictionary of the attributes that are allowed, but not required. Iterating through the dictionaries and printing the output, we can see that the required attributes for a record that used both the account and the simpleSecurityObject object classes would be userPassword, objectclass, and uid. Several other attributes are allowed, but not required: o, ou, seeAlso, description, l, and host. We could find out which object class definitions required or allowed which of these attributes using the get_obj() method we looked at above: >>> oc_obj = subschema.get_obj( ldap.schema.ObjectClass, 'account' )>>> oc_obj.may('description', 'seeAlso', 'localityName', 'organizationName', 'organizationalUnitName', 'host')>>> oc_obj.must('userid',)>>>>>> oc_obj = subschema.get_obj( ldap.schema.ObjectClass,... 'simpleSecurityObject' )>>> oc_obj.must('userPassword',)>>> oc_obj.may()>>> From the above, we can see that most of the required and optional attributes come from the account definition, while only userPassword comes from the simpleSecurityObject definition. The requirement of the objectClass attribute comes from the top object class, the ultimate ancestor of all structural object classes. The schema support offered by the Python-LDAP API makes it possible to program schema-aware clients that can, for instance, perform client-side schema checking, dynamically build forms for creating records, or compare definitions between different LDAP servers on a network. Unfortunately, the ldap.schema module is poorly documented. With most of the module, the best source of information is the __doc__ strings embedded in the code: >>> print ldap.schema.SubSchema.attribute_types.__doc__ Returns a 2-tuple of all must and may attributes including all inherited attributes of superior object classes by walking up classes along the SUP attribute. The attributes are stored in a ldap.cidict.cidict dictionary. object_class_list list of strings specifying object class names or OIDs attr_type_filter list of 2-tuples containing lists of class attributes which has to be matched raise_keyerror All KeyError exceptions for non-existent schema elements are ignored ignore_dit_content_rule A DIT content rule governing the structural object class is ignored >>> In some cases, though, the best source of documentation is the code itself. The last script in this article will provide an example of how the schema information can be used. An Example Script: suggest_attributes.py This example script compares the attributes in a user-specified record with the possible attributes, and prints out an annotated list of “suggested” available attributes. This script is longer than the other scripts in this article, but it makes use of similar techniques, and we will be able to move through it quickly.
Read more
  • 0
  • 0
  • 6086

article-image-deploying-vertx-application
Packt
24 Sep 2013
12 min read
Save for later

Deploying a Vert.x application

Packt
24 Sep 2013
12 min read
(For more resources related to this topic, see here.) Setting up an Ubuntu box We are going to set up an Ubuntu virtual machine using the Vagrant tool. This virtual machine will simulate a real production server. If you already have an Ubuntu box (or a similar Linux box) handy, you can skip this step and move on to setting up a user. Vagrant (http://www.vagrantup.com/) is a tool for managing virtual machines. Many people use it to manage their development environments so that they can easily share them and test their software on different operating systems. For us, it is a perfect tool to practice Vert.x deployment into a Linux environment. Install Vagrant by heading to the Downloads area at http://vagrantup.com and selecting the latest version. Select a package for your operating system and run the installer. Once it is done you should have a vagrant command available on the command line as follows: vagrant –v Navigate to the root directory of our project and run the following command: vagrant init precise64 http://files.vagrantup.com/precise64.box This will generate a file called Vagrant file in the project folder. It contains configuration for the virtual machine we're about to create. We initialized a precise64 box, which is shorthand for the 64-bit version of Ubuntu 12.04 Precise Pangolin. Open the file in an editor and find the following line: # config.vm.network :private_network, ip: "192.168.33.10" Uncomment the line by removing the # character. This will enable private networking for the box. We will be able to conveniently access it with the IP address 192.168.33.10 locally. Run the following command to download, install, and launch the virtual machine: vagrant up This command launches the virtual machine configured in the Vagrantfile. On first launch it will also download it. Because of this, running the command may take a while. Once the command is finished you can check the status of the virtual machine by running vagrant status, suspend it by running vagrant suspend, bring it back up by running vagrant up, and remove it by running vagrant destroy. Setting up a user For any application deployment, it's a good idea have an application-specific user configured. The sole purpose of the user is to run the application. This gives you a nice way to control permissions and make sure the application can only do what it's supposed to. Open a shell connection to our Linux box. If you followed the steps to set up a Vagrant box, you can do this by running the following command in the project root directory: vagrant ssh Add a new user called mindmaps using the following command: sudo useradd -d /home/mindmaps -m mindmaps Also specify a password for the new user using the following command (and make a note of the password you choose; you'll need it): sudo passwd mindmaps Install Java on the server Install Java for the Linux box, as described in Getting Started with Vert.x. As a quick reminder, Java can be installed on Ubuntu with the following command: sudo apt-get install openjdk-7-jdk On fresh Ubuntu installations, it is a good idea to always make sure the package manager index is up-to-date before installing any packages. This is also the case for our Ubuntu virtual machine. Run the following command if the Java installation fails: sudo apt-get update Installing MongoDB on the server We also need MongoDB to be installed on the server, for persisting the mind maps. Setting up privileged ports Our application is configured to serve requests on port 8080. When we deploy to the Internet, we don't want users to have to know anything about ports, which means we should deploy our app to the default HTTP port 80 instead. On Unix systems (such as Linux) port 80 can only be bound to by the root user. Because it is not a good idea to run applications as the root user, we should set up a special privilege for the mindmaps user to bind to port 80. We can do this with the authbind utility. authbind is a Linux utility that can be used to bind processes to privileged ports without requiring root access. Install authbind using the package manager with the following command: sudo apt-get install authbind Set up a privilege for the mindmaps user to bind to port 80, by creating a file into the authbind configuration directory with the following command: cd /etc/authbind/byport/sudo touch 80sudo chown mindmaps:mindmaps 80sudo chmod 700 80 When authbind is run, it checks from this directory, whether there is a file corresponding to the used port and whether the current user has access to it. Here we have created such a file. Many people prefer to have a web server such as Nginx or Apache as a frontend and not expose backend services to the Internet directly. This can also be done with Vert.x. In that case, you could just deploy Vert.x to port 8080 and skip the authbind configuration. Then, you would need to configure reverse proxying for the Vert.x application in your web server. Note that we are using the event bus bridge in our application, and that uses HTTP WebSockets as the transport mechanism. This means the front-end web server must be able to also proxy WebSocket traffic. Nginx is able to do this starting from version 1.3 and Apache from version 2.4.5. Installing Vert.x on the server Switch to the mindmaps user in the shell on the virtual machine using the following command: sudo su – mindmaps Install Vert.x for this user, as described in Getting Started with Vert.x. As a quick reminder, it can be done by downloading and unpacking the latest distribution from http://vertx.io. Making the application port configurable Let's move back to our application code for a moment. During development we have been running the application in port 8080, but on the server we will want to run it in port 80. To support both of these scenarios we can make the port configurable through an environment variable. Vert.x makes environment variables available to verticles through the container API. In JavaScript, the variables can be found in the container.env object. Let's use it to give our application a port at runtime. Find the following line from the deployment verticle app.js: port: 8080, Change it to the following line: port: parseInt(container.env.get('MINDMAPS_PORT')) || 8080, This gets the MINDMAPS_PORT environment variable, and parses it from a string to an integer using the standard JavaScript parseInt function. If no port has been given, the default value 8080 is used. We also need to change the host configuration of the web server. So far, we have been binding to localhost, but now we also want the application to be accessible from outside the server. Find the following line in app.js: host: "localhost", Change it to the following line: host: "0.0.0.0", Using the host 0.0.0.0 will make the server bind to all IPv4 network interfaces the server has. Setting up the application on the server We are going to need some way of transferring the application code itself to the server, as well as delivering incremental updates as new versions of the application are developed. One of the simplest ways to accomplish this is to just transfer the application files over using the rsync tool, which is what we will do. rsync is a widely used Unix tool for transferring files between machines. It has some useful features over plain file copying, such as only copying the deltas of what has changed, and two-way synchronization of files. Create a directory for the application, on the home directory of the mindmaps user using the following command: mkdir ~/app Go back to the application root directory and transfer the files from it to the new remote directory: rsync -rtzv . [email protected]:~/app Testing the setup At this point, the project working tree should already be in the application directory on the remote server, because we have transferred it over using rsync. You should also be able to run it on the virtual machine, provided that you have the JDK, Vert.x, and MongoDB installed, and that you have authbind installed and configured. You can run the app with the following commands: cd ~/appJAVA_OPTS="-Djava.net.preferIPv4Stack=true" MINDMAPS_PORT=80 authbind ~/vert.x-2.0.1-final/bin/vertx run app.js Let's go through the file bit by bit as follows: We pass a Java system parameter called java.net.preferIPv4Stack to Java via the JAVA_OPTS environment variable. This will have Java use IPv4 networking only. We need it because the authbind utility only supports IPv4. We also explicitly set the application to use port 80 using the MINDMAPS_PORT environment variable. We wrap the Vert.x command with the authbind command. Finally, there's the call to Vert.x. Substitute the path to the Vert.x executable with the path you installed Vert.x to. After starting the application, you should be able to see it by navigating to //192.168.33.10 in a browser. Setting up an upstart service We have our application fully operational, but it isn't very convenient or reliable to have to start it manually. What we'll do next is to set up an Ubuntu upstart job that will make sure the application is always running and survives things like server restarts. Upstart is an Ubuntu utility that handles task supervision and automated starting and stopping of tasks when the machine starts up, shuts down, or when some other events occur. It is similar to the /sbin/init daemon, but is arguably easier to configure, which is the reason we'll be using it. The first thing we need to do is set up an upstart configuration file. Open an editor with root access (using sudo) for a new file /etc/init/mindmaps.conf, and set its contents as follows: start on runlevel [2345]stop on runlevel [016]setuid mindmapssetgid mindmapsenv JAVA_OPTS="-Djava.net.preferIPv4Stack=true"env MINDMAPS_PORT=80chdir /home/mindmaps/appexec authbind /home/mindmaps/vert.x-2.0.1-final/bin/vertx run app.js Let's go through the file bit by bit as follows: On the first two lines, we configure when this service will start and stop. This is defined using runlevels, which are numeric identifiers of different states of the operating system (http://en.wikipedia.org/wiki/Runlevel). 2, 3, 4, and 5 designate runlevels where the system is operational; 0, 1, and 6 designate runlevels where the system is stopping or restarting. We set the user and group the service will run as to the mindmaps user and its group. We set the two environment variables we also used earlier when testing the service: JAVA_OPTS for letting Java know it should only use the IPv4 stack, and MINDMAPS_PORT to let our application know that it should use port 80. We change the working directory of the service to where our application resides, using the chdir directive. Finally, we define the command that starts the service. It is the vertx command wrapped in the authbind command. Be sure to change the directory for the vertx binary to match the directory you installed Vert.x to. Let's give the mindmaps user the permission to manage this job so that we won't have to always run it as root. Open up the /etc/sudoers file into an editor with the following command: sudo /usr/sbin/visudo At the end of the file, add the following line: mindmaps ALL = (root) NOPASSWD: /sbin/start mindmaps, /sbin/stopmindmaps, /sbin/restart mindmaps, /sbin/status mindmaps The visudo command is used to configure the privileges of different users to use the sudo command. With the line we added, we enabled the mindmaps user to run a few specific commands without having to supply a password. At this point you should be able to start and stop the application as the mindmaps user: sudo start mindmaps You also have the following additional commands available for managing the service: sudo status mindmapssudo restart mindmapssudo stop mindmaps If there is a problem with the commands, there might be some configuration error. The upstart service will log errors to the file: /var/log/upstart/mindmaps.log. You will need to open it using the sudo command. Deploying new versions Deploying a new version of the application consists of the following two steps: Transferring new files over using rsync Restarting the mind maps service We can make this even easier by creating a shell script that executes both steps. Create a file called deploy.sh in the root directory of the project and set its contents as: #!/bin/shrsync -rtzv . [email protected]:~/app/ssh [email protected] sudo restart mindmaps Make the script executable, using the following command: chmod +x deploy.sh After this, just run the following command whenever you want a new version on the server: ./deploy.sh To make deployment even more streamlined, you can set up SSH public key authentication so that you won't need to supply the password of the mindmaps user as you deploy. See https://help.ubuntu.com/community/SSH/OpenSSH/Keys for more information. Summary In this article, we have learned the following things: How to set up a Linux server for Vert.x production deployment How to set up deployment for a Vert.x application using rsync How to start and supervise a Vert.x process using upstart Resources for Article : Further resources on this subject: IRC-style chat with TCP server and event bus [Article] Coding for the Real-time Web [Article] Integrating Storm and Hadoop [Article]
Read more
  • 0
  • 0
  • 6086

article-image-digging-architecture
Packt
30 Jul 2013
31 min read
Save for later

Digging into the Architecture

Packt
30 Jul 2013
31 min read
(For more resources related to this topic, see here.) The big picture A very short description of a WaveMaker application could be: a Spring MVC server running in a Java container, such as Tomcat, serving file and JSON requests for a Dojo Toolkit-based JavaScript browser client. Unfortunately, such "elevator" descriptions can create more questions than they answer. For starters, although we will often refer to it as "the server," the WaveMaker server might be more aptly called an application server in most architectures. Sure, it is possible to have a useful application without additional servers or services beyond the WaveMaker server, but this is not typical. We could have a rich user interface to read against some in memory data set, for example. Far more commonly, the Java services running in the WaveMaker server are calling off to other servers or services, such as relational databases and RESTful web services. This means the WaveMaker server is often the middle or application tier server of a multi-tier application's architecture. Yet at the same time, the WaveMaker server can be eliminated completely. Applications can be packaged for uploading to PhoneGap build, http://build.phonegap.com/,directly from WaveMaker Studio. Both PhoneGap and the associated Apache project Cordova, http://cordova.apache.org,provide APIs to enable JavaScript to access native device functionality, such as capturing images with the camera and obtaining GPS location information. Packaged up and installed as a native application, the JavaScript files are loaded from the devices, file system instead of being downloaded from a server via HTTP. This means there is no origin domain to be constrained by. If the application only uses web services, or otherwise doesn't need additional services, such as database access, the WaveMaker server is neither used nor needed. Just because an application isn't installed on a mobile device from an app store doesn't mean we can't run it on a mobile device. Browsers on mobile devices are more capable than ever before. This means our client could be any device with a modern browser. You must also consider licensing in light of the bigger picture. WaveMaker, WaveMaker Studio, and the applications create with the Studio are released under the Apache 2.0 license, http://www.apache.org/licenses/LICENSE-2.0. The WaveMaker project was first released by WaveMaker Software in 2007. In March 2011, VMware (http://vmware.com) acquired the WaveMaker project. It was under VMware that WaveMaker 6.5 was released. In April 2013, Pramati Technlogies (http://pramati.com) acquired the assets of WaveMaker for its CloudJee (http://cloudjee.com) platform. WaveMaker continues to be developed and released by Pramati Technologies. Now that we understand where our client and server sit in the larger world, we will be primarily focused within and between those two parts. The overall picture of the client and server looks as shown in the following diagram: We will examine each piece of this diagram in detail during the course of this book. We shall start with the JavaScript client. Getting comfortable with the JavaScript client The client is a JavaScript client that runs in a modern browser. This means that most of the client, the HTML and DOM nodes that the browser interfaces with specifically, are created by JavaScript at runtime. The application is styled using CSS, and we can use HTML in our applications. However, we don't use HTML to define buttons and forms. Instead, we define components, such as widgets, and set their properties. These component class names and properties are used as arguments to functions that create DOM nodes for us. Dojo Toolkit To do this, WaveMaker uses the Dojo Toolkit, http://dojotoolkit.org/. Dojo, as it is generally referred to, is a modular, cross-browser, JavaScript framework with three sections. Dojo Core provides the base toolkit. On top of which are Dojo's visual widgets called Dijits. Finally, DojoX contains additional extensions such as charts and a color picker. DojoCampus' Dojo Explorer, http://dojocampus.com/explorer/, has a good selection of single unit demos across the toolkit, many with source code. Dojo allows developers to define widgets using HTML or JavaScript. WaveMaker users will better recognize the JavaScript approach. Specifically, WaveMaker 6.5.X uses version 1.6.1 of Dojo. Of the browsers supported by Dojo 1.6.1, http://dojotoolkit.org/reference-guide/1.8/releasenotes/1.6.html, Opera's "Dojo Core only" support prevents it from being supported by WaveMaker. This could change with Opera's move to WebKit. Building on top of the Dojo Toolkit, WaveMaker provides its own collections of widgets and underlying components. Although both can be called components, the name component is generally used for the non-visible parts, such as service calls to the server and the event notification system. Widgets, such as the Dijits, are visible components such as buttons and editors. Many, but not all, of the WaveMaker widgets extend functionality from Dojo widgets. When they do extend Dijits, WaveMaker widgets often add numerous functions and behaviors that are not part of Dojo. Examples include controlling the read-only state, formatting display values for currency, and merging components, such as buttons with icons in them. Combined with the WaveMaker runtime layers, these enhancements make it easy to assemble rich clients using only properties. WaveMaker's select editor (wm.SelectMenu) for example extends the Dojo Toolkit ComboBox (dijit.form.ComboBox) or the FilteringSelect (dijit.form.FilteringSelect) as needed. By default, a select menu has Dojo FilteringSelect as its editor, but it will use ComboBox instead if the user is on a mobile device or the developer has cleared the RestrictValues property tick box. A required select menu editor Let's consider the case of disabling a submit button when the user has not made a required list selection. In Dojo, this is done using JavaScript code, and for an experienced Dojo developer, this is not difficult. For those who may primarily consider Dojo a martial arts Studio however, it is likely another matter altogether. Using the WaveMaker framework provided widgets, no code is required to set up this inter-connection. This is simply a matter of visually linking or binding the button's disabled property to the lists' emptySelection property in the graphical binding dialog. Now the button will be disabled if the user has not made a selection in the grid's list of items. Logically, we can think of this as setting the disabled property to the value of the grid's emptySelection property, where emptySelection is true unless and until a row has been selected. Where WaveMaker most notably varies from the Dojo way of things is the layout engine. WaveMaker handles the layout of container widgets using its own engine. Containers are those widgets that contain other widgets, such as panels, tabs, and dialogs. This makes it easier for developers to arrange widgets in WaveMaker Studio. A result of this is that border, padding, and margin are set using properties on widgets, not by CSS. Border, padding, and margin are widget properties in WaveMaker, and are not controlled by CSS. Dojo made easy Having the Dojo framework available to us makes web development easier both when using the WaveMaker framework and when doing custom work. Dojo's modular and object-oriented functions, such as dojo.declare and dojo.inherited, for example, simplify creating custom components. The key takeaway here is that Dojo itself is available to you as a developer if you wish to use it directly. Many developers never need to utilize this capability, but it is available to you if you ever do wish to take advantage of it. Running the CRM Simple sample again from either the console in the browser development tools or custom project page code, we could use Dojo's byId() function to get a div, for example, the main title label: >dojo.byId("main_labelTitle"). In practice, the WaveMaker style of getting a DOM node via the component name, for example, main.labelTitle.domNode, is more practical and returns the same result. If a function or ability in Dojo is useful, the WaveMaker framework usually provides a wrapper of some sort for you. Just as often, the WaveMaker version is friendlier or otherwise easier to use in some way. For example, this.connect(), WaveMaker's version of dojo.connect(), tracks connections for you. This avoids the need for you to remember to call disconnect() to remove the reference added by every call to connect(). For more information about using Dojo functions in WaveMaker, see the Dojo framework page in the WaveMaker documentation at: http://dev.wavemaker.com/wiki/bin/wmdoc_6.5/Dojo+Framework. Binding and events Two solid examples of WaveMaker taking a powerful feature of Dojo and providing friendlier versions are topic notifications and event handling. Dojo.connect() enables you to register a method to be called when something happens. In other words: "when X happens, please also do Y". Studio provides visual tooling for this in the events section of a component's properties. Buttons have an event drop-down menu for their click event. Asynchronous server call components, live variables, and service variables, have tooled events for reviewing data just before the call is made and for the successful, and not so successful, returns from the call. These menus are populated with listings of likely components and if appropriate, functions. Invoking other service calls, particularly when a server call depends on data from the results of some previous server call, and navigation calls to other layers and pages within the application are easy examples of how WaveMaker's visual tooling of dojo.connect simplifies web development. WaveMaker's binding dialog is a graphical interface on the topic subscription system. Here we are "binding" a live variable that returns rows from the lineitem table to be filtered by the data value of the orderid editor in the form on the new order page: The result of this binding is that when the value of the orderid editor changes, the value in the filter parameter of this live variable will be updated. An event indicating that the value of this orderid editor has changed is published when the data value changes. This live variable's filter is being subscribed to that topic and can now update its value accordingly. Loading the client Web applications start from index.html, and a WaveMaker application is no different. If we examine index.html of a WaveMaker application, we see the total content is less than 100 lines. We have some meta tags in the head, mostly for Internet Explorer (MSIE) and iOS support. In the body, there are more entries to help out with older versions of MSIE, including script tags to use Chrome Frame if we so choose. If we cut all that away, index.html is rather simple. In the head, we load the CSS containing the projects theme and define a few lines of style classes for wavemakerNode and _wm_loading: <script>var wmThemeUrl = "/wavemaker/lib/wm/base/widget/themes/wm_default/theme.css";</script> <style type="text/css"> #wavemakerNode { height: 100%; overflow: hidden; position: relative; } #_wm_loading { text-align: center; margin: 25% 0px 25% 0px; } </style> Next we load the file config.js, which as its name suggests, is about configuration. The following line of code is used to load the file: <script type="text/javascript" src = "config.js"></script> Config.js defines the various settings, variables, and helper functions needed to initialize the application, such as the locale setting. Moving into the body tag of index.html, we find a div named wavemakerNode: <div id="wavemakerNode"> The next div tag is the loader gif, which is given in the following code: <div id="_wm_loading" style="z-index: 100;"> <table style='width:100%;height: 100%;'><tr><td align='center'><img alt="Loading" src = "/wavemaker/lib/boot/images/loader.gif" />&nbsp;&nbsp;Loading...</td></tr></table> </div> This is the standard spinner shown while the application is loading. With the loader gif now spinning, we begin the real work with runtimeLoader.js, as given in the following line of code: <script type="text/javascript" src = "/wavemaker/lib/runtimeLoader.js"></script> When running a project from Studio, the client runtime is loaded from Studio via WaveMaker. Config.js and index.html are modified for deployment while the client runtime is copied into the applications webapproot. runtimeLoader, as its name suggests, loads the WaveMaker runtime. With the runtime loaded, we can now load the top level project.a.js file, which defines our application using the dojo.declare() method. The following line of code loads the file: <script type="text/javascript" src = "project.a.js"></script> Finally, with our application class defined, we set up an instance of our application in wavemakerNode and run it. There are two modes for loading a WaveMaker application: debug and gzip mode. The debug mode is useful for debugging, as you would expect. The gzip mode is the default mode. The test mode of the Run , Test , or Compile button in Studio re-deploys the active project and opens it in debug mode. This is the only difference between using Test and Run in Studio. The Test button adds ?debug to the URL of the browser window; the Run button does not. Any WaveMaker application can be loaded in debug mode by adding debug to the URL parameters. For example, to load the CRM Simple application from with WaveMaker in debug mode, use the URL http://crm_simple.localhost:8094.com/?debug; detecting debug in the URL sets the djConfig.debugBoot flag, which alters the path used in runtimeLoader. djConfig.debugBoot = location.search.indexOf("debug") >=0; Like a compiled program, debug mode preserves variable names and all the other details that optimization removes which we would want available to use when debugging. However, JavaScript is not compiled into byte code or machine specific instructions. On the other hand, in gzip mode, the browser loads a few optimized packages containing all the source code in merged files. This reduces the number of files needed to load our application, which significantly improves loading time. These optimized packages are also minified. Minification removes whitespace and replaces variable names with short names, further reducing the volume of code to be parsed by the browser, and therefore further improving performance. The result is a significant reduction in the number of requests needed and the number of bytes transferred to load an application. A stock application in gzip mode requires 22 to 24 requests to load some 300 KB to 400 KB of content, depending on the application. In debug mode, the same app transfers over 1.5 MB in more than 500 requests. The index.html file, and when security is enabled, login.html, are yours to edit. If you are comfortable doing so, you can customize these files such as adding additional script tags. In practice, you shouldn't need to customize index.html, as you have full control of the application loaded into the wavemakerNode. Also, upgraded scripts in future versions of WaveMaker may need to programmatically update index.html and login.html. Changes to the X-US-Compatible meta tag are often required when support for newer versions of Internet Explorer becomes available, for example. These scripts can't possibly know about every customization you may make. Customization of index.html may cause these scripts to fail, and may require you to manually update these files. If you do encounter such a situation, simply use the index.html file from a project newly created in the new version as a template. Springing into the server side The WaveMaker server is a Java application running in a Java Virtual Machine (JVM). Like the client, it builds upon proven frameworks and libraries. In the case of the server, the foundational block is the SpringSource framework, http://www.springsource.org/SpringSource, or the Spring framework. The Spring framework is the most popular enterprise Java development framework today, and for good reason. The server of a WaveMaker application is a Spring application that includes the WaveMaker common, json, and runtime modules. More specifically, the WaveMaker server uses the Spring Web MVC framework to create a DispatcherServlet that delegates client requests to their handlers. WaveMaker uses only a handful of controllers, as we will see in the next section. The effective result is that it is the request URL that is used to direct a service call to the correct service. The method value of the request is the name of the client exposed function with the service to be called. In the case of overloaded functions, the signature of the params value is used to find the method matching by signature. We will look at example requests and responses shortly. Behind this controller is not only the power of the Spring framework, but also a number of leading frameworks such as Hibernate and, JaxWS, and libraries such as log4j and Apache commons. Here too, these libraries are available to you both directly in any custom work you might do and indirectly as tooled features of Studio. As we are working with a Spring server, we will be seeing Spring beans often as we examine the server-side configuration. One need not be familiar with Spring to reap its benefits when using custom Java in WaveMaker. Spring makes it easy to get access to other beans from our Java code. For example, if our project has imported a database as MyDB, we could get access to the service and any exposed functions in that service using getServiceBean().The following code illustrates the use of getServiceBean(): MyDB myDbSvc = (MyDB)RuntimeAccess.getInstance().getServiceBean("mydb"); We start by getting an instance of the WaveMaker runtime. From the returned runtime instance, we can use the getServiceBean() method to get a service bean for our mydb database service. There are other ways we could have got access to the service from our Java code; this one is pretty straightforward.  Starting from web.xml Just as the client side starts with index.html, a Java servlet starts in WEB-INF with web.xml. A WaveMaker application web.xml is a rather straightforward Spring MVC web.xml. You'll notice many servlet-mappings, a few listeners, and filters. Unlike index.html, web.xml is managed directly by Studio. If you need to add elements to the web-app context, add them to user-web.xml. The content of user-web.xml is merged into web.xml when generating the deployment package.  The most interesting entry is probably contextConfigLocation of /WEB-INF/project-springapp.xml. Project-springapp.xml is a Spring beans file. Immediately after the schema declaration is a series of resource imports. These imports include the services and entities that we create in Studio as we import databases and otherwise add services to our project. If you open project-spring.xml in WEB-INF, near the top of the file you'll see a comment noting how project-spring.xml is yours to edit. For experienced Spring users, here is the entry point to add any additional imports you may need. An example of such can be found at http://dev.wavemaker.com/wiki/bin/Spring. In that example, an additional XML file, ServerFileProcessor.xml, is used to enable component scanning on a package and sets some properties on those components. Project-spring.xml is then used to import ServerFileProcessor.xml into the application context. Many users of WaveMaker still think of Spring as the season between Winter and Summer. Such users do not need to think about these XML files. However, for those who are experienced with Java, the full power of the Spring framework is accessible to them. Also in project-springapp.xml is a list of URL mappings. These mappings specific request URLs that require handling by the file controller. Gzipped resources, for example, require the header Content-Encoding to be set to gzip. This informs the browser the content is gzip encoded and must be uncompressed before being parsed. >There are a few names that use ag in the server. WaveMaker Software the company was formerly known as ActiveGrid, and had a previous web development tool by the same name. The use of ag and com.activegrid stems back to the project's roots, first put down when the company was still known as ActiveGrid. Closing out web.xml is the Acegi filter mapping. Acegi is the security module used in WaveMaker 6.5 . Even when security is not enabled in an application, the Acegi filter mapping is included in web.xml. When security is not enabled in the project, an empty project-security.xml is used. Client and server communication Now that we've examined the client and server, we need to better understand the communication between the two. WaveMaker almost exclusively uses the HTTP methods GET and POST. In HTTP, GET is used, as you might suspect even without ever having heard of RFC 2626 (https://tools.ietf.org/html/rfc2616), to request, or get, a specific resource. Unless installed as a native application on a mobile device, a WaveMaker web application is loaded via a GET method. From index.html and runtimeLoad.js to the user defined pages and any images used on those images, the applications themselves are loaded into the browser using GET. All service calls, database reads and writes, or otherwise any invocations of a Java service functions, on the other hand, are POST. The URL of these POST functions is always the service named .json. For example, calls to a Java service named userPrefSvc would always be to the URL /userPrefSvc.json. Inside the POST method's request payload will be any required parameters including the method of the service to be invoked. The response will be the response returned from that call. PUT methods are not possible because we cannot nor do not want to know all possible WaveMaker server calls at "designtime", while the project files are open for writing in the Studio. This pattern avoids any URL length constraints, enabling lengthy datasets to be transferred while freeing up the URL to pass parameters such as page state. Let's take a look at an example. If you want to follow along in your browser's console, this is the third request of three when we select "Fog City Books" in the CRM Simple application when running the application with the console open. The following URL is the request URL: http://crm_simple.localhost:8094/services/runtimeService.json The following is request payload: {"params":["custpurchaseDB","com.custpurchasedb.data.Lineitem",null,{"properties":["id","item"],"filters":["id.orderid=9"],"matchMode":"start","ignoreCase":false},{"maxResults":500,"firstResult":0}],"method":"read","id":251422} The response is as follows: {"dataSetSize":2,"result":[{"id":{"itemid":2,"orderid":9},"item":{"itemid":2,"itemname":"Kidnapped","price":12.99},"quantity":2},{"id":{"itemid":10,"orderid":9},"item":{"itemid":10,"itemname":"Gravitys Rainbow","price":11.99},"quantity":1}]} As we expect, the request URL is to a service (in this case named runtime service), with the .json extension. Runtime service is the built-in WaveMaker service for reading and writing with the Hibernate (http://www.hibernate.org), data models generated by importing a database. Security service and WaveMaker service are the other built-in services used at runtime. The security service is used for security functions such as getUserName() and logout(). Note this does not include login, which is handled by Acegi. The WaveMaker service has functions such as getServerTimeOffset(), used to adjust for time zones, and remoteRESTCall(), used to proxy some web service calls. How the runtime service functions is easy to understand by observation. Inside the request payload we have, as the URL suggested, a JavaScript Object Notation (JSON) structure. JSON (http://www.json.org/), is a lightweight data-interchange format regularly used in AJAX applications. Dissecting our example request from the top of the structure enclosed in the outer-most {}'s looks like the following: {"params":[…….],"method":"read","id":251422} We have three top level name-value pairs to our request object: params, method, and id. The id is 251422; method is read and the params value is an array, as indicated by the [] brackets: ["custpurchaseDB","com.custpurchasedb.data.Lineitem",null,{},{ }] In our case, we have an array of five values. The first is the database service name, custpurchaseDB. Next we have what appears to be the package and class name we will be reading from, not unlike from in a SQL query. After which, we have a null and two objects. JSON is friendly to human reading, and we could continue to unwrap the two objects in this request in a similar fashion.  when we discuss database services and check out the response. At the top level, we have dataSetSize, the number of results, and the array of the results: {"dataSetSize":2,"result":[]} Inside our result array we have two objects: [{"id":{"itemid":2,"orderid":9},"item":{"itemid":2,"itemname":"Kidnapped","price":12.99},"quantity":2},{"id":{"itemid":10,"orderid":9},"item":{"itemid":10,"itemname":"Gravitys Rainbow","price":11.99},"quantity":1}]} Our first item has the compound key of itemid 2 with orderid 9. This is the item Kidnapped, which is a book costing $11.99. The other object in our result array also has the orderid 9, as we expect when reading line items from the selected order. This one is also a book, the item Gravity's Rainbow. Types To be more precise about the com.custpurchasdb.data.Lineitem parameter in our read request, it is actually the type name of the read request. WaveMaker projects define types from primitive types such as Boolean and custom complex types such as Lineitem. In our runtime read example, com.custpurchasedb.data.Lineitem is both the package and class name of the imported Hibernate entity and the type name for the line item entity in the project. Maintaining type information enables WaveMaker to ease a number of development issues. As the client knows the structure of the data it is getting from the server, it knows how to display that data with minimal developer configuration, if any. At design time, Studio uses type information in many areas to help us correctly configure our application. For example, when we set up a grid, type information enables Studio to present us with a list of possible column choices for the grid's dataset type. Likewise, when we add a form to the canvas for a database insert, it is type information that Studio uses to fill the form with appropriate editors. Line item is a project-wide type as it is defined in the server side. In the process of compiling the project's Java services sources, WaveMaker defines system types for any type returned to the client in a client facing function. To be added to the type system, a class must: Be public Define public getters and setters Be returned by a client exposed function Have a service class that extends JavaServiceSuperClass or uses the @ExposeToClient annotation WaveMaker 6.5.1 has a bug that prevents types from being generated as expected. Be certain to use 6.5.2 or newer versions to avoid this defect. It is possible to create new project types by adding a Java service class to the project that only defines types. Following is an example that creates a new simple type called Record to the project. Our definition of Record consists of an integer ID and a string. Note that there are two classes here. MyCustomTypes is the service class containing a method returning the type Record. As we will not be calling it, the function getNewRecord() need not do anything other than return a record. Creating a new default instance is an easy way to do this. The class Record is defined as an inner class. An inner class is a class defined within another class. In our case, Record is defined within MyCustomTypes: // Java Service class MyCustomTypes package com.myco.types; import com.wavemaker.runtime.javaservice.JavaServiceSuperClass; import com.wavemaker.runtime.service.annotations.ExposeToClient; public class MyCustomTypes extends JavaServiceSuperClass { public Record getNewRecord(){ return new Record(); } // Inner class Record public class Record{ private int id; private String name; public int getId(){ return id; } public void setId(int id){ this.id = id; } public String getName(){ return this.name; } public void setName(String name){ this.name = name; } } }  To add the preceding code to our WaveMaker project, we would add a Java service to the project using the class name MyCustomTypes in the Package and Class Name editor of the New Java Service dialog. The preceding code extends JavaServiceSuperClass and uses the package com.myco.types. A project can also have client-only types using the type definition option from the advanced section of the Studio insert menu. Type definitions are useful when we want to be able to pass structured data around within the client but we will not be sending or receiving that type to the server. For example, we may want to have application scoped wm.Variable storing a collection of current record selection information. This would enable us to keep track of a number of state items across all pages. Communication with the server is likely to be using only a few of those types at a time, so no such structure exists in the server side. Using wm.Variable enables us to bind each Record ID without using code. The insert type definition menu brings up the Type Definition Generator dialog. The generator takes JSON input and is pre-populated with a sample type. The sample type defines a person object, albeit an unusual one, with a name, an array of numbers for age, a Boolean (hasFoot), and a related person object, friend. Replace the sample type with your own JSON structure. Be certain to change the type name to something meaningful. After generating the type, you'll immediately see the newly minted type in type selectors, such as the type field of wm.Variable. Studio is pretty good at recognizing type changes. If for some reason Studio does not recognize a type change, the easiest thing to do is to get Studio to re-read the owning object. If a wm.Variable fails to show a newly added field to a type in its properties, change the type of the variable from the modified type to some other type and then back again. Studio is also an application One of the more complex WaveMaker applications is the Studio. That's right, Studio is itself an application built out of WaveMaker widgets and using the runtime and server. Being the large, complex application we use to build applications, it can sometimes be difficult to understand where the runtime ends and Studio begins. With that said, Studio remains a treasure trove of examples and ideas to explore. Let's open a finder, explorer, shell, or however you prefer to view the file system of a WaveMaker Studio installation. Let's look in the studio folder. If you've installed WaveMaker to c:program filesWaveMaker6.5.3.Release, the default on Windows, we're looking at c:program filesWaveMaker6.5.3.Releasestudio. This is the webapproot of the Studio project: For files, we've discussed index.html in loading the client. The type definition for the project types is types.js. The types.js definition is how the client learns of the server's Java types. Moving on to the directories alphabetically, we start with the app folder. The app folder can be considered a large utility folder these days. The branding folder, http://dev.wavemaker.com/wiki/bin/wmdoc_6.5/Branding, is a sample of the branding feature for when you want to easily re-brand applications for different customers. The build folder contains the optimized build files we discussed when loading our application in gzip mode. This build folder is for the Studio itself. The images folder is, as we would hope, where images are kept. The content of the doc in jsdoc is pretty old. Use jsref at the online wiki, http://dev.wavemaker.com/wiki/bin/wmjsref_6.5/WebHome, for a client API reference instead. Language contains the National Language Support (NLS) files to localize Studio into other languages. In 6.5.X, there is a Japanese (ja) and Spanish (es) directory in addition to the English (en) default thanks to the efforts of the WaveMaker community and a corporate partner. For more on internationalization applications with WaveMaker, navigate to http://dev.wavemaker.com/wiki/bin/wmdoc_6.5/Localization#HLocalizingtheClientLanguage. The lib folder is very interesting, so let's wrap up this top level before we dig into that one. The META-INF folder contains artifacts from the WaveMaker Maven build process that probably should be removed for 6.5.2. The pages folder contains the page definitions for Studio's pages. These pages can be opened in Studio. They also can be a treasure trove of tips and tricks if you see something when using Studio that you don't know how to do in your application. Be careful however, as some pages are old and use outdated classes or techniques. Other constructs are only used by Studio and aren't tooled. This means some pages use components that can only be created by code. The other major difference between a project's pages folder is that Studio page folders do not contain the same number of files. They do not have the optimized pageName.a.js file, for example. The services folder contains the Service Method Definition (SMD) files for Studio's services. These are summaries of a projects exposed services, one file per service, used at runtime by the client. Each callable function, its input parameter, and its return types are defined. Finally, WEB-INF we have discussed already when we examined web.xml. In Studio's case, replace project with studio in the file names. Also under WEB-INF, we have classes and lib. The classes folder contains Java class files and additional XML files. These files are on the classpath. WEB-INFlib contains JAR files. Studio requires significantly more JAR files, which are automatically added to projects created by Studio. Now let's get back to the lib folder. Astute readers of our walk through of index.html likely noticed the references to /wavemaker/lib in src tags for things such as runtimeloader. You might have also noticed that this folder was not present in the project and wondered how these tags could not fail. As a quick look at the URL of Studio running in a browser will demonstrate, /wavemaker is the Studio's context. This means the JavaScript runtime is only copied in as part of generating the deployment package. The lib folder is loaded directly from Studio's context when you test run an application from Studio using the Run or Test button. RuntimeLoader.js we encountered following index.html as it is the start of the loading of client modules. Manifest.js is an entry point into the loading process. Boot contains pre-initialization, such as the spinning loader image. Next we have another build folder. This one is the one used by applications and contains all possible build files. Not every JavaScript module is packaged up into an optimized build file. Some modules are so specific or rarely used that they are best loaded individually. Otherwise, if there's a build package available to applications, these them. Dojo lives in the dojo folder. I hope you don't find it surprising to find a dijit, dojo, and dojox folder in there. The folder github provides the library path github for JS Beautifier, http://jsbeautifier.org/. The images in the project images folder include a copy of Silk Icons, http://www.famfamfam.com/lab/icons/silk/, a great Creative Common licensed PNG icon set. This brings us to wm. We definitely saved the most interesting folder for our last stop on this tour. For in lib/wm, we have manifest.js, the top level of module loading when using debug mode in the runtime loader. In wm/lib/base, is the top level of the WaveMaker module space used at runtime. This means in wm/lib/base we have the WaveMaker components and widgets folders. These two folders contain the most commonly used sets of classes by WaveMaker developers using any custom JavaScript in a project. This also means we will be back in these folders again too. Summary In this article, we reviewed the WaveMaker architecture. We started with some context of what we mean by "client" and "server" in the context of this book. We then proceeded to dig into the client and the server. We reviewed how both build upon leading frameworks, the Dojo Toolkit and the SpringSource Framework in particular. We examined the running of an application from the network point of view and how the client and server communicated throughout. We dissected a JSON request to the runtime service and encountered project types. We also learned about both project and client type definitions. We ended by revisiting the file system. This time, however, we walked through a Studio installation. Studio is also a WaveMaker application. In the next article, we'll get comfortable with the Studio as a visual tool. We'll look at everything from the properties panels to the built-in source code editors. Resources for Article : Further resources on this subject: Binding Web Services in ESB—Web Services Gateway [Article] Service Oriented JBI: Invoking External Web Services from ServiceMix [Article] Web Services in Apache OFBiz [Article]
Read more
  • 0
  • 0
  • 6078

article-image-symmetric-messages-and-asynchronous-messages-part-1
Packt
05 May 2015
31 min read
Save for later

Symmetric Messages and Asynchronous Messages (Part 1)

Packt
05 May 2015
31 min read
In this article by Kingston Smiler. S, author of the book OpenFlow Cookbook describes the steps involved in sending and processing symmetric messages and asynchronous messages in the switch and contains the following recipes: Sending and processing a hello message Sending and processing an echo request and a reply message Sending and processing an error message Sending and processing an experimenter message Handling a Get Asynchronous Configuration message from the controller, which is used to fetch a list of asynchronous events that will be sent from the switch Sending a Packet-In message to the controller Sending a Flow-removed message to the controller Sending a port-status message to the controller Sending a controller-role status message to the controller Sending a table-status message to the controller Sending a request-forward message to the controller Handling a packet-out message from the controller Handling a barrier-message from the controller (For more resources related to this topic, see here.) Symmetric messages can be sent from both the controller and the switch without any solicitation between them. The OpenFlow switch should be able to send and process the following symmetric messages to or from the controller, but error messages will not be processed by the switch: Hello message Echo request and echo reply message Error message Experimenter message Asynchronous messages are sent by both the controller and the switch when there is any state change in the system. Like symmetric messages, asynchronous messages also should be sent without any solicitation between the switch and the controller. The switch should be able to send the following asynchronous messages to the controller: Packet-in message Flow-removed message Port-status message Table-status message Controller-role status message Request-forward message Similarly, the switch should be able to receive, or process, the following controller-to-switch messages: Packet-out message Barrier message The controller can program or instruct the switch to send a subset of interested asynchronous messages using an asynchronous configuration message. Based on this configuration, the switch should send the subset of asynchronous messages only via the communication channel. The switch should replicate and send asynchronous messages to all the controllers based on the information present in the asynchronous configuration message sent from each controller. The switch should maintain asynchronous configuration information on a per communication channel basis. Sending and processing a hello message The OFPT_HELLO message is used by both the switch and the controller to identify and negotiate the OpenFlow version supported by both the devices. Hello messages should be sent from the switch once the TCP/TLS connection is established and are considered part of the communication channel establishment procedure. The switch should send a hello message to the controller immediately after establishing the TCP/TLS connection with the controller. How to do it... As hello messages are transmitted by both the switch and the controller, the switch should be able to send, receive, and process the hello message. The following section explains these procedures in detail. Sending the OFPT_HELLO message The message format to be used to send the hello message from the switch is as follows. This message includes the OpenFlow header along with zero or more elements that have variable size: /* OFPT_HELLO. This message includes zero or more    hello elements having variable size. */ struct ofp_hello { struct ofp_header header; /* Hello element list */ struct ofp_hello_elem_header elements[0]; /* List of elements */ }; The version field in the ofp_header should be set with the highest OpenFlow protocol version supported by the switch. The elements field is an optional field and might contain the element definition, which takes the following TLV format: /* Version bitmap Hello Element */ struct ofp_hello_elem_versionbitmap { uint16_t type;           /* OFPHET_VERSIONBITMAP. */ uint16_t length;         /* Length in bytes of this element. */        /* Followed by:          * - Exactly (length - 4) bytes containing the bitmaps,          * then Exactly (length + 7)/8*8 - (length) (between 0          * and 7) bytes of all-zero bytes */ uint32_t bitmaps[0]; /* List of bitmaps - supported versions */ }; The type field should be set with OFPHET_VERSIONBITMAP. The length field should be set to the length of this element. The bitmaps field should be set with the list of the OpenFlow versions the switch supports. The number of bitmaps included in the field should depend on the highest version number supported by the switch. The ofp_versions 0 to 31 should be encoded in the first bitmap, ofp_versions 32 to 63 should be encoded in the second bitmap, and so on. For example, if the switch supports only version 1.0 (ofp_versions = 0 x 01) and version 1.3 (ofp_versions = 0 x 04), then the first bitmap should be set to 0 x 00000012. Refer to the send_hello_message() function in the of/openflow.c file for the procedure to build and send the OFPT_Hello message. Receiving the OFPT_HELLO message The switch should be able to receive and process the OFPT_HELLO messages that are sent from the controller. The controller also uses the same message format, structures, and enumerations as defined in the previous section of this recipe. Once the switch receives the hello message, it should calculate the protocol version to be used for messages exchanged with the controller. The procedure required to calculate the protocol version to be used is as follows: If the hello message received from the switch contains an optional OFPHET_VERSIONBITMAP element and the bitmap field contains a valid value, then the negotiated version should be the highest common version among the supported protocol versions in the controller, with the bitmap field in the OFPHET_VERSIONBITMAP element. If the hello message doesn't contain any OFPHET_VERSIONBITMAP element, then the negotiated version should be the smallest of the switch-supported protocol versions and the version field set in the OpenFlow header of the received hello message. If the negotiated version is supported by the switch, then the OpenFlow connection between the controller and the switch continues. Otherwise, the switch should send an OFPT_ERROR message with the type field set as OFPET_HELLO_FAILED, the code field set as OFPHFC_INCOMPATIBLE, and an optional ASCII string explaining the situation in the data and terminate the connection. There's more… Once the switch and the controller negotiate the OpenFlow protocol version to be used, the connection setup procedure is complete. From then on, both the controller and the switch can send OpenFlow protocol messages to each other. Sending and processing an echo request and a reply message Echo request and reply messages are used by both the controller and the switch to maintain and verify the liveliness of the controller-switch connection. Echo messages are also used to calculate the latency and bandwidth of the controller-switch connection. On reception of an echo request message, the switch should respond with an echo reply message. How to do it... As echo messages are transmitted by both the switch and the controller, the switch should be able to send, receive, and process them. The following section explains these procedures in detail. Sending the OFPT_ECHO_REQUEST message The OpenFlow specification doesn't specify how frequently this echo message has to be sent from the switch. However, the switch might choose to send an echo request message periodically to the controller with the configured interval. Similarly, the OpenFlow specification doesn't mention what the timeout (the longest period of time the switch should wait) for receiving echo reply message from the controller should be. After sending an echo request message to the controller, the switch should wait for the echo reply message for the configured timeout period. If the switch doesn't receive the echo reply message within this period, then it should initiate the connection interruption procedure. The OFPT_ECHO_REQUEST message contains an OpenFlow header followed by an undefined data field of arbitrary length. The data field might be filled with the timestamp at which the echo request message was sent, various lengths or values to measure the bandwidth, or be zero-size for just checking the liveliness of the connection. In most open source implementations of OpenFlow, the echo request message only contains the header field and doesn't contain any body. Refer to the send_echo_request() function in the of/openflow.c file for the procedure to build and send the echo_request message. Receiving OFPT_ECHO_REQUEST The switch should be able to receive and process OFPT_ECHO_REQUEST messages that are sent from the controller. The controller also uses the same message format, structures, and enumerations as defined in the previous section of this recipe. Once the switch receives the echo request message, it should build the OFPT_ECHO_REPLY message. This message consists of ofp_header and an arbitrary-length data field. While forming the echo reply message, the switch should copy the content present in the arbitrary-length field of the request message to the reply message. Refer to the process_echo_request() function in the of/openflow.c file for the procedure to handle and process the echo request message and send the echo reply message. Processing OFPT_ECHO_REPLY message The switch should be able to receive the echo reply message from the controller. If the switch sends the echo request message to calculate the latency or bandwidth, on receiving the echo reply message, it should parse the arbitrary-length data field and can calculate the bandwidth, latency, and so on. There's more… If the OpenFlow switch implementation is divided into multiple layers, then the processing of the echo request and reply should be handled in the deepest possible layer. For example, if the OpenFlow switch implementation is divided into user-space processing and kernel-space processing, then the echo request and reply message handling should be in the kernel space. Sending and processing an error message Error messages are used by both the controller and the switch to notify the other end of the connection about any problem. Error messages are typically used by the switch to inform the controller about failure of execution of the request sent from the controller. How to do it... Whenever the switch wants to send the error message to the controller, it should build the OFPT_ERROR message, which takes the following message format: /* OFPT_ERROR: Error message (datapath -> the controller). */ struct ofp_error_msg { struct ofp_header header; uint16_t type; uint16_t code; uint8_t data[0]; /* Variable-length data. Interpreted based on the type and code. No padding. */ }; The type field indicates a high-level type of error. The code value is interpreted based on the type. The data value is a piece of variable-length data that is interpreted based on both the type and the value. The data field should contain an ASCII text string that adds details about why the error occurred. Unless specified otherwise, the data field should contain at least 64 bytes of the failed message that caused this error. If the failed message is shorter 64 bytes, then the data field should contain the full message without any padding. If the switch needs to send an error message in response to a specific message from the controller (say, OFPET_BAD_REQUEST, OFPET_BAD_ACTION, OFPET_BAD_INSTRUCTION, OFPET_BAD_MATCH, or OFPET_FLOW_MOD_FAILED), then the xid field of the OpenFlow header in the error message should be set with the offending request message. Refer to the send_error_message() function in the of/openflow.c file for the procedure to build and send an error message. If the switch needs to send an error message for a request message sent from the controller (because of an error condition), then the switch need not send the reply message to that request. Sending and processing an experimenter message Experimenter messages provide a way for the switch to offer additional vendor-defined functionalities. How to do it... The controller sends the experimenter message with the format. Once the switch receives this message, it should invoke the appropriate vendor-specific functions. Handling a "Get Asynchronous Configuration message" from the controller The OpenFlow specification provides a mechanism in the controller to fetch the list of asynchronous events that can be sent from the switch to the controller channel. This is achieved by sending the "Get Asynchronous Configuration message" (OFPT_GET_ASYNC_REQUEST) to the switch. How to do it... The message format to be used to get the asynchronous configuration message (OFPT_GET_ASYNC_REQUEST) doesn't have any body other than ofp_header. On receiving this OFPT_GET_ASYNC_REQUEST message, the switch should respond with the OFPT_GET_ASYNC_REPLY message. The switch should fill the property list with the list of asynchronous configuration events / property types that the relevant controller channel is preconfigured to receive. The switch should get this information from its internal data structures. Refer to the process_async_config_request() function in the of/openflow.c file for the procedure to process the get asynchronous configuration request message from the controller. Sending a packet-in message to the controller Packet-in messages (OFP_PACKET_IN) are sent from the switch to the controller to transfer a packet received from one of the switch-ports to the controller for further processing. By default, a packet-in message should be sent to all the controllers that are in equal (OFPCR_ROLE_EQUAL) and master (OFPCR_ROLE_MASTER) roles. This message should not be sent to controllers that are in the slave state. There are three ways by which the switch can send a packet-in event to the controller: Table-miss entry: When there is no matching flow entry for the incoming packet, the switch can send the packet to the controller. TTL checking: When the TTL value in a packet reaches zero, the switch can send the packet to the controller. The "send to the controller" action in the matching entry (either the flow table entry or the group table entry) of the packet. How to do it... When the switch wants to send a packet received in its data path to the controller, the following message format should be used: /* Packet received on port (datapath -> the controller). */ struct ofp_packet_in { struct ofp_header header; uint32_t buffer_id; /* ID assigned by datapath. */ uint16_t total_len; /* Full length of frame. */ uint8_t reason;     /* Reason packet is being sent                      * (one of OFPR_*) */ uint8_t table_id;   /* ID of the table that was looked up */ uint64_t cookie;   /* Cookie of the flow entry that was                      * looked up. */ struct ofp_match match; /* Packet metadata. Variable size. */ /* The variable size and padded match is always followed by: * - Exactly 2 all-zero padding bytes, then * - An Ethernet frame whose length is inferred from header.length. * The padding bytes preceding the Ethernet frame ensure that IP * header (if any) following the Ethernet header is 32-bit aligned. */ uint8_t pad[2]; /* Align to 64 bit + 16 bit */ uint8_t data[0]; /* Ethernet frame */ }; The buffer-id field should be set to the opaque value generated by the switch. When the packet is buffered, the data portion of the packet-in message should contain some bytes of data from the incoming packet. If the packet is sent to the controller because of the "send to the controller" action of a table entry, then the max_len field of ofp_action_output should be used as the size of the packet to be included in the packet-in message. If the packet is sent to the controller for any other reason, then the miss_send_len field of the OFPT_SET_CONFIG message should be used to determine the size of the packet. If the packet is not buffered, either because of unavailability of buffers or an explicit configuration via OFPCML_NO_BUFFER, then the entire packet should be included in the data portion of the packet-in message with the buffer-id value as OFP_NO_BUFFER. The date field should be set to the complete packet or a fraction of the packet. The total_length field should be set to the length of the packet included in the data field. The reason field should be set with any one of the following values defined in the enumeration, based on the context that triggers the packet-in event: /* Why is this packet being sent to the controller? */ enum ofp_packet_in_reason { OFPR_TABLE_MISS = 0,   /* No matching flow (table-miss                        * flow entry). */ OFPR_APPLY_ACTION = 1, /* Output to the controller in                        * apply-actions. */ OFPR_INVALID_TTL = 2, /* Packet has invalid TTL */ OFPR_ACTION_SET = 3,   /* Output to the controller in action set. */ OFPR_GROUP = 4,       /* Output to the controller in group bucket. */ OFPR_PACKET_OUT = 5,   /* Output to the controller in packet-out. */ }; If the packet-in message was triggered by the flow-entry "send to the controller" action, then the cookie field should be set with the cookie of the flow entry that caused the packet to be sent to the controller. This field should be set to -1 if the cookie cannot be associated with a particular flow. When the packet-in message is triggered by the "send to the controller" action of a table entry, there is a possibility that some changes have already been applied over the packet in previous stages of the pipeline. This information needs to be carried along with the packet-in message, and it can be carried in the match field of the packet-in message with a set of OXM (short for OpenFlow Extensible Match) TLVs. If the switch includes an OXM TLV in the packet-in message, then the match field should contain a set of OXM TLVs that include context fields. The standard context fields that can be added into the OXL TLVs are OFPXMT_OFB_IN_PORT, OFPXMT_OFB_IN_PHY_PORT, OFPXMT_OFB_METADATA, and OFPXMT_OFB_TUNNEL_ID. When the switch receives the packet in the physical port, and this packet information needs to be carried in the packet-in message, then OFPXMT_OFB_IN_PORT and OFPXMT_OFB_IN_PHY_PORT should have the same value, which is the OpenFlow port number of that physical port. When the switch receives the packet in the logical port and this packet information needs to be carried in the packet-in message, then the switch should set the logical port's port number in OFPXMT_OFB_IN_PORT and the physical port's port number in OFPXMT_OFB_IN_PHY_PORT. For example, consider a packet received on a tunnel interface defined over a Link Aggregation Group (LAG) with two member ports. Then the packet-in message should carry the tunnel interface's port_no to the OFPXMT_OFB_IN_PORT field and the physical interface's port_no to the OFPXMT_OFB_IN_PHY_PORT field. Refer to the send_packet_in_message() function in the of/openflow.c file for the procedure to send a packet-in message event to the controller. How it works... The switch can send either the entire packet it receives from the switch port to the controller, or a fraction of the packet to the controller. When the switch is configured to send only a fraction of the packet, it should buffer the packet in its memory and send a portion of packet data. This is controlled by the switch configuration. If the switch is configured to buffer the packet, and it has sufficient memory to buffer the packet, then the packet-in message should contain the following: A fraction of the packet. This is the size of the packet to be included in the packet-in message, configured via the switch configuration message. By default, it is 128 bytes. When the packet-in message is resulted by a table-entry action, then the output action itself can specify the size of the packet to be sent to the controller. For all other packet-in messages, it is defined in the switch configuration. The buffer ID to be used by the controller when the controller wants to forward the message at a later point in time. There's more… The switch that implements buffering should be expected to expose some details, such as the amount of available buffers, the period of time the buffered data will be available, and so on, through documentation. The switch should implement the procedure to release the buffered packet when there is no response from the controller to the packet-in event. Sending a flow-removed message to the controller A flow-removed message (OFPT_FLOW_REMOVED) is sent from the switch to the controller when a flow entry is removed from the flow table. This message should be sent to the controller only when the OFPFF_SEND_FLOW_REM flag in the flow entry is set. The switch should send this message only to the controller channel wherein the controller requested the switch to send this event. The controller can express its interest in receiving this event by sending the switch configuration message to the switch. By default, OFPT_FLOW_REMOVED should be sent to all the controllers that are in equal (OFPCR_ROLE_EQUAL) and master (OFPCR_ROLE_MASTER) roles. This message should not be sent to a controller that is in the slave state. How to do it... When the switch removes an entry from the flow table, it should build an OFPT_FLOW_REMOVED message with the following format and send this message to the controllers that have already shown interest in this event: /* Flow removed (datapath -> the controller). */ struct ofp_flow_removed { struct ofp_header header; uint64_t cookie;       /* Opaque the controller-issued identifier. */ uint16_t priority;     /* Priority level of flow entry. */ uint8_t reason;         /* One of OFPRR_*. */ uint8_t table_id;       /* ID of the table */ uint32_t duration_sec; /* Time flow was alive in seconds. */ uint32_t duration_nsec; /* Time flow was alive in nanoseconds                          * beyond duration_sec. */ uint16_t idle_timeout; /* Idle timeout from original flow mod. */ uint16_t hard_timeout; /* Hard timeout from original flow mod. */ uint64_t packet_count; uint64_t byte_count; struct ofp_match match; /* Description of fields.Variable size. */ }; The cookie field should be set with the cookie of the flow entry, the priority field should be set with the priority of the flow entry, and the reason field should be set with one of the following values defined in the enumeration: /* Why was this flow removed? */ enum ofp_flow_removed_reason { OFPRR_IDLE_TIMEOUT = 0,/* Flow idle time exceeded idle_timeout. */ OFPRR_HARD_TIMEOUT = 1, /* Time exceeded hard_timeout. */ OFPRR_DELETE = 2,       /* Evicted by a DELETE flow mod. */ OFPRR_GROUP_DELETE = 3, /* Group was removed. */ OFPRR_METER_DELETE = 4, /* Meter was removed. */ OFPRR_EVICTION = 5,     /* the switch eviction to free resources. */ }; The duration_sec and duration_nsec should be set with the elapsed time of the flow entry in the switch. The total duration in nanoseconds can be computed as duration_sec*109 + duration_nsec. All the other fields, such as idle_timeout, hard_timeoutm, and so on, should be set with the appropriate value from the flow entry, that is, these values can be directly copied from the flow mode that created this entry. The packet_count and byte_count should be set with the number of packet count and the byte count associated with the flow entry, respectively. If the values are not available, then these fields should be set with the maximum possible value. Refer to the send_flow_removed_message() function in the of/openflow.c file for the procedure to send a flow removed event message to the controller. Sending a port-status message to the controller Port-status messages (OFPT_PORT_STATUS) are sent from the switch to the controller when there is any change in the port status or when a new port is added, removed, or modified in the switch's data path. The switch should send this message only to the controller channel that the controller requested the switch to send it. The controller can express its interest to receive this event by sending an asynchronous configuration message to the switch. By default, the port-status message should be sent to all configured controllers in the switch, including the controller in the slave role (OFPCR_ROLE_SLAVE). How to do it... The switch should construct an OFPT_PORT_STATUS message with the following format and send this message to the controllers that have already shown interest in this event: /* A physical port has changed in the datapath */ struct ofp_port_status { struct ofp_header header; uint8_t reason; /* One of OFPPR_*. */ uint8_t pad[7]; /* Align to 64-bits. */ struct ofp_port desc; }; The reason field should be set to one of the following values as defined in the enumeration: /* What changed about the physical port */ enum ofp_port_reason { OFPPR_ADD = 0,   /* The port was added. */ OFPPR_DELETE = 1, /* The port was removed. */ OFPPR_MODIFY = 2, /* Some attribute of the port has changed. */ }; The desc field should be set to the port description. In the port description, all properties need not be filled by the switch. The switch should fill the properties that have changed, whereas the unchanged properties can be included optionally. Refer to the send_port_status_message() function in the of/openflow.c file for the procedure to send port_status_message to the controller. Sending a controller role-status message to the controller Controller role-status messages (OFPT_ROLE_STATUS) are sent from the switch to the set of controllers when the role of a controller is changed as a result of an OFPT_ROLE_REQUEST message. For example, if there are three the controllers connected to a switch (say controller1, controller2, and controller3) and controller1 sends an OFPT_ROLE_REQUEST message to the switch, then the switch should send an OFPT_ROLE_STATUS message to controller2 and controller3. How to do it... The switch should build the OFPT_ROLE_STATUS message with the following format and send it to all the other controllers: /* Role status event message. */ struct ofp_role_status { struct ofp_header header; /* Type OFPT_ROLE_REQUEST /                            * OFPT_ROLE_REPLY. */ uint32_t role;           /* One of OFPCR_ROLE_*. */ uint8_t reason;           /* One of OFPCRR_*. */ uint8_t pad[3];           /* Align to 64 bits. */ uint64_t generation_id;   /* Master Election Generation Id */ /* Role Property list */ struct ofp_role_prop_header properties[0]; }; The reason field should be set with one of the following values as defined in the enumeration: /* What changed about the controller role */ enum ofp_controller_role_reason { OFPCRR_MASTER_REQUEST = 0, /* Another the controller asked                            * to be master. */ OFPCRR_CONFIG = 1,         /* Configuration changed on the                            * the switch. */ OFPCRR_EXPERIMENTER = 2,   /* Experimenter data changed. */ }; The role should be set to the new role of the controller. The generation_id should be set with the generation ID of the OFPT_ROLE_REQUEST message that triggered the OFPT_ROLE_STATUS message. If the reason code is OFPCRR_EXPERIMENTER, then the role property list should be set in the following format: /* Role property types. */ enum ofp_role_prop_type { OFPRPT_EXPERIMENTER = 0xFFFF, /* Experimenter property. */ };   /* Experimenter role property */ struct ofp_role_prop_experimenter { uint16_t type;         /* One of OFPRPT_EXPERIMENTER. */ uint16_t length;       /* Length in bytes of this property. */ uint32_t experimenter; /* Experimenter ID which takes the same                        * form as struct                        * ofp_experimenter_header. */ uint32_t exp_type;     /* Experimenter defined. */ /* Followed by: * - Exactly (length - 12) bytes containing the experimenter data, * - Exactly (length + 7)/8*8 - (length) (between 0 and 7) * bytes of all-zero bytes */ uint32_t experimenter_data[0]; }; The experimenter field in the experimenter ID should take the same format as the experimenter structure. Refer to the send_role_status_message() function in the of/openflow.c file for the procedure to send a role status message to the controller. Sending a table-status message to the controller Table-status messages (OFPT_TABLE_STATUS) are sent from the switch to the controller when there is any change in the table status; for example, the number of entries in the table crosses the threshold value, called the vacancy threshold. The switch should send this message only to the controller channel in which the controller requested the switch to send it. The controller can express its interest to receive this event by sending the asynchronous configuration message to the switch. How to do it... The switch should build an OFPT_TABLE_STATUS message with the following format and send this message to the controllers that have already shown interest in this event: /* A table config has changed in the datapath */ struct ofp_table_status { struct ofp_header header; uint8_t reason;             /* One of OFPTR_*. */ uint8_t pad[7];             /* Pad to 64 bits */ struct ofp_table_desc table; /* New table config. */ }; The reason field should be set with one of the following values defined in the enumeration: /* What changed about the table */ enum ofp_table_reason { OFPTR_VACANCY_DOWN = 3, /* Vacancy down threshold event. */ OFPTR_VACANCY_UP = 4,   /* Vacancy up threshold event. */ }; When the number of free entries in the table crosses the vacancy_down threshold, the switch should set the reason code as OFPTR_VACANCY_DOWN. Once the vacancy_down event is generated by the switch, the switch should not generate any further vacancy down event until a vacancy up event is generated. When the number of free entries in the table crosses the vacancy_up threshold value, the switch should set the reason code as OFPTR_VACANCY_UP. Again, once the vacancy up event is generated by the switch, the switch should not generate any further vacancy up event until a vacancy down event is generated. The table field should be set with the table description. Refer to the send_table_status_message() function in the of/openflow.c file for the procedure to send a table status message to the controller. Sending a request-forward message to the controller When a the switch receives a modify request message from the controller to modify the state of a group or meter entries, after successful modification of the state, the switch should forward this request message to all other controllers as a request forward message (OFPT_REQUESTFORWAD). The switch should send this message only to the controller channel in which the controller requested the switch to send this event. The controller can express its interest to receive this event by sending an asynchronous configuration message to the switch. How to do it... The switch should build the OFPT_REQUESTFORWAD message with the following format, and send this message to the controllers that have already shown interest in this event: /* Group/Meter request forwarding. */ struct ofp_requestforward_header { struct ofp_header header; /* Type OFPT_REQUESTFORWARD. */ struct ofp_header request; /* Request being forwarded. */ }; The request field should be set with the request that received from the controller. Refer to the send_request_forward_message() function in the of/openflow.c file for the procedure to send request_forward_message to the controller. Handling a packet-out message from the controller Packet-out (OFPT_PACKET_OUT) messages are sent from the controller to the switch when the controller wishes to send a packet out through the switch's data path via a switch port. How to do it... There are two ways in which the controller can send a packet-out message to the switch: Construct the full packet: In this case, the controller generates the complete packet and adds the action list field to the packet-out message. The action field contains a list of actions defining how the packet should be processed by the switch. If the switch receives a packet_out message with buffer_id set as OFP_NO_BUFFER, then the switch should look into the action list, and based on the action to be performed, it can do one of the following: Modify the packet and send it via the switch port mentioned in the action list Hand over the packet to OpenFlow's pipeline processing, based on the OFPP_TABLE specified in the action list Use a packet buffer in the switch: In this mechanism, the switch should use the buffer that was created at the time of sending the packet-in message to the controller. While sending the packet_in message to the controller, the switch adds the buffer_id to the packet_in message. When the controller wants to send a packet_out message that uses this buffer, the controller includes this buffer_id in the packet_out message. On receiving the packet_out message with a valid buffer_id, the switch should fetch the packet from the buffer and send it via the switch port. Once the packet is sent out, the switch should free the memory allocated to the buffer, which was cached. Handling a barrier message from the controller The switch implementation could arbitrarily reorder the message sent from the controller to maximize its performance. So, if the controller wants to enforce the processing of the messages in order, then barrier messages are used. Barrier messages (OFPT_TABLE_STATUS) are sent from the controller to the switch to ensure message ordering. The switch should not reorder any messages across the barrier message. For example, if the controller is sending a group add message, followed by a flow add message referencing the group, then the message order should be preserved in the barrier message. How to do it... When the controller wants to send messages that are related to each other, it sends a barrier message between these messages. The switch should process these messages as follows: Messages before a barrier request should be processed fully before the barrier, including sending any resulting replies or errors. The barrier request message should then be processed and a barrier reply should be sent. While sending the barrier reply message, the switch should copy the xid value from the barrier request message. The switch should process the remaining messages. Both the barrier request and barrier reply messages don't have any body. They only have the ofp_header. Summary This article covers the list of symmetric and asynchronous messages sent and received by the OpenFlow switch, along with the procedure for handling these messages. Resources for Article: Further resources on this subject: The OpenFlow Controllers [article] Untangle VPN Services [article] Getting Started [article]
Read more
  • 0
  • 0
  • 6076

article-image-creating-and-using-templates-cacti-08
Packt
10 Nov 2009
3 min read
Save for later

Creating and Using Templates with Cacti 0.8

Packt
10 Nov 2009
3 min read
For example, let's say we have a network of four Linux servers, two Unix servers, and one Cisco router. Here, if we use a template, we will need to make only three different templates: one for the Linux servers, one for the Unix servers, and one for the Cisco router. You may ask why we have to make a template for the Cisco router? We will make it so that we can use it later. Cacti templates can be imported and exported via the Console under Import/Export. You can only import templates that have been created on a system that is at the same, or an earlier, version of Cacti. At the end of this article, there is a list of third-party templates that can be imported. Types of Cacti templates Cacti templates are broken into two areas: Graph templates Host templates In the above figure, under the Templates section, you will see the three types of templates that come with the Cacti basic installation. If you click on one of those links, you will get the complete list of templates for that type. Graph templates Graphs are used to visualize the data you have collected. A graph template provides a skeleton for an actual graph. When you have more then one system/device, a graph template will save you lots of time and also reduce the possibility of error. Any parameters defined within a graph template are copied to all the graphs that are created using this template. Creating a graph template Now, we are going to create a new graph template. Under the Templates heading, click on Graph Templates. A list of the already available graph templates will be shown. Click on the Add link in the top right corner. You will get a screen like this: Here, you have to give the values that will be used in future to create the graph. If you have checked Use Per-Graph Value (Ignore this Value), then every time while using this graph template to create a graph, you have to give an input for this option. It's always best to enable this option for title field. After filling all these fields, click on the Create button. The graph template will be created. Now, we need to add a Graph Template Item and Graph Item Inputs to complete this graph template. Graph Template Item Graph template items are the various items that will be shown on the graph. To add a graph template item, click on Add on the right side of the Graph Template Items box. You will get this page. The following are the fields that can be filled in: When creating a graph item, you must always start with an AREA item before using STACK; otherwise, your graph will not render. Graph Item Inputs The second box is Graph Item Inputs. Graph item inputs are the input source through which the data will be collected. To add a new graph item input, click on the Add link on the right side of the Graph Item Inputs box. Below are the various fields that have to be filled out for a graph input item: After completing all these fields, click on the Create button. Do this again to add more graph item inputs to this graph template.
Read more
  • 0
  • 0
  • 6068

article-image-implement-long-short-term-memory-lstm-tensorflow
Gebin George
06 Mar 2018
4 min read
Save for later

Implement Long-short Term Memory (LSTM) with TensorFlow

Gebin George
06 Mar 2018
4 min read
[box type="note" align="" class="" width=""]This article is an excerpt from the book, Deep Learning Essentials written by Wei Di, Anurag Bhardwaj, and Jianing Wei. This book will help you get started with the essentials of deep learning and neural network modeling.[/box] In today’s tutorial, we will look at an example of using LSTM in TensorFlow to perform sentiment classification. The input to LSTM will be a sentence or sequence of words. The output of LSTM will be a binary value indicating a positive sentiment with 1 and a negative sentiment with 0. We will use a many-to-one LSTM architecture for this problem since it maps multiple inputs onto a single output. Figure LSTM: Basic cell architecture shows this architecture in more detail. As shown here, the input takes a sequence of word tokens (in this case, a sequence of three words). Each word token is input at a new time step and is input to the hidden state for the corresponding time step. For example, the word Book is input at time step t and is fed to the hidden state ht: Sentiment analysis: To implement this model in TensorFlow, we need to first define a few variables as follows: batch_size = 4 lstm_units = 16 num_classes = 2 max_sequence_length = 4 embedding_dimension = 64 num_iterations = 1000 As shown previously, batch_size dictates how many sequences of tokens we can input in one batch for training. lstm_units represents the total number of LSTM cells in the network. max_sequence_length represents the maximum possible length of a given sequence. Once defined, we now proceed to initialize TensorFlow-specific data structures for input data as follows: import tensorflow as tf labels = tf.placeholder(tf.float32, [batch_size, num_classes]) raw_data = tf.placeholder(tf.int32, [batch_size, max_sequence_length]) Given we are working with word tokens, we would like to represent them using a good feature representation technique. Let us assume the word embedding representation takes a word token and projects it onto an embedding space of dimension, embedding_dimension. The two-dimensional input data containing raw word tokens is now transformed into a three-dimensional word tensor with the added dimension representing the word embedding. We also use pre-computed word embedding, stored in a word_vectors data structure. We initialize the data structures as follows: data = tf.Variable(tf.zeros([batch_size, max_sequence_length, embedding_dimension]),dtype=tf.float32) data = tf.nn.embedding_lookup(word_vectors,raw_data) Now that the input data is ready, we look at defining the LSTM model. As shown previously, we need to create lstm_units of a basic LSTM cell. Since we need to perform a classification at the end, we wrap the LSTM unit with a dropout wrapper. To perform a full temporal pass of the data on the defined network, we unroll the LSTM using a dynamic_rnn routine of TensorFlow. We also initialize a random weight matrix and a constant value of 0.1 as the bias vector, as follows: weight = tf.Variable(tf.truncated_normal([lstm_units, num_classes])) bias = tf.Variable(tf.constant(0.1, shape=[num_classes])) lstm_cell = tf.contrib.rnn.BasicLSTMCell(lstm_units) wrapped_lstm_cell = tf.contrib.rnn.DropoutWrapper(cell=lstm_cell, output_keep_prob=0.8) output, state = tf.nn.dynamic_rnn(wrapped_lstm_cell, data, dtype=tf.float32) Once the output is generated by the dynamic unrolled RNN, we transpose its shape, multiply it by the weight vector, and add a bias vector to it to compute the final prediction value: output = tf.transpose(output, [1, 0, 2]) last = tf.gather(output, int(output.get_shape()[0]) - 1) prediction = (tf.matmul(last, weight) + bias) weight = tf.cast(weight, tf.float64) last = tf.cast(last, tf.float64) bias = tf.cast(bias, tf.float64) Since the initial prediction needs to be refined, we define an objective function with crossentropy to minimize the loss as follows: loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits (logits=prediction, labels=labels)) optimizer = tf.train.AdamOptimizer().minimize(loss) After this sequence of steps, we have a trained, end-to-end LSTM network for sentiment classification of arbitrary length sentences. To summarize, we saw how effectively we can implement LSTM network using TensorFlow. If you are interested to know more, check out this book Deep Learning Essentials which will help you take first steps in training efficient deep learning models and apply them in various practical scenarios.  
Read more
  • 0
  • 0
  • 6068
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at £15.99/month. Cancel anytime
article-image-metaprogramming-and-groovy-mop
Packt
31 May 2010
6 min read
Save for later

Metaprogramming and the Groovy MOP

Packt
31 May 2010
6 min read
(For more resources on Groovy DSL, see here.) In a nutshell, the term metaprogramming refers to writing code that can dynamically change its behavior at runtime. A Meta-Object Protocol (MOP) refers to the capabilities in a dynamic language that enable metaprogramming. In Groovy, the MOP consists of four distinct capabilities within the language: reflection, metaclasses, categories, and expandos. The MOP is at the core of what makes Groovy so useful for defining DSLs. The MOP is what allows us to bend the language in different ways in order to meet our needs, by changing the behavior of classes on the fly. This section will guide you through the capabilities of MOP. Reflection To use Java reflection, we first need to access the Class object for any Java object in which are interested through its getClass() method. Using the returned Class object, we can query everything from the list of methods or fields of the class to the modifiers that the class was declared with. Below, we see some of the ways that we can access a Class object in Java and the methods we can use to inspect the class at runtime. import java.lang.reflect.Field;import java.lang.reflect.Method;public class Reflection { public static void main(String[] args) { String s = new String(); Class sClazz = s.getClass(); Package _package = sClazz.getPackage(); System.out.println("Package for String class: "); System.out.println(" " + _package.getName()); Class oClazz = Object.class; System.out.println("All methods of Object class:"); Method[] methods = oClazz.getMethods(); for(int i = 0;i < methods.length;i++) System.out.println(" " + methods[i].getName()); try { Class iClazz = Class.forName("java.lang.Integer"); Field[] fields = iClazz.getDeclaredFields(); System.out.println("All fields of Integer class:"); for(int i = 0; i < fields.length;i++) System.out.println(" " + fields[i].getName()); } catch (ClassNotFoundException e) { e.printStackTrace(); } }} We can access the Class object from an instance by calling its Object.getClass() method. If we don't have an instance of the class to hand, we can get the Class object by using .class after the class name, for example, String.class. Alternatively, we can call the static Class.forName, passing to it a fully-qualified class name. Class has numerous methods, such as getPackage(), getMethods(), and getDeclaredFields() that allow us to interrogate the Class object for details about the Java class under inspection. The preceding example will output various details about String, Integer, and Double. Groovy Reflection shortcuts Groovy, as we would expect by now, provides shortcuts that let us reflect classes easily. In Groovy, we can shortcut the getClass() method as a property access .class, so we can access the class object in the same way whether we are using the class name or an instance. We can treat the .class as a String, and print it directly without calling Class.getName(), as follows: The variable greeting is declared with a dynamic type, but has the type java.lang.String after the "Hello" String is assigned to it. Classes are first class objects in Groovy so we can assign String to a variable. When we do this, the object that is assigned is of type java.lang.Class. However, it describes the String class itself, so printing will report java.lang.String. Groovy also provides shortcuts for accessing packages, methods, fields, and just about all other reflection details that we need from a class. We can access these straight off the class identifier, as follows: println "Package for String class"println " " + String.packageprintln "All methods of Object class:"Object.methods.each { println " " + it }println "All fields of Integer class:"Integer.fields.each { println " " + it } Incredibly, these six lines of code do all of the same work as the 30 lines in our Java example. If we look at the preceding code, it contains nothing that is more complicated than it needs to be. Referencing String.package to get the Java package of a class is as succinct as you can make it. As usual, String.methods and String.fields return Groovy collections, so we can apply a closure to each element with the each method. What's more, the Groovy version outputs a lot more useful detail about the package, methods, and fields. When using an instance of an object, we can use the same shortcuts through the class field of the instance. def greeting = "Hello"assert greeting.class.package == String.package Expandos An Expando is a dynamic representation of a typical Groovy bean. Expandos support typical get and set style bean access but in addition to this they will accept gets and sets to arbitrary properties. If we try to access, a non-existing property, the Expando does not mind and instead of causing an exception it will return null. If we set a non-existent property, the Expando will add that property and set the value. In order to create an Expando, we instantiate an object of class groovy.util.Expando. def customer = new Expando()assert customer.properties == [:]assert customer.id == nullassert customer.properties == [:]customer.id = 1001customer.firstName = "Fred"customer.surname = "Flintstone"customer.street = "1 Rock Road"assert customer.id == 1001assert customer.properties == [ id:1001, firstName:'Fred', surname:'Flintstone', street:'1 Rock Road']customer.properties.each { println it } The id field of customer is accessible on the Expando shown in the preceding example even when it does not exist as a property of the bean. Once a property has been set, it can be accessed by using the normal field getter: for example, customer.id. Expandos are a useful extension to normal beans where we need to be able to dump arbitrary properties into a bag and we don't want to write a custom class to do so. A neat trick with Expandos is what happens when we store a closure in a property. As we would expect, an Expando closure property is accessible in the same way as a normal property. However, because it is a closure we can apply function call syntax to it to invoke the closure. This has the effect of seeming to add a new method on the fly to the Expando. customer.prettyPrint = { println "Customer has following properties" customer.properties.each { if (it.key != 'prettyPrint') println " " + it.key + ": " + it.value }}customer.prettyPrint() Here we appear to be able to add a prettyPrint() method to the customer object, which outputs to the console: Customer has following properties surname: Flintstone street: 1 Rock Road firstName: Fred id: 1001
Read more
  • 0
  • 0
  • 6066

article-image-key-trends-in-software-infrastructure-in-2019
Richard Gall
17 Dec 2018
10 min read
Save for later

Key trends in software infrastructure in 2019: observability, chaos, and cloud complexity

Richard Gall
17 Dec 2018
10 min read
Software infrastructure has, over the last decade or so, become a key concern for developers of all stripes. Long gone are narrowly defined job roles; thanks to DevOps, accountability for how code is now shared between teams on both development and deployment sides. For anyone that’s ever been involved in the messy frustration of internal code wars, this has been a welcome change. But as developers who have traditionally sat higher up the software stack dive deeper into the mechanics of deploying and maintaining software, for those of us working in system administration, DevOps, SRE, and security (the list is endless, apologies if I’ve forgotten you), the rise of distributed systems only brings further challenges. Increased complexity not only opens up new points of failure and potential vulnerability, at a really basic level it makes understanding what’s actually going on difficult. And, essentially, this is what it will mean to work in software delivery and maintenance in 2019. Understanding what’s happening, minimizing downtime, taking steps to mitigate security threats - it’s a cliche, but finding strategies to become more responsive rather than reactive will be vital. Indeed, many responses to these kind of questions have emerged this year. Chaos engineering and observability, for example, have both been gaining traction within the SRE world, and are slowly beginning to make an impact beyond that particular job role. But let’s take a deeper look at what is really going to matter in the world of software infrastructure and architecture in 2019. Observability and the rise of the service mesh Before we decide what to actually do, it’s essential to know what’s actually going on. That seems obvious, but with increasing architectural complexity, that’s getting harder. Observability is a term that’s being widely thrown around as a response to this - but it has been met with some cynicism. For some developers, observability is just a sexed up way of talking about good old fashioned monitoring. But although the two concepts have a lot in common, observability is more of an approach, a design pattern maybe, rather than a specific activity. This post from The New Stack explains the difference between monitoring and observability incredibly well. Observability is “a measure of how well internal states of a system can be inferred from knowledge of its external outputs.” which means observability is more a property of a system, rather than an activity. There are a range of tools available to help you move towards better observability. Application management and logging tools like Splunk, Datadog, New Relic and Honeycomb can all be put to good use and are a good first step towards developing a more observable system. Want to learn how to put monitoring tools to work? Check out some of these titles: AWS Application Architecture and Management [Video]     Hands on Microservices Monitoring and Testing       Software Architecture with Spring 5.0      As well as those tools, if you’re working with containers, Kubernetes has some really useful features that can help you more effectively monitor your container deployments. In May, Google announced StackDriver Kubernetes Monitoring, which has seen much popularity across the community. Master monitoring with Kubernetes. Explore these titles: Google Cloud Platform Administration     Mastering Kubernetes      Kubernetes in 7 Days [Video]        But there’s something else emerging alongside observability which only appears to confirm it’s importance: that thing is the notion of a service mesh. The service mesh is essentially a tool that allows you to monitor all the various facets of your software infrastructure helping you to manage everything from performance to security to reliability. There are a number of different options out there when it comes to service meshes - Istio, Linkerd, Conduit and Tetrate being the 4 definitive tools out there at the moment. Learn more about service meshes inside these titles: Microservices Development Cookbook     The Ultimate Openshift Bootcamp [Video]     Cloud Native Application Development with Java EE [Video]       Why is observability important? Observability is important because it sets the foundations for many aspects of software management and design in various domains. Whether you’re an SRE or security engineer, having visibility on the way in which your software is working will be essential in 2019. Chaos engineering Observability lays the groundwork for many interesting new developments, chaos engineering being one of them. Based on the principle that modern, distributed software is inherently unreliable, chaos engineering ‘stress tests’ software systems. The word ‘chaos’ is a bit of a misnomer. All testing and experimentation on your software should follow a rigorous and almost scientific structure. Using something called chaos experiments - adding something unexpected into your system, or pulling a piece of it out like a game of Jenga - chaos engineering helps you to better understand the way it will act in various situations. In turn, this allows you to make the necessary changes that can help ensure resiliency. Chaos engineering is particularly important today simply because so many people, indeed, so many things, depend on software to actually work. From an eCommerce site to a self driving car, if something isn’t working properly there could be terrible consequences. It’s not hard to see how chaos engineering fits alongside something like observability. To a certain extent, it’s really another way of achieving observability. By running chaos experiments, you can draw out issues that may not be visible in usual scenarios. However, the caveat is that chaos engineering isn’t an easy thing to do. It requires a lot of confidence and engineering intelligence. Running experiments shouldn’t be done carelessly - in many ways, the word ‘chaos’ is a bit of a misnomer. All testing and experimentation on your software should follow a rigorous and almost scientific structure. While chaos engineering isn’t straightforward, there are tools and platforms available to make it more manageable. Gremlin is perhaps the best example, offering what they describe as ‘resiliency-as-a-service’. But if you’re not ready to go in for a fully fledged platform, it’s worth looking at open source tools like Chaos Monkey and ChaosToolkit. Want to learn how to put the principles of chaos engineering into practice? Check out this title: Microservice Patterns and Best Practices       Learn the principles behind resiliency with these SRE titles: Real-World SRE       Practical Site Reliability Engineering       Better integrated security and code testing Both chaos engineering and observability point towards more testing. And this shouldn’t be surprising: testing is to be expected in a world where people are accountable for unpredictable systems. But what’s particularly important is how testing is integrated. Whether it’s for security or simply performance, we’re gradually moving towards a world where testing is part of the build and deploy process, not completely isolated from it. There are a diverse range of tools that all hint at this move. Archery, for example, is a tool designed for both developers and security testers to better identify and assess security vulnerabilities at various stages of the development lifecycle. With a useful dashboard, it neatly ties into the wider trend of observability. ArchUnit (sounds similar but completely unrelated) is a Java testing library that allows you to test a variety of different architectural components. Similarly on the testing front, headless browsers continue to dominate. We’ve seen some of the major browsers bringing out headless browsers, which will no doubt delight many developers. Headless browsers allow developers to run front end tests on their code as if it were live and running in the browser. If this sounds a lot like PhantomJS, that’s because it is actually quite a bit like PhantomJS. However, headless browsers do make the testing process much faster. Smarter software purchasing and the move to hybrid cloud The key trends we’ve seen in software architecture are about better understanding your software. But this level of insight and understanding doesn’t matter if there’s no alignment between key decision makers and purchasers. Whatever cloud architecture you have, strong leadership and stakeholder management are essential. This can manifest itself in various ways. Essentially, it’s a symptom of decision makers being disconnected from engineers buried deep in their software. This is by no means a new problem, cloud coming to define just about every aspect of software, it’s now much easier for confusion to take hold. The best thing about cloud is also the worst thing - the huge scope of opportunities it opens up. It makes decision making a minefield - which provider should we use? What parts of it do we need? What’s going to be most cost effective? Of course, with hybrid cloud, there's a clear way of meeting those issues. But it's by no means a silver bullet. Whatever cloud architecture you have, strong leadership and stakeholder management are essential. This is something that ThoughtWorks references in its most recent edition of Radar (November 2018). Identifying two trends they call ‘bounded buy’ and ‘risk commensurate vendor strategy’ ThoughtWorks highlights how organizations can find their SaaS of choice shaping their strategy in its own image (bounded buy) or look to outsource business critical applications, functions or services. T ThoughtWorks explains: “This trade-off has become apparent as the major cloud providers have expanded their range of service offerings. For example, using AWS Secret Management Service can speed up initial development and has the benefit of ecosystem integration, but it will also add more inertia if you ever need to migrate to a different cloud provider than it would if you had implemented, for example, Vault”. Relatedly, ThoughtWorks also identifies a problem with how organizations manage cost. In the report they discuss what they call ‘run cost as architecture fitness function’ which is really an elaborate way of saying - make sure you look at how much things cost. So, for example, don’t use serverless blindly. While it might look like a cheap option for smaller projects, your costs could quickly spiral and leave you spending more than you would if you ran it on a typical cloud server. Get to grips with hybrid cloud: Hybrid Cloud for Architects       Building Hybrid Clouds with Azure Stack     Become an effective software and solutions architect in 2019: AWS Certified Solutions Architect - Associate Guide     Architecting Cloud Computing Solutions     Hands-On Cloud Solutions with Azure       Software complexity needs are best communicated in a simple language: money In practice, this takes us all the way back to the beginning - it’s simply the financial underbelly of observability. Performance, visibility, resilience - these matter because they directly impact the bottom line. That might sound obvious, but if you’re trying to make the case, say, for implementing chaos engineering, or using a any other particular facet of a SaaS offering, communicating to other stakeholders in financial terms can give you buy-in and help to guarantee alignment. If 2019 should be about anything, it’s getting closer to this fantasy of alignment. In the end, it will keep everyone happy - engineers and businesses
Read more
  • 0
  • 0
  • 6064

article-image-installing-mame4all-intermediate
Packt
27 Sep 2013
3 min read
Save for later

Installing MAME4All (Intermediate)

Packt
27 Sep 2013
3 min read
(For more resources related to this topic, see here.) Getting ready You will need: A Raspberry Pi An SD card with the official Raspberry Pi OS, Raspbian, properly loaded A USB keyboard A USB mouse A 5V 1A power supply with Micro-USB connector A network connection And a screen hooked up to your Raspberry Pi How to do it... Perform the following steps for installing MAME4All: From the command line, enter startx to launch the desktop environment. From the desktop, launch the Pi Store application by double-clicking on the Pi Store icon. At the top-right of the application, there will be a Log In link. Click on this link and log in with your registered account. Type MAME4All in the search bar, and press Enter. Click on the MAME4All result. At the application's information page, click on the Download button on the right-hand side of the screen. MAME4All will automatically download, and a window will appear showing the installation process. Press any button to close the window once it has finished installing. MAME4All will look for your game files in the /usr/local/bin/indiecity/InstalledApps/MAME4ALL-pi/Full/roms directory. Perform the following steps for running MAME4All from the Pi Store: From the desktop, launch the Pi Store application by double-clicking on the Pi Store icon. At the top-right of the application, there will be a Log In link. Click on the link and log in with your registered account. Click on the My Library tab. Click on MAME4All, and then click on Launch. For running MAME4All from the command line, perform the following steps: Type cd /usr/local/bin/indiecity/InstalledApps/mame4all_pi/Full and press Enter. Type ./mame and press Enter for launching MAME4All. How it works... MAME4All is a Multi Arcade Machine Emulator that takes advantage of the Raspberry Pi's GPU to achieve very fast emulation of arcade machines. It is able to achieve this speed by compiling with DispManX, which offloads SDL code to the graphics core via OpenGL ES. When you run MAME4All, it looks for any game files you have in the roms directory and displays them in a menu for you to select from. If it doesn't find any files, it exits after a few seconds. The default keys for MAME4All-Pi are: 5 for inserting coins 1 for player 1 to start Arrow keys for player 1 joystick controls Ctrl, Alt, space bar, Z, X, and C for default action keys You can modify the MAME4All configuration by editing the /usr/local/bin/indiecity/InstalledApps/mame4all_pi/Full/mame.cfg file. There's more... A few useful reference links: For information on MAME project go to http://mamedev.org/ For information on MAME4All project go to http://code.google.com/p/mame4all-pi/ Summary In this article we saw how to install, launch, and play with a specially created version of MAME for the Raspberry Pi from the Pi Store Resources for Article: Further resources on this subject: Creating a file server (Samba) [Article] Webcam and Video Wizardry [Article] Coding with Minecraft [Article]
Read more
  • 0
  • 0
  • 6053

article-image-man-do-i-templates
Packt
07 Jul 2015
22 min read
Save for later

Man, Do I Like Templates!

Packt
07 Jul 2015
22 min read
In this article by Italo Maia, author of the book Building Web Applications with Flask, we will discuss what Jinja2 is, and how Flask uses Jinja2 to implement the View layer and awe you. Be prepared! (For more resources related to this topic, see here.) What is Jinja2 and how is it coupled with Flask? Jinja2 is a library found at http://jinja.pocoo.org/; you can use it to produce formatted text with bundled logic. Unlike the Python format function, which only allows you to replace markup with variable content, you can have a control structure, such as a for loop, inside a template string and use Jinja2 to parse it. Let's consider this example: from jinja2 import Template x = """ <p>Uncle Scrooge nephews</p> <ul> {% for i in my_list %} <li>{{ i }}</li> {% endfor %} </ul> """ template = Template(x) # output is an unicode string print template.render(my_list=['Huey', 'Dewey', 'Louie']) In the preceding code, we have a very simple example where we create a template string with a for loop control structure ("for tag", for short) that iterates over a list variable called my_list and prints the element inside a "li HTML tag" using curly braces {{ }} notation. Notice that you could call render in the template instance as many times as needed with different key-value arguments, also called the template context. A context variable may have any valid Python variable name—that is, anything in the format given by the regular expression [a-zA-Z_][a-zA-Z0-9_]*. For a full overview on regular expressions (Regex for short) with Python, visit https://docs.python.org/2/library/re.html. Also, take a look at this nice online tool for Regex testing http://pythex.org/. A more elaborate example would make use of an environment class instance, which is a central, configurable, extensible class that may be used to load templates in a more organized way. Do you follow where we are going here? This is the basic principle behind Jinja2 and Flask: it prepares an environment for you, with a few responsive defaults, and gets your wheels in motion. What can you do with Jinja2? Jinja2 is pretty slick. You can use it with template files or strings; you can use it to create formatted text, such as HTML, XML, Markdown, and e-mail content; you can put together templates, reuse templates, and extend templates; you can even use extensions with it. The possibilities are countless, and combined with nice debugging features, auto-escaping, and full unicode support. Auto-escaping is a Jinja2 configuration where everything you print in a template is interpreted as plain text, if not explicitly requested otherwise. Imagine a variable x has its value set to <b>b</b>. If auto-escaping is enabled, {{ x }} in a template would print the string as given. If auto-escaping is off, which is the Jinja2 default (Flask's default is on), the resulting text would be b. Let's understand a few concepts before covering how Jinja2 allows us to do our coding. First, we have the previously mentioned curly braces. Double curly braces are a delimiter that allows you to evaluate a variable or function from the provided context and print it into the template: from jinja2 import Template # create the template t = Template("{{ variable }}") # – Built-in Types – t.render(variable='hello you') >> u"hello you" t.render(variable=100) >> u"100" # you can evaluate custom classes instances class A(object): def __str__(self):    return "__str__" def __unicode__(self):    return u"__unicode__" def __repr__(self):    return u"__repr__" # – Custom Objects Evaluation – # __unicode__ has the highest precedence in evaluation # followed by __str__ and __repr__ t.render(variable=A()) >> u"__unicode__" In the preceding example, we see how to use curly braces to evaluate variables in your template. First, we evaluate a string and then an integer. Both result in a unicode string. If we evaluate a class of our own, we must make sure there is a __unicode__ method defined, as it is called during the evaluation. If a __unicode__ method is not defined, the evaluation falls back to __str__ and __repr__, sequentially. This is easy. Furthermore, what if we want to evaluate a function? Well, just call it: from jinja2 import Template # create the template t = Template("{{ fnc() }}") t.render(fnc=lambda: 10) >> u"10" # evaluating a function with argument t = Template("{{ fnc(x) }}") t.render(fnc=lambda v: v, x='20') >> u"20" t = Template("{{ fnc(v=30) }}") t.render(fnc=lambda v: v) >> u"30" To output the result of a function in a template, just call the function as any regular Python function. The function return value will be evaluated normally. If you're familiar with Django, you might notice a slight difference here. In Django, you do not need the parentheses to call a function, or even pass arguments to it. In Flask, the parentheses are always needed if you want the function return evaluated. The following two examples show the difference between Jinja2 and Django function call in a template: {# flask syntax #} {{ some_function() }}   {# django syntax #} {{ some_function }} You can also evaluate Python math operations. Take a look: from jinja2 import Template # no context provided / needed Template("{{ 3 + 3 }}").render() >> u"6" Template("{{ 3 - 3 }}").render() >> u"0" Template("{{ 3 * 3 }}").render() >> u"9" Template("{{ 3 / 3 }}").render() >> u"1" Other math operators will also work. You may use the curly braces delimiter to access and evaluate lists and dictionaries: from jinja2 import Template Template("{{ my_list[0] }}").render(my_list=[1, 2, 3]) >> u'1' Template("{{ my_list['foo'] }}").render(my_list={'foo': 'bar'}) >> u'bar' # and here's some magic Template("{{ my_list.foo }}").render(my_list={'foo': 'bar'}) >> u'bar' To access a list or dictionary value, just use normal plain Python notation. With dictionaries, you can also access a key value using variable access notation, which is pretty neat. Besides the curly braces delimiter, Jinja2 also has the curly braces/percentage delimiter, which uses the notation {% stmt %} and is used to execute statements, which may be a control statement or not. Its usage depends on the statement, where control statements have the following notation: {% stmt %} {% endstmt %} The first tag has the statement name, while the second is the closing tag, which has the name of the statement appended with end in the beginning. You must be aware that a non-control statement may not have a closing tag. Let's look at some examples: {% block content %} {% for i in items %} {{ i }} - {{ i.price }} {% endfor %} {% endblock %} The preceding example is a little more complex than what we have been seeing. It uses a control statement for loop inside a block statement (you can have a statement inside another), which is not a control statement, as it does not control execution flow in the template. Inside the for loop you see that the i variable is being printed together with the associated price (defined elsewhere). A last delimiter you should know is {# comments go here #}. It is a multi-line delimiter used to declare comments. Let's see two examples that have the same result: {# first example #} {# second example #} Both comment delimiters hide the content between {# and #}. As can been seen, this delimiter works for one-line comments and multi-line comments, what makes it very convenient. Control structures We have a nice set of built-in control structures defined by default in Jinja2. Let's begin our studies on it with the if statement. {% if true %}Too easy{% endif %} {% if true == true == True %}True and true are the same{% endif %} {% if false == false == False %}False and false also are the same{% endif %} {% if none == none == None %}There's also a lowercase None{% endif %} {% if 1 >= 1 %}Compare objects like in plain python{% endif %} {% if 1 == 2 %}This won't be printed{% else %}This will{% endif %} {% if "apples" != "oranges" %}All comparison operators work = ]{% endif %} {% if something %}elif is also supported{% elif something_else %}^_^{% endif %} The if control statement is beautiful! It behaves just like a python if statement. As seen in the preceding code, you can use it to compare objects in a very easy fashion. "else" and "elif" are also fully supported. You may also have noticed that true and false, non-capitalized, were used together with plain Python Booleans, True and False. As a design decision to avoid confusion, all Jinja2 templates have a lowercase alias for True, False, and None. By the way, lowercase syntax is the preferred way to go. If needed, and you should avoid this scenario, you may group comparisons together in order to change precedence evaluation. See the following example: {% if 5 < 10 < 15 %}true{%else%}false{% endif %} {% if (5 < 10) < 15 %}true{%else%}false{% endif %} {% if 5 < (10 < 15) %}true{%else%}false{% endif %} The expected output for the preceding example is true, true, and false. The first two lines are pretty straightforward. In the third line, first, (10<15) is evaluated to True, which is a subclass of int, where True == 1. Then 5 < True is evaluated, which is certainly false. The for statement is pretty important. One can hardly think of a serious Web application that does not have to show a list of some kind at some point. The for statement can iterate over any iterable instance and has a very simple, Python-like syntax: {% for item in my_list %} {{ item }}{# print evaluate item #} {% endfor %} {# or #} {% for key, value in my_dictionary.items() %} {{ key }}: {{ value }} {% endfor %} In the first statement, we have the opening tag indicating that we will iterate over my_list items and each item will be referenced by the name item. The name item will be available inside the for loop context only. In the second statement, we have an iteration over the key value tuples that form my_dictionary, which should be a dictionary (if the variable name wasn't suggestive enough). Pretty simple, right? The for loop also has a few tricks in store for you. When building HTML lists, it's a common requirement to mark each list item in alternating colors in order to improve readability or mark the first or/and last item with some special markup. Those behaviors can be achieved in a Jinja2 for-loop through access to a loop variable available inside the block context. Let's see some examples: {% for i in ['a', 'b', 'c', 'd'] %} {% if loop.first %}This is the first iteration{% endif %} {% if loop.last %}This is the last iteration{% endif %} {{ loop.cycle('red', 'blue') }}{# print red or blue alternating #} {{ loop.index }} - {{ loop.index0 }} {# 1 indexed index – 0 indexed index #} {# reverse 1 indexed index – reverse 0 indexed index #} {{ loop.revindex }} - {{ loop.revindex0 }} {% endfor %} The for loop statement, as in Python, also allow the use of else, but with a slightly different meaning. In Python, when you use else with for, the else block is only executed if it was not reached through a break command like this: for i in [1, 2, 3]: pass else: print "this will be printed" for i in [1, 2, 3]: if i == 3:    break else: print "this will never not be printed" As seen in the preceding code snippet, the else block will only be executed in a for loop if the execution was never broken by a break command. With Jinja2, the else block is executed when the for iterable is empty. For example: {% for i in [] %} {{ i }} {% else %}I'll be printed{% endfor %} {% for i in ['a'] %} {{ i }} {% else %}I won't{% endfor %} As we are talking about loops and breaks, there are two important things to know: the Jinja2 for loop does not support break or continue. Instead, to achieve the expected behavior, you should use loop filtering as follows: {% for i in [1, 2, 3, 4, 5] if i > 2 %} value: {{ i }}; loop.index: {{ loop.index }} {%- endfor %} In the first tag you see a normal for loop together with an if condition. You should consider that condition as a real list filter, as the index itself is only counted per iteration. Run the preceding example and the output will be the following: value:3; index: 1 value:4; index: 2 value:5; index: 3 Look at the last observation in the preceding example—in the second tag, do you see the dash in {%-? It tells the renderer that there should be no empty new lines before the tag at each iteration. Try our previous example without the dash and compare the results to see what changes. We'll now look at three very important statements used to build templates from different files: block, extends, and include. block and extends always work together. The first is used to define "overwritable" blocks in a template, while the second defines a parent template that has blocks, for the current template. Let's see an example: # coding:utf-8 with open('parent.txt', 'w') as file:    file.write(""" {% block template %}parent.txt{% endblock %} =========== I am a powerful psychic and will tell you your past   {#- "past" is the block identifier #} {% block past %} You had pimples by the age of 12. {%- endblock %}   Tremble before my power!!!""".strip())   with open('child.txt', 'w') as file:    file.write(""" {% extends "parent.txt" %}   {# overwriting the block called template from parent.txt #} {% block template %}child.txt{% endblock %}   {#- overwriting the block called past from parent.txt #} {% block past %} You've bought an ebook recently. {%- endblock %}""".strip()) with open('other.txt', 'w') as file:    file.write(""" {% extends "child.txt" %} {% block template %}other.txt{% endblock %}""".strip())   from jinja2 import Environment, FileSystemLoader   env = Environment() # tell the environment how to load templates env.loader = FileSystemLoader('.') # look up our template tmpl = env.get_template('parent.txt') # render it to default output print tmpl.render() print "" # loads child.html and its parent tmpl = env.get_template('child.txt') print tmpl.render() # loads other.html and its parent env.get_template('other.txt').render() Do you see the inheritance happening, between child.txt and parent.txt? parent.txt is a simple template with two block statements, called template and past. When you render parent.txt directly, its blocks are printed "as is", because they were not overwritten. In child.txt, we extend the parent.txt template and overwrite all its blocks. By doing that, we can have different information in specific parts of a template without having to rewrite the whole thing. With other.txt, for example, we extend the child.txt template and overwrite only the block-named template. You can overwrite blocks from a direct parent template or from any of its parents. If you were defining an index.txt page, you could have default blocks in it that would be overwritten when needed, saving lots of typing. Explaining the last example, Python-wise, is pretty simple. First, we create a Jinja2 environment (we talked about this earlier) and tell it how to load our templates, then we load the desired template directly. We do not have to bother telling the environment how to find parent templates, nor do we need to preload them. The include statement is probably the easiest statement so far. It allows you to render a template inside another in a very easy fashion. Let's look at an example: with open('base.txt', 'w') as file: file.write(""" {{ myvar }} You wanna hear a dirty joke? {% include 'joke.txt' %} """.strip()) with open('joke.txt', 'w') as file: file.write(""" A boy fell in a mud puddle. {{ myvar }} """.strip())   from jinja2 import Environment, FileSystemLoader   env = Environment() # tell the environment how to load templates env.loader = FileSystemLoader('.') print env.get_template('base.txt').render(myvar='Ha ha!') In the preceding example, we render the joke.txt template inside base.txt. As joke.txt is rendered inside base.txt, it also has full access to the base.txt context, so myvar is printed normally. Finally, we have the set statement. It allows you to define variables for inside the template context. Its use is pretty simple: {% set x = 10 %} {{ x }} {% set x, y, z = 10, 5+5, "home" %} {{ x }} - {{ y }} - {{ z }} In the preceding example, if x was given by a complex calculation or a database query, it would make much more sense to have it cached in a variable, if it is to be reused across the template. As seen in the example, you can also assign a value to multiple variables at once. Macros Macros are the closest to coding you'll get inside Jinja2 templates. The macro definition and usage are similar to plain Python functions, so it is pretty easy. Let's try an example: with open('formfield.html', 'w') as file: file.write(''' {% macro input(name, value='', label='') %} {% if label %} <label for='{{ name }}'>{{ label }}</label> {% endif %} <input id='{{ name }}' name='{{ name }}' value='{{ value }}'></input> {% endmacro %}'''.strip()) with open('index.html', 'w') as file: file.write(''' {% from 'formfield.html' import input %} <form method='get' action='.'> {{ input('name', label='Name:') }} <input type='submit' value='Send'></input> </form> '''.strip())   from jinja2 import Environment, FileSystemLoader   env = Environment() env.loader = FileSystemLoader('.') print env.get_template('index.html').render() In the preceding example, we create a macro that accepts a name argument and two optional arguments: value and label. Inside the macro block, we define what should be output. Notice we can use other statements inside a macro, just like a template. In index.html we import the input macro from inside formfield.html, as if formfield was a module and input was a Python function using the import statement. If needed, we could even rename our input macro like this: {% from 'formfield.html' import input as field_input %} You can also import formfield as a module and use it as follows: {% import 'formfield.html' as formfield %} When using macros, there is a special case where you want to allow any named argument to be passed into the macro, as you would in a Python function (for example, **kwargs). With Jinja2 macros, these values are, by default, available in a kwargs dictionary that does not need to be explicitly defined in the macro signature. For example: # coding:utf-8 with open('formfield.html', 'w') as file:    file.write(''' {% macro input(name) -%} <input id='{{ name }}' name='{{ name }}' {% for k,v in kwargs.items() -%}{{ k }}='{{ v }}' {% endfor %}></input> {%- endmacro %} '''.strip())with open('index.html', 'w') as file:    file.write(''' {% from 'formfield.html' import input %} {# use method='post' whenever sending sensitive data over HTTP #} <form method='post' action='.'> {{ input('name', type='text') }} {{ input('passwd', type='password') }} <input type='submit' value='Send'></input> </form> '''.strip())   from jinja2 import Environment, FileSystemLoader   env = Environment() env.loader = FileSystemLoader('.') print env.get_template('index.html').render() As you can see, kwargs is available even though you did not define a kwargs argument in the macro signature. Macros have a few clear advantages over plain templates, that you notice with the include statement: You do not have to worry about variable names in the template using macros You can define the exact required context for a macro block through the macro signature You can define a macro library inside a template and import only what is needed Commonly used macros in a Web application include a macro to render pagination, another to render fields, and another to render forms. You could have others, but these are pretty common use cases. Regarding our previous example, it is good practice to use HTTPS (also known as, Secure HTTP) to send sensitive information, such as passwords, over the Internet. Be careful about that! Extensions Extensions are the way Jinja2 allows you to extend its vocabulary. Extensions are not enabled by default, so you can enable an extension only when and if you need, and start using it without much trouble: env = Environment(extensions=['jinja2.ext.do',   'jinja2.ext.with_']) In the preceding code, we have an example where you create an environment with two extensions enabled: do and with. Those are the extensions we will study in this article. As the name suggests, the do extension allows you to "do stuff". Inside a do tag, you're allowed to execute Python expressions with full access to the template context. Flask-Empty, a popular flask boilerplate available at https://github.com/italomaia/flask-empty uses the do extension to update a dictionary in one of its macros, for example. Let's see how we could do the same: {% set x = {1:'home', '2':'boat'} %} {% do x.update({3: 'bar'}) %} {%- for key,value in x.items() %} {{ key }} - {{ value }} {%- endfor %} In the preceding example, we create the x variable with a dictionary, then we update it with {3: 'bar'}. You don't usually need to use the do extension but, when you do, a lot of coding is saved. The with extension is also very simple. You use it whenever you need to create block scoped variables. Imagine you have a value you need cached in a variable for a brief moment; this would be a good use case. Let's see an example: {% with age = user.get_age() %} My age: {{ age }} {% endwith %} My age: {{ age }}{# no value here #} As seen in the example, age exists only inside the with block. Also, variables set inside a with block will only exist inside it. For example: {% with %} {% set count = query.count() %} Current Stock: {{ count }} Diff: {{ prev_count - count }} {% endwith %} {{ count }} {# empty value #} Filters Filters are a marvelous thing about Jinja2! This tool allows you to process a constant or variable before printing it to the template. The goal is to implement the formatting you want, strictly in the template. To use a filter, just call it using the pipe operator like this: {% set name = 'junior' %} {{ name|capitalize }} {# output is Junior #} Its name is passed to the capitalize filter that processes it and returns the capitalized value. To inform arguments to the filter, just call it like a function, like this: {{ ['Adam', 'West']|join(' ') }} {# output is Adam West #} The join filter will join all values from the passed iterable, putting the provided argument between them. Jinja2 has an enormous quantity of available filters by default. That means we can't cover them all here, but we can certainly cover a few. capitalize and lower were seen already. Let's look at some further examples: {# prints default value if input is undefined #} {{ x|default('no opinion') }} {# prints default value if input evaluates to false #} {{ none|default('no opinion', true) }} {# prints input as it was provided #} {{ 'some opinion'|default('no opinion') }}   {# you can use a filter inside a control statement #} {# sort by key case-insensitive #} {% for key in {'A':3, 'b':2, 'C':1}|dictsort %}{{ key }}{% endfor %} {# sort by key case-sensitive #} {% for key in {'A':3, 'b':2, 'C':1}|dictsort(true) %}{{ key }}{% endfor %} {# sort by value #} {% for key in {'A':3, 'b':2, 'C':1}|dictsort(false, 'value') %}{{ key }}{% endfor %} {{ [3, 2, 1]|first }} - {{ [3, 2, 1]|last }} {{ [3, 2, 1]|length }} {# prints input length #} {# same as in python #} {{ '%s, =D'|format("I'm John") }} {{ "He has two daughters"|replace('two', 'three') }} {# safe prints the input without escaping it first#} {{ '<input name="stuff" />'|safe }} {{ "there are five words here"|wordcount }} Try the preceding example to see exactly what each filter does. After reading this much about Jinja2, you're probably thinking: "Jinja2 is cool but this is a book about Flask. Show me the Flask stuff!". Ok, ok, I can do that! Of what we have seen so far, almost everything can be used with Flask with no modifications. As Flask manages the Jinja2 environment for you, you don't have to worry about creating file loaders and stuff like that. One thing you should be aware of, though, is that, because you don't instantiate the Jinja2 environment yourself, you can't really pass to the class constructor, the extensions you want to activate. To activate an extension, add it to Flask during the application setup as follows: from flask import Flask app = Flask(__name__) app.jinja_env.add_extension('jinja2.ext.do') # or jinja2.ext.with_ if __name__ == '__main__': app.run() Messing with the template context You can use the render_template method to load a template from the templates folder and then render it as a response. from flask import Flask, render_template app = Flask(__name__)   @app.route("/") def hello():    return render_template("index.html") If you want to add values to the template context, as seen in some of the examples in this article, you would have to add non-positional arguments to render_template: from flask import Flask, render_template app = Flask(__name__)   @app.route("/") def hello():    return render_template("index.html", my_age=28) In the preceding example, my_age would be available in the index.html context, where {{ my_age }} would be translated to 28. my_age could have virtually any value you want to exhibit, actually. Now, what if you want all your views to have a specific value in their context, like a version value—some special code or function; how would you do it? Flask offers you the context_processor decorator to accomplish that. You just have to annotate a function that returns a dictionary and you're ready to go. For example: from flask import Flask, render_response app = Flask(__name__)   @app.context_processor def luck_processor(): from random import randint def lucky_number():    return randint(1, 10) return dict(lucky_number=lucky_number)   @app.route("/") def hello(): # lucky_number will be available in the index.html context by default return render_template("index.html") Summary In this article, we saw how to render templates using only Jinja2, how control statements look and how to use them, how to write a comment, how to print variables in a template, how to write and use macros, how to load and use extensions, and how to register context processors. I don't know about you, but this article felt like a lot of information! I strongly advise you to run the experiment with the examples. Knowing your way around Jinja2 will save you a lot of headaches. Resources for Article: Further resources on this subject: Recommender systems dissected Deployment and Post Deployment [article] Handling sessions and users [article] Introduction to Custom Template Filters and Tags [article]
Read more
  • 0
  • 0
  • 6053
article-image-implementing-apache-spark-k-means-clustering-method-on-digital-breath-test-data-for-road-safety
Savia Lobo
01 Mar 2018
7 min read
Save for later

Implementing Apache Spark K-Means Clustering method on digital breath test data for road safety

Savia Lobo
01 Mar 2018
7 min read
[box type="note" align="" class="" width=""]This article is an excerpt taken from a book Mastering Apache Spark 2.x - Second Edition written by Romeo Kienzler. In this book, you will learn to use Spark as a big data operating system, understand how to implement advanced analytics on the new APIs, and explore how easy it is to use Spark in day-to-day tasks.[/box] In today’s tutorial, we have used the Road Safety test data from our previous article, to show how one can attempt to find clusters in data using K-Means algorithm with Apache Spark MLlib. Theory on Clustering The K-Means algorithm iteratively attempts to determine clusters within the test data by minimizing the distance between the mean value of cluster center vectors, and the new candidate cluster member vectors. The following equation assumes dataset members that range from X1 to Xn; it also assumes K cluster sets that range from S1 to Sk, where K <= n. K-Means in practice The K-Means MLlib functionality uses the LabeledPoint structure to process its data and so it needs numeric input data. As the same data from the last section is being reused, we will not explain the data conversion again. The only change that has been made in data terms in this section, is that processing in HDFS will now take place under the /data/spark/kmeans/ directory. Additionally, the conversion Scala script for the K-Means example produces a record that is all comma-separated. The development and processing for the K-Means example has taken place under the /home/hadoop/spark/kmeans directory to separate the work from other development. The sbt configuration file is now called kmeans.sbt and is identical to the last example, except for the project name: name := "K-Means" The code for this section can be found in the software package under chapter7K-Means. So, looking at the code for kmeans1.scala, which is stored under kmeans/src/main/scala, some similar actions occur. The import statements refer to the Spark context and configuration. This time, however, the K-Means functionality is being imported from MLlib. Additionally, the application class name has been changed for this example to kmeans1: import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf import org.apache.spark.mllib.linalg.Vectors import org.apache.spark.mllib.clustering.{KMeans,KMeansModel} object kmeans1 extends App { The same actions are being taken as in the last example to define the data file--to define the Spark configuration and create a Spark context: val hdfsServer = "hdfs://localhost:8020" val hdfsPath      = "/data/spark/kmeans/" val dataFile     = hdfsServer + hdfsPath + "DigitalBreathTestData2013- MALE2a.csv" val sparkMaster = "spark://localhost:7077" val appName = "K-Means 1" val conf = new SparkConf() conf.setMaster(sparkMaster) conf.setAppName(appName) val sparkCxt = new SparkContext(conf) Next, the CSV data is loaded from the data file and split by comma characters into the VectorData variable: val csvData = sparkCxt.textFile(dataFile) val VectorData = csvData.map { csvLine => Vectors.dense( csvLine.split(',').map(_.toDouble)) } A KMeans object is initialized, and the parameters are set to define the number of clusters and the maximum number of iterations to determine them: val kMeans = new KMeans val numClusters       = 3 val maxIterations     = 50 Some default values are defined for the initialization mode, number of runs, and Epsilon, which we needed for the K-Means call but did not vary for the processing. Finally, these parameters were set against the KMeans object: val initializationMode = KMeans.K_MEANS_PARALLEL val numRuns     = 1 val numEpsilon       = 1e-4 kMeans.setK( numClusters ) kMeans.setMaxIterations( maxIterations ) kMeans.setInitializationMode( initializationMode ) kMeans.setRuns( numRuns ) kMeans.setEpsilon( numEpsilon ) We cached the training vector data to improve the performance and trained the KMeans object using the vector data to create a trained K-Means model: VectorData.cache val kMeansModel = kMeans.run( VectorData ) We have computed the K-Means cost and number of input data rows, and have output the results via println statements. The cost value indicates how tightly the clusters are packed and how separate the clusters are: val kMeansCost = kMeansModel.computeCost( VectorData ) println( "Input data rows : " + VectorData.count() ) println( "K-Means Cost  : " + kMeansCost ) Next, we have used the K-Means Model to print the cluster centers as vectors for each of the three clusters that were computed: kMeansModel.clusterCenters.foreach{ println } Finally, we use the K-Means model predict function to create a list of cluster membership predictions. We then count these predictions by value to give a count of the data points in each cluster. This shows which clusters are bigger and whether there really are three clusters: val clusterRddInt = kMeansModel.predict( VectorData ) val clusterCount = clusterRddInt.countByValue clusterCount.toList.foreach{ println } } // end object kmeans1 So, in order to run this application, it must be compiled and packaged from the kmeans subdirectory as the Linux pwd command shows here: [hadoop@hc2nn kmeans]$ pwd /home/hadoop/spark/kmeans [hadoop@hc2nn kmeans]$ sbt package Loading /usr/share/sbt/bin/sbt-launch-lib.bash [info] Set current project to K-Means (in build file:/home/hadoop/spark/kmeans/) [info] Compiling 2 Scala sources to /home/hadoop/spark/kmeans/target/scala-2.10/classes... [info] Packaging /home/hadoop/spark/kmeans/target/scala-2.10/k- means_2.10-1.0.jar ... [info] Done packaging. [success] Total time: 20 s, completed Feb 19, 2015 5:02:07 PM Once this packaging is successful, we check HDFS to ensure that the test data is ready. As in the last example, we convert our data to numeric form using the convert.scala file, provided in the software package. We will process the DigitalBreathTestData2013- MALE2a.csv data file in the HDFS directory, /data/spark/kmeans, as follows: [hadoop@hc2nn nbayes]$ hdfs dfs -ls /data/spark/kmeans Found 3 items -rw-r--r--   3 hadoop supergroup 24645166 2015-02-05 21:11 /data/spark/kmeans/DigitalBreathTestData2013-MALE2.csv -rw-r--r--   3 hadoop supergroup 5694226 2015-02-05 21:48 /data/spark/kmeans/DigitalBreathTestData2013-MALE2a.csv drwxr-xr-x - hadoop supergroup   0 2015-02-05 21:46 /data/spark/kmeans/result The spark-submit tool is used to run the K-Means application. The only change in this command is that the class is now kmeans1: spark-submit --class kmeans1 --master spark://localhost:7077 --executor-memory 700M --total-executor-cores 100 /home/hadoop/spark/kmeans/target/scala-2.10/k-means_2.10-1.0.jar The output from the Spark cluster run is shown to be as follows: Input data rows : 467054 K-Means Cost  : 5.40312223450789E7 The previous output shows the input data volume, which looks correct; it also shows the K- Means cost value. The cost is based on the Within Set Sum of Squared Errors (WSSSE) which basically gives a measure how well the found cluster centroids are matching the distribution of the data points. The better they are matching, the lower the cost. The following link https://datasciencelab.wordpress.com/2013/12/27/finding-the-k-in-k-means-clustering/ explains WSSSE and how to find a good value for k in more detail. Next come the three vectors, which describe the data cluster centers with the correct number of dimensions. Remember that these cluster centroid vectors will have the same number of columns as the original vector data: [0.24698249738061878,1.3015883142472253,0.005830116872250263,2.917374778855 5207,1.156645130895448,3.4400290524342454] [0.3321793984152627,1.784137241326256,0.007615970459266097,2.58319870759289 17,119.58366028156011,3.8379106085083468] [0.25247226760684494,1.702510963969387,0.006384899819416975,2.2314042480006 88,52.202897927594805,3.551509158139135] Finally, cluster membership is given for clusters 1 to 3 with cluster 1 (index 0) having the largest membership at 407539 member vectors: (0,407539) (1,12999) (2,46516) To summarize, we saw a practical  example that shows how K-means algorithm is used to cluster data with the help of Apache Spark. If you found this post useful, do check out this book Mastering Apache Spark 2.x - Second Edition to learn about the latest enhancements in Apache Spark 2.x, such as interactive querying of live data and unifying DataFrames and Datasets.
Read more
  • 0
  • 1
  • 6050

article-image-popular-data-sources-and-models-in-sap-analytics-cloud
Kunal Chaudhari
03 Jan 2018
12 min read
Save for later

Popular Data sources and models in SAP Analytics Cloud

Kunal Chaudhari
03 Jan 2018
12 min read
[box type="note" align="" class="" width=""]This article is an excerpt from a book written by Riaz Ahmed titled Learning SAP Analytics Cloud.This book deals with the basics of SAP Analytics Cloud (formerly known as SAP BusinessObjects Cloud) and unveil significant features for a beginner.[/box] Our article provides a brief overview of the different data sources and models, available in SAP Analytics Cloud. A model is the foundation of every analysis you create to evaluate the performance of your organization. It is a high-level design that exposes the analytic requirements of end users. Planning and analytics are the two types of models you can create in SAP Analytics Cloud. Analytics models are simpler and more flexible, while planning models are full-featured models in which you work with planning features. Preconfigured with dimensions for time and categories, planning models support multi-currency and security features at both model and dimension levels.   To determine what content to include in your model, you must first identify the columns from the source data on which users need to query. The columns you need in your model reside in some sort of data source. SAP Analytics Cloud supports three types of data sources: files (such as CSV or Excel files) that usually reside on your computer, live data connections from a connected remote system, and cloud apps. In addition to the files on your computer, you can use on-premise data sources, such as SAP Business Warehouse, SAP ERP, SAP Universe, SQL database, and more, to acquire data for your models. In the cloud, you can get data from apps such as Concur, Google Drive, SAP Business ByDesign, SAP Hybris Cloud, OData Services, and Success Factors. The following figure depicts these data sources. The cloud app data sources you can use with SAP Analytics Cloud are displayed above the firewall mark, while those in your local network are shown under the firewall. As you can see in the following figure, there are over twenty data sources currently supported by SAP Analytics Cloud. The methods of connecting to these data sources also vary from each other. However, some instances provided in this article would give you an idea on how connections are established to acquire data. The connection methods provided here relate to on-premise and cloud app data sources. Create a direct live connection to SAP HANA Execute the following steps to connect to the on-premise SAP HANA system to use live data in SAP Analytics Cloud. Live data means that you can get up-to-the-minute data when you open a story in SAP Analytics Cloud. In this case, any changes made to the data in the source system are reflected immediately. Usually, there are two ways to establish a connection to a data source--use the Connection option from the main menu, or specify the data source during the process of creating a model. However, live data connections must be established via the Connection menu option prior to creating the corresponding model. Here are the steps: From the main menu, select Connection. On the Connections page, click on the Add Connection icon (+), and select Live Data Connection | SAP HANA. In the New Live Connection dialog, enter a name for the connection (for example, HANA). From the Connection Type drop-down list, select Direct. The Direct option is used when you connect to a data source that resides inside your corporate network. The Path option requires a reverse proxy to the HANA XS server. The SAP Cloud Platform and Cloud options in this list are used when you are connecting to SAP cloud environments. When you select the Direct option, the System Type is set to HANA and the protocol is set to HTTPS. Enter the hostname and port number in respective text boxes. The Authentication Method list contains two options: User Name and Password and SAML Single Sign On. The SAML Single Sign On option requires that the SAP HANA system is already configured to use SAML authentication. If not, choose the User Name and Password option and enter these credentials in relevant boxes. Click on OK to finish the process. A new connection will appear on the Connection page, which can now be used as a data source for models. However, in order to complete this exercise, we will go through a short demo of this process here. From the main menu, go to Create | Model. On the New Model page, select Use a datasource. From the list that appears on your right side, select Live Data connection. In the dialog that is displayed, select the HANA connection you created in the previous steps from the System list. From the Data Source list, select the HANA view you want to work with. The list of views may be very long, and a search feature is available to help you locate the source you are looking for. Finally, enter the name and the optional description for the new model, and click on OK. The model will be created, and its definitions will appear on another page. Connecting remote systems to import data In addition to creating live connections, you can also create connections that allow you to import data into SAP Analytics Cloud. In these types of connections that you make to access remote systems, data is imported (copied) to SAP Analytics Cloud. Any changes users make in the source data do not affect the imported data. To establish connections with these remote systems, you need to install some additional components. For example, you must install SAP HANA Cloud connector to access SAP Business Planning and Consolidation (BPC) for Netweaver . Similarly, SAP Analytics Cloud agent should be installed for SAP Business Warehouse (BW), SQL Server, SAP ERP, and others. Take a look at the connection figure illustrated on a previous page. The following set of steps provide instructions to connect to SAP ERP. You can either connect to this system from the Connection menu or establish the connection while creating a model. In these steps, we will adopt the latter approach. From the main menu, go to Create | Model. 2. Click on the Use a datasource option on the choose how you'd like to start your model page. 3. From the list of available datasources to your right, select SAP ERP. 4. From the Connection Name list, select Create New Connection. 5. Enter a name for the connection (for example, ERP) in the Connection Name box. You can also provide a       description to further elaborate the new connection. 6. For Server Type, select Application Server and enter values for System,   System Number, Client ID, System ID, Language, User Name, and Password. Click the Create button after providing this information. 7. Next, you need to create a query based on the SAP ERP system data. Enter  a name for the query, for example, sales. 8. In the same dialog, expand the ERP object where the data exists. Locate and select the object, and then choose the data columns you want to include in your model. You are provided with a preview of the data before importing. On the preview window, click on Done to start the import process. The imported data will appear on the Data Integration page, which is the initial screen in the model creation segment. Connect Google Drive to import data You went through two scenarios in which you saw how data can be fetched. In the first scenario, you created a live connection to create a model on live data, while in the second one, you learned how to import data from remote systems. In this article, you will be guided to create a model using a cloud app called Google Drive. Google Drive is a file storage and synchronization service developed by Google. It allows users to store files in the cloud, synchronize files across devices, and share files. Here are the steps to use the data stored on Google Drive: From the main menu, go to Create | Model. On the choose how you'd like to start your model page, select Get data from an app. From the available apps to your right, select Google Drive.  In the Import Model From Google Drive dialog, click on the Select Data button.  If you are not already logged into Google Drive, you will be prompted to log in.  Another dialog appears displaying a list of compatible files. Choose a file, and click on the Select button. You are brought back to the Import Model From Google Drive dialog, where you have to enter a model name and an optional description. After providing this information, click on the Import button. The import process will start, and after a while, you will see the Data Integration screen populated with the data from the selected Google Drive file. Refreshing imported data SAP Analytics Cloud allows you to refresh your imported data. With this option, you can re-import the data on demand to get the latest values. You can perform this refresh operation either manually or create an import schedule to refresh the data at a specific date and time or on a recurring basis. The following data sources support scheduling: SAP Business Planning and Consolidation (BPC) SAP Business Warehouse (BW) Concur OData services An SAP Analytics BI platform universe (UNX) query SAP ERP Central Component (SAP ECC) SuccessFactors [DC3] HCM suite Excel and comma-separated values (CSV) files imported from a file server (not imported from your local machine) SQL databases You can adopt the following method to access the schedule settings for a model: Select Connection from the main menu. The Connection page appears. The Schedule Status tab on this page lists all updates and import jobs associated with any data source. Alternatively, go to main menu | Browse | Models. The Models page appears. The updatable model on the list will have a number of data sources shown in the Datasources column. In the Datasources column, click on the View More link. The update and import jobs associated with this data source will appear. The Update Model and Import Data job are the two types of jobs that are run either immediately or on a schedule. To run an Import Data job immediately, choose Import Data in the Action column. If you want to run an Update Model job, select a job to open it. The following refreshing methods specify how you want existing data to be handled. The Import Data jobs are listed here: Update: Selecting this option updates the existing data and adds new entries to the target model. Clean and Replace: Any existing data is wiped out and new entries are added to the target model. Append: Nothing is done with the existing data. Only new entries are added to the target model. The Update Model jobs are listed here: Clean and Replace: This deletes the existing data and adds new entries to the target model. Append: This keeps the existing data as is and adds new entries to the target model. The Schedule Settings option allows you to select one of the following schedule options: None: The import is performed immediately Once: The import is performed only once at a scheduled time Repeating: The import is executed according to a repeating pattern; you can select a start and end date and time as well as a recurrence pattern After setting your preferences, click on the Save icon to save your scheduling settings. If you chose the None option for scheduling, select Update Model or Import Data to run the update or import job now. Once a scheduled job completes, its result appears on the Schedule Status tab displaying any errors or warnings. If you see such daunting messages, select the job to see the details. Expand an entry in the Refresh Manager panel to get more information about the scary stuff. If the import process rejected any rows in the dataset, you are provided with an option to download the rejected rows as a CSV file for offline examination. Fix the data in the source system, or fix the error in the downloaded CSV file and upload data from it. After creating your models, you access them via the main menu | Browse | Models path. The Models page, as illustrated in the following figure, is the main interface where you manage your models. All existing models are listed under the Models tab. You can open a model by clicking on its name. Public dimensions are saved separately from models and appear on the Public Dimensions tab. When you create a new model or modify an existing model, you can add these public dimensions. If you are using multiple currencies in your data, the exchange rates are maintained in separate tables. These are saved independently of any model and are listed on the Currency Conversion tab. Data for geographic locations, which are displayed and used in your data analysis, is maintained on the Points of Interest tab. The toolbar provided under the four tabs carries icons to perform common operations for managing models. Click on the New Model icon to create a new model. Select a model by placing a check mark (A) in front of it. Then click on the Copy Selected Model icon to make an exact copy of the selected model. Use the delete icon to remove the selected models. The Clear Selected Model option removes all the data from the selected model. The list of data import options that are supported is available from a menu beneath the Import Data icon on the toolbar. You can export a model to a .csv file once or on a recurring schedule using Export Model As File. SAP Analytics Cloud can help transform how you discover, plan, predict, collaborate, visualize, and extend all in one solution. In addition to on-premise data sources, you can fetch data from a variety of other cloud apps and even from Excel and text files to build your data models and then create stories based on these models. If you enjoyed this excerpt, check out the book Learning SAP Analytics Cloud to know more about professional data analysis using different types of charts, tables, geo maps, and more with SAP Analytics Cloud.    
Read more
  • 0
  • 0
  • 6050

article-image-introduction-sdn-transformation-legacy-sdn
Packt
07 Jun 2017
23 min read
Save for later

Introduction to SDN - Transformation from legacy to SDN

Packt
07 Jun 2017
23 min read
In this article, by Reza Toghraee, the author of the book, Learning OpenDayLight, we will: What is and what is not SDN Components of a SDN Difference between SDN and overlay The SDN controllers (For more resources related to this topic, see here.) You might have heard about Software-Defined Networking (SDN). If you are in networking industry this is a topic which probably you have studied initially when first time you heard about the SDN.To understand the importance of SDN and SDN controller, let's look at Google. Google silently built its own networking switches and controller called Jupiter. A home grown project which is mostly software driven and supports such massive scale of Google. The SDN base is There is a controller who knows the whole network. OpenDaylight (ODL), is a SDN controller. In other words, it's the central brain of the network. Why we are going towards SDN Everyone who is hearing about SDN, should ask this question that why are we talking about SDN. What problem is it trying to solve? If we look at traditional networking (layer 2, layer 3 with routing protocols such as BGP, OSPF) we are completely dominated by what is so called protocols. These protocols in fact have been very helpful to the industry. They are mostly standard. Different vendor and products can communicate using standard protocols with each other. A Cisco router can establish a BGP session with a Huawei switch or an open source Quagga router can exchange OSPF routes with a Juniper firewall. Routing protocol is a constant standard with solid bases. If you need to override something in your network routing, you have to find a trick to use protocols, even by using a static route. SDN can help us to come out of routing protocol cage, look at different ways to forward traffic. SDN can directly program each switch or even override a route which is installed by routing protocol. There are high-level benefits of using SDN which we explain few of them as follows: An integrated network: We used to have a standalone concept in traditional network. Each switch was managed separately, each switch was running its own routing protocol instance and was processing routing information messages from other neighbors. In SDN, we are migrating to a centralized model, where the SDN controller becomes the single point of configuration of the network, where you will apply the policies and configuration. Scalable layer 2 across layer 3:Having a layer 2 network across multiple layer 3 network is something which all network architects are interested and till date we have been using proprietary methods such as OTV or by using a service provider VPLS service. With SDN, we can create layer 2 networks across multiple switches or layer 3 domains (using VXLAN) and expand the layer 2 networks. In many cloud environment, where the virtual machines are distributed across different hosts in different datacenters, this is a major requirement. Third-party application programmability: This is a very generic term, isn't it? But what I'm referring to is to let other applications communicate with your network. For example,In many new distributed IP storage systems, the IP storage controller has ability to talk to network to provide the best, shortest path to the storage node. With SDN we are letting other applications to control the network. Of course this control has limitation and SDN doesn't allow an application to scrap the whole network. Flexible application based network:In SDN, everything is an application. L2/L3, BGP, VMware Integration, and so on all are applications running in SDN controller. Service chaining:On the fly you add a firewall in the path or a load balancer. This is service insertion. Unified wired and wireless: This is an ideal benefit, to have a controller which supports both wired and wireless network. OpenDaylight is the only controller which supports CAPWAP protocols which allows integration with wireless access points. Components of a SDN A software defined network infrastructure has two main key components: The SDN Controller (only one, could be deployed in a highly available cluster) The SDN enabled switches(multiple switches, mostly in a Clos topology in a datacenter):   SDN controller is the single brain of the SDN domain. In fact, an SDN domain is very similar to a chassis based switch. You can imagine supervisor or management module of a chassis based switch as a SDN controller and rest of the line card and I/O cards as SDN switches. The main difference between a SDN network and a chassis based switch is that you can scale out the SDN with multiple switches, where in a chassis based switch you are limited to the number of slots in that chassis: Controlling the fabric It is very important that you understand the main technologies involved in SDN. These methods are used by SDN controllers to manage and control the SDN network. In general, there are two methods available for controlling the fabric: Direct fabric programming: In this method, SDN controller directly communicates with SDN enabled switches via southbound protocols such as OpenFlow, NETCONF and OVSDB. SDN controller programs each switch member with related information about fabric, and how to forward the traffic. Direct fabric programming is the method used by OpenDaylight. Overlay:In overlay method, SDN controller doesn't rely on programing the network switches and routers. Instead it builds an virtual overlay network on top of existing underlay network. The underlay network can be a L2 or L3 network with traditional network switches and router, just providing IP connectivity. SDN controller uses this platform to build the overlay using encapsulation protocols such as VXLAN and NVGRE. VMware NSX uses overlay technology to build and control the virtual fabric. SDN controllers One of the key fundamentals of SDN is disaggregation. Disaggregation of software and hardware in a network and also disaggregation of control and forwarding planes. SDN controller is the main brain and controller of an SDN environment, it's a strategic control point within the network and responsible for communicating information to: Routers and switches and other network devices behind them. SDN controllers uses APIs or protocols (such as OpenFlow or NETCONF) to communicate with these devices. This communication is known as southbound Upstream switches, routers or applications and the aforementioned business logic (via APIs or protocols). This communication is known as northbound. An example for a northbound communication is a BGP session between a legacy router and SDN controller. If you are familiar with chassis based switches like Cisco Catalyst 6500 or Nexus 7k chassis, you can imagine a SDN network as a chassis, with switches and routers as its I/O line cards and SDN controller as its supervisor or management module. Infact SDN is similar to a very scalable chassis where you don't have any limitation on number of physical slots. SDN controller is similar to role of management module of a chassis based switch and it controls all switches via its southbound protocols and APIs. The following table compares the SDN controller and a chassis based switch:  SDN Controller Chassis based switch Supports any switch hardware Supports only specific switch line cards Can scale out, unlimited number of switches Limited to number of physical slots in the chassis Supports high redundancy by multiple controllers in a cluster Supports dual management redundancy, active standby Communicates with switches via southbound protocols such as OpenFlow, NETCONF, BGP PCEP Use proprietary protocols between management module and line cards Communicates with routers, switches and applications outside of SDN via northbound protocols such as BGP, OSPF and direct API Communicates with other routers and switches outside of chassis via standard protocols such as BGP, OSPF or APIs. The first protocol that popularized the concept behind SDN was OpenFlow. When conceptualized by networking researchers at Stanford back in 2008, it was meant to manipulate the data plane to optimize traffic flows and make adjustments, so the network could quickly adapt to changing requirements. Version 1.0 of the OpenFlow specification was released in December of 2009; it continues to be enhanced under the management of the Open Networking Foundation, which is a user-led organization focused on advancing the development of open standards and the adoption of SDN technologies. OpenFlow protocol was the first protocol that helped in popularizing SDN. OpenFlow is a protocol designed to update the flow tables in a switch. Allowing the SDN controller to access the forwarding table of each member switch or in other words to connect control plane and data plane in SDN world. Back in 2008, OpenFlow conceptualized by networking researchers at Stanford University, the initial use of OpenFlow was to alter the switch forwarding tables to optimize traffic flows and make adjustments, so the network could quickly adapt to changing requirements. After introduction of OpenFlow, NOX introduced as original OpenFlow controller (still there wasn't concept of SDN controller). NOX was providing a high-level API capable of managing and also developing network control applications. Separate applications were required to run on top of NOX to manage the network.NOX was initially developed by Nicira networks (which acquired by VMware, and finally became part of VMware NSX). NOX introduced along with OpenFlow in 2009. NOX was a closed source product but ultimately it was donated to SDN community which led to multiple forks and sub projects out of original NOX. For example, POX is a sub project of NOX which provides Python support. Both NOX and POX were early controllers. NOX appears an inactive development, however POX is still in use by the research community as it is a Python based project and can be easily deployed. POX is hosted at http://github.com/noxrepo/pox NOX apart from being the first OpenFlow or SDN controller also established a programing model which inherited by other subsequent controllers. The model was based on processing of OpenFlow messages, with each incoming OpenFlow message trigger an event that had to be processed individually. This model was simple to implement but not efficient and robust and couldn't scale. Nicira along with NTT and Google started developing ONIX, which was meant to be a more abstract and scalable for large deployments. ONIX became the base for Nicira (the core of VMware NSX or network virtualization platform) also there are rumors that it is also the base for Google WAN controller. ONIX was planned to become open source and donated to community but for some reasons the main contributors decided to not to do it which forced the SDN community to focus on developing other platforms. Started in 2010, a new controller introduced,the Beacon controller and it became one of the most popular controllers. It born with contribution of developers from Stanford University. Beacon is a Java-based open source OpenFlow controller created in 2010. It has been widely used for teaching, research, and as the basis of Floodlight. Beacon had the first built-in web user-interface which was a huge step forward in the market of SDN controllers. Also it provided a easier method to deploy and run compared to NOX. Beacon was an influence for design of later controllers after it, however it was only supporting star topologies which was one of the limitations on this controller. Floodlight was a successful SDN controller which was built as a fork of Beacon. BigSwitch networks is developing Floodlight along with other developers. In 2013, Beacon popularity started to shrink down and Floodlight started to gain popularity. Floodlight had fixed many issues of Beacon and added lots of additional features which made it one of the most feature rich controllers available. It also had a web interface, a Java-based GUI and also could get integrated with OpenStack using quantum plugin. Integration with OpenStack was a big step forward as it could be used to provide networking to a large pool of virtual machines, compute and storage. Floodlight adoption increased by evolution of OpenStack and OpenStack adopters. This gave Floodlight greater popularity and applicability than other controllers that came before. Most of controllers came after Floodlight also supported OpenStack integration. Floodlight is still supported and developed by community and BigSwitch networks, and is a base for BigCloud Fabric (the BigSwitch's commercial SDN controller). There are other open source SDN controllers which introduced such as Trema (ruby-based from NEC), Ryu (supported by NTT), FlowER, LOOM and the recent OpenMUL. The following table shows the current open source SDN controllers:  Active open source SDNcontroller Non-active open source SDN controllers Floodlight Beacon OpenContrail FlowER OpenDaylight NOX LOOM NodeFlow OpenMUL   ONOS   POX   Ryu   Trema     OpenDaylight OpenDaylight started in early 2013, and was originally led by IBM and Cisco. It was a new collaborative open source project. OpenDaylight hosted under Linux Foundation and draw support and interest from many developers and adopters. OpenDaylight is a platform to provide common foundations and a robust array of services for SDN environments. OpenDaylight uses a controller model which supports OpenFlow as well as other southbound protocols. It is the first open source controller capable of employing non-OpenFlow proprietary control protocols which eventually lets OpenDaylight to integrate with modern and multi-vendor networks. The first release of OpenDaylight in February 2014 with code name of Hydrogen, followed by Helium in September 2014. The Helium release was significant because it marked a change in direction for the platform that has influenced the way subsequent controllers have been architected. The main change was in the service abstraction layer, which is the part of the controller platform that resides just above the southbound protocols, such as OpenFlow, isolating them from the northbound side and where the applications reside. Hydrogen used an API-driven Service Abstraction Layer (AD-SAL), which had limitations specifically, it meant the controller needed to know about every type of device in the network AND have an inventory of drivers to support them. Helium introduced a Model-driven service abstraction layer (MD-SAL), which meant the controller didn't have to account for all the types of equipment installed in the network, allowing it to manage a wide range of hardware and southbound protocols. Helium release made the framework much more agile and adaptable to changes in the applications; an application could now request changes to the model, which would be received by the abstraction layer and forwarded to the network devices. The OpenDaylight platform built on this advancement in its third release, Lithium, which was introduced in June of 2015. This release focused on broadening the programmability of the network, enabling organizations to create their own service architectures to deliver dynamic network services in a cloud environment and craft intent-based policies. Lithium release was worked on by more than 400 individuals, and contributions from Big Switch Networks, Cisco, Ericsson, HP, NEC, and so on, making it one of the fastest growing open source projects ever. The fourth release, Beryllium come out in February of 2016 and the most recent fifth release, Boron released in September 2016. Many vendors have built and developed commercial SDN controller solutions based on OpenDaylight. Each product has enhanced or added features to OpenDaylight to have some differentiating factor. The use of OpenDaylight in different vendor products are: A base, but sell a commercial version with additional proprietary functionality—for example: Brocade, Ericsson, Ciena, and so on. Part of their infrastructure in their Network as a Service (or XaaS) offerings—for example: Telstra, IBM, and so on. Elements for use in their solution—for example: ConteXtream (now part of HP) Open Networking Operating System (ONOS), which was open sourced in December 2014 is focused on serving the needs of service providers. It is not as widely adopted as OpenDaylight, ONOS has been finding success and gaining momentum around WAN use cases. ONOS is backed by numerous organizations including AT&T, Cisco, Fujitsu, Ericsson, Ciena, Huawei, NTT, SK Telecom, NEC, and Intel, many of whom are also participants in and supporters of OpenDaylight. Apart from open source SDN controllers, there are many commercial, proprietary controllers available in the market. Products such as VMware NSX, Cisco APIC, BigSwitch Big Cloud Fabric, HP VAN and NEC ProgrammableFlow are example commercial and proprietary products. The following table lists the commercially available controllers and their relationship to OpenDaylight:  ODL-based ODL-friendly Non-ODL based Avaya Cyan (acquired by Ciena) BigSwitch Brocade HP Juniper Ciena NEC Cisco ConteXtream (HP) Nuage Plexxi Coriant   PLUMgrid Ericsson Pluribus Extreme Sonus Huawei (also ships non-ODL controller) VMware NSX Core features of SDN Regardless of an open source or a proprietary SDN platform, there are core features and capabilities which requires the SDN platform to support. These capabilities include: Fabric programmability:Providing the ability to redirect traffic, apply filters to packets (dynamically), and leverage templates to streamline the creation of custom applications. Ensuring northbound APIs allow the control information centralized in the controller available to be changed by SDN applications. This will ensure the controller can dynamically adjust the underlying network to optimize traffic flows to use the least expensive path, take into consideration varying bandwidth constraints, meet quality of service (QoS) requirements. Southbound protocol support:Enabling the controller to communicate to switches and routers and manipulate and optimize how they manage the flow of traffic. Currently OpenFlow is the most standard protocol used between different networking vendors, while there are other southbound protocols that can be used. A SDN platform should support different versions of OpenFlow in order to provide compatibility with different switching equipments. External API support:Ensuring the controller can be used within the varied orchestration and cloud environments such as VMware vSphere, OpenStack, and so on. Using APIs the orchestration platform can communicate with SDN platform in order to publish network policies. For example VMware vSphere shall talk to SDN platform to extend the virtual distributed switches(VDS) from virtual environment to the physical underlay network without any requirement form an network engineer to configure the network. Centralized monitoring and visualization:Since SDN controller has a full visibility over the network, it can offer end-to-end visibility of the network and centralized management to improve overall performance, simplify the identification of issues and accelerate troubleshooting. The SDN controller will be able to discover and present a logical abstraction of all the physical links in the network, also it can discover and present a map of connected devices (MAC addresses) which are related to virtual or physical devices connected to the network. The SDN controller support monitoring protocols, such as syslog, snmp and APIs in order to integrate with third-party management and monitoring systems. Performance: Performance in a SDN environment mainly depends on how fast SDN controller fills the flow tables of SDN enabled switches. Most of SDN controllers pre-populate the flow tables on switches to minimize the delay. When a SDN enabled switch receives a packet which doesn't find a matching entry in its flow table, it sends the packet to the SDN controller in order to find where the packet needs to get forwarded to. A robust SDN solution should ensure that the number of requests form switches are minimum and SDN controller doesn't become a bottleneck in the network. High availability and scalability: Controllers must support high availability clusters to ensure reliability and service continuity in case of failure of a controller. Clustering in SDN controller expands to scalability. A modern SDN platform should support scalability in order to add more controller nodes with load balancing in order to increase the performance and availability. Modern SDN controllers support clustering across multiple different geographical locations. Security:Since all switches communicate with SDN controller, the communication channel needs to be secured to ensure unauthorized devices doesn't compromise the network. SDN controller should secure the southbound channels, use encrypted messaging and mutual authentication to provide access control. Apart from that the SDN controller must implement preventive mechanisms to prevent from denial of services attacks. Also deployment of authorization levels and level controls for multi-tenant SDN platforms is a key requirement. Apart from the aforementioned features SDN controllers are likely to expand their function in future. They may become a network operating system and change the way we used to build networks with hardware, switches, SFPs and gigs of bandwidth. The future will look more software defined, as the silicon and hardware industry has already delivered their promises for high performance networking chips of 40G, 100G. Industry needs more time to digest the new hardware and silicons and refresh the equipment with new gears supporting 10 times the current performance. Current SDN controllers In this section, I'm putting the different SDN controllers in a table. This will help you to understand the current market players in SDN and how OpenDaylight relates to them:  Vendors/product Based on OpenDaylight? Commercial/open source Description Brocade SDN controller Yes Commercial It's a commercial version of OpenDaylight, fully supported and with extra reliable modules. Cisco APIC No Commercial Cisco Application Policy Infrastructure Controller (APIC) is the unifying automation and management point for the Application Centric Infrastructure (ACI) data center fabric. Cisco uses APIC controller and Nexus 9k switches to build the fabric. Cisco uses OpFlex as main southbound protocol. Erricson SDN controller Yes Commercial Ericsson's SDN controller is a commercial (hardened) version OpenDaylight SDN controller. Domain specific control applications that use the SDN controller as platform form the basis of the three commercial products in our SDN controller portfolio. Juniper OpenContrial /Contrail No Both OpenContrail is opensource, and Contrail itself is a commercial product. Juniper Contrail Networking is an open SDN solution that consists of Contrail controller, Contrail vRouter, an analytics engine, and published northbound APIs for cloud and NFV. OpenContrail is also available for free from Juniper. Contrail promotes and use MPLS in datacenter. NEC Programmable Flow No Commercial NEC provides its own SDN controller and switches. NEC SDN platform is one of choices of enterprises and has lots of traction and some case studies.   Avaya SDN Fx controller Yes Commercial Based on OpenDaylight, bundled as a solution package.   Big Cloud Fabric No Commercial BigSwitch networks solution is based on Floodlight opensource project. BigCloud Fabric is a robust, clean SDN controller and works with bare metal whitebox switches. BigCloud Fabric includes SwitchLightOS which is a switch operating system can be loaded on whitebox switches with Broadcom Trident 2 or Tomahawk silicons. The benefit of BigCloud Fabric is that you are not bound to any hardware and you can use baremetal switches from any vendor.   Ciena's Agility Yes Commercial Ciena's Agility multilayer WAN controller is built atop the open-source baseline of the OpenDaylight Project—an open, modular framework created by a vendor-neutral ecosystem (rather than a vendor-centric ego-system) that will enable network operators to source network services and applications from both Ciena's Agility and others. HP VAN (Virtual Application Network) No Commercial The building block of the HP open SDN ecosystem, the controller allows third-party developers to deliver innovative SDN solutions. Huawei Agile controller Yes and No (based on editions) Commercial Huawei's SDN controller which integrates as a solution with Huawei enterprise switches Nuage No Commercial Nuage Networks VSP provides SDN capabilities for clouds of all sizes. It is implemented as a non-disruptive overlay for all existing virtualized and non-virtualized server and network resources. Pluribus Netvisor No Commercial Netvisor Premium and Open Netvisor Linux are distributed network operating systems. Open Netvisor integrates atraditional, interoperable networking stack (L2/L3/VXLAN) with an SDN distributed controller that runs in everyswitch of the fabric. VMware NSX No Commercial VMware NSX is an Overlay type of SDN, which currently works with VMware vSphere. The plan is to support OpenStack in future. VMware NSX also has built-in firewall, router and L4 load balancers allowing micro segmentation. OpenDaylight as an SDN controller Previously, we went through the role of SDN controller, and a brief history of ODL.ODL is a modular open SDN platform which allows developers to build any network or business application on top of it to drive the network in the way they want. Currently OpenDaylight has reached to its fifth release (Boron, which is the fifth element in periodic table). ODL releases are named based on periodic table elements, started from first release the Hydrogen. ODL has a 6 month release period, with many developers working on expanding the ODL, 2 releases per year is expected from community. For technical readers to understand it more clearly, the following diagram will help: ODL platform has a broad set of use cases for multivendor, brown field, green fields, service providers and enterprises. ODL is a foundation for networks of the future. Service providers are using ODL to migrate their services to a software enabled level with automatic service delivery and coming out of circuit-based mindset of service delivery. Also they work on providing a virtualized CPE with NFV support in order to provide flexible offerings. Enterprises use ODL for many use cases, from datacenter networking, Cloud and NFS, network automation and resource optimization, visibility, control to deploying a fully SDN campus network. ODL uses a MD-SAL which makes it very scalable and lets it incorporate new applications and protocols faster. ODL supports multiple standard and proprietary southbound protocols, for example with full support of OpenFlow and OVSDB, ODL can communicate with any standard hardware (or even the virtual switches such as Open vSwitch(OVS) supporting such protocols). With such support, ODL can be deployed and used in multivendor environments and control hardware from different vendors from a single console no matter what vendor and what device it is, as long as they support standard southbound protocols. ODL uses a micro service architecture model which allows users to control applications, protocols and plugins while deploying SDN applications. Also ODL is able to manage the connection between external consumers and providers. The followingdiagram explains the ODL footprint and different components and projects within the ODL: Micro servicesarchitecture ODL stores its YANG data structure in a common data store and uses messaging infrastructure between different components to enable a model-driven approach to describe the network and functions. In ODL MD-SAL, any SDN application can be integrated as a service and then loaded into the SDN controller. These services (apps) can be chained together in any number and ways to match the application needs. This concept allows users to only install and enable the protocols and services they need which makes the system light and efficient. Also services and applications created by users can be shared among others in the ecosystem since the SDN application deployment for ODL follows a modular design. ODL supports multiple southbound protocols. OpenFlow and OpenFlow extension such as Table Type Patterns (TTP), as well as other protocols including NETCONF, BGP/PCEP, CAPWAP and OVSDB. Also ODL supports Cisco OpFlex protocol: ODL platform provides a framework for authentication, authorization and accounting (AAA), as well as automatic discovery and securing of network devices and controllers. Another key area in security is to use encrypted and authenticated communication trough southbound protocols with switches and routers within the SDN domain. Most of southbound protocols support security encryption mechanisms. Summary In this article we learned about SDN, and why it is important. We reviewed the SDN controller products, the ODL history as well as core features of SDN controllers and market leader controllers. We managed to dive in some details about SDN . Resources for Article: Further resources on this subject: Managing Network Devices [article] Setting Up a Network Backup Server with Bacula [article] Point-to-Point Networks [article]
Read more
  • 0
  • 0
  • 6044
article-image-rabbitmq-acknowledgements
Packt
16 Sep 2013
3 min read
Save for later

RabbitMQ Acknowledgements

Packt
16 Sep 2013
3 min read
(For more resources related to this topic, see here.) Acknowledgements (Intermediate) This task will examine reliable message delivery from the RabbitMQ server to a consumer. Getting ready If a consumer takes a message/order from our queue and the consumer dies, our unprocessed message/order will die with it. In order to make sure a message is never lost, RabbitMQ supports message acknowledgments or acks. When a consumer has received and processed a message, an acknowledgment is sent to RabbitMQ from the consumer informing RabbitMQ that it is free to delete the message. If a consumer dies without sending an acknowledgment, RabbitMQ will redeliver it to another consumer. How to do it... Let's navigate to our source code examples folder and locate the folder Message-Acknowledgement. Take a look at the consumer.js script and examine the changes we have made to support acks. We pass to the {ack:true} option to the q.subscribe function, which tells the queue that messages should be acknowledged before being removed: q.subscribe({ack:true}, function(message) { When our message has been processed we call q.shift, which informs RabbitMQ that the message has been processed, and it can now be removed from the queue: q.shift(); You can also use the prefetchCount option to increase the window of how many messages the server will send you before you need to send an acknowledgement. {ack:true, prefetchCount:1} is the default and will only send you one message before you acknowledge. Setting prefetchCount to 0 will make that window unlimited. A low value will impact performance, so it may be worth considering a higher value. Let's demonstrate this concept. Edit the consumer.js script located in the folder Message-Acknowledgement. Simply comment out the line q.shift(), which will stop the consumer from acknowledging the messages. Open a command-line console and start RabbitMQ: rabbitmq-server Now open a command-line console, navigate to our source code examples folder, and locate the folder Message-Acknowledgement. Execute the following command: Message-Acknowledgement> node producer Let the producer create several message/orders; press Ctrl + C while on the command-line console to stop the producer creating orders. Now execute the following to begin consuming messages: Message-Acknowledgement> node consumer Let's open another command-line console and run list_queues: rabbitmqctl list_queues messages_readymessages_unacknowledged The response should display our shop queue; details include the name, the number of messages ready to be processed, and one message which has not been acknowledged. Listing queues ...shop.queue 9 1...done. If you press Ctrl + C while on the command-line console, the consumer script is stopped, and then list the queues again you will notice the message has returned to the queue. Listing queues ...shop.queue 10 0...done. If you edit the change we made to consumer.js script and re-run these steps, the application will work correctly, consuming messages one at a time and sending an acknowledgment to RabbitMQ when each message has been processed. Summary This article explained a reliable message delivery process in RabbitMQ using Acknowledgements. It also listed the steps that will give you acknowledegements for a messaging application using scripts in RabbitMQ. Resources for Article : Further resources on this subject: Getting Started with Oracle Information Integration [Article] Messaging with WebSphere Application Server 7.0 (Part 1) [Article] Using Virtual Destinations (Advanced) [Article]
Read more
  • 0
  • 0
  • 6042

article-image-managing-nano-server-windows-powershell-and-windows-powershell-dsc
Packt
05 Jul 2017
8 min read
Save for later

Managing Nano Server with Windows PowerShell and Windows PowerShell DSC

Packt
05 Jul 2017
8 min read
In this article by Charbel Nemnom, the author of the book Getting Started with Windows Nano Server, we will cover the following topics: Remote server graphical tools Server manager Hyper-V manager Microsoft management console Managing Nano Server with PowerShell (For more resources related to this topic, see here.) Remote server graphical tools Without the Graphical User Interface (GUI), it’s not easy to carry out the daily management and maintenance of Windows Server. For this reason, Microsoft integrated Nano Server with all the existing graphical tools that you are familiar with such as Hyper-V manager, failover cluster manager, server manager, registry editor, File explorer, disk and device manager, server configuration, computer management, users and groups console, and so on. All those tools and consoles are compatible to manage Nano Server remotely. The GUI is always the easiest way to use. In this section, we will discuss how to access and set the most common configurations in Nano Server with remote graphical tools. Server manager Before we start managing Nano Server, we need to obtain the IP address or the computer name of the Nano Server to connect to and remotely manage a Nano instance either physical or virtual machine. Login to your management machine and make sure you have installed the latest Remote Server Administration Tools (RSAT) for Windows Server 2016 or Windows 10. You can download the latest RSAT tools from the following link: https://www.microsoft.com/en-us/download/details.aspx?id=45520 Launch server manager as shown in Figure 1, and add your Nano Server(s) that you would like to manage: Figure 1: Managing Nano Server using server manager You can refresh the view and browse all events and services as you expect to see. I want to point out that Best Practices Analyzer (BPA) is not supported in Nano Server. BPA is completely cmdlets-based and written in C# back during the days of PowerShell 2.0. It is also statically using some .NET XML library code that was not part of .NET framework at that time. So, do not expect to see Best Practices Analyzer in server manager. Hyper-V manager The next console that you probably want to access is Hyper-V Manager, right click on Nano Server name in server manager and select Hyper-V Manager console as shown in Figure 2: Figure 2: Managing Nano Server using Hyper-V manager Hyper-V Manager will launch with full support as you expect when managing full Windows Server 2016 Hyper-V, free Hyper-V server, server core and Nano Server with Hyper-V role. Microsoft management console You can use the Microsoft Management Console (MMC) to manage Nano Server as well. From the command line type mmc.exe. From the File menu, Click Add/Remove Snap-in…and then select Computer Management and click Add. Choose Another computer and add the IP address or the computer name of your Nano Server machine. Click Ok. As shown in Figure 3, you can expand System Tools and check the tools that you are familiar with like (Event Viewer, Local Users and Groups, Shares,and Services). Please note that some of these MMC tools such as Task Scheduler and Disk Management cannot be used against Nano Server. Also, for certain tools you need to open some ports in Windows firewall: Figure 3: Managing Nano Server using Microsoft Management Console Managing Nano Server with PowerShell For most IT administrators, the graphical user interface is the easiest way to use. But on the other hand, PowerShell can bring a fast and an automated process. That's why in Windows Server 2016, the Nano Server deployment option of Windows Server comes with full PowerShell remoting support. The purpose of the core PowerShell engine, is to manage Nano Server instances at scale. PowerShell remoting including DSC, Windows Server cmdlets (network, storage, Hyper-V, and so on), Remote file transfer, Remote script authoring and debugging, and PowerShell Web access. Some of the new features in Windows PowerShell version 5.1 on Nano Server supports the following: Copying files via PowerShell sessions Remote file editing in PowerShell ISE Interactive script debugging over PowerShell session Remote script debugging within PowerShell ISE Remote host process connects and debug PowerShell version 5.1 is available in different editions which denote varying feature sets and platform compatibility. Desktop Edition targeting Full Server, Server Core and Windows Desktop, Core Edition targeting Nano Server and Windows IoT. You can find a list of Windows PowerShell features not available yet in Nano Server here. As Nano Server is still evolving, we will see what the next cadence update will bring for unavailable PowerShell features. If you want to manage your Nano Server, you can use PowerShell Remoting or if your Nano Server instance is running in a virtual machine you can also use PowerShell Direct, more on that at the end of this section. In order to manage a Nano server installation using PowerShell remoting carry out the following steps: You may need to start the WinRM service on your management machine to enable remote connections. From the PowerShell console type the following command: net start WinRM If you want to manage Nano Server in a workgroup environment, open PowerShell console, and type the following command, substituting server name or IP with the right value using your machine-name is the easiest to use, but if your device is not uniquely named on your network, you can use the IP address instead: Set-Item WSMan:localhostClientTrustedHosts -Value "servername or IP" If you want to connect multiple devices, you can use comma and quotation marks to separate each device. Set-Item WSMan:localhostClientTrustedHosts -Value "servername or IP, servername or IP" You can also set it to allow to connect to a specific network subnet using the following command: Set-Item WSMan:localhostClientTrustedHosts -Value 10.10.100.* To test Windows PowerShell remoting against Nano Server and check if it’s working, you can use the following command: Test-WSMan -ComputerName"servername or IP" -Credential servernameAdministrator -Authentication Negotiate You can now start an interactive session with Nano Server. Open an elevated PowerShell console and type the following command: Enter-PSSession -ComputerName "servername or IP" -Credential servernameAdministrator In the following example, we will create two virtual machines on Nano Server Hyper-V host using PowerShell remoting. From your management machine, open an elevated PowerShell console or PowerShell scripting environment ,and run the following script (make sure to update the variables to match your environment): #region Variables $NanoSRV='NANOSRV-HV01' $Cred=Get-Credential"DemoSuperNano" $Session=New-PSSession-ComputerName$NanoSRV-Credential$Cred $CimSesion=New-CimSession-ComputerName$NanoSRV-Credential$Cred $VMTemplatePath='C:Temp' $vSwitch='Ext_vSwitch' $VMName='DemoVM-0' #endregion # Copying VM Template from the management machine to Nano Server Get-ChildItem-Path$VMTemplatePath-filter*.VHDX-recurse|Copy-Item-ToSession$Session-DestinationD: 1..2|ForEach-Object { New-VM-CimSession$CimSesion-Name$VMName$_-VHDPath"D:$VMName$_.vhdx"-MemoryStartupBytes1024GB` -SwitchName$vSwitch-Generation2 Start-VM-CimSession$CimSesion-VMName$VMName$_-Passthru } In this script, we are creating a PowerShell session and CIM session to Nano Server. A CIM session is a client-side object representing a connection to a local computer or a remote computer. Then we are copying VM Templates from the management machine to Nano Server over PowerShell remoting, when the copy is completed, we are creating two virtual machines as Generation 2 and finally starting them. After a couple of seconds, you can launch Hyper-V Manager console and see the new VMs running on Nano Server host as shown in Figure 4: Figure 4: Creating virtual machines on Nano Server host using PowerShell remoting If you have installed Nano Server in a virtual machine running on a Hyper-V host, you can use PowerShell direct to connect directly from your Hyper-V host to your Nano Server VM without any network connection by using the following command: Enter-PSSession -VMName <VMName> -Credential.Administrator So instead of specifying the computer name, we specified the VM Name, PowerShell Direct is so powerful, it’s one of my favorite feature, you can configure a bunch of VMs from scratch in just couple of seconds without any network connection. Moreover, if you have Nano Server running as a Hyper-V host as shown in the example earlier, you could use PowerShell remoting first to connect to Nano Server from your management machine, and then leverage PowerShell Direct to manage your virtual machines running on top of Nano Server. In this example, we used two PowerShell technologies (PS remoting and PS Direct).This is so powerful and open many possibilities to effectively manage Nano Server. To do that, you can use the following command: #region Variables $NanoSRV='NANOSRV-HV01'#Nano Server name or IP address $DomainCred=Get-Credential"DemoSuperNano" $VMLocalCred=Get-Credential"~Administrator" $Session=New-PSSession-ComputerName$NanoSRV-Credential$DomainCred #endregion Invoke-Command-Session$Session-ScriptBlock { Get-VM Invoke-Command-VMName (Get-VM).Name-Credential$Using;VMLocalCred-ScriptBlock { hostname Tzutil/g } } In this script, we have created a PowerShell session into Nano Server physical host, and then we used PowerShell Direct to list all VMs, including their hostnames and time zone. The result is shown in Figure 5: Figure 5. Nested PowerShell remoting Summary In this article, we discussed how to manage a Nano Server installation using remote server graphic tools, and Windows PowerShell remoting. Resources for Article: Further resources on this subject: Exploring Windows PowerShell 5.0 [article] Exchange Server 2010 Windows PowerShell: Mailboxes and Reports [article] Exchange Server 2010 Windows PowerShell: Managing Mailboxes [article]
Read more
  • 0
  • 0
  • 6042