Programming | 1081 articles | Tech News, Tutorials & Expert Insights

article-image-using-reactivecocoa-4-style

13 Jan 2016

5 min read

Using the ReactiveCocoa 4 Style

13 Jan 2016

0
0
2018

Packt

12 Jan 2016

11 min read

Façade Pattern – Being Adaptive with Façade

Packt

12 Jan 2016

11 min read

In this article by Chetan Giridhar, author of the book, Learning Python Design Patterns - Second Edition, we will get introduced to the Façade design pattern and how it is used in software application development. We will work with a sample use case and implement it in Python v3.5. In brief, we will cover the following topics in this article: An understanding of the Façade design pattern with a UML diagram A real-world use case with the Python v3.5 code implementation The Façade pattern and principle of least knowledge (For more resources related to this topic, see here.) Understanding the Façade design pattern Façade is generally referred to as the face of the building, especially an attractive one. It can be also referred to as a behavior or appearance that gives a false idea of someone's true feelings or situation. When people walk past a façade, they can appreciate the exterior face but aren't aware of the complexities of the structure within. This is how a façade pattern is used. Façade hides the complexities of the internal system and provides an interface to the client that can access the system in a very simplified way. Consider an example of a storekeeper. Now, when you, as a customer, visit a store to buy certain items, you're not aware of the layout of the store. You typically approach the storekeeper who is well aware of the store system. Based on your requirements, the storekeeper picks up items and hands them over to you. Isn't this easy? The customer need not know how the store looks and s/he gets the stuff done through a simple interface, the storekeeper. The Façade design pattern essentially does the following: It provides a unified interface to a set of interfaces in a subsystem and defines a high-level interface that helps the client use the subsystem in an easy way. Façade discusses representing a complex subsystem with a single interface object. It doesn't encapsulate the subsystem but actually combines the underlying subsystems. It promotes the decoupling of the implementation with multiple clients. A UML class diagram We will now discuss the Façade pattern with the help of the following UML diagram: As we observe the UML diagram, you'll realize that there are three main participants in this pattern: Façade: The main responsibility of a façade is to wrap up a complex group of subsystems so that it can provide a pleasing look to the outside world. System: This represents a set of varied subsystems that make the whole system compound and difficult to view or work with. Client: The client interacts with the Façade so that it can easily communicate with the subsystem and get the work completed. It doesn't have to bother about the complex nature of the system. You will now learn a little more about the three main participants from the data structure's perspective. Façade The following points will give us a better idea of Façade: It is an interface that knows which subsystems are responsible for a request It delegates the client's requests to the appropriate subsystem objects using composition For example, if the client is looking for some work to be accomplished, it need not have to go to individual subsystems but can simply contact the interface (Façade) that gets the work done System In the Façade world, System is an entity that performs the following: It implements subsystem functionality and is represented by a class. Ideally, a System is represented by a group of classes that are responsible for different operations. It handles the work assigned by the Façade object but has no knowledge of the façade and keeps no reference to it. For instance, when the client requests the Façade for a certain service, Façade chooses the right subsystem that delivers the service based on the type of service Client Here's how we can describe the client: The client is a class that instantiates the Façade It makes requests to the Façade to get the work done from the subsystems Implementing the Façade pattern in the real world To demonstrate the applications of the Façade pattern, let's take an example that we'd have experienced in our lifetime. Consider that you have a marriage in your family and you are in charge of all the arrangements. Whoa! That's a tough job on your hands. You have to book a hotel or place for marriage, talk to a caterer for food arrangements, organize a florist for all the decorations, and finally handle the musical arrangements expected for the event. In yesteryears, you'd have done all this by yourself, such as talking to the relevant folks, coordinating with them, negotiating on the pricing, but now life is simpler. You go and talk to an event manager who handles this for you. S/he will make sure that they talk to the individual service providers and get the best deal for you. From the Façade pattern perspective we will have the following three main participants: Client: It's you who need all the marriage preparations to be completed in time before the wedding. They should be top class and guests should love the celebrations. Façade: The event manager who's responsible for talking to all the folks that need to work on specific arrangements such as food, flower decorations, among others Subsystems: They represent the systems that provide services such as catering, hotel management, and flower decorations Let's develop an application in Python v3.5 and implement this use case. We start with the client first. It's you! Remember, you're the one who has been given the responsibility to make sure that the marriage preparations are done and the event goes fine! However, you're being clever here and passing on the responsibility to the event manager, isn't it? Let's now look at the You class. In this example, you create an object of the EventManager class so that the manager can work with the relevant folks on marriage preparations while you relax. class You(object): def __init__(self): print("You:: Whoa! Marriage Arrangements??!!!") def askEventManager(self): print("You:: Let's Contact the Event Manager\n\n") em = EventManager() em.arrange() def __del__(self): print("You:: Thanks to Event Manager, all preparations done! Phew!") Let's now move ahead and talk about the Façade class. As discussed earlier, the Façade class simplifies the interface for the client. In this case, EventManager acts as a façade and simplifies the work for You. Façade talks to the subsystems and does all the booking and preparations for the marriage on your behalf. Here is the Python code for the EventManager class: class EventManager(object): def __init__(self): print("Event Manager:: Let me talk to the folks\n") def arrange(self): self.hotelier = Hotelier() self.hotelier.bookHotel() self.florist = Florist() self.florist.setFlowerRequirements() self.caterer = Caterer() self.caterer.setCuisine() self.musician = Musician() self.musician.setMusicType() Now that we're done with the Façade and client, let's dive into the subsystems. We have developed the following classes for this scenario: Hotelier is for the hotel bookings. It has a method to check whether the hotel is free on that day (__isAvailable) and if it is free for booking the Hotel (bookHotel). The Florist class is responsible for flower decorations. Florist has the setFlowerRequirements() method to be used to set the expectations on the kind of flowers needed for the marriage decoration. The Caterer class is used to deal with the caterer and is responsible for the food arrangements. Caterer exposes the setCuisine() method to accept the type of cuisine to be served at the marriage. The Musician class is designed for musical arrangements at the marriage. It uses the setMusicType() method to understand the music requirements for the event. class Hotelier(object): def __init__(self): print("Arranging the Hotel for Marriage? --") def __isAvailable(self): print("Is the Hotel free for the event on given day?") return True def bookHotel(self): if self.__isAvailable(): print("Registered the Booking\n\n") class Florist(object): def __init__(self): print("Flower Decorations for the Event? --") def setFlowerRequirements(self): print("Carnations, Roses and Lilies would be used for Decorations\n\n") class Caterer(object): def __init__(self): print("Food Arrangements for the Event --") def setCuisine(self): print("Chinese & Continental Cuisine to be served\n\n") class Musician(object): def __init__(self): print("Musical Arrangements for the Marriage --") def setMusicType(self): print("Jazz and Classical will be played\n\n") you = You() you.askEventManager() The output of the preceding code is given here: In the preceding code example: The EventManager class is the Façade that simplifies the interface for You EventManager uses composition to create objects of the subsystems such as Hotelier, Caterer, and others The principle of least knowledge As you have learned in the initial parts of this article, the Façade provides a unified system that makes subsystems easy to use. It also decouples the client from the subsystem of components. The design principle that is employed behind the Façade pattern is the principle of least knowledge. The principle of least knowledge guides us to reduce the interactions between objects to just a few friends that are close enough to you. In real terms, it means the following:: When designing a system, for every object created, one should look at the number of classes that it interacts with and the way in which the interaction happens. Following the principle, make sure that we avoid situations where there are many classes created tightly coupled to each other. If there are a lot of dependencies between classes, the system becomes hard to maintain. Any changes in one part of the system can lead to unintentional changes to other parts of the system, which means that the system is exposed to regressions and this should be avoided. Summary We began the article by first understanding the Façade design pattern and the context in which it's used. We understood the basis of Façade and how it is effectively used in software architecture. We looked at how Façade design patterns create a simplified interface for clients to use. It simplifies the complexity of subsystems so that the client benefits. The Façade doesn't encapsulate the subsystem and the client is free to access the subsystems even without going through the Façade. You also learned the pattern with a UML diagram and sample code implementation in Python v3.5. We understood the principle of least knowledge and how its philosophy governs the Façade design patterns. Further resources on this subject: Asynchronous Programming with Python [article] Optimization in Python [article] The Essentials of Working with Python Collections [article]

0
0
5310

Packt

12 Jan 2016

13 min read

Using JavaScript with HTML

Packt

12 Jan 2016

13 min read

In this article by Syed Omar Faruk Towaha, author of the book JavaScript Projects for Kids, we will discuss about HTML, HTML canvas, implementing JavaScript codes on our HTML pages, and few JavaScript operations. (For more resources related to this topic, see here.) HTML HTML is a markup language. What does it mean? Well, a markup language processes and presents texts using specific codes for formatting, styling, and layout design. There are lots of markup languages; for example, Business Narrative Markup Language (BNML), ColdFusion Markup Language (CFML), Opera Binary Markup Language (OBML), Systems Biology Markup Language (SBML), Virtual Human Markup Language (VHML), and so on. However, in modern web, we use HTML. HTML is based on Standard Generalized Markup Language (SGML). SGML was basically used to design document papers. There are a number of versions of HTML. HTML 5 is the latest version. Throughout this book, we will use the latest version of HTML. Before you start learning HTML, think about your favorite website. What does the website contain? A few web pages? You may see some texts, few images, one or two text field, buttons, and some more elements on each of the webpages. Each of these elements are formatted by HTML. Let me introduce you to a web page. On your Internet browser, go to https://www.google.com. You will see a page similar to the following image: The first thing that you will see on the top of your browser is the title of the webpage: Here, the marked box, 1, is the title of the web page that we loaded. The second box, 2, indicates some links or some texts. The word Google in the middle of the page is an image. The third box, 3, indicates two buttons. Can you tell me what Sign in on the right-hand top of the page is? Yes, it is a button. Let's demonstrate the basic structure of HTML. The term tag will be used frequently to demonstrate the structure. An HTML tag is nothing but a few predefined words between the less than sign (<) and the greater than sign (>). Therefore, the structure of a tag is <WORD>, where WORD is the predefined text that is recognized by the Internet browsers. This type of tag is called open tag. There is another type of tag that is known as a close tag. The structure of a close tag is as </WORD>. You just have to put a forward slash after the less than sign. After this section, you will be able to make your own web page with a few texts using HTML. The structure of an HTML page is similar to the following image: This image has eight tags. Let me introduce all these tags with their activities: 1:This is the <html> tag, which is an open tag and it closes at line 15 with the </html> tag. These tags tell your Internet browser that all the texts and scripts in these two tags are HTML documents. 2:This is the <head> tag, which is an open tag and closes at line 7 with the </head> tag. These tags contain the title, script, style, and metadata of a web page. 3:This is the <title> tag, which closes at line 6 with the </title> tag. This tag contains the title of a webpage. The previous image had the title Google. To see this on the web browser you need to type like that: <title> Google </title> 4:This is the close tag of <title> tag 5:This is the closing tag of <head> tag 6:This is the <body> tag, closes at line 13 with tag </body> Whatever you can see on a web page is written between these two tags. Every element, image, link, and so on are formatted here. To see This is a web page on your browser, you need to type similar to the following: <body> This is a web page </body> 7:The </body> tag closes here. 8:The </html> tag is closed here. Your first webpage You have just learned the eight basic tags of an HTML page. You can now make your own web page. How? Why not you try with me? Open your text editor. Press Ctrl + N, which will open a new untitled file as shown in the following image: Type the following HTML codes on the blank page: <html> <head> <title> My Webpage! </title> </head> <body> This is my webpage :) </body> </html> Then, press Ctrl + Shift + S that will tell you to save your code somewhere on your computer, as follows: Type a suitable name on the File Name: field. I would like to name my HTML file webpage, therefore, I typed webpage.html. You may be thinking why I added an .html extension. As this is an HTML document, you need to add .html or .htm after the name that you give to your webpage. Press the Save button.This will create an HTML document on your computer. Go to the directory, where you saved your HTML file. Remember that you can give your web page any name. However, this name will not be visible on your browser. It is not the title of your webpage. It is a good practice not to keep a blank space on your web page's name. Consider that you want to name your HTML file This is my first webpage.html. Your computer will face no trouble showing the result on the Internet browsers; however, when your website will be on a server, this name may face a problem. Therefore, I suggest you to keep an underscore (_) where you need to add a space similar to the following: This_is_my_first_webpage.html. You will find a file similar to the following image: Now, double-click on the file. You will see your first web page on your Internet browser! You typed My Webpage! between the <title> and </title> tags, which is why your browser is showing this in the first selection box, 1. Also, you typed This is my webpage :) between the <body> and </body> tags. Therefore, you can see the text on your browser in the second selection box, 2. Congratulations! You created your first web page! You can edit your codes and other texts of the webpage.html file by right-clicking on the file and selecting Open with Atom. You must save (Ctrl + S) your codes and text before reopening the file with your browser. Implementing Canvas To add canvas on your HTML page, you need to define the height and width of your canvas in the <canvas> and </canvas> tags as shown in the following: <html> <head> <title>Canvas</title> </head> <body> <canvas id="canvasTest" width="200" height="100" style="border:2px solid #000;"> </canvas> </body> </html> We have defined canvas id as canvasTest, which will be used to play with the canvas. We used the inline CSS on our canvas. A 2 pixel solid border is used to have a better view of the canvas. Adding JavaScript Now, we are going to add few lines of JavaScript to our canvas. We need to add our JavaScript just after the <canvas>…</canvas> tags in the <script> and </script> tags. Drawing a rectangle To test our canvas, let's draw a rectangle in the canvas by typing the following codes: <script type="text/javascript"> var canvas = document.getElementById("canvasTest"); //called our canvas by id var canvasElement = canvas.getContext("2d"); // made our canvas 2D canvasElement.fillStyle = "black"; //Filled the canvas black canvasElement.fillRect(10, 10, 50, 50); //created a rectangle </script> In the script, we declared two JavaScript variables. The canvas variable is used to hold the content of our canvas using the canvas ID, which we used on our <canvas>…</canvas> tags. The canvasElement variable is used to hold the context of the canvas. We made fillStyle black so that the rectangle that we want to draw becomes black when filled. We used canvasElement.fillRect(x, y, w, h); for the shape of the rectangle. Where x is the distance of the rectangle from the x axis, y is the distance of the rectangle from the y axis. The w and h parameters are the width and height of the rectangle, respectively. The full code is similar to the following: <html> <head> <title>Canvas</title> </head> <body> <canvas id="canvasTest" width="200" height="100" style="border:2px solid #000;"> </canvas> <script type="text/javascript"> var canvas = document.getElementById("canvasTest"); //called our canvas by id var canvasElement = canvas.getContext("2d"); // made our canvas 2D canvasElement.fillStyle = "black"; //Filled the canvas black canvasElement.fillRect(10, 10, 50, 50); //created a rectangle </script> </body> </html> The output of the code is as follows: Drawing a line To draw a line in the canvas, you need to insert the following code in your <script> and </script> tags: <script type="text/javascript"> var c = document.getElementById("canvasTest"); var canvasElement = c.getContext("2d"); canvasElement.moveTo(0,0); canvasElement.lineTo(100,100); canvasElement.stroke(); </script> Here, canvasElement.moveTo(0,0); is used to make our line start from the (0,0) co-ordinate of our canvas. The canvasElement.lineTo(100,100); statement is used to make the line diagonal. The canvasElement.stroke(); statement is used to make the line visible. Here is the output of the code: A quick exercise Draw a line using canvas and JavaScript that will be parallel to the y axis of the canvas Draw a rectangle having 300 px height and 200 px width and draw a line on the same canvas touching the rectangle. Assignment operators An assignment operator assigns a value to an operator. I believe you that already know about assignment operators, don't you? Well, you used an equal sign (=) between a variable and its value. By doing this, you assigned the value to the variable. Let's see the following example: Var name = "Sherlock Holmes" The Sherlock Holmes string is assigned to the name variable. You already learned about the increment and decrement operators. Can you tell me what will be the output of the following codes: var x = 3; x *= 2; document.write(x); The output will be 6. Do you remember why this happened? The x *= 2; statement is similar to x = x * 2; as x is equal to 3 and later multiplied by 2. The final number (3 x 2 = 6) is assigned to the same x variable. That's why we got the output as shown in the following screenshot: Let's perform following exercise: What is the output of the following codes: var w = 32; var x = 12; var y = 9; var z = 5; w++; w--; x*2; y = x; y--; z%2; document.write("w = "+w+", x = "+x+", y = "+y+", z = "+z); The output that we get is w = 32, x = 12, y = 11, z = 5. JavaScript comparison and logical operators If you want to do something logical and compare two numbers or variables in JavaScript, you need to use a few logical operators. The following are a few examples of the comparison operators: Operator Description == Equal to != Not equal to > Greater than < Less than => Equal to or greater than <= Less than or equal to The example of each operator is shown in the following screenshot: According to mozilla.org, "Object-oriented programming (OOP) is a programming paradigm that uses abstraction to create models based on the real world. OOP uses several techniques from previously established paradigms, including modularity, polymorphism, and encapsulation." Nicholas C. Zakas states that "OOP languages typically are identified through their use of classes to create multiple objects that have the same properties and methods." You probably have assumed that JavaScript is an object-oriented programming language. Yes, you are absolutely right. Let's see why it is an OOP language. We call a computer programming language object oriented, if it has the following few features: Inheritance Polymorphism Encapsulation Abstraction Before going any further, let's discuss objects. We create objects in JavaScript in the following manner. var person = new Object(); person.name = "Harry Potter"; person.age = 22; person.job = "Magician"; We created an object for person. We added a few properties of person. If we want to access any property of the object, we need to call the property. Consider that you want to have a pop up of the name property of the preceding person object. You can do this with the following method: person.callName = function(){ alert(this.name); }; We can write the preceding code as the following: var person = { name: "Harry Potter", age: 22, job: "Magician", callName: function(){ alert(this.name); } }; Inheritance in JavaScript Inherit means derive something (characteristics, quality, and so on) from one's parents or ancestors. In programming languages, when a class or an object is based on another class or object in order to maintain the same behavior of mother class or object is known as inheritance. We can also say that this is a concept of gaining properties or behaviors of something else. Suppose, X inherits something from Y, it is like X is a type of Y. JavaScript occupies the inheritance capability. Let's see an example. A bird inherits from animal as a bird is a type of animal. Therefore, a bird can do the same thing as an animal. This kind of relationship in JavaScript is a little complex and needs a syntax. We need to use a special object called prototype, which assigns the properties to a type. We need to remember that only function has prototypes. Our Animal function should look similar to the following: function Animal(){ //We can code here. }; To add few properties to the function, we need to add a prototype, as shown in the following: Animal.prototype.eat = function(){ alert("Animal can eat."); }; Summary In this article, you have learned how to write HTML code and implement JavaScript code with the HTML file and HTML canvas. You have learned a few arithmetic operations with JavaScript. The sections in this article are from different chapters of the book, therefore, the flow may look like not in line. I hope you read the original book and practice the code that we discussed there. Resources for Article: Further resources on this subject: Walking You Through Classes [article] JavaScript Execution with Selenium [article] Object-Oriented JavaScript with Backbone Classes [article]

0
0
4894

Packt

10 Jan 2016

9 min read

Cython Won't Bite

Packt

10 Jan 2016

9 min read

In this article by Philip Herron, the author of the book Learning Cython Programming - Second Edition, we see how Cython is much more than just a programminglanguage. Its origin can be traced to Sage, the mathematics software package, where it was used to increase the performance of mathematical computations, such as those involving matrices. More generally, I tend to consider Cython as an alternative to Swig to generate really good python bindings to native code. Language bindings have been around for years and Swig was one of the first and best tools to generate bindings for multitudes of languages. Cython generates bindings for Python code only, and this single purpose approach means it generates the best Python bindings you can get outside of doing it all manually; attempt the latter only if you're a Python core developer. For me, taking control of legacy software by generating language bindings is a great way to reuse any software package. Consider a legacy application written in C/C++; adding advanced modern features like a web server for a dashboard or message bus is not a trivial thing to do. More importantly, Python comes with thousands of packages that have been developed, tested, and used by people for a long time, and can do exactly that. Wouldn't it be great to take advantage of all of this code? With Cython, we can do exactly this, and I will demonstrate approaches with plenty of example codes along the way. This article will be dedicated to the core concepts on using Cython, including compilation, and will provide a solid reference and introduction for all to Cython core concepts. In this article, we will cover: Installing Cython Getting started - Hello World Using distutils with Cython Calling C functions from Python Type conversion (For more resources related to this topic, see here.) Installing Cython Since Cython is a programming language, we must install its respective compiler, which just so happens to be so aptly named Cython. There are many different ways to install Cython. The preferred one would be to use pip: $ pip install Cython This should work on both Linux and Mac. Alternatively, you can use your Linux distribution's package manager to install Cython: $ yum install cython # will work on Fedora and Centos $ apt-get install cython # will work on Debian based systems In Windows, although there are a plethora of options available, following this Wiki is the safest option to stay up to date: http://wiki.cython.org/InstallingOnWindows Emacs mode There is an emacs mode available for Cython. Although the syntax is nearly the same as Python, there are differences that conflict in simply using Python mode. You can choose to grab the cython-mode.el from the Cython source code (inside the Tools directory.) The preferred way of installing packages to emacs would be to use a package repository such as MELPA(). To add the package repository to emacs, open your ~/.emacs configuration file and add the following code: (when (>= emacs-major-version 24) (require 'package) (add-to-list 'package-archives '("melpa" . "http://melpa.org/packages/") t) (package-initialize)) Once you add this and reload your configuration to install the cython mode, you can simply run the following: 'M-x package-install RET cython-mode' Once this is installed, you can activate the mode by adding this into your emacs config file: (require 'cython-mode) You can always activate the mode manually at any time with the following: 'M-x cython-mode RET' Getting the code examples Throughout this book, I intend to show real examples that are easy to digest to help you get a feel of the different things you can achieve with Cython. To access and download the code used, please clone the following repository: $ git clone git://github.com/redbrain/cython-book.git Getting started – Hello World As you will see when running the Hello World program, Cython generates native python modules. Therefore, while running any Cython code, you will reference it via a module import in Python. Let's build the module: $ cd cython-book/chapter1/helloworld $ make You should have now created helloworld.so! This is a Cython module of the same name of the Cython source code file. While in the same directory of the shared object module, you can invoke this code by running a respective Python import: $ python Python 2.7.3 (default, Aug 1 2012, 05:16:07) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import helloworld Hello World from cython! As you can see from opening helloworld.pyx, it looks just like a normal Python Hello World application; but as previously stated, Cython generates modules. These modules need a name so that it can be correctly imported by the python runtime. The Cython compiler simply uses the name of the source code file. It then requires us to compile this to the same shared object name. Overall, Cython source code files have the .pyx,.pxd, and .pxi extensions. For now, all we care about are the .pyx files; the others are for cimports and includes respectively within a .pyx module file. The following screenshot depicts the compilation flow required to have a callable native python module: I wrote a basic makefile so that you can simply run make to compile these examples. Here's the code to do this manually: $ cython helloworld.pyx $ gcc/clang -g -O2 -fpic `python-config --cflags` -c helloworld.c -o helloworld.o $ gcc/clang -shared -o helloworld.so helloworld.o `python-config –libs Using DistUtils with Cython You can compile this using Python distutils and cythonize. Open setup.py: from distutils.core import setup from Cython.Build import cythonize setup( ext_modules = cythonize("helloworld.pyx") ) Using the cythonize function as part of the ext_modules section will build any specified Cython source into an installable Python module. This will compile helloworld.pyx into the same shared library. This provides the Python practice to distribute native modules as part of distutils. Calling C functions from Python We should be careful when talking about Python and Cython for clarity, since the syntax is so similar. Let's wrap a simple AddFunction in C and make it callable from Python. Firstly, open a file called AddFunction.c, and write a simple function into it: #include <stdio.h> int AddFunction(int a, int b) { printf("look we are within your c code!n"); return a + b; } This is the C code we will call, which is just a simple function to add two integers. Now, let's get Python to call it. Open a file called AddFunction.h, wherein we will declare our prototype: #ifndef __ADDFUNCTION_H__ #define __ADDFUNCTION_H__ extern int AddFunction (int, int); #endif //__ADDFUNCTION_H__ We need this so that Cython can see the prototype for the function we want to call. In practice, you will already have your headers in your own project with your prototypes and declarations already available. Open a file called AddFunction.pyx, and insert the following code in to it: cdef extern from "AddFunction.h": cdef int AddFunction(int, int) Here, we have to declare what code we want to call. The cdef is a keyword signifying that this is from the C code that will be linked in. Now, we need a Python entry point: def Add(a, b): return AddFunction(a, b) This Add is a Python callable inside a PyAddFunction module. Again, I have provided a handy makefile to produce the module: $ cd cython-book/chapter1/ownmodule $ make cython -2 PyAddFunction.pyx gcc -g -O2 -fpic -c PyAddFunction.c -o PyAddFunction.o `python-config --includes` gcc -g -O2 -fpic -c AddFunction.c -o AddFunction.o gcc -g -O2 -shared -o PyAddFunction.so AddFunction.o PyAddFunc-tion.o `python-config --libs` Notice that AddFunction.c is compiled into the same PyAddFunction.so shared object. Now, let's call this AddFunction and check to see if C can add numbers correctly: $ python >>> from PyAddFunction import Add >>> Add(1,2) look we are within your c code!! 3 Notice the print statement inside AddFunction.c::AddFunction and that the final result is printed correctly. Therefore, we know the control hit the C code and did the calculation in C and not inside the Python runtime. This is a revelation to what is possible. Python can be cited to be slow in some circumstances. Using this technique, it makes it possible for Python code to bypass its own runtime and to run in an unsafe context, which is unrestricted by the Python runtime, which is much faster. Type conversion Notice that we had to declare a prototype inside the cython source code PyAddFunction.pyx: cdef extern from "AddFunction.h": cdef int AddFunction(int, int) It let the compiler know that there is a function called AddFunction, and that it takes two int's and returns an int. This is all the information the compiler needs to know besides the host and target operating system's calling convention in order to call this function safely. Then, we created the Python entry point, which is a python callable that takes two parameters: def Add(a, b): return AddFunction(a, b) Inside this entry point, it simply returned the native AddFunction and passed the two Python objects as parameters. This is what makes Cython so powerful. Here, the Cython compiler must inspect the function call and generate code to safely try and convert these Python objects to native C integers. This becomes difficult when precision is taken into account, and potential overflow, which just so happens to be a major use case since it handles everything so well. Also, remember that this function returns an integer and Cython also generates code to convert the integer return into a valid Python object. Summary Overall, we installed the Cython compiler, ran the Hello World example, and took into consideration that we need to compile all code into native shared objects. We also saw how to wrap native C code to be callable from Python, and how to do type conversion of parameters and return to C code and back to Python. Resources for Article: Further resources on this subject: Monte Carlo Simulation and Options [article] Understanding Cython [article] Scaling your Application Across Nodes with Spring Python's Remoting [article]

0
0
1221

article-image-planning-your-crm-implementation

Packt

08 Jan 2016

13 min read

Planning Your CRM Implementation

Packt

08 Jan 2016

13 min read

In this article by Joseph Murray and Brian Shaughnessy, authors of the book Using CiviCRM, Second Edition, you will learn to plan your implementation of CiviCRM so that your project has the best chance to achieve greater organization success. In this article, we will do the following: Identify potential barriers to success and learn how to overcome them Select an appropriate development methodology (For more resources related to this topic, see here.) Barriers to success Constituent Relationship Management (CRM) initiatives can be difficult. They require change, often impacting processes and workflows that are at the heart of your staff's daily responsibilities. They force you to rethink how you are managing your existing data, and decide what new data you want to begin collecting. They may require you to restructure external relationships, even as you rebuild internal processes and tools. Externally, the experience of your constituents, as they interact with your organization, may need to change so that they provide more value and fewer barriers to involvement. Internally, business processes and supporting technological systems may need to change in order to break down departmental operations' silos, increase efficiencies, and enable more effective targeting, improved responsiveness, and new initiatives. The success of the CRM project often depends on changing the behavior and attitude of individuals across the organization and replacing, changing, and/or integrating many IT systems used across the organization. To realize success as you manage the potential culture changes, you may need to ask staff members to take on tasks and responsibilities that do not directly benefit them or the department managed by their supervisor, but provides value to the organization by what it enables others in the organization to accomplish. As a result, it is often very challenging to align the interests of the staff and organizational units with the organization's broader interest in seeking improved constituent relations, as promised by the CRM strategy. This is why an executive sponsor, such as the executive director is so important. They must help staff see beyond their immediate scope of responsibility and buy-in to the larger goals of the organization. On the technical side, CRM projects for mid- and large-sized organizations typically involve replacing or integrating other systems. Configuring and customizing a new software system, migrating data into it, testing and deploying it, and training the staff members to use it, can be a challenge at the best of times. Doing it for multiple systems and more users multiplies the challenge. Since a CRM initiative generally involves integrating separate systems, you must be prepared to face the potential complexity of working with disparate data schemas requiring transformations and cleanup for interoperability, and keeping middleware in sync with changes in multiple independent software packages. Unfortunately, these challenges to the CRM implementation initiative may lead to a project failure if they are not identified and addressed. Common causes for failure may include: Lack of executive-level sponsorship resulting in improperly resolved turf wars. IT-led initiatives have a greater tendency to focus on cost efficiency. This focus will generally not result in better constituent relations that are oriented toward achieving the organization's mission. An IT approach, particularly where users and usability experts are not on the project team, may also lead to poor user adoption if the system is not adapted to their needs, or even if the users are poorly trained. No customer data integration approach resulting in not overcoming the data silos problem, no consolidated view of constituents, poorer targeting, and an inability to realize enterprise-level benefits. Lack of buy-in, leading to a lack of use of the new CRM system and continued use of the old processes and systems it was meant to supplant. Lack of training and follow-up causing staff anxiety and opposition. This may cause non-use or misuse of the system, resulting in poor data handling and mix-ups in the way in which constituents are treated. Not customizing enough to actually meet the requirements of the organization in the areas of: Data integration Business processes User experiences Over-customizing, causes: The costs of the system to escalate, especially through future upgrades The best practices incorporated in the base functionality to be overridden in some cases The user forms to become overly complex The user experiences to become off-putting No strategy for dealing with the technical challenges associated with developing, extending, and/or integrating the CRM software system, leads to: Cost overruns Poorly designed and built software Poor user experiences Incomplete or invalid data This does not mean that project failure is inevitable or common. These clearly identifiable causes of failure can be overcome through effective project planning. Perfection is the enemy of the good CRM systems and their functional components such as fundraising, ticket sales, communication, event registration, membership management, and case management are essential for the core operations of most non profit. Since they are so central to the daily operations of the organization, there is often legitimate fear of project failure when considering changes. This fear can easily create a perfectionist mentality, where the project team attempts to overcompensate by creating too much oversight, contingency planning, and project discovery time in an effort to avoid missing any potentially useful feature that could be integrated into the project. While planning is good, perfection may not be good, as perfection is often the enemy of the good. CRM implementations often risk erring on the side of what is known as the MIT approach (tongue-in-cheek). The MIT approach believes in, and attempts to design, construct, and deploy, the right thing right from the start. Its big-brain approach to problem solving leads to correctness, completeness, and consistency in the design. It values simplicity in the user interface over simplicity in the implementation design. The other end of the spectrum is captured with aphorisms like "Less is More," "KISS" (Keep it simple, stupid), and "Worse is Better." This alternate view willingly accepts deviations from correctness, completeness, and consistency in design in favor of general simplicity or simplicity of implementation over simplicity of user interface. The reason that such counter-intuitive approaches to developing solutions have become respected and popular is the problems and failures that can result from trying to do it all perfectly from the start. Neither end of the spectrum is healthier. Handcuffing the project to an unattainable standard of perfection or over simplifying in order to artificially reduce complexity will both lead to project failure. Value and success are generally found somewhere in between. As a project manager, it will be your responsibility to set the tone, determine priorities, and plan the implementation and development process. One rule that may help achieve balance and move the project forward is, Release early, release often. This is commonly embraced in the open source community where collaboration is essential to success. This motto: Captures the intent of catching errors earlier Allows users to realize value from the system sooner Allows users to better imagine and articulate what the software should do through ongoing use and interaction with a working system early in the process Creates a healthy cyclical process where end users are providing rapid feedback into the development process, and where those ideas are considered, implemented, and released on a regular basis There is no perfect antidote to these two extremes—only an awareness of the tendency for projects (and stakeholders) to lean in one of the two directions—and the realization the both extremes should be avoided. Development methodologies Whatever approach your organization decides to take for developing and implementing its CRM strategy, it's usually good to have an agreed upon process and methodology. Your processes define the steps to be taken as you implement the project. Your methodology defines the rules for the process, that is, the methods to be used throughout the course of the project. The spirit of this problem-solving approach can be seen in the Traditional Waterfall Development model and in the contrasting Iterative and Incremental Development model. Projects naturally change and evolve over time. You may find that you embrace one of these methodologies for initial implementation and then migrate to a different method or mixed method for maintenance and future development work. By no means should you feel restricted by the definitions provided, but rather adjust the principles to meet your changing needs throughout the course of the project. That being said, it's important that your team understands the project rules at a given point in time so that the project management principles are respected. The conventional Waterfall Development methodology The traditional Waterfall method of software development is sometimes thought of as "big design upfront." It employs a sequential approach to development, moving from needs analysis and requirements, to architectural and user experience, detailed design, implementation, integration, testing, deployment, and maintenance. The output of each step or phase flows downward, such as water, to the next step in the process, as illustrated by the arrows in the following figure: The Waterfall model tends to be more formal, more planned, includes more documentation, and often has a stronger division of labor. This methodology benefits from having clear, linear, and progressive development steps in which each phase builds upon the previous one. However, it can suffer from inflexibility if used too rigidly. For example, if during the verification and quality assurance phase you realize a significant functionality gap resulting from incorrect (or changing) specification requirements, then it may be difficult to interject those new needs into the process. The "release early, release often" iterative principle mentioned earlier can help overcome that inflexibility. If the overall process is kept tight and the development window short, you can justify delaying the new functionality or corrective specifications for the next release. If you do not embrace the "release early, release often" principle—either because it is not supported by the project team or the scope of the project does not permit it—you should still anticipate the need for flexibility and build it into your methodology. The overriding principle is to define the rules of the game early, so your project team knows what options are available at each stage of development. Iterative development methodology Iterative development models depart from this structure by breaking the work up into chunks that can be developed and delivered separately. The Waterfall process is used in each phase or segment of the project, but the overall project structure is not necessarily held to the same rigid process. As one moves farther away from the Waterfall approach, there is a greater emphasis on evaluating incrementally delivered pieces of the solution and incorporating feedback on what has already been developed into the planning of future work, as illustrated in the loop in the following figure: This methodology seeks to take what is good in the traditional Waterfall approach—structure, clearly defined linear steps, a strong development/quality assurance/roll out process—and improve it through shorter development cycles that are centered on smaller segments of the overall project. Perhaps, the biggest challenge in this model is the project management role, as it may result in many moving pieces that must be tightly coordinated in order to release the final working product. An iterative development method can also feel a little like a dog chasing its tail—you keep getting work done, but never feel like you're getting anywhere. It may be necessary to limit how many iterations take place before you pause the process, take a step back, and consider where you are in the project's bigger picture—before you set new goals and begin the cyclical process again. Agile development methodology Agile development methodologies are an effective derivative of the iterative development model that moves one step further away from the Waterfall model. They are characterized by requirements and solutions evolving together, requiring work teams to be drawn from all the relevant parts of the organization. They organize themselves to work in rapid 1 to 4 week iteration cycles. Agile centers on time-based release cycles, and in this way, it differs from the other methodologies discussed, which are oriented more toward functionality-based releases. The following figure illustrates the implementation of an Agile methodology that highlights short daily Scrum status meetings, a product backlog containing features or user stories for each iteration, and a sprint backlog containing revisable and re-prioritizable work items for the team during the current iteration. A deliberate effort is usually made in order to ensure that the sprint backlog is long enough to ensure that the lowest priority items will not be dealt with before the end of the iteration. Although they can be put onto the list of work items that may or may not be selected for the next iteration, the idea is that the client or the product owner should, at some point, decide that it's not worth investing more resources in the "nice to have, but not really necessary" items. But having those low-priority backlog items are equally as important for maximizing developer efficiency. If developers are able to work through the higher priority issues faster than originally expected, the backlog items give them a chance to chip away at the "nice to have" features while keeping within the time-based release cycle (and your budget constraints). As one might expect, this methodology relies heavily on effective prioritization. Since software releases and development cycles adhere to rigid timeframes, only high-priority issues or features are actively addressed at a given point in time; the remaining incomplete issues falling lower on the list are subject to reassignment for the next cycle. While an Agile development model may seem attractive (and rightly so), there are a few things to realize before embracing it: The process of reviewing, prioritizing, and managing the issue list does take time and effort. Each cycle will require the team to evaluate the status of issues and reprioritize for the next release. Central to the success of this model is a rigid allegiance to time-based releases and a ruthless determination to prevent feature creep. Yes – you will have releases that are delayed a week or see must-have features or bug fixes that enter the issue queue late in the process. However, the exceptions must not become the rule, or you lose your agility. Summary This article lists common barriers to the success of CRM initiatives that arise because of people issues or technical issues. These include problems getting systems and tools supporting disparate business functions to provide integrated functionality. We advocate a pragmatic approach to implementing a CRM strategy for your organization. We encourage the adoption of a change in approach and associated processes and methodologies that works for your organization: wide ranges in the level of structure, formality, and planning all work in different organizations. Resources for Article: Further resources on this subject: Getting Dynamics CRM 2015 Data into Power BI [article] Customization in Microsoft Dynamics CRM [article] Getting Started with Microsoft Dynamics CRM 2013 Marketing [article]

0
0
1589

article-image-distributed-resource-scheduler

Packt

07 Jan 2016

14 min read

Distributed resource scheduler

Packt

07 Jan 2016

14 min read

In this article written by Christian Stankowic, author of the book vSphere High Performance Essentials In cluster setups, Distributed Resource Scheduler (DRS) can assist you with automatic balancing CPU and storage load (Storage DRS). DRS monitors the ESXi hosts in a cluster and migrates the running VMs using vMotion, primarily, to ensure that all the VMs get the resources they need. Secondarily, it tries to balance the cluster. In addition to this, Storage DRS monitors the shared storage for information about latency and capacity consumption. In this case, Storage DRS recognizes the potential to optimize storage resources; it will make use of Storage vMotion to balance the load. We will cover Storage DRS in detail later. (For more resources related to this topic, see here.) Working of DRS DRS primarily uses two metrics to determine the cluster balance: Active host CPU: it includes the usage (CPU task time in ms) and ready (wait times in ms per VMs to get scheduled on physical cores) metrics. Active host Memory: It describes the amount of memory pages that are predicted to have changed in the last 20 seconds. A math-sampling algorithm calculates this amount; however, it is quite inaccurate. Active host memory is often used for resource capacity purposes. Be careful with using this value as an indicator as it only describes how aggressively a workload changes the memory. Depending on your application architecture, it may not measure how much memory a particular VM really needs. Think about applications that allocate a lot of memory for the purpose of caching. Using the active host memory metric for the purpose of capacity might lead to inappropriate settings. The migration threshold controls DRS's aggressiveness and defines how much a cluster can be imbalanced. Refer to the following table for detailed explanation: DRS level Priorities Effect Most conservative 1 Only affinity/anti-affinity constraints are applied More conservative 1–2 This will also apply recommendations addressing significant improvements Balanced (default) 1–3 Recommendations that, at least, promise good improvements are applied More aggressive 1–4 DRS applies recommendations that only promise a moderate improvement Most aggressive 1–5 Recommendations only addressing smaller improvements are applied Apart from the migration threshold, two other metrics—Target Host Load Standard Deviation (THLSD) and Current host load standard deviation (CHLSD)—are calculated. THLSD defines how much a cluster node's load can differ from others in order to be still balanced. The migration threshold and the particular ESXi host's active CPU and memory values heavily influence this metric. CHLSD calculates whether the cluster is currently balanced. If this value differs from the THLSD, the cluster is imbalanced and DRS will calculate the recommendations in order to balance it. In addition to this, DRS also calculates the vMotion overhead that is needed for the migration. If a migration's overhead is deemed higher than the benefit, vMotion will not be executed. DRS also evaluates the migration recommendations multiple times in order to avoid ping pong migrations. By default, once enabled, DRS is polled every five minutes (300 seconds). Depending on your landscape, it might be required to change this behavior. To do so, you need to alter the vpxd.cfg configuration file on the vCenter Server machine. Search for the following lines and alter the period (in seconds): <config> <drm> <pollPeriodSec> 300[SR1] </pollPeriodSec> </drm> </config> Refer to the following table for configuration file location, depending on your vCenter implementation: vCenter Server type File location vCenter Server Appliance /etc/vmware-vpx/vpxd.cfg vCenter Server C:ProgramDataVMwareVMware VirtualCentervpxd.cfg Check-list – performance tuning There are a couple of things to be considered when optimizing DRS for high-performance setups, as shown in the following: Make sure to use the hosts with homogenous CPU and memory configuration. Having different nodes will make DRS less effective. Use at least 1 Gbps network connection for vMotion. For better performance, it is recommended to use 10 Gbps instead. For virtual machines, it is a common procedure not to oversize them. Only configure as much as the CPU and memory resources need. Migrating workloads with unneeded resources takes more time. Make sure not to exceed the ESXi host and cluster limits that are mentioned in the VMware vSphere Configuration Maximums document. For vSphere 5.5, refer to https://www.vmware.com/pdf/vsphere5/r55/vsphere-55-configuration-maximums.pdf. For vSphere 6.0, refer to https://www.vmware.com/pdf/vsphere6/r60/vsphere-60-configuration-maximums.pdf. Configure DRS To configure DRS for your cluster, proceed with the following steps: Select your cluster from the inventory tab and click Manage and Settings. Under Services, select vSphere DRS. Click Edit. Select whether DRS should act in the Partially Automated or Fully Automated mode. In partially automated mode, DRS will place VMs in appropriate hosts, once powered on; however, it wil not migrate the running workloads. In fully automated mode, DRS will also migrate the running workloads in order to balance the cluster load. The Manual mode only gives you recommendations and the administrator can select the recommendations to apply. To create resource pools at cluster level, you will need to have at least the manual mode enabled. Select the DRS aggressiveness. Refer to the preceding table for a short explanation. Using more aggressive DRS levels is only recommended when having homogenous CPU and memory setups! When creating VMware support calls regarding DRS issues, a DRS dump file called drmdump is important. This file contains various metrics that DRS uses to calculate the possible migration benefits. On the vCenter Server Appliance, this file is located in /var/log/vmware/vpx/drmdump/clusterName. On the Windows variant, the file is located in %ALLUSERSPROFILE%VMwareVMware VirtualCenterLogsdrmdumpclusterName. VMware also offers an online tool called VM Resource and Availability Service (http://hasimulator.vmware.com), telling you which VMs can be restarted during the ESXi host failures. It requires you to upload this metric file in order to give you the results. This can be helpful when simulating the failure scenarios. Enhanced vMotion capability Enhanced vMotion Compatibility (EVC) enables your cluster to migrate the workloads between ESXi hosts with different processor generations. Unfortunately, it is not possible to migrate workloads between Intel-based and AMD-based servers; EVC only enables migrations in different Intel or AMD CPU generations. Once enabled, all the ESXi hosts are configured to provide the same set of CPU functions. In other words, the functions of newer CPU generations are disabled to match those of the older ESXi hosts in the cluster in order to create a common baseline. Configuring EVC To enable EVC, perform the following steps: Select the affected cluster from the inventory tab. Click on Manage, Settings, VMware EVC, and Edit. Choose Enable EVC for AMD Hosts or Enable EVC for Intel Hosts. Select the appropriate CPU generation for the cluster (the oldest). Make sure that Compatibility acknowledges your configuration. Save the changes, as follows: As mixing older hosts in high-performance clusters is not recommended, you should also avoid using EVC. To sum it up, keep the following steps in mind when planning the use of DRS: Enable DRS if you plan to have automatic load balancing; this is highly recommended for high-performance setups. Adjust the DRS aggressiveness level to match your requirements. Too aggressive migration thresholds may result in too many migrations, therefore, play with this setting to find the best for you. Make sure to have a separated vMotion network. Using the same logical network components as for the VM traffic is not recommended and might result in poor workload performance. Don't overload ESXi hosts to spare some CPU resources for vMotion processes in order to avoid performance bottlenecks during migrations. In high-performance setups, mixing various CPU and memory configurations is not recommended to achieve better performance. Try not to use EVC. Also, keep license constraints in mind when configuring DRS. Some software products might require additional licenses if it runs on multiple servers. We will focus on this later. Affinity and anti-affinity rules Sometimes, it is necessary to separate workloads or stick them together. To name some examples, think about the classical multi-tier applications such as the following: Frontend layer Database layer Backend layer One possibility would be to separate the particular VMs on multiple ESXi hosts to increase resilience. If a single ESXi host that is serving all the workloads crashes, all application components are affected by this fault. Moving all the participating application VMs to one single ESXi can result in higher performance as network traffic does not need to leave the ESXi host. However, there are more use cases to create affinity and anti-affinity rules, as shown in the following: Diving into production, development, and test workloads. For example, it would possible to separate production from the development and test workloads. This is a common procedure that many application vendors require. Licensing reasons (for example, license bound to the USB dongle, per core licensing, software assurance denying vMotion, and so on.) Application interoperability incompatibility (for example, applications need to run on separated hosts). As VMware vSphere has no knowledge about the license conditions of the workloads running virtualized, it is very important to check your software vendor's license agreements. You, as a virtual infrastructure administrator, are responsible to ensure that your software is fully licensed. Some software vendors require special licenses when running virtualized/on multiple hosts. There are two kinds of affinity/anti-affinity rules: VM-Host (relationship between VMs and ESXi hosts) and VM-VM (intra-relationship between particular VMs). Each rule consists of at least one VM and host DRS group. These groups also contain at least one entry. Every rule has a designation, where the administrator can choose between must or should. Implementing a rule with the should designation results in a preference on hosts satisfying all the configured rules. If no applicable host is found, the VM is put on another host in order to ensure at least the workload is running. If the must designation is selected, a VM is only running on hosts that are satisfying the configured rules. If no applicable host is found, the VM cannot be moved or started. This configuration approach is strict and requires excessive testing in order to avoid unplanned effects. DRS rules are rather combined than ranked. Therefore, if multiple rules are defined for a particular VM/host or VM/VM combination, the power-on process is only granted if all the rules apply to the requested action. If two rules are conflicting for a particular VM/host or VM/VM combination, the first rule is chosen and the other rule is automatically disabled. Especially, the use of the must rules should be evaluated very carefully as HA might not restart some workloads if these rules cannot be followed in case of a host crash. Configuring affinity/anti-affinity rules In this example, we will have a look at two use cases that affinity/anti-affinity rules can apply to. Example 1: VM-VM relationship This example consists of two VMs serving a two-tier application: db001 (database VM) and web001 (frontend VM). It is advisable to have both VMs running on the same physical host in order to reduce networking hops to connect the frontend server to its database. To configure the VM-VM affinity rule, proceed with the following steps: Select your cluster from the inventory tab and click Manage and VM/Host Rule underneath Configuration. Click Add. Enter a readable rule name (for example, db001-web001-bundle) and select Enable rule. Select the Keep Virtual Machines Together type and select the affected VMs. Click OK to save the rule, as shown in the following: When migrating one of the virtual machines using vMotion, the other VM will also migrate. Example 2: VM-Host relationship In this example, a VM (vcsa) is pinned to a particular ESXi host of a two-node cluster designated for production workloads. To configure the VM-Host affinity rule, proceed with the following steps: Select your cluster from the inventory tab and click Manage and VM/Host Groups underneath Configuration. Click Add. Enter a group name for the VM; make sure to select the VM Group type. Also, click Add to add the affected VM. Click Add once again. Enter a group name for the ESXi host; make sure to select the Host Group type. Later, click Add to add the ESXi host. Select VM/Host Rule underneath Configuration and click Add. Enter a readable rule name (for example, vcsa-to-esxi02) and select Enable rule. Select the Virtual Machines to Hosts type and select the previously created VM and host groups. Make sure to select Must run on hosts in group or Should run on hosts in group before clicking OK, as follows: Migrating the virtual machine to another host will fail with the following error message if Must run on hosts in group was selected earlier: Keep the following in mind when designing affinity and anti-affinity rules: Enable DRS. Double-check your software vendor's licensing agreements. Make sure to test your affinity/anti-affinity rules by simulating vMotion processes. Also, simulate host failures by using maintenance mode to ensure that your rules are working as expected. Note that the created rules also apply to HA and DPM. KISS – Keep it simple, stupid. Try to avoid utilizing too many or multiple rules for one VM/host combination. Distributed power management High performance setups are often the opposite of efficient, green infrastructures; however, high-performing virtual infrastructure setups can be efficient as well. Distributed Power Management (DPM) can help you with reducing the power costs and consumption of your virtual infrastructure. It is part of DRS and monitors the CPU and memory usage of all workloads running in the cluster. If it is possible to run all VMs on fewer hosts, DPM will put one or more ESXi hosts in standby mode (they will be powered off) after migrating the VMs using vMotion. DPM tries to keep the CPU and memory usage between 45% and 81% for all the cluster nodes by default. If this range is exceeded, the hosts will be powered on/off. Setting two advanced parameters can change this behaviour, as follows:DemandCapacityRatioTarget: Utilization target for the ESXi hosts (default: 63%)DemandCapacityRatioToleranceHost: Utilization range around target utilization (default 18%)The range is calculated as follows: (DemandCapacityRatioTarget - DemandCapacityRatioToleranceHost) to (DemandCapacityRatioTarget + DemandCapacityRatioToleranceHost)in this example we can calculate range: (63% - 18%) to (63% + 18%) To control a server's power state, DPM makes use of these three protocols in the following order: Intelligent Platform Management Interface (IPMI) Hewlett Packard Integrated Lights-Out (HP iLO) Wake-on-LAN (WoL) To enable IPMI/HP iLO management, you will need to configure the Baseboard Management Controller (BMC) IP address and other access information. To configure them, follow the given steps: Log in to vSphere Web Client and select the host that you want to configure for power management. Click on Configuration and select the Power Management tab. Select Properties and enter an IP address, MAC address, username, and password for the server's BMC. Note that entering hostnames will not work, as shown in the following: To enable DPM for a cluster, perform the following steps: Select the cluster from the inventory tab and select Manage. From the Services tab, select vSphere DRS and click Edit. Expand the Power Management tab and select Manual or Automatic. Also, select the threshold, DPM will choose to make power decisions. The higher the value, the faster DPM will put the ESXi hosts in standby mode, as follows: It is also possible to disable DPM for a particular host (for example, the strongest in your cluster). To do so; select the cluster and select Manage and Host Options. Check the host and click Edit. Make sure to select Disabled for the Power Management option. Consider giving a thought to the following when planning to utilize DPM: Make sure your server's have a supported BMC, such as HP iLO or IPMI. Evaluate the right DPM threshold. Also, keep your server's boot time (including firmware initialization) in mind and test your configuration before running in production. Keep in mind that the DPM also uses Active Memory and CPU usage for its decisions. Booting VMs might claim all memory; however, not use many active memory resources. If hosts are powered down while plenty VMs are booting, this might result in extensive swapping. Summary In this article, you learned how to implement the affinity and anti-affinity rules. You have also learned how to save power, while still achieving our workload requirements. Resources for Article: Further resources on this subject: Monitoring and Troubleshooting Networking [article] Storage Scalability [article] Upgrading VMware Virtual Infrastructure Setups [article]

0
0
2358

article-image-parallelization-using-reducers

Packt

06 Jan 2016

18 min read

Parallelization using Reducers

Packt

06 Jan 2016

18 min read

In this article by Akhil Wali, the author of the book Mastering Clojure, we will study this particular abstraction of collections and how it is quite orthogonal to viewing collections as sequences. Sequences and laziness are great way of handling collections. The Clojure standard library provides several functions to handle and manipulate sequences. However, abstracting a collection as a sequence has an unfortunate consequence—any computation that is performed over all the elements of a sequence is inherently sequential. All standard sequence functions create a new collection that be similar to the collection they are passed. Interestingly, performing a computation over a collection without creating a similar collection—even as an intermediary result—is quite useful. For example, it is often required to reduce a given collection to a single value through a series of transformations in an iterative manner. This sort of computation does not necessarily require the intermediary results of each transformation to be saved. (For more resources related to this topic, see here.) A consequence of iteratively computing values from a collection is that we cannot parallelize it in a straightforward way. Modern map-reduce frameworks handle this kind of computation by pipelining the elements of a collection through several transformations in parallel and finally reducing the results into a single result. Of course, the result could be a new collection as well. A drawback is that this methodology produces concrete collections as intermediate results of each transformation, which is rather wasteful. For example, if we want to filter values from a collection, a map-reduce strategy would require creating empty collections to represent values that are left out of the reduction step to produce the final result. This incurs unnecessary memory allocation and also creates additional work for the reduction step that produces the final result. Hence, there’s scope for optimizing these kinds of computations. This brings us to the notion of treating computations over collections as reducers to attain better performance. Of course, this doesn't mean that reducers are a replacement for sequences. Sequences and laziness are great for abstracting computations that create and manipulate collections, while reducers are specialized high-performance abstractions of collections in which a collection needs to be piped through several transformations and combined to produce the final result. Reducers achieve a performance gain in the following ways: Reducing the amount of memory allocated to produce the desired result Parallelizing the process of reducing a collection into a single result, which could be an entirely new collection The clojure.core.reducers namespace provides several functions for processing collections using reducers. Let's now examine how reducers are implemented and also study a few examples that demonstrate how reducers can be used. Using reduce to transform collections Sequences and functions that operate on sequences preserve the sequential ordering between the constituent elements of a collection. Lazy sequences avoid unnecessary realization of elements in a collection until they are required for a computation, but the realization of these values is still performed in a sequential manner. However, this characteristic of sequential ordering may not be desirable for all computations performed over it. For example, it's not possible to map a function over a vector and then lazily realize values in the resulting collection by random access, since the map function converts the collection that it is supplied into a sequence. Also, functions such as map and filter are lazy but still sequential by nature. Consider a unary function as shown in Example 3.1 that we intend to map it over a given vector. The function must compute a value from the one it is supplied, and also perform a side effect so that we can observe its application over the elements in a collection: Example 3.1. A simple unary function (defn square-with-side-effect [x] (do (println (str "Side-effect: " x)) (* x x))) The square-with-side-effect function defined here simply returns the square of a number x using the * function. This function also prints the value of x using a println form whenever it is called. Suppose this function is mapped over a given vector. The resulting collection will have to be realized completely if a computation has to be performed over it, even if all the elements from the resulting vector are not required. This can be demonstrated as follows: user> (def mapped (map square-with-side-effect [0 1 2 3 4 5])) #'user/mapped user> (reduce + (take 3 mapped)) Side-effect: 0 Side-effect: 1 Side-effect: 2 Side-effect: 3 Side-effect: 4 Side-effect: 5 5 As previously shown, the mapped variable contains the result of mapping the square-with-side-effect function over a vector. If we try to sum the first three values in the resulting collection using the reduce, take, and + functions, all the values in the [0 1 2 3 4 5] vector are printed as a side effect. This means that the square-with-side-effect function was applied to all the elements in the initial vector, despite the fact that only the first three elements were actually required by the reduce form. Of course, this can be solved by using the seq function to convert the vector to a sequence before we map the square-with-side-effect function over it. But then, we lose the ability to efficiently access elements in a random order in the resulting collection. To dive deeper into why this actually happens, you first need to understand how the standard map function is actually implemented. A simplified definition of the map function is shown here: Example 3.2. A simplified definition of the map function (defn map [f coll] (cons (f (first coll)) (lazy-seq (map f (rest coll))))) The definition of map in Example 3.2 is a simplified and rather incomplete one, as it doesn't check for an empty collection and cannot be used over multiple collections. That aside, this definition of map does indeed apply a function f to all the elements in a coll collection. This is implemented using a composition of the cons, first, rest, and lazy-seq forms. The implementation can be interpreted as, "apply the f function to the first element in the coll collection, and then map f over the rest of the collection in a lazy manner." An interesting consequence of this implementation is that the map function has the following characteristics: The ordering among the elements in the coll collection is preserved. This computation is performed recursively. The lazy-seq form is used to perform the computation in a lazy manner. The use of the first and rest forms indicates that coll must be a sequence, and the cons form will also produce a result that is a sequence. Hence, the map function accepts a sequence and builds a new one. Another interesting characteristic about lazy sequences is that they are realized in chunks. This means that a lazy sequence is realized in chunks of 32 elements, each as an optimization, when the values in the sequence are actually required. Sequences that behave this way are termed as chunked sequences. Of course, not all sequences are chunked, and we can check whether a given sequence is chunked using the chunked-seq? predicate. The range function returns a chunked sequence, shown as follows: user> (first (map #(do (print !) %) (range 70))) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 0 user> (nth (map #(do (print !) %) (range 70)) 32) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 32 Both the statements in the output shown previously select a single element from a sequence returned by the map function. The function passed to the map function in both the above statements prints the ! character and returns the value supplied to it. In the first statement, the first 32 elements of the resulting sequence are realized, even though only the first element is required. Similarly, the second statement is observed to realize the first 64 elements of the resulting sequence when the element at the 32nd position is obtained using the nth function. Chunked sequences have been an integral part of Clojure since version 1.1. However, none of the properties of the sequences described above are needed to transform a given collection into a result that is not a sequence. If we were to handle such computations efficiently, we cannot build on functions that return sequences, such as map and filter. Incidentally, the reduce function does not necessarily produce a sequence. It also has a couple of other interesting properties: The reduce function actually lets the collection it is passed to define how it is computed over or reduced. Thus, reduce is collection independent. Also, the reduce function is versatile enough to build a single value or an entirely new collection as well. For example, using reduce with the * or + function will create a single-valued result, while using it with the cons or concat function can create a new collection as the result. Thus, reduce can build anything. A collection is said to be reducible if it defines how it can be reduced to a single result. The binary function that is used by the reduce function along with a collection is also termed as a reducing function. A reducing function requires two arguments: one to represent the result of the computation so far, and another to represent an input value that has to be combined into the result. Several reducing functions can be composed into one, which effectively changes how the reduce function processes a given collection. This composition is done using reducers, which can be thought of as a shortened version of the term reducing function transformers. The use of sequences and laziness can be compared to the use of reducers to perform a given computation by Rich Hickey's infamous pie-maker analogy. Suppose a pie-maker has been supplied a bag of apples with an intent to reduce the apples to a pie. There are a couple of transformations needed to perform this task. First, the stickers on all the apples have to be removed; as in, we map a function to "take the sticker off" the apples in the collection. Also, all the rotten apples will have to be removed, which is analogous to using the filter function to remove elements from a collection. Instead of performing this work herself, the pie-maker delegates it to her assistant. The assistant could first take the stickers off all the apples, thus producing a new collection, and then take out the rotten apples to produce another new collection, which illustrates the use of lazy sequences. But then, the assistant would be doing unnecessary work by removing the stickers off of the rotten apples, which will have to be discarded later anyway. On the other hand, the assistant could delay this work until the actual reduction of the processed apples into a pie is performed. Once the work actually needs to be performed, the assistant will compose the two tasks of mapping and filtering the collection of apples, thus avoiding any unnecessary work. This case depicts the use of reducers for composing and transforming the tasks needed to effectively reduce a collection of apples to a pie. By using reducers, we create a recipe of tasks to reduce a collection of apples to a pie and delay all processing until the final reduction, instead of dealing with collections of apples as intermediary results of each task: The following namespaces must be included in your namespace declaration for the upcoming examples. (ns my-namespace (:require [clojure.core.reducers :as r])) The clojure.core.reducers namespace requires Java 6 with the jsr166y.jar or Java 7+ for fork/join support. Let's now briefly explore how reducers are actually implemented. Functions that operate on sequences use the clojure.lang.ISeq interface to abstract the behavior of a collection. In the case of reducers, the common interface that we must build upon is that of a reducing function. As we mentioned earlier, a reducing function is a two-arity function in which the first argument is the result produced so far and the second argument is the current input, which has to be combined with the first argument. The process of performing a computation over a collection and producing a result can be generalized into three distinct cases. They can be described as follows: A new collection with the same number of elements as that of the collection it is supplied needs to be produced. This one-to-one case is analogous to using the map function. The computation shrinks the supplied collection by removing elements from it. This can be done using the filter function. The computation could also be expansive, in which case it produces a new collection that contains an increased number of elements. This is like what the mapcat function does. These cases depict the different ways by which a collection can be transformed into the desired result. Any computation, or reduction, over a collection can be thought of as an arbitrary sequence of such transformations. These transformations are represented by transformers, which are functions that transform a reducing function. They can be implemented as shown here in Example 3.3: Example 3.3. Transformers (defn mapping [f] (fn [rf] (fn [result input] (rf result (f input))))) (defn filtering [p?] (fn [rf] (fn [result input] (if (p? input) (rf result input) result)))) (defn mapcatting [f] (fn [rf] (fn [result input] (reduce rf result (f input))))) The mapping, filtering, and mapcatting functions in the above example Example 3.3 represent the core logic of the map, filter, and mapcat functions respectively. All of these functions are transformers that take a single argument and return a new function. The returned function transforms a supplied reducing function, represented by rf, and returns a new reducing function, created using this expression: (fn [result input] ... ). Functions returned by the mapping, filtering, and mapcatting functions are termed as reducing function transformers. The mapping function applies the f function to the current input, represented by the input variable. The value returned by the f function is then combined with the accumulated result, represented by result, using the reducing function rf. This transformer is a frighteningly pure abstraction of the standard map function that applies an f function over a collection. The mapping function makes no assumptions about the structure of the collection it is supplied, or how the values returned by the f function are combined to produce the final result. Similarly, the filtering function uses a predicate, p?, to check whether the current input of the rf reducing function must combined into the final result, represented by result. If the predicate is not true, then the reducing function will simply return the result value without any modification. The mapcatting function uses the reduce function to combine the value result with the result of the (f input) expression. In this transformer, we can assume that the f function will return a new collection and the rf reducing function will somehow combine two collections into a new one. One of the foundations of the reducers library is the CollReduce protocol defined in the clojure.core.protocols namespace. This protocol defines the behavior of a collection when it is passed as an argument to the reduce function, and it is declared as shown below Example 3.4: Example 3.4. The CollReduce protocol (defprotocol CollReduce (coll-reduce [coll rf init])) The clojure.core.reducers namespace defines a reducer function that creates a reducible collection by dynamically extending the CollReduce protocol, as shown in this code Example 3.5: The reducer function (defn reducer ([coll xf] (reify CollReduce (coll-reduce [_ rf init] (coll-reduce coll (xf rf) init))))) The reducer function combines a collection (coll) and a reducing function transformer (xf), which is returned by the mapping, filtering, and mapcatting functions, to produce a new reducible collection. When reduce is invoked on a reducible collection, it will ultimately ask the collection to reduce itself using the reducing function returned by the (xf rf) expression. Using this mechanism, several reducing functions can be composed into a computation that has to be performed over a given collection. Also, the reducer function needs to be defined only once, and the actual implementation of coll-reduce is provided by the collection supplied to the reducer function. Now, we can redefine the reduce function to simply invoke the coll-reduce function implemented by a given collection, as shown herein Example 3.6: Example 3.6. Redefining the reduce function (defn reduce ([rf coll] (reduce rf (rf) coll)) ([rf init coll] (coll-reduce coll rf init))) As shown in the above code Example 3.6, the reduce function delegates the job of reducing a collection to the collection itself using the coll-reduce function. Also, the reduce function will use the rf reducing function to supply the init argument when it is not specified. An interesting consequence of this definition of reduce is that the rf function must produce an identity value when it is supplied no arguments. The standard reduce function even uses the CollReduce protocol to delegate the job of reducing a collection to the collection itself, but it will also fall back on the default definition of reduce if the supplied collection does not implement the CollReduce protocol. Since Clojure 1.4, the reduce function allows a collection to define how it is reduced using the clojure.core.CollReduce protocol. Clojure 1.5 introduced the clojure.core.reducers namespace, which extends the use of this protocol. All the standard Clojure collections, namely lists, vectors, sets, and maps, implement the CollReduce protocol. The reducer function can be used to build a sequence of transformations to be applied on a collection when it is passed as an argument to the reduce function. This can be demonstrated as follows: user> (r/reduce + 0 (r/reducer [1 2 3 4] (mapping inc))) 14 user> (reduce + 0 (r/reducer [1 2 3 4] (mapping inc))) 14 In this output, the mapping function is used with the inc function to create a reducing function transformer that increments all the elements in a given collection. This transformer is then combined with a vector using the reducer function to produce a reducible collection. The call to reduce in both of the above statements is transformed into the (reduce + [2 3 4 5]) expression, thus producing the result as 14. We can now redefine the map, filter, and mapcat functions using the reducer function, as shown belowin Example 3.7: Redefining the map, filter and mapcat functions using the reducer form (defn map [f coll] (reducer coll (mapping f))) (defn filter [p? coll] (reducer coll (filtering p?))) (defn mapcat [f coll] (reducer coll (mapcatting f))) As shown in Example 3.7, the map, filter, and mapcat functions are now simply compositions of the reducer form with the mapping, filtering, and mapcatting transformers respectively. The definitions of CollReduce, reducer, reduce, map, filter, and mapcat are simplified versions of their actual definitions in the clojure.core.reducers namespace. The definitions of the map, filter, and mapcat functions shown in Example 3.7 have the same shape as the standard versions of these functions, shown as follows: user> (r/reduce + (r/map inc [1 2 3 4])) 14 user> (r/reduce + (r/filter even? [1 2 3 4])) 6 user> (r/reduce + (r/mapcat range [1 2 3 4])) 10 Hence, the map, filter, and mapcat functions from the clojure.core.reducers namespace can be used in the same way as the standard versions of these functions. The reducers library also provides a take function that can be used as a replacement for the standard take function. We can use this function to reduce the number of calls to the square-with-side-effect function (from Example 3.1) when it is mapped over a given vector, as shown below: user> (def mapped (r/map square-with-side-effect [0 1 2 3 4 5])) #'user/mapped user> (reduce + (r/take 3 mapped)) Side-effect: 0 Side-effect: 1 Side-effect: 2 Side-effect: 3 5 Thus, using the map and take functions from the clojure.core.reducers namespace as shown above avoids the application of the square-with-side-effect function over all five elements in the [0 1 2 3 4 5] vector, as only the first three are required. The reducers library also provides variants of the standard take-while, drop, flatten, and remove functions which are based on reducers. Effectively, functions based on reducers will require a lesser number of allocations than sequence-based functions, thus leading to an improvement in performance. For example, consider the process and process-with-reducer functions, shown here: Example 3.8. Functions to process a collection of numbers using sequences and reducers (defn process [nums] (reduce + (map inc (map inc (map inc nums))))) (defn process-with-reducer [nums] (reduce + (r/map inc (r/map inc (r/map inc nums))))) This process function in Example 3.8 applies the inc function over a collection of numbers represented by nums using the map function. The process-with-reducer function performs the same action but uses the reducer variant of the map function. The process-with-reducer function will take a lesser amount of time to produce its result from a large vector compared to the process function, as shown here: user> (def nums (vec (range 1000000))) #'user/nums user> (time (process nums)) "Elapsed time: 471.217086 msecs" 500002500000 user> (time (process-with-reducer nums)) "Elapsed time: 356.767024 msecs" 500002500000 The process-with-reducer function gets a slight performance boost as it requires a lesser number of memory allocations than the process function. The performance of this computation can be improved by a greater scale if we can somehow parallelize it. Summary In this article, we explored the clojure.core.reducers library in detail. We took a look at how reducers are implemented and also how we can use reducers to handle large collections of data in an efficient manner. Resources for Article: Further resources on this subject: Getting Acquainted with Storm [article] Working with Incanter Datasets [article] Application Performance [article]

0
0
960

Packt

06 Jan 2016

8 min read

Raster Calculations

Packt

06 Jan 2016

8 min read

0
0
1722

article-image-building-command-line-tool

Packt

06 Jan 2016

15 min read

Building a Command-line Tool

Packt

06 Jan 2016

15 min read

0
0
2082

article-image-remote-sensing-and-histogram

Packt

04 Jan 2016

10 min read

Remote Sensing and Histogram

Packt

04 Jan 2016

10 min read

In this article by Joel Lawhead, the author of Learning GeoSpatial Analysis with Python - Second Edition, we will discuss remote sensing. This field grows more exciting every day as more satellites are launched and the distribution of data becomes easier. The high availability of satellite and aerial images as well as the interesting new types of sensors that are being launched each year is changing the role that remote sensing plays in understanding our world. In this field, Python is quite capable. In remote sensing, we step through each pixel in an image and perform a type of query or mathematical process. An image can be thought of as a large numerical array and in remote sensing, these arrays can be as large as tens of megabytes to several gigabytes. While Python is fast, only C-based libraries can provide the speed that is needed to loop through the arrays at a tolerable speed. (For more resources related to this topic, see here.) In this article, whenever possible, we’ll use Python Imaging Library (PIL) for image processing and NumPy, which provides multidimensional array mathematics. While written in C for speed, these libraries are designed for Python and provide Python’s API. In this article, we’ll start with basic image manipulation and build on each exercise all the way to automatic change detection. Here are the topics that we’ll cover: Swapping image bands Creating image histograms Swapping image bands Our eyes can only see colors in the visible spectrum as combinations of red, green, and blue (RGB). Airborne and spaceborne sensors can collect wavelengths of the energy that is outside the visible spectrum. In order to view this data, we move images representing different wavelengths of light reflectance in and out of the RGB channels in order to make color images. These images often end up as bizarre and alien color combinations that can make visual analysis difficult. An example of a typical satellite image is seen in the following Landsat 7 satellite scene near the NASA's Stennis Space Center in Mississippi along the Gulf of Mexico, which is a leading space center for remote sensing and geospatial analysis in general: Most of the vegetation appears red and water appears almost black. This image is one type of false color image, meaning that the color of the image is not based on the RGB light. However, we can change the order of the bands or swap certain bands in order to create another type of false color image that looks more like the world we are used to seeing. In order to do so, you first need to download this image as a ZIP file from the following: http://git.io/vqs41 We need to install the GDAL library with Python bindings. The Geospatial Data Abstraction Library(GDAL)includes a module called gdalnumeric that loads and saves remotely sensed images to and from NumPy arrays for easy manipulation. GDAL itself is a data access library and does not provide much in the name of processing. Therefore, in this article, we will rely heavily on NumPy to actually change images. In this example, we’ll load the image in a NumPy array using gdalnumeric and then, we’ll immediately save it in a new .tiff file. However, upon saving, we’ll use NumPy’s advanced array slicing feature to change the order of the bands. Images in NumPy are multidimensional arrays in the order of band, height, and width. Therefore, an image with three bands will be an array of length three, containing an array for each band, the height, and width of the image. It’s important to note that NumPy references the array locations as y,x (row, column) instead of the usual column and row format that we work with in spreadsheets and other software: from osgeo import gdalnumeric src = "FalseColor.tif" arr = gdalnumeric.LoadFile(src) gdalnumeric.SaveArray(arr[[1, 0, 2], :], "swap.tif",format="GTiff", prototype=src) Also in the SaveArray() method, the last argument is called prototype. This argument let’s you specify another image for GDAL from which we can copy spatial reference information and some other image parameters. Without this argument, we’d end up with an image without georeferencing information, which can not be used in a geographical information system (GIS). In this case, we specified our input image file name as the images are identical, except for the band order. The result of this example produces the swap.tif image, which is a much more visually appealing image with green vegetation and blue water: There’s only one problem with this image. It’s a little dark and difficult to see. Let’s see if we can figure out why. Creating histograms A histogram shows the statistical frequency of data distribution in a dataset. In the case of remote sensing, the dataset is an image, the data distribution is the frequency of the pixels in the range of 0 to 255, which is the range of the 8-byte numbers used to store image information on computers. In an RGB image, color is represented as a three-digit tuple with (0,0,0) being black and (255,255,255) being white. We can graph the histogram of an image with the frequency of each value along the y axis and the range of 255 possible pixel values along the x axis. We can use the Turtle graphics engine that is included with Python to create a simple GIS. We can also use it to easily graph histograms. Histograms are usually a one-off product that makes a quick script, like this example, great. Also, histograms are typically displayed as a bar graph with the width of the bars representing the size of the grouped data bins. However, in an image, each bin is only one value, so we’ll create a line graph. We’ll use the histogram function in this example and create a red, green, and blue line for each band. The graphing portion of this example also defaults to scaling the y axis values to the maximum RGB frequency that is found in the image. Technically, the y axis represents the maximum frequency, that is, the number of pixels in the image, which would be the case if the image was of one color. We’ll use the turtle module again; however, this example could be easily converted to any graphical output module. However, this format makes the distribution harder to see. Let’s take a look at our swap.tif image: from osgeo import gdalnumeric import turtle as t def histogram(a, bins=list(range(0, 256))): fa = a.flat n = gdalnumeric.numpy.searchsorted(gdalnumeric.numpy.sort(fa), bins) n = gdalnumeric.numpy.concatenate([n, [len(fa)]]) hist = n[1:]-n[:-1] return hist defdraw_histogram(hist, scale=True): t.color("black") axes = ((-355, -200), (355, -200), (-355, -200), (-355, 250)) t.up() for p in axes: t.goto(p) t.down() t.up() t.goto(0, -250) t.write("VALUE", font=("Arial, ", 12, "bold")) t.up() t.goto(-400, 280) t.write("FREQUENCY", font=("Arial, ", 12, "bold")) x = -355 y = -200 t.up() for i in range(1, 11): x = x+65 t.goto(x, y) t.down() t.goto(x, y-10) t.up() t.goto(x, y-25) t.write("{}".format((i*25)), align="center") x = -355 y = -200 t.up() pixels = sum(hist[0]) if scale: max = 0 for h in hist: hmax = h.max() if hmax> max: max = hmax pixels = max label = pixels/10 for i in range(1, 11): y = y+45 t.goto(x, y) t.down() t.goto(x-10, y) t.up() t.goto(x-15, y-6) t.write("{}" .format((i*label)), align="right") x_ratio = 709.0 / 256 y_ratio = 450.0 / pixels colors = ["red", "green", "blue"] for j in range(len(hist)): h = hist[j] x = -354 y = -199 t.up() t.goto(x, y) t.down() t.color(colors[j]) for i in range(256): x = i * x_ratio y = h[i] * y_ratio x = x - (709/2) y = y + -199 t.goto((x, y)) im = "swap.tif" histograms = [] arr = gdalnumeric.LoadFile(im) for b in arr: histograms.append(histogram(b)) draw_histogram(histograms) t.pen(shown=False) t.done() Here's what the histogram for swap.tif looks similar to after running the example: As you can see, all the three bands are grouped closely towards the left-hand side of the graph and all have values that are less than 125. As these values approach zero, the image becomes darker, which is not surprising. Just for fun, let’s run the script again and when we call the draw_histogram() function, we’ll add the scale=False option to get an idea of the size of the image and provide an absolute scale. Therefore, we take the following line: draw_histogram(histograms) Change it to the following: draw_histogram(histograms, scale=False) This change will produce the following histogram graph: As you can see, it’s harder to see the details of the value distribution. However, this absolute scale approach is useful if you are comparing multiple histograms from different products that are produced from the same source image. Now that we understand the basics of looking at an image statistically using histograms, how do we make our image brighter? Performing a histogram stretch A histogram stretch operation does exactly what the name suggests. It distributes the pixel values across the whole scale. By doing so, we have more values at the higher-intensity level and the image becomes brighter. Therefore, in this example, we’ll use our histogram function; however, we’ll add another function called stretch() that takes an image array, creates the histogram, and then spreads out the range of values for each band. We’ll run these functions on swap.tif and save the result in an image called stretched.tif: import gdalnumeric import operator from functools import reduce def histogram(a, bins=list(range(0, 256))): fa = a.flat n = gdalnumeric.numpy.searchsorted(gdalnumeric.numpy.sort(fa), bins) n = gdalnumeric.numpy.concatenate([n, [len(fa)]]) hist = n[1:]-n[:-1] return hist def stretch(a): hist = histogram(a) lut = [] for b in range(0, len(hist), 256): step = reduce(operator.add, hist[b:b+256]) / 255 n = 0 for i in range(256): lut.append(n / step) n = n + hist[i+b] gdalnumeric.numpy.take(lut, a, out=a) return a src = "swap.tif" arr = gdalnumeric.LoadFile(src) stretched = stretch(arr) gdalnumeric.SaveArray(arr, "stretched.tif", format="GTiff", prototype=src) The stretch algorithm will produce the following image. Look how much brighter and visually appealing it is: We can run our turtle graphics histogram script on stretched.tif by changing the filename in the im variable to stretched.tif: im = "stretched.tif" This run will give us the following histogram: As you can see, all the three bands are distributed evenly now. Their relative distribution to each other is the same; however, in the image, they are now spread across the spectrum. Summary In this article, we covered the foundations of remote sensing including band swapping and histograms. The authors of GDAL have a set of Python examples, covering some advanced topics that may be of interest, available at https://svn.osgeo.org/gdal/trunk/gdal/swig/python/samples/. Resources for Article: Further resources on this subject: Python Libraries for Geospatial Development[article] Python Libraries[article] Learning R for Geospatial Analysis [article]

0
0
7345

article-image-using-cloud-applications-and-containers

Xavier Bruhiere

10 Nov 2015

7 min read

Using Cloud Applications and Containers

Xavier Bruhiere

10 Nov 2015

7 min read

We can find a certain comfort while developing an application on our local computer. We debug logs in real time. We know the exact location of everything, for we probably started it by ourselves. Make it work, make it right, make it fast - Kent Beck Optimization is the root of all devil - Donald Knuth So hey, we hack around until interesting results pop up (ok that's a bit exaggerated). The point is, when hitting the production server our code will sail a much different sea. And a much more hostile one. So, how to connect to third party resources ? How do you get a clear picture of what is really happening under the hood ? In this post we will try to answer those questions with existing tools. We won't discuss continuous integration or complex orchestration. Instead, we will focus on what it takes to wrap a typical program to make it run as a public service. A sample application Before diving into the real problem, we need some code to throw on remote servers. Our sample application below exposes a random key/value store over http. // app.js // use redis for data storage var Redis = require('ioredis'); // and express to expose a RESTFul API var express = require('express'); var app = express(); // connecting to redis server var redis = new Redis({ host: process.env.REDIS_HOST || '127.0.0.1', port: process.env.REDIS_PORT || 6379 }); // store random float at the given path app.post('/:key', function (req, res) { var key = req.params.key var value = Math.random(); console.log('storing', value,'at', key) res.json({set: redis.set(key, value)}); }); // retrieve the value at the given path app.get('/:key', function (req, res) { console.log('fetching value at ', req.params.key); redis.get(req.params.key).then(function(err, result) { res.json({ result: result || err }); }) }); var server = app.listen(3000, function () { var host = server.address().address; var port = server.address().port; console.log('Example app listening at http://%s:%s', host, port); }); And we define the following package.json and Dockerfile. { "name": "sample-app", "version": "0.1.0", "scripts": { "start": "node app.js" }, "dependencies": { "express": "^4.12.4", "ioredis": "^1.3.6", }, "devDependencies": {} } # Given a correct package.json, those two lines alone will properly install and run our code FROM node:0.12-onbuild # application's default port EXPOSE 3000 A Dockerfile ? Yeah, here is a first step toward cloud computation under control. Packing our code and its dependencies into a container will allow us to ship and launch the application with a few reproducible commands. # download official redis image docker pull redis # cd to the root directory of the app and build the container docker build -t article/sample . # assuming we are logged in to hub.docker.com, upload the resulting image for future deployment docker push article/sample Enough for the preparation, time to actually run the code. Service Discovery The server code needs a connection to redis. We can't hardcode it because host and port are likely to change under different deployments. Fortunately The Twelve-Factor App provides us with an elegant solution. The twelve-factor app stores config in environment variables (often shortened to env vars or env). Env vars are easy to change between deploys without changing any code; Indeed, this strategy integrates smoothly with an infrastructure composed of containers. docker run --detach --name redis redis # 7c5b7ff0b3f95e412fc7bee4677e1c5a22e9077d68ad19c48444d55d5f683f79 # fetch redis container virtual ip export REDIS_HOST=$(docker inspect -f '{{ .NetworkSettings.IPAddress }}' redis) # note : we don't specify REDIS_PORT as the redis container listens on the default port (6379) docker run -it --rm --name sample --env REDIS_HOST=$REDIS_HOST article/sample # > [email protected] start /usr/src/app # > node app.js # Example app listening at http://:::3000 In another terminal, we can check everything is working as expected. export SAMPLE_HOST=$(docker inspect -f '{{ .NetworkSettings.IPAddress }}' sample)) curl -X POST $SAMPLE_HOST:3000/test # {"set":{"isFulfilled":false,"isRejected":false}} curl -X GET $SAMPLE_HOST:3000/test # {"result":"0.5807915225159377"} We didn't precise any network informations but even so, containers can communicate. This method is widely used and projects like etcd or consul let us automate the whole process. Monitoring Performances can be a critical consideration for end-user experience or infrastructure costs. We should be able to identify bottlenecks or abnormal activities and once again, we will take advantage of containers and open source projects. Without modifying the running server, let's launch three new components to build a generic monitoring infrastructure. Influxdb is a fast time series database where we will store containers metrics. Since we properly defined the application into two single-purpose containers, it will give us an interesting overview of what's going on. # default parameters export INFLUXDB_PORT=8086 export INFLUXDB_USER=root export INFLUXDB_PASS=root export INFLUXDB_NAME=cadvisor # Start database backend docker run --detach --name influxdb --publish 8083:8083 --publish $INFLUXDB_PORT:8086 --expose 8090 --expose 8099 --env PRE_CREATE_DB=$INFLUXDB_NAME tutum/influxdb export INFLUXDB_HOST=$(docker inspect -f '{{ .NetworkSettings.IPAddress }}' influxdb) cadvisor Analyzes resource usage and performance characteristics of running containers. The command flags will instruct it how to use the database above to store metrics. docker run --detach --name cadvisor --volume=/var/run:/var/run:rw --volume=/sys:/sys:ro --volume=/var/lib/docker/:/var/lib/docker:ro --publish=8080:8080 google/cadvisor:latest --storage_driver=influxdb --storage_driver_user=$INFLUXDB_USER --storage_driver_password=$INFLUXDB_PASS --storage_driver_host=$INFLUXDB_HOST:$INFLUXDB_PORT --log_dir=/ # A live dashboard is available at $CADVISOR_HOST:8080/containers # We can also point the brower to $INFLUXDB_HOST:8083, with credentials above, to inspect containers data. # Query example: # > list series # > select time,memory_usage from stats where container_name='cadvisor' limit 1000 # More infos: https://github.com/google/cadvisor/blob/master/storage/influxdb/influxdb.go Grafana is a feature rich metrics dashboard and graph editor for Graphite, InfluxDB and OpenTSB. From its web interface, we will query the database and graph the metrics cadvisor collected and stored. docker run --detach --name grafana -p 8000:80 -e INFLUXDB_HOST=$INFLUXDB_HOST -e INFLUXDB_PORT=$INFLUXDB_PORT -e INFLUXDB_NAME=$INFLUXDB_NAME -e INFLUXDB_USER=$INFLUXDB_USER -e INFLUXDB_PASS=$INFLUXDB_PASS -e INFLUXDB_IS_GRAFANADB=true tutum/grafana # Get login infos generated docker logs grafana Now we can head to localhost:8000 and build a custom dashboard to monitor the server. I won't repeat the comprehensive documentation but here is a query example: # note: cadvisor stores metrics in series named 'stats' select difference(cpu_cumulative_usage) where container_name='cadvisor' group by time 60s Grafana's autocompletion feature shows us what we can track : cpu, memory and network usage among other metrics. We all love screenshots and dashboards so here is a final reward for our hard work. Conclusion Development best practices and a good understanding of powerful tools gave us a rigorous workflow to launch applications with confidence. To sum up: Containers bundle code and requirements for flexible deployment and execution isolation. Environment stores third party services informations, giving developers a predictable and robust solution to read them. InfluxDB + Cadvisor + Grafana feature a complete monitoring solution independently of the project implementation. We fullfilled our expections but there's room for improvements. As mentioned, service discovery could be automated, but we also omitted how to manage logs. There are many discussions around this complex subject and we can expect shortly new improvements in our toolbox. About the author Xavier Bruhiere is the CEO of Hive Tech. He contributes to many community projects, including Occulus Rift, Myo, Docker and Leap Motion. In his spare time he enjoys playing tennis, the violin and the guitar. You can reach him at @XavierBruhiere.

0
0
2149

Packt

04 Nov 2015

9 min read

Synchronizing Tests

Packt

04 Nov 2015

9 min read

0
0
1731

article-image-moving-spatial-data-one-format-another

Packt

03 Nov 2015

29 min read

Moving Spatial Data From One Format to Another

Packt

03 Nov 2015

29 min read

In this article by Michael Diener, author of the Python Geospatial Analysis Cookbook, we will cover the following topics: Converting a Shapefile to a PostGIS table using ogr2ogr Batch importing a folder of Shapefiles into PostGIS using ogr2ogr Batch exporting a list of tables from PostGIS to Shapefiles Converting an OpenStreetMap (OSM) XML to a Shapefile Converting a Shapefile (vector) to a GeoTiff (raster) Converting a GeoTiff (raster) to a Shapefile (vector) using GDAL (For more resources related to this topic, see here.) Introduction Geospatial data comes in hundreds of formats and massaging this data from one format to another is a simple task. The ability to convert data types, such as rasters or vectors, belongs to data wrangling tasks that are involved in geospatial analysis. Here is an example of a raster and vector dataset so you can see what I am talking about: Source: Michael Diener drawing The best practice is to run analysis functions or models on data stored in a common format, such as a Postgresql PostGIS database or a set of Shapefiles, in a common coordinate system. For example, running analysis on input data stored in multiple formats is also possible, but you can expect to find the devil in the details of your results if something goes wrong or your results are not what you expect. This article looks at some common data formats and demonstrates how to move these formats around from one to another with the most common tools. Converting a Shapefile to a PostGIS table using ogr2ogr The simplest way to transform data from one format to another is to directly use the ogr2ogr tool that comes with the installation of GDAL. This powerful tool can convert over 200 geospatial formats. In this solution, we will execute the ogr2ogr utility from within a Python script to execute generic vector data conversions. The python code is, therefore, used to execute this command-line tool and pass around variables that are needed to create your own scripts for data imports or exports. The use of this tool is also recommended if you are not really interested in coding too much and simply want to get the job done to move your data. A pure python solution is, of course, possible but it is definitely targeted more at developers (or python purists). Getting ready To run this script, you will need the GDAL utility application installed on your system. Windows users can visit OSGeo4W (http://trac.osgeo.org/osgeo4w) and download the 32-bit or 64-bit Windows installer as follows: Simply double-click on the installer to start it. Navigate to the bottommost option, Advanced Installation | Next. Click on Next to download from the Internet (this is the first default option). Click on Next to accept default location of path or change to your liking. Click on Next to accept the location of local saved downloads (default). Click on Next to accept the direct connection (default). Click on Next to select a default download site. Now, you should finally see the menu. Click on + to open the command-line utilities and you should see the following: Now, select gdal. The GDAL/OGR library and command line tools to install it. Click on Next to start downloading it, and then install it. For Ubuntu Linux users, use the following steps for installation: Execute this simple one-line command: $ sudo apt-get install gdal-bin This will get you up and running so that you can execute ogr2ogr directly from your terminal. Next, set up your Postgresql database using the PostGIS extension. First, we will create a new user to manage our new database and tables: Sudo su createuser –U postgres –P pluto Enter a password for the new role. Enter the password again for the new role. Enter a password for postgres users since you're going to create a user with the help of the postgres user.The –P option prompts you to give the new user, called pluto, a password. For the following examples, our password is stars; I would recommend a much more secure password for your production database. Setting up your Postgresql database with the PostGIS extension in Windows is the same as setting it up in Ubuntu Linux. Perform the following steps to do this: Navigate to the c:Program FilesPostgreSQL9.3bin folder. Then, execute this command and follow the on-screen instructions as mentioned previously: Createuser.exe –U postgres –P pluto To create the database, we will use the command-line createdb command similar to the postgres user to create a database named py_geoan_cb. We will then assign the pluto user to be the database owner; here is the command to do this: $ sudo su createdb –O pluto –U postgres py_geoan_cb Windows users can visit the c:Program FilesPostgreSQL9.3bin and execute the createdb.exe command: createdb.exe –O pluto –U postgres py_geoan_cb Next, create the PostGIS extension for our newly created database: psql –U postgres -d py_geoan_cb -c "CREATE EXTENSION postgis;" Windows users can also execute psql from within the c:Program FilesPostgreSQL9.3bin folder: psql.exe –U postgres –d py_geoan_cb –c "CREATE EXTENSION postgis;" Lastly, create a schema called geodata to store the new spatial table. It is common to store spatial data in another schema outside the Postgresql default schema, public. Create the schema as follows: For Ubuntu Linux users: sudo -u postgres psql -d py_geoan_cb -c "CREATE SCHEMA geodata AUTHORIZATION pluto;" For Windows users: psql.exe –U postgres –d py_geoan_cb –c "CREATE SCHEMA geodata AUTHORIZATION pluto;" How to do it... Now let's get into the actual importing of our Shapefile into a PostGIS database that will automatically create a new table from our Shapefile: #!/usr/bin/env python # -*- coding: utf-8 -*- import subprocess # database options db_schema = "SCHEMA=geodata" overwrite_option = "OVERWRITE=YES" geom_type = "MULTILINESTRING" output_format = "PostgreSQL" # database connection string db_connection = """PG:host=localhost port=5432 user=pluto dbname=py_test password=stars""" # input shapefile input_shp = "../geodata/bikeways.shp" # call ogr2ogr from python subprocess.call(["ogr2ogr","-lco", db_schema, "-lco", overwrite_option, "-nlt", geom_type, "-f", output_format, db_connection, input_shp]) Now we can call our script from the command line: $ python ch03-01_shp2pg.py How it works... We begin with importing the standard python module subprocess that will call the ogr2ogr command-line tool. Next, we'll set a range of variables that are used as input arguments and various options for ogr2ogr to execute. Starting with the SCHEMA=geodata Postgresql database, we'll set a nondefault database schema for the destination of our new table. It is best practice to store your spatial data tables in a separate schema outside the "public" schema, which is set as the default. This practice will make backups and restores much easier and keep your database better organized. Next, we'll create a overwrite_option variable that's set to "yes" so that we can overwrite any table with the same name when its created. This is helpful when you want to completely replace the table with new data, otherwise, it is recommended to use the -append option. We'll also specify the geometry type because, sometimes, ogr2ogr does not always guess the correct geometry type of our Shapefile so setting this value saves you any worry. Now, setting our output_format variable with the PostgreSQL keyword tells ogr2ogr that we want to output data into a Postgresql database. This is then followed by the db_connection variable, which specifies our database connection information. Do not forget that the database must already exist along with the "geodata" schema, otherwise, we will get an error. The last input_shp variable gives the full path to our Shapefile, including the .shp file ending. Now, we will call the subprocess module ,which will then call the ogr2ogr command-line tool and pass along the variable options required to run the tool. We'll pass this function an array of arguments, the first object in the array being the ogr2ogr command-line tool name. After this, we'll pass each option after another in the array to complete the call. Subprocess can be used to call any command-line tool directly. It takes a list of parameters that are separated by spaces. This passing of parameters is quite fussy, so make sure you follow along closely and don't add any extra spaces or commas. Last but not least, we need to execute our script from the command line to actually import our Shapefile by calling the python interpreter and passing the script. Then, head over to the PgAdmin Postgresql database viewer and see if it worked, or even better, open up Quantum GIS (www.qgis.org) and take a look at the newly created tables. See also If you would like to see the full list of options available with the ogr2ogr command, simple enter the following in the command line: $ ogr2ogr –help You will see the full list of options that are available. Also, visit http://gdal.org/ogr2ogr.html to read the required documentation. Batch importing a folder of Shapefiles into PostGIS using ogr2ogr We would like to extend our last script to loop over a folder full of Shapefiles and import them into PostGIS. Most importing tasks involve more than one file to import so this makes it a very practical task. How to do it... The following steps will batch import a folder of Shapefiles into PostGIS using ogr2ogr: Our script will reuse the previous code in the form of a function so that we can batch process a list of Shapefiles to import into the Postgresql PostGIS database. We will create our list of Shapefiles from a single folder for the sake of simplicity: #!/usr/bin/env python # -*- coding: utf-8 -*- import subprocess import os import ogr def discover_geom_name(ogr_type): """ :param ogr_type: ogr GetGeomType() :return: string geometry type name """ return {ogr.wkbUnknown : "UNKNOWN", ogr.wkbPoint : "POINT", ogr.wkbLineString : "LINESTRING", ogr.wkbPolygon : "POLYGON", ogr.wkbMultiPoint : "MULTIPOINT", ogr.wkbMultiLineString : "MULTILINESTRING", ogr.wkbMultiPolygon : "MULTIPOLYGON", ogr.wkbGeometryCollection : "GEOMETRYCOLLECTION", ogr.wkbNone : "NONE", ogr.wkbLinearRing : "LINEARRING"}.get(ogr_type) def run_shp2pg(input_shp): """ input_shp is full path to shapefile including file ending usage: run_shp2pg('/home/geodata/myshape.shp') """ db_schema = "SCHEMA=geodata" db_connection = """PG:host=localhost port=5432 user=pluto dbname=py_geoan_cb password=stars""" output_format = "PostgreSQL" overwrite_option = "OVERWRITE=YES" shp_dataset = shp_driver.Open(input_shp) layer = shp_dataset.GetLayer(0) geometry_type = layer.GetLayerDefn().GetGeomType() geometry_name = discover_geom_name(geometry_type) print (geometry_name) subprocess.call(["ogr2ogr", "-lco", db_schema, "-lco", overwrite_option, "-nlt", geometry_name, "-skipfailures", "-f", output_format, db_connection, input_shp]) # directory full of shapefiles shapefile_dir = os.path.realpath('../geodata') # define the ogr spatial driver type shp_driver = ogr.GetDriverByName('ESRI Shapefile') # empty list to hold names of all shapefils in directory shapefile_list = [] for shp_file in os.listdir(shapefile_dir): if shp_file.endswith(".shp"): # apped join path to file name to outpout "../geodata/myshape.shp" full_shapefile_path = os.path.join(shapefile_dir, shp_file) shapefile_list.append(full_shapefile_path) # loop over list of Shapefiles running our import function for each_shapefile in shapefile_list: run_shp2pg(each_shapefile) print ("importing Shapefile: " + each_shapefile) Now, we can simply run our new script from the command line once again: $ python ch03-02_batch_shp2pg.py How it works... Here, we will reuse our code from the previous script but have converted it into a python function called run_shp2pg (input_shp), which takes exactly one argument to complete the path to the Shapefile we want to import. The input argument must include a Shapefile ending with .shp. We have a helper function that will get the geometry type as a string by reading in the Shapefile feature layer and outputting the geometry type so that the ogr commands know what to expect. This does not always work and some errors can occur. The skipfailures option will plow over any errors that are thrown during insert and will still populate our tables. To begin with, we need to define the folder that contains all our Shapefiles to be imported. Next up, we'll create an empty list object called shapefile_list that will hold a list of all the Shapefiles we want to import. The first for loop is used to get the list of all the Shapefiles in the directory specified using the standard python os.listdir() function. We do not want all the files in this folder; we only want files with ending with .shp, hence, the if statement will evaluate to True if the file ends with .shp. Once the .shp file is found, we need to append the file path to the file name to create a single string that holds the path plus the Shapefile name, and this is our variable called full_shapefile_path. The final part of this is to add each new file with its attached path to our shapefile_list list object. So, we now have our final list to loop through. It is time to loop through each Shapefile in our new list and run our run_shp2pg(input_shp) function for each Shapefile in the list by importing it into our Postgresql PostGIS database. See also If you have a lot of Shapefiles, and by this I mean mean hundred or more Shapefiles, performance will be one consideration and will, therefore, indicate that there are a lot of machines with free resources. Batch exporting a list of tables from PostGIS to Shapefiles We will now change directions and take a look at how we can batch export a list of tables from our PostGIS database into a folder of Shapefiles. We'll again use the ogr2ogr command-line tool from within a python script so that you can include it in your application programming workflow. Near the end, you can also see how this all of works in one single command line. How to do it... The script will fire the ogr2ogr command and loop over a list of tables to export the Shapefile format into an existing folder. So, let's take a look at how to do this using the following code: #!/usr/bin/env python # -*- coding: utf-8 -*- # import subprocess import os # folder to hold output Shapefiles destination_dir = os.path.realpath('../geodata/temp') # list of postGIS tables postgis_tables_list = ["bikeways", "highest_mountains"] # database connection parameters db_connection = """PG:host=localhost port=5432 user=pluto dbname=py_geoan_cb password=stars active_schema=geodata""" output_format = "ESRI Shapefile" # check if destination directory exists if not os.path.isdir(destination_dir): os.mkdir(destination_dir) for table in postgis_tables_list: subprocess.call(["ogr2ogr", "-f", output_format, destination_dir, db_connection, table]) print("running ogr2ogr on table: " + table) else: print("oh no your destination directory " + destination_dir + " already exist please remove it then run again") # commandline call without using python will look like this # ogr2ogr -f "ESRI Shapefile" mydatadump # PG:"host=myhost user=myloginname dbname=mydbname password=mypassword" neighborhood parcels Now, we'll call our script from the command line as follows: $ python ch03-03_batch_postgis2shp.py How it works... Beginning with a simple import of our subprocess and os modules, we'll immediately define our destination directory where we want to store the exported Shapefiles. This variable is followed by the list of table names that we want to export. This list can only include files located in the same Postgresql schema. The schema is defined as the active_schema so that ogr2ogr knows where to find the tables to be exported. Once again, we'll define the output format as ESRI Shapefile. Now, we'll check whether the destination folder exists. If it does, continue and call our loop. Then, loop through the list of tables stored in our postgis_tables_list variable. If the destination folder does not exist, you will see an error printed on the screen. There's more... If you are programming an application, then executing the ogr2ogr command from inside your script is definitely quick and easy. On the other hand, for a one-off job, simply executing the command-line tool is what you want when you export your list of Shapefiles. To do this in a one-liner, please take a look at the following information box. A one line example of calling the ogr2ogr batch of the PostGIS table to Shapefiles is shown here if you simply want to execute this once and not in a scripting environment: ogr2ogr -f "ESRI Shapefile" /home/ch03/geodata/temp PG:"host=localhost user=pluto dbname=py_geoan_cb password=stars" bikeways highest_mountains The list of tables you want to export is located as a list separated by spaces. The destination location of the exported Shapefiles is ../geodata/temp. Note this this /temp directory must exist. Converting an OpenStreetMap (OSM) XML to a Shapefile OpenStreetMap (OSM) has a wealth of free data, but to use it with most other applications, we need to convert it to another format, such as a Shapefile or Postgresql PostGIS database. This recipe will use the ogr2ogr tool to do the conversion for us within a python script. The benefit derived here is simplicity. Getting ready To get started, you will need to download the OSM data at http://www.openstreetmap.org/export#map=17/37.80721/-122.47305 and saving the file (.osm) to your /ch03/geodata directory. The download button is located on the bar on the left-hand side, and when pressed, it should immediately start the download (refer to the following image). The area we are testing is from San Francisco, just before the golden gate bridge. If you choose to download another area from OSM feel free to, but make sure that you take a small area like preceding example link. If you select a larger area, the OSM web tool will give you a warning and disable the download button. The reason is simple: if the dataset is very large, it will most likely be better suited for another tool, such as osm2pgsql (http://wiki.openstreetmap.org/wiki/Osm2pgsql), for you conversion. If you need to get OSM data for a large area and want to export to Shapefile, it would be advisable to use another tool, such as osm2pgsql, which will first import your data to a Postgresql database. Then, export from the PostGIS database to Shapefile using the pgsql2shp tool. A python tool to import OSM data into a PostGIS database is also available and is called imposm (located here at http://imposm.org/). Version 2 of it is written in python and version 3 is written in the "go" programming language if you want to give it a try. How to do it... Using the subprocess module, we will execute ogr2ogr to convert the OSM data that we downloaded into a new Shapefile: #!/usr/bin/env python # -*- coding: utf-8 -*- # convert / import osm xml .osm file into a Shapefile import subprocess import os import shutil # specify output format output_format = "ESRI Shapefile" # complete path to input OSM xml file .osm input_osm = '../geodata/OSM_san_francisco_westbluff.osm' # Windows users can uncomment these two lines if needed # ogr2ogr = r"c:/OSGeo4W/bin/ogr2ogr.exe" # ogr_info = r"c:/OSGeo4W/bin/ogrinfo.exe" # view what geometry types are available in our OSM file subprocess.call([ogr_info, input_osm]) destination_dir = os.path.realpath('../geodata/temp') if os.path.isdir(destination_dir): # remove output folder if it exists shutil.rmtree(destination_dir) print("removing existing directory : " + destination_dir) # create new output folder os.mkdir(destination_dir) print("creating new directory : " + destination_dir) # list of geometry types to convert to Shapefile geom_types = ["lines", "points", "multilinestrings", "multipolygons"] # create a new Shapefile for each geometry type for g_type in geom_types: subprocess.call([ogr2ogr, "-skipfailures", "-f", output_format, destination_dir, input_osm, "layer", g_type, "--config","OSM_USE_CUSTOM_INDEXING", "NO"]) print("done creating " + g_type) # if you like to export to SPATIALITE from .osm # subprocess.call([ogr2ogr, "-skipfailures", "-f", # "SQLITE", "-dsco", "SPATIALITE=YES", # "my2.sqlite", input_osm]) Now we can call our script from the command line: $ python ch03-04_osm2shp.py Go and have a look at your ../geodata folder to see the newly created Shapefiles, and try to open them up in Quantum GIS, which is a free GIS software (www.qgis.org) How it works... This script should be clear as we are using the subprocess module call to fire our ogr2ogr command-line tool. Specify our OSM dataset as an input file, including the full path to the file. The Shapefile name is not supplied as ogr2ogr and will output a set of Shapefiles, one for each geometry shape according to the geometry type it finds inside the OSM file. We only need to specify the name of the folder where we want ogr2ogr to export the Shapefiles to, automatically creating the folder if it does not exist. Windows users: if you do not have your ogr2ogr tool mapped to your environment variables, you can simply uncomment at lines 16 and 17 in the preceding code and replace the path shown with the path on your machine to the Windows executables. The first subprocess call prints out the screen that the geometry types have found inside the OSM file. This is helpful in most cases to help you identify what is available. Shapefiles can only support one geometry type per file, and this is why ogr2ogr outputs a folder full of Shapefiles, each one representing a separate geometry type. Lastly, we'll call subprocess to execute ogr2ogr, passing in the output "ESRI Shapefile" file type, output folder, and the name of the OSM dataset. Converting a Shapefile (vector) to a GeoTiff (raster) Moving data from format to format also includes moving from vector to raster or the other way around. In this recipe, we move from a vector (Shapefile) to a raster (GeoTiff) with the python gdal and ogr modules. Getting ready We need to be inside our virtual environment again, so fire it up so that we can access our gdal and ogr python modules. As usual, enter your python virtual environment with the workon pygeoan_cb command: $ source venvs/pygeoan_cb/bin/activate How to do it... Let's dive in and convert our golf course polygon Shapefile into a GeoTif; here is the code to do this: Import the ogr and gdal libraries, and then define the output pixel size along with the value that will be assigned to null: #!/usr/bin/env python # -*- coding: utf-8 -*- from osgeo import ogr from osgeo import gdal # set pixel size pixel_size = 1no_data_value = -9999 Set up the input Shapefile that we want to convert alongside the new GeoTiff raster that will be created when the script is executed: # Shapefile input name # input projection must be in cartesian system in meters # input wgs 84 or EPSG: 4326 will NOT work!!! input_shp = r'../geodata/ply_golfcourse-strasslach3857.shp' # TIF Raster file to be created output_raster = r'../geodata/ply_golfcourse-strasslach.tif' Now we need to create the input Shapefile object, so get the layer information and finally set the extent values: # Open the data source get the layer object # assign extent coordinates open_shp = ogr.Open(input_shp) shp_layer = open_shp.GetLayer() x_min, x_max, y_min, y_max = shp_layer.GetExtent() Here, we need to calculate the resolution distance to pixel value: # calculate raster resolution x_res = int((x_max - x_min) / pixel_size) y_res = int((y_max - y_min) / pixel_size) Our new raster type is a GeoTiff so we must explicitly tell this gdal to get the driver. The driver is then able to create a new GeoTiff by passing in the filename or the new raster we want to create. The x direction resolution is followed by the y direction resolution, and then our number of bands, which is, in this case, 1. Lastly, we'll set the new type of GDT_Byte raster: # set the image type for export image_type = 'GTiff' driver = gdal.GetDriverByName(image_type) new_raster = driver.Create(output_raster, x_res, y_res, 1, gdal.GDT_Byte) new_raster.SetGeoTransform((x_min, pixel_size, 0, y_max, 0, - pixel_size)) Now we can access the new raster band and assign the no data values and the inner data values for the new raster. All the inner values will receive a value of 255 similar to what we set for the burn_values variable: # get the raster band we want to export too raster_band = new_raster.GetRasterBand(1) # assign the no data value to empty cells raster_band.SetNoDataValue(no_data_value) # run vector to raster on new raster with input Shapefile gdal.RasterizeLayer(new_raster, [1], shp_layer, burn_values=[255]) Here we go! Lets run the script to see what our new raster looks like: $ python ch03-05_shp2raster.py The resulting raster should look like this if you open it using QGIS (http://www.qgis.org): How it works... There are several steps involved in this code so please follow along as some points could lead to trouble if you are not sure what values to input. We start with the import of the gdal and ogr modules, respectively, since they will do the work for us by inputting a Shapefile (vector) and outputting a GeoTiff (raster). The pixel_size variable is very important since it will determine the size of the new raster we will create. In this example, we only have two polygons, so we'll set pixel_size = 1 to keep a fine border. If you have many polygons stretching across the globe in one Shapefile, it is wiser to set this value to 25 or more. Otherwise, you could end up with a 10 GB raster and your machine will run all night long! The no_data_value parameter is needed to tell GDAL what values to set in the empty space around our input polygons, and we set these to -9999 in order to be easily identified. Next, we'll simply set the input Shapefile stored in the EPSG:3857 web mercator and output GeoTiff. Check to make sure that you change the file names accordingly if you want to use some other dataset. We start by working with the OGR module to open the Shapefile and retrieve its layer information and the extent information. The extent is important because it is used to calculate the size of the output raster width and height values, which must be integers that are represented by the x_res and y_res variables. Note that the projection of your Shapefile must be in meters not degrees. This is very important since this will NOT work in EPSG:4326 or WGS 84, for example. The reason is that the coordinate units are LAT/LON. This means that WGS84 is not a flat plane projection and cannot be drawn as is. Our x_res and y_res values would evaluate to 0 since we cannot get a real ratio using degrees. This is a result of use not being able to simply subtract coordinate x from coordinate y because the units are in degrees and not in a flat plane meter projection. Now moving on to the raster setup, we'll define the type of raster we want to export as a Gtiff. Then, we can get the correct GDAL driver by the raster type. Once the raster type is set, we can create a new empty raster dataset, passing in the raster file name, width, and height of the raster in pixels, number of raster bands, and finally, the type of raster in GDAL terms, that is, the gdal.GDT_Byte. These five parameters are mandatory to create a new raster. Next, we'll call SetGeoTransform that handles transforming between pixel/line raster space and projection coordinates space. We want to activate the band 1 as it is the only band we have in our raster. Then, we'll assign the no data value for all our empty space around the polygon. The final step is to call the gdal.RasterizeLayer() function and pass in our new raster band Shapefile and the value to assign to the inside of our raster. The value of all the pixels inside our polygon will be 255. See also If you are interested, you can visit the command-line tool gdal_rasterize at http://www.gdal.org/gdal_rasterize.html. You can run this straight from the command line. Converting a raster (GeoTiff) to a vector (Shapefile) using GDAL We have now looked at how we can go from vector to raster, so it is time to go from raster to vector. This method is much more common because most of our vector data is derived from remotely sensed data such as satellite images, orthophotos, or some other remote sensing dataset such as lidar. Getting ready As usual, please enter your python virtual environment with the help of the workon pygeoan_cb command: $ source venvs/pygeoan_cb/bin/activate How to do it... Now let's begin: Import the ogr and gdal modules. Go straight ahead and open the raster that we want to convert by passing it the file name on disk. Then, get the raster band: #!/usr/bin/env python # -*- coding: utf-8 -*- from osgeo import ogr from osgeo import gdal # get raster datasource open_image = gdal.Open( "../geodata/cadaster_borders-2tone-black- white.png") input_band = open_image.GetRasterBand(3) Setup the output vector file as a Shapefile with output_shp, and then get the Shapefile driver. Now, we can create the output from our driver and create the layer: # create output datasource output_shp = "../geodata/cadaster_raster" shp_driver = ogr.GetDriverByName("ESRI Shapefile") # create output file name output_shapefile = shp_driver.CreateDataSource( output_shp + ".shp" ) new_shapefile = output_shapefile.CreateLayer(output_shp, srs = None ) Our final step is to run the gdal.Polygonize function that does the heavy lifting by converting our raster to vector. gdal.Polygonize(input_band, None, new_shapefile, -1, [], callback=None) new_shapefile.SyncToDisk() Execute the new script. $ python ch03-06_raster2shp.py How it works... Working with ogr and gdal is similar in all our recipes; we must define the inputs and get the appropriate file driver to open the files. The GDAL library is very powerful and in only one line of code, we can convert a raster to a vector with the gdal. Polygonize function. All the preceding code is simply setup code to define which format we want to work with. We can then set up the appropriate driver to input and output our new file. Summary In this article we covered converting a Shapefile to a PostGIS table using ogr2ogr, batch importing a folder of Shapefiles into PostGIS using ogr2ogr, batch exporting a list of tables from PostGIS to Shapefiles, converting an OpenStreetMap (OSM) XML to a Shapefile, converting a Shapefile (vector) to a GeoTiff (raster), and converting a GeoTiff (raster) to a Shapefile (vector) using GDAL Resources for Article: Further resources on this subject: The Essentials of Working with Python Collections[article] Symbolizers[article] Preparing to Build Your Own GIS Application [article]

0
0
3278

Packt

03 Nov 2015

6 min read

HTML5 APIs

Packt

03 Nov 2015

6 min read

In this article by Dmitry Sheiko author of the book JavaScript Unlocked we will create our first web component. (For more resources related to this topic, see here.) Creating the first web component You might be familiar with HTML5 video element (http://www.w3.org/TR/html5/embedded-content-0.html#the-video-element). By placing a single element in your HTML, you will get a widget that runs a video. This element accepts a number of attributes to set up the player. If you want to enhance this, you can use its public API and subscribe listeners on its events (http://www.w3.org/2010/05/video/mediaevents.html). So, we reuse this element whenever we need a player and only customize it for project-relevant look and feel. If only we had enough of these elements to pick every time we needed a widget on a page. However, this is not the right way to include any widget that we may need in an HTML specification. However, the API to create custom elements, such as video, is already there. We can really define an element, package the compounds (JavaScript, HTML, CSS, images, and so on), and then just link it from the consuming HTML. In other words, we can create an independent and reusable web component, which we then use by placing the corresponding custom element (<my-widget />) in our HTML. We can restyle the element, and if needed, we can utilize the element API and events. For example, if you need a date picker, you can take an existing web component, let's say the one available at http://component.kitchen/components/x-tag/datepicker. All that we have to do is download the component sources (for example, using browser package manager) and link to the component from our HTML code: <link rel="import" href="bower_components/x-tag-datepicker/src/datepicker.js"> Declare the component in the HTML code: <x-datepicker name="2012-02-02"></x-datepicker> This is supposed to go smoothly in the latest versions of Chrome, but this won't probably work in other browsers. Running a web component requires a number of new technologies to be unlocked in a client browser, such as Custom Elements, HTML Imports, Shadow DOM, and templates. The templates include the JavaScript templates. The Custom Element API allows us to define new HTML elements, their behavior, and properties. The Shadow DOM encapsulates a DOM subtree required by a custom element. And support of HTML Imports assumes that by a given link the user-agent enables a web-component by including its HTML on a page. We can use a polyfill (http://webcomponents.org/) to ensure support for all of the required technologies in all the major browsers: <script src="./bower_components/webcomponentsjs/webcomponents.min.js"></script> Do you fancy writing your own web components? Let's do it. Our component acts similar to HTML's details/summary. When one clicks on summary, the details show up. So we create x-details.html, where we put component styles and JavaScript with component API: x-details.html <style> .x-details-summary { font-weight: bold; cursor: pointer; } .x-details-details { transition: opacity 0.2s ease-in-out, transform 0.2s ease-in-out; transform-origin: top left; } .x-details-hidden { opacity: 0; transform: scaleY(0); } </style> <script> "use strict"; /** * Object constructor representing x-details element * @param {Node} el */ var DetailsView = function( el ){ this.el = el; this.initialize(); }, // Creates an object based in the HTML Element prototype element = Object.create( HTMLElement.prototype ); /** @lend DetailsView.prototype */ Object.assign( DetailsView.prototype, { /** * @constracts DetailsView */ initialize: function(){ this.summary = this.renderSummary(); this.details = this.renderDetails(); this.summary.addEventListener( "click", this.onClick.bind( this ), false ); this.el.textContent = ""; this.el.appendChild( this.summary ); this.el.appendChild( this.details ); }, /** * Render summary element */ renderSummary: function(){ var div = document.createElement( "a" ); div.className = "x-details-summary"; div.textContent = this.el.dataset.summary; return div; }, /** * Render details element */ renderDetails: function(){ var div = document.createElement( "div" ); div.className = "x-details-details x-details-hidden"; div.textContent = this.el.textContent; return div; }, /** * Handle summary on click * @param {Event} e */ onClick: function( e ){ e.preventDefault(); if ( this.details.classList.contains( "x-details-hidden" ) ) { return this.open(); } this.close(); }, /** * Open details */ open: function(){ this.details.classList.toggle( "x-details-hidden", false ); }, /** * Close details */ close: function(){ this.details.classList.toggle( "x-details-hidden", true ); } }); // Fires when an instance of the element is created element.createdCallback = function() { this.detailsView = new DetailsView( this ); }; // Expose method open element.open = function(){ this.detailsView.open(); }; // Expose method close element.close = function(){ this.detailsView.close(); }; // Register the custom element document.registerElement( "x-details", { prototype: element }); </script> Further in JavaScript code, we create an element based on a generic HTML element (Object.create( HTMLElement.prototype )). Here we could inherit from a complex element (for example, video) if needed. We register a x-details custom element using the earlier one created as prototype. With element.createdCallback, we subscribe a handler that will be called when a custom element created. Here we attach our view to the element to enhance it with the functionality that we intend for it. Now we can use the component in HTML, as follows: <!DOCTYPE html> <html> <head> <title>X-DETAILS</title>    <link rel="import" href="./x-details.html"> </head> <body> <x-details data-summary="Click me"> Nunc iaculis ac erat eu porttitor. Curabitur facilisis ligula et urna egestas mollis. Aliquam eget consequat tellus. Sed ullamcorper ante est. In tortor lectus, ultrices vel ipsum eget, ultricies facilisis nisl. Suspendisse porttitor blandit arcu et imperdiet. </x-details> </body> </html> Summary This article covered basically how we can create our own custom advanced elements that can be easily reused, restyled, and enhanced. The assets required to render such elements are HTML, CSS, JavaScript, and images are bundled as Web Components. So, we literally can build the Web now from the components similar to how buildings are made from bricks. Resources for Article: Further resources on this subject: An Introduction to Kibana [article] Working On Your Bot [article] Icons [article]

0
0
3152

Packt

02 Nov 2015

5 min read

Interactive Documents

Packt

02 Nov 2015

5 min read

This article by Julian Hillebrand and Maximilian H. Nierhoff authors of the book Mastering RStudio for R Development covers the following topics: The two main ways to create interactive R Markdown documents Creating R Markdown and Shiny documents and presentations Using the ggvis package with R Markdown Embedding different types of interactive charts in documents Deploying interactive R Markdown documents (For more resources related to this topic, see here.) Creating interactive documents with R Markdown In this article, we want to focus on the opportunities to create interactive documents with R Markdown and RStudio. This is, of course, particularly interesting for the readers of a document, since it enables them to interact with the document by changing chart types, parameters, values, or other similar things. In principle, there are two ways to make an R Markdown document interactive. Firstly, you can use the Shiny web application framework of RStudio, or secondly, there is the possibility of incorporating various interactive chart types by using corresponding packages. Using R Markdown and Shiny Besides building complete web applications, there is also the possibility of integrating entire Shiny applications into R Markdown documents and presentations. Since we have already learned all the basic functions of R Markdown, and the use and logic of Shiny, we will focus on the following lines of integrating a simple Shiny app into an R Markdown file. In order for Shiny and R Markdown to work together, the argument, runtime: shiny must be added to the YAML header of the file. Of course, the RStudio IDE offers a quick way to create a new Shiny document presentation. Click on the new file, choose R Markdown, and in the popup window, select Shiny from the left-hand side menu. In the Shiny menu, you can decide whether you want to start with a Shiny Document option or a Shiny Presentation option: Shiny Document After choosing the Shiny Document option, a prefilled .Rmd file opens. It is different from the known R Markdown interface in that there is the Run Document button instead of the knit button and icon. The prefilled .Rmd file produces an R Markdown document with a working and interactive Shiny application. You can change the number of bins in the plot and also adjust the bandwidth. All these changes get rendered in real time, directly in your document. Shiny Presentation Also, when you click on Shiny Presentation in the selection menu, a prefilled .Rmd file opens. Because it is a presentation, the output format is changed to ioslides_presentation in the YAML header. The button in the code pane is now called Run Presentation: Otherwise, Shiny Presentation looks just like the normal R Markdown presentations. The Shiny app gets embedded in a slide and you can again interact with the underlying data of the application: Dissembling a Shiny R Markdown document Of course, the questions arises that how is it possible to embed a whole Shiny application onto an R Markdown document without the two usual basic files, ui.R and server.R? In fact, the rmarkdown package creates an invisible server.R file by extracting the R code from the code chunks. Reactive elements get placed into the index.html file of the HTML output, while the whole R Markdown document acts as the ui.R file. Embedding interactive charts into R Markdown The next way is to embed interactive chart types into R Markdown documents by using various R packages that enable us to create interactive charts. Some packages are as follows: ggvis rCharts googleVis dygraphs Therefore, we will not introduce them again, but will introduce some more packages that enable us to build interactive charts. They are: threejs networkD3 metricsgraphics plotly Please keep in mind that the interactivity logically only works with the HTML output of R Markdown. Using ggvis for interactive R Markdown documents Broadly speaking, ggvis is the successor of the well-known graphic package, ggplot2. The interactivity options of ggvis, which are based on the reactive programming model of the Shiny framework, are also useful for creating interactive R Markdown documents. To create an interactive R markdown document with ggvis, you need to click on the new file, then on R Markdown..., choose Shiny in the left menu of the new window, and finally, click on OK to create the document. As told before, since ggvis uses the reactive model of Shiny, we need to create an R Markdown document with ggvis this way. If you want to include an interactive ggvis plot within a normal R Markdown file, make sure to include the runtime: shiny argument in the YAML header. As shown, readers of this R Markdown document can easily adjust the bandwidth, and also, the kernel model. The interactive controls are created with input_. In our example, we used the controls, input_slider() and input_select(). For example, some of the other controls are input_checkbox(), input_numeric(), and so on. These controls have different arguments depending on the type of input. For both controls in our example, we used the label argument, which is just a text label shown next to the controls. Other arguments are ID (a unique identifier for the assigned control) and map (a function that remaps the output). Summary In this article, we have learned the two main ways to create interactive R Markdown documents. On the one hand, there is the versatile, usable Shiny framework. This includes the inbuilt Shiny documents and presentations options in RStudio, and also the ggvis package, which takes the advantages of the Shiny framework to build its interactivity. On the other hand, we introduced several already known, and also some new, R packages that make it possible to create several different types of interactive charts. Most of them achieve this by binding R to Existing JavaScript libraries. Resources for Article: Further resources on this subject: Jenkins Continuous Integration [article] Aspects of Data Manipulation in R [article] Find Friends on Facebook [article]

0
0
2096

How-To Tutorials - Programming

Using the ReactiveCocoa 4 Style

Façade Pattern – Being Adaptive with Façade

Using JavaScript with HTML

Cython Won't Bite

Planning Your CRM Implementation

Distributed resource scheduler

Parallelization using Reducers

Raster Calculations

Building a Command-line Tool

Remote Sensing and Histogram

Trending Topics

Using Cloud Applications and Containers

Synchronizing Tests

Moving Spatial Data From One Format to Another

HTML5 APIs

Interactive Documents