Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Web Development

1797 Articles
article-image-smart-features-improve-your-efficiency
Packt
15 Sep 2015
11 min read
Save for later

Smart Features to Improve Your Efficiency

Packt
15 Sep 2015
11 min read
In this article by Denis Patin and Stefan Rosca authors of the book WebStorm Essentials, we are going to deal with a number of really smart features that will enable you to fundamentally change your approach to web development and learn how to gain maximum benefit from WebStorm. We are going to study the following in this article: On-the-fly code analysis Smart code features Multiselect feature Refactoring facility (For more resources related to this topic, see here.) On-the-fly code analysis WebStorm will preform static code analysis on your code on the fly. The editor will check the code based on the language used and the rules you specify and highlight warnings and errors as you type. This is a very powerful feature that means you don't need to have an external linter and will catch most errors quickly thus making a dynamic and complex language like JavaScript more predictable and easy to use. Runtime error and any other error, such as syntax or performance, are two things. To investigate the first one, you need tests or a debugger, and it is obvious that they have almost nothing in common with the IDE itself (although, when these facilities are integrated into the IDE, such a synergy is better, but that is not it). You can also examine the second type of errors the same way but is it convenient? Just imagine that you need to run tests after writing the next line of code. It is no go! Won't it be more efficient and helpful to use something that keeps an eye on and analyzes each word being typed in order to notify about probable performance issues and bugs, code style and workflow issues, various validation issues, warn of dead code and other likely execution issues before executing the code, to say nothing of reporting inadvertent misprints. WebStorm is the best fit for it. It performs a deep-level analysis of each line, each word in the code. Moreover, you needn't break off your developing process when WebStorm scans your code; it is performed on the fly and thus so called: WebStorm also enables you to get a full inspection report on demand. For getting it, go to the menu: Code | Inspect Code. It pops up the Specify Inspection Scope dialog where you can define what exactly you would like to inspect, and click OK. Depending on what is selected and of what size, you need to wait a little for the process to finish, and you will see the detailed results where the Terminal window is located: You can expand all the items, if needed. To the right of this inspection result list you can see an explanation window. To jump to the erroneous code lines, you can simply click on the necessary item, and you will flip into the corresponding line. Besides simple indicating where some issue is located, WebStorm also unequivocally suggests the ways to eliminate this issue. And you even needn't make any changes yourself—WebStorm already has quick solutions, which you need just to click on, and they will be instantly inserted into the code: Smart code features Being an Integrated Development Environment (IDE) and tending to be intelligent, WebStorm provides a really powerful pack of features by using which you can strongly improve your efficiency and save a lot of time. One of the most useful and hot features is code completion. WebStorm continually analyzes and processes the code of the whole project, and smartly suggests the pieces of code appropriate in the current context, and even more—alongside the method names you can find the usage of these methods. Of course, code completion itself is not a fresh innovation; however WebStorm performs it in a much smarter way than other IDEs do. WebStorm can auto-complete a lot things: Class and function names, keywords and parameters, types and properties, punctuation, and even file paths. By default, the code completion facility is on. To invoke it, simply start typing some code. For example, in the following image you can see how WebStorm suggests object methods: You can navigate through the list of suggestions using your mouse or the Up and Down arrow keys. However, the list can be very long, which makes it not very convenient to browse. To reduce it and retain only the things appropriate in the current context, keep on typing the next letters. Besides typing only initial consecutive letter of the method, you can either type something from the middle of the method name, or even use the CamelCase style, which is usually the quickest way of typing really long method names: It may turn out for some reason that the code completion isn't working automatically. To manually invoke it, press Control + Space on Mac or Ctrl + Space on Windows. To insert the suggested method, press Enter; to replace the string next to the current cursor position with the suggested method, press Tab. If you want the facility to also arrange correct syntactic surroundings for the method, press Shift + ⌘ + Enter on Mac or Ctrl + Shift + Enter on Windows, and missing brackets or/and new lines will be inserted, up to the styling standards of the current language of the code. Multiselect feature With the multiple selection (or simply multiselect) feature, you can place the cursor in several locations simultaneously, and when you will type the code it will be applied at all these positions. For example, you need to add different background colors for each table cell, and then make them of twenty-pixel width. In this case, what you need to not perform these identical tasks repeatedly and save a lot of time, is to place the cursor after the <td> tag, press Alt, and put the cursor in each <td> tag, which you are going to apply styling to: Now you can start typing the necessary attribute—it is bgcolor. Note that WebStorm performs smart code completion here too, independently of you typing something on a single line or not. You get empty values for bgcolor attributes, and you fill them out individually a bit later. You need also to change the width so you can continue typing. As cell widths are arranged to be fixed-sized, simply add the value for width attributes as well. What you get in the following image: Moreover, the multiselect feature can select identical values or just words independently, that is, you needn't place the cursor in multiple locations. Let us watch this feature by another example. Say, you changed your mind and decided to colorize not backgrounds but borders of several consecutive cells. You may instantly think of using a simple replace feature but you needn't replace all attribute occurrences, only several consecutive ones. For doing this, you can place the cursor on the first attribute, which you are going to perform changes from, and click Ctrl + G on Mac or Alt + J on Windows as many times as you need. One by one the same attributes will be selected, and you can replace the bgcolor attribute for the bordercolor one: You can also select all occurrences of any word by clicking Ctrl + command + G on Mac or Ctrl + Alt + Shift + J. To get out of the multiselect mode you have to click in a different position or use the Esc key. Refactoring facility Throughout the development process, it is almost unavoidable that you have to use refactoring. Also, the bigger code base you have, the more difficult it becomes to control the code, and when you need to refactor some code, you can most likely be up against some issues relating to, examples. naming omission or not taking into consideration function usage. You learned that WebStorm performs a thorough code analysis so it understands what is connected with what and if some changes occur it collates them and decide what is acceptable and what is not to perform in the rest of the code. Let us try a simple example. In a big HTML file you have the following line: <input id="search" type="search" placeholder="search" /> And in a big JavaScript file you have another one: var search = document.getElementById('search'); You decided to rename the id attribute's value of the input element to search_field because it is less confusing. You could simply rename it here but after that you would have to manually find all the occurrences of the word search in the code. It is evident that the word is rather frequent so you would spend a lot of time recognizing usage cases appropriate in the current context or not. And there is a high probability that you forget something important, and even more time will be spent on investigating an issue. Instead, you can entrust WebStorm with this task. Select the code unit to refactor (in our case, it is the search value of the id attribute), and click Shift + T on Mac or Ctrl + Alt + Shift + T on Windows (or simply click the Refactor menu item) to call the Refactor This dialog. There, choose the Rename… item and enter the new name for the selected code unit (search_field in our case). To get only a preview of what will happen during the refactoring process, click the Preview button, and all the changes to apply will be displayed in the bottom. You can walk through the hierarchical tree and either apply the change by clicking the Do Refactor button, or not. If you need a preview, you can simply click the Refactor button. What you will see is that the id attribute got the search_field value, not the type or placeholder values, even if they have the same value, and in the JavaScript file you got getElementById('search_field'). Note that even though WebStorm can perform various smart tasks, it still remains a program, and there can occur some issues caused by so-called artificial intelligence imperfection, so you should always be careful when performing the refactoring. In particular, manually check the var declarations because WebStorm sometimes can apply the changes to them as well but it is not always necessary because of the scope. Of course, it is just a little of what you are enabled to perform with refactoring. The basic things that the refactoring facility allows you to do are as follows: The elements in the preceding screenshot are explained as follows: Rename…: You have already got familiar with this refactoring. Once again, with it you can rename code units, and WebStorm automatically will fix all references of them in the code. The shortcut is Shift + F6. Change Signature…: This feature is used basically for changing function names, and adding/removing, reordering, or renaming function parameters, that is, changing the function signature. The shortcut is ⌘ + F6 for Mac and Ctrl + F6 for Windows. Move…: This feature enables you to move files or directories within a project, and it simultaneously repairs all references to these project elements in the code so you needn't manually repair them. The shortcut is F6. Copy…: With this feature, you can copy a file or directory or even a class, with its structure, from one place to another. The shortcut is F5. Safe Delete…: This feature is really helpful. It allows you to safely delete any code or entire files from the project. When performing this refactoring, you will be asked about whether it is needed to inspect comments and strings or all text files for the occurrence of the required piece of code or not. The shortcut is ⌘ + delete for Mac and Alt + Delete for Windows. Variable…: This refactoring feature declares a new variable whereto the result of the selected statement or expression is put. It can be useful when you realize there are too many occurrences of a certain expression so it can be turned into a variable, and the expression can just initialize it. The shortcut is Alt +⌘ + V for Mac and Ctrl + Alt + V for Windows. Parameter…: When you need to add a new parameter to some method and appropriately update its calls, use this feature. The shortcut is Alt + ⌘ + P for Mac and Ctrl + Alt + P for Windows. Method…: During this refactoring, the code block you selected undergoes analysis, through which the input and output variables get detected, and the extracted function receives the output variable as a return value. The shortcut is Alt + ⌘ + M for Mac and Ctrl + Alt + M for Windows. Inline…: The inline refactoring is working contrariwise to the extract method refactoring—it replaces surplus variables with their initializers making the code more compact and concise. The shortcut is Alt + ⌘ + N for Mac and Ctrl + Alt + N for Windows. Summary In this article, you have learned about the most distinctive features of WebStorm, which are the core constituents of improving your efficiency in building web applications. Resources for Article: Further resources on this subject: Introduction to Spring Web Application in No Time [article] Applications of WebRTC [article] Creating Java EE Applications [article]
Read more
  • 0
  • 0
  • 825

article-image-welcome-javascript-full-stack
Packt
15 Sep 2015
12 min read
Save for later

Welcome to JavaScript in the full stack

Packt
15 Sep 2015
12 min read
In this article by Mithun Satheesh, the author of the book Web Development with MongoDB and NodeJS, you will not only learn how to use JavaScript to develop a complete single-page web application such as Gmail, but you will also know how to achieve the following projects with JavaScript throughout the remaining part of the book: Completely power the backend using Node.js and Express.js Persist data with a powerful document oriented database such as MongoDB Write dynamic HTML pages using Handlebars.js Deploy your entire project to the cloud using services such as Heroku and AWS With the introduction of Node.js, JavaScript has officially gone in a direction that was never even possible before. Now, you can use JavaScript on the server, and you can also use it to develop full-scale enterprise-level applications. When you combine this with the power of MongoDB and its JSON-powered data, you can work with JavaScript in every layer of your application. (For more resources related to this topic, see here.) A short introduction to Node.js One of the most important things that people get confused about while getting acquainted with Node.js is understanding what exactly it is. Is it a different language altogether or is it just a framework or is it something else? Node.js is definitely not a new language, and it is not just a framework on JavaScript too. It can be considered as a runtime environment for JavaScript built on top of Google's V8 engine. So, it provides us with a context where we can write JS code on any platform and where Node.js can be installed. That is anywhere! Now a bit of history; back in 2009, Ryan Dahl gave a presentation at JSConf that changed JavaScript forever. During his presentation, he introduced Node.js to the JavaScript community, and after a, roughly, 45-minute talk, he concluded it, receiving a standing ovation from the audience in the process. He was inspired to write Node.js after he saw a simple file upload progress bar on Flickr, the image-sharing site. Realizing that the site was going about the whole process the wrong way, he decided that there had to be a better solution. Now let's go through the features of Node.js that make it unique from other server side programming languages. The advantage that the V8 engine brings in The V8 engine was developed by Google and was made open source in 2008. As we all know, JavaScript is an interpreted language and it will not be as efficient as a compiled language as each line of code gets interpreted one by one while the code is executed. The V8 engine brings in an efficient model here where the JavaScript code will be compiled into machine level code and the executions will happen on the compiled code instead of interpreting the JavaScript. But even though Node.js is using V8 engine, Joyent, which is the company that is maintaining Node.js development, does not always update the V8 engine to the latest versions that Google actively releases. Node is single threaded! You might be asking how does a single threaded model help? Typical PHP, ASP.NET, Ruby, or Java based servers follow a model where each client request results in instantiation of a new thread or even a process. When it comes to Node.js, requests are run on the same thread with even shared resources. A common question that we might be asking will be the advantage of using such a model. To understand this, we should understand the problem that Node.js tries to resolve. It tries to do an asynchronous processing on a single thread to provide more performance and scalability for applications, which are supposed to handle too much web traffic. Imagine web applications that handle millions of concurrent requests; the server makes a new thread for handling each request that comes in, it will consume many resources. We would end up trying to add more and more servers to add the scalability of the application. The single threaded asynchronous processing model has its advantage in the previous context and you can get to process much more concurrent requests with a less number of server side resources. But there is a downside to this approach, that Node.js, by default, will not utilize the number of CPU cores available on the server it is running on without using an extra module such as pm2. The point that Node.js is single threaded doesn't mean that Node doesn't use threads internally. It is that the developer and the execution context that his code has exposure to have no control over the threading model internally used by Node.js. If you are new to the concept of threads and processes, we would suggest you to go through some preliminary articles regarding this. There are plenty of YouTube videos as well on the same topic. The following reference could be used as a starting point: http://www.cs.ucsb.edu/~rich/class/cs170/notes/IntroThreads/ Nonblocking asynchronous execution One of the most powerful features of Node is that it is event-driven and asynchronous. So how does an asynchronous model work? Imagine you have a block of code and at some nth line you have an operation, which is time consuming. So what happens to the lines that follow the nth line while this code gets executed? In normal synchronous programming models, the lines, which follow the nth line, will have to wait until the operation at the nth line completes. An asynchronous model handles this case differently. To handle this scenario in an asynchronous approach, we need to segment the code that follows the nth line into two sections. The first section is dependent on the result from the operation at the nth line and the second is independent of the result. We wrap the dependent code in a function with the result of the operation as its parameter and register it as a callback to the operation on its success. So once the operation completes, the callback function will be triggered with its result. And meanwhile, we can continue executing the result-independent lines without waiting for the result. So, in this scenario, the execution is never blocked for a process to complete. It just goes on with callback functions registered on each ones completion. Simply put, you assign a callback function to an operation, and when the Node determines that the completion event has been fired, it will execute your callback function at that moment. We can look at the following example to understand the asynchronous nature in detail: console.log('One'); console.log('Two'); setTimeout(function() { console.log('Three'); }, 2000); console.log('Four'); console.log('Five'); In a typical synchronous programming language, executing the preceding code will yield the following output: One Two ... (2 second delay) ... Three Four Five However, in an asynchronous approach, the following output is seen: One Two Four Five ... (approx. 2 second delay) ... Three The function that actually logs three is known as a callback to the setTimeout function. If you are still interested in learning more about asynchronous models and the callback concept in JavaScript, Mozilla Developer Network (MDN) has many articles, which explain these concepts in detail. Node Package Manager Writing applications with Node is really enjoyable when you realize the sheer wealth of information and tools at your disposal! Using Node's built-in Package Manager (npm), you can literally find tens of thousands of modules that can be installed and used within your application with just a few keystrokes! One of the reasons for the biggest success of Node.js is npm, which is one of the best package managers out there with a very minute learning curve. If this is the first ever package manager that you are being exposed to, then you should consider yourself lucky! On a regular monthly basis, npm handles more than a billion downloads and it has around 1,50,000 packages currently available for you to download. You can view the library of available modules by visiting www.npmjs.com. Downloading and installing any module within your application is as simple as executing the npm install package command. Have you written a module that you want to share with the world? You can package it using npm, and upload it to the public www.npmjs.org registry just as easily! If you are not sure how a module you installed works, the source code is right there in your projects' node_modules folder waiting to be explored! Sharing and reusing JavaScript While you develop web applications, you will always end up doing the validations for your UI both as client and server as the client side validations are required for a better UI experience and server side validations for better security of app. Think about two different languages in action, you will have the same logic implemented in both server and client side. With Node.js, you can think of sharing the common function between server and client reducing the code duplication to a bigger extent. Ever worked on optimizing the load time for client side components of your Single Page Application (SPA) loaded from template engines like underscore? That would end up in you thinking about a way we could share the rendering of templates in both server and client at the same time—some call it hybrid templating. Node.js resolves the context of duplication of client templates better than any other server side technologies just because we can use the same JS templating framework and the templates both at server and client. If you are taking this point lightly, the problem it resolves is not just the issue of reusing validations or templates on server and client. Think about a single page application being built, you will need to implement the subsets of server-side models in the client-side MV* framework also. Now think about the templates, models, and controller subsets being shared on both client and server. We are solving a higher scenario of code redundancy. Isn't it? Not just for building web servers! Node.js is not just to write JavaScript in server side. Yes, we have discussed this point earlier. Node.js sets up the environment for the JavaScript code to work anywhere it can be installed. It can be a powerful solution to create command-line tools as well as full-featured locally run applications that have nothing to do with the Web or a browser. Grunt.js is a great example of a Node-powered command-line tool that many web developers use daily to automate everyday tasks such as build processes, compiling Coffee Script, launching Node servers, running tests, and more. In addition to command-line tools, Node is increasingly popular among the hardware crowd with the Node bots movement. Johnny-Five and Cylon.js are two popular Node libraries that exist to provide a framework to work with robotics. Search YouTube for Node robots and you will see a lot of examples. Also, there is a chance that you might be using a text editor developed on Node.js. Github's open source editor named Atom is one such kind, which is hugely popular. Real-time web with Socket.io One of the important reasons behind the origin of Node.js was to support real time web applications. Node.js has a couple of frameworks built for real-time web applications, which are hugely popular namely socket.io and sock.js. These frameworks make it quite simple to build instant collaboration based applications such as Google Drive and Mozilla's together.js. Before the introduction of WebSockets in the modern browsers, this was achieved via long polling, which was not a great solution for real-time experience. While WebSockets is a feature that is only supported in modern browsers, Socket.io acts as a framework, which also features seamless fallback implementations for legacy browsers. If you need to understand more on the use of web sockets in applictions, here is a good resource on MDN that you can explore: https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API/Writing_WebSocket_client_applications Networking and file IO In addition to the powerful nonblocking asynchronous nature of Node, it also has very robust networking and filesystem tools available via its core modules. With Node's networking modules, you can create server and client applications that accept network connections and communicate via streams and pipes. The origin of io.js io.js is nothing but a fork of Node.js that was created to stay updated with the latest development on both V8 and other developments in JS community. Joyent was taking care of the releases in Node.js and the process, which was followed in taking care of the release management of Node.js, lacked an open governance model. It leads to scenarios where the newer developments in V8 as well as the JS community were not incorporated into its releases. For example, if you want to write JavaScript using the latest EcmaScript6 (ES6) features, you will have to run it in the harmony mode. Joyent is surely not to be blamed on this as they were more concerned about stability of Node.js releases than frequent updates in the stack. This led to the io.js fork, which is kept up to date with the latest JavaScript and V8 updates. So it's better to keep your eyes on the releases on both Node and io.js to keep updated with the Node.js world. Summary We discussed the amazing current state of JavaScript and how it can be used to power the full stack of a web application. Not that you needed any convincing in the first place, but I hope you're excited and ready to get started writing web applications using Node.js and MongoDB! Resources for Article: Further resources on this subject: Introduction and Composition [article] Deployment and Maintenance [article] Node.js Fundamentals [article]
Read more
  • 0
  • 0
  • 3959

article-image-object-oriented-programming-typescript
Packt
15 Sep 2015
12 min read
Save for later

Writing SOLID JavaScript code with TypeScript

Packt
15 Sep 2015
12 min read
In this article by Remo H. Jansen, author of the book Learning TypeScript, explains that in the early days of software development, developers used to write code with procedural programing languages. In procedural programming languages, the programs follow a top to bottom approach and the logic is wrapped with functions. New styles of computer programming like modular programming or structured programming emerged when developers realized that procedural computer programs could not provide them with the desired level of abstraction, maintainability and reusability. The development community created a series of recommended practices and design patterns to improve the level of abstraction and reusability of procedural programming languages but some of these guidelines required certain level of expertise. In order to facilitate the adherence to these guidelines, a new style of computer programming known as object-oriented programming (OOP) was created. (For more resources related to this topic, see here.) Developers quickly noticed some common OOP mistakes and came up with five rules that every OOP developer should follow to create a system that is easy to maintain and extend over time. These five rules are known as the SOLID principles. SOLID is an acronym introduced by Michael Feathers, which stands for the each following principles: Single responsibility principle (SRP): This principle states that software component (function, class or module) should focus on one unique tasks (have only one responsibility). Open/closed principle (OCP): This principle states that software entities should be designed with the application growth (new code) in mind (be open to extension), but the application growth should require the smaller amount of changes to the existing code as possible (be closed for modification). Liskov substitution principle (LSP): This principle states that we should be able to replace a class in a program with another class as long as both classes implement the same interface. After replacing the class no other changes should be required and the program should continue to work as it did originally. Interface segregation principle (ISP): This principle states that we should split interfaces which are very large (general-purpose interfaces) into smaller and more specific ones (many client-specific interfaces) so that clients will only have to know about the methods that are of interest to them. Dependency inversion principle (DIP): This principle states that entities should depend on abstractions (interfaces) as opposed to depend on concretion (classes). JavaScript does not support interfaces and most developers find its class support (prototypes) not intuitive. This may lead us to think that writing JavaScript code that adheres to the SOLID principles is not possible. However, with TypeScript we can write truly SOLID JavaScript. In this article we will learn how to write TypeScript code that adheres to the SOLID principles so our applications are easy to maintain and extend over time. Let's start by taking a look to interface and classes in TypeScript. Interfaces The feature that we will miss the most when developing large-scale web applications with JavaScript is probably interfaces. Following the SOLID principles can help us to improve the quality of our code and writing good code is a must when working on a large project. The problem is that if we attempt to follow the SOLID principles with JavaScript we will soon realize that without interfaces we will never be able to write truly OOP code that adheres to the SOLID principles. Fortunately for us, TypeScript features interfaces. The Wikipedia's definition of interfaces in OOP is: In object-oriented languages, the term interface is often used to define an abstract type that contains no data or code, but defines behaviors as method signatures. Implementing an interface can be understood as signing a contract. The interface is a contract and when we sign it (implement it) we must follow its rules. The interface rules are the signatures of the methods and properties and we must implement them. Usually in OOP languages, a class can extend another class and implement one or more interfaces. On the other hand, an interface can implement one or more interfaces and cannot extend another class or interfaces. In TypeScript, interfaces doesn't strictly follow this behavior. The main two differences are that in TypeScript: An interface can extend another interface or class. An interface can define data and behavior as opposed to only behavior. An interface in TypeScript can be declared using the interface keyword: interface IPerson { greet(): void; } Classes The support of Classes is another essential feature to write code that adheres to the SOLID principles. We can create classes in JavaScript using prototypes but its is not as trivial as it is in other OOP languages like Java or C#. The ECMAScript 6 (ES6) specification of JavaScript introduces native support for the class keyword but unfortunately ES6 is not compatible with many old browsers that still around. However, TypeScript features classes and allow us to use them today because can indicate to the compiler which version of JavaScript we would like to use (including ES3, ES5, and ES6). Let's start by declaring a simple class: class Person implements Iperson { public name : string; public surname : string; public email : string; constructor(name : string, surname : string, email : string){ this.email = email; this.name = name; this.surname = surname; } greet() { alert("Hi!"); } } var me : Person = new Person("Remo", "Jansen", "[email protected]"); We use classes to represent the type of an object or entity. A class is composed of a name, attributes, and methods. The class above is named Person and contains three attributes or properties (name, surname, and email) and two methods (constructor and greet). The class attributes are used to describe the objects characteristics while the class methods are used to describe its behavior. The class above uses the implements keyword to implement the IPerson interface. All the methods (greet) declared by the IPerson interface must be implemented by the Person class. A constructor is an especial method used by the new keyword to create instances (also known as objects) of our class. We have declared a variable named me, which holds an instance of the class Person. The new keyword uses the Person class's constructor to return an object which type is Person. Single Responsibility Principle This principle states that a software component (usually a class) should adhere to the Single Responsibility Principle (SRP). The Person class above represents a person including all its characteristics (attributes) and behaviors (methods). Now, let's add some email is validation logic to showcase the advantages of the SRP: class Person { public name : string; public surname : string; public email : string; constructor(name : string, surname : string, email : string) { this.surname = surname; this.name = name; if(this.validateEmail(email)) { this.email = email; } else { throw new Error("Invalid email!"); } } validateEmail() { var re = /S+@S+.S+/; return re.test(this.email); } greet() { alert("Hi! I'm " + this.name + ". You can reach me at " + this.email); } } When an object doesn't follow the SRP and it knows too much (has too many properties) or does too much (has too many methods) we say that the object is a God object. The preceding class Person is a God object because we have added a method named validateEmail that is not really related to the Person class behavior. Deciding which attributes and methods should or should not be part of a class is a relatively subjective decision. If we spend some time analyzing our options we should be able to find a way to improve the design of our classes. We can refactor the Person class by declaring an Email class, which is responsible for the e-mail validation and use it as an attribute in the Person class: class Email { public email : string; constructor(email : string){ if(this.validateEmail(email)) { this.email = email; } else { throw new Error("Invalid email!"); } } validateEmail(email : string) { var re = /S+@S+.S+/; return re.test(email); } } Now that we have an Email class we can remove the responsibility of validating the e-mails from the Person class and update its email attribute to use the type Email instead of string. class Person { public name : string; public surname : string; public email : Email; constructor(name : string, surname : string, email : Email){ this.email = email; this.name = name; this.surname = surname; } greet() { alert("Hi!"); } } Making sure that a class has a single responsibility makes it easier to see what it does and how we can extend/improve it. We can further improve our Person an Email classes by increasing the level of abstraction of our classes. For example, when we use the Email class we don't really need to be aware of the existence of validateEmail method so this method could be private or internal (invisible from the outside of the Email class). As a result, the Email class would be much simpler to understand. When we increase the level of abstraction of an object, we can say that we are encapsulating that object. Encapsulation is also known as information hiding. For example, in the Email class allow us to use e-mails without having to worry about the e-mail validation because the class will deal with it for us. We can make this more clearly by using access modifiers (public or private) to flag as private all the class attributes and methods that we want to abstract from the usage of the Email class: class Email { private email : string; constructor(email : string){ if(this.validateEmail(email)) { this.email = email; } else { throw new Error("Invalid email!"); } } private validateEmail(email : string) { var re = /S+@S+.S+/; return re.test(email); } get():string { return this.email; } } We can then simply use the Email class without explicitly perform any kind of validation: var email = new Email("[email protected]"); Liskov Substitution Principle Liskov Substitution Principle (LSP) states: Subtypes must be substitutable for their base types. Let's take a look at an example to understand what this means. We are going to declare a class which responsibility is to persist some objects into some kind of storage. We will start by declaring the following interface: interface IPersistanceService { save(entity : any) : number; } After declaring the IPersistanceService interface we can implement it. We will use cookies the storage for the application's data: class CookiePersitanceService implements IPersistanceService{ save(entity : any) : number { var id = Math.floor((Math.random() * 100) + 1); // Cookie persistance logic... return id; } } We will continue by declaring a class named FavouritesController, which has a dependency on the IPersistanceService interface: class FavouritesController { private _persistanceService : IPersistanceService; constructor(persistanceService : IPersistanceService) { this._persistanceService = persistanceService; } public saveAsFavourite(articleId : number) { return this._persistanceService.save(articleId); } } We can finally create and instance of FavouritesController and pass an instance of CookiePersitanceService via its constructor. var favController = new FavouritesController(new CookiePersitanceService()); The LSP allows us to replace a dependency with another implementation as long as both implementations are based in the same base type. For example, we decide to stop using cookies as storage and use the HTML5 local storage API instead without having to worry about the FavouritesController code being affected by this change: class LocalStoragePersitanceService implements IpersistanceService { save(entity : any) : number { var id = Math.floor((Math.random() * 100) + 1); // Local storage persistance logic... return id; } } We can then replace it without having to add any changes to the FavouritesController controller class: var favController = new FavouritesController(new LocalStoragePersitanceService()); Interface Segregation Principle In the previous example, our interface was IPersistanceService and it was implemented by the cases LocalStoragePersitanceService and CookiePersitanceService. The interface was consumed by the class FavouritesController so we say that this class is a client of the IPersistanceService API. Interface Segregation Principle (ISP) states that no client should be forced to depend on methods it does not use. To adhere to the ISP we need to keep in mind that when we declare the API (how two or more software components cooperate and exchange information with each other) of our application's components the declaration of many client-specific interfaces is better than the declaration of one general-purpose interface. Let's take a look at an example. If we are designing an API to control all the elements in a vehicle (engine, radio, heating, navigation, lights, and so on) we could have one general-purpose interface, which allows controlling every single element of the vehicle: interface IVehicle { getSpeed() : number; getVehicleType: string; isTaxPayed() : boolean; isLightsOn() : boolean; isLightsOff() : boolean; startEngine() : void; acelerate() : number; stopEngine() : void; startRadio() : void; playCd : void; stopRadio() : void; } If a class has a dependency (client) in the IVehicle interface but it only wants to use the radio methods we would be facing a violation of the ISP because, as we have already learned, no client should be forced to depend on methods it does not use. The solution is to split the IVehicle interface into many client-specific interfaces so our class can adhere to the ISP by depending only on Iradio: interface IVehicle { getSpeed() : number; getVehicleType: string; isTaxPayed() : boolean; isLightsOn() : boolean; } interface ILights { isLightsOn() : boolean; isLightsOff() : boolean; } interface IRadio { startRadio() : void; playCd : void; stopRadio() : void; } interface IEngine { startEngine() : void; acelerate() : number; stopEngine() : void; } Dependency Inversion Principle Dependency Inversion (DI) principle states that we should: Depend upon Abstractions. Do not depend upon concretions In the previous section, we implemented FavouritesController and we were able to replace an implementation of IPersistanceService with another without having to perform any additional change to FavouritesController. This was possible because we followed the DI principle as FavouritesController has a dependency on the IPersistanceService interface (abstractions) rather than LocalStoragePersitanceService class or CookiePersitanceService class (concretions). The DI principle also allow us to use an inversion of control (IoC) container. An IoC container is a tool used to reduce the coupling between the components of an application. Refer to Inversion of Control Containers and the Dependency Injection pattern by Martin Fowler at http://martinfowler.com/articles/injection.html. If you want to learn more about IoC. Summary In this article, we looked upon classes, interfaces, and the SOLID principles. Resources for Article: Further resources on this subject: Welcome to JavaScript in the full stack [article] Introduction to Spring Web Application in No Time [article] Introduction to TypeScript [article]
Read more
  • 0
  • 0
  • 8548
Banner background image

article-image-understanding-datastore
Packt
14 Sep 2015
41 min read
Save for later

Understanding the Datastore

Packt
14 Sep 2015
41 min read
 In this article by Mohsin Hijazee, the author of the book Mastering Google App Engine, we will go through learning, but unlearning something is even harder. The main reason why learning something is hard is not because it is hard in and of itself, but for the fact that most of the times, you have to unlearn a lot in order to learn a little. This is quite true for a datastore. Basically, it is built to scale the so-called Google scale. That's why, in order to be proficient with it, you will have to unlearn some of the things that you know. Your learning as a computer science student or a programmer has been deeply enriched by the relational model so much so that it is natural to you. Anything else may seem quite hard to grasp, and this is the reason why learning Google datastore is quite hard. However, if this were the only glitch in all that, things would have been way simpler because you could ask yourself to forget the relational world and consider the new paradigm afresh. Things have been complicated due to Google's own official documentation, where it presents a datastore in a manner where it seems closer to something such as Django's ORM, Rails ActiveRecord, or SQLAlchemy. However, all of a sudden, it starts to enlist its limitations with a very brief mention or, at times, no mention of why the limitations exist. Since you only know the limitations but not why the limitations are there in the first place, a lack of reason may result to you being unable to work around those limitations or mold your problem space into the new solution space, which is Google datastore. We will try to fix this. Hence, the following will be our goals in this article: To understand BigTable and its data model To have a look at the physical data storage in BigTable and the operations that are available in it To understand how BigTable scales To understand datastore and the way it models data on top of BigTable So, there's a lot more to learn. Let's get started on our journey of exploring datastore. The BigTable If you decided to fetch every web page hosted on the planet, download and store a copy of it, and later process every page to extract data from it, you'll find out that your own laptop or desktop is not good enough to accomplish this task. It has barely enough storage to store every page. Usually, laptops come with 1 TB hard disk drives, and this seems to be quite enough for a person who is not much into video content such as movies. Assuming that there are 2 billion websites, each with an average of 50 pages and each page weighing around 250 KB, it sums up to around 23,000+ TB (or roughly 22 petabytes), which would need 23,000 such laptops to store all the web pages with a 1 TB hard drive in each. Assuming the same statistics, if you are able to download at a whopping speed of 100 MBps, it would take you about seven years to download the whole content to one such gigantic hard drive if you had one in your laptop. Let's suppose that you downloaded the content in whatever time it took and stored it. Now, you need to analyze and process it too. If processing takes about 50 milliseconds per page, it would take about two months to process the entire data that you downloaded. The world would have changed a lot by then already, leaving your data and processed results obsolete. This is the Kind of scale for which BigTable is built. Every Google product that you see—Search Analytics, Finance, Gmail, Docs, Drive, and Google Maps—is built on top of BigTable. If you want to read more about BigTable, you can go through the academic paper from Google Research, which is available at http://static.googleusercontent.com/media/research.google.com/en//archive/bigtable-osdi06.pdf. The data model Let's examine the data model of BigTable at a logical level. BigTable is basically a key-value store. So, everything that you store falls under a unique key, just like PHP' arrays, Ruby's hash, or Python's dict: # PHP $person['name'] = 'Mohsin'; # Ruby or Python person['name'] = 'Mohsin' However, this is a partial picture. We will learn the details gradually in a while. So, let's understand this step by step. A BigTable installation can have multiple tables, just like a MySQL database can have multiple tables. The difference here is that a MySQL installation might have multiple databases, which in turn might have multiple tables. However, in the case of BigTable, the first major storage unit is a table. Each table can have hundreds of columns, which can be divided into groups called column families. You can define column families at the time of creating a table. They cannot be altered later, but each column family might have hundreds of columns that you can define even after the creation of the table. The notation that is used to address a column and its column families is like job:title, where job is a column family and title is the column. So here, you have a job column family that stores all the information about the job of the user, and title is supposed to store the job title. However, one of the important facts about these columns is that there's no concept of datatypes in BigTable as you'd encounter in other relational database systems. Everything is just an uninterpreted sequence of bytes, which means nothing to BigTable. What they really mean is just up to you. It might be a very long integer, a string, or a JSON-encoded data. Now, let's turn our attention to the rows. There are two major characteristics of the rows that we are concerned about. First, each row has a key, which must be unique. The contents of the key again consist of an uninterpreted string of bytes that is up to 64 KB in length. A key can be anything that you want it to be. All that's required is that it must be unique within the table, and in case it is not, you will have to overwrite the contents of the row with the same content. Which key should you use for a row in your table? That's the question that requires some consideration. To answer this, you need to understand how the data is actually stored. Till then, you can assume that each key has to be a unique string of bytes within the scope of a table and should be up to 64 KB in length. Now that we know about tables, column families, columns, rows, and row keys, let's look at an example of BigTable that stores 'employees' information. Let's pretend that we are creating something similar to LinkedIn here. So, here's the table: Personal Professional Key(name) personal:lastname personal:age professinal:company professional:designation Mohsin Hijazee 29 Sony Senior Designer Peter Smith 34 Panasonic General Manager Kim Yong 32 Sony Director Ricky Martin 45 Panasonic CTO Paul Jefferson 39 LG Sales Head So, 'this is a sample BigTable. The first column is the name, and we have chosen it as a key. It is of course not a good key, because the first name cannot necessarily be unique, even in small groups, let alone in millions of records. However, for the sake of this example, we will assume that the name is unique. Another reason behind assuming the name's uniqueness is that we want to increase our understanding gradually. So, the key point here is that we picked the first name as the row's key for now, but we will improve on this as we learn more. Next, we have two column groups. The personal column group holds all the personal attributes of the employees, and the other column family named professional has all the other attributes pertaining to the professional aspects. When referring to a column within a family, the notation is family:column. So, personal:age contains the age of the employees. If you look at professinal:designation and personal:age, it seems that the first one's contents are strings, while the second one stores integers. That's false. No column stores anything but just plain bytes without any distinction of what they mean. The meaning and interpretation of these bytes is up to the user of the data. From the point of view of BigTable', each column just contains plain old bytes. Another thing that is drastically different from RDBMS is such as MySQL is that each row need not have the same number of columns. Each row can adopt the layout that they want. So, the second row's personal column family can have two more columns that store gender and nationality. For this particular example, the data is in no particular order, and I wrote it down as it came to my mind. Hence, there's no order of any sort in the data at all. To summarize, BigTable is a key-value storage where keys should be unique and have a length that is less than or equal to 64 KB. The columns are divided into column families, which can be created at the time of defining the table, but each column family might have hundreds of columns created as and when needed. Also, contents have no data type and comprise just plain old bytes. There's one minor detail left, which is not important as regards our purpose. However, for the sake of the completeness of the BigTable's data model, I will mention it now. Each value of the column is stored with a timestamp that is accurate to the microseconds, and in this way, multiple versions of a column value are available. The number of last versions that should be kept is something that is configurable at the table level, but since we are not going to deal with BigTable directly, this detail is not important to us. How data is stored? Now that we know about row keys, column families, and columns, we will gradually move towards examining this data model in detail and understand how the data is actually stored. We will examine the logical storage and then dive into the actual structure, as it ends up on the disk. The data that we presented in the earlier table had no order and were listed as they came to my mind. However, while storing, the data is always sorted by the row key. So now, the data will actually be stored like this: personal professional Key(name) personal:lastname personal:age professinal:company professional:designation Kim Yong 32 Sony Director Mohsin Hijazee 29 Sony Senior Designer Paul Jefferson 39 LG Sales Head Peter Smith 34 Panasonic General Manager Ricky Martin 45 Panasonic CTO OK, so what happened here? The name column indicates the key of the table and now, the whole table is sorted by the key. That's exactly how it is stored on the disk as well. 'An important thing about sorting is lexicographic sorting and not semantic sorting. By lexicographic, we mean that they are sorted by the byte value and not by the textness or the semantic sort. This matters because even within the Latin character set, different languages have different sort orders for letters, such as letters in English versus German and French. However, all of this and the Unicode collation order isn't valid here. It is just sorted by byte values. In our instance, since K has a smaller byte value (because K has a lower ASCII/Unicode value) than letter M, it comes first. Now, suppose that some European language considers and sorts M before K. That's not how the data would be laid out here, because it is a plain, blind, and simple sort. The data is sorted by the byte value, with no regard for the semantic value. In fact, for BigTable, this is not even text. It's just a plain string of bytes. Just a hint. This order of keys is something that we will exploit when modeling data. How? We'll see later. The Physical storage Now that we understand the logical data model and how it is organized, it's time to take a closer look at how this data is actually stored on the disk. On a physical disk, the stored data is sorted by the key. So, key 1 is followed by its respective value, key 2 is followed by its respective value, and so on. At the end of the file, there's a sorted list of just the keys and their offset in the file from the start, which is something like the block to the right: Ignore the block on your left that is labeled Index. We will come back to it in a while. This particular format actually has a name SSTable (String Storage Table) because it has strings (the keys), and they are sorted. It is of course tabular data, and hence the name. Whenever your data is sorted, you have certain advantages, with the first and foremost advantage being that when you look up for an item or a range of items, 'your dataset is sorted. We will discuss this in detail later in this article. Now, if we start from the beginning of the file and read sequentially, noting down every key and then its offset in a format such as key:offset, we effectively create an index of the whole file in a single scan. That's where the first block to your left in the preceding diagram comes from. Since the keys are sorted in the file, we simply read it sequentially till the end of the file, hence effectively creating an index of the data. Furthermore, since this index only contains keys and their offsets in the file, it is much smaller in terms of the space it occupies. Now, assuming that SSTable has a table that is, say, 500 MB in size, we only need to load the index from the end of the file into the memory, and whenever we are asked for a key or a range of keys, we just search within a memory index (thus not touching the disk at all). If we find the data, only then do we seek the disk at the given offset because we know the offset of that particular key from the index that we loaded in the memory. Some limitations Pretty smart, neat, and elegant, you would say! Yes it is. However, there's a catch. If you want to create a new row, key must come in a sorted order, and even if you are sure about where exactly this key should be placed in the file to avoid the need to sort the data, you still need to rewrite the whole file in a new, sorted order along with the index. Hence, large amounts of I/O are required for just a single row insertion. The same goes for deleting a row because now, the file should be sorted and rewritten again. Updates are OK as long as the key itself is not altered because, in that case, it is sort of having a new key altogether. This is so because a modified key would have a different place in the sorted order, depending on what the key actually is. Hence, the whole file would be rewritten. Just for an example, say you have a row with the key as all-boys, and then you change the key of that row to x-rays-of-zebra. Now, you will see that after the new modification, the row will end up at nearly the end of the file, whereas previously, it was probably at the beginning of the file because all-boys comes before x-rays-of-zebra when sorted. This seems pretty limiting, and it looks like inserting or removing a key is quite expensive. However, this is not the case, as we will see later. Random writes and deletion There's one last thing that's worth a mention before we examine the operations that are available on a BigTable. We'd like to examine how random writes and the deletion of rows are handled because that seems quite expensive, as we just examined in the preceding section. The idea is very simple. All the read, writes, and removals don't go straight to the disk. Instead, an in-memory SSTable is created along with its index, both of which are empty when created. We'll call it MemTable from this point onwards for the sake of simplicity. Every read checks the index of this table, and if a record is found from here, it's well and good. If it is not, then the index of the SSTable on the disk is checked and the desired row is returned. When a new row has to be read, we don't look at anything and simply enter the row in the MemTable along with its record in the index of this MemTable. To delete a key, we simply mark it deleted in the memory, regardless of whether it is in MemTable or in the on disk table. As shown here the allocation of block into Mem Table: Now, when the size of the MemTable grows up to a certain size, it is written to the disk as a new SSTable. Since this only depends on the size of the MemTable and of course happens much infrequently, it is much faster. Each time the MemTable grows beyond a configured size, it is flushed to the disk as a new SSTable. However, the index of each flushed SSTable is still kept in the memory so that we can quickly check the incoming read requests and locate it in any table without touching the disk. Finally, when the number of SSTables reaches a certain count, the SSTables are merged and collapsed into a single SSTable. Since each SSTable is just a sorted set of keys, a merge sort is applied. This merging process is quite fast. Congratulations! You've just learned the most atomic storage unit in BigData solutions such as BigTable, Hbase, Hypertable, Cassandara, and LevelDB. That's how they actually store and process the data. Now that we know how a big table is actually stored on the disk and how the read and writes are handled, it's time to take a closer look at the available operations. Operations on BigTable Until this point, we know that a BigTable table is a collection of rows that have unique keys up to 64 KB in length and the data is stored according to the lexicographic sort order of the keys. We also examined how it is laid out on the disk and how read, writes, and removals are handled. Now, the question is, which operations are available on this data? The following are the operations that are available to us: Fetching a row by using its key Inserting a new key Deleting a row Updating a row Reading a range of rows from the starting row key to the ending row key Reading Now, the first operation is pretty simple. You have a key, and you want the associated row. Since the whole data set is sorted by the key, all we need to do is perform a binary search on it, and you'll be able to locate your desired row within a few lookups, even within a set of a million rows. In practice, the index at the end of the SSTable is loaded in the memory, and the binary search is actually performed on it. If we take a closer look at this operation in light of what we know from the previous section, the index is already in the memory of the MemTable that we saw in the previous section. In case there are multiple SSTables because MemTable was flushed many times to the disk as it grew too large, all the indexes of all the SSTables are present in the memory, and a quick binary search is performed on them. Writing The second operation that is available to us is the ability to insert a new row. So, we have a key and the values that we want to insert in the table. According to our new knowledge about physical storage and SSTables, we can understand this very well. The write directly happens on the in-memory MemTable and its index is updated, which is also in the memory. Since no disk access is required to write the row as we are writing in memory, the whole file doesn't have to be rewritten on disk, because yet again, all of it is in the memory. This operation is very fast and almost instantaneous. However, if the MemTable grows in size, it will be flushed to the disk as a new SSTable along with the index while retaining a copy of its index in the memory. Finally, we also saw that when the number of SSTables reaches a certain number, they are merged and collapsed to form a new, bigger table. Deleting It seems that since all the keys are in a sorted order on the disk and deleting a key would mean disrupting the sort order, a rewrite of the whole file would be a big I/O overhead. However, it is not, as it can be handled smartly. Since all the indexes, including the MemTable and the tables that were the result of flushing a larger MemTable to the disk, are already in the memory, deleting a row only requires us to find the required key in the in-memory indexes and mark it as deleted. Now, whenever someone tries to read the row, the in-memory indexes will be checked, and although an entry will be there, it will be marked as deleted and won't be returned. When MemTable is being flushed to the disk or multiple tables are being collapsed, this key and the associated row will be excluded in the write process. Hence, they are totally gone from the storage. Updating Updating a row is no different, but it has two cases. The first case is in which not only the values, but also the key is modified. In this case, it is like removing the row with an old key and inserting a row with a new key. We already have seen both of these cases in detail. So, the operation should be obvious. However, the case where only the values are modified is even simpler. We only have to locate the row from the indexes, load it in the memory if it is not already there, and modify. That's all. Scanning a range This last operation is quite interesting. You can scan a range of keys from a starting key to an ending key. For instance, you can return all the rows that have a key greater than or equal to key1 and less than or equal to key2, effectively forming a range. Since the looking up of a single key is a fast operation, we only have to locate the first key of the range. Then, we start reading the consecutive keys one after the other till we encounter a key that is greater than key2, at which point, we will stop the scanning, and the keys that we scanned so far are our query's result. This is how it looks like: Name Department Company Chris Harris Research & Development Google Christopher Graham Research & Development LG Debra Lee Accounting Sony Ernest Morrison Accounting Apple Fred Black Research & Development Sony Janice Young Research & Development Google Jennifer Sims Research & Development Panasonic Joyce Garrett Human Resources Apple Joyce Robinson Research & Development Apple Judy Bishop Human Resources Google Kathryn Crawford Human Resources Google Kelly Bailey Research & Development LG Lori Tucker Human Resources Sony Nancy Campbell Accounting Sony Nicole Martinez Research & Development LG Norma Miller Human Resources Sony Patrick Ward Research & Development Sony Paula Harvey Research & Development LG Stephanie Chavez Accounting Sony Stephanie Mccoy Human Resources Panasonic In the preceding table, we said that the starting key will be greater than or equal to Ernest and ending key will be less than or equal to Kathryn. So, we locate the first key that is greater than or equal to Ernest, which happens to be Ernest Morrison. Then, we start scanning further, picking and returning each key as long as it is less than or equal to Kathryn. When we reach Judy, it is less than or equal to Kathryn, but Kathryn isn't. So, this row is not returned. However, the rows before this are returned. This is the last operation that is available to us on BigTable. Selecting a key Now that we have examined the data model and the storage layout, we are in a better position to talk about the key selection for a table. As we know that the stored data is sorted by the key, it does not impact the writing, deleting, and updating to fetch a single row. However, the operation that is impacted by the key is that of scanning a range. Let's think about the previous table again and assume that this table is a part of some system that processes payrolls for companies, and the companies pay us for the task of processing their payroll. Now, let's suppose that Sony asks us to process their data and generate a payroll for them. Right now, we cannot do anything of this kind. We can just make our program scan the whole table, and hence all the records (which might be in millions), and only pick the records where job:company has the value of Sony. This would be inefficient. Instead, what we can do is put this sorted nature of row keys to our service. Select the company name as the key and concatenate the designation and name along with it. So, the new table will look like this: Key Name Department Company Apple-Accounting-Ernest Morrison Ernest Morrison Accounting Apple Apple-Human Resources-Joyce Garrett Joyce Garrett Human Resources Apple Apple-Research & Development-Joyce Robinson Joyce Robinson Research & Development Apple Google-Human Resources-Judy Bishop Chris Harris Research & Development Google Google-Human Resources-Kathryn Crawford Janice Young Research & Development Google Google-Research & Development-Chris Harris Judy Bishop Human Resources Google Google-Research & Development-Janice Young Kathryn Crawford Human Resources Google LG-Research & Development-Christopher Graham Christopher Graham Research & Development LG LG-Research & Development-Kelly Bailey Kelly Bailey Research & Development LG LG-Research & Development-Nicole Martinez Nicole Martinez Research & Development LG LG-Research & Development-Paula Harvey Paula Harvey Research & Development LG Panasonic-Human Resources-Stephanie Mccoy Jennifer Sims Research & Development Panasonic Panasonic-Research & Development-Jennifer Sims Stephanie Mccoy Human Resources Panasonic Sony-Accounting-Debra Lee Debra Lee Accounting Sony Sony-Accounting-Nancy Campbell Fred Black Research & Development Sony Sony-Accounting-Stephanie Chavez Lori Tucker Human Resources Sony Sony-Human Resources-Lori Tucker Nancy Campbell Accounting Sony Sony-Human Resources-Norma Miller Norma Miller Human Resources Sony Sony-Research & Development-Fred Black Patrick Ward Research & Development Sony Sony-Research & Development-Patrick Ward Stephanie Chavez Accounting Sony So, this is a new format. We just welded the company, department, and name as the key and as the table will always be sorted by the key, that's what it looks like, as shown in the preceding table. Now, suppose that we receive a request from Google to process their data. All we have to do is perform a scan, starting from the key greater than or equal to Google and less then L because that's the next letter. This scan is highlighted in the previous table. Now, the next request is more specific. Sony asks us to process their data, but only for their accounting department. How do we do that? Quite simple! In this case, our starting key will be greater than or equal to Sony-Accounting, and the ending key can be Sony-Accountinga, where a is appended to indicate the end key in the range. The scanned range and the returned rows are highlighted in the previous table. BigTable – a hands-on approach Okay, enough of the theory. It is now time to take a break and perform some hands-on experimentation. By now, we know that about 80 percent of the BigTable and the other 20 percent of the complexity is scaling it to more than one machine. Our current discussion only assumed and focused on a single machine environment, and we assumed that the BigTable table is on our laptop and that's about it. You might really want to experiment with what you learned. Fortunately, given that you have the latest version of Google Chrome or Mozilla Firefox, that's easy. You have BigTable right there! How? Let me explain. Basically, from the ideas that we looked at pertaining to the stored key value, the sorted layout, the indexes of the sorted files, and all the operations that were performed on them, including scanning, we extracted a separate component called LevelDB. Meanwhile, as HTML was evolving towards HTML5, a need was felt to store data locally. Initially, SQLite3 was embedded in browsers, and there was a querying interface for you to play with. So all in all, you had an SQL database in the browser, which yielded a lot of possibilities. However, in recent years, W3C deprecated this specification and urged browser vendors to not implement it. Instead of web databases that were based on SQLite3, they now have databases based on LevelDB that are actually key-value stores, where storage is always sorted by key. Hence, besides looking up for a key, you can scan across a range of keys. Covering the IndexedDB API here would be beyond the scope of this book, but if you want to understand it and find out what the theory that we talked about looks like in practice, you can try using IndexedDB in your browser by visiting http://code.tutsplus.com/tutorials/working-with-indexeddb--net-34673. The concepts of keys and the scanning of key ranges are exactly like those that we examined here as regards BigTable, and those about indexes are mainly from the concepts that we will examine in a later section about datastores. Scaling BigTable to BigData By now, you have probably understood the data model of BigTable, how it is laid out on the disk, and the advantages it offers. To recap once again, the BigTable installation may have many tables, each table may have many column families that are defined at the time of creating the table, and each column family may have many columns, as required. Rows are identified by keys, which have a maximum length of 64 KB, and the stored data is sorted by the key. We can receive, update, and delete a single row. We can also scan a range of rows from a starting key to an ending key. So now, the question comes, how does this scale? We will provide a very high-level overview, neglecting the micro details to keep things simple and build a mental model that is useful to us as the consumers of BigTable, as we're not supposed to clone BigTable's implementation after all. As we saw earlier, the basic storage unit in BigTable is a file format called SSTable that stores key-value pairs, which are sorted by the key, and has an index at its end. We also examined how the read, write, and delete work on an in-memory copy of the table and merged periodically with the table that is present on the disk. Lastly, we also mentioned that when the in memory is flushed as SSTables on the disk when reach a certain configurable count, they are merged into a bigger table. The view so far presents the data model, its physical layout, and how operations work on it in cases where the data resides on a single machine, such as a situation where your laptop has a telephone directory of the entire Europe. However, how does that work at larger scales? Neglecting the minor implementation details and complexities that arise in distributed systems, the overall architecture and working principles are simple. In case of a single machine, there's only one SSTable (or a few in case they are not merged into one) file that has to be taken care of, and all the operations have to be performed on it. However, in case this file does not fit on a single machine, we will of course have to add another machine, and half of the SSTable will reside on one machine, while the other half will be on the another machine. This split would of course mean that each machine would have a range of keys. For instance, if we have 1 million keys (that look like key1, key2, key3, and so on), then the keys from key1 to key500000 might be on one machine, while the keys from key500001 to key1000000 will be on the second machine. So, we can say that each machine has a different key range for the same table. Now, although the data resides on two different machines, it is of course a single table that sprawls over two machines. These partitions or separate parts are called tablets. Let's see the Key allocation on two machines: We will keep this system to only two machines and 1 million rows for the sake of discussion, but there may be cases where there are about 20 billion keys sprawling over some 12,000 machines, with each machine having a different range of keys. However, let's continue with this small cluster consisting of only two nodes. Now, the problem is that as an external user who has no knowledge of which machine has which portion of the SSTable (and eventually, the key ranges on each machine), how can a key, say, key489087 be located? For this, we will have to add something like a telephone directory, where I look up the table name and my desired key and I get to know the machine that I should contact to get the data associated with the key. So, we are going to add another node, which will be called the master. This master will again contain simple, plain SSTable, which is familiar to us. However, the key-value pair would be a very interesting one. Since this table would contain data about the other BigTable tables, let's call it the METADATA table. In the METADATA table, we will adopt the following format for the keys: tablename_ending-row-key Since we have only two machines and each machine has two tablets, the METADATA table will look like this: Key Value employees_key500000 192.168.0.2 employees_key1000000 192.168.0.3 The master stores the location of each tablet server with the row key that is the encoding of the table name and the ending row of the tablet. So, the tablet has to be scanned. The master assigns tablets to different machines when required. Each tablet is about 100 MB to 200 MB in size. So, if we want to fetch a key, all we need to know is the following: Location of the master server Table in which we are looking for the key The key itself Now, we will concatenate the table name with the key and perform a scan on the METADATA table on the master node. Let's suppose that we are looking for key600000 in employees table. So, we would first be actually looking for the employees_key600000 key in the table on master machine. As you are familiar with the scan operation on SSTable (and METADATA is just an SSTable), we are looking for a key that is greater than or equal to employees_key600000, which happens to be employees_key1000000. From this lookup, the key that we get is employees_key1000000 against which, IP address 192.168.0.3 is listed. This means that this is the machine that we should connect to fetch our data. We used the word keys and not the key because it is a range scan operation. This will be clearer with another example. Let's suppose that we want to process rows with keys starting from key400000 to key800000. Now, if you look at the distribution of data across the machine, you'll know that half of the required range is on one machine, while the other half is on the other. Now, in this case, when we consult the METADATA table, two rows will be returned to us because key400000 is less then key500000 (which is the ending row key for data on the first machine) and key800000 is less then key1000000, which is the ending row for the data on the second machine. So, with these two rows returned, we have two locations to fetch our data from. This leads to an interesting side-effect. As the data resides on two different machines, this can be read or processed in parallel, which leads to an improved system performance. This is one reason why even with larger datasets, the performance of BigTable won't deteriorate as badly as it would have if it were a single, large machine with all the data on it. The datastore thyself So until now, everything that we talked about was about BigTable, and we did not mention datastore at all. Now is the time to look at datastore in detail because we understand BigTable quite well now. Datastore is an effectively solution that was built on top of BigTable as a persistent NoSQL layer for Google App Engine. As we know that BigTable might have different tables, data for all the applications is stored in six separate tables, where each table stores a different aspect or information about the data. Don't worry about memorizing things about data modeling and how to use it for now, as this is something that we are going to look into in greater detail later. The fundamental unit of storage in datastore is called a property. You can think of a property as a column. So, a property has a name and type. You can group multiple properties into a Kind, which effectively is a Python class and analogous to a table in the RDBMS world. Here's a pseudo code sample: # 1. Define our Kind and how it looks like. class Person(object): name = StringProperty() age = IntegerProperty() # 2. Create an entity of kind person ali = Person(name='Ali', age='24) bob = Person(name='Bob', age='34) david = Person(name='David', age='44) zain = Person(name='Zain', age='54) # 3. Save it ali.put() bob.put() david.put() zain.put() This looks a lot like an ORM such as Django's ORM, SQLAlchemy, or Rails ActiveRecord. So, Person class is called a Kind in App Engine's terminology. The StringProperty and IntegerProperty property classes are used to indicate the type of the data that is supposed to be stored. We created an instance of the Person class as mohsin. This instance is called an entity in App Engine's terminology. Each entity, when stored, has a key that is not only unique throughout your application, but also combined with your application ID. It becomes unique throughout all the applications that are hosted over Google App Engine. All entities of all kinds for all apps are stored in a single BigTable, and they are stored in a way where all the property values are serialized and stored in a single BigTable column. Hence, no separate columns are defined for each property. This is interesting and required as well because if we are Google App Engine's architects, we do not know the Kind of data that people are going to store or the number and types of properties that they would define so that it makes sense to serialize the whole thing as one and store them in one column. So, this is how it looks like: Key Kind Data agtkZXZ-bWdhZS0wMXIQTXIGUGVyc29uIgNBbGkM Person {name: 'Ali', age: 24} agtkZXZ-bWdhZS0wMXIPCxNTVVyc29uIgNBbGkM Person {name: 'Bob', age: 34} agtkZXZ-bWdhZS0wMXIPCxIGUGVyc29uIgNBbBQM Person {name: 'David', age: 44} agtkZXZ-bWdhZS0wMXIPCxIGUGVyc29uIRJ3bGkM Person {name: 'Zain', age: 54} The key appears to be random, but it is not. A key is formed by concatenating your application ID, your Kind name (Person here), and either a unique identifier that is auto generated by Google App Engine, or a string that is supplied by you. The key seems cryptic, but it is not safe to pass it around in public, as someone might decode it and take advantage of it. Basically, it is just base 64 encoded and can easily be decoded to know the entity's Kind name and ID. A better way would be to encrypt it using a secret key and then pass it around in public. On the other hand, to receive it, you will have to decrypt it using the same key. A gist of this is available on GitHub that can serve the purpose. To view this, visit https://gist.github.com/mohsinhijazee/07cdfc2826a565b50a68. However, for it to work, you need to edit your app.yaml file so that it includes the following: libraries: - name: pycrypto version: latest Then, you can call the encrypt() method on the key while passing around and decrypt it back using the decrypt() method, as follows: person = Person(name='peter', age=10) key = person.put() url_safe_key = key.urlsafe() safe_to_pass_around = encrypt(SECRET_KEY, url_safe_key) Now, when you have a key from the outside, you should first decrypt it and then use it, as follows: key_from_outside = request.params.get('key') url_safe_key = decrypt(SECRET_KEY, key_from_outside) key = ndb.Key(urlsafe=url_safe_key) person = key.get() The key object is now good for use. To summarize, just get the URL safe key by calling the ndb.Key.urlsafe() method and encrypt it so that it can be passed around. On return, just do the reverse. If you really want to see how the encrypt and decrypt operations are implemented, they are reproduced as follows without any documentation/comments, as cryptography is not our main subject: import os import base64 from Crypto.Cipher import AES BLOCK_SIZE = 32 PADDING='#' def _pad(data, pad_with=PADDING): return data + (BLOCK_SIZE - len(data) % BLOCK_SIZE) * PADDING def encrypt(secret_key, data): cipher = AES.new(_pad(secret_key, '@')[:32]) return base64.b64encode(cipher.encrypt(_pad(data))) def decrypt(secret_key, encrypted_data): cipher = AES.new(_pad(secret_key, '@')[:32]) return cipher.decrypt(base64.b64decode (encrypted_data)).rstrip(PADDING) KEY='your-key-super-duper-secret-key-here-only-first-32-characters-are-used' decrypted = encrypt(KEY, 'Hello, world!') print decrypted print decrypt(KEY, decrypted) More explanation on how this works is given at https://gist.github.com/mohsinhijazee/07cdfc2826a565b50a68. Now, let's come back to our main subject, datastore. As you can see, all the data is stored in a single column, and if we want to query something, for instance, people who are older than 25, we have no way to do this. So, how will this work? Let's examine this next. Supporting queries Now, what if we want to get information pertaining to all the people who are older than, say, 30? In the current scheme of things, this does not seem to be something that is doable, because the data is serialized and dumped, as shown in the previous table. Datastore solves this problem by putting the sorted values to be queried upon as keys. So here, we want to query by age. Datastore will create a record in another table called the Index table. This index table is nothing but just a plain BigTable, where the row keys are actually the property value that you want to query. Hence, a scan and a quick lookup is possible. Here's how it would look like: Key Entity key Myapp-person-age-24 agtkZXZ-bWdhZS0wMXIQTXIGUGVyc29uIgNBbGkM Myapp-person-age-34 agtkZXZ-bWdhZS0wMXIPCxNTVVyc29uIgNBbGkM Myapp-person-age-44 agtkZXZ-bWdhZS0wMXIPCxIGUGVyc29uIgNBbBQM Myapp-person-age-54 agtkZXZ-bWdhZS0wMXIPCxIGUGVyc29uIRJ3bGkM Implementation details So, all in all, Datastore actually builds a NoSQL solution on top of BigTable by using the following six tables: A table to store entities A table to store entities by kind A table to store indexes for the property values in the ascending order A table to store indexes for the property values in the descending order A table to store indexes for multiple properties together A table to keep a track of the next unique ID for Kind Let us look at each table in turn. The first table is used to store entities for all the applications. We have examined this in an example. The second table just stores the Kind names. Nothing fancy here. It's just some metadata that datastore maintains for itself. Think of this—you want to get all the entities that are of the Person Kind. How will you do this? If you look at the entities table alone and the operations that are available to us on a BigTable table, you will know that there's no such way for us to fetch all the entities of a certain Kind. This table does exactly this. It looks like this: Key Entity key Myapp-Person-agtkZXZ-bWdhZS0wMXIQTXIGUGVyc29uIgNBbGkM AgtkZXZ-bWdhZS0wMXIQTXIGUGVyc29uIgNBbGkM Myapp-Person-agtkZXZ-bWdhZS0wMXIQTXIGUGVyc29uIgNBb854 agtkZXZ-bWdhZS0wMXIQTXIGUGVyc29uIgNBb854 Myapp-Person-agtkZXZ-bWdhZS0wMXIQTXIGUGVy748IgNBbGkM agtkZXZ-agtkZXZ-bWdhZS0wMXIQTXIGUGVy748IgNBbGkM So, as you can see, this is just a simple BigTable table where the keys are of the [app ID]-[Kind name]-[entity key] pattern. The tables 3, 4, and 5 from the six tables that were mentioned in the preceding list are similar to the table that we examined in the Supporting queries section labeled Data as stored in BigTable. This leaves us with the last table. As you know that while storing entities, it is important to have a unique key for each row. Since all the entities from all the apps are stored in a single table, they should be unique across the whole table. When datastore generates a key for an entity that has to be stored, it combines your application ID and the Kind name of the entity. Now, this much part of the key only makes it unique across all the other entities in the table, but not within the set of your own entities. To do this, you need a number that should be appended to this. This is exactly similar to how AUTO INCREMENT works in the RDBMS world, where the value of a column is automatically incremented to ensure that it is unique. So, that's exactly what the last table is for. It keeps a track of the last ID that was used by each Kind of each application, and it looks like this: Key Next ID Myapp-Person 65 So, in this table, the key is of the [application ID]-[Kind name] format, and the value is the next value, which is 65 in this particular case. When a new entity of kind Person is created, it will be assigned 65 as the ID, and the row will have a new value of 66. Our application has only one Kind defined, which is Person. Therefore, there's only one row in this table because we are only keeping track for the next ID for this Kind. If we had another Kind, say, Group, it will have its own row in this table. Summary We started this article with the problem of storing huge amounts of data, processing it in bulk, and randomly accessing it. This arose from the fact that we were ambitious to store every single web page on earth and process it to extract some results from it. We introduced a solution called BigTable and examined its data model. We saw that in BigTable, we can define multiple tables, with each table having multiple column families, which are defined at the time of creating the table. We learned that column families are logical groupings of columns, and new columns can be defined in a column family, as needed. We also learned that the data store in BigTable has no meaning on its own, and it stores them just as plain bytes; its interpretation and meanings depend on the user of data. We also learned that each row in BigTable has a unique row key, which has a length of 64 KB. Lastly, we turned our attention to datastore, a NoSQL storage solution built on top of BigTable for Google App Engine. We briefly mentioned some datastore terminology such as properties (columns), entities (rows), and kinds (tables). We learned that all data is stored across six different BigTable tables. This captured a different aspect of data. Most importantly, we learned that all the entities of all the apps hosted on Google App Engine are stored in a single BigTable and all properties go to a single BigTable column. We also learned how querying is supported by additional tables that are keyed by the property values that list the corresponding keys. This concludes our discussion on Google App Engine's datastore and its underlying technology, workings, and related concepts. Next, we will learn how to model our data on top of datastore. What we learned in this chapter will help us enormously in understanding how to better model our data to take full advantage of the underlying mechanisms. Resources for Article: Further resources on this subject: Google Guice[article] The EventBus Class[article] Integrating Google Play Services [article]
Read more
  • 0
  • 0
  • 1658

article-image-deploy-nodejs-apps-aws-code-deploy
Ankit Patial
14 Sep 2015
7 min read
Save for later

Deploy Node.js Apps with AWS Code Deploy

Ankit Patial
14 Sep 2015
7 min read
As an application developer, you must be familiar with the complexity of deploying apps to a fleet of servers with minimum down time. AWS introduced a new service called AWS Code Deploy to ease out the deployment of applications to an EC2 instance on the AWS cloud. Before explaining the full process, I will assume that you are using AWS VPC and are having all of your EC2 instances inside VPC, and that each instance is having an IAM role. Let's see how we can deploy a Node.js application to AWS. Install AWS Code Deploy Agent The first thing you need to do is to install aws-codedeploy-agent on each machine that you want your code deployed on. Before installing a client, please make sure that you have trust relationship for codedeploy.us-west-2.amazonaws.com and codedeploy.us-east-1.amazonaws.com added in IAM role that EC2 instance is using. Not sure what it is? Then, click on the top left dropdown with your account name in AWS console, select Security Credentials option you will be redirected to a new page, select Roles from left menu and look for IAM role that EC2 instance is using, click it and scroll to them bottom, and you will see Edit Trust Relationship button. Click this button to edit trust relationship, and make sure it looks like the following. ... "Principal": { "Service": [ "ec2.amazonaws.com", "codedeploy.us-west-2.amazonaws.com", "codedeploy.us-east-1.amazonaws.com" ] } ... Ok we are good to install the AWS Code Deploy Agent, so make sure ruby2.0 is installed. Use the following script to install code deploy agent. aws s3 cp s3://aws-codedeploy-us-east-1/latest/install ./install-aws-codedeploy-agent --region us-east-1 chmod +x ./install-aws-codedeploy-agent sudo ./install-aws-codedeploy-agent auto rm install-aws-codedeploy-agent Ok, hopefully nothing will go wrong and agent will be installed up and running. To check if its running or not, try the following command: sudo service codedeploy-agent status Let's move to the next step. Create Code Deploy Application Login to your AWS account. Under Deployment & Management click on Code Deploy link, on next screen click on the Get Started Now button and complete the following things: Choose Custom Deployment and click the Skip Walkthrough button. Create New Application; the following are steps to create an application. –            Application Name: display name for application you want to deploy. –            Deployment Group Name: this is something similar to environments like LIVE, STAGING and QA. –            Add Instances: you can choose Amazon EC2 instances by name group name etc. In case you are using autoscaling feature, you can add that auto scaling group too. –            Deployment Config: its a way to specify how we want to deploy application, whether we want to deploy one server at-a-time or half of servers at-a-time or deploy all at-once. –            Service Role: Choose the IAM role that has access to S3 bucket that we will use to hold code revisions. –            Hit the Create Application button. Ok, we just created a Code Deploy application. Let's hold it here and move to our NodeJs app to get it ready for deployment. Code Revision Ok, you have written your app and you are ready to deploy it. The most important thing your app need is appspec.yml. This file will be used by code deploy agent to perform various steps during the deployment life cycle. In simple words the deployment process includes the following steps: Stop the previous application if already deployed; if its first time then this step will not exist. Update the latest code, such as copy files to the application directory. Install new packages or run DB migrations. Start the application. Check if the application is working. Rollback if something went wrong. All above steps seem easy, but they are time consuming and painful to perform each time. Let's see how we can perform these steps easily with AWS code deploy. Lets say we have a following appspec.yml file in our code and also we have bin folder in an app that contain executable sh scripts to perform certain things that I will explain next. First of all take an example of appspec.yml: version: 0.0 os: linux files: - source: / destination: /home/ec2-user/my-app permissions: - object: / pattern: "**" owner: ec2-user group: ec2-user hooks: ApplicationStop: - location: bin/app-stop timeout: 10 runas: ec2-user AfterInstall: - location: bin/install-pkgs timeout: 1200 runas: ec2-user ApplicationStart: - location: bin/app-start timeout: 60 runas: ec2-user ValidateService: - location: bin/app-validate timeout: 10 runas: ec2-user It's a way to tell Code Deploy to copy and provide a destination of those files. files: - source: / destination: /home/ec2-user/my-app We can specify the permissions to be set for the source file on copy. permissions: - object: / pattern: "**" owner: ec2-user group: ec2-user Hooks are executed in an order during the Code Deploy life cycle. We have ApplicationStop, DownloadBundle, BeforeInstall, Install, AfterInstall, ApplicationStart and ValidateService hooks that all have the same syntax. hooks: deployment-lifecycle-event-name - location: script-location timeout: timeout-in-seconds runas: user-name location is the relative path from code root to script file that you want to execute. timeout is the maximum time a script can run. runas is an os user to run the script, and some time you may want to run a script with diff user privileges. Lets bundle your app, exclude the unwanted files such as node_modules folder, and zip it. I use AWS CLI to deploy my code revisions, but you can install awscli using PPI (Python Package Index). sudo pip install awscli I am using awscli profile that has access to s3 code revision bucket in my account. Here is code sample that can help: aws --profile simsaw-baas deploy push --no-ignore-hidden-files --application-name MY_CODE_DEPLOY_APP_NAME --s3-location s3://MY_CODE_REVISONS/MY_CODE_DEPLOY_APP_NAME/APP_BUILD --source MY_APP_BUILD.zip Now Code Revision is published to s3 and also the same revision is registered with the Code Deploy application with the name MY_CODE_DEPLOY_APP_NAME (it will be name of the application you created earlier in the second step.) Now go back to AWS console, Code Deploy. Deploy Code Revision Select your Code Deploy application from the application list show on the Code Deploy Dashboard. It will take you to the next window where you can see the published revision(s), expand the revision and click on Deploy This Revision. You will be redirected to a new window with options like application and deployment group. Choose them carefully and hit deploy. Wait for magic to happen. Code Deploy has a another option to deploy your app from github. The process for it will be almost the same, except you need not push code revisions to S3. About the author Ankit Patial has a Masters in Computer Applications, and nine years of experience with custom APIs, web and desktop applications using .NET technologies, ROR and NodeJs. As a CTO with SimSaw Inc and Pink Hand Technologies, his job is to learn and help his team to implement the best practices of using Cloud Computing and JavaScript technologies.
Read more
  • 0
  • 0
  • 14451

article-image-introduction-spring-web-application-no-time
Packt
10 Sep 2015
8 min read
Save for later

Introduction to Spring Web Application in No Time

Packt
10 Sep 2015
8 min read
 Many official Spring tutorials have both a Gradle build and a Maven build, so you will find examples easily if you decide to stick with Maven. Spring 4 is fully compatible with Java 8, so it would be a shame not to take advantage of lambdas to simplify our code base. In this article by Geoffroy Warin, author of the book Mastering Spring MVC 4, we will see some Git commands. It's a good idea to keep track of your progress and commit when you are in a stable state. (For more resources related to this topic, see here.) Getting started with Spring Tool Suite One of the best ways to get started with Spring and discover the numerous tutorials and starter projects that the Spring community offers is to download Spring Tool Suite (STS). STS is a custom version of eclipse designed to work with various Spring projects, as well as Groovy and Gradle. Even if, like me, you have another IDE that you would rather work with, we recommend that you give STS a shot because it gives you the opportunity to explore Spring's vast ecosystem in a matter of minutes with the "Getting Started" projects. So, let's visit https://Spring.io/tools/sts/all and download the latest release of STS. Before we generate our first Spring Boot project we will need to install the Gradle support for STS. You can find a Manage IDE Extensions button on the dashboard. You will then need to download the Gradle Support software in the Language and framework tooling section. Its recommend installing the Groovy Eclipse plugin along with the Groovy 2.4 compiler, as shown in the following screenshot. These will be needed later in this article when we set up acceptance tests with geb: We now have two main options to get started. The first option is to navigate to File | New | Spring Starter Project, as shown in the following screenshot. This will give you the same options as http://start.Spring.io, embedded in your IDE: The second way is to navigate to File | New | Import Getting Started Content. This will give you access to all the tutorials available on Spring.io. You will have the choice of working with either Gradle or Maven, as shown in the following screenshot: You can also check out the starter code to follow along with the tutorial, or get the complete code directly. There is a lot of very interesting content available in the Getting Started Content. It will demonstrate the integration of Spring with various technologies that you might be interested in. For the moment, we will generate a web project as shown in the preceding image. It will be a Gradle application, producing a JAR file and using Java 8. Here is the configuration we want to use: Property Value Name masterSpringMvc Type Gradle project Packaging Jar Java version 1.8 Language Java Group masterSpringMvc Artifact masterSpringMvc Version 0.0.1-SNAPSHOT Description Be creative! Package masterSpringMvc On the second screen you will be asked for the Spring Boot version you want to use and the the dependencies that should be added to the project. At the time of writing this, the latest version of Spring boot was 1.2.5. Ensure that you always check out the latest release. The latest snapshot version of Spring boot will also be available by the time you read this. If Spring boot 1.3 isn't released by then, you can probably give it a shot. One of its big features is the awesome devs tools. Refer to https://spring.io/blog/2015/06/17/devtools-in-spring-boot-1-3 for more details. At the bottom the configuration window you will see a number of checkboxes representing the various boot starter libraries. These are dependencies that can be appended to your build file. They provide autoconfigurations for various Spring projects. We are only interested in Spring MVC for the moment, so we will check only the Web checkbox. A JAR for a web application? Some of you might find it odd to package your web application as a JAR file. While it is still possible to use WAR files for packaging, it is not always the recommended practice. By default, Spring boot will create a fat JAR, which will include all the application's dependencies and provide a convenient way to start a web server using Java -jar. Our application will be packaged as a JAR file. If you want to create a war file, refer to http://spring.io/guides/gs/convert-jar-to-war/. Have you clicked on Finish yet? If you have, you should get the following project structure: We can see our main class MasterSpringMvcApplication and its test suite MasterSpringMvcApplicationTests. There are also two empty folders, static and templates, where we will put our static web assets (images, styles, and so on) and obviously our templates (jsp, freemarker, Thymeleaf). The last file is an empty application.properties file, which is the default Spring boot configuration file. It's a very handy file and we'll see how Spring boot uses it throughout this article. The last is build.gradle file, the build file that we will detail in a moment. If you feel ready to go, run the main method of the application. This will launch a web server for us. To do this, go to the main method of the application and navigate to Run as | Spring Application in the toolbar either by right-clicking on the class or clicking on the green play button in the toolbar. Doing so and navigating to http://localhost:8080 will produce an error. Don't worry, and read on. Now we will show you how to generate the same project without STS, and we will come back to all these files. Getting started with IntelliJ IntelliJ IDEA is a very popular tool among Java developers. For the past few years I've been very pleased to pay Jetbrains a yearly fee for this awesome editor. IntelliJ also has a way of creating Spring boot projects very quickly. Go to the new project menu and select the Spring Initializr project type: This will give us exactly the same options as STS. You will need to import the Gradle project into IntelliJ. we recommend generating the Gradle wrapper first (refer to the following Gradle build section). If needed, you can reimport the project by opening its build.gradle file again. Getting started with start.Spring.io Go to http://start.Spring.io to get started with start.Spring.io. The system behind this remarkable Bootstrap-like website should be familiar to you! You will see the following screenshot when you go to the previously mentioned link: Indeed, the same options available with STS can be found here. Clicking on Generate Project will download a ZIP file containing our starter project. Getting started with the command line For those of you who are addicted to the console, it is possible to curl http://start.Spring.io. Doing so will display instructions on how to structure your curl request. For instance, to generate the same project as earlier, you can issue the following command: $ curl http://start.Spring.io/starter.tgz -d name=masterSpringMvc -d dependencies=web -d language=java -d JavaVersion=1.8 -d type=gradle-project -d packageName=masterSpringMvc -d packaging=jar -d baseDir=app | tar -xzvf - % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1255 100 1119 100 136 1014 123 0:00:01 0:00:01 --:--:-- 1015 x app/ x app/src/ x app/src/main/ x app/src/main/Java/ x app/src/main/Java/com/ x app/src/main/Java/com/geowarin/ x app/src/main/resources/ x app/src/main/resources/static/ x app/src/main/resources/templates/ x app/src/test/ x app/src/test/Java/ x app/src/test/Java/com/ x app/src/test/Java/com/geowarin/ x app/build.Gradle x app/src/main/Java/com/geowarin/AppApplication.Java x app/src/main/resources/application.properties x app/src/test/Java/com/geowarin/AppApplicationTests.Java And viola! You are now ready to get started with Spring without leaving the console, a dream come true. You might consider creating an alias with the previous command, it will help you prototype the Spring application very quickly. Summary In this article, we leveraged Spring Boot's autoconfiguration capabilities to build an application with zero boilerplate or configuration files. We configured Spring Boot tool suite, IntelliJ,and start.spring.io and how to configure it! Resources for Article: Further resources on this subject: Welcome to the Spring Framework[article] Mailing with Spring Mail[article] Creating a Spring Application [article]
Read more
  • 0
  • 0
  • 1781
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €14.99/month. Cancel anytime
Packt
08 Sep 2015
17 min read
Save for later

The Symfony Framework – Installation and Configuration

Packt
08 Sep 2015
17 min read
 In this article by Wojciech Bancer, author of the book, Symfony2 Essentials, we will learn the basics of Symfony, its installation, configuration, and use. The Symfony framework is currently one of the most popular PHP frameworks existing within the PHP developer's environment. Version 2, which was released a few years ago, has been a great improvement, and in my opinion was one of the key elements for making the PHP ecosystem suitable for larger enterprise projects. The framework version 2.0 not only required the modern PHP version (minimal version required for Symfony is PHP 5.3.8), but also uses state-of-the-art technology — namespaces and anonymous functions. Authors also put a lot of efforts to provide long term support and to minimize changes, which break the compatibility between versions. Also, Symfony forced developers to use a few useful design concepts. The key one, introduced in Symfony, was DependencyInjection. (For more resources related to this topic, see here.) In most cases, the article will refer to the framework as Symfony2. If you want to look over the Internet or Google about this framework, apart from using Symfony keyword you may also try to use the Symfony2 keyword. This was the way recommended some time ago by one of the creators to make searching or referencing to the specific framework version easier in future. Key reasons to choose Symfony2 Symfony2 is recognized in the PHP ecosystem as a very well-written and well-maintained framework. Design patterns that are recommended and forced within the framework allow work to be more efficient in the group, this allows better tests and the creation of reusable code. Symfony's knowledge can also be verified through a certificate system, and this allows its developers to be easily found and be more recognized on the market. Last but not least, the Symfony2 components are used as parts of other projects, for example, look at the following: Drupal phpBB Laravel eZ Publish and more Over time, there is a good chance that you will find the parts of the Symfony2 components within other open source solutions. Bundles and extendable architecture are also some of the key Symfony2 features. They not only allow you to make your work easier through the easy development of reusable code, but also allows you to find smaller or larger pieces of code that you can embed and use within your project to speed up and make your work faster. The standards of Symfony2 also make it easier to catch errors and to write high-quality code; its community is growing every year. The history of Symfony There are many Symfony versions around, and it's good to know the differences between them to learn how the framework was evolving during these years. The first stable Symfony version — 1.0 — was released in the beginning of 2007 and was supported for three years. In mid-2008, version 1.1 was presented, which wasn't compatible with the previous release, and it was difficult to upgrade any old project to this. Symfony 1.2 version was released shortly after this, at the end of 2008. Migrating between these versions was much easier, and there were no dramatic changes in the structure. The final versions of Symfony 1's legacy family was released nearly one year later. Simultaneously, there were two version releases, 1.3 and 1.4. Both were identical, but Symfony 1.4 did not have deprecated features, and it was recommended to start new projects with it. Version 1.4 had 3 years of support. If you look into the code, version 1.x was very different from version 2. The company that was behind Symfony (the French company, SensioLabs) made a bold move and decided to rewrite the whole framework from scratch. The first release of Symfony2 wasn't perfect, but it was very promising. It relied on Git submodules (the composer did not exist back then). The 2.1 and 2.2 versions were closer to the one we use now, although it required a lot of effort to migrate to the upper level. Finally, the Symfony 2.3 was released — the first long-term support version within the 2.x branch. After this version, the changes provided within the next major versions (2.4, 2.5, and 2.6) are not so drastic and usually they do not break compatibility. This article was written based on the latest stable Symfony 2.7.4 version and was tested with PHP 5.5). This Symfony version is marked as the so called long-term support version, and updates for it will be released for 3 years since the first 2.7 version release. Installation Prior to installing Symfony2, you don't need to have a configured web server. If you have at least PHP version 5.4, you can use the standalone server provided by Symfony2. This server is suitable for development purposes and should not be used for production. It is strongly recommend to work with a Linux/UNIX system for both development and production deployment of Symfony2 framework applications. While it is possible to install and operate on a Windows box, due to its different nature, working with Windows can sometimes force you to maintain a separate fragment of code for this system. Even if your primary OS is Windows, it is strongly recommended to configure Linux system in a virtual environment. Also, there are solutions that will help you in automating the whole process. As an example, see more on https://www.vagrantup.com/ website. To install Symfony2, you can use a few methods as follows: Use a new Symfony2 installer script (currently, the only officially recommended). Please note that installer requires at least PHP 5.4. Use a composer dependency manager to install a Symfony project. Download a zip or tgz package and unpack it. It does not really matter which method you choose, as they all give you similar results. Installing Symfony2 by using an installer To install Symfony2 through an installer, go to the Symfony website at http://symfony.com/download, and install the Symfony2 installer by issuing the following commands: $ sudo curl -LsS http://symfony.com/installer -o /usr/local/bin/symfony $ sudo chmod +x /usr/local/bin/symfony After this, you can install Symfony by just typing the following command: $ symfony new <new_project_folder> To install the Symfony2 framework for a to-do application, execute the following command: $ symfony new <new_project_folder> This command installs the latest Symfony2 stable version on the newly created todoapp folder, creates the Symfony2 application, and prepares some basic structure for you to work with. After the app creation, you can verify that your local PHP is properly configured for Symfony2 by typing the following command: $ php app/check.php If everything goes fine, the script should complete with the following message: [OK] Your system is ready to run Symfony projects Symfony2 is equipped with a standalone server. It makes development easier. If you want to run this, type the following command: $ php app/console server:run If everything went alright, you will see a message that your server is working on the IP 127.0.0.1 and port 8000. If there is an error, make sure you are not running anything else that is listening on port 8000. It is also possible to run the server on a different port or IP, if you have such a requirement, by adding the address and port as a parameter, that is: $ php app/console server:run 127.0.0.1:8080 If everything works, you can now type the following: http://127.0.0.1:8000/ Now, you will visit Symfony's welcome page. This page presents you with a nice welcome information and useful documentation link. The Symfony2 directory structure Let's dive in to the initial directory structure within the typical Symfony application. Here it is: app bin src vendor web While Symfony2 is very flexible in terms of directory structure, it is recommended to keep the basic structure mentioned earlier. The following table describes their purpose: Directory Used for app This holds information about general configuration, routing, security configuration, database parameters, and many others. It is also the recommended place for putting new view files. This directory is a starting point. bin It holds some helper executables. It is not really important during the development process, and rarely modified. src This directory holds the project PHP code (usually your bundles). vendor These are third-party libraries used within the project. Usually, this directory contains all the open source third-party bundles, libraries, and other resources. It's worth to mention that it's recommended to keep the files within this directory outside the versioning system. It means that you should not modify them under any circumstances. Fortunately, there are ways to modify the code, if it suits your needs more. This will be demonstrated when we implement user management within our to-do application. web This is the directory that is accessible through the web server. It holds the main entry point to the application (usually the app.php and app_dev.php files), CSS files, JavaScript files, and all the files that need to be available through the web server (user uploadable files). So, in most cases, you will be usually modifying and creating the PHP files within the src/ directory, the view and configuration files within the app/ directory, and the JS/CSS files within the web/ directory. The main directory also holds a few files as follows: .gitignore README.md composer.json composer.lock The .gitignore file's purpose is to provide some preconfigured settings for the Git repository, while the composer.json and composer.lock files are the files used by the composer dependency manager. What is a bundle? Within the Symfony2 application, you will be using the "bundle" term quite often. Bundle is something similar to plugins. So it can literally hold any code controllers, views, models, and services. A bundle can integrate other non-Symfony2 libraries and hold some JavaScript/CSS code as well. We can say that almost everything is a bundle in Symfony2; even some of the core framework features together form a bundle. A bundle usually implements a single feature or functionality. The code you are writing when you write a Symfony2 application is also a bundle. There are two types of bundles. The first kind of bundle is the one you write within the application, which is project-specific and not reusable. For this purpose, there is a special bundle called AppBundle created for you when you install the Symfony2 project. Also, there are reusable bundles that are shared across the various projects either written by you, your team, or provided by a third-party vendors. Your own bundles are usually stored within the src/ directory, while the third-party bundles sit within the vendor/ directory. The vendor directory is used to store third-party libraries and is managed by the composer. As such, it should never be modified by you. There are many reusable open source bundles, which help you to implement various features within the application. You can find many of them to help you with User Management, writing RESTful APIs, making better documentation, connecting to Facebook and AWS, and even generating a whole admin panel. There are tons of bundles, and everyday brings new ones. If you want to explore open source bundles, and want to look around what's available, I recommend you to start with the http://knpbundles.com/ website. The bundle name is correlated with the PHP namespace. As such, it needs to follow some technical rules, and it needs to end with the Bundle suffix. A few examples of correct names are AppBundle and AcmeDemoBundle, CompanyBlogBundle or CompanySocialForumBundle, and so on. Composer Symfony2 is built based on components, and it would be very difficult to manage the dependencies between them and the framework without a dependency manager. To make installing and managing these components easier, Symfony2 uses a manager called composer. You can get it from the https://getcomposer.org/ website. The composer makes it easy to install and check all dependencies, download them, and integrate them to your work. If you want to find additional packages that can be installed with the composer, you should visit https://packagist.org/. This site is the main composer repository, and contains information about most of the packages that are installable with the composer. To install the composer, go to https://getcomposer.org/download/ and see the download instruction. The download instruction should be similar to the following: $ curl -sS https://getcomposer.org/installer | php If the download was successful, you should see the composer.phar file in your directory. Move this to the project location in the same place where you have the composer.json and composer.lock files. You can also install it globally, if you prefer to, with these two commands: $ curl -sS https://getcomposer.org/installer | php $ sudo mv composer.phar /usr/local/bin/composer You will usually need to use only three composer commands: require, install, and update. The require command is executed when you need to add a new dependency. The install command is used to install the package. The update command is used when you need to fetch the latest version of your dependencies as specified within the JSON file. The difference between install and update is subtle, but very important. If you are executing the update command, your composer.lock file gets updated with the version of the code you just fetched and downloaded. The install command uses the information stored in the composer.lock file and the fetch version stored in this file. When to use install? For example, if you deploy the code to the server, you should use install rather than update, as it will deploy the version of the code stored in composer.lock, rather than download the latest version (which may be untested by you). Also, if you work in a team and you just got an update through Git, you should use install to fetch the vendor code updated by other developers. You should use the update command if you want to check whether there is an updated version of the package you have installed, that is, whether a new minor version of Symfony2 will be released, then the update command will fetch everything. As an example, let's install one extra package for user management called FOSUserBundle (FOS is a shortcut of Friends of Symfony). We will only install it here; we will not configure it. To install FOSUserBundle, we need to know the correct package name and version. The easiest way is to look in the packagist site at https://packagist.org/ and search for the package there. If you type fosuserbundle, the search should return a package called friendsofsymfony/user-bundle as one of the top results. The download counts visible on the right-hand side might be also helpful in determining how popular the bundle is. If you click on this, you will end up on the page with the detailed information about that bundle, such as homepage, versions, and requirements of the package. Type the following command: $ php composer.phar require friendsofsymfony/user-bundle ^1.3 Using version ^1.3 for friendsofsymfony/user-bundle ./composer.json has been updated Loading composer repositories with package information Updating dependencies (including require-dev) - Installing friendsofsymfony/user-bundle (v1.3.6) Loading from cache friendsofsymfony/user-bundle suggests installing willdurand/propel-typehintable-behavior (Needed when using the propel implementation) Writing lock file Generating autoload files ... Which version of the package you choose is up to you. If you are interested in package versioning standards, see the composer website at https://getcomposer.org/doc/01-basic-usage.md#package-versions to get more information on it. The composer holds all the configurable information about dependencies and where to install them in a special JSON file called composer.json. Let's take a look at this: { "name": "wbancer/todoapp", "license": "proprietary", "type": "project", "autoload": { "psr-0": { "": "src/", "SymfonyStandard": "app/SymfonyStandard/" } }, "require": { "php": ">=5.3.9", "symfony/symfony": "2.7.*", "doctrine/orm": "~2.2,>=2.2.3,<2.5", // [...] "incenteev/composer-parameter-handler": "~2.0", "friendsofsymfony/user-bundle": "^1.3" }, "require-dev": { "sensio/generator-bundle": "~2.3" }, "scripts": { "post-root-package-install": [ "SymfonyStandard\\Composer::hookRootPackageInstall" ], "post-install-cmd": [ // post installation steps ], "post-update-cmd": [ // post update steps ] }, "config": { "bin-dir": "bin" }, "extra": { // [...] } } The most important section is the one with the require key. It holds all the information about the packages we want to use within the project. The key scripts contain a set of instructions to run post-install and post-update. The extra key in this case contains some settings specific to the Symfony2 framework. Note that one of the values in here points out to the parameter.yml file. This file is the main file holding the custom machine-specific parameters. The meaning of the other keys is rather obvious. If you look into the vendor/ directory, you will notice that our package has been installed in the vendor/friendsofsymfony/user-bundle directory. The configuration files Each application has a need to hold some global and machine-specific parameters and configurations. Symfony2 holds configuration within the app/config directory and it is split into a few files as follows: config.yml config_dev.yml config_prod.yml config_test.yml parameters.yml parameters.yml.dist routing.yml routing_dev.yml security.yml services.yml All the files except the parameters.yml* files contain global configuration, while the parameters.yml file holds machine-specific information such as database host, database name, user, password, and SMTP configuration. The default configuration file generated by the new Symfony command will be similar to the following one. This file is auto-generated during the composer install: parameters: database_driver: pdo_mysql database_host: 127.0.0.1 database_port: null database_name: symfony database_user: root database_password: null mailer_transport: smtp mailer_host: 127.0.0.1 mailer_user: null mailer_password: null secret: 93b0eebeffd9e229701f74597e10f8ecf4d94d7f As you can see, it mostly holds the parameters related to database, SMTP, locale settings, and secret key that are used internally by Symfony2. Here, you can add your custom parameters using the same syntax. It is a good practice to keep machine-specific data such as passwords, tokens, api-keys, and access keys within this file only. Putting passwords in the general config.yml file is considered as a security risk bug. The global configuration file (config.yml) is split into a few other files called routing*.yml that contain information about routing on the development and production configuration. The file called as security.yml holds information related to authentication and securing the application access. Note that some files contains information for development, production, or test mode. You can define your mode when you run Symfony through the command-line console and when you run it through the web server. In most cases, while developing you will be using the dev mode. The Symfony2 console To finish, let's take a look at the Symfony console script. We used it before to fire up the development server, but it offers more. Execute the following: $ php app/console You will see a list of supported commands. Each command has a short description. Each of the standard commands come with help, so I will not be describing each of them here, but it is worth to mention a few commonly used ones: Command Description app/console: cache:clear Symfony in production uses a lot of caching. Therefore, if you need to change values within a template (twig) or within configuration files while in production mode, you will need to clear the cache. Cache is also one of the reasons why it's worth to work in the development mode. app/console container:debug Displays all configured public services app/console router:debug Displays all routing configuration along with method, scheme, host, and path. app/console security:check Checks your composer and packages version against known security vulnerabilities. You should run this command regularly. Summary In this article, we have demonstrated how to use the Symfony2 installer, test the configuration, run the deployment server, and play around with the Symfony2 command line. We have also installed the composer and learned how to install a package using it. To demonstrate how Symfony2 enables you to make web applications faster, we will try to learn through examples that can be found in real life. To make this task easier, we will try to produce a real to-do web application with modern look and a few working features. In case you are interested in knowing other Symfony books that Packt has in store for you, here is the link: Symfony 1.3 Web Application Development, Tim Bowler, Wojciech Bancer Extending Symfony2 Web Application Framework, Sébastien Armand Resources for Article: Further resources on this subject: A Command-line Companion Called Artisan[article] Creating and Using Composer Packages[article] Services [article]
Read more
  • 0
  • 0
  • 1371

article-image-working-charts
Packt
07 Sep 2015
14 min read
Save for later

Working with Charts

Packt
07 Sep 2015
14 min read
In this article by Anand Dayalan, the author of the book Ext JS 6 By Example, he explores the different types of chart components in Ext JS and ends with a sample project called expense analyzer. The following topics will be covered: Charts types Bar and column charts Area and line charts Pie charts 3D charts The expense analyzer – a sample project (For more resources related to this topic, see here.) Charts Ext JS is almost like a one-stop shop for all your JavaScript framework needs. Yes, Ext JS also includes charts with all other rich components you learned so far. Chart types There are three types of charts: cartesian, polar, and spacefilling. The cartesian chart Ext.chart.CartesianChart (xtype: cartesian or chart) A cartesian chart has two directions: X and Y. By default, X is horizontal and Y is vertical. Charts that use the cartesian coordinates are column, bar, area, line, and scatter. The polar chart Ext.chart.PolarChart (xtype: polar) These charts have two axes: angular and radial. Charts that plot values using the polar coordinates are pie and radar. The spacefilling chart Ext.chart.SpaceFillingChart (xtype: spacefilling) These charts fill the complete area of the chart. Bar and column charts For bar and column charts, at the minimum, you need to provide a store, axes, and series. The basic column chart Let's start with a simple basic column chart. First, let's create a simple tree store with the inline hardcoded data as follows: Ext.define('MyApp.model.Population', { extend: 'Ext.data.Model', fields: ['year', 'population'] }); Ext.define('MyApp.store.Population', { extend: 'Ext.data.Store', storeId: 'population', model: 'MyApp.model.Population', data: [ { "year": "1610","population": 350 }, { "year": "1650","population": 50368 }, { "year": "1700", "population": 250888 }, { "year": "1750","population": 1170760 }, { "year": "1800","population": 5308483 }, { "year": "1900","population": 76212168 }, { "year": "1950","population": 151325798 }, { "year": "2000","population": 281421906 }, { "year": "2010","population": 308745538 }, ] }); var store = Ext.create("MyApp.store.Population"); Now, let's create the chart using Ext.chart.CartesianChart (xtype: cartesian or chart ) and use the store created above. Ext.create('Ext.Container', { renderTo: Ext.getBody(), width: 500, height: 500, layout: 'fit', items: [{ xtype: 'chart', insetPadding: { top: 60, bottom: 20, left: 20, right: 40 }, store: store, axes: [{ type: 'numeric', position: 'left', grid: true, title: { text: 'Population in Millions', fontSize: 16 }, }, { type: 'category', title: { text: 'Year', fontSize: 16 }, position: 'bottom', } ], series: [{ type: 'bar', xField: 'year', yField: ['population'] }], sprites: { type: 'text', text: 'United States Population', font: '25px Helvetica', width: 120, height: 35, x: 100, y: 40 } } ] }); Important things to note in the preceding code are axes, series, and sprite. Axes can be of one of the three types: numeric, time, and category. In series, you can see that the type is set to bar. In Ext JS, to render the column or bar chart, you have to specify the type as bar, but if you want a bar chart, you have to set flipXY to true in the chart config. The sprites config used here is quite straightforward. Sprites is optional, not a must. The grid property can be specified for both the axes, although we have specified it only for one axis here. The insetPadding is used to specify the padding for the chart to render other information, such as the title. If we don't specify insetPadding, the title and other information may get overlapped with the chart. The output of the preceding code is shown here: The bar chart As mentioned before, in order to get the bar chart, you can just use the same code, but specify flipXP to true and change the positions of axes accordingly, as shown in the following code: Ext.create('Ext.Container', { renderTo: Ext.getBody(), width: 500, height: 500, layout: 'fit', items: [{ xtype: 'chart', flipXY: true, insetPadding: { top: 60, bottom: 20, left: 20, right: 40 }, store: store, axes: [{ type: 'numeric', position: 'bottom', grid: true, title: { text: 'Population in Millions', fontSize: 16 }, }, { type: 'category', title: { text: 'Year', fontSize: 16 }, position: 'left', } ], series: [{ type: 'bar', xField: 'year', yField: ['population'] } ], sprites: { type: 'text', text: 'United States Population', font: '25px Helvetica', width: 120, height: 35, x: 100, y: 40 } } ] }); The output of the preceding code is shown in the following screenshot: The stacked chart Now, let's say you want to plot two values in each category in the column chart. You can either stack them or have two bar columns for each category. Let's modify our column chart example to render a stacked column chart. For this, we need an additional numeric field in the store, and we need to specify two fields for yField in the series. You can stack more than two fields, but for this example, we will stack only two fields. Take a look at the following code: Ext.define('MyApp.model.Population', { extend: 'Ext.data.Model', fields: ['year', 'total','slaves'] }); Ext.define('MyApp.store.Population', { extend: 'Ext.data.Store', storeId: 'population', model: 'MyApp.model.Population', data: [ { "year": "1790", "total": 3.9, "slaves": 0.7 }, { "year": "1800", "total": 5.3, "slaves": 0.9 }, { "year": "1810", "total": 7.2, "slaves": 1.2 }, { "year": "1820", "total": 9.6, "slaves": 1.5 }, { "year": "1830", "total": 12.9, "slaves": 2 }, { "year": "1840", "total": 17, "slaves": 2.5 }, { "year": "1850", "total": 23.2, "slaves": 3.2 }, { "year": "1860", "total": 31.4, "slaves": 4 }, ] }); var store = Ext.create("MyApp.store.Population"); Ext.create('Ext.Container', { renderTo: Ext.getBody(), width: 500, height: 500, layout: 'fit', items: [{ xtype: 'cartesian', store: store, insetPadding: { top: 60, bottom: 20, left: 20, right: 40 }, axes: [{ type: 'numeric', position: 'left', grid: true, title: { text: 'Population in Millions', fontSize: 16 }, }, { type: 'category', title: { text: 'Year', fontSize: 16 }, position: 'bottom', } ], series: [{ type: 'bar', xField: 'year', yField: ['total','slaves'] } ], sprites: { type: 'text', text: 'United States Slaves Distribution 1790 to 1860', font: '20px Helvetica', width: 120, height: 35, x: 60, y: 40 } } ] }); The output of the stacked column chart is shown here: If you want to render multiple fields without stacking, then you can simply set the stacked property of the series to false to get the following output: There are so many options available in the chart. Let's take a look at some of the commonly used options: tooltip: This can be added easily by setting a tooltip property in the series legend: This can be rendered to any of the four sides of the chart by specifying the legend config sprites: This can be an array if you want to specify multiple informations, such as header, footer, and so on Here is the code for the same store configured with some advanced options: Ext.create('Ext.Container', { renderTo: Ext.getBody(), width: 500, height: 500, layout: 'fit', items: [{ xtype: 'chart', legend: { docked: 'bottom' }, insetPadding: { top: 60, bottom: 20, left: 20, right: 40 }, store: store, axes: [{ type: 'numeric', position: 'left', grid: true, title: { text: 'Population in Millions', fontSize: 16 }, minimum: 0, }, { type: 'category', title: { text: 'Year', fontSize: 16 }, position: 'bottom', } ], series: [{ type: 'bar', xField: 'year', stacked: false, title: ['Total', 'Slaves'], yField: ['total', 'slaves'], tooltip: { trackMouse: true, style: 'background: #fff', renderer: function (storeItem, item) { this.setHtml('In ' + storeItem.get('year') + ' ' + item.field + ' population was ' + storeItem.get(item.field) + ' m'); } } ], sprites: [{ type: 'text', text: 'United States Slaves Distribution 1790 to 1860', font: '20px Helvetica', width: 120, height: 35, x: 60, y: 40 }, { type: 'text', text: 'Source: http://www.wikipedia.org', fontSize: 10, x: 12, y: 440 }] }] }); The output with tooltip, legend, and footer is shown here: The 3D bar chart If you simply change the type of the series to 3D bar instead of bar, you'll get the 3D column chart, as show in the following screenshot: Area and line charts Area and line charts are also cartesian charts. The area chart To render an area chart, simply replace the series in the previous example with the following code: series: [{ type: 'area', xField: 'year', stacked: false, title: ['Total','slaves'], yField: ['total', 'slaves'], style: { stroke: "#94ae0a", fillOpacity: 0.6, } }] The output of the preceding code is shown here: Similar to the stacked column chart, you can have the stacked area chart as well by setting stacked to true in the series. If you set stacked to true in the preceding example, you'll get the following output:  Figure 7.1 The line chart To get the line chart shown in Figure 7.1, use the following series config in the preceding example instead: series: [{ type: 'line', xField: 'year', title: ['Total'], yField: ['total'] }, { type: 'line', xField: 'year', title: ['Slaves'], yField: ['slaves'] }], The pie chart This is one of the frequently used charts in many applications and reporting tools. Ext.chart.PolarChart (xtype: polar) should be used to render a pie chart. The basic pie chart Specify the type as pie, and specify the angleField and label to render a basic pie chart, as as shown in the following code: Ext.define('MyApp.store.Expense', { extend: 'Ext.data.Store', alias: 'store.expense', fields: [ 'cat', 'spent'], data: [ { "cat": "Restaurant", "spent": 100}, { "cat": "Travel", "spent": 150}, { "cat": "Insurance", "spent": 500}, { "cat": "Rent", "spent": 1000}, { "cat": "Groceries", "spent": 400}, { "cat": "Utilities", "spent": 300}, ] }); var store = Ext.create("MyApp.store.Expense"); Ext.create('Ext.Container', { renderTo: Ext.getBody(), width: 600, height: 500, layout: 'fit', items: [{ xtype: 'polar', legend: { docked: 'bottom' }, insetPadding: { top: 100, bottom: 20, left: 20, right: 40 }, store: store, series: [{ type: 'pie', angleField: 'spent', label: { field: 'cat', }, tooltip: { trackMouse: true, renderer: function (storeItem, item) { var value = ((parseFloat(storeItem.get('spent') / storeItem.store.sum('spent')) * 100.0).toFixed(2)); this.setHtml(storeItem.get('cat') + ': ' + value + '%'); } } }] }] }); The donut chart Just by setting the donut property of the series in the preceding example to 40, you'll get the following chart. Here, donut is the percentage of the radius of the hole compared to the entire disk: The 3D pie chart In Ext JS 6, there were some improvements made to the 3D pie chart. The 3D pie chart in Ext JS 6 now supports the label and configurable 3D aspects, such as thickness, distortion, and so on. Let's use the same model and store that was used in the pie chart example and create a 3D pie chart as follows: Ext.create('Ext.Container', { renderTo: Ext.getBody(), width: 600, height: 500, layout: 'fit', items: [{ xtype: 'polar', legend: { docked: 'bottom' }, insetPadding: { top: 100, bottom: 20, left: 80, right: 80 }, store: store, series: [{ type: 'pie3d', donut: 50, thickness: 70, distortion: 0.5, angleField: 'spent', label: { field: 'cat' }, tooltip: { trackMouse: true, renderer: function (storeItem, item) { var value = ((parseFloat(storeItem.get('spent') / storeItem.store.sum('spent')) * 100.0).toFixed(2)); this.setHtml(storeItem.get('cat') + ': ' + value + '%'); } } }] }] }); The following image shows the output of the preceding code: The expense analyzer – a sample project Now that you have learned the different kinds of charts available in Ext JS, let's use them to create a sample project called Expense Analyzer. The following screenshot shows the design of this sample project: Let's use Sencha Cmd to scaffold our application. Run the following command in the terminal or command window: sencha -sdk <path to SDK>/ext-6.0.0.415/ generate app EA ./expense-analyzer Now, let's remove all the unwanted files and code and add some additional files to create this project. The final folder structure and some of the important files are shown in the following Figure 7.2: The complete source code is not given in this article. Here, only some of the important files are shown. In between, some less important code has been truncated. The complete source is available at https://github.com/ananddayalan/extjs-by-example-expense-analyzer.  Figure 7.2 Now, let's create the grid shown in the design. The following code is used to create the grid. This List view extends from Ext.grid.Panel, uses the expense store for the data, and has three columns: Ext.define('EA.view.main.List', { extend: 'Ext.grid.Panel', xtype: 'mainlist', maxHeight: 400, requires: [ 'EA.store.Expense'], title: 'Year to date expense by category', store: { type: 'expense' }, columns: { defaults: { flex:1 }, items: [{ text: 'Category', dataIndex: 'cat' }, { formatter: "date('F')", text: 'Month', dataIndex: 'date' }, { text: 'Spent', dataIndex: 'spent' }] } }); Here, I have not used the pagination. The maxHeight is used to limit the height of the grid, and this enables the scroll bar as well because we have more records that won't fit the given maximum height of the grid. The following code creates the expense store used in the preceding example. This is a simple store with the inline data. Here, we have not created a separate model and added fields directly in the store: Ext.define('EA.store.Expense', { extend: 'Ext.data.Store', alias: 'store.expense', storeId: 'expense', fields: [{ name:'date', type: 'date' }, 'cat', 'spent' ], data: { items: [ { "date": "1/1/2015", "cat": "Restaurant", "spent": 100 }, { "date": "1/1/2015", "cat": "Travel", "spent": 22 }, { "date": "1/1/2015", "cat": "Insurance", "spent": 343 }, // Truncated code ]}, proxy: { type: 'memory', reader: { type: 'json', rootProperty: 'items' } } }); Next, let's create the bar chart shown in the design. In the bar chart, we will use another store called expensebyMonthStore, in which we'll populate data from the expense data store. The following 3D bar chart has two types of axis: numeric and category. We have used the month part of the date field as a category. A renderer is used to render the month part of the date field: Ext.define('EA.view.main.Bar', { extend: 'Ext.chart.CartesianChart', requires: ['Ext.chart.axis.Category', 'Ext.chart.series.Bar3D', 'Ext.chart.axis.Numeric', 'Ext.chart.interactions.ItemHighlight'], xtype: 'mainbar', height: 500, padding: { top: 50, bottom: 20, left: 100, right: 100 }, legend: { docked: 'bottom' }, insetPadding: { top: 100, bottom: 20, left: 20, right: 40 }, store: { type: 'expensebyMonthStore' }, axes: [{ type: 'numeric', position: 'left', grid: true, minimum: 0, title: { text: 'Spendings in $', fontSize: 16 }, }, { type: 'category', position: 'bottom', title: { text: 'Month', fontSize: 16 }, label: { font: 'bold Arial', rotate: { degrees: 300 } }, renderer: function (date) { return ["Jan", "Feb", "Mar", "Apr", "May"][date.getMonth()]; } } ], series: [{ type: 'bar3d', xField: 'date', stacked: false, title: ['Total'], yField: ['total'] }], sprites: [{ type: 'text', text: 'Expense by Month', font: '20px Helvetica', width: 120, height: 35, x: 60, y: 40 }] }); Now, let's create the MyApp.model.ExpensebyMonth store used in the preceding bar chart view. This store will display the total amount spent in each month. This data is populated by grouping the expense store with the date field. Take a look at how the data property is configured to populate the data: Ext.define('MyApp.model.ExpensebyMonth', { extend: 'Ext.data.Model', fields: [{name:'date', type: 'date'}, 'total'] }); Ext.define('MyApp.store.ExpensebyMonth', { extend: 'Ext.data.Store', alias: 'store.expensebyMonthStore', model: 'MyApp.model.ExpensebyMonth', data: (function () { var data = []; var expense = Ext.createByAlias('store.expense'); expense.group('date'); var groups = expense.getGroups(); groups.each(function (group) { data.push({ date: group.config.groupKey, total: group.sum('spent') }); }); return data; })() }); Then, the following code is used to generate the pie chart. This chart uses the expense store, but only shows one selected month of data at a time. A drop-down box is added to the main view to select the month. The beforerender is used to filter the expense store to show the data only for the month of January on the load: Ext.define('EA.view.main.Pie', { extend: 'Ext.chart.PolarChart', requires: ['Ext.chart.series.Pie3D'], xtype: 'mainpie', height: 800, legend: { docked: 'bottom' }, insetPadding: { top: 100, bottom: 20, left: 80, right: 80 }, listeners: { beforerender: function () { var dateFiter = new Ext.util.Filter({ filterFn: function(item) { return item.data.date.getMonth() ==0; } }); Ext.getStore('expense').addFilter(dateFiter); } }, store: { type: 'expense' }, series: [{ type: 'pie3d', donut: 50, thickness: 70, distortion: 0.5, angleField: 'spent', label: { field: 'cat', } }] }); So far, we have created the grid, the bar chart, the pie chart, and the stores required for this sample application. Now, we need to link them together in the main view. The following code shows the main view from the classic toolkit. The main view is simply a tab control and specifies what view to render for each tab: Ext.define('EA.view.main.Main', { extend: 'Ext.tab.Panel', xtype: 'app-main', requires: [ 'Ext.plugin.Viewport', 'Ext.window.MessageBox', 'EA.view.main.MainController', 'EA.view.main.List', 'EA.view.main.Bar', 'EA.view.main.Pie' ], controller: 'main', autoScroll: true, ui: 'navigation', // Truncated code items: [{ title: 'Year to Date', iconCls: 'fa-bar-chart', items: [ { html: '<h3>Your average expense per month is: ' + Ext.createByAlias('store.expensebyMonthStore').average('total') + '</h3>', height: 70, }, { xtype: 'mainlist'}, { xtype: 'mainbar' } ] }, { title: 'By Month', iconCls: 'fa-pie-chart', items: [{ xtype: 'combo', value: 'Jan', fieldLabel: 'Select Month', store: ['Jan', 'Feb', 'Mar', 'Apr', 'May'], listeners: { select: 'onMonthSelect' } }, { xtype: 'mainpie' }] }] }); Summary In this article, we looked at the different kinds of charts available in Ext JS. We also created a simple sample project called Expense Analyzer and used some of the concepts you learned in this article. Resources for Article: Further resources on this subject: Ext JS 5 – an Introduction[article] Constructing Common UI Widgets[article] Static Data Management [article]
Read more
  • 0
  • 0
  • 3603

article-image-understanding-tdd
Packt
03 Sep 2015
31 min read
Save for later

Understanding TDD

Packt
03 Sep 2015
31 min read
 In this article by Viktor Farcic and Alex Garcia, the authors of the book Test-Driven Java Development, we will go through TDD in a simple procedure of writing tests before the actual implementation. It's an inversion of a traditional approach where testing is performed after the code is written. (For more resources related to this topic, see here.) Red-green-refactor Test-driven development is a process that relies on the repetition of a very short development cycle. It is based on the test-first concept of extreme programming (XP) that encourages simple design with a high level of confidence. The procedure that drives this cycle is called red-green-refactor. The procedure itself is simple and it consists of a few steps that are repeated over and over again: Write a test. Run all tests. Write the implementation code. Run all tests. Refactor. Run all tests. Since a test is written before the actual implementation, it is supposed to fail. If it doesn't, the test is wrong. It describes something that already exists or it was written incorrectly. Being in the green state while writing tests is a sign of a false positive. Tests like these should be removed or refactored. While writing tests, we are in the red state. When the implementation of a test is finished, all tests should pass and then we will be in the green state. If the last test failed, implementation is wrong and should be corrected. Either the test we just finished is incorrect or the implementation of that test did not meet the specification we had set. If any but the last test failed, we broke something and changes should be reverted. When this happens, the natural reaction is to spend as much time as needed to fix the code so that all tests are passing. However, this is wrong. If a fix is not done in a matter of minutes, the best thing to do is to revert the changes. After all, everything worked not long ago. Implementation that broke something is obviously wrong, so why not go back to where we started and think again about the correct way to implement the test? That way, we wasted minutes on a wrong implementation instead of wasting much more time to correct something that was not done right in the first place. Existing test coverage (excluding the implementation of the last test) should be sacred. We change the existing code through intentional refactoring, not as a way to fix recently written code. Do not make the implementation of the last test final, but provide just enough code for this test to pass. Write the code in any way you want, but do it fast. Once everything is green, we have confidence that there is a safety net in the form of tests. From this moment on, we can proceed to refactor the code. This means that we are making the code better and more optimum without introducing new features. While refactoring is in place, all tests should be passing all the time. If, while refactoring, one of the tests failed, refactor broke an existing functionality and, as before, changes should be reverted. Not only that at this stage we are not changing any features, but we are also not introducing any new tests. All we're doing is making the code better while continuously running all tests to make sure that nothing got broken. At the same time, we're proving code correctness and cutting down on future maintenance costs. Once refactoring is finished, the process is repeated. It's an endless loop of a very short cycle. Speed is the key Imagine a game of ping pong (or table tennis). The game is very fast; sometimes it is hard to even follow the ball when professionals play the game. TDD is very similar. TDD veterans tend not to spend more than a minute on either side of the table (test and implementation). Write a short test and run all tests (ping), write the implementation and run all tests (pong), write another test (ping), write implementation of that test (pong), refactor and confirm that all tests are passing (score), and then repeat—ping, pong, ping, pong, ping, pong, score, serve again. Do not try to make the perfect code. Instead, try to keep the ball rolling until you think that the time is right to score (refactor). Time between switching from tests to implementation (and vice versa) should be measured in minutes (if not seconds). It's not about testing T in TDD is often misunderstood. Test-driven development is the way we approach the design. It is the way to force us to think about the implementation and to what the code needs to do before writing it. It is the way to focus on requirements and implementation of just one thing at a time—organize your thoughts and better structure the code. This does not mean that tests resulting from TDD are useless—it is far from that. They are very useful and they allow us to develop with great speed without being afraid that something will be broken. This is especially true when refactoring takes place. Being able to reorganize the code while having the confidence that no functionality is broken is a huge boost to the quality. The main objective of test-driven development is testable code design with tests as a very useful side product. Testing Even though the main objective of test-driven development is the approach to code design, tests are still a very important aspect of TDD and we should have a clear understanding of two major groups of techniques as follows: Black-box testing White-box testing The black-box testing Black-box testing (also known as functional testing) treats software under test as a black-box without knowing its internals. Tests use software interfaces and try to ensure that they work as expected. As long as functionality of interfaces remains unchanged, tests should pass even if internals are changed. Tester is aware of what the program should do, but does not have the knowledge of how it does it. Black-box testing is most commonly used type of testing in traditional organizations that have testers as a separate department, especially when they are not proficient in coding and have difficulties understanding it. This technique provides an external perspective on the software under test. Some of the advantages of black-box testing are as follows: Efficient for large segments of code Code access, understanding the code, and ability to code are not required Separation between user's and developer's perspectives Some of the disadvantages of black-box testing are as follows: Limited coverage, since only a fraction of test scenarios is performed Inefficient testing due to tester's lack of knowledge about software internals Blind coverage, since tester has limited knowledge about the application If tests are driving the development, they are often done in the form of acceptance criteria that is later used as a definition of what should be developed. Automated black-box testing relies on some form of automation such as behavior-driven development (BDD). The white-box testing White-box testing (also known as clear-box testing, glass-box testing, transparent-box testing, and structural testing) looks inside the software that is being tested and uses that knowledge as part of the testing process. If, for example, an exception should be thrown under certain conditions, a test might want to reproduce those conditions. White-box testing requires internal knowledge of the system and programming skills. It provides an internal perspective on the software under test. Some of the advantages of white-box testing are as follows: Efficient in finding errors and problems Required knowledge of internals of the software under test is beneficial for thorough testing Allows finding hidden errors Programmers introspection Helps optimizing the code Due to the required internal knowledge of the software, maximum coverage is obtained Some of the disadvantages of white-box testing are as follows: It might not find unimplemented or missing features Requires high-level knowledge of internals of the software under test Requires code access Tests are often tightly coupled to the implementation details of the production code, causing unwanted test failures when the code is refactored. White-box testing is almost always automated and, in most cases, has the form of unit tests. When white-box testing is done before the implementation, it takes the form of TDD. The difference between quality checking and quality assurance The approach to testing can also be distinguished by looking at the objectives they are trying to accomplish. Those objectives are often split between quality checking (QC) and quality assurance (QA). While quality checking is focused on defects identification, quality assurance tries to prevent them. QC is product-oriented and intends to make sure that results are as expected. On the other hand, QA is more focused on processes that assure that quality is built-in. It tries to make sure that correct things are done in the correct way. While quality checking had a more important role in the past, with the emergence of TDD, acceptance test-driven development (ATDD), and later on behavior-driven development (BDD), focus has been shifting towards quality assurance. Better tests No matter whether one is using black-box, white-box, or both types of testing, the order in which they are written is very important. Requirements (specifications and user stories) are written before the code that implements them. They come first so they define the code, not the other way around. The same can be said for tests. If they are written after the code is done, in a certain way, that code (and the functionalities it implements) is defining tests. Tests that are defined by an already existing application are biased. They have a tendency to confirm what code does, and not to test whether client's expectations are met, or that the code is behaving as expected. With manual testing, that is less the case since it is often done by a siloed QC department (even though it's often called QA). They tend to work on tests' definition in isolation from developers. That in itself leads to bigger problems caused by inevitably poor communication and the police syndrome where testers are not trying to help the team to write applications with quality built-in, but to find faults at the end of the process. The sooner we find problems, the cheaper it is to fix them. Tests written in the TDD fashion (including its flavors such as ATDD and BDD) are an attempt to develop applications with quality built-in from the very start. It's an attempt to avoid having problems in the first place. Mocking In order for tests to run fast and provide constant feedback, code needs to be organized in such a way that the methods, functions, and classes can be easily replaced with mocks and stubs. A common word for this type of replacements of the actual code is test double. Speed of the execution can be severely affected with external dependencies; for example, our code might need to communicate with the database. By mocking external dependencies, we are able to increase that speed drastically. Whole unit tests suite execution should be measured in minutes, if not seconds. Designing the code in a way that it can be easily mocked and stubbed, forces us to better structure that code by applying separation of concerns. More important than speed is the benefit of removal of external factors. Setting up databases, web servers, external APIs, and other dependencies that our code might need, is both time consuming and unreliable. In many cases, those dependencies might not even be available. For example, we might need to create a code that communicates with a database and have someone else create a schema. Without mocks, we would need to wait until that schema is set. With or without mocks, the code should be written in a way that we can easily replace one dependency with another. Executable documentation Another very useful aspect of TDD (and well-structured tests in general) is documentation. In most cases, it is much easier to find out what the code does by looking at tests than the implementation itself. What is the purpose of some methods? Look at the tests associated with it. What is the desired functionality of some part of the application UI? Look at the tests associated with it. Documentation written in the form of tests is one of the pillars of TDD and deserves further explanation. The main problem with (traditional) software documentation is that it is not up to date most of the time. As soon as some part of the code changes, the documentation stops reflecting the actual situation. This statement applies to almost any type of documentation, with requirements and test cases being the most affected. The necessity to document code is often a sign that the code itself is not well written.Moreover, no matter how hard we try, documentation inevitably gets outdated. Developers shouldn't rely on system documentation because it is almost never up to date. Besides, no documentation can provide as detailed and up-to-date description of the code as the code itself. Using code as documentation, does not exclude other types of documents. The key is to avoid duplication. If details of the system can be obtained by reading the code, other types of documentation can provide quick guidelines and a high-level overview. Non-code documentation should answer questions such as what the general purpose of the system is and what technologies are used by the system. In many cases, a simple README is enough to provide the quick start that developers need. Sections such as project description, environment setup, installation, and build and packaging instructions are very helpful for newcomers. From there on, code is the bible. Implementation code provides all needed details while test code acts as the description of the intent behind the production code. Tests are executable documentation with TDD being the most common way to create and maintain it. Assuming that some form of Continuous Integration (CI) is in use, if some part of test-documentation is incorrect, it will fail and be fixed soon afterwards. CI solves the problem of incorrect test-documentation, but it does not ensure that all functionality is documented. For this reason (among many others), test-documentation should be created in the TDD fashion. If all functionality is defined as tests before the implementation code is written and execution of all tests is successful, then tests act as a complete and up-to-date information that can be used by developers. What should we do with the rest of the team? Testers, customers, managers, and other non coders might not be able to obtain the necessary information from the production and test code. As we saw earlier, two most common types of testing are black-box and white-box testing. This division is important since it also divides testers into those who do know how to write or at least read code (white-box testing) and those who don't (black-box testing). In some cases, testers can do both types. However, more often than not, they do not know how to code so the documentation that is usable for developers is not usable for them. If documentation needs to be decoupled from the code, unit tests are not a good match. That is one of the reasons why BDD came in to being. BDD can provide documentation necessary for non-coders, while still maintaining the advantages of TDD and automation. Customers need to be able to define new functionality of the system, as well as to be able to get information about all the important aspects of the current system. That documentation should not be too technical (code is not an option), but it still must be always up to date. BDD narratives and scenarios are one of the best ways to provide this type of documentation. Ability to act as acceptance criteria (written before the code), be executed frequently (preferably on every commit), and be written in natural language makes BDD stories not only always up to date, but usable by those who do not want to inspect the code. Documentation is an integral part of the software. As with any other part of the code, it needs to be tested often so that we're sure that it is accurate and up to date. The only cost-effective way to have accurate and up-to-date information is to have executable documentation that can be integrated into your continuous integration system. TDD as a methodology is a good way to move towards this direction. On a low level, unit tests are a best fit. On the other hand, BDD provides a good way to work on a functional level while maintaining understanding accomplished using natural language. No debugging We (authors of this article) almost never debug applications we're working on! This statement might sound pompous, but it's true. We almost never debug because there is rarely a reason to debug an application. When tests are written before the code and the code coverage is high, we can have high confidence that the application works as expected. This does not mean that applications written using TDD do not have bugs—they do. All applications do. However, when that happens, it is easy to isolate them by simply looking for the code that is not covered with tests. Tests themselves might not include some cases. In that situation, the action is to write additional tests. With high code coverage, finding the cause of some bug is much faster through tests than spending time debugging line by line until the culprit is found. With all this in mind, let's go through the TDD best practices. Best practices Coding best practices are a set of informal rules that the software development community has learned over time, which can help improve the quality of software. While each application needs a level of creativity and originality (after all, we're trying to build something new or better), coding practices help us avoid some of the problems others faced before us. If you're just starting with TDD, it is a good idea to apply some (if not all) of the best practices generated by others. For easier classification of test-driven development best practices, we divided them into four categories: Naming conventions Processes Development practices Tools As you'll see, not all of them are exclusive to TDD. Since a big part of test-driven development consists of writing tests, many of the best practices presented in the following sections apply to testing in general, while others are related to general coding best practices. No matter the origin, all of them are useful when practicing TDD. Take the advice with a certain dose of skepticism. Being a great programmer is not only about knowing how to code, but also about being able to decide which practice, framework or style best suits the project and the team. Being agile is not about following someone else's rules, but about knowing how to adapt to circumstances and choose the best tools and practices that suit the team and the project. Naming conventions Naming conventions help to organize tests better, so that it is easier for developers to find what they're looking for. Another benefit is that many tools expect that those conventions are followed. There are many naming conventions in use, and those presented here are just a drop in the ocean. The logic is that any naming convention is better than none. Most important is that everyone on the team knows what conventions are being used and are comfortable with them. Choosing more popular conventions has the advantage that newcomers to the team can get up to speed fast since they can leverage existing knowledge to find their way around. Separate the implementation from the test code Benefits: It avoids accidentally packaging tests together with production binaries; many build tools expect tests to be in a certain source directory. Common practice is to have at least two source directories. Implementation code should be located in src/main/java and test code in src/test/java. In bigger projects, the number of source directories can increase but the separation between implementation and tests should remain as is. Build tools such as Gradle and Maven expect source directories separation as well as naming conventions. You might have noticed that the build.gradle files that we used throughout this article did not have explicitly specified what to test nor what classes to use to create a .jar file. Gradle assumes that tests are in src/test/java and that the implementation code that should be packaged into a jar file is in src/main/java. Place test classes in the same package as implementation Benefits: Knowing that tests are in the same package as the code helps finding code faster. As stated in the previous practice, even though packages are the same, classes are in the separate source directories. All exercises throughout this article followed this convention. Name test classes in a similar fashion to the classes they test Benefits: Knowing that tests have a similar name to the classes they are testing helps in finding the classes faster. One commonly used practice is to name tests the same as the implementation classes, with the suffix Test. If, for example, the implementation class is TickTackToe, the test class should be TickTackToeTest. However, in all cases, with the exception of those we used throughout the refactoring exercises, we prefer the suffix Spec. It helps to make a clear distinction that test methods are primarily created as a way to specify what will be developed. Testing is a great subproduct of those specifications. Use descriptive names for test methods Benefits: It helps in understanding the objective of tests. Using method names that describe tests is beneficial when trying to figure out why some tests failed or when the coverage should be increased with more tests. It should be clear what conditions are set before the test, what actions are performed and what is the expected outcome. There are many different ways to name test methods and our preferred method is to name them using the Given/When/Then syntax used in the BDD scenarios. Given describes (pre)conditions, When describes actions, and Then describes the expected outcome. If some test does not have preconditions (usually set using @Before and @BeforeClass annotations), Given can be skipped. Let's take a look at one of the specifications we created for our TickTackToe application:   @Test public void whenPlayAndWholeHorizontalLineThenWinner() { ticTacToe.play(1, 1); // X ticTacToe.play(1, 2); // O ticTacToe.play(2, 1); // X ticTacToe.play(2, 2); // O String actual = ticTacToe.play(3, 1); // X assertEquals("X is the winner", actual); } Just by reading the name of the method, we can understand what it is about. When we play and the whole horizontal line is populated, then we have a winner. Do not rely only on comments to provide information about the test objective. Comments do not appear when tests are executed from your favorite IDE nor do they appear in reports generated by CI or build tools. Processes TDD processes are the core set of practices. Successful implementation of TDD depends on practices described in this section. Write a test before writing the implementation code Benefits: It ensures that testable code is written; ensures that every line of code gets tests written for it. By writing or modifying the test first, the developer is focused on requirements before starting to work on the implementation code. This is the main difference compared to writing tests after the implementation is done. The additional benefit is that with the tests written first, we are avoiding the danger that the tests work as quality checking instead of quality assurance. We're trying to ensure that quality is built in as opposed to checking later whether we met quality objectives. Only write new code when the test is failing Benefits: It confirms that the test does not work without the implementation. If tests are passing without the need to write or modify the implementation code, then either the functionality is already implemented or the test is defective. If new functionality is indeed missing, then the test always passes and is therefore useless. Tests should fail for the expected reason. Even though there are no guarantees that the test is verifying the right thing, with fail first and for the expected reason, confidence that verification is correct should be high. Rerun all tests every time the implementation code changes Benefits: It ensures that there is no unexpected side effect caused by code changes. Every time any part of the implementation code changes, all tests should be run. Ideally, tests are fast to execute and can be run by the developer locally. Once code is submitted to version control, all tests should be run again to ensure that there was no problem due to code merges. This is specially important when more than one developer is working on the code. Continuous integration tools such as Jenkins (http://jenkins-ci.org/), Hudson (http://hudson-ci.org/), Travis (https://travis-ci.org/), and Bamboo (https://www.atlassian.com/software/bamboo) should be used to pull the code from the repository, compile it, and run tests. All tests should pass before a new test is written Benefits: The focus is maintained on a small unit of work; implementation code is (almost) always in working condition. It is sometimes tempting to write multiple tests before the actual implementation. In other cases, developers ignore problems detected by existing tests and move towards new features. This should be avoided whenever possible. In most cases, breaking this rule will only introduce technical debt that will need to be paid with interest. One of the goals of TDD is that the implementation code is (almost) always working as expected. Some projects, due to pressures to reach the delivery date or maintain the budget, break this rule and dedicate time to new features, leaving the task of fixing the code associated with failed tests for later. These projects usually end up postponing the inevitable. Refactor only after all tests are passing Benefits: This type of refactoring is safe. If all implementation code that can be affected has tests and they are all passing, it is relatively safe to refactor. In most cases, there is no need for new tests. Small modifications to existing tests should be enough. The expected outcome of refactoring is to have all tests passing both before and after the code is modified. Development practices Practices listed in this section are focused on the best way to write tests. Write the simplest code to pass the test Benefits: It ensures cleaner and clearer design; avoids unnecessary features. The idea is that the simpler the implementation, the better and easier it is to maintain the product. The idea adheres to the keep it simple stupid (KISS) principle. This states that most systems work best if they are kept simple rather than made complex; therefore, simplicity should be a key goal in design, and unnecessary complexity should be avoided. Write assertions first, act later Benefits: This clarifies the purpose of the requirements and tests early. Once the assertion is written, the purpose of the test is clear and the developer can concentrate on the code that will accomplish that assertion and, later on, on the actual implementation. Minimize assertions in each test Benefits: This avoids assertion roulette; allows execution of more asserts. If multiple assertions are used within one test method, it might be hard to tell which of them caused a test failure. This is especially common when tests are executed as part of the continuous integration process. If the problem cannot be reproduced on a developer's machine (as may be the case if the problem is caused by environmental issues), fixing the problem may be difficult and time consuming. When one assert fails, execution of that test method stops. If there are other asserts in that method, they will not be run and information that can be used in debugging is lost. Last but not least, having multiple asserts creates confusion about the objective of the test. This practice does not mean that there should always be only one assert per test method. If there are other asserts that test the same logical condition or unit of functionality, they can be used within the same method. Let's go through few examples: @Test public final void whenOneNumberIsUsedThenReturnValueIsThatSameNumber() { Assert.assertEquals(3, StringCalculator.add("3")); } @Test public final void whenTwoNumbersAreUsedThenReturnValueIsTheirSum() { Assert.assertEquals(3+6, StringCalculator.add("3,6")); } The preceding code contains two specifications that clearly define what the objective of those tests is. By reading the method names and looking at the assert, there should be clarity on what is being tested. Consider the following for example: @Test public final void whenNegativeNumbersAreUsedThenRuntimeExceptionIsThrown() { RuntimeException exception = null; try { StringCalculator.add("3,-6,15,-18,46,33"); } catch (RuntimeException e) { exception = e; } Assert.assertNotNull("Exception was not thrown", exception); Assert.assertEquals("Negatives not allowed: [-6, -18]", exception.getMessage()); } This specification has more than one assert, but they are testing the same logical unit of functionality. The first assert is confirming that the exception exists, and the second that its message is correct. When multiple asserts are used in one test method, they should all contain messages that explain the failure. This way debugging the failed assert is easier. In the case of one assert per test method, messages are welcome, but not necessary since it should be clear from the method name what the objective of the test is. @Test public final void whenAddIsUsedThenItWorks() { Assert.assertEquals(0, StringCalculator.add("")); Assert.assertEquals(3, StringCalculator.add("3")); Assert.assertEquals(3+6, StringCalculator.add("3,6")); Assert.assertEquals(3+6+15+18+46+33, StringCalculator.add("3,6,15,18,46,33")); Assert.assertEquals(3+6+15, StringCalculator.add("3,6n15")); Assert.assertEquals(3+6+15, StringCalculator.add("//;n3;6;15")); Assert.assertEquals(3+1000+6, StringCalculator.add("3,1000,1001,6,1234")); } This test has many asserts. It is unclear what the functionality is, and if one of them fails, it is unknown whether the rest would work or not. It might be hard to understand the failure when this test is executed through some of the CI tools. Do not introduce dependencies between tests Benefits: The tests work in any order independently, whether all or only a subset is run Each test should be independent from the others. Developers should be able to execute any individual test, a set of tests, or all of them. Often, due to the test runner's design, there is no guarantee that tests will be executed in any particular order. If there are dependencies between tests, they might easily be broken with the introduction of new ones. Tests should run fast Benefits: These tests are used often. If it takes a lot of time to run tests, developers will stop using them or run only a small subset related to the changes they are making. The benefit of fast tests, besides fostering their usage, is quick feedback. The sooner the problem is detected, the easier it is to fix it. Knowledge about the code that produced the problem is still fresh. If the developer already started working on the next feature while waiting for the completion of the execution of the tests, he might decide to postpone fixing the problem until that new feature is developed. On the other hand, if he drops his current work to fix the bug, time is lost in context switching. Tests should be so quick that developers can run all of them after each change without getting bored or frustrated. Use test doubles Benefits: This reduces code dependency and test execution will be faster. Mocks are prerequisites for fast execution of tests and ability to concentrate on a single unit of functionality. By mocking dependencies external to the method that is being tested, the developer is able to focus on the task at hand without spending time in setting them up. In the case of bigger teams, those dependencies might not even be developed. Also, the execution of tests without mocks tends to be slow. Good candidates for mocks are databases, other products, services, and so on. Use set-up and tear-down methods Benefits: This allows set-up and tear-down code to be executed before and after the class or each method. In many cases, some code needs to be executed before the test class or before each method in a class. For this purpose, JUnit has @BeforeClass and @Before annotations that should be used as the setup phase. @BeforeClass executes the associated method before the class is loaded (before the first test method is run). @Before executes the associated method before each test is run. Both should be used when there are certain preconditions required by tests. The most common example is setting up test data in the (hopefully in-memory) database. At the opposite end are @After and @AfterClass annotations, which should be used as the tear-down phase. Their main purpose is to destroy data or a state created during the setup phase or by the tests themselves. As stated in one of the previous practices, each test should be independent from the others. Moreover, no test should be affected by the others. Tear-down phase helps to maintain the system as if no test was previously executed. Do not use base classes in tests Benefits: It provides test clarity. Developers often approach test code in the same way as implementation. One of the common mistakes is to create base classes that are extended by tests. This practice avoids code duplication at the expense of tests clarity. When possible, base classes used for testing should be avoided or limited. Having to navigate from the test class to its parent, parent of the parent, and so on in order to understand the logic behind tests introduces often unnecessary confusion. Clarity in tests should be more important than avoiding code duplication. Tools TDD, coding and testing in general, are heavily dependent on other tools and processes. Some of the most important ones are as follows. Each of them is too big a topic to be explored in this article, so they will be described only briefly. Code coverage and Continuous integration (CI) Benefits: It gives assurance that everything is tested Code coverage practice and tools are very valuable in determining that all code, branches, and complexity is tested. Some of the tools are JaCoCo (http://www.eclemma.org/jacoco/), Clover (https://www.atlassian.com/software/clover/overview), and Cobertura (http://cobertura.github.io/cobertura/). Continuous Integration (CI) tools are a must for all except the most trivial projects. Some of the most used tools are Jenkins (http://jenkins-ci.org/), Hudson (http://hudson-ci.org/), Travis (https://travis-ci.org/), and Bamboo (https://www.atlassian.com/software/bamboo). Use TDD together with BDD Benefits: Both developer unit tests and functional customer facing tests are covered. While TDD with unit tests is a great practice, in many cases, it does not provide all the testing that projects need. TDD is fast to develop, helps the design process, and gives confidence through fast feedback. On the other hand, BDD is more suitable for integration and functional testing, provides better process for requirement gathering through narratives, and is a better way of communicating with clients through scenarios. Both should be used, and together they provide a full process that involves all stakeholders and team members. TDD (based on unit tests) and BDD should be driving the development process. Our recommendation is to use TDD for high code coverage and fast feedback, and BDD as automated acceptance tests. While TDD is mostly oriented towards white-box, BDD often aims at black-box testing. Both TDD and BDD are trying to focus on quality assurance instead of quality checking. Summary You learned that it is a way to design the code through short and repeatable cycle called red-green-refactor. Failure is an expected state that should not only be embraced, but enforced throughout the TDD process. This cycle is so short that we move from one phase to another with great speed. While code design is the main objective, tests created throughout the TDD process are a valuable asset that should be utilized and severely impact on our view of traditional testing practices. We went through the most common of those practices such as white-box and black-box testing, tried to put them into the TDD perspective, and showed benefits that they can bring to each other. You discovered that mocks are a very important tool that is often a must when writing tests. Finally, we discussed how tests can and should be utilized as executable documentation and how TDD can make debugging much less necessary. Now that we are armed with theoretical knowledge, it is time to set up the development environment and get an overview and comparison of different testing frameworks and tools. Now we will walk you through all the TDD best practices in detail and refresh the knowledge and experience you gained throughout this article. Resources for Article: Further resources on this subject: RESTful Services JAX-RS 2.0[article] Java Refactoring in NetBeans[article] Developing a JavaFX Application for iOS [article]
Read more
  • 0
  • 0
  • 1872

article-image-testing-exceptional-flow
Packt
03 Sep 2015
22 min read
Save for later

Testing Exceptional Flow

Packt
03 Sep 2015
22 min read
 In this article by Frank Appel, author of the book Testing with JUnit, we will learn that special care has to be taken when testing a component's functionality under exception-raising conditions. You'll also learn how to use the various capture and verification possibilities and discuss their pros and cons. As robust software design is one of the declared goals of the test-first approach, we're going to see how tests intertwine with the fail fast strategy on selected boundary conditions. Finally, we're going to conclude with an in-depth explanation of working with collaborators under exceptional flow and see how stubbing of exceptional behavior can be achieved. The topics covered in this article are as follows: Testing patterns Treating collaborators (For more resources related to this topic, see here.) Testing patterns Testing exceptional flow is a bit trickier than verifying the outcome of normal execution paths. The following section will explain why and introduce the different techniques available to get this job done. Using the fail statement "Always expect the unexpected"                                                                                  – Adage based on Heraclitus Testing corner cases often results in the necessity to verify that a functionality throws a particular exception. Think, for example, of a java.util.List implementation. It quits the retrieval attempt of a list's element by means of a non-existing index number with java.lang.ArrayIndexOutOfBoundsException. Working with exceptional flow is somewhat special as without any precautions, the exercise phase would terminate immediately. But this is not what we want since it eventuates in a test failure. Indeed, the exception itself is the expected outcome of the behavior we want to check. From this, it follows that we have to capture the exception before we can verify anything. As we all know, we do this in Java with a try-catch construct. The try block contains the actual invocation of the functionality we are about to test. The catch block again allows us to get a grip on the expected outcome—the exception thrown during the exercise. Note that we usually keep our hands off Error, so we confine the angle of view in this article to exceptions. So far so good, but we have to bring up to our minds that in case no exception is thrown, this has to be classified as misbehavior. Consequently, the test has to fail. JUnit's built-in assertion capabilities provide the org.junit.Assert.fail method, which can be used to achieve this. The method unconditionally throws an instance of java.lang.AssertionError if called. The classical approach of testing exceptional flow with JUnit adds a fail statement straight after the functionality invocation within the try block. The idea behind is that this statement should never be reached if the SUT behaves correctly. But if not, the assertion error marks the test as failed. It is self-evident that capturing should narrow down the expected exception as much as possible. Do not catch IOException if you expect FileNotFoundException, for example. Unintentionally thrown exceptions must pass the catch block unaffected, lead to a test failure and, therefore, give you a good hint for troubleshooting with their stack trace. We insinuated that the fetch-count range check of our timeline example would probably be better off throwing IllegalArgumentException on boundary violations. Let's have a look at how we can change the setFetchCountExceedsLowerBound test to verify different behaviors with the try-catch exception testing pattern (see the following listing): @Test public void setFetchCountExceedsLowerBound() { int tooSmall = Timeline.FETCH_COUNT_LOWER_BOUND - 1; try { timeline.setFetchCount( tooSmall ); fail(); } catch( IllegalArgumentException actual ) { String message = actual.getMessage(); String expected = format( Timeline.ERROR_EXCEEDS_LOWER_BOUND, tooSmall ); assertEquals( expected, message ); assertTrue( message.contains( valueOf( tooSmall ) ) ); } } It can be clearly seen how setFetchCount, the functionality under test, is called within the try block, directly followed by a fail statement. The caught exception is narrowed down to the expected type. The test avails of the inline fixture setup to initialize the exceeds-lower-bound value in the tooSmall local variable because it is used more than once. The verification checks that the thrown message matches an expected one. Our test calculates the expectation with the aid of java.lang.String.format (static import) based on the same pattern, which is also used internally by the timeline to produce the text. Once again, we loosen encapsulation a bit to ensure that the malicious value gets mentioned correctly. Purists may prefer only the String.contains variant, which, on the other hand would be less accurate. Although this works fine, it looks pretty ugly and is not very readable. Besides, it blurs a bit the separation of the exercise and verification phases, and so it is no wonder that there have been other techniques invented for exception testing. Annotated expectations After the arrival of annotations in the Java language, JUnit got a thorough overhauling. We already mentioned the @Test type used to mark a particular method as an executable test. To simplify exception testing, it has been given the expected attribute. This defines that the anticipated outcome of a unit test should be an exception and it accepts a subclass of Throwable to specify its type. Running a test of this kind captures exceptions automatically and checks whether the caught type matches the specified one. The following snippet shows how this can be used to validate that our timeline constructor doesn't accept null as the injection parameter: @Test( expected = IllegalArgumentException.class ) public void constructWithNullAsItemProvider() { new Timeline( null, mock( SessionStorage.class ) ); } Here, we've got a test, the body statements of which merge setup and exercise in one line for compactness. Although the verification result is specified ahead of the method's signature definition, of course, it gets evaluated at last. This means that the runtime test structure isn't twisted. But it is a bit of a downside from the readability point of view as it breaks the usual test format. However, the approach bears a real risk when using it in more complex scenarios. The next listing shows an alternative of setFetchCountExceedsLowerBound using the expected attribute: @Test( expected = IllegalArgumentException.class ) public void setFetchCountExceedsLowerBound() { Timeline timeline = new Timeline( null, null ); timeline.setFetchCount( Timeline.FETCH_COUNT_LOWER_BOUND - 1 ); } On the face of it, this might look fine because the test run would succeed apparently with a green bar. But given that the timeline constructor already throws IllegalArgumentException due to the initialization with null, the virtual point of interest is never reached. So any setFetchCount implementation will pass this test. This renders it not only useless, but it even lulls you into a false sense of security! Certainly, the approach is most hazardous when checking for runtime exceptions because they can be thrown undeclared. Thus, they can emerge practically everywhere and overshadow the original test intention unnoticed. Not being able to validate the state of the thrown exception narrows down the reasonable operational area of this concept to simple use cases, such as the constructor parameter verification mentioned previously. Finally, here are two more remarks on the initial example. First, it might be debatable whether IllegalArgumentException is appropriate for an argument-not-null-check from a design point of view. But as this discussion is as old as the hills and probably will never be settled, we won't argue about that. IllegalArgumentException was favored over NullPointerException basically because it seemed to be an evident way to build up a comprehensible example. To specify a different behavior of the tested use case, one simply has to define another Throwable type as the expected value. Second, as a side effect, the test shows how a generated test double can make our life much easier. You've probably already noticed that the session storage stand-in created on the fly serves as a dummy. This is quite nice as we don't have to implement one manually and as it decouples the test from storage-related signatures, which may break the test in future when changing. But keep in mind that such a created-on-the-fly dummy lacks the implicit no-operation-check. Hence, this approach might be too fragile under some circumstances. With annotations being too brittle for most usage scenarios and the try-fail-catch pattern being too crabbed, JUnit provides a special test helper called ExpectedException, which we'll take a look at now. Verification with the ExpectedException rule The third possibility offered to verify exceptions is the ExpectedException class. This type belongs to a special category of test utilities. For the moment, it is sufficient to know that rules allow us to embed a test method into custom pre- and post-operations at runtime. In doing so, the expected exception helper can catch the thrown instance and perform the appropriate verifications. A rule has to be defined as a nonstatic public field, annotated with @Rule, as shown in the following TimelineTest excerpt. See how the rule object gets set up implicitly here with a factory method: public class TimelineTest { @Rule public ExpectedException thrown = ExpectedException.none(); [...] @Test public void setFetchCountExceedsUpperBound() { int tooLargeValue = FETCH_COUNT_UPPER_BOUND + 1; thrown.expect( IllegalArgumentException.class ); thrown.expectMessage( valueOf( tooLargeValue ) ); timeline.setFetchCount( tooLargeValue ); } [...] } Compared to the try-fail-catch approach, the code is easier to read and write. The helper instance supports several methods to specify the anticipated outcome. Apart from the static imports of constants used for compactness, this specification reproduces pretty much the same validations as the original test. ExpectedException#expectedMessage expects a substring of the actual message in case you wonder, and we omitted the exact formatting here for brevity. In case the exercise phase of setFetchCountExceedsUpperBound does not throw an exception, the rule ensures that the test fails. In this context, it is about time we mentioned the utility's factory method none. Its name indicates that as long as no expectations are configured, the helper assumes that a test run should terminate normally. This means that no artificial fail has to be issued. This way, a mix of standard and exceptional flow tests can coexist in one and the same test case. Even so, the test helper has to be configured prior to the exercise phase, which still leaves room for improvement with respect to canonizing the test structure. As we'll see next, the possibility of Java 8 to compact closures into lambda expressions enables us to write even leaner and cleaner structured exceptional flow tests. Capturing exceptions with closures When writing tests, we strive to end up with a clear representation of separated test phases in the correct order. All of the previous approaches for testing exceptional flow did more or less a poor job in this regard. Looking once more at the classical try-fail-catch pattern, we recognize, however, that it comes closest. It strikes us that if we put some work into it, we can extract exception capturing into a reusable utility method. This method would accept a functional interface—the representation of the exception-throwing functionality under test—and return the caught exception. The ThrowableCaptor test helper puts the idea into practice: public class ThrowableCaptor { @FunctionalInterface public interface Actor { void act() throws Throwable; } public static Throwable thrownBy( Actor actor ) { try { actor.act(); } catch( Throwable throwable ) { return throwable; } return null; } } We see the Actor interface that serves as a functional callback. It gets executed within a try block of the thrownBy method. If an exception is thrown, which should be the normal path of execution, it gets caught and returned as the result. Bear in mind that we have omitted the fail statement of the original try-fail-catch pattern. We consider the capturer as a helper for the exercise phase. Thus, we merely return null if no exception is thrown and leave it to the afterworld to deal correctly with the situation. How capturing using this helper in combination with a lambda expression works is shown by the next variant of setFetchCountExceedsUpperBound, and this time, we've achieved the clear phase separation we're in search of: @Test public void setFetchCountExceedsUpperBound() { int tooLarge = FETCH_COUNT_UPPER_BOUND + 1; Throwable actual = thrownBy( ()-> timeline.setFetchCount( tooLarge ) ); String message = actual.getMessage(); assertNotNull( actual ); assertTrue( actual instanceof IllegalArgumentException ); assertTrue( message.contains( valueOf( tooLarge ) ) ); assertEquals( format( ERROR_EXCEEDS_UPPER_BOUND, tooLarge ), message ); } Please note that we've added an additional not-null-check compared to the verifications of the previous version. We do this as a replacement for the non-existing failure enforcement. Indeed, the following instanceof check would fail implicitly if actual was null. But this would also be misleading since it overshadows the true failure reason. Stating that actual must not be null points out clearly the expected post condition that has not been met. One of the libraries presented there will be AssertJ. The latter is mainly intended to improve validation expressions. But it also provides a test helper, which supports the closure pattern you've just learned to make use of. Another choice to avoid writing your own helper could be the library Fishbowl, [FISBOW]. Now that we understand the available testing patterns, let's discuss a few system-spanning aspects when dealing with exceptional flow in practice. Treating collaborators Considerations we've made about how a software system can be built upon collaborating components, foreshadows that we have to take good care when modelling our overall strategy for exceptional flow. Because of this, we'll start this section with an introduction of the fail fast strategy, which is a perfect match to the test-first approach. The second part of the section will show you how to deal with checked exceptions thrown by collaborators. Fail fast Until now, we've learned that exceptions can serve in corner cases as an expected outcome, which we need to verify with tests. As an example, we've changed the behavior of our timeline fetch-count setter. The new version throws IllegalArgumentException if the given parameter is out of range. While we've explained how to test this, you may have wondered whether throwing an exception is actually an improvement. On the contrary, you might think, doesn't the exception make our program more fragile as it bears the risk of an ugly error popping up or even of crashing the entire application? Aren't those things we want to prevent by all means? So, wouldn't it be better to stick with the old version and silently ignore arguments that are out of range? At first sight, this may sound reasonable, but doing so is ostrich-like head-in-the-sand behavior. According to the motto: if we can't see them, they aren't there, and so, they can't hurt us. Ignoring an input that is obviously wrong can lead to misbehavior of our software system later on. The reason for the problem is probably much harder to track down compared to an immediate failure. Generally speaking, this practice disconnects the effects of a problem from its cause. As a consequence, you often have to deal with stack traces leading to dead ends or worse. Consider, for example, that we'd initialize the timeline fetch-count as an invariant employed by a constructor argument. Moreover, the value we use would be negative and silently ignored by the component. In addition, our application would make some item position calculations based on this value. Sure enough, the calculation results would be faulty. If we're lucky, an exception would be thrown, when, for example, trying to access a particular item based on these calculations. However, the given stack trace would reveal nothing about the reason that originally led to the situation. However, if we're unlucky, the misbehavior will not be detected until the software has been released to end users. On the other hand, with the new version of setFetchCount, this kind of translocated problem can never occur. A failure trace would point directly to the initial programming mistake, hence avoiding follow-up issues. This means failing immediately and visibly increases robustness due to short feedback cycles and pertinent exceptions. Jim Shore has given this design strategy the name fail fast, [SHOR04]. Shore points out that the heart of fail fast are assertions. Similar to the JUnit assert statements, an assertion fails on a condition that isn't met. Typical assertions might be not-null-checks, in-range-checks, and so on. But how do we decide if it's necessary to fail fast? While assertions of input arguments are apparently a potential use case scenario, checking of return values or invariants may also be so. Sometimes, such conditions are described in code comments, such as // foo should never be null because..., which is a clear indication that suggests to replace the note with an appropriate assertion. See the next snippet demonstrating the principle: public void doWithAssert() { [...] boolean condition = ...; // check some invariant if( !condition ) { throw new IllegalStateException( "Condition not met." ) } [...] } But be careful not to overdo things because in most cases, code will fail fast by default. So, you don't have to include a not-null-check after each and every variable assignment for example. Such paranoid programming styles decrease readability for no value-add at all. A last point to consider is your overall exception-handling strategy. The intention of assertions is to reveal programming or configuration mistakes as early as possible. Because of this, we strictly make use of runtime exception types only. Catching exceptions at random somewhere up the call stack of course thwarts the whole purpose of this approach. So, beware of the absurd try-catch-log pattern that you often see scattered all over the code of scrubs, and which is demonstrated in the next listing as a deterrent only: private Data readSomeData() { try { return source.readData(); } catch( Exception hardLuck ) { // NEVER DO THIS! hardLuck.printStackTrace(); } return null; } The sample code projects exceptional flow to null return values and disguises the fact that something seriously went wrong. It surely does not get better using a logging framework or even worse, by swallowing the exception completely. Analysis of an error by means of stack trace logs is cumbersome and often fault-prone. In particular, this approach usually leads to logs jammed with ignored traces, where one more or less does not attract attention. In such an environment, it's like looking for a needle in a haystack when trying to find out why a follow-up problem occurs. Instead, use the central exception handling mechanism at reasonable boundaries. You can create a bottom level exception handler around a GUI's message loop. Ensure that background threads report problems appropriately or secure event notification mechanisms for example. Otherwise, you shouldn't bother with exception handling in your code. As outlined in the next paragraph, securing resource management with try-finally should most of the time be sufficient. The stubbing of exceptional behavior Every now and then, we come across collaborators, which declare checked exceptions in some or all of their method signatures. There is a debate going on for years now whether or not checked exceptions are evil, [HEVEEC]. However, in our daily work, we simply can't elude them as they pop up in adapters around third-party code or get burnt in legacy code we aren't able to change. So, what are the options we have in these situations? "It is funny how people think that the important thing about exceptions is handling them. That's not the important thing about exceptions. In a well-written application there's a ratio of ten to one, in my opinion, of try finally to try catch."                                                                            – Anders Hejlsberg, [HEVEEC] Cool. This means that we also declare the exception type in question on our own method signature and let someone else up on the call stack solve the tricky things, right? Although it makes life easier for us for at the moment, acting like this is probably not the brightest idea. If everybody follows that strategy, the higher we get on the stack, the more exception types will occur. This doesn't scale well and even worse, it exposes details from the depths of the call hierarchy. Because of this, people sometimes simplify things by declaring java.lang.Exception as thrown type. Indeed, this gets them rid of the throws declaration tail. But it's also a pauper's oath as it reduces the Java type concept to absurdity. Fair enough. So, we're presumably better off when dealing with checked exceptions as soon as they occur. But hey, wouldn't this contradict Hejlsberg's statement? And what shall we do with the gatecrasher, meaning is there always a reasonable handling approach? Fortunately there is, and it absolutely conforms with the quote and the preceding fail fast discussion. We envelope the caught checked exception into an appropriate runtime exception, which we afterwards throw instead. This way, every caller of our component's functionality can use it without worrying about exception handling. If necessary, it is sufficient to use a try-finally block to ensure the disposal or closure of open resources for example. As described previously, we leave exception handling to bottom line handlers around the message loop or the like. Now that we know what we have to do, the next question is how can we achieve this with tests? Luckily, with the knowledge about stubs, you're almost there. Normally handling a checked exception represents a boundary condition. We can regard the thrown exception as an indirect input to our SUT. All we have to do is let the stub throw an expected exception (precondition) and check if the envelope gets delivered properly (postcondition). For better understanding, let's comprehend the steps in our timeline example. We consider for this section that our SessionStorage collaborator declares IOException on its methods for any reason whatsoever. The storage interface is shown in the next listing. public interface SessionStorage { void storeTop( Item top ) throws IOException; Item readTop() throws IOException; } Next, we'll have to write a test that reflects our thoughts. At first, we create an IOException instance that will serve as an indirect input. Looking at the next snippet, you can see how we configure our storage stub to throw this instance on a call to storeTop. As the method does not return anything, the Mockito stubbing pattern looks a bit different than earlier. This time, it starts with the expectation definition. In addition, we use Mockito's any matcher, which defines the exception that should be thrown for those calls to storeTop, where the given argument is assignment-compatible with the specified type token. After this, we're ready to exercise the fetchItems method and capture the actual outcome. We expect it to be an instance of IllegalStateException just to keep things simple. See how we verify that the caught exception wraps the original cause and that the message matches a predefined constant on our component class: @Test public void fetchItemWithExceptionOnStoreTop() throws IOException { IOException cause = new IOException(); doThrow( cause ).when( storage ).storeTop( any( Item.class ) ); Throwable actual = thrownBy( () -> timeline.fetchItems() ); assertNotNull( actual ); assertTrue( actual instanceof IllegalStateException ); assertSame( cause, actual.getCause() ); assertEquals( Timeline.ERROR_STORE_TOP, actual.getMessage() ); } With the test in place, the implementation is pretty easy. Let's assume that we have the item storage extracted to a private timeline method named storeTopItem. It gets called somewhere down the road of fetchItem and again calls a private method, getTopItem. Fixing the compile errors, we end up with a try-catch block because we have to deal with IOException thrown by storeTop. Our first error handling should be empty to ensure that our test case actually fails. The following snippet shows the ultimate version, which will make the test finally pass: static final String ERROR_STORE_TOP = "Unable to save top item"; [...] private void storeTopItem() { try { sessionStorage.storeTop( getTopItem() ); } catch( IOException cause ) { throw new IllegalStateException( ERROR_STORE_TOP, cause ); } } Of course, real-world situations can sometimes be more challenging, for example, when the collaborator throws a mix of checked and runtime exceptions. At times, this results in tedious work. But if the same type of wrapping exception can always be used, the implementation can often be simplified. First, re-throw all runtime exceptions; second, catch exceptions by their common super type and re-throw them embedded within a wrapping runtime exception (the following listing shows the principle): private void storeTopItem() { try { sessionStorage.storeTop( getTopItem() ); } catch( RuntimeException rte ) { throw rte; } catch( Exception cause ) { throw new IllegalStateException( ERROR_STORE_TOP, cause ); } } Summary In this article, you learned how to validate the proper behavior of an SUT with respect to exceptional flow. You experienced how to apply the various capture and verification options, and we discussed their strengths and weaknesses. Supplementary to the test-first approach, you were taught the concepts of the fail fast design strategy and recognized how adapting it increases the overall robustness of applications. Last but not least, we explained how to handle collaborators that throw checked exceptions and how to stub their exceptional bearing. Resources for Article: Further resources on this subject: Progressive Mockito[article] Using Mock Objects to Test Interactions[article] Ensuring Five-star Rating in the MarketPlace [article]
Read more
  • 0
  • 0
  • 1658
article-image-phalcons-orm
Packt
25 Aug 2015
9 min read
Save for later

Phalcon's ORM

Packt
25 Aug 2015
9 min read
In this article by Calin Rada, the author of the book Learning Phalcon PHP, we will go through a few of the ORM CRUD operations (update, and delete) and database transactions (For more resources related to this topic, see here.) By using the ORM, there is virtually no need to write any SQL in your code. Everything is OOP, and it is using the models to perform operations. The first, and the most basic, operation is retrieving data. In the old days, you would do this: $result = mysql_query("SELECT * FROM article"); The class that our models are extending is PhalconMvcModel. This  class has some very useful methods built in, such as find(), findFirst(), count(), sum(), maximum(), minimum(), average(), save(), create(), update(), and delete(). CRUD – updating data Updating data is as easy as creating it. The only thing that we need to do is find the record that we want to update. Open the article manager and add the following code: public function update($id, $data) { $article = Article::findFirstById($id); if (!$article) { throw new Exception('Article not found', 404); } $article->setArticleShortTitle($data[ 'article_short_title']); $article->setUpdatedAt(new PhalconDbRawValue('NOW()')); if (false === $article->update()) { foreach ($article->getMessages() as $message) { $error[] = (string) $message; } throw new Exception(json_encode($error)); } return $article; } As you can see, we are passing a new variable, $id, to the update method and searching for an article that has its ID equal to the value of the $id variable. For the sake of an example, this method will update only the article title and the updated_at field for now. Next, we will create a new dummy method as we did for the article, create. Open modules/Backoffice/Controllers/ArticleController.php and add the following code: public function updateAction($id) { $this->view->disable(); $article_manager = $this->getDI()->get( 'core_article_manager'); try { $article = $article_manager->update($id, [ 'article_short_title' => 'Modified article 1' ]); echo $article->getId(), " was updated."; } catch (Exception $e) { echo $e->getMessage(); } } If you access http://www.learning-phalcon.localhost/backoffice/article/update/1 now, you should be able to see the 1 was updated. response. Going back to the article list, you will see the new title, and the Updated column will have a new value. CRUD – deleting data Deleting data is easier, since we don't need to do more than calling the built-in delete() method. Open the article manager, and add the following code: public function delete($id) { $article = Article::findFirstById($id); if (!$article) { throw new Exception('Article not found', 404); } if (false === $article->delete()) { foreach ($article->getMessages() as $message) { $error[] = (string) $message; } throw new Exception(json_encode($error)); } return true; } We will once again create a dummy method to delete records. Open modules/Backoffice/Controllers/ArticleControllers.php, and add the following code: public function deleteAction($id) { $this->view->disable(); $article_manager = $this->getDI()->get('core_article_manager'); try { $article_manager->delete($id); echo "Article was deleted."; } catch (Exception $e) { echo $e->getMessage(); } } To test this, simply access http://www.learning-phalcon.localhost/backoffice/article/delete/1. If everything went well, you should see the Article was deleted. message. Going back to, article list, you won't be able to see the article with ID 1 anymore. These are the four basic methods: create, read, update, and delete. Later in this book, we will use these methods a lot. If you need/want to, you can use the Phalcon Developer Tools to generate CRUD automatically. Check out https://github.com/phalcon/phalcon-devtools for more information. Using PHQL Personally, I am not a fan of PHQL. I prefer using ORM or Raw queries. But if you are going to feel comfortable with it, feel free to use it. PHQL is quite similar to writing raw SQL queries. The main difference is that you will need to pass a model instead of a table name, and use a models manager service or directly call the PhalconMvcModelQuery class. Here is a method similar to the built-in find() method: public function find() { $query = new PhalconMvcModelQuery("SELECT * FROM AppCoreModelsArticle", $this->getDI()); $articles = $query->execute(); return $articles; } To use the models manager, we need to inject this new service. Open the global services file, config/service.php, and add the following code: $di['modelsManager'] = function () { return new PhalconMvcModelManager(); }; Now let's rewrite the find() method by making use of the modelsManager service: public function find() { $query = $this->modelsManager->createQuery( "SELECT * FROM AppCoreModelsArticle"); $articles = $query->execute(); return $articles; } If we need to bind parameters, the method can look like this one: public function find() { $query = $this->modelsManager->createQuery( "SELECT * FROM AppCoreModelsArticle WHERE id = :id:"); $articles = $query->execute(array( 'id' => 2 )); return $articles; } We are not going to use PHQL at all in our project. If you are interested in it, you can find more information in the official documentation at http://docs.phalconphp.com/en/latest/reference/phql.html. Using raw SQL Sometimes, using raw SQL is the only way of performing complex queries. Let's see what a raw SQL will look like for a custom find() method and a custom update() method : <?php use PhalconMvcModelResultsetSimple as Resultset; class Article extends Base { public static function rawFind() { $sql = "SELECT * FROM robots WHERE id > 0"; $article = new self(); return new Resultset(null, $article, $article->getReadConnection()->query($sql)); } public static function rawUpdate() { $sql = "UPDATE article SET is_published = 1"; $this->getReadConnection()->execute($sql); } } As you can see, the rawFind() method returns an instance of PhalconMvcModelResultsetSimple. The rawUpdate() method just executes the query (in this example, we will mark all the articles as published). You might have noticed the getReadConnection() method. This method is very useful when you need to iterate over a large amount of data or if, for example, you use a master-slave connection. As an example, consider the following code snippet: <?php class Article extends Base { public function initialize() { $this->setReadConnectionService('a_slave_db_connection_service'); // By default is 'db' $this->setWriteConnectionService('db'); } } Working with models might be a complex thing. We cannot cover everything in this book, but we will work with many common techniques to achieve this part of our project. Please spare a little time and read more about working with models at http://docs.phalconphp.com/en/latest/reference/models.html. Database transactions If you need to perform multiple database operations, then in most cases you need to ensure that every operation is successful, for the sake of data integrity. A good database architecture in not always enough to solve potential integrity issues. This is the case where you should use transactions. Let's take as an example a virtual wallet that can be represented as shown in the next few tables. The User table looks like the following: ID NAME 1 John Doe The Wallet table looks like this: ID USER_ID BALANCE 1 1 5000 The Wallet transactions table looks like the following: ID WALLET_ID AMOUNT DESCRIPTION 1 1 5000 Bonus credit 2 1 -1800 Apple store How can we create a new user, credit their wallet, and then debit it as the result of a purchase action? This can be achieved in three ways using transactions: Manual transactions Implicit transactions Isolated transactions A manual transactions example Manual transactions are useful when we are using only one connection and the transactions are not very complex. For example, if any error occurs during an update operation, we can roll back the changes without affecting the data integrity: <?php class UserController extends PhalconMvcController { public function saveAction() { $this->db->begin(); $user = new User(); $user->name = "John Doe"; if (false === $user->save() { $this->db->rollback(); return; } $wallet = new Wallet(); $wallet->user_id = $user->id; $wallet->balance = 0; if (false === $wallet->save()) { $this->db->rollback(); return; } $walletTransaction = new WalletTransaction(); $walletTransaction->wallet_id = $wallet->id; $walletTransaction->amount = 5000; $walletTransaction->description = 'Bonus credit'; if (false === $walletTransaction1->save()) { $this->db->rollback(); return; } $walletTransaction1 = new WalletTransaction(); $walletTransaction1->wallet_id = $wallet->id; $walletTransaction1->amount = -1800; $walletTransaction1->description = 'Apple store'; if (false === $walletTransaction1->save()) { $this->db->rollback(); return; } $this->db->commit(); } } An implicit transactions example Implicit transactions are very useful when we need to perform operations on related tables / exiting relationships: <?php class UserController extends PhalconMvcController { public function saveAction() { $walletTransactions[0] = new WalletTransaction(); $walletTransactions[0]->wallet_id = $wallet->id; $walletTransactions[0]->amount = 5000; $walletTransactions[0]->description = 'Bonus credit'; $walletTransactions[1] = new WalletTransaction(); $walletTransactions[1]->wallet_id = $wallet->id; $walletTransactions[1]->amount = -1800; $walletTransactions[1]->description = 'Apple store'; $wallet = new Wallet(); $wallet->user_id = $user->id; $wallet->balance = 0; $wallet->transactions = $walletTransactions; $user = new User(); $user->name = "John Doe"; $user->wallet = $wallet; } } An isolated transactions example Isolated transactions are always executed in a separate connection, and they require a transaction manager: <?php use PhalconMvcModelTransactionManager as TxManager, PhalconMvcModelTransactionFailed as TxFailed; class UserController extends PhalconMvcController { public function saveAction() { try { $manager = new TxManager(); $transaction = $manager->get(); $user = new User(); $user->setTransaction($transaction); $user->name = "John Doe"; if ($user->save() == false) { $transaction->rollback("Cannot save user"); } $wallet = new Wallet(); $wallet->setTransaction($transaction); $wallet->user_id = $user->id; $wallet->balance = 0; if ($wallet->save() == false) { $transaction->rollback("Cannot save wallet"); } $walletTransaction = new WalletTransaction(); $walletTransaction->setTransaction($transaction);; $walletTransaction->wallet_id = $wallet->id; $walletTransaction->amount = 5000; $walletTransaction->description = 'Bonus credit'; if ($walletTransaction1->save() == false) { $transaction->rollback("Cannot create transaction"); } $walletTransaction1 = new WalletTransaction(); $walletTransaction1->setTransaction($transaction); $walletTransaction1->wallet_id = $wallet->id; $walletTransaction1->amount = -1800; $walletTransaction1->description = 'Apple store'; if ($walletTransaction1->save() == false) { $transaction->rollback("Cannot create transaction"); } $transaction->commit(); } catch(TxFailed $e) { echo "Error: ", $e->getMessage(); } } Summary In this article, you learned something about ORM in general and how to use some of the main built-in methods to perform CRUD operations. You also learned about database transactions and how to use PHQL or raw SQL queries. Resources for Article: Further resources on this subject: Using Phalcon Models, Views, and Controllers[article] Your first FuelPHP application in 7 easy steps[article] PHP Magic Features [article]
Read more
  • 0
  • 0
  • 3025

Packt
24 Aug 2015
9 min read
Save for later

CSS3 – Selectors, Typography, Color Modes, and New Features

Packt
24 Aug 2015
9 min read
In this article by Ben Frain, the author of Responsive Web Design with HTML5 and CSS3 Second Edition, we'll cover the following topics: What are pseudo classes The :last-child selector The nth-child selectors The nth rules nth-based selection in responsive web design (For more resources related to this topic, see here.) CSS3 gives us more power to select elements based upon where they sit in the structure of the DOM. Let's consider a common design treatment; we're working on the navigation bar for a larger viewport and we want to have all but the last link over on the left. Historically, we would have needed to solve this problem by adding a class name to the last link so that we could select it, like this: <nav class="nav-Wrapper"> <a href="/home" class="nav-Link">Home</a> <a href="/About" class="nav-Link">About</a> <a href="/Films" class="nav-Link">Films</a> <a href="/Forum" class="nav-Link">Forum</a> <a href="/Contact-Us" class="nav-Link nav-LinkLast">Contact Us</a> </nav> This in itself can be problematic. For example, sometimes, just getting a content management system to add a class to a final list item can be frustratingly difficult. Thankfully, in those eventualities, it's no longer a concern. We can solve this problem and many more with CSS3 structural pseudo-classes. The :last-child selector CSS 2.1 already had a selector applicable for the first item in a list: div:first-child { /* Styles */ } However, CSS3 adds a selector that can also match the last: div:last-child { /* Styles */ } Let's look how that selector could fix our prior problem: @media (min-width: 60rem) { .nav-Wrapper { display: flex; } .nav-Link:last-child { margin-left: auto; } } There are also useful selectors for when something is the only item: :only-child and the only item of a type: :only-of-type. The nth-child selectors The nth-child selectors let us solve even more difficult problems. With the same markup as before, let's consider how nth-child selectors allow us to select any link(s) within the list. Firstly, what about selecting every other list item? We could select the odd ones like this: .nav-Link:nth-child(odd) { /* Styles */ } Or, if you wanted to select the even ones: .nav-Link:nth-child(even) { /* Styles */ } Understanding what nth rules do For the uninitiated, nth-based selectors can look pretty intimidating. However, once you've mastered the logic and syntax you'll be amazed what you can do with them. Let's take a look. CSS3 gives us incredible flexibility with a few nth-based rules: nth-child(n) nth-last-child(n) nth-of-type(n) nth-last-of-type(n) We've seen that we can use (odd) or (even) values already in an nth-based expression but the (n) parameter can be used in another couple of ways: As an integer; for example, :nth-child(2) would select the 
second item As a numeric expression; for example, :nth-child(3n+1) would start at 1 and then select every third element The integer based property is easy enough to understand, just enter the element number you want to select. The numeric expression version of the selector is the part that can be a little baffling for mere mortals. If math is easy for you, I apologize for this next section. For everyone else, let's break it down. Breaking down the math Let's consider 10 spans on a page (you can play about with these by looking at example_05-05): <span></span> <span></span> <span></span> <span></span> <span></span> <span></span> <span></span> <span></span> <span></span> <span></span> By default they will be styled like this: span { height: 2rem; width: 2rem; background-color: blue; display: inline-block; } As you might imagine, this gives us 10 squares in a line: OK, let's look at how we can select different ones with nth-based selections. For practicality, when considering the expression within the parenthesis, I start from the right. So, for example, if I want to figure out what (2n+3) will select, I start with the right-most number (the three here indicates the third item from the left) and know it will select every second element from that point on. So adding this rule: span:nth-child(2n+3) { color: #f90; border-radius: 50%; } Results in this in the browser: As you can see, our nth selector targets the third list item and then every subsequent second one after that too (if there were 100 list items, it would continue selecting every second one). How about selecting everything from the second item onwards? Well, although you could write :nth-child(1n+2), you don't actually need the first number 1 as unless otherwise stated, n is equal to 1. We can therefore just write :nth-child(n+2). Likewise, if we wanted to select every third element, rather than write :nth-child(3n+3), we can just write :nth-child(3n) as every third item would begin at the third item anyway, without needing to explicitly state it. The expression can also use negative numbers, for example, :nth-child(3n-2) starts at -2 and then selects every third item. You can also change the direction. By default, once the first part of the selection is found, the subsequent ones go down the elements in the DOM (and therefore from left to right in our example). However, you can reverse that with a minus. For example: span:nth-child(-2n+3) { background-color: #f90; border-radius: 50%; } This example finds the third item again, but then goes in the opposite direction to select every two elements (up the DOM tree and therefore from right to left in our example): Hopefully, the nth-based expressions are making perfect sense now? The nth-child and nth-last-child differ in that the nth-last-child variant works from the opposite end of the document tree. For example, :nth-last-child(-n+3) starts at 3 from the end and then selects all the items after it. Here's what that rule gives us in the browser: Finally, let's consider :nth-of-type and :nth-last-of-type. While the previous examples count any children regardless of type (always remember the nth-child selector targets all children at the same DOM level, regardless of classes), :nth-of-type and :nth-last-of-type let you be specific about the type of item you want to select. Consider the following markup (example_05-06): <span class="span-class"></span> <span class="span-class"></span> <span class="span-class"></span> <span class="span-class"></span> <span class="span-class"></span> <div class="span-class"></div> <div class="span-class"></div> <div class="span-class"></div> <div class="span-class"></div> <div class="span-class"></div> If we used the selector: .span-class:nth-of-type(-2n+3) { background-color: #f90; border-radius: 50%; } Even though all the elements have the same span-class, we will only actually be targeting the span elements (as they are the first type selected). Here is what gets selected: We will see how CSS4 selectors can solve this issue shortly. CSS3 doesn't count like JavaScript and jQuery! If you're used to using JavaScript and jQuery you'll know that it counts from 0 upwards (zero index based). For example, if selecting an element in JavaScript or jQuery, an integer value of 1 would actually be the second element. CSS3 however, starts at 1 so that a value of 1 is the first item it matches. nth-based selection in responsive web designs Just to close out this little section I want to illustrate a real life responsive web design problem and how we can use nth-based selection to solve it. Remember the horizontal scrolling panel from example_05-02? Let's consider how that might look in a situation where horizontal scrolling isn't possible. So, using the same markup, let's turn the top 10 grossing films of 2014 into a grid. For some viewports the grid will only be two items wide, as the viewport increases we show three items and at larger sizes still we show four. Here is the problem though. Regardless of the viewport size, we want to prevent any items on the bottom row having a border on the bottom. You can view this code at example_05-09. Here is how it looks with four items wide: See that pesky border below the bottom two items? That's what we need to remove. However, I want a robust solution so that if there were another item on the bottom row, the border would also be removed on that too. Now, because there are a different number of items on each row at different viewports, we will also need to change the nth-based selection at different viewports. For the sake of brevity, I'll show you the selection that matches four items per row (the larger of the viewports). You can view the code sample to see the amended selection at the different viewports. @media (min-width: 55rem) { .Item { width: 25%; } /* Get me every fourth item and of those, only ones that are in the last four items */ .Item:nth-child(4n+1):nth-last-child(-n+4), /* Now get me every one after that same collection too. */ .Item:nth-child(4n+1):nth-last-child(-n+4) ~ .Item { border-bottom: 0; } } You'll notice here that we are chaining the nth-based pseudo-class selectors. It's important to understand that the first doesn't filter the selection for the next, rather the element has to match each of the selections. For our preceding example, the first element has to be the first item of four and also be one of the last four. Nice! Thanks to nth-based selections we have a defensive set of rules to remove the bottom border regardless of the viewport size or number of items we are showing. Summary In this article, we've learned what are structural pseudo-classes. We've also learned what nth rules do. We have also showed the nth-based selection in responsive web design. Resources for Article:   Further resources on this subject: CSS3 Animation[article] A look into responsive design frameworks[article] Linking Dynamic Content from External Websites[article]
Read more
  • 0
  • 0
  • 1195

article-image-css-grids-rwd
Packt
24 Aug 2015
12 min read
Save for later

CSS Grids for RWD

Packt
24 Aug 2015
12 min read
In this article by the author, Ricardo Zea, of the book, Mastering Responsive Web Design, we're going to learn how to create a custom CSS grid. Responsive Web Design (RWD) has introduced a new layer of work for everyone building responsive websites and apps. When we have to test our work on different devices and in different dimensions, wherever the content breaks, we need to add a breakpoint and test again. (For more resources related to this topic, see here.) This can happen many, many times. So, building a website or app will take a bit longer than it used to. To make things a little more interesting, as web designers and developers, we need to be mindful of how the content is laid out at different dimensions and how a grid can help us structure the content to different layouts. Now that we have mentioned grids, have you ever asked yourself, "what do we use a grid for anyway?" To borrow a few terms from the design industry and answer that question, we use a grid to allow the content to have rhythm, proportion, and balance. The objective is that those who use our websites/apps will have a more pleasant experience with our content, since it will be easier to scan (rhythm), easier to read (proportion) and organized (balance). In order to speed up the design and build processes while keeping all the content properly formatted in different dimensions, many authors and companies have created CSS frameworks and CSS grids that contain not only a grid but also many other features and styles than can be leveraged by using a simple class name. As time goes by and browsers start supporting more and more CSS3 properties, such as Flexbox, it'll become easier to work with layouts. This will render the grids inside CSS frameworks almost unnecessary. Let's see what CSS grids are all about and how they can help us with RWD. In this article, we're going to learn how to create a custom CSS grid. Creating a custom CSS grid Since we're mastering RWD, we have the luxury of creating our own CSS grid. However, we need to work smart, not hard. Let's lay out our CSS grid requirements: It should have 12 columns. It should be 1200px wide to account for 1280px screens. It should be fluid, with relative units (percentages) for the columns and gutters. It should use the mobile-first approach. It should use the SCSS syntax. It should be reusable for other projects. It should be simple to use and understand. It should be easily scalable. Here's what our 1200 pixel wide and 12-column width 20px grid looks like: The left and right padding in black are 10px each. We'll convert those 10px into percentages at the end of this process. Doing the math We're going to use the RWD magic formula:  (target ÷ context) x 100 = result %. Our context is going to be 1200px. So let's convert one column:  80 ÷ 1200 x 100 = 6.67%. For two columns, we have to account for the gutter that is 20px. In other words, we can't say that two columns are exactly 160px. That's not entirely correct. Two columns are: 80px + 20px + 80px = 180px. Let's now convert two columns:  180 ÷ 1200 x 100 = 15%. For three columns, we now have to account for two gutters: 80px + 20px + 80px + 20px + 80px = 280px. Let's now convert three columns:  280 ÷ 1200 x 100 = 23.33%. Can you see the pattern now? Every time we add a column, all that we need to do is add 100 to the value. This value accounts for the gutters too! Check the screenshot of the grid we saw moments ago, you can see the values of the columns increment by 100. So, all the equations are as follows: 1 column: 80 ÷ 1200 x 100 = 6.67% 2 columns: 180 ÷ 1200 x 100 = 15% 3 columns: 280 ÷ 1200 x 100 = 23.33% 4 columns: 380 ÷ 1200 x 100 = 31.67% 5 columns: 480 ÷ 1200 x 100 = 40% 6 columns: 580 ÷ 1200 x 100 = 48.33% 7 columns: 680 ÷ 1200 x 100 = 56.67% 8 columns: 780 ÷ 1200 x 100 = 65% 9 columns: 880 ÷ 1200 x 100 = 73.33% 10 columns: 980 ÷ 1200 x 100 = 81.67% 11 columns:1080 ÷ 1200 x 100 = 90% 12 columns:1180 ÷ 1200 x 100 = 98.33% Let's create the SCSS for the 12-column grid: //Grid 12 Columns .grid { &-1 { width:6.67%; } &-2 { width:15%; } &-3 { width:23.33%; } &-4 { width:31.67%; } &-5 { width:40%; } &-6 { width:48.33%; } &-7 { width:56.67%; } &-8 { width:65%; } &-9 { width:73.33%; } &-10 { width:81.67%; } &-11 { width:90%; } &-12 { width:98.33%; } } Using hyphens (-) to separate words allows for easier selection of the terms when editing the code. Adding the UTF-8 character set directive and a Credits section Don't forget to include the UTF-8 encoding directive at the top of the file to let browsers know the character set we're using. Let's spruce up our code by adding a Credits section at the top. The code is as follows: @charset "UTF-8"; /* Custom Fluid & Responsive Grid System Structure: Mobile-first (min-width) Syntax: SCSS Grid: Float-based Created by: Your Name Date: MM/DD/YY */ //Grid 12 Columns .grid { &-1 { width:6.67%; } &-2 { width:15%; } &-3 { width:23.33%; } &-4 { width:31.67%; } &-5 { width:40%; } &-6 { width:48.33%; } &-7 { width:56.67%; } &-8 { width:65%; } &-9 { width:73.33%; } &-10 { width:81.67%; } &-11 { width:90%; } &-12 { width:98.33%; } } Notice the Credits are commented with CSS style comments: /* */. These types of comments, depending on the way we compile our SCSS files, don't get stripped out. This way, the Credits are always visible so that others know who authored the file. This may or may not work for teams. Also, the impact on file size of having the Credits display is imperceptible, if any. Including the box-sizing property and the mobile-first mixin Including the box-sizing property allows the browser's box model to account for the padding inside the containers; this means the padding gets subtracted rather than added, thus maintaining the defined width(s). Since the structure of our custom CSS grid is going to be mobile-first, we need to include the mixin that will handle this aspect: @charset "UTF-8"; /* Custom Fluid & Responsive Grid System Structure: Mobile-first (min-width) Syntax: SCSS Grid: Float-based Created by: Your Name Date: MM/DD/YY */ *, *:before, *:after { box-sizing: border-box; } //Moble-first Media Queries Mixin @mixin forLargeScreens($width) { @media (min-width: $width/16+em) { @content } } //Grid 12 Columns .grid { &-1 { width:6.67%; } &-2 { width:15%; } &-3 { width:23.33%; } &-4 { width:31.67%; } &-5 { width:40%; } &-6 { width:48.33%; } &-7 { width:56.67%; } &-8 { width:65%; } &-9 { width:73.33%; } &-10 { width:81.67%; } &-11 { width:90%; } &-12 { width:98.33%; } } The main container and converting 10px to percentage value Since we're using the mobile-first approach, our main container is going to be 100% wide by default; but we're also going to give it a maximum width of 1200px since the requirement is to create a grid of that size. We're also going to convert 10px into a percentage value, so using the RWD magic formula: 10 ÷ 1200 x 100 = 0.83%. However, as we've seen before, 10px, or in this case 0.83%, is not enough padding and makes the content appear too close to the edge of the main container. So we're going to increase the padding to 20px:  20 ÷ 1200 x 100 = 1.67%. We're also going to horizontally center the main container with margin:auto;. There's no need to declare zero values to the top and bottom margins to center horizontally. In other words, margin: 0 auto; isn't necessary. Just declaring margin: auto; is enough. Let's include these values now: @charset "UTF-8"; /* Custom Fluid & Responsive Grid System Structure: Mobile-first (min-width) Syntax: SCSS Grid: Float-based Created by: Your Name Date: MM/DD/YY */ *, *:before, *:after { box-sizing: border-box; } //Moble-first Media Queries Mixin @mixin forLargeScreens($width) { @media (min-width: $width/16+em) { @content } } //Main Container .container-12 { width: 100%; //Change this value to ANYTHING you want, no need to edit anything else. max-width: 1200px; padding: 0 1.67%; margin: auto; } //Grid 12 Columns .grid { &-1 { width:6.67%; } &-2 { width:15%; } &-3 { width:23.33%; } &-4 { width:31.67%; } &-5 { width:40%; } &-6 { width:48.33%; } &-7 { width:56.67%; } &-8 { width:65%; } &-9 { width:73.33%; } &-10 { width:81.67%; } &-11 { width:90%; } &-12 { width:98.33%; } } In the padding property, it's the same if we type 0.83% or .83%. We can omit the zero. It's always a good practice to keep our code as streamlined as possible. This is the same principle as when we use hexadecimal shorthand values: #3336699 is the same as #369. Making it mobile-first On small screens, all the columns are going to be 100% wide. Since we're working with a single column layout, we don't use gutters; this means we don't have to declare margins, at least yet. At 640px, the grid will kick in and assign corresponding percentages to each column, so we're going to include the columns in a 40em (640px) media query and float them to the left. At this point, we need gutters. Thus, we declare the margin with .83% to the left and right padding. I chose 40em (640px) arbitrarily and only as a starting point. Remember to create content-based breakpoints rather than device-based ones. The code is as follows: @charset "UTF-8"; /* Custom Fluid & Responsive Grid System Structure: Mobile-first (min-width) Syntax: SCSS Grid: Float-based Created by: Your Name Date: MM/DD/YY */ *, *:before, *:after { box-sizing: border-box; } //Moble-first Media Queries Mixin @mixin forLargeScreens($width) { @media (min-width: $width/16+em) { @content } } //Main Container .container-12 { width: 100%; //Change this value to ANYTHING you want, no need to edit anything else. max-width: 1200px; padding: 0 1.67%; margin: auto; } //Grid .grid { //Global Properties - Mobile-first &-1, &-2, &-3, &-4, &-5, &-6, &-7, &-8, &-9, &-10, &-11, &-12 { width: 100%; } @include forLargeScreens(640) { //Totally arbitrary width, it's only a starting point. //Global Properties - Large screens &-1, &-2, &-3, &-4, &-5, &-6, &-7, &-8, &-9, &-10, &-11, &-12 { float: left; margin: 0 .83%; } //Grid 12 Columns .grid { &-1 { width:6.67%; } &-2 { width:15%; } &-3 { width:23.33%; } &-4 { width:31.67%; } &-5 { width:40%; } &-6 { width:48.33%; } &-7 { width:56.67%; } &-8 { width:65%; } &-9 { width:73.33%; } &-10 { width:81.67%; } &-11 { width:90%; } &-12 { width:98.33%; } } } Adding the row and float clearing rules If we use rows in our HTML structure or add the class .clear to a tag, we can declare all the float clearing values in a single nested rule with the :before and :after pseudo-elements. It's the same thing to use single or double colons when declaring pseudo-elements. The double colon is a CSS3 syntax and the single colon is a CSS2.1 syntax. The idea was to be able to differentiate them at a glance so a developer could tell which CSS version they were written on. However, IE8 and below do not support the double-colon syntax. The float clearing technique is an adaptation of David Walsh's CSS snippet (http://davidwalsh.name/css-clear-fix). We're also adding a rule for the rows with a bottom margin of 10px to separate them from each other, while removing that margin from the last row to avoid creating unwanted extra spacing at the bottom. Finally, we add the clearing rule for legacy IEs. Let's include these rules now: @charset "UTF-8"; /* Custom Fluid & Responsive Grid System Structure: Mobile-first (min-width) Syntax: SCSS Grid: Float-based Created by: Your Name Date: MM/DD/YY */ *, *:before, *:after { box-sizing: border-box; } //Moble-first Media Queries Mixin @mixin forLargeScreens($width) { @media (min-width: $width/16+em) { @content } } //Main Container .container-12 { width: 100%; //Change this value to ANYTHING you want, no need to edit anything else. max-width: 1200px; padding: 0 1.67%; margin: auto; } //Grid .grid { //Global Properties - Mobile-first &-1, &-2, &-3, &-4, &-5, &-6, &-7, &-8, &-9, &-10, &-11, &-12 { width: 100%; } @include forLargeScreens(640) { //Totally arbitrary width, it's only a starting point. //Global Properties - Large screens &-1, &-2, &-3, &-4, &-5, &-6, &-7, &-8, &-9, &-10, &-11, &-12 { float: left; margin: 0 .83%; } //Grid 12 Columns .grid { &-1 { width:6.67%; } &-2 { width:15%; } &-3 { width:23.33%; } &-4 { width:31.67%; } &-5 { width:40%; } &-6 { width:48.33%; } &-7 { width:56.67%; } &-8 { width:65%; } &-9 { width:73.33%; } &-10 { width:81.67%; } &-11 { width:90%; } &-12 { width:98.33%; } } } //Clear Floated Elements - http://davidwalsh.name/css-clear-fix .clear, .row { &:before, &:after { content: ''; display: table; } &:after { clear: both; } } //Use rows to nest containers .row { margin-bottom: 10px; &:last-of-type { margin-bottom: 0; } } //Legacy IE .clear { zoom: 1; } Let's recap our CSS grid requirements: 12 columns: Starting from .grid-1 to .grid-12. 1200px wide to account for 1280px screens: The .container-12 container has max-width: 1200px; Fluid and relative units (percentages) for the columns and gutters: The percentages go from 6.67% to 98.33%. Mobile-first: We added the mobile-first mixin (using min-width) and nested the grid inside of it. The SCSS syntax: The whole file is Sass-based. Reusable: As long as we're using 12 columns and we're using the mobile-first approach, we can use this CSS grid multiple times. Simple to use and understand: The class names are very straightforward. The .grid-6 grid is used for an element that spans 6 columns, .grid-7 is used for an element that spans 7 columns, and so on. Easily scalable: If we want to use 980px instead of 1200px, all we need to do is change the value in the .container-12 max-width property. Since all the elements are using relative units (percentages), everything will adapt proportionally to the new width—to any width for that matter. Pretty sweet if you ask me. Summary A lot to digest here, eh? Creating our custom CSS with the traditional floats technique was a matter of identifying the pattern where the addition of a new column was a matter of increasing the value by 100. Now, we can create a 12-column grid at any width we want. Resources for Article: Further resources on this subject: Role of AngularJS[article] Managing Images[article] Angular Zen [article]
Read more
  • 0
  • 0
  • 2435
article-image-creating-test-suites-specs-and-expectations-jest
Packt
12 Aug 2015
7 min read
Save for later

Creating test suites, specs and expectations in Jest

Packt
12 Aug 2015
7 min read
In this article by Artemij Fedosejev, the author of React.js Essentials, we will take a look at test suites, specs, and expectations. To write a test for JavaScript functions, you need a testing framework. Fortunately, Facebook built their own unit test framework for JavaScript called Jest. It is built on top of Jasmine - another well-known JavaScript test framework. If you’re familiar with Jasmine you’ll find Jest's approach to testing very similar. However I'll make no assumptions about your prior experience with testing frameworks and discuss the basics first. The fundamental idea of unit testing is that you test only one piece of functionality in your application that usually is implemented by one function. And you test it in isolation - meaning that all other parts of your application which that function depends on are not used by your tests. Instead, they are imitated by your tests. To imitate a JavaScript object is to create a fake one that simulates the behavior of the real object. In unit testing the fake object is called mock and the process of creating it is called mocking. Jest automatically mocks dependencies when you're running your tests. Better yet, it automatically finds tests to execute in your repository. Let's take a look at the example. Create a directory called ./snapterest/source/js/utils/ and create a new file called TweetUtils.js within it, with the following contents: function getListOfTweetIds(tweets) {  return Object.keys(tweets);}module.exports.getListOfTweetIds = getListOfTweetIds; TweetUtils.js file is a module with the getListOfTweetIds() utility function for our application to use. Given an object with tweets, getListOfTweetIds() returns an array of tweet IDs. Using the CommonJS module pattern we export this function: module.exports.getListOfTweetIds = getListOfTweetIds; Jest Unit Testing Now let's write our first unit test with Jest. We'll be testing our getListOfTweetIds() function. Create a new directory: ./snapterest/source/js/utils/__tests__/. Jest will run any tests in any __tests__ directories that it finds within your project structure. So it's important to name your directories with tests: __tests__. Create a TweetUtils-test.js file inside of __tests__:jest.dontMock('../TweetUtils');describe('Tweet utilities module', function () {  it('returns an array of tweet ids', function () {    var TweetUtils = require('../TweetUtils');    var tweetsMock = {      tweet1: {},      tweet2: {},      tweet3: {}    };    var expectedListOfTweetIds = ['tweet1', 'tweet2', 'tweet3'];    var actualListOfTweetIds = TweetUtils.getListOfTweetIds(tweetsMock);    expect(actualListOfTweetIds).toBe(expectedListOfTweetIds);  });}); First we tell Jest not to mock our TweetUtils module: jest.dontMock('../TweetUtils'); We do this because Jest will automatically mock modules returned by the require() function. In our test we're requiring the TweetUtils module: var TweetUtils = require('../TweetUtils'); Without the jest.dontMock('../TweetUtils') call, Jest would return an imitation of our TweetUtils module, instead of the real one. But in this case we actually need the real TweetUtils module, because that's what we're testing. Creating test suites Next we call a global Jest function describe(). In our TweetUtils-test.js file we're not just creating a single test, instead we're creating a suite of tests. A suite is a collection of tests that collectively test a bigger unit of functionality. For example a suite can have multiple tests which tests all individual parts of a larger module. In our example, we have a TweetUtils module with a number of utility functions. In that situation we would create a suite for the TweetUtils module and then create tests for each individual utility function, like getListOfTweetIds(). describe defines a suite and takes two parameters: Suite name - the description of what is being tested: 'Tweet utilities module'. Suit implementation: the function that implements this suite. In our example, the suite is: describe('Tweet utilities module', function () {  // Suite implementation goes here...}); Defining specs How do you create an individual test? In Jest, individual tests are called specs. They are defined by calling another global Jest function it(). Just like describe(), it() takes two parameters: Spec name: the title that describes what is being tested by this spec: 'returns an array of tweet ids'. Spec implementation: the function that implements this spec. In our example, the spec is: it('returns an array of tweet ids', function () {  // Spec implementation goes here...}); Let's take a closer look at the implementation of our spec: var TweetUtils = require('../TweetUtils');var tweetsMock = {  tweet1: {},  tweet2: {},  tweet3: {}};var expectedListOfTweetIds = ['tweet1', 'tweet2', 'tweet3'];var actualListOfTweetIds = TweetUtils.getListOfTweetIds(tweetsMock);expect(actualListOfTweetIds).toEqual(expectedListOfTweetIds); This spec tests whether getListOfTweetIds() method of our TweetUtils module returns an array of tweet IDs when given an object with tweets. First we import the TweetUtils module: var TweetUtils = require('../TweetUtils'); Then we create a mock object that simulates the real tweets object: var tweetsMock = {  tweet1: {},  tweet2: {},  tweet3: {}}; The only requirement for this mock object is to have tweet IDs as object keys. The values are not important hence we choose empty objects. Key names are not important either, so we can name them tweet1, tweet2 and tweet3. This mock object doesn't fully simulate the real tweet object. Its sole purpose is to simulate the fact that its keys are tweet IDs. The next step is to create an expected list of tweet IDs: var expectedListOfTweetIds = ['tweet1', 'tweet2', 'tweet3']; We know what tweet IDs to expect because we've mocked a tweets object with the same IDs. The next step is to extract the actual tweet IDs from our mocked tweets object. For that we use getListOfTweetIds()that takes the tweets object and returns an array of tweet IDs: var actualListOfTweetIds = TweetUtils.getListOfTweetIds(tweetsMock); We pass tweetsMock to that method and store the results in actualListOfTweetIds. The reason this variable is named actualListOfTweetIds is because this list of tweet IDs is produced by the actual getListOfTweetIds() function that we're testing. Setting Expectations The final step will introduce us to a new important concept: expect(actualListOfTweetIds).toEqual(expectedListOfTweetIds); Let's think about the process of testing. We need to take an actual value produced by the method that we're testing - getListOfTweetIds(), and match it to the expected value that we know in advance. The result of that match will determine if our test has passed or failed. The reason why we can guess what getListOfTweetIds() will return in advance is because we've prepared the input for it - that's our mock object: var tweetsMock = {  tweet1: {},  tweet2: {},  tweet3: {}}; So we can expect the following output from calling TweetUtils.getListOfTweetIds(tweetsMock): ['tweet1', 'tweet2', 'tweet3'] But because something can go wrong inside of getListOfTweetIds() we cannot guarantee this result - we can only expect it. That's why we need to create an expectation. In Jest, an Expectation is built using expect()which takes an actual value, for example: actualListOfTweetIds. expect(actualListOfTweetIds) Then we chain it with a Matcher function that compares the actual value with the expected value and tells Jest whether the expectation was met. expect(actualListOfTweetIds).toEqual(expectedListOfTweetIds); In our example we use the toEqual() matcher function to compare two arrays. Click here for a list of all built-in matcher functions in Jest. And that's how you create a spec. A spec contains one or more expectations. Each expectation tests the state of your code. A spec can be either a passing spec or a failing spec. A spec is a passing spec only when all expectations are met, otherwise it's a failing spec. Well done, you've written your first testing suite with a single spec that has one expectation. Continue reading React.js Essentials to continue your journey into testing.
Read more
  • 0
  • 0
  • 22908

article-image-typical-javascript-project
Packt
11 Aug 2015
29 min read
Save for later

A Typical JavaScript Project

Packt
11 Aug 2015
29 min read
In this article by Phillip Fehre, author of the book JavaScript Domain-Driven Design, we will explore a practical approach to developing software with advanced business logic. There are many strategies to keep development flowing and the code and thoughts organized, there are frameworks building on conventions, there are different software paradigms such as object orientation and functional programming, or methodologies such as test-driven development. All these pieces solve problems, and are like tools in a toolbox to help manage growing complexity in software, but they also mean that today when starting something new, there are loads of decisions to make even before we get started at all. Do we want to develop a single-page application, do we want to develop following the standards of a framework closely or do we want to set our own? These kinds of decisions are important, but they also largely depend on the context of the application, and in most cases the best answer to the questions is: it depends. (For more resources related to this topic, see here.) So, how do we really start? Do we really even know what our problem is, and, if we understand it, does this understanding match that of others? Developers are very seldom the domain experts on a given topic. Therefore, the development process needs input from outside through experts of the business domain when it comes to specifying the behavior a system should have. Of course, this is not only true for a completely new project developed from the ground up, but also can be applied to any new feature added during development of to an application or product. So, even if your project is well on its way already, there will come a time when a new feature just seems to bog the whole thing down and, at this stage, you may want to think about alternative ways to go about approaching this new piece of functionality. Domain-driven design gives us another useful piece to play with, especially to solve the need to interact with other developers, business experts, and product owners. As in the modern era, JavaScript becomes a more and more persuasive choice to build projects in and, in many cases like browser-based web applications, it actually is the only viable choice. Today, the need to design software with JavaScript is more pressing than ever. In the past, the issues of a more involved software design were focused on either backend or client application development, with the rise of JavaScript as a language to develop complete systems in, this has changed. The development of a JavaScript client in the browser is a complex part of developing the application as a whole, and so is the development of server-side JavaScript applications with the rise of Node.js. In modern development, JavaScript plays a major role and therefore needs to receive the same amount of attention in development practices and processes as other languages and frameworks have in the past. A browser based client-side application often holds the same amount, or even more logic, than the backend. With this change, a lot of new problems and solutions have arisen, the first being the movement toward better encapsulation and modularization of JavaScript projects. New frameworks have arisen and established themselves as the bases for many projects. Last but not least, JavaScript made the jump from being the language in the browser to move more and more to the server side, by means of Node.js or as the query language of choice in some NoSQL databases. Let me take you on a tour of developing a piece of software, taking you through the stages of creating an application from start to finish using the concepts domain-driven design introduced and how they can be interpreted and applied. In this article, you will cover: The core idea of domain-driven design Our business scenario—managing an orc dungeon Tracking the business logic Understanding the core problem and selecting the right solution Learning what domain-driven design is The core idea of domain-driven design There are many software development methodologies around, all with pros and cons but all also have a core idea, which is to be applied and understood to get the methodology right. For a domain-driven design, the core lies in the realization that since we are not the experts in the domain the software is placed in, we need to gather input from other people who are experts. This realization means that we need to optimize our development process to gather and incorporate this input. So, what does this mean for JavaScript? When thinking about a browser application to expose a certain functionality to a consumer, we need to think about many things, for example: How does the user expect the application to behave in the browser? How does the business workflow work? What does the user know about the workflow? These three questions already involve three different types of experts: a person skilled in user experience can help with the first query, a business domain expert can address the second query, and a third person can research the target audience and provide input on the last query. Bringing all of this together is the goal we are trying to achieve. While the different types of people matter, the core idea is that the process of getting them involved is always the same. We provide a common way to talk about the process and establish a quick feedback loop for them to review. In JavaScript, this can be easier than in most other languages due to the nature of it being run in a browser, readily available to be modified and prototyped with; an advantage Java Enterprise Applications can only dream of. We can work closely with the user experience designer adjusting the expected interface and at the same time change the workflow dynamically to suit our business needs, first on the frontend in the browser and later moving the knowledge out of the prototype to the backend, if necessary. Managing an orc dungeon When talking about domain-driven design, it is often stated in the context of having complex business logic to deal with. In fact, most software development practices are not really useful when dealing with a very small, cut-out problem. Like with every tool, you need to be clear when it is the right time to use it. So, what does really fall in to the realm of complex business logic? It means that the software has to describe a real-world scenario, which normally involves human thinking and interaction. Writing software that deals with decisions, which 90 per cent of the time go a certain way and ten per cent of the time it's some other way, is notoriously hard, especially when explaining it to people not familiar with software. These kind of decisions are the core of many business problems, but even though this is an interesting problem to solve, following how the next accounting software is developed does not make an interesting read. With this in mind, I would like to introduce you to the problem we are trying to solve, that is, managing a dungeon. An orc Inside the dungeon Running an orc dungeon seems pretty simple from the outside, but managing it without getting killed is actually rather complicated. For this reason, we are contacted by an orc master who struggles with keeping his dungeon running smoothly. When we arrive at the dungeon, he explains to us how it actually works and what factors come into play. Even greenfield projects often have some status quo that work. This is important to keep in mind since it means that we don't have to come up with the feature set, but match the feature set of the current reality. Many outside factors play a role and the dungeon is not as independent at it would like to be. After all, it is part of the orc kingdom, and the king demands that his dungeons make him money. However, money is just part of the deal. How does it actually make money? The prisoners need to mine gold and to do that there needs to be a certain amount of prisoners in the dungeon that need to be kept. The way an orc kingdom is run also results in the constant arrival of new prisoners, new captures from war, those who couldn't afford their taxes, and so on. There always needs to be room for new prisoners. The good thing is that every dungeon is interconnected, and to achieve its goals it can rely on others by requesting a prisoner transfer to either fill up free cells or get rid of overflowing prisoners in its cells. These options allow the dungeon masters to keep a close watch on prisoners being kept and the amount of cell space available. Sending off prisoners into other dungeons as needed and requesting new ones from other dungeons, in case there is too much free cell space available, keeps the mining workforce at an optimal level for maximizing the profit, while at the same time being ready to accommodate the eventual arrival of a high value inmate sent directly to the dungeon. So far, the explanation is sound, but let's dig a little deeper and see what is going on. Managing incoming prisoners Prisoners can arrive for a couple of reasons, such as if a dungeon is overflowing and decides to transfer some of its inmates to a dungeon with free cells and, unless they flee on the way, they will eventually arrive at our dungeon sooner or later. Another source of prisoners is the ever expanding orc kingdom itself. The orcs will constantly enslave new folk and telling our king, "Sorry we don't have room", is not a valid option, it might actually result in us being one of the new prisoners. Looking at this, our dungeon will fill up eventually, but we need to make sure this doesn't happen. The way to handle this is by transferring inmates early enough to make room. This is obviously going to be the most complicated thing; we need to weigh several factors to decide when and how many prisoners to transfer. The reason we can't simply solve this via thresholds is that looking at the dungeon structure, this is not the only way we can lose inmates. After all, people are not always happy with being gold mining slaves and may decide the risk of dying in a prison is as high as dying while fleeing. Therefore, they decide to do so. The same is true while prisoners are on the move between different dungeons as well, and not unlikely. So even though we have a hard limit of physical cells, we need to deal with the soft number of incoming and outgoing prisoners. This is a classical problem in business software. Matching these numbers against each other and optimizing for a certain outcome is basically what computer data analysis is all about. The current state of the art With all this in mind, it becomes clear that the orc master's current system of keeping track via a badly written note on a napkin is not perfect. In fact, it almost got him killed multiple times already. To give you an example of what can happen, he tells the story of how one time the king captured four clan leaders and wanted to make them miners just to humiliate them. However, when arriving at the dungeon, he realized that there was no room and had to travel to the next dungeon to drop them off, all while having them laugh at him because he obviously didn't know how to run a kingdom. This was due to our orc master having forgotten about the arrival of eight transfers just the day before. Another time, the orc master was not able to deliver any gold when the king's sheriff arrived because he didn't know he only had one-third of his required prisoners to actually mine anything. This time it was due to having multiple people count the inmates, and instead of recoding them cell-by-cell, they actually tried to do it in their head. While being orc, this is a setup for failure. All this comes down to bad organization, and having your system to manage dungeon inmates drawn on the back of a napkin certainly qualifies as such. Digital dungeon management Guided by the recent failures, the orc master has finally realized it is time to move to modern times, and he wants to revolutionize the way to manage his dungeon by making everything digital. He strives to have a system that basically takes the busywork out of managing by automatically calculating the necessary transfers according to the current amount of cells filled. He would like to just sit back, relax and let the computer do all the work for him. A common pattern when talking with a business expert about software is that they are not aware of what can be done. Always remember that we, as developers, are the software experts and therefore are the only ones who are able to manage these expectations. It is time now for us to think about what we need to know about the details and how to deal with the different scenarios. The orc master is not really familiar with the concepts of software development, so we need to make sure we talk in a language he can follow and understand, while making sure we get all the answers we need. We are hired for our expertise in software development, so we need to make sure to manage the expectations as well as the feature set and development flow. The development itself is of course going to be an iterative process, since we can't expect to get a list of everything needed right in one go. It also means that we will need to keep possible changes in mind. This is an essential part of structuring complex business software. Developing software containing more complex business logic is prone to changing rapidly as the business is adapting itself and the users leverage the functionality the software provides. Therefore, it is essential to keep a common language between the people who understand the business and the developers who understand the software. Incorporate the business terms wherever possible, it will ease communication between the business domain experts and you as a developer and therefore prevent misunderstandings early on. Specification To create a good understanding of what a piece of software needs to do, at least to be useful in the best way, is to get an understanding of what the future users were doing before your software existed. Therefore, we sit down with the orc master as he is managing his incoming and outgoing prisoners, and let him walk us through what he is doing on a day-to-day basis. The dungeon is comprised of 100 cells that are either occupied by a prisoner or empty at the moment. When managing these cells, we can identify distinct tasks by watching the orc do his job. Drawing out what we see, we can roughly sketch it like this: There are a couple of organizational important events and states to be tracked, they are: Currently available or empty cells Outgoing transfer states Incoming transfer states Each transfer can be in multiple states that the master has to know about to make further decisions on what to do next. Keeping a view of the world like this is not easy especially accounting for the amount of concurrent updates happening. Tracking the state of everything results in further tasks for our master to do: Update the tracking Start outgoing transfers when too many cells are occupied Respond to incoming transfers by starting to track them Ask for incoming transfers if the occupied cells are to low So, what does each of them involve? Tracking available cells The current state of the dungeon is reflected by the state of its cells, so the first task is to get this knowledge. In its basic form, this is easily achievable by simply counting every occupied and every empty cell, writing down what the values are. Right now, our orc master tours the dungeon in the morning, noting each free cell assuming that the other one must be occupied. To make sure he does not get into trouble, he no longer trusts his subordinates to do that! The problem being that there only is one central sheet to keep track of everything, so his keepers may overwrite each other's information accidently if there is more than one person counting and writing down cells. Also, this is a good start and is sufficient as it is right now, although it misses some information that would be interesting to have, for example, the amount of inmates fleeing the dungeon and an understanding of the expected free cells based on this rate. For us, this means that we need to be able track this information inside the application, since ultimately we want to project the expected amount of free cells so that we can effectively create recommendations or warnings based on the dungeon state. Starting outgoing transfers The second part is to actually handle getting rid of prisoners in case the dungeon fills up. In this concrete case, this means that if the number of free cells drops beneath 10, it is time to move prisoners out, since there may be new prisoners coming at any time. This strategy works pretty reliably since, from experience, it has been established that there are hardly any larger transports, so the recommendation is to stick with it in the beginning. However, we can already see some optimizations which currently are too complex. Drawing from the experience of the business is important, as it is possible to encode such knowledge and reduces mistakes, but be mindful since encoding detailed experience is probably one of the most complex things to do. In the future, we want to optimize this based on the rate of inmates fleeing the dungeon, new prisoners arriving due to being captured, as well as the projection of new arrivals from transfers. All this is impossible right now, since it will just overwhelm the current tracking system, but it actually comes down to capturing as much data as possible and analyzing it, which is something modern computer systems are good at. After all, it could save the orc master's head! Tracking the state of incoming transfers On some days, a raven will arrive bringing news that some prisoners have been sent on their way to be transferred to our dungeon. There really is nothing we can do about it, but the protocol is to send the raven out five days prior to the prisoners actually arriving to give the dungeon a chance to prepare. Should prisoners flee along the way, another raven will be sent informing the dungeon of this embarrassing situation. These messages have to be sifted through every day, to make sure there actually is room available for those arriving. This is a big part of projecting the amount of filled cells, and also the most variable part, we get told. It is important to note that every message should only be processed once, but it can arrive at any time during the day. Right now, they are all dealt with by one orc, who throws them out immediately after noting what the content results in. One problem with the current system is that since other dungeons are managed the same way ours is currently, they react with quick and large transfers when they get in trouble, which makes this quite unpredictable. Initiating incoming transfers Besides keeping the prisoners where they belong, mining gold is the second major goal of the dungeon. To do this, there needs to be a certain amount of prisoners available to man the machines, otherwise production will essentially halt. This means that whenever too many cells become abandoned it is time to fill them, so the orc master sends a raven to request new prisoners in. This again takes five days and, unless they flee along the way, works reliably. In the past, it still has been a major problem for the dungeon due to the long delay. If the filled cells drop below 50, the dungeon will no longer produce any gold and not making money is a reason to replace the current dungeon master. If all the orc master does is react to the situation, it means that there will probably be about five days in which no gold will be mined. This is one of the major pain points in the current system because projecting the amount of filled cells five days out seems rather impossible, so all the orcs can do right now is react. All in all, this gives us a rough idea what the dungeon master is looking for and which tasks need to be accomplished to replace the current system. Of course, this does not have to happen in one go, but can be done gradually so everybody adjusts. Right now, it is time for us to identify where to start. From greenfield to application We are JavaScript developers, so it seems obvious for us to build a web application to implement this. As the problem is described, it is clear that starting out simply and growing the application as we further analyze the situation is clearly the way to go. Right now, we don't really have a clear understanding how some parts should be handled since the business process has not evolved to this level, yet. Also, it is possible that new features will arise or things start being handled differently as our software begins to get used. The steps described leave room for optimization based on collected data, so we first need the data to see how predictions can work. This means that we need to start by tracking as many events as possible in the dungeon. Running down the list, the first step is always to get a view of which state we are in, this means tracking the available cells and providing an interface for this. To start out, this can be done via a counter, but this can't be our final solution. So, we then need to grow toward tracking events and summing those to be able to make predictions for the future. The first route and model Of course there are many other ways to get started, but what it boils down to in most cases is that it is time now to choose the base to build on. By this I mean deciding on a framework or set of libraries to build upon. This happens alongside the decision on what database is used to back our application and many other small decisions, which are influenced by influenced by those decisions around framework and libraries. A clear understanding on how the frontend should be built is important as well, since building a single-page application, which implements a large amount of logic in the frontend and is backed by an API layer that differs a lot from an application, which implements most logic on the server side. Don't worry if you are unfamiliar with express or any other technology used in the following. You don't need to understand every single detail, but you will get the idea of how developing an application with a framework is achieved. Since we don't have a clear understanding, yet, which way the application will ultimately take, we try to push as many decisions as possible out, but decide on the stuff we immediately need. As we are developing in JavaScript, the application is going to be developed in Node.js and express is going to be our framework of choice. To make our life easier, we first decide that we are going to implement the frontend in plain HTML using EJS embedded JavaScript templates, since it will keep the logic in one place. This seems sensible since spreading the logic of a complex application across multiple layers will complicate things even further. Also, getting rid of the eventual errors during transport will ease our way toward a solid application in the beginning. We can push the decision about the database out and work with simple objects stored in RAM for our first prototype; this is, of course, no long-term solution, but we can at least validate some structure before we need to decide on another major piece of software, which brings along a lot of expectations as well. With all this in mind, we setup the application. In the following section and throughout the book, we are using Node.js to build a small backend. At the time of the writing, the currently active version was Node.js 0.10.33. Node.js can be obtained from http://nodejs.org/ and is available for Windows, Mac OS X, and Linux. The foundation for our web application is provided by express, available via the Node Package Manager (NPM) at the time of writing in version 3.0.3: $ npm install –g express$ express --ejs inmatr For the sake of brevity, the glue code in the following is omitted, but like all other code presented in the book, the code is available on the GitHub repository https://github.com/sideshowcoder/ddd-js-sample-code. Creating the model The most basic parts of the application are set up now. We can move on to creating our dungeon model in models/dungeon.js and add the following code to it to keep a model and its loading and saving logic: var Dungeon = function(cells) {this.cells = cellsthis.bookedCells = 0} Keeping in mind that this will eventually be stored in a database, we also need to be able to find a dungeon in some way, so the find method seems reasonable. This method should already adhere to the Node.js callback style to make our lives easier when switching to a real database. Even though we pushed this decision out, the assumption is clear since, even if we decide against a database, the dungeon reference will be stored and requested from outside the process in the future. The following shows an example with the find method: var dungeons = {}Dungeon.find = function(id, callback) {if(!dungeons[id]) {   dungeons[id] = new Dungeon(100)}callback(null, dungeons[id])} The first route and loading the dungeon Now that we have this in place, we can move on to actually react to requests. In express defining, the needed routes do this. Since we need to make sure we have our current dungeon available, we also use middleware to load it when a request comes in. Using the methods we just created, we can add a middleware to the express stack to load the dungeon whenever a request comes in. A middleware is a piece of code, which gets executed whenever a request reaches its level of the stack, for example, the router used to dispatch requests to defined functions is implemented as a middleware, as is logging and so on. This is a common pattern for many other kinds of interactions as well, such as user login. Our dungeon loading middleware looks like this, assuming for now we only manage one dungeon we can create it by adding a file in middleware/load_context.js with the following code: function(req, res, next) {req.context = req.context || {}Dungeon.find('main', function(err, dungeon) {   req.context.dungeon = dungeon   next()})} Displaying the page With this, we are now able to simply display information about the dungeon and track any changes made to it inside the request. Creating a view to render the state, as well as a form to modify it, are the essential parts of our GUI. Since we decided to implement the logic server-side, they are rather barebones. Creating a view under views/index.ejs allows us to render everything to the browser via express later. The following example is the HTML code for the frontend: <h1>Inmatr</h1> <p>You currently have <%= dungeon.free %> of <%= dungeon.cells %> cells available.</p>   <form action="/cells/book" method="post"> <select name="cells">    <% for(var i = 1; i < 11; i++) { %>    <option value="<%= i %>"><%= i %></option> <% } %> </select> <button type="submit" name="book" value="book"> Book cells</button> <button type="submit" name="free" value="free"> Free cells</button> </form> Gluing the application together via express Now that we are almost done, we have a display for the state, a model to track what is changing, and a middleware to load this model as needed. Now, to glue it all together we will use express to register our routes and call the necessary functions. We mainly need two routes: one to display the page and one to accept and process the form input. Displaying the page is done when a user hits the index page, so we need to bind to the root path. Accepting the form input is already declared in the form itself as /cells/book. We can just create a route for it. In express, we define routes in relation to the main app object and according to the HTTP verbs as follows: app.get('/', routes.index) app.post('/cells/book', routes.cells.book) Adding this to the main app.js file allows express to wire things up, the routes itself are implemented as follows in the routes/index.js file: var routes = { index: function(req, res){    res.render('index', req.context) },   cells: { book: function(req, res){    var dungeon = req.context.dungeon    var cells = parseInt(req.body.cells)    if (req.body.book) {    dungeon.book(cells) } else {    dungeon.unbook(cells) }        res.redirect('/')    } } } With this done, we have a working application to track free and used cells. The following shows the frontend output for the tracking system: Moving the application forward This is only the first step toward the application that will hopefully automate what is currently done by hand. With the first start in place, it is now time to make sure we can move the application along. We have to think about what this application is supposed to do and identify the next steps. After presenting the current state back to the business the next request is most likely to be to integrate some kind of login, since it will not be possible to modify the state of the dungeon unless you are authorized to do it. Since this is a web application, most people are familiar with them having a login. This moves us into a complicated space in which we need to start specifying the roles in the application along with their access patterns; so it is not clear if this is the way to go. Another route to take is starting to move the application towards tracking events instead of pure numbers of the free cells. From a developer's point of view, this is probably the most interesting route but the immediate business value might be hard to justify, since without the login it seems unusable. We need to create an endpoint to record events such as fleeing prisoner, and then modify the state of the dungeon according to those tracked events. This is based on the assumption that the highest value for the application will lie in the prediction of the prisoner movement. When we want to track free cells in such a way, we will need to modify the way our first version of the application works. The logic on what events need to be created will have to move somewhere, most logically the frontend, and the dungeon will no longer be the single source of truth for the dungeon state. Rather, it will be an aggregator for the state, which is modified by the generation of events. Thinking about the application in such a way makes some things clear. We are not completely sure what the value proposition of the application ultimately will be. This leads us down a dangerous path since the design decisions that we make now will impact how we build new features inside the application. This is also a problem in case our assumption about the main value proposition turns out to be wrong. In this case, we may have built quite a complex event tracking system which does not really solve the problem but complicates things. Every state modification needs to be transformed into a series of events where a simple state update on an object may have been enough. Not only does this design not solve the real problem, explaining it to the orc master is also tough. There are certain abstractions missing, and the communication is not following a pattern established as the business language. We need an alternative approach to keep the business more involved. Also, we need to keep development simple using abstraction on the business logic and not on the technologies, which are provided by the frameworks that are used. Summary In this article you were introduced to a typical business application and how it is developed. It showed how domain-driven design can help steer clear of common issues during the development to create a more problem-tailored application. Resources for Article: Further resources on this subject: An Introduction to Mastering JavaScript Promises and Its Implementation in Angular.js [article] Developing a JavaFX Application for iOS [article] Object-Oriented JavaScript with Backbone Classes [article]
Read more
  • 0
  • 0
  • 3738