Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Server-Side Web Development

404 Articles
article-image-digging-architecture
Packt
30 Jul 2013
31 min read
Save for later

Digging into the Architecture

Packt
30 Jul 2013
31 min read
(For more resources related to this topic, see here.) The big picture A very short description of a WaveMaker application could be: a Spring MVC server running in a Java container, such as Tomcat, serving file and JSON requests for a Dojo Toolkit-based JavaScript browser client. Unfortunately, such "elevator" descriptions can create more questions than they answer. For starters, although we will often refer to it as "the server," the WaveMaker server might be more aptly called an application server in most architectures. Sure, it is possible to have a useful application without additional servers or services beyond the WaveMaker server, but this is not typical. We could have a rich user interface to read against some in memory data set, for example. Far more commonly, the Java services running in the WaveMaker server are calling off to other servers or services, such as relational databases and RESTful web services. This means the WaveMaker server is often the middle or application tier server of a multi-tier application's architecture. Yet at the same time, the WaveMaker server can be eliminated completely. Applications can be packaged for uploading to PhoneGap build, http://build.phonegap.com/,directly from WaveMaker Studio. Both PhoneGap and the associated Apache project Cordova, http://cordova.apache.org,provide APIs to enable JavaScript to access native device functionality, such as capturing images with the camera and obtaining GPS location information. Packaged up and installed as a native application, the JavaScript files are loaded from the devices, file system instead of being downloaded from a server via HTTP. This means there is no origin domain to be constrained by. If the application only uses web services, or otherwise doesn't need additional services, such as database access, the WaveMaker server is neither used nor needed. Just because an application isn't installed on a mobile device from an app store doesn't mean we can't run it on a mobile device. Browsers on mobile devices are more capable than ever before. This means our client could be any device with a modern browser. You must also consider licensing in light of the bigger picture. WaveMaker, WaveMaker Studio, and the applications create with the Studio are released under the Apache 2.0 license, http://www.apache.org/licenses/LICENSE-2.0. The WaveMaker project was first released by WaveMaker Software in 2007. In March 2011, VMware (http://vmware.com) acquired the WaveMaker project. It was under VMware that WaveMaker 6.5 was released. In April 2013, Pramati Technlogies (http://pramati.com) acquired the assets of WaveMaker for its CloudJee (http://cloudjee.com) platform. WaveMaker continues to be developed and released by Pramati Technologies. Now that we understand where our client and server sit in the larger world, we will be primarily focused within and between those two parts. The overall picture of the client and server looks as shown in the following diagram: We will examine each piece of this diagram in detail during the course of this book. We shall start with the JavaScript client. Getting comfortable with the JavaScript client The client is a JavaScript client that runs in a modern browser. This means that most of the client, the HTML and DOM nodes that the browser interfaces with specifically, are created by JavaScript at runtime. The application is styled using CSS, and we can use HTML in our applications. However, we don't use HTML to define buttons and forms. Instead, we define components, such as widgets, and set their properties. These component class names and properties are used as arguments to functions that create DOM nodes for us. Dojo Toolkit To do this, WaveMaker uses the Dojo Toolkit, http://dojotoolkit.org/. Dojo, as it is generally referred to, is a modular, cross-browser, JavaScript framework with three sections. Dojo Core provides the base toolkit. On top of which are Dojo's visual widgets called Dijits. Finally, DojoX contains additional extensions such as charts and a color picker. DojoCampus' Dojo Explorer, http://dojocampus.com/explorer/, has a good selection of single unit demos across the toolkit, many with source code. Dojo allows developers to define widgets using HTML or JavaScript. WaveMaker users will better recognize the JavaScript approach. Specifically, WaveMaker 6.5.X uses version 1.6.1 of Dojo. Of the browsers supported by Dojo 1.6.1, http://dojotoolkit.org/reference-guide/1.8/releasenotes/1.6.html, Opera's "Dojo Core only" support prevents it from being supported by WaveMaker. This could change with Opera's move to WebKit. Building on top of the Dojo Toolkit, WaveMaker provides its own collections of widgets and underlying components. Although both can be called components, the name component is generally used for the non-visible parts, such as service calls to the server and the event notification system. Widgets, such as the Dijits, are visible components such as buttons and editors. Many, but not all, of the WaveMaker widgets extend functionality from Dojo widgets. When they do extend Dijits, WaveMaker widgets often add numerous functions and behaviors that are not part of Dojo. Examples include controlling the read-only state, formatting display values for currency, and merging components, such as buttons with icons in them. Combined with the WaveMaker runtime layers, these enhancements make it easy to assemble rich clients using only properties. WaveMaker's select editor (wm.SelectMenu) for example extends the Dojo Toolkit ComboBox (dijit.form.ComboBox) or the FilteringSelect (dijit.form.FilteringSelect) as needed. By default, a select menu has Dojo FilteringSelect as its editor, but it will use ComboBox instead if the user is on a mobile device or the developer has cleared the RestrictValues property tick box. A required select menu editor Let's consider the case of disabling a submit button when the user has not made a required list selection. In Dojo, this is done using JavaScript code, and for an experienced Dojo developer, this is not difficult. For those who may primarily consider Dojo a martial arts Studio however, it is likely another matter altogether. Using the WaveMaker framework provided widgets, no code is required to set up this inter-connection. This is simply a matter of visually linking or binding the button's disabled property to the lists' emptySelection property in the graphical binding dialog. Now the button will be disabled if the user has not made a selection in the grid's list of items. Logically, we can think of this as setting the disabled property to the value of the grid's emptySelection property, where emptySelection is true unless and until a row has been selected. Where WaveMaker most notably varies from the Dojo way of things is the layout engine. WaveMaker handles the layout of container widgets using its own engine. Containers are those widgets that contain other widgets, such as panels, tabs, and dialogs. This makes it easier for developers to arrange widgets in WaveMaker Studio. A result of this is that border, padding, and margin are set using properties on widgets, not by CSS. Border, padding, and margin are widget properties in WaveMaker, and are not controlled by CSS. Dojo made easy Having the Dojo framework available to us makes web development easier both when using the WaveMaker framework and when doing custom work. Dojo's modular and object-oriented functions, such as dojo.declare and dojo.inherited, for example, simplify creating custom components. The key takeaway here is that Dojo itself is available to you as a developer if you wish to use it directly. Many developers never need to utilize this capability, but it is available to you if you ever do wish to take advantage of it. Running the CRM Simple sample again from either the console in the browser development tools or custom project page code, we could use Dojo's byId() function to get a div, for example, the main title label: >dojo.byId("main_labelTitle"). In practice, the WaveMaker style of getting a DOM node via the component name, for example, main.labelTitle.domNode, is more practical and returns the same result. If a function or ability in Dojo is useful, the WaveMaker framework usually provides a wrapper of some sort for you. Just as often, the WaveMaker version is friendlier or otherwise easier to use in some way. For example, this.connect(), WaveMaker's version of dojo.connect(), tracks connections for you. This avoids the need for you to remember to call disconnect() to remove the reference added by every call to connect(). For more information about using Dojo functions in WaveMaker, see the Dojo framework page in the WaveMaker documentation at: http://dev.wavemaker.com/wiki/bin/wmdoc_6.5/Dojo+Framework. Binding and events Two solid examples of WaveMaker taking a powerful feature of Dojo and providing friendlier versions are topic notifications and event handling. Dojo.connect() enables you to register a method to be called when something happens. In other words: "when X happens, please also do Y". Studio provides visual tooling for this in the events section of a component's properties. Buttons have an event drop-down menu for their click event. Asynchronous server call components, live variables, and service variables, have tooled events for reviewing data just before the call is made and for the successful, and not so successful, returns from the call. These menus are populated with listings of likely components and if appropriate, functions. Invoking other service calls, particularly when a server call depends on data from the results of some previous server call, and navigation calls to other layers and pages within the application are easy examples of how WaveMaker's visual tooling of dojo.connect simplifies web development. WaveMaker's binding dialog is a graphical interface on the topic subscription system. Here we are "binding" a live variable that returns rows from the lineitem table to be filtered by the data value of the orderid editor in the form on the new order page: The result of this binding is that when the value of the orderid editor changes, the value in the filter parameter of this live variable will be updated. An event indicating that the value of this orderid editor has changed is published when the data value changes. This live variable's filter is being subscribed to that topic and can now update its value accordingly. Loading the client Web applications start from index.html, and a WaveMaker application is no different. If we examine index.html of a WaveMaker application, we see the total content is less than 100 lines. We have some meta tags in the head, mostly for Internet Explorer (MSIE) and iOS support. In the body, there are more entries to help out with older versions of MSIE, including script tags to use Chrome Frame if we so choose. If we cut all that away, index.html is rather simple. In the head, we load the CSS containing the projects theme and define a few lines of style classes for wavemakerNode and _wm_loading: <script>var wmThemeUrl = "/wavemaker/lib/wm/base/widget/themes/wm_default/theme.css";</script> <style type="text/css"> #wavemakerNode { height: 100%; overflow: hidden; position: relative; } #_wm_loading { text-align: center; margin: 25% 0px 25% 0px; } </style> Next we load the file config.js, which as its name suggests, is about configuration. The following line of code is used to load the file: <script type="text/javascript" src = "config.js"></script> Config.js defines the various settings, variables, and helper functions needed to initialize the application, such as the locale setting. Moving into the body tag of index.html, we find a div named wavemakerNode: <div id="wavemakerNode"> The next div tag is the loader gif, which is given in the following code: <div id="_wm_loading" style="z-index: 100;"> <table style='width:100%;height: 100%;'><tr><td align='center'><img alt="Loading" src = "/wavemaker/lib/boot/images/loader.gif" />&nbsp;&nbsp;Loading...</td></tr></table> </div> This is the standard spinner shown while the application is loading. With the loader gif now spinning, we begin the real work with runtimeLoader.js, as given in the following line of code: <script type="text/javascript" src = "/wavemaker/lib/runtimeLoader.js"></script> When running a project from Studio, the client runtime is loaded from Studio via WaveMaker. Config.js and index.html are modified for deployment while the client runtime is copied into the applications webapproot. runtimeLoader, as its name suggests, loads the WaveMaker runtime. With the runtime loaded, we can now load the top level project.a.js file, which defines our application using the dojo.declare() method. The following line of code loads the file: <script type="text/javascript" src = "project.a.js"></script> Finally, with our application class defined, we set up an instance of our application in wavemakerNode and run it. There are two modes for loading a WaveMaker application: debug and gzip mode. The debug mode is useful for debugging, as you would expect. The gzip mode is the default mode. The test mode of the Run , Test , or Compile button in Studio re-deploys the active project and opens it in debug mode. This is the only difference between using Test and Run in Studio. The Test button adds ?debug to the URL of the browser window; the Run button does not. Any WaveMaker application can be loaded in debug mode by adding debug to the URL parameters. For example, to load the CRM Simple application from with WaveMaker in debug mode, use the URL http://crm_simple.localhost:8094.com/?debug; detecting debug in the URL sets the djConfig.debugBoot flag, which alters the path used in runtimeLoader. djConfig.debugBoot = location.search.indexOf("debug") >=0; Like a compiled program, debug mode preserves variable names and all the other details that optimization removes which we would want available to use when debugging. However, JavaScript is not compiled into byte code or machine specific instructions. On the other hand, in gzip mode, the browser loads a few optimized packages containing all the source code in merged files. This reduces the number of files needed to load our application, which significantly improves loading time. These optimized packages are also minified. Minification removes whitespace and replaces variable names with short names, further reducing the volume of code to be parsed by the browser, and therefore further improving performance. The result is a significant reduction in the number of requests needed and the number of bytes transferred to load an application. A stock application in gzip mode requires 22 to 24 requests to load some 300 KB to 400 KB of content, depending on the application. In debug mode, the same app transfers over 1.5 MB in more than 500 requests. The index.html file, and when security is enabled, login.html, are yours to edit. If you are comfortable doing so, you can customize these files such as adding additional script tags. In practice, you shouldn't need to customize index.html, as you have full control of the application loaded into the wavemakerNode. Also, upgraded scripts in future versions of WaveMaker may need to programmatically update index.html and login.html. Changes to the X-US-Compatible meta tag are often required when support for newer versions of Internet Explorer becomes available, for example. These scripts can't possibly know about every customization you may make. Customization of index.html may cause these scripts to fail, and may require you to manually update these files. If you do encounter such a situation, simply use the index.html file from a project newly created in the new version as a template. Springing into the server side The WaveMaker server is a Java application running in a Java Virtual Machine (JVM). Like the client, it builds upon proven frameworks and libraries. In the case of the server, the foundational block is the SpringSource framework, http://www.springsource.org/SpringSource, or the Spring framework. The Spring framework is the most popular enterprise Java development framework today, and for good reason. The server of a WaveMaker application is a Spring application that includes the WaveMaker common, json, and runtime modules. More specifically, the WaveMaker server uses the Spring Web MVC framework to create a DispatcherServlet that delegates client requests to their handlers. WaveMaker uses only a handful of controllers, as we will see in the next section. The effective result is that it is the request URL that is used to direct a service call to the correct service. The method value of the request is the name of the client exposed function with the service to be called. In the case of overloaded functions, the signature of the params value is used to find the method matching by signature. We will look at example requests and responses shortly. Behind this controller is not only the power of the Spring framework, but also a number of leading frameworks such as Hibernate and, JaxWS, and libraries such as log4j and Apache commons. Here too, these libraries are available to you both directly in any custom work you might do and indirectly as tooled features of Studio. As we are working with a Spring server, we will be seeing Spring beans often as we examine the server-side configuration. One need not be familiar with Spring to reap its benefits when using custom Java in WaveMaker. Spring makes it easy to get access to other beans from our Java code. For example, if our project has imported a database as MyDB, we could get access to the service and any exposed functions in that service using getServiceBean().The following code illustrates the use of getServiceBean(): MyDB myDbSvc = (MyDB)RuntimeAccess.getInstance().getServiceBean("mydb"); We start by getting an instance of the WaveMaker runtime. From the returned runtime instance, we can use the getServiceBean() method to get a service bean for our mydb database service. There are other ways we could have got access to the service from our Java code; this one is pretty straightforward.  Starting from web.xml Just as the client side starts with index.html, a Java servlet starts in WEB-INF with web.xml. A WaveMaker application web.xml is a rather straightforward Spring MVC web.xml. You'll notice many servlet-mappings, a few listeners, and filters. Unlike index.html, web.xml is managed directly by Studio. If you need to add elements to the web-app context, add them to user-web.xml. The content of user-web.xml is merged into web.xml when generating the deployment package.  The most interesting entry is probably contextConfigLocation of /WEB-INF/project-springapp.xml. Project-springapp.xml is a Spring beans file. Immediately after the schema declaration is a series of resource imports. These imports include the services and entities that we create in Studio as we import databases and otherwise add services to our project. If you open project-spring.xml in WEB-INF, near the top of the file you'll see a comment noting how project-spring.xml is yours to edit. For experienced Spring users, here is the entry point to add any additional imports you may need. An example of such can be found at http://dev.wavemaker.com/wiki/bin/Spring. In that example, an additional XML file, ServerFileProcessor.xml, is used to enable component scanning on a package and sets some properties on those components. Project-spring.xml is then used to import ServerFileProcessor.xml into the application context. Many users of WaveMaker still think of Spring as the season between Winter and Summer. Such users do not need to think about these XML files. However, for those who are experienced with Java, the full power of the Spring framework is accessible to them. Also in project-springapp.xml is a list of URL mappings. These mappings specific request URLs that require handling by the file controller. Gzipped resources, for example, require the header Content-Encoding to be set to gzip. This informs the browser the content is gzip encoded and must be uncompressed before being parsed. >There are a few names that use ag in the server. WaveMaker Software the company was formerly known as ActiveGrid, and had a previous web development tool by the same name. The use of ag and com.activegrid stems back to the project's roots, first put down when the company was still known as ActiveGrid. Closing out web.xml is the Acegi filter mapping. Acegi is the security module used in WaveMaker 6.5 . Even when security is not enabled in an application, the Acegi filter mapping is included in web.xml. When security is not enabled in the project, an empty project-security.xml is used. Client and server communication Now that we've examined the client and server, we need to better understand the communication between the two. WaveMaker almost exclusively uses the HTTP methods GET and POST. In HTTP, GET is used, as you might suspect even without ever having heard of RFC 2626 (https://tools.ietf.org/html/rfc2616), to request, or get, a specific resource. Unless installed as a native application on a mobile device, a WaveMaker web application is loaded via a GET method. From index.html and runtimeLoad.js to the user defined pages and any images used on those images, the applications themselves are loaded into the browser using GET. All service calls, database reads and writes, or otherwise any invocations of a Java service functions, on the other hand, are POST. The URL of these POST functions is always the service named .json. For example, calls to a Java service named userPrefSvc would always be to the URL /userPrefSvc.json. Inside the POST method's request payload will be any required parameters including the method of the service to be invoked. The response will be the response returned from that call. PUT methods are not possible because we cannot nor do not want to know all possible WaveMaker server calls at "designtime", while the project files are open for writing in the Studio. This pattern avoids any URL length constraints, enabling lengthy datasets to be transferred while freeing up the URL to pass parameters such as page state. Let's take a look at an example. If you want to follow along in your browser's console, this is the third request of three when we select "Fog City Books" in the CRM Simple application when running the application with the console open. The following URL is the request URL: http://crm_simple.localhost:8094/services/runtimeService.json The following is request payload: {"params":["custpurchaseDB","com.custpurchasedb.data.Lineitem",null,{"properties":["id","item"],"filters":["id.orderid=9"],"matchMode":"start","ignoreCase":false},{"maxResults":500,"firstResult":0}],"method":"read","id":251422} The response is as follows: {"dataSetSize":2,"result":[{"id":{"itemid":2,"orderid":9},"item":{"itemid":2,"itemname":"Kidnapped","price":12.99},"quantity":2},{"id":{"itemid":10,"orderid":9},"item":{"itemid":10,"itemname":"Gravitys Rainbow","price":11.99},"quantity":1}]} As we expect, the request URL is to a service (in this case named runtime service), with the .json extension. Runtime service is the built-in WaveMaker service for reading and writing with the Hibernate (http://www.hibernate.org), data models generated by importing a database. Security service and WaveMaker service are the other built-in services used at runtime. The security service is used for security functions such as getUserName() and logout(). Note this does not include login, which is handled by Acegi. The WaveMaker service has functions such as getServerTimeOffset(), used to adjust for time zones, and remoteRESTCall(), used to proxy some web service calls. How the runtime service functions is easy to understand by observation. Inside the request payload we have, as the URL suggested, a JavaScript Object Notation (JSON) structure. JSON (http://www.json.org/), is a lightweight data-interchange format regularly used in AJAX applications. Dissecting our example request from the top of the structure enclosed in the outer-most {}'s looks like the following: {"params":[…….],"method":"read","id":251422} We have three top level name-value pairs to our request object: params, method, and id. The id is 251422; method is read and the params value is an array, as indicated by the [] brackets: ["custpurchaseDB","com.custpurchasedb.data.Lineitem",null,{},{ }] In our case, we have an array of five values. The first is the database service name, custpurchaseDB. Next we have what appears to be the package and class name we will be reading from, not unlike from in a SQL query. After which, we have a null and two objects. JSON is friendly to human reading, and we could continue to unwrap the two objects in this request in a similar fashion.  when we discuss database services and check out the response. At the top level, we have dataSetSize, the number of results, and the array of the results: {"dataSetSize":2,"result":[]} Inside our result array we have two objects: [{"id":{"itemid":2,"orderid":9},"item":{"itemid":2,"itemname":"Kidnapped","price":12.99},"quantity":2},{"id":{"itemid":10,"orderid":9},"item":{"itemid":10,"itemname":"Gravitys Rainbow","price":11.99},"quantity":1}]} Our first item has the compound key of itemid 2 with orderid 9. This is the item Kidnapped, which is a book costing $11.99. The other object in our result array also has the orderid 9, as we expect when reading line items from the selected order. This one is also a book, the item Gravity's Rainbow. Types To be more precise about the com.custpurchasdb.data.Lineitem parameter in our read request, it is actually the type name of the read request. WaveMaker projects define types from primitive types such as Boolean and custom complex types such as Lineitem. In our runtime read example, com.custpurchasedb.data.Lineitem is both the package and class name of the imported Hibernate entity and the type name for the line item entity in the project. Maintaining type information enables WaveMaker to ease a number of development issues. As the client knows the structure of the data it is getting from the server, it knows how to display that data with minimal developer configuration, if any. At design time, Studio uses type information in many areas to help us correctly configure our application. For example, when we set up a grid, type information enables Studio to present us with a list of possible column choices for the grid's dataset type. Likewise, when we add a form to the canvas for a database insert, it is type information that Studio uses to fill the form with appropriate editors. Line item is a project-wide type as it is defined in the server side. In the process of compiling the project's Java services sources, WaveMaker defines system types for any type returned to the client in a client facing function. To be added to the type system, a class must: Be public Define public getters and setters Be returned by a client exposed function Have a service class that extends JavaServiceSuperClass or uses the @ExposeToClient annotation WaveMaker 6.5.1 has a bug that prevents types from being generated as expected. Be certain to use 6.5.2 or newer versions to avoid this defect. It is possible to create new project types by adding a Java service class to the project that only defines types. Following is an example that creates a new simple type called Record to the project. Our definition of Record consists of an integer ID and a string. Note that there are two classes here. MyCustomTypes is the service class containing a method returning the type Record. As we will not be calling it, the function getNewRecord() need not do anything other than return a record. Creating a new default instance is an easy way to do this. The class Record is defined as an inner class. An inner class is a class defined within another class. In our case, Record is defined within MyCustomTypes: // Java Service class MyCustomTypes package com.myco.types; import com.wavemaker.runtime.javaservice.JavaServiceSuperClass; import com.wavemaker.runtime.service.annotations.ExposeToClient; public class MyCustomTypes extends JavaServiceSuperClass { public Record getNewRecord(){ return new Record(); } // Inner class Record public class Record{ private int id; private String name; public int getId(){ return id; } public void setId(int id){ this.id = id; } public String getName(){ return this.name; } public void setName(String name){ this.name = name; } } }  To add the preceding code to our WaveMaker project, we would add a Java service to the project using the class name MyCustomTypes in the Package and Class Name editor of the New Java Service dialog. The preceding code extends JavaServiceSuperClass and uses the package com.myco.types. A project can also have client-only types using the type definition option from the advanced section of the Studio insert menu. Type definitions are useful when we want to be able to pass structured data around within the client but we will not be sending or receiving that type to the server. For example, we may want to have application scoped wm.Variable storing a collection of current record selection information. This would enable us to keep track of a number of state items across all pages. Communication with the server is likely to be using only a few of those types at a time, so no such structure exists in the server side. Using wm.Variable enables us to bind each Record ID without using code. The insert type definition menu brings up the Type Definition Generator dialog. The generator takes JSON input and is pre-populated with a sample type. The sample type defines a person object, albeit an unusual one, with a name, an array of numbers for age, a Boolean (hasFoot), and a related person object, friend. Replace the sample type with your own JSON structure. Be certain to change the type name to something meaningful. After generating the type, you'll immediately see the newly minted type in type selectors, such as the type field of wm.Variable. Studio is pretty good at recognizing type changes. If for some reason Studio does not recognize a type change, the easiest thing to do is to get Studio to re-read the owning object. If a wm.Variable fails to show a newly added field to a type in its properties, change the type of the variable from the modified type to some other type and then back again. Studio is also an application One of the more complex WaveMaker applications is the Studio. That's right, Studio is itself an application built out of WaveMaker widgets and using the runtime and server. Being the large, complex application we use to build applications, it can sometimes be difficult to understand where the runtime ends and Studio begins. With that said, Studio remains a treasure trove of examples and ideas to explore. Let's open a finder, explorer, shell, or however you prefer to view the file system of a WaveMaker Studio installation. Let's look in the studio folder. If you've installed WaveMaker to c:program filesWaveMaker6.5.3.Release, the default on Windows, we're looking at c:program filesWaveMaker6.5.3.Releasestudio. This is the webapproot of the Studio project: For files, we've discussed index.html in loading the client. The type definition for the project types is types.js. The types.js definition is how the client learns of the server's Java types. Moving on to the directories alphabetically, we start with the app folder. The app folder can be considered a large utility folder these days. The branding folder, http://dev.wavemaker.com/wiki/bin/wmdoc_6.5/Branding, is a sample of the branding feature for when you want to easily re-brand applications for different customers. The build folder contains the optimized build files we discussed when loading our application in gzip mode. This build folder is for the Studio itself. The images folder is, as we would hope, where images are kept. The content of the doc in jsdoc is pretty old. Use jsref at the online wiki, http://dev.wavemaker.com/wiki/bin/wmjsref_6.5/WebHome, for a client API reference instead. Language contains the National Language Support (NLS) files to localize Studio into other languages. In 6.5.X, there is a Japanese (ja) and Spanish (es) directory in addition to the English (en) default thanks to the efforts of the WaveMaker community and a corporate partner. For more on internationalization applications with WaveMaker, navigate to http://dev.wavemaker.com/wiki/bin/wmdoc_6.5/Localization#HLocalizingtheClientLanguage. The lib folder is very interesting, so let's wrap up this top level before we dig into that one. The META-INF folder contains artifacts from the WaveMaker Maven build process that probably should be removed for 6.5.2. The pages folder contains the page definitions for Studio's pages. These pages can be opened in Studio. They also can be a treasure trove of tips and tricks if you see something when using Studio that you don't know how to do in your application. Be careful however, as some pages are old and use outdated classes or techniques. Other constructs are only used by Studio and aren't tooled. This means some pages use components that can only be created by code. The other major difference between a project's pages folder is that Studio page folders do not contain the same number of files. They do not have the optimized pageName.a.js file, for example. The services folder contains the Service Method Definition (SMD) files for Studio's services. These are summaries of a projects exposed services, one file per service, used at runtime by the client. Each callable function, its input parameter, and its return types are defined. Finally, WEB-INF we have discussed already when we examined web.xml. In Studio's case, replace project with studio in the file names. Also under WEB-INF, we have classes and lib. The classes folder contains Java class files and additional XML files. These files are on the classpath. WEB-INFlib contains JAR files. Studio requires significantly more JAR files, which are automatically added to projects created by Studio. Now let's get back to the lib folder. Astute readers of our walk through of index.html likely noticed the references to /wavemaker/lib in src tags for things such as runtimeloader. You might have also noticed that this folder was not present in the project and wondered how these tags could not fail. As a quick look at the URL of Studio running in a browser will demonstrate, /wavemaker is the Studio's context. This means the JavaScript runtime is only copied in as part of generating the deployment package. The lib folder is loaded directly from Studio's context when you test run an application from Studio using the Run or Test button. RuntimeLoader.js we encountered following index.html as it is the start of the loading of client modules. Manifest.js is an entry point into the loading process. Boot contains pre-initialization, such as the spinning loader image. Next we have another build folder. This one is the one used by applications and contains all possible build files. Not every JavaScript module is packaged up into an optimized build file. Some modules are so specific or rarely used that they are best loaded individually. Otherwise, if there's a build package available to applications, these them. Dojo lives in the dojo folder. I hope you don't find it surprising to find a dijit, dojo, and dojox folder in there. The folder github provides the library path github for JS Beautifier, http://jsbeautifier.org/. The images in the project images folder include a copy of Silk Icons, http://www.famfamfam.com/lab/icons/silk/, a great Creative Common licensed PNG icon set. This brings us to wm. We definitely saved the most interesting folder for our last stop on this tour. For in lib/wm, we have manifest.js, the top level of module loading when using debug mode in the runtime loader. In wm/lib/base, is the top level of the WaveMaker module space used at runtime. This means in wm/lib/base we have the WaveMaker components and widgets folders. These two folders contain the most commonly used sets of classes by WaveMaker developers using any custom JavaScript in a project. This also means we will be back in these folders again too. Summary In this article, we reviewed the WaveMaker architecture. We started with some context of what we mean by "client" and "server" in the context of this book. We then proceeded to dig into the client and the server. We reviewed how both build upon leading frameworks, the Dojo Toolkit and the SpringSource Framework in particular. We examined the running of an application from the network point of view and how the client and server communicated throughout. We dissected a JSON request to the runtime service and encountered project types. We also learned about both project and client type definitions. We ended by revisiting the file system. This time, however, we walked through a Studio installation. Studio is also a WaveMaker application. In the next article, we'll get comfortable with the Studio as a visual tool. We'll look at everything from the properties panels to the built-in source code editors. Resources for Article : Further resources on this subject: Binding Web Services in ESB—Web Services Gateway [Article] Service Oriented JBI: Invoking External Web Services from ServiceMix [Article] Web Services in Apache OFBiz [Article]
Read more
  • 0
  • 0
  • 6078

article-image-man-do-i-templates
Packt
07 Jul 2015
22 min read
Save for later

Man, Do I Like Templates!

Packt
07 Jul 2015
22 min read
In this article by Italo Maia, author of the book Building Web Applications with Flask, we will discuss what Jinja2 is, and how Flask uses Jinja2 to implement the View layer and awe you. Be prepared! (For more resources related to this topic, see here.) What is Jinja2 and how is it coupled with Flask? Jinja2 is a library found at http://jinja.pocoo.org/; you can use it to produce formatted text with bundled logic. Unlike the Python format function, which only allows you to replace markup with variable content, you can have a control structure, such as a for loop, inside a template string and use Jinja2 to parse it. Let's consider this example: from jinja2 import Template x = """ <p>Uncle Scrooge nephews</p> <ul> {% for i in my_list %} <li>{{ i }}</li> {% endfor %} </ul> """ template = Template(x) # output is an unicode string print template.render(my_list=['Huey', 'Dewey', 'Louie']) In the preceding code, we have a very simple example where we create a template string with a for loop control structure ("for tag", for short) that iterates over a list variable called my_list and prints the element inside a "li HTML tag" using curly braces {{ }} notation. Notice that you could call render in the template instance as many times as needed with different key-value arguments, also called the template context. A context variable may have any valid Python variable name—that is, anything in the format given by the regular expression [a-zA-Z_][a-zA-Z0-9_]*. For a full overview on regular expressions (Regex for short) with Python, visit https://docs.python.org/2/library/re.html. Also, take a look at this nice online tool for Regex testing http://pythex.org/. A more elaborate example would make use of an environment class instance, which is a central, configurable, extensible class that may be used to load templates in a more organized way. Do you follow where we are going here? This is the basic principle behind Jinja2 and Flask: it prepares an environment for you, with a few responsive defaults, and gets your wheels in motion. What can you do with Jinja2? Jinja2 is pretty slick. You can use it with template files or strings; you can use it to create formatted text, such as HTML, XML, Markdown, and e-mail content; you can put together templates, reuse templates, and extend templates; you can even use extensions with it. The possibilities are countless, and combined with nice debugging features, auto-escaping, and full unicode support. Auto-escaping is a Jinja2 configuration where everything you print in a template is interpreted as plain text, if not explicitly requested otherwise. Imagine a variable x has its value set to <b>b</b>. If auto-escaping is enabled, {{ x }} in a template would print the string as given. If auto-escaping is off, which is the Jinja2 default (Flask's default is on), the resulting text would be b. Let's understand a few concepts before covering how Jinja2 allows us to do our coding. First, we have the previously mentioned curly braces. Double curly braces are a delimiter that allows you to evaluate a variable or function from the provided context and print it into the template: from jinja2 import Template # create the template t = Template("{{ variable }}") # – Built-in Types – t.render(variable='hello you') >> u"hello you" t.render(variable=100) >> u"100" # you can evaluate custom classes instances class A(object): def __str__(self):    return "__str__" def __unicode__(self):    return u"__unicode__" def __repr__(self):    return u"__repr__" # – Custom Objects Evaluation – # __unicode__ has the highest precedence in evaluation # followed by __str__ and __repr__ t.render(variable=A()) >> u"__unicode__" In the preceding example, we see how to use curly braces to evaluate variables in your template. First, we evaluate a string and then an integer. Both result in a unicode string. If we evaluate a class of our own, we must make sure there is a __unicode__ method defined, as it is called during the evaluation. If a __unicode__ method is not defined, the evaluation falls back to __str__ and __repr__, sequentially. This is easy. Furthermore, what if we want to evaluate a function? Well, just call it: from jinja2 import Template # create the template t = Template("{{ fnc() }}") t.render(fnc=lambda: 10) >> u"10" # evaluating a function with argument t = Template("{{ fnc(x) }}") t.render(fnc=lambda v: v, x='20') >> u"20" t = Template("{{ fnc(v=30) }}") t.render(fnc=lambda v: v) >> u"30" To output the result of a function in a template, just call the function as any regular Python function. The function return value will be evaluated normally. If you're familiar with Django, you might notice a slight difference here. In Django, you do not need the parentheses to call a function, or even pass arguments to it. In Flask, the parentheses are always needed if you want the function return evaluated. The following two examples show the difference between Jinja2 and Django function call in a template: {# flask syntax #} {{ some_function() }}   {# django syntax #} {{ some_function }} You can also evaluate Python math operations. Take a look: from jinja2 import Template # no context provided / needed Template("{{ 3 + 3 }}").render() >> u"6" Template("{{ 3 - 3 }}").render() >> u"0" Template("{{ 3 * 3 }}").render() >> u"9" Template("{{ 3 / 3 }}").render() >> u"1" Other math operators will also work. You may use the curly braces delimiter to access and evaluate lists and dictionaries: from jinja2 import Template Template("{{ my_list[0] }}").render(my_list=[1, 2, 3]) >> u'1' Template("{{ my_list['foo'] }}").render(my_list={'foo': 'bar'}) >> u'bar' # and here's some magic Template("{{ my_list.foo }}").render(my_list={'foo': 'bar'}) >> u'bar' To access a list or dictionary value, just use normal plain Python notation. With dictionaries, you can also access a key value using variable access notation, which is pretty neat. Besides the curly braces delimiter, Jinja2 also has the curly braces/percentage delimiter, which uses the notation {% stmt %} and is used to execute statements, which may be a control statement or not. Its usage depends on the statement, where control statements have the following notation: {% stmt %} {% endstmt %} The first tag has the statement name, while the second is the closing tag, which has the name of the statement appended with end in the beginning. You must be aware that a non-control statement may not have a closing tag. Let's look at some examples: {% block content %} {% for i in items %} {{ i }} - {{ i.price }} {% endfor %} {% endblock %} The preceding example is a little more complex than what we have been seeing. It uses a control statement for loop inside a block statement (you can have a statement inside another), which is not a control statement, as it does not control execution flow in the template. Inside the for loop you see that the i variable is being printed together with the associated price (defined elsewhere). A last delimiter you should know is {# comments go here #}. It is a multi-line delimiter used to declare comments. Let's see two examples that have the same result: {# first example #} {# second example #} Both comment delimiters hide the content between {# and #}. As can been seen, this delimiter works for one-line comments and multi-line comments, what makes it very convenient. Control structures We have a nice set of built-in control structures defined by default in Jinja2. Let's begin our studies on it with the if statement. {% if true %}Too easy{% endif %} {% if true == true == True %}True and true are the same{% endif %} {% if false == false == False %}False and false also are the same{% endif %} {% if none == none == None %}There's also a lowercase None{% endif %} {% if 1 >= 1 %}Compare objects like in plain python{% endif %} {% if 1 == 2 %}This won't be printed{% else %}This will{% endif %} {% if "apples" != "oranges" %}All comparison operators work = ]{% endif %} {% if something %}elif is also supported{% elif something_else %}^_^{% endif %} The if control statement is beautiful! It behaves just like a python if statement. As seen in the preceding code, you can use it to compare objects in a very easy fashion. "else" and "elif" are also fully supported. You may also have noticed that true and false, non-capitalized, were used together with plain Python Booleans, True and False. As a design decision to avoid confusion, all Jinja2 templates have a lowercase alias for True, False, and None. By the way, lowercase syntax is the preferred way to go. If needed, and you should avoid this scenario, you may group comparisons together in order to change precedence evaluation. See the following example: {% if 5 < 10 < 15 %}true{%else%}false{% endif %} {% if (5 < 10) < 15 %}true{%else%}false{% endif %} {% if 5 < (10 < 15) %}true{%else%}false{% endif %} The expected output for the preceding example is true, true, and false. The first two lines are pretty straightforward. In the third line, first, (10<15) is evaluated to True, which is a subclass of int, where True == 1. Then 5 < True is evaluated, which is certainly false. The for statement is pretty important. One can hardly think of a serious Web application that does not have to show a list of some kind at some point. The for statement can iterate over any iterable instance and has a very simple, Python-like syntax: {% for item in my_list %} {{ item }}{# print evaluate item #} {% endfor %} {# or #} {% for key, value in my_dictionary.items() %} {{ key }}: {{ value }} {% endfor %} In the first statement, we have the opening tag indicating that we will iterate over my_list items and each item will be referenced by the name item. The name item will be available inside the for loop context only. In the second statement, we have an iteration over the key value tuples that form my_dictionary, which should be a dictionary (if the variable name wasn't suggestive enough). Pretty simple, right? The for loop also has a few tricks in store for you. When building HTML lists, it's a common requirement to mark each list item in alternating colors in order to improve readability or mark the first or/and last item with some special markup. Those behaviors can be achieved in a Jinja2 for-loop through access to a loop variable available inside the block context. Let's see some examples: {% for i in ['a', 'b', 'c', 'd'] %} {% if loop.first %}This is the first iteration{% endif %} {% if loop.last %}This is the last iteration{% endif %} {{ loop.cycle('red', 'blue') }}{# print red or blue alternating #} {{ loop.index }} - {{ loop.index0 }} {# 1 indexed index – 0 indexed index #} {# reverse 1 indexed index – reverse 0 indexed index #} {{ loop.revindex }} - {{ loop.revindex0 }} {% endfor %} The for loop statement, as in Python, also allow the use of else, but with a slightly different meaning. In Python, when you use else with for, the else block is only executed if it was not reached through a break command like this: for i in [1, 2, 3]: pass else: print "this will be printed" for i in [1, 2, 3]: if i == 3:    break else: print "this will never not be printed" As seen in the preceding code snippet, the else block will only be executed in a for loop if the execution was never broken by a break command. With Jinja2, the else block is executed when the for iterable is empty. For example: {% for i in [] %} {{ i }} {% else %}I'll be printed{% endfor %} {% for i in ['a'] %} {{ i }} {% else %}I won't{% endfor %} As we are talking about loops and breaks, there are two important things to know: the Jinja2 for loop does not support break or continue. Instead, to achieve the expected behavior, you should use loop filtering as follows: {% for i in [1, 2, 3, 4, 5] if i > 2 %} value: {{ i }}; loop.index: {{ loop.index }} {%- endfor %} In the first tag you see a normal for loop together with an if condition. You should consider that condition as a real list filter, as the index itself is only counted per iteration. Run the preceding example and the output will be the following: value:3; index: 1 value:4; index: 2 value:5; index: 3 Look at the last observation in the preceding example—in the second tag, do you see the dash in {%-? It tells the renderer that there should be no empty new lines before the tag at each iteration. Try our previous example without the dash and compare the results to see what changes. We'll now look at three very important statements used to build templates from different files: block, extends, and include. block and extends always work together. The first is used to define "overwritable" blocks in a template, while the second defines a parent template that has blocks, for the current template. Let's see an example: # coding:utf-8 with open('parent.txt', 'w') as file:    file.write(""" {% block template %}parent.txt{% endblock %} =========== I am a powerful psychic and will tell you your past   {#- "past" is the block identifier #} {% block past %} You had pimples by the age of 12. {%- endblock %}   Tremble before my power!!!""".strip())   with open('child.txt', 'w') as file:    file.write(""" {% extends "parent.txt" %}   {# overwriting the block called template from parent.txt #} {% block template %}child.txt{% endblock %}   {#- overwriting the block called past from parent.txt #} {% block past %} You've bought an ebook recently. {%- endblock %}""".strip()) with open('other.txt', 'w') as file:    file.write(""" {% extends "child.txt" %} {% block template %}other.txt{% endblock %}""".strip())   from jinja2 import Environment, FileSystemLoader   env = Environment() # tell the environment how to load templates env.loader = FileSystemLoader('.') # look up our template tmpl = env.get_template('parent.txt') # render it to default output print tmpl.render() print "" # loads child.html and its parent tmpl = env.get_template('child.txt') print tmpl.render() # loads other.html and its parent env.get_template('other.txt').render() Do you see the inheritance happening, between child.txt and parent.txt? parent.txt is a simple template with two block statements, called template and past. When you render parent.txt directly, its blocks are printed "as is", because they were not overwritten. In child.txt, we extend the parent.txt template and overwrite all its blocks. By doing that, we can have different information in specific parts of a template without having to rewrite the whole thing. With other.txt, for example, we extend the child.txt template and overwrite only the block-named template. You can overwrite blocks from a direct parent template or from any of its parents. If you were defining an index.txt page, you could have default blocks in it that would be overwritten when needed, saving lots of typing. Explaining the last example, Python-wise, is pretty simple. First, we create a Jinja2 environment (we talked about this earlier) and tell it how to load our templates, then we load the desired template directly. We do not have to bother telling the environment how to find parent templates, nor do we need to preload them. The include statement is probably the easiest statement so far. It allows you to render a template inside another in a very easy fashion. Let's look at an example: with open('base.txt', 'w') as file: file.write(""" {{ myvar }} You wanna hear a dirty joke? {% include 'joke.txt' %} """.strip()) with open('joke.txt', 'w') as file: file.write(""" A boy fell in a mud puddle. {{ myvar }} """.strip())   from jinja2 import Environment, FileSystemLoader   env = Environment() # tell the environment how to load templates env.loader = FileSystemLoader('.') print env.get_template('base.txt').render(myvar='Ha ha!') In the preceding example, we render the joke.txt template inside base.txt. As joke.txt is rendered inside base.txt, it also has full access to the base.txt context, so myvar is printed normally. Finally, we have the set statement. It allows you to define variables for inside the template context. Its use is pretty simple: {% set x = 10 %} {{ x }} {% set x, y, z = 10, 5+5, "home" %} {{ x }} - {{ y }} - {{ z }} In the preceding example, if x was given by a complex calculation or a database query, it would make much more sense to have it cached in a variable, if it is to be reused across the template. As seen in the example, you can also assign a value to multiple variables at once. Macros Macros are the closest to coding you'll get inside Jinja2 templates. The macro definition and usage are similar to plain Python functions, so it is pretty easy. Let's try an example: with open('formfield.html', 'w') as file: file.write(''' {% macro input(name, value='', label='') %} {% if label %} <label for='{{ name }}'>{{ label }}</label> {% endif %} <input id='{{ name }}' name='{{ name }}' value='{{ value }}'></input> {% endmacro %}'''.strip()) with open('index.html', 'w') as file: file.write(''' {% from 'formfield.html' import input %} <form method='get' action='.'> {{ input('name', label='Name:') }} <input type='submit' value='Send'></input> </form> '''.strip())   from jinja2 import Environment, FileSystemLoader   env = Environment() env.loader = FileSystemLoader('.') print env.get_template('index.html').render() In the preceding example, we create a macro that accepts a name argument and two optional arguments: value and label. Inside the macro block, we define what should be output. Notice we can use other statements inside a macro, just like a template. In index.html we import the input macro from inside formfield.html, as if formfield was a module and input was a Python function using the import statement. If needed, we could even rename our input macro like this: {% from 'formfield.html' import input as field_input %} You can also import formfield as a module and use it as follows: {% import 'formfield.html' as formfield %} When using macros, there is a special case where you want to allow any named argument to be passed into the macro, as you would in a Python function (for example, **kwargs). With Jinja2 macros, these values are, by default, available in a kwargs dictionary that does not need to be explicitly defined in the macro signature. For example: # coding:utf-8 with open('formfield.html', 'w') as file:    file.write(''' {% macro input(name) -%} <input id='{{ name }}' name='{{ name }}' {% for k,v in kwargs.items() -%}{{ k }}='{{ v }}' {% endfor %}></input> {%- endmacro %} '''.strip())with open('index.html', 'w') as file:    file.write(''' {% from 'formfield.html' import input %} {# use method='post' whenever sending sensitive data over HTTP #} <form method='post' action='.'> {{ input('name', type='text') }} {{ input('passwd', type='password') }} <input type='submit' value='Send'></input> </form> '''.strip())   from jinja2 import Environment, FileSystemLoader   env = Environment() env.loader = FileSystemLoader('.') print env.get_template('index.html').render() As you can see, kwargs is available even though you did not define a kwargs argument in the macro signature. Macros have a few clear advantages over plain templates, that you notice with the include statement: You do not have to worry about variable names in the template using macros You can define the exact required context for a macro block through the macro signature You can define a macro library inside a template and import only what is needed Commonly used macros in a Web application include a macro to render pagination, another to render fields, and another to render forms. You could have others, but these are pretty common use cases. Regarding our previous example, it is good practice to use HTTPS (also known as, Secure HTTP) to send sensitive information, such as passwords, over the Internet. Be careful about that! Extensions Extensions are the way Jinja2 allows you to extend its vocabulary. Extensions are not enabled by default, so you can enable an extension only when and if you need, and start using it without much trouble: env = Environment(extensions=['jinja2.ext.do',   'jinja2.ext.with_']) In the preceding code, we have an example where you create an environment with two extensions enabled: do and with. Those are the extensions we will study in this article. As the name suggests, the do extension allows you to "do stuff". Inside a do tag, you're allowed to execute Python expressions with full access to the template context. Flask-Empty, a popular flask boilerplate available at https://github.com/italomaia/flask-empty uses the do extension to update a dictionary in one of its macros, for example. Let's see how we could do the same: {% set x = {1:'home', '2':'boat'} %} {% do x.update({3: 'bar'}) %} {%- for key,value in x.items() %} {{ key }} - {{ value }} {%- endfor %} In the preceding example, we create the x variable with a dictionary, then we update it with {3: 'bar'}. You don't usually need to use the do extension but, when you do, a lot of coding is saved. The with extension is also very simple. You use it whenever you need to create block scoped variables. Imagine you have a value you need cached in a variable for a brief moment; this would be a good use case. Let's see an example: {% with age = user.get_age() %} My age: {{ age }} {% endwith %} My age: {{ age }}{# no value here #} As seen in the example, age exists only inside the with block. Also, variables set inside a with block will only exist inside it. For example: {% with %} {% set count = query.count() %} Current Stock: {{ count }} Diff: {{ prev_count - count }} {% endwith %} {{ count }} {# empty value #} Filters Filters are a marvelous thing about Jinja2! This tool allows you to process a constant or variable before printing it to the template. The goal is to implement the formatting you want, strictly in the template. To use a filter, just call it using the pipe operator like this: {% set name = 'junior' %} {{ name|capitalize }} {# output is Junior #} Its name is passed to the capitalize filter that processes it and returns the capitalized value. To inform arguments to the filter, just call it like a function, like this: {{ ['Adam', 'West']|join(' ') }} {# output is Adam West #} The join filter will join all values from the passed iterable, putting the provided argument between them. Jinja2 has an enormous quantity of available filters by default. That means we can't cover them all here, but we can certainly cover a few. capitalize and lower were seen already. Let's look at some further examples: {# prints default value if input is undefined #} {{ x|default('no opinion') }} {# prints default value if input evaluates to false #} {{ none|default('no opinion', true) }} {# prints input as it was provided #} {{ 'some opinion'|default('no opinion') }}   {# you can use a filter inside a control statement #} {# sort by key case-insensitive #} {% for key in {'A':3, 'b':2, 'C':1}|dictsort %}{{ key }}{% endfor %} {# sort by key case-sensitive #} {% for key in {'A':3, 'b':2, 'C':1}|dictsort(true) %}{{ key }}{% endfor %} {# sort by value #} {% for key in {'A':3, 'b':2, 'C':1}|dictsort(false, 'value') %}{{ key }}{% endfor %} {{ [3, 2, 1]|first }} - {{ [3, 2, 1]|last }} {{ [3, 2, 1]|length }} {# prints input length #} {# same as in python #} {{ '%s, =D'|format("I'm John") }} {{ "He has two daughters"|replace('two', 'three') }} {# safe prints the input without escaping it first#} {{ '<input name="stuff" />'|safe }} {{ "there are five words here"|wordcount }} Try the preceding example to see exactly what each filter does. After reading this much about Jinja2, you're probably thinking: "Jinja2 is cool but this is a book about Flask. Show me the Flask stuff!". Ok, ok, I can do that! Of what we have seen so far, almost everything can be used with Flask with no modifications. As Flask manages the Jinja2 environment for you, you don't have to worry about creating file loaders and stuff like that. One thing you should be aware of, though, is that, because you don't instantiate the Jinja2 environment yourself, you can't really pass to the class constructor, the extensions you want to activate. To activate an extension, add it to Flask during the application setup as follows: from flask import Flask app = Flask(__name__) app.jinja_env.add_extension('jinja2.ext.do') # or jinja2.ext.with_ if __name__ == '__main__': app.run() Messing with the template context You can use the render_template method to load a template from the templates folder and then render it as a response. from flask import Flask, render_template app = Flask(__name__)   @app.route("/") def hello():    return render_template("index.html") If you want to add values to the template context, as seen in some of the examples in this article, you would have to add non-positional arguments to render_template: from flask import Flask, render_template app = Flask(__name__)   @app.route("/") def hello():    return render_template("index.html", my_age=28) In the preceding example, my_age would be available in the index.html context, where {{ my_age }} would be translated to 28. my_age could have virtually any value you want to exhibit, actually. Now, what if you want all your views to have a specific value in their context, like a version value—some special code or function; how would you do it? Flask offers you the context_processor decorator to accomplish that. You just have to annotate a function that returns a dictionary and you're ready to go. For example: from flask import Flask, render_response app = Flask(__name__)   @app.context_processor def luck_processor(): from random import randint def lucky_number():    return randint(1, 10) return dict(lucky_number=lucky_number)   @app.route("/") def hello(): # lucky_number will be available in the index.html context by default return render_template("index.html") Summary In this article, we saw how to render templates using only Jinja2, how control statements look and how to use them, how to write a comment, how to print variables in a template, how to write and use macros, how to load and use extensions, and how to register context processors. I don't know about you, but this article felt like a lot of information! I strongly advise you to run the experiment with the examples. Knowing your way around Jinja2 will save you a lot of headaches. Resources for Article: Further resources on this subject: Recommender systems dissected Deployment and Post Deployment [article] Handling sessions and users [article] Introduction to Custom Template Filters and Tags [article]
Read more
  • 0
  • 0
  • 6053

article-image-understanding-php-basics
Packt
17 Feb 2016
27 min read
Save for later

Understanding PHP basics

Packt
17 Feb 2016
27 min read
In this article by Antonio Lopez Zapata, the author of the book Learning PHP 7, you need to understand not only the syntax of the language, but also its grammatical rules, that is, when and why to use each element of the language. Luckily, for you, some languages come from the same root. For example, Spanish and French are romance languages as they both evolved from spoken Latin; this means that these two languages share a lot of rules, and learning Spanish if you already know French is much easier. (For more resources related to this topic, see here.) Programming languages are quite the same. If you already know another programming language, it will be very easy for you to go through this chapter. If it is your first time though, you will need to understand from scratch all the grammatical rules, so it might take some more time. But fear not! We are here to help you in this endeavor. In this chapter, you will learn about these topics: PHP in web applications Control structures Functions PHP in web applications Even though the main purpose of this chapter is to show you the basics of PHP, doing so in a reference-manual way is not interesting enough. If we were to copy paste what the official documentation says, you might as well go there and read it by yourself. Instead, let's not forget the main purpose of this book and your main goal—to write web applications with PHP. We will show you how can you apply everything you are learning as soon as possible, before you get too bored. In order to do that, we will go through the journey of building an online bookstore. At the very beginning, you might not see the usefulness of it, but that is just because we still haven't seen all that PHP can do. Getting information from the user Let's start by building a home page. In this page, we are going to figure out whether the user is looking for a book or just browsing. How do we find this out? The easiest way right now is to inspect the URL that the user used to access our application and extract some information from there. Save this content as your index.php file: <?php $looking = isset($_GET['title']) || isset($_GET['author']); ?> <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Bookstore</title> </head> <body> <p>You lookin'? <?php echo (int) $looking; ?></p> <p>The book you are looking for is</p> <ul> <li><b>Title</b>: <?php echo $_GET['title']; ?></li> <li><b>Author</b>: <?php echo $_GET['author']; ?></li> </ul> </body> </html> And now, access http://localhost:8000/?author=Harper Lee&title=To Kill a Mockingbird. You will see that the page is printing some of the information that you passed on to the URL. For each request, PHP stores in an array—called $_GET- all the parameters that are coming from the query string. Each key of the array is the name of the parameter, and its associated value is the value of the parameter. So, $_GET contains two entries: $_GET['author'] contains Harper Lee and $_GET['title'] contains To Kill a Mockingbird. On the first highlighted line, we are assigning a Boolean value to the $looking variable. If either $_GET['title'] or $_GET['author'] exists, this variable will be true; otherwise, false. Just after that, we close the PHP tag and then we start printing some HTML, but as you can see, we are actually mixing HTML with PHP code. Another interesting line here is the second highlighted line. We are printing the content of $looking, but before that, we cast the value. Casting means forcing PHP to transform a type of value to another one. Casting a Boolean to an integer means that the resultant value will be 1 if the Boolean is true or 0 if the Boolean is false. As $looking is true since $_GET contains valid keys, the page shows 1. If we try to access the same page without sending any information as in http://localhost:8000, the browser will say "Are you looking for a book? 0". Depending on the settings of your PHP configuration, you will see two notice messages complaining that you are trying to access the keys of the array that do not exist. Casting versus type juggling We already knew that when PHP needs a specific type of variable, it will try to transform it, which is called type juggling. But PHP is quite flexible, so sometimes, you have to be the one specifying the type that you need. When printing something with echo, PHP tries to transform everything it gets into strings. Since the string version of the false Boolean is an empty string, this would not be useful for our application. Casting the Boolean to an integer first assures that we will see a value, even if it is just "0". HTML forms HTML forms are one of the most popular ways to collect information from users. They consist a series of fields called inputs in the HTML world and a final submit button. In HTML, the form tag contains two attributes: the action points, where the form will be submitted and method that specifies which HTTP method the form will use—GET or POST. Let's see how it works. Save the following content as login.html and go to http://localhost:8000/login.html: <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Bookstore - Login</title> </head> <body> <p>Enter your details to login:</p> <form action="authenticate.php" method="post"> <label>Username</label> <input type="text" name="username" /> <label>Password</label> <input type="password" name="password" /> <input type="submit" value="Login"/> </form> </body> </html> This form contains two fields, one for the username and one for the password. You can see that they are identified by the name attribute. If you try to submit this form, the browser will show you a Page Not Found message, as it is trying to access http://localhost:8000/authenticate.phpand the web server cannot find it. Let's create it then: <?php $submitted = !empty($_POST); ?> <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Bookstore</title> </head> <body> <p>Form submitted? <?php echo (int) $submitted; ?></p> <p>Your login info is</p> <ul> <li><b>username</b>: <?php echo $_POST['username']; ?></li> <li><b>password</b>: <?php echo $_POST['password']; ?></li> </ul> </body> </html> As with $_GET, $_POST is an array that contains the parameters received by POST. In this piece of code, we are first asking whether that array is not empty—note the ! operator. Afterwards, we just display the information received, just as in index.php. Note that the keys of the $_POST array are the values for the name argument of each input field. Control structures So far, our files have been executed line by line. Due to that, we are getting some notices on some scenarios, such as when the array does not contain what we are looking for. Would it not be nice if we could choose which lines to execute? Control structures to the rescue! A control structure is like a traffic diversion sign. It directs the execution flow depending on some predefined conditions. There are different control structures, but we can categorize them in conditionals and loops. A conditional allows us to choose whether to execute a statement or not. A loop will execute a statement as many times as you need. Let's take a look at each one of them. Conditionals A conditional evaluates a Boolean expression, that is, something that returns a value. If the expression is true, it will execute everything inside its block of code. A block of code is a group of statements enclosed by {}. Let's see how it works: <?php echo "Before the conditional."; if (4 > 3) { echo "Inside the conditional."; } if (3 > 4) { echo "This will not be printed."; } echo "After the conditional."; In this piece of code, we are using two conditionals. A conditional is defined by the keyword if followed by a Boolean expression in parentheses and by a block of code. If the expression is true, it will execute the block; otherwise, it will skip it. You can increase the power of conditionals by adding the keyword else. This tells PHP to execute a block of code if the previous conditions were not satisfied. Let's see an example: if (2 > 3) { echo "Inside the conditional."; } else { echo "Inside the else."; } This will execute the code inside else as the condition of if was not satisfied. Finally, you can also add an elseif keyword followed by another condition and block of code to continue asking PHP for more conditions. You can add as many elseif as you need after if. If you add else, it has to be the last one of the chain of conditions. Also keep in mind that as soon as PHP finds a condition that resolves to true, it will stop evaluating the rest of the conditions: <?php if (4 > 5) { echo "Not printed"; } elseif (4 > 4) { echo "Not printed"; } elseif (4 == 4) { echo "Printed."; } elseif (4 > 2) { echo "Not evaluated."; } else { echo "Not evaluated."; } if (4 == 4) { echo "Printed"; } In this last example, the first condition that evaluates to true is the one that is highlighted. After that, PHP does not evaluate any more conditions until a new if starts. With this knowledge, let's try to clean up a bit of our application, executing statements only when needed. Copy this code to your index.php file: <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Bookstore</title> </head> <body> <p> <?php if (isset($_COOKIE[username'])) { echo "You are " . $_COOKIE['username']; } else { echo "You are not authenticated."; } ?> </p> <?php if (isset($_GET['title']) && isset($_GET['author'])) { ?> <p>The book you are looking for is</p> <ul> <li><b>Title</b>: <?php echo $_GET['title']; ?></li> <li><b>Author</b>: <?php echo $_GET['author']; ?></li> </ul> <?php } else { ?> <p>You are not looking for a book?</p> <?php } ?> </body> </html> In this new code, we are mixing conditionals and HTML code in two different ways. The first one opens a PHP tag and adds an if-else clause that will print whether we are authenticated or not with echo. No HTML is merged within the conditionals, which makes it clear. The second option—the second highlighted block—shows an uglier solution, but this is sometimes necessary. When you have to print a lot of HTML code, echo is not that handy, and it is better to close the PHP tag; print all the HTML you need and then open the tag again. You can do that even inside the code block of an if clause, as you can see in the code. Mixing PHP and HTML If you feel like the last file we edited looks rather ugly, you are right. Mixing PHP and HTML is confusing, and you have to avoid it by all means. Let's edit our authenticate.php file too, as it is trying to access $_POST entries that might not be there. The new content of the file would be as follows: <?php $submitted = isset($_POST['username']) && isset($_POST['password']); if ($submitted) { setcookie('username', $_POST['username']); } ?> <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Bookstore</title> </head> <body> <?php if ($submitted): ?> <p>Your login info is</p> <ul> <li><b>username</b>: <?php echo $_POST['username']; ?></li> <li><b>password</b>: <?php echo $_POST['password']; ?></li> </ul> <?php else: ?> <p>You did not submitted anything.</p> <?php endif; ?> </body> </html> This code also contains conditionals, which we already know. We are setting a variable to know whether we've submitted a login or not and to set the cookies if we have. However, the highlighted lines show you a new way of including conditionals with HTML. This way, tries to be more readable when working with HTML code, avoiding the use of {} and instead using : and endif. Both syntaxes are correct, and you should use the one that you consider more readable in each case. Switch-case Another control structure similar to if-else is switch-case. This structure evaluates only one expression and executes the block depending on its value. Let's see an example: <?php switch ($title) { case 'Harry Potter': echo "Nice story, a bit too long."; break; case 'Lord of the Rings': echo "A classic!"; break; default: echo "Dunno that one."; break; } The switch case takes an expression; in this case, a variable. It then defines a series of cases. When the case matches the current value of the expression, PHP executes the code inside it. As soon as PHP finds break, it will exit switch-case. In case none of the cases are suitable for the expression, if there is a default case         , PHP will execute it, but this is optional. You also need to know that breaks are mandatory if you want to exit switch-case. If you do not specify any, PHP will keep on executing statements, even if it encounters a new case. Let's see a similar example but without breaks: <?php $title = 'Twilight'; switch ($title) { case 'Harry Potter': echo "Nice story, a bit too long."; case 'Twilight': echo 'Uh...'; case 'Lord of the Rings': echo "A classic!"; default: echo "Dunno that one."; } If you test this code in your browser, you will see that it is printing "Uh...A classic!Dunno that one.". PHP found that the second case is valid so it executes its content. But as there are no breaks, it keeps on executing until the end. This might be the desired behavior sometimes, but not usually, so we need to be careful when using it! Loops Loops are control structures that allow you to execute certain statements several times—as many times as you need. You might use them on several different scenarios, but the most common one is when interacting with arrays. For example, imagine you have an array with elements but you do not know what is in it. You want to print all its elements so you loop through all of them. There are four types of loops. Each of them has their own use cases, but in general, you can transform one type of loop into another. Let's see them closely While While is the simplest of the loops. It executes a block of code until the expression to evaluate returns false. Let's see one example: <?php $i = 1; while ($i < 4) { echo $i . " "; $i++; } Here, we are defining a variable with the value 1. Then, we have a while clause in which the expression to evaluate is $i < 4. This loop will execute the content of the block of code until that expression is false. As you can see, inside the loop we are incrementing the value of $i by 1 each time, so after 4 iterations, the loop will end. Check out the output of that script, and you will see "0 1 2 3". The last value printed is 3, so by that time, $i was 3. After that, we increased its value to 4, so when the while was evaluating whether $i < 4, the result was false. Whiles and infinite loops One of the most common problems with while loops is creating an infinite loop. If you do not add any code inside while, which updates any of the variables considered in the while expression so it can be false at some point, PHP will never exit the loop! For This is the most complex of the four loops. For defines an initialization expression, an exit condition, and the end of the iteration expression. When PHP first encounters the loop, it executes what is defined as the initialization expression. Then, it evaluates the exit condition, and if it resolves to true, it enters the loop. After executing everything inside the loop, it executes the end of the iteration expression. Once this is done, it will evaluate the end condition again, going through the loop code and the end of iteration expression until it evaluates to false. As always, an example will help clarify this: <?php for ($i = 1; $i < 10; $i++) { echo $i . " "; } The initialization expression is $i = 1 and is executed only the first time. The exit condition is $i < 10, and it is evaluated at the beginning of each iteration. The end of the iteration expression is $i++, which is executed at the end of each iteration. This example prints numbers from 1 to 9. Another more common usage of the for loop is with arrays: <?php $names = ['Harry', 'Ron', 'Hermione']; for ($i = 0; $i < count($names); $i++) { echo $names[$i] . " "; } In this example, we have an array of names. As it is defined as a list, its keys will be 0, 1, and 2. The loop initializes the $i variable to 0, and it will iterate until the value of $i is not less than the amount of elements in the array 3 The first iteration $i is 0, the second will be 1, and the third one will be 2. When $i is 3, it will not enter the loop as the exit condition evaluates to false. On each iteration, we are printing the content of the $i position of the array; hence, the result of this code will be all three names in the array. We careful with exit conditions It is very common to set an exit condition. This is not exactly what we need, especially with arrays. Remember that arrays start with 0 if they are a list, so an array of 3 elements will have entries 0, 1, and 2. Defining the exit condition as $i <= count($array) will cause an error on your code, as when $i is 3, it also satisfies the exit condition and will try to access to the key 3, which does not exist. Foreach The last, but not least, type of loop is foreach. This loop is exclusive for arrays, and it allows you to iterate an array entirely, even if you do not know its keys. There are two options for the syntax, as you can see in these examples: <?php $names = ['Harry', 'Ron', 'Hermione']; foreach ($names as $name) { echo $name . " "; } foreach ($names as $key => $name) { echo $key . " -> " . $name . " "; } The foreach loop accepts an array; in this case, $names. It specifies a variable, which will contain the value of the entry of the array. You can see that we do not need to specify any end condition, as PHP will know when the array has been iterated. Optionally, you can specify a variable that will contain the key of each iteration, as in the second loop. Foreach loops are also useful with maps, where the keys are not necessarily numeric. The order in which PHP will iterate the array will be the same order in which you used to insert the content in the array. Let's use some loops in our application. We want to show the available books in our home page. We have the list of books in an array, so we will have to iterate all of them with a foreach loop, printing some information from each one. Append the following code to the body tag in index.php: <?php endif; $books = [ [ 'title' => 'To Kill A Mockingbird', 'author' => 'Harper Lee', 'available' => true, 'pages' => 336, 'isbn' => 9780061120084 ], [ 'title' => '1984', 'author' => 'George Orwell', 'available' => true, 'pages' => 267, 'isbn' => 9780547249643 ], [ 'title' => 'One Hundred Years Of Solitude', 'author' => 'Gabriel Garcia Marquez', 'available' => false, 'pages' => 457, 'isbn' => 9785267006323 ], ]; ?> <ul> <?php foreach ($books as $book): ?> <li> <i><?php echo $book['title']; ?></i> - <?php echo $book['author']; ?> <?php if (!$book['available']): ?> <b>Not available</b> <?php endif; ?> </li> <?php endforeach; ?> </ul> The highlighted code shows a foreach loop using the : notation, which is better when mixing it with HTML. It iterates all the $books arrays, and for each book, it will print some information as a HTML list. Also note that we have a conditional inside a loop, which is perfectly fine. Of course, this conditional will be executed for each entry in the array, so you should keep the block of code of your loops as simple as possible. Functions A function is a reusable block of code that, given an input, performs some actions and optionally returns a result. You already know several predefined functions, such as empty, in_array, or var_dump. These functions come with PHP so you do not have to reinvent the wheel, but you can create your own very easily. You can define functions when you identify portions of your application that have to be executed several times or just to encapsulate some functionality. Function declaration Declaring a function means to write it down so that it can be used later. A function has a name, takes arguments, and has a block of code. Optionally, it can define what kind of value is returning. The name of the function has to follow the same rules as variable names; that is, it has to start by a letter or underscore and can contain any letter, number, or underscore. It cannot be a reserved word. Let's see a simple example: function addNumbers($a, $b) { $sum = $a + $b; return $sum; } $result = addNumbers(2, 3); Here, the function's name is addNumbers, and it takes two arguments: $a and $b. The block of code defines a new variable $sum that is the sum of both the arguments and then returns its content with return. In order to use this function, you just need to call it by its name, sending all the required arguments, as shown in the highlighted line. PHP does not support overloaded functions. Overloading refers to the ability of declaring two or more functions with the same name but different arguments. As you can see, you can declare the arguments without knowing what their types are, so PHP would not be able to decide which function to use. Another important thing to note is the variable scope. We are declaring a $sum variable inside the block of code, so once the function ends, the variable will not be accessible any more. This means that the scope of variables declared inside the function is just the function itself. Furthermore, if you had a $sum variable declared outside the function, it would not be affected at all since the function cannot access that variable unless we send it as an argument. Function arguments A function gets information from outside via arguments. You can define any number of arguments—including 0. These arguments need at least a name so that they can be used inside the function, and there cannot be two arguments with the same name. When invoking the function, you need to send the arguments in the same order as we declared them. A function may contain optional arguments; that is, you are not forced to provide a value for those arguments. When declaring the function, you need to provide a default value for those arguments, so in case the user does not provide a value, the function will use the default one: function addNumbers($a, $b, $printResult = false) { $sum = $a + $b; if ($printResult) { echo 'The result is ' . $sum; } return $sum; } $sum1 = addNumbers(1, 2); $sum1 = addNumbers(3, 4, false); $sum1 = addNumbers(5, 6, true); // it will print the result This new function takes two mandatory arguments and an optional one. The default value is false, and is used as a normal value inside the function. The function will print the result of the sum if the user provides true as the third argument, which happens only the third time that the function is invoked. For the first two times, $printResult is set to false. The arguments that the function receives are just copies of the values that the user provided. This means that if you modify these arguments inside the function, it will not affect the original values. This feature is known as sending arguments by a value. Let's see an example: function modify($a) { $a = 3; } $a = 2; modify($a); var_dump($a); // prints 2 We are declaring the $a variable with the value 2, and then we are calling the modify method, sending $a. The modify method modifies the $a argument, setting its value to 3. However, this does not affect the original value of $a, which reminds to 2 as you can see in the var_dump function. If what you want is to actually change the value of the original variable used in the invocation, you need to pass the argument by reference. To do that, you add & in front of the argument when declaring the function: function modify(&$a) { $a = 3; } Now, after invoking the modify function, $a will be always 3. Arguments by value versus by reference PHP allows you to do it, and in fact, some native functions of PHP use arguments by reference—remember the array sorting functions; they did not return the sorted array; instead, they sorted the array provided. But using arguments by reference is a way of confusing developers. Usually, when someone uses a function, they expect a result, and they do not want their provided arguments to be modified. So, try to avoid it; people will be grateful! The return statement You can have as many return statements as you want inside your function, but PHP will exit the function as soon as it finds one. This means that if you have two consecutive return statements, the second one will never be executed. Still, having multiple return statements can be useful if they are inside conditionals. Add this function inside your functions.php file: function loginMessage() { if (isset($_COOKIE['username'])) { return "You are " . $_COOKIE['username']; } else { return "You are not authenticated."; } } Let's use it in your index.php file by replacing the highlighted content—note that to save some tees, I replaced most of the code that was not changed at all with //…: //... <body> <p><?php echo loginMessage(); ?></p> <?php if (isset($_GET['title']) && isset($_GET['author'])): ?> //... Additionally, you can omit the return statement if you do not want the function to return anything. In this case, the function will end once it reaches the end of the block of code. Type hinting and return types With the release of PHP7, the language allows developers to be more specific about what functions get and return. You can—always optionally—specify the type of argument that the function needs, for example, type hinting, and the type of result the function will return—return type. Let's first see an example: <?php declare(strict_types=1); function addNumbers(int $a, int $b, bool $printSum): int { $sum = $a + $b; if ($printSum) { echo 'The sum is ' . $sum; } return $sum; } addNumbers(1, 2, true); addNumbers(1, '2', true); // it fails when strict_types is 1 addNumbers(1, 'something', true); // it always fails This function states that the arguments need to be an integer, and Boolean, and that the result will be an integer. Now, you know that PHP has type juggling, so it can usually transform a value of one type to its equivalent value of another type, for example, the string 2 can be used as integer 2. To stop PHP from using type juggling with the arguments and results of functions, you can declare the strict_types directive as shown in the first highlighted line. This directive has to be declared on the top of each file, where you want to enforce this behavior. The three invocations work as follows: The first invocation sends two integers and a Boolean, which is what the function expects. So, regardless of the value of strict_types, it will always work. The second invocation sends an integer, a string, and a Boolean. The string has a valid integer value, so if PHP was allowed to use type juggling, the invocation would resolve to just normal. But in this example, it will fail because of the declaration on top of the file. The third invocation will always fail as the something string cannot be transformed into a valid integer. Let's try to use a function within our project. In our index.php file, we have a foreach loop that iterates the books and prints them. The code inside the loop is kind of hard to understand as it is mixing HTML with PHP, and there is a conditional too. Let's try to abstract the logic inside the loop into a function. First, create the new functions.php file with the following content: <?php function printableTitle(array $book): string { $result = '<i>' . $book['title'] . '</i> - ' . $book['author']; if (!$book['available']) { $result .= ' <b>Not available</b>'; } return $result; } This file will contain our functions. The first one, printableTitle, takes an array representing a book and builds a string with a nice representation of the book in HTML. The code is the same as before, just encapsulated in a function. Now, index.php will have to include the functions.php file and then use the function inside the loop. Let's see how this is done: <?php require_once 'functions.php' ?> <!DOCTYPE html> <html lang="en"> //... ?> <ul> <?php foreach ($books as $book): ?> <li><?php echo printableTitle($book); ?> </li> <?php endforeach; ?> </ul> //... Well, now our loop looks way cleaner, right? Also, if we need to print the title of the book somewhere else, we can reuse the function instead of duplicating code! Summary In this article, we went through all the basics of procedural PHP while writing simple examples in order to practice them. You now know how to use variables and arrays with control structures and functions and how to get information from HTTP requests among others. Resources for Article: Further resources on this subject: Getting started with Modernizr using PHP IDE[article] PHP 5 Social Networking: Implementing Public Messages[article] Working with JSON in PHP jQuery[article]
Read more
  • 0
  • 0
  • 6025
Visually different images

article-image-overview-node-package-manager
Packt
11 Aug 2011
5 min read
Save for later

An Overview of the Node Package Manager

Packt
11 Aug 2011
5 min read
Node Web Development npm package format An npm package is a directory structure with a package.json file describing the package. This is exactly what we just referred to as a Complex Module, except npm recognizes many more package.json tags than does Node. The starting point for npm's package.json is the CommonJS Packages/1.0 specification. The documentation for npm's package.json implementation is accessed with the following command: $ npm help json A basic package.json file is as follows: { name: "packageName", version: "1.0", main: "mainModuleName", modules: { "mod1": "lib/mod1", "mod2": "lib/mod2" } } The file is in JSON format which, as a JavaScript programmer, you should already have seen a few hundred times. The most important tags are name and version. The name will appear in URLs and command names, so choose one that's safe for both. If you desire to publish a package in the public npm repository it's helpful to check and see if a particular name is already being used, at http://search.npmjs.org or with the following command: $ npm search packageName The main tag is treated the same as complex modules. It references the module that will be returned when invoking require('packageName'). Packages can contain many modules within themselves, and those can be listed in the modules list. Packages can be bundled as tar-gzip tarballs, especially to send them over the Internet. A package can declare dependencies on other packages. That way npm can automatically install other modules required by the module being installed. Dependencies are declared as follows: "dependencies": { "foo" : "1.0.0 - 2.9999.9999" , "bar" : ">=1.0.2 <2.1.2" } The description and keywords fields help people to find the package when searching in an npm repository (http://search.npmjs.org). Ownership of a package can be documented in the homepage, author, or contributors fields: "description": "My wonderful packages walks dogs", "homepage": "http://npm.dogs.org/dogwalker/", "author": [email protected] Some npm packages provide executable programs meant to be in the user's PATH. These are declared using the bin tag. It's a map of command names to the script which implements that command. The command scripts are installed into the directory containing the node executable using the name given. bin: { 'nodeload.js': './nodeload.js', 'nl.js': './nl.js' }, The directories tag documents the package directory structure. The lib directory is automatically scanned for modules to load. There are other directory tags for binaries, manuals, and documentation. directories: { lib: './lib', bin: './bin' }, The script tags are script commands run at various events in the lifecycle of the package. These events include install, activate, uninstall, update, and more. For more information about script commands, use the following command: $ npm help scripts This was only a taste of the npm package format; see the documentation (npm help json) for more. Finding npm packages By default npm modules are retrieved over the Internet from the public package registry maintained on http://npmjs.org. If you know the module name it can be installed simply by typing the following: $ npm install moduleName But what if you don't know the module name? How do you discover the interesting modules? The website http://npmjs.org publishes an index of the modules in that registry, and the http://search.npmjs.org site lets you search that index. npm also has a command-line search function to consult the same index: $ npm search mp3 mediatags Tools extracting for media meta-data tags =coolaj86 util m4a aac mp3 id3 jpeg exiv xmp node3p An Amazon MP3 downloader for NodeJS. =ncb000gt Of course upon finding a module it's installed as follows: $ npm install mediatags After installing a module one may want to see the documentation, which would be on the module's website. The homepage tag in the package.json lists that URL. The easiest way to look at the package.json file is with the npm view command, as follows: $ npm view zombie ... { name: 'zombie', description: 'Insanely fast, full-stack, headless browser testing using Node.js', ... version: '0.9.4', homepage: 'http://zombie.labnotes.org/', ... npm ok You can use npm view to extract any tag from package.json, like the following which lets you view just the homepage tag: $ npm view zombie homepage http://zombie.labnotes.org/ Using the npm commands The main npm command has a long list of sub-commands for specific package management operations. These cover every aspect of the lifecycle of publishing packages (as a package author), and downloading, using, or removing packages (as an npm consumer). Getting help with npm Perhaps the most important thing is to learn where to turn to get help. The main help is delivered along with the npm command accessed as follows: For most of the commands you can access the help text for that command by typing the following: $ npm help <command> The npm website (http://npmjs.org/) has a FAQ that is also delivered with the npm software. Perhaps the most important question (and answer) is: Why does npm hate me? npm is not capable of hatred. It loves everyone, even you.  
Read more
  • 0
  • 0
  • 5994

article-image-replication-mysql-admin
Packt
17 Jan 2011
10 min read
Save for later

Replication in MySQL Admin

Packt
17 Jan 2011
10 min read
Replication is an interesting feature of MySQL that can be used for a variety of purposes. It can help to balance server load across multiple machines, ease backups, provide a workaround for the lack of fulltext search capabilities in InnoDB, and much more. The basic idea behind replication is to reflect the contents of one database server (this can include all databases, only some of them, or even just a few tables) to more than one instance. Usually, those instances will be running on separate machines, even though this is not technically necessary. Traditionally, MySQL replication is based on the surprisingly simple idea of repeating the execution of all statements issued that can modify data—not SELECT—against a single master machine on other machines as well. Provided all secondary slave machines had identical data contents when the replication process began, they should automatically remain in sync. This is called Statement Based Replication (SBR). With MySQL 5.1, Row Based Replication (RBR) was added as an alternative method for replication, targeting some of the deficiencies SBR brings with it. While at first glance it may seem superior (and more reliable), it is not a silver bullet—the pain points of RBR are simply different from those of SBR. Even though there are certain use cases for RBR, all recipes in this chapter will be using Statement Based Replication. While MySQL makes replication generally easy to use, it is still important to understand what happens internally to be able to know the limitations and consequences of the actions and decisions you will have to make. We assume you already have a basic understanding of replication in general, but we will still go into a few important details. Statement Based Replication SBR is based on a simple but effective principle: if two or more machines have the same set of data to begin with, they will remain identical if all of them execute the exact same SQL statements in the same order. Executing all statements manually on multiple machines would be extremely tedious and impractical. SBR automates this process. In simple terms, it takes care of sending all the SQL statements that change data on one server (the master) to any number of additional instances (the slaves) over the network. The slaves receiving this stream of modification statements execute them automatically, thereby effectively reproducing the changes the master machine made to its data originally. That way they will keep their local data files in sync with the master's. One thing worth noting here is that the network connection between the master and its slave(s) need not be permanent. In case the link between a slave and its master fails, the slave will remember up to which point it had read the data last time and will continue from there once the network becomes available again. In order to minimize the dependency on the network link, the slaves will retrieve the binary logs (binlogs) from the master as quickly as they can, storing them on their local disk in files called relay logs. This way, the connection, which might be some sort of dial-up link, can be terminated much sooner while executing the statements from the local relay-log asynchronously. The relay log is just a copy of the master's binlog. The following image shows the overall architecture: Filtering In the image you can see that each slave may have its individual configuration on whether it executes all the statements coming in from the master, or just a selection of those. This can be helpful when you have some slaves dedicated to special tasks, where they might not need all the information from the master. All of the binary logs have to be sent to each slave, even though it might then decide to throw away most of them. Depending on the size of the binlogs, the number of slaves and the bandwidth of the connections in between, this can be a heavy burden on the network, especially if you are replicating via wide area networks. Even though the general idea of transferring SQL statements over the wire is rather simple, there are lots of things that can go wrong, especially because MySQL offers some configuration options that are quite counter-intuitive and lead to hard-to-find problems. For us, this has become a best practice: "Only use qualified statements and replicate-*-table configuration options for intuitively predictable replication!" What this means is that the only filtering rules that produce intuitive results are those based on the replicate-do-table and replicate-ignore-table configuration options. This includes those variants with wildcards, but specifically excludes the all-database options like replicate-do-db and replicate-ignore-db. These directives are applied on the slave side on all incoming relay logs. The master-side binlog-do-* and binlog-ignore-* configuration directives influence which statements are sent to the binlog and which are not. We strongly recommend against using them, because apart from hard-to-predict results they will make the binlogs undesirable for server backup and restore. They are often of limited use anyway as they do not allow individual configurations per slave but apply to all of them. Setting up automatically updated slaves of a server based on a SQL dump In this recipe, we will show you how to prepare a dump file of a MySQL master server and use it to set up one or more replication slaves. These will automatically be updated with changes made on the master server over the network. Getting ready You will need a running MySQL master database server that will act as the replication master and at least one more server to act as a replication slave. This needs to be a separate MySQL instance with its own data directory and configuration. It can reside on the same machine if you just want to try this out. In practice, a second machine is recommended because this technique's very goal is to distribute data across multiple pieces of hardware, not place an even higher burden on a single one. For production systems you should pick a time to do this when there is a lighter load on the master machine, often during the night when there are less users accessing the system. Taking the SQL dump uses some extra resources, but unless your server is maxed out already, the performance impact usually is not a serious problem. Exactly how long the dump will take depends mostly on the amount of data and speed of the I/O subsystem. You will need an administrative operating system account on the master and the slave servers to edit the MySQL server configuration files on both of them. Moreover, an administrative MySQL database user is required to set up replication. We will just replicate a single database called sakila in this example. Replicating more than one database In case you want to replicate more than one schema, just add their names to the commands shown below. To replicate all of them, just leave out any database name from the command line. How to do it... At the operating system level, connect to the master machine and open the MySQL configuration file with a text editor. Usually it is called my.ini on Windows and my.cnf on other operating systems. On the master machine, make sure the following entries are present and add them to the [mysqld] section if not already there: server-id=1000 log-bin=master-bin If one or both entries already exist, do not change them but simply note their values. The log-bin setting need not have a value, but can stand alone as well. Restart the master server if you need to modify the configuration. Create a user account on the master that can be used by the slaves to connect: master> grant replication slave on *.* to 'repl'@'%' identified by 'slavepass'; Using the mysqldump tool included in the default MySQL install, create the initial copy to set up the slave(s): $ mysqldump -uUSER -pPASS --master-data --single-transaction sakila > sakila_master.sql Transfer the sakila_master.sql dump file to each slave you want to set up, for example, by using an external drive or network copy. On the slave, make sure the following entries are present and add them to the [mysqld] section if not present: server-id=1001 replicate-wild-do-table=sakila.% When adding more than one slave, make sure the server-id setting is unique among master and all clients. Restart the slave server. Connect to the slave server and issue the following commands (assuming the data dump was stored in the /tmp directory): slave> create database sakila; slave> use sakila; slave> source /tmp/sakila_master.sql; slave> CHANGE MASTER TO master_host='master.example.com', master_port=3306, master_ user='repl', master_password='slavepass'; slave> START SLAVE; Verify the slave is running with: slave> SHOW SLAVE STATUSG ************************** 1. row *************************** ... Slave_IO_Running: Yes Slave_SQL_Running: Yes ... How it works... Some of the instructions discussed in the previous section are to make sure that both master and slave are configured with different server-id settings. This is of paramount importance for a successful replication setup. If you fail to provide unique server-id values to all your server instances, you might see strange replication errors that are hard to debug. Moreover, the master must be configured to write binlogs—a record of all statements manipulating data (this is what the slaves will receive). Before taking a full content dump of the sakila demo database, we create a user account for the slaves to use. This needs the REPLICATION SLAVE privilege. Then a data dump is created with the mysqldump command line tool. Notice the provided parameters --master-data and --single-transaction. The former is needed to have mysqldump include information about the precise moment the dump was created in the resulting output. The latter parameter is important when using InnoDB tables, because only then will the dump be created based on a transactional snapshot of the data. Without it, statements changing data while the tool was running could lead to an inconsistent dump. The output of the command is redirected to the /tmp/sakila_master.sql file. As the sakila database is not very big, you should not see any problems. However, if you apply this recipe to larger databases, make sure you send the data to a volume with sufficient free disk space—the SQL dump can become quite large. To save space here, you may optionally pipe the output through gzip or bzip2 at the cost of a higher CPU load on both the master and the slaves, because they will need to unpack the dump before they can load it, of course. If you open the uncompressed dump file with an editor, you will see a line with a CHANGE MASTER TO statement. This is what --master-data is for. Once the file is imported on a slave, it will know at which point in time (well, rather at which binlog position) this dump was taken. Everything that happened on the master after that needs to be replicated. Finally, we configure that slave to use the credentials set up on the master before to connect and then start the replication. Notice that the CHANGE MASTER TO statement used for that does not include the information about the log positions or file names because that was already taken from the dump file just read in. From here on the slave will go ahead and record all SQL statements sent from the master, store them in its relay logs, and then execute them against the local data set. This recipe is very important because the following recipes are based on this! So in case you have not fully understood the above steps yet, we recommend you go through them again, before trying out more complicated setups.
Read more
  • 0
  • 0
  • 5992

article-image-importing-structure-and-data-using-phpmyadmin
Packt
12 Oct 2009
9 min read
Save for later

Importing Structure and Data Using phpMyAdmin

Packt
12 Oct 2009
9 min read
A feature was added in version 2.11.0: an import file may contain the DELIMITER keyword. This enables phpMyAdmin to mimic the mysql command-line interpreter. The DELIMITER separator is used to delineate the part of the file containing a stored procedure, as these procedures can themselves contain semicolons. The default values for the Import interface are defined in $cfg['Import']. Before examining the actual import dialog, let's discuss some limits issues. Limits for the transfer When we import, the source file is usually on our client machine; so, it must travel to the server via HTTP. This transfer takes time and uses resources that may be limited in the web server's PHP configuration. Instead of using HTTP, we can upload our file to the server using a protocol such as FTP, as described in the Web Server Upload Directories section. This method circumvents the web server's PHP upload limits. Time limits First, let's consider the time limit. In config.inc.php, the $cfg['ExecTimeLimit'] configuration directive assigns, by default, a maximum execution time of 300 seconds (five minutes) for any phpMyAdmin script, including the scripts that process data after the file has been uploaded. A value of 0 removes the limit, and in theory, gives us infinite time to complete the import operation. If the PHP server is running in safe mode, modifying $cfg['ExecTimeLimit'] will have no effect. This is because the limits set in php.ini or in user-related web server configuration file (such as .htaccess or virtual host configuration files) take precedence over this parameter. Of course, the time it effectively takes, depends on two key factors: Web server load MySQL server load The time taken by the file, as it travels between the client and the server,does not count as execution time because the PHP script starts to execute only once the file has been received on the server. Therefore, the $cfg['ExecTimeLimit'] parameter has an impact only on the time used to process data (like decompression or sending it to the MySQL server). Other limits The system administrator can use the php.ini file or the web server's virtual host configuration file to control uploads on the server. The upload_max_filesize parameter specifies the upper limit or the maximum file size that can be uploaded via HTTP. This one is obvious, but another less obvious parameter is post_max_size. As HTTP uploading is done via the POST method, this parameter may limit our transfers. For more details about the POST method, please refer to http://en.wikipedia.org/wiki/Http#Request_methods. The memory_limit parameter is provided to avoid web server child processes from grabbing too much of the server memory—phpMyAdmin also runs as a child process. Thus, the handling of normal file uploads, especially compressed dumps, can be compromised by giving this parameter a small value. Here, no preferred value can be recommended; the value depends on the size of uploaded data. The memory limit can also be tuned via the $cfg['MemoryLimit'] parameter in config.inc.php. Finally, file uploads must be allowed by setting file_uploads to On. Otherwise, phpMyAdmin won't even show the Location of the textfile dialog. It would be useless to display this dialog, as the connection would be refused later by the PHP component of the web server. Partial imports If the file is too big, there are ways in which we can resolve the situation. If we still have access to the original data, we could use phpMyAdmin to generate smaller CSV export files, choosing the Dump n rows starting at record # n dialog. If this were not possible, we will have to use a text editor to split the file into smaller sections. Another possibility is to use the upload directory mechanism, which accesses the directory defined in $cfg['UploadDir']. This feature is explained later in this article. In recent phpMyAdmin versions, the Partial import feature can also solve this file size problem. By selecting the Allow interrupt… checkbox, the import process will interrupt itself if it detects that it is close to the time limit. We can also specify a number of queries to skip from the start, in case we successfully import a number of rows and wish to continue from that point. Temporary directory On some servers, a security feature called open_basedir can be set up in a way that impedes the upload mechanism. In this case, or for any other reason, when uploads are problematic, the $cfg['TempDir'] parameter can be set with the value of a temporary directory. This is probably a subdirectory of phpMyAdmin's main directory, into which the web server is allowed to put the uploaded file. Importing SQL files Any file containing MySQL statements can be imported via this mechanism. The dialog is available in the Database view or the Table view, via the Import subpage, or in the Query window. There is no relation between the currently selected table (here author) and the actual contents of the SQL file that will be imported. All the contents of the SQL file will be imported, and it is those contents that determine which tables or databases are affected. However, if the imported file does not contain any SQL statements to select a database, all statements in the imported file will be executed on the currently selected database. Let's try an import exercise. First, we make sure that we have a current SQL export of the book table. This export file must contain the structure and the data. Then we drop the book table—yes, really! We could also simply rename it. Now it is time to import the file back. We should be on the Import subpage, where we can see the Location of the text file dialog. We just have to hit the Browse button and choose our file. phpMyAdmin is able to detect which compression method (if any) has been applied to the file. Depending on the phpMyAdmin version, and the extensions that are available in the PHP component of the web server, there is variation in the formats that the program can decompress. However, to import successfully, phpMyAdmin must be informed of the character set of the file to be imported. The default value is utf8. However, if we know that the import file was created with another character set, we should specify it here. An SQL compatibility mode selector is available at import time. This mode should be adjusted to match the actual data that we are about to import, according to the type of the server where the data was previously exported. To start the import, we click Go. The import procedure continues and we receive a message: Import has been successfully finished, 2 queries executed. We can browse our newly-created tables to confirm the success of the import operation. The file could be imported for testing in a different database or even in a MySQL server. Importing CSV files In this section, we will examine how to import CSV files. There are two possible methods—CSV and CSV using LOAD DATA. The first method is implemented internally by phpMyAdmin and is the recommended one for its simplicity. With the second method, phpMyAdmin receives the file to be loaded, and passes it to MySQL. In theory, this method should be faster. However, it has more requirements due to MySQL itself (see the Requirements sub-section of the CSV using LOAD DATA section). Differences between SQL and CSV formats There are some differences between these two formats. The CSV file format contains data only, so we must already have an existing table in place. This table does not need to have the same structure as the original table (from which the data comes); the Column names dialog enables us to choose which columns are affected in the target table. Because the table must exist prior to the import, the CSV import dialog is available only from the Import subpage in the Table view, and not in the Database view.   Exporting a test file Before trying an import, let's generate an author.csv export file from the author table. We use the default values in the CSV export options. We can then Empty the author table—we should avoid dropping this table because we still need the table structure. CSV From the author table menu, we select Import and then CSV. We can influence the behavior of the import in a number of ways. By default, importing does not modify existing data (based on primary or unique keys). However, the Replace table data with file option instructs phpMyAdmin to use REPLACE statement instead of INSERT statement, so that existing rows are replaced with the imported data. Using Ignore duplicate rows, INSERT IGNORE statements are generated. These cause MySQL to ignore any duplicate key problems during insertion. A duplicate key from the import file does not replace existing data, and the procedure continues for the next line of CSV data. We can then specify the character that terminates each field, the character that encloses data, and the character that escapes the enclosing character. Usually this is . For example, for a double quote enclosing character, if the data field contains a double quote, it must be expressed as "some data " some other data". For Lines terminated by, recent versions of phpMyAdmin offer the auto choice, which should be tried first as it detects the end-of-line character automatically. We can also specify manually which characters terminate the lines. The usual choice is n for UNIX-based systems, rn for DOS or Windows systems, and r for Mac-based system (up to Mac OS 9). If in doubt, we can use a hexadecimal file editor on our client computer (not part of phpMyAdmin) to examine the exact codes. By default, phpMyAdmin expects a CSV file with the same number of fields and the same field order as the target table. But this can be changed by entering a comma-separated list of column names in Column names, respecting the source file format. For example, let's say our source file contains only the author ID and the author name information: "1","John Smith" "2","Maria Sunshine" We'd have to put id, name in Column names to match the source file. When we click Go, the import is executed and we get a confirmation. We might also see the actual INSERT queries generated if the total size of the file is not too big. Import has been successfully finished, 2 queries executed.INSERT INTO `author` VALUES ('1', 'John Smith', '+01 445 789-1234')# 1 row(s) affected.INSERT INTO `author` VALUES ('2', 'Maria Sunshine', '333-3333')# 1 row(s) affected.
Read more
  • 0
  • 0
  • 5975
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €14.99/month. Cancel anytime
article-image-relational-databases-sqlalchemy
Packt
02 Nov 2015
28 min read
Save for later

Relational Databases with SQLAlchemy

Packt
02 Nov 2015
28 min read
In this article by Matthew Copperwaite, author of the book Learning Flask Framework, he talks about how relational databases are the bedrock upon which almost every modern web applications are built. Learning to think about your application in terms of tables and relationships is one of the keys to a clean, well-designed project. We will be using SQLAlchemy, a powerful object relational mapper that allows us to abstract away the complexities of multiple database engines, to work with the database directly from within Python. In this article, we shall: Present a brief overview of the benefits of using a relational database Introduce SQLAlchemy, The Python SQL Toolkit and Object Relational Mapper Configure our Flask application to use SQLAlchemy Write a model class to represent blog entries Learn how to save and retrieve blog entries from the database Perform queries—sorting, filtering, and aggregation Create schema migrations using Alembic (For more resources related to this topic, see here.) Why use a relational database? Our application's database is much more than a simple record of things that we need to save for future retrieval. If all we needed to do was save and retrieve data, we could easily use flat text files. The fact is, though, that we want to be able to perform interesting queries on our data. What's more, we want to do this efficiently and without reinventing the wheel. While non-relational databases (sometimes known as NoSQL databases) are very popular and have their place in the world of web, relational databases long ago solved the common problems of filtering, sorting, aggregating, and joining tabular data. Relational databases allow us to define sets of data in a structured way that maintains the consistency of our data. Using relational databases also gives us, the developers, the freedom to focus on the parts of our app that matter. In addition to efficiently performing ad hoc queries, a relational database server will also do the following: Ensure that our data conforms to the rules set forth in the schema Allow multiple people to access the database concurrently, while at the same time guaranteeing the consistency of the underlying data Ensure that data, once saved, is not lost even in the event of an application crash Relational databases and SQL, the programming language used with relational databases, are topics worthy of an entire book. Because this book is devoted to teaching you how to build apps with Flask, I will show you how to use a tool that has been widely adopted by the Python community for working with databases, namely, SQLAlchemy. SQLAlchemy abstracts away many of the complications of writing SQL queries, but there is no substitute for a deep understanding of SQL and the relational model. For that reason, if you are new to SQL, I would recommend that you check out the colorful book Learn SQL the Hard Way, Zed Shaw available online for free at http://sql.learncodethehardway.org/. Introducing SQLAlchemy SQLAlchemy is an extremely powerful library for working with relational databases in Python. Instead of writing SQL queries by hand, we can use normal Python objects to represent database tables and execute queries. There are a number of benefits to this approach which are listed as follows: Your application can be developed entirely in Python. Subtle differences between database engines are abstracted away. This allows you to do things just like a lightweight database, for instance, use SQLite for local development and testing, then switch to the databases designed for high loads (such as PostgreSQL) in production. Database errors are less common because there are now two layers between your application and the database server: the Python interpreter itself (which will catch the obvious syntax errors), and SQLAlchemy, which has well-defined APIs and it's own layer of error-checking. Your database code may become more efficient, thanks to SQLAlchemy's unit-of-work model which helps reduce unnecessary round-trips to the database. SQLAlchemy also has facilities for efficiently pre-fetching related objects known as eager loading. Object Relational Mapping (ORM) makes your code more maintainable, an asperation known as don't repeat yourself, (DRY). Suppose you add a column to a model. With SQLAlchemy it will be available whenever you use that model. If, on the other hand, you had hand-written SQL queries strewn throughout your app, you would need to update each query, one at a time, to ensure that you were including the new column. SQLAlchemy can help you avoid SQL injection vulnerabilities. Excellent library support: There are a multitude of useful libraries that can work directly with your SQLAlchemy models to provide things like maintenance interfaces and RESTful APIs. I hope you're excited after reading this list. If all the items in this list don't make sense to you right now, don't worry. Now that we have discussed some of the benefits of using SQLAlchemy, let's install it and start coding. If you'd like to learn more about SQLAlchemy, there is an article devoted entirely to its design in The Architecture of Open-Source Applications, available online for free at http://aosabook.org/en/sqlalchemy.html. Installing SQLAlchemy We will use pip to install SQLAlchemy into the blog app's virtualenv. To activate your virtualenv, change directories to source the activate script as follows: $ cd ~/projects/blog $ source bin/activate (blog) $ pip install sqlalchemy Downloading/unpacking sqlalchemy … Successfully installed sqlalchemy Cleaning up... You can check if your installation succeeded by opening a Python interpreter and checking the SQLAlchemy version; note that your exact version number is likely to differ. $ python >>> import sqlalchemy >>> sqlalchemy.__version__ '0.9.0b2' Using SQLAlchemy in our Flask app SQLAlchemy works very well with Flask on its own, but the author of Flask has released a special Flask extension named Flask-SQLAlchemy that provides helpers with many common tasks, and can save us from having to re-invent the wheel later on. Let's use pip to install this extension: (blog) $ pip install flask-sqlalchemy … Successfully installed flask-sqlalchemy Flask provides a standard interface for the developers who are interested in building extensions. As the framework has grown in popularity, the number of high quality extensions has increased. If you'd like to take a look at some of the more popular extensions, there is a curated list available on the Flask project website at http://flask.pocoo.org/extensions/. Choosing a database engine SQLAlchemy supports a multitude of popular database dialects, including SQLite, MySQL, and PostgreSQL. Depending on the database you would like to use, you may need to install an additional Python package containing a database driver. Listed next are several popular databases supported by SQLAlchemy and the corresponding pip-installable driver. Some databases have multiple driver options, so I have listed the most popular one first. Database Driver Package(s) SQLite Not needed, part of the Python standard library since version 2.5 MySQL MySQL-python, PyMySQL (pure Python), OurSQL PostgreSQL psycopg2 Firebird fdb Microsoft SQL Server pymssql, PyODBC Oracle cx-Oracle SQLite comes as standard with Python and does not require a separate server process, so it is perfect for getting up and running quickly. For simplicity in the examples that follow, I will demonstrate how to configure the blog app for use with SQLite. If you have a different database in mind that you would like to use for the blog project, feel free to use pip to install the necessary driver package at this time. Connecting to the database Using your favorite text editor, open the config.py module for our blog project (~/projects/blog/app/config.py). We are going to add an SQLAlchemy specific setting to instruct Flask-SQLAlchemy how to connect to our database. The new lines are highlighted in the following: class Configuration(object): APPLICATION_DIR = current_directory DEBUG = True SQLALCHEMY_DATABASE_URI = 'sqlite:///%s/blog.db' % APPLICATION_DIR The SQLALCHEMY_DATABASE_URIis comprised of the following parts: dialect+driver://username:password@host:port/database Because SQLite databases are stored in local files, the only information we need to provide is the path to the database file. On the other hand, if you wanted to connect to PostgreSQL running locally, your URI might look something like this: postgresql://postgres:secretpassword@localhost:5432/blog_db If you're having trouble connecting to your database, try consulting the SQLAlchemy documentation on the database URIs:http://docs.sqlalchemy.org/en/rel_0_9/core/engines.html Now that we've specified how to connect to the database, let's create the object responsible for actually managing our database connections. This object is provided by the Flask-SQLAlchemy extension and is conveniently named SQLAlchemy. Open app.py and make the following additions: from flask import Flask from flask.ext.sqlalchemy import SQLAlchemy from config import Configuration app = Flask(__name__) app.config.from_object(Configuration) db = SQLAlchemy(app) These changes instruct our Flask app, and in turn SQLAlchemy, how to communicate with our application's database. The next step will be to create a table for storing blog entries and to do so, we will create our first model. Creating the Entry model A model is the data representation of a table of data that we want to store in the database. These models have attributes called columns that represent the data items in the data. So, if we were creating a Person model, we might have columns for storing the first and last name, date of birth, home address, hair color, and so on. Since we are interested in creating a model to represent blog entries, we will have columns for things like the title and body content. Note that we don't say a People model or Entries model – models are singular even though they commonly represent many different objects. With SQLAlchemy, creating a model is as easy as defining a class and specifying a number of attributes assigned to that class. Let's start with a very basic model for our blog entries. Create a new file named models.py in the blog project's app/ directory and enter the following code: import datetime, re from app import db def slugify(s): return re.sub('[^w]+', '-', s).lower() class Entry(db.Model): id = db.Column(db.Integer, primary_key=True) title = db.Column(db.String(100)) slug = db.Column(db.String(100), unique=True) body = db.Column(db.Text) created_timestamp = db.Column(db.DateTime, default=datetime.datetime.now) modified_timestamp = db.Column( db.DateTime, default=datetime.datetime.now, onupdate=datetime.datetime.now) def __init__(self, *args, **kwargs): super(Entry, self).__init__(*args, **kwargs) # Call parent constructor. self.generate_slug() def generate_slug(self): self.slug = '' if self.title: self.slug = slugify(self.title) def __repr__(self): return '<Entry: %s>' % self.title There is a lot going on, so let's start with the imports and work our way down. We begin by importing the standard library datetime and re modules. We will be using datetime to get the current date and time, and re to do some string manipulation. The next import statement brings in the db object that we created in app.py. As you recall, the db object is an instance of the SQLAlchemy class, which is a part of the Flask-SQLAlchemy extension. The db object provides access to the classes that we need to construct our Entry model, which is just a few lines ahead. Before the Entry model, we define a helper function slugify, which we will use to give our blog entries some nice URLs. The slugify function takes a string like A post about Flask and uses a regular expression to turn a string that is human-readable in a URL, and so returns a-post-about-flask. Next is the Entry model. Our Entry model is a normal class that extends db.Model. By extending the db.Model our Entry class will inherit a variety of helpers which we'll use to query the database. The attributes of the Entry model, are a simple mapping of the names and data that we wish to store in the database and are listed as follows: id: This is the primary key for our database table. This value is set for us automatically by the database when we create a new blog entry, usually an auto incrementing number for each new entry. While we will not explicitly set this value, a primary key comes in handy when you want to refer one model to another. title: The title for a blog entry, stored as a String column with a maximum length of 100. slug: The URL-friendly representation of the title, stored as a String column with a maximum length of 100. This column also specifies unique=True, so that no two entries can share the same slug. body: The actual content of the post, stored in a Text column. This differs from the String type of the Title and Slug as you can store as much text as you like in this field. created_timestamp: The time a blog entry was created, stored in a DateTime column. We instruct SQLAlchemy to automatically populate this column with the current time by default when an entry is first saved. modified_timestamp: The time a blog entry was last updated. SQLAlchemy will automatically update this column with the current time whenever we save an entry. For short strings such as titles or names of things, the String column is appropriate, but when the text may be especially long it is better to use a Text column, as we did for the entry body. We've overridden the constructor for the class (__init__) so that when a new model is created, it automatically sets the slug for us based on the title. The last piece is the __repr__ method which is used to generate a helpful representation of instances of our Entry class. The specific meaning of __repr__ is not important but allows you to reference the object that the program is working with, when debugging. A final bit of code needs to be added to main.py, the entry-point to our application, to ensure that the models are imported. Add the highlighted changes to main.py as follows: from app import app, db import models import views if __name__ == '__main__': app.run() Creating the Entry table In order to start working with the Entry model, we first need to create a table for it in our database. Luckily, Flask-SQLAlchemy comes with a nice helper for doing just this. Create a new sub-folder named scripts in the blog project's app directory. Then create a file named create_db.py: (blog) $ cd app/ (blog) $ mkdir scripts (blog) $ touch scripts/create_db.py Add the following code to the create_db.py module. This function will automatically look at all the code that we have written and create a new table in our database for the Entry model based on our models: from main import db if __name__ == '__main__': db.create_all() Execute the script from inside the app/ directory. Make sure the virtualenv is active. If everything goes successfully, you should see no output. (blog) $ python create_db.py (blog) $ If you encounter errors while creating the database tables, make sure you are in the app directory, with the virtualenv activated, when you run the script. Next, ensure that there are no typos in your SQLALCHEMY_DATABASE_URI setting. Working with the Entry model Let's experiment with our new Entry model by saving a few blog entries. We will be doing this from the Python interactive shell. At this stage let's install IPython, a sophisticated shell with features like tab-completion (that the default Python shell lacks): (blog) $ pip install ipython Now check if we are in the app directory and let's start the shell and create a couple of entries as follows: (blog) $ ipython In []: from models import * # First things first, import our Entry model and db object. In []: db # What is db? Out[]: <SQLAlchemy engine='sqlite:////home/charles/projects/blog/app/blog.db'> If you are familiar with the normal Python shell but not IPython, things may look a little different at first. The main thing to be aware of is that In[] refers to the code you type in, and Out[] is the output of the commands you put in to the shell. IPython has a neat feature that allows you to print detailed information about an object. This is done by typing in the object's name followed by a question-mark (?). Introspecting the Entry model provides a bit of information, including the argument signature and the string representing that object (known as the docstring) of the constructor: In []: Entry? # What is Entry and how do we create it? Type: _BoundDeclarativeMeta String Form:<class 'models.Entry'> File: /home/charles/projects/blog/app/models.py Docstring: <no docstring> Constructor information: Definition:Entry(self, *args, **kwargs) We can create Entry objects by passing column values in as the keyword-arguments. In the preceding example, it uses **kwargs; this is a shortcut for taking a dict object and using it as the values for defining the object, as shown next: In []: first_entry = Entry(title='First entry', body='This is the body of my first entry.') In order to save our first entry, we will to add it to the database session. The session is simply an object that represents our actions on the database. Even after adding it to the session, it will not be saved to the database yet. In order to save the entry to the database, we need to commit our session: In []: db.session.add(first_entry) In []: first_entry.id is None # No primary key, the entry has not been saved. Out[]: True In []: db.session.commit() In []: first_entry.id Out[]: 1 In []: first_entry.created_timestamp Out[]: datetime.datetime(2014, 1, 25, 9, 49, 53, 1337) As you can see from the preceding code examples, once we commit the session, a unique id will be assigned to our first entry and the created_timestamp will be set to the current time. Congratulations, you've created your first blog entry! Try adding a few more on your own. You can add multiple entry objects to the same session before committing, so give that a try as well. At any point while you are experimenting, feel free to delete the blog.db file and re-run the create_db.py script to start over with a fresh database. Making changes to an existing entry In order to make changes to an existing Entry, simply make your edits and then commit. Let's retrieve our Entry using the id that was returned to use earlier, make some changes and commit it. SQLAlchemy will know that it needs to be updated. Here is how you might make edits to the first entry: In []: first_entry = Entry.query.get(1) In []: first_entry.body = 'This is the first entry, and I have made some edits.' In []: db.session.commit() And just like that your changes are saved. Deleting an entry Deleting an entry is just as easy as creating one. Instead of calling db.session.add, we will call db.session.delete and pass in the Entry instance that we wish to remove: In []: bad_entry = Entry(title='bad entry', body='This is a lousy entry.') In []: db.session.add(bad_entry) In []: db.session.commit() # Save the bad entry to the database. In []: db.session.delete(bad_entry) In []: db.session.commit() # The bad entry is now deleted from the database. Retrieving blog entries While creating, updating, and deleting are fairly straightforward operations, the real fun starts when we look at ways to retrieve our entries. We'll start with the basics, and then work our way up to more interesting queries. We will use a special attribute on our model class to make queries: Entry.query. This attribute exposes a variety of APIs for working with the collection of entries in the database. Let's simply retrieve a list of all the entries in the Entry table: In []: entries = Entry.query.all() In []: entries # What are our entries? Out[]: [<Entry u'First entry'>, <Entry u'Second entry'>, <Entry u'Third entry'>, <Entry u'Fourth entry'>] As you can see, in this example, the query returns a list of Entry instances that we created. When no explicit ordering is specified, the entries are returned to us in an arbitrary order chosen by the database. Let's specify that we want the entries returned to us in an alphabetical order by title: In []: Entry.query.order_by(Entry.title.asc()).all() Out []: [<Entry u'First entry'>, <Entry u'Fourth entry'>, <Entry u'Second entry'>, <Entry u'Third entry'>] Shown next is how you would list your entries in reverse-chronological order, based on when they were last updated: In []: oldest_to_newest = Entry.query.order_by(Entry.modified_timestamp.desc()).all() Out []: [<Entry: Fourth entry>, <Entry: Third entry>, <Entry: Second entry>, <Entry: First entry>] Filtering the list of entries It is very useful to be able to retrieve the entire collection of blog entries, but what if we want to filter the list? We could always retrieve the entire collection and then filter it in Python using a loop, but that would be very inefficient. Instead we will rely on the database to do the filtering for us, and simply specify the conditions for which entries should be returned. In the following example, we will specify that we want to filter by entries where the title equals 'First entry'. In []: Entry.query.filter(Entry.title == 'First entry').all() Out[]: [<Entry u'First entry'>] If this seems somewhat magical to you, it's because it really is! SQLAlchemy uses operator overloading to convert expressions like <Model>.<column> == <some value> into an abstracted object called BinaryExpression. When you are ready to execute your query, these data-structures are then translated into SQL. A BinaryExpression is simply an object that represents the logical comparison and is produced by over-riding the standards methods that are typically called on an object when comparing values in Python. In order to retrieve a single entry, you have two options, .first() and .one(). Their differences and similarities are summarized in the following table: Number of matching rows first() behavior one() behavior 1 Return the object. Return the object. 0 Return None. Raise sqlalchemy.orm.exc.NoResultFound 2+ Return the first object (based on either explicit ordering or the ordering chosen by the database). Raise sqlalchemy.orm.exc.MultipleResultsFound Let's try the same query as before, but instead of calling .all(), we will call .first() to retrieve a single Entry instance: In []: Entry.query.filter(Entry.title == 'First entry').first() Out[]: <Entry u'First entry'> Notice how previously .all() returned a list containing the object, whereas .first() returned just the object itself. Special lookups In the previous example we tested for equality, but there are many other types of lookups possible. In the following table, have listed some that you may find useful. A complete list can be found in the SQLAlchemy documentation. Example Meaning Entry.title == 'The title' Entries where the title is "The title", case-sensitive. Entry.title != 'The title' Entries where the title is not "The title". Entry.created_timestamp < datetime.date(2014, 1, 25) Entries created before January 25, 2014. For less than or equal, use <=. Entry.created_timestamp > datetime.date(2014, 1, 25) Entries created after January 25, 2014. For greater than or equal, use >=. Entry.body.contains('Python') Entries where the body contains the word "Python", case-sensitive. Entry.title.endswith('Python') Entries where the title ends with the string "Python", case-sensitive. Note that this will also match titles that end with the word "CPython", for example. Entry.title.startswith('Python') Entries where the title starts with the string "Python", case-sensitive. Note that this will also match titles like "Pythonistas". Entry.body.ilike('%python%') Entries where the body contains the word "python" anywhere in the text, case-insensitive. The "%" character is a wild-card. Entry.title.in_(['Title one', 'Title two']) Entries where the title is in the given list, either 'Title one' or 'Title two'. Combining expressions The expressions listed in the preceding table can be combined using bitwise operators to produce arbitrarily complex expressions. Let's say we want to retrieve all blog entries that have the word Python or Flask in the title. To accomplish this, we will create two contains expressions, then combine them using Python's bitwise OR operator which is a pipe| character unlike a lot of other languages that use a double pipe || character: Entry.query.filter(Entry.title.contains('Python') | Entry.title.contains('Flask')) Using bitwise operators, we can come up with some pretty complex expressions. Try to figure out what the following example is asking for: Entry.query.filter( (Entry.title.contains('Python') | Entry.title.contains('Flask')) & (Entry.created_timestamp > (datetime.date.today() - datetime.timedelta(days=30))) ) As you probably guessed, this query returns all entries where the title contains either Python or Flask, and which were created within the last 30 days. We are using Python's bitwise OR and AND operators to combine the sub-expressions. For any query you produce, you can view the generated SQL by printing the query as follows: In []: query = Entry.query.filter( (Entry.title.contains('Python') | Entry.title.contains('Flask')) & (Entry.created_timestamp > (datetime.date.today() - datetime.timedelta(days=30))) ) In []: print str(query) SELECT entry.id AS entry_id, ... FROM entry WHERE ( (entry.title LIKE '%%' || :title_1 || '%%') OR (entry.title LIKE '%%' || :title_2 || '%%') ) AND entry.created_timestamp > :created_timestamp_1 Negation There is one more piece to discuss, which is negation. If we wanted to get a list of all blog entries which did not contain Python or Flask in the title, how would we do that? SQLAlchemy provides two ways to create these types of expressions, using either Python's unary negation operator (~) or by calling db.not_(). This is how you would construct this query with SQLAlchemy: Using unary negation: In []: Entry.query.filter(~(Entry.title.contains('Python') | Entry.title.contains('Flask')))   Using db.not_(): In []: Entry.query.filter(db.not_(Entry.title.contains('Python') | Entry.title.contains('Flask'))) Operator precedence Not all operations are considered equal to the Python interpreter. This is like in math class, where we learned that expressions like 2 + 3 * 4 are equal to 14 and not 20, because the multiplication operation occurs first. In Python, bitwise operators all have a higher precedence than things like equality tests, so this means that when you are building your query expression, you have to pay attention to the parentheses. Let's look at some example Python expressions and see the corresponding query: Expression Result (Entry.title == 'Python' | Entry.title == 'Flask') Wrong! SQLAlchemy throws an error because the first thing to be evaluated is actually the 'Python' | Entry.title! (Entry.title == 'Python') | (Entry.title == 'Flask') Right. Returns entries where the title is either "Python" or "Flask". ~Entry.title == 'Python' Wrong! SQLAlchemy will turn this into a valid SQL query, but the results will not be meaningful. ~(Entry.title == 'Python') Right. Returns entries where the title is not equal to "Python". If you find yourself struggling with the operator precedence, it's a safe bet to put parentheses around any comparison that uses ==, !=, <, <=, >, and >=. Making changes to the schema The final topic we will discuss in this article is how to make modifications to an existing Model definition. From the project specification, we know we would like to be able to save drafts of our blog entries. Right now we don't have any way to tell whether an entry is a draft or not, so we will need to add a column that let's us store the status of our entry. Unfortunately, while db.create_all() works perfectly for creating tables, it will not automatically modify an existing table; to do this we need to use migrations. Adding Flask-Migrate to our project We will use Flask-Migrate to help us automatically update our database whenever we change the schema. In the blog virtualenv, install Flask-migrate using pip: (blog) $ pip install flask-migrate The author of SQLAlchemy has a project called alembic; Flask-Migrate makes use of this and integrates it with Flask directly, making things easier. Next we will add a Migrate helper to our app. We will also create a script manager for our app. The script manager allows us to execute special commands within the context of our app, directly from the command-line. We will be using the script manager to execute the migrate command. Open app.py and make the following additions: from flask import Flask from flask.ext.migrate import Migrate, MigrateCommand from flask.ext.script import Manager from flask.ext.sqlalchemy import SQLAlchemy from config import Configuration app = Flask(__name__) app.config.from_object(Configuration) db = SQLAlchemy(app) migrate = Migrate(app, db) manager = Manager(app) manager.add_command('db', MigrateCommand) In order to use the manager, we will add a new file named manage.py along with app.py. Add the following code to manage.py: from app import manager from main import * if __name__ == '__main__': manager.run() This looks very similar to main.py, the key difference being that instead of calling app.run(), we are calling manager.run(). Django has a similar, although auto-generated, manage.py file that serves a similar function. Creating the initial migration Before we can start changing our schema, we need to create a record of its current state. To do this, run the following commands from inside your blog's app directory. The first command will create a migrations directory inside the app folder which will track the changes we make to our schema. The second command db migrate will create a snapshot of our current schema so that future changes can be compared to it. (blog) $ python manage.py db init Creating directory /home/charles/projects/blog/app/migrations ... done ... (blog) $ python manage.py db migrate INFO [alembic.migration] Context impl SQLiteImpl. INFO [alembic.migration] Will assume non-transactional DDL. Generating /home/charles/projects/blog/app/migrations/versions/535133f91f00_.py ... done Finally, we will run db upgrade to run the migration which will indicate to the migration system that everything is up-to-date: (blog) $ python manage.py db upgrade INFO [alembic.migration] Context impl SQLiteImpl. INFO [alembic.migration] Will assume non-transactional DDL. INFO [alembic.migration] Running upgrade None -> 535133f91f00, empty message Adding a status column Now that we have a snapshot of our current schema, we can start making changes. We will be adding a new column named status, which will store an integer value corresponding to a particular status. Although there are only two statuses at the moment (PUBLIC and DRAFT), using an integer instead of a Boolean gives us the option to easily add more statuses in the future. Open models.py and make the following additions to the Entry model: class Entry(db.Model): STATUS_PUBLIC = 0 STATUS_DRAFT = 1 id = db.Column(db.Integer, primary_key=True) title = db.Column(db.String(100)) slug = db.Column(db.String(100), unique=True) body = db.Column(db.Text) status = db.Column(db.SmallInteger, default=STATUS_PUBLIC) created_timestamp = db.Column(db.DateTime, default=datetime.datetime.now) ... From the command-line, we will once again be running db migrate to generate the migration script. You can see from the command's output that it found our new column: (blog) $ python manage.py db migrate INFO [alembic.migration] Context impl SQLiteImpl. INFO [alembic.migration] Will assume non-transactional DDL. INFO [alembic.autogenerate.compare] Detected added column 'entry.status' Generating /home/charles/projects/blog/app/migrations/versions/2c8e81936cad_.py ... done Because we have blog entries in the database, we need to make a small modification to the auto-generated migration to ensure the statuses for the existing entries are initialized to the proper value. To do this, open up the migration file (mine is migrations/versions/2c8e81936cad_.py) and change the following line: op.add_column('entry', sa.Column('status', sa.SmallInteger(), nullable=True)) The replacement of nullable=True with server_default='0' tells the migration script to not set the column to null by default, but instead to use 0: op.add_column('entry', sa.Column('status', sa.SmallInteger(), server_default='0')) Finally, run db upgrade to run the migration and create the status column: (blog) $ python manage.py db upgrade INFO [alembic.migration] Context impl SQLiteImpl. INFO [alembic.migration] Will assume non-transactional DDL. INFO [alembic.migration] Running upgrade 535133f91f00 -> 2c8e81936cad, empty message Congratulations, your Entry model now has a status field! Summary By now you should be familiar with using SQLAlchemy to work with a relational database. We covered the benefits of using a relational database and an ORM, configured a Flask application to connect to a relational database, and created SQLAlchemy models. All this allowed us to create relationships between our data and perform queries. To top it off, we also used a migration tool to handle future database schema changes. We will set aside the interactive interpreter and start creating views to display blog entries in the web browser. We will put all our SQLAlchemy knowledge to work by creating interesting lists of blog entries, as well as a simple search feature. We will build a set of templates to make the blogging site visually appealing, and learn how to use the Jinja2 templating language to eliminate repetitive HTML coding. Resources for Article:   Further resources on this subject: Man, Do I Like Templates! [article] Snap – The Code Snippet Sharing Application [article] Deploying on your own server [article]
Read more
  • 0
  • 0
  • 5924

article-image-how-to-publish-docker-and-integrate-with-maven
Pravin Dhandre
11 Apr 2018
6 min read
Save for later

How to publish Docker and integrate with Maven

Pravin Dhandre
11 Apr 2018
6 min read
We have learned how to create Dockers, and how to run them, but these Dockers are stored in our system. Now we need to publish them so that they are accessible anywhere. In this post, we will learn how to publish our Docker images, and how to finally integrate Maven with Docker to easily do the same steps for our microservices. Understanding repositories In our previous example, when we built a Docker image, we published it into our local system repository so we can execute Docker run. Docker will be able to find them; this local repository exists only on our system, and most likely we need to have this access to wherever we like to run our Docker. For example, we may create our Docker in a pipeline that runs on a machine that creates our builds, but the application itself may run in our pre production or production environments, so the Docker image should be available on any system that we need. One of the great advantages of Docker is that any developer building an image can run it from their own system exactly as they would on any server. This will minimize the risk of having something different in each environment, or not being able to reproduce production when you try to find the source of a problem. Docker provides a public repository, Docker Hub, that we can use to publish and pull images, but of course, you can use private Docker repositories such as Sonatype Nexus, VMware Harbor, or JFrog Artifactory. To learn how to configure additional repositories refer to the repositories documentation. Docker Hub registration After registering, we need to log into our account, so we can publish our Dockers using the Docker tool from the command line using Docker login: docker login Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head over to https://hub.Docker.com to create one. Username: mydockerhubuser Password: Login Succeeded When we need to publish a Docker, we must always be logged into the registry that we are working with; remember to log into Docker. Publishing a Docker Now we'd like to publish our Docker image to Docker Hub; but before we can, we need to build our images for our repository. When we create an account in Docker Hub, a repository with our username will be created; in this example, it will be mydockerhubuser. In order to build the Docker for our repository, we can use this command from our microservice directory: docker build . -t mydockerhubuser/chapter07 This should be quite a fast process since all the different layers are cached: Sending build context to Docker daemon 21.58MB Step 1/3 : FROM openjdk:8-jdk-alpine ---> a2a00e606b82 Step 2/3 : ADD target/*.jar microservice.jar ---> Using cache ---> 4ae1b12e61aa Step 3/3 : ENTRYPOINT java -jar microservice.jar ---> Using cache ---> 70d76cbf7fb2 Successfully built 70d76cbf7fb2 Successfully tagged mydockerhubuser/chapter07:latest Now that our Docker is built, we can push it to Docker Hub with the following command: docker push mydockerhubuser/chapter07 This command will take several minutes since the whole image needs to be uploaded. With our Docker published, we can now run it from any Docker system with the following command: docker run mydockerhubuser/chapter07 Or else, we can run it as a daemon, with: docker run -d mydockerhubuser/chapter07 Integrating Docker with Maven Now that we know most of the Docker concepts, we can integrate Docker with Maven using the Docker-Maven-plugin created by fabric8, so we can create Docker as part of our Maven builds. First, we will move our Dockerfile to a different folder. In the IntelliJ Project window, right-click on the src folder and choose New | Directory. We will name it Docker. Now, drag and drop the existing Dockerfile into this new directory, and we will change it to the following: FROM openjdk:8-jdk-alpine ADD maven/*.jar microservice.jar ENTRYPOINT ["java","-jar", "microservice.jar"] To manage the Dockerfile better, we just move into our project folders. When our Docker is built using the plugin, the contents of our application will be created in a folder named Maven, so we change the Dockerfile to reference that folder. Now, we will modify our Maven pom.xml, and add the Dockerfile-Maven-plugin in the build | plugins section: <build> .... <plugins> .... <plugin> <groupId>io.fabric8</groupId> <artifactId>Docker-maven-plugin</artifactId> <version>0.23.0</version> <configuration> <verbose>true</verbose> <images> <image> <name>mydockerhubuser/chapter07</name> <build> <dockerFileDir>${project.basedir}/src/Docker</dockerFileDir> <assembly> <descriptorRef>artifact</descriptorRef> </assembly> <tags> <tag>latest</tag> <tag>${project.version}</tag> </tags> </build> <run> <ports> <port>8080:8080</port> </ports> </run> </image> </images> </configuration> </plugin> </plugins> </build> Here, we are specifying how to create our Docker, where the Dockerfile is, and even which version of the Docker we are building. Additionally, we specify some parameters when our Docker runs, such as the port that it exposes. If we need IntelliJ to reload the Maven changes, we may need to click on the Reimport all maven projects button in the Maven Project window. For building our Docker using Maven, we can use the Maven Project window by running the task Docker: build, or by running the following command: mvnw docker:build This will build the Docker image, but we require to have it before it's packaged, so we can perform the following command: mvnw package docker:build We can also publish our Docker using Maven, either with the Maven Project window to run the Docker: push task, or by running the following command: mvnw docker:push This will push our Docker into the Docker Hub, but if we'd like to do everything in just one command, we can just use the following code: mvnw package docker:build docker:push Finally, the plugin provides other tasks such as Docker: run, Docker: start, and Docker: stop, which we can use in the commands that we've already learned on the command line. With this, we learned how to publish docker manually and integrate them into the Maven lifecycle. Do check out the book Hands-On Microservices with Kotlin to start simplifying development of microservices and building high quality service environment. Check out other posts: The key differences between Kubernetes and Docker Swarm How to publish Microservice as a service onto a Docker Building Docker images using Dockerfiles  
Read more
  • 0
  • 0
  • 5918

article-image-using-phpstorm-team
Packt
26 Dec 2014
11 min read
Save for later

Using PhpStorm in a Team

Packt
26 Dec 2014
11 min read
In this article by Mukund Chaudhary and Ankur Kumar, authors of the book PhpStorm Cookbook, we will cover the following recipes: Getting a VCS server Creating a VCS repository Connecting PhpStorm to a VCS repository Storing a PhpStorm project in a VCS repository (For more resources related to this topic, see here.) Getting a VCS server The first action that you have to undertake is to decide which version of VCS you are going to use. There are a number of systems available, such as Git and Subversion (commonly known as SVN). It is free and open source software that you can download and install on your development server. There is another system named concurrent versions system (CVS). Both are meant to provide a code versioning service to you. SVN is newer and supposedly faster than CVS. Since SVN is the newer system and in order to provide information to you on the latest matters, this text will concentrate on the features of Subversion only. Getting ready So, finally that moment has arrived when you will start off working in a team by getting a VCS system for you and your team. The installation of SVN on the development system can be done in two ways: easy and difficult. The difficult step can be skipped without consideration because that is for the developers who want to contribute to the Subversion system. Since you are dealing with PhpStorm, you need to remember the easier way because you have a lot more to do. How to do it... The installation step is very easy. There is this aptitude utility available with Debian-based systems, and there is the Yum utility available with Red Hat-based systems. Perform the following steps: You just need to issue the command apt-get install subversion. The operating system's package manager will do the remaining work for you. In a very short time, after flooding the command-line console with messages, you will have the Subversion system installed. To check whether the installation was successful, you need to issue the command whereis svn. If there is a message, it means that you installed Subversion successfully. If you do not want to bear the load of installing Subversion on your development system, you can use commercial third-party servers. But that is more of a layman's approach to solving problems, and no PhpStorm cookbook author will recommend that you do that. You are a software engineer; you should not let go easily. How it works... When you install the version control system, you actually install a server that provides the version control service to a version control client. The subversion control service listens for incoming connections from remote clients on port number 3690 by default. There's more... If you want to install the older companion, CVS, you can do that in a similar way, as shown in the following steps: You need to download the archive for the CVS server software. You need to unpack it from the archive using your favorite unpacking software. You can move it to another convenient location since you will not need to disturb this folder in the future. You then need to move into the directory, and there will start your compilation process. You need to do #. /configure to create the make targets. Having made the target, you need to enter #make install to complete the installation procedure. Due to it being older software, you might have to compile from the source code as the only alternative. Creating a VCS repository More often than not, a PHP programmer is expected to know some system concepts because it is often required to change settings for the PHP interpreter. The changes could be in the form of, say, changing the execution time or adding/removing modules, and so on. In order to start working in a team, you are going to get your hands dirty with system actions. Getting ready You will have to create a new repository on the development server so that PhpStorm can act as a client and get connected. Here, it is important to note the difference between an SVN client and an SVN server—an SVN client can be any of these: a standalone client or an embedded client such as an IDE. The SVN server, on the other hand, is a single item. It is a continuously running process on a server of your choice. How to do it... You need to be careful while performing this activity as a single mistake can ruin your efforts. Perform the following steps: There is a command svnadmin that you need to know. Using this command, you can create a new directory on the server that will contain the code base in it. Again, you should be careful when selecting a directory on the server as it will appear in your SVN URL for the rest part of your life. The command should be executed as: svnadmin create /path/to/your/repo/ Having created a new repository on the server, you need to make certain settings for the server. This is just a normal phenomenon because every server requires a configuration. The SVN server configuration is located under /path/to/your/repo/conf/ with the name svnserve.conf. Inside the file, you need to make three changes. You need to add these lines at the bottom of the file: anon-access = none auth-access = write password-db = passwd There has to be a password file to authorize a list of users who will be allowed to use the repository. The password file in this case will be named passwd (the default filename). The contents in the file will be a number of lines, each containing a username and the corresponding password in the form of username = password. Since these files are scanned by the server according to a particular algorithm, you don't have the freedom to leave deliberate spaces in the file—there will be error messages displayed in those cases. Having made the appropriate settings, you can now make the SVN service run so that an SVN client can access it. You need to issue the command svnserve -d to do that. It is always good practice to keep checking whether what you do is correct. To validate proper installation, you need to issue the command svn ls svn://user@host/path/to/subversion/repo/. The output will be as shown in the following screenshot:   How it works... The svnadmin command is used to perform admin tasks on the Subversion server. The create option creates a new folder on the server that acts as the repository for access from Subversion clients. The configuration file is created by default at the time of server installation. The contents that are added to the file are actually the configuration directives that control the behavior of the Subversion server. Thus, the settings mentioned prevent anonymous access and restrict the write operations to certain users whose access details are mentioned in a file. The command svnserve is again a command that needs to be run on the server side and which starts the instance of the server. The -d switch mentions that the server should be run as a daemon (system process). This also means that your server will continue running until you manually stop it or the entire system goes down. Again, you can skip this section if you have opted for a third-party version control service provider. Connecting PhpStorm to a VCS repository The real utility of software is when you use it. So, having installed the version control system, you need to be prepared to use it. Getting ready With SVN being client-server software, having installed the server, you now need a client. Again, you will have difficulty searching for a good SVN client. Don't worry; the client has been factory-provided to you inside PhpStorm. The PhpStorm SVN client provides you with features that accelerate your development task by providing you detailed information about the changes made to the code. So, go ahead and connect PhpStorm to the Subversion repository you created. How to do it... In order to connect PhpStorm to the Subversion repository, you need to activate the Subversion view. It is available at View | Tool Windows | Svn Repositories. Perform the following steps to activate the Subversion view: 1. Having activated the Subversion view, you now need to add the repository location to PhpStorm. To do that, you need to use the + symbol in the top-left corner in the view you have opened, as shown in the following screenshot: Upon selecting the Add option, there is a question asked by PhpStorm about the location of the repository. You need to provide the full location of the repository. Once you provide the location, you will be able to see the repository in the same Subversion view in which you have pressed the Add button. Here, you should always keep in mind the correct protocol to use. This depends on the way you installed the Subversion system on the development machine. If you used the default installation by installing from the installer utility (apt-get or aptitude), you need to specify svn://. If you have configured SVN to be accessible via SSH, you need to specify svn+ssh://. If you have explicitly configured SVN to be used with the Apache web server, you need to specify http://. If you configured SVN with Apache over the secure protocol, you need to specify https://. Storing a PhpStorm project in a VCS repository Here comes the actual start of the teamwork. Even if you and your other team members have connected to the repository, what advantage does it serve? What is the purpose solved by merely connecting to the version control repository? Correct. The actual thing is the code that you work on. It is the code that earns you your bread. Getting ready You should now store a project in the Subversion repository so that the other team members can work and add more features to your code. It is time to add a project to version control. It is not that you need to start a new project from scratch to add to the repository. Any project, any work that you have done and you wish to have the team work on now can be added to the repository. Since the most relevant project in the current context is the cooking project, you can try adding that. There you go. How to do it... In order to add a project to the repository, perform the following steps: You need to use the menu item provided at VCS | Import into version control | Share project (subversion). PhpStorm will ask you a question, as shown in the following screenshot: Select the correct hierarchy to define the share target—the correct location where your project will be saved. If you wish to create the tags and branches in the code base, you need to select the checkbox for the same. It is good practice to provide comments to the commits that you make. The reason behind this is apparent when you sit down to create a release document. It also makes the change more understandable for the other team members. PhpStorm then asks you the format you want the working copy to be in. This is related to the version of the version control software. You just need to smile and select the latest version number and proceed, as shown in the following screenshot:   Having done that, PhpStorm will now ask you to enter your credentials. You need to enter the same credentials that you saved in the configuration file (see the Creating a VCS repository recipe) or the credentials that your service provider gave you. You can ask PhpStorm to save the credentials for you, as shown in the following screenshot:   How it works... Here it is worth understanding what is going on behind the curtains. When you do any Subversion related task in PhpStorm, there is an inbuilt SVN client that executes the commands for you. Thus, when you add a project to version control, the code is given a version number. This makes the version system remember the state of the code base. In other words, when you add the code base to version control, you add a checkpoint that you can revisit at any point in future for the time the code base is under the same version control system. Interesting phenomenon, isn't it? There's more... If you have installed the version control software yourself and if you did not make the setting to store the password in encrypted text, PhpStorm will provide you a warning about it, as shown in the following screenshot: Summary We got to know about version control systems, step-by-step process to create a VCS repository, and connecting PhpStorm to a VCS repository. Resources for Article:  Further resources on this subject: FuelPHP [article] A look into the high-level programming operations for the PHP language [article] PHP Web 2.0 Mashup Projects: Your Own Video Jukebox: Part 1 [article]
Read more
  • 0
  • 0
  • 5859

article-image-debugging-rest-web-services
Packt
20 Oct 2009
13 min read
Save for later

Debugging REST Web Services

Packt
20 Oct 2009
13 min read
(For more resources on this subject, see here.) Message tracing The first symptom that you will notice when you are running into problems is that the client would not behave the way you want it to behave. As an example, there would be no output, or the wrong output. Since the outcome of running a REST client depends on the request that you send over the wire and the response that you receive over the wire, one of the first things is to capture the messages and verify that those are in the correct expected format. REST Services and clients interact using messages, usually in pairs of request and response. So if there are problems, they are caused by errors in the messages being exchanged. Sometimes the user only has control over a REST client and does not have access to the implementation details of the service. Sometimes the user will implement the REST service for others to consume the service. Sometimes the Web browser can act as a client. Sometimes a PHP application on a server can act as a REST client. Irrespective of where the client is and where the service is, you can use message capturing tools to capture messages and try to figure out the problem. Thanks to the fact that the service and client use messages to interact with each other, we can always use a message capturing tool in the middle to capture messages. It is not that we must run the message capturing tool on the same machine where the client is running or the service is running; the message capturing tool can be run on either machine, or it can be run on a third machine. The following figure illustrates how the message interaction would look with a message capturing tool in place. If the REST client is a Web browser and we want to capture the request and response involved in a message interaction, we would have to point the Web browser to message capturing tool and let the tool send the request to the service on behalf of the Web browser. Then, since it is the tool that sent the request to the service, the service would respond to the tool. The message capturing tool in turn would send the response it received from the service to the Web browser. In this scenario, the tool in the middle would gain access to both the request and response. Hence it can reveal those messages for us to have a look. When you are not seeing the client to work, here is the list of things that you might need to look for: If the client sends a message If you are able to receive a response from a service If the request message sent by the client is in the correct format, including HTTP headers If the response sent by the server is in the correct format, including the HTTP headers In order to check for the above, you would require a message-capturing tool to trace the messages. There are multiple tools that you can use to capture the messages that are sent from the client to the service and vice versa. Wireshark (http://www.wireshark.org/) is one such tool that can be used to capture any network traffic. It is an open-source tool and is available under the GNU General Public License version 2. However this tool can be a bit complicated if you are looking for a simple tool. Apache TCPMon (http://ws.apache.org/commons/tcpmon/) is another tool that is designed to trace web services messages. This is a Java based tool that can be used with web services to capture the messages. Because TCPMon is a message capturing tool, it can be used to intercept messages sent between client and service, and as explained earlier, can be run on the client machine, the server machine or on a third independent machine. The only catch is that you need Java installed in your system to run this tool. You can also find a C-based implementation of a similar tool with Apache Axis2/C (http://ws.apache.org/axis2/c). However, that tool does not have a graphical user interface. There is a set of steps that you need to follow, which are more or less the same across all of these tools, in order to prepare the tool for capturing messages. Define the target host name Define the target port number Define the listen port number Target host name is the name of the host machine on which the service is running. As an example, if we want to debug the request sent to the Yahoo spelling suggestion service, hosted at http://search.yahooapis.com/WebSearchService/V1/spellingSuggestion, the host name would be search.yahooapis.com. We can either use the name of the host or we can use the IP address of the host because the tools are capable of dealing with both formats in place of the host name. As an example, if the service is hosted on the local machine, we could either use localhost or 127.0.0.1 in place of the host name. Target port number is the port number on which the service hosting web server is listening; usually this is 80. As an example, for the Yahoo spelling suggestion service, hosted at http://search.yahooapis.com/WebSearchService/V1/spellingSuggestion, the target port number is 80. Note that, when the service URL does not mention any number, we can always use the default number. If it was running on a port other than port 80, we can find the port number followed by the host name and preceded with character ':'. As an example, if we have our web server running on port 8080 on the local machine, we would have service URL similar to http://localhost:8080/rest/04/library/book.php. Here, the host name is localhost and the target port is 8080. Listen port is the port on which the tool will be listening to capture the messages from the client before sending it to the service. For an example, say that we want to use port 9090 as our listen port to capture the messages while using the Yahoo spelling suggestion service. Under normal circumstances, we will be using a URL similar to the following with the web browser to send the request to the service. http://search.yahooapis.com/WebSearchService/V1/spellingSuggestion?appid=YahooDemo&query=apocalipto When we want to send this request through the message capturing tool and since we decided to make the tools listen port to be 9090 with the tool in the middle and assuming that the tool is running on the local machine, we would now use the following URL with the web browser in place of the original URL. http://localhost:9090/WebSearchService/V1/spellingSuggestion?appid=YahooDemo&query=apocalipto Note that we are not sending this request directly to search.yahooapis.com, but rather to the tool listening on port 9090 on local host. Once the tool receives the request, it will capture the request, forward that to the target host, receive the response and forward that response to the web browser. The following figure shows the Apache TCPMon tool. You can see localhost being used as the target host, 80 being the target port number and 9090 being the listening port number. Once you fill in these fields you can see a new tab being added in the tool showing the messages being captured. Once you click on the Add button, you will see a new pane as shown in the next figure where it will show the messages and pass the messages to and from the client and service. Before you can capture the messages, there is one more step. That is to change the client code to point to the port number 9090, since our monitoring tool is now listening on that port. Originally, we were using port 80 $url = 'http://localhost:80/rest/04/library/book.php'; or just $url = 'http://localhost/rest/04/library/book.php'; because the default port number used by a web server is port 80, and the client was directly talking to the service. However, with the tool in place, we are going to make the client talk to the tool listening on port 9090. The tool in turn will talk to the service. Note that in this sample we have all three parties, the client, the service, and the tool running on the same machine. So we will keep using localhost as our host name. Now we are going to change the service endpoint address used by the client to contain port 9090. This will make sure that the client will be talking to the tool. $url = 'http://localhost:9090/rest/04/library/book.php'; As you can see, the tool has captured the request and the response. The request appears at the top and the response at the bottom. The request is a GET request to the resource located at /rest/04/library/book.php. The response is a success response, with HTTP 200 OK code. And after the HTTP headers, the response body, which is in XML follows. As mentioned earlier, the first step in debugging is to verify if the client has sent a request and if the service responded. In the above example, we have both the request and response in place. If both were missing then we need to check what is wrong on either side. If the client request was missing, you can check for the following in the code. Are you using the correct URL in client Have you written the request to the wire in the client? Usually this is done by the function curl_exec when using Curl If the response was missing, you can check for the following. Are you connected to the network? Because your service can be hosted on a remote machine Have you written a response from the service? That is, basically, have you returned the correct string value from the service? In PHP wire, using the echo function to write the required response to the wire usually does this. If you are using a PHP framework, you may have to use the framework specific mechanisms to do this. As an example, if you are using the Zend_Rest_Server class, you have to use handle() method to make sure that the response is sent to the client. Here is a sample error scenario. As you can see, the response is 404 not found. And if you look at the request, you see that there is a typo in the request. We have missed 'k' from our resource URL, hence we have sent the request to /rest/04/library/boo.php, which does not exist, whereas the correct resource URL is /rest/04/library/book.php. Next let us look at the Yahoo search example that was discussed earlier to identify some advanced concepts. We want to capture the request sent by the web browser and the response sent by the server for the request. http://search.yahooapis.com/WebSearchService/V1/spellingSuggestion?appid=YahooDemo&ampquery=apocalipto. As discussed earlier, the target host name is search.yahooapis.com. The target port number is 80. Let's use 9091 as the listen. Let us use the web browser to send the request through the tool so that we can capture the request and response. Since the tool is listening on port 9091, we would use the following URL with the web browser. http://localhost:9091/WebSearchService/V1/spellingSuggestion?appid=YahooDemo&ampquery=apocalipto When you use the above URL with the web browser, the web browser would send the request to the tool and the tool will get the response from the service and forward that to the web browser. We can see that the web browser gets the response. However, if we have a look at the TCPMon tool's captured messages, we see that the service has sent some binary data instead of XML data even though the Web browser is displaying the response in XML format. So what went wrong? In fact, nothing is wrong. The service sent the data in binary format because the web browser requested that format. If you look closely at the request sent you will see the following. GET /WebSearchService/V1/spellingSuggestion?appid=YahooDemo&query=apocalipto HTTP/1.1Host: search.yahooapis.com:9091User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9) Gecko/20071025 Firefox/2.0.0.9Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5Accept-Language: en-us,en;q=0.7,zh-cn;q=0.3Accept-Encoding: gzip,deflateAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7Keep-Alive: 300Connection: keep-alive In the request, the web browser has used the HTTP header. Accept-Encoding: gzip,deflate This tells the service that the web browser can handle data that comes in gzip compressed format. Hence the service sends the data compressed to the web browser. Obviously, it is not possible to look into the XML messages and debug them if the response is compressed. Hence we should ideally capture the messages in XML format. To do this, we can modify the request message on the TCPMon pane itself and resend the message. First remove the line Accept-Encoding: gzip,deflate Then click on the Resend button. Once we click on the Resend button, we will get the response in XML format. Errors in building XML While forming XML as request payload or response payload, we can run into errors through simple mistakes. Some would be easy to spot but some are not. Most of the XML errors could be avoided by following a simple rule of thumb-each opening XML tag should have an equivalent closing tag. That is the common mistake that can happen while building XML payloads. In the above diagram, if you look carefully in the circle, the ending tag for the book element is missing. A new starting tag for a new book is started before the first book is closed. This would cause the XML parsing on the client side to fail. In this case I am using the Library system sample and here is the PHP source code causing the problem. echo "<books>";while ($line = mysql_fetch_array($result, MYSQL_ASSOC)) { echo "<book>"; foreach ($line as $key => $col_value) { echo "<$key>$col_value</$key>"; } //echo "</book>";}echo "</books>"; Here I have intentionally commented out printing the closing tag to demonstrate the error scenario. However, while writing this code, I could have missed that as well, causing the system to be buggy. While looking for XML related errors, you can use the manual technique that we just used. Look for missing tags. If the process looks complicated and you cannot seem to find any XML errors in the response or request that you are trying to debug, you can copy the XML captured with the tool and run it through an XML validator tool. For example, you can use an online tool such as http://www.w3schools.com/XML/xml_validator.asp. You can also check if the XML file is well formed using an XML parser
Read more
  • 0
  • 0
  • 5814
article-image-making-web-server-nodejs
Packt
25 Feb 2016
38 min read
Save for later

Making a Web Server in Node.js

Packt
25 Feb 2016
38 min read
In this article, we will cover the following topics: Setting up a router Serving static files Caching content in memory for immediate delivery Optimizing performance with streaming Securing against filesystem hacking exploits (For more resources related to this topic, see here.) One of the great qualities of Node is its simplicity. Unlike PHP or ASP, there is no separation between the web server and code, nor do we have to customize large configuration files to get the behavior we want. With Node, we can create the web server, customize it, and deliver content. All this can be done at the code level. This article demonstrates how to create a web server with Node and feed content through it, while implementing security and performance enhancements to cater for various situations. If we don't have Node installed yet, we can head to http://nodejs.org and hit the INSTALL button appearing on the homepage. This will download the relevant file to install Node on our operating system. Setting up a router In order to deliver web content, we need to make a Uniform Resource Identifier (URI) available. This recipe walks us through the creation of an HTTP server that exposes routes to the user. Getting ready First let's create our server file. If our main purpose is to expose server functionality, it's a general practice to call the server.js file (because the npm start command runs the node server.js command by default). We could put this new server.js file in a new folder. It's also a good idea to install and use supervisor. We use npm (the module downloading and publishing command-line application that ships with Node) to install. On the command-line utility, we write the following command: sudo npm -g install supervisor Essentially, sudo allows administrative privileges for Linux and Mac OS X systems. If we are using Node on Windows, we can drop the sudo part in any of our commands. The supervisor module will conveniently autorestart our server when we save our changes. To kick things off, we can start our server.js file with the supervisor module by executing the following command: supervisor server.js For more on possible arguments and the configuration of supervisor, check out https://github.com/isaacs/node-supervisor. How to do it... In order to create the server, we need the HTTP module. So let's load it and use the http.createServer method as follows: var http = require('http'); http.createServer(function (request, response) {   response.writeHead(200, {'Content-Type': 'text/html'});   response.end('Woohoo!'); }).listen(8080); Now, if we save our file and access localhost:8080 on a web browser or using curl, our browser (or curl) will exclaim Woohoo! But the same will occur at localhost:8080/foo. Indeed, any path will render the same behavior. So let's build in some routing. We can use the path module to extract the basename variable of the path (the final part of the path) and reverse any URI encoding from the client with decodeURI as follows: var http = require('http'); var path = require('path'); http.createServer(function (request, response) {   var lookup=path.basename(decodeURI(request.url)); We now need a way to define our routes. One option is to use an array of objects as follows: var pages = [   {route: '', output: 'Woohoo!'},   {route: 'about', output: 'A simple routing with Node example'},   {route: 'another page', output: function() {return 'Here's     '+this.route;}}, ]; Our pages array should be placed above the http.createServer call. Within our server, we need to loop through our array and see if the lookup variable matches any of our routes. If it does, we can supply the output. We'll also implement some 404 error-related handling as follows: http.createServer(function (request, response) {   var lookup=path.basename(decodeURI(request.url));   pages.forEach(function(page) {     if (page.route === lookup) {       response.writeHead(200, {'Content-Type': 'text/html'});       response.end(typeof page.output === 'function'       ? page.output() : page.output);     }   });   if (!response.finished) {      response.writeHead(404);      response.end('Page Not Found!');   } }).listen(8080); How it works... The callback function we provide to http.createServer gives us all the functionality we need to interact with our server through the request and response objects. We use request to obtain the requested URL and then we acquire its basename with path. We also use decodeURI, without which another page route would fail as our code would try to match another%20page against our pages array and return false. Once we have our basename, we can match it in any way we want. We could send it in a database query to retrieve content, use regular expressions to effectuate partial matches, or we could match it to a filename and load its contents. We could have used a switch statement to handle routing, but our pages array has several advantages—it's easier to read, easier to extend, and can be seamlessly converted to JSON. We loop through our pages array using forEach. Node is built on Google's V8 engine, which provides us with a number of ECMAScript 5 (ES5) features. These features can't be used in all browsers as they're not yet universally implemented, but using them in Node is no problem! The forEach function is an ES5 implementation; the ES3 way is to use the less convenient for loop. While looping through each object, we check its route property. If we get a match, we write the 200 OK status and content-type headers, and then we end the response with the object's output property. The response.end method allows us to pass a parameter to it, which it writes just before finishing the response. In response.end, we have used a ternary operator (?:) to conditionally call page.output as a function or simply pass it as a string. Notice that the another page route contains a function instead of a string. The function has access to its parent object through the this variable, and allows for greater flexibility in assembling the output we want to provide. In the event that there is no match in our forEach loop, response.end would never be called and therefore the client would continue to wait for a response until it times out. To avoid this, we check the response.finished property and if it's false, we write a 404 header and end the response. The response.finished flag is affected by the forEach callback, yet it's not nested within the callback. Callback functions are mostly used for asynchronous operations, so on the surface this looks like a potential race condition; however, the forEach loop does not operate asynchronously; it blocks until all loops are complete. There's more... There are many ways to extend and alter this example. There are also some great non-core modules available that do the legwork for us. Simple multilevel routing Our routing so far only deals with a single level path. A multilevel path (for example, /about/node) will simply return a 404 error message. We can alter our object to reflect a subdirectory-like structure, remove path, and use request.url for our routes instead of path.basename as follows: var http=require('http'); var pages = [   {route: '/', output: 'Woohoo!'},   {route: '/about/this', output: 'Multilevel routing with Node'},   {route: '/about/node', output: 'Evented I/O for V8 JavaScript.'},   {route: '/another page', output: function () {return 'Here's '     + this.route; }} ]; http.createServer(function (request, response) {   var lookup = decodeURI(request.url); When serving static files, request.url must be cleaned prior to fetching a given file. Check out the Securing against filesystem hacking exploits recipe in this article. Multilevel routing could be taken further; we could build and then traverse a more complex object as follows: {route: 'about', childRoutes: [   {route: 'node', output: 'Evented I/O for V8 JavaScript'},   {route: 'this', output: 'Complex Multilevel Example'} ]} After the third or fourth level, this object would become a leviathan to look at. We could alternatively create a helper function to define our routes that essentially pieces our object together for us. Alternatively, we could use one of the excellent noncore routing modules provided by the open source Node community. Excellent solutions already exist that provide helper methods to handle the increasing complexity of scalable multilevel routing. Parsing the querystring module Two other useful core modules are url and querystring. The url.parse method allows two parameters: first the URL string (in our case, this will be request.url) and second a Boolean parameter named parseQueryString. If the url.parse method is set to true, it lazy loads the querystring module (saving us the need to require it) to parse the query into an object. This makes it easy for us to interact with the query portion of a URL as shown in the following code: var http = require('http'); var url = require('url'); var pages = [   {id: '1', route: '', output: 'Woohoo!'},   {id: '2', route: 'about', output: 'A simple routing with Node     example'},   {id: '3', route: 'another page', output: function () {     return 'Here's ' + this.route; }   }, ]; http.createServer(function (request, response) {   var id = url.parse(decodeURI(request.url), true).query.id;   if (id) {     pages.forEach(function (page) {       if (page.id === id) {         response.writeHead(200, {'Content-Type': 'text/html'});         response.end(typeof page.output === 'function'         ? page.output() : page.output);       }     });   }   if (!response.finished) {     response.writeHead(404);     response.end('Page Not Found');   } }).listen(8080); With the added id properties, we can access our object data by, for instance, localhost:8080?id=2. The routing modules There's an up-to-date list of various routing modules for Node at https://github.com/joyent/node/wiki/modules#wiki-web-frameworks-routers. These community-made routers cater to various scenarios. It's important to research the activity and maturity of a module before taking it into a production environment. NodeZoo (http://nodezoo.com) is an excellent tool to research the state of a NODE module. See also The Serving static files and Securing against filesystem hacking exploits recipes discussed in this article Serving static files If we have information stored on disk that we want to serve as web content, we can use the fs (filesystem) module to load our content and pass it through the http.createServer callback. This is a basic conceptual starting point to serve static files; as we will learn in the following recipes, there are much more efficient solutions. Getting ready We'll need some files to serve. Let's create a directory named content, containing the following three files: index.html styles.css script.js Add the following code to the HTML file index.html: <html>   <head>     <title>Yay Node!</title>     <link rel=stylesheet href=styles.css type=text/css>     <script src=script.js type=text/javascript></script>   </head>   <body>     <span id=yay>Yay!</span>   </body> </html> Add the following code to the script.js JavaScript file: window.onload = function() { alert('Yay Node!'); }; And finally, add the following code to the CSS file style.css: #yay {font-size:5em;background:blue;color:yellow;padding:0.5em} How to do it... As in the previous recipe, we'll be using the core modules http and path. We'll also need to access the filesystem, so we'll require fs as well. With the help of the following code, let's create the server and use the path module to check if a file exists: var http = require('http'); var path = require('path'); var fs = require('fs'); http.createServer(function (request, response) {   var lookup = path.basename(decodeURI(request.url)) ||     'index.html';   var f = 'content/' + lookup;   fs.exists(f, function (exists) {     console.log(exists ? lookup + " is there"     : lookup + " doesn't exist");   }); }).listen(8080); If we haven't already done it, then we can initialize our server.js file by running the following command: supervisor server.js Try loading localhost:8080/foo. The console will say foo doesn't exist, because it doesn't. The localhost:8080/script.js URL will tell us that script.js is there, because it is. Before we can serve a file, we are supposed to let the client know the content-type header, which we can determine from the file extension. So let's make a quick map using an object as follows: var mimeTypes = {   '.js' : 'text/javascript',   '.html': 'text/html',   '.css' : 'text/css' }; We could extend our mimeTypes map later to support more types. Modern browsers may be able to interpret certain mime types (like text/javascript), without the server sending a content-type header, but older browsers or less common mime types will rely upon the correct content-type header being sent from the server. Remember to place mimeTypes outside of the server callback, since we don't want to initialize the same object on every client request. If the requested file exists, we can convert our file extension into a content-type header by feeding path.extname into mimeTypes and then pass our retrieved content-type to response.writeHead. If the requested file doesn't exist, we'll write out a 404 error and end the response as follows: //requires variables, mimeType object... http.createServer(function (request, response) {     var lookup = path.basename(decodeURI(request.url)) ||     'index.html';   var f = 'content/' + lookup;   fs.exists(f, function (exists) {     if (exists) {       fs.readFile(f, function (err, data) {         if (err) {response.writeHead(500); response.end('Server           Error!'); return; }         var headers = {'Content-type': mimeTypes[path.extname          (lookup)]};         response.writeHead(200, headers);         response.end(data);       });       return;     }     response.writeHead(404); //no such file found!     response.end();   }); }).listen(8080); At the moment, there is still no content sent to the client. We have to get this content from our file, so we wrap the response handling in an fs.readFile method callback as follows: //http.createServer, inside fs.exists: if (exists) {   fs.readFile(f, function(err, data) {     var headers={'Content-type': mimeTypes[path.extname(lookup)]};     response.writeHead(200, headers);     response.end(data);   });  return; } Before we finish, let's apply some error handling to our fs.readFile callback as follows: //requires variables, mimeType object... //http.createServer,  path exists, inside if(exists):  fs.readFile(f, function(err, data) {     if (err) {response.writeHead(500); response.end('Server       Error!');  return; }     var headers = {'Content-type': mimeTypes[path.extname      (lookup)]};     response.writeHead(200, headers);     response.end(data);   }); return; } Notice that return stays outside of the fs.readFile callback. We are returning from the fs.exists callback to prevent further code execution (for example, sending the 404 error). Placing a return statement in an if statement is similar to using an else branch. However, the pattern of the return statement inside the if loop is encouraged instead of if else, as it eliminates a level of nesting. Nesting can be particularly prevalent in Node due to performing a lot of asynchronous tasks, which tend to use callback functions. So, now we can navigate to localhost:8080, which will serve our index.html file. The index.html file makes calls to our script.js and styles.css files, which our server also delivers with appropriate mime types. We can see the result in the following screenshot: This recipe serves to illustrate the fundamentals of serving static files. Remember, this is not an efficient solution! In a real world situation, we don't want to make an I/O call every time a request hits the server; this is very costly especially with larger files. In the following recipes, we'll learn better ways of serving static files. How it works... Our script creates a server and declares a variable called lookup. We assign a value to lookup using the double pipe || (OR) operator. This defines a default route if path.basename is empty. Then we pass lookup to a new variable that we named f in order to prepend our content directory to the intended filename. Next, we run f through the fs.exists method and check the exist parameter in our callback to see if the file is there. If the file does exist, we read it asynchronously using fs.readFile. If there is a problem accessing the file, we write a 500 server error, end the response, and return from the fs.readFile callback. We can test the error-handling functionality by removing read permissions from index.html as follows: chmod -r index.html Doing so will cause the server to throw the 500 server error status code. To set things right again, run the following command: chmod +r index.html chmod is a Unix-type system-specific command. If we are using Windows, there's no need to set file permissions in this case. As long as we can access the file, we grab the content-type header using our handy mimeTypes mapping object, write the headers, end the response with data loaded from the file, and finally return from the function. If the requested file does not exist, we bypass all this logic, write a 404 error message, and end the response. There's more... The favicon icon file is something to watch out for. We will explore the file in this section. The favicon gotcha When using a browser to test our server, sometimes an unexpected server hit can be observed. This is the browser requesting the default favicon.ico icon file that servers can provide. Apart from the initial confusion of seeing additional hits, this is usually not a problem. If the favicon request does begin to interfere, we can handle it as follows: if (request.url === '/favicon.ico') {   console.log('Not found: ' + f);   response.end();   return; } If we wanted to be more polite to the client, we could also inform it of a 404 error by using response.writeHead(404) before issuing response.end. See also The Caching content in memory for immediate delivery recipe The Optimizing performance with streaming recipe The Securing against filesystem hacking exploits recipe Caching content in memory for immediate delivery Directly accessing storage on each client request is not ideal. For this task, we will explore how to enhance server efficiency by accessing the disk only on the first request, caching the data from file for that first request, and serving all further requests out of the process memory. Getting ready We are going to improve upon the code from the previous task, so we'll be working with server.js and in the content directory, with index.html, styles.css, and script.js. How to do it... Let's begin by looking at our following script from the previous recipe Serving Static Files: var http = require('http'); var path = require('path'); var fs = require('fs');    var mimeTypes = {   '.js' : 'text/javascript',   '.html': 'text/html',   '.css' : 'text/css' };   http.createServer(function (request, response) {   var lookup = path.basename(decodeURI(request.url)) ||     'index.html';   var f = 'content/'+lookup;   fs.exists(f, function (exists) {     if (exists) {       fs.readFile(f, function(err,data) {         if (err) {           response.writeHead(500); response.end('Server Error!');           return;         }         var headers = {'Content-type': mimeTypes[path.extname          (lookup)]};         response.writeHead(200, headers);         response.end(data);       });     return;     }     response.writeHead(404); //no such file found!     response.end('Page Not Found');   }); } We need to modify this code to only read the file once, load its contents into memory, and respond to all requests for that file from memory afterwards. To keep things simple and preserve maintainability, we'll extract our cache handling and content delivery into a separate function. So above http.createServer, and below mimeTypes, we'll add the following: var cache = {}; function cacheAndDeliver(f, cb) {   if (!cache[f]) {     fs.readFile(f, function(err, data) {       if (!err) {         cache[f] = {content: data} ;       }       cb(err, data);     });     return;   }   console.log('loading ' + f + ' from cache');   cb(null, cache[f].content); } //http.createServer A new cache object and a new function called cacheAndDeliver have been added to store our files in memory. Our function takes the same parameters as fs.readFile so we can replace fs.readFile in the http.createServer callback while leaving the rest of the code intact as follows: //...inside http.createServer:   fs.exists(f, function (exists) {   if (exists) {     cacheAndDeliver(f, function(err, data) {       if (err) {         response.writeHead(500);         response.end('Server Error!');         return; }       var headers = {'Content-type': mimeTypes[path.extname(f)]};       response.writeHead(200, headers);       response.end(data);     }); return;   } //rest of path exists code (404 handling)... When we execute our server.js file and access localhost:8080 twice, consecutively, the second request causes the console to display the following output: loading content/index.html from cache loading content/styles.css from cache loading content/script.js from cache How it works... We defined a function called cacheAndDeliver, which like fs.readFile, takes a filename and callback as parameters. This is great because we can pass the exact same callback of fs.readFile to cacheAndDeliver, padding the server out with caching logic without adding any extra complexity visually to the inside of the http.createServer callback. As it stands, the worth of abstracting our caching logic into an external function is arguable, but the more we build on the server's caching abilities, the more feasible and useful this abstraction becomes. Our cacheAndDeliver function checks to see if the requested content is already cached. If not, we call fs.readFile and load the data from disk. Once we have this data, we may as well hold onto it, so it's placed into the cache object referenced by its file path (the f variable). The next time anyone requests the file, cacheAndDeliver will see that we have the file stored in the cache object and will issue an alternative callback containing the cached data. Notice that we fill the cache[f] property with another new object containing a content property. This makes it easier to extend the caching functionality in the future as we would just have to place extra properties into our cache[f] object and supply logic that interfaces with these properties accordingly. There's more... If we were to modify the files we are serving, the changes wouldn't be reflected until we restart the server. We can do something about that. Reflecting content changes To detect whether a requested file has changed since we last cached it, we must know when the file was cached and when it was last modified. To record when the file was last cached, let's extend the cache[f] object as follows: cache[f] = {content: data,timestamp: Date.now() // store a Unix                                                 // time stamp }; That was easy! Now let's find out when the file was updated last. The fs.stat method returns an object as the second parameter of its callback. This object contains the same useful information as the command-line GNU (GNU's Not Unix!) coreutils stat. The fs.stat function supplies three time-related properties: last accessed (atime), last modified (mtime), and last changed (ctime). The difference between mtime and ctime is that ctime will reflect any alterations to the file, whereas mtime will only reflect alterations to the content of the file. Consequently, if we changed the permissions of a file, ctime would be updated but mtime would stay the same. We want to pay attention to permission changes as they happen so let's use the ctime property as shown in the following code: //requires and mimeType object.... var cache = {}; function cacheAndDeliver(f, cb) {   fs.stat(f, function (err, stats) {     if (err) { return console.log('Oh no!, Error', err); }     var lastChanged = Date.parse(stats.ctime),     isUpdated = (cache[f]) && lastChanged  > cache[f].timestamp;     if (!cache[f] || isUpdated) {       fs.readFile(f, function (err, data) {         console.log('loading ' + f + ' from file');         //rest of cacheAndDeliver   }); //end of fs.stat } If we're using Node on Windows, we may have to substitute ctime with mtime, since ctime supports at least Version 0.10.12. The contents of cacheAndDeliver have been wrapped in an fs.stat callback, two variables have been added, and the if(!cache[f]) statement has been modified. We parse the ctime property of the second parameter dubbed stats using Date.parse to convert it to milliseconds since midnight, January 1st, 1970 (the Unix epoch) and assign it to our lastChanged variable. Then we check whether the requested file's last changed time is greater than when we cached the file (provided the file is indeed cached) and assign the result to our isUpdated variable. After that, it's merely a case of adding the isUpdated Boolean to the conditional if(!cache[f]) statement via the || (or) operator. If the file is newer than our cached version (or if it isn't yet cached), we load the file from disk into the cache object. See also The Optimizing performance with streaming recipe discussed in this article Optimizing performance with streaming Caching content certainly improves upon reading a file from disk for every request. However, with fs.readFile, we are reading the whole file into memory before sending it out in a response object. For better performance, we can stream a file from disk and pipe it directly to the response object, sending data straight to the network socket a piece at a time. Getting ready We are building on our code from the last example, so let's get server.js, index.html, styles.css, and script.js ready. How to do it... We will be using fs.createReadStream to initialize a stream, which can be piped to the response object. In this case, implementing fs.createReadStream within our cacheAndDeliver function isn't ideal because the event listeners of fs.createReadStream will need to interface with the request and response objects, which for the sake of simplicity would preferably be dealt with in the http.createServer callback. For brevity's sake, we will discard our cacheAndDeliver function and implement basic caching within the server callback as follows: //...snip... requires, mime types, createServer, lookup and f //  vars...   fs.exists(f, function (exists) {   if (exists) {     var headers = {'Content-type': mimeTypes[path.extname(f)]};     if (cache[f]) {       response.writeHead(200, headers);       response.end(cache[f].content);       return;    } //...snip... rest of server code... Later on, we will fill cache[f].content while we are interfacing with the readStream object. The following code shows how we use fs.createReadStream: var s = fs.createReadStream(f); The preceding code will return a readStream object that streams the file, which is pointed at by variable f. The readStream object emits events that we need to listen to. We can listen with addEventListener or use the shorthand on method as follows: var s = fs.createReadStream(f).on('open', function () {   //do stuff when the readStream opens }); Because createReadStream returns the readStream object, we can latch our event listener straight onto it using method chaining with dot notation. Each stream is only going to open once; we don't need to keep listening to it. Therefore, we can use the once method instead of on to automatically stop listening after the first event occurrence as follows: var s = fs.createReadStream(f).once('open', function () {   //do stuff when the readStream opens }); Before we fill out the open event callback, let's implement some error handling as follows: var s = fs.createReadStream(f).once('open', function () {   //do stuff when the readStream opens }).once('error', function (e) {   console.log(e);   response.writeHead(500);   response.end('Server Error!'); }); The key to this whole endeavor is the stream.pipe method. This is what enables us to take our file straight from disk and stream it directly to the network socket via our response object as follows: var s = fs.createReadStream(f).once('open', function () {   response.writeHead(200, headers);   this.pipe(response); }).once('error', function (e) {   console.log(e);   response.writeHead(500);   response.end('Server Error!'); }); But what about ending the response? Conveniently, stream.pipe detects when the stream has ended and calls response.end for us. There's one other event we need to listen to, for caching purposes. Within our fs.exists callback, underneath the createReadStream code block, we write the following code: fs.stat(f, function(err, stats) {   var bufferOffset = 0;   cache[f] = {content: new Buffer(stats.size)};   s.on('data', function (chunk) {     chunk.copy(cache[f].content, bufferOffset);     bufferOffset += chunk.length;   }); }); //end of createReadStream We've used the data event to capture the buffer as it's being streamed, and copied it into a buffer that we supplied to cache[f].content, using fs.stat to obtain the file size for the file's cache buffer. For this case, we're using the classic mode data event instead of the readable event coupled with stream.read() (see http://nodejs.org/api/stream.html#stream_readable_read_size_1) because it best suits our aim, which is to grab data from the stream as soon as possible. How it works... Instead of the client waiting for the server to load the entire file from disk prior to sending it to the client, we use a stream to load the file in small ordered pieces and promptly send them to the client. With larger files, this is especially useful as there is minimal delay between the file being requested and the client starting to receive the file. We did this by using fs.createReadStream to start streaming our file from disk. The fs.createReadStream method creates a readStream object, which inherits from the EventEmitter class. The EventEmitter class accomplishes the evented part pretty well. Due to this, we'll be using listeners instead of callbacks to control the flow of stream logic. We then added an open event listener using the once method since we want to stop listening to the open event once it is triggered. We respond to the open event by writing the headers and using the stream.pipe method to shuffle the incoming data straight to the client. If the client becomes overwhelmed with processing, stream.pipe applies backpressure, which means that the incoming stream is paused until the backlog of data is handled. While the response is being piped to the client, the content cache is simultaneously being filled. To achieve this, we had to create an instance of the Buffer class for our cache[f].content property. A Buffer class must be supplied with a size (or array or string), which in our case is the size of the file. To get the size, we used the asynchronous fs.stat method and captured the size property in the callback. The data event returns a Buffer variable as its only callback parameter. The default value of bufferSize for a stream is 64 KB; any file whose size is less than the value of the bufferSize property will only trigger one data event because the whole file will fit into the first chunk of data. But for files that are greater than the value of the bufferSize property, we have to fill our cache[f].content property one piece at a time. Changing the default readStream buffer size We can change the buffer size of our readStream object by passing an options object with a bufferSize property as the second parameter of fs.createReadStream. For instance, to double the buffer, you could use fs.createReadStream(f,{bufferSize: 128 * 1024});. We cannot simply concatenate each chunk with cache[f].content because this will coerce binary data into string format, which, though no longer in binary format, will later be interpreted as binary. Instead, we have to copy all the little binary buffer chunks into our binary cache[f].content buffer. We created a bufferOffset variable to assist us with this. Each time we add another chunk to our cache[f].content buffer, we update our new bufferOffset property by adding the length of the chunk buffer to it. When we call the Buffer.copy method on the chunk buffer, we pass bufferOffset as the second parameter, so our cache[f].content buffer is filled correctly. Moreover, operating with the Buffer class renders performance enhancements with larger files because it bypasses the V8 garbage-collection methods, which tend to fragment a large amount of data, thus slowing down Node's ability to process them. There's more... While streaming has solved the problem of waiting for files to be loaded into memory before delivering them, we are nevertheless still loading files into memory via our cache object. With larger files or a large number of files, this could have potential ramifications. Protecting against process memory overruns Streaming allows for intelligent and minimal use of memory for processing large memory items. But even with well-written code, some apps may require significant memory. There is a limited amount of heap memory. By default, V8's memory is set to 1400 MB on 64-bit systems and 700 MB on 32-bit systems. This can be altered by running node with --max-old-space-size=N, where N is the amount of megabytes (the actual maximum amount that it can be set to depends upon the OS, whether we're running on a 32-bit or 64-bit architecture—a 32-bit may peak out around 2 GB and of course the amount of physical RAM available). The --max-old-space-size method doesn't apply to buffers, since it applies to the v8 heap (memory allocated for JavaScript objects and primitives) and buffers are allocated outside of the v8 heap. If we absolutely had to be memory intensive, we could run our server on a large cloud platform, divide up the logic, and start new instances of node using the child_process class, or better still the higher level cluster module. There are other more advanced ways to increase the usable memory, including editing and recompiling the v8 code base. The http://blog.caustik.com/2012/04/11/escape-the-1-4gb-v8-heap-limit-in-node-js link has some tips along these lines. In this case, high memory usage isn't necessarily required and we can optimize our code to significantly reduce the potential for memory overruns. There is less benefit to caching larger files because the slight speed improvement relative to the total download time is negligible, while the cost of caching them is quite significant in ratio to our available process memory. We can also improve cache efficiency by implementing an expiration time on cache objects, which can then be used to clean the cache, consequently removing files in low demand and prioritizing high demand files for faster delivery. Let's rearrange our cache object slightly as follows: var cache = {   store: {},   maxSize : 26214400, //(bytes) 25mb } For a clearer mental model, we're making a distinction between the cache object as a functioning entity and the cache object as a store (which is a part of the broader cache entity). Our first goal is to only cache files under a certain size; we've defined cache.maxSize for this purpose. All we have to do now is insert an if condition within the fs.stat callback as follows: fs.stat(f, function (err, stats) {   if (stats.size<cache.maxSize) {     var bufferOffset = 0;     cache.store[f] = {content: new Buffer(stats.size),       timestamp: Date.now() };     s.on('data', function (data) {       data.copy(cache.store[f].content, bufferOffset);       bufferOffset += data.length;     });   } }); Notice that we also slipped in a new timestamp property into our cache.store[f] method. This is for our second goal—cleaning the cache. Let's extend cache as follows: var cache = {   store: {},   maxSize: 26214400, //(bytes) 25mb   maxAge: 5400 * 1000, //(ms) 1 and a half hours   clean: function(now) {     var that = this;     Object.keys(this.store).forEach(function (file) {       if (now > that.store[file].timestamp + that.maxAge) {         delete that.store[file];       }     });   } }; So in addition to maxSize, we've created a maxAge property and added a clean method. We call cache.clean at the bottom of the server with the help of the following code: //all of our code prior   cache.clean(Date.now()); }).listen(8080); //end of the http.createServer The cache.clean method loops through the cache.store function and checks to see if it has exceeded its specified lifetime. If it has, we remove it from the store. One further improvement and then we're done. The cache.clean method is called on each request. This means the cache.store function is going to be looped through on every server hit, which is neither necessary nor efficient. It would be better if we clean the cache, say, every two hours or so. We'll add two more properties to cache—cleanAfter to specify the time between cache cleans, and cleanedAt to determine how long it has been since the cache was last cleaned, as follows: var cache = {   store: {},   maxSize: 26214400, //(bytes) 25mb   maxAge : 5400 * 1000, //(ms) 1 and a half hours   cleanAfter: 7200 * 1000,//(ms) two hours   cleanedAt: 0, //to be set dynamically   clean: function (now) {     if (now - this.cleanAfter>this.cleanedAt) {       this.cleanedAt = now;       that = this;       Object.keys(this.store).forEach(function (file) {         if (now > that.store[file].timestamp + that.maxAge) {           delete that.store[file];         }       });     }   } }; So we wrap our cache.clean method in an if statement, which will allow a loop through cache.store only if it has been longer than two hours (or whatever cleanAfter is set to) since the last clean. See also The Securing against filesystem hacking exploits recipe discussed in this article Securing against filesystem hacking exploits For a Node app to be insecure, there must be something an attacker can interact with for exploitation purposes. Due to Node's minimalist approach, the onus is on the programmer to ensure that their implementation doesn't expose security flaws. This recipe will help identify some security risk anti-patterns that could occur when working with the filesystem. Getting ready We'll be working with the same content directory as we did in the previous recipes. But we'll start a new insecure_server.js file (there's a clue in the name!) from scratch to demonstrate mistaken techniques. How to do it... Our previous static file recipes tend to use path.basename to acquire a route, but this ignores intermediate paths. If we accessed localhost:8080/foo/bar/styles.css, our code would take styles.css as the basename property and deliver content/styles.css to us. How about we make a subdirectory in our content folder? Call it subcontent and move our script.js and styles.css files into it. We'd have to alter our script and link tags in index.html as follows: <link rel=stylesheet type=text/css href=subcontent/styles.css> <script src=subcontent/script.js type=text/javascript></script> We can use the url module to grab the entire pathname property. So let's include the url module in our new insecure_server.js file, create our HTTP server, and use pathname to get the whole requested path as follows: var http = require('http'); var url = require('url'); var fs = require('fs');   http.createServer(function (request, response) {   var lookup = url.parse(decodeURI(request.url)).pathname;   lookup = (lookup === "/") ? '/index.html' : lookup;   var f = 'content' + lookup;   console.log(f);   fs.readFile(f, function (err, data) {     response.end(data);   }); }).listen(8080); If we navigate to localhost:8080, everything works great! We've gone multilevel, hooray! For demonstration purposes, a few things have been stripped out from the previous recipes (such as fs.exists); but even with them, this code presents the same security hazards if we type the following: curl localhost:8080/../insecure_server.js Now we have our server's code. An attacker could also access /etc/passwd with a few attempts at guessing its relative path as follows: curl localhost:8080/../../../../../../../etc/passwd If we're using Windows, we can download and install curl from http://curl.haxx.se/download.html. In order to test these attacks, we have to use curl or another equivalent because modern browsers will filter these sort of requests. As a solution, what if we added a unique suffix to each file we wanted to serve and made it mandatory for the suffix to exist before the server coughs it up? That way, an attacker could request /etc/passwd or our insecure_server.js file because they wouldn't have the unique suffix. To try this, let's copy the content folder and call it content-pseudosafe, and rename our files to index.html-serve, script.js-serve, and styles.css-serve. Let's create a new server file and name it pseudosafe_server.js. Now all we have to do is make the -serve suffix mandatory as follows: //requires section ...snip... http.createServer(function (request, response) {   var lookup = url.parse(decodeURI(request.url)).pathname;   lookup = (lookup === "/") ? '/index.html-serve'     : lookup + '-serve';   var f = 'content-pseudosafe' + lookup; //...snip... rest of the server code... For feedback purposes, we'll also include some 404 handling with the help of fs.exists as follows: //requires, create server etc fs.exists(f, function (exists) {   if (!exists) {     response.writeHead(404);     response.end('Page Not Found!');     return;   } //read file etc So, let's start our pseudosafe_server.js file and try out the same exploit by executing the following command: curl -i localhost:8080/../insecure_server.js We've used the -i argument so that curl will output the headers. The result? A 404, because the file it's actually looking for is ../insecure_server.js-serve, which doesn't exist. So what's wrong with this method? Well it's inconvenient and prone to error. But more importantly, an attacker can still work around it! Try this by typing the following: curl localhost:8080/../insecure_server.js%00/index.html And voilà! There's our server code again. The solution to our problem is path.normalize, which cleans up our pathname before it gets to fs.readFile as shown in the following code: http.createServer(function (request, response) {   var lookup = url.parse(decodeURI(request.url)).pathname;   lookup = path.normalize(lookup);   lookup = (lookup === "/") ? '/index.html' : lookup;   var f = 'content' + lookup } Prior recipes haven't used path.normalize and yet they're still relatively safe. The path.basename method gives us the last part of the path, thus removing any preceding double dot paths (../) that would take an attacker higher up the directory hierarchy than should be allowed. How it works... Here we have two filesystem exploitation techniques: the relative directory traversal and poison null byte attacks. These attacks can take different forms, such as in a POST request or from an external file. They can have different effects—if we were writing to files instead of reading them, an attacker could potentially start making changes to our server. The key to security in all cases is to validate and clean any data that comes from the user. In insecure_server.js, we pass whatever the user requests to our fs.readFile method. This is foolish because it allows an attacker to take advantage of the relative path functionality in our operating system by using ../, thus gaining access to areas that should be off limits. By adding the -serve suffix, we didn't solve the problem, we put a plaster on it, which can be circumvented by the poison null byte. The key to this attack is the %00 value, which is a URL hex code for the null byte. In this case, the null byte blinds Node to the ../insecure_server.js portion, but when the same null byte is sent through to our fs.readFile method, it has to interface with the kernel. But the kernel gets blinded to the index.html part. So our code sees index.html but the read operation sees ../insecure_server.js. This is known as null byte poisoning. To protect ourselves, we could use a regex statement to remove the ../ parts of the path. We could also check for the null byte and spit out a 400 Bad Request statement. But we don't have to, because path.normalize filters out the null byte and relative parts for us. There's more... Let's further delve into how we can protect our servers when it comes to serving static files. Whitelisting If security was an extreme priority, we could adopt a strict whitelisting approach. In this approach, we would create a manual route for each file we are willing to deliver. Anything not on our whitelist would return a 404 error. We can place a whitelist array above http.createServer as follows: var whitelist = [   '/index.html',   '/subcontent/styles.css',   '/subcontent/script.js' ]; And inside our http.createServer callback, we'll put an if statement to check if the requested path is in the whitelist array, as follows: if (whitelist.indexOf(lookup) === -1) {   response.writeHead(404);   response.end('Page Not Found!');   return; } And that's it! We can test this by placing a file non-whitelisted.html in our content directory and then executing the following command: curl -i localhost:8080/non-whitelisted.html This will return a 404 error because non-whitelisted.html isn't on the whitelist. Node static The module's wiki page (https://github.com/joyent/node/wiki/modules#wiki-web-frameworks-static) has a list of static file server modules available for different purposes. It's a good idea to ensure that a project is mature and active before relying upon it to serve your content. The node-static module is a well-developed module with built-in caching. It's also compliant with the RFC2616 HTTP standards specification, which defines how files should be delivered over HTTP. The node-static module implements all the essentials discussed in this article and more. For the next example, we'll need the node-static module. You could install it by executing the following command: npm install node-static The following piece of code is slightly adapted from the node-static module's GitHub page at https://github.com/cloudhead/node-static: var static = require('node-static'); var fileServer = new static.Server('./content'); require('http').createServer(function (request, response) {   request.addListener('end', function () {     fileServer.serve(request, response);   }); }).listen(8080); The preceding code will interface with the node-static module to handle server-side and client-side caching, use streams to deliver content, and filter out relative requests and null bytes, among other things. Summary To learn more about Node.js and creating web servers, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: Node Cookbook Second Edition (https://www.packtpub.com/web-development/node-cookbook-second-edition) Node.js Design Patterns (https://www.packtpub.com/web-development/nodejs-design-patterns) Node Web Development Second Edition (https://www.packtpub.com/web-development/node-web-development-second-edition) Resources for Article: Further resources on this subject: Working With Commands And Plugins [article] Node.js Fundamentals And Asynchronous Javascript [article] Building A Movie API With Express [article]
Read more
  • 0
  • 0
  • 5812

article-image-caching-symfony
Packt
05 Apr 2016
15 min read
Save for later

Caching in Symfony

Packt
05 Apr 2016
15 min read
In this article by Sohail Salehi, author of the book, Mastering Symfony, we are going to discuss performance improvement using cache. Caching is a vast subject and needs its own book to be covered properly. However, in our Symfony project, we are interested in two types of caches only: Application cache Database cache We will see what caching facilities are provided in Symfony by default and how we can use them. We are going to apply the caching techniques on some methods in our projects and watch the performance improvement. By the end of this article, you will have a firm understanding about the usage of HTTP cache headers in the application layer and caching libraries. (For more resources related to this topic, see here.) Definition of cache Cache is a temporary place that stores contents that can be served faster when they are needed. Considering that we already have a permanent place on disk to store our web contents (templates, codes, and database tables), cache sounds like a duplicate storage. That is exactly what they are. They are duplicates and we need them because, in return for consuming an extra space to store the same data, they provide a very fast response to some requests. So this is a very good trade-off between storage and performance. To give you an example about how good this deal can be, consider the following image. On the left side, we have a usual client/server request/response model and let's say the response latency is two seconds and there are only 100 users who hit the same content per hour: On the right side, however, we have a cache layer that sits between the client and server. What it does basically is receive the same request and pass it to the server. The server sends a response to the cache and, because this response is new to the cache, it will save a copy (duplicate) of the response and then pass it back to the client. The latency is 2 + 0.2 seconds. However, it doesn't add up, does it? The purpose of using cache was to improve the overall performance and reduce the latency. It has already added more delays to the cycle. With this result, how could it possibly be beneficial? The answer is in the following image: Now, with the response being cached, imagine the same request comes through. (We have about 100 requests/hour for the same content, remember?) This time, the cache layer looks into its space, finds the response, and sends it back to the client, without bothering the server. The latency is 0.2 seconds. Of course, these are only imaginary numbers and situations. However, in the simplest form, this is how cache works. It might not be very helpful on a low traffic website; however, when we are dealing with thousands of concurrent users on a high traffic website, then we can appreciate the value of caching. So, according to the previous images, we can define some terminology and use them in this article as we continue. In the first image, when a client asked for that page, it wasn't exited and the cache layer had to store a copy of its contents for the future references. This is called Cache Miss. However, in the second image, we already had a copy of the contents stored in the cache and we benefited from it. This is called Cache Hit. Characteristics of a good cache If you do a quick search, you will find that a good cache is defined as the one which misses only once. In other words, this cache miss happens only if the content has not been requested before. This feature is necessary but it is not sufficient. To clarify the situation a little bit, let's add two more terminology here. A cache can be in one of the following states: fresh (has the same contents as the original response) and stale (has the old response's contents that have now changed on the server). The important question here is for how long should a cache be kept? We have the power to define the freshness of a cache via a setting expiration period. We will see how to do this in the coming sections. However, just because we have this power doesn't mean that we are right about the content's freshness. Consider the situation shown in the following image: If we cache a content for a long time, cache miss won't happen again (which satisfies the preceding definition), but the content might lose its freshness according to the dynamic resources that might change on the server. To give you an example, nobody likes to read the news of three months ago when they open the BBC website. Now, we can modify the definition of a good cache as follows: A cache strategy is considered to be good if cache miss for the same content happens only once, while the cached contents are still fresh. This means that defining the cache expiry time won't be enough and we need another strategy to keep an eye on cache freshness. This happens via a cache validation strategy. When the server sends a response, we can set the validation rules on the basis of what really matters on the server side, and this way, we can keep the contents stored in the cache fresh, as shown in the following image. We will see how to do this in Symfony soon. Caches in a Symfony project In this article, we will focus on two types of caches: The gateway cache (which is called reverse proxy cache as well) and doctrine cache. As you might have guessed, the gateway cache deals with all of the HTTP cache headers. Symfony comes with a very strong gateway cache out of the box. All you need to do is just activate it in your front controller then start defining your cache expiration and validation strategies inside your controllers. That said, it does not mean that you are forced or restrained to use the Symfony cache only. If you prefer other reverse proxy cache libraries (that is, Varnish or Django), you are welcome to use them. The caching configurations in Symfony are transparent such that you don't need to change a single line inside your controllers when you change your caching libraries. Just modify your config.yml file and you will be good to go. However, we all know that caching is not for application layers and views only. Sometimes, we need to cache any database-related contents as well. For our Doctrine ORM, this includes metadata cache, query cache, and result cache. Doctrine comes with its own bundle to handle these types of caches and it uses a wide range of libraries (APC, Memcached, Redis, and so on) to do the job. Again, we don't need to install anything to use this cache bundle. If we have Doctrine installed already, all we need to do is configure something and then all the Doctrine caching power will be at our disposal. Putting these two caching types together, we will have a big picture to cache our Symfony project: As you can see in this image, we might have a problem with the final cached page. Imagine that we have a static page that might change once a week, and in this page, there are some blocks that might change on a daily or even hourly basis, as shown in the following image. The User dashboard in our project is a good example. Thus, if we set the expiration on the gateway cache to one week, we cannot reflect all of those rapid updates in our project and task controllers. To solve this problem, we can leverage from Edge Side Includes (ESI) inside Symfony. Basically, any part of the page that has been defined inside an ESI tag can tell its own cache story to the gateway cache. Thus, we can have multiple cache strategies living side by side inside a single page. With this solution, our big picture will look as follows: Thus, we are going to use the default Symfony and Doctrine caching features for application and model layers and you can also use some popular third-party bundles for more advanced settings. If you completely understand the caching principals, moving to other caching bundles would be like a breeze. Key players in the HTTP cache header Before diving into the Symfony application cache, let's familiarize ourselves with the elements that we need to handle in our cache strategies. To do so, open https://www.wikipedia.org/ in your browser and inspect any resource with the 304 response code and ponder on request/response headers inside the Network tab: Among the response elements, there are four cache headers that we are interested in the most: expires and cache-control, which will be used for an expiration model, and etag and last-modified, which will be used for a validation model. Apart from these cache headers, we can have variations of the same cache (compressed/uncompressed) via the Vary header and we can define a cache as private (accessible by a specific user) or public (accessible by everyone). Using the Symfony reverse proxy cache There is no complicated or lengthy procedure required to activate the Symfony's gateway cache. Just open the front controller and uncomment the following lines: // web/app.php <?php //... require_once __DIR__.'/../app/AppKernel.php'; //un comment this line require_once __DIR__.'/../app/AppCache.php'; $kernel = new AppKernel('prod', false); $kernel->loadClassCache(); // and this line $kernel = new AppCache($kernel); // ... ?> Now, the kernel is wrapped around the Application Cache layer, which means that any request coming from the client will pass through this layer first. Set the expiration for the dashboard page Log in to your project and click on the Request/Response section in the debug toolbar. Then, scroll down to Response Headers and check the contents: As you can see, only cache-control is sitting there with some default values among the cache headers that we are interested in. When you don't set any value for Cache-Control, Symfony considers the page contents as private to keep them safe. Now, let's go to the Dashboard controller and add some gateway cache settings to the indexAction() method: // src/AppBundle/Controller/DashboardController.php <?php namespace AppBundleController; use SymfonyBundleFrameworkBundleControllerController; use SymfonyComponentHttpFoundationResponse; class DashboardController extends Controller { public function indexAction() { $uId = $this->getUser()->getId(); $util = $this->get('mava_util'); $userProjects = $util->getUserProjects($uId); $currentTasks= $util->getUserTasks($uId, 'in progress'); $response = new Response(); $date = new DateTime('+2 days'); $response->setExpires($date); return $this->render( 'CoreBundle:Dashboard:index.html.twig', array( 'currentTasks' => $currentTasks, 'userProjects' => $userProjects ), $response ); } } You might have noticed that we didn't change the render() method. Instead, we added the response settings as the third parameter of this method. This is a good solution because now we can keep the current template structure and adding new settings won't require any other changes in the code. However, you might wonder what other options do we have? We can save the whole $this->render() method in a variable and assign a response setting to it as follows: // src/AppBundle/Controller/DashboardController.php <?php // ... $res = $this->render( 'AppBundle:Dashboard:index.html.twig', array( 'currentTasks' => $currentTasks, 'userProjects' => $userProjects ) ); $res->setExpires($date); return $res; ?> Still looks like a lot of hard work for a simple response header setting. So let me introduce a better option. We can use the @Cache annotation as follows: // src/AppBundle/Controller/DashboardController.php <?php namespace AppBundleController; use SymfonyBundleFrameworkBundleControllerController; use SensioBundleFrameworkExtraBundleConfigurationCache; class DashboardController extends Controller { /** * @Cache(expires="next Friday") */ public function indexAction() { $uId = $this->getUser()->getId(); $util = $this->get('mava_util'); $userProjects = $util->getUserProjects($uId); $currentTasks= $util->getUserTasks($uId, 'in progress'); return $this->render( 'AppBundle:Dashboard:index.html.twig', array( 'currentTasks' => $currentTasks, 'userProjects' => $userProjects )); } } Have you noticed that the response object is completely removed from the code? With an annotation, all response headers are sent internally, which helps keep the original code clean. Now that's what I call zero-fee maintenance. Let's check our response headers in Symfony's debug toolbar and see what it looks like: The good thing about the @Cache annotation is that they can be nested. Imagine you have a controller full of actions. You want all of them to have a shared maximum age of half an hour except one that is supposed to be private and should be expired in five minutes. This sounds like a lot of code if you going are to use the response objects directly, but with an annotation, it will be as simple as this: <?php //... /** * @Cache(smaxage="1800", public="true") */ class DashboardController extends Controller { public function firstAction() { //... } public function secondAction() { //... } /** * @Cache(expires="300", public="false") */ public function lastAction() { //... } } The annotation defined before the controller class will apply to every single action, unless we explicitly add a new annotation for an action. Validation strategy In the previous example, we set the expiry period very long. This means that if a new task is assigned to the user, it won't show up in his dashboard because of the wrong caching strategy. To fix this issue, we can validate the cache before using it. There are two ways for validation: We can check the content's date via the Last-Modified header: In this technique, we certify the freshness of a content via the time it has been modified. In other words, if we keep track of the dates and times of each change on a resource, then we can simply compare that date with cache's date and find out if it is still fresh. We can use the ETag header as a unique content signature: The other solution is to generate a unique string based on the contents and evaluate the cache's freshness based on its signature. We are going to try both of them in the Dashboard controller and see them in action. Using the right validation header is totally dependent on the current code. In some actions, calculating modified dates is way easier than creating a digital footprint, while in others, going through the date and time function might looks costly. Of course, there are situations where generating both headers are critical. So creating it is totally dependent on the code base and what you are going to achieve. As you can see, we have two entities in the indexAction() method and, considering the current code, generating the ETag header looks practical. So the validation header will look as follows: // src/AppBundle/Controller/DashboardController.php <?php //... class DashboardController extends Controller { /** * @Cache(ETag="userProjects ~ finishedTasks") */ public function indexAction() { //... } } The next time a request arrives, the cache layer looks into the ETag value in the controller, compares it with its own ETag, and calls the indexAction() method; only, there is a difference between these two. How to mix expiration and validation strategies Imagine that we want to keep the cache fresh for 10 minutes and simultaneously keep an eye on any changes over user projects or finished tasks. It is obvious that tasks won't finish every 10 minutes and it is far beyond reality to expect changes on project status during this period. So what we can do to make our caching strategy efficient is that we can combine Expiration and Validation together and apply them to the Dashboard Controller as follows: // src/CoreBundle/Controller/DashboardController.php <?php //... /** * @Cache(expires="600") */ class DashboardController extends Controller { /** * @Cache(ETag="userProjects ~ finishedTasks") */ public function indexAction() { //... } } Keep in mind that Expiration has a higher priority over Validation. In other words, the cache is fresh for 10 minutes, regardless of the validation status. So when you visit your dashboard for the first time, a new cache plus a 302 response (not modified) is generated automatically and you will hit cache for the next 10 minutes. However, what happens after 10 minutes is a little different. Now, the expiration status is not satisfying; thus, the HTTP flow falls into the validation phase and in case nothing happened to the finished tasks status or the your project status, then a new expiration period is generated and you hit the cache again. However, if there is any change in your tasks or project status, then you will hit the server to get the real response, and a new cache from response's contents, new expiration period, and new ETag are generated and stored in the cache layer for future references. Summary In this article, you learned about the basics of gateway and Doctrine caching. We saw how to set expiration and validation strategies using HTTP headers such as Cache-Control, Expires, Last-Modified, and ETag. You learned how to set public and private access levels for a cache and use an annotation to define cache rules in the controller. Resources for Article: Further resources on this subject: User Interaction and Email Automation in Symfony 1.3: Part1 [article] The Symfony Framework – Installation and Configuration [article] User Interaction and Email Automation in Symfony 1.3: Part2 [article]
Read more
  • 0
  • 0
  • 5727

article-image-reactive-data-streams
Packt
03 Jun 2015
11 min read
Save for later

Reactive Data Streams

Packt
03 Jun 2015
11 min read
In this article by Shiti Saxena, author of the book Mastering Play Framework for Scala, we will discuss the Iteratee approach used to handle such situations. This article also covers the basics of handling data streams with a brief explanation of the following topics: Iteratees Enumerators Enumeratees (For more resources related to this topic, see here.) Iteratee Iteratee is defined as a trait, Iteratee[E, +A], where E is the input type and A is the result type. The state of an Iteratee is represented by an instance of Step, which is defined as follows: sealed trait Step[E, +A] {def it: Iteratee[E, A] = this match {case Step.Done(a, e) => Done(a, e)case Step.Cont(k) => Cont(k)case Step.Error(msg, e) => Error(msg, e)}}object Step {//done state of an iterateecase class Done[+A, E](a: A, remaining: Input[E]) extends Step[E, A]//continuing state of an iteratee.case class Cont[E, +A](k: Input[E] => Iteratee[E, A]) extendsStep[E, A]//error state of an iterateecase class Error[E](msg: String, input: Input[E]) extends Step[E,Nothing]} The input used here represents an element of the data stream, which can be empty, an element, or an end of file indicator. Therefore, Input is defined as follows: sealed trait Input[+E] {def map[U](f: (E => U)): Input[U] = this match {case Input.El(e) => Input.El(f(e))case Input.Empty => Input.Emptycase Input.EOF => Input.EOF}}object Input {//An input elementcase class El[+E](e: E) extends Input[E]// An empty inputcase object Empty extends Input[Nothing]// An end of file inputcase object EOF extends Input[Nothing]} An Iteratee is an immutable data type and each result of processing an input is a new Iteratee with a new state. To handle the possible states of an Iteratee, there is a predefined helper object for each state. They are: Cont Done Error Let's see the definition of the readLine method, which utilizes these objects: def readLine(line: List[Array[Byte]] = Nil): Iteratee[Array[Byte],String] = Cont {case Input.El(data) => {val s = data.takeWhile(_ != 'n')if (s.length == data.length) {readLine(s :: line)} else {Done(new String(Array.concat((s :: line).reverse: _*),"UTF-8").trim(), elOrEmpty(data.drop(s.length + 1)))}}case Input.EOF => {Error("EOF found while reading line", Input.Empty)}case Input.Empty => readLine(line)} The readLine method is responsible for reading a line and returning an Iteratee. As long as there are more bytes to be read, the readLine method is called recursively. On completing the process, an Iteratee with a completed state (Done) is returned, else an Iteratee with state continuous (Cont) is returned. In case the method encounters EOF, an Iteratee with state Error is returned. In addition to these, Play Framework exposes a companion Iteratee object, which has helper methods to deal with Iteratees. The API exposed through the Iteratee object is documented at https://www.playframework.com/documentation/2.3.x/api/scala/index.html#play.api.libs.iteratee.Iteratee$. The Iteratee object is also used internally within the framework to provide some key features. For example, consider the request body parsers. The apply method of the BodyParser object is defined as follows: def apply[T](debugName: String)(f: RequestHeader =>Iteratee[Array[Byte], Either[Result, T]]): BodyParser[T] = newBodyParser[T] {def apply(rh: RequestHeader) = f(rh)override def toString = "BodyParser(" + debugName + ")"} So, to define BodyParser[T], we need to define a method that accepts RequestHeader and returns an Iteratee whose input is an Array[Byte] and results in Either[Result,T]. Let's look at some of the existing implementations to understand how this works. The RawBuffer parser is defined as follows: def raw(memoryThreshold: Int): BodyParser[RawBuffer] =BodyParser("raw, memoryThreshold=" + memoryThreshold) { request =>import play.core.Execution.Implicits.internalContextval buffer = RawBuffer(memoryThreshold)Iteratee.foreach[Array[Byte]](bytes => buffer.push(bytes)).map {_ =>buffer.close()Right(buffer)}} The RawBuffer parser uses Iteratee.forEach method and pushes the input received into a buffer. The file parser is defined as follows: def file(to: File): BodyParser[File] = BodyParser("file, to=" +to) { request =>import play.core.Execution.Implicits.internalContextIteratee.fold[Array[Byte], FileOutputStream](newFileOutputStream(to)) {(os, data) =>os.write(data)os}.map { os =>os.close()Right(to)}} The file parser uses the Iteratee.fold method to create FileOutputStream of the incoming data. Now, let's see the implementation of Enumerator and how these two pieces fit together. Enumerator Similar to the Iteratee, an Enumerator is also defined through a trait and backed by an object of the same name: trait Enumerator[E] {parent =>def apply[A](i: Iteratee[E, A]): Future[Iteratee[E, A]]...}object Enumerator{def apply[E](in: E*): Enumerator[E] = in.length match {case 0 => Enumerator.emptycase 1 => new Enumerator[E] {def apply[A](i: Iteratee[E, A]): Future[Iteratee[E, A]] =i.pureFoldNoEC {case Step.Cont(k) => k(Input.El(in.head))case _ => i}}case _ => new Enumerator[E] {def apply[A](i: Iteratee[E, A]): Future[Iteratee[E, A]] =enumerateSeq(in, i)}}...} Observe that the apply method of the trait and its companion object are different. The apply method of the trait accepts Iteratee[E, A] and returns Future[Iteratee[E, A]], while that of the companion object accepts a sequence of type E and returns an Enumerator[E]. Now, let's define a simple data flow using the companion object's apply method; first, get the character count in a given (Seq[String]) line: val line: String = "What we need is not the will to believe, butthe wish to find out."val words: Seq[String] = line.split(" ")val src: Enumerator[String] = Enumerator(words: _*)val sink: Iteratee[String, Int] = Iteratee.fold[String,Int](0)((x, y) => x + y.length)val flow: Future[Iteratee[String, Int]] = src(sink)val result: Future[Int] = flow.flatMap(_.run) The variable result has the Future[Int] type. We can now process this to get the actual count. In the preceding code snippet, we got the result by following these steps: Building an Enumerator using the companion object's apply method: val src: Enumerator[String] = Enumerator(words: _*) Getting Future[Iteratee[String, Int]] by binding the Enumerator to an Iteratee: val flow: Future[Iteratee[String, Int]] = src(sink) Flattening Future[Iteratee[String,Int]] and processing it: val result: Future[Int] = flow.flatMap(_.run) Fetching the result from Future[Int]: Thankfully, Play provides a shortcut method by merging steps 2 and 3 so that we don't have to repeat the same process every time. The method is represented by the |>>> symbol. Using the shortcut method, our code is reduced to this: val src: Enumerator[String] = Enumerator(words: _*)val sink: Iteratee[String, Int] = Iteratee.fold[String, Int](0)((x, y)=> x + y.length)val result: Future[Int] = src |>>> sink Why use this when we can simply use the methods of the data type? In this case, do we use the length method of String to get the same value (by ignoring whitespaces)? In this example, we are getting the data as a single String but this will not be the only scenario. We need ways to process continuous data, such as a file upload, or feed data from various networking sites, and so on. For example, suppose our application receives heartbeats at a fixed interval from all the devices (such as cameras, thermometers, and so on) connected to it. We can simulate a data stream using the Enumerator.generateM method: val dataStream: Enumerator[String] = Enumerator.generateM {Promise.timeout(Some("alive"), 100 millis)} In the preceding snippet, the "alive" String is produced every 100 milliseconds. The function passed to the generateM method is called whenever the Iteratee bound to the Enumerator is in the Cont state. This method is used internally to build enumerators and can come in handy when we want to analyze the processing for an expected data stream. An Enumerator can be created from a file, InputStream, or OutputStream. Enumerators can be concatenated or interleaved. The Enumerator API is documented at https://www.playframework.com/documentation/2.3.x/api/scala/index.html#play.api.libs.iteratee.Enumerator$. Using the Concurrent object The Concurrent object is a helper that provides utilities for using Iteratees, enumerators, and Enumeratees concurrently. Two of its important methods are: Unicast: It is useful when sending data to a single iterate. Broadcast: It facilitates sending the same data to multiple Iteratees concurrently. Unicast For example, the character count example in the previous section can be implemented as follows: val unicastSrc = Concurrent.unicast[String](channel =>channel.push(line))val unicastResult: Future[Int] = unicastSrc |>>> sink The unicast method accepts the onStart, onError, and onComplete handlers. In the preceding code snippet, we have provided the onStart method, which is mandatory. The signature of unicast is this: def unicast[E](onStart: (Channel[E]) ⇒ Unit,onComplete: ⇒ Unit = (),onError: (String, Input[E]) ⇒ Unit = (_: String, _: Input[E])=> ())(implicit ec: ExecutionContext): Enumerator[E] {…} So, to add a log for errors, we can define the onError handler as follows: val unicastSrc2 = Concurrent.unicast[String](channel => channel.push(line),onError = { (msg, str) => Logger.error(s"encountered $msg for$str")}) Now, let's see how broadcast works. Broadcast The broadcast[E] method creates an enumerator and a channel and returns a (Enumerator[E], Channel[E]) tuple. The enumerator and channel thus obtained can be used to broadcast data to multiple Iteratees: val (broadcastSrc: Enumerator[String], channel:Concurrent.Channel[String]) = Concurrent.broadcast[String]private val vowels: Seq[Char] = Seq('a', 'e', 'i', 'o', 'u')def getVowels(str: String): String = {val result = str.filter(c => vowels.contains(c))result}def getConsonants(str: String): String = {val result = str.filterNot(c => vowels.contains(c))result}val vowelCount: Iteratee[String, Int] = Iteratee.fold[String,Int](0)((x, y) => x + getVowels(y).length)val consonantCount: Iteratee[String, Int] =Iteratee.fold[String, Int](0)((x, y) => x +getConsonants(y).length)val vowelInfo: Future[Int] = broadcastSrc |>>> vowelCountval consonantInfo: Future[Int] = broadcastSrc |>>>consonantCountwords.foreach(w => channel.push(w))channel.end()vowelInfo onSuccess { case count => println(s"vowels:$count")}consonantInfo onSuccess { case count =>println(s"consonants:$count")} Enumeratee Enumeratee is also defined using a trait and its companion object with the same Enumeratee name. It is defined as follows: trait Enumeratee[From, To] {...def applyOn[A](inner: Iteratee[To, A]): Iteratee[From,Iteratee[To, A]]def apply[A](inner: Iteratee[To, A]): Iteratee[From, Iteratee[To,A]] = applyOn[A](inner)...} An Enumeratee transforms the Iteratee given to it as input and returns a new Iteratee. Let's look at a method that defines an Enumeratee by implementing the applyOn method. An Enumeratee's flatten method accepts Future[Enumeratee] and returns an another Enumeratee, which is defined as follows: def flatten[From, To](futureOfEnumeratee:Future[Enumeratee[From, To]]) = new Enumeratee[From, To] {def applyOn[A](it: Iteratee[To, A]): Iteratee[From,Iteratee[To, A]] =Iteratee.flatten(futureOfEnumeratee.map(_.applyOn[A](it))(dec))} In the preceding snippet, applyOn is called on the Enumeratee whose future is passed and dec is defaultExecutionContext. Defining an Enumeratee using the companion object is a lot simpler. The companion object has a lot of methods to deal with enumeratees, such as map, transform, collect, take, filter, and so on. The API is documented at https://www.playframework.com/documentation/2.3.x/api/scala/index.html#play.api.libs.iteratee.Enumeratee$. Let's define an Enumeratee by working through a problem. The example we used in the previous section to find the count of vowels and consonants will not work correctly if a vowel is capitalized in a sentence, that is, the result of src |>>> vowelCount will be incorrect when the line variable is defined as follows: val line: String = "What we need is not the will to believe, but the wish to find out.".toUpperCase To fix this, let's alter the case of all the characters in the data stream to lowercase. We can use an Enumeratee to update the input provided to the Iteratee. Now, let's define an Enumeratee to return a given string in lowercase: val toSmallCase: Enumeratee[String, String] =Enumeratee.map[String] {s => s.toLowerCase} There are two ways to add an Enumeratee to the dataflow. It can be bound to the following: Enumerators Iteratees Summary In this article, we discussed the concept of Iteratees, Enumerators, and Enumeratees. We also saw how they were implemented in Play Framework and used internally. Resources for Article: Further resources on this subject: Play Framework: Data Validation Using Controllers [Article] Play Framework: Introduction to Writing Modules [Article] Integrating with other Frameworks [Article]
Read more
  • 0
  • 0
  • 5710
article-image-introducing-jax-rs-api
Packt
21 Sep 2015
25 min read
Save for later

Introducing JAX-RS API

Packt
21 Sep 2015
25 min read
 In this article by Jobinesh Purushothaman, author of the book, RESTful Java Web Services, Second Edition, we will see that there are many tools and frameworks available in the market today for building RESTful web services. There are some recent developments with respect to the standardization of various framework APIs by providing unified interfaces for a variety of implementations. Let's take a quick look at this effort. (For more resources related to this topic, see here.) As you may know, Java EE is the industry standard for developing portable, robust, scalable, and secure server-side Java applications. The Java EE 6 release took the first step towards standardizing RESTful web service APIs by introducing a Java API for RESTful web services (JAX-RS). JAX-RS is an integral part of the Java EE platform, which ensures portability of your REST API code across all Java EE-compliant application servers. The first release of JAX-RS was based on JSR 311. The latest version is JAX-RS 2 (based on JSR 339), which was released as part of the Java EE 7 platform. There are multiple JAX-RS implementations available today by various vendors. Some of the popular JAX-RS implementations are as follows: Jersey RESTful web service framework: This framework is an open source framework for developing RESTful web services in Java. It serves as a JAX-RS reference implementation. You can learn more about this project at https://jersey.java.net. Apache CXF: This framework is an open source web services framework. CXF supports both JAX-WS and JAX-RS web services. To learn more about CXF, refer to http://cxf.apache.org. RESTEasy: This framework is an open source project from JBoss, which provides various modules to help you build a RESTful web service. To learn more about RESTEasy, refer to http://resteasy.jboss.org. Restlet: This framework is a lightweight, open source RESTful web service framework. It has good support for building both scalable RESTful web service APIs and lightweight REST clients, which suits mobile platforms well. You can learn more about Restlet at http://restlet.com. Remember that you are not locked down to any specific vendor here, the RESTful web service APIs that you build using JAX-RS will run on any JAX-RS implementation as long as you do not use any vendor-specific APIs in the code. JAX-RS annotations                                      The main goal of the JAX-RS specification is to make the RESTful web service development easier than it has been in the past. As JAX-RS is a part of the Java EE platform, your code becomes portable across all Java EE-compliant servers. Specifying the dependency of the JAX-RS API To use JAX-RS APIs in your project, you need to add the javax.ws.rs-api JAR file to the class path. If the consuming project uses Maven for building the source, the dependency entry for the javax.ws.rs-api JAR file in the Project Object Model (POM) file may look like the following: <dependency> <groupId>javax.ws.rs</groupId> <artifactId>javax.ws.rs-api</artifactId> <version>2.0.1</version><!-- set the tight version --> <scope>provided</scope><!-- compile time dependency --> </dependency> Using JAX-RS annotations to build RESTful web services Java annotations provide the metadata for your Java class, which can be used during compilation, during deployment, or at runtime in order to perform designated tasks. The use of annotations allows us to create RESTful web services as easily as we develop a POJO class. Here, we leave the interception of the HTTP requests and representation negotiations to the framework and concentrate on the business rules necessary to solve the problem at hand. If you are not familiar with Java annotations, go through the tutorial available at http://docs.oracle.com/javase/tutorial/java/annotations/. Annotations for defining a RESTful resource REST resources are the fundamental elements of any RESTful web service. A REST resource can be defined as an object that is of a specific type with the associated data and is optionally associated to other resources. It also exposes a set of standard operations corresponding to the HTTP method types such as the HEAD, GET, POST, PUT, and DELETE methods. @Path The @javax.ws.rs.Path annotation indicates the URI path to which a resource class or a class method will respond. The value that you specify for the @Path annotation is relative to the URI of the server where the REST resource is hosted. This annotation can be applied at both the class and the method levels. A @Path annotation value is not required to have leading or trailing slashes (/), as you may see in some examples. The JAX-RS runtime will parse the URI path templates in the same way even if they have leading or trailing slashes. Specifying the @Path annotation on a resource class The following code snippet illustrates how you can make a POJO class respond to a URI path template containing the /departments path fragment: import javax.ws.rs.Path; @Path("departments") public class DepartmentService { //Rest of the code goes here } The /department path fragment that you see in this example is relative to the base path in the URI. The base path typically takes the following URI pattern: http://host:port/<context-root>/<application-path>. Specifying the @Path annotation on a resource class method The following code snippet shows how you can specify @Path on a method in a REST resource class. Note that for an annotated method, the base URI is the effective URI of the containing class. For instance, you will use the URI of the following form to invoke the getTotalDepartments() method defined in the DepartmentService class: /departments/count, where departments is the @Path annotation set on the class. import javax.ws.rs.GET; import javax.ws.rs.Path; import javax.ws.rs.Produces; @Path("departments") public class DepartmentService { @GET @Path("count") @Produces("text/plain") public Integer getTotalDepartments() { return findTotalRecordCount(); } //Rest of the code goes here } Specifying variables in the URI path template It is very common that a client wants to retrieve data for a specific object by passing the desired parameter to the server. JAX-RS allows you to do this via the URI path variables as discussed here. The URI path template allows you to define variables that appear as placeholders in the URI. These variables would be replaced at runtime with the values set by the client. The following example illustrates the use of the path variable to request for a specific department resource. The URI path template looks like /departments/{id}. At runtime, the client can pass an appropriate value for the id parameter to get the desired resource from the server. For instance, the URI path of the /departments/10 format returns the IT department details to the caller. The following code snippet illustrates how you can pass the department ID as a path variable for deleting a specific department record. The path URI looks like /departments/10. import javax.ws.rs.Path; import javax.ws.rs.DELETE; @Path("departments") public class DepartmentService { @DELETE @Path("{id}") public void removeDepartment(@PathParam("id") short id) { removeDepartmentEntity(id); } //Other methods removed for brevity } In the preceding code snippet, the @PathParam annotation is used for copying the value of the path variable to the method parameter. Restricting values for path variables with regular expressions JAX-RS lets you use regular expressions in the URI path template for restricting the values set for the path variables at runtime by the client. By default, the JAX-RS runtime ensures that all the URI variables match the following regular expression: [^/]+?. The default regular expression allows the path variable to take any character except the forward slash (/). What if you want to override this default regular expression imposed on the path variable values? Good news is that JAX-RS lets you specify your own regular expression for the path variables. For example, you can set the regular expression as given in the following code snippet in order to ensure that the department name variable present in the URI path consists only of lowercase and uppercase alphanumeric characters: @DELETE @Path("{name: [a-zA-Z][a-zA-Z_0-9]}") public void removeDepartmentByName(@PathParam("name") String deptName) { //Method implementation goes here } If the path variable does not match the regular expression set of the resource class or method, the system reports the status back to the caller with an appropriate HTTP status code, such as 404 Not Found, which tells the caller that the requested resource could not be found at this moment. Annotations for specifying request-response media types The Content-Type header field in HTTP describes the body's content type present in the request and response messages. The content types are represented using the standard Internet media types. A RESTful web service makes use of this header field to indicate the type of content in the request or response message body. JAX-RS allows you to specify which Internet media types of representations a resource can produce or consume by using the @javax.ws.rs.Produces and @javax.ws.rs.Consumes annotations, respectively. @Produces The @javax.ws.rs.Produces annotation is used for defining the Internet media type(s) that a REST resource class method can return to the client. You can define this either at the class level (which will get defaulted for all methods) or the method level. The method-level annotations override the class-level annotations. The possible Internet media types that a REST API can produce are as follows: application/atom+xml application/json application/octet-stream application/svg+xml application/xhtml+xml application/xml text/html text/plain text/xml The following example uses the @Produces annotation at the class level in order to set the default response media type as JSON for all resource methods in this class. At runtime, the binding provider will convert the Java representation of the return value to the JSON format. import javax.ws.rs.Path; import javax.ws.rs.Produces; import javax.ws.rs.core.MediaType; @Path("departments") @Produces(MediaType.APPLICATION_JSON) public class DepartmentService{ //Class implementation goes here... } @Consumes The @javax.ws.rs.Consumes annotation defines the Internet media type(s) that the resource class methods can accept. You can define the @Consumes annotation either at the class level (which will get defaulted for all methods) or the method level. The method-level annotations override the class-level annotations. The possible Internet media types that a REST API can consume are as follows: application/atom+xml application/json application/octet-stream application/svg+xml application/xhtml+xml application/xml text/html text/plain text/xml multipart/form-data application/x-www-form-urlencoded The following example illustrates how you can use the @Consumes attribute to designate a method in a class to consume a payload presented in the JSON media type. The binding provider will copy the JSON representation of an input message to the Department parameter of the createDepartment() method. import javax.ws.rs.Consumes; import javax.ws.rs.core.MediaType; import javax.ws.rs.POST; @POST @Consumes(MediaType.APPLICATION_JSON) public void createDepartment(Department entity) { //Method implementation goes here… } The javax.ws.rs.core.MediaType class defines constants for all media types supported in JAX-RS. To learn more about the MediaType class, visit the API documentation available at http://docs.oracle.com/javaee/7/api/javax/ws/rs/core/MediaType.html. Annotations for processing HTTP request methods In general, RESTful web services communicate over HTTP with the standard HTTP verbs (also known as method types) such as GET, PUT, POST, DELETE, HEAD, and OPTIONS. @GET A RESTful system uses the HTTP GET method type for retrieving the resources referenced in the URI path. The @javax.ws.rs.GET annotation designates a method of a resource class to respond to the HTTP GET requests. The following code snippet illustrates the use of the @GET annotation to make a method respond to the HTTP GET request type. In this example, the REST URI for accessing the findAllDepartments() method may look like /departments. The complete URI path may take the following URI pattern: http://host:port/<context-root>/<application-path>/departments. //imports removed for brevity @Path("departments") public class DepartmentService { @GET @Produces(MediaType.APPLICATION_JSON) public List<Department> findAllDepartments() { //Find all departments from the data store List<Department> departments = findAllDepartmentsFromDB(); return departments; } //Other methods removed for brevity } @PUT The HTTP PUT method is used for updating or creating the resource pointed by the URI. The @javax.ws.rs.PUT annotation designates a method of a resource class to respond to the HTTP PUT requests. The PUT request generally has a message body carrying the payload. The value of the payload could be any valid Internet media type such as the JSON object, XML structure, plain text, HTML content, or binary stream. When a request reaches a server, the framework intercepts the request and directs it to the appropriate method that matches the URI path and the HTTP method type. The request payload will be mapped to the method parameter as appropriate by the framework. The following code snippet shows how you can use the @PUT annotation to designate the editDepartment() method to respond to the HTTP PUT request. The payload present in the message body will be converted and copied to the department parameter by the framework: @PUT @Path("{id}") @Consumes(MediaType.APPLICATION_JSON) public void editDepartment(@PathParam("id") Short id, Department department) { //Updates department entity to data store updateDepartmentEntity(id, department); } @POST The HTTP POST method posts data to the server. Typically, this method type is used for creating a resource. The @javax.ws.rs.POST annotation designates a method of a resource class to respond to the HTTP POST requests. The following code snippet shows how you can use the @POST annotation to designate the createDepartment() method to respond to the HTTP POST request. The payload present in the message body will be converted and copied to the department parameter by the framework: @POST public void createDepartment(Department department) { //Create department entity in data store createDepartmentEntity(department); } @DELETE The HTTP DELETE method deletes the resource pointed by the URI. The @javax.ws.rs.DELETE annotation designates a method of a resource class to respond to the HTTP DELETE requests. The following code snippet shows how you can use the @DELETE annotation to designate the removeDepartment() method to respond to the HTTP DELETE request. The department ID is passed as the path variable in this example. @DELETE @Path("{id}") public void removeDepartment(@PathParam("id") Short id) { //remove department entity from data store removeDepartmentEntity(id); } @HEAD The @javax.ws.rs.HEAD annotation designates a method to respond to the HTTP HEAD requests. This method is useful for retrieving the metadata present in the response headers, without having to retrieve the message body from the server. You can use this method to check whether a URI pointing to a resource is active or to check the content size by using the Content-Length response header field, and so on. The JAX-RS runtime will offer the default implementations for the HEAD method type if the REST resource is missing explicit implementation. The default implementation provided by runtime for the HEAD method will call the method designated for the GET request type, ignoring the response entity retuned by the method. @OPTIONS The @javax.ws.rs.OPTIONS annotation designates a method to respond to the HTTP OPTIONS requests. This method is useful for obtaining a list of HTTP methods allowed on a resource. The JAX-RS runtime will offer a default implementation for the OPTIONS method type, if the REST resource is missing an explicit implementation. The default implementation offered by the runtime sets the Allow response header to all the HTTP method types supported by the resource. Annotations for accessing request parameters You can use this offering to extract the following parameters from a request: a query, URI path, form, cookie, header, and matrix. Mostly, these parameters are used in conjunction with the GET, POST, PUT, and DELETE methods. @PathParam A URI path template, in general, has a URI part pointing to the resource. It can also take the path variables embedded in the syntax; this facility is used by clients to pass parameters to the REST APIs as appropriate. The @javax.ws.rs.PathParam annotation injects (or binds) the value of the matching path parameter present in the URI path template into a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. Typically, this annotation is used in conjunction with the HTTP method type annotations such as @GET, @POST, @PUT, and @DELETE. The following example illustrates the use of the @PathParam annotation to read the value of the path parameter, id, into the deptId method parameter. The URI path template for this example looks like /departments/{id}: //Other imports removed for brevity javax.ws.rs.PathParam @Path("departments") public class DepartmentService { @DELETE @Path("{id}") public void removeDepartment(@PathParam("id") Short deptId) { removeDepartmentEntity(deptId); } //Other methods removed for brevity } The REST API call to remove the department resource identified by id=10 looks like DELETE /departments/10 HTTP/1.1. We can also use multiple variables in a URI path template. For example, we can have the URI path template embedding the path variables to query a list of departments from a specific city and country, which may look like /departments/{country}/{city}. The following code snippet illustrates the use of @PathParam to extract variable values from the preceding URI path template: @Produces(MediaType.APPLICATION_JSON) @Path("{country}/{city} ") public List<Department> findAllDepartments( @PathParam("country") String countyCode, @PathParam("city") String cityCode) { //Find all departments from the data store for a country //and city List<Department> departments = findAllMatchingDepartmentEntities(countyCode, cityCode ); return departments; } @QueryParam The @javax.ws.rs.QueryParam annotation injects the value(s) of a HTTP query parameter into a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. The following example illustrates the use of @QueryParam to extract the value of the desired query parameter present in the URI. This example extracts the value of the query parameter, name, from the request URI and copies the value into the deptName method parameter. The URI that accesses the IT department resource looks like /departments?name=IT: @GET @Produces(MediaType.APPLICATION_JSON) public List<Department> findAllDepartmentsByName(@QueryParam("name") String deptName) { List<Department> depts= findAllMatchingDepartmentEntities (deptName); return depts; } @MatrixParam Matrix parameters are another way of defining parameters in the URI path template. The matrix parameters take the form of name-value pairs in the URI path, where each pair is preceded by semicolon (;). For instance, the URI path that uses a matrix parameter to list all departments in Bangalore city looks like /departments;city=Bangalore. The @javax.ws.rs.MatrixParam annotation injects the matrix parameter value into a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. The following code snippet demonstrates the use of the @MatrixParam annotation to extract the matrix parameters present in the request. The URI path used in this example looks like /departments;name=IT;city=Bangalore. @GET @Produces(MediaType.APPLICATION_JSON) @Path("matrix") public List<Department> findAllDepartmentsByNameWithMatrix(@MatrixParam("name") String deptName, @MatrixParam("city") String locationCode) { List<Department> depts=findAllDepartmentsFromDB(deptName, city); return depts; } You can use PathParam, QueryParam, and MatrixParam to pass the desired search parameters to the REST APIs. Now, you may ask when to use what? Although there are no strict rules here, a very common practice followed by many is to use PathParam to drill down to the entity class hierarchy. For example, you may use the URI of the following form to identify an employee working in a specific department: /departments/{dept}/employees/{id}. QueryParam can be used for specifying attributes to locate the instance of a class. For example, you may use URI with QueryParam to identify employees who have joined on January 1, 2015, which may look like /employees?doj=2015-01-01. The MatrixParam annotation is not used frequently. This is useful when you need to make a complex REST style query to multiple levels of resources and subresources. MatrixParam is applicable to a particular path element, while the query parameter is applicable to the entire request. @HeaderParam The HTTP header fields provide necessary information about the request and response contents in HTTP. For example, the header field, Content-Length: 348, for an HTTP request says that the size of the request body content is 348 octets (8-bit bytes). The @javax.ws.rs.HeaderParam annotation injects the header values present in the request into a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. The following example extracts the referrer header parameter and logs it for audit purposes. The referrer header field in HTTP contains the address of the previous web page from which a request to the currently processed page originated: @POST public void createDepartment(@HeaderParam("Referer") String referer, Department entity) { logSource(referer); createDepartmentInDB(department); } Remember that HTTP provides a very wide selection of headers that cover most of the header parameters that you are looking for. Although you can use custom HTTP headers to pass some application-specific data to the server, try using standard headers whenever possible. Further, avoid using a custom header for holding properties specific to a resource, or the state of the resource, or parameters directly affecting the resource. @CookieParam The @javax.ws.rs.CookieParam annotation injects the matching cookie parameters present in the HTTP headers into a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. The following code snippet uses the Default-Dept cookie parameter present in the request to return the default department details: @GET @Path("cook") @Produces(MediaType.APPLICATION_JSON) public Department getDefaultDepartment(@CookieParam("Default-Dept") short departmentId) { Department dept=findDepartmentById(departmentId); return dept; } @FormParam The @javax.ws.rs.FormParam annotation injects the matching HTML form parameters present in the request body into a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. The request body carrying the form elements must have the content type specified as application/x-www-form-urlencoded. Consider the following HTML form that contains the data capture form for a department entity. This form allows the user to enter the department entity details: <!DOCTYPE html> <html> <head> <title>Create Department</title> </head> <body> <form method="POST" action="/resources/departments"> Department Id: <input type="text" name="departmentId"> <br> Department Name: <input type="text" name="departmentName"> <br> <input type="submit" value="Add Department" /> </form> </body> </html> Upon clicking on the submit button on the HTML form, the department details that you entered will be posted to the REST URI, /resources/departments. The following code snippet shows the use of the @FormParam annotation for extracting the HTML form fields and copying them to the resource class method parameter: @Path("departments") public class DepartmentService { @POST //Specifies content type as //"application/x-www-form-urlencoded" @Consumes(MediaType.APPLICATION_FORM_URLENCODED) public void createDepartment(@FormParam("departmentId") short departmentId, @FormParam("departmentName") String departmentName) { createDepartmentEntity(departmentId, departmentName); } } @DefaultValue The @javax.ws.rs.DefaultValue annotation specifies a default value for the request parameters accessed using one of the following annotations: PathParam, QueryParam, MatrixParam, CookieParam, FormParam, or HeaderParam. The default value is used if no matching parameter value is found for the variables annotated using one of the preceding annotations. The following REST resource method will make use of the default value set for the from and to method parameters if the corresponding query parameters are found missing in the URI path: @GET @Produces(MediaType.APPLICATION_JSON) public List<Department> findAllDepartmentsInRange (@DefaultValue("0") @QueryParam("from") Integer from, @DefaultValue("100") @QueryParam("to") Integer to) { findAllDepartmentEntitiesInRange(from, to); } @Context The JAX-RS runtime offers different context objects, which can be used for accessing information associated with the resource class, operating environment, and so on. You may find various context objects that hold information associated with the URI path, request, HTTP header, security, and so on. Some of these context objects also provide the utility methods for dealing with the request and response content. JAX-RS allows you to reference the desired context objects in the code via dependency injection. JAX-RS provides the @javax.ws.rs.Context annotation that injects the matching context object into the target field. You can specify the @Context annotation on a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. The following example illustrates the use of the @Context annotation to inject the javax.ws.rs.core.UriInfo context object into a method variable. The UriInfo instance provides access to the application and request URI information. This example uses UriInfo to read the query parameter present in the request URI path template, /departments/IT: @GET @Produces(MediaType.APPLICATION_JSON) public List<Department> findAllDepartmentsByName( @Context UriInfo uriInfo){ String deptName = uriInfo.getPathParameters().getFirst("name"); List<Department> depts= findAllMatchingDepartmentEntities (deptName); return depts; } Here is a list of the commonly used classes and interfaces, which can be injected using the @Context annotation: javax.ws.rs.core.Application: This class defines the components of a JAX-RS application and supplies additional metadata javax.ws.rs.core.UriInfo: This interface provides access to the application and request URI information javax.ws.rs.core.Request: This interface provides a method for request processing such as reading the method type and precondition evaluation. javax.ws.rs.core.HttpHeaders: This interface provides access to the HTTP header information javax.ws.rs.core.SecurityContext: This interface provides access to security-related information javax.ws.rs.ext.Providers: This interface offers the runtime lookup of a provider instance such as MessageBodyReader, MessageBodyWriter, ExceptionMapper, and ContextResolver javax.ws.rs.ext.ContextResolver<T>: This interface supplies the requested context to the resource classes and other providers javax.servlet.http.HttpServletRequest: This interface provides the client request information for a servlet javax.servlet.http.HttpServletResponse: This interface is used for sending a response to a client javax.servlet.ServletContext: This interface provides methods for a servlet to communicate with its servlet container javax.servlet.ServletConfig: This interface carries the servlet configuration parameters @BeanParam The @javax.ws.rs.BeanParam annotation allows you to inject all matching request parameters into a single bean object. The @BeanParam annotation can be set on a class field, a resource class bean property (the getter method for accessing the attribute), or a method parameter. The bean class can have fields or properties annotated with one of the request parameter annotations, namely @PathParam, @QueryParam, @MatrixParam, @HeaderParam, @CookieParam, or @FormParam. Apart from the request parameter annotations, the bean can have the @Context annotation if there is a need. Consider the example that we discussed for @FormParam. The createDepartment() method that we used in that example has two parameters annotated with @FormParam: public void createDepartment( @FormParam("departmentId") short departmentId, @FormParam("departmentName") String departmentName) Let's see how we can use @BeanParam for the preceding method to give a more logical, meaningful signature by grouping all the related fields into an aggregator class, thereby avoiding too many parameters in the method signature. The DepartmentBean class that we use for this example is as follows: public class DepartmentBean { @FormParam("departmentId") private short departmentId; @FormParam("departmentName") private String departmentName; //getter and setter for the above fields //are not shown here to save space } The following code snippet demonstrates the use of the @BeanParam annotation to inject the DepartmentBean instance that contains all the FormParam values extracted from the request message body: @POST public void createDepartment(@BeanParam DepartmentBean deptBean) { createDepartmentEntity(deptBean.getDepartmentId(), deptBean.getDepartmentName()); } @Encoded By default, the JAX-RS runtime decodes all request parameters before injecting the extracted values into the target variables annotated with one of the following annotations: @FormParam, @PathParam, @MatrixParam, or @QueryParam. You can use @javax.ws.rs.Encoded to disable the automatic decoding of the parameter values. With the @Encoded annotation, the value of parameters will be provided in the encoded form itself. This annotation can be used on a class, method, or parameters. If you set this annotation on a method, it will disable decoding for all parameters defined for this method. You can use this annotation on a class to disable decoding for all parameters of all methods. In the following example, the value of the path parameter called name is injected into the method parameter in the URL encoded form (without decoding). The method implementation should take care of the decoding of the values in such cases: @GET @Produces(MediaType.APPLICATION_JSON) public List<Department> findAllDepartmentsByName(@QueryParam("name") String deptName) { //Method body is removed for brevity } URL encoding converts a string into a valid URL format, which may contain alphabetic characters, numerals, and some special characters supported in the URL string. To learn about the URL specification, visit http://www.w3.org/Addressing/URL/url-spec.html. Summary With the use of annotations, the JAX-RS API provides a simple development model for RESTful web service programming. In case you are interested in knowing other Java RESTful Web Services books that Packt has in store for you, here is the link: RESTful Java Web Services, Jose Sandoval RESTful Java Web Services Security, René Enríquez, Andrés Salazar C Resources for Article: Further resources on this subject: The Importance of Securing Web Services[article] Understanding WebSockets and Server-sent Events in Detail[article] Adding health checks [article]
Read more
  • 0
  • 0
  • 5573

article-image-create-collisions-objects
Packt
19 Feb 2016
10 min read
Save for later

Create Collisions on Objects

Packt
19 Feb 2016
10 min read
In this article by Julien Moreau-Mathis, the author of the book, Babylon.JS Essentials, you will learn about creating and customizing the scenes using the materials and meshes provided by the designer. (For more resources related to this topic, see here.) In this article, let's play with the gameplay itself and create interactions with the objects in the scene by creating collisions and physics simulations. Collisions are important to add realism in your scenes if you want to walk without crossing the walls. Moreover, to make your scenes more alive, let's introduce the physics simulation with Babylon.js and, finally, see how it is easy to integrate these two notions in your scenes. We are going to discuss the following topics: Checking collisions in a scene Physics simulation Checking collisions in a scene Starting from the concept, configuring and checking collisions in a scene can be done without mathematical notions. We all have the notions of gravity and ellipsoid even if the ellipsoid word is not necessarily familiar. How collisions work in Babylon.js? Let's start with the following scene (a camera, light, plane, and box): The goal is to block the current camera of the scene in order not to cross the objects in the scene. In other words, we want the camera to stay above the plane and stop moving forward if it collides with the box. To perform these actions, the collisions system provided by Babylon.js uses what we call a collider. A collider is based on the bounding box of an object that looks as follows: A bounding box simply represents the minimum and maximum positions of a mesh's vertices and is computed automatically by Babylon.js. This means that you have to do nothing particularly tricky to configure collisions on objects. All the collisions are based on these bounding boxes of each mesh to determine if the camera should be blocked or not. You'll also see that the physics engines use the same kind of collider to simulate physics. If the geometry of a mesh has to be changed by modifying the vertices' positions, the bounding box information can be refreshed by calling the following code: myMesh.refreshBoundingInfo(); To draw the bounding box of an object, simply set the .showBoundingBox property of the mesh instance to true with the following code: myMesh.showBoundingBox = true; Configuring collisions in a scene Let's practice with the Babylon.js collisions engine itself. You'll see that the engine is particularly well hidden as you have to enable checks on scenes and objects only. First, configure the scene to enable collisions and then wake the engine up. If the following property is false, all the next properties will be ignored by Babylon.js and the collisions engine will be in stand-by mode. Then, it's easy to enable or disable collisions in a scene without modifying more properties: scene.collisionsEnabled = true; // Enable collisions in scene Next, configure the camera to check collisions. Then, the collisions engine will check collisions for all the rendered cameras that have collisions enabled. Here, we have only one camera to configure: camera.checkCollisions = true; // Check collisions for THIS camera To finish, just set the .checkCollisions property of each mesh to true to active collisions (the plane and box): plane.checkCollisions = true; box.checkCollisions = true; Now, the collisions engine will check the collisions in the scene for the camera on both the plane and box meshes. You guessed it; if you want to enable collisions only on the plane and you want the camera to walk across the box, you'll have to set the .checkCollisions property of the box to false: plane.checkCollisions = true; box.checkCollisions = false; Configuring gravity and ellipsoid In the previous section, the camera checks the collision on the plane and box but is not submitted to a famous force named gravity. To enrich the collisions in your scene, you can apply the gravitational force, for example, to go down the stairs. First, enable the gravity on the camera by setting the .applyGravity property to true: camera.applyGravity = true; // Enable gravity on the camera Customize the gravity direction by setting BABYLON.Vector3 to the .gravity property of your scene: scene.gravity = new BABYLON.Vector3(0.0, -9.81, 0.0); // To stay on earth Of course, the gravity in space should be as follows: scene.gravity = BABYLON.Vector3.Zero(); // No gravity in space Don't hesitate to play with the values in order to adjust the gravity to your scene referential. The last parameter to enrich the collisions in your scene is the camera's ellipsoid. The ellipsoid represents the camera's dimensions in the scene. In other words, it adjusts the collisions according to the X, Y, and Z axes of the ellipsoid (an ellipsoid is represented by BABYLON.Vector3). For example, the camera must measure 1m80 (Y axis) and the minimum distance to collide on the X (sides) and Z (forward) axes must be 1 m. Then, the ellipsoid must be (X=1, Y=1.8, Z=1). Simply set the .ellipsoid property of the camera as follows: camera.ellipsoid = new BABYLON.Vector3(1, 1.8, 1); The default value of the cameras ellipsoids is (X=0.5, Y=1.0, Z=0.5) As for the gravity, don't hesitate to adjust the X, Y, and Z values to your scene referential. Physics simulation Physics simulation is pretty different from the collisions system as it does not occur on the cameras but on the objects of the scene themselves. In other words, if physics is enabled (and configured) on a box, the box will interact with the other meshes in the scene and will try to represent the real physical movements. For example, let's take a sphere in the air. If you apply physics on the sphere, the sphere will fall until it collides with another mesh and, according to the given parameters, will tend to bounce and roll in the scene. The example files reproduce the behavior with a sphere that falls on the box in the middle of the sphere. Enabling physics in Babylon.js In Babylon.js, physics simulations can be done using plugins only. Two plugins are available using either the Cannon.js framework or the Oimo.js framework. These two frameworks are included in the Babylon.js GitHub repository in the dist folder. Each scene has its own physics simulations system and can be enabled with the following lines: var gravity = new BABYLON.Vector3(0, -9.81, 0); scene.enablePhysics(gravity, new BABYLON.OimoJSPlugin()); // or scene.enablePhysics(gravity, new BABYLON.CannonJSPlugin()); The .enablePhysics(gravity, plugin) function takes the following two arguments: The gravitational force of the scene to apply on objects The plugin to use: Oimo.js: This is the new BABYLON.OimoJSPlugin() plugin Cannon.js: This is the new BABYLON.CannonJSPlugin() plugin To disable physics in a scene, simply call the .disablePhysicsEngine() function using the following code: scene.disablePhysicsEngine(); Impostors Once the physics simulations are enabled in a scene, you can configure the physics properties (or physics states) of the scene meshes. To configure the physics properties of a mesh, the BABYLON.Mesh class provides a function named .setPhysicsState(impostor, options). The parameters are as follows: The impostor: Each kind of mesh has its own impostor according to their forms. For example, a box will tend to slip while a sphere will tend to roll. There are several types of impostors. The options: These options define the values used in physics equations. They count the mass, friction, and restitution. Let's have a box named box with a mass of 1 and set its physics properties: box.setPhysicsState(BABYLON.PhysicsEngine.BoxImpostor, { mass: 1 }); This is all; the box is now configured to interact with other configured meshes by following the physics equations. Let's imagine that the box is in the air. It will fall until it collides with another configured mesh. Now, let's have a sphere named sphere with a mass of 2 and set its physics properties: sphere.setPhysicsState(BABYLON.PhysicsEngine.SphereImpostor, { mass: 2 }); You'll notice that the sphere, which is a particular kind of mesh, has its own impostor (sphere impostor). In contrast to the box, the physics equations of the plugin will make the sphere roll while the box will slip on other meshes. According to their mass, if the box and sphere collide together, then the sphere will tend to push harder the box. The available impostors in Babylon.js are as follows: The box impostor: BABYLON.PhysicsEngine.BoxImpostor The sphere impostor: BABYLON.PhysicsEngine.SphereImpostor The plane impostor: BABYLON.PhysicsEngine.PlaneImpostor The cylinder impostor: BABYLON.PhysicsEngine.CylinderImpostor In fact, in Babylon.js, the box, plane, and cylinder impostors are the same according to the Cannon.js and Oimo.js plugins. Regardless of the impostor parameter, the options parameter is the same. In these options, there are three parameters, which are as follows: The mass: This represents the mass of the mesh in the world. The heavier the mesh is, the harder it is to stop its movement. The friction: This represents the force opposed to the meshes in contact. In other words, it represents how the mesh is slippery. To give you an order, the glass as a friction equals to 1. We can determine that the friction is in the range [0, 1]. The restitution: This represents how the mesh will bounce on others. Imagine a ping-pong ball and its table. If the table's material is a carpet, the restitution will be small, but if the table's material is glass, the restitution will be maximum. A real interval for the restitution is in [0, 1]. In the example files, these parameters are set and if you play, you'll see that these three parameters are linked together in the physics equations. Appling a force (impulse) on a mesh At any moment, you can apply a new force or impulse to a configured mesh. Let's take an explosion: a box is located at the coordinates (X=0, Y=0, Z=0) and an explosion happens above the box at the coordinates (X=0, Y=-5, Z=0). In real life, the box should be pushed up: this action is possible by calling a function named .applyImpulse(force, contactPoint) provided by the BABYLON.Mesh class. Once the mesh is configured with its options and impostor, you can, at any moment, call this function to apply a force to the object. The parameters are as follows: The force: This represents the force in the X, Y, and Z axes The contact point: This represents the origin of the force located in the X, Y and Z axes For example, the explosion generates a force only on Y that is equal to 10 (10 is an arbitrary value) and has its origin at the coordinates (X=0, Y=-5, Z=0): mesh.applyImpulse(new BABYLON.Vector3(0, 10, 0), new BABYLON.Vector3(0, -5, 0)); Once the impulse is applied to the mesh (only once), the box is pushed up and will fall according to its physics options (mass, friction, and restitution). Summary You are now ready to configure collisions in your scene and simulate physics. Whether by code or by the artists, you learned the pipeline to make your scenes more alive. Resources for Article: Further resources on this subject: Building Games with HTML5 and Dart[article] Fundamentals of XHTML MP in Mobile Web Development[article] MODx Web Development: Creating Lists[article]
Read more
  • 0
  • 0
  • 5565