Programming | 0 articles | Tech News, Tutorials & Expert Insights

21 Feb 2018

9 min read

API Gateway and its Need

21 Feb 2018

In this article by Umesh R Sharma, author of the book Practical Microservices, we will cover API Gateway and its need with simple and short examples. (For more resources related to this topic, see here.) Dynamic websites show a lot on a single page, and there is a lot of information that needs to be shown on the page. The common success order summary page shows the cart detail and customer address. For this, frontend has to fire a different query to the customer detail service and order detail service. This is a very simple example of having multiple services on a single page. As a single microservice has to deal with only one concern, in result of that to show much information on page, there are many API calls on the same page. So, a website or mobile page can be very chatty in terms of displaying data on the same page. Another problem is that, sometimes, microservice talks on another protocol, then HTTP only, such as thrift call and so on. Outer consumers can't directly deal with microservice in that protocol. As a mobile screen is smaller than a web page, the result of the data required by the mobile or desktop API call is different. A developer would want to give less data to the mobile API or have different versions of the API calls for mobile and desktop. So, you could face a problem such as this: each client is calling different web services and keeping track of their web service and developers have to give backward compatibility because API URLs are embedded in clients like in mobile app. Why do we need the API Gateway? All these preceding problems can be addressed with the API Gateway in place. The API Gateway acts as a proxy between the API consumer and the API servers. To address the first problem in that scenario, there will only be one call, such as /successOrderSummary, to the API Gateway. The API Gateway, on behalf of the consumer, calls the order and user detail, then combines the result and serves to the client. So basically, it acts as a facade or API call, which may internally call many APIs. The API Gateway solves many purposes, some of which are as follows. Authentication API Gateways can take the overhead of authenticating an API call from outside. After that, all the internal calls remove security check. If the request comes from inside the VPC, it can remove the check of security, decrease the network latency a bit, and make the developer focus more on business logic than concerning about security. Different protocol Sometimes, microservice can internally use different protocols to talk to each other; it can be thrift call, TCP, UDP, RMI, SOAP, and so on. For clients, there can be only one rest-based HTTP call. Clients hit the API Gateway with the HTTP protocol and the API Gateway can make the internal call in required protocol and combine the results in the end from all web service. It can respond to the client in required protocol; in most of the cases, that protocol will be HTTP. Load-balancing The API Gateway can work as a load balancer to handle requests in the most efficient manner. It can keep a track of the request load it has sent to different nodes of a particular service. Gateway should be intelligent enough to load balances between different nodes of a particular service. With NGINX Plus coming into the picture, NGINX can be a good candidate for the API Gateway. It has many of the features to address the problem that is usually handled by the API Gateway. Request dispatching (including service discovery) One main feature of the gateway is to make less communication between client and microservcies. So, it initiates the parallel microservices if that is required by the client. From the client side, there will only be one hit. Gateway hits all the required services and waits for the results from all services. After obtaining the response from all the services, it combines the result and sends it back to the client. Reactive microservice designs can help you achieve this. Working with service discovery can give many extra features. It can mention which is the master node of service and which is the slave. Same goes for DB in case any write request can go to the master or read request can go to the slave. This is the basic rule, but users can apply so many rules on the basis of meta information provided by the API Gateway. Gateway can record the basic response time from each node of service instance. For higher priority API calls, it can be routed to the fastest responding node. Again, rules can be defined on the basis of the API Gateway you are using and how it will be implemented. Response transformation Being a first and single point of entry for all API calls, the API Gateway knows which type of client is calling a mobile, web client, or other external consumer; it can make the internal call to the client and give the data to different clients as per needs and configuration. Circuit breaker To handle the partial failure, the API Gateway uses a technique called circuit breaker pattern. A service failure in one service can cause the cascading failure in the flow to all the service calls in stack. The API Gateway can keep an eye on some threshold for any microservice. If any service passes that threshold, it marks that API as open circuit and decides not to make the call for a configured time. Hystrix (by Netflix) served this purpose efficiently. Default value in this is failing of 20 requests in 5 seconds. Developers can also mention the fall back for this open circuit. This fall back can be of dummy service. Once API starts giving results as expected, then gateway marks it as a closed service again. Pros and cons of API Gateway Using the API Gateway itself has its own pros and cons. In the previous section, we have described the advantages of using the API Gateway already. I will still try to make them in points as the pros of the API Gateway. Pros Microservice can focus on business logic Clients can get all the data in a single hit Authentication, logging, and monitoring can be handled by the API Gateway Gives flexibility to use completely independent protocols in which clients and microservice can talk It can give tailor-made results, as per the clients needs It can handle partial failure Addition to the preceding mentioned pros, some of the trade-offs are also to use this pattern. Cons It can cause performance degrade due to lots of happenings on the API Gateway With this, discovery service should be implemented Sometimes, it becomes the single point of failure Managing routing is an overhead of the pattern Adding additional network hope in the call Overall. it increases the complexity of the system Too much logic implementation in this gateway will lead to another dependency problem So, before using the API Gateway, both of the aspects should be considered. Decision of including the API Gateway in the system increases the cost as well. Before putting effort, cost, and management in this pattern, it is recommended to analysis how much you can gain from it. Example of API Gateway In this example, we will try to show only sample product pages that will fetch the data from service product detail to give information about the product. This example can be increased in many aspects. Our focus of this example is to only show how the API Gateway pattern works; so we will try to keep this example simple and small. This example will be using Zuul from Netflix as an API Gateway. Spring also had an implementation of Zuul in it, so we are creating this example with Spring Boot. For a sample API Gateway implementation, we will be using http://start.spring.io/ to generate an initial template of our code. Spring initializer is the project from Spring to help beginners generate basic Spring Boot code. A user has to set a minimum configuration and can hit the Generate Project button. If any user wants to set more specific details regarding the project, then they can see all the configuration settings by clicking on the Switch to the full version button, as shown in the following screenshot: Let's create a controller in the same package of main application class and put the following code in the file: @SpringBootApplication @RestController public class ProductDetailConrtoller { @Resource ProductDetailService pdService; @RequestMapping(value = "/product/{id}") public ProductDetail getAllProduct( @PathParam("id") String id) { return pdService.getProductDetailById(id); } } In the preceding code, there is an assumption of the pdService bean that will interact with Spring data repository for product detail and get the result for the required product ID. Another assumption is that this service is running on port 10000. Just to make sure everything is running, a hit on a URL such as http://localhost:10000/product/1 should give some JSON as response. For the API Gateway, we will create another Spring Boot application with Zuul support. Zuul can be activated by just adding a simple @EnableZuulProxy annotation. The following is a simple code to start the simple Zuul proxy: @SpringBootApplication @EnableZuulProxy public class ApiGatewayExampleInSpring { public static void main(String[] args) { SpringApplication.run(ApiGatewayExampleInSpring.class, args); } } Rest all the things are managed in configuration. In the application.properties file of the API Gateway, the content will be something as follows: zuul.routes.product.path=/product/** zuul.routes.produc.url=http://localhost:10000 ribbon.eureka.enabled=false server.port=8080 With this configuration, we are defining rules such as this: for any request for a URL such as /product/xxx, pass this request to http://localhost:10000. For outer world, it will be like http://localhost:8080/product/1, which will internally be transferred to the 10000 port. If we defined a spring.application.name variable as product in product detail microservice, then we don't need to define the URL path property here (zuul.routes.product.path=/product/** ), as Zuul, by default, will make it a URL/product. The example taken here for an API Gateway is not very intelligent, but this is a very capable API Gateway. Depending on the routes, filter, and caching defined in the Zuul's property, one can make a very powerful API Gateway. Summary In this article, you learned about the API Gateway, its need, and its pros and cons with the code example. Resources for Article: Further resources on this subject: What are Microservices? [article] Microservices and Service Oriented Architecture [article] Breaking into Microservices Architecture [article]

0
0
11305

article-image-introduction-performance-testing-and-jmeter

Packt

20 Feb 2018

11 min read

Introduction to Performance Testing and JMeter

Packt

20 Feb 2018

11 min read

In this article by Bayo Erinle, the author of the book Performance Testing with JMeter 3, will explore some of the options that make JMeter a great tool of choice for performance testing. (For more resources related to this topic, see here.) Performance testing and tuning There is a strong relationship between performance testing and tuning, in the sense that one often leads to the other. Often, end-to-end testing unveils system or application bottlenecks that are regarded unacceptable with project target goals. Once those bottlenecks are discovered, the next step for most teams is a series of tuning efforts to make the application perform adequately. Such efforts normally include, but are not limited to, the following: Configuring changes in system resources Optimizing database queries Reducing round trips in application calls, sometimes leading to redesigning and re-architecting problematic modules Scaling out application and database server capacity Reducing application resource footprint Optimizing and refactoring code, including eliminating redundancy and reducing execution time Tuning efforts may also commence if the application has reached acceptable performance but the team wants to reduce the amount of system resources being used, decrease the volume of hardware needed, or further increase system performance. After each change (or series of changes), the test is re-executed to see whether the performance has improved or declined due to the changes. The process will be continued with the performance results having reached acceptable goals. The outcome of these test-tuning circles normally produces a baseline. Baselines Baseline is a process of capturing performance metric data for the sole purpose of evaluating the efficacy of successive changes to the system or application. It is important that all characteristics and configurations, except those specifically being varied for comparison, remain the same in order to make effective comparisons as to which change (or series of changes) is driving results toward the targeted goal. Armed with such baseline results, subsequent changes can be made to the system configuration or application and testing results can be compared to see whether such changes were relevant or not. Some considerations when generating baselines include the following: They are application-specific They can be created for system, application, or modules They are metrics/results They should not be over generalized They evolve and may need to be redefined from time to time They act as a shared frame of reference They are reusable They help identify changes in performance Load and stress testing Load testing is the process of putting demand on a system and measuring its response, that is, determining how much volume the system can handle. Stress testing is the process of subjecting the system to unusually high loads far beyond its normal usage pattern to determine its responsiveness. These are different from performance testing, whose sole purpose is to determine the response and effectiveness of a system, that is, how fast the system is. Since load ultimately affects how a system responds, performance testing is always done in conjunction with stress testing. JMeter to the rescue One of the areas performance testing covers is testing tools. Which testing tool do you use to put the system and application under load? There are numerous testing tools available to perform this operation, from free to commercial solutions. However, our focus will be on Apache JMeter, a free, open source, cross-platform desktop application from the Apache Software foundation. JMeter has been around since 1998 according to historic change logs on its official site, making it a mature, robust, and reliable testing tool. Cost may also have played a role in its wide adoption. Small companies usually may not want to foot the bill for commercial end testing tools, which often place restrictions, for example, on how many concurrent users one can spin off. My first encounter with JMeter was exactly a result of this. I worked in a small shop that had paid for a commercial testing tool, but during the course of testing, we had outrun the licensing limits of how many concurrent users we needed to simulate for realistic test plans. Since JMeter was free, we explored it and were quite delighted with the offerings and the share amount of features we got for free. Here are some of its features: Performance tests of different server types, including web (HTTP and HTTPS), SOAP, database, LDAP, JMS, mail, and native commands or shell scripts Complete portability across various operating systems Full multithreading framework allowing concurrent sampling by many threads and simultaneous sampling of different functions by separate thread groups Full featured Test IDE that allows fast Test Plan recording, building, and debugging Dashboard Report for detailed analysis of application performance indexes and key transactions In-built integration with real-time reporting and analysis tools, such as Graphite, InfluxDB, and Grafana, to name a few Complete dynamic HTML reports Graphical User Interface (GUI) HTTP proxy recording server Caching and offline analysis/replaying of test results High extensibility Live view of results as testing is being conducted JMeter allows multiple concurrent users to be simulated on the application, allowing you to work toward most of the target goals obtained earlier, such as attaining baseline and identifying bottlenecks. It will help answer questions, such as the following: Will the application still be responsive if 50 users are accessing it concurrently? How reliable will it be under a load of 200 users? How much of the system resources will be consumed under a load of 250 users? What will the throughput look like with 1000 users active in the system? What will be the response time for the various components in the application under load? JMeter, however, should not be confused with a browser. It doesn't perform all the operations supported by browsers; in particular, JMeter does not execute JavaScript found in HTML pages, nor does it render HTML pages the way a browser does. However, it does give you the ability to view request responses as HTML through many of its listeners, but the timings are not included in any samples. Furthermore, there are limitations to how many users can be spun on a single machine. These vary depending on the machine specifications (for example, memory, processor speed, and so on) and the test scenarios being executed. In our experience, we have mostly been able to successfully spin off 250-450 users on a single machine with a 2.2 GHz processor and 8 GB of RAM. Up and running with JMeter Now, let's get up and running with JMeter, beginning with its installation. Installation JMeter comes as a bundled archive, so it is super easy to get started with it. Those working in corporate environments behind a firewall or machines with non-admin privileges appreciate this more. To get started, grab the latest binary release by pointing your browser to http://jmeter.apache.org/download_jmeter.cgi. At the time of writing this, the current release version is 3.1. The download site offers the bundle as both a .zip file and a .tgz file. We go with the .zip file option, but feel free to download the .tgz file if that's your preferred way of grabbing archives. Once downloaded, extract the archive to a location of your choice. The location you extracted the archive to will be referred to as JMETER_HOME. Provided you have a JDK/JRE correctly installed and a JAVA_HOME environment variable set, you are all set and ready to run! The following screenshot shows a trimmed down directory structure of a vanilla JMeter install: JMETER_HOME folder structure The following are some of the folders in Apache-JMeter-3.2, as shown in the preceding screenshot: bin: This folder contains executable scripts to run and perform other operations in JMeter docs: This folder contains a well-documented user guide extras: This folder contains miscellaneous items, including samples illustrating the usage of the Apache Ant build tool (http://ant.apache.org/) with JMeter, and bean shell scripting lib: This folder contains utility JAR files needed by JMeter (you may add additional JARs here to use from within JMeter; we will cover this in detail later) printable_docs: This is the printable documentation Installing Java JDK Follow these steps to install Java JDK: Go to http://www.oracle.com/technetwork/java/javase/downloads/index.html. Download Java JDK (not JRE) compatible with the system that you will use to test. At the time of writing, JDK 1.8 (update 131) was the latest. Double-click on the executable and follow the onscreen instructions. On Windows systems, the default location for the JDK is under Program Files. While there is nothing wrong with this, the issue is that the folder name contains a space, which can sometimes be problematic when attempting to set PATH and run programs, such as JMeter, depending on the JDK from the command line. With this in mind, it is advisable to change the default location to something like C:toolsjdk. Setting up JAVA_HOME Here are the steps to set up the JAVA_HOME environment variable on Windows and Unix operating systems. On Windows For illustrative purposes, assume that you have installed Java JDK at C:toolsjdk: Go to Control Panel. Click on System. Click on Advance System settings. Add Environment to the following variables: Value: JAVA_HOME Path: C:toolsjdk Locate Path (under system variables, bottom half of the screen). Click on Edit. Append %JAVA_HOME%/bin to the end of the existing path value (if any). On Unix For illustrative purposes, assume that you have installed Java JDK at /opt/tools/jdk: Open up a Terminal window. Export JAVA_HOME=/opt/tools/jdk. Export PATH=$PATH:$JAVA_HOME. It is advisable to set this in your shell profile settings, such as .bash_profile (for bash users) or .zshrc (for zsh users), so that you won't have to set it for each new Terminal window you open. Running JMeter Once installed, the bin folder under the JMETER_HOME folder contains all the executable scripts that can be run. Based on the operating system that you installed JMeter on, you either execute the shell scripts (.sh file) for operating systems that are Unix/Linux flavored, or their batch (.bat file) counterparts on operating systems that are Windows flavored. JMeter files are saved as XML files with a .jmx extension. We refer to them as test scripts or JMX files. These scripts include the following: jmeter.sh: This script launches JMeter GUI (the default) jmeter-n.sh: This script launches JMeter in non-GUI mode (takes a JMX file as input) jmeter-n-r.sh: This script launches JMeter in non-GUI mode remotely jmeter-t.sh: This opens a JMX file in the GUI jmeter-server.sh: This script starts JMeter in server mode (this will be kicked off on the master node when testing with multiple machines remotely) mirror-server.sh: This script runs the mirror server for JMeter shutdown.sh: This script gracefully shuts down a running non-GUI instance stoptest.sh: This script abruptly shuts down a running non-GUI instance To start JMeter, open a Terminal shell, change to the JMETER_HOME/bin folder, and run the following command on Unix/Linux: ./jmeter.sh Alternatively, run the following command on Windows: jmeter.bat Take a moment to explore the GUI. Hover over each icon to see a short description of what it does. The Apache JMeter team has done an excellent job with the GUI. Most icons are very similar to what you are used to, which helps ease the learning curve for new adapters. Some of the icons, for example, stop and shutdown, are disabled for now till a scenario/test is being conducted. The JVM_ARGS environment variable can be used to override JVM settings in the jmeter.bat or jmeter.sh script. Consider the following example: export JVM_ARGS="-Xms1024m -Xmx1024m -Dpropname=propvalue". Command-line options To see all the options available to start JMeter, run the JMeter executable with the -? command. The options provided are as follows: . ./jmeter.sh -? -? print command line options and exit -h, --help print usage information and exit -v, --version print the version information and exit -p, --propfile <argument> the jmeter property file to use -q, --addprop <argument> additional JMeter property file(s) -t, --testfile <argument> the jmeter test(.jmx) file to run -l, --logfile <argument> the file to log samples to -j, --jmeterlogfile <argument> jmeter run log file (jmeter.log) -n, --nongui run JMeter in nongui mode ... -J, --jmeterproperty <argument>=<value> Define additional JMeter properties -G, --globalproperty <argument>=<value> Define Global properties (sent to servers) e.g. -Gport=123 or -Gglobal.properties -D, --systemproperty <argument>=<value> Define additional system properties -S, --systemPropertyFile <argument> additional system property file(s) This is a snippet (non-exhaustive list) of what you might see if you did the same. Summary In this article we have learnt relationship between performance testing and tuning, and how to install and run JMeter. Resources for Article: Further resources on this subject: Functional Testing with JMeter [article] Creating an Apache JMeter™ test workbench [article] Getting Started with Apache Spark DataFrames [article]

0
0
2106

Packt

20 Feb 2018

6 min read

Consuming Diagnostic Analyzers in .NET projects

Packt

20 Feb 2018

6 min read

We know how to write diagnostic analyzers to analyze and report issues about .NET source code and contribute them to the .NET developer community. In this article by the author Manish Vasani, of the book Roslyn Cookbook, we will show you how to search, install, view and configure the analyzers that have already been published by various analyzer authors on NuGet and VS Extension gallery. We will cover the following recipes: (For more resources related to this topic, see here.) Searching and installing analyzers through the NuGet package manager. Searching and installing VSIX analyzers through the VS extension gallery. Viewing and configuring analyzers in solution explorer in Visual Studio. Using ruleset file and ruleset editor to configure analyzers. Diagnostic analyzers are extensions to the Roslyn C# compiler and Visual Studio IDE to analyze user code and report diagnostics. User will see these diagnostics in the error list after building the project from Visual Studio and even when building the project on the command line. They will also see the diagnostics live while editing the source code in the Visual Studio IDE. Analyzers can report diagnostics to enforce specific code styles, improve code quality and maintenance, recommend design guidelines or even report very domain specific issues which cannot be covered by the core compiler. Analyzers can be installed to a .NET project either as a NuGet package or as a VSIX. To get a better understanding of these packaging schemes and learn about the differences in the analyzer experience when installed as a NuGet package versus a VSIX. Analyzers are supported on various different flavors of .NET standard, .NET core and .NET framework projects, for example, class library, console app, etc. Searching and installing analyzers through the NuGet package manager In this recipe we will show you how to search and install analyzer NuGet packages in the NuGet package manager in Visual Studio and see how the analyzer diagnostics from an installed NuGet package light up in project build and as live diagnostics during code editing in Visual Studio. Getting ready You will need to have Visual Studio 2017 installed on your machine to this recipe. You can install a free community version of Visual Studio 2017 from https://www.visualstudio.com/thank-you-downloading-visual-studio/?sku=Community&rel=15. How to do it… Create a C# class library project, say ClassLibrary, in Visual Studio 2017. In solution explorer, right click on the solution or project node and execute Manage NuGet Packages command. This brings up the NuGet Package Manager, which can be used to search and install NuGet packages to the solution or project. In the search bar type the following text to find NuGet packages tagged as analyzers: Tags:"analyzers" Note that some of the well known packages are tagged as analyzer, so you may also want to search:Tags:"analyzer" Check or uncheck the Include prerelease checkbox to the right of the search bar to search or hide the prerelease analyzer packages respectively. The packages are listed based on the number of downloads, with the highest downloaded package at the top. Select a package to install, say System.Runtime.Analyzers, and pick a specific version, say 1.1.0, and click Install. Click on I Accept button on the License Acceptance dialog to install the NuGet package. Verify the installed analyzer(s) show up under the Analyzers node in the solution explorer. Verify the project file has a new ItemGroup with the following analyzer references from the installed analyzer package: <ItemGroup> <Analyzer Include="..packagesSystem.Runtime.Analyzers.1.1.0analyzersdotnetcsSystem.Runtime.Analyzers.dll" /> <Analyzer Include="..packagesSystem.Runtime.Analyzers.1.1.0analyzersdotnetcsSystem.Runtime.CSharp.Analyzers.dll" /> </ItemGroup> Add the following code to your C# project: namespace ClassLibrary { public class MyAttribute : System.Attribute { } } Verify the analyzer diagnostic from the installed analyzer is shown in the error list: Open a Visual Studio 2017 Developer Command Prompt and build the project to verify that the analyzer is executed on the command line build and the analyzer diagnostic is reported: Create a new C# project in VS2017 and add the same code to it as step 9 and verify no analyzer diagnostic shows up in error list or command line, confirming that the analyzer package was only installed to the selected project in steps 1-6. Note that CA1018 (Custom attribute should have AttributeUsage defined) has been moved to a separate analyzer assembly in future versions of FxCop/System.Runtime.Analyzers package. It is recommended that you install Microsoft.CodeAnalysis.FxCopAnalyzers NuGet package to get the latest group of Microsoft recommended analyzers. Searching and installing VSIX analyzers through the VS extension gallery In this recipe we will show you how to search and install analyzer VSIX packages in the Visual Studio Extension manager and see how the analyzer diagnostics from an installed VSIX light up as live diagnostics during code editing in Visual Studio. Getting ready You will need to have Visual Studio 2017 installed on your machine to this recipe. You can install a free community version of Visual Studio 2017 from https://www.visualstudio.com/thank-you-downloading-visual-studio/?sku=Community&rel=15. How to do it… Create a C# class library project, say ClassLibrary, in Visual Studio 2017. From the top level menu, execute Tools | Extensions and Updates Navigate to Online | Visual Studio Marketplace on the left tab of the dialog to view the available VSIXes in the Visual Studio extension gallery/marketplace. Search analyzers in the search text box in the upper right corner of the dialog and download an analyzer VSIX, say Refactoring Essentials for Visual Studio. Once the download completes, you will get a message at the bottom of the dialog that the install will be scheduled to execute once Visual Studio and related windows are closed. Close the dialog and then close the Visual Studio instance to start the install. In the VSIX Installer dialog, click Modify to start installation. The subsequent message prompts you to kill all the active Visual Studio and satellite processes. Save all your relevant work in all the open Visual Studio instances, and click End Tasks to kill these processes and install the VSIX. After installation, restart VS, click Tools | Extensions And Updates, and verify Refactoring Essentials VSIX is installed. Create a new C# project with the following source code and verify analyzer diagnostic RECS0085 (Redundant array creation expression) in the error list: namespace ClassLibrary { public class Class1 { void Method() { int[] values = new int[] { 1, 2, 3 }; } } } Build the project from Visual Studio 2017 or command line and confirm no analyzer diagnostic shows up in the Output Window or the command line respectively, confirming that the VSIX analyzer did not execute as part of the build. Resources for Article: Further resources on this subject: C++, SFML, Visual Studio, and Starting the first game [article] Connecting to Microsoft SQL Server Compact 3.5 with Visual Studio [article] Creating efficient reports with Visual Studio [article]

0
0
2314

article-image-getting-inside-c-plus-plus-multithreaded-application

Maya Posch

13 Feb 2018

8 min read

Getting Inside a C++ Multithreaded Application

Maya Posch

13 Feb 2018

8 min read

This C++ programming tutorial is taken from Maya Posch's Mastering C++ Multithreading. In its most basic form, a multithreaded application consists of a singular process with two or more threads. These threads can be used in a variety of ways, for example, to allow the process to respond to events in an asynchronous manner by using one thread per incoming event or type of event, or to speed up the processing of data by splitting the work across multiple threads. Examples of asynchronous response to events include the processing of user interface (GUI) and network events on separate threads so that neither type of event has to wait on the other, or can block events from being responded to in time. Generally, a single thread performs a single task, such as the processing of GUI or network events, or the processing of data. For this basic example, the application will start with a singular thread, which will then launch a number of threads, and wait for them to finish. Each of these new threads will perform its own task before finishing. Let's start with the includes and global variables for our application: #include <iostream> #include <thread> #include <mutex> #include <vector> #include <random> using namespace std; // --- Globals mutex values_mtx; mutex cout_mtx; Both the I/O stream and vector headers should be familiar to anyone who has ever used C++: the former is here used for the standard output (cout), and vector for storing a sequence of values. The random header is new in c++11, and as the name suggests, it offers classes and methods for generating random sequences. We use it here to make our threads do something interesting. Finally, the thread and mutex includes are the core of our multithreaded application; they provide the basic means for creating threads, and allow for thread-safe interactions between them. Moving on, we create two mutexes: one for the global vector and one for cout, since the latter is not thread-safe. Next we create the main function as follows: int main() { values.push_back(42); We then push a fixed value onto the vector instance; this one will be used by the threads we create in a moment: thread tr1(threadFnc, 1); thread tr2(threadFnc, 2); thread tr3(threadFnc, 3); thread tr4(threadFnc, 4); We create new threads, and provide them with the name of the method to use, passing along any parameters--in this case, just a single integer: tr1.join(); tr2.join(); tr3.join(); tr4.join(); Next, we wait for each thread to finish before we continue by calling join() on each thread instance: cout << "Input: " << values[0] << ", Result 1: " << values[1] << ", Result 2: " << values[2] << ", Result 3: " << values[3] << ", Result 4: " << values[4] << "n"; return 1; } At this point, we expect that each thread has done whatever it's supposed to do, and added the result to the vector, which we then read out and show the user. Of course, this shows almost nothing of what really happens in the application, mostly just the essential simplicity of using threads. Next, let's see what happens inside this method that we pass to each thread instance: void threadFnc(int tid) { cout_mtx.lock(); cout << "Starting thread " << tid << ".n"; cout_mtx.unlock(); When we obtain the initial value set in the vector, we copy it to a local variable so that we can immediately release the mutex for the vector to enable other threads to use it: int rval = randGen(0, 10); val += rval; These last two lines contain the essence of what the threads created do: they take the initial value, and add a randomly generated value to it. The randGen() method takes two parameters, defining the range of the returned value: cout_mtx.lock(); cout << "Thread " << tid << " adding " << rval << ". New value: " << val << ".n"; cout_mtx.unlock(); values_mtx.lock(); values.push_back(val); values_mtx.unlock(); } Finally, we (safely) log a message informing the user of the result of this action before adding the new value to the vector. In both cases, we use the respective mutex to ensure that there can be no overlap with any of the other threads. Once the method reaches this point, the thread containing it will terminate, and the main thread will have one fewer thread to wait for to rejoin. Lastly, we'll take a look at the randGen() method. Here we can see some multithreaded specific additions as well: int randGen(const int& min, const int& max) { static thread_local mt19937 generator(hash<thread::id>()(this_thread::get_id())); uniform_int_distribution<int> distribution(min, max); return distribution(generator) } This preceding method takes a minimum and maximum value as explained earlier, which limit the range of the random numbers this method can return. At its core, it uses a mt19937-based generator, which employs a 32-bit Mersenne Twister algorithm with a state size of 19937 bits. This is a common and appropriate choice for most applications. Of note here is the use of the thread_local keyword. What this means is that even though it is defined as a static variable, its scope will be limited to the thread using it. Every thread will thus create its own generator instance, which is important when using the random number API in the STL. A hash of the internal thread identifier (not our own) is used as seed for the generator. This ensures that each thread gets a fairly unique seed for its generator instance, allowing for better random number sequences. Finally, we create a new uniform_int_distribution instance using the provided minimum and maximum limits, and use it together with the generator instance to generate the random number which we return. Makefile In order to compile the code described earlier, one could use an IDE, or type the command on the command line. As mentioned in the beginning of this chapter, we'll be using makefiles for the examples in this book. The big advantages of this are that one does not have to repeatedly type in the same extensive command, and it is portable to any system which supports make. The makefile for this example is rather basic: GCC := g++ OUTPUT := ch01_mt_example SOURCES := $(wildcard *.cpp) CCFLAGS := -std=c++11 all: $(OUTPUT) $(OUTPUT): clean: rm $(OUTPUT) .PHONY: all From the top down, we first define the compiler that we'll use (g++), set the name of the output binary (the .exe extension on Windows will be post-fixed automatically), followed by the gathering of the sources and any important compiler flags. The wildcard feature allows one to collect the names of all files matching the string following it in one go without having to define the name of each source file in the folder individually. For the compiler flags, we're only really interested in enabling the c++11 features, for which GCC still requires one to supply this compiler flag. For the all method, we just tell make to run g++ with the supplied information. Next we define a simple clean method which just removes the produced binary, and finally, we tell make to not interpret any folder or file named all in the folder, but to use the internal method with the .PHONY section. When we run this makefile, we see the following command-line output: $ make Afterwards, we find an executable file called ch01_mt_example (with the .exe extension attached on Windows) in the same folder. Executing this binary will result in a command-line output akin to the following: $ ./ch01_mt_example.exe Starting thread 1. Thread 1 adding 8. New value: 50. Starting thread 2. Thread 2 adding 2. New value: 44. Starting thread 3. Starting thread 4. Thread 3 adding 0. New value: 42. Thread 4 adding 8. New value: 50. Input: 42, Result 1: 50, Result 2: 44, Result 3: 42, Result 4: 50 What one can see here already is the somewhat asynchronous nature of threads and their output. While threads 1 and 2 appear to run synchronously, threads 3 and 4 clearly run asynchronously. For this reason, and especially in longer-running threads, it's virtually impossible to say in which order the log output and results will be returned. While we use a simple vector to collect the results of the threads, there is no saying whether Result 1 truly originates from the thread which we assigned ID 1 in the beginning. If we need this information, we need to extend the data we return by using an information structure with details on the processing thread or similar. One could, for example, use struct like this: struct result { int tid; int result; }; The vector would then be changed to contain result instances rather than integer instances. One could pass the initial integer value directly to the thread as part of its parameters, or pass it via some other way. Want to learn C++ multithreading in detail? You can find Mastering C++ Multithreading here, or explore all our latest C++ eBooks and videos here.

0
0
4109

article-image-getting-started-soa-and-wso2

Packt

11 Jan 2018

11 min read

Getting Started with SOA and WSO2

Packt

11 Jan 2018

11 min read

In this article by Fidel Prieto Estrada and Ramón Garrido, authors of the book WSO2: Developer’s Guide, we will discuss the facts or problems that large companies with a huge IT system had to face, and that finally gave rise to the SOA approach. (For more resources related to this topic, see here.) Once we know what we are talking about, we will introduce the WSO2 technology and describe the role it plays in SOA, which will be followed by the installation and configuration of the WSO2 products we will use. So, in this article, we willlearn about the basic knowledge of SOA. Service-oriented architecture (SOA) is a style, an approach to design software in a different way from the standard. SOA is not a technology; it is a paradigm, a design style. There comes a time when a company grows and grows, which means that its IT system also becomes bigger and bigger, fetching a huge amount of data that it has to share with other companies. This typical data may be, for example, any of the following: Sales data Employees data Customer data Business information In this environment, each information need of the company's applications is satisfied by a direct link to the system that owns the required information. So, when a company becomes a large corporation, with many departments and complex business logic, the IT system becomes a spaghetti dish: Insert Image B06549_01_01.png Spaghetti dish The spaghetti dish is a comparison widely used to describe how complex the integration links between applications may become in this large corporation. In this comparison, each spaghetti represents the link between two applications in order to share any kind of information. Thus, when the number of applications needed for our business rises, the amount of information shared is larger as well. So, if we draw the map that represents all the links between the whole set of applications, the image will be quite similar to a spaghetti dish. Take a look at the following diagram: Insert Image B06549_01_02.png Spaghetti integrations by Oracle (https://image.slidesharecdn.com/2012-09-20-aspire-oraclesoawebinar-finalversion-160109031240/95/maximizing-oracle-apps-capabilities-using-oracle-soa-7- 638.jpg?cb=1452309418) The preceding diagram represents an environment that is closed, monolithic, and inefficient,with the following features: The architecture is split into blocks divided by business areas. Each area is close to the rest of the areas, so interaction between them is quite difficult. These isolated blocks are hard to maintain. Each block was managed by just one provider, which knew that business area deeply. It is difficult for the company to change the provider that manages each business area due to the risk involved. The company cannot protect itself against the abuses of the provider. The provider may commit many abuses, such as raising the provided service fare, violatingservice level agreement (SLA), breaching the schedule, and many others we can imagine. In these situations, the company lacks instruments to fight them because if the business area managed by the provider stops working, the impact on the company profits is much larger than when assumingthat the provider abuses. The provider has deeper knowledge of the customer business than the customer itself. The maintenance cost is high due to the complexity of the network for many reasons; consider the following example: It is difficultto perform impact analysis when a new functionality is needed, which means high cost and long time to evaluate any fix, and higher cost of each fix in turn. The complex interconnection network is difficult to know in depth. Finding the cause of a failure or malfunction may become quite a task. When a system is down, most of the others may be down as well. A business process is used to involve different databases and applications. Thus, when a user has to run a business process in the company, he needs to use different applications, access different networks, and log in with different credentials in each one; this makes the business quite inefficient, making simple tasks take too much time. When a system in your puzzle uses an obsolete technology, which is quite common with legacy systems, you will always be tied to it and to the incompatibility issues with brand new technologies, for instance. Managing a fine-grained security policy that manages who has access to each piece of data is simply an uthopy. Something must to be done to face all these problems and SOA is the one to put this in order. SOA is the final approach after the previous attempt to try to tidy up this chaos. We can take a look at the SOA origin in the white paper,The 25-year history of SOA, by ErikTownsend(http://www.eriktownsend.com/white-papers/technology). It is quite an interesting read, where Erik establishes the origin of the manufacturing industry. I agree to that idea, and it is easy to see how the improvements in the manufacturing industry, or other industries, are applied to the IT world; take these examples: Hardware bus in motherboards are being used for decades, and now we can also find software bus, Enterprise Service Bus (ESB) in a company. The hardware bus connects hardware devices such as microprocessor, memory, or hard drive; the software bus connects applications. Hardware router in a network routes small fragments of data between different nets to lead these packets to the destination net. The message router software, which implements the message router enterprise integration pattern, routes data objects between applications. We create software factories to develop software using the same paradigm as a manufacturing industry. Lean IT is a trending topic nowadays. It tries, roughly speaking, to optimize the IT processes by removing the muda (Japanese word meaning wastefulness, uselessness). It is based on the benefits of the lean manufacturing applied by Toyota in the '70s, after the oil crisis, which led it to the top position in the car manufacturing industry. We find an analogy between what object-oriented language means to programming and what SOA represents to system integrations as well. We can also find analogies between ITILv3 and SOA. The way ITILv3 manages the company services can be applied to manage the SOA services at many points. ITILv3 deals with the services that a company offers and how to manage them, and SOA deals with the service that a company offers to expose data from one system to the rest of them. Both the conceptions are quite similar if we think of the ITILv3 company as the IT department and ofthe company's service as the SOA service. There is another quite interesting read--Note on Distributed Computing from Sun Microsystem Laboratories published in 1994. In this reading,four membersof Sun Microsystems discuss the problems that a company faces when it expands, and the system that made up the IT core of the company and its need to share information. You canfind this reading athttp://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.48.7969&rep=rep1&type=pdf. In the early '90s, when companies were starting to computerize, they needed to share information from one system to another, whichwas not an easy task at all. There was a discussion on how to handle the local and remote information as well as which technology to use to share that information. The Network File System(NFS), by IBM was a good attempt to share that information, but there was still a lot of work left to do.After NFS, other approaches came,such as CORBA and Microsoft DCOM, but they still keep the dependencies between the whole set of applications connected. Refer to the following diagram: Insert Image B06549_01_03.png The SOA approach versus CORBA and DCOM Finally, with the SOA approach, by the end of the '90s, independent applications where able to share their data avoiding dependencies. This data interchange is done using services. An SOA service is a data interchange need between different systems that accomplish some rules. These rules are the so-called SOA principles that we will explain as we move on. SOA Principles The SOA Principles are the rules that we always have to keep in mind when taking any kind of decisions in an SOA organization, such as the following: Analyzing proposal for services Deciding whether to add a new functionality to a service or to split it into two services Solving performance issues Designing new services There is no industry agreement about the SOA principles, and some of them publish their own principles. Now, we will go through the principles that will help us in understanding its importance: Service Standardization:Services must comply with communication and design agreements defined for the catalog they belong to. These include both high-level specifications and low level details, such as those mentioned here: Service name Functional details Input data Output data Protocols Security Service loose coupling: Services in the catalog must be independent from each other. The only thing a service should know about the rest of the services in the catalog is that they exist. The way to achieve this is by defining service contracts so that when a service needs to use another one, it has to just use that service contract. Service abstraction: The service should be a black box just defined by its contracts. The contract specifies the input and output parameters with no information about how the process is performed at all. This reduces the coupling with other services to a minimum. Service reusability: This is the most important principle and means that services must be conceived to be reused by the maximum number of consumers. The service must be reused in any context and by any consumer, not only by the application that originated the need for the service. Other applications in the company must be able to consume that service and even other systems outside the company in case the service is published, for example, for the citizenship. To achieve this, obviously the service must be independent from any technology and must not be coupled to a specific business process. If we have a service working in a context, and it is needed to serve in a wider context, the right choice is to modify the service for it to be able to be consumed in both the contexts. Service autonomy: A service must have a high degree of control over the runtime environment and over the logic it represents. The more control a service has over the underlying resources, the less dependencies it has and the more predictable and reliable it is. Resources maybe hardware resources or software resources, for example, the network is a hardware resource, and a database table is a software resource. It would be ideal to have a service with exclusive ownership over the resources, but with an equilibrated amount of control that allows it to minimize the dependencies on shared resources. Service statelessness: Services must have no state, that is, a service does not retain information about the data processed. All thedata needed comes from the input parameters every time it is consumed. The information needed during the process dies when the process ends. Managing the whole amount of state information will put its availability in serious trouble. Service discovery: With a goal to maximize the reutilization, services must be able to be discovered. Everyone should know the service list and their detailed information. To achieve that aim, services will have metadata to describe them, which will be stored in a repository or catalog. This metadata information must be accessed easily and automatically (programmatically) using, for example,Universal Description, Discovery, and Integration (UDDI). Thus, we avoid building or applying for a new service when we already have a service, or several ones, providing that information by composition. Service composability: Service with more complex requirements must use other existing services to achieve that aim instead of implementing the same logic that is already available in other services. Service granularity: Services must offer a relevant piece of business. The functionality of the service must not be so simple that the output of the service always needs to be complemented with another service'sfunctionality. Likewise, the functionality of the service must not be so complex that none of the services in the company uses the whole set of information returned by the service. Service normalization: Like in other areas such as database design, services must be decomposed, avoiding redundant logic. This principle may be omitted in some cases due to, for example, performance issues, where the priority is quick response for the business. Vendor independent: As we discussed earlier, services must not be attached to any technology. The service definition must be technology independent, and any vendor-specific feature must not affect the design of the service. Summary In this article, we discussed the issues that gave rise to SOA, described its main principles, and explained how to make our standard organization in an SOA organization. In order to achieve this aim, we named the WSO2 product we need as WSO2 Enterprise Integrator. Finally, we learned how to install, configure, and start it up. Resources for Article: Further resources on this subject: [article] [article] [article]

0
0
1275

Packt

10 Aug 2017

21 min read

Starting Out

Packt

10 Aug 2017

21 min read

In this article by Chris Simmonds, author of the book Mastering Embedded Linux Programming – Second Edition, you are about to begin working on your next project, and this time it is going to be running Linux. What should you think about before you put finger to keyboard? Let's begin with a high-level look at embedded Linux and see why it is popular, what are the implications of open source licenses, and what kind of hardware you will need to run Linux. (For more resources related to this topic, see here.) Linux first became a viable choice for embedded devices around 1999. That was when Axis (https://www.axis.com), released their first Linux-powered network camera and TiVo (https://business.tivo.com/) their first Digital Video Recorder (DVR). Since 1999, Linux has become ever more popular, to the point that today it is the operating system of choice for many classes of product. At the time of writing, in 2017, there are about two billion devices running Linux. That includes a large number of smartphones running Android, which uses a Linux kernel, and hundreds of millions of set-top-boxes, smart TVs, and Wi-Fi routers, not to mention a very diverse range of devices such as vehicle diagnostics, weighing scales, industrial devices, and medical monitoring units that ship in smaller volumes. So, why does your TV run Linux? At first glance, the function of a TV is simple: it has to display a stream of video on a screen. Why is a complex Unix-like operating system like Linux necessary? The simple answer is Moore's Law: Gordon Moore, co-founder of Intel, observed in 1965 that the density of components on a chip will double approximately every two years. That applies to the devices that we design and use in our everyday lives just as much as it does to desktops, laptops, and servers. At the heart of most embedded devices is a highly integrated chip that contains one or more processor cores and interfaces with main memory, mass storage, and peripherals of many types. This is referred to as a System on Chip, or SoC, and SoCs are increasing in complexity in accordance with Moore's Law. A typical SoC has a technical reference manual that stretches to thousands of pages. Your TV is not simply displaying a video stream as the old analog sets used to do. The stream is digital, possibly encrypted, and it needs processing to create an image. Your TV is (or soon will be) connected to the Internet. It can receive content from smartphones, tablets, and home media servers. It can be (or soon will be) used to play games. And so on and so on. You need a full operating system to manage this degree of complexity. Here are some points that drive the adoption of Linux: Linux has the necessary functionality. It has a good scheduler, a good network stack, support for USB, Wi-Fi, Bluetooth, many kinds of storage media, good support for multimedia devices, and so on. It ticks all the boxes. Linux has been ported to a wide range of processor architectures, including some that are very commonly found in SoC designs—ARM, MIPS, x86, and PowerPC. Linux is open source, so you have the freedom to get the source code and modify it to meet your needs. You, or someone working on your behalf, can create a board support package for your particular SoC board or device. You can add protocols, features, and technologies that may be missing from the mainline source code. You can remove features that you don't need to reduce memory and storage requirements. Linux is flexible. Linux has an active community; in the case of the Linux kernel, very active. There is a new release of the kernel every 8 to 10 weeks, and each release contains code from more than 1,000 developers. An active community means that Linux is up to date and supports current hardware, protocols, and standards. Open source licenses guarantee that you have access to the source code. There is no vendor tie-in. For these reasons, Linux is an ideal choice for complex devices. But there are a few caveats I should mention here. Complexity makes it harder to understand. Coupled with the fast moving development process and the decentralized structures of open source, you have to put some effort into learning how to use it and to keep on re-learning as it changes. Selecting the right operating system Is Linux suitable for your project? Linux works well where the problem being solved justifies the complexity. It is especially good where connectivity, robustness, and complex user interfaces are required. However, it cannot solve every problem, so here are some things to consider before you jump in: Is your hardware up to the job? Compared to a traditional real-time operating system (RTOS) such as VxWorks, Linux requires a lot more resources. It needs at least a 32-bit processor and lots more memory. I will go into more detail in the section on typical hardware requirements. Do you have the right skill set? The early parts of a project, board bring-up, detailed knowledge of Linux and how it relates to your hardware. Likewise, when debugging and tuning your application, you will need to be able to interpret the results. If you don't have the skills in-house, you may want to outsource some of the work. Is your system real-time? Linux can handle many real-time activities so long as you pay attention to certain details. Consider these points carefully. Probably the best indicator of success is to look around for similar products that run Linux and see how they have done it; follow best practice. The players Where does open source software come from? Who writes it? In particular, how does this relate to the key components of embedded development—the toolchain, bootloader, kernel, and basic utilities found in the root filesystem? The main players are: The open source community: This, after all, is the engine that generates the software you are going to be using. The community is a loose alliance of developers, many of whom are funded in some way, perhaps by a not-for-profit organization, an academic institution, or a commercial company. They work together to further the aims of the various projects. There are many of them—some small, some large. CPU architects: These are the organizations that design the CPUs we use. The important ones here are ARM/Linaro (ARM-based SoCs), Intel (x86 and x86_64), Imagination Technologies (MIPS), and IBM (PowerPC). They implement or, at the very least, influence support for the basic CPU architecture. SoC vendors (Atmel, Broadcom, Intel, Qualcomm, TI, and many others). They take the kernel and toolchain from the CPU architects and modify them to support their chips. They also create reference boards: designs that are used by the next level down to create development boards and working products. Board vendors and OEMs: These people take the reference designs from SoC vendors and build them in to specific products, for instance, set-top-boxes or cameras, or create more general purpose development boards, such as those from Avantech and Kontron. An important category are the cheap development boards such as BeagleBoard/BeagleBone and Raspberry Pi that have created their own ecosystems of software and hardware add-ons. These form a chain, with your project usually at the end, which means that you do not have a free choice of components. You cannot simply take the latest kernel from https://www.kernel.org/, except in a few rare cases, because it does not have support for the chip or board that you are using. This is an ongoing problem with embedded development. Ideally, the developers at each link in the chain would push their changes upstream, but they don't. It is not uncommon to find a kernel which has many thousands of patches that are not merged. In addition, SoC vendors tend to actively develop open source components only for their latest chips, meaning that support for any chip more than a couple of years old will be frozen and not receive any updates. The consequence is that most embedded designs are based on old versions of software. They do not receive security fixes, performance enhancements, or features that are in newer versions. Problems such as Heartbleed (a bug in the OpenSSL libraries) and ShellShock (a bug in the bash shell) go unfixed. What can you do about it? First, ask questions of your vendors: what is their update policy, how often do they revise kernel versions, what is the current kernel version, what was the one before that, and what is their policy for merging changes up-stream? Some vendors are making great strides in this way. You should prefer their chips. Secondly, you can take steps to make yourself more self-sufficient. The article explains the dependencies in more detail and show you where you can help yourself. Don't just take the package offered to you by the SoC or board vendor and use it blindly without considering the alternatives. The four elements of embedded Linux Every project begins by obtaining, customizing, and deploying these four elements: the toolchain, the bootloader, the kernel, and the root filesystem: Toolchain: The compiler and other tools needed to create code for your target device. Everything else depends on the toolchain. Bootloader: The program that initializes the board and loads the Linux kernel. Kernel: This is the heart of the system, managing system resources and interfacing with hardware. Root filesystem: Contains the libraries and programs that are run once the kernel has completed its initialization. Of course, there is also a fifth element, not mentioned here. That is the collection of programs specific to your embedded application which make the device do whatever it is supposed to do, be it weigh groceries, display movies, control a robot, or fly a drone. Typically, you will be offered some or all of these elements as a package when you buy your SoC or board. But, for the reasons mentioned in the preceding paragraph, they may not be the best choices for you. Open source The components of embedded Linux are open source, so now is a good time to consider what that means, why open sources work the way they do, and how this affects the often proprietary embedded device you will be creating from it. Licenses When talking about open source, the word free is often used. People new to the subject often take it to mean nothing to pay, and open source software licenses do indeed guarantee that you can use the software to develop and deploy systems for no charge. However, the more important meaning here is freedom, since you are free to obtain the source code, modify it in any way you see fit, and redeploy it in other systems. These licenses give you this right. Compare that with shareware licenses which allow you to copy the binaries for no cost but do not give you the source code, or other licenses that allow you to use the software for free under certain circumstances, for example, for personal use but not commercial. These are not open source. I will provide the following comments in the interest of helping you understand the implications of working with open source licenses, but I would like to point out that I am an engineer and not a lawyer. What follows is my understanding of the licenses and the way they are interpreted. Open source licenses fall broadly into two categories: the copyleft licenses such as the General Public License (GPL) and the permissive licenses such as those from the Berkeley Software Distribution (BSD), the , and others. The permissive licenses say, in essence, that you may modify the source code and use it in systems of your own choosing so long as you do not modify the terms of the license in any way. In other words, with that one restriction, you can do with it what you want, including building it into possibly proprietary systems. The GPL licenses are similar, but have clauses which compel you to pass the rights to obtain and modify the software on to your end users. In other words, you share your source code. One option is to make it completely public by putting it onto a public server. Another is to offer it only to your end users by means of a written offer to provide the code when requested. The GPL goes further to say that you cannot incorporate GPL code into proprietary programs. Any attempt to do so would make the GPL apply to the whole. In other words, you cannot combine a GPL and proprietary code in one program. So, what about libraries? If they are licensed with the GPL, any program linked with them becomes GPL also. However, most libraries are licensed under the Lesser General Public License (LGPL). If this is the case, you are allowed to link with them from a proprietary program. All the preceding description relates specifically to GLP v2 and LGPL v2.1. I should mention the latest versions of GLP v3 and LGPL v3. These are controversial, and I will admit that I don't fully understand the implications. However, the intention is to ensure that the GPLv3 and LGPL v3 components in any system can be replaced by the end user, which is in the spirit of open source software for everyone. It does pose some problems though. Some Linux devices are used to gain access to information according to a subscription level or another restriction, and replacing critical parts of the software may compromise that. Set-top-boxes fit into this category. There are also issues with security. If the owner of a device has access to the system code, then so might an unwelcome intruder. Often the defense is to have kernel images that are signed by an authority, the vendor, so that unauthorized updates are not possible. Is that an infringement of my right to modify my device? Opinions differ. The TiVo set-top-box is an important part of this debate. It uses a Linux kernel, which is licensed under GPL v2. TiVo have released the source code of their version of the kernel and so comply with the license. TiVo also has a bootloader that will only load a kernel binary that is signed by them. Consequently, you can build a modified kernel for a TiVo box but you cannot load it on the hardware. The Free Software Foundation (FSF) takes the position that this is not in the spirit of open source software and refers to this procedure as Tivoization. The GPL v3 and LGPL v3 were written to explicitly prevent this happening. Some projects, the Linux kernel in particular, have been reluctant to adopt the version three licenses because of the restrictions it would place on device manufacturers. Hardware for embedded Linux If you are designing or selecting hardware for an embedded Linux project, what do you look out for? Firstly, a CPU architecture that is supported by the kernel—unless you plan to add a new architecture yourself, of course! Looking at the source code for Linux 4.9, there are 31 architectures, each represented by a sub-directory in the arch/ directory. They are all 32- or 64-bit architectures, most with a memory management unit (MMU), but some without. The ones most often found in embedded devices are ARM, MIPS PowerPC, and X86, each in 32- and 64-bit variants, and all of which have memory management units. That doesn't have an MMU that runs a subset of Linux known as microcontroller Linux or uClinux. These processor architectures include ARC, Blackfin, MicroBlaze, and Nios. I will mention uClinux from time to time but I will not go into detail because it is a rather specialized topic. Secondly, you will need a reasonable amount of RAM. 16 MiB is a good minimum, although it is quite possible to run Linux using half that. It is even possible to run Linux with 4 MiB if you are prepared to go to the trouble of optimizing every part of the system. It may even be possible to get lower, but there comes a point at which it is no longer Linux. Thirdly, there is non-volatile storage, usually flash memory. 8 MiB is enough for a simple device such as a webcam or a simple router. As with RAM, you can create a workable Linux system with less storage if you really want to, but the lower you go, the harder it becomes. Linux has extensive support for flash storage devices, including raw NOR and NAND flash chips, and managed flash in the form of SD cards, eMMC chips, USB flash memory, and so on. Fourthly, a debug port is very useful, most commonly an RS-232 serial port. It does not have to be fitted on production boards, but makes board bring-up, debugging, and development much easier. Fifthly, you need some means of loading software when starting from scratch. A few years ago, boards would have been fitted with a Joint Test Action Group (JTAG) interface for this purpose, but modern SoCs have the ability to load boot code directly from removable media, especially SD and micro SD cards, or serial interfaces such as RS-232 or USB. In addition to these basics, there are interfaces to the specific bits of hardware your device needs to get its job done. Mainline Linux comes with open source drivers for many thousands of different devices, and there are drivers (of variable quality) from the SoC manufacturer and from the OEMs of third-party chips that may be included in the design, but remember my comments on the commitment and ability of some manufacturers. As a developer of embedded devices, you will find that you spend quite a lot of time evaluating and adapting third-party code, if you have it, or liaising with the manufacturer if you don't. Finally, you will have to write the device support for interfaces that are unique to the device, or find someone to do it for you. Hardware The worked examples are intended to be generic, but to make them relevant and easy to follow, I have had to choose specific hardware. I have chosen two exemplar devices: the BeagleBone Black and QEMU. The first is a widely-available and cheap development board which can be used in serious embedded hardware. The second is a machine emulator that can be used to create a range of systems that are typical of embedded hardware. It was tempting to use QEMU exclusively, but, like all emulations, it is not quite the same as the real thing. Using a BeagleBone Black, you have the satisfaction of interacting with real hardware and seeing real LEDs flash. I could have selected a board that is more up-to-date than the BeagleBone Black, which is several years old now, but I believe that its popularity gives it a degree of longevity and it means that it will continue to be available for some years yet. In any case, I encourage you to try out as many of the examples as you can, using either of these two platforms, or indeed any embedded hardware you may have to hand. The BeagleBone Black The BeagleBone and the later BeagleBone Black are open hardware designs for a small, credit card sized development board produced by CircuitCo LLC. The main repository of information is at https://beagleboard.org/. The main points of the specifications are: TI AM335x 1 GHz ARM® Cortex-A8 Sitara SoC 512 MiB DDR3 RAM 2 or 4 GiB 8-bit eMMC on-board flash storage Serial port for debug and development MicroSD connector, which can be used as the boot device Mini USB OTG client/host port that can also be used to power the board Full size USB 2.0 host port 10/100 Ethernet port HDMI for video and audio output In addition, there are two 46-pin expansion headers for which there are a great variety of daughter boards, known as capes, which allow you to adapt the board to do many different things. However, you do not need to fit any capes in the examples. In addition to the board itself, you will need: A mini USB to full-size USB cable (supplied with the board) to provide power, unless you have the last item on this list. An RS-232 cable that can interface with the 6-pin 3.3V TTL level signals provided by the board. The Beagleboard website has links to compatible cables. A microSD card and a means of writing to it from your development PC or laptop, which will be needed to load software onto the board. An Ethernet cable, as some of the examples require network connectivity. Optional, but recommended, a 5V power supply capable of delivering 1 A or more. QEMU QEMU is a machine emulator. It comes in a number of different flavors, each of which can emulate a processor architecture and a number of boards built using that architecture. For example, we have the following: qemu-system-arm: ARM qemu-system-mips: MIPS qemu-system-ppc: PowerPC qemu-system-x86: x86 and x86_64 For each architecture, QEMU emulates a range of hardware, which you can see by using the option—machine help. Each machine emulates most of the hardware that would normally be found on that board. There are options to link hardware to local resources, such as using a local file for the emulated disk drive. Here is a concrete example: $ qemu-system-arm -machine vexpress-a9 -m 256M -drive file=rootfs.ext4,sd -net nic -net use -kernel zImage -dtb vexpress- v2p-ca9.dtb -append "console=ttyAMA0,115200 root=/dev/mmcblk0" - serial stdio -net nic,model=lan9118 -net tap,ifname=tap0 The options used in the preceding command line are: -machine vexpress-a9: Creates an emulation of an ARM Versatile Express development board with a Cortex A-9 processor -m 256M: Populates it with 256 MiB of RAM -drive file=rootfs.ext4,sd: Connects the SD interface to the local file rootfs.ext4 (which contains a filesystem image) -kernel zImage: Loads the Linux kernel from the local file named zImage -dtb vexpress-v2p-ca9.dtb: Loads the device tree from the local file vexpress-v2p-ca9.dtb -append "...": Supplies this string as the kernel command-line -serial stdio: Connects the serial port to the terminal that launched QEMU, usually so that you can log on to the emulated machine via the serial console -net nic,model=lan9118: Creates a network interface -net tap,ifname=tap0: Connects the network interface to the virtual network interface tap0 To configure the host side of the network, you need the tunctl command from the User Mode Linux (UML) project; on Debian and Ubuntu, the package is named uml-utilites: $ sudo tunctl -u $(whoami) -t tap0 This creates a network interface named tap0 which is connected to the network controller in the emulated QEMU machine. You configure tap0 in exactly the same way as any other interface. I will be using Versatile Express for most of my examples, but it should be easy to use a different machine or architecture. Software I have used only open source software, both for the development tools and the target operating system and applications. I assume that you will be using Linux on your development system. I tested all the host commands using Ubuntu 14.04 and so there is a slight bias towards that particular version, but any modern Linux distribution is likely to work just fine. Summary Embedded hardware will continue to get more complex, following the trajectory set by Moore's Law. Linux has the power and the flexibility to make use of hardware in an efficient way. Linux is just one component of open source software out of the many that you need to create a working product. The fact that the code is freely available means that people and organizations at many different levels can contribute. However, the sheer variety of embedded platforms and the fast pace of development lead to isolated pools of software which are not shared as efficiently as they should be. In many cases, you will become dependent on this software, especially the Linux kernel that is provided by your SoC or Board vendor, and to a lesser extent, the toolchain. Some SoC manufacturers are getting better at pushing their changes upstream and the maintenance of these changes is getting easier. Fortunately, there are some powerful tools that can help you create and maintain the software for your device. For example, Buildroot is ideal for small systems and the Yocto Project for larger ones. Before I describe these build tools, I will describe the four elements of embedded Linux, which you can apply to all embedded Linux projects, however they are created. Resources for Article: Further resources on this subject: Programming with Linux [article] Embedded Linux and Its Elements [article] Revisiting Linux Network Basics [article]

0
0
1727

article-image-creating-first-python-script

Packt

09 Aug 2017

27 min read

Creating the First Python Script

Packt

09 Aug 2017

27 min read

In this article by Silas Toms, the author of the book ArcPy and ArcGIS - Second Edition, we will demonstrate how to use ModelBuilder, which ArcGIS professionals are already familiar with, to model their first analysis and then export it out as a script. With the Python environment configured to fit our needs, we can now create and execute ArcPy scripts. To ease into the creation of Python scripts, this article will use ArcGIS ModelBuilder to model a simple analysis, and export it as a Python script. ModelBuilder is very useful for creating Python scripts. It has an operational and a visual component, and all models can be outputted as Python scripts, where they can be further customized. This article we will cover the following topics: Modeling a simple analysis using ModelBuilder Exporting the model out to a Python script Window file paths versus Pythonic file paths String formatting methods (For more resources related to this topic, see here.) Prerequisites The following are the prerequisites for this article: ArcGIS 10x and Python 2.7, with arcpy available as a module. For this article, the accompanying data and scripts should be downloaded from Packt Publishing's website. The completed scripts are available for comparison purposes, and the data will be used for this article's analysis. To run the code and test code examples, use your favorite IDE or open the IDLE (Python GUI) program from the Start Menu/ArcGIS/Python2.7 folder after installing ArcGIS for Desktop. Use the built-in "interpreter" or code entry interface, indicated by the triple chevron >>> and a blinking cursor. ModelBuilder ArcGIS has been in development since the 1970s. Since that time, it has included a variety of programming languages and tools to help GIS users automate analysis and map production. These include the Avenue scripting language in the ArcGIS 3x series, and the ARC Macro Language (AML) in the ARCInfo Workstation days as well as VBScript up until ArcGIS 10x, when Python was introduced. Another useful tool introduced in ArcGIS 9x was ModelBuilder, a visual programming environment used for both modeling analysis and creating tools that can be used repeatedly with different input feature classes. A useful feature of ModelBuilder is an export function, which allows modelers to create Python scripts directly from a model. This makes it easier to compare how parameters in a ModelBuilder tool are accepted as compared to how a Python script calls the same tool and supplies its parameters, and how generated feature classes are named and placed within the file structure. ModelBuilder is a helpful tool on its own, and its Python export functionality makes it easy for a GIS analyst to generate and customize ArcPy scripts. Creating a model and exporting to Python This article and the associated scripts depend on the downloadable file SanFrancisco.gdb geodatabase available from Packt. SanFrancisco.gdb contains data downloaded from https://datasf.org/ and the US Census' American Factfinder website at https://factfinder.census.gov/faces/nav/jsf/pages/index.xhtml. All census and geographic data included in the geodatabase is from the 2010 census. The data is contained within a feature dataset called SanFrancisco. The data in this feature dataset is in NAD 83 California State Plane Zone 3, and the linear unit of measure is the US foot. This corresponds to SRID 2227 in the European Petroleum Survey Group (EPSG) format. The analysis which will create with the model, and eventually export to Python for further refinement, will use bus stops along a specific line in San Francisco. These bus stops will be buffered to create a representative region around each bus stop. The buffered areas will then be intersected with census blocks to find out how many people live within each representative region around the bus stops. Modeling the Select and Buffer tools Using ModelBuilder, we will model the basic bus stop analysis. Once it has been modeled, it will be exported as an automatically generated Python script. Follow these steps to begin the analysis: Open up ArcCatalog, and create a folder connection to the folder containing SanFrancisco.gdb. I have put the geodatabase in a C drive folder called "Projects" for a resulting file path of C:ProjectsSanFrancisco.gdb. Right-click on geodatabase, and add a new toolbox called Chapter2Tools. Right-click on geodatabase; select New, and then Feature Dataset, from the menu. A dialogue will appear that asks for a name; call it Chapter2Results, and push Next. It will ask for a spatial reference system; enter 2227 into the search bar, and push the magnifying glass icon. This will locate the correct spatial reference system: NAD 1983 StatePlane California III FIPS 0403 Feet. Don't select a vertical reference system, as we are not doing any Z value analysis. Push next, select the default tolerances, and push Finish. Next, open ModelBuilder using the ModelBuilder icon or by right-clicking on the Toolbox, and create a new Model. Save the model in the Chapter2Tools toolbox as Chapter2Model1. Drag in the Bus_Stops feature class and the Select tool from the Analysis/Extract toolset in ArcToolbox. Open up the Select tool, and name the output feature class Inbound71. Make sure that the feature class is written to the Chapter2Results feature dataset. Open up the Expression SQL Query Builder, and create the following SQL expression : NAME = '71 IB' AND BUS_SIGNAG = 'Ferry Plaza'. The next step is to add a Buffer Tool from the Analysis/Proximity toolset. The Buffer tool will be used to create buffers around each bus stop. The buffered bus stops allow us to intersect with census data in the form of census blocks, creating the representative regions around each bus stop. Connect the output of the Select tool (Inbound71) to the Buffer tool. Open up the Buffer tool, add 400 to the Distance field, and change the units to Feet. Leave the rest of the options blank. Click on OK, and return to the model: Adding in the Intersect tool Now that we have selected the bus line of interest, and buffered the stops to create representative regions, we will need to intersect the regions with the census blocks to find the population of each representative region. This can be done as follows: First, add the CensusBlocks2010 feature class from the SanFrancisco feature dataset to the model. Next, add in the Intersect tool located in the Analysis/Overlay toolset in the ArcToolbox. While we could use a Spatial Join to achieve a similar result, I have used the Intersect tool to capture the area of intersect for use later in the model and script. At this point, our model should look like this: Tallying the analysis results After we have created this simple analysis, the next step is to determine the results for each bus stop. Finding the number of people that live in census blocks, touched by the 400-foot buffer of each bus stop, involves examining each row of data in the final feature class, and selecting rows that correspond to the bus stop. Once these are selected, a sum of the selected rows would be calculated either using the Field Calculator or the Summarize tool. All of these methods will work, and yet none are perfect. They take too long, and worse, are not repeatable automatically if an assumption in the model is adjusted (if the buffer is adjusted from 400 feet to 500 feet, for instance). This is where the traditional uses of ModelBuilder begin to fail analysts. It should be easy to instruct the model to select all rows associated with each bus stop, and then generate a summed population figure for each bus stop's representative region. It would be even better to have the model create a spreadsheet to contain the final results of the analysis. It's time to use Python to take this analysis to the next level. Exporting the model and adjusting the script While modeling analysis in ModelBuilder has its drawbacks, there is one fantastic option built into ModelBuilder: the ability to create a model, and then export the model to Python. Along with the ArcGIS Help Documentation, it is the best way to discover the correct Python syntax to use when writing ArcPy scripts. Create a folder that can hold the exported scripts next to the SanFrancisco geodatabase (for example, C:ProjectsScripts). This will hold both the exported scripts that ArcGIS automatically generates, and the versions that we will build from those generated scripts. Now, perform the following steps: Open up the model called Chapter2Model1. Click on the Model menu in the upper-left side of the screen. Select Export from the menu. Select To Python Script. Save the script as Chapter2Model1.py. Note that there is also the option to export the model as a graphic. Creating a graphic of the model is a good way to share what the model is doing with other analysts without the need to share the model and the data, and can also be useful when sharing Python scripts as well. The Automatically generated script Open the automatically generated script in an IDE. It should look like this: # -*- coding: utf-8 -*- # --------------------------------------------------------------------------- # Chapter2Model1.py # Created on: 2017-01-26 04:26:31.00000 # (generated by ArcGIS/ModelBuilder) # Description: # --------------------------------------------------------------------------- # Import arcpy module import arcpy # Local variables: Bus_Stops = "C:ProjectsSanFrancisco.gdbSanFranciscoBus_Stops" Inbound71 = "C:ProjectsSanFrancisco.gdbChapter2ResultsInbound71" Inbound71_400ft_Buffer = "C:ProjectsSanFrancisco.gdbChapter2ResultsInbound71_400ft_Buffer" CensusBlocks2010 = "C:ProjectsSanFrancisco.gdbSanFranciscoCensusBlocks2010" Intersect71Census = "C:ProjectsSanFrancisco.gdbChapter2ResultsIntersect71Census" # Process: Select arcpy.Select_analysis(Bus_Stops, Inbound71, "NAME = '71 IB' AND BUS_SIGNAG = 'Ferry Plaza'") # Process: Buffer arcpy.Buffer_analysis(Inbound71, Inbound71_400ft_buffer, "400 Feet", "FULL", "ROUND", "NONE", "") # Process: Intersect arcpy.Intersect_analysis("C:ProjectsSanFrancisco.gdbChapter2ResultsInbound71_400ft_Buffer #;C:ProjectsSanFrancisco.gdbSanFranciscoCensusBlocks2010 #",Intersect71Census, "ALL", "", "INPUT") Let's examine this script line by line. The first line is preceded by a pound sign ("#"), which again means that this line is a comment; however, it is not ignored by the Python interpreter when the script is executed as usual, but is used to help Python interpret the encoding of the script as described here: http://legacy.python.org/dev/peps/pep-0263. The second commented line and the third line are included for decorative purposes. The next four lines, all commented, are used for providing readers information about the script: what it is called and when it was created along with a description, which is pulled from the model's properties. Another decorative line is included to visually separate out the informative header from the body of the script. While the commented information section is nice to include in a script for other users of the script, it is not necessary. The body of the script, or the executable portion of the script, starts with the import arcpy line. Import statements are, by convention, included at the top of the body of the script. In this instance, the only module that is being imported is ArcPy. ModelBuilder's export function creates not only an executable script, but also comments each section to help mark the different sections of the script. The comments let user know where the variables are located, and where the ArcToolbox tools are being executed. After the import statements come the variables. In this case, the variables represent the file paths to the input and output feature classes. The variable names are derived from the names of the feature classes (the base names of the file paths). The file paths are assigned to the variables using the assignment operator ("="), and the parts of the file paths are separated by two backslashes. File paths in Python To store and retrieve data, it is important to understand how file paths are used in Python as compared to how they are represented in Windows. In Python, file paths are strings, and strings in Python have special characters used to represent tabs "t", newlines "n", or carriage returns "r", among many others. These special characters all incorporate single backslashes, making it very hard to create a file path that uses single backslashes. File paths in Windows Explorer all use single backslashes. Windows Explorer: C:ProjectsSanFrancisco.gdbChapter2ResultsIntersect71Census Python was developed within the Linux environment, where file paths have forward slashes. There are a number of methods used to avoid this issue. The first is using filepaths with forward slashes. The Python interpreter will understand file paths with forward slashes as seen in this code: Python version: "C:/Projects/SanFrancisco.gdb/Chapter2Results/Intersect71Census" Within a Python script, the Python file path with the forward slashes will definitely work, while the Windows Explorer version might cause the script to throw an exception as Python strings can have special characters like the newline character n, or tab t. that will cause the string file path to be read incorrectly by the Python interpreter. Another method used to avoid the issue with special characters is the one employed by ModelBuilder when it automatically creates the Python scripts from a model. In this case, the backslashes are "escaped" using a second backslash. The preceding script uses this second method to produce the following results: Python escaped version: "C:ProjectsSanFrancisco.gdbChapter2ResultsIntersect71Census" The third method, which I use when copying file paths from ArcCatalog or Windows Explorer into scripts, is to create what is known as a "raw" string. This is the same as a regular string, but it includes an "r" before the script begins. This "r" alerts the Python interpreter that the following script does not contain any special characters or escape characters. Here is an example of how it is used: Python raw string: r"C:ProjectsSanFrancisco.gdbSanFranciscoBus_Stops" Using raw strings makes it easier to grab a file path from Windows Explorer, and add it to a string inside a script. It also makes it easier to avoid accidentally forgetting to include a set of double backslashes in a file path, which happens all the time and is the cause of many script bugs. String manipulation There are three major methods for inserting variables into strings. Each has different advantages and disadvantages of a technical nature. It's good to know about all three, as they have uses beyond our needs here, so let's review them. String manipulation method 1: string addition String addition seems like an odd concept at first, as it would not seem possible to "add" strings together, unlike integers or floats which are numbers. However, within Python and other programming languages, this is a normal step. Using the plus sign "+", strings are "added" together to make longer strings, or to allow variables to be added into the middle of existing strings. Here are some examples of this process: >>> aString = "This is a string" >>> bString = " and this is another string" >>> cString = aString + bString >>> cString The output is as follows: 'This is a string and this is another string' Two or more strings can be "added" together, and the result can be assigned to a third variable for using it later in the script. This process can be useful for data processing and formatting. Another similar offshoot of string addition is string multiplication, where strings are multiplied by an integer to produce repeating versions of the string, like this: >>> "string" * 3 'stringstringstring' String manipulation method 2: string formatting #1 The second method of string manipulation, known as string formatting, involves adding placeholders into the string, which accept specific kinds of data. This means that these special strings can accept other strings as well as integers and float values. These placeholders use the modulo "%" and a key letter to indicate the type of data to expect. Strings are represented using %s, floats using %f, and integers using %d. The floats can also be adjusted to limit the digits included by adding a modifying number after the modulo. If there is more than one placeholder in a string, the values are passed to the string in a tuple. This method has become less popular, since the third method discussed next was introduced in Python 2.6, but it is still valuable to know, as many older scripts use it. Here is an example of this method: >>> origString = "This string has as a placeholder %s" >>> newString = origString % "and this text was added" >>> print newString The output is as follows: This string has as a placeholder and this text was added Here is an example when using a float placeholder: >>> floatString1 = "This string has a float here: %f" >>> newString = floatString % 1.0 >>> newString = floatString1 % 1.0 >>> print newString The output is as follows: This string has a float here: 1.000000 Here is another example when using a float placeholder: >>> floatString2 = "This string has a float here: %.1f" >>> newString2 = floatString2 % 1.0 >>> print newString2 The output is as follows: This string has a float here: 1.0 Here is an example using an integer placeholder: >>> intString = "Here is an integer: %d" >>> newString = intString % 1 >>> print newString The output is as follows: Here is an integer: 1 String manipulation method 3: string formatting #2 The final method is known as string formatting. It is similar to the string formatting method 1, with the added benefit of not requiring a specific data type of placeholder. The placeholders, or tokens as they are also known, are only required to be in order to be accepted. The format function is built into strings; by adding .format to the string, and passing in parameters, the string accepts the values, as seen in the following example: >>> formatString = "This string has 3 tokens: {0}, {1}, {2}" >>> newString = formatString.format("String", 2.5, 4) >>> print newString This string has 3 tokens: String, 2.5, 4 The tokens don't have to be in order within the string, and can even be repeated by adding a token wherever it is needed within the template. The order of the values applied to the template is derived from the parameters supplied to the .format function, which passes the values to the string. The third method has become my go-to method for string manipulation because of the ability to add the values repeatedly, and because it makes it possible to avoid supplying the wrong type of data to a specific placeholder, unlike the second method. The ArcPy tools After the import statements and the variable definitions, the next section of the script is where the analysis is executed. The same tools that we created in the model--the Select, Buffer, and Intersect tools, are included in this section. The same parameters that we supplied in the model are also included here: the inputs and outputs, plus the SQL statement in the Select tool, and the buffer distance in the Buffer tool. The tool parameters are supplied to the tools in the script in the same order as they appear in the tool interfaces in the model. Here is the Select tool in the script: arcpy.Select_analysis(Bus_Stops, Inbound71, "NAME = '71 IB' AND BUS_SIGNAG = 'Ferry Plaza'") It works like this: the arcpy module has a "method", or tool, called Select_analysis. This method, when called, requires three parameters: the input feature class (or shapefile), the output feature class, and the SQL statement. In this example, the input is represented by the variable Bus_Stops, and the output feature class is represented by the variable Inbound71, both of which are defined in the variable section. The SQL statement is included as the third parameter. Note that it could also be represented by a variable if the variable was defined preceding to this line; the SQL statement, as a string, could be assigned to a variable, and the variable could replace the SQL statement as the third parameter. Here is an example of parameter replacement using a variable: sqlStatement = "NAME = '71 IB' AND BUS_SIGNAG = 'Ferry Plaza'" arcpy.Select_analysis(Bus_Stops, Inbound71, sqlStatement) While ModelBuilder is good for assigning input and output feature classes to variables, it does not assign variables to every portion of the parameters. This will be an important thing to correct when we adjust and build our own scripts. The Buffer tool accepts a similar set of parameters as the Select tool. There is an input feature class represented by a variable, an output feature class variable, and the distance that we provided (400 feet in this case) along with a series of parameters that were supplied by default. Note that the parameters rely on keywords, and these keywords can be adjusted within the text of the script to adjust the resulting buffer output. For instance, "Feet" could be adjusted to "Meters", and the buffer would be much larger. Check the help section of the tool to understand better how the other parameters will affect the buffer, and to find the keyword arguments that are accepted by the Buffer tool in ArcPy. Also, as noted earlier, all of the parameters could be assigned to variables, which can save time if the same parameters are used repeatedly throughout a script. Sometimes, the supplied parameter is merely an empty string, as in this case here with the last parameter: arcpy.Buffer_analysis(Inbound71,Inbound71_400ft_buffer, "400 Feet", "FULL", "ROUND", "NONE", "") The empty string for the last parameter, which, in this case, signifies that there is no dissolve field for this buffer, is found quite frequently within ArcPy. It could also be represented by two single quotes, but ModelBuilder has been built to use double quotes to encase strings. The Intersect tool The last tool, the Intersect tool, uses a different method to represent the files that need to be intersected together when the tool is executed. Because the tool accepts multiple files in the input section (meaning, there is no limit to the number of files that can be intersected together in one operation), it stores all of the file paths within one string. This string can be manipulated using one of the string manipulation methods discussed earlier, or it can be reorganized to accept a Python list that contains the file paths, or variables representing file paths as a list, as the first parameter in any order. The Intersect tool will find the intersection of all of the strings. Adjusting the script Now is the time to take the automatically generated script, and adjust it to fit our needs. We want the script to both produce the output data, and to have it analyze the data and tally the results into a spreadsheet. This spreadsheet will hold an averaged population value for each bus stop. The average will be derived from each census block that the buffered representative region surrounding the stops intersected. Save the original script as "Chapter2Model1Modified.py". Adding the CSV module to the script For this script, we will use the csv module, a useful module for creating Comma-Separated Value spreadsheets. Its simple syntax will make it a useful tool for creating script outputs. ArcGIS for Desktop also installs the xlrd and xlwt modules, used to read or generate Excel spreadsheets respectively, when it is installed. These modules are also great for data analysis output. After the import arcpy line, add import csv. This will allow us to use the csv module for creating the spreadsheet. # Import arcpy module import arcpy import csv The next adjustment is made to the Intersect tool. Notice that the two paths included in the input string are also defined as variables in the variable section. Remove the file paths from the input strings, and replace them with a list containing the variable names of the input datasets, as follows: # Process: Intersect arcpy.Intersect_analysis([Inbound71_400ft_buffer,CensusBlocks2010],Intersect71Census, "ALL", "", "INPUT") Accessing the data: using a cursor Now that the script is in place to generate the raw data we need, we need a way to access the data held in the output feature class from the Intersect tool. This access will allow us to aggregate the rows of data representing each bus stop. We also need a data container to hold the aggregated data in memory before it is written to the spreadsheet. To accomplish the second part, we will use a Python dictionary. To accomplish the first part, we will use a method built into the ArcPy module: the Data Access SearchCursor. The Python dictionary will be added after the Intersect tool. A dictionary in Python is created using curly brackets {}. Add the following line to the script, below the analysis section: dataDictionary = {} This script will use the bus stop IDs as keys for the dictionary. The values will be lists, which will hold all of the population values associated with each busStopID. Add the following lines to generate a Data Cursor: with arcpy.da.SearchCursor(Intersect71Census, ["STOPID","POP10"]) as cursor: for row in cursor: busStopID = row[0] pop10 = row[1] if busStopID not in dataDictionary.keys(): dataDictionary[busStopID] = [pop10] else: dataDictionary[busStopID].append(pop10) This iteration combines a few ideas in Python and ArcPy. The with...as statement is used to create a variable (cursor), which represents the arcpy.da.SearchCursor object. It could also be written like this: cursor = arcpy.da.SearchCursor(Intersect71Census, ["STOPID","POP10"]) The advantage of the with...as structure is that the cursor object is erased from memory when the iteration is completed, which eliminates locks on the feature classes being evaluated. The arcpy.da.SearchCursor function requires an input feature class, and a list of fields to be returned. Optionally, an SQL statement can limit the number of rows returned. The next line, for row in cursor:, is the iteration through the data. It is not a normal Pythonic iteration, a distinction that will have ramifications in certain instances. For instance, one cannot pass index parameters to the cursor object to only evaluate specific rows within the cursor object, as one can do with a list. When using a Search Cursor, each row of data is returned as a tuple, which cannot be modified. The data can be accessed using indexes. The if...else condition allows the data to be sorted. As noted earlier, the bus stop ID, which is the first member of the data included in the tuple, will be used as a key. The conditional evaluates if the bus stop ID is included in the dictionary's existing keys (which are contained in a list, and accessed using the dictionary.keys() method). If it is not, it is added to the keys, and assigned a value that is a list that contains (at first) one piece of data, the population value contained in that row. If it does exist in the keys, the list is appended with the next population value associated with that bus stop ID. With this code, we have now sorted each census block population according to the bus stop with which it is associated. Next we need to add code to create the spreadsheet. This code will use the same with...as structure, and will generate an average population value by using two built-in Python functions: sum, which creates a sum from a list of numbers, and len, which will get the length of a list, tuple, or string. with open(r'C:ProjectsAverages.csv', 'wb') as csvfile: csvwriter = csv.writer(csvfile, delimiter=',') for busStopID in dataDictionary.keys(): popList = dataDictionary[busStopID] averagePop = sum(popList)/len(popList) data = [busStopID, averagePop] csvwriter.writerow(data) The average population value is retrieved from the dictionary using the busStopID key, and then assigned to the variable averagePop. The two data pieces, the busStopID and the averagePop variable are then added to a list.This list is supplied to a csvwriter object, which knows how to accept the data and write it out to a file located at the file path supplied to the built-in Python function open, used to create simple files. The script is complete, although it is nice to add one more line to the end to give us visual confirmation that the script has run. print "Data Analysis Complete" This last line will create an output indicating that the script has run. Once it is done, go to the location of the output CSV file and open it using Excel or Notepad, and see the results of the analysis. Our first script is complete! Exceptions and tracebacks During the process of writing and testing scripts, there will be errors that cause the code to break and throw exceptions. In Python, these are reported as a "traceback", which shows the last few lines of code executed before an exception occurred. To best understand the message, read them from the last line up. It will tell you the type of exception that occurred, and preceding to that will be the code that failed, with a line number, that should allow you to find and fix the code. It's not perfect, but it works. Overwriting files One common issue is that ArcGIS for Desktop does not allow you to overwrite files without turning on an environment variable. To avoid this issue, you can add a line after the import statements that will make overwriting files possible. Be aware that the original data will be unrecoverable once it is overwritten. It uses the env module to access the ArcGIS environment: import arcpy arcpy.env.overwriteOutput = True The final script Here is how the script should look in the end: # Chapter2Model1Modified.py # Import arcpy module import arcpy import csv # Local variables: Bus_Stops = r"C:ProjectsSanFrancisco.gdbSanFranciscoBus_Stops" CensusBlocks2010 = r"C:ProjectsSanFrancisco.gdbSanFranciscoCensusBlocks2010" Inbound71 = r"C:ProjectsSanFrancisco.gdbChapter2ResultsInbound71" Inbound71_400ft_buffer = r"C:ProjectsSanFrancisco.gdbChapter2ResultsInbound71_400ft_buffer" Intersect71Census = r"C:ProjectsSanFrancisco.gdbChapter2ResultsIntersect71Census" # Process: Select arcpy.Select_analysis(Bus_Stops, Inbound71, "NAME = '71 IB' AND BUS_SIGNAG = 'Ferry Plaza'") # Process: Buffer arcpy.Buffer_analysis(Inbound71, Inbound71_400ft_buffer, "400 Feet", "FULL", "ROUND", "NONE", "") # Process: Intersect arcpy.Intersect_analysis([Inbound71_400ft_buffe,CensusBlocks2010], Intersect71Census, "ALL", "", "INPUT") dataDictionary = {} with arcpy.da.SearchCursor(Intersect71Census, ["STOPID","POP10"]) as cursor: for row in cursor: busStopID = row[0] pop10 = row[1] if busStopID not in dataDictionary.keys(): dataDictionary[busStopID] = [pop10] else: dataDictionary[busStopID].append(pop10) with open(r'C:ProjectsAverages.csv', 'wb') as csvfile: csvwriter = csv.writer(csvfile, delimiter=',') for busStopID in dataDictionary.keys(): popList = dataDictionary[busStopID] averagePop = sum(popList)/len(popList) data = [busStopID, averagePop] csvwriter.writerow(data) print "Data Analysis Complete" Summary In this article, you learned how to craft a model of an analysis and export it out to a script. In particular, you learned how to use ModelBuilder to create an analysis and export it out as a script and how to adjust the script to be more "Pythonic". After explaining about the auto-generated script, we adjusted the script to include a results analysis and summation, which was outputted to a CSV file. We also briefly touched on the use of Search Cursors. Also, we saw how built-in modules such as the csv module can be used along with ArcPy to capture analysis output in formatted spreadsheets. Resources for Article: Further resources on this subject: Using the ArcPy DataAccess Module withFeature Classesand Tables [article] Measuring Geographic Distributions with ArcGIS Tool [article] Learning to Create and Edit Data in ArcGIS [article]

0
0
4674

Packt

09 Aug 2017

3 min read

Games and Exercises

Packt

09 Aug 2017

3 min read

In this article by Shishira Bhat and Ravi Wray, authors of the book, Learn Java in 7 days, we will study the following concepts: Making an object as the return type for a method Making an object as the parameter for a method (For more resources related to this topic, see here.) Let’s start this article by revisiting the reference variablesand custom data types: In the preceding program, p is a variable of datatype,Pen. Yes! Pen is a class, but it is also a datatype, a custom datatype. The pvariable stores the address of the Penobject, which is in heap memory. The pvariable is a reference that refers to a Penobject. Now, let’s get more comfortable by understanding and working with examples. How to return an Object from a method? In this section, let’s understand return types. In the following code, methods returnthe inbuilt data types (int and String), and the reason is explained after each method, as follows: int add () { int res = (20+50); return res; } The addmethod returns the res(70) variable, which is of the int type. Hence, the return type must be int: String sendsms () { String msg = "hello"; return msg; } The sendsmsmethod returns a variable by the name of msg, which is of the String type. Hence, the return type is String. The data type of the returning value and the return type must be the same. In the following code snippet, the return type of the givenPenmethod is not an inbuilt data type. However, the return type is a class (Pen) Let’s understand the following code: The givePen ()methodreturns a variable (reference variable) by the name of p, which is of the Pen type. Hence, the return type is Pen: In the preceding program, tk is a variable of the Ticket type. The method returns tk; hence, the return type of the method is Ticket. A method accepting an object (parameter) After seeing how a method can return an object/reference, let's understand how a method can take an object/reference as the input,that is, parameter. We already understood that if a method takes parameter(s), then we need to pass argument(s). Example In the preceding program,the method takestwo parameters,iandk. So, while calling/invoking the method, we need to pass two arguments, which are 20.5 and 15. The parameter type andthe argument type must be the same. Remember thatwhen class is the datatype, then object is the data. Consider the following example with respect toa non-primitive/class data type andthe object as its data: In the preceding code, the Kid class has the eat method, which takes ch as a parameter of the Chocolatetype, that is,the data type of ch is Chocolate, which is a class. When class is the data type then the object of that class is an actual data or argument. Hence,new Chocolate() is passed as an argument to the eat method. Let's see one more example: The drink method takes wtr as the parameter of the type,Water, which is a class/non-primitive type; hence, the argument must be an object of theWater class. Summary In this article we have learned what to return when a class is a return type for a method and what to pass as an argument for a method when a class is a parameter for the method. Resources for Article: Further resources on this subject: Saying Hello to Java EE [article] Getting Started with Sorting Algorithms in Java [article] Debugging Java Programs using JDB [article]

0
0
1297

Packt

18 Jul 2017

15 min read

Parallelize It

Packt

18 Jul 2017

15 min read

In this article by Elliot Forbes, the author of the book Learning Concurrency in Python, will explain concurrency and parallelism thoroughly, and bring necessary CPU knowledge related to it. Concurrency and parallelism are two concepts that are commonly confused. The reality though is that they are quite different and if you designed software to be concurrent when instead you needed parallel execution then you could be seriously impacting your software’s true performance potential. Due to this, it's vital to know exactly what the two concepts mean so that you can understand the differences. Through knowing these differences you’ll be putting yourself at a distinct advantage when it comes to designing your own high performance software in Python. In this article we’ll be covering the following topics: What is concurrency and what are the major bottlenecks that impact our applications? What is parallelism and how does this differ from concurrency? (For more resources related to this topic, see here.) Understanding concurrency Concurrency is essentially the practice of doing multiple things at the same time, but not specifically in parallel. It can help us to improve the perceived performance of our applications and it can also improve the speed at which our applications run. The best way to think of how concurrency works is to imagine one person working on multiple tasks and quickly switching between these tasks. Imagine this one person was working concurrently on a program and at the same time dealing with support requests. This person would focus primarily on the writing of their program and quickly context switch to fixing a bug or dealing with a support issue should there be one. Once they complete the support task, they could context switch again back to writing their program really quickly. However, in computing there are typically two performance bottlenecks that we have to watch out for and guard against when writing our programs. It’s important to know the differences between the two bottlenecks as if we tried to apply concurrency to a CPU based bottleneck then you could find that the program actually starts to see performance decreases as opposed to increases. And if you tried to apply parallelism to a task that really require a concurrent solution then again you could see the same performance hits. Properties of concurrent systems All concurrent systems share a similar set of properties, these can be defined as: Multiple actors: This represent the different processes and threads all trying to actively make progress on their own tasks. We could have multiple processes that contain multiple threads all trying to run at the same time. Shared Resources: This represents the memory, the disk and other resources that the actors in the above group must utilize in order to perform what they need to do. Rules: All concurrent systems must follow a strict set of rules that define when actors can and can’t acquire locks, access memory, modify state and so on. These rules are vital in order for these concurrent systems to work otherwise our programs would tear themselves apart. Input/Output bottlenecks Input/Output bottlenecks, or I/O bottlenecks for short, are bottlenecks where your computer spends more time waiting on various inputs and outputs than it does on processing the information. You’ll typically find this type of bottleneck when you are working with an I/O heavy application. We could take your standard web browser as an example of a heavy I/O application. In a browser we typically spend a significantly longer amount of time waiting for network requests to finish for things like style sheets, scripts or HTML pages to load as opposed to rendering this on the screen. If the rate at which data is requested is slower than the rate than which it is consumed at then you have yourself an I/O bottleneck. One of the main ways to improve the speed of these applications typically is to either improve the speed of the underlying I/O by buying more expensive and faster hardware or to improve the way in which we handle these I/O requests. A great example of a program bound by I/O bottlenecks would be a web crawler. Now the main purpose of a web crawler is to traverse the web and essentially index web pages so that they can be taken into consideration when Google runs its search ranking algorithm to decide the top 10 results for a given keyword. We’ll start by creating a very simple script that just requests a page and times how long it takes to request said web page: import urllib.request import time t0 = time.time() req = urllib.request.urlopen('http://www.example.com') pageHtml = req.read() t1 = time.time() print("Total Time To Fetch Page: {} Seconds".format(t1-t0)) If we break down this code, first we import the two necessary modules, urllib.request and the time module. We then record the starting time and request the web page: example.com and then record the ending time and printing out the time difference. Now say we wanted to add a bit of complexity and follow any links to other pages so that we could index them in the future. We could use a library such as BeautifulSoup in order to make our lives a little easier: import urllib.request import time from bs4 import BeautifulSoup t0 = time.time() req = urllib.request.urlopen( 'http://www.example.com' ) t1 = time.time() print("Total Time To Fetch Page: {} Seconds".format(t1-t0)) soup = BeautifulSoup(req.read(), "html.parser" ) for link in soup.find_all( 'a' ): print (link.get( 'href' )) t2 = time.time() print( "Total Execeution Time: {} Seconds" .format) When I execute the above program I see the results like so in my terminal: You’ll notice from this output that the time to fetch the page is over a quarter of a second. Now imagine we wanted to run our web crawler for a million different web pages, our total execution time would be roughly a million times longer. The main real cause for this enormous execution time would be purely down to the I/O bottleneck we face in our program. We spend a massive amount of time waiting on our network requests and a fraction of that time parsing our retrieved page for further links to crawl. Understanding parallelism Parallelism is the art of executing two or more actions simultaneously as opposed to concurrency in which you make progress on two or more things at the same time. This is an important distinction, and in order to achieve true parallelism, we’ll need multiple processors on which to run our code on at the same time. A good analogy to think of parallel processing is to think of a queue for coffee. If you had say two queues of 20 people all waiting to use this coffee machine so that they can get through the rest of the day. Well this would be an example of concurrency. Now say you were to introduce a second coffee machine into the mix, this would then be an example of something happening in parallel. This is exactly how parallel processing works, each of the coffee machines in that room would represent one processing core and are able to make progress on tasks simultaneously. A real life example which highlights the true power of parallel processing is your computer’s graphics card. These graphics cards tend to have hundreds if not thousands of individual processing cores that live independently and can compute things at the same time. The reason we are able to run high-end PC games at such smooth frame rates is due to the fact we’ve been able to put so many parallel cores onto these cards. CPU bound bottleneck A CPU bound bottleneck is typically the inverse of an I/O bound bottleneck. This bottleneck is typically found in applications that do a lot of heavy number crunching or any other task that is computationally expensive. These are programs for which the rate at which they execute is bound by the speed of the CPU, if you throw a faster CPU in your machine you should see a direct increase in the speed of these programs. If the rate at which you are processing data far outweighs the rate at which you are requesting data then you have a CPU Bound Bottleneck. How do they work on a CPU? Understanding the differences outlined in the previous section between both concurrency and parallelism is essential but it’s also very important to understand more about the systems that your software will be running on. Having an appreciation of the different architecture styles as well as the low level mechanics helps you make the most informed decisions in your software design. Single core CPUs Single core processors will only ever execute one thread at any given time as that is all they are capable of. However, in order to ensure that we don’t see our applications hanging and being unresponsive, these processors rapidly switch between multiple threads of execution many thousands of times per second. This switching between threads is what is called a "context switch" and involves storing all the necessary information for a thread at a specific point of time and then restoring it at a different point further down the line. Using this mechanism of constantly saving and restoring threads allows us to make progress on quite a number of threads within a given second and it appears like the computer is doing multiple things at once. It is in fact doing only one thing at any given time but doing it at such speed that it’s imperceptible to users of that machine. When writing multi-threaded applications in Python it is important to note that these context switches are computationally quite expensive. There is no way to get around this unfortunately and much of the design of operating systems these days is about optimizing for these context switches so that we don’t feel the pain quite as much. Advantages of single core CPUs: They do not require any complex communication protocols between multiple cores Single core CPUs require less power which typically makes them better suited for IoT devices Disadvantages: They are limited in speed and larger applications will cause them to struggle and potentially freeze Heat dissipation issues place a hard limit on how fast a single core CPU can go Clock rate One of the key limitations to a single-core application running on a machine is the Clock Speed of the CPU. When we talk about Clock rate, we are essentially talking about how many clock cycles a CPU can execute every second. For the past 10 years we have watched as manufacturers have been able to surpass Moore’s law which was essentially an observation that the number of transistors one was able to place on a piece of silicon was able to double roughly every 2 years. This doubling of transistors every 2 years paved the way for exponential gains in single-cpu clock rates and CPUs went from the low MHz to the 4-5GHz clock speeds we are seeing on Intel’s i7 6700k processor. But with transistors getting as small as a few nanometers across, this is inevitably coming to an end. We’ve started to hit the boundaries of Physics and unfortunately if we go any smaller we’ll start to be hit by the effects of quantum tunneling. Due to these physical limitations we need to start looking at other methods in order to improve the speeds at which we are able to compute things. This is where Materlli’s Model of Scalability comes into play. Martelli model of scalability The author of Python Cookbook, Alex Martelli came up with a model on scalability which Raymond Hettinger discussed in his brilliant hour-long talk "Thinking about Concurrency", which he gave at PyCon Russia 2016. This model represents three different types of problem and programs: 1 core: single threaded and single process programs 2-8 cores: multithreaded and multiprocess programs 9+ cores: distributed computing The first category, the single core, single threaded category is able to handle a growing number of problems due to the constant improvements of the speed of single core CPUs and as a result the second category is being rendered more and more obsolete. We will eventually hit a limit with the speed at which a 2-8 core system can run at and as a result we’ll have to start looking at other methods such as multiple CPU systems or even distributed computing. If your problem is worth solving quickly and it requires a lot of power then the sensible approach is to go with the distributed computing category and spin up multiple machines and multiple instances of your program in order to tackle your problems in a truly parallel manner. Large enterprise systems that handle hundreds of millions of requests are the main inhabitants of this category. You’ll typically find that these enterprise systems are deployed on tens, if not hundreds of high performance, incredibly powerful servers in various locations across the world. Time-Sharing - the task scheduler One of the most important parts of the Operating System is the task scheduler. This acts as the maestro of the orchestra and directs everything with impeccable precision and incredible timing and discipline. This maestro has only one real goal and that is to ensure that every task has a chance to run through till completion, the when and where of a task’s execution however is non-deterministic. That is to say, if we gave a task scheduler two identical competing processes one after the other, there is no guarantee that the first process will complete first. This non-deterministic nature is what makes concurrent programming so challenging. An excellent example that highlights this non-deterministic behavior is say we take the following code: import threading import time import random counter = 1 def workerA(): global counter while counter < 1000: counter += 1 print("Worker A is incrementing counter to {}".format(counter)) sleepTime = random.randint(0,1) time.sleep(sleepTime) def workerB(): global counter while counter > -1000: counter -= 1 print("Worker B is decrementing counter to {}".format(counter)) sleepTime = random.randint(0,1) time.sleep(sleepTime) def main(): t0 = time.time() thread1 = threading.Thread(target=workerA) thread2 = threading.Thread(target=workerB) thread1.start() thread2.start() thread1.join() thread2.join() t1 = time.time() print("Execution Time {}".format(t1-t0)) if __name__ == '__main__': main() Here we have two competing threads in Python that are each trying to accomplish their own goal of either decrementing the counter to 1,000 or conversely incrementing it to 1,000. In a single core processor there is the possibility that worker A managers to complete its task before worker B has a chance to execute and the same can be said for worker B. However there is a third potential possibility and that is that the task scheduler continues to switch between worker A and worker B for an infinite number of times and never complete. The above code incidentally also shows one of the dangers of multiple threads accessing shared resources without any form of synchronization. There is no accurate way to determine what will happen to our counter and as such our program could be considered unreliable. Multi-core processors We’ve now got some idea as to how single-core processors work, but now it’s time to take a look at multicore processors. Multicore processors contain multiple independent processing units or “cores”. Each core contains everything it needs in order to execute a sequence of stored instructions. These cores each follow their own cycle: Fetch - This step involves fetching instructions from program memory. This is dictated by a program counter (PC) which identifies the location of the next step to execute. Decode - The core converts the instruction that it has just fetched and converts it into a series of signals that will trigger various other parts of the CPU. Execute - Finally we perform the execute step. This is where we run the instruction that we have just fetched and decoded and typically the results of this execution are then stored in a CPU register. Having multiple cores offers us the advantage of being able to work independently on multiple Fetch -> Decode -> Execute cycles. This style of architecture essentially enables us to create higher performance programs that leverage this parallel execution. Advantages of multicore processors: We are no longer bound by the same performance limitations that a single core processor is bound Applications that are able to take advantage of multiple cores will tend to run faster if well designed Disadvantages of multicore processors: They require more power than your typical single core processor. Cross-core communication is no simple feat, we have multiple different ways of doing this. Summary In this article we covered a multitude of topics including the differences between Concurrency and Parallelism. We also looked at how they both leverage the CPU in different ways. Resources for Article: Further resources on this subject: Python Data Science Up and Running [article] Putting the Fun in Functional Python [article] Basics of Python for Absolute Beginners [article]

0
0
3016

Packt

10 Jul 2017

8 min read

Queues and topics

Packt

10 Jul 2017

8 min read

In this article by Luca Stancapiano, the author of the book Mastering Java EE Development with WildFly, we will see how to implement Java Message Service (JMS) in a queue channel using WildFly console. (For more resources related to this topic, see here.) JMS works inside channels of messages that manage the messages asynchronously. These channels contain messages that they will collect or remove according the configuration and the type of channel. These channels are of two types, queues and topics. These channels are highly configurable by the WildFly console. As for all components in WildFly they can be installed through the console command line or directly with maven plugins of the project. In the next two paragraphs we will show what do they mean and all the possible configurations. Queues Queues collect the sent messages that are waiting to be read. The messages are delivered in the order they are sent and when beds are removed from the queue. Create the queue from the web console See now the steps to create a new queue through the web console. Connect to http://localhost:9990/. Go in Configuration | Subsystems/Messaging - ActiveMQ/default. And click on Queues/Topics. Now select the Queues menu and click on the Add button. You will see this screen: The parameters to insert are as follows: Name: The name of the queue. JNDI Names: The jndi names the queue will be bound to. Durable?: Whether the queue is durable or not. Selector: The queue selector. As for all enterprise components, JMS components are callable through Java Naming Directory Interface (JNDI). Durable queues keep messages around persistently for any suitable consumer to consume them. Durable queues do not need to concern themselves with which consumer is going to consume the messages at some point in the future. There is just one copy of a message that any consumer in the future can consume. Message Selectors allows to filter the messages that a Message Consumer will receive. The filter is a relatively complex language similar to the syntax of an SQL WHERE clause. The selector can use all the message headers and properties for filtering operations, but cannot use the message content.Selectors are mostly useful for channels that broadcast a very large number of messages to its subscribers. On Queues, only messages that match the selector will be returned. Others stay in the queue (and thus can be read by a MessageConsumer with different selector). The following SQL elements are allowed in our filters and we can put them in the Selector field of the form: Element Description of the Element Example of Selectors AND, OR, NOT Logical operators (releaseYear < 1986) ANDNOT (title = 'Bad') String Literals String literals in single quotes, duplicate to escape title = 'Tom''s' Number Literals Numbers in Java syntax. They can be double or integer releaseYear = 1982 Properties Message properties that follow Java identifier naming releaseYear = 1983 Boolean Literals TRUE and FALSE isAvailable = FALSE ( ) Round brackets (releaseYear < 1981) OR (releaseYear > 1990) BETWEEN Checks whether number is in range (both numbers inclusive) releaseYear BETWEEN 1980 AND 1989 Header Fields Any headers except JMSDestination, JMSExpiration and JMSReplyTo JMSPriority = 10 =, <>, <, <=, >, >= Comparison operators (releaseYear < 1986) AND (title <> 'Bad') LIKE String comparison with wildcards '_' and '%' title LIKE 'Mirror%' IN Finds value in set of strings title IN ('Piece of mind', 'Somewhere in time', 'Powerslave') IS NULL, IS NOT NULL Checks whether value is null or not null. releaseYear IS NULL *, +, -, / Arithmetic operators releaseYear * 2 > 2000 - 18 Fill the form now: In this article we will implement a messaging service to send coordinates of the bus means . The queue is created and showed in the queues list: Create the queue using CLI and Maven WildFly plugin The same thing can be done with the Command Line Interface (CLI). So start a WildFly instance, go in the bin directory of WildFly and execute the following script: bash-3.2$ ./jboss-cli.sh You are disconnected at the moment. Type 'connect' to connect to the server or 'help' for the list of supported commands. [disconnected /] connect [standalone@localhost:9990 /] /subsystem=messagingactivemq/ server=default/jmsqueue= gps_coordinates:add(entries=["java:/jms/queue/GPS"]) {"outcome" => "success"} The same thing can be done through maven. Simply add this snippet in your pom.xml: <plugin> <groupId>org.wildfly.plugins</groupId> <artifactId>wildfly-maven-plugin</artifactId> <version>1.0.2.Final</version> <executions> <execution> <id>add-resources</id> <phase>install</phase> <goals> <goal>add-resource</goal> </goals> <configuration> <resources> <resource> <address>subsystem=messaging-activemq,server=default,jmsqueue= gps_coordinates</address> <properties> <durable>true</durable> <entries>!!["gps_coordinates", "java:/jms/queue/GPS"]</entries> </properties> </resource> </resources> </configuration> </execution> <execution> <id>del-resources</id> <phase>clean</phase> <goals> <goal>undeploy</goal> Queues and topics [ 7 ] </goals> <configuration> <afterDeployment> <commands> <command>/subsystem=messagingactivemq/ server=default/jms-queue=gps_coordinates:remove </command> </commands> </afterDeployment> </configuration> </execution> </executions> </plugin> The Maven WildFly plugin lets you to do admin operations in WildFly using the same custom protocol used by command line. Two executions are configured: add-resources: It hooks the install maven scope and it adds the queue passing the name, JNDI and durable parameters seen in the previous paragraph. del-resources: It hooks the clean maven scope and remove the chosen queue by name. Create the queue through an Arquillian test case Or we can add and remove the queue through an Arquillian test case: @RunWith(Arquillian.class) @ServerSetup(MessagingResourcesSetupTask.class) public class MessageTestCase { ... private static final String QUEUE_NAME = "gps_coordinates"; private static final String QUEUE_LOOKUP = "java:/jms/queue/GPS"; static class MessagingResourcesSetupTask implements ServerSetupTask { @Override public void setup(ManagementClient managementClient, String containerId) throws Exception { getInstance(managementClient.getControllerClient()).createJmsQueue(QUEUE_NA ME, QUEUE_LOOKUP); } @Override public void tearDown(ManagementClient managementClient, String containerId) throws Exception { getInstance(managementClient.getControllerClient()).removeJmsQueue(QUEUE_NA ME); } } Queues and topics [ 8 ] ... } The Arquillian org.jboss.as.arquillian.api.ServerSetup annotation let to use an external setup manager used to install or remove new components inside WildFly. In this case we are installing the queue declared with the two variables QUEUE_NAME and QUEUE_LOOKUP. When the test ends, automatically the tearDown method will be started and it will remove the installed queue. To use Arquillian it's important add the WildFly testsuite dependency in your pom.xml project: ... <dependencies> <dependency> <groupId>org.wildfly</groupId> <artifactId>wildfly-testsuite-shared</artifactId> <version>10.1.0.Final</version> <scope>test</scope> </dependency> </dependencies> ... Going in the standalone-full.xml we will find the created queue as: <subsystem > <server name="default"> ... <jms-queue name="gps_coordinates" entries="java:/jms/queue/GPS"/> ... </server> </subsystem> JMS is available in the standalone-full configuration. By default WildFly supports 4 standalone configurations. They can be found in the standalone/configuration directory: standalone.xml: It supports all components except the messaging and corba/iiop standalone-full.xml: It supports all components standalone-ha.xml: It supports all components except the messaging and corba/iiop with the enabled cluster standalone-full-ha.xml: It supports all components with the enabled cluster To start WildFly with the chosen configuration simply add a -c with the configuration in the standalone.sh script. Here a sample to start the standalone full configuration: ./standalone.sh -c standalone-full.xml Create the java client for the queue See now how create a client to send a message to the queue. JMS 2.0 simplify very much the creation of clients. Here a sample of a client inside a stateless Enterprise Java Beans (EJB): @Stateless public class MessageQueueSender { @Inject private JMSContext context; @Resource(mappedName = "java:/jms/queue/GPS") private Queue queue; public void sendMessage(String message) { context.createProducer().send(queue, message); } } The javax.jms.JMSContext is injectable from any EE component. We will see the JMS context in details in the next paragraph The JMS Context. The queue is represented in JMS by the javax.jms.Queue class. It can be injected as JNDI resource through the @Resource annotation. The JMS context through the createProducer method creates a producer represented by the javax.jms.JMSProducer class used to send the messages. We can now create a client injecting the stateless and sending a string message hello! ... @EJB private MessageQueueSender messageQueueSender; ... messageQueueSender.sendMessage("hello!"); Summary In this article we have seen how to implement Java Message Service in a queue channel using web console, Command Line Interface and Maven WildFly plugins, Arquillian test cases and how to create Java clients for queue. Resources for Article: Further resources on this subject: WildFly – the Basics [article] WebSockets in Wildfly [article] Creating Java EE Applications [article]

0
0
5834

Packt

06 Jul 2017

9 min read

Command-Line Tools

Packt

06 Jul 2017

9 min read

In this article by Aaron Torres, author of the book, Go Cookbook, we will cover the following recipes: Using command-line arguments Working with Unix pipes An ANSI coloring application (For more resources related to this topic, see here.) Using command-line arguments This article will expand on other uses for these arguments by constructing a command that supports nested subcommands. This will demonstrate Flagsets and also using positional arguments passed into your application. This recipe requires a main function to run. There are a number of third-party packages for dealing with complex nested arguments and flags, but we'll again investigate doing so using only the standard library. Getting ready You need to perform the following steps for the installation: Download and install Go on your operating system at https://golang.org/doc/install and configure your GOPATH. Open a terminal/console application. Navigate to your GOPATH/src and create a project directory, for example, $GOPATH/src/github.com/yourusername/customrepo. All code will be run and modified from this directory. Optionally, install the latest tested version of the code using the go get github.com/agtorre/go-cookbook/ command. How to do it... From your terminal/console application, create and navigate to the chapter2/cmdargs directory. Copy tests from https://github.com/agtorre/go-cookbook/tree/master/chapter2/cmdargs or use this as an exercise to write some of your own. Create a file called cmdargs.go with the following content: package main import ( "flag" "fmt" "os" ) const version = "1.0.0" const usage = `Usage: %s [command] Commands: Greet Version ` const greetUsage = `Usage: %s greet name [flag] Positional Arguments: name the name to greet Flags: ` // MenuConf holds all the levels // for a nested cmd line argument type MenuConf struct { Goodbye bool } // SetupMenu initializes the base flags func (m *MenuConf) SetupMenu() *flag.FlagSet { menu := flag.NewFlagSet("menu", flag.ExitOnError) menu.Usage = func() { fmt.Printf(usage, os.Args[0]) menu.PrintDefaults() } return menu } // GetSubMenu return a flag set for a submenu func (m *MenuConf) GetSubMenu() *flag.FlagSet { submenu := flag.NewFlagSet("submenu", flag.ExitOnError) submenu.BoolVar(&m.Goodbye, "goodbye", false, "Say goodbye instead of hello") submenu.Usage = func() { fmt.Printf(greetUsage, os.Args[0]) submenu.PrintDefaults() } return submenu } // Greet will be invoked by the greet command func (m *MenuConf) Greet(name string) { if m.Goodbye { fmt.Println("Goodbye " + name + "!") } else { fmt.Println("Hello " + name + "!") } } // Version prints the current version that is // stored as a const func (m *MenuConf) Version() { fmt.Println("Version: " + version) } Create a file called main.go with the following content: package main import ( "fmt" "os" "strings" ) func main() { c := MenuConf{} menu := c.SetupMenu() menu.Parse(os.Args[1:]) // we use arguments to switch between commands // flags are also an argument if len(os.Args) > 1 { // we don't care about case switch strings.ToLower(os.Args[1]) { case "version": c.Version() case "greet": f := c.GetSubMenu() if len(os.Args) < 3 { f.Usage() return } if len(os.Args) > 3 { if.Parse(os.Args[3:]) } c.Greet(os.Args[2]) default: fmt.Println("Invalid command") menu.Usage() return } } else { menu.Usage() return } } Run the go build command. Run the following commands and try a few other combinations of arguments: $./cmdargs -h Usage: ./cmdargs [command] Commands: Greet Version $./cmdargs version Version: 1.0.0 $./cmdargs greet Usage: ./cmdargs greet name [flag] Positional Arguments: name the name to greet Flags: -goodbye Say goodbye instead of hello $./cmdargs greet reader Hello reader! $./cmdargs greet reader -goodbye Goodbye reader! If you copied or wrote your own tests go up one directory and run go test, and ensure all tests pass. How it works... Flagsets can be used to set up independent lists of expected arguments, usage strings, and more. The developer is required to do validation on a number of arguments, parsing in the right subset of arguments to commands, and defining usage strings. This can be error prone and requires a lot of iteration to get it completely correct. The flag package makes parsing arguments much easier and includes convenience methods to get the number of flags, arguments, and more. This recipe demonstrates basic ways to construct a complex command-line application using arguments, including a package-level config, required positional arguments, multi-leveled command usage, and how to split these things into multiple files or packages if needed. Working with Unix pipes Unix pipes are useful when passing the output of one program to the input of another. Consider the following example: $ echo "test case" | wc -l 1 In a Go application, the left-hand side of the pipe can be read in using os.Stdin and acts like a file descriptor. To demonstrate this, this recipe will take an input on the left-hand side of a pipe and return a list of words and their number of occurrences. These words will be tokenized on white space. Getting ready Refer to the Getting Ready section of the Using command-line arguments recipe. How to do it... From your terminal/console application, create a new directory, chapter2/pipes. Navigate to that directory and copy tests from https://github.com/agtorre/go-cookbook/tree/master/chapter2/pipes or use this as an exercise to write some of your own. Create a file called pipes.go with the following content: package main import ( "bufio" "fmt" "os" ) // WordCount takes a file and returns a map // with each word as a key and it's number of // appearances as a value func WordCount(f *os.File) map[string]int { result := make(map[string]int) // make a scanner to work on the file // io.Reader interface scanner := bufio.NewScanner(f) scanner.Split(bufio.ScanWords) for scanner.Scan() { result[scanner.Text()]++ } if err := scanner.Err(); err != nil { fmt.Fprintln(os.Stderr, "reading input:", err) } return result } func main() { fmt.Printf("string: number_of_occurrencesnn") for key, value := range WordCount(os.Stdin) { fmt.Printf("%s: %dn", key, value) } } Run echo "some string" | go run pipes.go. You may also run: go build echo "some string" | ./pipes You should see the following output: $ echo "test case" | go run pipes.go string: number_of_occurrences test: 1 case: 1 $ echo "test case test" | go run pipes.go string: number_of_occurrences test: 2 case: 1 If you copied or wrote your own tests, go up one directory and run go test, and ensure that all tests pass. How it works... Working with pipes in go is pretty simple, especially if you're familiar with working with files. This recipe uses a scanner to tokenize the io.Reader interface of the os.Stdin file object. You can see how you must check for errors after completing all of the reads. An ANSI coloring application Coloring an ANSI terminal application is handled by a variety of code before and after a section of text that you want colored. This recipe will explore a basic coloring mechanism to color the text red or keep it plain. For a more complete application, take a look at https://github.com/agtorre/gocolorize, which supports many more colors and text types implements the fmt.Formatter interface for ease of printing. Getting ready Refer to the Getting Ready section of the Using command line arguments recipe. How to do it... From your terminal/console application, create and navigate to the chapter2/ansicolor directory. Copy tests from https://github.com/agtorre/go-cookbook/tree/master/chapter2/ansicolor or use this as an exercise to write some of your own. Create a file called color.go with the following content: package ansicolor import "fmt" //Color of text type Color int const ( // ColorNone is default ColorNone = iota // Red colored text Red // Green colored text Green // Yellow colored text Yellow // Blue colored text Blue // Magenta colored text Magenta // Cyan colored text Cyan // White colored text White // Black colored text Black Color = -1 ) // ColorText holds a string and its color type ColorText struct { TextColor Color Text string } func (r *ColorText) String() string { if r.TextColor == ColorNone { return r.Text } value := 30 if r.TextColor != Black { value += int(r.TextColor) } return fmt.Sprintf("33[0;%dm%s33[0m", value, r.Text) } Create a new directory named example. Navigate to example and then create a file named main.go with the following content. Ensure that you modify the ansicolor import to use the path you set up in step 1: package main import ( "fmt" "github.com/agtorre/go-cookbook/chapter2/ansicolor" ) func main() { r := ansicolor.ColorText{ansicolor.Red, "I'm red!"} fmt.Println(r.String()) r.TextColor = ansicolor.Green r.Text = "Now I'm green!" fmt.Println(r.String()) r.TextColor = ansicolor.ColorNone r.Text = "Back to normal..." fmt.Println(r.String()) } Run go run main.go. Alternatively, you may also run the following: go build ./example You should see the following with the text colored if your terminal supports the ANSI coloring format: $ go run main.go I'm red! Now I'm green! Back to normal... If you copied or wrote your own tests, go up one directory and run go test, and ensure that all the tests pass. How it works... This application makes use of a struct keyword to maintain state of the colored text. In this case, it stores the color of the text and the value of the text. The final string is rendered when you call the String() method, which will either return colored text or plain text depending on the values stored in the struct. By default, the text will be plain. Summary In this article, we demonstrated basic ways to construct a complex command-line application using arguments, including a package-level config, required positional arguments, multi-leveled command usage, and how to split these things into multiple files or packages if needed. We saw how to work with Unix pipes and explored a basic coloring mechanism to color text red or keep it plain. Resources for Article: Further resources on this subject: Building a Command-line Tool [article] A Command-line Companion Called Artisan [article] Scaffolding with the command-line tool [article]

0
0
2869

Packt

06 Jul 2017

10 min read

Exposure to RxJava

Packt

06 Jul 2017

10 min read

0
0
2379

Packt

06 Jul 2017

9 min read

Ruby Strings

Packt

06 Jul 2017

9 min read

In this article by Jordan Hudgens, the author of the book Comprehensive Ruby Programming, you'll learn about the Ruby String data type and walk through how to integrate string data into a Ruby program. Working with words, sentences, and paragraphs are common requirements in many applications. Additionally you learn how to: Employ string manipulation techniques using core Ruby methods Demonstrate how to work with the string data type in Ruby (For more resources related to this topic, see here.) Using strings in Ruby A string is a data type in Ruby and contains set of characters, typically normal English text (or whatever natural language you're building your program for), that you would write. A key point for the syntax of strings is that they have to be enclosed in single or double quotes if you want to use them in a program. The program will throw an error if they are not wrapped inside quotation marks. Let's walk through three scenarios. Missing quotation marks In this code I tried to simply declare a string without wrapping it in quotation marks. As you can see, this results in an error. This error is because Ruby thinks that the values are classes and methods. Printing strings In this code snippet we're printing out a string that we have properly wrapped in quotation marks. Please note that both single and double quotation marks work properly. It's also important that you do not mix the quotation mark types. For example, if you attempted to run the code: puts "Name an animal' You would get an error, because you need to ensure that every quotation mark is matched with a closing (and matching) quotation mark. If you start a string with double quotation marks, the Ruby parser requires that you end the string with the matching double quotation marks. Storing strings in variables Lastly in this code snippet we're storing a string inside of a variable and then printing the value out to the console. We'll talk more about strings and string interpolation in subsequent sections. String interpolation guide for Ruby In this section, we are going to talk about string interpolation in Ruby. What is string interpolation? So what exactly is string interpolation? Good question. String interpolation is the process of being able to seamlessly integrate dynamic values into a string. Let's assume we want to slip dynamic words into a string. We can get input from the console and store that input into variables. From there we can call the variables inside of a pre-existing string. For example, let's give a sentence the ability to change based on a user's input. puts "Name an animal" animal = gets.chomp puts "Name a noun" noun= gets.chomp p "The quick brown #{animal} jumped over the lazy #{noun} " Note the way I insert variables inside the string? They are enclosed in curly brackets and are preceded by a # sign. If I run this code, this is what my output will look: So, this is how you insert values dynamically in your sentences. If you see sites like Twitter, it sometimes displays personalized messages such as: Good morning Jordan or Good evening Tiffany. This type of behavior is made possible by inserting a dynamic value in a fixed part of a string and leverages string interpolation. Now, let's use single quotes instead of double quotes, to see what happens. As you'll see, the string was printed as it is without inserting the values for animal and noun. This is exactly what happens when you try using single quotes—it prints the entire string as it is without any interpolation. Therefore it's important to remember the difference. Another interesting aspect is that anything inside the curly brackets can be a Ruby script. So, technically you can type your entire algorithm inside these curly brackets, and Ruby will run it perfectly for you. However, it is not recommended for practical programming purposes. For example, I can insert a math equation, and as you'll see it prints the value out. String manipulation guide In this section we are going to learn about string manipulation along with a number of examples of how to integrate string manipulation methods in a Ruby program. What is string manipulation? So what exactly is string manipulation? It's the process of altering the format or value of a string, usually by leveraging string methods. String manipulation code examples Let's start with an example. Let's say I want my application to always display the word Astros in capital letters. To do that, I simply write: "Astros".upcase Now if I always a string to be in lower case letters I can use the downcase method, like so: "Astros".downcase Those are both methods I use quite often. However there are other string methods available that we also have at our disposal. For the rare times when you want to literally swap the case of the letters you can leverage the swapcase method: "Astros".swapcase And lastly if you want to reverse the order of the letters in the string we can call the reverse method: "Astros".reverse These methods are built into the String data class and we can call them on any string values in Ruby. Method chaining Another neat thing we can do is join different methods together to get custom output. For example, I can run: "Astros".reverse.upcase The preceding code displays the value SORTSA. This practice of combining different methods with a dot is called method chaining. Split, strip, and join guides for strings In this section, we are going to walk through how to use the split and strip methods in Ruby. These methods will help us clean up strings and convert a string to an array so we can access each word as its own value. Using the strip method Let's start off by analyzing the strip method. Imagine that the input you get from the user or from the database is poorly formatted and contains white space before and after the value. To clean the data up we can use the strip method. For example: str = " The quick brown fox jumped over the quick dog " p str.strip When you run this code, the output is just the sentence without the white space before and after the words. Using the split method Now let's walk through the split method. The split method is a powerful tool that allows you to split a sentence into an array of words or characters. For example, when you type the following code: str = "The quick brown fox jumped over the quick dog" p str.split You'll see that it converts the sentence into an array of words. This method can be particularly useful for long paragraphs, especially when you want to know the number of words in the paragraph. Since the split method converts the string into an array, you can use all the array methods like size to see how many words were in the string. We can leverage method chaining to find out how many words are in the string, like so: str = "The quick brown fox jumped over the quick dog" p str.split.size This should return a value of 9, which is the number of words in the sentence. To know the number of letters, we can pass an optional argument to the split method and use the format: str = "The quick brown fox jumped over the quick dog" p str.split(//).size And if you want to see all of the individual letters, we can remove the size method call, like this: p str.split(//) And your output should look like this: Notice, that it also included spaces as individual characters which may or may not be what you want a program to return. This method can be quite handy while developing real-world applications. A good practical example of this method is Twitter. Since this social media site restricts users to 140 characters, this method is sure to be a part of the validation code that counts the number of characters in a Tweet. Using the join method We've walked through the split method, which allows you to convert a string into a collection of characters. Thankfully, Ruby also has a method that does the opposite, which is to allow you to convert an array of characters into a single string, and that method is called join. Let's imagine a situation where we're asked to reverse the words in a string. This is a common Ruby coding interview question, so it's an important concept to understand since it tests your knowledge of how string work in Ruby. Let's imagine that we have a string, such as: str = "backwards am I" And we're asked to reverse the words in the string. The pseudocode for the algorithm would be: Split the string into words Reverse the order of the words Merge all of the split words back into a single string We can actually accomplish each of these requirements in a single line of Ruby code. The following code snippet will perform the task: str.split.reverse.join(' ') This code will convert the single string into an array of strings, for the example it will equal ["backwards", "am", "I"]. From there it will reverse the order of the array elements, so the array will equal: ["I", "am", "backwards"]. With the words reversed, now we simply need to merge the words into a single string, which is where the join method comes in. Running the join method will convert all of the words in the array into one string. Summary In this article, we were introduced to the string data type and how it can be utilized in Ruby. We analyzed how to pass strings into Ruby processes by leveraging string interpolation. We also learned the methods of basic string manipulation and how to find and replace string data. We analyzed how to break strings into smaller components, along with how to clean up string based data. We even introduced the Array class in this article. Resources for Article: Further resources on this subject: Ruby and Metasploit Modules [article] Find closest mashup plugin with Ruby on Rails [article] Building tiny Web-applications in Ruby using Sinatra [article]

0
0
6940

article-image-writing-your-first-cucumber-appium-test

Packt

27 Jun 2017

12 min read

Writing Your First Cucumber Appium Test

Packt

27 Jun 2017

12 min read

0
0
18205

Packt

23 Jun 2017

9 min read

Working with Basic Elements – Threads and Runnables

Packt

23 Jun 2017

9 min read

In this article by Javier Fernández González, the author of the book, Mastering Concurrency Programming with Java 9 - Second Edition, we will see the execution threads are the core of concurrent applications. When you implement a concurrent application, no matter the language, you have to create different execution threads that run in parallel in a non-deterministic order unless you use a synchronization element (such as a semaphore). In Java you can create execution threads in two ways: Extending the Thread class Implementing the Runnable interface In this article, you will learn how to use these elements to implement concurrent applications in Java. (For more resources related to this topic, see here.) Introduction Nowadays, computer users (and mobile and tablet users too) use different applications at the same time when they work with their computers. They can be writing a document with the word processor while they’re reading the news or posting on the social network and listening to music. They can do all these things at the same time because modern operating systems supports multiprocessing. They can execute different tasks at the same time. But inside an application, you can also do different things at the same time. For example, if you’re working with your word processor, you can save the file while you’re putting a text with bold style. You can do this because the modern programming languages that are used to write those applications allow programmers to create multiple execution threads inside an application. Each execution thread executes a different task, so you can do different things at the same time. Java implements execution threads using the Thread class. You can create an execution thread in your application using the following mechanisms: You can extend the Thread class and override the run() method You can implement the Runnable interface and pass an object of that class to the constructor of a Thread object In both the cases, you will have a Thread object, but the second approach is recommended over the first one. Its main advantages are: Runnable is an interface: You can implement other interfaces and extend other class. With the Thread class you can only extend that class. Runnable objects can not only be executed with threads but also in other Java concurrency objects as executors. This gives you more flexibility to change your concurrent applications. You can use the same Runnable object with different threads. Once you have a Thread object, you must use the start() method to create a new execution thread and execute the run() method of Thread. If you call the run() method directly, you will be calling a normal Java method and no new execution thread will be created. Let’s see the most important characteristics of threads in the Java programming language. Threads in Java: characteristics and states The first thing we have to say about threads in Java is that all Java programs, concurrent or not, have one Thread called the main thread. As you may know, a Java SE program starts its execution with the main() method. When you execute that program, the Java Virtual Machine (JVM) creates a new Thread and executes the main() method in that thread. This is the unique thread in the non-concurrent applications and the first one in the concurrent ones. In Java, as with other programming languages, threads share all the resources of the application, including memory and open files. This is a powerful tool because they can share information in a fast and easy way, but it must be done using adequate synchronization elements to avoid data race conditions. All the threads in Java have a priority. It’s an integer value that can be between the Thread.MIN_PRIORITY and Thread.MAX_PRIORITY values (actually, their values are 1 and 10). By default, all the threads are created with the priority, Thread.NORM_PRIORITY (actually, its value is 5). You can use the setPriority() method to change the priority of a Thread (it can throw a SecurityException exception if you are not allowed to do that operation) and the getPriority() method to get the priority of a Thread. This priority is a hint to the Java Virtual Machine and to the underlying operating system about the preference between the threads, but it’s not a contract. There’s no guarantee about the order of execution of the threads. Normally, threads with a higher priority will be executed before the threads with lower priority but, as I told you before, there’s no guarantee about this. You can create two kind of threads in Java: Daemon threads Non-daemon threads The difference between them is how they affect the end of a program. A Java program ends its execution when one of the following circumstances occurs: The program executes the exit() method of the Runtime class and the user has authorization to execute that method All the non-daemon threads of the application have ended its execution, no matter if there are daemon threads running. With these characteristics, daemon threads are usually used to execute auxiliary tasks in the applications as garbage collectors or cache managers. You can use the isDaemon() method to check whether a thread is a daemon thread or not, and you can use the setDaemon() method to establish a thread as a daemon one. Take into account that you must call this method before the thread starts its execution with the start() method. Finally, threads can pass through different states depending on the situation. All the possible states are defined in the Thread.States class and you can use the getState() method to get the status of a Thread. Obviously, you can change the status of the thread directly. These are the possible statuses of a thread: NEW: Thread has been created but it hasn’t started its execution yet RUNNABLE: Thread is running in the Java Virtual Machine BLOCKED: Thread is waiting for a lock WAITING: Thread is waiting for the action of the other thread TIME_WAITING: Thread is waiting for the action of the other thread, but this waiting has a time limit THREAD: Thread has finished its execution Now that we know the most important characteristics of threads in the Java programming language, let’s see the most important methods of the Runnable interface and the Thread class. The Thread class and the Runnable interface As we mentioned before, you can create a new execution thread using one of the following two mechanisms: Extend the Thread class and override its run() method. Implement the Runnable interface and pass an instance of that object to the constructor of a Thread object. Java good practices recommend the utilization of the second approach over the first one, and that will be the approach we will use in this article and in the whole book. The Runnable interface only defines one method: the run() method. This is the main method of every thread. When you start a new execution of the start() method, it will call the run() method (of the Thread class or of the Runnable object passed as a parameter in the constructor of the Thread class). The Thread class, on the contrary, has a lot of different methods. It has a run() method that you must override if you implement your thread extending the Thread class and the start() method that you must call to create a new execution thread. These are the other interesting methods of the Thread class: Methods to get and set information of a Thread: getId(): This method returns the identifier of the Thread. The thread identifier is a positive integer number, assigned when a thread is created. It is unique during its lifetime and it can't be changed. getName()/setName(): This method allows you to get or set the name of the Thread. This name is a String that can also be established in the constructor of the Thread class. getPriority()/setPriority(): You can use these methods to obtain and establish the priority of the Thread class. We explained before in this article how Java manages the priority of its threads. isDaemon()/setDaemon(): This method allows you to obtain and establish the condition of a daemon of the Thread. We have explained how this condition works before. getState(): This method returns the state of the Thread. We explained before all the possible states of a Thread. interrupt()/interrupted()/isInterrupted(): The first method is used to indicate to Thread that you're are requesting the end of its execution. The other two methods can be used to check the interrupt status. The main difference between those methods is that one clears the value of the interrupted flag when it's called and the other one does not. A call to the interrupt() method doesn't end the execution of a Thread. It is the responsibility of the Thread to check the status of that flag and respond accordingly. sleep(): This method allows you to suspend the execution of the Thread for a period of time. It receives a long value, that is, the number of milliseconds that you want to suspend the execution of the Thread for. join(): This method suspends the execution of the thread that makes the call until the end of the execution of the Thread that is used to call the method. You can use this method to wait for the finalization of other Thread. setUncaughtExceptionHandler(): This method is used to establish the controller of the unchecked exceptions that can occur while you're executing the threads. currentThread(): This is a static method of the Thread class that returns the Thread object that is actually executing this code. Summary In this article, you learned the threads in Java and how the Thread class and the Runnable interface work. Resources for Article: Further resources on this subject: Thread synchronization and communication [article] Multithreading with Qt [article] Concurrency and Parallelism with Swift 2 [article]

0
0
811

How-To Tutorials - Programming

API Gateway and its Need

Introduction to Performance Testing and JMeter

Consuming Diagnostic Analyzers in .NET projects

Getting Inside a C++ Multithreaded Application

Getting Started with SOA and WSO2

Starting Out

Creating the First Python Script

Games and Exercises

Parallelize It

Queues and topics

Trending Topics

Command-Line Tools

Exposure to RxJava

Ruby Strings

Writing Your First Cucumber Appium Test

Working with Basic Elements – Threads and Runnables