Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

6719 Articles
article-image-using-proxies-monitor-remote-locations-zabbix-18
Packt
29 Mar 2010
7 min read
Save for later

Using Proxies to Monitor Remote Locations with Zabbix 1.8

Packt
29 Mar 2010
7 min read
When proxies are useful A server is not always able to connect directly to the monitored machines. We could use active items everywhere (where Zabbix agents connect to a Zabbix server, discussed in Chapter 3), but that would require access from all of the monitored hosts to the Zabbix server. Additionally, that would not work for SNMP, IPMI, and other monitoring methods. This is common when devices from other organizations have to be monitored, or when such restrictions are in place in a large corporate network. This is where Zabbix proxy comes in. When set up, only connections to the Zabbix server come from the proxy, which, in turn, does all the monitoring on behalf of the Zabbix server. In this example, there's no need for individual connections from Zabbix to individual devices in the remote location or the other way around—single firewall rule to allow Zabbix proxy connections to the Zabbix server is sufficient. So how does the proxy work? Let's repeat: Only the Zabbix proxy connects to the Zabbix serverv. This bit of information is crucial to understanding how it actually works. When the proxy connects to the server, it requests configuration information about what it has to monitor, then goes and does just that. As data is gathered, proxy sends it back to the server. Does that sound familiar? It should, because that part sounds just like an active agent, except that the proxy can gather data from different systems pretty much like the Zabbix server can. It sounds almost like a super agent. Actually, it is said that the proxy was at first internally named "super agent", which confirms the similarity. The Zabbix proxy is a very useful feature, the first stable Zabbix version to introduce it was 1.6, back in 2008. Given the usefulness of the Zabbix proxy, it is surprising that for a long time no other solutions provided something comparable. Today, Zabbix proxies continue to lead the way by having and improving features that are considered most useful for remote data collection. As we will see later, there are other benefits that proxies can provide, so let's get to actually setting one up. Setting up the proxy W hen setting up the proxy for this exercise, it is suggested you use a separate (physical or virtual) machine. If not possible, you can choose to run proxy on the same machine, as hints will be provided for such a setup as well. Before proceeding, we will have to compile the proxy itself. When compiling the proxy there are a few mandatory parameters, --enable-proxy being one. Zabbix proxy also needs a database, so you have to include support for one. While it is not suggested to use SQLite on Zabbix itself because of the performance issue, the proxy might be a good candidate. Unless it is going to monitor lots of systems, SQLite's performance can be satisfactory, and it is trivial to set up because Zabbix server creates and populates the database automatically. Additionally, Zabbix proxy must have support compiled in for things it will be monitoring, like SNMP, IPMI, and web monitoring. Let's say we want to compile Zabbix proxy with SQLite support and the ability to monitor web, SNMP, and IPMI. Obviously, we will need all the relevant support libraries mentioned when we first compiled Zabbix, but this time we'll also need another one—SQLite development headers. On most distributions they will be included in a package named sqlite-devel or similar, so make sure to install this package before proceeding. Then, on the machine where you plan to install Zabbix proxy, enter the directory where the Zabbix installation source has been extracted to and execute: $ ./configure --enable-proxy --with-sqlite3 --with-libcurl --with-netsnmp--with-openipmi && make This will both configure and compile Zabbix proxy. We can again use the simple and universal solution to create a more or less proper package by executing as root: # checkinstall --nodoc --install=yes -y --pkgname=zabbix-proxy The proxy binary is now compiled and in place, we can put our hands on the configuration now. First things first—it must be placed in the correct location: # cp misc/conf/zabbix_proxy.conf /etc/zabbix Now open /etc/zabbix/zabbix_proxy.conf as root and make some changes to get the proxy running: Hostname=proxy Hostname is an important parameter. As with active agents, if it's wrong, nothing will work. Server=<Zabbix server IP address> Zabbix proxy connects to the Zabbix server, so it must know where to connect to. DBName=/tmp/zabbix_proxy.db We will try to use an SQLite database, thus the full path to the database file must be specified. Let's try to start the Zabbix proxy now, as root execute: # zabbix_proxy Wait, we did not create the database like we did when installing the server—the proxy couldn't start up, right? Let's look at the log file—open /tmp/zabbix_proxy.log. Paying close attention we can find some interesting log records. 27982:20090729:131636.530 Cannot open database file "/tmp/zabbix_proxy.db": No such file or directory27982:20090729:131636.531 Creating database ... Wonderful. Zabbix proxy can automatically create the required SQLite database and populate it. We could also verify that Zabbix proxy is listening on the port it should be by running: $ netstat -ntl | grep 10051 The output should confirm that everything is correct. Monitoring a host through a proxy N ow that we have the proxy compiled, configured, and running, we have to inform Zabbix about it somehow. To do this, open Administration | DM in the frontend, then select Proxies in the first dropdown. No proxies are listed, so we have to create one—click the Create Proxy button and complete the form. Enter proxy in the Proxy name field. The section below; Hosts, allows us to specify which hosts will be monitored by this proxy. To make one host monitored by proxy, mark Another Host in the Other Hosts list box and click on the << button. When you are done, click Save. This server should now be monitored by the proxy, right? Again, not quite. As you might remember, we were setting the Zabbix server IP address in the agent configuration files so that agents would accept connections from these servers and send active check data to them. When we change a host to be monitored by a proxy, all connections will be made from the proxy, thus the proxy address must be added to the agent daemon's configuration file. As root, edit /etc/zabbix/zabbix_agentd.conf on Another Host and change the Hostname line to have the Zabbix proxy IP address instead of the Zabbix server IP address. Save the file, then restart the agent daemon. Now this agent will send all data to the proxy only, no connection will be made to or from Zabbix server. If, on the other hand, you have the proxy running on the same host as Zabbix server, agent will not notice that it is now queried by the proxy, not by the server, because the source IP would not have changed. The agent is still requesting and sending active checks to port 10051, Zabbix server port. To change this for the proxy port, edit /etc/zabbix/zabbix_agentd.conf on Another Host as root and change the ServerPort line to read: ServerPort=10052 Don't forget to remove the leading hash mark, then restart the agent daemon. With the host monitored solely by the proxy, let's check whether there are any indications of that in the frontend. Open Configuration | Hosts, make sure Linux servers is selected in the Group dropdown and take a look at the Name column. As can be seen, Another Host is now prefixed by the proxy name and reads proxy:Another Host. Of course, we are not limited in our proxy name choice, we could have named it "Nothing to see here" for all that the Zabbix server cares. But do we have to always go to Administration | DM whenever we want to make a host monitored by proxy? Click on Another Host in the Name column, and observe the available properties. There's one dropdown available, Monitored by proxy. Using this dropdown we can easily assign a host to be monitored by the chosen proxy (remembering to change server IP address in the agent daemon configuration file, if using active checks).
Read more
  • 0
  • 0
  • 6828

article-image-creating-camel-project-simple
Packt
27 Aug 2013
8 min read
Save for later

Creating a Camel project (Simple)

Packt
27 Aug 2013
8 min read
(For more resources related to this topic, see here.) Getting ready For the examples in this article, we are going to use Apache Camel version 2.11 (http://maven.apache.org/) and Apache Maven version 2.2.1 or newer (http://maven.apache.org/) as a build tool. Both of these projects can be downloaded for free from their websites. The complete source code for all the examples in this article is available on github at https://github.com/bibryam/camel-message-routing-examples repository. It contains Camel routes in Spring XML and Java DSL with accompanying unit tests. The source code for this tutorial is located under the project: camel-message-routing-examples/creating-camel-project. How to do it... In a new Maven project add the following Camel dependency to the pom.xml: <dependency> <groupId>org.apache.camel</groupId> <artifactId>camel-core</artifactId> <version>${camel-version}</version></dependency> With this dependency in place, creating our first route requires only a couple of lines of Java code: public class MoveFileRoute extends RouteBuilder { @Override public void configure() throws Exception { from("file://source") .to("log://org.apache.camel.howto?showAll=true") .to("file://target"); }} Once the route is defined, the next step is to add it to CamelContext, which is the actual routing engine and run it as a standalone Java application: public class Main { public static void main(String[] args) throws Exception { CamelContext camelContext = new DefaultCamelContext(); camelContext.addRoutes(new MoveFileRoute()); camelContext.start(); Thread.sleep(10000); camelContext.stop(); }} That's all it takes to create our first Camel application. Now, we can run it using a Java IDE or from the command line with Maven mvn exec:java. How it works... Camel has a modular architecture; its core (camel-core dependency) contains all the functionality needed to run a Camel application—DSL for various languages, the routing engine, implementations of EIPs, a number of data converters, and core components. This is the only dependency needed to run this application. Then there are optional technology specific connector dependencies (called components) such as JMS, SOAP, JDBC, Twitter, and so on, which are not needed for this example, as the file and log components we used are all part of the camel-core. Camel routes are created using a Domain Specific Language (DSL), specifically tailored for application integration. Camel DSLs are high-level languages that allow us to easily create routes, combining various processing steps and EIPs without going into low-level implementation details. In the Java DSL, we create a route by extending RouteBuilder and overriding the configure method. A route represents a chain of processing steps applied to a message based on some rules. The route has a beginning defined by the from endpoint, and one or more processing steps commonly called "Processors" (which implement the Processor interface). Most of these ideas and concepts originate from the Pipes and Filters pattern from the Enterprise Integration Patterns articlee by Gregor Hohpe and Bobby Woolf. The article provides an extensive list of patterns, which are also available at http://www.enterpriseintegrationpatterns.com, and the majority of which are implemented by Camel. With the Pipes and Filters pattern, a large processing task is divided into a sequence of smaller independent processing steps (Filters) that are connected by channels (Pipes). Each filter processes messages received from the inbound channel, and publishes the result to the outbound channel. In our route, the processing steps are reading the file using a polling consumer, logging it and writing the file to the target folder, all of them piped by Camel in the sequence specified in the DSL. We can visualize the individual steps in the application with the following diagram: A route has exactly one input called consumer and identified by the keyword from. A consumer receives messages from producers or external systems, wraps them in a Camel specific format called Exchange , and starts routing them. There are two types of consumers: a polling consumer that fetches messages periodically (for example, reading files from a folder) and an event-driven consumer that listens for events and gets activated when a message arrives (for example, an HTTP server). All the other processor nodes in the route are either a type of integration pattern or producers used for sending messages to various endpoints. Producers are identified by the keyword to and they are capable of converting exchanges and delivering them to other channels using the underlying transport mechanism. In our example, the log producer logs the files using the log4J API, whereas the file producer writes them to a target folder. The route is not enough to have a running application; it is only a template that defines the processing steps. The engine that runs and manages the routes is called Camel Context. A high level view of CamelContext looks like the following diagram: CamelContext is a dynamic multithread route container, responsible for managing all aspects of the routing: route lifecycle, message conversions, configurations, error handling, monitoring, and so on. When CamelContext is started, it starts the components, endpoints and activates the routes. The routes are kept running until CamelContext is stopped again when it performs a graceful shutdown giving time for all the in-flight messages to complete processing. CamelContext is dynamic, it allows us to start, stop routes, add new routes, or remove running routes at runtime. In our example, after adding the MoveFileRoute, we start CamelContext and let it copy files for 10 seconds, and then the application terminates. If we check the target folder, we should see files copied from the source folder. There's more... Camel applications can run as standalone applications or can be embedded in other containers such as Spring or Apache Karaf. To make development and deployment to various environments easy, Camel provides a number of DSLs, including Spring XML, Blueprint XML, Groovy, and Scala. Next, we will have a look at the Spring XML DSL. Using Spring XML DSL Java and Spring XML are the two most popular DSLs in Camel. Both provide access to all Camel features and the choice is mostly a matter of taste. Java DSL is more flexible and requires fewer lines of code, but can easily become complicated and harder to understand with the use of anonymous inner classes and other Java constructs. Spring XML DSL, on the other hand, is easier to read and maintain, but it is too verbose and testing it requires a little more effort. My rule of thumb is to use Spring XML DSL only when Camel is going to be part of a Spring application (to benefit from other Spring features available in Camel), or when the routing logic has to be easily understood by many people. For the routing examples in the article, we are going to show a mixture of Java and Spring XML DSL, but the source code accompanying this article has all the examples in both DSLs. In order to use Spring, we also need the following dependency in our projects: <dependency> <groupId>org.apache.camel</groupId> <artifactId>camel-spring</artifactId> <version>${camel-version}</version></dependency> The same application for copying files, written in Spring XML DSL looks like the following: <beans xsi_schemaLocation=" http ://www.springframework.org/schema/beans http ://www.springframework.org/schema/beans/spring-beans.xsd http ://camel.apache.org/schema/spring http ://camel.apache.org/schema/spring/camel-spring.xsd"><camelContext > <route> <from uri="file://source"/> <to uri="log://org.apache.camel.howto?showAll=true"/> <to uri="file://target"/> </route></camelContext></beans> Notice that this is a standard Spring XML file with an additional CamelContext element containing the route. We can launch the Spring application as part of a web application, OSGI bundle, or as a standalone application: public static void main(String[] args) throws Exception { AbstractApplicationContext springContext = new ClassPathXmlApplicationContext("META-INF/spring/move-file-context.xml"); springContext.start(); Thread.sleep(10000); springContext.stop();} When the Spring container starts, it will instantiate a CamelContext, start it and add the routes without any other code required. That is the complete application written in Spring XML DSL. More information about Spring support in Apache Camel can be found at http://camel.apache.org/spring.html. Summary This article provides a high-level overview of Camel architecture, and demonstrates how to create a simple message driven application. Resources for Article: Further resources on this subject: Binding Web Services in ESB—Web Services Gateway [Article] Drools Integration Modules: Spring Framework and Apache Camel [Article] Routing to an external ActiveMQ broker [Article]
Read more
  • 0
  • 0
  • 6824

article-image-securing-and-authenticating-web-api
Packt
21 Oct 2015
9 min read
Save for later

Securing and Authenticating Web API

Packt
21 Oct 2015
9 min read
In this article by Rajesh Gunasundaram, author of ASP.NET Web API Security Essentials, we will cover how to secure a Web API using forms authentication and Windows authentication. You will also get to learn the advantages and disadvantages of using the forms and Windows authentication in Web API. In this article, we will cover the following topics: The working of forms authentication Implementing forms authentication in the Web API Discussing the advantages and disadvantages of using the integrated Windows authentication mechanism Configuring Windows authentication Enabling Windows authentication in Katana Discussing Hawkauthentication (For more resources related to this topic, see here.) The working of forms authentication The user credentials will be submitted to the server using the HTML forms in forms authentication. This can be used in the ASP.NET Web API only if it is consumed from a web application. Forms authentication is built under ASP.NET and uses the ASP.NET membership provider to manage user accounts. Forms authentication requires a browser client to pass the user credentials to the server. It sends the user credentials in the request and uses HTTP cookies for the authentication. Let's list out the process of forms authenticationstep by step: The browser tries to access a restricted action that requires an authenticated request. If the browser sends an unauthenticated request, thenthe server responds with an HTTP status 302 Found and triggers the URL redirection to the login page. To send the authenticated request, the user enters the username and password and submits the form. If the credentials are valid, the server responds with an HTTP 302 status code that initiates the browser to redirect the page to the original requested URI with the authentication cookie in the response. Any request from the browser will now include the authentication cookie and the server will grant access to any restricted resource. The following image illustrates the workflow of forms authentication: Fig 1 – Illustrates the workflow of forms authentication Implementing forms authentication in the Web API To send the credentials to the server, we need an HTML form to submit. Let's use the HTML form or view an ASP.NET MVC application. The steps to implement forms authentication in an ASP.NET MVC application areas follows: Create New Project from the Start pagein Visual Studio. Select Visual C# Installed Templatenamed Web. Choose ASP.NET Web Applicationfrom the middle panel. Name the project Chapter06.FormsAuthentication and click OK. Fig 2 – We have named the ASP.NET Web Application as Chapter06.FormsAuthentication Select the MVC template in the New ASP.NET Project dialog. Tick Web APIunder Add folders and core referencesand press OKleaving Authentication to Individual User Accounts. Fig 3 – Select MVC template and check Web API in add folders and core references In the Models folder, add a class named Contact.cs with the following code: namespace Chapter06.FormsAuthentication.Models { public class Contact { publicint Id { get; set; } public string Name { get; set; } public string Email { get; set; } public string Mobile { get; set; } } } Add a Web API controller named ContactsController with the following code snippet: namespaceChapter06.FormsAuthentication.Api { public class ContactsController : ApiController { IEnumerable<Contact> contacts = new List<Contact> { new Contact { Id = 1, Name = "Steve", Email = "[email protected]", Mobile = "+1(234)35434" }, new Contact { Id = 2, Name = "Matt", Email = "[email protected]", Mobile = "+1(234)5654" }, new Contact { Id = 3, Name = "Mark", Email = "[email protected]", Mobile = "+1(234)56789" } }; [Authorize] // GET: api/Contacts publicIEnumerable<Contact> Get() { return contacts; } } } As you can see in the preceding code, we decorated the Get() action in ContactsController with the [Authorize] attribute. So, this Web API action can only be accessed by an authenticated request. An unauthenticated request to this action will make the browser redirect to the login page and enable the user to either register or login. Once logged in, any request that tries to access this action will be allowed as it is authenticated.This is because the browser automatically sends the session cookie along with the request and forms authentication uses this cookie to authenticate the request. It is very important to secure the website using SSL as forms authentication sends unencrypted credentials. Discussing the advantages and disadvantages of using the integrated Windows authentication mechanism First let's see the advantages of Windows authentication. Windows authentication is built under theInternet Information Services (IIS). It doesn't sends the user credentials along with the request. This authentication mechanism is best suited for intranet applications and doesn't need a user to enter their credentials. However, with all these advantages, there are a few disadvantages in the Windows authentication mechanism. It requires Kerberos that works based on tickets or NTLM, which is a Microsoft security protocols that should be supported by the client. The client'sPC must be underan active directory domain. Windows authentication is not suitable for internet applications as the client may not necessarily be on the same domain. Configuring Windows authentication Let's implement Windows authentication to an ASP.NET MVC application, as follows: Create New Project from the Start pagein Visual Studio. Select Visual C# Installed Templatenamed Web. Choose ASP.NET Web Applicationfrom the middle panel. Give project name as Chapter06.WindowsAuthentication and click OK. Fig 4 – We have named the ASP.NET Web Application as Chapter06.WindowsAuthentication Change the Authentication mode to Windows Authentication. Fig 5 – Select Windows Authentication in Change Authentication window Select the MVC template in the New ASP.NET Project dialog. Tick Web API under Add folders and core references and click OK. Fig 6 – Select MVC template and check Web API in add folders and core references Under theModels folder, add a class named Contact.cs with the following code: namespace Chapter06.FormsAuthentication.Models { public class Contact { publicint Id { get; set; } public string Name { get; set; } public string Email { get; set; } public string Mobile { get; set; } } } Add a Web API controller named ContactsController with the following code: namespace Chapter06.FormsAuthentication.Api { public class ContactsController : ApiController { IEnumerable<Contact> contacts = new List<Contact> { new Contact { Id = 1, Name = "Steve", Email = "[email protected]", Mobile = "+1(234)35434" }, new Contact { Id = 2, Name = "Matt", Email = "[email protected]", Mobile = "+1(234)5654" }, new Contact { Id = 3, Name = "Mark", Email = "[email protected]", Mobile = "+1(234)56789" } }; [Authorize] // GET: api/Contacts publicIEnumerable<Contact> Get() { return contacts; } } } The Get() action in ContactsController is decorated with the[Authorize] attribute. However, in Windows authentication, any request is considered as an authenticated request if the client relies on the same domain. So no explicit login process is required to send an authenticated request to call theGet() action. Note that the Windows authentication is configured in the Web.config file: <system.web> <authentication mode="Windows" /> </system.web> Enabling Windows authentication in Katana The following steps will create a console application and enable Windows authentication in katana: Create New Project from the Start pagein Visual Studio. Select Visual C# Installed TemplatenamedWindows Desktop. Select Console Applicationfrom the middle panel. Give project name as Chapter06.WindowsAuthenticationKatana and click OK: Fig 7 – We have named the Console Application as Chapter06.WindowsAuthenticationKatana Install NuGet Packagenamed Microsoft.Owin.SelfHost from NuGet Package Manager: Fig 8 – Install NuGet Package named Microsoft.Owin.SelfHost Add aStartup class with the following code snippet: namespace Chapter06.WindowsAuthenticationKatana { class Startup { public void Configuration(IAppBuilder app) { var listener = (HttpListener)app.Properties["System.Net.HttpListener"]; listener.AuthenticationSchemes = AuthenticationSchemes.IntegratedWindowsAuthentication; app.Run(context => { context.Response.ContentType = "text/plain"; returncontext.Response.WriteAsync("Hello Packt Readers!"); }); } } } Add the following code in the Main function in Program.cs: using (WebApp.Start<Startup>("http://localhost:8001")) { Console.WriteLine("Press any Key to quit Web App."); Console.ReadKey(); } Now run the application and open http://localhost:8001/ in the browser: Fig 8 – Open the Web App in a browser If you capture the request using the fiddler, you will notice an Authorization Negotiate entry in the header of the request Try calling http://localhost:8001/ in the fiddler and you will get a 401 Unauthorized response with theWWW-Authenticate headers that indicates that the server attaches a Negotiate protocol that consumes either Kerberos or NTLM, as follows: HTTP/1.1 401 Unauthorized Cache-Control: private Content-Type: text/html; charset=utf-8 Server: Microsoft-IIS/8.0 WWW-Authenticate: Negotiate WWW-Authenticate: NTLM X-Powered-By: ASP.NET Date: Tue, 01 Sep 2015 19:35:51 IST Content-Length: 6062 Proxy-Support: Session-Based-Authentication Discussing Hawk authentication Hawk authentication is a message authentication code-based HTTP authentication scheme that facilitates the partial cryptographic verification of HTTP messages. Hawk authentication requires a symmetric key to be shared between the client and server. Instead of sending the username and password to the server in order to authenticate the request, Hawk authentication uses these credentials to generate a message authentication code and is passed to the server in the request for authentication. Hawk authentication is mainly implemented in those scenarios where you need to pass the username and password via the unsecured layer and no SSL is implemented over the server. In such cases, Hawk authentication protects the username and password and passes the message authentication code instead. For example, if you are building a small product that has control over both the server and client and implementing SSL is too expensive for such a small project, then Hawk is the best option to secure the communication between your server and client. Summary Voila! We just secured our Web API using the forms- and Windows-based authentication. In this article,youlearnedabout how forms authentication works and how it is implemented in the Web API. You also learnedabout configuring Windows authentication and got to know about the advantages and disadvantages of using Windows authentication. Then you learned about implementing the Windows authentication mechanism in Katana. Finally, we had an introduction about Hawk authentication and the scenarios of using Hawk authentication. Resources for Article: Further resources on this subject: Working with ASP.NET Web API [article] Creating an Application using ASP.NET MVC, AngularJS and ServiceStack [article] Enhancements to ASP.NET [article]
Read more
  • 0
  • 1
  • 6821

article-image-getting-started-inkscape
Packt
09 Mar 2011
9 min read
Save for later

Getting Started with Inkscape

Packt
09 Mar 2011
9 min read
Inkscape 0.48 Essentials for Web Designers Use the fascinating Inkscape graphics editor to create attractive layout designs, images, and icons for your website   Vector graphics Vector graphics are made up of paths. Each path is basically a line with a start and end point, curves, angles, and points that are calculated with a mathematical equation. These paths are not limited to being straight—they can be of any shape, size, and even encompass any number of curves. When you combine them, they create drawings, diagrams, and can even help create certain fonts. These characteristics make vector graphics very different than JPEGs, GIFs, or BMP images—all of which are considered rasterized or bitmap images made up of tiny squares which are called pixels or bits. If you magnify these images, you will see they are made up of a grid (bitmaps) and if you keep magnifying them, they will become blurry and grainy as each pixel with bitmap square's zoom level grows larger. Computer monitors also use pixels in a grid. However, they use millions of them so that when you look at a display, your eyes see a picture. In high-resolution monitors, the pixels are smaller and closer together to give a crisper image. How does this all relate to vector-based graphics? Vector-based graphics aren't made up of squares. Since they are based on paths, you can make them larger (by scaling) and the image quality stays the same, lines and edges stay clean, and the same images can be used on items as small as letterheads or business cards or blown up to be billboards or used in high definition animation sequences. This flexibility, often accompanied by smaller file sizes, makes vector graphics ideal—especially in the world of the Internet, varying computer displays, and hosting services for web spaces, which leads us nicely to Inkscape, a tool that can be invaluable for use in web design. What is Inkscape and how can it be used? Inkscape is a free, open source program developed by a group of volunteers under the GNU General Public License (GPL). You not only get a free download but can use the program to create items with it and freely distribute them, modify the program itself, and share that modified program with others. Inkscape uses Scalable Vector Graphics (SVG), a vector-based drawing language that uses some basic principles: A drawing can (and should) be scalable to any size without losing detail A drawing can use an unlimited number of smaller drawings used in any number of ways (and reused) and still be a part of a larger whole SVG and World Wide Web Consortium (W3C) web standards are built into Inkscape which give it a number of features including a rich body of XML (eXtensible Markup Language) format with complete descriptions and animations. Inkscape drawings can be reused in other SVG-compliant drawing programs and can adapt to different presentation methods. It has support across most web browsers (Firefox, Chrome, Opera, Safari, Internet Explorer). When you draw your objects (rectangles, circles, and so on.), arbitrary paths, and text in Inkscape, you also give them attributes such as color, gradient, or patterned fills. Inkscape automatically creates a web code (XML) for each of these objects and tags your images with this code. If need be, the graphics can then be transformed, cloned, and grouped in the code itself, Hyperlinks can even be added for use in web browsers, multi-lingual scripting (which isn't available in most commercial vector-based programs) and more—all within Inkscape or in a native programming language. It makes your vector graphics more versatile in the web space than a standard JPG or GIF graphic. There are still some limitations in the Inkscape program, even though it aims to be fully SVG compliant. For example, as of version 0.48 it still does not support animation or SVG fonts—though there are plans to add these capabilities into future versions. Installing Inkscape Inkscape is available for download for Windows, Macintosh, Linux, or Solaris operating systems. To run on the Mac OS X operating system, it typically runs under X11—an implementation of the X Window System software that makes it possible to run X11-based applications in Mac OS X. The X11 application has shipped with the Mac OS X since version 10.5. When you open Inkscape on a Mac, it will first open X11 and run Inkscape within that program. Loss of some shortcut key options will occur but all functionality is present using menus and toolbars. Let's briefly go over how to download and install Inkscape: Go to the official Inkscape website at: http://www.inkscape.org/ and download the appropriate version of the software for your computer. For the Mac OS X Leopard software, you will also need to download an additional application. It is the X11 application package 2.4.0 or greater from this website: http://xquartz.macosforge.org/trac/wiki/X112.4.0. Once downloaded, double-click the X11-2.4.0.DMG package first. It will open another folder with the X11 application installer. Double-click that icon to be prompted through an installation wizard. Double-click the downloaded Inkscape installation package to start the installation. For the Mac OS, a DMG file is downloaded. Double-click on it and then drag and drop the Inkscape package to the Application Folder. For any Windows device, an .EXE file is downloaded. Double-click that file to start and complete the installation. For Linux-based computers, there are a number of distributions available. Be sure to download and install the correct installation package for your system. Now find the Inkscape icon in the Application or Programs folders to open the program. Double-click the Inkscape icon and the program will automatically open to the main screen. The basics of the software When you open Inkscape for the first time, you'll see that the main screen and a new blank document opened are ready to go. If you are using a Macintosh computer, Inkscape opens within the X11 application and may take slightly longer to load. The Inkscape interface is based on the GNOME UI standard which uses visual cues and feedback for any icons. For example: Hovering your mouse over any icon displays a pop-up description of the icon. If an icon has a dark gray border, it is active and can be used. If an icon is grayed out, it is not currently available to use with the current selection. All icons that are in execution mode (or busy) are covered by a dark shadow. This signifies that the application is busy and won't respond to any edit request. There is a Notification Display on the main screen that displays dynamic help messages to key shortcuts and basic information on how to use the Inkscape software in its current state or based on what objects and tools are selected. Main screen basics Within the main screen there is the main menu, a command, snap and status bar, tool controls, and a palette bar. Main menu You will use the main menu bar the most when working on your projects. This is the central location to find every tool and menu item in the program—even those found in the visual-based toolbars below it on the screen. When you select a main menu item the Inkscape dialog displays the icon, a text description, and shortcut key combination for the feature. This can be helpful while first learning the program—as it provides you with easier and often faster ways to use your most commonly used functions of the program. Toolbars Let's take a general tour of the tool bars seen on this main screen. We'll pay close attention to the tools we'll use most frequently. If you don't like the location of any of the toolbars, you can also make them as floating windows on your screen. This lets you move them from their pre-defined locations and move them to a location of your liking. To move any of the toolbars, from their docking point on the left side, click and drag them out of the window. When you click the upper left button to close the toolbar window, it will be relocated back into the screen. Command bar This toolbar represents the common and most frequently used commands in Inkscape: As seen in the previous screenshot you can create a new document, open an existing one, save, print, cut, paste, zoom, add text, and much more. Hover your mouse over each icon for details on its function. By default, when you open Inkscape, this toolbar is on the right side of the main screen. Snap bar Also found vertically on the right side of the main screen, this toolbar is designed to help with the Snap to features of Inkscape. It lets you easily align items (snap to guides), force objects to align to paths (snap to paths), or snap to bounding boxes and edges. Tool controls This toolbar's options change depending on which tool you have selected in the toolbox (described in the next section). When you are creating objects, it provides you all the detailed options—size, position, angles, and attributes specific to the tool you are currently using. By default, it looks like the following screenshot: (Move the mouse over the image to enlarge.) You have options to select/deselect objects within a layer, rotate or mirror objects, adjust object locations on the canvas, and scaling options and much more. Use it to define object properties when they are selected on the canvas. Toolbox bar You'll use the tool box frequently. It contains all of the main tools for creating objects, selecting and modifying objects, and drawing. To select a tool, click the icon. If you double-click a tool, you can see that tool's preferences (and change them). If you are new to Inkscape, there are a couple of hints about creating and editing text. The Text tool (A icon) in the Tool Box shown above is the only way of creating new text on the canvas. The T icon shown in the Command Bar is used only while editing text that already exists on the canvas.  
Read more
  • 0
  • 0
  • 6812

article-image-how-to-manage-complex-applications-using-kubernetes-based-helm-tool-tutorial
Savia Lobo
16 Jul 2019
16 min read
Save for later

How to manage complex applications using Kubernetes-based Helm tool [Tutorial]

Savia Lobo
16 Jul 2019
16 min read
Helm is a popular tool in the Kubernetes ecosystem that gives us a way of building packages (known as charts) of related Kubernetes objects that can be deployed in a cohesive way to a cluster. It also allows us to parameterize these packages, so they can be reused in different contexts and deployed to the varying environments that the services they provide might be needed in. This article is an excerpt taken from the book Kubernetes on AWS written by Ed Robinson. In this book, you will discover how to utilize the power of Kubernetes to manage and update your applications. In this article, you will learn how to manage complex applications using Kubernetes-based Helm tool. You will start by learning how to install Helm and later on how to configure and package Helm charts. Like Kubernetes, development of Helm is overseen by the Cloud Native Computing Foundation. As well as Helm (the package manager), the community maintains a repository of standard charts for a wide range of open source software you can install and run on your cluster. From the Jenkins CI server to MySQL or Prometheus, it's simple to install and run complex deployments involving many underlying Kubernetes resources with Helm. Installing Helm If you have already set up your own Kubernetes cluster and have correctly configured kubectl on your machine, then it is simple to install Helm. On macOS On macOS, the simplest way to install the Helm client is with Homebrew: $ brew install kubernetes-helm On Linux and Windows Every release of Helm includes prebuilt binaries for Linux, Windows, and macOS. Visit https://github.com/kubernetes/helm/releases to download the version you need for your platform. To install the client, simply unpack and copy the binary onto your path. For example, on a Linux machine you might do the following: $ tar -zxvf helm-v2.7.2-linux-amd64.tar.gz $ mv linux-amd64/helm /usr/local/bin/helm Installing Tiller Once you have the Helm CLI tool installed on your machine, you can go about installing Helm's server-side component, Tiller. Helm uses the same configuration as kubectl, so start by checking which context you will be installing Tiller onto: $ kubectl config current-context minikube Here, we will be installing Tiller into the cluster referenced by the Minikube context. In this case, this is exactly what we want. If your kubectl is not currently pointing to another cluster, you can quickly switch to the context you want to use like this: $ kubectl config use-context minikube If you are still not sure that you are using the correct context, take a quick look at the full config and check that the cluster server field is correct: $ kubectl config view --minify=true The minify flag removes any config not referenced by the current context. Once you are happy that the cluster that kubectl is connecting to is the correct one, we can set up Helm's local environment and install Tiller on to your cluster: $ helm init $HELM_HOME has been configured at /Users/edwardrobinson/.helm. Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster. Happy Helming! We can use kubectl to check that Tiller is indeed running on our cluster: $ kubectl -n kube-system get deploy -l app=helm NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE tiller-deploy 1 1 1 1 3m Once we have verified that Tiller is correctly running on the cluster, let's use the version command. This will validate that we are able to connect correctly to the API of the Tiller server and return the version number of both the CLI and the Tiller server: $ helm version Client: &version.Version{SemVer:"v2.7.2", GitCommit:"8478fb4fc723885b155c924d1c8c410b7a9444e6", GitTreeState:"clean"} Server: &version.Version{SemVer:"v2.7.2", GitCommit:"8478fb4fc723885b155c924d1c8c410b7a9444e6", GitTreeState:"clean"} Installing a chart Let's start by installing an application by using one of the charts provided by the community. You can discover applications that the community has produced Helm charts for at https://hub.kubeapps.com/. As well as making it simple to deploy a wide range of applications to your Kubernetes cluster, it's a great resource for learning some of the best practices the community uses when packaging applications for Helm. Helm charts can be stored in a repository, so it is simple to install them by name. By default, Helm is already configured to use one remote repository called Stable. This makes it simple for us to try out some commonly used applications as soon as Helm is installed. Before you install a chart, you will need to know three things: The name of the chart you want to install The name you will give to this release (If you omit this, Helm will create a random name for this release) The namespace on the cluster you want to install the chart into (If you omit this, Helm will use the default namespace) Helm calls each distinct installation of a particular chart a release. Each release has a unique name that is used if you later want to update, upgrade, or even remove a release from your cluster. Being able to install multiple instances of a chart onto a single cluster makes Helm a little bit different from how we think about traditional package managers that are tied to a single machine, and typically only allow one installation of a particular package at once. But once you have got used to the terminology, it is very simple to understand: A chart is the package that contains all the information about how to install a particular application or tool to the cluster. You can think of it as a template that can be reused to create many different instances or releases of the packaged application or tool. A release is a named installation of a chart to a particular cluster. By referring to a release by name, Helm can make upgrades to a particular release, updating the version of the installed tool, or making configuration changes. A repository is an HTTP server storing charts along with an index file. When configured with the location of a repository, the Helm client can install a chart from that repository by downloading it and then making a new release. Before you can install a chart onto your cluster, you need to make sure that Helm knows about the repository that you want to use. You can list the repositories that are currently in use by running the helm repo list command: $ helm repo list NAME URL stable https://kubernetes-charts.storage.googleapis.com local http://127.0.0.1:8879/charts By default, Helm is configured with a repository named stable pointing at the community chart repository and local repository that points at a local address for testing your own local repository. (You need to be running helm serve for this.) Adding a Helm repository to this list is simple with the helm repo add command. You can add my Helm repository that contains some example applications related to this book by running the following command: $ helm repo add errm https://charts.errm.co.uk "errm" has been added to your repositories In order to pull the latest chart information from the configured repositories, you can run the following command: $ helm repo update Hang tight while we grab the latest from your chart repositories... ...Skip local chart repository ...Successfully got an update from the "errm" chart repository ...Successfully got an update from the "stable" chart repository Update Complete. Happy Helming! Let's start with one of the simplest applications available in my Helm repository, kubeslate. This provides some very basic information about your cluster, such as the version of Kubernetes you are running and the number of pods, deployments, and services in your cluster. We are going to start with this application, since it is very simple and doesn't require any special configuration to run on Minikube, or indeed any other cluster. Installing a chart from a repository on your cluster couldn't be simpler: $ helm install --name=my-slate errm/kubeslate You should see a lot of output from the helm command. Firstly, you will see some metadata about the release, such as its name, status, and namespace: NAME: my-slate LAST DEPLOYED: Mon Mar 26 21:55:39 2018 NAMESPACE: default STATUS: DEPLOYED Next, you should see some information about the resources that Helm has instructed Kubernetes to create on the cluster. As you can see, a single service and a single deployment have been created: RESOURCES: ==> v1/Service NAME TYPE CLUSTER-IP PORT(S) AGE my-slate-kubeslate ClusterIP 10.100.209.48 80/TCP 0s ==> v1/Deployment NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE my-slate-kubeslate 2 0 0 0 0s ==> v1/Pod(related) NAME READY STATUS AGE my-slate-kubeslate-77bd7479cf-gckf8 0/1 ContainerCreating 0s my-slate-kubeslate-77bd7479cf-vvlnz 0/1 ContainerCreating 0s Finally, there is a section with some notes that have been provided by the chart's author to give us some information about how to start using the application: Notes: To access kubeslate. First start the kubectl proxy: kubectl proxy Now open the following URL in your browser: http://localhost:8001/api/v1/namespaces/default/services/my-slate-kubeslate:http/proxy Please try reloading the page if you see ServiceUnavailable / no endpoints available for service, as pod creation might take a few moments. Try following these instructions yourself and open Kubeslate in your browser: Kubeslate deployed with Helm Configuring a chart When you use Helm to make a release of a chart, there are certain attributes that you might need to change or configuration you might need to provide. Luckily, Helm provides a standard way for users of a chart to override some or all of the configuration values. In this section, we are going to look at how, as the user of a chart, you might go about supplying configuration to Helm. Later in the chapter, we are going to look at how you can create your own charts and use the configuration passed in to allow your chart to be customized. When we invoke helm install, there are two ways we can provide configuration values: passing them as command-line arguments, or by providing a configuration file. These configuration values are merged with the default values provided by a chart. This allows a chart author to provide a default configuration to allow users to get up and running quickly, but still allow users to tweak important settings, or enable advanced features. Providing a single value to Helm on the command line is achieved by using the set flag. The kubeslate chart allows us to specify additional labels for the pod(s) that it launches using the podLabels variable. Let's make a new release of the kubeslate chart, and then use the podLabels variable to add an additional hello label with the value world: $ helm install --name labeled-slate --set podLabels.hello=world errm/kubeslate Once you have run this command, you should be able to prove that the extra variable you passed to Helm did indeed result in the pods launched by Helm having the correct label. Using the kubectl get pods command with a label selector for the label we applied using Helm should return the pods that have just been launched with Helm: $ kubectl get pods -l hello=world NAME READY STATUS labeled-slate-kubeslate-5b75b58cb-7jpfk 1/1 Running labeled-slate-kubeslate-5b75b58cb-hcpgj 1/1 Running As well as being able to pass a configuration to Helm when we create a new release, it is also possible to update the configuration in a pre-existing release using the upgrade command. When we use Helm to update a configuration, the process is much the same as when we updated deployment resources in the last chapter, and a lot of those considerations still apply if we want to avoid downtime in our services. For example, by launching multiple replicas of a service, we can avoid downtime, as a new version of a deployment configuration is rolled out. Let's also upgrade our original kubeslate release to include the same hello: world pod label that we applied to the second release. As you can see, the structure of the upgrade command is quite similar to the install command. But rather than specifying the name of the release with the --name flag, we pass it as the first argument. This is because when we install a chart to the cluster, the name of the release is optional. If we omit it, Helm will create a random name for the release. However, when performing an upgrade, we need to target a pre-existing release to upgrade, and thus this argument is mandatory: $ helm upgrade my-slate --set podLabels.hello=world errm/kubeslate If you now run helm ls, you should see that the release named my-slate has been upgraded to Revision 2. You can test that the deployment managed by this release has been upgraded to include this pod label by repeating our kubectl get command: $ kubectl get pods -l hello=world NAME READY STATUS labeled-slate-kubeslate-5b75b58cb-7jpfk 1/1 Running labeled-slate-kubeslate-5b75b58cb-hcpgj 1/1 Running my-slate-kubeslate-5c8c4bc77-4g4l4 1/1 Running my-slate-kubeslate-5c8c4bc77-7pdtf 1/1 Running We can now see that four pods, two from each of our releases, now match the label selector we passed to kubectl get. Passing variables on the command line with the set flag is convenient when we just want to provide values for a few variables. But when we want to pass more complex configurations, it can be simpler to provide the values as a file. Let's prepare a configuration file to apply several labels to our kubeslate pods: values.yml podLabels: hello: world access: internal users: admin We can then use the helm command to apply this configuration file to our release: $ helm upgrade labeled-slate -f values.yml errm/kubeslate To learn how to create your own charts, head over to our book. Packaging Helm charts While we are developing our chart, it is simple to use the Helm CLI to deploy our chart straight from the local filesystem. However, Helm also allows you to create your own repository in order to share your charts. A Helm repository is a collection of packaged Helm charts, plus an index stored in a particular directory structure on a standard HTTP web server. Once you are happy with your chart, you will want to package it so it is ready to distribute in a Helm repository. This is simple to do with the helm package command. When you start to distribute your charts with a repository, versioning becomes important. The version number of a chart in a Helm repository needs to follow the SemVer 2 guidelines. In order to build a packaged chart, start by checking that you have set an appropriate version number in Chart.yaml. If this is the first time you have packaged your chart, the default will be OK: $ helm package version-app Successfully packaged chart and saved it to: ~/helm-charts/version-app-0.1.0.tgz You can test a packaged chart without uploading it to a repository by using the helm serve command. This command will serve all of the packaged charts found in the current directory and generate an index on the fly: $ helm serve Regenerating index. This may take a moment. Now serving you on 127.0.0.1:8879 You can now try installing your chart by using the local repository: $ helm install local/version-app You can test building an index An Helm repository is just a collection of packaged charts stored in a directory. In order to discover and search the charts and versions available in a particular repository, the Helm client downloads a special index.yaml that includes metadata about each packaged chart and the location it can be downloaded from. In order to generate this index file, we need to copy all the packaged charts that we want in our index to the same directory: cp ~/helm-charts/version-app-0.1.0.tgz ~/helm-repo/ Then, in order to generate the index.yaml file, we use the helm repo index command. You will need to pass the root URL where the packaged charts will be served from. This could be the address of a web server, or on AWS, you might use a S3 bucket: helm repo index ~/helm-repo --url https://helm-repo.example.org The chart index is quite a simple format, listing the name of each chart available, and then providing a list of each version available for each named chart. The index also includes a checksum in order to validate the download of charts from the repository: apiVersion: v1 entries: version-app: - apiVersion: v1 created: 2018-01-10T19:28:27.802896842Z description: A Helm chart for Kubernetes digest: 79aee8b48cab65f0d3693b98ae8234fe889b22815db87861e590276a657912c1 name: version-app urls: - https://helm-repo.example.org/version-app-0.1.0.tgz version: 0.1.0 generated: 2018-01-10T19:28:27.802428278Z The generated index.yaml file for our new chart repository. Once we have created the index.yaml file, it is simply a question of copying your packaged charts and the index file to the host you have chosen to use. If you are using S3, this might look like this: aws s3 sync ~/helm-repo s3://my-helm-repo-bucket In order for Helm to be able to use your repository, your web server (or S3) needs to be correctly configured. The web server needs to serve the index.yaml file with the correct content type header (text/yaml or text/x-yaml). The charts need to be available at the URLs listed in the index. Using your repository Once you have set up the repository, you can configure Helm to use it: helm repo add my-repo https://helm-repo.example.org my-repo has been added to your repositories When you add a repository, Helm validates that it can indeed connect to the URL given and download the index file. You can check this by searching for your chart by using helm search: $ helm search version-app NAME VERSION DESCRIPTION my-repo/version-app 0.1.1 A Helm chart for Kubernetes Thus, in this article you learned how to install Helm, configuring and packaging Helm charts.  It can be used for a wide range of scenarios where you want to deploy resources to a Kubernetes cluster, from providing a simple way for others to install an application you have written on their own clusters, to forming the cornerstone of an internal Platform as a Service within a larger organization. To know more about how to configure your own charts using Helm and to know the organizational patterns for Helm, head over to our book, Kubernetes on AWS. Elastic launches Helm Charts (alpha) for faster deployment of Elasticsearch and Kibana to Kubernetes Introducing ‘Quarkus’, a Kubernetes native Java framework for GraalVM & OpenJDK HotSpot Pivotal and Heroku team up to create Cloud Native Buildpacks for Kubernetes
Read more
  • 0
  • 0
  • 6809

article-image-ensemble-methods-optimize-machine-learning-models
Guest Contributor
07 Nov 2017
8 min read
Save for later

Ensemble Methods to Optimize Machine Learning Models

Guest Contributor
07 Nov 2017
8 min read
[box type="info" align="" class="" width=""]We are happy to bring you an elegant guest post on ensemble methods by Benjamin Rogojan, popularly known as The Seattle Data Guy.[/box] How do data scientists improve their algorithm’s accuracy or improve the robustness of a model? A method that is tried and tested is ensemble learning. It is a must know topic if you claim to be a data scientist and/or a machine learning engineer. Especially, if you are planning to go in for a data science/machine learning interview. Essentially, ensemble learning stays true to the meaning of the word ‘ensemble’. Rather than having several people who are singing at different octaves to create one beautiful harmony (each voice filling in the void of the other), ensemble learning uses hundreds to thousands of models of the same algorithm that work together to find the correct classification. Another way to think about ensemble learning is the fable of the blind men and the elephant. Each blind man in the story seeks to identify the elephant in front of them. However, they work separately and come up with their own conclusions about the animal. Had they worked in unison, they might have been able to eventually figure out what they were looking at. Similarly, ensemble learning utilizes the workings of different algorithms and combines them for a successful and optimal classification. Ensemble methods such as Boosting and Bagging have led to an increased robustness of statistical models with decreased variance. Before we begin with explaining the various ensemble methods, let us have a glance at the common bond between them, Bootstrapping. Bootstrap: The common glue Explaining Bootstrapping can occasionally be missed by many data scientists. However, an understanding of bootstrapping is essential as both the ensemble methods, Boosting and Bagging, are based on the concept of bootstrapping. Figure 1: Bootstrapping In machine learning terms, bootstrap method refers to random sampling with replacement. This sample, after replacement, is referred as a resample. This allows the model or algorithm to get a better understanding of the various biases, variances, and features that exist in the resample. Taking a sample of the data allows the resample to contain different characteristics which the sample might have contained. This would, in turn, affect the overall mean, standard deviation, and other descriptive metrics of a data set. Ultimately, leading to the development of more robust models. The above diagram depicts each sample population having different and non-identical pieces. Bootstrapping is also great for small size data sets that may have a tendency to overfit. In fact, we recommended this to one company who was concerned because their data sets were far from “Big Data”. Bootstrapping can be a solution in this case because algorithms that utilize bootstrapping are more robust and can handle new datasets depending on the methodology chosen (boosting or bagging). The bootstrap method can also test the stability of a solution. By using multiple sample data sets and then testing multiple models, it can increase robustness.  In certain cases, one sample data set may have a larger mean than another or a different standard deviation. This might break a model that was overfitted and not tested using data sets with different variations. One of the many reasons bootstrapping has become so common is because of the increase in computing power. This allows multiple permutations to be done with different resamples. Let us now move on to the most prominent ensemble methods: Bagging and Boosting. Ensemble Method 1: Bagging Bagging actually refers to Bootstrap Aggregators. Most papers or posts that explain bagging algorithms are bound to refer to Leo Breiman’s work, a paper published in 1996 called  “Bagging Predictors”. In the paper, Leo describes bagging as: “Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor.” Bagging helps reduce variance from models that are accurate only on the data they were trained on. This problem is also known as overfitting. Overfitting happens when a function fits the data too well. Typically this is because the actual equation is highly complicated to take into account each data point and the outlier. Figure 2: Overfitting Another example of an algorithm that can overfit easily is a decision tree. The models that are developed using decision trees require very simple heuristics. Decision trees are composed of a set of if-else statements done in a specific order. Thus, if the data set is changed to a new data set that might have some bias or difference in the spread of underlying features compared to the previous set, the model will fail to be as accurate as before. This is because the data will not fit the model well. Bagging gets around the overfitting problem by creating its own variance amongst the data. This is done by sampling and replacing data while it tests multiple hypotheses (models). In turn, this reduces the noise by utilizing multiple samples that would most likely be made up of data with various attributes (median, average, etc). Once each model has developed a hypothesis, the models use voting for classification or averaging for regression. This is where the “Aggregating” of the “Bootstrap Aggregating” comes into play. As in the figure shown below, each hypothesis has the same weight as all the others. (When we later discuss boosting, this is one of the places the two methodologies differ.) Figure 3: Bagging Essentially, all these models run at the same time and vote on the hypothesis which is the most accurate. This helps to decrease variance i.e. reduce the overfit. Ensemble Method 2: Boosting Boosting refers to a group of algorithms that utilize weighted averages to make weak learners into stronger learners. Unlike bagging (that has each model run independently and then aggregate the outputs at the end without preference to any model), boosting is all about “teamwork”. Each model that runs dictates what features the next model will focus on. Boosting also requires bootstrapping. However, there is another difference here. Unlike bagging, boosting weights each sample of data. This means some samples will be run more often than others. Why put weights on the samples of data? Figure 4: Boosting When boosting runs each model, it tracks which data samples are the most successful and which are not. The data sets with the most misclassified outputs are given heavier weights. This is because such data sets are considered to have more complexity. Thus, more iterations would be required to properly train the model. During the actual classification stage, boosting tracks the model's error rates to ensure that better models are given better weights. That way, when the “voting” occurs, like in bagging, the models with better outcomes have a stronger pull on the final output. Which of these ensemble methods is right for me? Ensemble methods generally out-perform a single model. This is why many Kaggle winners have utilized ensemble methodologies. Another important ensemble methodology, not discussed here, is stacking. Boosting and bagging are both great techniques to decrease variance. However, they won’t fix every problem, and they themselves have their own issues. There are different reasons why you would use one over the other. Bagging is great for decreasing variance when a model is overfitted. However, boosting is likely to be a better pick of the two methods. This is because it is also great for decreasing bias in an underfit model. On the other hand, boosting is likely to suffer performance issues. This is where experience and subject matter expertise comes in! It may seem easy to jump on the first model that works. However, it is important to analyze the algorithm and all the features it selects. For instance, a decision tree that sets specific leafs shouldn’t be implemented if it can’t be supported with other data points and visuals. It is not just about trying AdaBoost, or Random forests on various datasets. The final algorithm is driven depending on the results an algorithm is getting and the support provided. [author title="About the Author"] Benjamin Rogojan Ben has spent his career focused on healthcare data. He has focused on developing algorithms to detect fraud, reduce patient readmission and redesign insurance provider policy to help reduce the overall cost of healthcare. He has also helped develop analytics for marketing and IT operations in order to optimize limited resources such as employees and budget. Ben privately consults on data science and engineering problems both solo as well as with a company called Acheron Analytics. He has experience both working hands-on with technical problems as well as helping leadership teams develop strategies to maximize their data.[/author]
Read more
  • 0
  • 0
  • 6808
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at £15.99/month. Cancel anytime
article-image-brett-lantz-on-implementing-a-decision-tree-using-c5-0-algorithm-in-r
Packt Editorial Staff
29 Mar 2019
9 min read
Save for later

Brett Lantz on implementing a decision tree using C5.0 algorithm in R

Packt Editorial Staff
29 Mar 2019
9 min read
Decision tree learners are powerful classifiers that utilize a tree structure to model the relationships among the features and the potential outcomes. This structure earned its name due to the fact that it mirrors the way a literal tree begins at a wide trunk and splits into narrower and narrower branches as it is followed upward. In much the same way, a decision tree classifier uses a structure of branching decisions that channel examples into a final predicted class value. In this article, we demonstrate the implementation of decision tree using C5.0 algorithm in R. This article is taken from the book, Machine Learning with R, Fourth Edition written by Brett Lantz. This 10th Anniversary Edition of the classic R data science book is updated to R 4.0.0 with newer and better libraries. This book features several new chapters that reflect the progress of machine learning in the last few years and help you build your data science skills and tackle more challenging problems There are numerous implementations of decision trees, but the most well-known is the C5.0 algorithm. This algorithm was developed by computer scientist J. Ross Quinlan as an improved version of his prior algorithm, C4.5 (C4.5 itself is an improvement over his Iterative Dichotomiser 3 (ID3) algorithm). Although Quinlan markets C5.0 to commercial clients (see http://www.rulequest.com/ for details), the source code for a single-threaded version of the algorithm was made public, and has therefore been incorporated into programs such as R. The C5.0 decision tree algorithm The C5.0 algorithm has become the industry standard for producing decision trees because it does well for most types of problems directly out of the box. Compared to other advanced machine learning models, the decision trees built by C5.0 generally perform nearly as well but are much easier to understand and deploy. Additionally, as shown in the following table, the algorithm's weaknesses are relatively minor and can be largely avoided. Strengths An all-purpose classifier that does well on many types of problems. Highly automatic learning process, which can handle numeric or nominal features, as well as missing data. Excludes unimportant features. Can be used on both small and large datasets. Results in a model that can be interpreted without a mathematical background (for relatively small trees). More efficient than other complex models. Weaknesses Decision tree models are often biased toward splits on features having a large number of levels. It is easy to overfit or underfit the model. Can have trouble modeling some relationships due to reliance on axis-parallel splits. Small changes in training data can result in large changes to decision logic. Large trees can be difficult to interpret and the decisions they make may seem counterintuitive. To keep things simple, our earlier decision tree example ignored the mathematics involved with how a machine would employ a divide and conquer strategy. Let's explore this in more detail to examine how this heuristic works in practice. Choosing the best split The first challenge that a decision tree will face is to identify which feature to split upon. In the previous example, we looked for a way to split the data such that the resulting partitions contained examples primarily of a single class. The degree to which a subset of examples contains only a single class is known as purity, and any subset composed of only a single class is called pure. There are various measurements of purity that can be used to identify the best decision tree splitting candidate. C5.0 uses entropy, a concept borrowed from information theory that quantifies the randomness, or disorder, within a set of class values. Sets with high entropy are very diverse and provide little information about other items that may also belong in the set, as there is no apparent commonality. The decision tree hopes to find splits that reduce entropy, ultimately increasing homogeneity within the groups. Typically, entropy is measured in bits. If there are only two possible classes, entropy values can range from 0 to 1. For n classes, entropy ranges from 0 to log2(n). In each case, the minimum value indicates that the sample is completely homogenous, while the maximum value indicates that the data are as diverse as possible, and no group has even a small plurality. In mathematical notion, entropy is specified as: In this formula, for a given segment of data (S), the term c refers to the number of class levels, and pi  refers to the proportion of values falling into class level i. For example, suppose we have a partition of data with two classes: red (60 percent) and white (40 percent). We can calculate the entropy as: > -0.60 * log2(0.60) - 0.40 * log2(0.40) [1] 0.9709506 We can visualize the entropy for all possible two-class arrangements. If we know the proportion of examples in one class is x, then the proportion in the other class is (1 – x). Using the curve() function, we can then plot the entropy for all possible values of x: > curve(-x * log2(x) - (1 - x) * log2(1 - x),     col = "red", xlab = "x", ylab = "Entropy", lwd = 4) This results in the following figure: The total entropy as the proportion of one class varies in a two-class outcome As illustrated by the peak in entropy at x = 0.50, a 50-50 split results in the maximum entropy. As one class increasingly dominates the other, the entropy reduces to zero. To use entropy to determine the optimal feature to split upon, the algorithm calculates the change in homogeneity that would result from a split on each possible feature, a measure known as information gain. The information gain for a feature F is calculated as the difference between the entropy in the segment before the split (S1) and the partitions resulting from the split (S2): One complication is that after a split, the data is divided into more than one partition. Therefore, the function to calculate Entropy(S2) needs to consider the total entropy across all of the partitions. It does this by weighting each partition's entropy according to the proportion of all records falling into that partition. This can be stated in a formula as: In simple terms, the total entropy resulting from a split is the sum of entropy of each of the n partitions weighted by the proportion of examples falling in the partition (wi). The higher the information gain, the better a feature is at creating homogeneous groups after a split on that feature. If the information gain is zero, there is no reduction in entropy for splitting on this feature. On the other hand, the maximum information gain is equal to the entropy prior to the split. This would imply the entropy after the split is zero, which means that the split results in completely homogeneous groups. The previous formulas assume nominal features, but decision trees use information gain for splitting on numeric features as well. To do so, a common practice is to test various splits that divide the values into groups greater than or less than a threshold. This reduces the numeric feature into a two-level categorical feature that allows information gain to be calculated as usual. The numeric cut point yielding the largest information gain is chosen for the split. Note: Though it is used by C5.0, information gain is not the only splitting criterion that can be used to build decision trees. Other commonly used criteria are Gini index, chi-squared statistic, and gain ratio. For a review of these (and many more) criteria, refer to An Empirical Comparison of Selection Measures for Decision-Tree Induction, Mingers, J, Machine Learning, 1989, Vol. 3, pp. 319-342. Pruning the decision tree As mentioned earlier, a decision tree can continue to grow indefinitely, choosing splitting features and dividing into smaller and smaller partitions until each example is perfectly classified or the algorithm runs out of features to split on. However, if the tree grows overly large, many of the decisions it makes will be overly specific and the model will be overfitted to the training data. The process of pruning a decision tree involves reducing its size such that it generalizes better to unseen data. One solution to this problem is to stop the tree from growing once it reaches a certain number of decisions or when the decision nodes contain only a small number of examples. This is called early stopping or prepruning the decision tree. As the tree avoids doing needless work, this is an appealing strategy. However, one downside to this approach is that there is no way to know whether the tree will miss subtle but important patterns that it would have learned had it grown to a larger size. An alternative, called post-pruning, involves growing a tree that is intentionally too large and pruning leaf nodes to reduce the size of the tree to a more appropriate level. This is often a more effective approach than prepruning because it is quite difficult to determine the optimal depth of a decision tree without growing it first. Pruning the tree later on allows the algorithm to be certain that all of the important data structures were discovered. Note: The implementation details of pruning operations are very technical and beyond the scope of this book. For a comparison of some of the available methods, see A Comparative Analysis of Methods for Pruning Decision Trees, Esposito, F, Malerba, D, Semeraro, G, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, Vol. 19, pp. 476-491. One of the benefits of the C5.0 algorithm is that it is opinionated about pruning—it takes care of many of the decisions automatically using fairly reasonable defaults. Its overall strategy is to post-prune the tree. It first grows a large tree that overfits the training data. Later, the nodes and branches that have little effect on the classification errors are removed. In some cases, entire branches are moved further up the tree or replaced by simpler decisions. These processes of grafting branches are known as subtree raising and subtree replacement, respectively. Getting the right balance of overfitting and underfitting is a bit of an art, but if model accuracy is vital, it may be worth investing some time with various pruning options to see if it improves the test dataset performance. To summarize , decision trees are widely used due to their high accuracy and ability to formulate a statistical model in plain language.  Here, we looked at a highly popular and easily configurable decision tree algorithm C5.0. The major strength of the C5.0 algorithm over other decision tree implementations is that it is very easy to adjust the training options. Harness the power of R to build flexible, effective, and transparent machine learning models with Brett Lantz’s latest book Machine Learning with R, Fourth Edition. Dr.Brandon explains Decision Trees to Jon Building a classification system with Decision Trees in Apache Spark 2.0 Implementing Decision Trees
Read more
  • 0
  • 0
  • 6801

article-image-essbase-aso-aggregate-storage-option
Packt
14 Oct 2009
5 min read
Save for later

Essbase ASO (Aggregate Storage Option)

Packt
14 Oct 2009
5 min read
Welcome to the exciting world of Essbase Analytics known as the Aggregate Storage Option (ASO). Well, now you're ready to take everything one step further. You see, the BSO architecture used by Essbase is the original database architecture as the behind the scenes method of data storage in an Essbase database. The ASO method is entirely different. What is ASO ASO is Essbase's alternative to the sometimes cumbersome BSO method of storing data in an Essbase database. In fact, it is BSO that is exactly what makes Essbase a superior OLAP analytical tool but it is also the BSO that can occasionally be a detriment to the level of system performance demanded in today's business world. In a BSO database, all data is stored, except for dynamically calculated members. All data consolidations and parent-child relationships in the database outline are stored as well. While the block storage method is quite efficient from a data to size ratio perspective, a BSO database can require large amounts of overhead to deliver the retrieval performance demanded by the business customer. The ASO database efficiently stores not only zero level data, but can also store aggregated hierarchical data with the understandings that stored hierarchies can only have the no-consolidation (~) or the addition (+) operator assigned to them and the no-consolidation (~) operator can only be used underneath Label Only members. Outline member consolidations are performed on the fly using dynamic calculations and only at the time of the request for data. This is the main reason why ASO is a valuable option worth consideration when building an Essbase system for your customer. Because of the simplified levels of data stored in the ASO database, a more simplified method of storing the physical data on the disk can also be used. It is this simplified storage method which can help result in higher performance for the customer. Your choice of one database type over the other will always depend on balancing the customer's needs with the server's physical capabilities, along with the volume of data. These factors must be given equal consideration. Creating an aggregate storage Application|Database Believe it or not, creating an ASO Essbase application and database is as easy as creating a BSO application and database. All you need to do is follow these simple steps: Right-click on the server name in your EAS console for the server on which you want to create your ASO application. Select Create application | Using aggregate storage as shown in the following screenshot: Click on Using aggregate storage and that's it. The rest of the steps are easy to follow and basically the same as for a BSO application. To create an ASO application and database, you follow virtually the same steps as you do to create a BSO application and database. However, there are some important differences, and here we list a few: A BSO database outline can be converted into an Aggregate Storage database outline, but an Aggregate Storage database outline cannot be converted into a Block Storage database outline.Steps to convert a BSO application into an ASO application: Open the BSO outline that you wish to convert, select the Essbase database and click on the File | Wizards | Aggregate Storage Outline Conversion option. You will see the first screen Select Source Outline. The source of the outline can be in a file system or on the Essbase Server. In this case, we have selected the OTL from the Essbase Server and then click Next as shown in the following screenshot: In the Next screen, the conversion wizard will verify the conversion and display a message that the conversion has completed successfully. Click Next. Here, Essbase prompts you to select the destination of the ASO outline. If you have not yet created an ASO application, you can click on the Create Aggregate Storage Application on the bottom-right corner of the screen as shown in the next screenshot: Enter the Application and the Database name and click on OK. Your new ASO application is created, now click on Finish. Your BSO application is now converted into an ASO application. You may still need to tweak the ASO application settings and outline members to be the best fit for your needs. In an ASO database, all dimensions are Sparse so there is no need to try to determine the best Dense/Sparse settings as you would do with a BSO database. Although Essbase recommends that you only have one Essbase database in an Essbase application, you can create more than one database per application when you are using the BSO. When you create an ASO application, Essbase will only allow one database per application. There is quite a bit to know about ASO but have no fear, with all that you know about Essbase and how to design and build an Essbase system, it will seem easy for you. Keep reading for more valuable information on the ASO for things like, when it is a good time to use ASO, or how do you query ASO databases effectively, or even what are the differences between ASO and BSO. If you understand the differences, you can then understand the benefits.
Read more
  • 0
  • 0
  • 6791

article-image-introduction-oracle-service-bus-oracle-service-registry
Packt
15 Sep 2010
6 min read
Save for later

Introduction to Oracle Service Bus & Oracle Service Registry

Packt
15 Sep 2010
6 min read
(For more resources on BPEL, SOA and Oracle see here.) If we want our SOA architecture to be highly fexible and agile, we have to ensure loose coupling between different components. As service interfaces and endpoint addresses change over time, we have to remove all point-to-point connections between service providers and service consumers by introducing an intermediate layer—Enterprise Service Bus (ESB). ESB is a key component of every mature SOA architecture and provides several important functionalities, such as message routing, transformation between message types and protocols, the use of adapters, and so on. Another important requirement for providing fexibility is service-reuse. This can be achieved through the use of UDDI (Universal Description, Discovery and Integration) compliant service registry, which enables us to publish and discover services. Using service registry we can also implement dynamic endpoint lookup, so that service consumers retrieve actual service endpoint address from registry in runtime. Oracle Service Bus architecture and features ESB provides means to manage connections, control the communication between services, supervise the services and their SLAs (Service Level Agreements), and much more. The importance of the ESB often becomes visible after the frst development iteration of SOA composite application has taken place. For example, when a service requires a change in its interface or payload, the ESB can provide the transformation capabilities to mask the differences to existing service consumers. ESB can also mask the location of services, making it easy to migrate services to different servers. There are plenty other scenarios, where ESB is important. In this article, we will look at the Oracle Service Bus (OSB). OSB presents a communication backbone for transport and routing of messages across an enterprise. It is designed for high-throughput and reliable message delivery to a variety of service providers and consumers. It supports XML as a native data type, however, other data types are also supported. As an intermediary, it processes incoming service request messages, executes the routing logic and transforms these messages if needed. It can also transform between different transport protocols (HTTP, JMS, File, FTP, and so on.). Service response messages follow the inverse path. The message processing is specifed in the message fow defnition of a proxy service. OSB provides some functionalities that are similar to the functionalities of the Mediator component within the SOA Composite, such as routing, validation, fltering, and transformation. The major difference is that the Mediator is a mediation component that is meant to work within the SOA Composite and is deployed within a SOA composition application. The OSB on the other hand is a standalone service bus. In addition to providing the communication backbone for all SOA (and non-SOA) applications, OSB mission is to shield application developers from changes in the service endpoints and to prevent those systems from being overloaded with requests from upstream applications. In addition to the Oracle Service Bus, we can also use the Mediator service component, which also provides mediation capabilities, but only within SOA composite applications. On the other hand, OSB is used for inter-application communication. The following figure shows the functional architecture of Oracle Service Bus (OSB). We can see that OSB can be categorized into four functional layers: Messaging layer: Provides support to reliably connect any service by leveraging standards, such as HTTP/SOAP, WS-I, WS-Security, WS-Policy, WS-Addressing, SOAP v1.1, SOAP v1.2, EJB, RMI, and so on. It even supports the creation of custom transports using the Custom Transport Software Development Kit (SDK). Security layer: Provides security at all levels: Transport Security (SSL), Message Security (WS-Policy, WS-Security, and so on), Console Security (SSO and role based access) and Policy (leverages WS-Security and WS-Policy). Composition layer: Provides confguration-driven composition environment. We can use either the Eclipse plug-in environment or web-based Oracle Service Bus Console. We can model message fows that contain content-based routing, message validation, and exception handling. We can also use message transformations (XSLT, XQuery), service callouts (POJO, Web Services), and a test browser. Automatic synchronization with UDDI registries is also supported. Management layer: Provides a unifed dashboard for service monitoring and management. We can defne and monitor Service Level Agreements (SLAs), alerts on operation metrics and message pipelines, and view reports. Proxy services and business services OSB uses a specifc terminology of proxy and business services. The objective of OSB is to route message between business services and service consumers through proxy services. Proxy services are generic intermediary web services that implement the mediation logic and are hosted locally on OSB. Proxy services route messages to business services and are exposed to service consumers. A proxy service is confgured by specifying its interface, type of transport, and its associated message processing logic. Message fow defnitions are used to defne the proxy service message handling capabilities. Business services describe the enterprise services that exchange messages with business processes and which we want to virtualize using the OSB. The defnition of a business service is similar to that of a proxy service, however, the business services does not have a message fow defnition. Message fow modeling Message fows are used to defne message processing logic of proxy services. Message fow modeling includes defning a sequence of activities, where activities are individual actions, such as transformations, reporting, publishing and exception management. Message fow modeling can be performed using a visual development environment (Eclipse or Oracle Service Bus Console). Message fow defnitions are defned using components, such as pipelines, branch nodes and route nodes, as shown in the following fgure: A pipeline is a sequence of stages, representing a one-way processing path. It is used to specify message fow for service requests and responses. If a service defnes more operations, a pipeline might optionally branch into operational pipelines. There are three types of pipelines: Request pipelines are used to specify the request path of the message flow Response pipelines are used to specify the response path of a message flow Error pipelines are used as error handlers. Request and response pipelines are paired together as pipeline pairs. Branch nodes are used as exclusive switches, where the processing can follow one of the branches. A variable in the message context is used as a lookup variable to determine which branch to follow. Route nodes are used to communicate with another service (in most cases a business service). They cannot have any descendants in the message fow. When the route node sends the request message, the request processing is fnished. On the other side, when it receives a response message, the response processing begins. Each pipeline is a sequence of stages that contain user-defned message processing actions. We can choose between a variety of supported actions, such as Publish, Service Callout, For Each, If... Then..., Raise Error, Reply, Resume, Skip, Delete, Insert, Replace, Validate, Alert, Log, Report, and more. Later in this article, we will show you how to use a pipeline on the Travel Approval process. However, let us frst look at the Oracle Service Registry, which we will use together with the OSB.
Read more
  • 0
  • 0
  • 6791

article-image-how-to-build-a-live-interactive-visual-dashboard-in-power-bi-with-azure-stream
Sugandha Lahoti
17 Apr 2018
4 min read
Save for later

How to build a live interactive visual dashboard in Power BI with Azure Stream

Sugandha Lahoti
17 Apr 2018
4 min read
Azure Stream Analytics is a managed complex event processing interactive data engine. As a built-in output connector, it offers the facility of building live interactive intelligent BI charts and graphics using Microsoft's cloud-based Business Intelligent tool called Power BI. In this tutorial we implement a data architecture pipeline by designing a visual dashboard using Microsoft Power BI and Stream Analytics. Prerequisites of building an interactive visual live dashboard in Power BI with Stream Analytics: Azure subscription Power BI Office365 account (the account email ID should be the same for both Azure and Power BI). It can be a work or school account Integrating Power BI as an output job connector for Stream Analytics To start with connecting the Power BI portal as an output of an existing Stream Analytics job, follow the given steps:  First, select Outputs in the Azure portal under JOB TOPOLOGY:  After clicking on Outputs, click on +Add in the top left corner of the job window, as shown in the following screenshot:  After selecting +Add, you will be prompted to enter the New output connectors of the job. Provide details such as job Output name/alias; under Sink, choose Power BI from the drop-down menu.  On choosing Power BI as the streaming job output Sink, it will automatically prompt you to authorize the Power BI work/personal account with Azure. Additionally, you may create a new Power BI account by clicking on Signup. By authorizing, you are granting access to the Stream Analytics output permanently in the Power BI dashboard. You can also revoke the access by changing the password of the Power BI account or deleting the output/job.  Post the successful authorization of the Power BI account with Azure, there will be options to select Group Workspace, which is the Power BI tenant workspace where you may create the particular dataset to configure processed Stream Analytics events. Furthermore, you also need to define the Table Name as data output. Lastly, click on the Create button to integrate the Power BI data connector for real-time data visuals:   Note: If you don't have any custom workspace defined in the Power BI tenant, the default workspace is My Workspace. If you define a dataset and table name that already exists in another Stream Analytics job/output, it will be overwritten. It is also recommended that you just define the dataset and table name under the specific tenant workspace in the job portal and not explicitly create them in Power BI tenants as Stream Analytics automatically creates them once the job starts and output events start to push into the Power BI dashboard.   On starting the Streaming job with output events, the Power BI dataset would appear under the dataset tab following workspace. The  dataset can contain maximum 200,000 rows and supports real-time streaming events and historical BI report visuals as well: Further Power BI dashboard and reports can be implemented using the streaming dataset. Alternatively, you may also create tiles in custom dashboards by selecting CUSTOM STREAMING  DATA under REAL-TIME OATA, as shown in the following screenshot:  By selecting Next, the streaming dataset should be selected and then the visual type, respective fields, Axis, or legends, can be defined: Thus, a complete interactive near real-time Power BI visual dashboard can be implemented with analyzed streamed data from Stream Analytics, as shown in the following screenshot, from the real-world Connected Car-Vehicle Telemetry analytics dashboard: In this article we saw a step-by-step implementation of a real-time visual dashboard using Microsoft Power BI with processed data from Azure Stream Analytics as the output data connector. This article is an excerpt from the book, Stream Analytics with Microsoft Azure, written by Anindita Basak, Krishna Venkataraman, Ryan Murphy, and Manpreet Singh. To learn more on designing and managing Stream Analytics jobs using reference data and utilizing petabyte-scale enterprise data store with Azure Data Lake Store, you may refer to this book. Unlocking the secrets of Microsoft Power BI Ride the third wave of BI with Microsoft Power BI  
Read more
  • 0
  • 0
  • 6781
article-image-testing-webassembly-modules-with-jest-tutorial
Sugandha Lahoti
15 Oct 2018
7 min read
Save for later

Testing WebAssembly modules with Jest [Tutorial]

Sugandha Lahoti
15 Oct 2018
7 min read
WebAssembly (Wasm) represents an important stepping stone for the web platform. Enabling a developer to run compiled code on the web without a plugin or browser lock-in presents many new opportunities. This article is taken from the book Learn WebAssembly by Mike Rourke. This book will introduce you t powerful WebAssembly concepts that will help you write lean and powerful web applications with native performance. Well-tested code prevents regression bugs, simplifies refactoring, and alleviates some of the frustrations that go along with adding new features. Once you've compiled a WebAssembly module, you should write tests to ensure it's functioning as expected, even if you've written tests for C, C++, or Rust code you compiled it from. In this tutorial, we'll use Jest, a JavaScript testing framework, to test the functions in a compiled Wasm module. The code being tested All of the code used in this example is located on GitHub. The code and corresponding tests are very simple and are not representative of real-world applications, but they're intended to demonstrate how to use Jest for testing. The following code represents the file structure of the /testing-example folder: ├── /src | ├── /__tests__ | │ └── main.test.js | └── main.c ├── package.json └── package-lock.json The contents of the C file that we'll test, /src/main.c, is shown as follows: int addTwoNumbers(int leftValue, int rightValue) { return leftValue + rightValue; } float divideTwoNumbers(float leftValue, float rightValue) { return leftValue / rightValue; } double findFactorial(float value) { int i; double factorial = 1; for (i = 1; i <= value; i++) { factorial = factorial * i; } return factorial; } All three functions in the file are performing simple mathematical operations. The package.json file includes a script to compile the C file to a Wasm file for testing. Run the following command to compile the C file: npm run build There should be a file named main.wasm in the /src directory. Let's move on to describing the testing configuration step. Testing configuration The only dependency we'll use for this example is Jest, a JavaScript testing framework built by Facebook. Jest is an excellent choice for testing because it includes most of the features you'll need out of the box, such as coverage, assertions, and mocking. In most cases, you can use it with zero configuration, depending on the complexity of your application. If you're interested in learning more, check out Jest's website at https://jestjs.io. Open a terminal instance in the /chapter-09-node/testing-example folder and run the following command to install Jest: npm install In the package.json file, there are three entries in the scripts section: build, pretest, and test. The build script executes the emcc command with the required flags to compile /src/main.c to /src/main.wasm. The test script executes the jest command with the --verbose flag, which provides additional details for each of the test suites. The pretest script simply runs the build script to ensure /src/main.wasm exists prior to running any tests. Tests file review Let's walk through the test file, located at /src/__tests__/main.test.js, and review the purpose of each section of code. The first section of the test file instantiates the main.wasm file and assigns the result to the local wasmInstance variable: const fs = require('fs'); const path = require('path'); describe('main.wasm Tests', () => { let wasmInstance; beforeAll(async () => { const wasmPath = path.resolve(__dirname, '..', 'main.wasm'); const buffer = fs.readFileSync(wasmPath); const results = await WebAssembly.instantiate(buffer, { env: { memoryBase: 0, tableBase: 0, memory: new WebAssembly.Memory({ initial: 1024 }), table: new WebAssembly.Table({ initial: 16, element: 'anyfunc' }), abort: console.log } }); wasmInstance = results.instance.exports; }); ... Jest provides life-cycle methods to perform any setup or teardown actions prior to running tests. You can specify functions to run before or after all of the tests (beforeAll()/afterAll()), or before or after each test (beforeEach()/afterEach()). We need a compiled instance of the Wasm module from which we can call exported functions, so we put the instantiation code in the beforeAll() function. We're wrapping the entire test suite in a describe() block for the file. Jest uses a describe() function to encapsulate suites of related tests and test() or it() to represent a single test. Here's a simple example of this concept: const add = (a, b) => a + b; describe('the add function', () => { test('returns 6 when 4 and 2 are passed in', () => { const result = add(4, 2); expect(result).toEqual(6); }); test('returns 20 when 12 and 8 are passed in', () => { const result = add(12, 8); expect(result).toEqual(20); }); }); The next section of code contains all the test suites and tests for each exported function: ... describe('the _addTwoNumbers function', () => { test('returns 300 when 100 and 200 are passed in', () => { const result = wasmInstance._addTwoNumbers(100, 200); expect(result).toEqual(300); }); test('returns -20 when -10 and -10 are passed in', () => { const result = wasmInstance._addTwoNumbers(-10, -10); expect(result).toEqual(-20); }); }); describe('the _divideTwoNumbers function', () => { test.each([ [10, 100, 10], [-2, -10, 5], ])('returns %f when %f and %f are passed in', (expected, a, b) => { const result = wasmInstance._divideTwoNumbers(a, b); expect(result).toEqual(expected); }); test('returns ~3.77 when 20.75 and 5.5 are passed in', () => { const result = wasmInstance._divideTwoNumbers(20.75, 5.5); expect(result).toBeCloseTo(3.77, 2); }); }); describe('the _findFactorial function', () => { test.each([ [120, 5], [362880, 9.2], ])('returns %p when %p is passed in', (expected, input) => { const result = wasmInstance._findFactorial(input); expect(result).toEqual(expected); }); }); }); The first describe() block, for the _addTwoNumbers() function, has two test() instances to ensure that the function returns the sum of the two numbers passed in as arguments. The next two describe() blocks, for the _divideTwoNumbers() and _findFactorial() functions, use Jest's .each feature, which allows you to run the same test with different data. The expect() function allows you to make assertions on the value passed in as an argument. The .toBeCloseTo() assertion in the last _divideTwoNumbers() test checks whether the result is within two decimal places of 3.77. The rest use the .toEqual() assertion to check for equality. Writing tests with Jest is relatively simple, and running them is even easier! Let's try running our tests and reviewing some of the CLI flags that Jest provides. Running the wasm tests To run the tests, open a terminal instance in the /chapter-09-node/testing-example folder and run the following command: npm test You should see the following output in your terminal: main.wasm Tests the _addTwoNumbers function ✓ returns 300 when 100 and 200 are passed in (4ms) ✓ returns -20 when -10 and -10 are passed in the _divideTwoNumbers function ✓ returns 10 when 100 and 10 are passed in ✓ returns -2 when -10 and 5 are passed in (1ms) ✓ returns ~3.77 when 20.75 and 5.5 are passed in the _findFactorial function ✓ returns 120 when 5 is passed in (1ms) ✓ returns 362880 when 9.2 is passed in Test Suites: 1 passed, 1 total Tests: 7 passed, 7 total Snapshots: 0 total Time: 1.008s Ran all test suites. If you have a large number of tests, you could remove the --verbose flag from the test script in package.json and only pass the flag to the npm test command if needed. There are several other CLI flags you can pass to the jest command. The following list contains some of the more commonly used flags: --bail: Exits the test suite immediately upon the first failing test suite --coverage: Collects test coverage and displays it in the terminal after the tests have run --watch: Watches files for changes and reruns tests related to changed files You can pass these flags to the npm test command by adding them after a --. For example, if you wanted to use the --bail flag, you'd run this command: npm test -- --bail You can view the entire list of CLI options on the official site at https://jestjs.io/docs/en/cli. In this article, we saw how the Jest testing framework can be leveraged to test a compiled module in WebAssembly to ensure it's functioning correctly. To learn more about WebAssembly and its functionalities read the book, Learn WebAssembly. Blazor 0.6 release and what it means for WebAssembly Introducing Wasmjit: A kernel mode WebAssembly runtime for Linux. Why is everyone going crazy over WebAssembly?
Read more
  • 0
  • 0
  • 6771

article-image-factory-method-pattern
Packt
10 Feb 2016
10 min read
Save for later

The Factory Method Pattern

Packt
10 Feb 2016
10 min read
In this article by Anshul Verma and Jitendra Zaa, author of the book Apex Design Patterns, we will discuss some problems that can occur mainly during the creation of class instances and how we can write the code for the creation of objects in a more simple, easy to maintain, and scalable way. (For more resources related to this topic, see here.) In this article, we will discuss the the factory method creational design pattern. Often, we find that some classes have common features (behavior) and can be considered classes of the same family. For example, multiple payment classes represent a family of payment services. Credit card, debit card, and net banking are some of the examples of payment classes that have common methods, such as makePayment, authorizePayment, and so on. Using the factory method pattern, we can develop controller classes, which can use these payment services, without knowing the actual payment type at design time. The factory method pattern is a creational design pattern used to create objects of classes from the same family without knowing the exact class name at design time. Using the factory method pattern, classes can be instantiated from the common factory method. The advantage of using this pattern is that it delegates the creation of an object to another class and provides a good level of abstraction. Let's learn this pattern using the following example: The Universal Call Center company is new in business and provides free admin support to customers to resolve issues related to their products. A call center agent can provide some information about the product support; for example, to get the Service Level Agreement (SLA) or information about the total number of tickets allowed to open per month. A developer came up with the following class: public class AdminBasicSupport{ /** * return SLA in hours */ public Integer getSLA() { return 40; } /** * Total allowed support tickets allowed every month */ public Integer allowedTickets() { // As this is basic support return 9999; } } Now, to get the SLA of AdminBasicSupport, we need to use the following code every time: AdminBasicSupport support = new AdminBasicSupport(); System.debug('Support SLA is - '+support.getSLA()); Output - Support SLA is – 40 The "Universal Call Centre" company was doing very well, and in order to grow the business and increase the profit, they started the premium support for customers who were willing to pay for cases and get a quick support. To make them special from the basic support, they changed the SLA to 12 hours and maximum 50 cases could be opened in one month. A developer had many choices to make this happen in the existing code. However, instead of changing the existing code, they created a new class that would handle only the premium support-related functionalities. This was a good decision because of the single responsibility principle. public class AdminPremiumSupport{ /** * return SLA in hours */ public Integer getSLA() { return 12; } /** * Total allowed support tickets allowed every month is 50 */ public Integer allowedTickets() { return 50; } } Now, every time any information regarding the SLA or allowed tickets per month is needed, the following Apex code can be used: if(Account.supportType__c == 'AdminBasic') { AdminBasicSupport support = new AdminBasicSupport(); System.debug('Support SLA is - '+support.getSLA()); }else{ AdminPremiumSupport support = new AdminPremiumSupport(); System.debug('Support SLA is - '+support.getSLA()); } As we can see in the preceding example, instead of adding some conditions to the existing class, the developer decided to go with a new class. Each class has its own responsibility, and they need to be changed for only one reason. If any change is needed in the basic support, then only one class needs to be changed. As we all know that this design principle is known as the Single Responsibility Principle. Business was doing exceptionally well in the call center, and they planned to start the golden and platinum support as well. Developers started facing issues with the current approach. Currently, they have two classes for the basic and premium support and requests for two more classes were in the pipeline. There was no guarantee that the support type will not remain the same in future. Because of every new support type, a new class is needed; and therefore, the previous code needs to be updated to instantiate these classes. The following code will be needed to instantiate these classes: if(Account.supportType__c == 'AdminBasic') { AdminBasicSupport support = new AdminBasicSupport(); System.debug('Support SLA is - '+support.getSLA()); }else if(Account.supportType__c == 'AdminPremier') { AdminPremiumSupport support = new AdminPremiumSupport(); System.debug('Support SLA is - '+support.getSLA()); }else if(Account.supportType__c == 'AdminGold') { AdminGoldSupport support = new AdminGoldSupport(); System.debug('Support SLA is - '+support.getSLA()); }else{ AdminPlatinumSupport support = new AdminPlatinumSupport(); System.debug('Support SLA is - '+support.getSLA()); } We are only considering the getSLA() method, but in a real application, there can be other methods and scenarios as well. The preceding code snippet clearly depicts the code duplicity and maintenance nightmare. The following image shows the overall complexity of the example that we are discussing: Although they are using a separate class for each support type, an introduction to a new support class will lead to changes in the code in all existing code locations where these classes are being used. The development team started brainstorming to make sure that the code is capable to extend easily in future with the least impact on the existing code. One of the developers came up with a suggestion to use an interface for all support classes so that every class can have the same methods and they can be referred to using an interface. The following interface was finalized to reduce the code duplicity: public Interface IAdminSupport{ Integer getSLA() ; Integer allowedTickets(); } Methods defined within an interface have no access modifiers and just contain their signatures. Once an interface was created, it was time to update existing classes. In our case, only one line needed to be changed and the remaining part of the code was the same because both the classes already have the getSLA() and allowedTickets() methods. Let's take a look at the following line of code: public class AdminPremiumSupport{ This will be changed to the following code: public class AdminBasicSupportImpl implements IAdminSupport{ The following line of code is as follows: public class AdminPremiumSupport{ This will be changed to the following code: public class AdminPremiumSupportImpl implements IAdminSupport{ In the same way, the AdminGoldSupportImpl and AdminPlatinumSupportImpl classes are written. A class diagram is a type of Unified Modeling Language (UML), which describes classes, methods, attributes, and their relationships, among other objects in a system. You can read more about class diagrams at https://en.wikipedia.org/wiki/Class_diagram. The following image shows a class diagram of the code written by developers using an interface: Now, the code to instantiate different classes of the support type can be rewritten as follows: IAdminSupport support = null; if(Account.supportType__c == 'AdminBasic') { support = new AdminBasicSupportImpl(); }else if(Account.supportType__c == 'AdminPremier') { support = new AdminPremiumSupportImpl(); }else if(Account.supportType__c == 'AdminGold') { support = new AdminGoldSupportImpl(); }else{ support = new AdminPlatinumSupportImpl(); } System.debug('Support SLA is - '+support.getSLA()); There is no switch case statement in Apex, and that's why multiple if and else statements are written. As per the product team, a new compiler may be released in 2016 and it will be supported. You can vote for this idea at https://success.salesforce.com/ideaView?id=08730000000BrSIAA0. As we can see, the preceding code is minimized to create a required instance of a concrete class, and then uses an interface to access methods. This concept is known as program to interface. This is one of the most recommended OOP principles suggested to be followed. As interfaces are kinds of contracts, we already know which methods will be implemented by concrete classes, and we can completely rely on the interface to call them, which hides their complex implementation and logic. It has a lot of advantages and a few of them are loose coupling and dependency injection. A concrete class is a complete class that can be used to instantiate objects. Any class that is not abstract or an interface can be considered a concrete class. We still have one problem in the previous approach. The code to instantiate concrete classes is still present at many locations and will still require changes if a new support type is added. If we can delegate the creation of concrete classes to some other class, then our code will be completely independent of the existing code and new support types. This concept of delegating decisions and creation of similar types of classes is known as the factory method pattern. The following class can be used to create concrete classes and will act as a factory: /** * This factory class is used to instantiate concrete class * of respective support type * */ public class AdminSupportFactory { public static IAdminSupport getInstance(String supporttype){ IAdminSupport support = null; if(supporttype == 'AdminBasic') { support = new AdminBasicSupportImpl(); }else if(supporttype == 'AdminPremier') { support = new AdminPremiumSupportImpl(); }else if(supporttype == 'AdminGold') { support = new AdminGoldSupportImpl(); }else if(supporttype == 'AdminPlatinum') { support = new AdminPlatinumSupportImpl(); } return support ; } } In the preceding code, we only need to call the getInstance(string) method, and this method will take a decision and return the actual implementation. As a return type is an interface, we already know the methods that are defined, and we can use the method without actually knowing its implementation. This is a very good example of abstraction. The final class diagram of the factory method pattern that we discussed will look like this: The following code snippet can be used repeatedly by any client code to instantiate a class of any support type: IAdminSupport support = AdminSupportFactory.getInstance ('AdminBasic'); System.debug('Support SLA is - '+support.getSLA()); Output : Support SLA is – 40 Reflection in Apex The problem with the preceding design is that whenever a new support needs to be added, we need to add a condition to AdminSupportFactory. We can store the mapping between a support type and its concrete class name in Custom setting. This way, whenever a new concrete class is added, we don't even need to change the factory class and a new entry needs to be added to custom setting. Consider custom setting created by the Support_Type__c name with the Class_Name__c field name of the text type with the following records: Name Class name AdminBasic AdminBasicSupportImpl AdminGolden AdminGoldSupportImpl AdminPlatinum AdminPlatinumSupportImpl AdminPremier AdminPremiumSupportImpl However, using reflection, the AdminSupportFactory class can also be rewritten to instantiate service types at runtime as follows: /** * This factory class is used to instantiate concrete class * of respective support type * */ public class AdminSupportFactory { public static IAdminSupport getInstance(String supporttype) { //Read Custom setting to get actual class name on basis of Support type Support_Type__c supportTypeInfo = Support_Type__c.getValues(supporttype); //from custom setting get appropriate class name Type t = Type.forName(supportTypeInfo.Class_Name__c); IAdminSupport retVal = (IAdminSupport)t.newInstance(); return retVal; } } In the preceding code, we are using the Type system class. This is a very powerful class used to instantiate a new class at runtime. It has the following two important methods: forName: This returns a type that is equivalent to a string passed newInstance: This creates a new object for a specified type Inspecting classes, methods, and variables at runtime without knowing a class name, or instantiating a new object and invoking methods at runtime is known as Reflection in computer science. One more advantage of using the factory method, custom setting, and reflection together is that if in future one of the support types need to be replaced by another service type permanently, then we need to simply change the appropriate mapping in custom setting without any changes in the code. Summary In this article, we discussed how to deal with various situations while instantiating objects using design patterns, using the factory method. Resources for Article: Further resources on this subject: Getting Your APEX Components Logic Right[article] AJAX Implementation in APEX[article] Custom Coding with Apex[article]
Read more
  • 0
  • 23
  • 6759

article-image-how-to-set-up-odoo-as-a-system-service-tutorial
Sugandha Lahoti
01 Feb 2019
7 min read
Save for later

How to set up Odoo as a system service [Tutorial]

Sugandha Lahoti
01 Feb 2019
7 min read
In this tutorial, we'll learn the basics of setting up Odoo as a system service.  This article is taken from the book Odoo 12 Development Essentials by Daniel Reis. This book will help you extend your skills with Odoo 12 to build resourceful and open source business applications. Setting up and maintaining servers is a non-trivial topic in itself and should be done by specialists. The information given here is not enough to ensure an average user can create a resilient and secure environment that hosts sensitive data and services. In this article, we'll discuss the following topics: Setting up Odoo as a system service, including the following: Creating a systemd service Creating an Upstart or sysvinit service Checking the Odoo service from the command line The code and scripts used here can be found in the ch14/ directory of the Git repository. Setting up Odoo as a system service We will learn how to set up Odoo as a system service and have it started automatically when the system boots. In Ubuntu or Debian, the init system is responsible for starting services. Historically, Debian (and derived operating systems) has used sysvinit, and Ubuntu has used a compatible system called Upstart. Recently, however, this has changed, and the init system used in both the latest Debian and Ubuntu editions is systemd. This means that there are now two different ways to install a system service, and you need to pick the correct one depending on the version of your operating system. On Ubuntu 16.04 and later, you should be using systemd. However, older versions are still used in many cloud providers, so there is a good chance that you might need to use it. To check whether systemd is used in your system, try the following command: $ man init This command opens the documentation for the currently init system in use, so you're able to check what is being used. Ubuntu on Windows Subsystem for Linux (WSL) is an environment good enough for development only, but may have some quirks and is entirely inappropriate for running production servers. At the time of writing, our tests revealed that while man init identifies the init system as systemd, installing a systemd service doesn't work, while installing a sysvinit service does. Creating a systemd service If the operating system you're using is recent, such as Debian 8 or Ubuntu 16.04, you should be using systemd for the init system. To add a new service to the system, simply create a file describing it. Create a /lib/systemd/system/odoo.service file with the following content: [Unit] Description=Odoo After=postgresql.service [Service] Type=simple User=odoo Group=odoo ExecStart=/home/odoo/odoo-12/odoo-bin -c /etc/odoo/odoo.conf [Install] WantedBy=multi-user.target The Odoo source code includes a sample odoo.service file inside the debian/ directory. Instead of creating a new file, you can copy it and then make the required changes. At the very least, the ExecStart option should be changed according to your setup. Next, we need to register the new service with the following code: $ sudo systemctl enable odoo.service To start this new service, use the following command: $ sudo systemctl start odoo To check its status, run the following command: $ sudo systemctl status odoo Finally, if you want to stop it, use the following command: $ sudo systemctl stop odoo Creating an Upstart or sysvinit service If you're using an older operating system, such as Debian 7 or Ubuntu 15.04, chances are your system is sysvinit or Upstart. For the purpose of creating a system service, both should behave in the same way. Some cloud Virtual Private Server (VPS) services are still based on older Ubuntu images, so this might be aware of this scenario in case you encounter it when deploying your Odoo server. The Odoo source code includes an init script used for the Debian packaged distribution. We can use it as our service init script with minor modifications, as follows: $ sudo cp /home/odoo/odoo-12/debian/init /etc/init.d/odoo $ sudo chmod +x /etc/init.d/odoo At this point, you might want to check the content of the init script. The key parameters are assigned to variables at the top of the file, as illustrated in the following example: PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin DAEMON=/usr/bin/odoo NAME=odoo DESC=odoo CONFIG=/etc/odoo/odoo.conf LOGFILE=/var/log/odoo/odoo-server.log PIDFILE=/var/run/${NAME}.pid USER=odoo These variables should be adequate, so we'll prepare the rest of the setup with their default values in mind. However, you can, of course, change them to better suit your needs. The USER variable is the system user under which the server will run. We have already created the expected odoo user. The DAEMON variable is the path to the server executable. Our executable used to start Odoo is in a different location, but we can create the following symbolic link to it: $ sudo ln -s /home/odoo/odoo-12/odoo-bin /usr/bin/odoo $ sudo chown -h odoo /usr/bin/odoo The CONFIG variable is the configuration file we need to use. In a previous section, we created a configuration file in the default expected location, /etc/odoo/odoo.conf. Finally, the LOGFILE variable is the directory where log files should be stored. The expected directory is /var/log/odoo, which we created when we defined the configuration file. Now we should be able to start and stop our Odoo service, as follows: $ sudo /etc/init.d/odoo start Starting odoo: ok Stopping the service is done in a similar way with the following command: $ sudo /etc/init.d/odoo stop Stopping odoo: ok In Ubuntu, the service command can also be used, as follows: $ sudo service odoo start $ sudo service odoo status $ sudo service odoo stop Now we need to make the service start automatically on system boot; this can be done with the following code: $ sudo update-rc.d odoo defaults After this, when we reboot our server, the Odoo service should start automatically and with no errors. It's a good time to verify that all is working as expected. Checking the Odoo service from the command line At this point, we can confirm whether our Odoo instance is up and responding to requests as expected. If Odoo is running properly, we should be able to get a response from it and see no errors in the log file. We can check whether Odoo is responding to HTTP requests inside the server by using the following command: $ curl http://localhost:8069 <html><head><script>window.location = '/web' + location.hash;</script></head></html> In addition, to see what is in the log file, use the following command: $ sudo less /var/log/odoo/odoo-server.log You can also follow what is being added to the log file live, using  tail -f as follows: $ sudo tail -f /var/log/odoo/odoo-server.log Summary In this tutorial, we learned about the steps required for setting up Odoo as a system service. To learn more about Odoo, you should read our book  Odoo 12 Development Essentials. You may also take a look at the official documentation at https://www.odoo.com/documentation. Odoo is an open source product with a vibrant community. Getting involved, asking questions, and contributing is a great way not only to learn but also to build a business network. With this in mind, we can't help mention the Odoo Community Association (OCA), which promotes collaboration and quality open source code. You can learn more about it at odoo‑comunity.org. “Everybody can benefit from adopting Odoo, whether you’re a small start-up or a giant tech company - An interview by Yenthe van Ginneken. Implement an effective CRM system in Odoo 11 [Tutorial] Handle Odoo application data with ORM API [Tutorial]
Read more
  • 0
  • 0
  • 6753
article-image-thread-synchronization-and-communication
Packt
19 Jun 2017
20 min read
Save for later

Thread synchronization and communication

Packt
19 Jun 2017
20 min read
In this article by Maya Posch, the author of the book Mastering C++ Multithreading, we will learn to work through and understand a basic multithreaded C++ application. While generally threads are used to work on a task more or less independently from other threads, there are many occasions where one would want to pass data between threads, or even control other threads, such as from a central task scheduler thread. This article looks at how such tasks are accomplished. Topics covered in this article include: Use of mutexes, locks and similar synchronization structures. The use of condition variables and signals to control threads. Safely passing and sharing data between threads. (For more resources related to this topic, see here.) Safety first The central problem with concurrency is that of ensuring safe access to shared resources, including when communicating between threads. There is also the issue of threads being able to communicate and synchronize themselves. What makes multithreaded programming such a challenge is to be able to keep track of each interaction between threads, ensure that each and every form of access is secured while not falling into the traps of deadlocks and data races. In this article we will look at a fairly complex example involving a task scheduler. This is a form of high-concurrency, high-throughput situation where many different requirements come together, with many potential traps, as we will see in a moment. The scheduler A good example of multithreading with a significant amount of synchronization and communication between threads is the scheduling of tasks. Hereby the goal is to accept incoming tasks and assign them to work threads as quickly as possible. Hereby a number of different approaches are possible. Often one has worker threads running in an active loop, constantly polling a central queue for new tasks. Disadvantages of this approach include the wasting of processor cycles on said polling and the congestion which forms at the synchronization mechanism used, generally a mutex.Furthermore, this active polling approach scales very poorly when the number of worker threads increase. Ideally, each worker thread would idly wait until it is needed again. To accomplish this, we have to approach the problem from the other side: not from the perspective of the worker threads, but of that of the queue. Much like the scheduler of of an operating system, it is the scheduler which is aware of both tasks which require processing, as well as available worker threads. In this approach, a central scheduler instance would accept new tasks and actively assign them to worker threads. Said scheduler instance may also manage these worker threads, such as their number and priority depending on the number of incoming tasks and the type of task or other properties. High-level view At its core, our scheduleror dispatcher is quite simple, functioning like a queue with all of the scheduling logic built into it: As one can see from the high-level view, there really isn't much to it. As we'll see in a moment, the actual implementation does however have a number of complications. Implementation As is usual, we start off with the main function, contained in main.cpp: #include "dispatcher.h" #include "request.h" #include <iostream> #include <string> #include <csignal> #include <thread> #include <chrono> using namespace std; sig_atomic_t signal_caught = 0; mutex logMutex; The custom headers we include are those for our dispatcher implementation, as well as the Request class we'll be using. Globally we define an atomic variable to be used with the signal handler, as well as a mutex which will synchronize the output (on the standard output) from our logging method. void sigint_handler(int sig) { signal_caught = 1; } Our signal handler function (for SIGINT signals) simply sets the global atomic variable we defined earlier. void logFnc(string text) { logMutex.lock(); cout << text <<"n"; logMutex.unlock(); } In our logging function we use the global mutex to ensure writing to the standard output is synchronized. int main() { signal(SIGINT, &sigint_handler); Dispatcher::init(10); In the main function we install the signal handler for SIGINT to allow us to interrupt the execution of the application. We also call the static init() function on the Dispatcher class to initialize it. cout <<"Initialised.n"; int cycles = 0; Request* rq = 0; while (!signal_caught && cycles < 50) { rq = new Request(); rq->setValue(cycles); rq->setOutput(&logFnc); Dispatcher::addRequest(rq); cycles++; } Next we set up the loop in which we will create new requests. In each cycle we create a new Request instance and use its setValue() function to set an integer value (current cycle number). We also set our logging function on the request instance before adding this new request to the Dispatcher using its static addRequest() function. This loop will continue until the maximum number of cycles have been reached, or SIGINT has been signaled, using Ctrl+C or similar. his_thread::sleep_for(chrono::seconds(5)); Dispatcher::stop(); cout <<"Clean-up done.n"; return 0; } Finally we wait for five seconds, using the thread's sleep_for() function and the chrono::seconds() function from the chrono STL header. We also call the stop() function on the Dispatcher before returning. Request class A request for the Dispatcher always derives from the pure virtual AbstractRequest class: #pragma once #ifndef ABSTRACT_REQUEST_H #define ABSTRACT_REQUEST_H class AbstractRequest { // public: virtual void setValue(int value) = 0; virtual void process() = 0; virtual void finish() = 0; }; #endif This class defines an API with three functions which a deriving class always has to implement, of which the process() and finish() functions are the most generic and likely to be used in any practical implementation. The setValue() function is specific to this demonstration implementation and would likely be adapted or extended to fit a real-life scenario. The advantage of using an abstract class as the basis for a request is that it allows the Dispatcher class to handle many different types of requests, as long as they all adhere to this same basic API. Using this abstract interface, we implement a basic Request class: #pragma once #ifndef REQUEST_H #define REQUEST_H #include "abstract_request.h" #include <string> using namespace std; typedef void (*logFunction)(string text); class Request : public AbstractRequest { int value; logFunction outFnc; public: void setValue(int value) { this->value = value; } void setOutput(logFunction fnc) { outFnc = fnc; } void process(); void finish(); }; #endif In its header file we first define the logging function pointer's format. After this we implement the request API, adding the setOutput() function to the base API, which accepts a function pointer for logging. Both setter functions merely assign the provided parameter to their respective private class members. Next, the class function implementations: #include "request.h" void Request::process() { outFnc("Starting processing request " + std::to_string(value) + "..."); // } void Request::finish() { outFnc("Finished request " + std::to_string(value)); }Both of these implementations are very basic, merely using the function pointer to output a string indicating the status of the worker thread. In a practical implementation, one would add the business logic to the process()function, with the finish() function containing any functionality to finish up a request, such as writing a map into a string. Worker class Next, the Worker class. This contains the logic which will be called by the dispatcher in order to process a request: #pragma once #ifndef WORKER_H #define WORKER_H #include "abstract_request.h" #include <condition_variable> #include <mutex> using namespace std; class Worker { condition_variable cv; mutex mtx; unique_lock<mutex> ulock; AbstractRequest* request; bool running; bool ready; public: Worker() { running = true; ready = false; ulock = unique_lock<mutex>(mtx); } void run(); void stop() { running = false; } void setRequest(AbstractRequest* request) { this->request = request; ready = true; } void getCondition(condition_variable* &cv); }; #endif Whereas the adding of a request to the dispatcher does not require any special logic, the Worker class does require the use of condition variables to synchronize itself with the dispatcher. For the C++11 threads API, this requires a condition variable, a mutex and a unique lock. The unique lock encapsulates the mutex and will ultimately be used with the condition variable as we will see in a moment. Beyond this we define methods to start and stop the worker, to set a new request for processing and to obtain access to its internal condition variable. Moving on, the rest of the implementation: #include "worker.h" #include "dispatcher.h" #include <chrono> using namespace std; void Worker::getCondition(condition_variable* &cv) { cv = &(this)->cv; } void Worker::run() { while (running) { if (ready) { ready = false; request->process(); request->finish(); } if (Dispatcher::addWorker(this)) { // Use the ready loop to deal with spurious wake-ups. while (!ready && running) { if (cv.wait_for(ulock, chrono::seconds(1)) == cv_status::timeout) { // We timed out, but we keep waiting unless // the worker is // stopped by the dispatcher. } } } } } Beyond the getter function for the condition variable, we define the run() function, which the dispatcher will run for each worker thread upon starting it. Its main loop merely checks that the stop() function hasn't been called yet, which would have set the running boolean value to false and ended the work thread. This is used by the dispatcher when shutting down, allowing it to terminate the worker threads. Since boolean values are generally atomic, setting and checking can be done simultaneously without risk or requiring a mutex. Moving on, the check of the ready variable is to ensure that a request is actually waiting when the thread is first run. On the first run of the worker thread, no request will be waiting and thus attempting to process one would result in a crash. Upon the dispatcher setting a new request, this boolean variable will be set to true. If a request is waiting, the ready variable will be set to false again, after which the request instance will have its process() and finish() functions called. This will run the business logic of the request on the worker thread's thread and finalize it. Finally, the worker thread adds itself to the dispatcher using its static addWorker() function. This function will return false if no new request was available, causing the worker thread to wait until a new request has become available. Otherwise the worker thread will continue with the processing of the new request that the dispatcher will have set on it. If asked to wait, we enter a new loop which will ensure that upon waking up from waiting for the condition variable to be signaled, we woke up because we got signaled by the dispatcher (ready variable set to true), and not because of a spurious wake-up. Last of all, we enter the actual wait() function of the condition variable, using the unique lock instance we created before, along with a timeout. If a timeout occurs, we can either terminate the thread, or keep waiting. Here we choose to do nothing and just re-enter the waiting loop. Dispatcher As the last item, we have the Dispatcher class itself: #pragma once #ifndef DISPATCHER_H #define DISPATCHER_H #include "abstract_request.h" #include "worker.h" #include <queue> #include <mutex> #include <thread> #include <vector> using namespace std; class Dispatcher { static queue<AbstractRequest*> requests; static queue<Worker*> workers; static mutex requestsMutex; static mutex workersMutex; static vector<Worker*> allWorkers; static vector<thread*> threads; public: static bool init(int workers); static bool stop(); static void addRequest(AbstractRequest* request); static bool addWorker(Worker* worker); }; #endif Most of this should look familiar by now. As one should have surmised by now, this is a fully static class. Moving on with its implementation: #include "dispatcher.h" #include <iostream> using namespace std; queue<AbstractRequest*> Dispatcher::requests; queue<Worker*> Dispatcher::workers; mutex Dispatcher::requestsMutex; mutex Dispatcher::workersMutex; vector<Worker*> Dispatcher::allWorkers; vector<thread*> Dispatcher::threads; bool Dispatcher::init(int workers) { hread* t = 0; Worker* w = 0; for (int i = 0; i < workers; ++i) { w = new Worker; allWorkers.push_back(w); = new thread(&Worker::run, w); hreads.push_back(t); } } After setting up the static class members, the init() function is defined. It starts the specified number of worker threads, keeping a reference to each worker and thread instance in their respective vector data structures. bool Dispatcher::stop() { for (int i = 0; i < allWorkers.size(); ++i) { allWorkers[i]->stop(); } cout <<"Stopped workers.n"; for (int j = 0; j < threads.size(); ++j) { hreads[j]->join(); cout <<"Joined threads.n"; } } In the stop() function each worker instance has its stop() function called. This will cause each worker thread to terminate, as we saw earlier in the Worker class description. Finally, we wait for each thread to join (that is, finish), prior to returning. void Dispatcher::addRequest(AbstractRequest* request) { workersMutex.lock(); if (!workers.empty()) { Worker* worker = workers.front(); worker->setRequest(request); condition_variable* cv; worker->getCondition(cv); cv->notify_one(); workers.pop(); workersMutex.unlock(); } else { workersMutex.unlock(); requestsMutex.lock(); requests.push(request); requestsMutex.unlock(); } } The addRequest() function is where things get interesting. In this one function a new request is added. What happens next to it depends on whether a worker thread is waiting for a new request or not. If no worker thread is waiting (worker queue is empty), the request is added to the request queue. The use of mutexes ensures that the access to these queues occurs safely, as the worker threads will simultaneously try to access both queues as well. An import gotcha to note here is the possibility of a deadlock. That is, a situation where two threads will hold the lock on a resource, with the other thread waiting for the first thread to release its lock before releasing its own. Every situation where more than one mutex is used in a single scope holds this potential. In this function the potential for deadlock lies in the releasing of the lock on the workers mutex and when the lock on the requests mutex is obtained. In the case that this function holds the workers mutex and tries to obtain the requests lock (when no worker thread is available), there is a chance that another thread holds the requests mutex (looking for new requests to handle), while simultaneously trying to obtain the workers mutex (finding no requests and adding itself to the workers queue). The solution hereby is simple: release a mutex before obtaining the next one. In the situation where one feels that more than one mutex lock has to be held it is paramount to examine and test one's code for potential deadlocks. In this particular situation the workers mutex lock is explicitly released when it is no longer needed, or before the requests mutex lock is obtained, preventing a deadlock. Another important aspect of this particular section of code is the way it signals a worker thread. As one can see in the first section of the if/else block, when the workers queue is not empty, a worker is fetched from the queue, has the request set on it and then has its condition variable referenced and signaled, or notified. Internally the condition variable uses the mutex we handed it before in the Worker class definition to guarantee only atomic access to it. When the notify_one() function (generally called signal() in other APIs) is called on the condition variable, it will notify the first thread in the queue of threads waiting for the condition variable to return and continue. In the Worker class'run() function we would be waiting for this notification event. Upon receiving it, the worker thread would continue and process the new request. The thread reference will then be removed from the queue until it adds itself again once it is done processing the request. bool Dispatcher::addWorker(Worker* worker) { bool wait = true; requestsMutex.lock(); if (!requests.empty()) { AbstractRequest* request = requests.front(); worker->setRequest(request); requests.pop(); wait = false; requestsMutex.unlock(); } else { requestsMutex.unlock(); workersMutex.lock(); workers.push(worker); workersMutex.unlock(); } return wait; } With this function a worker thread will add itself to the queue once it is done processing a request. It is similar to the earlier function in that the incoming worker is first actively matched with any request which may be waiting in the request queue. If none are available, the worker is added to the worker queue. Important to note here is that we return a boolean value which indicates whether the calling thread should wait for a new request, or whether it already has received a new request while trying to add itself to the queue. While this code is less complex than that of the previous function, it still holds the same potential deadlock issue due to the handling of two mutexes within the same scope. Here, too, we first release the mutex we hold before obtaining the next one. Makefile The Makefile for this dispatcher example is very basic again, gathering all C++ source files in the current folder and compiling them into a binary using g++: GCC := g++ OUTPUT := dispatcher_demo SOURCES := $(wildcard *.cpp) CCFLAGS := -std=c++11 -g3 all: $(OUTPUT) $(OUTPUT): $(GCC) -o $(OUTPUT) $(CCFLAGS) $(SOURCES) clean: rm $(OUTPUT) .PHONY: all Output After compiling the application, running it produces the following output for the fifty total requests: $ ./dispatcher_demo.exe Initialised. Starting processing request 1... Starting processing request 2... Finished request 1 Starting processing request 3... Finished request 3 Starting processing request 6... Finished request 6 Starting processing request 8... Finished request 8 Starting processing request 9... Finished request 9 Finished request 2 Starting processing request 11... Finished request 11 Starting processing request 12... Finished request 12 Starting processing request 13... Finished request 13 Starting processing request 14... Finished request 14 Starting processing request 7... Starting processing request 10... Starting processing request 15... Finished request 7 Finished request 15 Finished request 10 Starting processing request 16... Finished request 16 Starting processing request 17... Starting processing request 18... Starting processing request 0… At this point we we can already clearly see that even with each request taking almost no time to process, the requests are clearly being executed in parallel. The first request (request 0) only starts being processed after the 16th request, while the second request already finishes after the ninth request, long before this. The factors which determine which thread and thus which request is processed first depends on the OS scheduler and hardware-based scheduling. This clearly shows just how few assumptions one can be made about how a multithreaded application will be executed, even on a single platform. Starting processing request 5... Finished request 5 Starting processing request 20... Finished request 18 Finished request 20 Starting processing request 21... Starting processing request 4... Finished request 21 Finished request 4 Here the fourth and fifth requests also finish in a rather delayed fashion. Starting processing request 23... Starting processing request 24... Starting processing request 22... Finished request 24 Finished request 23 Finished request 22 Starting processing request 26... Starting processing request 25... Starting processing request 28... Finished request 26 Starting processing request 27... Finished request 28 Finished request 27 Starting processing request 29... Starting processing request 30... Finished request 30 Finished request 29 Finished request 17 Finished request 25 Starting processing request 19... Finished request 0 At this point the first request finally finishes. This may indicate that the initialization time for the first request will always delay it relative to the successive requests. Running the application multiple times can confirm this. It's important that if the order of processing is relevant, that this randomness does not negatively affect one's application. Starting processing request 33... Starting processing request 35... Finished request 33 Finished request 35 Starting processing request 37... Starting processing request 38... Finished request 37 Finished request 38 Starting processing request 39... Starting processing request 40... Starting processing request 36... Starting processing request 31... Finished request 40 Finished request 39 Starting processing request 32... Starting processing request 41... Finished request 32 Finished request 41 Starting processing request 42... Finished request 31 Starting processing request 44... Finished request 36 Finished request 42 Starting processing request 45... Finished request 44 Starting processing request 47... Starting processing request 48... Finished request 48 Starting processing request 43... Finished request 47 Finished request 43 Finished request 19 Starting processing request 34... Finished request 34 Starting processing request 46... Starting processing request 49... Finished request 46 Finished request 49 Finished request 45 Request 19 also became fairly delayed, showing once again just how unpredictable a multithreaded application can be. If we were processing a large data set in parallel here, with chunks of data in each request, we might have to pause at some points to account for these delays as otherwise our output cache might grow too large. As doing so would negatively affect an application's performance, one might have to look at low-level optimizations, as well as the scheduling of threads on specific processor cores in order to prevent this from happening. Stopped workers. Joined threads. Joined threads. Joined threads. Joined threads. Joined threads. Joined threads. Joined threads. Joined threads. Joined threads. Joined threads. Clean-up done. All ten worker threads which were launched in the beginning terminate here as we call the stop() function of the Dispatcher. Sharing data In this article's example we saw how to share information between threads in addition to the synchronizing of threads. This in the form of the requests we passed from the main thread into the dispatcher, from which each request gets passed on to a different thread. The essential idea behind the sharing of data between threads is that the data to be shared exists somewhere in a way which is accessible to two threads or more. After this we have to ensure that only one thread can modify the data, and that the data does not get modified while it's being read. Generally we would use mutexes or similar to ensure this. Using R/W-locks Readers-writer locks are a possible optimization here, because they allow multiple threads to read simultaneously from a single data source. If one has an application in which multiple worker threads read the same information repeatedly, it would be more efficient to use read-write locks than basic mutexes, because the attempts to read the data will not block the other threads. A read-write lock can thus be used as a more advanced version of a mutex, namely as one which adapts its behavior to the type of access. Internally it builds on mutexes (or semaphores) and condition variables. Using shared pointers First available via the Boost library and introduced natively with C++11, shared pointers are an abstraction of memory management using reference counting for heap-allocated instances. They are partially thread-safe, in that creating multiple shared pointer instances can be created, but the referenced object itself is not thread-safe. Depending on the application this may suffice, however. To make them properly thread-safe one can use atomics. Summary In this article we looked at how to pass data between threads in a safe manner as part of a fairly complex scheduler implementation. We also looked at the resulting asynchronous processing of said scheduler and considered some potential alternatives and optimizations for passing data between threads. At this point one should be able to safely pass data between threads, as well as synchronize the access to other shared resources. In the next article we will be looking at the native C++ threading and primitives API.  Resources for Article: Further resources on this subject: Multithreading with Qt [article] Understanding the Dependencies of a C++ Application [article] Boost.Asio C++ Network Programming [article]
Read more
  • 0
  • 0
  • 6749

article-image-creating-tfs-scheduled-jobs
Packt
28 Sep 2015
12 min read
Save for later

Creating TFS Scheduled Jobs

Packt
28 Sep 2015
12 min read
In this article by Gordon Beeming, the author of the book, Team Foundation Server 2015 Customization, we are going to cover TFS scheduled jobs. The topics that we are going to cover include: Writing a TFS Job Deploying a TFS Job Removing a TFS Job You would want to write a scheduled job for any logic that needs to be run at specific times, whether it is at certain increments or at specific times of the day. A scheduled job is not the place to put logic that you would like to run as soon as some other event, such as a check-in or a work item change, occurs. It will automatically link change sets to work items based on the comments. (For more resources related to this topic, see here.) The project setup First off, we'll start with our project setup. This time, we'll create a Windows console application. Creating a new windows console application The references that we'll need this time around are: Microsoft.VisualStudio.Services.WebApi.dll Microsoft.TeamFoundation.Common.dll Microsoft.TeamFoundation.Framework.Server.dll All of these can be found in C:Program FilesMicrosoft Team Foundation Server 14.0Application TierTFSJobAgent on the TFS server. That's all the setup that is required for your TFS job project. Any class that inherit ITeamFoundationJobExtension will be able to be used for a TFS Job. Writing the TFS job So, as mentioned, we are going to need a class that inherits from ITeamFoundationJobExtension. Let's create a class called TfsCommentsToChangeSetLinksJob and inherit from ITeamFoundationJobExtension. As part of this, we will need to implement the Run method, which is part of an interface, like this: public class TfsCommentsToChangeSetLinksJob : ITeamFoundationJobExtension { public TeamFoundationJobExecutionResult Run( TeamFoundationRequestContext requestContext, TeamFoundationJobDefinition jobDefinition, DateTime queueTime, out string resultMessage) { throw new NotImplementedException(); } } Then, we also add the using statement: using Microsoft.TeamFoundation.Framework.Server; Now, for this specific extension, we'll need to add references to the following: Microsoft.TeamFoundation.Client.dll Microsoft.TeamFoundation.VersionControl.Client.dll Microsoft.TeamFoundation.WorkItemTracking.Client.dll All of these can be found in C:Program FilesMicrosoft Team Foundation Server 14.0Application TierTFSJobAgent. Now, for the logic of our plugin, we use the following code inside of the Run method as a basic shell, where we'll then place the specific logic for this plugin. This basic shell will be adding a try catch block, and at the end of the try block, it will return a successful job run. We'll then add to the job message what exception may be thrown and returning that the job failed: resultMessage = string.Empty; try { // place logic here return TeamFoundationJobExecutionResult.Succeeded; } catch (Exception ex) { resultMessage += "Job Failed: " + ex.ToString(); return TeamFoundationJobExecutionResult.Failed; } Along with this code, you will need the following using function: using Microsoft.TeamFoundation; using Microsoft.TeamFoundation.Client; using Microsoft.TeamFoundation.VersionControl.Client; using Microsoft.TeamFoundation.WorkItemTracking.Client; using System.Linq; using System.Text.RegularExpressions; So next, we need to place some logic specific to this job in the try block. First, let's create a connection to TFS for version control: TfsTeamProjectCollection tfsTPC = TfsTeamProjectCollectionFactory.GetTeamProjectCollection( new Uri("http://localhost:8080/tfs")); VersionControlServer vcs = tfsTPC.GetService<VersionControlServer>(); Then, we will query the work item store's history and get the last 25 check-ins: WorkItemStore wis = tfsTPC.GetService<WorkItemStore>(); // get the last 25 check ins foreach (Changeset changeSet in vcs.QueryHistory("$/", RecursionType.Full, 25)) { // place the next logic here } Now that we have the changeset history, we are going to check the comments for any references to work items using a simple regex expression: //try match the regex for a hash number in the comment foreach (Match match in Regex.Matches((changeSet.Comment ?? string.Empty), @"#d{1,}")) { // place the next logic here } Getting into this loop, we'll know that we have found a valid number in the comment and that we should attempt to link the check-in to that work item. But just the fact that we have found a number doesn't mean that the work item exists, so let's try find a work item with the found number: int workItemId = Convert.ToInt32(match.Value.TrimStart('#')); var workItem = wis.GetWorkItem(workItemId); if (workItem != null) { // place the next logic here } Here, we are checking to make sure that the work item exists so that if the workItem variable is not null, then we'll proceed to check whether a relationship for this changeSet and workItem function already exists: //now create the link ExternalLink changesetLink = new ExternalLink( wis.RegisteredLinkTypes[ArtifactLinkIds.Changeset], changeSet.ArtifactUri.AbsoluteUri); //you should verify if such a link already exists if (!workItem.Links.OfType<ExternalLink>() .Any(l => l.LinkedArtifactUri == changeSet.ArtifactUri.AbsoluteUri)) { // place the next logic here } If a link does not exist, then we can add a new link: changesetLink.Comment = "Change set " + $"'{changeSet.ChangesetId}'" + " auto linked by a server plugin"; workItem.Links.Add(changesetLink); workItem.Save(); resultMessage += $"Linked CS:{changeSet.ChangesetId} " + $"to WI:{workItem.Id}"; We just have the extra bit here so as to get the last 25 change sets. If you were using this for production, you would probably want to store the last change set that you processed and then get history up until that point, but I don't think it's needed to illustrate this sample. Then, after getting the list of change sets, we basically process everything 100 percent as before. We check whether there is a comment and whether that comment contains a hash number that we can try linking to a changeSet function. We then check whether a workItem function exists for the number that we found. Next, we add a link to the work item from the changeSet function. Then, for each link we add to the overall resultMessage string so that when we look at the results from our job running, we can see which links were added automatically for us. As you can see, with this approach, we don't interfere with the check-in itself but rather process this out-of-hand way of linking changeSet to work with items at a later stage. Deploying our TFS Job Deploying the code is very simple; change the project's Output type to Class Library. This can be done by going to the project properties, and then in the Application tab, you will see an Output type drop-down list. Now, build your project. Then, copy the TfsJobSample.dll and TfsJobSample.pdb output files to the scheduled job plugins folder, which is C:Program FilesMicrosoft Team Foundation Server 14.0Application TierTFSJobAgentPlugins. Unfortunately, simply copying the files into this folder won't make your scheduled job automatically installed, and the reason for this is that as part of the interface of the scheduled job, you don't specify when to run your job. Instead, you register the job as a separate step. Change Output type back to Console Application option for the next step. You can, and should, split the TFS job from its installer into different projects, but in our sample, we'll use the same one. Registering, queueing, and deregistering a TFS Job If you try install the job the way you used to in TFS 2013, you will now get the TF400444 error: TF400444: The creation and deletion of jobs is no longer supported. You may only update the EnabledState or Schedule of a job. Failed to create, delete or update job id 5a7a01e0-fff1-44ee-88c3-b33589d8d3b3 This is because they have made some changes to the job service, for security reasons, and these changes prevent you from using the Client Object Model. You are now forced to use the Server Object Model. The code that you have to write is slightly more complicated and requires you to copy your executable to multiple locations to get it working properly. Place all of the following code in your program.cs file inside the main method. We start off by getting some arguments that are passed through to the application, and if we don't get at least one argument, we don't continue: #region Collect commands from the args if (args.Length != 1 && args.Length != 2) { Console.WriteLine("Usage: TfsJobSample.exe <command "+ "(/r, /i, /u, /q)> [job id]"); return; } string command = args[0]; Guid jobid = Guid.Empty; if (args.Length > 1) { if (!Guid.TryParse(args[1], out jobid)) { Console.WriteLine("Job Id not a valid Guid"); return; } } #endregion We then wrap all our logic in a try catch block, and for our catch, we only write the exception that occurred: try { // place logic here } catch (Exception ex) { Console.WriteLine(ex.ToString()); } Place the next steps inside the try block, unless asked to do otherwise. As part of using the Server Object Model, you'll need to create a DeploymentServiceHost. This requires you to have a connection string to the TFS Configuration database, so make sure that the connection string set in the following is valid for you. We also need some other generic path information, so we'll mimic what we could expect the job agents' paths to be: #region Build a DeploymentServiceHost string databaseServerDnsName = "localhost"; string connectionString = $"Data Source={databaseServerDnsName};"+ "Initial Catalog=TFS_Configuration;Integrated Security=true;"; TeamFoundationServiceHostProperties deploymentHostProperties = new TeamFoundationServiceHostProperties(); deploymentHostProperties.HostType = TeamFoundationHostType.Deployment | TeamFoundationHostType.Application; deploymentHostProperties.Id = Guid.Empty; deploymentHostProperties.PhysicalDirectory = @"C:Program FilesMicrosoft Team Foundation Server 14.0"+ @"Application TierTFSJobAgent"; deploymentHostProperties.PlugInDirectory = $@"{deploymentHostProperties.PhysicalDirectory}Plugins"; deploymentHostProperties.VirtualDirectory = "/"; ISqlConnectionInfo connInfo = SqlConnectionInfoFactory.Create(connectionString, null, null); DeploymentServiceHost host = new DeploymentServiceHost(deploymentHostProperties, connInfo, true); #endregion Now that we have a TeamFoundationServiceHost function, we are able to create a TeamFoundationRequestContext function . We'll need it to call methods such as UpdateJobDefinitions, which adds and/or removes our job, and QueryJobDefinition, which is used to queue our job outside of any schedule: using (TeamFoundationRequestContext requestContext = host.CreateSystemContext()) { TeamFoundationJobService jobService = requestContext.GetService<TeamFoundationJobService>() // place next logic here } We then create a new TeamFoundationJobDefinition instance with all of the information that we want for our TFS job, including the name, schedule, and enabled state: var jobDefinition = new TeamFoundationJobDefinition( "Comments to Change Set Links Job", "TfsJobSample.TfsCommentsToChangeSetLinksJob"); jobDefinition.EnabledState = TeamFoundationJobEnabledState.Enabled; jobDefinition.Schedule.Add(new TeamFoundationJobSchedule { ScheduledTime = DateTime.Now, PriorityLevel = JobPriorityLevel.Normal, Interval = 300, }); Once we have the job definition, we check what the command was and then execute the code that will relate to that command. For the /r command, we will just run our TFS job outside of the TFS job agent: if (command == "/r") { string resultMessage; new TfsCommentsToChangeSetLinksJob().Run(requestContext, jobDefinition, DateTime.Now, out resultMessage); } For the /i command, we will install the TFS job: else if (command == "/i") { jobService.UpdateJobDefinitions(requestContext, null, new[] { jobDefinition }); } For the /u command, we will uninstall the TFS Job: else if (command == "/u") { jobService.UpdateJobDefinitions(requestContext, new[] { jobid }, null); } Finally, with the /q command, we will queue the TFS job to be run inside the TFS job agent and outside of its schedule: else if (command == "/q") { jobService.QueryJobDefinition(requestContext, jobid); } Now that we have this code in the program.cs file, we need to compile the project and then copy TfsJobSample.exe and TfsJobSample.pdb to the TFS Tools folder, which is C:Program FilesMicrosoft Team Foundation Server 14.0Tools. Now open a cmd window as an administrator. Change the directory to the Tools folder and then run your application with a /i command, as follows: Installing the TFS Job Now, you have successfully installed the TFS Job. To uninstall it or force it to be queued, you will need the job ID. But basically you have to run /u with the job ID to uninstall, like this: Uninstalling the TFS Job You will be following the same approach as prior for queuing, simply specifying the /q command and the job ID. How do I know whether my TFS Job is running? The easiest way to check whether your TFS Job is running or not is to check out the job history table in the configuration database. To do this, you will need the job ID (we spoke about this earlier), which you can obtain by running the following query against the TFS_Configuration database: SELECT JobId FROM Tfs_Configuration.dbo.tbl_JobDefinition WITH ( NOLOCK ) WHERE JobName = 'Comments to Change Set Links Job' With this JobId, we will then run the following lines to query the job history: SElECT * FROM Tfs_Configuration.dbo.tbl_JobHistory WITH (NOLOCK) WHERE JobId = '<place the JobId from previous query here>' This will return you a list of results about the previous times the job was run. If you see that your job has a Result of 6 which is extension not found, then you will need to stop and restart the TFS job agent. You can do this by running the following commands in an Administrator cmd window: net stop TfsJobAgent net start TfsJobAgent Note that when you stop the TFS job agent, any jobs that are currently running will be terminated. Also, they will not get a chance to save their state, which, depending on how they were written, could lead to some unexpected situations when they start again. After the agent has started again, you will see that the Result field is now different as it is a job agent that will know about your job. If you prefer browsing the web to see the status of your jobs, you can browse to the job monitoring page (_oi/_jobMonitoring#_a=history), for example, http://gordon-lappy:8080/tfs/_oi/_jobMonitoring#_a=history. This will give you all the data that you can normally query but with nice graphs and grids. Summary In this article, we looked at how to write, install, uninstall, and queue a TFS Job. You learned that the way we used to install TFS Jobs will no longer work for TFS 2015 because of a change in the Client Object Model for security. Resources for Article: Further resources on this subject: Getting Started with TeamCity[article] Planning for a successful integration[article] Work Item Querying [article]
Read more
  • 0
  • 0
  • 6749