Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Programming

1081 Articles
article-image-algorithm-analysis
Packt
11 Nov 2016
12 min read
Save for later

Algorithm Analysis

Packt
11 Nov 2016
12 min read
In this article by Prakash and Achyutuni Sri Krishna Rao, authors of the book R Data Structures and Algorithms, we will discuss how an algorithm can be defined as a set of step-by-step instructions which govern the outline of a program that needs to be executed using computational resources. The execution can be in any programming language such as R, Python, and Java. Data is an intricate component of any program, and depending on how data is organized (data structure), your execution time can vary drastically. That’s why data structure is such a critical component of any good algorithm implementation. (For more resources related to this topic, see here.) The sorting algorithm, which acts as a connecter between the user-defined input and user-desired output, can be approached in multiple ways: Bubble sort and Shell sort, which are simple variants of sorting, but are highly inefficient Insertion sort and Selection sort, primarily used for sorting small datasets Merge sort, Heap sort, and Quick sort, which are efficient ways of sorting based on the complexities involved in an average system runtime Distributed sorts such as counting sort, bucket sort, and radix sort, which can handle both runtime and memory usage Each of these options can, in turn, handle a particular set of instances more effectively. This essentially deduces the concept of a “good algorithm”. An algorithm can be termed as “good” if it possesses attributes such as the following among many others: Shorter running time Lesser memory utilization Simplicity in reading the code Generality in accepting inputs This book will concentrate primarily on running time or time complexity, partly on memory utilization, and their relationship during program execution. Introduction A problem can be approached using multiple algorithms, and each algorithm can be assessed based on certain parameters such as: System runtime Memory requirement However, these parameters are generally affected by external environmental factors such as: Handling of data structures System software and hardware configurations Style of writing and compiling codes Programming language As it is highly impossible to control all external parameters, it becomes difficult to estimate the system runtime of multiple algorithms for performance comparison (ideal scenario analysis). Asymptotic analysis is one such technique which can be used to assess an algorithm’s efficiency without actually coding and compiling the entire program. It is a functional form representing a pseudo system runtime based on the size of input data and the number of operations. It is based on the principle that the growth rate of input data is directly proportional to the system runtime. For example, in the case of insertion sorting, the size represents the length of the input vector, and the number of operations represents the complexity of sort operations. This analysis can only be used to gauge the consideration of implementing the algorithm rather than evaluating the merits and demerits of algorithms in comparison. The following table represents the most widely used growth rate functional forms. The most widely used functional forms of growth rates are based on the size of input data, which are used to analyze the performance of algorithms. These are also considered as pseudo-functional forms to evaluate an algorithm’s system runtime. Memory management in R Memory management primarily deals with the administration of available memory and prediction of additional memory required for smoother and faster execution of functions. The current section will cover the concept of memory allocation, which deals with storage of an object in the R environment. Memory Allocation: R allocates memory differently to different objects in its environment. Memory allocation can be determined using the object_size function from the pryr package. The pryr package can be installed from the CRAN repository using install.packages(“pryr”). The object_size function in pryr is similar to the object.size function in the base package. However, it is more accurate as it: Takes into account the environment size associated with the current object Takes into account the shared elements within a given object under consideration The following are examples of using the object_size function in R to evaluate memory allocation: > object_size(1) ## Memory allocated for a single numeric vector 48 B > object_size(“R”) ## Memory allocated for a single character vector 96 B > object_size(TRUE) ## Memory allocated for a single logical vector 48 B > object_size(1i) ## Memory allocated for a single complex vector 56 B The storage required by an object can be attributed to the following parameters: Metadata: Metadata of an object is defined by the type of object used such as character, integers, logical, and so on. The type can also usually be helpful during debugging. Node pointer: The node pointer maintains the link between the different nodes, and depending on the number of node pointers used, memory requirement changes. For example, a doubly linked list requires more memory than a singly linked list, as it uses two node pointers to connect to the previous and next nodes. Attribute pointer: Pointer to keep reference for attributes; this helps to reduce memory allocation, especially the data stored by a variable. Memory allocation: Length of the vector representing the currently used space Size: Size represents the true allocated space length of the vector. Memory Padding: Padding applied to a component, for example, each element begins after an 8-byte boundary. The Object_size() command is also used to see the inherent memory allocation as shown in the following table: The preceding table shows inherent memory allocated by each data structure/type. Let’s simulate scenarios with varying lengths of a vector with different data types such as integer, character, Boolean, and complex. The simulation is performed by simulating a vector length from 0 to 60 as follows: > vec_length <- 0:60 > num_vec_size <- sapply(vec_length, function(x) object_size(seq(x))) > char_vec_size <- sapply(vec_length, function(x) object_size(rep(“a”,x))) > log_vec_size <- sapply(vec_length, function(x) object_size(rep(TRUE,x))) > comp_vec_size <- sapply(vec_length, function(x) object_size(rep(“2i”,x))) Num_vec_size computes the memory requirement for each numeric vector from zero to 60 number of elements. These elements are integers increasing sequentially, as stated in the function. Similarly, incremental memory requirements are calculated for character (char_vec_size), logical (log_vec_size), and complex (comp_vec_size) vectors. The result obtained from the simulation can be plotted using code. > par(mfrow=c(2,2)) > plot(num_vec_size ~ vec_length, xlab = “Numeric seq vector”, ylab = “Memory allocated (in bytes)”, + type = “n”) > abline(h = (c(0,8,16,32,48,64,128)+40), col = “grey”) > lines(num_vec_size, type = “S”) The result obtained on running the preceding code is shown in following figure. From the following figure, it can be observed that memory allocated to a vector is a function of its length and the object type used. However, the relationship does not seem to be linear—rather, it seems to increase in step. This is due to the fact that for better and consistent performance, R initially assigns big blocks of memory from RAM and handles them internally. These memory blocks are individually assigned to vectors based on the type and the number of elements within. Initially, memory blocks seem to be irregular towards a particular level (128 bytes for numeric/logical vector, and 176 bytes for character/complex vectors), and later become stable with small increments of 8 bytes as can be seen in the plots: Memory allocation based on length of vector Due to initial memory allocation differences, numeric and logical vectors show similar memory allocation patterns, and complex vectors behave similar to the character vectors. Memory management helps to efficiently run an algorithm. However, before the execution of any program, we should evaluate it based on its runtime. In the next sub-section, we will discuss the basic concepts involved in obtaining the runtime of any function, and its comparison with similar functions. System runtime in R System runtime is very essential for benchmarking different algorithms. The process helps us in comparing different options, and pick the best algorithm. The CRAN package microbenchmark is used to evaluate the runtime of any expression/function/code at an accuracy of a sub-millisecond. It is an accurate replacement to the system.time() function. Also, all the evaluations are performed in C code to minimize any overhead. The following methods are used to measure the time elapsed: The QueryPerformanceCounter interface on Windows OS The clock_gettime API on Linux OS The mach_absolute_time function on MAC OS The gethrtime function on Solaris OS In our current example, we shall be using the mtcars data, which is in the package datasets. This data is obtained from 1974 Motor Trend US magazine, which comprises of fuel consumption comparison along with 10 automobile designs and the performance of 32 automobiles (1973-74 models). Now, we would like to perform an operation in which a specific numeric attribute (mpg means miles per gallon) needs to be averaged to the corresponding unique values in an integer attribute (carb means no of carburetors). This can be performed using multiple ways such as aggregate, group_by, by, split, ddply(plyr), tapply, data.table, dplyr, sqldf, dplyr and so on. In our current scenario, we have used the following four ways: aggregate function aggregate(mpg~carb,data=mtcars,mean) ddply from plyr package ddply( mtcars, .(carb),function(x) mean(x$mpg)) data.table format mtcars_tb[,mean(mpg),by=carb] group_by function summarize(group_by(mtcars, carb), mean(mpg)) Then, microbenchmark is used to determine the performance of each of the four ways mentioned in the preceding list. Here, we will be evaluating each expression 100 times. > library(microbenchmark) > MB_res <- microbenchmark( + Aggregate_func=aggregate(mpg~carb,data=mtcars,mean), + Ddply_func=ddply( mtcars, .(carb),function(x) mean(x$mpg)), + Data_table_func = mtcars_tb[,mean(mpg),by=carb], + Group_by_func = summarize(group_by(mtcars, carb), mean(mpg)), + times=1000 + ) The output table is as follows: > MB_res Unit: microseconds expr min lq mean median uq max neval Aggregate_func 851.489 913.8015 1001.9007 944.775 1000.4905 6094.209 1000 Ddply_func 1370.519 1475.1685 1579.6123 1517.322 1575.7855 6598.578 1000 Data_table_func 493.739 552.7540 610.7791 577.495 621.6635 3125.179 1000 Group_by_func 932.129 1008.5540 1095.4193 1033.113 1076.1825 4279.435 1000 The output plot is as follows: > library(ggplot2) > autoplot(MB_res) Distribution of time (microseconds) for 1000 iterations in each type of aggregate operation Among these four expressions and for the given dataset, data.table has performed effectively in the least possible time as compared to the others. However, expressions need to be tested under scenarios with a high number of observations, high number of attributes, and both prior to finalizing the best operator. Best, worst, and average Cases Based on the performance in terms of system runtime, a code can be classified under best, worst or average category for a particular algorithm. Let’s consider a sorting algorithm to understand in detail. A sorting algorithm is used to arrange a numeric vector in an ascending order, wherein the output vector should have the smallest number as its first element and largest number as its last element with intermediate elements in subsequent increasing order. In insertion sorting algorithm, the elements within a vector are arranged based on moving positions. In our scenario, we will be inserting each element at a time into a previously sorted vector, with a smaller set of elements moving towards the end. Now, let’s define best, worst and average-case scenarios for an insertion sorting algorithm. Best Case: A best case is one which requires the least running time. For example: a vector with all elements arranged in increasing order requires least amount of time for sorting. Worst Case: A worst case is one which requires the maximum possible runtime to complete sorting a vector. For example: a vector with all the elements sorted in decreasing order requires most amount of time for sorting. Average Case: An average case is one which requires intermediate time to complete sorting a vector. For example: a vector with half elements sorted in increasing order and the remaining in decreasing order. An average case is assessed using multiple vectors of differently arranged elements. Generally, the best-case scenarios are not considered to benchmark an algorithm, since they evaluate an algorithm most optimistically. However, if the probability of occurrence of best case is high, then algorithms can be compared using the best-case scenarios. Similar to best case, worst-case scenarios evaluate the algorithm most pessimistically. It is only used to benchmark algorithms which are used in real-time applications, such as railway network controls, air traffic controls, and the like. Sometimes, when we are not aware of input data distributions, it is safe to assess the performance of the algorithm based on the worst-case scenario. Most of the times, average-case scenario is used as a representative measure of an algorithm’s performance; however, this is valid only when we are aware of the input data distribution. Average-case scenarios may not evaluate the algorithm properly if the distribution of input data is skewed. In the case of sorting, if most of the input vectors are arranged in descending order, the average-case scenario may not be the best form of evaluating the algorithm. In a nutshell, real-time application scenarios, along with input data distribution, are major criterions to analyze the algorithms based on best, worst, and average cases. Summary This article summarizes the basic concepts and nuances of evaluating algorithms in R. We covered the conceptual theory of memory management and system runtime in R. We discussed the best, worst, and average-case scenarios to evaluate the performance of algorithms. Resources for Article: Further resources on this subject: Reconstructing 3D Scenes [article] Raster Calculations [article] Remote Sensing and Histogram [article]
Read more
  • 0
  • 0
  • 1837

article-image-why-we-need-design-patterns
Packt
10 Nov 2016
16 min read
Save for later

Why we need Design Patterns?

Packt
10 Nov 2016
16 min read
In this article by Praseed Pai, and Shine Xavier, authors of the book .NET Design Patterns, we will try to understand the necessity of choosing a pattern-based approach to software development. We start with some principles of software development, which one might find useful while undertaking large projects. The working example in the article starts with a requirements specification and progresses towards a preliminary implementation. We will then try to iteratively improve the solution using patterns and idioms, and come up with a good design that supports a well-defined programming Interface. In this process, we will learn about some software development principles (listed below) one can adhere to, including the following: SOLID principles for OOP Three key uses of design patterns Arlow/Nuestadt archetype patterns Entity, value, and data transfer objects Leveraging the .NET Reflection API for plug-in architecture (For more resources related to this topic, see here.) Some principles of software development Writing quality production code consistently is not easy without some foundational principles under your belt. The purpose of this section is to whet the developer's appetite, and towards the end, some references are given for detailed study. Detailed coverage of these principles warrants a separate book on its own scale. The authors have tried to assimilate the following key principles of software development which would help one write quality code: KISS: Keep it simple, Stupid DRY: Don't repeat yourself YAGNI: You aren't gonna need it Low coupling: Minimize coupling between classes SOLID principles: Principles for better OOP William of Ockham had framed the maxim Keep it simple, Stupid (KISS). It is also called law of parsimony. In programming terms, it can be translated as "writing code in a straightforward manner, focusing on a particular solution that solves the problem at hand". This maxim is important because, most often, developers fall into the trap of writing code in a generic manner for unwarranted extensibility. Even though it initially looks attractive, things slowly go out of bounds. The accidental complexity introduced in the code base for catering to improbable scenarios, often reduces readability and maintainability. The KISS principle can be applied to every human endeavor. Learn more about KISS principle by consulting the Web. Don't repeat yourself (DRY), a maxim which most programmers often forget while implementing their domain logic. Most often, in a collaborative development scenario, code gets duplicated inadvertently due to lack of communication and proper design specifications. This bloats the code base, induces subtle bugs, and make things really difficult to change. By following the DRY maxim at all stages of development, we can avoid additional effort and make the code consistent. The opposite of DRY is write everything twice (WET). You aren't gonna need it (YAGNI), a principle that compliments the KISS axiom. It serves as a warning for people who try to write code in the most general manner, anticipating changes right from the word go,. Too often, in practice, most of this code is not used to make potential code smells. While writing code, one should try to make sure that there are no hard-coded references to concrete classes. It is advisable to program to an interface as opposed to an implementation. This is a key principle which many patterns use to provide behavior acquisition at runtime. A dependency injection framework could be used to reduce coupling between classes. SOLID principles are a set of guidelines for writing better object-oriented software. It is a mnemonic acronym that embodies the following five principles: 1 Single Responsibility Principle (SRP) A class should have only one responsibility. If it is doing more than one unrelated thing, we need to split the class. 2 Open Close Principle (OCP) A class should be open for extension, closed for modification. 3 Liskov Substitution Principle (LSP) Named after Barbara Liskov, a Turing Award laureate, who postulated that a sub-class (derived class) could substitute any super class (base class) references without affecting the functionality. Even though it looks like stating the obvious, most implementations have quirks which violate this principle. 4 Interface segregation principle (ISP) It is more desirable to have multiple interfaces for a class (such classes can also be called components) than having one Uber interface that forces implementation of all methods (both relevant and non-relevant to the solution context). 5 Dependency Inversion (DI) This is a principle which is very useful for Framework design. In the case of Frameworks, the client code will be invoked by server code, as opposed to the usual process of client invoking the server. The main principle here is that abstraction should not depend upon details, rather, details should depend upon abstraction. This is also called the "Hollywood Principle" (Do not call us, we will call you back). The authors consider the preceding five principles primarily as a verification mechanism. This will be demonstrated by verifying the ensuing case study implementations for violation of these principles. Karl Seguin has written an e-book titled Foundations of Programming – Building Better Software, which covers most of what has been outlined here. Read his book to gain an in-depth understanding of most of these topics. The SOLID principles are well covered in the Wikipedia page on the subject, which can be retrieved from https://en.wikipedia.org/wiki/SOLID_(object-oriented_design). Robert Martin's Agile Principles, Patterns, and Practices in C# is a definitive book on learning about SOLID, as Robert Martin itself is the creator of these principles, even though Michael Feathers coined the acronym. Why patterns are required? According to the authors, the three key advantages of pattern-oriented software development that stand out are as follows: A language/platform-agnostic way to communicate about software artifacts A tool for refactoring initiatives (targets for refactoring) Better API design With the advent of the pattern movement, the software development community got a canonical language to communicate about software design, architecture, and implementation. Software development is a craft which has got trade-offs attached to each strategy, and there are multiple ways to develop software. The various pattern catalogues brought some conceptual unification for this cacophony in software development. Most developers around the world today who are worth their salt, can understand and speak this language. We believe you will be able to do the same at the end of the article. Fancy yourself stating the following about your recent implementation: For our tax computation example, we have used command pattern to handle the computation logic. The commands (handlers) are configured using an XML file, and a factory method takes care of the instantiation of classes on the fly using Lazy loading. We cache the commands, and avoid instantiation of more objects by imposing singleton constraints on the invocation. We support prototype pattern where command objects can be cloned. The command objects have a base implementation, where concrete command objects use the template method pattern to override methods which are necessary. The command objects are implemented using the design by contracts idiom. The whole mechanism is encapsulated using a Façade class, which acts as an API layer for the application logic. The application logic uses entity objects (reference) to store the taxable entities, attributes like tax parameters are stored as value objects. We use data transfer object (DTO) to transfer the data from the application layer to the computational layer. Arlow/Nuestadt-based archetype pattern is the unit of structuring the tax computation logic. For some developers, the preceding language/platform-independent description of the software being developed is enough to understand the approach taken. This will boos developer productivity (during all phases of SDLC, including development, maintenance, and support) as the developers will be able to get a good mental model of the code base. Without Pattern catalogs, such succinct descriptions of the design or implementation would have been impossible. In an Agile software development scenario, we develop software in an iterative fashion. Once we reach a certain maturity in a module, developers refactor their code. While refactoring a module, patterns do help in organizing the logic. The case study given next will help you to understand the rationale behind "Patterns as refactoring targets". APIs based on well-defined patterns are easy to use and impose less cognitive load on programmers. The success of the ASP.NET MVC framework, NHibernate, and API's for writing HTTP modules and handlers in the ASP.NET pipeline are a few testimonies to the process. Personal income tax computation - A case study Rather than explaining the advantages of patterns, the following example will help us to see things in action. Computation of the annual income tax is a well-known problem domain across the globe. We have chosen an application domain which is well known to focus on the software development issues. The application should receive inputs regarding the demographic profile (UID, Name, Age, Sex, Location) of a citizen and the income details (Basic, DA, HRA, CESS, Deductions) to compute his tax liability. The System should have discriminants based on the demographic profile, and have a separate logic for senior citizens, juveniles, disabled people, old females, and others. By discriminant we mean demographic that parameters like age, sex and location should determine the category to which a person belongs and apply category-specific computation for that individual. As a first iteration, we will implement logic for the senior citizen and ordinary citizen category. After preliminary discussion, our developer created a prototype screen as shown in the following image: Archetypes and business archetype pattern The legendary Swiss psychologist, Carl Gustav Jung, created the concept of archetypes to explain fundamental entities which arise from a common repository of human experiences. The concept of archetypes percolated to the software industry from psychology. The Arlow/Nuestadt patterns describe business archetype patterns like Party, Customer Call, Product, Money, Unit, Inventory, and so on. An Example is the Apache Maven archetype, which helps us to generate projects of different natures like J2EE apps, Eclipse plugins, OSGI projects, and so on. The Microsoft patterns and practices describes archetypes for targeting builds like Web applications, rich client application, mobile applications, and services applications. Various domain-specific archetypes can exist in respective contexts as organizing and structuring mechanisms. In our case, we will define some archetypes which are common in the taxation domain. Some of the key archetypes in this domain are: Sr.no Archetype Description 1 SeniorCitizenFemale Tax payers who are female, and above the age of 60 years 2 SeniorCitizen Tax payers who are male, and above the age of 60 years 3 OrdinaryCitizen Tax payers who are Male/Female, and above 18 years of age 3 DisabledCitizen Tax payers who have any disability 4 MilitaryPersonnel Tax payers who are military personnel 5 Juveniles Tax payers whose age is less than 18 years We will use demographic parameters as discriminant to find the archetype which corresponds to the entity. The whole idea of inducing archetypes is to organize the tax computation logic around them. Once we are able to resolve the archetypes, it is easy to locate and delegate the computations corresponding to the archetypes. Entity, value, and data transfer objects We are going to create a class which represents a citizen. Since citizen needs to be uniquely identified, we are going to create an entity object, which is also called reference object (from DDD catalog). The universal identifier (UID) of an entity object is the handle which an application refers. Entity objects are not identified by their attributes, as there can be two people with the same name. The ID uniquely identifies an entity object. The definition of an entity object is given as follows: public class TaxableEntity { public int Id { get; set; } public string Name { get; set; } public int Age { get; set; } public char Sex { get; set; } public string Location { get; set; } public TaxParamVO taxparams { get; set; } } In the preceding class definition, Id uniquely identifies the entity object. TaxParams is a value object (from DDD catalog) associated with the entity object. Value objects do not have a conceptual identity. They describe some attributes of things (entities). The definition of TaxParams is given as follows: public class TaxParamVO { public double Basic {get;set;} public double DA { get; set; } public double HRA { get; set; } public double Allowance { get; set; } public double Deductions { get; set; } public double Cess { get; set; } public double TaxLiability { get; set; } public bool Computed { get; set; } } While writing applications ever since Smalltalk, Model-view-controller (MVC) is the most dominant paradigm for structuring applications. The application is split into a model layer ( which mostly deals with data), view layer (which acts as a display layer), and a controller (to mediate between the two). In the Web development scenario, they are physically partitioned across machines. To transfer data between layers, the J2EE pattern catalog identified the DTO to transfer data between layers. The DTO object is defined as follows: public class TaxDTO { public int id { } public TaxParamVO taxparams { } } If the layering exists within the same process, we can transfer these objects as-is. If layers are partitioned across processes or systems, we can use XML or JSON serialization to transfer objects between the layers. A computation engine We need to separate UI processing, input validation, and computation to create a solution which can be extended to handle additional requirements. The computation engine will execute different logic depending upon the command received. The GoF command pattern is leveraged for executing the logic based on the command received. The command pattern consists of four constituents. They are: Command object Parameters Command Dispatcher Client The command object's interface has an Execute method. The parameters to the command objects are passed through a bag. The client invokes the command object by passing the parameters through a bag to be consumed by the Command Dispatcher. The Parameters are passed to the command object through the following data structure: public class COMPUTATION_CONTEXT { private Dictionary<String, Object> symbols = new Dictionary<String, Object>(); public void Put(string k, Object value) { symbols.Add(k, value); } public Object Get(string k) { return symbols[k]; } } The ComputationCommand interface, which all the command objects implement, has only one Execute method, which is shown next. The Execute method takes a bag as parameter. The COMPUTATION_CONTEXT data structure acts as the bag here. Interface ComputationCommand { bool Execute(COMPUTATION_CONTEXT ctx); } Since we have already implemented a command interface and bag to transfer the parameters, it is time that we implement a command object. For the sake of simplicity, we will implement two commands where we hardcode the tax liability. public class SeniorCitizenCommand : ComputationCommand { public bool Execute(COMPUTATION_CONTEXT ctx) { TaxDTO td = (TaxDTO)ctx.Get("tax_cargo"); //---- Instead of computation, we are assigning //---- constant tax for each arcetypes td.taxparams.TaxLiability = 1000; td.taxparams.Computed = true; return true; } } public class OrdinaryCitizenCommand : ComputationCommand { public bool Execute(COMPUTATION_CONTEXT ctx) { TaxDTO td = (TaxDTO)ctx.Get("tax_cargo"); //---- Instead of computation, we are assigning //---- constant tax for each arcetypes td.taxparams.TaxLiability = 1500; td.taxparams.Computed = true; return true; } } The commands will be invoked by a CommandDispatcher Object, which takes an archetype string and a COMPUTATION_CONTEXT object. The CommandDispatcher acts as an API layer for the application. class CommandDispatcher { public static bool Dispatch(string archetype, COMPUTATION_CONTEXT ctx) { if (archetype == "SeniorCitizen") { SeniorCitizenCommand cmd = new SeniorCitizenCommand(); return cmd.Execute(ctx); } else if (archetype == "OrdinaryCitizen") { OrdinaryCitizenCommand cmd = new OrdinaryCitizenCommand(); return cmd.Execute(ctx); } else { return false; } } } The application to engine communication The data from the application UI, be it Web or Desktop, has to flow to the computation engine. The following ViewHandler routine shows how data, retrieved from the application UI, is passed to the engine, via the Command Dispatcher, by a client: public static void ViewHandler(TaxCalcForm tf) { TaxableEntity te = GetEntityFromUI(tf); if (te == null){ ShowError(); return; } string archetype = ComputeArchetype(te); COMPUTATION_CONTEXT ctx = new COMPUTATION_CONTEXT(); TaxDTO td = new TaxDTO { id = te.id, taxparams = te.taxparams}; ctx.Put("tax_cargo",td); bool rs = CommandDispatcher.Dispatch(archetype, ctx); if ( rs ) { TaxDTO temp = (TaxDTO)ctx.Get("tax_cargo"); tf.Liabilitytxt.Text = Convert.ToString(temp.taxparams.TaxLiability); tf.Refresh(); } } At this point, imagine that a change in requirement has been received from the stakeholders. Now, we need to support tax computation for new categories. Initially, we had different computations for senior citizen and ordinary citizen. Now we need to add new Archetypes. At the same time, to make the software extensible (loosely coupled) and maintainable, it would be ideal if we provide the capability to support new Archetypes in a configurable manner as opposed to recompiling the application for every new archetype owing to concrete references. The Command Dispatcher object does not scale well to handle additional archetypes. We need to change the assembly whenever a new archetype is included, as the tax computation logic varies for each archetype. We need to create a pluggable architecture to add or remove archetypes at will. The plugin system to make system extensible Writing system logic without impacting the application warrants a mechanism—that of loading a class on the fly. Luckily, the .NET Reflection API provides a mechanism for one to load a class during runtime, and invoke methods within it. A developer worth his salt should learn the Reflection API to write systems which change dynamically. In fact, most of the technologies like ASP.NET, Entity framework, .NET Remoting, and WCF work because of the availability of Reflection API in the .NET stack. Henceforth, we will be using an XML configuration file to specify our tax computation logic. A sample XML file is given next: <?xml version="1.0"?> <plugins> <plugin archetype ="OrindaryCitizen" command="TaxEngine.OrdinaryCitizenCommand"/> <plugin archetype="SeniorCitizen" command="TaxEngine.SeniorCitizenCommand"/> </plugins> The contents of the XML file can be read very easily using LINQ to XML. We will be generating a Dictionary object by the following code snippet: private Dictionary<string,string> LoadData(string xmlfile) { return XDocument.Load(xmlfile) .Descendants("plugins") .Descendants("plugin") .ToDictionary(p => p.Attribute("archetype").Value, p => p.Attribute("command").Value); } Summary In this article, we have covered quite a lot of ground in understanding why pattern-oriented software development is a good way to develop modern software. We started the article citing some key principles. We progressed further to demonstrate the applicability of these key principles by iteratively skinning an application which is extensible and resilient to changes. Resources for Article: Further resources on this subject: Debugging Your .NET Application [article] JSON with JSON.Net [article] Using ASP.NET Controls in SharePoint [article]
Read more
  • 0
  • 0
  • 8910

article-image-data-access-layer
Packt
09 Nov 2016
13 min read
Save for later

Data Access Layer

Packt
09 Nov 2016
13 min read
In this article by Alexander Zaytsev, author of NHibernate 4.0 Cookbook, we will cover the following topics: Transaction Auto-wrapping for the data access layer Setting up an NHibernate repository Using Named Queries in the data access layer (For more resources related to this topic, see here.) Introduction There are two styles of data access layer common in today's applications. Repositories and Data Access Objects. In reality, the distinction between these two have become quite blurred, but in theory, it's something like this: A repository should act like an in-memory collection. Entities are added to and removed from the collection, and its contents can be enumerated. Queries are typically handled by sending query specifications to the repository. A DAO (Data Access Object) is simply an abstraction of an application's data access. Its purpose is to hide the implementation details of the database access, from the consuming code. The first recipe shows the beginnings of a typical data access object. The remaining recipes show how to set up a repository-based data access layer with NHibernate's various APIs. Transaction Auto-wrapping for the data access layer In this recipe, we'll show you how we can set up the data access layer to wrap all data access in NHibernate transactions automatically. How to do it... Create a new class library named Eg.Core.Data. Install NHibernate to Eg.Core.Data using NuGet Package Manager Console. Add the following two DOA classes: public class DataAccessObject<T, TId> where T : Entity<TId> { private readonly ISessionFactory _sessionFactory; private ISession session { get { return _sessionFactory.GetCurrentSession(); } } public DataAccessObject(ISessionFactory sessionFactory) { _sessionFactory = sessionFactory; } public T Get(TId id) { return WithinTransaction(() => session.Get<T>(id)); } public T Load(TId id) { return WithinTransaction(() => session.Load<T>(id)); } public void Save(T entity) { WithinTransaction(() => session.SaveOrUpdate(entity)); } public void Delete(T entity) { WithinTransaction(() => session.Delete(entity)); } private TResult WithinTransaction<TResult>(Func<TResult> func) { if (!session.Transaction.IsActive) { // Wrap in transaction TResult result; using (var tx = session.BeginTransaction()) { result = func.Invoke(); tx.Commit(); } return result; } // Don't wrap; return func.Invoke(); } private void WithinTransaction(Action action) { WithinTransaction<bool>(() => { action.Invoke(); return false; }); } } public class DataAccessObject<T> : DataAccessObject<T, Guid> where T : Entity { } How it works... NHibernate requires that all data access occurs inside an NHibernate transaction. Remember, the ambient transaction created by TransactionScope is not a substitute for an NHibernate transaction This recipe, however, shows a more explicit approach. To ensure that at least all our data access layer calls are wrapped in transactions, we create a private WithinTransaction method that accepts a delegate, consisting of some data access methods, such as session.Save or session.Get. This WithinTransaction method first checks if the session has an active transaction. If it does, the delegate is invoked immediately. If it doesn't, a new NHibernate transaction is created, the delegate is invoked, and finally the transaction is committed. If the data access method throws an exception, the transaction will be rolled back automatically as the exception bubbles up to the using block. There's more... This transactional auto-wrapping can also be set up using SessionWrapper from the unofficial NHibernate AddIns project at https://bitbucket.org/fabiomaulo/unhaddins. This class wraps a standard NHibernate session. By default, it will throw an exception when the session is used without an NHibernate transaction. However, it can be configured to check for and create a transaction automatically, much in the same way I've shown you here. See also Setting up an NHibernate repository Setting up an NHibernate Repository Many developers prefer the repository pattern over data access objects. In this recipe, we'll show you how to set up the repository pattern with NHibernate. How to do it... Create a new, empty class library project named Eg.Core.Data. Add a reference to Eg.Core project. Add the following IRepository interface: public interface IRepository<T>: IEnumerable<T> where T : Entity { void Add(T item); bool Contains(T item); int Count { get; } bool Remove(T item); } Create a new, empty class library project named Eg.Core.Data.Impl. Add references to the Eg.Core and Eg.Core.Data projects. Add a new abstract class named NHibernateBase using the following code: protected readonly ISessionFactory _sessionFactory; protected virtual ISession session { get { return _sessionFactory.GetCurrentSession(); } } public NHibernateBase(ISessionFactory sessionFactory) { _sessionFactory = sessionFactory; } protected virtual TResult WithinTransaction<TResult>( Func<TResult> func) { if (!session.Transaction.IsActive) { // Wrap in transaction TResult result; using (var tx = session.BeginTransaction()) { result = func.Invoke(); tx.Commit(); } return result; } // Don't wrap; return func.Invoke(); } protected virtual void WithinTransaction(Action action) { WithinTransaction<bool>(() => { action.Invoke(); return false; }); } Add a new class named NHibernateRepository using the following code: public class NHibernateRepository<T> : NHibernateBase, IRepository<T> where T : Entity { public NHibernateRepository( ISessionFactory sessionFactory) : base(sessionFactory) { } public void Add(T item) { WithinTransaction(() => session.Save(item)); } public bool Contains(T item) { if (item.Id == default(Guid)) return false; return WithinTransaction(() => session.Get<T>(item.Id)) != null; } public int Count { get { return WithinTransaction(() => session.Query<T>().Count()); } } public bool Remove(T item) { WithinTransaction(() => session.Delete(item)); return true; } public IEnumerator<T> GetEnumerator() { return WithinTransaction(() => session.Query<T>() .Take(1000).GetEnumerator()); } IEnumerator IEnumerable.GetEnumerator() { return WithinTransaction(() => GetEnumerator()); } } How it works... The repository pattern, as explained in http://martinfowler.com/eaaCatalog/repository.html, has two key features: It behaves as an in-memory collection Query specifications are submitted to the repository for satisfaction. In this recipe, we are concerned only with the first feature, behaving as an in-memory collection. The remaining recipes in this article will build on this base, and show various methods for satisfying the second point. Because our repository should act like an in-memory collection, it makes sense that our IRepository<T> interface should resemble ICollection<T>. Our NHibernateBase class provides both contextual session management and the automatic transaction wrapping explained in the previous recipe. NHibernateRepository simply implements the members of IRepository<T>. There's more... The Repository pattern reduces data access to its absolute simplest form, but this simplification comes with a price. We lose much of the power of NHibernate behind an abstraction layer. Our application must either do without even basic session methods like Merge, Refresh, and Load, or allow them to leak through the abstraction. See also Transaction Auto-wrapping for the data access layer Using Named Queries in the data access layer Using Named Queries in the data access layer Named Queries encapsulated in query objects is a powerful combination. In this recipe, we'll show you how to use Named Queries with your data access layer. Getting ready To complete this recipe you will need Common Service Locator from Microsoft Patterns & Practices. The documentation and source code could be found at http://commonservicelocator.codeplex.com. Complete the previous recipe Setting up an NHibernate repository. Include the Eg.Core.Data.Impl assembly as an additional mapping assembly in your test project's App.Config with the following xml: <mapping assembly="Eg.Core.Data.Impl"/> How to do it... In the Eg.Core.Data project, add a folder for the Queries namespace. Add the following IQuery interfaces: public interface IQuery { } public interface IQuery<TResult> : IQuery { TResult Execute(); } Add the following IQueryFactory interface: public interface IQueryFactory { TQuery CreateQuery<TQuery>() where TQuery :IQuery; } Change the IRepository interface to implement the IQueryFactory interface, as shown in the following code: public interface IRepository<T> : IEnumerable<T>, IQueryFactory where T : Entity { void Add(T item); bool Contains(T item); int Count { get; } bool Remove(T item); } In the Eg.Core.Data.Impl project, change the NHibernateRepository constructor and add the _queryFactory field, as shown in the following code: private readonly IQueryFactory _queryFactory; public NHibernateRepository( ISessionFactory sessionFactory, IQueryFactory queryFactory) : base(sessionFactory) { _queryFactory = queryFactory; } Add the following method to NHibernateRepository: public TQuery CreateQuery<TQuery>() where TQuery : IQuery { return _queryFactory.CreateQuery<TQuery>(); } In the Eg.Core.Data.Impl project, add a folder for the Queries namespace. Install Common Service Locator using NuGet Package Manager Console, using the command. Install-Package CommonServiceLocator To the Queries namespace, add this QueryFactory class: public class QueryFactory : IQueryFactory { private readonly IServiceLocator _serviceLocator; public QueryFactory(IServiceLocator serviceLocator) { _serviceLocator = serviceLocator; } public TQuery CreateQuery<TQuery>() where TQuery : IQuery { return _serviceLocator.GetInstance<TQuery>(); } } Add the following NHibernateQueryBase class: public abstract class NHibernateQueryBase<TResult> : NHibernateBase, IQuery<TResult> { protected NHibernateQueryBase( ISessionFactory sessionFactory) : base(sessionFactory) { } public abstract TResult Execute(); } Add an empty INamedQuery interface, as shown in the following code: public interface INamedQuery { string QueryName { get; } } Add a NamedQueryBase class, as shown in the following code: public abstract class NamedQueryBase<TResult> : NHibernateQueryBase<TResult>, INamedQuery { protected NamedQueryBase(ISessionFactory sessionFactory) : base(sessionFactory) { } public override TResult Execute() { var nhQuery = GetNamedQuery(); return Transact(() => Execute(nhQuery)); } protected abstract TResult Execute(IQuery query); protected virtual IQuery GetNamedQuery() { var nhQuery = session.GetNamedQuery(QueryName); SetParameters(nhQuery); return nhQuery; } protected abstract void SetParameters(IQuery nhQuery); public virtual string QueryName { get { return GetType().Name; } } } In Eg.Core.Data.Impl.Test, add a test fixture named QueryTests inherited from NHibernateFixture. Add the following test and three helper methods: [Test] public void NamedQueryCheck() { var errors = new StringBuilder(); var queryObjectTypes = GetNamedQueryObjectTypes(); var mappedQueries = GetNamedQueryNames(); foreach (var queryType in queryObjectTypes) { var query = GetQuery(queryType); if (!mappedQueries.Contains(query.QueryName)) { errors.AppendFormat( "Query object {0} references non-existent " + "named query {1}.", queryType, query.QueryName); errors.AppendLine(); } } if (errors.Length != 0) Assert.Fail(errors.ToString()); } private IEnumerable<Type> GetNamedQueryObjectTypes() { var namedQueryType = typeof(INamedQuery); var queryImplAssembly = typeof(BookWithISBN).Assembly; var types = from t in queryImplAssembly.GetTypes() where namedQueryType.IsAssignableFrom(t) && t.IsClass && !t.IsAbstract select t; return types; } private IEnumerable<string> GetNamedQueryNames() { var nhCfg = NHConfigurator.Configuration; var mappedQueries = nhCfg.NamedQueries.Keys .Union(nhCfg.NamedSQLQueries.Keys); return mappedQueries; } private INamedQuery GetQuery(Type queryType) { return (INamedQuery) Activator.CreateInstance( queryType, new object[] { SessionFactory }); } For our example query, in the Queries namespace of Eg.Core.Data, add the following interface: public interface IBookWithISBN : IQuery<Book> { string ISBN { get; set; } } Add the implementation to the Queries namespace of Eg.Core.Data.Impl using the following code: public class BookWithISBN : NamedQueryBase<Book>, IBookWithISBN { public BookWithISBN(ISessionFactory sessionFactory) : base(sessionFactory) { } public string ISBN { get; set; } protected override void SetParameters( NHibernate.IQuery nhQuery) { nhQuery.SetParameter("isbn", ISBN); } protected override Book Execute(NHibernate.IQuery query) { return query.UniqueResult<Book>(); } } Finally, add the embedded resource mapping, BookWithISBN.hbm.xml, to Eg.Core.Data.Impl with the following xml code: <?xml version="1.0" encoding="utf-8" ?> <hibernate-mapping > <query name="BookWithISBN"> <![CDATA[ from Book b where b.ISBN = :isbn ]]> </query> </hibernate-mapping> How it works... As we learned in the previous recipe, according to the repository pattern, the repository is responsible for fulfilling queries, based on the specifications submitted to it. These specifications are limiting. They only concern themselves with whether a particular item matches the given criteria. They don't care for other necessary technical details, such as eager loading of children, batching, query caching, and so on. We need something more powerful than simple where clauses. We lose too much to the abstraction. The query object pattern defines a query object as a group of criteria that can self-organize in to a SQL query. The query object is not responsible for the execution of this SQL. This is handled elsewhere, by some generic query runner, perhaps inside the repository. While a query object can better express the different technical requirements, such as eager loading, batching, and query caching, a generic query runner can't easily implement those concerns for every possible query, especially across the half-dozen query APIs provided by NHibernate. These details about the execution are specific to each query, and should be handled by the query object. This enhanced query object pattern, as Fabio Maulo has named it, not only self-organizes into SQL but also executes the query, returning the results. In this way, the technical concerns of a query's execution are defined and cared for with the query itself, rather than spreading into some highly complex, generic query runner. According to the abstraction we've built, the repository represents the collection of entities that we are querying. Since the two are already logically linked, if we allow the repository to build the query objects, we can add some context to our code. For example, suppose we have an application service that runs product queries. When we inject dependencies, we could specify IQueryFactory directly. This doesn't give us much information beyond "This service runs queries." If, however, we inject IRepository<Product>, we have a much better idea about what data the service is using. The IQuery interface is simply a marker interface for our query objects. Besides advertising the purpose of our query objects, it allows us to easily identify them with reflection. The IQuery<TResult> interface is implemented by each query object. It specifies only the return type and a single method to execute the query. The IQueryFactory interface defines a service to create query objects. For the purpose of explanation, the implementation of this service, QueryFactory, is a simple service locator. IQueryFactory is used internally by the repository to instantiate query objects. The NamedQueryBase class handles most of the plumbing for query objects, based on named HQL and SQL queries. As a convention, the name of the query is the name of the query object type. That is, the underlying named query for BookWithISBN is also named BookWithISBN. Each individual query object must simply implement SetParameters and Execute(NHibernate.IQuery query), which usually consists of a simple call to query.List<SomeEntity>() or query.UniqueResult<SomeEntity>(). The INamedQuery interface both identifies the query objects based on Named Queries, and provides access to the query name. The NamedQueryCheck test uses this to verify that each INamedQuery query object has a matching named query. Each query has an interface. This interface is used to request the query object from the repository. It also defines any parameters used in the query. In this example, IBookWithISBN has a single string parameter, ISBN. The implementation of this query object sets the :isbn parameter on the internal NHibernate query, executes it, and returns the matching Book object. Finally, we also create a mapping containing the named query BookWithISBN, which is loaded into the configuration with the rest of our mappings. The code used in the query object setup would look like the following code: var query = bookRepository.CreateQuery<IBookWithISBN>(); query.ISBN = "12345"; var book = query.Execute(); See also Transaction Auto-wrapping for the data access layer Setting up an NHibernate repository Summary In this article we learned how to transact Auto-wrapping for the data access layer, setting up an NHibernate repository and how to use Named Queries in the data access layer Resources for Article: Further resources on this subject: Memory Management [article] Getting Started with Spring Security [article] Design with Spring AOP [article]
Read more
  • 0
  • 0
  • 1271
Banner background image

article-image-supervision-and-monitoring
Packt
02 Nov 2016
8 min read
Save for later

Supervision and Monitoring

Packt
02 Nov 2016
8 min read
In this article by Piyush Mishra, author of the book Akka Cookbook, we will learn about supervision and monitoring of Akka actors. (For more resources related to this topic, see here.) Using supervision and monitoring, we can write fault-tolerant systems, which can run continuously for days, months, and years without stopping. Fault tolerance is a property of the systems which are intended to be always responsive rather than failing completely in case of a failure. Such systems are known as fault tolerance systems or resilient systems. In simple words, a fault-tolerant system is one which is destined to continue as more or less fully operational, with perhaps a reduction in throughput or an increase in response time because of partial failure of its components. Even if a components fails, the whole system never gets shut down, instead, it remains operational and responsive with just a decreased throughput. Similarly, while designing a distributed system, we need to care about what would happen if one or more it's components go down. So, the system design should itself be such that the system is able to take appropriate action to resolve the issue. In this article, we will cover the following recipe: Creating child actors of a parent actor Overriding the life cycle hooks of an actor Sending messages to actors and collecting responses Creating child actors of a parent actor In this recipe, we will learn how to create child actors of an actor. Akka follows a tree-like structure to create actors, and it is also the recommended practice. By following such practices, we can handle failures in actors as the parent can take care of it. Lets see how to do it. Getting ready We need to import the Hello-Akka project in the IDE of our choice. The Akka actor dependency that we added in build.sbt is sufficient for most of the recipes in this article, so we will skip the Getting ready section in our further recipes. How to do it… Create a file named ParentChild.scala in package com.packt.chapter2. Add the following imports to the top of the file: import akka.actor.{ActorSystem, Props, Actor} Create messages for sending to actors. case object CreateChild case class Greet(msg: String) Define a child actor as follows: class ChildActor extends Actor { def receive = { case Greet(msg) => println(s"My parent[${self.path.parent}] greeted to me [${self.path}] $msg") } } Define a parent actor as follows, and create a child actor in its context: class ParentActor extends Actor { def receive = { case CreateChild => val child = context.actorOf(Props[ChildActor], "child") child ! Greet("Hello Child") } } Create an application object as shown next: object ParentChild extends App { val actorSystem = ActorSystem("Supervision") val parent = actorSystem.actorOf(Props[ParentActor], "parent") parent ! CreateChild } Run the preceding application, and you will get the following output: My parent[akka://Supervision/user/parent] greeted to me [akka://Supervision/user/parent/child] Hello Child     How it works… In this recipe, we created a child actor, which receives a message, Greet, from the parent actor. We see the parent actor create a child actor using context.actorOf. This method creates a child actor under the parent actor. We can see the path of the actor in the output clearly. Overriding life cycle hooks of an actor Since we are talking about supervision and monitoring of actors, you should understand the life cycle hooks of an actor. In this recipe, you will learn how to override the life cycle hooks of an actor when it starts, stops, prestarts, and postrestarts. How to do it… Create a file called ActorLifeCycle.scala in package com.packt.chapter2. Add the following imports to the top of the file: import akka.actor import akka.actor.SupervisorStrategy import akka.util.Timeout. import scala.concurrent.Await import scala.concurrent.duration import akka.pattern.ask Create the following messages to be sent to the actors: case object Error case class StopActor(actorRef: ActorRef) Create an actor as follows, and override the life cycle methods: class LifeCycleActor extends Actor { var sum = 1 override def preRestart(reason: Throwable, message: Option[Any]): Unit = { println(s"sum in preRestart is $sum") } override def preStart(): Unit = println(s"sum in preStart is $sum") def receive = { case Error => throw new ArithmeticException() case _ => println("default msg") } override def postStop(): Unit = { println(s"sum in postStop is ${sum * 3}") } override def postRestart(reason: Throwable): Unit = { sum = sum * 2 println(s"sum in postRestart is $sum") } } Create a supervisor actor as follows: class Supervisor extends Actor { override val supervisorStrategy = OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) { case _: ArithmeticException => Restart case t => super.supervisorStrategy.decider.applyOrElse(t, (_: Any) => Escalate) } def receive = { case (props: Props, name: String) => sender ! context.actorOf(props, name) case StopActor(actorRef) => context.stop(actorRef) } } Create a test application as shown next, and run the application. object ActorLifeCycle extends App { implicit val timeout = Timeout(2 seconds) val actorSystem = ActorSystem("Supervision") val supervisor = actorSystem.actorOf(Props[Supervisor], "supervisor") val childFuture = supervisor ? (Props(new LifeCycleActor), "LifeCycleActor") val child = Await.result(childFuture.mapTo[ActorRef], 2 seconds) child ! Error Thread.sleep(1000) supervisor ! StopActor(child) } Create another test application as follows, and run it. object ActorLifeCycle extends App { implicit val timeout = Timeout(2 seconds) val actorSystem = ActorSystem("Supervision") val supervisor = actorSystem.actorOf(Props[Supervisor], "supervisor") val childFuture = supervisor ? (Props(new LifeCycleActor), "LifeCycleActor") val child = Await.result(childFuture.mapTo[ActorRef], 2 seconds) child ! Error Thread.sleep(1000) supervisor ! StopActor(child) } On running the preceding test application, you will get the following output: sum in preStart is 1 sum in preRestart is 1 sum in postRestart is 2 [ERROR] [07/01/2016 00:49:57.568] [Supervision-akka.actor.default-dispatcher-5] [akka://Supervision/user/supervisor/LifeCycleActor] null java.lang.ArithmeticException at com.packt.chapter2.LifeCycleActor$ $anonfun$receive$2.applyOrElse(ActorLifeCycle.scala:51) sum in postStop is 6 How it works… In this preceding recipe, we create an actor, which maintains sum as a state, and we modify its life cycle hooks. We create this actor under the parent supervisor, which handles the ArthimaticException in the child actor. Let's see what happens in life cycle hooks. When an actor starts, it calls the preStart method, so we see the following output: "sum in preStart is 1". When an actor throws an exception, it sends a message to the supervisor, and the supervisor handles the failure by restarting that actor. It clears out the accumulated state of the actor, creates a fresh new actor means, and then restores the last value assigned to the state of old actor to the preRestart value. After that postRestart method is called, and whenever the actor stops, the supervisor calls the postStop. Sending messages to actors and collecting responses In this recipe, you will learn how a parent sends messages to its child, and collects responses from them. To step through this recipe, we need to import the Hello-Akka project in the IDE. How to do it… Create a file, SendMesagesToChilds.scala, in package com.packt.chapter2. Add the following imports to the top of the file: import akka.actor.{ Props, ActorSystem, Actor, ActorRef } Create messages to be sent to the actors as follows: case class DoubleValue(x: Int) case object CreateChild case object Send case class Response(x: Int) Define a child actor. It doubles the value sent to it. class DoubleActor extends Actor { def receive = { case DoubleValue(number) => println(s"${self.path.name} Got the number $number") sender ! Response(number * 2) } } Define a parent actor. It creates child actors in its context, sends messages to them, and collects responses from them. class ParentActor extends Actor { val random = new scala.util.Random var childs = scala.collection.mutable.ListBuffer[ActorRef]() def receive = { case CreateChild => childs ++= List(context.actorOf(Props[DoubleActor])) case Send => println(s"Sending messages to child") childs.zipWithIndex map { case (child, value) => child ! DoubleValue(random.nextInt(10)) } case Response(x) => println(s"Parent: Response from child $ {sender.path.name} is $x") } } Create a test application as follows, and run it: object SendMessagesToChild extends App { val actorSystem = ActorSystem("Hello-Akka") val parent = actorSystem.actorOf(Props[ParentActor], "parent") parent ! CreateChild parent ! CreateChild parent ! CreateChild parent ! Send } On running the preceding test application, you will get the following output: $b Got the number 6 $a Got the number 5 $c Got the number 8 Parent: Response from child $a is 10 Parent: Response from child $b is 12 Parent: Response from child $c is 16 How it works… In this last recipe, we create a child actor called DoubleActor, which doubles the value it gets. We also create a parent actor, which creates a child actor when it receives a CreateChild message, and maintains it in the list. When the parent actor receives the message Send, it sends a random number to the child, and the child, in turn, sends a response to the parent. Summary In this article, you learned how to supervise and monitor Akka actors as well as create child actors of an actor. We also discussed how to override the life cycle hooks of an actor. Lastly, you learned how a parent sends messages to its child and collects responses from them. Resources for Article: Further resources on this subject: Introduction to Akka [article] Creating First Akka Application [article] Making History with Event Sourcing [article]
Read more
  • 0
  • 0
  • 697

article-image-learning-basic-nature-f-code
Packt
02 Nov 2016
6 min read
Save for later

Learning the Basic Nature of F# Code

Packt
02 Nov 2016
6 min read
In this article by Eriawan Kusumawardhono, author of the book, F# High Performance explains why F# has been a first class citizen, a built in part of programming languages support in Visual Studio, starting from Visual Studio 2010. Though F# is a programming language that has its own unique trait: it is a functional programming language but at the same time it has OOP support. F# from the start has run on .NET, although we can also run F# on cross-platform, such as Android (using Mono). (For more resources related to this topic, see here.) Although F# mostly runs faster than C# or VB when doing computations, its own performance characteristics and some not so obvious bad practices and subtleties may have led to performance bottlenecks. The bottlenecks may or may not be faster than C#/VB counterparts, although some of the bottlenecks may share the same performance characteristics, such as the use of .NET APIs. The main goal of this book is to identify performance problems in F#, measuring and also optimizing F# code to run more efficiently while also maintaining the functional programming style as appropriately as possible. A basic knowledge of F# (including the functional programming concept and basic OOP) is required as a prerequisite to start understanding the performance problems and the optimization of F#. There are many ways and definitions to define F# performance characteristics and at the same time measure them, but understanding the mechanics of running F# code, especially on top of .NET, is crucial and it's also a part of the performance characteristic itself. This includes other aspects of approaches to identify concurrency problems and language constructs. Understanding the nature of F# code Understanding the nature of F# code is very crucial and it is a definitive prerequisite before we begin to measure how long it runs and its effectiveness. We can measure a running F# code by running time, but to fully understand why it may run slow or fast, there are some basic concepts we have to consider first. Before we dive more into this, we must meet the basic requirements and setup. After the requirements have been set, we need to put in place the environment setting of Visual Studio 2015. We have to set this, because we need to maintain the consistency of the default setting of Visual Studio. The setting should be set to General. These are the steps: Select the Tools menu from Visual Studio's main menu. Select Import and Export Settings... and the Import and Export Settings Wizard screen is displayed. Select Reset all Settings and then Next to proceed. Select No, just reset my settings overwriting my current setting and then Next to proceed. Select General and then Next to proceed After setting it up, we will have a consistent layout to be used throughout this book, including the menu locations and the look and feel of Visual Studio. Now we are going to scratch the surface of F# runtime with an introductory overview of common F# runtime, which will give us some insights into F# performance. F# runtime characteristics The release of Visual Studio 2015 occurred at the same time as the release of .NET 4.6 and the rest of the tools, including the F# compiler. The compiler version of F# in Visual Studio 2015 is F# 4.0. F# 4.0 has no large differences or notable new features compared to the previous version, F# 3.0 in Visual Studio 2013. Its runtime characteristic is essentially the same as F# 4.0, although there are some subtle performance improvements and bug fixes. For more information on what's new in F# 4.0 (described as release notes) visit: https://github.com/Microsoft/visualfsharp/blob/fsharp4/CHANGELOG.md. At the time of writing this book, the online and offline MSDN Library of F# in Visual Studio does not have F# 4.0 release notes documentation, but can always go to the GitHub repository of F# to check the latest update. These are the common characteristics of F# as part of managed programming language: F# must conform to .NET CLR. This includes the compatibilities, the IL emitted after compile, and support for .NET BCL (the basic class library). Therefore, F# functions and libraries can be used by other CLR compliant languages such as C#, VB, and managed C++. The debug symbols (PDB) have the same format and semantic as other CLR compliant languages. This is important, because F# code must be able to be debugged from other CLR compliant languages as well. From the managed languages perspective, measuring performance of F# is similar when measured by tools such as the CLR profiler. But from a F# unique perspective, these are F#-only unique characteristics: By default, all types in F# are immutable. Therefore, it's safe to assume it is intrinsically thread safe. F# has a distinctive collection library, and it is immutable by default. It is also safe to assume it is intrinsically thread safe. F# has a strong type inference model, and when a generic type is inferred without any concrete type, it automatically performs generalizations. Default functions in F# are implemented internally by creating an internal class derived from F#’s FastFunc. This FastFunc is essentially a delegate that is used by F# to apply functional language constructs such as currying and partial application. With tail call recursive optimization in the IL, the F# compiler may emit .tail IL, and then the CLR will recognize this and perform optimization at runtime. F# has inline functions as option F# has a computation workflow that is used to compose functions F# async computation doesn't need Task<T> to implement it. Although F# async doesn't need the Task<T> object, it can operate well with the async-await model in C# and VB. The async-await model in C# and VB is inspired by F# async, but behaves semantically differently based on more things than just the usage of Task<T>. All of those characteristics are not only unique, but they can also have performance implications when used to interoperate with C# and VB. Summary This article explained the basic introduction to F# IDE, along with runtime characteristics of F#. Resources for Article: Further resources on this subject: Creating an F# Project [article] Unit Testing [article] Working with Windows Phone Controls [article]
Read more
  • 0
  • 0
  • 1496

article-image-introduction-scala
Packt
01 Nov 2016
8 min read
Save for later

Introduction to Scala

Packt
01 Nov 2016
8 min read
In this article by Diego Pacheco, the author of the book, Building applications with Scala, we will see the following topics: Writing a program for Scala Hello World using the REPL Scala language – the basics Scala variables – var and val Creating immutable variables (For more resources related to this topic, see here.) Scala Hello World using the REPL Let's get started. Go ahead, open your terminal, and type $ scala in order to open the Scala REPL. Once the REPL is open, you can just type "Hello World". By doing this, you are performing two operations – eval and print. The Scala REPL will create a variable called res0 and store your string there, and then it will print the content of the res0 variable. Scala REPL Hello World program $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> "Hello World" res0: String = Hello World scala> Scala is a hybrid language, which means it is both object-oriented (OO) and functional. You can create classes and objects in Scala. Next, we will create a complete Hello World application using classes. Scala OO Hello World program $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> object HelloWorld { | def main(args:Array[String]) = println("Hello World") | } defined object HelloWorld scala> HelloWorld.main(null) Hello World scala> First things first, you need to realize that we use the word object instead of class. The Scala language has different constructs, compared with Java. Object is a Singleton in Scala. It's the same as you code the Singleton pattern in Java. Next, we see the word def that is used in Scala to create functions. In this program, we create the main function just as we do in Java, and we call the built-in function, println, in order to print the String Hello World. Scala imports some java objects and packages by default. Coding in Scala does not require you to type, for instance, System.out.println("Hello World"), but you can if you want to, as shown in the following:. $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> System.out.println("Hello World") Hello World scala> We can and we will do better. Scala has some abstractions for a console application. We can write this code with less lines of code. To accomplish this goal, we need to extend the Scala class App. When we extend from App, we are performing inheritance, and we don't need to define the main function. We can just put all the code on the body of the class, which is very convenient, and which makes the code clean and simple to read. Scala HelloWorld App in the Scala REPL $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> object HelloWorld extends App { | println("Hello World") | } defined object HelloWorld scala> HelloWorld object HelloWorld scala> HelloWorld.main(null) Hello World scala> After coding the HelloWorld object in the Scala REPL, we can ask the REPL what HelloWorld is and, as you might realize, the REPL answers that HelloWorld is an object. This is a very convenient Scala way to code console applications because we can have a Hello World application with just three lines of code. Sadly, the same program in Java requires way more code, as you will see in the next section. Java is a great language for performance, but it is a verbose language compared with Scala. Java Hello World application package scalabook.javacode.chap1; public class HelloWorld { public static void main(String args[]){ System.out.println("Hello World"); } } The Java application required six lines of code, while in Scala, we were able to do the same with 50% less code(three lines of code). This is a very simple application; when we are coding complex applications, the difference gets bigger as a Scala application ends up with way lesser code than that of Java. Remember that we use an object in Scala in order to have a Singleton(Design Pattern that makes sure you have just one instance of a class), and if we want to do the same in Java, the code would be something like this: package scalabook.javacode.chap1; public class HelloWorldSingleton { private HelloWorldSingleton(){} private static class SingletonHelper{ private static final HelloWorldSingleton INSTANCE = new HelloWorldSingleton(); } public static HelloWorldSingleton getInstance(){ return SingletonHelper.INSTANCE; } public void sayHello(){ System.out.println("Hello World"); } public static void main(String[] args) { getInstance().sayHello(); } } It's not just about the size of the code, but it is all about consistency and the language providing more abstractions for you. If you write less code, you will have less bugs in your software at the end of the day. Scala language – the basics Scala is a statically typed language with a very expressive type system, which enforces abstractions in a safe yet coherent manner. All values in Scala are Java objects (but primitives that are unboxed at runtime) because at the end of the day, Scala runs on the Java JVM. Scala enforces immutability as a core functional programing principle. This enforcement happens in multiple aspects of the Scala language, for instance, when you create a variable, you do it in an immutable way, and when you use a collection, you use an immutable collection. Scala also lets you use mutable variables and mutable structures, but it favors immutable ones by design. Scala variables – var and val When you are coding in Scala, you create variables using either the var operator or the val operator. The var operator allows you to create mutable states, which is fine as long as you make it local, stick to the core functional programing principles, and avoid mutable shared state. Using var in the Scala REPL $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> var x = 10 x: Int = 10 scala> x res0: Int = 10 scala> x = 11 x: Int = 11 scala> x res1: Int = 11 scala> However, Scala has a more interesting construct called val. Using the val operator makes your variables immutable, which means that you can't change their values after you set them. If you try to change the value of a val variable in Scala, the compiler will give you an error. As a Scala developer, you should use val as much as possible because that's a good functional programing mindset, and it will make your programs better and more correct. In Scala, everything is an object; there are no primitives – the var and val rules apply for everything, be it Int, String, or even a class. Using val in the Scala REPL $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> val x = 10 x: Int = 10 scala> x res0: Int = 10 scala> x = 11 <console>:12: error: reassignment to val x = 11 ^ scala> x res1: Int = 10 scala> Creating immutable variables Right. Now let's see how we can define the most common types in Scala, such as Int, Double, Boolean, and String. Remember that you can create these variables using val or var, depending on your requirement. Scala variable types at the Scala REPL $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> val x = 10 x: Int = 10 scala> val y = 11.1 y: Double = 11.1 scala> val b = true b: Boolean = true scala> val f = false f: Boolean = false scala> val s = "A Simple String" s: String = A Simple String scala> For these variables, we did not define the type. The Scala language figures it out for us. However, it is possible to specify the type if you want. In Scala, the type comes after the name of the variable, as shown in the following section. Scala variables with explicit typing at the Scala REPL $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> val x:Int = 10 x: Int = 10 scala> val y:Double = 11.1 y: Double = 11.1 scala> val s:String = "My String " s: String = "My String " scala> val b:Boolean = true b: Boolean = true scala> Summary In this article, we learned about some basic constructs and concepts of the Scala language, with functions, collections, and OO in Scala. Resources for Article: Further resources on this subject: Making History with Event Sourcing [article] Creating Your First Plug-in [article] Content-based recommendation [article]
Read more
  • 0
  • 0
  • 2508
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at ₹800/month. Cancel anytime
article-image-hosting-google-app-engine
Packt
21 Oct 2016
22 min read
Save for later

Hosting on Google App Engine

Packt
21 Oct 2016
22 min read
In this article by Mat Ryer, the author of the book Go Programming Blueprints Second Edition, we will see how to create a successful Google Application and deploy it in Google App Engine along with Googles Cloud data storage facility for App Engine Developers. (For more resources related to this topic, see here.) Google App Engine gives developers a NoOps (short for No Operations, indicating that developers and engineers have no work to do in order to have their code running and available) way of deploying their applications, and Go has been officially supported as a language option for some years now. Google's architecture runs some of the biggest applications in the world, such as Google Search, Google Maps, Gmail, among others, so is a pretty safe bet when it comes to deploying our own code. Google App Engine allows you to write a Go application, add a few special configuration files, and deploy it to Google's servers, where it will be hosted and made available in a highly available, scalable, and elastic environment. Instances will automatically spin up to meet demand and tear down gracefully when they are no longer needed with a healthy free quota and preapproved budgets. Along with running application instances, Google App Engine makes available a myriad of useful services, such as fast and high-scale data stores, search, memcache, and task queues. Transparent load balancing means you don't need to build and maintain additional software or hardware to ensure servers don't get overloaded and that requests are fulfilled quickly. In this article, we will build the API backend for a question and answer service similar to Stack Overflow or Quora and deploy it to Google App Engine. In the process, we'll explore techniques, patterns, and practices that can be applied to all such applications as well as dive deep into some of the more useful services available to our application. Specifically, in this article, you will learn: How to use the Google App Engine SDK for Go to build and test applications locally before deploying to the cloud How to use app.yaml to configure your application How Modules in Google App Engine let you independently manage the different components that make up your application How the Google Cloud Datastore lets you persist and query data at scale A sensible pattern for the modeling of data and working with keys in Google Cloud Datastore How to use the Google App Engine Users API to authenticate people with Google accounts A pattern to embed denormalized data into entities The Google App Engine SDK for Go In order to run and deploy Google App Engine applications, we must download and configure the Go SDK. Head over to https://cloud.google.com/appengine/downloads and download the latest Google App Engine SDK for Go for your computer. The ZIP file contains a folder called go_appengine, which you should place in an appropriate folder outside of your GOPATH, for example, in /Users/yourname/work/go_appengine. It is possible that the names of these SDKs will change in the future—if that happens, ensure that you consult the project home page for notes pointing you in the right direction at https://github.com/matryer/goblueprints. Next, you will need to add the go_appengine folder to your $PATH environment variable, much like what you did with the go folder when you first configured Go. To test your installation, open a terminal and type this: goapp version You should see something like the following: go version go1.6.1 (appengine-1.9.37) darwin/amd64 The actual version of Go is likely to differ and is often a few months behind actual Go releases. This is because the Cloud Platform team at Google needs to do work on its end to support new releases of Go. The goapp command is a drop-in replacement for the go command with a few additional subcommands; so you can do things like goapp test and goapp vet, for example. Creating your application In order to deploy an application to Google's servers, we must use the Google Cloud Platform Console to set it up. In a browser, go to https://console.cloud.google.com and sign in with your Google account. Look for the Create Project menu item, which often gets moved around as the console changes from time to time. If you already have some projects, click on a project name to open a submenu, and you'll find it in there. If you can't find what you're looking for, just search Creating App Engine project and you'll find it. When the New Project dialog box opens, you will be asked for a name for your application. You are free to call it whatever you like (for example, Answers), but note the Project ID that is generated for you; you will need to refer to this when you configure your app later. You can also click on Edit and specify your own ID, but know that the value must be globally unique, so you'll have to get creative when thinking one up. Here we will use answersapp as the application ID, but you won't be able to use that one since it has already been taken. You may need to wait a minute or two for your project to get created; there's no need to watch the page—you can continue and check back later. App Engine applications are Go packages Now that the Google App Engine SDK for Go is configured and our application has been created, we can start building it. In Google App Engine, an application is just a normal Go package with an init function that registers handlers via the http.Handle or http.HandleFunc functions. It does not need to be the main package like normal tools. Create a new folder (somewhere inside your GOPATH folder) called answersapp/api and add the following main.go file: package api import ( "io" "net/http" ) func init() { http.HandleFunc("/", handleHello) } func handleHello(w http.ResponseWriter, r *http.Request) { io.WriteString(w, "Hello from App Engine") } You will be familiar with most of this by now, but note that there is no ListenAndServe call, and the handlers are set inside the init function rather than main. We are going to handle every request with our simple handleHello function, which will just write a welcoming string. The app.yaml file In order to turn our simple Go package into a Google App Engine application, we must add a special configuration file called app.yaml. The file will go at the root of the application or module, so create it inside the answersapp/api folder with the following contents: application: YOUR_APPLICATION_ID_HERE version: 1 runtime: go api_version: go1 handlers: - url: /.* script: _go_app The file is a simple human–(and machine) readable configuration file in YAML (Yet Another Markup Language format—refer to yaml.org for more details). The following table describes each property: Property Description application The application ID (copied and pasted from when you created your project). version Your application version number—you can deploy multiple versions and even split traffic between them to test new features, among other things. We'll just stick with version 1 for now. runtime The name of the runtime that will execute your application. Since we're building a Go application, we'll use go. api_version The go1 api version is the runtime version supported by Google; you can imagine that this could be go2 in the future. handlers A selection of configured URL mappings. In our case, everything will be mapped to the special _go_app script, but you can also specify static files and folders here. Running simple applications locally Before we deploy our application, it makes sense to test it locally. We can do this using the App Engine SDK we downloaded earlier. Navigate to your answersapp/api folder and run the following command in a terminal: goapp serve You should see the following output: This indicates that an API server is running locally on port :56443, an admin server is running on :8000, and our application (the module default) is now serving at localhost:8080, so let's hit that one in a browser. As you can see by the Hello from App Engine response, our application is running locally. Navigate to the admin server by changing the port from :8080 to :8000. The preceding screenshot shows the web portal that we can use to interrogate the internals of our application, including viewing running instances, inspecting the data store, managing task queues, and more. Deploying simple applications to Google App Engine To truly understand the power of Google App Engine's NoOps promise, we are going to deploy this simple application to the cloud. Back in the terminal, stop the server by hitting Ctrl+C and run the following command: goapp deploy Your application will be packaged and uploaded to Google's servers. Once it's finished, you should see something like the following: Completed update of app: theanswersapp, version: 1 It really is as simple as that. You can prove this by navigating to the endpoint you get for free with every Google App Engine application, remembering to replace the application ID with your own: https://YOUR_APPLICATION_ID_HERE.appspot.com/. You will see the same output as earlier (the font may render differently since Google's servers will make assumptions about the content type that the local dev server doesn't). The application is being served over HTTP/2 and is already capable of pretty massive scale, and all we did was write a config file and a few lines of code. Modules in Google App Engine A module is a Go package that can be versioned, updated, and managed independently. An app might have a single module, or it can be made up of many modules: each distinct but part of the same application with access to the same data and services. An application must have a default module—even if it doesn't do much. Our application will be made up of the following modules: Description The module name The obligatory default module default An API package delivering RESTful JSON api A static website serving HTML, CSS, and JavaScript that makes AJAX calls to the API module web Each module will be a Go package and will, therefore, live inside its own folder. Let's reorganize our project into modules by creating a new folder alongside the api folder called default. We are not going to make our default module do anything other than use it for configuration, as we want our other modules to do all the meaningful work. But if we leave this folder empty, the Google App Engine SDK will complain that it has nothing to build. Inside the default folder, add the following placeholder main.go file: package defaultmodule func init() {} This file does nothing except allowing our default module to exist. It would have been nice for our package names to match the folders, but default is a reserved keyword in Go, so we have a good reason to break that rule. The other module in our application will be called web, so create another folder alongside the api and default folders called web. Here we are only going to build the API for our application and cheat by downloading the web module. Head over to the project home page at https://github.com/matryer/goblueprints, access the content for Second Edition, and look for the download link for the web components for this article in the Downloads section of the README file. The ZIP file contains the source files for the web component, which should be unzipped and placed inside the web folder. Now, our application structure should look like this: /answersapp/api /answersapp/default /answersapp/web Specifying modules To specify which module our api package will become, we must add a property to the app.yaml inside our api folder. Update it to include the module property: application: YOUR_APPLICATION_ID_HERE version: 1 runtime: go module: api api_version: go1 handlers: - url: /.* script: _go_app Since our default module will need to be deployed as well, we also need to add an app.yaml configuration file to it. Duplicate the api/app.yaml file inside default/app.yaml, changing the module to default: application: YOUR_APPLICATION_ID_HERE version: 1 runtime: go module: default api_version: go1 handlers: - url: /.* script: _go_app Routing to modules with dispatch.yaml In order to route traffic appropriately to our modules, we will create another configuration file called dispatch.yaml, which will let us map URL patterns to the modules. We want all traffic beginning with the /api/ path to be routed to the api module and everything else to the web module. As mentioned earlier, we won't expect our default module to handle any traffic, but it will have more utility later. In the answersapp folder (alongside our module folders—not inside any of the module folders), create a new file called dispatch.yaml with the following contents: application: YOUR_APPLICATION_ID_HERE dispatch: - url: "*/api/*" module: api - url: "*/*" module: web The same application property tells the Google App Engine SDK for Go which application we are referring to, and the dispatch section routes URLs to modules. Google Cloud Datastore One of the services available to App Engine developers is Google Cloud Datastore, a NoSQL document database built for automatic scaling and high performance. Its limited feature-set guarantees very high scale, but understanding the caveats and best practices is vital to a successful project. Denormalizing data Developers with experience of relational databases (RDBMS) will often aim to reduce data redundancy (trying to have each piece of data appear only once in their database) by normalizing data, spreading it across many tables, and adding references (foreign keys) before joining it back via a query to build a complete picture. In schemaless and NoSQL databases, we tend to do the opposite. We denormalize data so that each document contains the complete picture it needs, making read times extremely fast—since it only needs to go and get a single thing. For example, consider how we might model tweets in a relational database such as MySQL or Postgres: A tweet itself contains only its unique ID, a foreign key reference to the Users table representing the author of the tweet, and perhaps many URLs that were mentioned in TweetBody. One nice feature of this design is that a user can change their Name or AvatarURL and it will be reflected in all of their tweets, past and future: something you wouldn't get for free in a denormalized world. However, in order to present a tweet to the user, we must load the tweet itself, look up (via a join) the user to get their name and avatar URL, and then load the associated data from the URLs table in order to show a preview of any links. At scale, this becomes difficult because all three tables of data might well be physically separated from each other, which means lots of things need to happen in order to build up this complete picture. Consider what a denormalized design would look like instead: We still have the same three buckets of data, except that now our tweet contains everything it needs in order to render to the user without having to look up data from anywhere else. The hardcore relational database designers out there are realizing what this means by now, and it is no doubt making them feel uneasy. Following this approach means that: Data is repeated—AvatarURL in User is repeated as UserAvatarURL in the tweet (waste of space, right?) If the user changes their AvatarURL, UserAvatarURL in the tweet will be out of date Database design, at the end of the day, comes down to physics. We are deciding that our tweet is going to be read far more times than it is going to be written, so we'd rather take the pain up-front and take a hit in storage. There's nothing wrong with repeated data as long as there is an understanding about which set is the master set and which is duplicated for speed. Changing data is an interesting topic in itself, but let's think about a few reasons why we might be OK with the trade-offs. Firstly, the speed benefit to reading tweets is probably worth the unexpected behavior of changes to master data not being reflected in historical documents; it would be perfectly acceptable to decide to live with this emerged functionality for that reason. Secondly, we might decide that it makes sense to keep a snapshot of data at a specific moment in time. For example, imagine if someone tweets asking whether people like their profile picture. If the picture changed, the tweet context would be lost. For a more serious example, consider what might happen if you were pointing to a row in an Addresses table for an order delivery and the address later changed. Suddenly, the order might look like it was shipped to a different place. Finally, storage is becoming increasingly cheaper, so the need for normalizing data to save space is lessened. Twitter even goes as far as copying the entire tweet document for each of your followers. 100 followers on Twitter means that your tweet will be copied at least 100 times, maybe more for redundancy. This sounds like madness to relational database enthusiasts, but Twitter is making smart trade-offs based on its user experience; they'll happily spend a lot of time writing a tweet and storing it many times to ensure that when you refresh your feed, you don't have to wait very long to get updates. If you want to get a sense of the scale of this, check out the Twitter API and look at what a tweet document consists of. It's a lot of data. Then, go and look at how many followers Lady Gaga has. This has become known in some circles as "the Lady Gaga problem" and is addressed by a variety of different technologies and techniques that are out of the scope of this article. Now that we have an understanding of good NoSQL design practices, let's implement the types, functions, and methods required to drive the data part of our API. Entities and data access To persist data in Google Cloud Datastore, we need a struct to represent each entity. These entity structures will be serialized and deserialized when we save and load data through the datastore API. We can add helper methods to perform the interactions with the data store, which is a nice way to keep such functionality physically close to the entities themselves. For example, we will model an answer with a struct called Answer and add a Create method that in turn calls the appropriate function from the datastore package. This prevents us from bloating our HTTP handlers with lots of data access code and allows us to keep them clean and simple instead. One of the foundation blocks of our application is the concept of a question. A question can be asked by a user and answered by many. It will have a unique ID so that it is addressable (referable in a URL), and we'll store a timestamp of when it was created. type Question struct { Key *datastore.Key `json:"id" datastore:"-"` CTime time.Time `json:"created"` Question string `json:"question"` User UserCard `json:"user"` AnswersCount int `json:"answers_count"` } The UserCard struct represents a denormalized User entity, both of which we'll add later. You can import the datastore package in your Go project using this: import "google.golang.org/appengine/datastore" It's worth spending a little time understanding the datastore.Key type. Keys in Google Cloud Datastore Every entity in Datastore has a key, which uniquely identifies it. They can be made up of either a string or an integer depending on what makes sense for your case. You are free to decide the keys for yourself or let Datastore automatically assign them for you; again, your use case will usually decide which is the best approach to take Keys are created using datastore.NewKey and datastore.NewIncompleteKey functions and are used to put and get data into and out of Datastore via the datastore.Get and datastore.Put functions. In Datastore, keys and entity bodies are distinct, unlike in MongoDB or SQL technologies, where it is just another field in the document or record. This is why we are excluding Key from our Question struct with the datastore:"-" field tag. Like the json tags, this indicates that we want Datastore to ignore the Key field altogether when it is getting and putting data. Keys may optionally have parents, which is a nice way of grouping associated data together and Datastore makes certain assurances about such groups of entities, which you can read more about in the Google Cloud Datastore documentation online. Putting data into Google Cloud Datastore Before we save data into Datastore, we want to ensure that our question is valid. Add the following method underneath the Question struct definition: func (q Question) OK() error { if len(q.Question) < 10 { return errors.New("question is too short") } return nil } The OK function will return an error if something is wrong with the question, or else it will return nil. In this case, we just check to make sure the question has at least 10 characters. To persist this data in the data store, we are going to add a method to the Question struct itself. At the bottom of questions.go, add the following code: func (q *Question) Create(ctx context.Context) error { log.Debugf(ctx, "Saving question: %s", q.Question) if q.Key == nil { q.Key = datastore.NewIncompleteKey(ctx, "Question", nil) } user, err := UserFromAEUser(ctx) if err != nil { return err } q.User = user.Card() q.CTime = time.Now() q.Key, err = datastore.Put(ctx, q.Key, q) if err != nil { return err } return nil } The Create method takes a pointer to Question as the receiver, which is important because we want to make changes to the fields. If the receiver was (q Question)—without *, we would get a copy of the question rather than a pointer to it, and any changes we made to it would only affect our local copy and not the original Question struct itself. The first thing we do is use log (from the google.golang.org/appengine/log package) to write a debug statement saying we are saving the question. When you run your code in a development environment, you will see this appear in the terminal; in production, it goes into a dedicated logging service provided by Google Cloud Platform. If the key is nil (that means this is a new question), we assign an incomplete key to the field, which informs Datastore that we want it to generate a key for us. The three arguments we pass are context.Context (which we must pass to all datastore functions and methods), a string describing the kind of entity, and the parent key; in our case, this is nil. Once we know there is a key in place, we call a method (which we will add later) to get or create User from an App Engine user and set it to the question and then set the CTime field (created time) to time.Now—timestamping the point at which the question was asked. One we have our Question function in good shape, we call datastore.Put to actually place it inside the data store. As usual, the first argument is context.Context, followed by the question key and the question entity itself. Since Google Cloud Datastore treats keys as separate and distinct from entities, we have to do a little extra work if we want to keep them together in our own code. The datastore.Put method returns two arguments: the complete key and error. The key argument is actually useful because we're sending in an incomplete key and asking the data store to create one for us, which it does during the put operation. If successful, it returns a new datastore.Key object to us, representing the completed key, which we then store in our Key field in the Question object. If all is well, we return nil. Add another helper to update an existing question: func (q *Question) Update(ctx context.Context) error { if q.Key == nil { q.Key = datastore.NewIncompleteKey(ctx, "Question", nil) } var err error q.Key, err = datastore.Put(ctx, q.Key, q) if err != nil { return err } return nil } This method is very similar except that it doesn't set the CTime or User fields, as they will already have been set. Reading data from Google Cloud Datastore Reading data is as simple as putting it with the datastore.Get method, but since we want to maintain keys in our entities (and datastore methods don't work like that), it's common to add a helper function like the one we are going to add to questions.go: func GetQuestion(ctx context.Context, key *datastore.Key) (*Question, error) { var q Question err := datastore.Get(ctx, key, &q) if err != nil { return nil, err } q.Key = key return &q, nil } The GetQuestion function takes context.Context and the datastore.Key method of the question to get. It then does the simple task of calling datastore.Get and assigning the key to the entity before returning it. Of course, errors are handled in the usual way. This is a nice pattern to follow so that users of your code know that they never have to interact with datastore.Get and datastore.Put directly but rather use the helpers that can ensure the entities are properly populated with the keys (along with any other tweaks that they might want to do before saving or after loading). Summary This article thus gives us an idea about the Go App functionality, how to create a simple application and upload on Google App Engine thus giving a clear understanding of configurations and its working Further we also get some ideas about modules in Google App Engine and also Googles cloud data storage facility for App Engine Developers Resources for Article: Further resources on this subject: Google Forms for Multiple Choice and Fill-in-the-blank Assignments [article] Publication of Apps [article] Prerequisites for a Map Application [article]
Read more
  • 0
  • 0
  • 2090

article-image-deployment-and-devops
Packt
14 Oct 2016
16 min read
Save for later

Deployment and DevOps

Packt
14 Oct 2016
16 min read
 In this article by Makoto Hashimoto and Nicolas Modrzyk, the authors of the book Clojure Programming Cookbook, we will cover the recipe Clojure on Amazon Web Services. (For more resources related to this topic, see here.) Clojure on Amazon Web Services This recipe is a standalone dish where you can learn how to combine the elegance of Clojure with Amazon Web Services (AWS). AWS was started in 2006 and is used by many businesses as easy to use web services. This style of serverless services is becoming more and more popular. You can use computer resources and software services on demand, without the need of preparing hardware or installing software by yourselves. You will mostly make use of the amazonica library, which is a comprehensive Clojure client for the entire Amazon AWS set of APIs. This library wraps the Amazon AWS APIs and supports most of AWS services including EC2, S3, Lambda, Kinesis, Elastic Beanstalk, Elastic MapReduce, and RedShift. This recipe has received a lot of its content and love from Robin Birtle, a leading member of the Clojure Community in Japan. Getting ready You need an AWS account and credentials to use AWS, so this recipe starts by showing you how to do the setup and acquire the necessary keys to get started. Signing up on AWS You need to sign up AWS if you don't have your account in AWS yet. In this case, go to https://aws.amazon.com, click on Sign In to the Console, and follow the instruction for creating your account:   To complete the sign up, enter the number of a valid credit card and a phone number. Getting access key and secret access key To call the API, you now need your AWS's access key and secret access key. Go to AWS console and click on your name, which is located in the top right corner of the screen, and select Security Credential, as shown in the following screenshot: Select Access Keys (Access Key ID and Secret Access Key), as shown in the following screenshot:   Then, the following screen appears; click on New Access Key: You can see your access key and secret access key, as shown in the following screenshot: Copy and save these strings for later use. Setting up dependencies in your project.clj Let's add amazonica library to your project.clj and restart your REPL: :dependencies [[org.clojure/clojure "1.8.0"] [amazonica "0.3.67"]] How to do it… From there on, we will go through some sample usage of the core Amazon services, accessed with Clojure, and the amazonica library. The three main ones we will review are as follows: EC2, Amazon's Elastic Cloud, which allows to run Virtual Machines on Amazon's Cloud S3, Simple Storage Service, which gives you Cloud based storage SQS, Simple Queue Services, which gives you Cloud-based data streaming and processing Let's go through each of these one by one. Using EC2 Let's assume you have an EC2 micro instance in Tokyo region: First of all, we will declare core and ec2 namespace in amazonica to use: (ns aws-examples.ec2-example (:require [amazonica.aws.ec2 :as ec2] [amazonica.core :as core])) We will set the access key and secret access key for enabling AWS client API accesses AWS. core/defcredential does as follows: (core/defcredential "Your Access Key" "Your Secret Access Key" "your region") ;;=> {:access-key "Your Access Key", :secret-key "Your Secret Access Key", :endpoint "your region"} The region you need to specify is ap-northeast-1, ap-south-1, or us-west-2. To get full regions list, use ec2/describe-regions: (ec2/describe-regions) ;;=> {:regions [{:region-name "ap-south-1", :endpoint "ec2.ap-south-1.amazonaws.com"} ;;=> ..... ;;=> {:region-name "ap-northeast-2", :endpoint "ec2.ap-northeast-2.amazonaws.com"} ;;=> {:region-name "ap-northeast-1", :endpoint "ec2.ap-northeast-1.amazonaws.com"} ;;=> ..... ;;=> {:region-name "us-west-2", :endpoint "ec2.us-west-2.amazonaws.com"}]} ec2/describe-instances returns very long information as the following: (ec2/describe-instances) ;;=> {:reservations [{:reservation-id "r-8efe3c2b", :requester-id "226008221399", ;;=> :owner-id "182672843130", :group-names [], :groups [], .... To get only necessary information of instance, we define the following __get-instances-info: (defn get-instances-info[] (let [inst (ec2/describe-instances)] (->> (mapcat :instances (inst :reservations)) (map #(vector [:node-name (->> (filter (fn [x] (= (:key x)) "Name" ) (:tags %)) first :value)] [:status (get-in % [:state :name])] [:instance-id (:instance-id %)] [:private-dns-name (:private-dns-name %)] [:global-ip (-> % :network-interfaces first :private-ip-addresses first :association :public-ip)] [:private-ip (-> % :network-interfaces first :private-ip-addresses first :private-ip-address)])) (map #(into {} %)) (sort-by :node-name)))) ;;=> #'aws-examples.ec2-example/get-instances-info Let's try to use the following function: get-instances-info) ;;=> ({:node-name "ECS Instance - amazon-ecs-cli-setup-my-cluster", ;;=> :status "running", ;;=> :instance-id "i-a1257a3e", ;;=> :private-dns-name "ip-10-0-0-212.ap-northeast-1.compute.internal", ;;=> :global-ip "54.199.234.18", ;;=> :private-ip "10.0.0.212"} ;;=> {:node-name "EcsInstanceAsg", ;;=> :status "terminated", ;;=> :instance-id "i-c5bbef5a", ;;=> :private-dns-name "", ;;=> :global-ip nil, ;;=> :private-ip nil}) As in the preceding example function, we can obtain instance-id list. So, we can start/stop instances using ec2/start-instances and ec2/stop-instances_ accordingly: (ec2/start-instances :instance-ids '("i-c5bbef5a")) ;;=> {:starting-instances ;;=> [{:previous-state {:code 80, :name "stopped"}, ;;=> :current-state {:code 0, :name "pending"}, ;;=> :instance-id "i-c5bbef5a"}]} (ec2/stop-instances :instance-ids '("i-c5bbef5a")) ;;=> {:stopping-instances ;;=> [{:previous-state {:code 16, :name "running"}, ;;=> :current-state {:code 64, :name "stopping"}, ;;=> :instance-id "i-c5bbef5a"}]} Using S3 Amazon S3 is secure, durable, and scalable storage in AWS cloud. It's easy to use for developers and other users. S3 also provide high durability, availability, and low cost. The durability is 99.999999999 % and the availability is 99.99 %. Let's create s3 buckets names makoto-bucket-1, makoto-bucket-2, and makoto-bucket-3 as follows: (s3/create-bucket "makoto-bucket-1") ;;=> {:name "makoto-bucket-1"} (s3/create-bucket "makoto-bucket-2") ;;=> {:name "makoto-bucket-2"} (s3/create-bucket "makoto-bucket-3") ;;=> {:name "makoto-bucket-3"} s3/list-buckets returns buckets information: (s3/list-buckets) ;;=> [{:creation-date #object[org.joda.time.DateTime 0x6a09e119 "2016-08-01T07:01:05.000+09:00"], ;;=> :owner ;;=> {:id "3d6e87f691897059c23bcfb88b17da55f0c9aa02cc2a44e461f1594337059d27", ;;=> :display-name "tokoma1"}, ;;=> :name "makoto-bucket-1"} ;;=> {:creation-date #object[org.joda.time.DateTime 0x7392252c "2016-08-01T17:35:30.000+09:00"], ;;=> :owner ;;=> {:id "3d6e87f691897059c23bcfb88b17da55f0c9aa02cc2a44e461f1594337059d27", ;;=> :display-name "tokoma1"}, ;;=> :name "makoto-bucket-2"} ;;=> {:creation-date #object[org.joda.time.DateTime 0x4d59b4cb "2016-08-01T17:38:59.000+09:00"], ;;=> :owner ;;=> {:id "3d6e87f691897059c23bcfb88b17da55f0c9aa02cc2a44e461f1594337059d27", ;;=> :display-name "tokoma1"}, ;;=> :name "makoto-bucket-3"}] We can see that there are three buckets in your AWS console, as shown in the following screenshot: Let's delete two of the three buckets as follows: (s3/list-buckets) ;;=> [{:creation-date #object[org.joda.time.DateTime 0x56387509 "2016-08-01T07:01:05.000+09:00"], ;;=> :owner {:id "3d6e87f691897059c23bcfb88b17da55f0c9aa02cc2a44e461f1594337059d27", :display-name "tokoma1"}, :name "makoto-bucket-1"}] We can see only one bucket now, as shown in the following screenshot: Now we will demonstrate how to send your local data to s3. s3/put-object uploads a file content to the specified bucket and key. The following code uploads /etc/hosts and makoto-bucket-1: (s3/put-object :bucket-name "makoto-bucket-1" :key "test/hosts" :file (java.io.File. "/etc/hosts")) ;;=> {:requester-charged? false, :content-md5 "HkBljfktNTl06yScnMRsjA==", ;;=> :etag "1e40658df92d353974eb249c9cc46c8c", :metadata {:content-disposition nil, ;;=> :expiration-time-rule-id nil, :user-metadata nil, :instance-length 0, :version-id nil, ;;=> :server-side-encryption nil, :etag "1e40658df92d353974eb249c9cc46c8c", :last-modified nil, ;;=> :cache-control nil, :http-expires-date nil, :content-length 0, :content-type nil, ;;=> :restore-expiration-time nil, :content-encoding nil, :expiration-time nil, :content-md5 nil, ;;=> :ongoing-restore nil}} s3/list-objects lists objects in a bucket as follows: (s3/list-objects :bucket-name "makoto-bucket-1") ;;=> {:truncated? false, :bucket-name "makoto-bucket-1", :max-keys 1000, :common-prefixes [], ;;=> :object-summaries [{:storage-class "STANDARD", :bucket-name "makoto-bucket-1", ;;=> :etag "1e40658df92d353974eb249c9cc46c8c", ;;=> :last-modified #object[org.joda.time.DateTime 0x1b76029c "2016-08-01T07:01:16.000+09:00"], ;;=> :owner {:id "3d6e87f691897059c23bcfb88b17da55f0c9aa02cc2a44e461f1594337059d27", ;;=> :display-name "tokoma1"}, :key "test/hosts", :size 380}]} To obtain the contents of objects in buckets, use s3/get-object: (s3/get-object :bucket-name "makoto-bucket-1" :key "test/hosts") ;;=> {:bucket-name "makoto-bucket-1", :key "test/hosts", ;;=> :input-stream #object[com.amazonaws.services.s3.model.S3ObjectInputStream 0x24f810e9 ;;=> ...... ;;=> :last-modified #object[org.joda.time.DateTime 0x79ad1ca9 "2016-08-01T07:01:16.000+09:00"], ;;=> :cache-control nil, :http-expires-date nil, :content-length 380, :content-type "application/octet-stream", ;;=> :restore-expiration-time nil, :content-encoding nil, :expiration-time nil, :content-md5 nil, ;;=> :ongoing-restore nil}} The result is a map, the content is a stream data, and the value of :object-content. To get the result as a string, we will use slurp_ as follows: (slurp (:object-content (s3/get-object :bucket-name "makoto-bucket-1" :key "test/hosts"))) ;;=> "127.0.0.1tlocalhostn127.0.1.1tphenixnn# The following lines are desirable for IPv6 capable hostsn::1 ip6-localhost ip6-loopbacknfe00::0 ip6-localnetnff00::0 ip6-mcastprefixnff02::1 ip6-allnodesnff02::2 ip6-allroutersnn52.8.30.189 my-cluster01-proxy1 n52.8.169.10 my-cluster01-master1 n52.8.198.115 my-cluster01-slave01 n52.9.12.12 my-cluster01-slave02nn52.8.197.100 my-node01n" Using Amazon SQS Amazon SQS is a high-performance, high-availability, and scalable Queue Service. We will demonstrate how easy it is to handle messages on queues in SQS using Clojure: (ns aws-examples.sqs-example (:require [amazonica.core :as core] [amazonica.aws.sqs :as sqs])) To create a queue, you can use sqs/create-queue as follows: (sqs/create-queue :queue-name "makoto-queue" :attributes {:VisibilityTimeout 3000 :MaximumMessageSize 65536 :MessageRetentionPeriod 1209600 :ReceiveMessageWaitTimeSeconds 15}) ;;=> {:queue-url "https://sqs.ap-northeast-1.amazonaws.com/864062283993/makoto-queue"} To get information of queue, use sqs/get-queue-attributes as follows: (sqs/get-queue-attributes "makoto-queue") ;;=> {:QueueArn "arn:aws:sqs:ap-northeast-1:864062283993:makoto-queue", ... You can configure a dead letter queue using sqs/assign-dead-letter-queue as follows: (sqs/create-queue "DLQ") ;;=> {:queue-url "https://sqs.ap-northeast-1.amazonaws.com/864062283993/DLQ"} (sqs/assign-dead-letter-queue (sqs/find-queue "makoto-queue") (sqs/find-queue "DLQ") 10) ;;=> nil Let's list queues defined: (sqs/list-queues) ;;=> {:queue-urls ;;=> ["https://sqs.ap-northeast-1.amazonaws.com/864062283993/DLQ" ;;=> "https://sqs.ap-northeast-1.amazonaws.com/864062283993/makoto-queue"]} The following image is of the console of SQS: Let's examine URLs of queues: (sqs/find-queue "makoto-queue") ;;=> "https://sqs.ap-northeast-1.amazonaws.com/864062283993/makoto-queue" (sqs/find-queue "DLQ") ;;=> "https://sqs.ap-northeast-1.amazonaws.com/864062283993/DLQ" To send messages, we use sqs/send-message: (sqs/send-message (sqs/find-queue "makoto-queue") "hello sqs from Clojure") ;;=> {:md5of-message-body "00129c8cc3c7081893765352a2f71f97", :message-id "690ddd68-a2f6-45de-b6f1-164eb3c9370d"} To receive messages, we use sqs/receive-message: (sqs/receive-message "makoto-queue") ;;=> {:messages [ ;;=> {:md5of-body "00129c8cc3c7081893765352a2f71f97", ;;=> :receipt-handle "AQEB.....", :message-id "bd56fea8-4c9f-4946-9521-1d97057f1a06", ;;=> :body "hello sqs from Clojure"}]} To remove all messages in your queues, we use sqs/purge-queue: (sqs/purge-queue :queue-url (sqs/find-queue "makoto-queue")) ;;=> nil To delete queues, we use sqs/delete-queue: (sqs/delete-queue "makoto-queue") ;;=> nil (sqs/delete-queue "DLQ") ;;=> nil Serverless Clojure with AWS Lambda Lambda is an AWS product that allows you to run Clojure code without the hassle and expense of setting up and maintaining a server environment. Behind the scenes, there are still servers involved, but as far as you are concerned, it is a serverless environment. Upload a JAR and you are good to go. Code running on Lambda is invoked in response to an event, such as a file being uploaded to S3, or according to a specified schedule. In production environments, Lambda is normally used in wider AWS deployment that includes standard server environments to handle discrete computational tasks. Particularly those that benefit from Lambda's horizontal scaling that just happens with configuration required. For Clojurians working on personal project, Lambda is a wonderful combination of power and limitation. Just how far can you hack Lambda given the constraints imposed by AWS? Clojure namespace helloworld Start off with a clean empty projected generated using lein new. From there, in your IDE of choice, configure and package and a new Clojure source file. In the following example, the package is com.sakkam and the source file uses the Clojure namespace helloworld. The entry point to your Lambda code is a Clojure function that is exposed as a method of a Java class using Clojure's gen-class. Similar to use and require, the gen-class function can be included in the Clojure ns definition, as the following, or specified separately. You can use any name you want for the handler function but the prefix must be a hyphen unless an alternate prefix is specified as part of the :methods definition: (ns com.sakkam.lambda.helloworld (:gen-class :methods [^:static [handler [String] String]])) (defn -myhandler [s] (println (str "Hello," s))) From the command line, use lein uberjar to create a JAR that can be uploaded to AWS Lambda. Hello World – the AWS part Getting your Hello World to work is now a matter of creating a new Lambda within AWS, uploading your JAR, and configuring your handler. Hello Stream The handler method we used in our Hello World Lambda function was coded directly and could be extended to accept custom Java classes as part of the method signature. However, for more complex Java integrations, implementing one of AWS's standard interfaces for Lambda is both straightforward and feels more like idiomatic Clojure. The following example replaces our own definition of a handler method with an implementation of a standard interface that is provided as part of the aws-lambda-java-core library. First of all, add the dependency [com.amazonaws/aws-lambda-java-core "1.0.0"] into your project.clj. While you are modifying your project.clj, also add in the dependency for [org.clojure/data.json "0.2.6"] since we will be manipulating JSON formatted objects as part of this exercise. Then, either create a new Clojure namespace or modify your existing one so that it looks like the following (the handler function must be named -handleRequest since handleRequest is specified as part of the interface): (ns aws-examples.lambda-example (:gen-class :implements [com.amazonaws.services.lambda.runtime.RequestStreamHandler]) (:require [clojure.java.io :as io] [clojure.data.json :as json] [clojure.string :as str])) (defn -handleRequest [this is os context] (let [w (io/writer os) parameters (json/read (io/reader is) :key-fn keyword)] (println "Lambda Hello Stream Output ") (println "this class: " (class this)) (println "is class:" (class is)) (println "os class:" (class os)) (println "context class:" (class context)) (println "Parameters are " parameters)) (.flush w)) Use lein uberjar again to create a JAR file. Since we have an existing Lambda function in AWS, we can overwrite the JAR used in the Hello World example. Since the handler function name has changed, we must modify our Lambda configuration to match. This time, the default test that provides parameters in JSON format should work as is, and the result will look something like the following: We can very easily get a more interesting test of Hello Stream by configuring this Lambda to run whenever a file is uploaded to S3. At the Lambda management page, choose the Event Sources tab, click on Add Event, and choose an S3 bucket to which you can easily add a file. Now, upload a file to the specified S3 bucket and then navigate to the logs of the Hello World Lambda function. You will find that Hello World has been automatically invoked, and a fairly complicated object that represents the uploaded file is supplied as a parameter to our Lambda function. Real-world Lambdas To graduate from a Hello World Lambda to real-world Lambdas, the chances are you going to need richer integration with other AWS facilities. As a minimum, you will probably want to write a file to an S3 bucket or insert a notification into SNS queue. Amazon provides an SDK that makes this integration straightforward for developers using standard Java. For Clojurians, using the Amazon Clojure wrapper Amazonica is a very fast and easy way to achieve the same. How it works… Here, we will explain how AWS works. What Is Amazon EC2? Using EC2, we don't need to buy hardware or installing operating system. Amazon provides various types of instances for customers' use cases. Each instance type has varies combinations of CPU, memory, storage, and networking capacity. Some instance types are given in the following table. You can select appropriate instances according to the characteristics of your application. Instance type Description M4 M4 type instance is designed for general purpose computing. This family provides a balanced CPU, memory and network bandwidth C4 C4 type instance is designed for applications that consume CPU resources. C4 is the highest CPU performance with the lowest cost R3 R3 type instances are for memory-intensive applications G2 G2 type instances has NVIDIA GPU and is used for graphic applications and GPU computing applications such as deep learning   The following table shows the variations of models of M4 type instance. You can choose the best one among models. Model vCPU RAM (GiB) EBS bandwidth (Mbps) m4.large 2 8 450 m4.xlarge 4 16 750 m4.2xlarge 8 32 1,000 m4.4xlarge 16 64 2,000 m4.10xlarge 40 160 4,000   Amazon S3 Amazon S3 is storage for Cloud. It provides a simple web interface that allows you to store and retrieve data. S3 API is an ease of use but ensures security. S3 provides Cloud storage services and is scalable, reliable, fast, and inexpensive. Buckets and Keys Buckets are containers for objects stored in Amazon S3. Objects are stored in buckets. Bucket name is unique among all regions in the world. So, names of buckets are the top-level identities of S3 and units of charges and access controls. Keys are the unique identifiers for an object within a bucket. Every object in a bucket has exactly one key. Keys are the second-level identifiers and should be unique in a bucket. To identify an object, you use the combination of bucket name and key name. Objects Objects are accessed by a bucket names and keys. Objects consist of data and metadata. Metadata is a set of name-value pairs that describe the characteristics of object. Examples of metadata are the date last modified and content type. Objects can have multiple versions of data. There's more… It is clearly impossible to review all the different APIs for all the different services proposed via the Amazonica library, but you would probably get the feeling of having tremendous powers in your hands right now. (Don't forget to give that credit card back to your boss now …) Some other examples of Amazon services are as follows: Amazon IoT: This proposes a way to get connected devices easily and securely interact with cloud applications and other devices. Amazon Kinesis: This gives you ways of easily loading massive volumes of streaming data into AWS and easily analyzing them through streaming techniques. Summary We hope you enjoyed this appetizer to the book Clojure Programming Cookbook, which will present you a set of progressive readings to improve your Clojure skills, and make it so that Clojure becomes your de facto everyday language for professional and efficient work. This book presents different topics of generic programming, which are always to the point, with some fun so that each recipe feels not like a classroom, but more like a fun read, with challenging exercises left to the reader to gradually build up skills. See you in the book! Resources for Article: Further resources on this subject: Customizing Xtext Components [article] Reactive Programming and the Flux Architecture [article] Setup Routine for an Enterprise Spring Application [article]
Read more
  • 0
  • 0
  • 1524

article-image-fast-data-manipulation-r
Packt
14 Oct 2016
28 min read
Save for later

Fast Data Manipulation with R

Packt
14 Oct 2016
28 min read
Data analysis is a combination of art and science. The art part consists of data exploration and visualization, which is usually done best with better intuition and understanding of the data. The science part consists of statistical analysis, which relies on concrete knowledge of statistics and analytic skills. However, both parts of a serious research require proper tools and good skills to work with them. R is exactly the proper tool to do data analysis with. In this article by Kun Ren, author of the book Learning R Programming, we will discuss how R and data.table package make it easy to transform data and, thus, greatly unleash our productivity. (For more resources related to this topic, see here.) Loading data as data frames The most basic data structures in R are atomic vectors, such as. numeric, logical, character, and complex vector, and list. An atomic vector stores elements of the same type while list is allowed to store different types of elements. The most commonly used data structure in R to store real-world data is data frame. A data frame stores data in tabular form. In essence, a data frame is a list of vectors with equal length but maybe different types. Most of the code in this article is based on a group of fictitious data about some products (you can download the data at https://gist.github.com/renkun-ken/ba2d33f21efded23db66a68240c20c92). We will use the readr package to load the data for better handling of column types. If you don't have this package installed, please run install.packages("readr"). library(readr) product_info <- read_csv("data/product-info.csv") product_info ##    id      name  type   class released ## 1 T01    SupCar   toy vehicle      yes ## 2 T02  SupPlane   toy vehicle       no ## 3 M01     JeepX model vehicle      yes ## 4 M02 AircraftX model vehicle      yes ## 5 M03    Runner model  people      yes ## 6 M04    Dancer model  people       no Once the data is loaded into memory as a data frame, we can take a look at its column types, shown as follows: sapply(product_info, class) ##          id        name        type       class    released ## "character" "character" "character" "character" "character" Using built-in functions to manipulate data frames Although a data frame is essentially a list of vectors, we can access it like a matrix due to all column vectors being the same length. To select rows that meet certain conditions, we will supply a logical vector as the first argument of [] while the second is left empty. For example, we can take out all rows of toy type, shown as follows: product_info[product_info$type == "toy", ] ##    id     name type   class released ## 1 T01   SupCar  toy vehicle      yes ## 2 T02 SupPlane  toy vehicle       no Or, we can take out all rows that are not released. product_info[product_info$released == "no", ] ##    id     name  type   class released ## 2 T02 SupPlane   toy vehicle       no ## 6 M04   Dancer model  people       no To filter columns, we can supply a character vector as the second argument while the first is left empty, which is exactly the same with how we subset a matrix. product_info[1:3, c("id", "name", "type")] ##    id     name  type ## 1 T01   SupCar   toy ## 2 T02 SupPlane   toy ## 3 M01    JeepX model Alternatively, we can filter the data frame by regarding it as a list. We can supply only one character vector of column names in []. product_info[c("id", "name", "class")] ##    id      name   class ## 1 T01    SupCar vehicle ## 2 T02  SupPlane vehicle ## 3 M01     JeepX vehicle ## 4 M02 AircraftX vehicle ## 5 M03    Runner  people ## 6 M04    Dancer  people To filter a data frame by both row and column, we can supply a vector as the first argument to select rows and a vector as the second to select columns. product_info[product_info$type == "toy", c("name", "class", "released")] ##       name   class released ## 1   SupCar vehicle      yes ## 2 SupPlane vehicle       no If the row filtering condition is based on values of certain columns, the preceding code can be very redundant, especially when the condition gets more complicated. Another built-in function to simplify code is subset, as introduced previously. subset(product_info,   subset = type == "model" & released == "yes",   select = name:class) ##        name  type   class ## 3     JeepX model vehicle ## 4 AircraftX model vehicle ## 5    Runner model  people The subset function uses nonstandard evaluation so that we can directly use the columns of the data frame without typing product_info many times because the expressions are meant to be evaluated in the context of the data frame. Similarly, we can use with to evaluate an expression in the context of the data frame, that is, the columns of the data frame can be used as symbols in the expression without repeatedly specifying the data frame. with(product_info, name[released == "no"]) ## [1] "SupPlane" "Dancer" The expression can be more than a simple subsetting. We can summarize the data by counting the occurrences of each possible value of a vector. For example, we can create a table of occurrences of types of records that are released. with(product_info, table(type[released == "yes"])) ## ## model   toy ##     3     1 In addition to the table of product information, we also have a table of product statistics that describe some properties of each product. product_stats <- read_csv("data/product-stats.csv") product_stats ##    id material size weight ## 1 T01    Metal  120   10.0 ## 2 T02    Metal  350   45.0 ## 3 M01 Plastics   50     NA ## 4 M02 Plastics   85    3.0 ## 5 M03     Wood   15     NA ## 6 M04     Wood   16    0.6 Now, think of how we can get the names of products with the top three largest sizes? One way is to sort the records in product_stats by size in descending order, select id values of the top three records, and use these values to filter rows of product_info by id. top_3_id <- product_stats[order(product_stats$size, decreasing = TRUE), "id"][1:3] product_info[product_info$id %in% top_3_id, ] ##    id      name  type   class released ## 1 T01    SupCar   toy vehicle      yes ## 2 T02  SupPlane   toy vehicle       no ## 4 M02 AircraftX model vehicle      yes This approach looks quite redundant. Note that product_info and product_stats actually describe the same set of products in different perspectives. The connection between these two tables is the id column. Each id is unique and means the same product. To access both sets of information, we can put the two tables together into one data frame. The simplest way to do this is use merge: product_table <- merge(product_info, product_stats, by = "id") product_table ##    id      name  type   class released material size weight ## 1 M01     JeepX model vehicle      yes Plastics   50     NA ## 2 M02 AircraftX model vehicle      yes Plastics   85    3.0 ## 3 M03    Runner model  people      yes     Wood   15     NA ## 4 M04    Dancer model  people       no     Wood   16    0.6 ## 5 T01    SupCar   toy vehicle      yes    Metal  120   10.0 ## 6 T02  SupPlane   toy vehicle       no    Metal  350   45.0 Now, we can create a new data frame that is a combined version of product_table and product_info with a shared id column. In fact, if you reorder the records in the second table, the two tables still can be correctly merged. With the combined version, we can do things more easily. For example, with the merged version, we can sort the data frame with any column in one table we loaded without having to manually work with the other. product_table[order(product_table$size), ] ##    id      name  type   class released material size weight ## 3 M03    Runner model  people      yes     Wood   15     NA ## 4 M04    Dancer model  people       no     Wood   16    0.6 ## 1 M01     JeepX model vehicle      yes Plastics   50     NA ## 2 M02 AircraftX model vehicle      yes Plastics   85    3.0 ## 5 T01    SupCar   toy vehicle      yes    Metal  120   10.0 ## 6 T02  SupPlane   toy vehicle       no    Metal  350   45.0 To solve the problem, we can directly use the merged table and get the same answer. product_table[order(product_table$size, decreasing = TRUE), "name"][1:3] ## [1] "SupPlane"  "SupCar"    "AircraftX" The merged data frame allows us to sort the records by a column in one data frame and filter the records by a column in the other. For example, we can first sort the product records by weight in descending order and select all records of model type. product_table[order(product_table$weight, decreasing = TRUE), ][   product_table$type == "model",] ##    id      name  type   class released material size weight ## 6 T02  SupPlane   toy vehicle       no    Metal  350   45.0 ## 5 T01    SupCar   toy vehicle      yes    Metal  120   10.0 ## 2 M02 AircraftX model vehicle      yes Plastics   85    3.0 ## 4 M04    Dancer model  people       no     Wood   16    0.6 Sometimes, the column values are literal but can be converted to standard R data structures to better represent the data. For example, released column in product_info only takes yes and no, which can be better represented with a logical vector. We can use <- to modify the column values, as we learned previously. However, it is usually better to create a new data frame with the existing columns properly adjusted and new columns added without polluting the original data. To do this, we can use transform: transform(product_table,   released = ifelse(released == "yes", TRUE, FALSE),   density = weight / size) ##    id      name  type   class released material size weight ## 1 M01     JeepX model vehicle     TRUE Plastics   50     NA ## 2 M02 AircraftX model vehicle     TRUE Plastics   85    3.0 ## 3 M03    Runner model  people     TRUE     Wood   15     NA ## 4 M04    Dancer model  people    FALSE     Wood   16    0.6 ## 5 T01    SupCar   toy vehicle     TRUE    Metal  120   10.0 ## 6 T02  SupPlane   toy vehicle    FALSE    Metal  350   45.0 ##      density ## 1         NA ## 2 0.03529412 ## 3         NA ## 4 0.03750000 ## 5 0.08333333 ## 6 0.12857143 The result is a new data frame with released converted to a logical vector and a new density column added. You can easily verify that product_table is not modified at all. Additionally, note that transform is like subset, as both functions use nonstandard evaluation to allow direct use of data frame columns as symbols in the arguments so that we don't have to type product_table$ all the time. Now, we will load another table into R. It is the test results of the quality, and durability of each product. We store the data in product_tests. product_tests <- read_csv("data/product-tests.csv") product_tests ##    id quality durability waterproof ## 1 T01      NA         10         no ## 2 T02      10          9         no ## 3 M01       6          4        yes ## 4 M02       6          5        yes ## 5 M03       5         NA        yes ## 6 M04       6          6        yes Note that the values in both quality and durability contain missing values (NA). To exclude all rows with missing values, we can use na.omit(): na.omit(product_tests) ##    id quality durability waterproof ## 2 T02      10          9         no ## 3 M01       6          4        yes ## 4 M02       6          5        yes ## 6 M04       6          6        yes Another way is to use complete.cases() to get a logical vector indicating all complete rows, without any missing value,: complete.cases(product_tests) ## [1] FALSE  TRUE  TRUE  TRUE FALSE  TRUE Then, we can use this logical vector to filter the data frame. For example, we can get the id  column of all complete rows as follows: product_tests[complete.cases(product_tests), "id"] ## [1] "T02" "M01" "M02" "M04" Or, we can get the id column of all incomplete rows: product_tests[!complete.cases(product_tests), "id"] ## [1] "T01" "M03" Note that product_info, product_stats and product_tests all share an id column, and we can merge them altogether. Unfortunately, there's no built-in function to merge an arbitrary number of data frames. We can only merge two existing data frames at a time, or we'll have to merge them recursively. merge(product_table, product_tests, by = "id") ##    id      name  type   class released material size weight ## 1 M01     JeepX model vehicle      yes Plastics   50     NA ## 2 M02 AircraftX model vehicle      yes Plastics   85    3.0 ## 3 M03    Runner model  people      yes     Wood   15     NA ## 4 M04    Dancer model  people       no     Wood   16    0.6 ## 5 T01    SupCar   toy vehicle      yes    Metal  120   10.0 ## 6 T02  SupPlane   toy vehicle       no    Metal  350   45.0 ##   quality durability waterproof ## 1       6          4        yes ## 2       6          5        yes ## 3       5         NA        yes ## 4       6          6        yes ## 5      NA         10         no ## 6      10          9         no Data wrangling with data.table In the previous section, we had an overview on how we can use built-in functions to work with data frames. Built-in functions work, but are usually verbose. In this section, let's use data.table, an enhanced version of data.frame, and see how it makes data manipulation much easier. Run install.packages("data.table") to install the package. As long as the package is ready, we can load the package and use fread() to read the data files as data.table objects. library(data.table) product_info <- fread("data/product-info.csv") product_stats <- fread("data/product-stats.csv") product_tests <- fread("data/product-tests.csv") toy_tests <- fread("data/product-toy-tests.csv") It is extremely easy to filter data in data.table. To select the first two rows, just use [1:2], which instead selects the first two columns for data.frame. product_info[1:2] ##     id     name type   class released ## 1: T01   SupCar  toy vehicle      yes ## 2: T02 SupPlane  toy vehicle       no To filter by logical conditions, just directly type columns names as variables without quotation as the expression is evaluated within the context of product_info: product_info[type == "model" & class == "people"] ##     id   name  type  class released ## 1: M03 Runner model people      yes ## 2: M04 Dancer model people       no It is easy to select or transform columns. product_stats[, .(id, material, density = size / weight)] ##     id material   density ## 1: T01    Metal 12.000000 ## 2: T02    Metal  7.777778 ## 3: M01 Plastics        NA ## 4: M02 Plastics 28.333333 ## 5: M03     Wood        NA ## 6: M04     Wood 26.666667 The data.table object also supports using key for subsetting, which can be much faster than using ==. We can set a column as key for each data.table: setkey(product_info, id) setkey(product_stats, id) setkey(product_tests, id) Then, we can use a value to directly select rows. product_info["M02"] ##     id      name  type   class released ## 1: M02 AircraftX model vehicle      yes We can also set multiple columns as key so as to use multiple values to subset it. setkey(toy_tests, id, date) toy_tests[.("T02", 20160303)] ##     id     date sample quality durability ## 1: T02 20160303     75       8          8 If two data.table objects share the same key, we can join them easily: product_info[product_tests] ##     id      name  type   class released quality durability ## 1: M01     JeepX model vehicle      yes       6          4 ## 2: M02 AircraftX model vehicle      yes       6          5 ## 3: M03    Runner model  people      yes       5         NA ## 4: M04    Dancer model  people       no       6          6 ## 5: T01    SupCar   toy vehicle      yes      NA         10 ## 6: T02  SupPlane   toy vehicle       no      10          9 ##    waterproof ## 1:        yes ## 2:        yes ## 3:        yes ## 4:        yes ## 5:         no ## 6:         no Instead of creating new data.table, in-place modification is also supported. The := sets the values of a column in place without the overhead of making copies and, thus, is much faster than using <-. product_info[, released := (released == "yes")] ##     id      name  type   class released ## 1: M01     JeepX model vehicle     TRUE ## 2: M02 AircraftX model vehicle     TRUE ## 3: M03    Runner model  people     TRUE ## 4: M04    Dancer model  people    FALSE ## 5: T01    SupCar   toy vehicle     TRUE ## 6: T02  SupPlane   toy vehicle    FALSE product_info ##     id      name  type   class released ## 1: M01     JeepX model vehicle     TRUE ## 2: M02 AircraftX model vehicle     TRUE ## 3: M03    Runner model  people     TRUE ## 4: M04    Dancer model  people    FALSE ## 5: T01    SupCar   toy vehicle     TRUE ## 6: T02  SupPlane   toy vehicle    FALSE Another important argument of subsetting a data.table is by, which is used to split the data into multiple parts and for each part the second argument (j) is evaluated. For example, the simplest usage of by is counting the records in each group. In the following code, we can count the number of both released and unreleased products: product_info[, .N, by = released] ##    released N ## 1:     TRUE 4 ## 2:    FALSE 2 The group can be defined by more than one variable. For example, a tuple of type and class can be a group, and for each group, we can count the number of records, as follows: product_info[, .N, by = .(type, class)] ##     type   class N ## 1: model vehicle 2 ## 2: model  people 2 ## 3:   toy vehicle 2 We can also perform the following statistical calculations for each group: product_tests[, .(mean_quality = mean(quality, na.rm = TRUE)),   by = .(waterproof)] ##    waterproof mean_quality ## 1:        yes         5.75 ## 2:         no        10.00 We can chain multiple [] in turn. In the following example, we will first join product_info and product_tests by a shared key id and then calculate the mean value of quality and durability for each group of type and class of released products. product_info[product_tests][released == TRUE,   .(mean_quality = mean(quality, na.rm = TRUE),     mean_durability = mean(durability, na.rm = TRUE)),   by = .(type, class)] ##     type   class mean_quality mean_durability ## 1: model vehicle            6             4.5 ## 2: model  people            5             NaN ## 3:   toy vehicle          NaN            10.0 Note that the values of the by columns will be unique in the resulted data.table; we can use keyby instead of by to ensure that it is automatically used as key by the resulted data.table. product_info[product_tests][released == TRUE,   .(mean_quality = mean(quality, na.rm = TRUE),     mean_durability = mean(durability, na.rm = TRUE)),   keyby = .(type, class)] ##     type   class mean_quality mean_durability ## 1: model  people            5             NaN ## 2: model vehicle            6             4.5 ## 3:   toy vehicle          NaN            10.0 The data.table package also provides functions to perform superfast reshaping of data. For example, we can use dcast() to spread id values along the x-axis as columns and align quality values to all possible date values along the y-axis. toy_quality <- dcast(toy_tests, date ~ id, value.var = "quality") toy_quality ##        date T01 T02 ## 1: 20160201   9   7 ## 2: 20160302  10  NA ## 3: 20160303  NA   8 ## 4: 20160403  NA   9 ## 5: 20160405   9  NA ## 6: 20160502   9  10 Although each month a test is conducted for each product, the dates may not exactly match with each other. This results in missing values if one product has a value on a day but the other has no corresponding value on exactly the same day. One way to fix this is to use year-month data instead of exact date. In the following code, we will create a new ym column that is the first 6 characters of toy_tests. For example, substr(20160101, 1, 6) will result in 201601. toy_tests[, ym := substr(toy_tests$date, 1, 6)] ##     id     date sample quality durability     ym ## 1: T01 20160201    100       9          9 201602 ## 2: T01 20160302    150      10          9 201603 ## 3: T01 20160405    180       9         10 201604 ## 4: T01 20160502    140       9          9 201605 ## 5: T02 20160201     70       7          9 201602 ## 6: T02 20160303     75       8          8 201603 ## 7: T02 20160403     90       9          8 201604 ## 8: T02 20160502     85      10          9 201605 toy_tests$ym ## [1] "201602" "201603" "201604" "201605" "201602" "201603" ## [7] "201604" "201605" This time, we will use ym for alignment instead of date: toy_quality <- dcast(toy_tests, ym ~ id, value.var = "quality") toy_quality ##        ym T01 T02 ## 1: 201602   9   7 ## 2: 201603  10   8 ## 3: 201604   9   9 ## 4: 201605   9  10 Now the missing values are gone, the quality scores of both products in each month are naturally presented. Sometimes, we will need to combine a number of columns into one that indicates the measure and another that stores the value. For example, the following code uses melt() to combine the two measures (quality and durability) of the original data into a column named measure and a column of the measured value. toy_tests2 <- melt(toy_tests, id.vars = c("id", "ym"),   measure.vars = c("quality", "durability"),   variable.name = "measure") toy_tests2 ##      id     ym    measure value ##  1: T01 201602    quality     9 ##  2: T01 201603    quality    10 ##  3: T01 201604    quality     9 ##  4: T01 201605    quality     9 ##  5: T02 201602    quality     7 ##  6: T02 201603    quality     8 ##  7: T02 201604    quality     9 ##  8: T02 201605    quality    10 ##  9: T01 201602 durability     9 ## 10: T01 201603 durability     9 ## 11: T01 201604 durability    10 ## 12: T01 201605 durability     9 ## 13: T02 201602 durability     9 ## 14: T02 201603 durability     8 ## 15: T02 201604 durability     8 ## 16: T02 201605 durability     9 The variable names are now contained in the data, which can be directly used by some packages. For example, we can use ggplot2 to plot data in such format. The following code is an example of a scatter plot with a facet grid of different combination of factors. library(ggplot2) ggplot(toy_tests2, aes(x = ym, y = value)) +   geom_point() +   facet_grid(id ~ measure) The graph generated is shown as follows: The plot can be easily manipulated because the grouping factor (measure) is contained as data rather than columns, which is easier to represent from the perspective of the ggplot2 package. ggplot(toy_tests2, aes(x = ym, y = value, color = id)) +   geom_point() +   facet_grid(. ~ measure) The graph generated is shown as follows: Summary In this article, we used both built-in functions and the data.table package to perform simple data manipulation tasks. Using built-in functions can be verbose while using data.table can be much easier and faster. However, the tasks in real-world data analysis can be much more complex than the examples we demonstrated, which also requires better R programming skills. It is helpful to have a good understanding on how nonstandard evaluation makes data.table so easy to work with, how environment works and scoping rules apply to make your code predictable, and so on. A universal and consistent understanding of how R basically works will certainly give you great confidence to write R code to work with data and enable you to learn packages very quickly. Resources for Article: Further resources on this subject: Supervised Machine Learning [article] Getting Started with Bootstrap [article] Basics of Classes and Objects [article]
Read more
  • 0
  • 0
  • 1098

article-image-loops-conditions-and-recursion
Packt
14 Oct 2016
14 min read
Save for later

Loops, Conditions, and Recursion

Packt
14 Oct 2016
14 min read
In this article from Paul Johnson, author of the book Learning Rust, we would take a look at how loops and conditions within any programming language are a fundamental aspect of operation. You may be looping around a list attempting to find when something matches, and when a match occurs, branching out to perform some other task; or, you may just want to check a value to see if it meets a condition. In any case, Rust allows you to do this. (For more resources related to this topic, see here.) In this article, we will cover the following topics: Types of loop available Different types of branching within loops Recursive methods When the semi-colon (;) can be omitted and what it means Loops Rust has essentially three types of loop—for, loop, and while. The for loop This type of loop is very simple to understand, yet rather powerful in operation. It is simple. In that, we have a start value, an end condition, and some form of value change. Although, the power comes in those two last points. Let's take a simple example to start with—a loop that goes from 0 to 10 and outputs the value: for x in 0..10 { println!("{},", x); } We create a variable x that takes the expression (0..10) and does something with it. In Rust terminology, x is not only a variable but also an iterator, as it gives back a value from a series of elements. This is obviously a very simple example. We can also go down as well, but the syntax is slightly different. In C, you will expect something akin to for (i = 10; i > 0; --i). This is not available in Rust, at least, not in the stable branches. Instead, we will use the rev() method, which is as follows: for x in (0..10).rev() { println!("{},", x); } It is worth noting that, as with the C family, the last number is to be excluded. So, for the first example, the values outputted are 9 to 0; essentially, the program generates the output values from 0 to 10 and then outputs them in reverse. Notice also that the condition is in braces. This is because the second parameter is the condition. In C#, this will be the equivalent of a foreach. In Rust, it will be as follows: for var in condition { // do something } The C# equivalent for the preceding code is: foreach(var t in condition) // do something Using enumerate A loop condition can also be more complex using multiple conditions and variables. For example, the for loop can be tracked using enumerate. This will keep track of how many times the loop has executed, as shown here: for(i, j) in (10..20).enumerate() { println!("loop has executed {} times. j = {}", i, j); } 'The following is the output: The enumeration is given in the first variable with the condition in the second. This example is not of that much use, but where it comes into its own is when looping over an iterator. Say we have an array that we need to iterate over to obtain the values. Here, the enumerate can be used to obtain the value of the array members. However, the value returned in the condition will be a pointer, so a code such as the one shown in the following example will fail to execute (line is a & reference whereas an i32 is expected) fn main() { let my_array: [i32; 7] = [1i32,3,5,7,9,11,13]; let mut value = 0i32; for(_, line) in my_array.iter().enumerate() { value += line; } println!("{}", value); } This can be simply converted back from the reference value, as follows: for(_, line) in my_array.iter().enumerate() { value += *line; } The iter().enumerate() method can equally be used with the Vec type, as shown in the following code: fn main() { let my_array = vec![1i32,3,5,7,9,11,13]; let mut value = 0i32; for(_,line) in my_array.iter().enumerate() { value += *line; } println!("{}", value); } In both cases, the value given at the end will be 49, as shown in the following screenshot: The _ parameter You may be wondering what the _ parameter is. It's Rust, which means that there is an argument, but we'll never do anything with it, so it's a parameter that is only there to ensure that the code compiles. It's a throw-away. The _ parameter cannot be referred to either; whereas, we can do something with linenumber in for(linenumber, line), but we can't do anything with _ in for(_, line). The simple loop A simple form of the loop is called loop: loop { println!("Hello"); } The preceding code will either output Hello until the application is terminated or the loop reaches a terminating statement. While… The while condition is of slightly more use, as you will see in the following code snippet: while (condition) { // do something } Let's take a look at the following example: fn main() { let mut done = 0u32; while done != 32 { println!("done = {}", done); done+=1; } } The preceding code will output done = 0 to done = 31. The loop terminates when done equals 32. Prematurely terminating a loop Depending on the size of the data being iterated over within a loop, the loop can be costly on processor time. For example, say the server is receiving data from a data-logging application, such as measuring values from a gas chromatograph, over the entire scan, it may record roughly half a million data points with an associated time position. For our purposes, we want to add all of the recorded values until the value is over 1.5 and once that is reached, we can stop the loop. Sound easy? There is one thing not mentioned, there is no guarantee that the recorded value will ever reach over 1.5, so how can we terminate the loop if the value is reached? We can do this one of two ways. First is to use a while loop and introduce a Boolean to act as the test condition. In the following example, my_array represents a very small subsection of the data sent to the server. fn main() { let my_array = vec![0.6f32, 0.4, 0.2, 0.8, 1.3, 1.1, 1.7, 1.9]; let mut counter: usize = 0; let mut result = 0f32; let mut test = false; while test != true { if my_array[counter] > 1.5 { test = true; } else { result += my_array[counter]; counter += 1; } } println!("{}", result); } The result here is 4.4. This code is perfectly acceptable, if slightly long winded. Rust also allows the use of break and continue keywords (if you're familiar with C, they work in the same way). Our code using break will be as follows: fn main() { let my_array = vec![0.6f32, 0.4, 0.2, 0.8, 1.3, 1.1, 1.7, 1.9]; let mut result = 0f32; for(_, value) in my_array.iter().enumerate() { if *value > 1.5 { break; } else { result += *value; } } println!("{}", result); } Again, this will give an answer of 4.4, indicating that the two methods used are the equivalent of each other. If we replace break with continue in the preceding code example, we will get the same result (4.4). The difference between break and continue is that continue jumps to the next value in the iteration rather than jumping out, so if we had the final value of my_array as 1.3, the output at the end should be 5.7. When using break and continue, always keep in mind this difference. While it may not crash the code, mistaking break and continue may lead to results that you may not expect or want. Using loop labels Rust allows us to label our loops. This can be very useful (for example with nested loops). These labels act as symbolic names to the loop and as we have a name to the loop, we can instruct the application to perform a task on that name. Consider the following simple example: fn main() { 'outer_loop: for x in 0..10 { 'inner_loop: for y in 0..10 { if x % 2 == 0 { continue 'outer_loop; } if y % 2 == 0 { continue 'inner_loop; } println!("x: {}, y: {}", x, y); } } } What will this code do? Here x % 2 == 0 (or y % 2 == 0) means that if variable divided by two returns no remainder, then the condition is met and it executes the code in the braces. When x % 2 == 0, or when the value of the loop is an even number, we will tell the application to skip to the next iteration of outer_loop, which is an odd number. However, we will also have an inner loop. Again, when y % 2 is an even value, we will tell the application to skip to the next iteration of inner_loop. In this case, the application will output the following results: While this example may seem very simple, it does allow for a great deal of speed when checking data. Let's go back to our previous example of data being sent to the web service. Recall that we have two values—the recorded data and some other value, for ease, it will be a data point. Each data point is recorded 0.2 seconds apart; therefore, every 5th data point is 1 second. This time, we want all of the values where the data is greater than 1.5 and the associated time of that data point but only on a time when it's dead on a second. As we want the code to be understandable and human readable, we can use a loop label on each loop. The following code is not quite correct. Can you spot why? The code compiles as follows: fn main() { let my_array = vec![0.6f32, 0.4, 0.2, 0.8, 1.3, 1.1, 1.7, 1.9, 1.3, 0.1, 1.6, 0.6, 0.9, 1.1, 1.31, 1.49, 1.5, 0.7]; let my_time = vec![0.2f32, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.2, 2.4, 2.6, 2.8, 3.0, 3.2, 3.4, 3.6, 3.8]; 'time_loop: for(_, time_value) in my_time.iter().enumerate() { 'data_loop: for(_, value) in my_array.iter().enumerate() { if *value < 1.5 { continue 'data_loop; } if *time_value % 5f32 == 0f32 { continue 'time_loop; } println!("Data point = {} at time {}s", *value, *time_value); } } } This example is a very good one to demonstrate the correct operator in use. The issue is the if *time_value % 5f32 == 0f32 line. We are taking a float value and using the modulus of another float to see if we end up with 0 as a float. Comparing any value that is not a string, int, long, or bool type to another is never a good plan; especially, if the value is returned by some form of calculation. We can also not simply use continue on the time loop, so, how can we solve this problem? If you recall, we're using _ instead of a named parameter for the enumeration of the loop. These values are always an integer, therefore if we replace _ for a variable name, then we can use % 5 to perform the calculation and the code becomes: 'time_loop: for(time_enum, time_value) in my_time.iter().enumerate() { 'data_loop: for(_, value) in my_array.iter().enumerate() { if *value < 1.5 { continue 'data_loop; } if time_enum % 5 == 0 { continue 'time_loop; } println!("Data point = {} at time {}s", *value, *time_value); } } The next problem is that the output isn't correct. The code gives the following: Data point = 1.7 at time 0.4s Data point = 1.9 at time 0.4s Data point = 1.6 at time 0.4s Data point = 1.5 at time 0.4s Data point = 1.7 at time 0.6s Data point = 1.9 at time 0.6s Data point = 1.6 at time 0.6s Data point = 1.5 at time 0.6s The data point is correct, but the time is way out and continually repeats. We still need the continue statement for the data point step, but the time step is incorrect. There are a couple of solutions, but possibly the simplest will be to store the data and the time into a new vector and then display that data at the end. The following code gets closer to what is required: fn main() { let my_array = vec![0.6f32, 0.4, 0.2, 0.8, 1.3, 1.1, 1.7, 1.9, 1.3, 0.1, 1.6, 0.6, 0.9, 1.1, 1.31, 1.49, 1.5, 0.7]; let my_time = vec![0.2f32, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.2, 2.4, 2.6, 2.8, 3.0, 3.2, 3.4, 3.6, 3.8]; let mut my_new_array = vec![]; let mut my_new_time = vec![]; 'time_loop: for(t, _) in my_time.iter().enumerate() { 'data_loop: for(v, value) in my_array.iter().enumerate() { if *value < 1.5 { continue 'data_loop; } else { if t % 5 != 0 { my_new_array.push(*value); my_new_time.push(my_time[v]); } } if v == my_array.len() { break; } } } for(m, my_data) in my_new_array.iter().enumerate() { println!("Data = {} at time {}", *my_data, my_new_time[m]); } } We will now get the following output: Data = 1.7 at time 1.4 Data = 1.9 at time 1.6 Data = 1.6 at time 2.2 Data = 1.5 at time 3.4 Data = 1.7 at time 1.4 Yes, we now have the correct data, but the time starts again. We're close, but it's not right yet. We aren't continuing the time_loop loop and we will also need to introduce a break statement. To trigger the break, we will create a new variable called done. When v, the enumerator for my_array, reaches the length of the vector (this is the number of elements in the vector), we will change this from false to true. This is then tested outside of the data_loop. If done == true, break out of the loop. The final version of the code is as follows: fn main() { let my_array = vec![0.6f32, 0.4, 0.2, 0.8, 1.3, 1.1, 1.7, 1.9, 1.3, 0.1, 1.6, 0.6, 0.9, 1.1, 1.31, 1.49, 1.5, 0.7]; let my_time = vec![0.2f32, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.2, 2.4, 2.6, 2.8, 3.0, 3.2, 3.4, 3.6]; let mut my_new_array = vec![]; let mut my_new_time = vec![]; let mut done = false; 'time_loop: for(t, _) in my_time.iter().enumerate() { 'data_loop: for(v, value) in my_array.iter().enumerate() { if v == my_array.len() - 1 { done = true; } if *value < 1.5 { continue 'data_loop; } else { if t % 5 != 0 { my_new_array.push(*value); my_new_time.push(my_time[v]); } else { continue 'time_loop; } } } if done {break;} } for(m, my_data) in my_new_array.iter().enumerate() { println!("Data = {} at time {}", *my_data, my_new_time[m]); } } Our final output from the code is this: Recursive functions The final form of loop to consider is known as a recursive function. This is a function that calls itself until a condition is met. In pseudocode, the function looks like this: float my_function(i32:a) { // do something with a if (a != 32) { my_function(a); } else { return a; } } An actual implementation of a recursive function would look like this: fn recurse(n:i32) { let v = match n % 2 { 0 => n / 2, _ => 3 * n + 1 }; println!("{}", v); if v != 1 { recurse(v) } } fn main() { recurse(25) } The idea of a recursive function is very simple, but we need to consider two parts of this code. The first is the let line in the recurse function and what it means: let v = match n % 2 { 0 => n / 2, _ => 3 * n + 1 }; Another way of writing this is as follows: let mut v = 0i32; if n % 2 == 0 { v = n / 2; } else { v = 3 * n + 1; } In C#, this will equate to the following: var v = n % 2 == 0 ? n / 2 : 3 * n + 1; The second part is that the semicolon is not being used everywhere. Consider the following example: fn main() { recurse(25) } What is the difference between having and not having a semicolon? Rust operates on a system of blocks called closures. The semicolon closes a block. Let's see what that means. Consider the following code as an example: fn main() { let x = 5u32; let y = { let x_squared = x * x; let x_cube = x_squared * x; x_cube + x_squared + x }; let z = { 2 * x; }; println!("x is {:?}", x); println!("y is {:?}", y); println!("z is {:?}", z); } We have two different uses of the semicolon. If we look at the let y line first: let y = { let x_squared = x * x; let x_cube = x_squared * x; x_cube + x_squared + x // no semi-colon }; This code does the following: The code within the braces is processed. The final line, without the semicolon, is assigned to y. Essentially, this is considered as an inline function that returns the line without the semicolon into the variable. The second line to consider is for z: let z = { 2 * x; }; Again, the code within the braces is evaluated. In this case, the line ends with a semicolon, so the result is suppressed and () to z. When it is executed, we will get the following results: In the code example, the line within fn main calling recurse gives the same result with or without the semicolon. Summary In this, we've covered the different types of loops that are available within Rust, as well as gained an understanding of when to use a semicolon and what it means to omit it. We have also considered enumeration and iteration over a vector and array and how to handle the data held within them. Resources for Article: Further resources on this subject: Extra, Extra Collection, and Closure Changes that Rock! [article] Create a User Profile System and use the Null Coalesce Operator [article] Fine Tune Your Web Application by Profiling and Automation [article]
Read more
  • 0
  • 0
  • 1259
article-image-applying-themes-sails-applications-part-2
Luis Lobo
14 Oct 2016
4 min read
Save for later

Applying Themes to Sails Applications, Part 2

Luis Lobo
14 Oct 2016
4 min read
In Part 1 of this series covering themes in the Sails Framework, we bootstrapped our sample Sails app (step 1). Here in Part 2, we will complete steps 2 and 3, compiling our theme’s CSS and the necessary Less files and setting up the theme Sails hook to complete our application. Step 2 – Adding a task for compiling our theme's CSS and the necessary Less files Let’s pick things back up where we left of in Part 1. We now want to customize our page to have our burrito style. We need to add a task that compiles our themes. Edit your /tasks/config/less.js so that it looks like this one: module.exports = function (grunt) { grunt.config.set('less', { dev: { files: [{ expand: true, cwd: 'assets/styles/', src: ['importer.less'], dest: '.tmp/public/styles/', ext: '.css' }, { expand: true, cwd: 'assets/themes/export', src: ['*.less'], dest: '.tmp/public/themes/', ext: '.css' }] } }); grunt.loadNpmTasks('grunt-contrib-less'); }; Basically, we added a second object to the files section, which tells the Less compiler task to look for any Less file in assets/themes/export, compile it, and put the resulting CSS in the .tmp/public/themes folder. In case you were not aware of it, the .tmp/public folder is the one Sails uses to publish its assets. We now create two themes: one is default.less and the other is burrito.less, which is based on default.less. We also have two other Less files, each one holding the variables for each theme. This technique allows you to have one base theme and many other themes based on the default. /assets/themes/variables.less @app-navbar-background-color: red; @app-navbar-brand-color: white; /assets/themes/variablesBurrito.less @app-navbar-background-color: green; @app-navbar-brand-color: yellow; /assets/themes/export/default.less @import "../variables.less"; .navbar-inverse { background-color: @app-navbar-background-color; .navbar-brand { color: @app-navbar-brand-color; } } /assets/themes/export/burrito.less @import "default.less"; @import "../variablesBurrito.less"; So, burrito.less just inherits from default.less but overrides the variables with the ones on its own, creating a new theme based on the default. If you lift Sails now, you will notice that the Navigation bar has a red background on white. Step 3 – Setting up the theme Sails hook The last step involves creating a Hook, a Node module that adds functionality to the Sails corethat catches the hostname, and if it has burrito in it, sets the new theme. First, let’s create the folder for the hook: mkdir -p ./api/hooks/theme Now create a file named index.js in that folder with this content: /** * theme hook - Sets the correct CSS to be displayed */ module.exports = function (sails) { return { routes: { before: { 'all /*': function (req, res, next) { if (!req.isSocket) { // makes theme variable available in views res.locals.theme = sails.hooks.theme.getTheme(req); } returnnext(); } } }, /** * getTheme defines which css needs to be used for this request * In this case, we select the theme by pattern matching certain words from the hostname */ getTheme: function (req) { var hostname = 'default'; var theme = 'default'; try { hostname = req.get('host').toLowerCase(); } catch(e) { // host may not be available always (ie, socket calls. If you need that, add a Host header in your // sails socket configuration) } // if burrito is found on the hostname, change the theme if (hostname.indexOf('burrito') > -1) { theme = 'burrito'; } return theme; } }; }; Finally, to test our configuration, we need to add a host entry in our OS hosts file. In Linux/Unix-based operating systems, you have to edit /etc/hosts (with sudo or root). Add the following line: 127.0.0.1 burrito.smartdelivery.localwww.smartdelivery.local Now navigate using those host names, first to www.smartdelivery.local: And lastly, navigate to burrito.smartdelivery.local: You now have your Burrito Smart Delivery! And you have a Themed Sails Application! I hope you have enjoyed this series.  You can get the source code from here. Enjoy! About the author Luis Lobo Borobia is the CTO at FictionCity.NET, is a mentor and advisor, independent software engineer consultant, and conference speaker. He has a background as a software analyst and designer, creating, designing, and implementing software products, solutions, frameworks, and platforms for several kinds of industries. In the last few years, he has focused on research and development for the Internet of Things, using the latest bleeding-edge software and hardware technologies available.
Read more
  • 0
  • 0
  • 1756

article-image-asynchronous-programming-f
Packt
12 Oct 2016
15 min read
Save for later

Asynchronous Programming in F#

Packt
12 Oct 2016
15 min read
In this article by Alfonso Garcia Caro Nunez and Suhaib Fahad, author of the book Mastering F#, sheds light on how writing applications that are non-blocking or reacting to events is increasingly becoming important in this cloud world we live in. A modern application needs to carry out a rich user interaction, communicate with web services, react to notifications, and so on; the execution of reactive applications is controlled by events. Asynchronous programming is characterized by many simultaneously pending reactions to internal or external events. These reactions may or may not be processed in parallel. (For more resources related to this topic, see here.) In .NET, both C# and F# provide asynchronous programming experience through keywords and syntaxes. In this article, we will go through the asynchronous programming model in F#, with a bit of cross-referencing or comparison drawn with the C# world. In this article, you will learn about asynchronous workflows in F# Asynchronous workflows in F# Asynchronous workflows are computation expressions that are setup to run asynchronously. It means that the system runs without blocking the current computation thread when a sleep, I/O, or other asynchronous process is performed. You may be wondering why do we need asynchronous programming and why can't we just use the threading concepts that we did for so long. The problem with threads is that the operation occupies the thread for the entire time that something happens or when a computation is done. On the other hand, asynchronous programming will enable a thread only when it is required, otherwise it will be normal code. There is also lot of marshalling and unmarshalling (junk) code that we will write around to overcome the issues that we face when directly dealing with threads. Thus, asynchronous model allows the code to execute efficiently whether we are downloading a page 50 or 100 times using a single thread or if we are doing some I/O operation over the network and there are a lot of incoming requests from the other endpoint. There is a list of functions that the Async module in F# exposes to create or use these asynchronous workflows to program. The asynchronous pattern allows writing code that looks like it is written for a single-threaded program, but in the internals, it uses async blocks to execute. There are various triggering functions that provide a wide variety of ways to create the asynchronous workflow, which is either a background thread, a .NET framework task object, or running the computation in the current thread itself. In this article, we will use the example of downloading the content of a webpage and modifying the data, which is as follows: let downloadPage (url: string) = async { let req = HttpWebRequest.Create(url) use! resp = req.AsyncGetResponse() use respStream = resp.GetResponseStream() use sr = new StreamReader(respStream) return sr.ReadToEnd() } downloadPage("https://www.google.com") |> Async.RunSynchronously The preceding function does the following: The async expression, { … }, generates an object of type Async<string> These values are not actual results; rather, they are specifications of tasks that need to run and return a string Async.RunSynchronously takes this object and runs synchronously We just wrote a simple function with asynchronous workflows with relative ease and reason about the code, which is much better than using code with Begin/End routines. One of the most important point here is that the code is never blocked during the execution of the asynchronous workflow. This means that we can, in principle, have thousands of outstanding web requests—the limit being the number supported by the machine, not the number of threads that host them. Using let! In asynchronous workflows, we will use let! binding to enable execution to continue on other computations or threads, while the computation is being performed. After the execution is complete, the rest of the asynchronous workflow is executed, thus simulating a sequential execution in an asynchronous way. In addition to let!, we can also use use! to perform asynchronous bindings; basically, with use!, the object gets disposed when it loses the current scope. In our previous example, we used use! to get the HttpWebResponse object. We can also do as follows: let! resp = req.AsyncGetResponse() // process response We are using let! to start an operation and bind the result to a value, do!, which is used when the return of the async expression is a unit. do! Async.Sleep(1000) Understanding asynchronous workflows As explained earlier, asynchronous workflows are nothing but computation expressions with asynchronous patterns. It basically implements the Bind/Return pattern to implement the inner workings. This means that the let! expression is translated into a call to async. The bind and async.Return function are defined in the Async module in the F# library. This is a compiler functionality to translate the let! expression into computation workflows and, you, as a developer, will never be required to understand this in detail. The purpose of explaining this piece is to understand the internal workings of an asynchronous workflow, which is nothing but a computation expression. The following listing shows the translated version of the downloadPage function we defined earlier: async.Delay(fun() -> let req = HttpWebRequest.Create(url) async.Bind(req.AsyncGetResponse(), fun resp -> async.Using(resp, fun resp -> let respStream = resp.GetResponseStream() async.Using(new StreamReader(respStream), fun sr " -> reader.ReadToEnd() ) ) ) ) The following things are happening in the workflow: The Delay function has a deferred lambda that executes later. The body of the lambda creates an HttpWebRequest and is forwarded in a variable req to the next segment in the workflow. The AsyncGetResponse function is called and a workflow is generated, where it knows how to execute the response and invoke when the operation is completed. This happens internally with the BeginGetResponse and EndGetResponse functions already present in the HttpWebRequest class; the AsyncGetResponse is just a wrapper extension present in the F# Async module. The Using function then creates a closure to dispose the object with the IDisposable interface once the workflow is complete. Async module The Async module has a list of functions that allows writing or consuming asynchronous code. We will go through each function in detail with an example to understand it better. Async.AsBeginEnd It is very useful to expose the F# workflow functionality out of F#, say if we want to use and consume the API's in C#. The Async.AsBeginEnd method gives the possibility of exposing the asynchronous workflows as a triple of methods—Begin/End/Cancel—following the .NET Asynchronous Programming Model (APM). Based on our downloadPage function, we can define the Begin, End, Cancel functions as follows: type Downloader() = let beginMethod, endMethod, cancelMethod = " Async.AsBeginEnd downloadPage member this.BeginDownload(url, callback, state : obj) = " beginMethod(url, callback, state) member this.EndDownload(ar) = endMethod ar member this.CancelDownload(ar) = cancelMethod(ar) Async.AwaitEvent The Async.AwaitEvent method creates an asynchronous computation that waits for a single invocation of a .NET framework event by adding a handler to the event. type MyEvent(v : string) = inherit EventArgs() member this.Value = v; let testAwaitEvent (evt : IEvent<MyEvent>) = async { printfn "Before waiting" let! r = Async.AwaitEvent evt printfn "After waiting: %O" r.Value do! Async.Sleep(1000) return () } let runAwaitEventTest () = let evt = new Event<Handler<MyEvent>, _>() Async.Start <| testAwaitEvent evt.Publish System.Threading.Thread.Sleep(3000) printfn "Before raising" evt.Trigger(null, new MyEvent("value")) printfn "After raising" > runAwaitEventTest();; > Before waiting > Before raising > After raising > After waiting : value The testAwaitEvent function listens to the event using Async.AwaitEvent and prints the value. As the Async.Start will take some time to start up the thread, we will simply call Thread.Sleep to wait on the main thread. This is for example purpose only. We can think of scenarios where a button-click event is awaited and used inside an async block. Async.AwaitIAsyncResult Creates a computation result and waits for the IAsyncResult to complete. IAsyncResult is the asynchronous programming model interface that allows us to write asynchronous programs. It returns true if IAsyncResult issues a signal within the given timeout. The timeout parameter is optional, and its default value is -1 of Timeout.Infinite. let testAwaitIAsyncResult (url: string) = async { let req = HttpWebRequest.Create(url) let aResp = req.BeginGetResponse(null, null) let! asyncResp = Async.AwaitIAsyncResult(aResp, 1000) if asyncResp then let resp = req.EndGetResponse(aResp) use respStream = resp.GetResponseStream() use sr = new StreamReader(respStream) return sr.ReadToEnd() else return "" } > Async.RunSynchronously (testAwaitIAsyncResult "https://www.google.com") We will modify the downloadPage example with AwaitIAsyncResult, which allows a bit more flexibility where we want to add timeouts as well. In the preceding example, the AwaitIAsyncResult handle will wait for 1000 milliseconds, and then it will execute the next steps. Async.AwaitWaitHandle Creates a computation that waits on a WaitHandle—wait handles are a mechanism to control the execution of threads. The following is an example with ManualResetEvent: let testAwaitWaitHandle waitHandle = async { printfn "Before waiting" let! r = Async.AwaitWaitHandle waitHandle printfn "After waiting" } let runTestAwaitWaitHandle () = let event = new System.Threading.ManualResetEvent(false) Async.Start <| testAwaitWaitHandle event System.Threading.Thread.Sleep(3000) printfn "Before raising" event.Set() |> ignore printfn "After raising" The preceding example uses ManualResetEvent to show how to use AwaitHandle, which is very similar to the event example that we saw in the previous topic. Async.AwaitTask Returns an asynchronous computation that waits for the given task to complete and returns its result. This helps in consuming C# APIs that exposes task based asynchronous operations. let downloadPageAsTask (url: string) = async { let req = HttpWebRequest.Create(url) use! resp = req.AsyncGetResponse() use respStream = resp.GetResponseStream() use sr = new StreamReader(respStream) return sr.ReadToEnd() } |> Async.StartAsTask let testAwaitTask (t: Task<string>) = async { let! r = Async.AwaitTask t return r } > downloadPageAsTask "https://www.google.com" |> testAwaitTask |> Async.RunSynchronously;; The preceding function is also downloading the web page as HTML content, but it starts the operation as a .NET task object. Async.FromBeginEnd The FromBeginEnd method acts as an adapter for the asynchronous workflow interface by wrapping the provided Begin/End method. Thus, it allows using large number of existing components that support an asynchronous mode of work. The IAsyncResult interface exposes the functions as Begin/End pattern for asynchronous programming. We will look at the same download page example using FromBeginEnd: let downloadPageBeginEnd (url: string) = async { let req = HttpWebRequest.Create(url) use! resp = Async.FromBeginEnd(req.BeginGetResponse, req.EndGetResponse) use respStream = resp.GetResponseStream() use sr = new StreamReader(respStream) return sr.ReadToEnd() } The function accepts two parameters and automatically identifies the return type; we will use BeginGetResponse and EndGetResponse as our functions to call. Internally, Async.FromBeginEnd delegates the asynchronous operation and gets back the handle once the EndGetResponse is called. Async.FromContinuations Creates an asynchronous computation that captures the current success, exception, and cancellation continuations. To understand these three operations, let's create a sleep function similar to Async.Sleep using timer: let sleep t = Async.FromContinuations(fun (cont, erFun, _) -> let rec timer = new Timer(TimerCallback(callback)) and callback state = timer.Dispose() cont(()) timer.Change(t, Timeout.Infinite) |> ignore ) let testSleep = async { printfn "Before" do! sleep 5000 printfn "After 5000 msecs" } Async.RunSynchronously testSleep The sleep function takes an integer and returns a unit; it uses Async.FromContinuations to allow the flow of the program to continue when a timer event is raised. It does so by calling the cont(()) function, which is a continuation to allow the next step in the asynchronous flow to execute. If there is any error, we can call erFun to throw the exception and it will be handled from the place we are calling this function. Using the FromContinuation function helps us wrap and expose functionality as async, which can be used inside asynchronous workflows. It also helps to control the execution of the programming with cancelling or throwing errors using simple APIs. Async.Start Starts the asynchronous computation in the thread pool. It accepts an Async<unit> function to start the asynchronous computation. The downloadPage function can be started as follows: let asyncDownloadPage(url) = async { let! result = downloadPage(url) printfn "%s" result"} asyncDownloadPage "http://www.google.com" |> Async.Start   We wrap the function to another async function that returns an Async<unit> function so that it can be called by Async.Start. Async.StartChild Starts a child computation within an asynchronous workflow. This allows multiple asynchronous computations to be executed simultaneously, as follows: let subTask v = async { print "Task %d started" v Thread.Sleep (v * 1000) print "Task %d finished" v return v } let mainTask = async { print "Main task started" let! childTask1 = Async.StartChild (subTask 1) let! childTask2 = Async.StartChild (subTask 5) print "Subtasks started" let! child1Result = childTask1 print "Subtask1 result: %d" child1Result let! child2Result = childTask2 print "Subtask2 result: %d" child2Result print "Subtasks completed" return () } Async.RunSynchronously mainTask Async.StartAsTask Executes a computation in the thread pool and returns a task that will be completed in the corresponding state once the computation terminates. We can use the same example of starting the downloadPage function as a task. let downloadPageAsTask (url: string) = async { let req = HttpWebRequest.Create(url) use! resp = req.AsyncGetResponse() use respStream = resp.GetResponseStream() use sr = new StreamReader(respStream) return sr.ReadToEnd() } |> Async.StartAsTask let task = downloadPageAsTask("http://www.google.com") prinfn "Do some work" task.Wait() printfn "done"   Async.StartChildAsTask Creates an asynchronous computation from within an asynchronous computation, which starts the given computation as a task. let testAwaitTask = async { print "Starting" let! child = Async.StartChildAsTask <| async { // " Async.StartChildAsTask shall be described later print "Child started" Thread.Sleep(5000) print "Child finished" return 100 } print "Waiting for the child task" let! result = Async.AwaitTask child print "Child result %d" result } Async.StartImmediate Runs an asynchronous computation, starting immediately on the current operating system thread. This is very similar to the Async.Start function we saw earlier: let asyncDownloadPage(url) = async { let! result = downloadPage(url) printfn "%s" result"} asyncDownloadPage "http://www.google.com" |> Async.StartImmediate Async.SwitchToNewThread Creates an asynchronous computation that creates a new thread and runs its continuation in it: let asyncDownloadPage(url) = async { do! Async.SwitchToNewThread() let! result = downloadPage(url) printfn "%s" result"} asyncDownloadPage "http://www.google.com" |> Async.Start   Async.SwitchToThreadPool Creates an asynchronous computation that queues a work item that runs its continuation, as follows: let asyncDownloadPage(url) = async { do! Async.SwitchToNewThread() let! result = downloadPage(url) do! Async.SwitchToThreadPool() printfn "%s" result"} asyncDownloadPage "http://www.google.com" |> Async.Start   Async.SwitchToContext Creates an asynchronous computation that runs its continuation in the Post method of the synchronization context. Let's assume that we set the text from the downloadPage function to a UI textbox, then we will do it as follows: let syncContext = System.Threading.SynchronizationContext() let asyncDownloadPage(url) = async { do! Async.SwitchToContext(syncContext) let! result = downloadPage(url) textbox.Text <- result"} asyncDownloadPage "http://www.google.com" |> Async.Start   Note that in the console applications, the context will be null. Async.Parallel The Parallel function allows you to execute individual asynchronous computations queued in the thread pool and uses the fork/join pattern. let parallel_download() = let sites = ["http://www.bing.com"; "http://www.google.com"; "http://www.yahoo.com"; "http://www.search.com"] let htmlOfSites = Async.Parallel [for site in sites -> downloadPage site ] |> Async.RunSynchronously printfn "%A" htmlOfSites   We will use the same example of downloading HTML content in a parallel way. The preceding example shows the essence of parallel I/O computation The async function, { … }, in the downloadPage function shows the asynchronous computation These are then composed in parallel using the fork/join combinator In this sample, the composition executed waits synchronously for overall result Async.OnCancel A cancellation interruption in the asynchronous computation when a cancellation occurs. It returns an asynchronous computation trigger before being disposed. // This is a simulated cancellable computation. It checks the " token source // to see whether the cancel signal was received. let computation " (tokenSource:System.Threading.CancellationTokenSource) = async { use! cancelHandler = Async.OnCancel(fun () -> printfn " "Canceling operation.") // Async.Sleep checks for cancellation at the end of " the sleep interval, // so loop over many short sleep intervals instead of " sleeping // for a long time. while true do do! Async.Sleep(100) } let tokenSource1 = new " System.Threading.CancellationTokenSource() let tokenSource2 = new " System.Threading.CancellationTokenSource() Async.Start(computation tokenSource1, tokenSource1.Token) Async.Start(computation tokenSource2, tokenSource2.Token) printfn "Started computations." System.Threading.Thread.Sleep(1000) printfn "Sending cancellation signal." tokenSource1.Cancel() tokenSource2.Cancel() The preceding example implements the Async.OnCancel method to catch or interrupt the process when CancellationTokenSource is cancelled. Summary In this article, we went through detail, explanations about different semantics in asynchronous programming with F#, used with asynchronous workflows. We saw a number of functions from the Async module. Resources for Article: Further resources on this subject: Creating an F# Project [article] Asynchronous Control Flow Patterns with ES2015 and beyond [article] Responsive Applications with Asynchronous Programming [article]
Read more
  • 0
  • 0
  • 4897

article-image-reactive-python-asynchronous-programming-rescue-part-2
Xavier Bruhiere
10 Oct 2016
5 min read
Save for later

Reactive Python - Asynchronous programming to the rescue, Part 2

Xavier Bruhiere
10 Oct 2016
5 min read
This two-part series explores asynchronous programming with Python using Asyncio. In Part 1 of this series, we started by building a project that shows how you can use Reactive Python in asynchronous programming. Let’s pick it back up here by exploring peer-to-peer communication and then just touching on service discovery before examining the streaming machine-to-machine concept. Peer-to-peer communication So far we’ve established a websocket connection to process clock events asynchronously. Now that one pin swings between 1's and 0's, let's wire a buzzer and pretend it buzzes on high states (1) and remains silent on low ones (0). We can rephrase that in Python, like so: # filename: sketches.py import factory class Buzzer(factory.FactoryLoop): """Buzz on light changes.""" def setup(self, sound): # customize buzz sound self.sound = sound @factory.reactive async def loop(self, channel, signal): """Buzzing.""" behavior = self.sound if signal == '1' else '...' self.out('signal {} received -> {}'.format(signal, behavior)) return behavior So how do we make them to communicate? Since they share a common parent class, we implement a stream method to send arbitrary data and acknowledge reception with, also, arbitrary data. To sum up, we want IOPin to use this API: class IOPin(factory.FactoryLoop): # [ ... ] @protocol.reactive async def loop(self, channel, msg): # [ ... ] await self.stream('buzzer', bits_stream) return 'acknowledged' Service discovery The first challenge to solve is service discovery. We need to target specific nodes within a fleet of reactive workers. This topic, however, goes past the scope of this post series. The shortcut below will do the job (that is, hardcode the nodes we will start), while keeping us focused on reactive messaging. # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: mesh.py """Provide nodes network knowledge.""" import websockets class Node(object): def __init__(self, name, socket, port): print('[ mesh ] registering new node: {}'.format(name)) self.name = name self._socket = socket self._port = port def uri(self, path): return 'ws://{socket}:{port}/{path}'.format(socket=self._socket, port=self._port, path=path) def connection(self, path=''): # instanciate the same connection as `clock` method return websockets.connect(self.uri(path)) # TODO service discovery def grid(): """Discover and build nodes network.""" # of course a proper service discovery should be used here # see consul or zookkeeper for example # note: clock is not a server so it doesn't need a port return [ Node('clock', 'localhost', None), Node('blink', 'localhost', 8765), Node('buzzer', 'localhost', 8765 + 1) ] Streaming machine-to-machine chat Let's provide FactoryLoop with the knowledge of the grid and implement an asynchronous communication channel. # filename: factory.py (continued) import mesh class FactoryLoop(object): def __init__(self, *args, **kwargs): # now every instance will know about the other ones self.grid = mesh.grid() # ... def node(self, name): """Search for the given node in the grid.""" return next(filter(lambda x: x.name == name, self.grid)) async def stream(self, target, data, channel): self.out('starting to stream message to {}'.format(target)) # use the node webscoket connection defined in mesh.py # the method is exactly the same as the clock async with self.node(target).connection(channel) as ws: for partial in data: self.out('> sending payload: {}'.format(partial)) # websockets requires bytes or strings await ws.send(str(partial)) self.out('< {}'.format(await ws.recv())) We added a bit of debugging lines to better understand how the data flows through the network. Every implementation of the FactoryLoop can both react to events and communicate with other nodes it is aware of. Wrapping up Time to update arduino.py and run our cluster of three reactive workers in three @click.command()# [ ... ]def main(sketch, **flags): # [ ... ] elif sketch == 'buzzer': sketchs.Buzzer(sound='buzz buzz buzz').run(flags['socket'], flags['port']) Launch three terminals or use a tool such as foreman to spawn multiple processes. Either way, keep in mind that you will need to track the scripts output. way, keep in mind that you will need to track the scripts output. $ # start IOPin and Buzzer on the same ports we hardcoded in mesh.py $ ./arduino.py buzzer --port 8766 $ ./arduino.py iopin --port 8765 $ # now that they listen, trigger actions with the clock (targetting IOPin port) $ ./arduino.py clock --port 8765 [ ... ] $ # Profit ! We just saw one worker reacting to a clock and another reacting to randomly generated events. The websocket protocol allowed us to exchange streaming data and receive arbitrary responses, unlocking sophisticated fleet orchestration. While we limited this example to two nodes, a powerful service discovery mechanism could bring to life a distributed network of microservices. By completing this post series, you should now have a better understanding of how to use Python with Asyncio for asynchronous programming. About the author Xavier Bruhiere is a lead developer at AppTurbo in Paris, where he develops innovative prototypes to support company growth. He is addicted to learning, hacking on intriguing hot techs (both soft and hard), and practicing high-intensity sports.
Read more
  • 0
  • 0
  • 2077
article-image-basics-classes-and-objects
Packt
06 Oct 2016
11 min read
Save for later

Basics of Classes and Objects

Packt
06 Oct 2016
11 min read
In this article by Steven Lott, the author of the book Modern Python Cookbook, we will see how to use a class to encapsulate data plus processing. (For more resources related to this topic, see here.) Introduction The point of computing is to process data. Even when building something like an interactive game, the game state and the player's actions are the data, the processing computes the next game state and the display update. The data plus processing is ubiquitous. Some games can have a relatively complex internal state. When we think of console games with multiple players and complex graphics, there are complex, real-time state changes. On the other hand, when we think of a very simple casino game like Craps, the game state is very simple. There may be no point established, or one of the numbers 4, 5, 6, 8, 9, 10 may be the established point. The transitions are relatively simple, and are often denoted by moving markers and chips around on the casino table. The data includes the current state, player actions, and rolls of the dice. The processing is the rules of the game. A game like Blackjack has a somewhat more complex internal state change as each card is accepted. In games where the hands can be split, the state of play can become quite complex. The data includes the current game state, the player's commands, and the cards drawn from the deck. Processing is defined by the rules of the game as modified by any house rules. In the case of Craps, the player may place bets. Interestingly, the player's input, has no effect on the game state. The internal state of the game object is determined entirely by the next throw of the dice. This leads to a class design that's relatively easy to visualize. Using a class to encapsulate data plus processing The essential idea of computing is to process data. This is exemplified when we write functions that process data. Often, we'd like to have a number of closely related functions that work with a common data structure. This concept is the heart of object-oriented programming. A class definition will contain a number of methods that will control the internal state of an object. The unifying concept behind a class definition is often captured as a summary of the responsibilities allocated to the class. How can we do this effectively? What's a good way to design a class? Getting Ready Let's look at a simple, stateful object—a pair of dice. The context for this would be an application which simulates the casino game of Craps. The goal is to use simulation of results to help invent a better playing strategy. This will save us from losing real money while we try to beat the house edge. There's an important distinction between the class definition and an instance of the class, called an object. We call this idea – as a whole – Object-Oriented Programming. Our focus is on writing class definitions. Our overall application will create instances of the classes. The behavior that emerges from the collaboration of the instances is the overall goal of the design process. Most of the design effort is on class definitions. Because of this, the name object-oriented programming can be misleading. The idea of emergent behavior is an essential ingredient in object-oriented programming. We don't specify every behavior of a program. Instead, we decompose the program into objects, define the object's state and behavior via the object's classes. The programming decomposes into class definitions based on their responsibilities and collaborations. An object should be viewed as a thing—a noun. The behavior of the class should be viewed as verbs. This gives us a hint as to how we can proceed with design classes that work effectively. Object-oriented design is often easiest to understand when it relates to tangible real-world things. It's often easier to write a software to simulate a playing card than to create a software that implements an Abstract Data Type (ADT). For this example, we'll simulate the rolling of die. For some games – like the casino game of Craps – two dice are used. We'll define a class which models the pair of dice. To be sure that the example is tangible, we'll model the pair of dice in the context of simulating a casino game. How to do it... Write down simple sentences that describe what an instance of the class does. We can call these as the problem statements. It's essential to focus on short sentences, and emphasize the nouns and verbs. The game of Craps has two standard dice. Each die has six faces with point values from 1 to 6. Dice are rolled by a player. The total of the dice changes the state of the craps game. However, those rules are separate from the dice. If the two dice match, the number was rolled the hard way. If the two dice do not match, the number was easy. Some bets depend on this hard vs easy distinction. Identify all of the nouns in the sentences. Nouns may identify different classes of objects. These are collaborators. Examples include player and game. Nouns may also identify attributes of objects in questions. Examples include face and point value. Identify all the verbs in the sentences. Verbs are generally methods of the class in question. Examples include rolled and match. Sometimes, they are methods of other classes. Examples include change the state, which applies to the Craps game. Identify any adjectives. Adjectives are words or phrases which clarify a noun. In many cases, some adjectives will clearly be properties of an object. In other cases, the adjectives will describe relationships among objects. In our example, a phrase like the total of the dice is an example of a prepositional phrase taking the role of an adjective. The the total of phrase modifies the noun the dice. The total is a property of the pair of dice. Start writing the class with the class statement. class Dice: Initialize the object's attributes in the __init__ method. def __init__(self): self.faces = None We'll model the internal state of the dice with the self.faces attribute. The self variable is required to be sure that we're referencing an attribute of a given instance of a class. The object is identified by the value of the instance variable, self We could put some other properties here as well. The alternative is to implement the properties as separate methods. These details of the design decision is the subject for using properties for lazy attributes. Define the object's methods based on the various verbs. In our case, we have several methods that must be defined. Here's how we can implement dice are rolled by a player. def roll(self): self.faces = (random.randint(1,6), random.randint(1,6)) We've updated the internal state of the dice by setting the self.faces attribute. Again, the self variable is essential for identifying the object to be updated. Note that this method mutates the internal state of the object. We've elected to not return a value. This makes our approach somewhat like the approach of Python's built-in collection classes. Any method which mutates the object does not return a value. This method helps implement the total of the dice changes the state of the Craps game. The game is a separate object, but this method provides a total that fits the sentence. def total(self): return sum(self.faces) These two methods help answer the hard way and easy way questions. def hardway(self): return self.faces[0] == self.faces[1] def easyway(self): return self.faces[0] != self.faces[1] It's rare in a casino game to have a rule that has a simple logical inverse. It's more common to have a rare third alternative that has a remarkably bad payoff rule. In this case, we could have defined easy way as return not self.hardway(). Here's an example of using the class. First, we'll seed the random number generator with a fixed value, so that we can get a fixed sequence of results. This is a way to create a unit test for this class. >>> import random >>> random.seed(1)   We'll create a Dice object, d1. We can then set its state with the roll() method. We'll then look at the total() method to see what was rolled. We'll examine the state by looking at the faces attribute. >>> from ch06_r01 import Dice >>> d1 = Dice() >>> d1.roll() >>> d1.total() 7 >>> d1.faces (2, 5)   We'll create a second Dice object, d2. We can then set its state with the roll() method. We'll look at the result of the total() method, as well as the hardway() method. We'll examine the state by looking at the faces attribute. >>> d2 = Dice() >>> d2.roll() >>> d2.total() 4 >>> d2.hardway() False >>> d2.faces (1, 3)   Since the two objects are independent instances of the Dice class, a change to d2 has no effect on d1. >>> d1.total() 7   How it works... The core idea here is to use ordinary rules of grammar – nouns, verbs, and adjectives – as a way to identify basic features of a class. Noun represents things. A good descriptive sentence should focus on tangible, real-world things more than ideas or abstractions. In our example, dice are real things. We try to avoid using abstract terms like randomizers or event generators. It's easier to describe the tangible features of real things, and then locate an abstract implementation that offers some of the tangible features. The idea of rolling the dice is an example of physical action that we can model with a method definition. Clearly, this action changes the state of the object. In rare cases – one time in 36 – the next state will happen to match the previous state. Adjectives often hold the potential for confusion. There are several cases such as: Some adjectives like first, last, least, most, next, previous, and so on will have a simple interpretation. These can have a lazy implementation as a method or an eager implementation as an attribute value. Some adjectives are more complex phrase like "the total of the dice". This is an adjective phrase built from a noun (total) and a preposition (of). This, too, can be seen as a method or an attribute. Some adjectives involve nouns that appear elsewhere in our software. We might have had a phrase like "the state of the Craps game" is a phrase where "state of" modifies another object, the "Craps game". This is clearly only tangentially related to the dice themselves. This may reflect a relationship between "dice" and "game". We might add a sentence to the problem statement like "The dice are part of the game". This can help clarify the presence of a relationship between game and dice. Prepositional phrases like "are part of" can always be reversed to create the a statement from the other object's point of view—"The game contains dice". This can help clarify the relationships among objects. In Python, the attributes of an object are – by default – dynamic. We don't specific a fixed list of attributes. We can initialize some (or all) of the attributes in the __init__() method of a class definition. Since attributes aren't static, we have considerable flexibility in our design. There's more... Capturing the essential internal state, and methods that cause state change is the first step in good class design. We can summarize some helpful design principles using the acronym SOLID. Single Responsibility Principle: A class should have one clearly-defined responsibility. Open/Closed Principle: A class should be open to extension – generally via inheritance – but closed to modification. We should design our classes so that we don't need to tweak the code to add or change features. Liskov Substitution Principle: We need to design inheritance so that a subclass can be used in place of the superclass. Interface Segregation Principle: When writing a problem statement, we want to be sure that collaborating classes have as few dependencies as possible. In many cases, this principle will lead us to decompose large problems into many small class definitions. Dependency Inversion Principle: It's less than ideal for a class to depend directly on other classes. It's better if a class depends on an abstraction, and a concrete implementation class is substituted for the abstract class. The goal is to create classes that have the proper behavior and also adhere to the design principles. Resources for Article: Further resources on this subject: Python Data Structures [article] Web scraping with Python (Part 2) [article] How is Python code organized [article]
Read more
  • 0
  • 0
  • 1679

article-image-reactive-python-asynchronous-programming-rescue-part-1
Xavier Bruhiere
05 Oct 2016
7 min read
Save for later

Reactive Python – Asynchronous programming to the rescue, Part 1

Xavier Bruhiere
05 Oct 2016
7 min read
On the Confluent website, you can find this title: Stream data changes everything From the createors of Kafka, a real-time messaging system, this is not a surprising assertion. Yet, data streaming infrastructures have gained in popularity and many projects require the data to be processed as soon as it shows up. This contributed to the development of famous technologies like Spark Stremaing, Apache Storm and more broadly websockets. This latest piece of software in particular brought real-time data feeds to web applications, trying to solve low-latency connections. Coupled with the asynchronous Node.js, you can build a powerful event-based reactive system. But what about Python? Given the popularity of the language in data science, would it be possible to bring the benefits of this kind of data ingestion? As this two-part post series will show, it turns out that modern Python (Python 3.4 or later) supports asynchronous data streaming apps. Introducing asyncio Python 3.4 introduced in the standard library the module asyncio to provision the language with: Asynchronous I/O, event loop, coroutines and tasks While Python treats functions as first-class objects (meaning you can assign them to variables and pass them as arguments), most developers follow an imperative programming style. It seems on purpose: It requires super human discipline to write readable code in callbacks and if you don’t believe me look at any piece of JavaScript code. - Guido van Rossum So Asyncio is the pythonic answer to asynchronous programming. This paradigm makes a lot of sense for otherwise costly I/O operations or when we need events to trigger code. Scenario For fun and profit, let's build such a project. We will simulate a dummy electrical circuit composed of three components: A clock regularly ticking A board I/O pin randomly choosing to toggle its binary state on clock events A buzzer buzzing when the I/O pin flips to one This set us up with an interesting machine-to-machine communication problem to solve. Note that the code snippets in this post make use of features like async and await introduced in Python 3.5. While it would be possible to backport to Python 3.4, I highly recommend that you follow along with the same version or newer. Anaconda or Pyenv can ease the installation process if necessary. $ python --version Python 3.5.1 $ pip --version pip 8.1.2 Asynchronous webscoket Client/Server Our first step, the clock, will introduce both asyncio and websocket basics. We need a straightforward method that fires tick signals through a websocket and wait for acknowledgement. # filename: sketch.py async def clock(socket, port, tacks=3, delay=1) The async keyword is sugar syntaxing introduced in Python 3.5 to replace the previous @asyncio.coroutine. The official pep 492 explains it all but the tldr : API quality. To simplify websocket connection plumbing, we can take advantage of the eponymous package: pip install websockets==3.5.1. It hides the protocol's complexity behind an elegant context manager. # filename: sketch.py # the path "datafeed" in this uri will be a parameter available in the other side but we won't use it for this example uri = 'ws://{socket}:{port}/datafeed'.format(socket=socket, port=port) # manage asynchronously the connection async with websockets.connect(uri) as ws: for payload in range(tacks): print('[ clock ] > {}'.format(payload)) # send payload and wait for acknowledgement await ws.send(str(payload)) print('[ clock ] < {}'.format(await ws.recv())) time.sleep(delay) The keyword await was introduced with async and replaces the old yield from to read values from asynchronous functions. Inside the context manager the connection stays open and we can stream data to the server we contacted. The server: IOPin At the core of our application are entities capable of speaking to each other directly. To make things fun, we will expose the same API as Arduino sketches, or a setup method that runs once at startup and a loop called when new data is available. # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: factory.py import abc import asyncio import websockets class FactoryLoop(object): """ Glue components to manage the evented-loop model. """ __metaclass__ = abc.ABCMeta def__init__(self, *args, **kwargs): # call user-defined initialization self.setup(*args, **kwargs) def out(self, text): print('[ {} ] {}'.format(type(self).__name__, text)) @abc.abstractmethod def setup(self, *args, **kwargs): pass @abc.abstractmethod async def loop(self, channel, data): pass def run(self, host, port): try: server = websockets.serve(self.loop, host, port) self.out('serving on {}:{}'.format(host, port)) asyncio.get_event_loop().run_until_complete(server) asyncio.get_event_loop().run_forever() exceptOSError: self.out('Cannot bind to this port! Is the server already running?') exceptKeyboardInterrupt: self.out('Keyboard interruption, aborting.') asyncio.get_event_loop().stop() finally: asyncio.get_event_loop().close() The child objects will be required to implement setup and loop, while this class will take care of: Initializing the sketch Registering a websocket server based on a asynchronous callback (loop) Telling the event loop to poll for... events The websockets states the server callback is expected to have the signature on_connection(websocket, path). This is too low-level for our purpose. Instead, we can write a decorator to manage asyncio details, message passing, or error handling. We will only call self.loop with application-level-relevant information: the actual message and the websocket path. # filename: factory.py import functools import websockets def reactive(fn): @functools.wraps(fn) async def on_connection(klass, websocket, path): """Dispatch events and wrap execution.""" klass.out('** new client connected, path={}'.format(path)) # process messages as long as the connection is opened or # an error is raised whileTrue: try: message = await websocket.recv() aknowledgement = await fn(klass, path, message) await websocket.send(aknowledgement or 'n/a') except websockets.exceptions.ConnectionClosed as e: klass.out('done processing messages: {}n'.format(e)) break return on_connection Now we can develop a readable IOPin object. # filename: sketch.py import factory class IOPin(factory.FactoryLoop): """Set an IO pin to 0 or 1 randomly.""" def setup(self, chance=0.5, sequence=3): self.chance = chance self.sequence = chance def state(self): """Toggle state, sometimes.""" return0if random.random() < self.chance else1 @factory.reactive async def loop(self, channel, msg): """Callback on new data.""" self.out('new tick triggered on {}: {}'.format(channel, msg)) bits_stream = [self.state() for _ in range(self.sequence)] self.out('toggling pin state: {}'.format(bits_stream)) # ... # ... toggle pin state here # ... return'acknowledged' We finally need some glue to run both the clock and IOPin and test if the latter toggles its state when the former fires new ticks. The following snippet uses a convenient library, click 6.6, to parse command-line arguments. #! /usr/bin/env python # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: arduino.py import sys import asyncio import click import sketchs @click.command() @click.argument('sketch') @click.option('-s', '--socket', default='localhost', help='Websocket to bind to') @click.option('-p', '--port', default=8765, help='Websocket port to bind to') @click.option('-t', '--tacks', default=5, help='Number of clock ticks') @click.option('-d', '--delay', default=1, help='Clock intervals') def main(sketch, **flags): if sketch == 'clock': # delegate the asynchronous execution to the event loop asyncio.get_event_loop().run_until_complete(sketchs.clock(**flags)) elif sketch == 'iopin': # arguments in the constructor go as is to our `setup` method sketchs.IOPin(chance=0.6).run(flags['socket'], flags['port']) else: print('unknown sketch, please choose clock, iopin or buzzer') return1 return0 if__name__ == '__main__': sys.exit(main()) Don't forget to chmod +x the script and start the server in a first terminal ./arduino.py iopin. When it is listening for connections, start the clock with ./arduino.py clock and watch them communicate! Note that we used here common default host and port so they can find each other. We have a good start with our app, and now in Part 2 we will further explore peer-to-peer communication, service discovery, and the streaming machine-to-machine concept. About the author Xavier Bruhiere is a lead developer at AppTurbo in Paris, where he develops innovative prototypes to support company growth. He is addicted to learning, hacking on intriguing hot techs (both soft and hard), and practicing high intensity sports.
Read more
  • 0
  • 0
  • 3492