Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

6719 Articles
article-image-classify-emails-using-deep-neural-networks-generating-tf-idf
Savia Lobo
21 Feb 2018
9 min read
Save for later

How to classify emails using deep neural networks after generating TF-IDF

Savia Lobo
21 Feb 2018
9 min read
[box type="note" align="" class="" width=""]This article is an excerpt taken from the book Natural Language Processing with Python Cookbook written by Krishna Bhavsar, Naresh Kumar, and Pratap Dangeti. This book will teach you how to efficiently use NLTK and implement text classification, identify parts of speech, tag words, and more. You will also learn how to analyze sentence structures and master lexical analysis, syntactic and semantic analysis, pragmatic analysis, and application of deep learning techniques.[/box] In this article, you will learn how to use deep neural networks to classify emails into one of the 20 pre-trained categories based on the words present in each email. This is a simple model to start with understanding the subject of deep learning and its applications on NLP. Getting ready The 20 newsgroups dataset from scikit-learn have been utilized to illustrate the concept. Number of observations/emails considered for analysis are 18,846 (train observations - 11,314 and test observations - 7,532) and its corresponding classes/categories are 20, which are shown in the following: >>> from sklearn.datasets import fetch_20newsgroups >>> newsgroups_train = fetch_20newsgroups(subset='train') >>> newsgroups_test = fetch_20newsgroups(subset='test') >>> x_train = newsgroups_train.data >>> x_test = newsgroups_test.data >>> y_train = newsgroups_train.target >>> y_test = newsgroups_test.target >>> print ("List of all 20 categories:") >>> print (newsgroups_train.target_names) >>> print ("n") >>> print ("Sample Email:") >>> print (x_train[0]) >>> print ("Sample Target Category:") >>> print (y_train[0]) >>> print (newsgroups_train.target_names[y_train[0]]) In the following screenshot, a sample first data observation and target class category has been shown. From the first observation or email we can infer that the email is talking about a two-door sports car, which we can classify manually into autos category which is 8. Note: Target value is 7 due to the indexing starts from 0), which is validating our understanding with actual target class 7. How to do it… Using NLP techniques, we have pre-processed the data for obtaining finalized word vectors to map with final outcomes spam or ham. Major steps involved are:    Pre-processing.    Removal of punctuations.    Word tokenization.    Converting words into lowercase.    Stop word removal.    Keeping words of length of at least 3.    Stemming words.    POS tagging.    Lemmatization of words: TF-IDF vector conversion. Deep learning model training and testing. Model evaluation and results discussion. How it works... The NLTK package has been utilized for all the pre-processing steps, as it consists of all the necessary NLP functionality under one single roof: # Used for pre-processing data >>> import nltk >>> from nltk.corpus import stopwords >>> from nltk.stem import WordNetLemmatizer >>> import string >>> import pandas as pd >>> from nltk import pos_tag >>> from nltk.stem import PorterStemmer The function written (pre-processing) consists of all the steps for convenience. However, we will be explaining all the steps in each section: >>> def preprocessing(text): The following line of the code splits the word and checks each character to see if it contains any standard punctuations, if so it will be replaced with a blank or else it just don't replace with blank: ... text2 = " ".join("".join([" " if ch in string.punctuation else ch for ch in text]).split()) The following code tokenizes the sentences into words based on whitespaces and puts them together as a list for applying further steps: ... tokens = [word for sent in nltk.sent_tokenize(text2) for word in nltk.word_tokenize(sent)] Converting all the cases (upper, lower and proper) into lower case reduces duplicates in corpus: ... tokens = [word.lower() for word in tokens] As mentioned earlier, Stop words are the words that do not carry much of weight in understanding the sentence; they are used for connecting words and so on. We have removed them with the following line of code: ... stopwds = stopwords.words('english') ... tokens = [token for token in tokens if token not in stopwds] Keeping only the words with length greater than 3 in the following code for removing small words which hardly consists of much of a meaning to carry; ... tokens = [word for word in tokens if len(word)>=3] Stemming applied on the words using Porter stemmer which stems the extra suffixes from the words: ... stemmer = PorterStemmer() ... tokens = [stemmer.stem(word) for word in tokens] POS tagging is a prerequisite for lemmatization, based on whether word is noun or verb or and so on. it will reduce it to the root word ... tagged_corpus = pos_tag(tokens) pos_tag function returns the part of speed in four formats for Noun and six formats for verb. NN - (noun, common, singular), NNP - (noun, proper, singular), NNPS - (noun, proper, plural), NNS - (noun, common, plural), VB - (verb, base form), VBD - (verb, past tense), VBG - (verb, present participle), VBN - (verb, past participle), VBP - (verb, present tense, not 3rd person singular), VBZ - (verb, present tense, third person singular) ... Noun_tags = ['NN','NNP','NNPS','NNS'] ... Verb_tags = ['VB','VBD','VBG','VBN','VBP','VBZ'] ... lemmatizer = WordNetLemmatizer() The following function, prat_lemmatize, has been created only for the reasons of mismatch between the pos_tag function and intake values of lemmatize function. If the tag for any word falls under the respective noun or verb tags category, n or v will be applied accordingly in lemmatize function: ... def prat_lemmatize(token,tag): ...      if tag in Noun_tags: ...          return lemmatizer.lemmatize(token,'n') ...      elif tag in Verb_tags: ...          return lemmatizer.lemmatize(token,'v') ...      else: ...          return lemmatizer.lemmatize(token,'n') After performing tokenization and applied all the various operations, we need to join it back to form stings and the following function performs the same: ... pre_proc_text =   " ".join([prat_lemmatize(token,tag) for token,tag in tagged_corpus]) ... return pre_proc_text Applying pre-processing on train and test data: >>> x_train_preprocessed = [] >>> for i in x_train: ... x_train_preprocessed.append(preprocessing(i)) >>> x_test_preprocessed = [] >>> for i in x_test: ... x_test_preprocessed.append(preprocessing(i)) # building TFIDF vectorizer >>> from sklearn.feature_extraction.text import TfidfVectorizer >>> vectorizer = TfidfVectorizer(min_df=2, ngram_range=(1, 2), stop_words='english', max_features= 10000,strip_accents='unicode', norm='l2') >>> x_train_2 = vectorizer.fit_transform(x_train_preprocessed).todense() >>> x_test_2 = vectorizer.transform(x_test_preprocessed).todense() After the pre-processing step has been completed, processed TF-IDF vectors have to be sent to the following deep learning code: # Deep Learning modules >>> import numpy as np >>> from keras.models import Sequential >>> from keras.layers.core import Dense, Dropout, Activation >>> from keras.optimizers import Adadelta,Adam,RMSprop >>> from keras.utils import np_utils The following image produces the output after firing up the preceding Keras code. Keras has been installed on Theano, which eventually works on Python. A GPU with 6 GB memory has been installed with additional libraries (CuDNN and CNMeM) for four to five times faster execution, with a choking of around 20% memory; hence only 80% memory out of 6 GB is available; The following code explains the central part of the deep learning model. The code is self- explanatory, with the number of classes considered 20, batch size 64, and number of epochs to train, 20: # Definition hyper parameters >>> np.random.seed(1337) >>> nb_classes = 20 >>> batch_size = 64 >>> nb_epochs = 20 The following code converts the 20 categories into one-hot encoding vectors in which 20 columns are created and the values against the respective classes are given as 1. All other classes are given as 0: >>> Y_train = np_utils.to_categorical(y_train, nb_classes) In the following building blocks of Keras code, three hidden layers (1000, 500, and 50 neurons in each layer respectively) are used, with dropout as 50% for each layer with Adam as an optimizer: #Deep Layer Model building in Keras #del model >>> model = Sequential() >>> model.add(Dense(1000,input_shape= (10000,))) >>> model.add(Activation('relu')) >>> model.add(Dropout(0.5)) >>> model.add(Dense(500)) >>> model.add(Activation('relu')) >>> model.add(Dropout(0.5)) >>> model.add(Dense(50)) >>> model.add(Activation('relu')) >>> model.add(Dropout(0.5)) >>> model.add(Dense(nb_classes)) >>> model.add(Activation('softmax')) >>> model.compile(loss='categorical_crossentropy', optimizer='adam') >>> print (model.summary()) The architecture is shown as follows and describes the flow of the data from a start of 10,000 as input. Then there are 1000, 500, 50, and 20 neurons to classify the given email into one of the 20 categories: The model is trained as per the given metrics: # Model Training >>> model.fit(x_train_2, Y_train, batch_size=batch_size, epochs=nb_epochs,verbose=1) The model has been fitted with 20 epochs, in which each epoch took about 2 seconds. The loss has been minimized from 1.9281 to 0.0241. By using CPU hardware, the time required for training each epoch may increase as a GPU massively parallelizes the computation with thousands of threads/cores: Finally, predictions are made on the train and test datasets to determine the accuracy, precision, and recall values: #Model Prediction >>> y_train_predclass = model.predict_classes(x_train_2,batch_size=batch_size) >>> y_test_predclass = model.predict_classes(x_test_2,batch_size=batch_size) >>> from sklearn.metrics import accuracy_score,classification_report >>> print ("nnDeep Neural Network - Train accuracy:"),(round(accuracy_score( y_train, y_train_predclass),3)) >>> print ("nDeep Neural Network - Test accuracy:"),(round(accuracy_score( y_test,y_test_predclass),3)) >>> print ("nDeep Neural Network - Train Classification Report") >>> print (classification_report(y_train,y_train_predclass)) >>> print ("nDeep Neural Network - Test Classification Report") >>> print (classification_report(y_test,y_test_predclass)) It appears that the classifier is giving a good 99.9% accuracy on the train dataset and 80.7% on the test dataset. We learned the classification of emails using DNNs(Deep Neural Networks) after generating TF-IDF. If you found this post useful, do check out this book Natural Language Processing with Python Cookbook  to further analyze sentence structures and application of various deep learning techniques.    
Read more
  • 0
  • 0
  • 10377

article-image-how-to-use-standard-macro-in-workflows
Sunith Shetty
21 Feb 2018
6 min read
Save for later

How to use Standard Macro in Workflows

Sunith Shetty
21 Feb 2018
6 min read
[box type="note" align="" class="" width=""]This article is an excerpt from a book written by Renato Baruti titled Learning Alteryx. In this book, you will learn how to perform self-analytics and create interactive dashboards using various tools in Alteryx.[/box] Today we will learn Standard Macro that will provide you with a foundation for building enhanced workflows. The csv file required for this tutorial is available to download here. Standard Macro Before getting into Standard Macro, let's define what a macro is. A macro is a collection of workflow tools that are grouped together into one tool. Using a range of different interface tools, a macro can be developed and used within a workflow. Any workflow can be turned into a macro and a repeatable element of a workflow can commonly be converted into a macro. There are a couple of ways you can turn your workflow into a Standard Macro. The first is to go to the canvas configuration pane and navigate to the Workflow tab. This is where you select what type of workflow you want. If you select Macro you should then have Standard Macro automatically selected. Now, when you save this workflow it will save as a macro. You’ll then be able to add it to another workflow and run the process created within the macro itself. The second method is just to add a Macro Input tool from the Interface tool section onto the canvas; the workflow will then automatically change to a Standard Macro. The following screenshot shows the selection of a Standard Macro, under the Workflow tab: Let's go through an example of creating and deploying a standard macro. Standard Macro Example #1: Create a macro that allows the user to input a number used as a multiplier. Use the multiplier for the DataValueAlt field. The following steps demonstrate this process: Step 1: Select the Macro Input tool from the Interface tool palette and add the tool onto the canvas. The workflow will automatically change to a Standard Macro. Step 2: Select Text Input and Edit Data option within the Macro Input tool configuration. Step 3: Create a field called Number and enter the values: 155, 243, 128, 352, and 357 in each row, as shown in the following image: Step 4: Rename the Input Name Input and set the Anchor Abbreviation as I as shown in the following image: Step 5: Select the Formula tool from the Preparation tool palette. Connect the Formula tool to the Macro Input tool. Step 6: Select the + Add Column option in the Select Column drop down within the Formula tool configuration. Name the field Result. Step 7: Add the following expression to the expression window: [Number]*0.50 Step 8: Select the Macro Output tool from the Interface tool palette and add the tool onto the canvas. Connect the Macro Output tool to the Formula tool. Step 9: Rename the Output Name Output and set the Anchor Abbreviation as O: The Standard Macro has now been created. It can be saved to use as multiplier, to calculate the five numbers added within the Macro Input tool to multiply 0.50. This is great; however, let's take it a step further to make it dynamic and flexible by allowing the user to enter a multiplier. For instance, currently the multiplier is set to 0.50, but what if a user wants to change that to 0.25 or 0.10 to determine the 25% or 10% value of a field. Let's continue building out the Standard Macro to make this possible. Step 1: Select the Text Box tool from the Interface tool palette and drag it onto the canvas. Connect the Text Box tool to the Formula tool on the lightning bolt (the macro indicator). The Action tool will automatically be added to the canvas, as this automatically updates the configuration of a workflow with values provided by interface questions when run as an app or macro. Step 2: Configure the Action tool that will automatically update the expression replaced by a specific field. Select Formula | FormulaFields | FormulaField | @expression - value="[Number]*0.50". Select the Replace a specific string: option and enter 0.50. This is where the automation happens, updating the 0.50 to any number the user enters. You will see how this happens in the following steps: Step 3: In the Enter the text or question to be displayed text box, within the Text Box tool configuration, enter: Please enter a number: Step 4: Save the workflow as Standard Macro.yxmc. The .yxmc file type indicates it's a macro related workflow, as shown in the following image: Step 5: Open a new workflow. Step 6: Select the Input Data tool from the In/Out tool palette and connect to the U.S. Chronic Disease Indicators.csv file. Step 7: Select the Select tool from the Preparation tool palette and drag it onto the canvas. Connect the Select tool to the Input Data tool. Step 8: Change the Data Type for the DataValueAlt field to Double. Step 9: Right-click on the canvas and select Insert | Macro | Standard Macro. Step 10: Connect the Standard Macro to the Select tool. Step 11: There will be Questions to select within the Standard Macro tool configuration. Select DataValueAlt (Double) as the Choose Field option and enter 0.25 in the Please enter a number text box: Step 12: Add a Browse tool to the Standard Macro tool. Step 13: Run the workflow: The goal for creating this Standard Macro was to allow the user to select what they would like the multiplier to be rather than a static number. Let's recap what has been created and deployed using a Standard Macro. First, Standard Macro.yxmc was developed using Interface tools. The Macro Input (I) was used to enter sample text data for the Number field. This Number field is what is used to multiply to what the given multiplier is - in this case, 0.50. This is the static number multiplier. The Formula tool was used to create the expression to conclude that the Number field will be multiplied by 0.50. The Macro Output (O) was used to output the macro so that it can be used in another workflow. The Text Box tool is where the question Please enter a number will be displayed, along with the Action tool that is used to update the specific value replaced. The current multiplier, 0.50, is replaced by 0.25, as identified in step 20, through a dynamic input by which the user can enter the multiplier. Notice that, in the Browse tool output, the Result field has been added, multiplying the values for the DataValueAlt field to the multiplier 0.25. Change the value in the macro to 0.10 and run the workflow. The Result field has been updated to now multiple the values for the DataValueAlt field to the multiplier 0.10. This is a great use case of a Standard Macro and demonstrates how versatile the Interface tools are. We learned about macros and their dynamic use within workflows. We saw how Standard Macro was developed to allow the end user to specify what they want the multiplier to be. This is a great way to implement the interactivity within a workflow. To know more about high-quality interactive dashboards and efficient self-service data analytics, do checkout this book Learning Alteryx.  
Read more
  • 0
  • 0
  • 2935

article-image-api-gateway-and-its-need
Packt
21 Feb 2018
9 min read
Save for later

API Gateway and its Need

Packt
21 Feb 2018
9 min read
 In this article by Umesh R Sharma, author of the book Practical Microservices, we will cover API Gateway and its need with simple and short examples. (For more resources related to this topic, see here.) Dynamic websites show a lot on a single page, and there is a lot of information that needs to be shown on the page. The common success order summary page shows the cart detail and customer address. For this, frontend has to fire a different query to the customer detail service and order detail service. This is a very simple example of having multiple services on a single page. As a single microservice has to deal with only one concern, in result of that to show much information on page, there are many API calls on the same page. So, a website or mobile page can be very chatty in terms of displaying data on the same page. Another problem is that, sometimes, microservice talks on another protocol, then HTTP only, such as thrift call and so on. Outer consumers can't directly deal with microservice in that protocol. As a mobile screen is smaller than a web page, the result of the data required by the mobile or desktop API call is different. A developer would want to give less data to the mobile API or have different versions of the API calls for mobile and desktop. So, you could face a problem such as this: each client is calling different web services and keeping track of their web service and developers have to give backward compatibility because API URLs are embedded in clients like in mobile app. Why do we need the API Gateway? All these preceding problems can be addressed with the API Gateway in place. The API Gateway acts as a proxy between the API consumer and the API servers. To address the first problem in that scenario, there will only be one call, such as /successOrderSummary, to the API Gateway. The API Gateway, on behalf of the consumer, calls the order and user detail, then combines the result and serves to the client. So basically, it acts as a facade or API call, which may internally call many APIs. The API Gateway solves many purposes, some of which are as follows. Authentication API Gateways can take the overhead of authenticating an API call from outside. After that, all the internal calls remove security check. If the request comes from inside the VPC, it can remove the check of security, decrease the network latency a bit, and make the developer focus more on business logic than concerning about security. Different protocol Sometimes, microservice can internally use different protocols to talk to each other; it can be thrift call, TCP, UDP, RMI, SOAP, and so on. For clients, there can be only one rest-based HTTP call. Clients hit the API Gateway with the HTTP protocol and the API Gateway can make the internal call in required protocol and combine the results in the end from all web service. It can respond to the client in required protocol; in most of the cases, that protocol will be HTTP. Load-balancing The API Gateway can work as a load balancer to handle requests in the most efficient manner. It can keep a track of the request load it has sent to different nodes of a particular service. Gateway should be intelligent enough to load balances between different nodes of a particular service. With NGINX Plus coming into the picture, NGINX can be a good candidate for the API Gateway. It has many of the features to address the problem that is usually handled by the API Gateway. Request dispatching (including service discovery) One main feature of the gateway is to make less communication between client and microservcies. So, it initiates the parallel microservices if that is required by the client. From the client side, there will only be one hit. Gateway hits all the required services and waits for the results from all services. After obtaining the response from all the services, it combines the result and sends it back to the client. Reactive microservice designs can help you achieve this. Working with service discovery can give many extra features. It can mention which is the master node of service and which is the slave. Same goes for DB in case any write request can go to the master or read request can go to the slave. This is the basic rule, but users can apply so many rules on the basis of meta information provided by the API Gateway. Gateway can record the basic response time from each node of service instance. For higher priority API calls, it can be routed to the fastest responding node. Again, rules can be defined on the basis of the API Gateway you are using and how it will be implemented. Response transformation Being a first and single point of entry for all API calls, the API Gateway knows which type of client is calling a mobile, web client, or other external consumer; it can make the internal call to the client and give the data to different clients as per needs and configuration. Circuit breaker To handle the partial failure, the API Gateway uses a technique called circuit breaker pattern. A service failure in one service can cause the cascading failure in the flow to all the service calls in stack. The API Gateway can keep an eye on some threshold for any microservice. If any service passes that threshold, it marks that API as open circuit and decides not to make the call for a configured time. Hystrix (by Netflix) served this purpose efficiently. Default value in this is failing of 20 requests in 5 seconds. Developers can also mention the fall back for this open circuit. This fall back can be of dummy service. Once API starts giving results as expected, then gateway marks it as a closed service again. Pros and cons of API Gateway Using the API Gateway itself has its own pros and cons. In the previous section, we have described the advantages of using the API Gateway already. I will still try to make them in points as the pros of the API Gateway. Pros Microservice can focus on business logic Clients can get all the data in a single hit Authentication, logging, and monitoring can be handled by the API Gateway Gives flexibility to use completely independent protocols in which clients and microservice can talk It can give tailor-made results, as per the clients needs It can handle partial failure Addition to the preceding mentioned pros, some of the trade-offs are also to use this pattern. Cons It can cause performance degrade due to lots of happenings on the API Gateway With this, discovery service should be implemented Sometimes, it becomes the single point of failure Managing routing is an overhead of the pattern Adding additional network hope in the call Overall. it increases the complexity of the system Too much logic implementation in this gateway will lead to another dependency problem So, before using the API Gateway, both of the aspects should be considered. Decision of including the API Gateway in the system increases the cost as well. Before putting effort, cost, and management in this pattern, it is recommended to analysis how much you can gain from it. Example of API Gateway In this example, we will try to show only sample product pages that will fetch the data from service product detail to give information about the product. This example can be increased in many aspects. Our focus of this example is to only show how the API Gateway pattern works; so we will try to keep this example simple and small. This example will be using Zuul from Netflix as an API Gateway. Spring also had an implementation of Zuul in it, so we are creating this example with Spring Boot. For a sample API Gateway implementation, we will be using http://start.spring.io/ to generate an initial template of our code. Spring initializer is the project from Spring to help beginners generate basic Spring Boot code. A user has to set a minimum configuration and can hit the Generate Project button. If any user wants to set more specific details regarding the project, then they can see all the configuration settings by clicking on the Switch to the full version button, as shown in the following screenshot: Let's create a controller in the same package of main application class and put the following code in the file: @SpringBootApplication @RestController public class ProductDetailConrtoller { @Resource ProductDetailService pdService; @RequestMapping(value = "/product/{id}") public ProductDetail getAllProduct( @PathParam("id") String id) { return pdService.getProductDetailById(id); } }   In the preceding code, there is an assumption of the pdService bean that will interact with Spring data repository for product detail and get the result for the required product ID. Another assumption is that this service is running on port 10000. Just to make sure everything is running, a hit on a URL such as http://localhost:10000/product/1 should give some JSON as response. For the API Gateway, we will create another Spring Boot application with Zuul support. Zuul can be activated by just adding a simple @EnableZuulProxy annotation. The following is a simple code to start the simple Zuul proxy: @SpringBootApplication @EnableZuulProxy public class ApiGatewayExampleInSpring { public static void main(String[] args) { SpringApplication.run(ApiGatewayExampleInSpring.class, args); } }   Rest all the things are managed in configuration. In the application.properties file of the API Gateway, the content will be something as follows: zuul.routes.product.path=/product/** zuul.routes.produc.url=http://localhost:10000 ribbon.eureka.enabled=false server.port=8080  With this configuration, we are defining rules such as this: for any request for a URL such as /product/xxx, pass this request to http://localhost:10000. For outer world, it will be like http://localhost:8080/product/1, which will internally be transferred to the 10000 port. If we defined a spring.application.name variable as product in product detail microservice, then we don't need to define the URL path property here (zuul.routes.product.path=/product/** ), as Zuul, by default, will make it a URL/product. The example taken here for an API Gateway is not very intelligent, but this is a very capable API Gateway. Depending on the routes, filter, and caching defined in the Zuul's property, one can make a very powerful API Gateway. Summary In this article, you learned about the API Gateway, its need, and its pros and cons with the code example. Resources for Article:   Further resources on this subject: What are Microservices? [article] Microservices and Service Oriented Architecture [article] Breaking into Microservices Architecture [article]
Read more
  • 0
  • 0
  • 11305

article-image-getting-started-raspberry-pi
Packt
21 Feb 2018
7 min read
Save for later

Getting Started on the Raspberry Pi

Packt
21 Feb 2018
7 min read
 In this article, by Soham Chetan Kamani, author of the book Full Stack Web Development with Raspberry Pi 3, we will cover the marvel of the Raspberry Pi, however, doesn’t end here. It’s extreme portability means we can now do things which were not previously possible with traditional desktop computers. The GPIO pins give us easy access to interface with external devices. This allows the Pi to act as a bridge between embedded electronics and sensors, and the power that linux gives us. In essence, we can run any code in our favorite programming language (which can run on linux), and interface it directly to outside hardware quickly and easily. Once we couple this with the wireless networking capabilities introduced in the Raspberry Pi 3, we gain the ability to make applications that would not have been feasible to make before this device existed.and Scar de Courcier, authors of Windows Forensics Cookbook The Raspberry Pi has become hugely popular as a portable computer, and for good reason. When it comes to what you can do with this tiny piece of technology, the sky’s the limit. Back in the day, computers used to be the size of entire neighborhood blocks, and only large corporations doing expensive research could afford them. After that we went on to embrace personal computers, which were still a bit expensive, but, for the most part, could be bought by the common man. This brings us to where we are today, where we can buy a fully functioning Linux computer, which is as big as a credit card, for under 30$. It is truly a huge leap in making computers available to anyone and everyone. (For more resources related to this topic, see here.)  Web development and portable computing have come a long way. A few years ago we couldn’t dream of making a rich, interactive, and performant application which runs on the browser. Today, not only can we do that, but also do it all in the palm of our hands (quite literally). When we think of developing an application that uses databases, application servers, sockets, and cloud APIs, the picture that normally comes to mind is that of many server racks sitting in a huge room. In this book however, we are going to implement all of that using only the Raspberry Pi. In this article, we will go through the concept of the internet of things, and discuss how web development on the Raspberry Pi can help us get there. Following this, we will also learn how to set up our Raspberry Pi and access it from our computer. We will cover the following topics: The internet of things Our application Setting up Raspberry Pi Remote access The Internet of things (IOT) The web has until today been a network of computers exchanging data. The limitation of this was that it was a closed loop. People could send and receive data from other people via their computers, but rarely much else. The internet of things, in contrast, is a network of devices or sensors that connect the outside world to the internet. Superficially, nothing is different: the internet is still a network of computers. What has changed, is that now, these computers are collecting and uploading data from things instead of people. This now allows anyone who is connected to obtain information that is not collected by a human. The internet of things as a concept has been around for a long time, but it is only now that almost anyone can connect a sensor or device to the cloud, and the IOT revolution was hugely enabled by the advent of portable computing, which was led by the Raspberry Pi.  A brief look at our application Throughout this book, we are going to go through different components and aspects of web development and embedded systems. These are all going to be held together by our central goal of making an entire web application capable of sensing and displaying the surrounding temperature and humidity. In order to make a properly functioning system, we have to first build out the individual parts. More difficult still, is making sure all the parts work well together. Keeping this in mind, let's take a look at the different components of our technology stack, and the problems that each of them solve : The sensor interface - Perception The sensor is what connects our otherwise isolated application to the outside world. The sensor will be connected to the GPIO pins of the Raspberry pi. We can interface with the sensor through various different native libraries. This is the starting point of our data. It is where all the data that is used by our application is created. If you think about it, every other component of our technology stack only exists to manage, manipulate, and display the data collected from the sensor. The database - Persistence "Data" is the term we give to raw information, which is information that we cannot easily aggregate or understand. Without a way to store and meaningfully process and retrieve this data, it will always remain "data" and never "information", which is what we actually want. If we just hook up a sensor and display whatever data it reads, we are missing out on a lot of additional information. Let's take the example of temperature: What if we wanted to find out how the temperature was changing over time? What if we wanted to find the maximum and minimum temperatures for a particular day, or a particular week, or even within a custom duration of time? What if we wanted to see temperature variation across locations? There is no way we could do any of this with only the sensor. We also need some sort of persistence and structure to our data, and this is exactly what the database provides for us. If we structure our data correctly, getting the answers to the above questions is just a matter of a simple database query. The user interface - Presentation The user interface is the layer which connects our application to the end user. One of the most challenging aspects of software development is to make information meaningful and understandable to regular users of our application. The UI layer serves exactly this purpose: it takes relevant information and shows it in such a way that it is easily understandable to humans. How do we achieve such a level of understandability with such a large amount of data? We use visual aids: like colors, charts and diagrams (just like how the diagrams in this book make its information easier to understand). An important thing for any developer to understand is that your end user actually doesn't care about any of the the back-end stuff. The only thing that matters to them is a good experience. Of course, all the 0ther components serve to make the users experience better, but it's really the user facing interface that leaves the first impression, and that's why it's so important to do it well. The application server - Middleware This layer consists of the actual server side code we are going to write to get the application running. It is also called "middleware". In addition to being in the exact center of the architecture diagram, this layer also acts as the controller and middle-man for the other layers. The HTML pages that form the UI are served through this layer. All the database queries that we were talking about earlier are made here. The code that runs in this layer is responsible for retrieving the sensor readings from our external pins and storing the data in our database. Summary We are just warming up! In this article we got a brief introduction to the concept of the internet of things. We then went on to look at an overview of what we were going to build throughout the rest of this book, and saw how the Raspberry Pi can help us get there. Resources for Article:   Further resources on this subject: Clusters, Parallel Computing, and Raspberry Pi – A Brief Background [article] Setting up your Raspberry Pi [article] Welcome to JavaScript in the full stack [article]
Read more
  • 0
  • 0
  • 4498

article-image-what-is-asp-dotnet-core
Jason De
21 Feb 2018
4 min read
Save for later

What is ASP.NET Core?

Jason De
21 Feb 2018
4 min read
This is a guest post by Jason De Oliveira, the author of Learning ASP.NET Core 2.0. ASP.NET Core is a new cross-platform open-source framework for building modern web-based applications. It provides everything necessary for building web applications, mobile applications as well as Internet of Things (IoT) applications quickly and easily. The first preview release of ASP.NET came out almost 15 years ago as part of the .NET Framework. Over the years Microsoft has added and evolved many of its features until coming up with a complete redesign of the ASP.NET framework called ASP.NET Core in June 2016. After ASP.NET Core 1.0 and 1.1, the third and latest installment - version 2.0 was released in August 2017. ASP.NET Core 2.0 applications not only run on the .NET Core Framework but also on the full .NET Framework. The ASP.NET Core 2.0 framework was designed from the ground-up to provide an optimized development framework for applications, which can be deployed either in the cloud or on-premises. It consists of modular components with minimal overhead to retain a high degree of flexibility, when conceiving and implementing software solutions. Additionally, ASP.NET Core 2.0 applications run on Windows, Linux and MacOS. Given these impressive set of features, millions of software developers have adopted it to build and run all types of web applications since its inception. What are the key features of ASP.NET Core 2.0? ASP.NET Core 2.0 applications have normally the following characteristics: They are service-oriented They provide very high-performance They are easy to deploy and to configure They support cross-platform scenarios They contain high security levels It helps developers build scalable, reliable, high-performance web applications in C#. Moreover, it allows them to achieve productivity and quality levels never seen before with any of the other frameworks available in the market. Thanks to these impressive set of features, millions of software developers have adopted ASP.NET Core to build and run all types of web applications since its inception. Why web developers need to know ASP.NET Core 2.0 Modern web applications, which can run on multiple environments are clearly the future. Developers, who want to invest in a sustainable technology, which is going to last for a long time, have to acquire advanced web development skills today. ASP.NET Core 2.0 will surely be the future market standard. It includes everything a web developer can dream of. This allows them to be more productive and provide higher quality, while lowering maintenance efforts, since everything is already built-in. You can build modern web applications as well as powerful Web APIs, which can be deployed in any environment. Furthermore, it supports the latest technologies and best practices such as Containers and Microservices. But providing great web applications is not only about their development, which is only the beginning. Once developed, you need to assure that your applications are running successfully and that you can react quickly in case of unexpected behavior or errors during runtime. That is where supervision and monitoring come into play. A great developer needs to be open to these topics and understand them to build software, that supports them by default. How ASP.NET Core 2.0 might be used in the future If you want to prepare yourself for the future, you need to learn ASP.NET 2.0! It has been designed for the latest standards and best practices. It supports all current technologies and is open-source, which means that it will evolve over time and be adapted to new trends by Microsoft and the various developer communities. Since it is running on Windows, Linux and Mac OS on-premises and in the cloud, you can deploy and host your web application in any type of environment. So, there is no lock-in or risk in using it for your applications. No seriously, ASP.NET Core 2.0 is the most futuristic web development framework on the market! So start learning it today!
Read more
  • 0
  • 0
  • 2251

article-image-tree-test-and-surveys
Packt
21 Feb 2018
13 min read
Save for later

Tree Test and Surveys

Packt
21 Feb 2018
13 min read
In this article by Pablo Perea and  Pau Giner, authors of the book UX Design for Mobile, we will cover how to use different techniques that can be applied according to the needs of the project and the how to obtain the information that we want. (For more resources related to this topic, see here.) Tree Test This is also called reverse card sorting. This is a method where the participants try to find elements in a given structure. The objective of this method is to discover findability problems and improve the organization and labeling system. The organization structure used should represent a realistic navigation for the application or web page you are evaluating. If you don’t have one by the time the experiment is taking place, try to create a real scenario, as using a fake organization will not lead to really valuable results. There are some platforms available to perform this type of experiment with a computer. One example is https://www.optimalworkshop.com. This can have several advantages: the experiment can be carried out without requiring a physical displacement by the participant, and you can also study the participant steps and not just analyze whether the participant succeeded or not. It can be that the participants found the objectives but had to make many attempts to achieve them. The method steps Create the structure: Type or create the navigation structure with all the different levels you want to evaluate. Create a set of findability tasks: Think about different items that the participant should find or give a location in the given structure. Test with participants: The participant will receive a set of tasks to do. The following are some examples of possible tasks: Find some different products to buy Contact the customer support Get the shipping rates The results: At the end, we should have a success rate for each of the tasks. Tasks such as finding products in a store must be done several times with products located in different sections. This will help us classify our assortment and show us how to organize the first levels of the structure better. How to improve the organization Once we find the main weak points and workarounds we have in our structure, we can create alternative structures to retest and try to find better results. We can repeat this process several times until we get the desired results. The Information Architecture is the science field of organizing and labeling content in a web page to support usability and findability. There's a growing community of Information architecture specialists that supports the Information architecture Institute--https://en.wikipedia.org/wiki/Information_architecture. There are some general lines of work in which we have to invest time in order to improve our Information architecture. Content organization The content can be ordered by following different schemes, in the same way that a supermarket orders products according to different criteria. We should try to find the one that better fits our user needs. We can order the content, dividing it into groups according to nature, goals, audience, chronological entry, and so on. Each of these approaches will lead to different results and each will work better with different kinds of users. In the case of mobile applications, it is common to have certain sections where they mix contents of a different nature, for instance, integrating messages for the user in the contents of the activity view. However, an abuse of these types of techniques can lead to turning the section into a confusing area for the user. Areas naming There are words that have a completely different meaning for one person to another, especially if those people are thinking in different fields when they use our solution. Understanding our user needs, and how the think and speak, will help us provide clear names for sections and subsections. For example, the word pool will represent a different set of products for a person looking for summer products than for a person looking for games. In the case of applications, we will have to find a balance between simplicity and clarity. If space permits, adding a label along with the icon will clarify and reduce the possible ambiguities that may be encountered in recognizing the meaning of these graphic representations. In the case of mobiles, where space is really small, we can find some universal icons, but we must test with users to ensure that they interpret them properly. In the following examples, you can find two different approaches. In the Gmail app, attachment and send are known icons and can work without a label. We find a very different scenario in the Samsung Clock app, where it would be really difficult to differentiate between the Alarm, the Stopwatch, and the Timer without labels:      Samsung system and Google Gmail App screenshots (source: Screenshot from Google Gmail App, source: Screenshot from Gmail App) The working memory limit The way the information is displayed to the user can drastically change the ease with which it is understood. When we talk about mobiles, where space is very limited, limiting the number of options and providing a navigation adapted to small spaces can help our user have a more satisfactory experience. As you probably know, the human working memory is not limitless, and it is commonly supposed to be limited to remembering a maximum of seven elements (https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two). Some authors such as Nelson Cowan suggested that the number of elements an adult can remember while performing a task is even lower, and gives the number of reference as four (https://en.wikipedia.org/wiki/Working_memory). This means that your users will understand the information you give them better if you block it into groups according to their limitations. Once we create a new structure, we can evaluate the efficiency of this new structure versus the last version. With small improvements, we will be able to increase the user engagement. Another way to learn about how the user understands the organization of our app or web is by testing a competitor product. This is one of the cheapest ways to have a quick prototype. Evaluate as many versions as you can; in each review, you will find new ideas to organize and show the content of your application or web app better. Surveys Surveys allow us to gather information from lots of participants without too much effort. Sometimes, we need information from a big group of people, and interviewing them one by one will not be affordable. Instead of that, surveys can quickly provide answers from lots of participants and analyze the results with bulk methods. It is not the purpose of this book to deal in depth with the field of questionnaires since there are books devoted entirely to this subject. Nevertheless, we will give some brushstrokes on the subject since they are commonly used to gather information in both web pages and mobile applications. Creating proper questions is a key part of the process that will reduce the noise and help the participants provide useful answers. Some questions will require more effort to analyze, but they will give us answers with deeper level of detail. Questions with pre-established answers are usually easier to automatize, and we can get results in less time. What we want to discover? The first thing to do is to define the objective for which we are making a survey. Working with a clear objective will help the process be focused and will get better results. Plan carefully and determine the information that you really need at that moment. We should avoid surveys with lots of questions that do not have a clear purpose. They will produce poor outcome and result in meaningless exercises for the participants. On the contrary, if we have a general leitmotiv for the questionnaire, it will also help the participants understand how the effort of completing the survey will help the company, and therefore it will give clear value to the time expended. You can plan your survey according to different planning approaches, your questions can be focussed on short and long term goals: Long-term planning: Understanding your users expectations and their view about your product in the long term will help plan the evolution of your application and help create new features that match their needs. For example, imagine that you are designing a music application, and you are unsure about focusing on mass majority music or maybe giving more visibility to amateur groups. Creating a long-term survey can help you understand what your users want to find in your platform and plan changes that match the conclusions extracted from the survey analysis. Short-term planning: This is usually related to operational actions. The objective with these kind of surveys is to gather information for taking actions later with a defined mission. These kind of surveys are useful when we need to choose between two options, that is, whether we are deciding to make a change in our platform or not. For example, it can help to decide what type of information is most important for the user when choosing between one group and another, so we can make that information more visible. We will take better decisions by understanding the main aspects our users will expect to find in our platform. Find the participants Depending on the goal of the survey, we can use a wider range of participants or reduce their number, filtering by their demographics, experience, or the relationship with our brand or products. If the goal is to expand our number of users, it may be interesting to expand the search range to participants outside our current set of users. Looking for new niches and interesting features for our potential users can make new users try out our application. If, on the contrary, our objective is to keep our current users loyal, it can be a great source of improvement to consult them about their preferences and their opinions about the things that work properly in our application. This data, along with the data of use and navigation, will let us see areas for improvement, and we will be able to solve problems of navigation and usability. Determining the questions We can ask different types of questions; depending on the type, we will get more or less detailed answers. If we choose the right type, we can save analysis effort, or we can reduce the number of participants when we require a deep analysis of each response. It is common to include questions at the beginning of the questionnaires in order to classify the results. They are usually called filtering or screening questions, and they will allow us to analyze the answers based on data such as age, gender, or technical skills. These questions have the objective of knowing the person answering the survey. If we know the person solving the questionnaire, we will be able to determine whether the answers given by this user are useful for our goals or not. We can add questions about the experience the participant has with general technology, or with our app, and about the relation with the brand. We can create two kinds of questions based on the type of answers the participant can provide; each of them, therefore, will lead to different results. Open-answer questions The objective of this type of questions is to know more about the participant without guiding the answers. We will try to ask objectively for a subject without providing possible answers. The participant will answer these type of questions with open-ended answers, so it will be easier to know more about how that participant thinks and which aspects are proving more or less satisfactory. While the advantage of this kind of questions is that you will gain a lot of insights and new ideas, the con is the cost of managing big amounts of data. So, these type of questions will be more useful when the number of participants is reduced. Here are some examples of open-answer questions: How often have you been using our customer service? How was your last purchase experience on our platform? Questions with pre-established answers These type of questions facilitate the analysis when the number of participants is high. We will create questions with a clear objective and give different options to respond. Participants will be able to choose one of the options in response. The analysis of these types of questions can be automated and therefore is faster, but it will not give us as detailed information as an open question, in which the participant can expose all his ideas about the matter in the question. The following is an example of a question with pre-established answers: Questions: How many times have you used our application in the last week? Answers: 1) More than five times 2) Two to Five 3) Once 4) None Another great advantage is the facility to answer these types of questions when the participant does not have much time or interest to respond. In environments such as mobile phones, typing long answers can be costly and frustrating. With these types of questions, we can offer answers that the user can select with a single click. This can help increase the number of participants completing the form. It is common to mix both the types of questions. Open-answer questions where the user can respond in more detail can be included as optional questions. The participants willing to share more information can use these fields to introduce more detailed answers. This way, we can make a quicker analysis on the questions with pre-established answers and analyze the questions that require more precise revision later. Humanize the forms When we create a form, we must think about the person who will answer it. Almost no one likes to fill in questionnaires, especially if they are long and complex. To make our participants feel comfortable filling out all the answers on our form, we have to try to treat the process as a human relationship: The first thing we should do is to explain the reason of our form. If our participants understand how their answers will be used in the project, and how they can help us achieve the goal, they will feel more encouraged to answer the questions and take their role seriously. Ask only what is strictly necessary for the purpose you have set for it. We must prevent different departments from introducing questions without a common goal. If the form will answer concerns of different departments, all of them should have the same goal. This way, the form will have more cohesion. The tone used on the form should be friendly and clear. We should not go beyond the limits of indiscretion with our questions, or the participant may feel overwhelmed, especially if the participants of our study are not users of our application or our services, we must treat them as unknown. Being respectful and kind is a key point in getting high participation. Summary In the article we saw how to apply Tree Test and how to conduct surveys to gain information that you want. Resources for Article:   Further resources on this subject: Trends UX Design [article] Building Mobile Apps [article] Auditing Mobile Applications [article]
Read more
  • 0
  • 0
  • 1373
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €14.99/month. Cancel anytime
article-image-debugging-yournet
Packt
21 Feb 2018
11 min read
Save for later

Debugging Your.Net

Packt
21 Feb 2018
11 min read
In this article,Ya nnick Lefebvre, author of Wordpress Plugin Development Cookbook - Second Edition, we will cover the following recipes: Creating a new shortcode with parameters Managing multiple sets of user settings from a single admin page WordPress shortcodes are a simple, yet powerful tool that can be used to automate the insertion of code into web pages. For example, a shortcode could be used to automate the insertion of videos from a third-party platform that is not supported natively by WordPress, or embed content from a popular web site. By following the two code samples found in this article, you will learn how to create a WordPress plugin that defines your own shortcode to be able to quickly embed Twitter feeds on a web site. You will also learn how to create an administration configuration panel to be able to create a set of configurations that can be referenced when using your newly-created shortcode. Creating a new shortcode with parameters While simple shortcodes already provide a lot of potential to output complex content to a page by entering a few characters in the post editor, shortcodes become even more useful when they are coupled with parameters that will be passed to their associated processing function. Using this technique, it becomes very easy to create a shortcode that accelerates the insertion of external content in WordPress posts or pages by only needing to specify the shortcode and the unique identifier of the source element to be displayed. We will illustrate this concept in this recipe by creating a shortcode that will be used to quickly add Twitter feeds to posts or pages. How to do it... Navigate to the WordPress plugin directory of your development installation. Create a new directory called ch2-twitter-embed. Navigate to this directory and create a new text file called ch2-twitter-embed.php. Open the new file in a code editor and add an appropriate header at the top of the plugin file, naming the plugin Chapter 2 - Twitter Embed. Add the following line of code to declare a new shortcode and specify the name of the function that should be called when the shortcode is found in posts or pages: add_shortcode( 'twitterfeed', 'ch2te_twitter_embed_shortcode' ); Add the following code section to provide an implementation for the                     ch2te_twitter_embed_shortcode function:function ch2te_twitter_embed_shortcode( $atts ) { extract( shortcode_atts( array( 'user_name' => 'ylefebvre' ), $atts ) ); if ( !empty( $user_name ) ) { $output = '<a class="twitter-timeline" href="'; $output .= esc_url( 'https://twitter.com/' . $user_name ); $output .= '">Tweets by ' . esc_html( $user_name ); $output .= '</a><script async '; $output .= 'src="//platform.twitter.com/widgets.js"'; $output .= ' charset="utf-8"></script>'; } else { $output = ''; } return $output; } Save and close the plugin file. Log in to the administration page of your development WordPress installation. Click on Plugins in the left-hand navigation menu. Activate your new plugin. Create a new page and use the shortcode [twitterfeed user_name='WordPress'] in the page editor, where WordPress is the Twitter username of the feed to display: Save and view the page to see that the shortcode was replaced by an embedded Twitter feed on your site. Edit the page and remove the user_name parameter and its associated value, only leaving the core [twitterfeed] shortcode in the post and Save. Refresh the page and see that the feed is still being displayed but now shows tweets from another account. How it works... When shortcodes are used with parameters, these extra pieces of data are sent to the associated processing function in the $atts parameter variable. By using a combination of the standard PHP extract and WordPress-specific shortcode_atts functions, our plugin is able to parse the data sent to the shortcode and create an array of identifiers and values that are subsequently transformed into PHP variables that we can use in the rest of our shortcode implementation function. In this specific example, we expect a single variable to be used, called user_name, which will be stored in a PHP variable called $user_name. If the user enters the shortcode without any parameter, a default value of ylefebvre will be assigned to the username variable to ensure that the plugin still works. Since we are going to accept user input in this code, we also verify that the user did not provide an empty string and we use the esc_html and esc_url functions to remove any potentially harmful HTML characters from the input string and make sure that the link destination URL is valid. Once we have access to the twitter username, we can put together the required HTML code that will embed a Twitter feed in our page and display the selected user's tweets. While this example only has one argument, it is possible to define multiple parameters for a shortcode. Managing multiple sets of user settings from a single admin page Throughout this article, you have learned how to create configuration pages to manage single sets of configuration options for our plugins. In some cases, only being able to specify a single set of options will not be enough. For example, looking back at the Twitter embed shortcode plugin that was created, a single configuration panel would only allow users to specify one set of options, such as the desired twitter feed dimensions or the number of tweets to display. A more flexible solution would be to allow users to specify multiple sets of configuration options, which could then be called up by using an extra shortcode parameter (for example, [twitterfeed user_name="WordPress" option_id="2"]). While the first thought that might cross your mind to configure such a plugin is to create a multi-level menu item with submenus to store a number of different settings, this method would produce a very awkward interface for users to navigate. A better way is to use a single panel but give the user a way to select between multiple sets of options to be modified. In this recipe, you will learn how to enhance the previously created Twitter feed shortcode plugin to be able to control the embedded feed size and number of tweets to display from the plugin configuration panel and to give users the ability to specify multiple display sizes. Getting ready You should have already followed the Creating a new shortcode with parameters recipe in the article to have a starting point for this recipe. Alternatively, you can get the resulting code (Chapter 2/ch2-twitter-embed/ch2-twitter-embed.php) from the downloaded code bundle. How to do it... Navigate to the ch2-twitter-embed folder of the WordPress plugin directory of your development installation. Open the ch2-twitter-embed.php file in a text editor. Add the following lines of code to implement an activation callback to initialize plugin options when it is installed or upgraded: ch2te_twitter_embed_shortcode function:function ch2te_twitter_embed_shortcode( $atts ) { extract( shortcode_atts( array( 'user_name' => 'ylefebvre' ), $atts ) ); if ( !empty( $user_name ) ) { $output = '<a class="twitter-timeline" href="'; $output .= esc_url( 'https://twitter.com/' . $user_name ); $output .= '">Tweets by ' . esc_html( $user_name ); $output .= '</a><script async '; $output .= 'src="//platform.twitter.com/widgets.js"'; $output .= ' charset="utf-8"></script>'; } else { $output = ''; } return $output; } Insert the following code segment to register a function to be called when the administration menu is put together. When this happens, the callback function adds an item to the Settings menu and specifies the function to be called to render the configuration page: ch2te_twitter_embed_shortcode function:function ch2te_twitter_embed_shortcode( $atts ) { extract( shortcode_atts( array( 'user_name' => 'ylefebvre' ), $atts ) ); if ( !empty( $user_name ) ) { $output = '<a class="twitter-timeline" href="'; $output .= esc_url( 'https://twitter.com/' . $user_name ); $output .= '">Tweets by ' . esc_html( $user_name ); $output .= '</a><script async '; $output .= 'src="//platform.twitter.com/widgets.js"'; $output .= ' charset="utf-8"></script>'; } else { $output = ''; } return $output; } Add the following code to implement the configuration page rendering function: ch2te_twitter_embed_shortcode function:function ch2te_twitter_embed_shortcode( $atts ) { extract( shortcode_atts( array( 'user_name' => 'ylefebvre' ), $atts ) ); if ( !empty( $user_name ) ) { $output = '<a class="twitter-timeline" href="'; $output .= esc_url( 'https://twitter.com/' . $user_name ); $output .= '">Tweets by ' . esc_html( $user_name ); $output .= '</a><script async '; $output .= 'src="//platform.twitter.com/widgets.js"'; $output .= ' charset="utf-8"></script>'; } else { $output = ''; } return $output; } Add the following block of code to register a function that will process user options when submitted to the site: add_action( 'admin_init', 'ch2te_admin_init' ); function ch2te_admin_init() { add_action( 'admin_post_save_ch2te_options', 'process_ch2te_options' ); } Add the following code to implement the process_ch2te_options function, declared in the previous block of code, and to declare a utility function used to clean the redirection path: function ch2te_twitter_embed_shortcode( $atts ) { extract( shortcode_atts( array( 'user_name' => 'ylefebvre', 'option_id' => '1' ), $atts ) ); if ( intval( $option_id ) < 1 || intval( $option_id ) > 5 ) { $option_id = 1; } $options = ch2te_get_options( $option_id ); if ( !empty( $user_name ) ) { $output = '<a class="twitter-timeline" href="'; $output .= esc_url( 'https://twitter.com/' . $user_name ); $output .= '" data-width="' . $options['width'] . '" '; $output .= 'data-tweet-limit="' . $options['number_of_tweets']; $output .= '">' . 'Tweets by ' . esc_html( $user_name ); $output .= '</a><script async '; $output .= 'src="//platform.twitter.com/widgets.js"'; $output .= ' charset="utf-8"></script>'; } else { $output = ''; } return $output; } Find the ch2te_twitter_embed_shortcode function and modify it as follows to accept the new option_id parameter and load the plugin options to produce the desired output. The changes are identified in bold within the recipe: function ch2te_twitter_embed_shortcode( $atts ) { extract( shortcode_atts( array( 'user_name' => 'ylefebvre', 'option_id' => '1' ), $atts ) ); if ( intval( $option_id ) < 1 || intval( $option_id ) > 5 ) { $option_id = 1; } $options = ch2te_get_options( $option_id ); if ( !empty( $user_name ) ) { $output = '<a class="twitter-timeline" href="'; $output .= esc_url( 'https://twitter.com/' . $user_name ); $output .= '" data-width="' . $options['width'] . '" '; $output .= 'data-tweet-limit="' . $options['number_of_tweets']; $output .= '">' . 'Tweets by ' . esc_html( $user_name ); $output .= '</a><script async '; $output .= 'src="//platform.twitter.com/widgets.js"'; $output .= ' charset="utf-8"></script>'; } else { $output = ''; } return $output; } Save and close the plugin file. Deactivate and then Activate the Chapter 2 - Twitter Embed plugin from the administration interface to execute its activation function and create default settings. Navigate to the Settings menu and select the Twitter Embed submenu item to see the newly created configuration panel with the first set of options being displayed and more sets of options accessible through the drop-down list shown at the top of the page. To select the set of options to be used, add the parameter option_id to the shortcode used to display a Twitter feed, as follows: [twitterfeed user_name="WordPress" option_id="1"] How it works... This recipe shows how we can leverage options arrays to create multiple sets of options simply by creating the name of the options array on the fly. Instead of having a specific option name in the first parameter of the get_option function call, we create a string with an option ID. This ID is sent through as a URL parameter on the configuration page and as a hidden text field when processing the form data. On initialization, the plugin only creates a single set of options, which is probably enough for most casual users of the plugin. Doing so will avoid cluttering the site database with useless options. When the user requests to view one of the empty option sets, the plugin creates a new set of options right before rendering the options page. The rest of the code is very similar to the other examples that we saw in this article, since the way to access the array elements remains the same. Summary In this article, the author has explained about the entire process of how to create a new shortcode with parameters and how to manage multiple sets of user settings from a single admin page.   Resources for Article:   Further resources on this subject:Introduction to WordPress Plugin [article] Wordpress: Buddypress Courseware [article] Anatomy of a WordPress Plugin [article]
Read more
  • 0
  • 0
  • 1968

article-image-react-native-really-native-framework
Packt
21 Feb 2018
11 min read
Save for later

Is React Native is really Native framework?

Packt
21 Feb 2018
11 min read
This article by Vladimir Novick, author of the book React Native - Building Mobile Apps with JavaScript, introduces the concept of how the the React Native is really a Native framework, it's working, information flow, architecture, and benefits. (For more resources related to this topic, see here.) Introduction So how React Native is different? Well it doesn’t fall under hybrid category because the approach is different. When hybrid apps are trying to make platform specific features reusable between platforms, React Native have platform independent features, but also have lots of specific platform implementations. Meaning that on iOS and on Android code will look different, but somewhere between 70-90 percent of code will be reused. Also React Native does not depend on HTML or CSS. You write in JavaScript, but this JavaScript is compiled to platform specific Native code using React Native bridge. It happens all the time, but it’s optimize to a way, that application will run smoothly in 60fps. So to summarize React Native is not really a Native framework, but It’s much closer to Native code, than hybrid apps. And now let’s dive a bit deeper and understand how JavaScript gets converted into a Native code. How React Native bridge from JavaScript to Native world works? Let’s dive a bit deeper and understand how React Native works under the hood, which will help us understand how JavaScript is compiled to a Native code and how the whole process works. It’s crucial to know how the whole process works, so if you will have performance issues one day, you will understand where it originates. Information flow So we’ve talked about React concepts that power up React Native and one of them is that UI is a function of data. You change the state and React knows what to update. Let’s visualize now how information flows through common React app. Check out the diagram:  We have React component, which passes data to three child components Under the hood what is happening is, Virtual DOM tree is created representing these component hierarchy When state of the parent component is updated, React knows how to pass information to the children Since children are basically representation of UI, React figures out how to batch Browser DOM updates and executes them So now let’s remove Browser DOM and think that instead of batching Browser DOM updates, React Native does the same with calls to Native modules. So what about passing information down to Native modules? It can be done in two ways: Shared mutable data Serializable messages exchanged between JavaScript and Native modules React Native is going with the second approach. Instead of mutating data on shareable objects it passes asynchronous serialized batched messages to React Native Bridge. Bridge is the layer that is responsible for glueing together Native and JavaScript environments. Architecture Let’s take a look at the following diagram, which explains how React Native Architecture is structured and walk through the diagram: In diagram, pictured three layers: Native, Bridge and JavaScript. Native layer is pictured at the last in picture, because the layer that is closer to device itself. Bridge is the layer that connects between JavaScript and Native modules and basically is a transport layer that transport asynchronous serialized batched response messages from JavaScript to Native modules. When event is executed on Native layer. It can be touch, timer, network request. Basically any event involving device Native modules, It’s data is collected and is sent to the Bridge as a serialized message. Bridge pass this message to JavaScript layer. JavaScript layer is an event loop. Once Bridge passes Serialized payload to JavaScript, Event is processed and your application logic comes into play. If you update state, triggering your UI to re-render for example, React Native will batch Update UI and send them to the Bridge. Bridge will pass this Serialized batched response to Native layer, which will process all commands, that it can distinguish from serialized batched response and will Update UI accordingly. Threading model Up till now we’ve seen that there are lots of stuff going on under the hood of React Native. It’s important to know that everything is done on three main threads: UI (application main thread) Native modules JavaScript Runtime UI thread is the main Native thread where Native level rendering occurs. It is here, where your platform of choice, iOS or Android, does measures, layouting, drawing. If your application accesses any Native APIs, it’s done on a separate Native modules thread. For example, if you want to access the camera, Geo location, photos, and any other Native API. Panning and gestures in general are also done on this thread. JavaScript Runtime thread is the thread where all your JavaScript code will run. It’s slower than UI thread since it’s based on JavaScript event loop, so if you do complex calculations in your application, that leads to lots of UI changes, these can lead to bad performance. The rule of thumb is that if your UI will change slower than 16.67ms, then UI will appear sluggish. What are benefits of React Native? React Native brings with it lots of advantages for mobile development. We covered some of them briefly before, but let’s go over now in more detail. These advantages are what made React Native so popular and trending nowadays. And most of all it give web developers to start developing Native apps with relatively short learning curve compared to overhead learning Objective-C and Java. Developer experience One of the amazing changes React Native brings to mobile development world is enhancing developer experience. If we check developer experience from the point of view of web developer, it’s awesome. For mobile developer it’s something that every mobile developer have dreamt of. Let’s go over some of the features React Native brings for us out of the box. Chrome DevTools debugging Every web developer is familiar with Chrome Developer tools. These tools give us amazing experience debugging web applications. In mobile development debugging mobile applications can be hard. Also it’s really dependent on your target platform. None of mobile application debugging techniques does not even come near web development experience. In React Native, we already know, that JavaScript event loop is running on a separate thread and it can be connected to Chrome DevTools. By clicking Ctrl/Cmd + D in application simulator, we can attach our JavaScript code to Chrome DevTools and bring web debugging to a mobile world. Let’s take a look at the following screenshot: Here you see a React Native debug tools. By clicking on Debug JS Remotely, a separate Google Chrome window is opened where you can debug your applications by setting breakpoints, profiling CPU and memory usage and much more. Elements tab in Chrome Developer tools won’t be relevant though. For that we have a different option. Let’s take a look at what we will get with Chrome Developer tools Remote debugger. Currently Chrome developer tools are focused on Sources tab. You can notice that JavaScript is written in ECMAScript 2015 syntax. For those of you who are not familiar with React JSX, you see weird XML like syntax. Don’t worry, this syntax will be also covered in the book in the context of React Native.  If you put debugger inside your JavaScript code, or a breakpoint in your Chrome development tools, the app will pause on this breakpoint or debugger and you will be able to debug your application while it’s running. Live reload As you can see in React Native debugging menu, the third row says Live Reload. If you enable this option, whenever you change your code and save, the application will be automatically reloaded. This ability to Live reload is something mobile developers only dreamt of. No need to recompile application after each minor code change. Just save and the application will reload itself in simulator. This greatly speed up application development and make it much more fun and easy than conventional mobile development. The workflow for every platform is different while in React Native the experience is the same. Does not matter for which platform you develop. Hot reload Sometimes you develop part of the application which requires several user interactions to get to. Think of, for example logging in, opening menu and choosing some option. When we change our code and save, while live reload is enabled, our application is reloaded and we need to once again do these steps. But it does not have to be like that. React Native gives us amazing experience of hot reloading. If you enable this option in React Native development tools and if you change your React Native component, only the component will be reloaded while you stay on the same screen you were before. This speeds up the development process even more. Component hierarchy inspections I’ve said before, that we cannot use elements panel in Chrome development tools, but how you inspect your component structure in React Native apps? React Native gives us built in option in development tools called Show Inspector. When clicking it, you will get the following window: After inspector is opened, you can select any component on the screen and inspect it. You will get the full hierarchy of your components as well as their styling: In this example I’ve selected Welcome to React Native! text. In the opened pane I can see it’s dimensions, padding margin as well as component hierarchy. As you can see it’s IntroApp/Text/RCTText. RCTText is not a React Native JavaScript component, but a Native text component, connected to React Native bridge. In that way you also can see that component is connected to a Native text component. There are even more dev tools available in React Native, that I will cover later on, but we all can agree, that development experience is outstanding. Web inspired layout techniques Styling for Native mobile apps can be really painful sometimes. Also it’s really different between iOS and Android. React Native brings another solution. As you may’ve seen before the whole concept of React Native is bringing web development experience to mobile app development. That’s also the case for creating layouts. Modern way of creating layout for the web is by using flexbox. React Native decided to adopt this modern technique for web and bring it also to the mobile world with small differences. In addition to layouting, all styling in React Native is very similar to using inline styles in HTML. Let’s take a look at example: const styles = StyleSheet.create({ container: { flex: 1, justifyContent: 'center', alignItems: 'center', backgroundColor: '#F5FCFF', }); As you can see in this example, there are several properties of flexbox used as well as background color. This really reminds CSS properties, however instead of using background-color, justify-content and align-items, CSS properties are named in a camel case manner. In order to apply these styles to text component for example. It’s enough to pass them as following: <Text styles={styles.container}>Welcome to React Native </Text> Styling will be discussed in the book, however as you can see from example before , styling techniques are similar to web. They are not dependant on any platform and the same for both iOS and Android Code reusability across applications In terms of code reuse, if an application is properly architectured (something we will also learn in this book), around 80% to 90% of code can be reused between iOS and Android. This means that in terms of development speed React Native beats mobile development. Sometimes even code used for the web can be reused in React Native environment with small changes. This really brings React Native to top of the list of the best frameworks to develop Native mobile apps. Summary In this article, we learned about the concept of how the React Native is really a Native framework, working, information flow, architecture, and it's benefits briefly. Resources for Article:   Further resources on this subject: Building Mobile Apps [article] Web Development with React and Bootstrap [article] Introduction to JavaScript [article]
Read more
  • 0
  • 1
  • 14423

article-image-vmware-vsphere-storage-datastores-snapshots
Packt
21 Feb 2018
9 min read
Save for later

VMware vSphere storage, datastores, snapshots

Packt
21 Feb 2018
9 min read
VMware vSphere storage, datastores, snapshotsIn this article, byAbhilash G B, author of the book,VMware vSphere 6.5 CookBook - Third Edition, we will cover the following:Managing VMFS volumes detected as snapshotsCreating NFSv4.1 datastores with Kerberos authenticationEnabling storage I/O control (For more resources related to this topic, see here.) IntroductionStorage is an integral part of any infrastructure. It is used to store the files backing your virtual machines. The most common way to refer to a type of storage presented to a VMware environment is based on the protocol used and the connection type. NFS are storage solutions that can leverage the existing TCP/IP network infrastructure. Hence, they are referred to as IP-based storage. Storage IO Control (SIOC) is one of the mechanisms to use ensure a fair share of storage bandwidth allocation to all Virtual Machines running on shared storage, regardless of the ESXi host the Virtual Machines are running on. Managing VMFS volumes detected as snapshotsSome environments maintain copies of the production LUNs as a backup, by replicating them. These replicas are exact copies of the LUNs that were already presented to the ESXi hosts. If for any reason a replicated LUN is presented to an ESXi host, then the host will not mount the VMFS volume on the LUN. This is a precaution to prevent data corruption. ESXi identifies each VMFS volume using its signature denoted by aUniversally Unique Identifier (UUID). The UUID is generated when the volume is first created or resignatured and is stored in the LVM header of the VMFS volume. When an ESXi host scans for new LUN ;devices and VMFS volumes on it, it compares the physical device ID (NAA ID) of the LUN with the device ID (NAA ID) value stored in the VMFS volumes LVM header. If it finds a mismatch, then it flags the volume as a snapshot volume.Volumes detected as snapshots are not mounted by default. There are two ways to mount such volumes/datastore:Mount by Keeping the Existing Signature Intact - This is used when you are attempting to temporarily mount the snapshot volume on an ESXi that doesn't see the original volume. If you were to attempt mounting the VMFS volume by keeping the existing signature and if the host sees the original volume, then you will not be allowed to mount the volume and will be warned about the presence of another VMFS volume with the same UUID:Mount by generating a new VMFS Signature - This has to be used if you are mounting a clone or a snapshot of an existing VMFS datastore to the same host/s. The process of assigning a new signature will not only update the LVM header with the newly generated UUID, but all the Physical Device ID (NAA ID) of the snapshot LUN. Here, the VMFS volume/datastore will be renamed by prefixing the wordsnap followed by a random number and the name of the original datastore: Getting readyMake sure that the original datastore and its LUN is no longer seen by the ESXi host the snapshot is being mounted to. How to do it...The following procedure will help mount a VMFS volume from a LUN detected as a snapshot:Log in to the vCenter Server using the vSphere Web Client and use the key combination Ctrl+Alt+2 to switch to the Host and Clusters view.Right click on the ESXi host the snapshot LUN is mapped to and go to Storage | New Datastore.On the New Datastore wizard, select VMFS as the filesystem type and click Next to continue.On the Name and Device selection screen, select the LUN detected as a snaphsot and click Next to continue:On the Mount Option screen, choose to either mount by assigning a new signature or by keeping the existing signature, and click Next to continue:On the Ready to Complete screen, review the setting and click Finish to initiate the operation. Creating NFSv4.1 datastores with Kerberos authenticationVMware introduced support for NFS 4.1 with vSphere 6.0. The vSphere 6.5 added several enhancements:It now supports AES encryptionSupport for IP version 6Support Kerberos's integrity checking mechanismHere, we will learn how to create NFS 4.1 datastores. Although the procedure is similar to NFSv3, there are a few additional steps that needs to be performed. Getting readyFor Kerberos authentication to work, you need to make sure that the ESXi hosts and the NFS Server is joined to the Active Directory domainCreate a new or select an existing AD user for NFS Kerberos authenticationConfigure the NFS Server/Share to Allow access to the AD user chosen for NFS Kerberos authentication How to do it...The following procedure will help you mount an NFS datasture using the NFSv4.1 client with Kerberos authentication enabled:Log in to the vCenter Server using the vSphere Web Client and use the key combination Ctrl+Alt+2 to switch to the Host and Clusters view, select the desired ESXi host and navigate to it  Configure | System | Authentication Services section and supply the credentials of the Active Directory user that was chosen for NFS Kerberon Authentication:Right-click on the desired ESXi host and go to Storage | New Datastore to bring-up the Add Storage wizard.On the New Datastore wizard, select the Type as NFS and click Next to continue.On the Select NFS version screen, select NFS 4.1 and click Next to continue. Keep in mind that it is not recommended to mount an NFS Export using both NFS3 and NFS4.1 client. On the Name and Configuration screen, supply a Name for the Datastore, the NFS export's folder path and NFS Server's IP Address or FQDN. You can also choose to mount the share as ready-only if desired:On the Configure Kerberos Authentication screen, check the Enable Kerberos-based authentication box and choose the type of authentication required and click Next to continue:On the Ready to Complete screen review the settings and click Finish to mount the NFS export. Enabling storage I/O controlThe use of disk shares will work just fine as long as the datastore is seen by a single ESXi host. Unfortunately, that is not a common case. Datastores are often shared among multiple ESXi hosts. When datastores are shared, you bring in more than one local host scheduler into the process of balancing the I/O among the virtual machines. However, these lost host schedules cannot talk to each other and their visibility is limited to the ESXi hosts they are running on. This easily contributes to a serious problem called thenoisy neighbor situation. The job of SIOC is to enable some form of communication between local host schedulers so that I/O can be balanced between virtual machines running on separate hosts.  How to do it...The following procedure will help you enable SIOC on a datastore:Connect to the vCenter Server using the Web Client and switch to the Storage view using the key combination Ctrl+Alt+4.Right-click on the desired datastore and go to Configure Storage I/O Control:On the Configure Storage I/O Control window, select the checkbox Enable Storage I/O Control, set a custom congestion threshold (only if needed) and click OK to confirm the settings: With the Virtual Machine selected from the inventory, navigate to its Configure | General tab and review its datastore capability settings to ensure that SIOC is enabled:  How it works...As mentioned earlier, SIOC enables communication between these local host schedulers so that I/O can be balanced between virtual machines running on separate hosts. It does so by maintaining a shared file in the datastore that all hosts can read/write/update. When SIOC is enabled on a datastore, it starts monitoring the device latency on the LUN backing the datastore. If the latency crosses the threshold, it throttles the LUN's queue depth on each of the ESXi hosts in an attempt to distribute a fair share of access to the LUN for all the Virtual Machines issuing the I/O.The local scheduler on each of the ESXi hosts maintains an iostats file to keep its companion hosts aware of the device I/O statistics observed on the LUN. The file is placed in a directory (naa.xxxxxxxxx) on the same datastore.For example, if there are six virtual machines running on three different ESXi hosts, accessing a shared LUN. Among the six VMs, four of them have a normal share value of 1000 and the remaining two have high (2000) disk share value sets on them. These virtual machines have only a single VMDK attached to them. VM-C on host ESX-02 is issuing a large number of I/O operations. Since that is the only VM accessing the shared LUN from that host, it gets the entire queue's bandwidth. This can induce latency on the I/O operations performed by the other VMs: ESX-01 and ESX-03. If the SIOC detects the latency value to be greater than the dynamic threshold, then it will start throttling the queue depth: The throttled DQLEN for a VM is calculated as follows:DQLEN for the VM = (VM's Percent of Shares) of (Queue Depth)Example: 12.5 % of 64 â&#x86;&#x92; (12.5 * 64)/100 = 8The throttled DQLEN per host is calculated as follows:DQLEN of the Host = Sum of the DQLEN of the VMs on itExample: VM-A (8) + VM-B(16) = 24The following diagram shows the effect of SIOC throttling the queue depth: SummaryIn this article we learnt, how to mount a VMFS volume from a LUN detected as a snapshot, how to mount an NFS datasture using the NFSv4.1 client with Kerberos authentication enabled, and how to enable SIOC on a datastore. Resources for Article:   Further resources on this subject: Essentials of VMware vSphere [article] Working with VMware Infrastructure [article] Network Virtualization and vSphere [article]
Read more
  • 0
  • 0
  • 3174

article-image-implementing-face-detection-using-haar-cascades-adaboost-algorithm
Sugandha Lahoti
20 Feb 2018
7 min read
Save for later

Implementing face detection using the Haar Cascades and AdaBoost algorithm

Sugandha Lahoti
20 Feb 2018
7 min read
[box type="note" align="" class="" width=""]This article is an excerpt from a book written by Ankit Dixit titled Ensemble Machine Learning. This book serves as an effective guide to using ensemble techniques to enhance machine learning models.[/box] In today’s tutorial, we will learn how to apply the AdaBoost classifier in face detection using Haar cascades. Face detection using Haar cascades Object detection using Haar feature-based cascade classifiers is an effective object detection method proposed by Paul Viola and Michael Jones in their paper Rapid Object Detection using a Boosted Cascade of Simple Features in 2001. It is a machine-learning-based approach where a cascade function is trained from a lot of positive and negative images. It is then used to detect objects in other images. Here, we will work with face detection. Initially, the algorithm needs a lot of positive images (images of faces) and negative images (images without faces) to train the classifier. Then we need to extract features from it. Features are nothing but numerical information extracted from the images that can be used to distinguish one image from another; for example, a histogram (distribution of intensity values) is one of the features that can be used to define several characteristics of an image even without looking at the image, such as dark or bright image, the intensity range of the image, contrast, and so on. We will use Haar features to detect faces in an image. Here is a figure showing different Haar features: These features are just like the convolution kernel; to know about convolution, you need to wait for the following chapters. For a basic understanding, convolutions can be described as in the following figure: So we can summarize convolution with these steps: Pick a pixel location from the image. Now crop a sub-image with the selected pixel as the center from the source image with the same size as the convolution kernel. Calculate an element-wise product between the values of the kernel and sub- image. Add the result of the product. Put the resultant value into the new image at the same place where you picked up the pixel location. Now we are going to do a similar kind of procedure, but with a slight difference for our images. Each feature of ours is a single value obtained by subtracting the sum of the pixels under the white rectangle from the sum of the pixels under the black rectangle. Now, all possible sizes and locations of each kernel are used to calculate plenty of features. (Just imagine how much computation it needs. Even a 24x24 window results in over 160,000 features.) For each feature calculation, we need to find the sum of the pixels under the white and black rectangles. To solve this, we will use the concept of integral image; we will discuss this concept very briefly here, as it's not a part of our context. Integral image Integral images are those images in which the pixel value at any (x,y) location is the sum of the all pixel values present before the current pixel. Its use can be understood by the following example: Image on the left and the integral image on the right. Let's see how this concept can help reduce computation time; let us assume a matrix A of size 5x5 representing an image, as shown here: Now, let's say we want to calculate the average intensity over the area highlighted: Region for addition Normally, you'd do the following: 9 + 1 + 2 + 6 + 0 + 5 + 3 + 6 + 5 = 37 37 / 9 = 4.11 This requires a total of 9 operations. Doing the same for 100 such operations would require: 100 * 9 = 900 operations. Now, let us first make a integral image of the preceding image: Making this image requires a total of 56 operations. Again, focus on the highlighted portion: To calculate the avg intensity, all you have to do is: (76 - 20) - (24 - 5) = 37 37 / 9 = 4.11 This required a total of 4 operations. To do this for 100 such operations, we would require: 56 + 100 * 4 = 456 operations. For just a hundred operations over a 5x5 matrix, using an integral image requires about 50% less computations. Imagine the difference it makes for large images and other such operations. Creation of an integral image changes other sum difference operations by almost O(1) time complexity, thereby decreasing the number of calculations. It simplifies the calculation of the sum of pixels—no matter how large the number of pixels—to an operation involving just four pixels. Nice, isn't it? It makes things superfast. However, among all of these features we calculated, most of them are irrelevant. For example, consider the following image. The top row shows two good features. The first feature selected seems to focus on the property that the region of the eyes is often darker than the region of the nose and cheeks. The second feature selected relies on the property that the eyes are darker than the bridge of the nose. But the same windows applying on cheeks or any other part is irrelevant. So how do we select the best features out of 160000+ features? It is achieved by AdaBoost. To do this, we apply each and every feature on all the training images. For each feature, it finds the best threshold that will classify the faces as positive and negative. Obviously, there will be errors or misclassifications. We select the features with the minimum error rate, which means they are the features that best classify the face and non-face images. Note: The process is not as simple as this. Each image is given an equal weight in the       beginning. After each classification, the weights of misclassified images are increased. Again, the same process is done. New error rates are calculated among the new weights. This process continues until the required accuracy or error rate is achieved or the required number of features is found. The final classifier is a weighted sum of these weak classifiers. It is called weak because it alone can't classify the image, but together with others, it forms a strong classifier. The paper says that even 200 features provide detection with 95% accuracy. Their final setup had around 6,000 features. (Imagine a reduction from 160,000+ to 6000 features. That is a big gain.) Face detection framework using the Haar cascade and AdaBoost algorithm So now, you take an image take each 24x24 window, apply 6,000 features to it, and check if it is a face or not. Wow! Wow! Isn't this a little inefficient and time consuming? Yes, it is. The authors of the algorithm have a good solution for that. In an image, most of the image region is non-face. So it is a better idea to have a simple method to verify that a window is not a face region. If it is not, discard it in a single shot. Don’t process it again. Instead, focus on the region where there can be a face. This way, we can find more time to check a possible face region. For this, they introduced the concept of a cascade of classifiers. Instead of applying all the 6,000 features to a window, we group the features into different stages of classifiers and apply one by one (normally first few stages will contain very few features). If a window fails in the first stage, discard it. We don’t consider the remaining features in it. If it passes, apply the second stage of features and continue the process. The window that passes all stages is a face region. How cool is the plan!!! The authors' detector had 6,000+ features with 38 stages, with 1, 10, 25, 25, and 50 features in the first five stages (two features in the preceding image were actually obtained as the best two features from AdaBoost). According to the authors, on average, 10 features out of 6,000+ are evaluated per subwindow. So this is a simple, intuitive explanation of how Viola-Jones face detection works. Read the paper for more details. If you found this post useful, do check out the book Ensemble Machine Learning to learn different machine learning aspects such as bagging, boosting, and stacking.    
Read more
  • 0
  • 0
  • 16602
article-image-introduction-device-management
Packt
20 Feb 2018
10 min read
Save for later

Introduction with Device Management

Packt
20 Feb 2018
10 min read
In this article by Yatish Patil, the author of the book Microsoft Azure IOT Development Cookbook, we will look at device management using different techniques with Azure IoT Hub. We will see the following recipes: Device registry operations Device twins Device direct methods Device jobs (For more resources related to this topic, see here.) Azure IoT Hub has the capabilities that can be used by a developer to build a robust device management. There could be different use cases or scenarios across multiple industries but these device management capabilities, their patterns and the SDK code remains same, saving the significant time in developing and managing as well as maintaining the millions of devices. Device management will be the central part of any IoT solution. The IoT solution is going to help the users to manage the devices remotely, take actions from the cloud based application like disable, update data, run any command, and firmware update. In this article, we are going to perform all these tasks for device management and will start with creating the device. Device registry operations This sample application is focused on device registry operations and how it works, we will create a console application as our first IoT solution and look at the various device management techniques. Getting ready Let’s create a console application to start with IoT: Create a new project in Visual Studio: Create a Console Application Add IoT Hub connectivity extension in Visual Studio: Add the extension for IoT Hub connectivity Now right click on the Solution and go to Add a Connected Services. Select Azure IoT Hub and click Add. Now select Azure subscription and the IoT Hub created: Select IoT Hub for our application Next it will ask you to add device or you can skip this step and click Complete the configuration. How to do it... Create device identity: initialize the Azure IoT Hub registry connection: registryManager = RegistryManager.CreateFromConnectionString(connectionString); Device device = new Device(); try { device = await registryManager.AddDeviceAsync(new Device(deviceId)); success = true; } catch (DeviceAlreadyExistsException) { success = false; } Retrieve device identity by ID: Device device = new Device(); try { device = await registryManager.GetDeviceAsync(deviceId); } catch (DeviceAlreadyExistsException) { return device; } Delete device identity: Device device = new Device(); try { device = GetDevice(deviceId); await registryManager.RemoveDeviceAsync(device); success = true; } catch (Exception ex) { success = false; } List up to 1000 identities: try { var devicelist = registryManager.GetDevicesAsync(1000); return devicelist.Result; } catch (Exception ex) { // Export all identities to Azure blob storage: var blobClient = storageAccount.CreateCloudBlobClient(); string Containername = "iothubdevices"; //Get a reference to a container var container = blobClient.GetContainerReference(Containername); container.CreateIfNotExists(); //Generate a SAS token var storageUri = GetContainerSasUri(container); await registryManager.ExportDevicesAsync(storageUri, "devices1.txt", false); } Import all identities to Azure blob storage: await registryManager.ImportDevicesAsync(storageUri, OutputStorageUri); How it works... Let’s now understand the steps we performed. We initiated by creating a console application and configured it for the Azure IoT Hub solution. The idea behind this is to see the simple operation for device management. In this article, we started with simple operation for provision of the device by adding it to IoT Hub. We need to create connection to the IoT Hub followed by the created object of registry manager which is a part of devices namespace. Once we are connected we can perform operations like, add device, delete device, get device, these methods are asynchronous ones. IoT Hub also provides a way where in it connects with Azure storage blob for bulk operations like export all devices or import all devices, this works on JSON format only, the entire set of IoT devices gets exported in this way. There's more... Device identities are represented as JSON documents. It consists of properties like: deviceId: It represents the unique identification or the IoT device. ETag: A string representing a weak ETag for the device identity. symkey: A composite object containing a primary and a secondary key, stored in base64 format. status: If enabled, the device can connect. If disabled, this device cannot access any device-facing Endpoint. statusReason: A string that can be used to store the reason for the status changes. connectionState: It can be connected or disconnected. Device twins First we need to understand what device twin is and what is the purpose where we can use the device twin in any IoT solution. The device twin is a JSON formatted document that describes the metadata, properties of any device created within IoT Hub. It describes the individual device specific information. The device twin is made up of: tags, desired properties, and the reported properties. The operation that can be done by a IoT solution are basically update this the data, query for any IoT device. Tags hold the device metadata that can be accessed from IoT solution only. Desired properties are set from IoT solution and can be accessed on the device. Whereas the reported properties are set on the device and retrieved at IoT solution end. How to do it... Store device metadata: var patch = new { properties = new { desired = new { deviceConfig = new { configId = Guid.NewGuid().ToString(), DeviceOwner = "yatish", latitude = "17.5122560", longitude = "70.7760470" } }, reported = new { deviceConfig = new { configId = Guid.NewGuid().ToString(), DeviceOwner = "yatish", latitude = "17.5122560", longitude = "70.7760470" } } }, tags = new { location = new { region = "US", plant = "Redmond43" } } }; await registryManager.UpdateTwinAsync(deviceTwin.DeviceId, JsonConvert.SerializeObject(patch), deviceTwin.ETag); Query device metadata: var query = registryManager.CreateQuery("SELECT * FROM devices WHERE deviceId = '" + deviceTwin.DeviceId + "'"); Report current state of device: var results = await query.GetNextAsTwinAsync(); How it works... In this sample, we retrieved the current information of the device twin and updated the desired properties, which will be accessible on the device side. In the code, we will set the co-ordinates of the device with latitude and longitude values, also the device owner name and so on. This same value will be accessible on the device side. In the similar manner, we can set some properties on the device side which will be a part of the reported properties. While using the device twin we must always consider: Tags can be set, read, and accessed only by backend . Reported properties are set by device and can be read by backend. Desired properties are set by backend and can be read by backend. Use version and last updated properties to detect updates when necessary. Each device twin size is limited to 8 KB by default per device by IoT Hub There's more... Device twin metadata always maintains the last updated time stamp for any modifications. This is UTC time stamp maintained in the metadata. Device twin format is JSON format in which the tags, desired, and reported properties are stored, here is sample JSON with different nodes showing how it is stored: "tags": { "$etag": "1234321", "location": { "country": "India" "city": "Mumbai", "zipCode": "400001" } }, "properties": { "desired": { "latitude": 18.75, "longitude": -75.75, "status": 1, "$version": 4 }, "reported": { "latitude": 18.75, "longitude": -75.75, "status": 1, "$version": 4 } } Device direct methods Azure IoT Hub provides a fully managed bi-directional communication between the IoT solution on the backend and the IoT devices in the fields. When there is need for an immediate communication result, a direct method best suites the scenarios. Lets take example in home automation system, one needs to control the AC temperature or on/off the faucet showers. Invoke method from application: public async Task<CloudToDeviceMethodResult> InvokeDirectMethodOnDevice(string deviceId, ServiceClient serviceClient) { var methodInvocation = new CloudToDeviceMethod("WriteToMessage") { ResponseTimeout = TimeSpan.FromSeconds(300) }; methodInvocation.SetPayloadJson("'1234567890'"); var response = await serviceClient.InvokeDeviceMethodAsync(deviceId, methodInvocation); return response; } Method execution on device: deviceClient = DeviceClient.CreateFromConnectionString("", TransportType.Mqtt); deviceClient.SetMethodHandlerAsync("WriteToMessage", new DeviceSimulator().WriteToMessage, null).Wait(); deviceClient.SetMethodHandlerAsync("GetDeviceName", new DeviceSimulator().GetDeviceName, new DeviceData("DeviceClientMethodMqttSample")).Wait(); How it works... Direct method works on request-response interaction with the IoT device and backend solution. It works on timeout basis if no reply within that, it fails. These synchronous requests have by default 30 seconds of timeout, one can modify the timeout and increase up to 3600 depending on the IoT scenarios they have.  The device needs to connect using the MQTT protocol whereas the backend solution can be using HTTP. The JSON data size direct method can work up to 8 KB Device jobs In a typical scenario, device administrator or operators are required to manage the devices in bulk. We look at the device twin which maintains the properties and tags. Conceptually the job is nothing but a wrapper on the possible actions which can be done in bulk. Suppose we have a scenario in which we need to update the properties for multiple devices, in that case one can schedule the job and track the progress of that job. I would like to set the frequency to send the data at every 1 hour instead of every 30 min for 1000 IoT devices. Another example could be to reboot the multiple devices at the same time. Device administrators can perform device registration in bulk using the export and import methods. How to do it... Job to update twin properties. var twin = new Twin(); twin.Properties.Desired["HighTemperature"] = "44"; twin.Properties.Desired["City"] = "Mumbai"; twin.ETag = "*"; return await jobClient.ScheduleTwinUpdateAsync(jobId, "deviceId='"+ deviceId + "'", twin, DateTime.Now, 10); Job status. var twin = new Twin(); twin.Properties.Desired["HighTemperature"] = "44"; twin.Properties.Desired["City"] = "Mumbai"; twin.ETag = "*"; return await jobClient.ScheduleTwinUpdateAsync(jobId, "deviceId='"+ deviceId + "'", twin, DateTime.Now, 10); How it works... In this example, we looked at a job updating the device twin information and we can follow up the job for its status to find out if the job was completed or failed. In this case, instead of having single API calls, a job can be created to execute on multiple IoT devices. The job client object provides the jobs available with the IoT Hub using the connection to it. Once we locate the job using its unique ID we can retrieve the status for it. The code snippet mentioned in the How to do it... preceding recipe, uses the temperature properties and updates the data. The job is scheduled to start execution immediately with 10 seconds of execution timeout set. There's more... For a job, the life cycle begins with initiation from the IoT solution. If any job is in execution, we can query to it and see the status of execution. Another most common scenario where this could be useful is the firmware update, reboot, configuration updates, and so on, apart from the device property read or write. Each device job has properties that helps us working with them. The useful properties are start and end date time, status, and lastly device job statistics which gives the job execution statistics. Summary We have learned the device management using different techniques with Azure IoT Hub in detail. We have explained, how the IoT solution is going to help the users to manage the devices remotely, take actions from the cloud based application like disable, update data, run any command, and firmware update. We also performed different tasks for device management. Resources for Article: Further resources on this subject: Device Management in Zenoss Core Network and System Monitoring: Part 1 [article] Device Management in Zenoss Core Network and System Monitoring: Part 2 [article] Managing Network Devices [article]
Read more
  • 0
  • 0
  • 3512

article-image-decision-trees
Packt
20 Feb 2018
17 min read
Save for later

Decision Trees

Packt
20 Feb 2018
17 min read
In this article by David Toth, the author of the book Data Science Algorithms in a Week, we will cover the following topics: Concepts Analysis Concepts A decision tree is the arrangement of the data in a tree structure where at each node data is separated to different branches according to the value of the attribute at the node. Analysis To construct a decision tree, we will use a standard ID3 learning algorithm that chooses an attribute that classifies the data samples in the best possible way to maximize the information gain – a measure based on information entropy. Information entropy Information entropy of the given data measures the least amount of the information necessary to represent a data item from the given data. The unit of the information entropy is a familiar unit – a bit and a byte, a kilobyte, and so on. The lower information entropy, the more regular, data is, the more pattern occurs in the data and thus less amount of the information is necessary to represent it. That is why compression tools on the computer can take large text files and compress them to a much smaller size, as words and word expressions keep reoccurring, forming a pattern. Coin flipping Imagine we flip and unbiased coin. We would like to know if the result is head or tail. How much information do we need to represent the result? Both words head and tail consists of 4 characters and if we represent one character with one byte (8 bits) as it is standard in ASCII table, then we would need 4 bytes or 32 bits to represent the result. But the information entropy is the least amount of the data necessary to represent the result. We know that there are only two possible results – head or tail. If we agree to represent head with 0 and tail with 1, then 1 bit would be sufficient to communicate the result efficiently. Here the data is the space of the possibilities of the result of the coin throw. It is the set {head,tail} which can represented as a set {0,1}. The actual result is a data item from this set. It turns out that the entropy of the set is 1. This is owing to that the probability of head and tail are both 50%. Now imagine that the coin is biased and throws head 25% of time and tails 75% of time. What would be the entropy of the probability space {0,1} this time? We could certainly represent the result with 1 bit of the information. But can we do better? 1 bit is of course indivisible, but maybe we could generalize the concept of the information to indiscrete amounts. In the previous example, we know nothing about the previous result of the coin flip unless we look at the coin. But in the example with the biased coin, we know that the result tail is more likely to happen. If we recorded n results of coin flips in a file representing heads with 0 and tails with 1, then about 75% of the bits there would have the value 1 and 25% of them would have the value 0. The size of such file would be n bits. But since it is more regular (the pattern of 1s prevails in it) a good compression tool should be able to compress it to less than n bits. To learn the theoretical bound to the compression and the amount of the information necessary to represent a data item we define information entropy precisely. Definition of Information Entropy Suppose that we are given a probability space S with the elements 1, 2, …, n. The probability an element i would be chosen from the probability space is pi. Then the information entropy of the probability space is defined as: E(S)=-p1 *log2(p1) - … - pn *log2(pn) where log2 is a binary logarithm. So for the information entropy of the probability space of unbiased coin throws is E = -0.5 * log2(0.5) – 0.5*log2(0.5)=0.5+0.5=1. When the coin is based with 25% chance of a head and 75% change of a tail, then the information entropy of such space is: E = -0.25 * log2(0.25) – 0.75*log2(0.75) = 0.81127812445 which is less than 1. Thus for example if we had a large file with about 25% of 0 bits and 75% of 1 bits, a good compression tool should be able to compress it down to about 81.12% of its size. Information gain The information gain is the amount of the information entropy gained as a result of a certain procedure. For example, if we would like to know the results of 3 fair coins, then its information entropy is 3. But if we could look at the 3rd coin, then information entropy of the result for the remaining 2 coins would be 2. Thus by looking at the 3rd coin we gained 1 bit information, so the information gain was 1. We may also gain the information entropy by dividing the whole set S into sets grouping them by similar pattern. If we group elements by their value of an attribute A, then we define the information gain as IG(S, A) = E(S) – Sumv in values(A)[(|Sv|/|S|) * E(Sv)] where Sv is a set with the elements of S that have the value v for the attribute A. For example let us calculate the information gain for the 6 rows in the swimming example by taking swimming suit as an attribute. Because we are interested whether a given row of data is classified as no or yes for the question whether one should swim, we will use swim preference to calculate the entropy and information gain. We partition the set S by the attribute swimming suit: Snone={(none,cold,no),(none,warm,no)} Ssmall={(small,cold,no),(small,warm,no)} Sgood= {(good,cold,no),(good,warm,yes)} The information entropy of S is E(S)=-(1/6)*log2(1/6)-(5/6)*log2(5/6)~0.65002242164 The information entropy of the partitions is: E(Snone)=-(2/2)*log2(2/2)=-log2(1)=0 since all instances have the class no. E(Ssmall)=0 for a similar reason E(Sgood)=-(1/2)*log2(1/2)=1 Therefore the information gain is IG(S,swimming suit)=E(S)-[(2/6)*E(Snone)+(2/6)*E(Ssmall)+(2/6)*E(Sgood)] =0.65002242164-(1/3)=0.3166890883 If we chose the attribute water temperature to partition the set S, what would be the information gain IG(S,water temperature)? The water temperature partitions the set S into the following sets: Scold={(none,cold,no),(small,cold,no),(good,cold,no)} Swarm={(none,warm,no),(small,warm,no),(good,warm,yes)} Their entropies are: E(Scold)=0 as all instances are classified as no. E(Swarm)=-(2/3)*log2(2/3)-(1/3)*log2(1/3)~0.91829583405 which is less than IG(S,swimming suit). Therefore, we can gain more information about the set S (the classification of its instances) by partitioning it per the attribute swimming suit instead of the attribute water temperature. This finding will be the basis of the ID3 algorithm constructing a decision tree in the next paragraph. ID3 algorithm ID3 algorithm constructs a decision tree from the data based on the information gain. In the beginning, we start with the set S. The data items in the set S have various properties according to which we can partition the set S. If an attribute A has the values {v1, …, vn}, then we partition the set S into the sets Sv1, …, Svn. Where the set Svi is a subset of the set S where the elements have the value vi for the attribute A. If each element in the set S has attributes A1, …, Am, then we can partition the set S according to any of the possible attributes. ID3 algorithm partitions the set S according to the attribute that yields the highest information gain. Now suppose that it was an attribute A1. Then for the set S we have the partitions Sv1, …, Svn where A1 has the possible values {v1,…, vn}. Since we have not constructed any tree yet, we first place a root node into the tree. For every partition of S we place a new branch from the root. Every branch represents one value of the selected attributes. A branch has data samples with the same value for that attribute. For every new branch we can define a new node that will have data samples from its ancestor branch. Once we have defined a new node, we choose another of the remaining attributes with the highest information gain for the data at that node to partition the data at that node further, then define new branches and nodes. This process can be repeated until we run out of all the attributes for the nodes or even earlier until all the data at the node have the same class of our interest. In the case of a swimming example there are only two possible classes for swimming preference: class no and class yes. The last node is called a leaf node and decides the class of a data item from the data. Tree construction by ID3 algorithm Here we describe step by step how an ID3 algorithm would construct a decision tree from the given data samples in the swimming example. The initial set consists of 6 data samples: S={(none,cold,no),(small,cold,no),(good,cold,no),(none,warm,no),(small,warm,no),(good,warm,yes)} In the previous sections we calculated the information gains for both and the only non- classifying attributes swimming suit and water temperature: IG(S,swimming suit)=0.3166890883 IG(S,water temperature)=0.19087450461 Hence we would choose the attribute swimming suit as it has a higher information gain. There is no tree drawn yet, so we start from the root node. As the attribute swimming suit has 3 possible values {none, small, good}, we draw 3 possible branches out of it for each. Each branch will have one partition from the partitioned set S: Snone, Ssmall, Sgood. We add nodes to the ends of the branches. Snone data samples have the same class swimming preference = no, so we do not need to branch that node by a further attribute and partition set. Thus the node with the data Snone is already a leaf node. The same is true for the node with the data Ssmall. But the node with the data Sgood has two possible classes for swimming preference. Therefore, we will branch the node further. There is only one non- classifying attribute left – water temperature. So there is no need to calculate the information gain for that attribute with the data Sgood. From the node Sgood we will have 2 branches each with the partition from the set Sgood. One branch will have the set of the data sample Sgood, cold={(good,cold,no)}, the other branch will have the partition Sgood, warm={(good,warm,yes)}. Each of these 2 branches will end with a node. Each node will be a leaf node because each node has the data samples of the same value for the classifying attribute swimming preference. The resulting decision tree has 4 leaf nodes and is the tree in the picture decision tree for the swimming preference example. Deciding with a decision tree Once we have constructed a decision tree from the data with the attributes A1, …, Am and the classes {c1, …, ck}; we can use this decision tree to classify a new data item with the attributes A1, …, Am into one of the classes {c1, …, ck}. Given a new data item that we would like to classify, we can think of each node including the root as a question for data sample: What value does that data sample for the selected attribute Aihave? Then based on the answer we select the branch of a decision tree and move further to the next node. Then another question is answered about the data sample and another until the data sample reaches the leaf node. A leaf node has an associated one of the classes {c1, …, ck} with it, e.g. ci. Then the decision tree algorithm would classify the data sample into the class ci. Deciding a data sample with the swimming preference decision tree Let us construct a decision tree for the swimming preference example with the ID3 algorithm. Consider a data sample (good,cold,?) and we would like to use the constructed decision tree to decide into which class it should belong. Start with a data sample at the root of the tree. The first attribute that branches from the root is swimming suit, so we ask for the value for the attribute swimming suit of the sample (good,cold,?). We learn that the value of the attribute is swimming suit=good, therefore move down the rightmost branch with that value for its data samples. We arrive at the node with the attribute water temperature and ask the question: what is the value of the attribute water temperature for the data sample (good,cold,?). We learn that for that data sample we have water temperature=cold, therefore we move down the left branch into the leaf node. This leaf is associated with the class swimming preference=no. Therefore the decision tree would classify the data sample (good,cold,?) to be in that class swimming preference, i.e. to complete it to the data sample (good,cold,no). Therefore, the decision tree says that if one has a good swimming suit, but the water temperature is cold, then one would still not want to swim based on the data collected in the table. Implementation decision_tree.py import math import imp import sys #anytree module is used to visualize the decision tree constructed by this ID3 algorithm. from anytree import Node, RenderTree import common #Node for the construction of a decision tree. class TreeNode: definit(self,var=None,val=None): self.children=[] self.var=varself.val=val defadd_child(self,child): self.children.append(child) defget_children(self): return self.children defget_var(self): return self.var defis_root(self): return self.var==None and self.val==None defis_leaf(self): return len(self.children)==0 def name(self): if self.is_root(): return “[root]” return “[“+self.var+”=“+self.val+”]” #Constructs a decision tree where heading is the heading of the table with the data, i.e. the names of the attributes. #complete_data are data samples with a known value for every attribute. #enquired_column is the index of the column (starting from zero) which holds the classifying attribute. defconstuct_decision_tree(heading,complete_data,enquired_column): available_columns=[] for col in range(0,len(heading)): if col!=enquired_column: available_columns.append(col) tree=TreeNode() add_children_to_node(tree,heading,complete_data,available_columns,enquired_ column) return tree #Splits the data samples into the groups with each having a different value for the attribute at the column col. defsplit_data_by_col(data,col): data_groups={} for data_item in data: if data_groups.get(data_item[col])==None: data_groups[data_item[col]]=[] data_groups[data_item[col]].append(data_item) return data_groups #Adds a leaf node to node. defadd_leaf(node,heading,complete_data,enquired_column): node.add_child(TreeNode(heading[enquired_column],complete_data[0][enquired_ column])) #Adds all the descendants to the node. def add_children_to_node(node,heading,complete_data,available_columns,enquired_ column): if len(available_columns)==0: add_leaf(node,heading,complete_data,enquired_column) return -1 selected_col=select_col(complete_data,available_columns,enquired_column) for i inrange(0,len(available_columns)): if available_columns[i]==selected_col: available_columns.pop(i) break data_groups=split_data_by_col(complete_data,selected_col) if(len(data_groups.items())==1): add_leaf(node,heading,complete_data,enquired_column) return -1 for child_group, child_data in data_groups.items(): child=TreeNode(heading[selected_col],child_group) add_children_to_node(child,heading,child_data,list(available_columns),enquired_column) node.add_child(child) #Selects an available column/attribute with the highest information gain. defselect_col(complete_data,available_columns,enquired_column): selected_col=-1 selected_col_information_gain=-1 for col in available_columns: current_information_gain=col_information_gain(complete_data,col,enquired_column) if current_information_gain>selected_col_information_gain: selected_col=col selected_col_information_gain=current_information_gainreturn selected_col #Calculates the information gain when partitioning complete_dataaccording to the attribute at the column col and classifying by the attribute at enquired_column. defcol_information_gain(complete_data,col,enquired_column): data_groups=split_data_by_col(complete_data,col) information_gain=entropy(complete_data,enquired_column) for _,data_group in data_groups.items(): information_gain- =(float(len(data_group))/len(complete_data))*entropy(data_group,enquired_column) return information_gain #Calculates the entropy of the data classified by the attribute at the enquired_column. def entropy(data,enquired_column): value_counts={} for data_item in data: if value_counts.get(data_item[enquired_column])==None: value_counts[data_item[enquired_column]]=0 value_counts[data_item[enquired_column]]+=1 entropy=0 for _,count in value_counts.items(): probability=float(count)/len(data) entropy-=probability*math.log(probability,2) return entropy #A visual output of a tree using the text characters. defdisplay_tree(tree): anytree=convert_tree_to_anytree(tree) for pre, fill, node in RenderTree(anytree): pre=pre.encode(encoding=‘UTF-8’,errors=‘strict’) print(“%s%s” % (pre, node.name)) #A simple textual output of a tree without the visualization. defdisplay_tree_simple(tree): print(‘***Tree structure***’) display_node(tree) sys.stdout.flush() #A simple textual output of a node in a tree. defdisplay_node(node): if node.is_leaf(): print(‘The node ‘+node.name()+’ is a leaf node.’) return sys.stdout.write(‘The node ‘+node.name()+’ has children: ‘) for child in node.get_children(): sys.stdout.write(child.name()+’‘) print(‘‘) for child in node.get_children(): display_node(child) #Convert a decision tree into the anytree module tree format to make it ready for rendering. defconvert_tree_to_anytree(tree): anytree=Node(“Root”) attach_children(tree,anytree) return anytree#Attach the children from the decision tree into the anytree tree format. defattach_children(parent_node, parent_anytree_node): for child_node in parent_node.get_children(): child_anytree_node=Node(child_node.name(),parent=parent_anytree_node) attach_children(child_node,child_anytree_node) ###PROGRAM START### if len(sys.argv)<2: sys.exit(‘Please, input as an argument the name of the CSV file.’) csv_file_name=sys.argv[1] (heading,complete_data,incomplete_data,enquired_column)=common.csv_file_to_ ordered_data(csv_file_name) tree=constuct_decision_tree(heading,complete_data,enquired_column) display_tree(tree) common.py #Reads the csv file into the table and then separates the table into heading, complete data, incomplete data and then produces also the index number for the column that is not complete, i.e. contains a question mark. defcsv_file_to_ordered_data(csv_file_name): with open(csv_file_name, ‘rb’) as f: reader = csv.reader(f) data = list(reader) return order_csv_data(data) deforder_csv_data(csv_data): #The first row in the CSV file is the heading of the data table. heading=csv_data.pop(0) complete_data=[] incomplete_data=[] #Let enquired_column be the column of the variable which conditional probability should be calculated. Here set that column to be the last one. enquired_column=len(heading)-1 #Divide the data into the complete and the incomplete data. An incomplete row is the one that has a question mark in the enquired_column. The question mark will be replaced by the calculated Baysian probabilities from the complete data. for data_item in csv_data: if is_complete(data_item,enquired_column): complete_data.append(data_item) else: incomplete_data.append(data_item) return (heading,complete_data,incomplete_data,enquired_column) Program input swim.csv swimming_suit,water_temperature,swimNone,Cold,No None,Warm,NoSmall,Cold,NoSmall,Warm,NoGood,Cold,NoGood,Warm,Yes Program output $ python decision_tree.py swim.csv Root ├── [swimming_suit=Small] │├──[water_temperature=Cold] ││└──[swim=No] │└──[water_temperature=Warm] │└──[swim=No] ├── [swimming_suit=None] │├──[water_temperature=Cold] ││└──[swim=No] │└──[water_temperature=Warm] │└──[swim=No] └── [swimming_suit=Good] ├── [water_temperature=Cold] │└──[swim=No] └── [water_temperature=Warm] └── [swim=Yes] Summary In this article we have learned the concept of decision tree, analysis using ID3 algorithm, and implementation. Resources for Article: Further resources on this subject: Working with Data – Exploratory Data Analysis [article] Introduction to Data Analysis and Libraries [article] Data Analysis Using R [article]
Read more
  • 0
  • 0
  • 1730

article-image-introduction-performance-testing-and-jmeter
Packt
20 Feb 2018
11 min read
Save for later

Introduction to Performance Testing and JMeter

Packt
20 Feb 2018
11 min read
In this article by Bayo Erinle, the author of the book Performance Testing with JMeter 3, will explore some of the options that make JMeter a great tool of choice for performance testing.  (For more resources related to this topic, see here.) Performance testing and tuning There is a strong relationship between performance testing and tuning, in the sense that one often leads to the other. Often, end-to-end testing unveils system or application bottlenecks that are regarded unacceptable with project target goals. Once those bottlenecks are discovered, the next step for most teams is a series of tuning efforts to make the application perform adequately. Such efforts normally include, but are not limited to, the following: Configuring changes in system resources Optimizing database queries Reducing round trips in application calls, sometimes leading to redesigning and re-architecting problematic modules Scaling out application and database server capacity Reducing application resource footprint Optimizing and refactoring code, including eliminating redundancy and reducing execution time Tuning efforts may also commence if the application has reached acceptable performance but the team wants to reduce the amount of system resources being used, decrease the volume of hardware needed, or further increase system performance. After each change (or series of changes), the test is re-executed to see whether the performance has improved or declined due to the changes. The process will be continued with the performance results having reached acceptable goals. The outcome of these test-tuning circles normally produces a baseline. Baselines Baseline is a process of capturing performance metric data for the sole purpose of evaluating the efficacy of successive changes to the system or application. It is important that all characteristics and configurations, except those specifically being varied for comparison, remain the same in order to make effective comparisons as to which change (or series of changes) is driving results toward the targeted goal. Armed with such baseline results, subsequent changes can be made to the system configuration or application and testing results can be compared to see whether such changes were relevant or not. Some considerations when generating baselines include the following: They are application-specific They can be created for system, application, or modules They are metrics/results They should not be over generalized They evolve and may need to be redefined from time to time They act as a shared frame of reference They are reusable They help identify changes in performance Load and stress testing Load testing is the process of putting demand on a system and measuring its response, that is, determining how much volume the system can handle. Stress testing is the process of subjecting the system to unusually high loads far beyond its normal usage pattern to determine its responsiveness. These are different from performance testing, whose sole purpose is to determine the response and effectiveness of a system, that is, how fast the system is. Since load ultimately affects how a system responds, performance testing is always done in conjunction with stress testing. JMeter to the rescue One of the areas performance testing covers is testing tools. Which testing tool do you use to put the system and application under load? There are numerous testing tools available to perform this operation, from free to commercial solutions. However, our focus will be on Apache JMeter, a free, open source, cross-platform desktop application from the Apache Software foundation. JMeter has been around since 1998 according to historic change logs on its official site, making it a mature, robust, and reliable testing tool. Cost may also have played a role in its wide adoption. Small companies usually may not want to foot the bill for commercial end testing tools, which often place restrictions, for example, on how many concurrent users one can spin off. My first encounter with JMeter was exactly a result of this. I worked in a small shop that had paid for a commercial testing tool, but during the course of testing, we had outrun the licensing limits of how many concurrent users we needed to simulate for realistic test plans. Since JMeter was free, we explored it and were quite delighted with the offerings and the share amount of features we got for free. Here are some of its features: Performance tests of different server types, including web (HTTP and HTTPS), SOAP, database, LDAP, JMS, mail, and native commands or shell scripts Complete portability across various operating systems Full multithreading framework allowing concurrent sampling by many threads and simultaneous sampling of different functions by separate thread groups Full featured Test IDE that allows fast Test Plan recording, building, and debugging Dashboard Report for detailed analysis of application performance indexes and key transactions In-built integration with real-time reporting and analysis tools, such as Graphite, InfluxDB, and Grafana, to name a few Complete dynamic HTML reports Graphical User Interface (GUI) HTTP proxy recording server Caching and offline analysis/replaying of test results High extensibility Live view of results as testing is being conducted JMeter allows multiple concurrent users to be simulated on the application, allowing you to work toward most of the target goals obtained earlier, such as attaining baseline and identifying bottlenecks. It will help answer questions, such as the following: Will the application still be responsive if 50 users are accessing it concurrently? How reliable will it be under a load of 200 users? How much of the system resources will be consumed under a load of 250 users? What will the throughput look like with 1000 users active in the system? What will be the response time for the various components in the application under load? JMeter, however, should not be confused with a browser. It doesn't perform all the operations supported by browsers; in particular, JMeter does not execute JavaScript found in HTML pages, nor does it render HTML pages the way a browser does. However, it does give you the ability to view request responses as HTML through many of its listeners, but the timings are not included in any samples. Furthermore, there are limitations to how many users can be spun on a single machine. These vary depending on the machine specifications (for example, memory, processor speed, and so on) and the test scenarios being executed. In our experience, we have mostly been able to successfully spin off 250-450 users on a single machine with a 2.2 GHz processor and 8 GB of RAM. Up and running with JMeter Now, let's get up and running with JMeter, beginning with its installation. Installation JMeter comes as a bundled archive, so it is super easy to get started with it. Those working in corporate environments behind a firewall or machines with non-admin privileges appreciate this more. To get started, grab the latest binary release by pointing your browser to http://jmeter.apache.org/download_jmeter.cgi. At the time of writing this, the current release version is 3.1. The download site offers the bundle as both a .zip file and a .tgz file. We go with the .zip file option, but feel free to download the .tgz file if that's your preferred way of grabbing archives. Once downloaded, extract the archive to a location of your choice. The location you extracted the archive to will be referred to as JMETER_HOME. Provided you have a JDK/JRE correctly installed and a JAVA_HOME environment variable set, you are all set and ready to run! The following screenshot shows a trimmed down directory structure of a vanilla JMeter install: JMETER_HOME folder structure The following are some of the folders in Apache-JMeter-3.2, as shown in the preceding screenshot: bin: This folder contains executable scripts to run and perform other operations in JMeter docs: This folder contains a well-documented user guide extras: This folder contains miscellaneous items, including samples illustrating the usage of the Apache Ant build tool (http://ant.apache.org/) with JMeter, and bean shell scripting lib: This folder contains utility JAR files needed by JMeter (you may add additional JARs here to use from within JMeter; we will cover this in detail later) printable_docs: This is the printable documentation Installing Java JDK Follow these steps to install Java JDK: Go to http://www.oracle.com/technetwork/java/javase/downloads/index.html. Download Java JDK (not JRE) compatible with the system that you will use to test. At the time of writing, JDK 1.8 (update 131) was the latest. Double-click on the executable and follow the onscreen instructions. On Windows systems, the default location for the JDK is under Program Files. While there is nothing wrong with this, the issue is that the folder name contains a space, which can sometimes be problematic when attempting to set PATH and run programs, such as JMeter, depending on the JDK from the command line. With this in mind, it is advisable to change the default location to something like C:toolsjdk. Setting up JAVA_HOME Here are the steps to set up the JAVA_HOME environment variable on Windows and Unix operating systems. On Windows For illustrative purposes, assume that you have installed Java JDK at C:toolsjdk: Go to Control Panel. Click on System. Click on Advance System settings. Add Environment to the following variables:     Value: JAVA_HOME     Path: C:toolsjdk Locate Path (under system variables, bottom half of the screen). Click on Edit. Append %JAVA_HOME%/bin to the end of the existing path value (if any). On Unix For illustrative purposes, assume that you have installed Java JDK at /opt/tools/jdk: Open up a Terminal window. Export JAVA_HOME=/opt/tools/jdk. Export PATH=$PATH:$JAVA_HOME. It is advisable to set this in your shell profile settings, such as .bash_profile (for bash users) or .zshrc (for zsh users), so that you won't have to set it for each new Terminal window you open. Running JMeter Once installed, the bin folder under the JMETER_HOME folder contains all the executable scripts that can be run. Based on the operating system that you installed JMeter on, you either execute the shell scripts (.sh file) for operating systems that are Unix/Linux flavored, or their batch (.bat file) counterparts on operating systems that are Windows flavored. JMeter files are saved as XML files with a .jmx extension. We refer to them as test scripts or JMX files. These scripts include the following: jmeter.sh: This script launches JMeter GUI (the default) jmeter-n.sh: This script launches JMeter in non-GUI mode (takes a JMX file as input) jmeter-n-r.sh: This script launches JMeter in non-GUI mode remotely jmeter-t.sh: This opens a JMX file in the GUI jmeter-server.sh: This script starts JMeter in server mode (this will be kicked off on the master node when testing with multiple machines remotely) mirror-server.sh: This script runs the mirror server for JMeter shutdown.sh: This script gracefully shuts down a running non-GUI instance stoptest.sh: This script abruptly shuts down a running non-GUI instance   To start JMeter, open a Terminal shell, change to the JMETER_HOME/bin folder, and run the following command on Unix/Linux: ./jmeter.sh Alternatively, run the following command on Windows: jmeter.bat Take a moment to explore the GUI. Hover over each icon to see a short description of what it does. The Apache JMeter team has done an excellent job with the GUI. Most icons are very similar to what you are used to, which helps ease the learning curve for new adapters. Some of the icons, for example, stop and shutdown, are disabled for now till a scenario/test is being conducted. The JVM_ARGS environment variable can be used to override JVM settings in the jmeter.bat or jmeter.sh script. Consider the following example: export JVM_ARGS="-Xms1024m -Xmx1024m -Dpropname=propvalue". Command-line options To see all the options available to start JMeter, run the JMeter executable with the -? command. The options provided are as follows: . ./jmeter.sh -? -? print command line options and exit -h, --help print usage information and exit -v, --version print the version information and exit -p, --propfile <argument> the jmeter property file to use -q, --addprop <argument> additional JMeter property file(s) -t, --testfile <argument> the jmeter test(.jmx) file to run -l, --logfile <argument> the file to log samples to -j, --jmeterlogfile <argument> jmeter run log file (jmeter.log) -n, --nongui run JMeter in nongui mode ... -J, --jmeterproperty <argument>=<value> Define additional JMeter properties -G, --globalproperty <argument>=<value> Define Global properties (sent to servers) e.g. -Gport=123 or -Gglobal.properties -D, --systemproperty <argument>=<value> Define additional system properties -S, --systemPropertyFile <argument> additional system property file(s) This is a snippet (non-exhaustive list) of what you might see if you did the same. Summary In this article we have learnt relationship between performance testing and tuning, and how to install and run JMeter.   Resources for Article: Further resources on this subject: Functional Testing with JMeter [article] Creating an Apache JMeter™ test workbench [article] Getting Started with Apache Spark DataFrames [article]
Read more
  • 0
  • 0
  • 2106
article-image-k-nearest-neighbors
Packt
20 Feb 2018
10 min read
Save for later

K Nearest Neighbors

Packt
20 Feb 2018
10 min read
In this article by Gavin Hackeling, author of book Mastering Machine Learning with scikit-learn - Second Edition, we will start with K Nearest Neighbors (KNN) which is a simple model for regression and classification tasks. It is so simple that its name describes most of its learning algorithm. The titular neighbors are representations of training instances in a metric space. A metric space is a feature space in which the distances between all members of a set are defined. (For more resources related to this topic, see here.) For classification tasks, a set of tuples of feature vectors and class labels comprise the training set. KNN is a capable of binary, multi-class, and multi-label classification. We will focus on binary classification in this article. The simplest KNN classifiers use the mode of the KNN labels to classify test instances, but other strategies can be used. k is often set to an odd number to prevent ties. In regression tasks, the features vectors are each associated with a response variable that takes a real-valued scalar instead of a label. The prediction is the mean or weighted mean of the k nearest neighbors’ response variables. Lazy learning and non-parametric models KNN is a lazy learner. Also known as instance-based learners, lazy learners simply store the training data set with little or no processing. In contrast to eager learners, such as simple linear regression, KNN does not estimate the parameters of a model that generalizes the training data during a training phase. Lazy learning has advantages and disadvantages. Training an eager learner is often computationally costly, but prediction with the resulting model is often inexpensive. For simple linear regression, prediction consists only of multiplying the learned coefficient by the feature, and adding the learned intercept parameter. A lazy learner can predict almost immediately, but making predictions can be costly. In the simplest implementation of KNN, prediction requires calculating the distances between a test instance and all of the training instances. In contrast to most of the other models we will discuss, KNN is a non-parametric model. A parametric model uses a fixed number of parameters, or coefficients, to define the model that summarizes the data. The number of parameters is independent of the number of training instances. Non-parametric may seem to be a misnomer, as it does not mean that the model has no parameters; rather, non-parametric means that the number of parameters of the model is not fixed, and may grow with the number of training instances. Non-parametric models can be useful when training data is abundant and you have little prior knowledge about the relationship between the response and explanatory variables. KNN makes only one assumption: instances that are near each other are likely to have similar values of the response variable. The flexibility provided by non-parametric models is not always desirable; a model that makes assumptions about the relationship can be useful if training data is scarce or if you already know about the relationship. Classification with KNN The goal of classification tasks is to use one or more features to predict the value of a discrete response variable. Let’s work through a toy classification problem. Assume that you must use a person’s height and weight to predict his or her sex. This problem is called binary classification because the response variable can take one of two labels. The following table records nine training instances. height weight label 158 cm 64 kg male 170 cm 66 kg male 183 cm 84 kg male 191 cm 80 kg male 155 cm 49 kg female 163 cm 59 kg female 180 cm 67 kg female 158 cm 54 kg female 178 cm 77 kg female We are now using features from two explanatory variables to predict the value of the response variable. KNN is not limited to two features; the algorithm can use an arbitrary number of features, but more than three features cannot be visualized. Let’s visualize the data by creating a scatter plot with matplotlib. # In[1]: import numpy as np import matplotlib.pyplot as plt X_train = np.array([ [158, 64], [170, 86], [183, 84], [191, 80], [155, 49], [163, 59], [180, 67], [158, 54], [170, 67] ]) y_train = ['male', 'male', 'male', 'male', 'female', 'female', 'female', 'female', 'female'] plt.figure() plt.title('Human Heights and Weights by Sex') plt.xlabel('Height in cm') plt.ylabel('Weight in kg') for i, x in enumerate(X_train): # Use 'x' markers for instances that are male and diamond markers for instances that are female plt.scatter(x[0], x[1], c='k', marker='x' if y_train[i] == 'male' else 'D') plt.grid(True) plt.show() From the plot we can see that men, denoted by the x markers, tend to be taller and weigh more than women. This observation is probably consistent with your experience. Now let’s use KNN to predict whether a person with a given height and weight is a man or a woman. Let’s assume that we want to predict the sex of a person who is 155 cm tall and who weighs 70 kg. First, we must define our distance measure. In this case we will use Euclidean distance, the straight line distance between points in a Euclidean space. Euclidean distance in a two-dimensional space is given by the following: Next we must calculate the distances between the query instance and all of the training instances. height weight label Distance from test instance 158 cm 64 kg male 170 cm 66 kg male 183 cm 84 kg male 191 cm 80 kg male 155 cm 49 kg female 163 cm 59 kg female 180 cm 67 kg female 158 cm 54 kg female 178 cm 77 kg female We will set k to 3, and select the three nearest training instances. The following script calculates the distances between the test instance and the training instances, and identifies the most common sex of the nearest neighbors. # In[2]: x = np.array([[155, 70]]) distances = np.sqrt(np.sum((X_train - x)**2, axis=1)) distances # Out[2]: array([ 6.70820393, 21.9317122 , 31.30495168, 37.36308338, 21. , 13.60147051, 25.17935662, 16.2788206 , 15.29705854]) # In[3]: nearest_neighbor_indices = distances.argsort()[:3] nearest_neighbor_genders = np.take(y_train, nearest_neighbor_indices) nearest_neighbor_genders # Out[3]: array(['male', 'female', 'female'], dtype='|S6') # In[4]: from collections import Counter b = Counter(np.take(y_train, distances.argsort()[:3])) b.most_common(1)[0][0] # Out[4]: 'female' The following plots the query instance, indicated by the circle, and its three nearest neighbors, indicated by the enlarged markers: Two of the neighbors are female, and one is male. We therefore predict that the test instance is female. Now let’s implement a KNN classifier using scikit-learn. # In[5]: from sklearn.preprocessing import LabelBinarizer from sklearn.neighbors import KNeighborsClassifier lb = LabelBinarizer() y_train_binarized = lb.fit_transform(y_train) y_train_binarized # Out[5]: array([[1], [1], [1], [1], [0], [0], [0], [0], [0]]) # In[6]: K = 3 clf = KNeighborsClassifier(n_neighbors=K) clf.fit(X_train, y_train_binarized.reshape(-1)) prediction_binarized = clf.predict(np.array([155, 70]).reshape(1, -1))[0] predicted_label = lb.inverse_transform(prediction_binarized) predicted_label # Out[6]: array(['female'], dtype='|S6') Our labels are strings; we first use LabelBinarizer to convert them to integers. LabelBinarizer implements the transformer interface, which consists of the methods fit, transform, and fit_transform. fit prepares the transformer; in this case, it creates a mapping from label strings to integers. transform applies the mapping to input labels. fit_transform is a convenience method that calls fit and transform. A transformer should be fit only on the training set. Independently fitting and transforming the training and testing sets could result in inconsistent mappings from labels to integers; in this case, male might be mapped to 1 in the training set and 0 in the testing set. Fitting on the entire dataset should also be avoided, as for some transformers it will leak information about the testing set in to the model. This advantage won't be available in production, so performance measures on the test set may be optimistic. We wil discuss this pitfall more when we extract features from text. Next, we initialize a KNeighborsClassifier. Even through KNN is a lazy learner, it still implements the estimator interface. We call fit and predict just as we did with our simple linear regression object. Finally, we can use our fit LabelBinarizer to reverse the transformation and return a string label. Now let’s use our classifier to make predictions for a test set, and evaluate the performance of our classifier. height weight label 168 cm 65 kg male 170 cm 61 kg male 160 cm 52 kg female 169 cm 67 kg female # In[7]: X_test = np.array([ [168, 65], [180, 96], [160, 52], [169, 67] ]) y_test = ['male', 'male', 'female', 'female'] y_test_binarized = lb.transform(y_test) print('Binarized labels: %s' % y_test_binarized.T[0]) predictions_binarized = clf.predict(X_test) print('Binarized predictions: %s' % predictions_binarized) print('Predicted labels: %s' % lb.inverse_transform(predictions_binarized)) # Out[7]: Binarized labels: [1 1 0 0] Binarized predictions: [0 1 0 0] Predicted labels: ['female' 'male' 'female' 'female'] By comparing our test labels to our classifier's predictions, we find that it incorrectly predicted that one of the male test instances was female. There are two types of errors in binary classification tasks: false positives and false negatives. There are many performance measures for classifiers; some measures may be more appropriate than others depending on the consequences of the types of errors in your application. We will assess our classifier using several common performance measures, including accuracy, precision, and recall. Accuracy is the proportion of test instances that were classified correctly. Our model classified one of the four instances incorrectly, so the accuracy is 75%. # In[8]: from sklearn.metrics import accuracy_score print('Accuracy: %s' % accuracy_score(y_test_binarized, predictions_binarized)) # Out[8]: Accuracy: 0.75 Precision is the proportion of test instances that were predicted to be positive that are truly positive. In this example the positive class is male. The assignment of male and “female” to the positive and negative classes is arbitrary, and could be reversed. Our classifier predicted that one of the test instances is the positive class. This instance is truly the positive class, so the classifier’s precision is 100%. # In[9]: from sklearn.metrics import precision_score print('Precision: %s' % precision_score(y_test_binarized, predictions_binarized)) # Out[9]: Precision: 1.0 Recall is the proportion of truly positive test instances that were predicted to be positive. Our classifier predicted that one of the two truly positive test instances is positive. Its recall is therefore 50%. # In[10]: from sklearn.metrics import recall_score print('Recall: %s' % recall_score(y_test_binarized, predictions_binarized)) # Out[10]: Recall: 0.5 Sometimes it is useful to summarize precision and recall with a single statistic, called the F1-score or F1-measure. The F1-score is the harmonic mean of precision and recall. # In[11]: from sklearn.metrics import f1_score print('F1 score: %s' % f1_score(y_test_binarized, predictions_binarized)) # Out[11]: F1 score: 0.666666666667 Note that the arithmetic mean of the precision and recall scores is the upper bound of the F1 score. The F1 score penalizes classifiers more as the difference between their precision and recall scores increases. Finally, the Matthews correlation coefficient is an alternative to the F1 score for measuring the performance of binary classifiers. A perfect classifier’s MCC is 1. A trivial classifier that predicts randomly will score 0, and a perfectly wrong classifier will score -1. MCC is useful even when the proportions of the classes in the test set is severely imbalanced. # In[12]: from sklearn.metrics import matthews_corrcoef print('Matthews correlation coefficient: %s' % matthews_corrcoef(y_test_binarized, predictions_binarized)) # Out[12]: Matthews correlation coefficient: 0.57735026919 scikit-learn also provides a convenience function, classification_report, that reports the precision, recall and F1 score. # In[13]: from sklearn.metrics import classification_report print(classification_report(y_test_binarized, predictions_binarized, target_names=['male'], labels=[1])) # Out[13]: precision recall f1-score support male 1.00 0.50 0.67 2 avg / total 1.00 0.50 0.67 2 Summary In this article we learned about K Nearest Neighbors in which we saw that KNN is lazy learner as well as non-parametric model. We also saw about the classification of KNN. Resources for Article: Further resources on this subject: Introduction to Scikit-Learn [article] Machine Learning in IPython with scikit-learn [article] Machine Learning Models [article]
Read more
  • 0
  • 0
  • 2958

article-image-develop-stock-price-predictive-model-using-reinforcement-learning-tensorflow
Aaron Lazar
20 Feb 2018
12 min read
Save for later

How to develop a stock price predictive model using Reinforcement Learning and TensorFlow

Aaron Lazar
20 Feb 2018
12 min read
[box type="note" align="" class="" width=""]This article is an extract from the book Predictive Analytics with TensorFlow, authored by Md. Rezaul Karim. This book helps you build, tune, and deploy predictive models with TensorFlow.[/box] In this article we’ll show you how to create a predictive model to predict stock prices, using TensorFlow and Reinforcement Learning. An emerging area for applying Reinforcement Learning is the stock market trading, where a trader acts like a reinforcement agent since buying and selling (that is, action) particular stock changes the state of the trader by generating profit or loss, that is, reward. The following figure shows some of the most active stocks on July 15, 2017 (for an example): Now, we want to develop an intelligent agent that will predict stock prices such that a trader will buy at a low price and sell at a high price. However, this type of prediction is not so easy and is dependent on several parameters such as the current number of stocks, recent historical prices, and most importantly, on the available budget to be invested for buying and selling. The states in this situation are a vector containing information about the current budget, current number of stocks, and a recent history of stock prices (the last 200 stock prices). So each state is a 202-dimensional vector. For simplicity, there are only three actions to be performed by a stock market agent: buy, sell, and hold. So, we have the state and action, what else do you need? Policy, right? Yes, we should have a good policy, so based on that an action will be performed in a state. A simple policy can consist of the following rules: Buying (that is, action) a stock at the current stock price (that is, state) decreases the budget while incrementing the current stock count Selling a stock trades it in for money at the current share price Holding does neither, and performing this action simply waits for a particular time period and yields no reward To find the stock prices, we can use the yahoo_finance library in Python. A general warning you might experience is "HTTPError: HTTP Error 400: Bad Request". But keep trying. Now, let's try to get familiar with this module: >>> from yahoo_finance import Share >>> msoft = Share('MSFT') >>> print(msoft.get_open()) 72.24= >>> print(msoft.get_price()) 72.78 >>> print(msoft.get_trade_datetime()) 2017-07-14 20:00:00 UTC+0000 >>> So as of July 14, 2017, the stock price of Microsoft Inc. went higher, from 72.24 to 72.78, which means about a 7.5% increase. However, this small and just one-day data doesn't give us any significant information. But, at least we got to know the present state for this particular stock or instrument. To install yahoo_finance, issue the following command: $ sudo pip3 install yahoo_finance Now it would be worth looking at the historical data. The following function helps us get the historical data for Microsoft Inc: def get_prices(share_symbol, start_date, end_date, cache_filename): try: stock_prices = np.load(cache_filename) except IOError: share = Share(share_symbol) stock_hist = share.get_historical(start_date, end_date) stock_prices = [stock_price['Open'] for stock_price in stock_ hist] np.save(cache_filename, stock_prices) return stock_prices The get_prices() method takes several parameters such as the share symbol of an instrument in the stock market, the opening date, and the end date. You will also like to specify and cache the historical data to avoid repeated downloading. Once you have downloaded the data, it's time to plot the data to get some insights. The following function helps us to plot the price: def plot_prices(prices): plt.title('Opening stock prices') plt.xlabel('day') plt.ylabel('price ($)') plt.plot(prices) plt.savefig('prices.png') Now we can call these two functions by specifying a real argument as follows: if __name__ == '__main__': prices = get_prices('MSFT', '2000-07-01', '2017-07-01', 'historical_stock_prices.npy') plot_prices(prices) Here I have chosen a wide range for the historical data of 17 years to get a better insight. Now, let's take a look at the output of this data: The goal is to learn a policy that gains the maximum net worth from trading in the stock market. So what will a trading agent be achieving in the end? Figure 8 gives you some clue: Well, figure 8 shows that if the agent buys a certain instrument with price $20 and sells at a peak price say at $180, it will be able to make $160 reward, that is, profit. So, implementing such an intelligent agent using RL algorithms is a cool idea? From the previous example, we have seen that for a successful RL agent, we need two operations well defined, which are as follows: How to select an action How to improve the utility Q-function To be more specific, given a state, the decision policy will calculate the next action to take. On the other hand, improve Q-function from a new experience of taking an action. Also, most reinforcement learning algorithms boil down to just three main steps: infer, perform, and learn. During the first step, the algorithm selects the best action (a) given a state (s) using the knowledge it has so far. Next, it performs the action to find out the reward (r) as well as the next state (s'). Then, it improves its understanding of the world using the newly acquired knowledge (s, r, a, s') as shown in the following figure: Now, let's start implementing the decision policy based on which action will be taken for buying, selling, or holding a stock item. Again, we will do it an incremental way. At first, we will create a random decision policy and evaluate the agent's performance. But before that, let's create an abstract class so that we can implement it accordingly: class DecisionPolicy: def select_action(self, current_state, step): pass def update_q(self, state, action, reward, next_state): pass The next task that can be performed is to inherit from this superclass to implement a random decision policy: class RandomDecisionPolicy(DecisionPolicy): def __init__(self, actions): self.actions = actions def select_action(self, current_state, step): action = self.actions[random.randint(0, len(self.actions) - 1)] return action The previous class did nothing except defi ning a function named select_action (), which will randomly pick an action without even looking at the state. Now, if you would like to use this policy, you can run it on the real-world stock price data. This function takes care of exploration and exploitation at each interval of time, as shown in the following figure that form states S1, S2, and S3. The policy suggests an action to be taken, which we may either choose to exploit or otherwise randomly explore another action. As we get rewards for performing an action, we can update the policy function over time: Fantastic, so we have the policy and now it's time to utilize this policy to make decisions and return the performance. Now, imagine a real scenario—suppose you're trading on Forex or ForTrade platform, then you can recall that you also need to compute the portfolio and the current profit or loss, that is, reward. Typically, these can be calculated as follows: portfolio = budget + number of stocks * share value reward = new_portfolio - current_portfolio At first, we can initialize values that depend on computing the net worth of a portfolio, where the state is a hist+2 dimensional vector. In our case, it would be 202 dimensional. Then we define the range of tuning the range up to: Length of the prices selected by the user query – (history + 1), since we start from 0, we subtract 1 instead. Then, we should calculate the updated value of the portfolio and from the portfolio, we can calculate the value of the reward, that is, profit. Also, we have already defined our random policy, so we can then select an action from the current policy. Then, we repeatedly update the portfolio values based on the action in each iteration and the new portfolio value after taking the action can be calculated. Then, we need to compute the reward from taking an action at a state. Nevertheless, we also need to update the policy after experiencing a new action. Finally, we compute the final portfolio worth: def run_simulation(policy, initial_budget, initial_num_stocks, prices, hist, debug=False): budget = initial_budget num_stocks = initial_num_stocks share_value = 0 transitions = list() for i in range(len(prices) - hist - 1): if i % 100 == 0: print('progress {:.2f}%'.format(float(100*i) / (len(prices) - hist - 1))) current_state = np.asmatrix(np.hstack((prices[i:i+hist], budget, num_stocks))) current_portfolio = budget + num_stocks * share_value action = policy.select_action(current_state, i) share_value = float(prices[i + hist + 1]) if action == 'Buy' and budget >= share_value: budget -= share_value num_stocks += 1 elif action == 'Sell' and num_stocks > 0: budget += share_value num_stocks -= 1 else: action = 'Hold' new_portfolio = budget + num_stocks * share_value reward = new_portfolio - current_portfolio next_state = np.asmatrix(np.hstack((prices[i+1:i+hist+1], budget, num_stocks))) transitions.append((current_state, action, reward, next_ state)) policy.update_q(current_state, action, reward, next_state) portfolio = budget + num_stocks * share_value if debug: print('${}t{} shares'.format(budget, num_stocks)) return portfolio The previous simulation predicts a somewhat good result; however, it produces random results too often. Thus, to obtain a more robust measurement of success, let's run the simulation a couple of times and average the results. Doing so may take a while to complete, say 100 times, but the results will be more reliable: def run_simulations(policy, budget, num_stocks, prices, hist): num_tries = 100 final_portfolios = list() for i in range(num_tries): final_portfolio = run_simulation(policy, budget, num_stocks, prices, hist) final_portfolios.append(final_portfolio) avg, std = np.mean(final_portfolios), np.std(final_portfolios) return avg, std The previous function computes the average portfolio and the standard deviation by iterating the previous simulation function 100 times. Now, it's time to evaluate the previous agent. As already stated, there will be three possible actions to be taken by the stock trading agent such as buy, sell, and hold. We have a state vector of 202 dimension and budget only $1000. Then, the evaluation goes as follows: actions = ['Buy', 'Sell', 'Hold'] hist = 200 policy = RandomDecisionPolicy(actions) budget = 1000.0 num_stocks = 0 avg,std=run_simulations(policy,budget,num_stocks,prices, hist) print(avg, std) >>> 1512.87102405 682.427384814 The first one is the mean and the second one is the standard deviation of the final portfolio. So, our stock prediction agent predicts that as a trader you/we could make a profit about $513. Not bad. However, the problem is that since we have utilized a random decision policy, the result is not so reliable. To be more specific, the second execution will definitely produce a different result: >>> 1518.12039077 603.15350649 Therefore, we should develop a more robust decision policy. Here comes the use of neural network-based QLearning for decision policy. Next, we will see a new hyperparameter epsilon to keep the solution from getting stuck when applying the same action over and over. The lesser its value, the more often it will randomly explore new actions: Next, I am going to write a class containing their functions: Constructor: This helps to set the hyperparameters from the Q-function. It also helps to set the number of hidden nodes in the neural networks. Once we have these two, it helps to define the input and output tensors. It then defines the structure of the neural network. Further, it defines the operations to compute the utility. Then, it uses an optimizer to update model parameters to minimize the loss and sets up the session and initializes variables. select_action: This function exploits the best option with probability 1-epsilon. update_q: This updates the Q-function by updating its model parameters. Refer to the following code: class QLearningDecisionPolicy(DecisionPolicy): def __init__(self, actions, input_dim): self.epsilon = 0.9 self.gamma = 0.001 self.actions = actions output_dim = len(actions) h1_dim = 200 self.x = tf.placeholder(tf.float32, [None, input_dim]) self.y = tf.placeholder(tf.float32, [output_dim]) W1 = tf.Variable(tf.random_normal([input_dim, h1_dim])) b1 = tf.Variable(tf.constant(0.1, shape=[h1_dim])) h1 = tf.nn.relu(tf.matmul(self.x, W1) + b1) W2 = tf.Variable(tf.random_normal([h1_dim, output_dim])) b2 = tf.Variable(tf.constant(0.1, shape=[output_dim])) self.q = tf.nn.relu(tf.matmul(h1, W2) + b2) loss = tf.square(self.y - self.q) self.train_op = tf.train.GradientDescentOptimizer(0.01). minimize(loss) self.sess = tf.Session() self.sess.run(tf.initialize_all_variables()) def select_action(self, current_state, step): threshold = min(self.epsilon, step / 1000.) if random.random() < threshold: # Exploit best option with probability epsilon action_q_vals = self.sess.run(self.q, feed_dict={self.x: current_state}) action_idx = np.argmax(action_q_vals) action = self.actions[action_idx] else: # Random option with probability 1 - epsilon action = self.actions[random.randint(0, len(self.actions) - 1)] return action def update_q(self, state, action, reward, next_state): action_q_vals = self.sess.run(self.q, feed_dict={self.x: state}) next_action_q_vals = self.sess.run(self.q, feed_dict={self.x: next_state}) next_action_idx = np.argmax(next_action_q_vals) action_q_vals[0, next_action_idx] = reward + self.gamma * next_action_q_vals[0, next_action_idx] action_q_vals = np.squeeze(np.asarray(action_q_vals)) self.sess.run(self.train_op, feed_dict={self.x: state, self.y: action_q_vals}) There you go! We have a stock price predictive model running and we’ve built it using Reinforcement Learning and TensorFlow. If you found this tutorial interesting and would like to learn more, head over to grab this book, Predictive Analytics with TensorFlow, by Md. Rezaul Karim.    
Read more
  • 0
  • 1
  • 14268